Skip to content

Conversation

@Prasanna721
Copy link
Contributor

@Prasanna721 Prasanna721 commented Jan 19, 2026

RE-RAISING Pipecat live speech PR

Added native speech-to-speech model support

Summary:

  • Speech-to-speech support - Auto-detect audio frames and inject memories to system prompt for native audio models (Gemini Live, etc.)
  • Fix memory bloating - Replace memories each turn using XML tags instead of accumulating
  • Add temporal context - Show recency on search results ([2d ago], [15 Jan])
  • New inject_mode param - auto (default), system, or user

Docs update

  • Update the docs for native speech-2-speech models

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Jan 19, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
⛔ Deployment terminated
View logs
supermemory-app 0a8c5fa Commit Preview URL

Branch Preview URL
Jan 21 2026, 04:19 AM

Copy link
Contributor Author


How to use the Graphite Merge Queue

Add the label Main to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

Comment on lines +252 to +256
if MEMORY_TAG_PATTERN.search(existing_content):
messages[system_idx]["content"] = MEMORY_TAG_PATTERN.sub(
tagged_memory, existing_content
)
else:

This comment was marked as outdated.

@claude
Copy link

claude bot commented Jan 19, 2026

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

@graphite-app
Copy link

graphite-app bot commented Jan 21, 2026

Merge activity

…ve) (#683)

#### RE-RAISING Pipecat live speech PR

### Added native speech-to-speech model support

### Summary:
  - Speech-to-speech support - Auto-detect audio frames and inject memories to system prompt for native audio models (Gemini Live, etc.)
  - Fix memory bloating - Replace memories each turn using XML tags instead of accumulating
  - Add temporal context - Show recency on search results ([2d ago], [15 Jan])
  - New inject_mode param - auto (default), system, or user

### Docs update
  - Update the docs for native speech-2-speech models
@graphite-app graphite-app bot merged commit 0a8c5fa into main Jan 21, 2026
5 of 8 checks passed
Comment on lines 1 to 7
"""Utility functions for Supermemory Pipecat integration."""

from typing import Dict, List
from datetime import datetime, timezone
from typing import Any, Dict, List, Union


def get_last_user_message(messages: List[Dict[str, str]]) -> str | None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The get_last_user_message function doesn't handle multimodal message content (lists), which will cause memory retrieval to fail for those messages.
Severity: MEDIUM

Suggested Fix

Update get_last_user_message to handle cases where message['content'] is a list. Check if the content is a list and, if so, iterate through its parts to extract and join the text content into a single string, similar to the implementation in supermemory_openai/utils.py.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: packages/pipecat-sdk-python/src/supermemory_pipecat/utils.py#L1-L7

Potential issue: The function `get_last_user_message` in `supermemory_pipecat/utils.py`
assumes that the `content` of a message is always a string. However, Pipecat messages
can also contain a list for multimodal content, a scenario more likely in
speech-to-speech pipelines which this pull request supports. When a message with list
content is processed, the function will return a list instead of a string. This list is
then passed to `_retrieve_memories`, which expects a string query, causing the memory
retrieval API call to fail. The failure is caught and logged, but it silently breaks the
memory feature for any multimodal user inputs.

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants