feat(tts): add FallbackAdapter for TTS failover support#1022
feat(tts): add FallbackAdapter for TTS failover support#1022gokuljs wants to merge 24 commits intolivekit:mainfrom
Conversation
…anisms for TTS instances
…ror handling during TTS synthesis
…kAdapter for enhanced TTS synthesis
…e method for task cancellation and resource cleanup
…ror reporting for TTS instance failures
…apter for better readability and maintainability
…module accessibility
…tries and cleanup
|
📝 WalkthroughWalkthroughAdds an exported FallbackAdapter TTS that orchestrates multiple TTS providers with per-provider availability tracking and recovery, unified sample-rate selection and optional per-provider resampling, and public APIs: synthesize, stream, getStreamingInstance, close, plus an AvailabilityChangedEvent type. (48 words) Changes
Sequence DiagramsequenceDiagram
participant Client
participant FallbackAdapter
participant TTS1 as TTS_Instance_1
participant TTS2 as TTS_Instance_2
participant Resampler
participant OutputStream
Client->>FallbackAdapter: synthesize(text) / stream(options)
activate FallbackAdapter
FallbackAdapter->>TTS1: request synthesis / start stream
activate TTS1
alt TTS1 returns audio
TTS1-->>FallbackAdapter: audio chunks
else TTS1 errors/unavailable
TTS1-->>FallbackAdapter: error
FallbackAdapter->>FallbackAdapter: mark TTS1 unavailable & schedule recovery
FallbackAdapter->>TTS2: attempt synthesis / start stream (fallback)
activate TTS2
TTS2-->>FallbackAdapter: audio chunks or error
deactivate TTS2
end
deactivate TTS1
alt resampling required
FallbackAdapter->>Resampler: resample chunks to target sample rate/channels
Resampler-->>FallbackAdapter: resampled chunks
end
FallbackAdapter->>OutputStream: write audio chunks
OutputStream-->>Client: deliver audio
deactivate FallbackAdapter
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
🧹 Recent nitpick comments
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (3)**/*.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Files:
**/*.{ts,tsx}?(test|example|spec)📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Files:
**/*.{ts,tsx}?(test|example)📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Files:
🧠 Learnings (1)📚 Learning: 2026-01-16T14:33:39.551ZApplied to files:
🔇 Additional comments (9)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@agents/src/tts/fallback_adapter.ts`:
- Around line 1-3: Update the SPDX header year from 2024 to 2025 in the new
file's top-of-file comments: replace the existing '// SPDX-FileCopyrightText:
2024 LiveKit, Inc.' with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and
ensure the accompanying '// SPDX-License-Identifier: Apache-2.0' header remains
present (these are the two header comment lines at the top of the file).
- Around line 183-186: In the async close() method of fallback_adapter.ts the
this._recoveryTimeouts.forEach callback returns a value which triggers
lint/suspicious/useIterableCallbackReturn; fix it by replacing the forEach call
with an explicit loop (e.g., for (const timeoutId of this._recoveryTimeouts)
clearTimeout(timeoutId)) or by changing the forEach callback to use a block body
with no return, then keep the subsequent this._recoveryTimeouts.clear() call
intact so all timeouts are cleared and the lint warning is resolved.
- Around line 226-283: In run() detect when a TTS instance produced no audio by
adding a boolean (e.g., receivedAudio) that is set to true inside the for-await
loop when any audio/frame is queued; after the stream loop and before
flushing/resolving (before the "Flush any remaining resampled frames" block and
before the "TTS synthesis succeeded" log) check if receivedAudio is false and
treat it as a failure: call this.adapter.markUnAvailable(i) (or throw the caught
error to trigger the outer catch/continuation) and continue to the next TTS
instead of returning success; ensure flushing only runs when there was
audio/resampler activity and lastRequestId/lastSegmentId are valid so empty
streams don’t produce silent success.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
agents/src/tts/fallback_adapter.tsagents/src/tts/index.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'
Files:
agents/src/tts/index.tsagents/src/tts/fallback_adapter.ts
**/*.{ts,tsx}?(test|example|spec)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
When testing inference LLM, always use full model names from
agents/src/inference/models.ts(e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')
Files:
agents/src/tts/index.tsagents/src/tts/fallback_adapter.ts
**/*.{ts,tsx}?(test|example)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Initialize logger before using any LLM functionality with
initializeLogger({ pretty: true })from '@livekit/agents'
Files:
agents/src/tts/index.tsagents/src/tts/fallback_adapter.ts
🧬 Code graph analysis (1)
agents/src/tts/fallback_adapter.ts (3)
agents/src/utils.ts (2)
Task(420-532)error(243-245)agents/src/types.ts (1)
DEFAULT_API_CONNECT_OPTIONS(22-26)agents/src/_exceptions.ts (2)
APIError(30-52)APIConnectionError(102-114)
🪛 Biome (2.3.13)
agents/src/tts/fallback_adapter.ts
[error] 185-185: This callback passed to forEach() iterable method should not return a value.
Either remove this return or remove the returned value.
(lint/suspicious/useIterableCallbackReturn)
🔇 Additional comments (3)
agents/src/tts/fallback_adapter.ts (2)
46-83: LGTM: capability aggregation and resampler setup look solid.
312-432: LGTM: streaming path prevents mid‑utterance fallback and detects silent output.agents/src/tts/index.ts (1)
13-13: LGTM: public re-export keeps the API surface consistent.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@agents/src/tts/fallback_adapter.ts`:
- Around line 131-137: The recovery loop consumes testStream but never checks if
any audio frames were yielded, so update the logic in the recovery routine that
iterates testStream (the for await (const _ of testStream) loop) to track
whether at least one frame was received (e.g., set a boolean like gotFrame =
true when a chunk is yielded), and only set status.available = true, clear
status.recoveringTask and call this._logger.info({ tts: tts.label }, 'TTS
recovered') and this.emitAvailabilityChanged(tts, true) when gotFrame is true;
if no frames are received after the stream completes, leave status.available
false (or set an appropriate failure state), clear/adjust status.recoveringTask
and log/emit a recovery-failed message instead so a TTS that connects but
returns no audio is not marked recovered.
🧹 Nitpick comments (1)
agents/src/tts/fallback_adapter.ts (1)
109-115: Simplify the emit call by removing unnecessary type assertion.
FallbackAdapterextendsTTS, which extendsEventEmitter, sothis.emit()should be directly callable without the complex type assertion.♻️ Proposed simplification
private emitAvailabilityChanged(tts: TTS, available: boolean): void { const event: AvailabilityChangedEvent = { tts, available }; - (this as unknown as { emit: (event: string, data: AvailabilityChangedEvent) => void }).emit( - 'tts_availability_changed', - event, - ); + this.emit('tts_availability_changed', event); }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
agents/src/tts/fallback_adapter.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'
Files:
agents/src/tts/fallback_adapter.ts
**/*.{ts,tsx}?(test|example|spec)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
When testing inference LLM, always use full model names from
agents/src/inference/models.ts(e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')
Files:
agents/src/tts/fallback_adapter.ts
**/*.{ts,tsx}?(test|example)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Initialize logger before using any LLM functionality with
initializeLogger({ pretty: true })from '@livekit/agents'
Files:
agents/src/tts/fallback_adapter.ts
🧠 Learnings (1)
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'
Applied to files:
agents/src/tts/fallback_adapter.ts
🪛 Biome (2.3.13)
agents/src/tts/fallback_adapter.ts
[error] 185-185: This callback passed to forEach() iterable method should not return a value.
Either remove this return or remove the returned value.
(lint/suspicious/useIterableCallbackReturn)
🔇 Additional comments (5)
agents/src/tts/fallback_adapter.ts (5)
1-12: LGTM!License header and imports are correctly structured. The SPDX header uses the required 2025 year per coding guidelines.
13-33: LGTM!Interface definitions and default options are well-structured. Setting
maxRetry: 0at the adapter level appropriately delegates retry responsibility to the per-instance fallback mechanism.
46-77: LGTM!Constructor validation is thorough—enforcing at least one TTS instance and consistent channel counts. The sample rate normalization strategy (using maximum) with per-instance resampler caching is efficient and correct.
151-161: LGTM!The
markUnAvailablemethod correctly guards against redundant state changes and triggers recovery appropriately.
299-439: LGTM!The
FallbackSynthesizeStreamimplementation is well-designed:
- Token buffering enables replay on fallback.
- The
audioPushedflag correctly detects silent TTS failures.- Mid-utterance failures are properly handled by throwing rather than attempting fallback (lines 415-421), which prevents corrupted audio output.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
…sing block syntax for clearTimeout
…ures during TTS synthesis
…nal options for TTS instance management
|
@toubatbrian This is ready for review . Please let me know anything needs to be changed. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@agents/src/tts/fallback_adapter.ts`:
- Around line 359-418: When running forwardInput() and processOutput()
concurrently, ensure forwardInput stops consuming this.input if processOutput
throws: catch errors inside processOutput (and forwardInput) and on any failure
call this.abortController.abort() (or set a shared cancellation flag) so
forwardInput breaks out of its for-await loop, close the stream, and rethrow the
error; await both tasks with Promise.all and handle rejections so no silent
token consumption occurs. Apply the same guarded cancellation pattern to the
analogous block around lines 427-444 (the other forwardInput/processOutput
pair).
- Around line 123-151: The recovery Task created in status.recoveringTask via
Task.from ignores abort signals, so close() may hang and the catch can schedule
retries after shutdown; to fix, create an AbortController inside the Task (or
obtain the Task's signal) and pass its signal into tts.synthesize (the
synthesize call inside the recovery Task), check signal.aborted in the
loop/after await to throw/exit early, and in the catch skip scheduling a retry
(do not set a timeout in _recoveryTimeouts or call tryRecovery) if the
controller.signal.aborted; ensure status.recoveringTask is cleared and no retry
timeout is created when aborted (affects status.recoveringTask, Task.from
recovery code, tts.synthesize call, tryRecovery, recoveryDelayMs, and
_recoveryTimeouts).
- Around line 278-295: The code throws APIConnectionError when audioReceived is
false even though status.resampler.flush() may have produced frames during the
flush loop; update the flush handling in fallback_adapter.ts (the block
iterating over status.resampler.flush()) to mark audioReceived = true when any
frame is yielded before calling this.queue.put (so the variable reflects flushed
audio), ensuring subsequent verification of audioReceived does not falsely
trigger the error.
- Around line 403-413: When draining the resampler (status.resampler.flush())
ensure you set the appropriate audio flags when frames are yielded: in the
streaming path set audioPushed = true inside the flush loop (the same place you
call this.queue.put for each flushed frame) so the streaming logic recognizes
audio was produced, and in the chunked-synthesis path set audioReceived = true
inside its flush loop so chunked synthesis won’t falsely throw "no audio was
received"; locate the flush loops where status.resampler.flush() is iterated and
add the flag assignment alongside the existing queue.put / frame handling.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
agents/src/tts/fallback_adapter.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'
Files:
agents/src/tts/fallback_adapter.ts
**/*.{ts,tsx}?(test|example|spec)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
When testing inference LLM, always use full model names from
agents/src/inference/models.ts(e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')
Files:
agents/src/tts/fallback_adapter.ts
**/*.{ts,tsx}?(test|example)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Initialize logger before using any LLM functionality with
initializeLogger({ pretty: true })from '@livekit/agents'
Files:
agents/src/tts/fallback_adapter.ts
🧠 Learnings (1)
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'
Applied to files:
agents/src/tts/fallback_adapter.ts
🧬 Code graph analysis (1)
agents/src/tts/fallback_adapter.ts (4)
agents/src/utils.ts (2)
Task(420-532)error(243-245)agents/src/types.ts (1)
DEFAULT_API_CONNECT_OPTIONS(22-26)agents/src/tokenize/index.ts (1)
basic(16-16)agents/src/_exceptions.ts (2)
APIConnectionError(102-114)APIError(30-52)
🔇 Additional comments (7)
agents/src/tts/fallback_adapter.ts (7)
1-12: SPDX header and imports look good.
13-33: Type definitions and default fallback options are clear.
35-115: Constructor validation, sample-rate normalization, and event forwarding are solid.
155-185: Availability tracking and stream factories look good.
203-211: Listener cleanup and TTS shutdown are thorough.
214-231: Chunked stream wiring is straightforward.
315-326: Streaming adapter setup looks good.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
… and improve retry logic
…ream and FallbackSynthesizeStream
…rt signals and ensure proper stream closure
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ac026870e7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
agents/src/tts/fallback_adapter.ts
Outdated
| // Use cached resampler for this TTS instance | ||
| const resampler = status.resampler; | ||
| if (resampler) { | ||
| for (const frame of resampler.push(audio.frame)) { |
There was a problem hiding this comment.
Create a resampler per stream to avoid cross-talk
The cached resampler is stored per TTS instance and then reused by every FallbackChunkedStream/FallbackSynthesizeStream call. AudioResampler is stateful (it buffers samples until flush()), so if two syntheses run concurrently on the same TTS instance, their frames will be interleaved in the shared resampler and the flush() from one stream can drain buffered audio from the other. This leads to corrupted or missing audio when the adapter is used for parallel syntheses (a common pattern with multiple speakers/requests). Consider instantiating a new resampler per stream or per call instead of sharing the instance from status.
Useful? React with 👍 / 👎.
…e createResamplerForTTS method for better management of audio resampling
|
@toubatbrian added a fix for that edge case |
This adds a FallbackAdapter for TTS that lets you configure multiple TTS providers and automatically switches to the next one if the current one fails. It handles both connection errors and silent failures where the TTS connects but doesn't return any audio. Failed providers are automatically tested in the background and restored when they come back online. It also normalizes sample rates across different providers so you can mix and match TTS services without worrying about audio format differences.
Summary by CodeRabbit
New Features
Bug Fixes