-
Notifications
You must be signed in to change notification settings - Fork 70
LCORE-1179: Update metadata extraction from chunks #1073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughRemoved text-item metadata parsing from query endpoints; changed referenced-document extraction in query_v2 to prefer doc_url/docs_url/url/link, extract doc_title, and dedupe by (doc_url, doc_title); renamed and expanded BYOK RAG config wiring to centralized storage/backends and registered_resources with new helper constructors; tests and docs updated accordingly. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/app/endpoints/query_v2.py (1)
547-588: Include doc_id in de-duplication to avoid dropping distinct documents.
If two documents share title/URL (or URL is missing) but have different doc_id values, the current(final_url, doc_title)key will collapse them, which undermines the new doc_id linkage.✅ Suggested fix
- # Use a set to track unique documents by (doc_url, doc_title) tuple - seen_docs: set[tuple[Optional[str], Optional[str]]] = set() + # Use a set to track unique documents by (doc_id, doc_url, doc_title) tuple + seen_docs: set[tuple[Optional[str], Optional[str], Optional[str]]] = set() @@ - if (final_url, doc_title) not in seen_docs: + unique_key = (doc_id, final_url, doc_title) + if unique_key not in seen_docs: documents.append( ReferencedDocument( doc_url=final_url, doc_title=doc_title, doc_id=doc_id ) ) - seen_docs.add((final_url, doc_title)) + seen_docs.add(unique_key)
| ) | ||
| seen_docs.add((final_url, filename)) | ||
|
|
||
| # 2. Parse from message content annotations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for dropping this section?
Check also OpenAIResponseOutputMessageContent type that can be part of OpenAIResponseMessage content. Isn't this also relevant? More specifically, OpenAIResponseAnnotations object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already get the references to the chunks used for the response.
Annotations is useful if we want to determine exactly which files the model has used to give the answer (determined by the model) and at which position in the text.
Personally I think that we can use this data, but right now it is not needed. Because the scope of referenced_docs is to keep data of all chunks retrieved and used as input for reference. The data that we get from citation is just redundant since it is a small subset of these docs.
I think we can make use of citations in a separate feature, after agreement across our customers since it adds a new functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for handling "annotations" separately
asimurka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
88ed424 to
a9ccff5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@docs/openapi.md`:
- Around line 5324-5328: The ReferencedDocument markdown table pipes are
misaligned with the header row causing MD060; fix by aligning the pipe
separators vertically and ensuring the header separator row matches the number
of columns and header styles—update the table containing the fields doc_url,
doc_title, and doc_id so the header row "| Field | Type | Description |" and the
separator row (the row of dashes) have matching columns and properly placed
pipes, and then ensure each data row (the lines with doc_url, doc_title, doc_id)
uses the same pipe alignment and column count.
In `@src/app/endpoints/query_v2.py`:
- Around line 561-566: The code sets attributes = result.get("attributes", {})
or uses getattr(result, "attributes", {}), but if the key/attribute exists with
value None the default isn't applied and attributes becomes None causing
attributes.get(...) to raise AttributeError; change the assignment so attributes
is always a dict by applying the "or {}" pattern after both retrievals (i.e.,
ensure attributes = result.get("attributes", {}) or {} and attributes =
getattr(result, "attributes", {}) or {}), referencing the variables result and
attributes and the subsequent attributes.get(...) usage to locate where to apply
the fix.
- Around line 575-586: The current filter only appends ReferencedDocument when
doc_title or doc_url exists, which drops entries that only have doc_id; change
the if-condition to check any of doc_title, doc_url, or doc_id so documents with
only a document_id are preserved. Keep the existing final_url normalization
(treat empty string as None), use the same dedupe tuple (final_url, doc_title,
doc_id) against seen_docs, and append ReferencedDocument(doc_url=final_url,
doc_title=doc_title, doc_id=doc_id) when that tuple is not in seen_docs.
🧹 Nitpick comments (1)
src/app/endpoints/query_v2.py (1)
503-507: Consider using doc_id asRAGChunk.source(with filename fallback).If clients are expected to join
rag_chunkswithreferenced_documentsviadoc_id,sourceshould carry that ID.🔄 Suggested linkage-friendly source
- rag_chunk = RAGChunk( - content=result.text, source=result.filename, score=result.score - ) + attributes = getattr(result, "attributes", None) or {} + doc_id = None + if isinstance(attributes, dict): + doc_id = attributes.get("document_id") or attributes.get("doc_id") + source = doc_id or result.filename + rag_chunk = RAGChunk(content=result.text, source=source, score=result.score)
| doc_title = attributes.get("title") | ||
| doc_id = attributes.get("document_id") or attributes.get("doc_id") | ||
|
|
||
| # If we have at least a filename or url | ||
| if filename or doc_url: | ||
| if doc_title or doc_url: | ||
| # Treat empty string as None for URL to satisfy Optional[AnyUrl] | ||
| final_url = doc_url if doc_url else None | ||
| if (final_url, filename) not in seen_docs: | ||
| if (final_url, doc_title, doc_id) not in seen_docs: | ||
| documents.append( | ||
| ReferencedDocument(doc_url=final_url, doc_title=filename) | ||
| ) | ||
| seen_docs.add((final_url, filename)) | ||
|
|
||
| # 2. Parse from message content annotations | ||
| elif item_type == "message": | ||
| content = getattr(output_item, "content", None) | ||
| if isinstance(content, list): | ||
| for part in content: | ||
| # Skip if part is a string or doesn't have annotations | ||
| if isinstance(part, str): | ||
| continue | ||
|
|
||
| annotations = getattr(part, "annotations", []) or [] | ||
| for annotation in annotations: | ||
| # Handle both object and dict access for annotations | ||
| if isinstance(annotation, dict): | ||
| anno_type = annotation.get("type") | ||
| anno_url = annotation.get("url") | ||
| anno_title = annotation.get("title") or annotation.get( | ||
| "filename" | ||
| ) | ||
| else: | ||
| anno_type = getattr(annotation, "type", None) | ||
| anno_url = getattr(annotation, "url", None) | ||
| anno_title = getattr(annotation, "title", None) or getattr( | ||
| annotation, "filename", None | ||
| ReferencedDocument( | ||
| doc_url=final_url, doc_title=doc_title, doc_id=doc_id | ||
| ) | ||
|
|
||
| if anno_type == "url_citation": | ||
| # Treat empty string as None | ||
| final_url = anno_url if anno_url else None | ||
| if (final_url, anno_title) not in seen_docs: | ||
| documents.append( | ||
| ReferencedDocument( | ||
| doc_url=final_url, doc_title=anno_title | ||
| ) | ||
| ) | ||
| seen_docs.add((final_url, anno_title)) | ||
|
|
||
| elif anno_type == "file_citation": | ||
| if (None, anno_title) not in seen_docs: | ||
| documents.append( | ||
| ReferencedDocument( | ||
| doc_url=None, doc_title=anno_title | ||
| ) | ||
| ) | ||
| seen_docs.add((None, anno_title)) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the actual code around lines 575-586 in query_v2.py
cat -n src/app/endpoints/query_v2.py | sed -n '570,590p'Repository: lightspeed-core/lightspeed-stack
Length of output: 1147
🏁 Script executed:
# Let's also check the ReferencedDocument class definition to understand its structure
rg -A 10 'class ReferencedDocument' src/Repository: lightspeed-core/lightspeed-stack
Length of output: 708
🏁 Script executed:
# Let's see more context around this code section to understand the full logic
cat -n src/app/endpoints/query_v2.py | sed -n '550,600p'Repository: lightspeed-core/lightspeed-stack
Length of output: 2444
🏁 Script executed:
# Get full ReferencedDocument class definition
cat -n src/models/responses.py | grep -A 20 'class ReferencedDocument'Repository: lightspeed-core/lightspeed-stack
Length of output: 840
🏁 Script executed:
# Search for where ReferencedDocument is used to understand the intent
rg 'ReferencedDocument' src/ --type py -B 2 -A 2Repository: lightspeed-core/lightspeed-stack
Length of output: 10260
🏁 Script executed:
# Check if there are any tests that show expected behavior with doc_id-only documents
fd 'test.*query' src/ --type py | head -5Repository: lightspeed-core/lightspeed-stack
Length of output: 245
Include doc_id-only documents to prevent data loss.
Documents with only document_id but no title or url are currently filtered out. Since doc_id is explicitly extracted and used in deduplication, it should be preserved. Update the condition to include doc_id:
Suggested fix
- if doc_title or doc_url:
+ if doc_title or doc_url or doc_id:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| doc_title = attributes.get("title") | |
| doc_id = attributes.get("document_id") or attributes.get("doc_id") | |
| # If we have at least a filename or url | |
| if filename or doc_url: | |
| if doc_title or doc_url: | |
| # Treat empty string as None for URL to satisfy Optional[AnyUrl] | |
| final_url = doc_url if doc_url else None | |
| if (final_url, filename) not in seen_docs: | |
| if (final_url, doc_title, doc_id) not in seen_docs: | |
| documents.append( | |
| ReferencedDocument(doc_url=final_url, doc_title=filename) | |
| ) | |
| seen_docs.add((final_url, filename)) | |
| # 2. Parse from message content annotations | |
| elif item_type == "message": | |
| content = getattr(output_item, "content", None) | |
| if isinstance(content, list): | |
| for part in content: | |
| # Skip if part is a string or doesn't have annotations | |
| if isinstance(part, str): | |
| continue | |
| annotations = getattr(part, "annotations", []) or [] | |
| for annotation in annotations: | |
| # Handle both object and dict access for annotations | |
| if isinstance(annotation, dict): | |
| anno_type = annotation.get("type") | |
| anno_url = annotation.get("url") | |
| anno_title = annotation.get("title") or annotation.get( | |
| "filename" | |
| ) | |
| else: | |
| anno_type = getattr(annotation, "type", None) | |
| anno_url = getattr(annotation, "url", None) | |
| anno_title = getattr(annotation, "title", None) or getattr( | |
| annotation, "filename", None | |
| ReferencedDocument( | |
| doc_url=final_url, doc_title=doc_title, doc_id=doc_id | |
| ) | |
| if anno_type == "url_citation": | |
| # Treat empty string as None | |
| final_url = anno_url if anno_url else None | |
| if (final_url, anno_title) not in seen_docs: | |
| documents.append( | |
| ReferencedDocument( | |
| doc_url=final_url, doc_title=anno_title | |
| ) | |
| ) | |
| seen_docs.add((final_url, anno_title)) | |
| elif anno_type == "file_citation": | |
| if (None, anno_title) not in seen_docs: | |
| documents.append( | |
| ReferencedDocument( | |
| doc_url=None, doc_title=anno_title | |
| ) | |
| ) | |
| seen_docs.add((None, anno_title)) | |
| ) | |
| doc_title = attributes.get("title") | |
| doc_id = attributes.get("document_id") or attributes.get("doc_id") | |
| if doc_title or doc_url or doc_id: | |
| # Treat empty string as None for URL to satisfy Optional[AnyUrl] | |
| final_url = doc_url if doc_url else None | |
| if (final_url, doc_title, doc_id) not in seen_docs: | |
| documents.append( | |
| ReferencedDocument( | |
| doc_url=final_url, doc_title=doc_title, doc_id=doc_id | |
| ) | |
| ) |
🤖 Prompt for AI Agents
In `@src/app/endpoints/query_v2.py` around lines 575 - 586, The current filter
only appends ReferencedDocument when doc_title or doc_url exists, which drops
entries that only have doc_id; change the if-condition to check any of
doc_title, doc_url, or doc_id so documents with only a document_id are
preserved. Keep the existing final_url normalization (treat empty string as
None), use the same dedupe tuple (final_url, doc_title, doc_id) against
seen_docs, and append ReferencedDocument(doc_url=final_url, doc_title=doc_title,
doc_id=doc_id) when that tuple is not in seen_docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
referenced_docs are used because we need to show in the frontend what document have been referenced, IMO it does not make sense to return empty documents without TItle / Url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@are-ces, that makes perfect sense! You're right that a document reference with only an ID wouldn't provide meaningful information for the frontend to display. The current condition ensures that only documents with at least a title or URL—something actionable and human-readable—are returned to the UI. Thank you for the clarification!
✏️ Learnings added
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 1073
File: src/app/endpoints/query_v2.py:575-586
Timestamp: 2026-01-30T13:33:34.749Z
Learning: In `src/app/endpoints/query_v2.py`, the `parse_referenced_documents_from_responses_api` function filters referenced documents to include only those with `doc_title` or `doc_url` because these documents are displayed in the frontend. Documents with only `doc_id` are intentionally excluded as they wouldn't provide useful information to end users.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/openapi.json (1)
4287-4312:⚠️ Potential issue | 🟠 MajorEnsure unique operationId values for GET vs POST.
Line 4293 and Line 4311 both usehandle_a2a_jsonrpc_a2a_post, which can break client codegen (operationId must be unique). Consider distinct IDs per method.🔧 Suggested fix
- "operationId": "handle_a2a_jsonrpc_a2a_post", + "operationId": "handle_a2a_jsonrpc_a2a_get",
🧹 Nitpick comments (2)
src/llama_stack_configuration.py (1)
310-347: Avoid in-place mutation ofls_configinenrich_byok_rag.
Returning a new mapping reduces side‑effects and aligns with the project guideline.♻️ Proposed refactor to avoid in-place mutation
-def enrich_byok_rag(ls_config: dict[str, Any], byok_rag: list[dict[str, Any]]) -> None: +def enrich_byok_rag(ls_config: dict[str, Any], byok_rag: list[dict[str, Any]]) -> dict[str, Any]: """Enrich Llama Stack config with BYOK RAG settings. Args: - ls_config: Llama Stack configuration dict (modified in place) + ls_config: Llama Stack configuration dict byok_rag: List of BYOK RAG configurations + Returns: + dict[str, Any]: Enriched configuration mapping. """ - if len(byok_rag) == 0: - logger.info("BYOK RAG is not configured: skipping") - return + if len(byok_rag) == 0: + logger.info("BYOK RAG is not configured: skipping") + return ls_config + + output = {**ls_config} logger.info("Enriching Llama Stack config with BYOK RAG") # Add storage backends - if "storage" not in ls_config: - ls_config["storage"] = {} - ls_config["storage"]["backends"] = construct_storage_backends_section( - ls_config, byok_rag - ) + storage = {**output.get("storage", {})} + storage["backends"] = construct_storage_backends_section(output, byok_rag) + output["storage"] = storage # Add vector_io providers - if "providers" not in ls_config: - ls_config["providers"] = {} - ls_config["providers"]["vector_io"] = construct_vector_io_providers_section( - ls_config, byok_rag - ) + providers = {**output.get("providers", {})} + providers["vector_io"] = construct_vector_io_providers_section(output, byok_rag) + output["providers"] = providers # Add registered vector stores - if "registered_resources" not in ls_config: - ls_config["registered_resources"] = {} - ls_config["registered_resources"]["vector_stores"] = ( - construct_vector_stores_section(ls_config, byok_rag) - ) - - # Add embedding models - ls_config["registered_resources"]["models"] = construct_models_section( - ls_config, byok_rag - ) + registered_resources = {**output.get("registered_resources", {})} + registered_resources["vector_stores"] = construct_vector_stores_section( + output, byok_rag + ) + registered_resources["models"] = construct_models_section(output, byok_rag) + output["registered_resources"] = registered_resources + + return output🔁 Caller update
- enrich_byok_rag(ls_config, config.get("byok_rag", [])) + ls_config = enrich_byok_rag(ls_config, config.get("byok_rag", []))As per coding guidelines: Avoid in-place parameter modification anti-patterns; instead return new data structures.
docs/openapi.json (1)
8434-8451: Clarify whendoc_idwill actually appear.
The schema allowsdoc_idalone, but the API filters out entries lackingdoc_title/doc_url. A short note will prevent consumer confusion.📝 Suggested doc clarification
- "description": "Model representing a document referenced in generating a response.\n\nAttributes:\n doc_url: Url to the referenced doc.\n doc_title: Title of the referenced doc.\n doc_id: ID of the referenced doc." + "description": "Model representing a document referenced in generating a response.\n\nAttributes:\n doc_url: Url to the referenced doc.\n doc_title: Title of the referenced doc.\n doc_id: ID of the referenced doc.\n\nNote: referenced documents are only emitted when doc_title or doc_url is present; doc_id-only entries are filtered."Based on learnings: In
src/app/endpoints/query_v2.py, theparse_referenced_documents_from_responses_apifunction filters referenced documents to include only those withdoc_titleordoc_urlbecause these documents are displayed in the frontend. Documents with onlydoc_idare intentionally excluded as they wouldn't provide useful information to end users.
asimurka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update LGTM
| or attributes.get("url") | ||
| or attributes.get("doc_url") | ||
| or attributes.get("link") | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also implements attribute key priority, worth documenting it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rag-content tool creates a docs_url field. Users should not care about this, they should just care about creating the vector store via the tool and register the BYOK store in LCORE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you can't connect an arbitrary index to LCORE? If you have one, you need to recreate it via the rag-content tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can but we don't support that. AFAIK We support only creation of rag dbs from rag-content
| ) | ||
| seen_docs.add((final_url, filename)) | ||
|
|
||
| # 2. Parse from message content annotations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for handling "annotations" separately
f715457 to
c259964
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
examples/run.yaml (1)
111-112:⚠️ Potential issue | 🟡 MinorDuplicate
telemetry:key in YAML configuration.The file contains two
telemetry:sections at lines 111-112 and 167-168. In YAML, duplicate keys cause the second occurrence to silently override the first. This results intelemetry.enabled: true(line 168) overwriting the earlier definition (line 112).If both are intentional, consolidate them into a single
telemetry:block.🔧 Proposed fix: remove duplicate telemetry block
safety: default_shield_id: llama-guard -telemetry: - enabled: trueAlso applies to: 167-168
docs/openapi.json (1)
4287-4312:⚠️ Potential issue | 🟠 MajorOperationId must be unique across GET/POST.
Line 4293 and Line 4311 now share the sameoperationId, which breaks OpenAPI uniqueness guarantees and can confuse client generators.🛠️ Suggested fix (keep GET unique)
- "operationId": "handle_a2a_jsonrpc_a2a_post", + "operationId": "handle_a2a_jsonrpc_a2a_get",
🤖 Fix all issues with AI agents
In `@docs/openapi.json`:
- Around line 8434-8451: Update the OpenAPI docs to clarify that doc_id alone is
supplemental and that runtime only emits referenced_documents that include
doc_title or doc_url: edit the ReferencedDocument description (and/or add a
note) to state that entries with only doc_id are not emitted by
parse_referenced_documents_from_responses_api and that consumers should expect
either doc_title or doc_url to be present for displayed documents;
alternatively, if you prefer a schema-level guarantee, change the
ReferencedDocument schema to require at least one of doc_title or doc_url (e.g.,
document-level note or a oneOf/anyOf constraint) so the API contract matches
parse_referenced_documents_from_responses_api behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docs/openapi.json (1)
4287-4312:⚠️ Potential issue | 🟠 MajorEnsure
operationIdvalues are unique across methods.Both GET and POST for
/a2acurrently use the sameoperationId, which breaks OpenAPI uniqueness guarantees and can confuse client generators.🔧 Suggested fix
/a2a: get: ... - "operationId": "handle_a2a_jsonrpc_a2a_post" + "operationId": "handle_a2a_jsonrpc_a2a_get" post: ... "operationId": "handle_a2a_jsonrpc_a2a_post"tests/configuration/run.yaml (1)
84-90:⚠️ Potential issue | 🟠 MajorRemove unsupported telemetry provider block from
providerssection.The
telemetryprovider block at lines 84–90 is not supported in Llama Stack 0.3.x. Telemetry should be configured only via the top-leveltelemetry.enabled: truesetting (already present at line 146), without requiring a provider entry. Remove this block to prevent configuration warnings or failures.🛠️ Proposed change
- telemetry: - - config: - service_name: '' - sinks: sqlite - sqlite_db_path: .llama/distributions/ollama/trace_store.db - provider_id: meta-reference - provider_type: inline::meta-reference
🤖 Fix all issues with AI agents
In `@src/llama_stack_configuration.py`:
- Around line 138-145: The loop that builds BYOK backends uses vector_store_id
from each byok_rag entry and currently allows empty or missing values (producing
names like byok__storage and silent overwrites); update the code in this loop
(the byok_rag iteration where backend_name = f"byok_{vector_store_id}_storage"
and output[...] is set) to validate that vector_store_id is present and
non-empty and raise/throw a clear exception (fail fast) if not, and apply the
same validation/early-fail pattern in the other helpers mentioned
(construct_vector_stores_section, construct_vector_io_providers_section,
construct_models_section) so each function checks its vector_store_id (or
equivalent ID fields) before using them to build keys or provider/model entries.
🧹 Nitpick comments (3)
src/llama_stack_configuration.py (1)
323-347: Consider returning a new config instead of mutatingls_configin place.The function currently rewrites nested sections on the passed-in dict, which makes side effects harder to reason about. Returning a new dict would better isolate changes.
As per coding guidelines, "Avoid in-place parameter modification anti-patterns; instead return new data structures".
docs/openapi.json (2)
5338-5356: Consider noting the rename fromvector_db_id→vector_store_id.This is a public schema change; adding a brief migration note in the description can help clients discover the rename without hunting release notes.
8434-8451: Clarify whendoc_idis emitted.The API filters out referenced documents that lack
doc_titleordoc_url, sodoc_ideffectively appears only alongside a title or URL. Consider adding a short note to avoid clients expecting doc_id-only entries.Based on learnings: “In
src/app/endpoints/query_v2.py, theparse_referenced_documents_from_responses_apifunction filters referenced documents to include only those withdoc_titleordoc_urlbecause these documents are displayed in the frontend. Documents with onlydoc_idare intentionally excluded as they wouldn't provide useful information to end users.”
| # add new backends for each BYOK RAG | ||
| for brag in byok_rag: | ||
| vector_store_id = brag.get("vector_store_id", "") | ||
| backend_name = f"byok_{vector_store_id}_storage" | ||
| output[backend_name] = { | ||
| "type": "kv_sqlite", | ||
| "db_path": brag.get("db_path", f".llama/{vector_store_id}.db"), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against missing vector_store_id to avoid invalid backend/provider IDs.
If a BYOK entry is missing vector_store_id, the code generates names like byok__storage and can silently overwrite entries. Please validate and fail fast (and mirror the same guard in construct_vector_stores_section, construct_vector_io_providers_section, and construct_models_section).
🛠️ Proposed fix (apply similarly in the other construct_* helpers)
- for brag in byok_rag:
- vector_store_id = brag.get("vector_store_id", "")
+ for brag in byok_rag:
+ vector_store_id = brag.get("vector_store_id")
+ if not vector_store_id:
+ raise ValueError("byok_rag entry missing vector_store_id")
backend_name = f"byok_{vector_store_id}_storage"
output[backend_name] = {
"type": "kv_sqlite",
"db_path": brag.get("db_path", f".llama/{vector_store_id}.db"),
}🤖 Prompt for AI Agents
In `@src/llama_stack_configuration.py` around lines 138 - 145, The loop that
builds BYOK backends uses vector_store_id from each byok_rag entry and currently
allows empty or missing values (producing names like byok__storage and silent
overwrites); update the code in this loop (the byok_rag iteration where
backend_name = f"byok_{vector_store_id}_storage" and output[...] is set) to
validate that vector_store_id is present and non-empty and raise/throw a clear
exception (fail fast) if not, and apply the same validation/early-fail pattern
in the other helpers mentioned (construct_vector_stores_section,
construct_vector_io_providers_section, construct_models_section) so each
function checks its vector_store_id (or equivalent ID fields) before using them
to build keys or provider/model entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/openapi.json (1)
4287-4312:⚠️ Potential issue | 🟠 MajorEnsure unique
operationIdvalues for /a2a GET and POST.Both methods now share the same
operationId, which can break OpenAPI tooling/codegen due to collisions. Assign a distinct ID to the GET operation (or remove/deprecate the GET route if it is no longer supported).🔧 Suggested fix (unique GET operationId)
- "operationId": "handle_a2a_jsonrpc_a2a_post", + "operationId": "handle_a2a_jsonrpc_a2a_get",
🧹 Nitpick comments (4)
docs/rag_guide.md (1)
242-242: Minor inconsistency: documentation still referencesvector_dbs.Line 242 mentions
vector_dbswhile the rest of the document has been updated to usevector_stores. Consider updating for consistency:-> [!NOTE] -> When experimenting with different `models`, `providers` and `vector_dbs`, you might need to manually unregister the old ones with the Llama Stack client CLI (e.g. `llama-stack-client vector_dbs list`) +> [!NOTE] +> When experimenting with different `models`, `providers` and `vector_stores`, you might need to manually unregister the old ones with the Llama Stack client CLI (e.g. `llama-stack-client vector_stores list`)tests/unit/test_llama_stack_configuration.py (1)
283-283: Consider usingtmp_pathfixture for thedb_path.Using
/tmp/store1.dbas a hardcoded path works but could be improved by leveraging pytest'stmp_pathfixture for better cross-platform compatibility and automatic cleanup.tests/unit/cache/test_postgres_cache.py (2)
518-543: Assert topic summary removal after delete.The test name/docstring says the summary is removed, but there’s no assertion verifying it. Add a post-delete list check to prevent regressions.
🧪 Suggested assertion
deleted = cache.delete(USER_ID_1, CONVERSATION_ID_1, False) assert deleted is True + mock_cursor.fetchall.return_value = [] + conversations = cache.list(USER_ID_1, False) + assert conversations == []
573-626: Add doc_id coverage for referenced_documents persistence.PR objectives highlight doc_id linkage, but this test only verifies doc_title/doc_url. Adding doc_id to the test would prevent regressions in cache serialization/deserialization.
🧪 Add doc_id to the persisted payload check
- docs = [ - ReferencedDocument(doc_title="Test Doc", doc_url=AnyUrl("http://example.com/")) - ] + docs = [ + ReferencedDocument( + doc_title="Test Doc", + doc_url=AnyUrl("http://example.com/"), + doc_id="doc_1", + ) + ] @@ - assert json.loads(inserted_json_str) == [ - {"doc_url": "http://example.com/", "doc_title": "Test Doc"} - ] + assert json.loads(inserted_json_str) == [ + {"doc_url": "http://example.com/", "doc_title": "Test Doc", "doc_id": "doc_1"} + ] @@ - [{"doc_url": "http://example.com/", "doc_title": "Test Doc"}], + [{"doc_url": "http://example.com/", "doc_title": "Test Doc", "doc_id": "doc_1"}], @@ assert retrieved_entries[0] == entry_with_docs assert retrieved_entries[0].referenced_documents[0].doc_title == "Test Doc" + assert retrieved_entries[0].referenced_documents[0].doc_id == "doc_1"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/app/endpoints/query_v2.py (1)
546-584:⚠️ Potential issue | 🟠 MajorInclude
document_idin referenced documents for chunk linkage.
attributes.document_idis currently ignored, soreferenced_documentscan’t be linked back to the originating RAG chunks. Extract it, include it in the dedupe key, and pass it through (while keeping the title/url filter). Also extendReferencedDocumentto storedoc_id.🔧 Suggested update (plus model change in
models/responses.py)- seen_docs: set[tuple[Optional[str], Optional[str]]] = set() + seen_docs: set[tuple[Optional[str], Optional[str], Optional[str]]] = set() ... doc_title = attributes.get("title") + doc_id = attributes.get("doc_id") or attributes.get("document_id") if doc_title or doc_url: # Treat empty string as None for URL to satisfy Optional[AnyUrl] final_url = doc_url if doc_url else None - if (final_url, doc_title) not in seen_docs: + dedupe_key = (final_url, doc_title, doc_id) + if dedupe_key not in seen_docs: documents.append( - ReferencedDocument(doc_url=final_url, doc_title=doc_title) + ReferencedDocument( + doc_url=final_url, doc_title=doc_title, doc_id=doc_id + ) ) - seen_docs.add((final_url, doc_title)) + seen_docs.add(dedupe_key)docs/openapi.json (2)
4287-4312:⚠️ Potential issue | 🟠 MajorAvoid duplicate
operationIdvalues for /a2a GET/POST.OpenAPI operationIds must be unique; duplicating them can break client generation and tooling.
🔧 Proposed fix
"get": { "tags": [ "a2a" ], "summary": "Handle A2A Jsonrpc", @@ - "operationId": "handle_a2a_jsonrpc_a2a_post", + "operationId": "handle_a2a_jsonrpc_a2a_get",
8407-8439:⚠️ Potential issue | 🟠 MajorReferencedDocument schema is missing
doc_id.PR objectives indicate
doc_idis now part of referenced_documents; the OpenAPI contract should expose it.🔧 Proposed fix
"ReferencedDocument": { "properties": { + "doc_id": { + "anyOf": [ + { + "type": "string", + "minLength": 1 + }, + { + "type": "null" + } + ], + "title": "Doc Id", + "description": "ID of the referenced document" + }, "doc_url": { "anyOf": [ { "type": "string", @@ "type": "object", "title": "ReferencedDocument", "description": "Model representing a document referenced in generating a response.\n\nAttributes:\n doc_url: Url to the referenced doc.\n doc_title: Title of the referenced doc.\n+ doc_id: ID of the referenced doc." },Also update the referenced_documents examples (e.g., /v1/query and QueryResponse) to include doc_id when available.
🤖 Fix all issues with AI agents
In `@docs/byok_guide.md`:
- Around line 480-485: Example 2 still defines top-level keys (vector_stores,
models, tool_groups) which conflict with the new registered_resources schema;
move those blocks under the registered_resources map in Example 2 so the entries
become registered_resources:<resource_name>: { vector_stores / models /
tool_groups ... } and ensure the vector_store_id (enterprise-docs) and
provider_id (enterprise-knowledge) remain referenced inside the corresponding
registered_resources entry rather than at top level.
In `@docs/rag_guide.md`:
- Around line 83-101: The snippets place the vector_stores mapping at the top
level but the schema expects registered_resources.vector_stores; move the entire
vector_stores block under registered_resources.vector_stores to match the rest
of the config (keep keys like embedding_dimension, embedding_model, provider_id:
custom-index, and vector_store_id unchanged) and update any comments or
references that point to the top-level vector_stores so they now refer to
registered_resources.vector_stores (apply the same change for both FAISS and
pgvector examples and ensure storage.backends.rag_backend and vector_io
provider_id references remain consistent).
In `@tests/configuration/run.yaml`:
- Around line 145-146: The config currently defines both a providers.telemetry
block and telemetry.enabled: true, which is incompatible with Llama Stack 0.3.x;
remove the providers.telemetry block (the entire providers.telemetry
configuration) and keep only telemetry.enabled: true so the file uses the
supported telemetry flag; locate the providers object in the YAML (the
providers.telemetry section) and delete that subsection, leaving the top-level
telemetry.enabled setting intact.
🧹 Nitpick comments (3)
tests/unit/app/endpoints/test_query_v2.py (1)
912-951: Add doc_id assertions to lock in the new linkage.
The fixtures now carrydocument_id, but the test doesn’t verify it onReferencedDocument. Ifdoc_idis part of the contract, add assertions so regressions are caught.✅ Suggested test additions (once
ReferencedDocument.doc_idexists)assert doc1.doc_title == "Title 1" assert str(doc1.doc_url) == "http://example.com/doc2" + assert doc1.doc_id == "doc-123" ... assert doc2.doc_title == "Title 2" assert str(doc2.doc_url) == "http://example.com/doc3" + assert doc2.doc_id == "doc-456"Also applies to: 994-1007
tests/unit/test_llama_stack_configuration.py (1)
9-222: Solid coverage for the new construct_ helpers.*src/llama_stack_configuration.py (1)
323-347: Avoid in-place mutation ofls_configinenrich_byok_rag.Consider returning a new enriched config instead of mutating the input dict. It will reduce side effects and align with the project’s style expectations.
As per coding guidelines: "Avoid in-place parameter modification anti-patterns; instead return new data structures".
|
I hope it works ;) /lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/openapi.json (1)
4287-4312:⚠️ Potential issue | 🟠 MajoroperationId must be unique across GET/POST on /a2a.
Line 4293 and Line 4311 use the same operationId, which breaks OpenAPI uniqueness and can collide in code generators.
🔧 Suggested fix
- "operationId": "handle_a2a_jsonrpc_a2a_post", + "operationId": "handle_a2a_jsonrpc_a2a_get",
🧹 Nitpick comments (1)
docs/openapi.json (1)
3667-3677: Clarify server-only parameters in the /v1/infer description.Line 3667 lists
request,background_tasks, andauthunder “Args”, which reads like client-supplied inputs in OpenAPI. Consider labeling them as server-side context or removing to avoid client confusion.💡 Suggested doc tweak
- "description": "Handle rlsapi v1 /infer requests for stateless inference.\n\nThis endpoint serves requests from the RHEL Lightspeed Command Line Assistant (CLA).\n\nAccepts a question with optional context (stdin, attachments, terminal output,\nsystem info) and returns an LLM-generated response.\n\nArgs:\n infer_request: The inference request containing question and context.\n request: The FastAPI request object for accessing headers and state.\n background_tasks: FastAPI background tasks for async Splunk event sending.\n auth: Authentication tuple from the configured auth provider.\n\nReturns:\n RlsapiV1InferResponse containing the generated response text and request ID.\n\nRaises:\n HTTPException: 503 if the LLM service is unavailable.", + "description": "Handle rlsapi v1 /infer requests for stateless inference.\n\nThis endpoint serves requests from the RHEL Lightspeed Command Line Assistant (CLA).\n\nAccepts a question with optional context (stdin, attachments, terminal output,\nsystem info) and returns an LLM-generated response.\n\nArgs:\n infer_request: The inference request containing question and context.\n\nServer-side context (not provided by clients):\n request: The FastAPI request object for accessing headers and state.\n background_tasks: FastAPI background tasks for async Splunk event sending.\n auth: Authentication tuple from the configured auth provider.\n\nReturns:\n RlsapiV1InferResponse containing the generated response text and request ID.\n\nRaises:\n HTTPException: 503 if the LLM service is unavailable.",
- now extracting doc_title, doc_url from the "attributes" in response - removed chunk metadata extraction from results which are not "type": "file_search_call" - citations could be added in future, needs discussion with other teams - Updated BYOK config generation
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
I have registered a vector store, called
/v1/queryand/v1/streaming_queryendpoints, making sure that thereferenced_docsfield is well structured.E.g.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation