LCORE-1179: Update metadata extraction from chunks #1073

are-ces · 2026-01-28T15:57:27Z

Description

now extracting doc_title, doc_url from the "attributes" in response
removed chunk metadata extraction from results which are not "type": "file_search_call"
deleted deprecated method in old query endpoint
updated the BYOK config generation
updated documentation for BYOK and RAG

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Claude

Related Tickets & Documents

Related Issue # LCORE-1179
Closes # LCORE-1179

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

I have registered a vector store, called /v1/query and /v1/streaming_query endpoints, making sure that the referenced_docs field is well structured.

E.g.

"referenced_documents":` [
    {
      "doc_url": "https://www.redhat.com/openshift_docs/virt/about_virt/about-virt.txt",
      "doc_title": "About OpenShift Virtualization",
    },

Summary by CodeRabbit

New Features
- Referenced-document extraction now prefers document titles and improved URL resolution (doc_url, docs_url, url, link); deduplication uses title+URL instead of filenames.
- Centralized configuration: added storage.backends and registered_resources (vector_stores, models); BYOK RAG wiring now populates storage, providers, vector stores, and model registrations.
- Inference endpoint accepts additional request context inputs for richer integrations.
Bug Fixes
- A2A endpoint mapping corrected to use POST.
Documentation
- Docs/OpenAPI updated: vector_db_id renamed to vector_store_id and new storage/registered_resources schema documented.

coderabbitai · 2026-01-28T15:57:44Z

Warning

Rate limit exceeded

@are-ces has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 6 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

Removed text-item metadata parsing from query endpoints; changed referenced-document extraction in query_v2 to prefer doc_url/docs_url/url/link, extract doc_title, and dedupe by (doc_url, doc_title); renamed and expanded BYOK RAG config wiring to centralized storage/backends and registered_resources with new helper constructors; tests and docs updated accordingly.

Changes

Cohort / File(s)	Summary
Query endpoints `src/app/endpoints/query.py`, `src/app/endpoints/query_v2.py`	Removed `parse_metadata_from_text_item` and all TextContentItem usage from `query.py`. In `query_v2.py`, updated referenced-document parsing: URL precedence doc_url→docs_url→url→link, extract `doc_title` from attributes, eligibility based on `doc_title` or `doc_url`, dedupe by `(final_url, doc_title)`, and construct ReferencedDocument with `doc_url`/`doc_title`. Removed filename/annotation branches.
Models & responses `src/models/responses.py`, `src/models/config.py`	Minor import reorder in `responses.py`. Updated `ByokRag.vector_db_id` description wording to "Vector database identification." in `config.py`.
BYOK / Llama stack config `src/llama_stack_configuration.py`	Renamed `construct_vector_dbs_section` → `construct_storage_backends_section`; added `construct_models_section`, `construct_vector_stores_section`, and expanded `construct_vector_io_providers_section`. `enrich_byok_rag` now populates `storage.backends`, `providers.vector_io`, `registered_resources.vector_stores`, and `registered_resources.models`.
Tests — endpoints `tests/unit/app/endpoints/test_query.py`, `tests/unit/app/endpoints/test_query_v2.py`	Removed tests/imports referencing `parse_metadata_from_text_item` and TextContentItem. Updated `test_query_v2` fixtures/assertions to use `docs_url`, `title`, `document_id` attributes and new referenced-doc expectations (reduced referenced docs and added RAG chunk content/source/score assertions).
Tests — config & cache `tests/unit/cache/test_postgres_cache.py`, `tests/unit/test_llama_stack_configuration.py`	Import adjustments in Postgres cache test. Config tests updated to validate new `vector_stores`, `storage.backends`, `providers.vector_io`, and `registered_resources.models` wiring and ID/name mappings.
YAML examples & CI configs `tests/configuration/run.yaml`, `examples/run.yaml`, `run.yaml`, `tests/e2e/configs/run-ci.yaml`	Removed top-level `conversations_store`, `inference_store`, `metadata_store`. Replaced `vector_dbs` with `vector_stores` under `registered_resources`. Introduced `storage.backends`/`stores` and adjusted provider persistence and `vector_io` persistence/backends.
Docs / OpenAPI / Guides `docs/openapi.json`, `docs/openapi.md`, `docs/byok_guide.md`, `docs/rag_guide.md`	Renamed `vector_db_id` → `vector_store_id` and updated descriptions; extended `/v1/infer` to include `request` and `background_tasks`; updated A2A operationIds to POST variants; reworked BYOK/RAG docs to central `storage/backends` and `registered_resources` model.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

LCORE-956: Bump llama stack to 0.3.0 #866 — Changes to referenced-document parsing in query/query_v2 closely align with this PR.
LCORE-642: Llama Stack configuration regeneration #651 — Related BYOK RAG configuration wiring and helper function additions/renames in llama_stack_configuration.
[RHDHPAI-1143] Implement referenced_documents caching #643 — Overlapping work on referenced_documents handling in endpoints.

Suggested reviewers

tisnik
radofuchs

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the primary change: updating metadata extraction logic to extract doc_title and doc_url from chunk attributes instead of filenames, which is reflected in the substantial refactoring across query endpoints and configuration files.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/app/endpoints/query_v2.py (1)

547-588: Include doc_id in de-duplication to avoid dropping distinct documents.
If two documents share title/URL (or URL is missing) but have different doc_id values, the current (final_url, doc_title) key will collapse them, which undermines the new doc_id linkage.

✅ Suggested fix

-    # Use a set to track unique documents by (doc_url, doc_title) tuple
-    seen_docs: set[tuple[Optional[str], Optional[str]]] = set()
+    # Use a set to track unique documents by (doc_id, doc_url, doc_title) tuple
+    seen_docs: set[tuple[Optional[str], Optional[str], Optional[str]]] = set()
@@
-                if (final_url, doc_title) not in seen_docs:
+                unique_key = (doc_id, final_url, doc_title)
+                if unique_key not in seen_docs:
                     documents.append(
                         ReferencedDocument(
                             doc_url=final_url, doc_title=doc_title, doc_id=doc_id
                         )
                     )
-                    seen_docs.add((final_url, doc_title))
+                    seen_docs.add(unique_key)

asimurka · 2026-01-29T10:55:37Z

src/app/endpoints/query_v2.py

-                        )
-                        seen_docs.add((final_url, filename))
-
-        # 2. Parse from message content annotations


What is the reason for dropping this section?
Check also OpenAIResponseOutputMessageContent type that can be part of OpenAIResponseMessage content. Isn't this also relevant? More specifically, OpenAIResponseAnnotations object

We already get the references to the chunks used for the response.

Annotations is useful if we want to determine exactly which files the model has used to give the answer (determined by the model) and at which position in the text.

Personally I think that we can use this data, but right now it is not needed. Because the scope of referenced_docs is to keep data of all chunks retrieved and used as input for reference. The data that we get from citation is just redundant since it is a small subset of these docs.

I think we can make use of citations in a separate feature, after agreement across our customers since it adds a new functionality.

+1 for handling "annotations" separately

asimurka

LGTM

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@docs/openapi.md`:
- Around line 5324-5328: The ReferencedDocument markdown table pipes are
misaligned with the header row causing MD060; fix by aligning the pipe
separators vertically and ensuring the header separator row matches the number
of columns and header styles—update the table containing the fields doc_url,
doc_title, and doc_id so the header row "| Field | Type | Description |" and the
separator row (the row of dashes) have matching columns and properly placed
pipes, and then ensure each data row (the lines with doc_url, doc_title, doc_id)
uses the same pipe alignment and column count.

In `@src/app/endpoints/query_v2.py`:
- Around line 561-566: The code sets attributes = result.get("attributes", {})
or uses getattr(result, "attributes", {}), but if the key/attribute exists with
value None the default isn't applied and attributes becomes None causing
attributes.get(...) to raise AttributeError; change the assignment so attributes
is always a dict by applying the "or {}" pattern after both retrievals (i.e.,
ensure attributes = result.get("attributes", {}) or {} and attributes =
getattr(result, "attributes", {}) or {}), referencing the variables result and
attributes and the subsequent attributes.get(...) usage to locate where to apply
the fix.
- Around line 575-586: The current filter only appends ReferencedDocument when
doc_title or doc_url exists, which drops entries that only have doc_id; change
the if-condition to check any of doc_title, doc_url, or doc_id so documents with
only a document_id are preserved. Keep the existing final_url normalization
(treat empty string as None), use the same dedupe tuple (final_url, doc_title,
doc_id) against seen_docs, and append ReferencedDocument(doc_url=final_url,
doc_title=doc_title, doc_id=doc_id) when that tuple is not in seen_docs.

🧹 Nitpick comments (1)

src/app/endpoints/query_v2.py (1)

503-507: Consider using doc_id as RAGChunk.source (with filename fallback).

If clients are expected to join rag_chunks with referenced_documents via doc_id, source should carry that ID.

🔄 Suggested linkage-friendly source

-            rag_chunk = RAGChunk(
-                content=result.text, source=result.filename, score=result.score
-            )
+            attributes = getattr(result, "attributes", None) or {}
+            doc_id = None
+            if isinstance(attributes, dict):
+                doc_id = attributes.get("document_id") or attributes.get("doc_id")
+            source = doc_id or result.filename
+            rag_chunk = RAGChunk(content=result.text, source=source, score=result.score)

docs/openapi.md

src/app/endpoints/query_v2.py

coderabbitai · 2026-01-30T12:16:51Z

src/app/endpoints/query_v2.py

+                doc_title = attributes.get("title")
+                doc_id = attributes.get("document_id") or attributes.get("doc_id")

-                # If we have at least a filename or url
-                if filename or doc_url:
+                if doc_title or doc_url:
                    # Treat empty string as None for URL to satisfy Optional[AnyUrl]
                    final_url = doc_url if doc_url else None
-                    if (final_url, filename) not in seen_docs:
+                    if (final_url, doc_title, doc_id) not in seen_docs:
                        documents.append(
-                            ReferencedDocument(doc_url=final_url, doc_title=filename)
-                        )
-                        seen_docs.add((final_url, filename))
-
-        # 2. Parse from message content annotations
-        elif item_type == "message":
-            content = getattr(output_item, "content", None)
-            if isinstance(content, list):
-                for part in content:
-                    # Skip if part is a string or doesn't have annotations
-                    if isinstance(part, str):
-                        continue
-
-                    annotations = getattr(part, "annotations", []) or []
-                    for annotation in annotations:
-                        # Handle both object and dict access for annotations
-                        if isinstance(annotation, dict):
-                            anno_type = annotation.get("type")
-                            anno_url = annotation.get("url")
-                            anno_title = annotation.get("title") or annotation.get(
-                                "filename"
-                            )
-                        else:
-                            anno_type = getattr(annotation, "type", None)
-                            anno_url = getattr(annotation, "url", None)
-                            anno_title = getattr(annotation, "title", None) or getattr(
-                                annotation, "filename", None
+                            ReferencedDocument(
+                                doc_url=final_url, doc_title=doc_title, doc_id=doc_id
                            )
-
-                        if anno_type == "url_citation":
-                            # Treat empty string as None
-                            final_url = anno_url if anno_url else None
-                            if (final_url, anno_title) not in seen_docs:
-                                documents.append(
-                                    ReferencedDocument(
-                                        doc_url=final_url, doc_title=anno_title
-                                    )
-                                )
-                                seen_docs.add((final_url, anno_title))
-
-                        elif anno_type == "file_citation":
-                            if (None, anno_title) not in seen_docs:
-                                documents.append(
-                                    ReferencedDocument(
-                                        doc_url=None, doc_title=anno_title
-                                    )
-                                )
-                                seen_docs.add((None, anno_title))
+                        )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the actual code around lines 575-586 in query_v2.py cat -n src/app/endpoints/query_v2.py | sed -n '570,590p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 1147

🏁 Script executed:

# Let's also check the ReferencedDocument class definition to understand its structure rg -A 10 'class ReferencedDocument' src/

Repository: lightspeed-core/lightspeed-stack

Length of output: 708

🏁 Script executed:

# Let's see more context around this code section to understand the full logic cat -n src/app/endpoints/query_v2.py | sed -n '550,600p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 2444

🏁 Script executed:

# Get full ReferencedDocument class definition cat -n src/models/responses.py | grep -A 20 'class ReferencedDocument'

Repository: lightspeed-core/lightspeed-stack

Length of output: 840

🏁 Script executed:

# Search for where ReferencedDocument is used to understand the intent rg 'ReferencedDocument' src/ --type py -B 2 -A 2

Repository: lightspeed-core/lightspeed-stack

Length of output: 10260

🏁 Script executed:

# Check if there are any tests that show expected behavior with doc_id-only documents fd 'test.*query' src/ --type py | head -5

Repository: lightspeed-core/lightspeed-stack

Length of output: 245

Include doc_id-only documents to prevent data loss.

Documents with only document_id but no title or url are currently filtered out. Since doc_id is explicitly extracted and used in deduplication, it should be preserved. Update the condition to include doc_id:

Suggested fix

- if doc_title or doc_url: + if doc_title or doc_url or doc_id:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

doc_title = attributes.get("title")

doc_id = attributes.get("document_id") or attributes.get("doc_id")

# If we have at least a filename or url

if filename or doc_url:

if doc_title or doc_url:

# Treat empty string as None for URL to satisfy Optional[AnyUrl]

final_url = doc_url if doc_url else None

if (final_url, filename) not in seen_docs:

if (final_url, doc_title, doc_id) not in seen_docs:

documents.append(

ReferencedDocument(doc_url=final_url, doc_title=filename)

)

seen_docs.add((final_url, filename))

# 2. Parse from message content annotations

elif item_type == "message":

content = getattr(output_item, "content", None)

if isinstance(content, list):

for part in content:

# Skip if part is a string or doesn't have annotations

if isinstance(part, str):

continue

annotations = getattr(part, "annotations", []) or []

for annotation in annotations:

# Handle both object and dict access for annotations

if isinstance(annotation, dict):

anno_type = annotation.get("type")

anno_url = annotation.get("url")

anno_title = annotation.get("title") or annotation.get(

"filename"

)

else:

anno_type = getattr(annotation, "type", None)

anno_url = getattr(annotation, "url", None)

anno_title = getattr(annotation, "title", None) or getattr(

annotation, "filename", None

ReferencedDocument(

doc_url=final_url, doc_title=doc_title, doc_id=doc_id

)

if anno_type == "url_citation":

# Treat empty string as None

final_url = anno_url if anno_url else None

if (final_url, anno_title) not in seen_docs:

documents.append(

ReferencedDocument(

doc_url=final_url, doc_title=anno_title

)

)

seen_docs.add((final_url, anno_title))

elif anno_type == "file_citation":

if (None, anno_title) not in seen_docs:

documents.append(

ReferencedDocument(

doc_url=None, doc_title=anno_title

)

)

seen_docs.add((None, anno_title))

)

doc_title = attributes.get("title")

doc_id = attributes.get("document_id") or attributes.get("doc_id")

if doc_title or doc_url or doc_id:

# Treat empty string as None for URL to satisfy Optional[AnyUrl]

final_url = doc_url if doc_url else None

if (final_url, doc_title, doc_id) not in seen_docs:

documents.append(

ReferencedDocument(

doc_url=final_url, doc_title=doc_title, doc_id=doc_id

)

)

🤖 Prompt for AI Agents

In `@src/app/endpoints/query_v2.py` around lines 575 - 586, The current filter only appends ReferencedDocument when doc_title or doc_url exists, which drops entries that only have doc_id; change the if-condition to check any of doc_title, doc_url, or doc_id so documents with only a document_id are preserved. Keep the existing final_url normalization (treat empty string as None), use the same dedupe tuple (final_url, doc_title, doc_id) against seen_docs, and append ReferencedDocument(doc_url=final_url, doc_title=doc_title, doc_id=doc_id) when that tuple is not in seen_docs.

referenced_docs are used because we need to show in the frontend what document have been referenced, IMO it does not make sense to return empty documents without TItle / Url

@are-ces, that makes perfect sense! You're right that a document reference with only an ID wouldn't provide meaningful information for the frontend to display. The current condition ensures that only documents with at least a title or URL—something actionable and human-readable—are returned to the UI. Thank you for the clarification!

✏️ Learnings added

Learnt from: are-ces Repo: lightspeed-core/lightspeed-stack PR: 1073 File: src/app/endpoints/query_v2.py:575-586 Timestamp: 2026-01-30T13:33:34.749Z Learning: In `src/app/endpoints/query_v2.py`, the `parse_referenced_documents_from_responses_api` function filters referenced documents to include only those with `doc_title` or `doc_url` because these documents are displayed in the frontend. Documents with only `doc_id` are intentionally excluded as they wouldn't provide useful information to end users.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/openapi.json (1)
4287-4312: ⚠️ Potential issue | 🟠 Major

Ensure unique operationId values for GET vs POST.
Line 4293 and Line 4311 both use handle_a2a_jsonrpc_a2a_post, which can break client codegen (operationId must be unique). Consider distinct IDs per method.
🔧 Suggested fix
-                "operationId": "handle_a2a_jsonrpc_a2a_post",
+                "operationId": "handle_a2a_jsonrpc_a2a_get",

🧹 Nitpick comments (2)

src/llama_stack_configuration.py (1)

310-347: Avoid in-place mutation of ls_config in enrich_byok_rag.
Returning a new mapping reduces side‑effects and aligns with the project guideline.

♻️ Proposed refactor to avoid in-place mutation

-def enrich_byok_rag(ls_config: dict[str, Any], byok_rag: list[dict[str, Any]]) -> None:
+def enrich_byok_rag(ls_config: dict[str, Any], byok_rag: list[dict[str, Any]]) -> dict[str, Any]:
     """Enrich Llama Stack config with BYOK RAG settings.

     Args:
-        ls_config: Llama Stack configuration dict (modified in place)
+        ls_config: Llama Stack configuration dict
         byok_rag: List of BYOK RAG configurations
+    Returns:
+        dict[str, Any]: Enriched configuration mapping.
     """
-    if len(byok_rag) == 0:
-        logger.info("BYOK RAG is not configured: skipping")
-        return
+    if len(byok_rag) == 0:
+        logger.info("BYOK RAG is not configured: skipping")
+        return ls_config
+
+    output = {**ls_config}

     logger.info("Enriching Llama Stack config with BYOK RAG")

     # Add storage backends
-    if "storage" not in ls_config:
-        ls_config["storage"] = {}
-    ls_config["storage"]["backends"] = construct_storage_backends_section(
-        ls_config, byok_rag
-    )
+    storage = {**output.get("storage", {})}
+    storage["backends"] = construct_storage_backends_section(output, byok_rag)
+    output["storage"] = storage

     # Add vector_io providers
-    if "providers" not in ls_config:
-        ls_config["providers"] = {}
-    ls_config["providers"]["vector_io"] = construct_vector_io_providers_section(
-        ls_config, byok_rag
-    )
+    providers = {**output.get("providers", {})}
+    providers["vector_io"] = construct_vector_io_providers_section(output, byok_rag)
+    output["providers"] = providers

     # Add registered vector stores
-    if "registered_resources" not in ls_config:
-        ls_config["registered_resources"] = {}
-    ls_config["registered_resources"]["vector_stores"] = (
-        construct_vector_stores_section(ls_config, byok_rag)
-    )
-
-    # Add embedding models
-    ls_config["registered_resources"]["models"] = construct_models_section(
-        ls_config, byok_rag
-    )
+    registered_resources = {**output.get("registered_resources", {})}
+    registered_resources["vector_stores"] = construct_vector_stores_section(
+        output, byok_rag
+    )
+    registered_resources["models"] = construct_models_section(output, byok_rag)
+    output["registered_resources"] = registered_resources
+
+    return output

🔁 Caller update

-    enrich_byok_rag(ls_config, config.get("byok_rag", []))
+    ls_config = enrich_byok_rag(ls_config, config.get("byok_rag", []))

As per coding guidelines: Avoid in-place parameter modification anti-patterns; instead return new data structures.

docs/openapi.json (1)

8434-8451: Clarify when doc_id will actually appear.
The schema allows doc_id alone, but the API filters out entries lacking doc_title/doc_url. A short note will prevent consumer confusion.
📝 Suggested doc clarification
-                "description": "Model representing a document referenced in generating a response.\n\nAttributes:\n    doc_url: Url to the referenced doc.\n    doc_title: Title of the referenced doc.\n    doc_id: ID of the referenced doc."
+                "description": "Model representing a document referenced in generating a response.\n\nAttributes:\n    doc_url: Url to the referenced doc.\n    doc_title: Title of the referenced doc.\n    doc_id: ID of the referenced doc.\n\nNote: referenced documents are only emitted when doc_title or doc_url is present; doc_id-only entries are filtered."
Based on learnings: In src/app/endpoints/query_v2.py, the parse_referenced_documents_from_responses_api function filters referenced documents to include only those with doc_title or doc_url because these documents are displayed in the frontend. Documents with only doc_id are intentionally excluded as they wouldn't provide useful information to end users.

asimurka

The update LGTM

onmete · 2026-02-02T08:40:12Z

src/app/endpoints/query_v2.py

                    or attributes.get("url")
-                    or attributes.get("doc_url")
+                    or attributes.get("link")
                )


This also implements attribute key priority, worth documenting it?

The rag-content tool creates a docs_url field. Users should not care about this, they should just care about creating the vector store via the tool and register the BYOK store in LCORE

So you can't connect an arbitrary index to LCORE? If you have one, you need to recreate it via the rag-content tool?

You can but we don't support that. AFAIK We support only creation of rag dbs from rag-content

src/models/config.py

onmete · 2026-02-02T08:44:01Z

src/app/endpoints/query_v2.py

-                        )
-                        seen_docs.add((final_url, filename))
-
-        # 2. Parse from message content annotations


+1 for handling "annotations" separately

src/app/endpoints/query_v2.py

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

examples/run.yaml (1)
111-112: ⚠️ Potential issue | 🟡 Minor

Duplicate telemetry: key in YAML configuration.

The file contains two telemetry: sections at lines 111-112 and 167-168. In YAML, duplicate keys cause the second occurrence to silently override the first. This results in telemetry.enabled: true (line 168) overwriting the earlier definition (line 112).

If both are intentional, consolidate them into a single telemetry: block.
🔧 Proposed fix: remove duplicate telemetry block
 safety:
   default_shield_id: llama-guard
-telemetry:
-  enabled: true
Also applies to: 167-168
docs/openapi.json (1)
4287-4312: ⚠️ Potential issue | 🟠 Major

OperationId must be unique across GET/POST.
Line 4293 and Line 4311 now share the same operationId, which breaks OpenAPI uniqueness guarantees and can confuse client generators.
🛠️ Suggested fix (keep GET unique)
-                "operationId": "handle_a2a_jsonrpc_a2a_post",
+                "operationId": "handle_a2a_jsonrpc_a2a_get",

🤖 Fix all issues with AI agents

In `@docs/openapi.json`:
- Around line 8434-8451: Update the OpenAPI docs to clarify that doc_id alone is
supplemental and that runtime only emits referenced_documents that include
doc_title or doc_url: edit the ReferencedDocument description (and/or add a
note) to state that entries with only doc_id are not emitted by
parse_referenced_documents_from_responses_api and that consumers should expect
either doc_title or doc_url to be present for displayed documents;
alternatively, if you prefer a schema-level guarantee, change the
ReferencedDocument schema to require at least one of doc_title or doc_url (e.g.,
document-level note or a oneOf/anyOf constraint) so the API contract matches
parse_referenced_documents_from_responses_api behavior.

docs/openapi.json

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

docs/openapi.json (1)
4287-4312: ⚠️ Potential issue | 🟠 Major

Ensure operationId values are unique across methods.

Both GET and POST for /a2a currently use the same operationId, which breaks OpenAPI uniqueness guarantees and can confuse client generators.
🔧 Suggested fix
 /a2a:
   get:
     ...
-    "operationId": "handle_a2a_jsonrpc_a2a_post"
+    "operationId": "handle_a2a_jsonrpc_a2a_get"
   post:
     ...
     "operationId": "handle_a2a_jsonrpc_a2a_post"
tests/configuration/run.yaml (1)
84-90: ⚠️ Potential issue | 🟠 Major

Remove unsupported telemetry provider block from providers section.

The telemetry provider block at lines 84–90 is not supported in Llama Stack 0.3.x. Telemetry should be configured only via the top-level telemetry.enabled: true setting (already present at line 146), without requiring a provider entry. Remove this block to prevent configuration warnings or failures.
🛠️ Proposed change
-  telemetry:
-  - config:
-      service_name: ''
-      sinks: sqlite
-      sqlite_db_path: .llama/distributions/ollama/trace_store.db
-    provider_id: meta-reference
-    provider_type: inline::meta-reference

🤖 Fix all issues with AI agents

In `@src/llama_stack_configuration.py`:
- Around line 138-145: The loop that builds BYOK backends uses vector_store_id
from each byok_rag entry and currently allows empty or missing values (producing
names like byok__storage and silent overwrites); update the code in this loop
(the byok_rag iteration where backend_name = f"byok_{vector_store_id}_storage"
and output[...] is set) to validate that vector_store_id is present and
non-empty and raise/throw a clear exception (fail fast) if not, and apply the
same validation/early-fail pattern in the other helpers mentioned
(construct_vector_stores_section, construct_vector_io_providers_section,
construct_models_section) so each function checks its vector_store_id (or
equivalent ID fields) before using them to build keys or provider/model entries.

🧹 Nitpick comments (3)

src/llama_stack_configuration.py (1)

323-347: Consider returning a new config instead of mutating ls_config in place.

The function currently rewrites nested sections on the passed-in dict, which makes side effects harder to reason about. Returning a new dict would better isolate changes.

As per coding guidelines, "Avoid in-place parameter modification anti-patterns; instead return new data structures".

docs/openapi.json (2)

5338-5356: Consider noting the rename from vector_db_id → vector_store_id.

This is a public schema change; adding a brief migration note in the description can help clients discover the rename without hunting release notes.

8434-8451: Clarify when doc_id is emitted.

The API filters out referenced documents that lack doc_title or doc_url, so doc_id effectively appears only alongside a title or URL. Consider adding a short note to avoid clients expecting doc_id-only entries.

Based on learnings: “In src/app/endpoints/query_v2.py, the parse_referenced_documents_from_responses_api function filters referenced documents to include only those with doc_title or doc_url because these documents are displayed in the frontend. Documents with only doc_id are intentionally excluded as they wouldn't provide useful information to end users.”

coderabbitai · 2026-02-02T09:44:23Z

src/llama_stack_configuration.py

+    # add new backends for each BYOK RAG
+    for brag in byok_rag:
+        vector_store_id = brag.get("vector_store_id", "")
+        backend_name = f"byok_{vector_store_id}_storage"
+        output[backend_name] = {
+            "type": "kv_sqlite",
+            "db_path": brag.get("db_path", f".llama/{vector_store_id}.db"),
+        }


⚠️ Potential issue | 🟠 Major

Guard against missing vector_store_id to avoid invalid backend/provider IDs.

If a BYOK entry is missing vector_store_id, the code generates names like byok__storage and can silently overwrite entries. Please validate and fail fast (and mirror the same guard in construct_vector_stores_section, construct_vector_io_providers_section, and construct_models_section).

🛠️ Proposed fix (apply similarly in the other construct_* helpers)

- for brag in byok_rag: - vector_store_id = brag.get("vector_store_id", "") + for brag in byok_rag: + vector_store_id = brag.get("vector_store_id") + if not vector_store_id: + raise ValueError("byok_rag entry missing vector_store_id") backend_name = f"byok_{vector_store_id}_storage" output[backend_name] = { "type": "kv_sqlite", "db_path": brag.get("db_path", f".llama/{vector_store_id}.db"), }

🤖 Prompt for AI Agents

In `@src/llama_stack_configuration.py` around lines 138 - 145, The loop that builds BYOK backends uses vector_store_id from each byok_rag entry and currently allows empty or missing values (producing names like byok__storage and silent overwrites); update the code in this loop (the byok_rag iteration where backend_name = f"byok_{vector_store_id}_storage" and output[...] is set) to validate that vector_store_id is present and non-empty and raise/throw a clear exception (fail fast) if not, and apply the same validation/early-fail pattern in the other helpers mentioned (construct_vector_stores_section, construct_vector_io_providers_section, construct_models_section) so each function checks its vector_store_id (or equivalent ID fields) before using them to build keys or provider/model entries.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/openapi.json (1)
4287-4312: ⚠️ Potential issue | 🟠 Major

Ensure unique operationId values for /a2a GET and POST.

Both methods now share the same operationId, which can break OpenAPI tooling/codegen due to collisions. Assign a distinct ID to the GET operation (or remove/deprecate the GET route if it is no longer supported).
🔧 Suggested fix (unique GET operationId)
-                "operationId": "handle_a2a_jsonrpc_a2a_post",
+                "operationId": "handle_a2a_jsonrpc_a2a_get",

🧹 Nitpick comments (4)

docs/rag_guide.md (1)
242-242: Minor inconsistency: documentation still references vector_dbs.

Line 242 mentions vector_dbs while the rest of the document has been updated to use vector_stores. Consider updating for consistency:
-> [!NOTE]
-> When experimenting with different `models`, `providers` and `vector_dbs`, you might need to manually unregister the old ones with the Llama Stack client CLI (e.g. `llama-stack-client vector_dbs list`)
+> [!NOTE]
+> When experimenting with different `models`, `providers` and `vector_stores`, you might need to manually unregister the old ones with the Llama Stack client CLI (e.g. `llama-stack-client vector_stores list`)
tests/unit/test_llama_stack_configuration.py (1)

283-283: Consider using tmp_path fixture for the db_path.

Using /tmp/store1.db as a hardcoded path works but could be improved by leveraging pytest's tmp_path fixture for better cross-platform compatibility and automatic cleanup.
tests/unit/cache/test_postgres_cache.py (2)
518-543: Assert topic summary removal after delete.

The test name/docstring says the summary is removed, but there’s no assertion verifying it. Add a post-delete list check to prevent regressions.
🧪 Suggested assertion
     deleted = cache.delete(USER_ID_1, CONVERSATION_ID_1, False)
     assert deleted is True
+    mock_cursor.fetchall.return_value = []
+    conversations = cache.list(USER_ID_1, False)
+    assert conversations == []
573-626: Add doc_id coverage for referenced_documents persistence.

PR objectives highlight doc_id linkage, but this test only verifies doc_title/doc_url. Adding doc_id to the test would prevent regressions in cache serialization/deserialization.
🧪 Add doc_id to the persisted payload check
-    docs = [
-        ReferencedDocument(doc_title="Test Doc", doc_url=AnyUrl("http://example.com/"))
-    ]
+    docs = [
+        ReferencedDocument(
+            doc_title="Test Doc",
+            doc_url=AnyUrl("http://example.com/"),
+            doc_id="doc_1",
+        )
+    ]
@@
-    assert json.loads(inserted_json_str) == [
-        {"doc_url": "http://example.com/", "doc_title": "Test Doc"}
-    ]
+    assert json.loads(inserted_json_str) == [
+        {"doc_url": "http://example.com/", "doc_title": "Test Doc", "doc_id": "doc_1"}
+    ]
@@
-        [{"doc_url": "http://example.com/", "doc_title": "Test Doc"}],
+        [{"doc_url": "http://example.com/", "doc_title": "Test Doc", "doc_id": "doc_1"}],
@@
     assert retrieved_entries[0] == entry_with_docs
     assert retrieved_entries[0].referenced_documents[0].doc_title == "Test Doc"
+    assert retrieved_entries[0].referenced_documents[0].doc_id == "doc_1"

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/app/endpoints/query_v2.py (1)

546-584: ⚠️ Potential issue | 🟠 Major

Include document_id in referenced documents for chunk linkage.
attributes.document_id is currently ignored, so referenced_documents can’t be linked back to the originating RAG chunks. Extract it, include it in the dedupe key, and pass it through (while keeping the title/url filter). Also extend ReferencedDocument to store doc_id.

🔧 Suggested update (plus model change in models/responses.py)

-    seen_docs: set[tuple[Optional[str], Optional[str]]] = set()
+    seen_docs: set[tuple[Optional[str], Optional[str], Optional[str]]] = set()
...
                 doc_title = attributes.get("title")
+                doc_id = attributes.get("doc_id") or attributes.get("document_id")

                 if doc_title or doc_url:
                     # Treat empty string as None for URL to satisfy Optional[AnyUrl]
                     final_url = doc_url if doc_url else None
-                    if (final_url, doc_title) not in seen_docs:
+                    dedupe_key = (final_url, doc_title, doc_id)
+                    if dedupe_key not in seen_docs:
                         documents.append(
-                            ReferencedDocument(doc_url=final_url, doc_title=doc_title)
+                            ReferencedDocument(
+                                doc_url=final_url, doc_title=doc_title, doc_id=doc_id
+                            )
                         )
-                        seen_docs.add((final_url, doc_title))
+                        seen_docs.add(dedupe_key)

docs/openapi.json (2)

4287-4312: ⚠️ Potential issue | 🟠 Major

Avoid duplicate operationId values for /a2a GET/POST.

OpenAPI operationIds must be unique; duplicating them can break client generation and tooling.

🔧 Proposed fix

             "get": {
                 "tags": [
                     "a2a"
                 ],
                 "summary": "Handle A2A Jsonrpc",
@@
-                "operationId": "handle_a2a_jsonrpc_a2a_post",
+                "operationId": "handle_a2a_jsonrpc_a2a_get",

8407-8439: ⚠️ Potential issue | 🟠 Major

ReferencedDocument schema is missing doc_id.

PR objectives indicate doc_id is now part of referenced_documents; the OpenAPI contract should expose it.

🔧 Proposed fix

             "ReferencedDocument": {
                 "properties": {
+                    "doc_id": {
+                        "anyOf": [
+                            {
+                                "type": "string",
+                                "minLength": 1
+                            },
+                            {
+                                "type": "null"
+                            }
+                        ],
+                        "title": "Doc Id",
+                        "description": "ID of the referenced document"
+                    },
                     "doc_url": {
                         "anyOf": [
                             {
                                 "type": "string",
@@
                 "type": "object",
                 "title": "ReferencedDocument",
                 "description": "Model representing a document referenced in generating a response.\n\nAttributes:\n    doc_url: Url to the referenced doc.\n    doc_title: Title of the referenced doc.\n+    doc_id: ID of the referenced doc."
             },

Also update the referenced_documents examples (e.g., /v1/query and QueryResponse) to include doc_id when available.

🤖 Fix all issues with AI agents

In `@docs/byok_guide.md`:
- Around line 480-485: Example 2 still defines top-level keys (vector_stores,
models, tool_groups) which conflict with the new registered_resources schema;
move those blocks under the registered_resources map in Example 2 so the entries
become registered_resources:<resource_name>: { vector_stores / models /
tool_groups ... } and ensure the vector_store_id (enterprise-docs) and
provider_id (enterprise-knowledge) remain referenced inside the corresponding
registered_resources entry rather than at top level.

In `@docs/rag_guide.md`:
- Around line 83-101: The snippets place the vector_stores mapping at the top
level but the schema expects registered_resources.vector_stores; move the entire
vector_stores block under registered_resources.vector_stores to match the rest
of the config (keep keys like embedding_dimension, embedding_model, provider_id:
custom-index, and vector_store_id unchanged) and update any comments or
references that point to the top-level vector_stores so they now refer to
registered_resources.vector_stores (apply the same change for both FAISS and
pgvector examples and ensure storage.backends.rag_backend and vector_io
provider_id references remain consistent).

In `@tests/configuration/run.yaml`:
- Around line 145-146: The config currently defines both a providers.telemetry
block and telemetry.enabled: true, which is incompatible with Llama Stack 0.3.x;
remove the providers.telemetry block (the entire providers.telemetry
configuration) and keep only telemetry.enabled: true so the file uses the
supported telemetry flag; locate the providers object in the YAML (the
providers.telemetry section) and delete that subsection, leaving the top-level
telemetry.enabled setting intact.

🧹 Nitpick comments (3)

tests/unit/app/endpoints/test_query_v2.py (1)
912-951: Add doc_id assertions to lock in the new linkage.
The fixtures now carry document_id, but the test doesn’t verify it on ReferencedDocument. If doc_id is part of the contract, add assertions so regressions are caught.
✅ Suggested test additions (once ReferencedDocument.doc_id exists)
     assert doc1.doc_title == "Title 1"
     assert str(doc1.doc_url) == "http://example.com/doc2"
+    assert doc1.doc_id == "doc-123"
...
     assert doc2.doc_title == "Title 2"
     assert str(doc2.doc_url) == "http://example.com/doc3"
+    assert doc2.doc_id == "doc-456"
Also applies to: 994-1007
tests/unit/test_llama_stack_configuration.py (1)

9-222: Solid coverage for the new construct_ helpers.*

src/llama_stack_configuration.py (1)

323-347: Avoid in-place mutation of ls_config in enrich_byok_rag.

Consider returning a new enriched config instead of mutating the input dict. It will reduce side effects and align with the project’s style expectations.

As per coding guidelines: "Avoid in-place parameter modification anti-patterns; instead return new data structures".

docs/byok_guide.md

docs/rag_guide.md

tests/configuration/run.yaml

onmete · 2026-02-02T15:08:18Z

I hope it works ;)

/lgtm

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/openapi.json (1)
4287-4312: ⚠️ Potential issue | 🟠 Major

operationId must be unique across GET/POST on /a2a.

Line 4293 and Line 4311 use the same operationId, which breaks OpenAPI uniqueness and can collide in code generators.
🔧 Suggested fix
-                "operationId": "handle_a2a_jsonrpc_a2a_post",
+                "operationId": "handle_a2a_jsonrpc_a2a_get",

🧹 Nitpick comments (1)

docs/openapi.json (1)

3667-3677: Clarify server-only parameters in the /v1/infer description.

Line 3667 lists request, background_tasks, and auth under “Args”, which reads like client-supplied inputs in OpenAPI. Consider labeling them as server-side context or removing to avoid client confusion.

💡 Suggested doc tweak

-                "description": "Handle rlsapi v1 /infer requests for stateless inference.\n\nThis endpoint serves requests from the RHEL Lightspeed Command Line Assistant (CLA).\n\nAccepts a question with optional context (stdin, attachments, terminal output,\nsystem info) and returns an LLM-generated response.\n\nArgs:\n    infer_request: The inference request containing question and context.\n    request: The FastAPI request object for accessing headers and state.\n    background_tasks: FastAPI background tasks for async Splunk event sending.\n    auth: Authentication tuple from the configured auth provider.\n\nReturns:\n    RlsapiV1InferResponse containing the generated response text and request ID.\n\nRaises:\n    HTTPException: 503 if the LLM service is unavailable.",
+                "description": "Handle rlsapi v1 /infer requests for stateless inference.\n\nThis endpoint serves requests from the RHEL Lightspeed Command Line Assistant (CLA).\n\nAccepts a question with optional context (stdin, attachments, terminal output,\nsystem info) and returns an LLM-generated response.\n\nArgs:\n    infer_request: The inference request containing question and context.\n\nServer-side context (not provided by clients):\n    request: The FastAPI request object for accessing headers and state.\n    background_tasks: FastAPI background tasks for async Splunk event sending.\n    auth: Authentication tuple from the configured auth provider.\n\nReturns:\n    RlsapiV1InferResponse containing the generated response text and request ID.\n\nRaises:\n    HTTPException: 503 if the LLM service is unavailable.",

- now extracting doc_title, doc_url from the "attributes" in response - removed chunk metadata extraction from results which are not "type": "file_search_call" - citations could be added in future, needs discussion with other teams - Updated BYOK config generation

tisnik

LGTM

are-ces requested review from asimurka and tisnik January 28, 2026 15:57

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

are-ces requested a review from onmete January 28, 2026 16:15

asimurka reviewed Jan 29, 2026

View reviewed changes

asimurka approved these changes Jan 29, 2026

View reviewed changes

are-ces marked this pull request as draft January 30, 2026 09:31

are-ces force-pushed the fix-metadata branch 3 times, most recently from 88ed424 to a9ccff5 Compare January 30, 2026 12:09

are-ces marked this pull request as ready for review January 30, 2026 12:10

coderabbitai bot reviewed Jan 30, 2026

View reviewed changes

are-ces marked this pull request as draft January 30, 2026 13:55

are-ces marked this pull request as ready for review January 30, 2026 16:10

coderabbitai bot reviewed Jan 30, 2026

View reviewed changes

asimurka approved these changes Jan 30, 2026

View reviewed changes

are-ces requested review from onmete and removed request for onmete February 2, 2026 08:08

onmete reviewed Feb 2, 2026

View reviewed changes

are-ces force-pushed the fix-metadata branch 2 times, most recently from f715457 to c259964 Compare February 2, 2026 09:39

coderabbitai bot reviewed Feb 2, 2026

View reviewed changes

docs/openapi.json Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 2, 2026

View reviewed changes

are-ces force-pushed the fix-metadata branch from 73db674 to 88c0ee4 Compare February 2, 2026 14:15

coderabbitai bot reviewed Feb 2, 2026

View reviewed changes

docs/byok_guide.md Show resolved Hide resolved

docs/rag_guide.md Outdated Show resolved Hide resolved

tests/configuration/run.yaml Show resolved Hide resolved

are-ces force-pushed the fix-metadata branch from 88c0ee4 to 9db9f40 Compare February 2, 2026 15:11

coderabbitai bot reviewed Feb 2, 2026

View reviewed changes

are-ces marked this pull request as draft February 2, 2026 15:16

are-ces force-pushed the fix-metadata branch from 9db9f40 to 59fad7e Compare February 2, 2026 15:21

are-ces marked this pull request as ready for review February 2, 2026 15:57

tisnik approved these changes Feb 2, 2026

View reviewed changes

tisnik merged commit 73ad33a into lightspeed-core:main Feb 2, 2026
21 of 22 checks passed

LCORE-1179: Update metadata extraction from chunks #1073

LCORE-1179: Update metadata extraction from chunks #1073

Uh oh!

Conversation

are-ces commented Jan 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

asimurka Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asimurka left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

asimurka left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

are-ces commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 28, 2026 •

edited

Loading

asimurka Jan 29, 2026 •

edited

Loading

coderabbitai bot Jan 30, 2026 •

edited

Loading