Skip to content

Conversation

@JslYoon
Copy link
Contributor

@JslYoon JslYoon commented Feb 3, 2026

Description

As a developer using Lightspeed Core Stack, I want to specify which safety shields to apply per query, so that I can use different shield configurations for different use cases without restarting the service.

Updated query v1, query v2, and streaming query to take in an extra parameter to identify which shields to apply. If the request parameter is not provided, it will apply all safety shields (same as before).

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Claude code

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features
    • Users can specify which safety shields to apply per query: select specific shields, use all defaults, or disable shields by sending an empty list.
  • Behavior Change
    • Moderation now respects per-query shield selection when evaluating input.
  • Tests
    • Added unit tests covering shield selection, bypass, and invalid-shield handling.

Signed-off-by: Lucas <lyoon@redhat.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

Walkthrough

An optional shield_ids field was added to the QueryRequest model and threaded through A2A, query, and streaming endpoints into run_shield_moderation, which now accepts and applies shield ID filtering (including empty-list and invalid-ID handling). Tests for the new behavior were added.

Changes

Cohort / File(s) Summary
Request Model
src/models/requests.py
Add optional shield_ids: Optional[list[str]] to QueryRequest with docs/examples (None = all shields, [] = skip all, list = run specified shields).
Shield Moderation Utility
src/utils/shields.py
Update run_shield_moderation signature to accept shield_ids. Implement filtering: empty list → skip all (return unblocked), None → run all, list → filter available shields, warn about unknown IDs, raise 422 if none matched. Preserve existing error/violation handling.
Endpoint Integration
src/app/endpoints/a2a.py, src/app/endpoints/query_v2.py, src/app/endpoints/streaming_query_v2.py
Thread shield_ids into QueryRequest construction and pass query_request.shield_ids into run_shield_moderation calls. Minor routing/metadata plumbing only.
Tests
tests/unit/utils/test_shields.py
Add async tests covering: empty-list skips shields, invalid shield_ids → 422 UnprocessableEntity, and filtering to a specific shield triggers expected moderation call.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Endpoint
  participant ShieldsUtil as run_shield_moderation
  participant ShieldsAPI as Shields Service / List

  Client->>Endpoint: Send query (may include shield_ids)
  Endpoint->>Endpoint: Build QueryRequest (includes shield_ids)
  Endpoint->>ShieldsUtil: run_shield_moderation(input_text, shield_ids)
  ShieldsUtil->>ShieldsAPI: list available shields
  ShieldsAPI-->>ShieldsUtil: return shields
  ShieldsUtil->>ShieldsUtil: filter shields by shield_ids (None/all, []/skip, list/filter)
  ShieldsUtil->>ShieldsAPI: run moderation on selected shield(s)
  ShieldsAPI-->>ShieldsUtil: moderation result
  ShieldsUtil-->>Endpoint: moderation result (blocked / unblocked / error)
  Endpoint-->>Client: return response (respecting moderation outcome)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Safety shield config' is vague and generic. While it broadly relates to the shield configuration feature, it lacks specificity about the key change: enabling per-query shield specification. The title uses non-descriptive phrasing that doesn't clearly convey what was changed or why. Consider a more specific title such as 'Add per-query safety shield configuration' or 'Enable runtime shield selection via query parameters' to clearly communicate the main change.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/utils/shields.py`:
- Around line 63-104: In run_shield_moderation, when a non-None shield_ids list
is provided but no shields match (shields_to_run is empty), fail fast instead of
silently skipping: after computing shields_to_run and the missing set, if
shields_to_run is empty raise an HTTPException (or appropriate client error)
indicating the requested shield IDs were not found so callers cannot bypass
moderation due to typos/misconfiguration; keep the existing logger.warning but
follow it with the exception throw using the function name run_shield_moderation
and the shield_ids/shields_to_run identifiers to locate the change.

Signed-off-by: Lucas <lyoon@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/utils/shields.py (1)

115-158: ⚠️ Potential issue | 🟠 Major

Add APIConnectionError handling to all Llama Stack API calls.

Network failures from shields.list(), models.list(), or moderations.create() currently bubble as 500s. Import APIConnectionError and handle connection errors consistently (map to HTTP 503) following the established pattern:

Recommended fix
-from llama_stack_client import AsyncLlamaStackClient, BadRequestError
+from llama_stack_client import APIConnectionError, AsyncLlamaStackClient, BadRequestError
+from models.responses import ServiceUnavailableResponse

Then add this exception handler after the existing BadRequestError/ValueError handler:

         except (BadRequestError, ValueError):
             logger.warning(
                 "Shield '%s' violation detected, treating as blocked",
                 shield.identifier,
             )
             metrics.llm_calls_validation_errors_total.inc()
             return ShieldModerationResult(
                 blocked=True,
                 message=DEFAULT_VIOLATION_MESSAGE,
                 shield_model=shield.provider_resource_id,
             )
+        except APIConnectionError as e:
+            logger.error("Unable to connect to Llama Stack: %s", e)
+            response = ServiceUnavailableResponse(backend_name="Llama Stack", cause=str(e))
+            raise HTTPException(**response.model_dump()) from e

This matches the error handling pattern established in src/app/endpoints/shields.py and other endpoint modules.

🤖 Fix all issues with AI agents
In `@src/utils/shields.py`:
- Around line 68-85: The docstring for run_shield_moderation is missing
documentation of the new HTTP 422 error path for invalid shield_ids; update the
Google-style docstring for run_shield_moderation to include in the Raises
section that an HTTPException with status 422 is raised when shield_ids is
invalid (e.g., contains unknown IDs or is malformed), and ensure the Parameters,
Returns, and Raises sections are present and concise per Google Python docstring
conventions, referencing the ShieldModerationResult return type and the existing
500/404 or other HTTPException cases for missing provider_resource_id or
model-not-found.

Comment on lines 68 to 85
"""
Run shield moderation on input text.

Iterates through all configured shields and runs moderation checks.
Iterates through configured shields and runs moderation checks.
Raises HTTPException if shield model is not found.

Parameters:
client: The Llama Stack client.
input_text: The text to moderate.
shield_ids: Optional list of shield IDs to use. If None, uses all shields.
If empty list, skips all shields.

Returns:
ShieldModerationResult: Result indicating if content was blocked and the message.

Raises:
HTTPException: If shield's provider_resource_id is not configured or model not found.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Document the new 422 raise path in the docstring.
The function now raises HTTP 422 for invalid shield_ids, but the Raises section only mentions missing models.

✏️ Proposed docstring update
     Raises:
-        HTTPException: If shield's provider_resource_id is not configured or model not found.
+        HTTPException: If shield's provider_resource_id is not configured, model not found,
+            or the requested shield_ids do not match any available shields.
As per coding guidelines, Follow Google Python docstring conventions including Parameters, Returns, Raises, and Attributes sections as needed.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"""
Run shield moderation on input text.
Iterates through all configured shields and runs moderation checks.
Iterates through configured shields and runs moderation checks.
Raises HTTPException if shield model is not found.
Parameters:
client: The Llama Stack client.
input_text: The text to moderate.
shield_ids: Optional list of shield IDs to use. If None, uses all shields.
If empty list, skips all shields.
Returns:
ShieldModerationResult: Result indicating if content was blocked and the message.
Raises:
HTTPException: If shield's provider_resource_id is not configured or model not found.
"""
"""
Run shield moderation on input text.
Iterates through configured shields and runs moderation checks.
Raises HTTPException if shield model is not found.
Parameters:
client: The Llama Stack client.
input_text: The text to moderate.
shield_ids: Optional list of shield IDs to use. If None, uses all shields.
If empty list, skips all shields.
Returns:
ShieldModerationResult: Result indicating if content was blocked and the message.
Raises:
HTTPException: If shield's provider_resource_id is not configured, model not found,
or the requested shield_ids do not match any available shields.
"""
🤖 Prompt for AI Agents
In `@src/utils/shields.py` around lines 68 - 85, The docstring for
run_shield_moderation is missing documentation of the new HTTP 422 error path
for invalid shield_ids; update the Google-style docstring for
run_shield_moderation to include in the Raises section that an HTTPException
with status 422 is raised when shield_ids is invalid (e.g., contains unknown IDs
or is malformed), and ensure the Parameters, Returns, and Raises sections are
present and concise per Google Python docstring conventions, referencing the
ShieldModerationResult return type and the existing 500/404 or other
HTTPException cases for missing provider_resource_id or model-not-found.

@JslYoon JslYoon closed this Feb 3, 2026
@JslYoon JslYoon reopened this Feb 3, 2026
@JslYoon JslYoon closed this Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant