Fix/agent engine custom basellm validation #4224

ppgranger · 2026-01-21T15:53:07Z

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

Closes: adk deploy agent_engine with custom BaseLlm but query methods not registered - "Default method query not found" #4208

Description of the change:

Problem:
When deploying an agent with a custom BaseLlm implementation using adk deploy agent_engine, the deployment succeeds but querying the deployed agent fails with:
Agent Engine Error: Default method query not found. Available methods are:
['async_delete_session', 'get_session', 'delete_session', 'create_session',
'async_create_session', 'async_search_memory', 'async_get_session', 'list_sessions',
'async_list_sessions', 'async_add_session_to_memory']

The same agent works correctly:

In ADK Playground (adk web agents)
When deployed inline with agent_engines.create() where all code is in a single file

Root Cause:
cloudpickle serializes imported classes by reference (import path) instead of by value (full code). When Agent Engine deserializes the agent at runtime, it can't find the custom class because the import path doesn't match the deployed module structure.

Solution:
Use cloudpickle.register_pickle_by_value() in the generated Agent Engine app template to force value-based serialization for the agent module and all its submodules. This ensures custom BaseLlm classes are serialized with their full definition, making them available at Agent Engine runtime.

Additionally, add pre-deployment validation to catch issues early with clear error messages.

Testing Plan

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

======================== 39 passed, 1 warning in 1.26s =========================

New tests added:

TestValidateAgentImport (10 tests):

test_skips_config_agents - Config agents skip validation
test_raises_on_missing_agent_module - Error when agent.py is missing
test_raises_on_missing_export - Error when root_agent/app export is missing
test_success_with_root_agent_export - Success with root_agent
test_success_with_app_export - Success with app
test_raises_on_import_error - Helpful message on ImportError
test_raises_on_basellm_import_error - Specific guidance for BaseLlm errors
test_raises_on_syntax_error - Error on syntax errors
test_cleans_up_sys_modules - Cleanup verification
test_restores_sys_path - sys.path restoration verification

TestValidateAgentObject (6 tests):

test_skips_app_export - Skips validation for 'app' exports
test_warns_on_non_baseagent - Warns for non-BaseAgent objects
test_skips_string_models - Skips validation when model is a string
test_validates_custom_basellm_serialization - Validates serializable custom BaseLlm
test_raises_on_non_serializable_custom_basellm - Raises on non-serializable custom BaseLlm
test_skips_builtin_models - Skips check for built-in ADK models

TestAgentEngineAppTemplate (5 tests):

test_template_includes_cloudpickle_imports - Template imports cloudpickle and sys
test_template_registers_agent_module_for_pickle_by_value - Registers agent module
test_template_registers_submodules_for_pickle_by_value - Registers submodules (clients/, tools/)
test_template_handles_non_registerable_modules - Handles non-registerable modules gracefully
test_template_skips_cloudpickle_for_config_agents - Config agents skip cloudpickle registration

TestCloudpickleSerializationFix (2 tests):

test_custom_basellm_in_submodule_can_be_serialized - Custom BaseLlm in submodule serializes correctly
test_agent_with_custom_basellm_can_be_serialized - Agent with custom BaseLlm serializes correctly

Manual End-to-End (E2E) Tests:

Test setup with custom BaseLlm in submodule:
agents/
├── init.py
├── agent.py
├── clients/
│ ├── init.py
│ └── custom_client.py # Custom BaseLlm implementation
└── requirements.txt

Scenario	Before Fix	After Fix
`adk deploy agent_engine`	✅ Deploys	✅ Deploys
`agent.stream_query()`	❌ "query method not found"	✅ Works correctly
`agent.async_stream_query()`	❌ "query method not found"	✅ Works correctly

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

Additional context

Changes made:

_AGENT_ENGINE_APP_TEMPLATE - Added cloudpickle registration:

import cloudpickle
from . import agent as _agent_module
cloudpickle.register_pickle_by_value(_agent_module)
# Also register submodules (clients/, tools/, etc.)
for name, module in list(sys.modules.items()):
  if module is not None and name.startswith(_agent_module.__name__.rsplit('.', 1)[0] + '.'):
    try:
      cloudpickle.register_pickle_by_value(module)
    except Exception:
      pass

_validate_agent_import() - Pre-deployment validation that checks:
- Agent module exists and can be imported
- Expected export (root_agent or app) is present
- Provides specific guidance for BaseLlm-related import errors
_validate_agent_object() - Deep validation that checks:
- Custom BaseLlm implementations can be serialized by cloudpickle
- Warns about module path structure for proper relative imports

Adds pre-deployment validation to `adk deploy agent_engine` to catch import errors (especially with custom BaseLlm implementations) before deployment. This provides clearer error messages and prevents deployments that would fail at runtime with cryptic "query method not found" errors. The validation: - Checks that agent.py exists and can be imported - Verifies the expected export (root_agent or app) is present - Provides specific guidance for BaseLlm-related import errors - Properly cleans up sys.path and sys.modules after validation Fixes google#4208

Fixes the "query method not found" error when deploying agents with custom BaseLlm implementations via `adk deploy agent_engine`. Root cause: cloudpickle serializes imported classes by reference (import path) instead of by value (full code). When Agent Engine deserializes the agent at runtime, it can't find the custom class because the import path doesn't match. The fix: - Uses cloudpickle.register_pickle_by_value() to force value-based serialization for the agent module and all its submodules - This ensures custom BaseLlm classes are serialized with their full definition, making them available at Agent Engine runtime Also includes pre-deployment validation: - Checks that agent.py exists and can be imported - Verifies the expected export (root_agent or app) is present - Validates custom BaseLlm implementations can be serialized - Provides specific guidance for BaseLlm-related errors - Properly cleans up sys.path and sys.modules after validation Fixes google#4208

gemini-code-assist · 2026-01-21T15:53:28Z

Summary of Changes

Hello @ppgranger, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical deployment issue for agents utilizing custom BaseLlm implementations within the Agent Engine. Previously, such deployments would fail at runtime due to cloudpickle's default serialization behavior. The core change ensures that custom LLM classes are serialized by value, making them properly accessible in the deployed environment. Furthermore, the PR significantly enhances the developer experience by integrating robust pre-deployment validation, which proactively identifies and guides users through common configuration and serialization problems, preventing failed deployments and providing actionable feedback.

Highlights

Custom BaseLlm Serialization Fix: Implemented a fix to ensure custom BaseLlm implementations are correctly serialized by value using cloudpickle.register_pickle_by_value() within the Agent Engine app template. This resolves the 'query method not found' error encountered when deploying agents with custom LLM clients.
Pre-deployment Validation: Introduced new pre-deployment validation steps (_validate_agent_import and _validate_agent_object) to catch common issues early. This includes checking for agent module importability, presence of expected exports, and serializability of custom BaseLlm instances, providing clear error messages and guidance.
Enhanced Error Messaging: Improved error messages for import failures, especially those related to BaseLlm, offering specific advice on module structure, dependencies, and serialization requirements.
Comprehensive Unit Testing: Added extensive unit tests (23 new tests across four new test classes) to cover the new validation logic and confirm the cloudpickle serialization fix, ensuring robustness and preventing regressions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces crucial pre-deployment validation for Agent Engine deployments, particularly addressing issues with custom BaseLlm implementations and cloudpickle serialization. The changes include modifying the Agent Engine app template to register modules for pickle-by-value serialization and adding robust validation functions (_validate_agent_import and _validate_agent_object) to catch common errors early. The accompanying unit tests are comprehensive and cover various edge cases, ensuring the reliability of the new validation logic and the serialization fix. Overall, these changes significantly improve the developer experience by providing clearer error messages and preventing runtime failures.

src/google/adk/cli/cli_deploy.py

Address code review feedback: - Add logger.debug for module registration failures in generated Agent Engine app template to aid debugging - Add logger.debug when ADK imports fail in _validate_agent_object to clarify why validation is skipped

ppgranger added 2 commits January 21, 2026 15:25

adk-bot added the tools [Component] This issue is related to tools label Jan 21, 2026

ppgranger mentioned this pull request Jan 21, 2026

adk deploy agent_engine with custom BaseLlm but query methods not registered - "Default method query not found" #4208

Open

gemini-code-assist bot reviewed Jan 21, 2026

View reviewed changes

src/google/adk/cli/cli_deploy.py Outdated Show resolved Hide resolved

src/google/adk/cli/cli_deploy.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/agent engine custom basellm validation #4224

Fix/agent engine custom basellm validation #4224

ppgranger commented Jan 21, 2026

Uh oh!

gemini-code-assist bot commented Jan 21, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix/agent engine custom basellm validation #4224

Are you sure you want to change the base?

Fix/agent engine custom basellm validation #4224

Conversation

ppgranger commented Jan 21, 2026

Link to Issue or Description of Change

Testing Plan

Checklist

Additional context

Uh oh!

gemini-code-assist bot commented Jan 21, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants