-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Fix/agent engine custom basellm validation #4224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix/agent engine custom basellm validation #4224
Conversation
Adds pre-deployment validation to `adk deploy agent_engine` to catch import errors (especially with custom BaseLlm implementations) before deployment. This provides clearer error messages and prevents deployments that would fail at runtime with cryptic "query method not found" errors. The validation: - Checks that agent.py exists and can be imported - Verifies the expected export (root_agent or app) is present - Provides specific guidance for BaseLlm-related import errors - Properly cleans up sys.path and sys.modules after validation Fixes google#4208
Fixes the "query method not found" error when deploying agents with custom BaseLlm implementations via `adk deploy agent_engine`. Root cause: cloudpickle serializes imported classes by reference (import path) instead of by value (full code). When Agent Engine deserializes the agent at runtime, it can't find the custom class because the import path doesn't match. The fix: - Uses cloudpickle.register_pickle_by_value() to force value-based serialization for the agent module and all its submodules - This ensures custom BaseLlm classes are serialized with their full definition, making them available at Agent Engine runtime Also includes pre-deployment validation: - Checks that agent.py exists and can be imported - Verifies the expected export (root_agent or app) is present - Validates custom BaseLlm implementations can be serialized - Provides specific guidance for BaseLlm-related errors - Properly cleans up sys.path and sys.modules after validation Fixes google#4208
Summary of ChangesHello @ppgranger, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical deployment issue for agents utilizing custom Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces crucial pre-deployment validation for Agent Engine deployments, particularly addressing issues with custom BaseLlm implementations and cloudpickle serialization. The changes include modifying the Agent Engine app template to register modules for pickle-by-value serialization and adding robust validation functions (_validate_agent_import and _validate_agent_object) to catch common errors early. The accompanying unit tests are comprehensive and cover various edge cases, ensuring the reliability of the new validation logic and the serialization fix. Overall, these changes significantly improve the developer experience by providing clearer error messages and preventing runtime failures.
Address code review feedback: - Add logger.debug for module registration failures in generated Agent Engine app template to aid debugging - Add logger.debug when ADK imports fail in _validate_agent_object to clarify why validation is skipped
Link to Issue or Description of Change
1. Link to an existing issue (if applicable):
cc @GWeale
Description of the change:
Problem:
When deploying an agent with a custom
BaseLlmimplementation usingadk deploy agent_engine, the deployment succeeds but querying the deployed agent fails with:Agent Engine Error: Default method query not found. Available methods are:
['async_delete_session', 'get_session', 'delete_session', 'create_session',
'async_create_session', 'async_search_memory', 'async_get_session', 'list_sessions',
'async_list_sessions', 'async_add_session_to_memory']
The same agent works correctly:
adk web agents)agent_engines.create()where all code is in a single fileRoot Cause:
cloudpickleserializes imported classes by reference (import path) instead of by value (full code). When Agent Engine deserializes the agent at runtime, it can't find the custom class because the import path doesn't match the deployed module structure.Solution:
Use
cloudpickle.register_pickle_by_value()in the generated Agent Engine app template to force value-based serialization for the agent module and all its submodules. This ensures customBaseLlmclasses are serialized with their full definition, making them available at Agent Engine runtime.Additionally, add pre-deployment validation to catch issues early with clear error messages.
Testing Plan
Unit Tests:
======================== 39 passed, 1 warning in 1.26s =========================
New tests added:
TestValidateAgentImport(10 tests):test_skips_config_agents- Config agents skip validationtest_raises_on_missing_agent_module- Error when agent.py is missingtest_raises_on_missing_export- Error when root_agent/app export is missingtest_success_with_root_agent_export- Success with root_agenttest_success_with_app_export- Success with apptest_raises_on_import_error- Helpful message on ImportErrortest_raises_on_basellm_import_error- Specific guidance for BaseLlm errorstest_raises_on_syntax_error- Error on syntax errorstest_cleans_up_sys_modules- Cleanup verificationtest_restores_sys_path- sys.path restoration verificationTestValidateAgentObject(6 tests):test_skips_app_export- Skips validation for 'app' exportstest_warns_on_non_baseagent- Warns for non-BaseAgent objectstest_skips_string_models- Skips validation when model is a stringtest_validates_custom_basellm_serialization- Validates serializable custom BaseLlmtest_raises_on_non_serializable_custom_basellm- Raises on non-serializable custom BaseLlmtest_skips_builtin_models- Skips check for built-in ADK modelsTestAgentEngineAppTemplate(5 tests):test_template_includes_cloudpickle_imports- Template imports cloudpickle and systest_template_registers_agent_module_for_pickle_by_value- Registers agent moduletest_template_registers_submodules_for_pickle_by_value- Registers submodules (clients/, tools/)test_template_handles_non_registerable_modules- Handles non-registerable modules gracefullytest_template_skips_cloudpickle_for_config_agents- Config agents skip cloudpickle registrationTestCloudpickleSerializationFix(2 tests):test_custom_basellm_in_submodule_can_be_serialized- Custom BaseLlm in submodule serializes correctlytest_agent_with_custom_basellm_can_be_serialized- Agent with custom BaseLlm serializes correctlyManual End-to-End (E2E) Tests:
Test setup with custom BaseLlm in submodule:
agents/
├── init.py
├── agent.py
├── clients/
│ ├── init.py
│ └── custom_client.py # Custom BaseLlm implementation
└── requirements.txt
adk deploy agent_engineagent.stream_query()agent.async_stream_query()Checklist
Additional context
Changes made:
_AGENT_ENGINE_APP_TEMPLATE- Added cloudpickle registration:- Agent module exists and can be imported
- Expected export (root_agent or app) is present
- Provides specific guidance for BaseLlm-related import errors
- Custom BaseLlm implementations can be serialized by cloudpickle
- Warns about module path structure for proper relative imports