-
Notifications
You must be signed in to change notification settings - Fork 2
Danny/kernel 742 create yutori n1 computer use cli templates (ts/python) #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dprevoznik
wants to merge
14
commits into
main
Choose a base branch
from
danny/kernel-742-create-yutori-n1-computer-use-cli-templates-typescript
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Danny/kernel 742 create yutori n1 computer use cli templates (ts/python) #89
dprevoznik
wants to merge
14
commits into
main
from
danny/kernel-742-create-yutori-n1-computer-use-cli-templates-typescript
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add new CLI templates for Yutori's n1 computer use model, enabling users to quickly scaffold browser automation projects using Kernel's infrastructure. Templates (TypeScript & Python): - Agentic sampling loop with n1's OpenAI-compatible API - Computer tool mapping n1 actions (click, type, scroll, drag, etc.) to Kernel's Computer Controls API - Coordinate scaling from n1's 1000x1000 relative space to actual viewport - Session management with replay recording support - read_texts_and_links action using Playwright execution API (with fallback) Key implementation details: - n1 requires screenshots sent with role 'observation' (not 'user') - Model: n1-preview-2025-11 outputs coordinates in 1000x1000 space - Viewport: 1200x800 at 25Hz (closest to Yutori's recommended 1280x800) - Navigation actions (refresh, go_back, goto_url) use keyboard shortcuts via Computer Controls since n1 doesn't use Playwright directly Also updated: - .gitignore: Added qa-* to exclude QA testing directories - pkg/create/templates.go: Registered new yutori-computer-use templates - .cursor/commands/qa.md: Added Yutori templates to QA testing matrix Closes KERNEL-742
Replace page.accessibility.snapshot() with page._snapshotForAI() which is specifically designed for AI agents and documented in Kernel's MCP server. The previous implementation used the experimental/deprecated accessibility API which failed silently and fell back to screenshot-only mode. _snapshotForAI() returns a structured representation of the page optimized for LLM consumption, including visible text, interactive elements (links, buttons, inputs), and page structure - exactly what n1 needs for reading texts and saving URLs for citation.
Add PlaywrightComputerTool adapter that connects via CDP WebSocket for
browser-only screenshots, optimized for Yutori n1's training data per
their documentation recommendations.
Changes:
- Add PlaywrightComputerTool class (TS + Python) using CDP connection
- Add 'mode' parameter to sampling loop ('computer_use' | 'playwright')
- Default to 'computer_use' mode (stable); 'playwright' is opt-in
- Add configurable viewport dimensions (1200x800)
- Expose cdp_ws_url from session for Playwright connection
- Add playwright-core (TS) and playwright (Python) dependencies
The playwright mode provides viewport-only screenshots without OS UI or
browser chrome, improving n1 model performance per Yutori's docs:
https://docs.yutori.com/reference/n1#screenshot-requirements
Add templates + modes for Yutori to QA file
Fix drag operations that previously weren't working properly on Playwright mode operations.
Use ariaSnapshot instead of the existing method, as ariaSnapshot is stably available in both Python and TypeScript versions.
Issue: The ComputerTool.screenshot() method was a synchronous function, but: The N1ComputerToolProtocol expected it to be async The PlaywrightComputerTool.screenshot() was async The loop.py code tried to await it Fix: Changed def screenshot() to async def screenshot() Updated all handler methods to await self.screenshot() instead of return self.screenshot()
Update default delays for actions and screenshots
… moving. Clarified instructions for both computer_use and playwright modes to enhance user understanding and execution accuracy.
The cleanup removed ~300 lines of redundant inline comments and verbose method docstrings while keeping the useful class-level documentation you restored. The templates now match the minimal-comment style of the existing anthropic/openai templates in the codebase.
#88) This PR updates the Go SDK to cee2050be3f8136505d41c20c2903dfca2cbc479 and adds CLI commands for new SDK methods. ## SDK Update - Updated kernel-go-sdk to cee2050be3f8136505d41c20c2903dfca2cbc479 ## Coverage Analysis This PR was generated by performing a full enumeration of SDK methods and CLI commands. ## New Commands - `kernel credential-providers list` - List configured external credential providers - `kernel credential-providers get <id>` - Get a credential provider by ID - `kernel credential-providers create` - Create a new credential provider (supports 1Password) - `kernel credential-providers update <id>` - Update a credential provider's configuration - `kernel credential-providers delete <id>` - Delete a credential provider - `kernel credential-providers test <id>` - Test a credential provider connection ## Breaking Changes Fixed - Fixed `browsers.Get()` calls to pass new required `BrowserGetParams` parameter Triggered by: kernel/kernel-go-sdk@cee2050 Reviewer: @masnwilliams <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Introduces new CLI surfaces and updates for latest SDK. > > - **Agent Auth CLI**: `kernel agents auth` with `create/get/list/delete`, `invocations {create/get/exchange/submit}`, and end‑to‑end `run` flow (auto field submission, TOTP, optional live view); docs and examples added to `README.md`. > - **Credential Providers CLI**: `kernel credential-providers {list/get/create/update/delete/test}` (supports 1Password), wired into root. > - **Browsers API updates**: adapt to SDK breaking change (`browsers.Get` now requires `BrowserGetParams`); add `process resize` and filesystem watch (`fs watch start/stop/events`) commands; tests updated accordingly. > - **Dependencies**: bump `kernel-go-sdk` to cee2050… and add `pquerna/otp`; regenerate `go.sum`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 0b27df6. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Mason Williams <43387599+masnwilliams@users.noreply.github.com> Co-authored-by: Cursor Agent <cursor-agent@kernel.sh> Co-authored-by: Cursor Agent <cursor-agent@onkernel.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com>
…se-cli-templates-typescript
Contributor
Author
|
Working on fixing comments from bugbot then will request review |
… and remove unused dependencies from Python and TypeScript templates.
…se-cli-templates-typescript
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
pkg/templates/typescript/yutori-computer-use/tools/playwright-computer.ts
Show resolved
Hide resolved
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add Yutori n1 Computer Use CLI Templates
This PR adds new CLI templates for Yutori's n1 computer use model, enabling users to quickly scaffold browser automation projects using Kernel's infrastructure.
New Templates
kernel create --template ts-yutori-cuakernel create --template python-yutori-cuaFeatures
Both templates include:
click,type,scroll,drag,hover,key_press,wait,refresh,go_back,goto_url,stop) to Kernel's Computer Controls APIDual Screenshot Modes
computer_use(default)playwrightImplementation Details
n1-preview-2025-11outputs coordinates in 1000×1000 spaceWith Playwright Mode for viewport-only screenshots
kernel invoke ts-yutori-cua cua-task --payload '{"query": "...", "mode": "playwright"}'Files Changed
pkg/templates/typescript/yutori-computer-use/- TypeScript templatepkg/templates/python/yutori-computer-use/- Python templatepkg/create/templates.go- Template registrationCloses KERNEL-742
Note
Introduces ready-to-deploy Yutori n1 computer use templates with dual screenshot modes and full browser session/replay support.
typescript/yutori-computer-useandpython/yutori-computer-usewith n1 sampling loops, action-to-Computer Controls mapping, coordinate scaling, and optional Playwright CDP modeyutori-computer-useinpkg/create/templates.go(names, sorting priority, deploy/invoke configs for both languages).gitignoreignoresqa-*dirsWritten by Cursor Bugbot for commit 7e4ce52. This will update automatically on new commits. Configure here.