browser
Scannednpx machina-cli add skill brsbl/ottonomous/browser --openclawArgument: $ARGUMENTS
| Command | Behavior |
|---|---|
{url} | Navigate to URL, capture screenshot and ARIA snapshot |
explore | Interactive exploration - navigate, inspect, understand UI |
verify {description} | Verify specific UI behavior or state |
extract {description} | Extract specific data from the frontend |
Choosing Your Approach
- Local/source-available sites: Read source code first to write selectors directly
- Unknown layouts: Use
getAISnapshot()for element discovery andselectSnapshotRef()for interactions - Visual feedback: Take screenshots to observe results
Setup
Start the browser server before running scripts:
node skills/otto/lib/browser/server.js &
Wait for "Ready" message, then connect:
import { connect, waitForPageLoad } from 'skills/otto/lib/browser/client.js'
const client = await connect({ headless: true })
Writing Scripts
Run scripts using npx tsx with heredocs for inline execution.
Key Principles:
- Small scripts doing one action each
- Evaluate state at completion
- Use descriptive page names
- Call
await client.disconnect()to exit (pages persist) - Use plain JavaScript in
page.evaluate()(no TypeScript syntax)
Workflow Loop
- Write script performing one action
- Run and observe output
- Evaluate results and current state
- Decide: complete or need another script?
- Repeat until task complete
Navigate & Capture
Determine the dev server URL from package.json scripts, running processes, or project config.
const page = await client.page('main')
await page.goto(url) // e.g., http://localhost:5173
await waitForPageLoad(page)
// Screenshot
await page.screenshot({ path: '.otto/screenshots/page.png' })
// ARIA snapshot
const snapshot = await client.getAISnapshot('main')
console.log(snapshot)
Interact
// Click by ref
const btn = await client.selectSnapshotRef('main', 'e3')
await btn.click()
// Fill input by ref
const input = await client.selectSnapshotRef('main', 'e5')
await input.fill('user@example.com')
// Re-capture after interaction
await waitForPageLoad(page)
const newSnapshot = await client.getAISnapshot('main')
Waiting
await waitForPageLoad(page)
await page.waitForSelector('.results')
await page.waitForURL('**/success')
No TypeScript in Browser Context
Code in page.evaluate() runs in browser context without TypeScript support. Use plain JavaScript only—type annotations break at runtime.
// ✓ Correct
await page.evaluate(() => {
const items = document.querySelectorAll('.item')
return items.length
})
// ✗ Wrong - TypeScript syntax fails
await page.evaluate(() => {
const items: NodeListOf<Element> = document.querySelectorAll('.item')
return items.length
})
Cleanup
await client.disconnect()
After completing the workflow, remove screenshots:
rm -rf .otto/screenshots
ARIA Snapshot Format
- banner:
- link "Home" [ref=e1]
- main:
- heading "Welcome" [ref=e2]
- form:
- textbox "Email" [ref=e3]
- button "Submit" [disabled] [ref=e4]
Use [ref=eN] values with selectSnapshotRef() to interact.
Error Recovery
Page state persists after failures. Debug using screenshots and state inspection to evaluate current conditions before next action.
// After an error, reconnect and inspect
const client = await connect({ headless: true })
const snapshot = await client.getAISnapshot('main')
await client.page('main').then(p => p.screenshot({ path: '.otto/screenshots/debug.png' }))
Scraping Data
For large datasets, intercept and replay network requests rather than scrolling the DOM. See references/scraping.md for the complete guide covering request capture, schema discovery, and paginated API replay.
Overview
Automate browser tasks with a persistent page state to navigate, capture screenshots, fill forms, and extract data. This skill helps inspect UI, verify frontend behavior, and automate browser workflows efficiently.
How This Skill Works
You run a local browser server and connect a client to control it. Interact by selecting element references, clicking, filling inputs, waiting for page loads, and capturing AISnapshots. Pages persist across scripts, enabling multi-step workflows before disconnecting.
When to Use It
- Inspect complex UI and layouts during development
- Capture screenshots and ARIA snapshots for visual QA
- Fill forms and trigger frontend flows to test behavior
- Extract specific data or page state for validation
- Automate repetitive browser workflows and frontend verifications
Quick Start
- Step 1: Start the browser server: node skills/otto/lib/browser/server.js &
- Step 2: Connect the client: const client = await connect({ headless: true })
- Step 3: Navigate to a URL, wait for page load, and capture a screenshot
Best Practices
- Write small scripts that perform one action at a time
- Evaluate and confirm the final state after each run
- Use descriptive, stable page names for snapshots
- Disconnect with await client.disconnect() to persist pages for follow-ups
- Use plain JavaScript inside page.evaluate(); avoid TypeScript syntax
Example Use Cases
- Navigate to a URL, capture a screenshot and ARIA snapshot
- Interact with UI by clicking a ref and filling an input, then re-capture AISnapshot
- Explore an unknown layout with getAISnapshot() and selectSnapshotRef() for interactions
- Verify a UI condition with verify and waitForURL, then log results
- Extract a data field (e.g., item count) from the frontend using extract