What do I need to run browser-use?

You need Node.js, the Playwright MCP server, and access to a port (e.g., 8808). Start the server with the provided script or npx command and ensure the shared browser context is enabled.

How do I preserve state across calls?

Always start the server with the --shared-browser-context flag; subsequent calls reuse the same browser context to keep session state.

How can I run complex workflows?

Use browser_run_code to execute multiple actions atomically; for simpler tasks, sequence individual browser_* calls.

browser-use

Scanned

npx machina-cli add skill aiskillstore/marketplace/browser-use --openclaw

Files (1)

SKILL.md

4.2 KB

Browser Automation

Automate browser interactions via Playwright MCP server.

Server Lifecycle

Start Server

# Using helper script (recommended)
bash scripts/start-server.sh

# Or manually
npx @playwright/mcp@latest --port 8808 --shared-browser-context &

Stop Server

# Using helper script (closes browser first)
bash scripts/stop-server.sh

# Or manually
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_close -p '{}'
pkill -f "@playwright/mcp"

When to Stop

End of task: Stop when browser work is complete
Long sessions: Keep running if doing multiple browser tasks
Errors: Stop and restart if browser becomes unresponsive

Important: The --shared-browser-context flag is required to maintain browser state across multiple mcp-client.py calls. Without it, each call gets a fresh browser context.

Quick Reference

Navigation

# Go to URL
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_navigate \
  -p '{"url": "https://example.com"}'

# Go back
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_navigate_back -p '{}'

Get Page State

# Accessibility snapshot (returns element refs for clicking/typing)
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_snapshot -p '{}'

# Screenshot
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_take_screenshot \
  -p '{"type": "png", "fullPage": true}'

Interact with Elements

Use ref from snapshot output to target elements:

# Click element
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_click \
  -p '{"element": "Submit button", "ref": "e42"}'

# Type text
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_type \
  -p '{"element": "Search input", "ref": "e15", "text": "hello world", "submit": true}'

# Fill form (multiple fields)
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_fill_form \
  -p '{"fields": [{"ref": "e10", "value": "john@example.com"}, {"ref": "e12", "value": "password123"}]}'

# Select dropdown
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_select_option \
  -p '{"element": "Country dropdown", "ref": "e20", "values": ["US"]}'

Wait for Conditions

# Wait for text to appear
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_wait_for \
  -p '{"text": "Success"}'

# Wait for time (ms)
python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_wait_for \
  -p '{"time": 2000}'

Execute JavaScript

python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_evaluate \
  -p '{"function": "return document.title"}'

Multi-Step Playwright Code

For complex workflows, use browser_run_code to run multiple actions in one call:

python3 scripts/mcp-client.py call -u http://localhost:8808 -t browser_run_code \
  -p '{"code": "async (page) => { await page.goto(\"https://example.com\"); await page.click(\"text=Learn more\"); return await page.title(); }"}'

Tip: Use browser_run_code for complex multi-step operations that should be atomic (all-or-nothing).

Workflow: Form Submission

Navigate to page
Get snapshot to find element refs
Fill form fields using refs
Click submit
Wait for confirmation
Screenshot result

Workflow: Data Extraction

Navigate to page
Get snapshot (contains text content)
Use browser_evaluate for complex extraction
Process results

Tool Reference

See references/playwright-tools.md for complete tool documentation.

Troubleshooting

Issue	Solution
Element not found	Run browser_snapshot first to get current refs
Click fails	Try browser_hover first, then click
Form not submitting	Use `"submit": true` with browser_type
Page not loading	Increase wait time or use browser_wait_for

Source

git clone https://github.com/aiskillstore/marketplace/blob/main/skills/92bilal26/browser-use/SKILL.mdView on GitHub

Overview

Automate browser tasks via the Playwright MCP server, including navigation, form submission, interactions, screenshots, and data extraction. This is essential for web scraping, UI testing, and browser-driven workflows across sessions.

How This Skill Works

Start the Playwright MCP server and drive it with mcp-client calls such as browser_navigate, browser_click, and browser_type. A shared browser context (--shared-browser-context) preserves state across calls; use browser_snapshot to map element refs and browser_run_code for atomic multi-step workflows.

When to Use It

Automate login and multi-page user flows (navigation, form submission, and verification).
Fill and submit forms across sites and capture confirmations.
Scrape dynamic data and take screenshots for catalogs or reports.
Run UI tests by clicking through elements and asserting results.
Maintain session state across multiple browser tasks in long-running workflows.

Quick Start

Step 1: Start the MCP server on a port with a shared browser context (e.g., bash scripts or npx command).
Step 2: Navigate to a URL and run a snapshot to locate element refs.
Step 3: Interact with elements (click/type) and optionally take a screenshot or run code.

Best Practices

Use the shared browser context to keep session state across calls.
Take a snapshot first to map stable element refs before actions.
Leverage browser_run_code for atomic, multi-step workflows that must be all-or-nothing.
Wait for specific text or elements and verify results with screenshots or evaluations.
Gracefully stop the server when tasks finish or errors occur, and implement simple retries.

Example Use Cases

Automated login to a user dashboard, navigate to reports, and extract data.
Auto-fill and submit signup forms, then verify success messages.
Scrape product titles and prices from a dynamic catalog and save screenshots for a catalog.
End-to-end UI regression test of a multi-step form workflow across pages.
Extract page text content using evaluate for structured data processing.

Frequently Asked Questions

Add this skill to your agents