What do I need to run Pinchtab?

A running Pinchtab Go binary that exposes the HTTP API; use environment variables to tailor behavior.

Can Pinchtab run in headless mode?

Yes. Configure BRIDGE_HEADLESS or CHROME_FLAGS to control headless versus headed mode.

What format is the result data returned in?

Results are provided as a flat JSON accessibility tree with stable references, plus page data like text and screenshots when available.

Pinchtab

Verified

@luigi-agosti

npx machina-cli add skill @luigi-agosti/pinchtab --openclaw

Files (1)

SKILL.md

17.7 KB

Pinchtab

Fast, lightweight browser control for AI agents via HTTP + accessibility tree.

Setup

Start Pinchtab in one of these modes:

# Headless (default) — no UI, pure automation (lowest token cost when using /text and filtered snapshots)
pinchtab &

# Headed — visible Chrome for human + agent workflows
BRIDGE_HEADLESS=false pinchtab &

# Dashboard/orchestrator — profile manager + launcher, no browser in dashboard process
pinchtab dashboard &

Default port: 9867. Override with BRIDGE_PORT=9868. Auth: set BRIDGE_TOKEN=<secret> and pass Authorization: Bearer <secret>.

Base URL for all examples: http://localhost:9867

Token savings come from the API shape (/text, /snapshot?filter=interactive&format=compact), not from headless vs headed alone.

Headed mode definition

Headed mode means a real visible Chrome window managed by Pinchtab.

Human can open profile(s), log in, pass 2FA/captcha, and validate page state
Agent then calls Pinchtab HTTP APIs against that same running profile instance
Session state persists in the profile directory, so follow-up runs reuse cookies/storage

In dashboard workflows, the dashboard process itself does not launch Chrome; it launches profile instances that run Chrome (headed or headless).

To resolve a running profile endpoint from dashboard state:

pinchtab connect <profile-name>

Recommended human + agent flow:

# human
pinchtab dashboard
# setup profile + launch profile instance

# agent
PINCHTAB_BASE_URL="$(pinchtab connect <profile-name>)"
curl "$PINCHTAB_BASE_URL/health"

Profile Management (Dashboard Mode)

When running pinchtab dashboard, profiles are managed via the dashboard API on port 9867.

List profiles

curl http://localhost:9867/profiles

Returns array of profiles with id, name, accountEmail, useWhen, etc.

Start a profile by ID

# Auto-allocate port (recommended)
curl -X POST http://localhost:9867/start/278be873adeb

# With specific port and headless mode
curl -X POST http://localhost:9867/start/278be873adeb \
  -H 'Content-Type: application/json' \
  -d '{"port": "9868", "headless": true}'

Returns the instance info including the allocated port. Use that port for all subsequent API calls (navigate, snapshot, action, etc.).

Stop a profile by ID

curl -X POST http://localhost:9867/stop/278be873adeb

Check profile instance status

curl http://localhost:9867/profiles/Pinchtab%20org/instance

Typical agent flow with profiles

# 1. List profiles to find the right one
PROFILES=$(curl -s http://localhost:9867/profiles)
# Pick the profile ID you need

# 2. Start the profile
INSTANCE=$(curl -s -X POST http://localhost:9867/start/$PROFILE_ID)
PORT=$(echo $INSTANCE | jq -r .port)

# 3. Use the instance (all API calls go to the instance port)
curl -X POST http://localhost:$PORT/navigate -H 'Content-Type: application/json' \
  -d '{"url": "https://mail.google.com"}'
curl http://localhost:$PORT/snapshot?maxTokens=4000

# 4. Stop when done
curl -s -X POST http://localhost:9867/stop/$PROFILE_ID

Profile IDs

Each profile gets a stable 12-char hex ID (SHA-256 of name, truncated) stored in profile.json. The ID is generated at creation time and never changes. Use IDs instead of names in automation — they're URL-safe and stable.

Core Workflow

The typical agent loop:

Navigate to a URL
Snapshot the accessibility tree (get refs)
Act on refs (click, type, press)
Snapshot again to see results

Refs (e.g. e0, e5, e12) are cached per tab after each snapshot — no need to re-snapshot before every action unless the page changed significantly.

API Reference

Navigate

curl -X POST http://localhost:9867/navigate \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://example.com"}'

# With options: custom timeout, block images, open in new tab
curl -X POST http://localhost:9867/navigate \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://example.com", "timeout": 60, "blockImages": true, "newTab": true}'

Snapshot (accessibility tree)

# Full tree
curl http://localhost:9867/snapshot

# Interactive elements only (buttons, links, inputs) — much smaller
curl "http://localhost:9867/snapshot?filter=interactive"

# Limit depth
curl "http://localhost:9867/snapshot?depth=5"

# Smart diff — only changes since last snapshot (massive token savings)
curl "http://localhost:9867/snapshot?diff=true"

# Text format — indented tree, ~40-60% fewer tokens than JSON
curl "http://localhost:9867/snapshot?format=text"

# Compact format — one-line-per-node, 56-64% fewer tokens than JSON (recommended)
curl "http://localhost:9867/snapshot?format=compact"

# YAML format
curl "http://localhost:9867/snapshot?format=yaml"

# Scope to CSS selector (e.g. main content only)
curl "http://localhost:9867/snapshot?selector=main"

# Truncate to ~N tokens
curl "http://localhost:9867/snapshot?maxTokens=2000"

# Combine for maximum efficiency
curl "http://localhost:9867/snapshot?format=compact&selector=main&maxTokens=2000&filter=interactive"

# Disable animations before capture
curl "http://localhost:9867/snapshot?noAnimations=true"

# Write to file
curl "http://localhost:9867/snapshot?output=file&path=/tmp/snapshot.json"

Returns flat JSON array of nodes with ref, role, name, depth, value, nodeId.

Token optimization: Use ?format=compact for best token efficiency. Add ?filter=interactive for action-oriented tasks (~75% fewer nodes). Use ?selector=main to scope to relevant content. Use ?maxTokens=2000 to cap output. Use ?diff=true on multi-step workflows to see only changes. Combine all params freely.

Act on elements

# Click by ref
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "click", "ref": "e5"}'

# Type into focused element (click first, then type)
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "click", "ref": "e12"}'
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "type", "ref": "e12", "text": "hello world"}'

# Press a key
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "press", "key": "Enter"}'

# Focus an element
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "focus", "ref": "e3"}'

# Fill (set value directly, no keystrokes)
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "fill", "selector": "#email", "text": "user@example.com"}'

# Hover (trigger dropdowns/tooltips)
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "hover", "ref": "e8"}'

# Select dropdown option (by value or visible text)
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "select", "ref": "e10", "value": "option2"}'

# Scroll to element
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "scroll", "ref": "e20"}'

# Scroll by pixels (infinite scroll pages)
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "scroll", "scrollY": 800}'

# Click and wait for navigation (link clicks)
curl -X POST http://localhost:9867/action \
  -H 'Content-Type: application/json' \
  -d '{"kind": "click", "ref": "e5", "waitNav": true}'

Extract text

# Readability mode (default) — strips nav/footer/ads, keeps article/main content
curl http://localhost:9867/text

# Raw innerText (old behavior)
curl "http://localhost:9867/text?mode=raw"

Returns {url, title, text}. Cheapest option (~1K tokens for most pages).

Screenshot

# Raw JPEG bytes
curl "http://localhost:9867/screenshot?raw=true" -o screenshot.jpg

# With quality setting (default 80)
curl "http://localhost:9867/screenshot?raw=true&quality=50" -o screenshot.jpg

Evaluate JavaScript

curl -X POST http://localhost:9867/evaluate \
  -H 'Content-Type: application/json' \
  -d '{"expression": "document.title"}'

Tab management

# List tabs
curl http://localhost:9867/tabs

# Open new tab
curl -X POST http://localhost:9867/tab \
  -H 'Content-Type: application/json' \
  -d '{"action": "new", "url": "https://example.com"}'

# Close tab
curl -X POST http://localhost:9867/tab \
  -H 'Content-Type: application/json' \
  -d '{"action": "close", "tabId": "TARGET_ID"}'

Multi-tab: pass ?tabId=TARGET_ID to snapshot/screenshot/text, or "tabId" in POST body.

Tab locking (multi-agent)

# Lock a tab (default 30s timeout, max 5min)
curl -X POST http://localhost:9867/tab/lock \
  -H 'Content-Type: application/json' \
  -d '{"tabId": "TARGET_ID", "owner": "agent-1", "timeoutSec": 60}'

# Unlock
curl -X POST http://localhost:9867/tab/unlock \
  -H 'Content-Type: application/json' \
  -d '{"tabId": "TARGET_ID", "owner": "agent-1"}'

Locked tabs show owner and lockedUntil in /tabs. Returns 409 on conflict.

Batch actions

# Execute multiple actions in sequence
curl -X POST http://localhost:9867/actions \
  -H 'Content-Type: application/json' \
  -d '[{"kind":"click","ref":"e3"},{"kind":"type","ref":"e3","text":"hello"},{"kind":"press","key":"Enter"}]'

Cookies

# Get cookies for current page
curl http://localhost:9867/cookies

# Set cookies
curl -X POST http://localhost:9867/cookies \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com","cookies":[{"name":"session","value":"abc123"}]}'

Stealth

# Check stealth status and score
curl http://localhost:9867/stealth/status

# Rotate browser fingerprint
curl -X POST http://localhost:9867/fingerprint/rotate \
  -H 'Content-Type: application/json' \
  -d '{"os":"windows"}'
# os: "windows", "mac", or omit for random

Health check

curl http://localhost:9867/health

Token Cost Guide

Method	Typical tokens	When to use
`/text`	~800	Reading page content
`/snapshot?filter=interactive`	~3,600	Finding buttons/links to click
`/snapshot?diff=true`	varies	Multi-step workflows (only changes)
`/snapshot?format=compact`	~56-64% less	One-line-per-node, best token efficiency
`/snapshot?format=text`	~40-60% less	Indented tree, cheaper than JSON
`/snapshot`	~10,500	Full page understanding
`/screenshot`	~2K (vision)	Visual verification

Strategy: Start with /snapshot?filter=interactive. Use ?diff=true on subsequent snapshots in multi-step tasks. Use /text when you only need the readable content. Use ?format=text to cut token costs further. Use full /snapshot only for complete page understanding.

Environment Variables

Core runtime

Var	Default	Description
`BRIDGE_BIND`	`127.0.0.1`	Bind address — localhost only by default. Set `0.0.0.0` for network access
`BRIDGE_PORT`	`9867`	HTTP port
`BRIDGE_HEADLESS`	`true`	Run Chrome headless
`BRIDGE_TOKEN`	(none)	Bearer auth token (recommended when using `BRIDGE_BIND=0.0.0.0`)
`BRIDGE_PROFILE`	`~/.pinchtab/chrome-profile`	Chrome profile dir
`BRIDGE_STATE_DIR`	`~/.pinchtab`	State/session storage
`BRIDGE_NO_RESTORE`	`false`	Skip tab restore on startup
`BRIDGE_STEALTH`	`light`	Stealth level: `light` or `full`
`BRIDGE_MAX_TABS`	`20`	Max open tabs (0 = unlimited)
`BRIDGE_BLOCK_IMAGES`	`false`	Block image loading
`BRIDGE_BLOCK_MEDIA`	`false`	Block all media (images + fonts + CSS + video)
`BRIDGE_NO_ANIMATIONS`	`false`	Disable CSS animations/transitions
`BRIDGE_TIMEZONE`	(none)	Force browser timezone (IANA tz)
`BRIDGE_CHROME_VERSION`	`144.0.7559.133`	Chrome version string used by fingerprint rotation
`CHROME_BINARY`	(auto)	Path to Chrome/Chromium binary
`CHROME_FLAGS`	(none)	Extra Chrome flags (space-separated)
`BRIDGE_CONFIG`	`~/.pinchtab/config.json`	Path to config JSON file
`BRIDGE_TIMEOUT`	`15`	Action timeout (seconds)
`BRIDGE_NAV_TIMEOUT`	`30`	Navigation timeout (seconds)
`CDP_URL`	(none)	Connect to existing Chrome DevTools
`BRIDGE_NO_DASHBOARD`	`false`	Disable dashboard/orchestrator endpoints on instance processes

Dashboard mode (`pinchtab dashboard`)

Var	Default	Description
`PINCHTAB_AUTO_LAUNCH`	`false`	Auto-launch a default profile at dashboard startup
`PINCHTAB_DEFAULT_PROFILE`	`default`	Profile name for auto-launch
`PINCHTAB_DEFAULT_PORT`	`9867`	Port for auto-launched profile
`PINCHTAB_HEADED`	(unset)	If set, auto-launched profile is headed; unset means headless
`PINCHTAB_DASHBOARD_URL`	`http://localhost:$BRIDGE_PORT`	CLI helper base URL for `pinchtab connect`

Tips

Always pass tabId explicitly when working with multiple tabs — active tab tracking can be unreliable
Refs are stable between snapshot and actions — no need to re-snapshot before clicking
After navigation or major page changes, take a new snapshot to get fresh refs
Use filter=interactive by default, fall back to full snapshot when needed
Pinchtab persists sessions — tabs survive restarts (disable with BRIDGE_NO_RESTORE=true)
Chrome profile is persistent — cookies/logins carry over between runs
Chrome uses its native User-Agent by default — BRIDGE_CHROME_VERSION only affects fingerprint rotation
Use BRIDGE_BLOCK_IMAGES=true or "blockImages": true on navigate for read-heavy tasks — reduces bandwidth and memory

Source

git clone https://clawhub.ai/luigi-agosti/pinchtabView on GitHub

Overview

Pinchtab lets you control a headless or headed Chrome browser through an HTTP API. It exposes the accessibility tree as flat JSON with stable references, optimized for AI agents and low token cost. Use Pinchtab for web automation tasks like browsing, form filling, navigation, and multi-tab workflows, requiring a running Pinchtab Go binary.

How This Skill Works

Clients send HTTP requests to the Pinchtab API to launch or attach to a Chrome instance, then perform actions such as navigate, click, and fill forms. The API returns data from the page (text, state, screenshots) using a flat JSON accessibility tree with stable refs, designed for fast parsing by AI agents. Pinchtab supports both headless and headed Chrome and can manage multiple tabs in a single session.

When to Use It

Automate browsing and form filling across multi-tab sessions
Scrape page text and extract structured data
Navigate complex sites and capture screenshots for QA
Run AI-driven tasks that rely on a stable, low-cost page representation
Automated UI testing and repetitive web tasks in headless mode

Quick Start

Step 1: Start Pinchtab by running the Go binary to expose the HTTP API (default: http://127.0.0.1:9867).
Step 2: Optionally configure environment variables such as BRIDGE_HEADLESS, BRIDGE_STATE_DIR, BRIDGE_TIMEOUT, and BRIDGE_NAV_TIMEOUT.
Step 3: Send API commands to navigate, click, fill forms, and fetch results (texts, state, or screenshots) using the flat JSON accessibility tree.

Best Practices

Run Pinchtab with the Go binary and target its HTTP API
Tune BRIDGE_TIMEOUT and BRIDGE_NAV_TIMEOUT for task timing
Block images/media to speed up scraping when appropriate
Limit tabs with BRIDGE_MAX_TABS to control resource use
Rely on the flat JSON accessibility tree's stable refs for reliable element targeting; use BRIDGE_STEALTH if needed

Example Use Cases

Login to an e-commerce site, navigate categories, and extract prices
Fill multi-step forms across pages and submit results
Open several product pages simultaneously and compile a catalog
Take screenshots of product pages for visual QA
AI agent scrapes article text and metadata from news sites

Frequently Asked Questions

Add this skill to your agents

Pinchtab

Pinchtab

Setup

Headed mode definition

Profile Management (Dashboard Mode)

List profiles

Start a profile by ID

Stop a profile by ID

Check profile instance status

Typical agent flow with profiles

Profile IDs

Core Workflow

API Reference

Navigate

Snapshot (accessibility tree)

Act on elements

Extract text

Screenshot

Evaluate JavaScript

Tab management

Tab locking (multi-agent)

Batch actions

Cookies

Stealth

Health check

Token Cost Guide

Environment Variables

Core runtime

Dashboard mode (pinchtab dashboard)

Tips

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What do I need to run Pinchtab?

Can Pinchtab run in headless mode?

What format is the result data returned in?

Dashboard mode (`pinchtab dashboard`)