What is firefox-browser?

A bridge that lets you control the user s actual Firefox browser via WebSocket, preserving login state and cookies.

Can I run in headless mode?

No. This only controls the real Firefox session, not a headless browser.

How do I fill forms with multiple fields?

Use fillForm with a fields array that specifies selectors and values for inputs, textareas, and selects.

firefox-browser

Flagged

{"isSafe":false,"isSuspicious":true,"riskLevel":"high","findings":[{"category":"data_exfiltration","severity":"high","description":"The Firefox Browser Bridge can access sensitive user data from the actual browser session, including login data, cookies, and page content via actions like getAuthContext and getContent, as well as screenshots. This creates a potential leakage path if misused.","evidence":"This uses their real browser with existing logins and cookies - not a headless browser. getAuthContext Detect login pages, available accounts. getContent Get page content. screenshot Capture visible area as PNG."},{"category":"data_exfiltration","severity":"high","description":"The component can detect authentication contexts and read available accounts, which could enable credential theft or unauthorized access across sites if data is exfiltrated or misused.","evidence":"Authentication section shows getAuthContext | Detect login pages, available accounts. The description emphasizes leveraging existing browser sessions (logins and cookies)."}],"summary":"The content describes a browser-automation bridge that operates on the user’s live Firefox session, with access to login pages, cookies, and page content. This provides powerful capabilities that could enable data exfiltration or credential misuse if abused. While the tool enables legitimate automation, it requires strong safeguards (consent, least privilege, auditing, and domain restrictions) to mitigate privacy and security risks."}

npx machina-cli add skill aiskillstore/marketplace/firefox-browser --openclaw

Files (1)

SKILL.md

7.8 KB

Firefox Browser Agent Bridge

Control the user's actual Firefox browser session via WebSocket. This uses their real browser with existing logins and cookies - not a headless browser.

Quick Start

# 0. If Firefox isn't running, start it first
nohup firefox &>/dev/null &

# 1. Check connection
browser ping

# 2. See what tabs are open
browser listTabs '{}'

# 3. Start a new session (recommended)
browser newSession '{"url": "https://example.com"}'

# 4. Read the page with interactable elements marked
browser getContent '{"format": "annotated"}'

Client Usage

browser <action> '<json_params>'

Actions Reference

Session & Tab Management

Action	Description	Key Params
`listTabs`	List all open tabs across windows	-
`newSession`	Create new tab to work in	`url` (optional)
`setActiveTab`	Switch which tab agent works on	`tabId`, `focus`
`getActiveTab`	Get current tab info	-

Navigation & Page Info

Action	Description	Key Params
`navigate`	Go to URL in current tab	`url`, `wait`, `newTab`
`getContent`	Get page content	`format`: `annotated`, `text`, `html`
`getInteractables`	List clickable elements and inputs	`selector` (optional scope)
`screenshot`	Capture visible area as PNG	`filename` (optional)

Interaction

Action	Description	Key Params
`click`	Click element	`selector`, `text`, or `x`/`y` coords
`type`	Type into focused/selected input	`selector`, `text`, `submit`, `clear`
`fillForm`	Fill form fields (inputs, textareas, selects)	`fields[]` array with selector/value
`waitFor`	Wait for element/text	`selector`, `text`, `timeout`

fillForm - The Right Way to Fill Forms

IMPORTANT: There is no fill command. Use fillForm with a fields array:

# Fill a single field
browser fillForm '{"fields": [{"selector": "#email", "value": "test@example.com"}]}'

# Fill multiple fields at once (text inputs, textareas, AND select dropdowns)
browser fillForm '{"fields": [
  {"selector": "#name", "value": "John Doe"},
  {"selector": "#email", "value": "john@example.com"},
  {"selector": "#subject", "value": "support"},
  {"selector": "#message", "value": "Hello world"}
]}'

Works with: <input>, <textarea>, <select>, checkboxes, radio buttons.

Control Flow

Action	Description	Key Params
`fork`	Duplicate tab into multiple paths	`paths[]` with name + commands
`killFork`	Close a fork	`fork` (name)
`listForks`	List active forks	-
`tryUntil`	Try alternatives until one succeeds	`alternatives[]`, `timeout`
`parallel`	Run commands on multiple URLs	`branches[]` with url + commands

Authentication

Action	Description	Key Params
`getAuthContext`	Detect login pages, available accounts	-
`requestAuth`	Request user approval for auth	`reason`
`configureAuth`	Set auth preferences	`authMode`, `setSiteRule`, `domain`

Recommended Workflow

1. Start by Inspecting Available Tabs

browser listTabs '{}'

Returns:

{
  "activeTabId": 123,
  "windows": [
    {
      "windowId": 1,
      "focused": true,
      "tabs": [
        {"tabId": 123, "url": "https://...", "title": "...", "active": true}
      ]
    }
  ],
  "totalTabs": 5
}

2. Start Fresh or Pick Existing Tab

# Start fresh
browser newSession '{"url": "https://amazon.com"}'

# Or switch to existing tab
browser setActiveTab '{"tabId": 456}'

3. Read Page with Annotated Format (Recommended)

browser getContent '{"format": "annotated"}'

Returns content with interactive elements marked inline:

Product Name Here
$4.99
[button: "Add to cart" | selector: #add-btn]
[input:text: "search" | value: "" | selector: #search-box]
[link: "View details" | href: /product/123 | selector: a.details-link]

This shows what's clickable and where it is in context.

4. Interact Using Selectors

# Click using selector from annotated output
browser click '{"selector": "#add-btn"}'

# Or by text (prefers visible elements)
browser click '{"text": "Add to cart"}'

# Type into input
browser type '{"selector": "#search-box", "text": "query", "submit": true}'

Fork: Speculative Parallel Execution

When you're not sure which path is right, fork the tab and try both:

# Create forks
browser fork '{
  "paths": [
    {
      "name": "google-auth",
      "commands": [{"action": "click", "params": {"text": "Sign in with Google"}}]
    },
    {
      "name": "email-auth",
      "commands": [{"action": "click", "params": {"text": "Sign in with Email"}}]
    }
  ]
}'

Returns:

{
  "forked": true,
  "sourceTabId": 123,
  "forks": [
    {"name": "google-auth", "tabId": 456, "url": "...", "commandResults": [...]},
    {"name": "email-auth", "tabId": 789, "url": "...", "commandResults": [...]}
  ]
}

Work on specific fork:

browser getContent '{"format": "annotated", "fork": "google-auth"}'
browser click '{"text": "Continue", "fork": "google-auth"}'

Kill the wrong path:

browser killFork '{"fork": "email-auth"}'

TryUntil: Handle Uncertain UI

When the exact button varies (cookie banners, A/B tests):

browser tryUntil '{
  "alternatives": [
    {"action": "click", "params": {"selector": "#accept-cookies"}},
    {"action": "click", "params": {"text": "Accept All"}},
    {"action": "click", "params": {"selector": ".cookie-dismiss"}}
  ],
  "timeout": 3000
}'

Tries each until one succeeds.

Parallel: Multiple URLs at Once

Compare prices across sites:

browser parallel '{
  "branches": [
    {"url": "https://amazon.com/product", "commands": [{"action": "getContent", "params": {"format": "text"}}]},
    {"url": "https://walmart.com/product", "commands": [{"action": "getContent", "params": {"format": "text"}}]}
  ]
}'

Authentication

The bridge detects auth pages and leverages existing browser sessions:

# Check if on login page
browser getAuthContext '{}'

# Returns available accounts, OAuth options, etc.

Isolated Sessions (for Parallel Execution)

When running multiple tasks in parallel, use tabId to avoid conflicts:

# 1. Create isolated session - get a unique tabId
browser newSession '{"url": "https://example.com"}'
# Returns: {"tabId": 15, "url": "...", "windowId": 1}

# 2. Use that tabId in ALL subsequent commands
browser navigate '{"url": "https://example.com/page", "tabId": 15}'
browser getContent '{"format": "annotated", "tabId": 15}'
browser click '{"selector": "#btn", "tabId": 15}'
browser type '{"selector": "#input", "text": "hello", "tabId": 15}'

This lets multiple agents work in parallel without stepping on each other.

Tips

Start with listTabs to see what's open
Use newSession for a clean start
Use tabId for parallel/isolated execution
Use annotated format - shows content + clickable elements together
Use selectors from annotated output - more reliable than text matching
Fork when uncertain - try multiple paths, kill the wrong ones

Troubleshooting

Firefox not running? Start it: nohup firefox &>/dev/null &
Check connection: browser ping
Connection refused? The extension may need to be reloaded in about:debugging
Element not found? Use browser getContent '{"format": "annotated"}' to see what's on the page

Source

git clone https://github.com/aiskillstore/marketplace/blob/main/skills/1jehuang/firefox-browser/SKILL.mdView on GitHub

Overview

Firefox Browser Agent Bridge lets you drive the user s real Firefox session via WebSocket, preserving logins and cookies. It enables browsing as the user, interacting with authenticated pages, filling forms, clicking elements, taking screenshots, and extracting page content.

How This Skill Works

The agent communicates with the local Firefox process over WebSocket, using the live browser rather than a headless instance. It exposes commands like listTabs, newSession, navigate, getContent, click, type, fillForm, and screenshot to operate the user session while preserving existing logins and cookies.

When to Use It

Browse websites exactly as the user to access authenticated pages without signing in again
Interact with forms, buttons, and dynamic UI on live pages in the user s session
Fill multi field forms and submit them using fillForm with a fields array
Capture screenshots or extract page content from the user s active session
Manage and coordinate multiple tabs or forks to test or automate flows across real pages

Quick Start

Step 1: Ensure Firefox is running, or start it with a command like nohup firefox &>/dev/null &
Step 2: Connect and inspect the session with browser ping and browser listTabs '{}'
Step 3: Start a new session to a target URL and fetch content, e.g. browser newSession '{"url": "https://example.com"}' and then browser getContent '{"format": "annotated"}'

Best Practices

Ensure Firefox is running before issuing commands
Use newSession to start work in a clean context while keeping the user s session intact
Prefer getContent with annotated format for structured element visibility
Use waitFor to handle dynamic content and asynchronous page updates
Leverage getAuthContext and configureAuth for handling login flows and permissions

Example Use Cases

Open a product page in the user s session and take a screenshot for a QA report
Fill a support form across multiple fields and submit using the fillForm workflow
Navigate to a protected account page and verify content without logging out
List current tabs, switch to a relevant one, and extract page HTML for data scraping
Run parallel branches to compare page layouts across different tabs in the same session

Frequently Asked Questions

Add this skill to your agents