freecrawl

A production-ready mcp server for web scraping and document processing. Drop-in replacement for FireCrawl.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio dylan-gluck-freecrawl-mcp uvx freecrawl-mcp \
  --env FREECRAWL_CACHE="Enable caching (true/false)." \
  --env FREECRAWL_API_KEYS="Comma-separated list of API keys." \
  --env FREECRAWL_HEADLESS="Run browsers in headless mode (true/false)." \
  --env FREECRAWL_CACHE_DIR="Directory for cache storage." \
  --env FREECRAWL_CACHE_TTL="Cache time-to-live in seconds." \
  --env FREECRAWL_LOG_LEVEL="Logging level (DEBUG, INFO, WARN, ERROR)." \
  --env FREECRAWL_ROTATE_UA="Rotate User-Agent strings (true/false)." \
  --env FREECRAWL_TRANSPORT="Transport method for MCP (stdio or http). Default stdio." \
  --env FREECRAWL_CACHE_SIZE="Cache size limit in bytes." \
  --env FREECRAWL_RATE_LIMIT="Requests per minute allowed per domain." \
  --env FREECRAWL_ANTI_DETECT="Enable anti-detection features (true/false)." \
  --env FREECRAWL_MAX_BROWSERS="Maximum concurrent browser instances (default 3)." \
  --env FREECRAWL_MAX_CONCURRENT="Overall max concurrent scrape requests." \
  --env FREECRAWL_MAX_PER_DOMAIN="Max concurrent requests per domain." \
  --env FREECRAWL_BLOCKED_DOMAINS="Comma-separated list of blocked domains." \
  --env FREECRAWL_REQUIRE_API_KEY="Require API key for access (true/false)."

How to use

FreeCrawl is an MCP server focused on JavaScript-enabled web scraping and document processing. It runs as a self-contained service managed by uvx, exposing a set of MCP tools that allow you to scrape single pages, batch scrape multiple URLs, extract structured data from pages, process documents (PDF/DOCX with OCR), and perform health checks. Typical workflows include scraping content with anti-detection features, caching results for repeated access, and configuring per-domain rate limits to minimize blocking. Use the freecrawl_scrape tool for ad-hoc page scraping, freecrawl_batch_scrape for large lists of URLs, freecrawl_extract to pull structured data with a defined schema, and freecrawl_process_document for document ingestion with optional OCR and table extraction. Health checks via freecrawl_health_check provide visibility into server status and metrics.

How to install

Prerequisites:

git
internet access to install dependencies via uvx
Node/uvx tooling as required by the project (uvx)

Clone the repository:

git clone https://github.com/dylan-gluck/freecrawl-mcp.git
cd freecrawl-mcp

Install and setup dependencies with uvx (recommended):

# Sync dependencies and install browser dependencies on first run
uvx freecrawl-mcp --install-browsers

Run tests (optional, for local validation):

uv run freecrawl-mcp --test

Run the server:

uv run freecrawl-mcp

Optional: install via uvx in a persistent fashion (for production deployments):

# Ensure browsers are installed and the MCP is ready
uvx freecrawl-mcp

Note: The FreeCrawl MCP server is designed to be run with uvx for automatic dependency management and browser setup. If you prefer a local development workflow, you can clone the repository, install dependencies manually, and run the MCP with uv run as shown above.

Additional notes

Tips and common considerations:

Enable caching to improve repeat fetch performance and reduce load on target sites.
Tune FREECRAWL_MAX_BROWSERS and FREECRAWL_MAX_CONCURRENT based on target site behavior and your hosting capacity.
Use FREECRAWL_RATE_LIMIT to avoid triggering anti-scraping defenses; adjust per-domain rate limits as needed.
If you encounter anti-detection issues, ensure FREECRAWL_ANTI_DETECT=true and FREECRAWL_ROTATE_UA=true are set.
For API security, consider enabling FREECRAWL_REQUIRE_API_KEY and managing keys via FREECRAWL_API_KEYS.
The toolset includes: freecrawl_scrape (single URL with formats like markdown, html, text, screenshot, structured), freecrawl_batch_scrape (batch scraping with concurrency control), freecrawl_extract (schema-driven data extraction), freecrawl_process_document (document ingestion with OCR and extraction options), and freecrawl_health_check (monitoring).
When using Claude Code integration, configure the MCP to reference freecrawl_scrape as the primary scraping tool.

Related MCP Servers

mcp-claude-code

296

MCP implementation of Claude Code capabilities and more

stt -linux

Local speech-to-text MCP server for Tmux on Linux (for use not only with Claude Code)

Zammad

A Model Context Protocol (MCP) server for Zammad integration, enabling AI assistants to interact with tickets, users, and organizations.

h1b-job-search

An MCP (Model Context Protocol) server that automates H-1B job searching using U.S. Department of Labor LCA disclosure data.

openapi -proxy

An MCP server that provides tools for exploring large OpenAPI schemas

skrape

MCP Server for skrape.ai, lets you input any URL and it returns clean markdown for the LLM