crawl4ai
用于提供给本地开发者的 LLM的高效互联网搜索&内容获取的MCP Server, 节省你的token
claude mcp add --transport stdio weidwonder-crawl4ai-mcp-server python -m crawl4ai_mcp_server
How to use
Crawl4AI MCP Server provides powerful web search and content extraction capabilities tuned for LLM workflows. It exposes a search tool that can query multiple engines (DuckDuckGo by default, with optional Google integration via API keys) and a read_url tool that fetches web pages and returns optimized content in formats suitable for LLM processing, including markdown with citations, simplified markdown, raw HTML to markdown, and references sections. Use these tools to gather diverse sources and convert them into concise, structured outputs that preserve traceability to original URLs.
To use the server, first start it in a Python environment as described in the installation steps. Then call the search tool with a query, optionally specifying the engine and number of results. You can request multiple engines simultaneously for broader coverage. For read_url, provide a URL and choose the preferred format for output (for example, markdown_with_citations to maintain inline references). The server will filter out non-essential content, preserve the article body and key points, and keep URL references so your AI assistant can trace back to sources. These tools together enable robust web information retrieval and content transformation tailored for LLM consumption.
How to install
Prerequisites:
- Python 3.9 or newer
- Optional virtual environment tool (e.g., venv)
Installation steps:
- Clone the repository:
git clone https://github.com/yourusername/crawl4ai-mcp-server.git
cd crawl4ai-mcp-server
- Create and activate a virtual environment:
python -m venv crawl4ai_env
# Linux/macOS
source crawl4ai_env/bin/activate
# Windows
.\crawl4ai_env\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Install Playwright browsers (for web automation if used by the server):
playwright install
- Run the MCP server (development):
python -m crawl4ai_mcp_server
- Optional: If you plan to integrate via Smithery (Claude desktop or other clients) see installation via Smithery in the README:
npx -y @smithery/cli install @weidwonder/crawl4ai-mcp-server --client claude
Notes:
- Ensure you have API keys configured for Google search if you intend to use the Google engine (see config.json example in the README).
- The repository uses a FastMCP-based asynchronous design for performance.
Additional notes
Tips and caveats:
- If you enable Google search, populate the API key and Custom Search Engine (cse_id) in config.json as shown in README examples.
- Keep Python virtual environments isolated to avoid dependency conflicts.
- The default content format for output is markdown_with_citations to maximize traceability; you can switch formats in read_url calls.
- For long-running searches, consider paging results with num_results to manage token usage and latency.
- If encountering environment issues with Playwright, ensure all browser dependencies are installed and compatible with your OS.
- The project uses FastMCP for high-performance asynchronous operations; monitor resource usage under heavy concurrent requests.
Related MCP Servers
mcp-vegalite
MCP server from isaacwasserman/mcp-vegalite-server
github-chat
A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
pagerduty
PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.
futu-stock
mcp server for futuniuniu stock
mcp -boilerplate
Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP