crawl
完整的微信文章抓取MCP服务器 - 基于Model Context Protocol (MCP)的智能网页抓取工具,专为Cursor IDE和AI工具设计。
claude mcp add --transport stdio wutongci-crawl-mcp npx -y crawl-mcp-server@1.1.0 \ --env NODE_ENV="production"
How to use
Crawl-MCP is an MCP server that provides tools to crawl and locally download images from WeChat articles. It supports both a command-driven (instruction) mode and an automated mode, enabling you to perform single-article or batch crawling, with proper handling of WeChat image domains, headers, and retry logic. The server exposes a set of MCP tools such as crawl_wechat_article for a single article, crawl_wechat_batch for multiple articles, and crawl_get_status to check session states. Use these tools from your MCP client (e.g., Cursor IDE or any MCP-compatible client) by invoking the appropriate tool with the required parameters. In Cursor, you can enable the MCP server via an npx command and then issue either a guided instruction prompt or an automated request to download content and generate a localized Markdown/HTML output along with downloaded images.
Typical workflows: (1) Single article: provide the WeChat article URL and optional output format (markdown/json/html) and a download strategy. (2) Batch articles: provide an array of URLs and optional output format and concurrency. (3) Status: query a session to monitor progress or retrieve results. The server prioritizes image download with correct headers, domain recognition (mmbiz.qpic.cn), and controlled concurrency (up to 3 concurrent image downloads) to ensure reliable localization of content and assets.
How to install
Prerequisites:
- Node.js 18+ installed on your machine
- npm (comes with Node.js) or pnpm if you prefer
Installation steps:
-
Install or run via npx (recommended for quick start)
# Run the MCP server directly without local installation npx crawl-mcp-server@1.1.0 -
Global install (optional):
npm install -g crawl-mcp-server@1.1.0 crawl-mcp-server -
Local project install (optional):
npm install crawl-mcp-server@1.1.0 npx crawl-mcp-server -
Verify installation by running the server and ensuring the MCP endpoints become available. For Cursor integration, you will point the MCP client at the installed server and use the provided tools to perform crawling and image download.
Additional notes
Tips and notes:
- Prefer using the specific version crawl-mcp-server@1.1.0 to ensure you have all image download capabilities and the latest fixes.
- In MCP config (Cursor), set NODE_ENV to production to optimize behavior for automated tasks.
- The server uses WeChat image domains (e.g., mmbiz.qpic.cn) and proper HTTP headers (Referer, User-Agent) to ensure successful image downloads.
- The tool supports different output formats for article results (markdown, json, html); adjust outputFormat accordingly when using crawl_wechat_article or crawl_wechat_batch.
- If you encounter rate limits or download failures, leverage the retry mechanism and consider adjusting strategy (fast, basic, conservative) for stability.
- This server is designed to work with Node.js 18+ and uses the built-in fetch API; no additional HTTP libraries are required.
Related MCP Servers
iterm
A Model Context Protocol server that executes commands in the current iTerm session - useful for REPL and CLI assistance
mcp
Octopus Deploy Official MCP Server
furi
CLI & API for MCP management
editor
MCP Server for Phaser Editor
DoorDash
MCP server from JordanDalton/DoorDash-MCP-Server
mcp
MCP сервер для автоматического создания и развертывания приложений в Timeweb Cloud