Get the FREE Ultimate OpenClaw Setup Guide →

AI-Cursor-Scraping-Assistant

A powerful tool that leverages Cursor AI and MCP (Model Context Protocol) to easily generate web scrapers for various types of websites.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio thewebscrapingclub-ai-cursor-scraping-assistant python xpath_server.py \
  --env CAMOUFOX_FILE_PATH="path to Camoufox_template.py"

How to use

AI-Cursor-Scraping-Assistant is a Python-based MCP server that empowers Cursor AI to automatically generate web scrapers. It combines Cursor Rules and MCP tools to analyze websites, detect structure, and produce Scrapy or Camoufox scrapers with minimal user input. The server exposes an XPath selector generator and anti-bot analysis workflow, enabling Cursor to fetch page content, identify JSON data, and create scraper templates tailored to PLP (Product Listing Page) and PDP (Product Detail Page) patterns. You can also opt to use Camoufox for stealth scraping when anti-bot protections are present. To use it, ensure the MCP server is running and connect Cursor to the MCP endpoint; then prompt Cursor to generate an e-commerce scraper (e.g., Write an e-commerce PDP scraper for nike.com), and Cursor will guide you through analysis, selector extraction, and code generation.

How to install

Prerequisites:

  • Python 3.10+
  • Cursor AI installed
  • Basic knowledge of web scraping concepts
  1. Clone the repository and install dependencies
git clone https://github.com/TheWebScrapingClub/AI-Cursor-Scraping-Assistant.git
cd AI-Cursor-Scraping-Assistant

# Install MCP tooling and required packages
pip install mcp camoufox scrapy
  1. Optional: set up Camoufox browser binary (if you plan to use Camoufox)
python -m camoufox fetch
  1. Start the MCP server
cd MCPfiles
python xpath_server.py
  1. In Cursor, configure the MCP server (usually via MCP panel) to point at the running server. You should see the server name AI-Cursor-Scraping-Assistant in the MCP registry.

Note: If you adjust Camoufox paths, ensure CAMOUFOX_FILE_PATH is updated in the MCP configuration.

Additional notes

Tips and considerations:

  • The server relies on Camoufox for stealth scrapers; run python -m camoufox fetch if you plan to bypass certain anti-bot measures.
  • Ensure CAMOUFOX_FILE_PATH points to a valid Camoufox_template.py before starting the MCP server.
  • The MCP workflow includes multiple MDC rule sets (prerequisites, website-analysis, scrapy, scraper-models) that guide Cursor in analysis and scraper generation.
  • When using Camoufox, you may need to fetch the browser binary and ensure network access to target sites is permitted.
  • If you encounter anti-bot blocks, use the advanced rules to analyze cookies, JSON data, and schema.org markup as indicated in the Cursor rules documentation.
  • This server exposes the path to xpath_server.py; ensure you run it from the MCPfiles directory to align with the intended configuration.

Related MCP Servers

Sponsor this space

Reach thousands of developers