Get the FREE Ultimate OpenClaw Setup Guide →

scrap

An MCP (Model Context Protocol) server that can scrape web pages and extract content using CSS selectors. Built with deno-dom for fast HTML parsing.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio sigmasd-scrap-mcp npx -y @sigma/scrap-mcp

How to use

This MCP server provides a focused web scraping capability. It fetches publicly accessible web pages and extracts content using CSS selectors, returning only the text content of matched elements. The main tool is scrape_page, which takes a URL and a CSS selector to locate elements. Use it when you want to pull specific pieces of information from pages (for example, headings, paragraphs, or links) without loading the entire page content into context. The server handles common errors (network issues, HTTP errors, parsing problems, and invalid selectors) and returns readable error messages through the MCP protocol, making it suitable for integration into larger LLM workflows that need structured, targeted data extraction. Typical usage patterns involve querying with selectors like h1, p, a, or more complex selectors such as .article-content p or nav a, to collect exactly the content you need.

To use the tool, call scrape_page with the URL and a CSS selector. The response lists how many elements matched and provides the text content for each element, in order. This enables simple pipelines: fetch the page, apply selectors to pull the exact data you want, and feed the resulting text into your LLM or downstream processor.

How to install

Prerequisites:

  • Node.js (recommended) with npm (or npx available)
  • Internet access for fetching pages

Installation steps:

  1. Ensure Node.js and npm are installed. Verify:

    • node -v
    • npm -v
  2. Use npx to run the MCP server directly (no global install required):

npx -y @sigma/scrap-mcp
  1. If you prefer a long-running setup, you can install the package globally (optional):
npm install -g @sigma/scrap-mcp

Then start with:

npx @sigma/scrap-mcp
  1. In your MCP manager or orchestration, reference the mcp_config snippet to connect to this server under the name you chose (e.g., scrap).

Note: The server requires network access to fetch web pages (enabled by default when running with the appropriate permissions).

Additional notes

Tips and common issues:

  • Permissions: Ensure outbound network access is allowed (e.g., with appropriate firewall rules or denials). The server relies on network access to fetch pages.
  • Dynamic content: Some pages render content via JavaScript; CSS selectors may not reflect dynamically loaded text unless the page has already loaded in the response. If you get empty results, try alternative selectors or verify the page’s source HTML.
  • Selector accuracy: Complex selectors can fail if the page structure changes. Start with simple selectors (e.g., h1, p) and progressively refine.
  • Robots.txt and courtesy: Respect robots.txt and rate-limit requests to avoid overloading target sites.
  • Error handling: If you encounter network issues, HTTP errors, or parsing failures, those errors are surfaced as readable MCP messages; check the selector validity and page URL correctness.
  • Security: The server executes only CSS selectors and does not run arbitrary code from the page; outbound requests are sandboxed and limited to HTTP/HTTPS.
  • Versioning: The readme references dependencies like @modelcontextprotocol/sdk and deno-dom in the upstream project; ensure you’re using compatible versions in your environment if you integrate or extend the server.

Related MCP Servers

Sponsor this space

Reach thousands of developers