scraper
🌐 Streamline web scraping with Scraper MCP, a server that optimizes content for AI by filtering data and reducing token usage for LLMs.
claude mcp add --transport stdio jessaminesimple608-scraper-mcp node server.js \ --env PORT="3000" \ --env LOG_LEVEL="info"
How to use
scraper-mcp is designed to simplify web data extraction by orchestrating a lightweight scraper that you can run from a centralized MCP interface. It focuses on efficient data retrieval with CSS selectors for targeted content, and it can convert scraped pages into Markdown for easy reading or export. The server exposes a simple interface for configuring target URLs, applying CSS filters to narrow down the content you want, and retrieving structured results. Use the built-in export options to generate clean, portable outputs suitable for reporting or further analysis.
Once running, you can connect to the scraper server through your MCP dashboard and supply the URLs you want to scrape. The tool will apply the configured CSS filters, fetch the page content, and return the extracted data in a structured format. If needed, you can adjust settings like the output format (Markdown or raw structured data) and the depth of the scrape to balance detail with performance.
How to install
Prerequisites:
- Node.js installed on your machine (v12+ recommended)
- Internet access to install dependencies
- Basic understanding of command-line usage
Installation steps:
- Ensure Node.js is installed by running: node -v
- Create a project directory and navigate into it: mkdir scraper-mcp && cd scraper-mcp
- Initialize a new Node.js project (if not already provided in the distribution): npm init -y
- Install the scraper package or dependencies (if published as an npm package; otherwise unpack the provided distribution): npm install --save scraper-mcp
- If you have a prebuilt server file, ensure server.js exists in your project root. If you are using a package distribution, follow the included setup instructions to place the files appropriately.
- Start the server locally to verify it runs: node server.js
- Verify the server is listening on the configured port (default 3000) and accessible from your MCP interface.
Additional notes
Tips:
- If you encounter network or SSL issues, ensure your environment allows outbound HTTP(S) requests to target websites.
- Adjust the CSS selectors carefully to maximize relevant data and minimize noise.
- When exporting to Markdown, verify that long text blocks are truncated or wrapped as needed for readability.
- Common environment variables to consider: PORT (port to run the server), LOG_LEVEL (info, warn, error), and RATE_LIMIT (requests per second).
- If you update the server, restart the MCP integration to pick up changes.
Related MCP Servers
git
Put an end to code hallucinations! GitMCP is a free, open-source, remote MCP server for any GitHub project
basic-memory
AI conversations that actually remember. Never re-explain your project to your AI again. Join our Discord: https://discord.gg/tyvKNccgqN
sdk-typescript
A model-driven approach to building AI agents in just a few lines of code.
aser
Aser is a lightweight, self-assembling AI Agent frame.
openapi
OpenAPI definitions, converters and LLM function calling schema composer.
neurolink
Universal AI Development Platform with MCP server integration, multi-provider support, and professional CLI. Build, test, and deploy AI applications with multiple ai providers.