Get the FREE Ultimate OpenClaw Setup Guide →

mcp -webcrawl

MCP server tailored to connecting web crawler data and archives

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio pragmar-mcp-server-webcrawl python -m mcp_server_webcrawl

How to use

mcp-server-webcrawl provides an advanced, search-enabled interface for working with data crawled from the web. It exposes a fulltext search capability with boolean operators, and supports resource filtering by type, HTTP status, and other attributes. The server is designed to be driven by an LLM, giving it a ready-made prompt toolkit and routines for tasks such as SEO analysis, 404 auditing, performance reviews, and data extraction from multiple crawler backends. This makes it suitable for building knowledge bases from crawled content, running curated prompts, and performing guided queries against diverse crawl datasets.

To use it, install the package via pip and start the MCP server entry point. Once running, you can query the indexed crawl data using the built-in boolean search syntax, field-based filters, and content queries. The server supports prompts and routines (for example, SEO audits or performance analyses) that can be used directly within your LLM workflow. It is compatible with a variety of crawlers and formats, allowing you to filter results by type, status, and other metadata while performing complex searches across crawled content.

How to install

Prerequisites:

  • Python 3.10 or newer
  • pip (Python package manager)
  • Internet access to install dependencies

Installation steps:

  1. Create and activate a Python virtual environment (recommended): python -m venv venv source venv/bin/activate # macOS/Linux venv\Scripts\activate # Windows

  2. Install the MCP server package from PyPI: pip install mcp-server-webcrawl

  3. Confirm installation and available entry point (example): python -m mcp_server_webcrawl --help

  4. Start the MCP server (as configured in mcp_config): python -m mcp_server_webcrawl

  5. Optional: integrate with your orchestration or compose with other MCP servers as needed.

Additional notes

Notes and tips:

  • The server requires Python 3.10+ and will expose endpoints suitable for integration with your MCP CLI or orchestration tooling.
  • If you plan to run multiple crawlers or data sources, ensure your environment variables or configuration reflect the specific backends you intend to index.
  • Common environment variables may include paths or credentials for crawled data sources; consult the project docs for crawler-specific setup guides.
  • When building prompts or routines for the LLM, take advantage of the provided audit and analysis prompts (SEO Audit, 404 Audit, Performance Audit, etc.) to derive structured outputs from raw crawl data.
  • Monitor resource usage for large crawl indices, as fulltext search and field filtering can be memory-intensive depending on the dataset size.

Related MCP Servers

Sponsor this space

Reach thousands of developers