Get the FREE Ultimate OpenClaw Setup Guide →

MediaCrawler_MCP_Server

MCP_Server for MediaCrawler. To Use MediaCrawler conveniently

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio bowenwin-mediacrawler_mcp_server python path/to/MediaCrawler_MCP_Server/main.py \
  --env MYSQL_DB_PWD="your_password" \
  --env MYSQL_DB_HOST="localhost" \
  --env MYSQL_DB_NAME="your_database" \
  --env MYSQL_DB_PORT="3306" \
  --env MYSQL_DB_USER="your_username" \
  --env ENABLE_GET_COMMENTS="true" \
  --env MAX_CONCURRENCY_NUM="1" \
  --env CRAWLER_MAX_NOTES_COUNT="20"

How to use

This MCP server provides three tools to drive the MediaCrawler workflow: crawl_search, crawl_detail, and crawl_creator. Each tool starts a crawler for a given platform and store type using keyword-based searches, video IDs, or creator IDs respectively. The server is intended to run under a Python environment managed by the UV tool, and expects a running MySQL database to persist crawl results. To invoke operations, configure the MCP client to call the appropriate tool with the required parameters and environment variables, including database connection details. Example usage patterns include initiating a search by keywords for a platform like Bilibili, or fetching details by a list of video IDs, or collecting creator-backed content portfolios. The MCP integration exposes these capabilities via standard input/output channels, suitable for integration into Claude Desktop/CLI workflows.

How to install

Prerequisites:

  • Python 3.13 or newer installed on your system
  • Access to a MySQL database with credentials
  • (Optional) uv tool for Python-based MCP orchestration
  1. Clone the repository or prepare the project directory

  2. Install dependencies (in the project directory)

# If using a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

# Install Python dependencies as specified by MediaCrawler_MCP_Server (adjust as needed)
pip install -r requirements.txt
  1. Prepare MySQL database
  • Ensure the database exists and credentials match the environment variables described in the README.
  • Create tables if they do not exist (the project mentions不会覆盖已有数据 on init).
  1. Install browser drivers for Playwright (required by the crawler)
uv run playwright install
  1. Run UV tooling to ensure environment parity (as described in the README)
uv sync
  1. Start the MCP server
# Example using the server entry point via Python
python path/to/MediaCrawler_MCP_Server/main.py
  1. Verify operation by invoking the MCP client using the configured settings (see how_to_use for details).

Additional notes

Tips and common issues:

  • Ensure MYSQL_DB_HOST/PORT/USER/NAME/PWD are correctly set in the environment; incorrect DB credentials will prevent the crawler from storing results.
  • Adjust CRAWLER_MAX_NOTES_COUNT and MAX_CONCURRENCY_NUM to control load; start with conservative values (e.g., 20 notes, 1 concurrent crawler).
  • ENABLE_GET_COMMENTS governs whether comments are crawled; set to true if you need engagement data.
  • The README notes that if the MySQL tables already exist, initialization will not overwrite existing data. Ensure proper backups before reinitializing.
  • When running via UV, use the provided main entry point (main.py) and ensure the path in the mcp_config matches your project structure.
  • If you encounter environment-related issues, verify Python 3.13 compatibility as mentioned in the README.

Related MCP Servers

Sponsor this space

Reach thousands of developers