mcp_pdf_processor
MCP PDF Processor , Fetches, proceses to llm.txt, and loads the llm.txt to your AI
claude mcp add --transport stdio michaellevinson-mcp_pdf_processor python pdf_tool_server.py \ --env OUTPUT_DIR="Directory to store processed PDFs (default: llm_output)" \ --env PYTHONPATH="Set to the directory containing the mcp_pdf_processor package"
How to use
The PDF Processor MCP server enables Claude to fetch PDFs from URLs, extract text, and identify LaTeX equations for downstream analysis. It exposes MCP-driven commands under the PDF_TOOLS namespace that let you fetch a PDF, process it (with optional LaTeX extraction), and then read the processed content. Typical use cases include retrieving a document, extracting mathematical expressions in LaTeX form, and summarizing or analyzing the content for further insights. When registered with Claude, you can invoke these tools directly in conversations to perform end-to-end PDF processing without leaving the chat.
How to install
Prerequisites:
- Python 3.9 or higher
- pip (Python package manager)
- Optional: MCP CLI tools if you plan to integrate with Claude Desktop/Claude Code
- Install the package in editable mode (from the repository root):
pip install -e .
- (Optional for Claude Desktop/Code) Install MCP CLI tools:
pip install "mcp[cli]"
- Install or run the server locally:
# Run the server directly (standalone mode)
python pdf_tool_server.py
- If you want to register/install with Claude Desktop/Claude Code via MCP CLI:
# Install the server using the MCP CLI tool
mcp install /path/to/pdf_tool_server.py --with-editable /path/to/mcp_pdf_processor
Example with your repo cloned at ~/mcp_pdf_processor:
mcp install ~/mcp_pdf_processor/pdf_tool_server.py --with-editable ~/mcp_pdf_processor
- For development with the MCP Inspector:
mcp dev /path/to/pdf_tool_server.py --with-editable /path/to/mcp_pdf_processor
Note: Ensure that the PD F processing dependencies (pymupdf, torch, pix2tex, etc.) specified in pyproject.toml are installed as needed for full functionality.
Additional notes
Environment variables:
- OUTPUT_DIR controls where processed PDFs are stored. If not set, the default is llm_output.
- PYTHONPATH should point to the directory containing the mcp_pdf_processor package to ensure imports resolve correctly.
Common issues:
- Ensure Python 3.9+ is installed; some dependencies may require newer versions.
- If LaTeX extraction relies on optional components (torch, pix2tex), install them or disable those features if not needed.
- When using Claude, the server name (PDF_TOOLS in instructions) should map to the registered mcpServer key (pdf_tool_server in this config).
Configuration tips:
- You can adjust OUTPUT_DIR to a writable path in your environment to avoid permission errors.
- If you encounter import errors, verify PYTHONPATH includes the root of the mcp_pdf_processor package.
Related MCP Servers
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
mcp-yfinance
Real-time stock API with Python, MCP server example, yfinance stock analysis dashboard
pfsense
pfSense MCP Server enables security administrators to manage their pfSense firewalls using natural language through AI assistants like Claude Desktop. Simply ask "Show me blocked IPs" or "Run a PCI compliance check" instead of navigating complex interfaces. Supports REST/XML-RPC/SSH connections, and includes built-in complian
cloudwatch-logs
MCP server from serkanh/cloudwatch-logs-mcp
servicenow-api
ServiceNow MCP Server and API Wrapper
the -company
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools