pdf-reader
📄 Production-ready MCP server for PDF processing - 5-10x faster with parallel processing and 94%+ test coverage
claude mcp add --transport stdio sylphxai-pdf-reader-mcp npx @sylphx/pdf-reader-mcp
How to use
pdf-reader-mcp is a production-ready MCP server that exposes a PDF processing toolkit for AI agents. It supports extracting text, images, and metadata from PDFs, while preserving document ordering using Y-coordinate-based layout. The server is designed for parallel processing and batch workloads, allowing multiple PDFs to be processed concurrently to maximize throughput. Typical usage involves starting the MCP server via npx and then issuing requests that specify sources (paths or URLs) and the desired output selections (text, images, metadata, page counts, etc.). The API is oriented around a single consolidated tool that can handle whole documents or specific page ranges, with support for absolute/relative paths and batch processing to scale across CPUs.
Once running, you can leverage the provided Quick Start examples in the README to build request payloads. You can request full text extraction, metadata, and page counts in one go, or tailor requests to extract only the pieces you need (e.g., only images and page-level metadata). The server emphasizes per-page error isolation, ensuring that failures on a single page do not derail the entire document processing workflow. This makes it suitable for enterprise-grade document ingestion pipelines where reliability and predictable performance are critical.
How to install
Prerequisites:
- Node.js installed (recommended LTS version)
- Internet access to fetch the MCP package from npm
Installation steps:
- Install/run the MCP server via npx (no local installation required):
npx @sylphx/pdf-reader-mcp
- (Optional) Install globally for repeated use:
npm install -g @sylphx/pdf-reader-mcp
- If you prefer manual configuration in Claude/Windsurf/etc., you can reference the following example to add the server:
// Example: Claude Desktop / Windsurf / Windsurf-like config
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"]
}
}
}
- Validate installation by running a basic quick start request (as shown in the README) to ensure the MCP server responds and processes a sample PDF.
Additional notes
Tips and notes:
- The server is designed for parallel processing; you can batch multiple PDFs in a single request.
- Absolute vs relative paths are supported (requires v1.3.0+ for absolute path handling).
- When using in production, consider configuring environment variables if the MCP package exposes them for tuning performance or logging (check the package docs for specifics).
- If you encounter issues with path resolution, ensure the working directory of the hosting process is appropriate or prefer URL-based sources when possible.
- The npm package name to reference in configurations is @sylphx/pdf-reader-mcp; use the provided examples to wire it into Claude, Windsurf, Cline, Warp, or other MCP managers.
Related MCP Servers
XActions
⚡ The Complete X/Twitter Automation Toolkit — Scrapers, MCP server for AI agents (Claude/GPT), CLI, browser scripts. No API fees. Open source. Unfollow people who don't follow back. Monitor real-time analytics. Auto follow, like, comment, scrape, without API.
mcp-ssh
🔐 SSH MCP Tool - AI-powered SSH management through MCP protocol | 基于MCP协议的SSH工具,为AI提供SSH远程操作能力
glasses
Glasses MCP is a simple MCP server that lets your AI agent see and capture the web 👓
MCP-Client -Project-using-NodeJS
A minimal Model Context Protocol (MCP) implementation built with Node.js and TypeScript. This project demonstrates client–server communication over stdio, structured message handling, and local data access, developed with VS Code and GitHub Copilot to explore modern AI tool integration workflows.
filesystem
📁 Secure, efficient MCP filesystem server - token-saving batch operations with project root confinement
warp-sql
🗄️ Model Context Protocol (MCP) server for SQL Server integration with Warp terminal. Execute queries, explore schemas, export data, and analyze performance with natural language commands.