ai-spark
MCP server from vgiri2015/ai-spark-mcp-server
claude mcp add --transport stdio vgiri2015-ai-spark-mcp-server python v1/run_server.py \ --env ANTHROPIC_API_KEY="your-anthropic-api-key"
How to use
ai-spark is an MCP (Model Context Protocol) server designed to optimize and analyze Apache Spark (PySpark) code using Claude AI. Clients submit PySpark code and request optimization and performance analysis; the server routes the request to registered tools and Claude AI for code-level improvements and validation. The workflow produces an optimized Spark script and a performance analysis report, helping you understand the impact of the optimizations. To use it, ensure you have Python installed and set up the required environment variables (such as your Claude/Anthropic API key), then start the server and run the client to submit your code for optimization. The server coordinates tool invocation, code generation, and validation, returning both the optimized code and a performance report.
How to install
Prerequisites:
- Python 3.8+ and pip
- PySpark 3.2.0+ (for runtime validation of Spark code)
- Anthropic Claude API access (via ANTHROPIC_API_KEY)
Installation steps:
-
Create and activate a Python virtual environment:
- python -m venv venv
- source venv/bin/activate # macOS/Linux
- .\venv\Scripts\activate # Windows
-
Install required dependencies:
- pip install -r requirements.txt
-
Set the required environment variable for Claude AI access:
- export ANTHROPIC_API_KEY=your-anthropic-api-key # macOS/Linux
- set ANTHROPIC_API_KEY=your-anthropic-api-key # Windows
-
Start the MCP server:
- python v1/run_server.py
-
In another terminal, run the MCP client to optimize your code:
- python v1/run_client.py
-
The process will generate:
- output/optimized_spark_code.py (or similarly named file)
- output/performance_analysis.md with a performance comparison
Additional notes
Tips and common considerations:
- Ensure your ANTHROPIC_API_KEY is valid and has access to Claude AI features used by the server.
- The MCP server expects the Spark code to be accessible by the client and compatible with the PySpark runtime used during validation.
- If you change the project layout, update the run_server.py path in the mcp_config accordingly.
- Use the performance_analysis.md file to gauge real-world improvements and validate correctness after optimization.
- Check for version compatibility between Python, PySpark, and any dependencies in requirements.txt.
- If you encounter network or API rate limits with Claude AI, consider adding retry/backoff logic or adjusting tool timeouts.
- Secure your API keys and don’t commit them to version control.
Related MCP Servers
mcp-vegalite
MCP server from isaacwasserman/mcp-vegalite-server
github-chat
A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
pagerduty
PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.
futu-stock
mcp server for futuniuniu stock
mcp -boilerplate
Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP