ai-spark

MCP server from vgiri2015/ai-spark-mcp-server

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio vgiri2015-ai-spark-mcp-server python v1/run_server.py \
  --env ANTHROPIC_API_KEY="your-anthropic-api-key"

How to use

ai-spark is an MCP (Model Context Protocol) server designed to optimize and analyze Apache Spark (PySpark) code using Claude AI. Clients submit PySpark code and request optimization and performance analysis; the server routes the request to registered tools and Claude AI for code-level improvements and validation. The workflow produces an optimized Spark script and a performance analysis report, helping you understand the impact of the optimizations. To use it, ensure you have Python installed and set up the required environment variables (such as your Claude/Anthropic API key), then start the server and run the client to submit your code for optimization. The server coordinates tool invocation, code generation, and validation, returning both the optimized code and a performance report.

How to install

Prerequisites:

Python 3.8+ and pip
PySpark 3.2.0+ (for runtime validation of Spark code)
Anthropic Claude API access (via ANTHROPIC_API_KEY)

Installation steps:

Create and activate a Python virtual environment:
- python -m venv venv
- source venv/bin/activate # macOS/Linux
- .\venv\Scripts\activate # Windows
Install required dependencies:
- pip install -r requirements.txt
Set the required environment variable for Claude AI access:
- export ANTHROPIC_API_KEY=your-anthropic-api-key # macOS/Linux
- set ANTHROPIC_API_KEY=your-anthropic-api-key # Windows
Start the MCP server:
- python v1/run_server.py
In another terminal, run the MCP client to optimize your code:
- python v1/run_client.py
The process will generate:
- output/optimized_spark_code.py (or similarly named file)
- output/performance_analysis.md with a performance comparison

Additional notes

Tips and common considerations:

Ensure your ANTHROPIC_API_KEY is valid and has access to Claude AI features used by the server.
The MCP server expects the Spark code to be accessible by the client and compatible with the PySpark runtime used during validation.
If you change the project layout, update the run_server.py path in the mcp_config accordingly.
Use the performance_analysis.md file to gauge real-world improvements and validate correctness after optimization.
Check for version compatibility between Python, PySpark, and any dependencies in requirements.txt.
If you encounter network or API rate limits with Claude AI, consider adding retry/backoff logic or adjusting tool timeouts.
Secure your API keys and don’t commit them to version control.

Related MCP Servers

mcp-vegalite

MCP server from isaacwasserman/mcp-vegalite-server

github-chat

A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.

nautex

MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline

pagerduty

PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.

futu-stock

mcp server for futuniuniu stock

mcp -boilerplate

Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP