Get the FREE Ultimate OpenClaw Setup Guide →

beam

MCP server to manage apache beam workflows with different runners

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio souravch-beam-mcp-server python main.py --debug --port 8888 \
  --env GCP_REGION="us-central1" \
  --env GCP_PROJECT_ID="your-gcp-project-id" \
  --env PYTHONUNBUFFERED="1"

How to use

This MCP server provides a unified API for managing Apache Beam pipelines across multiple runners (Flink, Spark, Dataflow, and Direct) using the MCP standard. It exposes endpoints to list, create, monitor, and manage jobs and resources, as well as tools and contexts to enable AI-assisted pipeline orchestration. Core endpoints include /tools for registering and configuring processing tools, /resources for datasets and other inputs, and /contexts for defining execution environments and runner configurations. You can interact with the server to submit WordCount-style pipelines, switch between runners, and observe pipeline status and metrics through the MCP-compliant API.

To start using it, run the server (for example: python main.py --debug --port 8888) and then issue REST requests to the documented endpoints. The server provides a Python client example to help you integrate programmatic control into your applications. Once running, you can add tools (like a sentiment analyzer), register datasets, create execution contexts for Dataflow or Flink, and submit jobs via the /jobs endpoint. The API is designed to be AI-friendly, so you can orchestrate pipelines and leverage LLM-driven decision making to select runners, configure options, and monitor progress.

How to install

Prerequisites:

  • Python 3.9 or higher
  • pip (packaged with Python)
  • Internet access to install dependencies

Step-by-step installation:

  1. Clone the repository: git clone https://github.com/yourusername/beam-mcp-server.git cd beam-mcp-server

  2. Create and/or activate a virtual environment (recommended): python -m venv beam-mcp-venv

    macOS/Linux

    source beam-mcp-venv/bin/activate

    Windows

    beam-mcp-venv\Scripts\activate

  3. Install dependencies: pip install -r requirements.txt

  4. (Optional) If you plan to run with Docker, build images as per the Docker instructions in the README.

  5. Run the server: python main.py --debug --port 8888

  6. Verify startup by hitting the API root or /health endpoint, e.g.: curl http://localhost:8888/health

Additional notes

Notes and tips:

  • The server supports multiple runners via endpoints; use /contexts to configure runner-specific parameters and /jobs to submit pipelines.
  • For Docker deployments, ensure environment variables like GCP_PROJECT_ID and GCP_REGION are set if your pipelines interact with Google Cloud resources.
  • If you encounter port binding issues, check for existing processes using port 8888 and update the port accordingly in the command.
  • Enable logging and monitoring by using the /metrics endpoint for Prometheus and reading the JSON-formatted logs emitted by the server.
  • When using the Docker image, mount a config directory (config/) to /app/config and set environment variables as needed to point to your resources and credentials.
  • If you expand to Kubernetes, refer to the included Kubernetes Deployment Guide for manifests and Helm charts.
  • Ensure your Python environment includes compatible versions of dependencies listed in requirements.txt to avoid import errors.

Related MCP Servers

Sponsor this space

Reach thousands of developers