forge
Forge: Swarm Agents That Turn Slow PyTorch Into Fast CUDA/Triton Kernels
claude mcp add --transport stdio rightnow-ai-forge-mcp-server npx -y @rightnow/forge-mcp-server
How to use
Forge MCP Server connects a model-coding agent to Forge to optimize PyTorch workloads into production-grade CUDA/Triton kernels. It leverages 32 parallel swarm agents that race to discover performant kernel implementations, benchmarking results on real datacenter GPUs to deliver a best-in-class drop-in kernel. Tools exposed by the server include authentication via forge_auth, optimization via forge_optimize, and generation via forge_generate. When integrated with MCP clients, you can submit PyTorch code or natural-language kernel descriptions, and receive optimized kernels with speedup metrics and correctness guarantees. The server is designed to work with a wide range of MCP clients, including Claude Code/Desktop, OpenCode, Cursor, Windsurf, VS Code Copilot, and OpenAI-style MCP integrations, making it straightforward to plug into existing workflows.
How to install
Prerequisites:
- Node.js v14+ (recommended latest LTS) and npm installed on your system
- Access to an MCP client (Claude Code/Desktop, VS Code MCP, Cursor, Windsurf, OpenCode, etc.) or an MCP-compatible integration
Installation steps:
- Install Node.js and npm from the official website if you don’t already have them.
- Use your MCP client to register or configure the Forge MCP server. The recommended method is via npx, which runs the MCP server without a global install:
# Claude integration (example)
claude mcp add forge-mcp -- npx -y @rightnow/forge-mcp-server
# VS Code / Copilot / other MCP clients (example snippet)
# In the respective MCP config, point to the server using:
# command: npx
# args: ["-y", "@rightnow/forge-mcp-server"]
- If you prefer to run manually in a local environment (without an MCP client):
npx -y @rightnow/forge-mcp-server
- Verify the server starts and is reachable by your MCP client. Follow client-specific prompts to authorize and connect Forge as the active MCP server.
Additional notes
Tips and considerations:
- The server uses npx to fetch and run the latest Forge MCP server package; ensure network access to npm registries.
- The forge_optimize tool accepts a PyTorch code snippet and returns a tuned kernel along with speedup metrics; it is recommended to set a reasonable target_speedup and max_iterations to control compute cost.
- If your MCP client requires Windows compatibility, you may need to wrap commands with cmd /c as shown in the README examples for specific clients.
- The server supports multiple GPUs (B200, H200/H100, L40S, A100, L4, A10, T4); you can guide the optimization toward a particular GPU by using the gpu parameter in forge_optimize.
- For production use, ensure proper authentication via forge_auth before attempting optimizations to avoid quota or access issues.
Related MCP Servers
iterm
A Model Context Protocol server that executes commands in the current iTerm session - useful for REPL and CLI assistance
mcp
Octopus Deploy Official MCP Server
furi
CLI & API for MCP management
editor
MCP Server for Phaser Editor
DoorDash
MCP server from JordanDalton/DoorDash-MCP-Server
mcp
MCP сервер для автоматического создания и развертывания приложений в Timeweb Cloud