forge

Forge: Swarm Agents That Turn Slow PyTorch Into Fast CUDA/Triton Kernels

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio rightnow-ai-forge-mcp-server npx -y @rightnow/forge-mcp-server

How to use

Forge MCP Server connects a model-coding agent to Forge to optimize PyTorch workloads into production-grade CUDA/Triton kernels. It leverages 32 parallel swarm agents that race to discover performant kernel implementations, benchmarking results on real datacenter GPUs to deliver a best-in-class drop-in kernel. Tools exposed by the server include authentication via forge_auth, optimization via forge_optimize, and generation via forge_generate. When integrated with MCP clients, you can submit PyTorch code or natural-language kernel descriptions, and receive optimized kernels with speedup metrics and correctness guarantees. The server is designed to work with a wide range of MCP clients, including Claude Code/Desktop, OpenCode, Cursor, Windsurf, VS Code Copilot, and OpenAI-style MCP integrations, making it straightforward to plug into existing workflows.

How to install

Prerequisites:

Node.js v14+ (recommended latest LTS) and npm installed on your system
Access to an MCP client (Claude Code/Desktop, VS Code MCP, Cursor, Windsurf, OpenCode, etc.) or an MCP-compatible integration

Installation steps:

Install Node.js and npm from the official website if you don’t already have them.
Use your MCP client to register or configure the Forge MCP server. The recommended method is via npx, which runs the MCP server without a global install:

# Claude integration (example)
claude mcp add forge-mcp -- npx -y @rightnow/forge-mcp-server

# VS Code / Copilot / other MCP clients (example snippet)
# In the respective MCP config, point to the server using:
# command: npx
# args: ["-y", "@rightnow/forge-mcp-server"]

If you prefer to run manually in a local environment (without an MCP client):

npx -y @rightnow/forge-mcp-server

Verify the server starts and is reachable by your MCP client. Follow client-specific prompts to authorize and connect Forge as the active MCP server.

Additional notes

Tips and considerations:

The server uses npx to fetch and run the latest Forge MCP server package; ensure network access to npm registries.
The forge_optimize tool accepts a PyTorch code snippet and returns a tuned kernel along with speedup metrics; it is recommended to set a reasonable target_speedup and max_iterations to control compute cost.
If your MCP client requires Windows compatibility, you may need to wrap commands with cmd /c as shown in the README examples for specific clients.
The server supports multiple GPUs (B200, H200/H100, L40S, A100, L4, A10, T4); you can guide the optimization toward a particular GPU by using the gpu parameter in forge_optimize.
For production use, ensure proper authentication via forge_auth before attempting optimizations to avoid quota or access issues.

Related MCP Servers

iterm

530

A Model Context Protocol server that executes commands in the current iTerm session - useful for REPL and CLI assistance

mcp

Octopus Deploy Official MCP Server

furi

CLI & API for MCP management

editor

MCP Server for Phaser Editor

DoorDash

MCP server from JordanDalton/DoorDash-MCP-Server

mcp

MCP сервер для автоматического создания и развертывания приложений в Timeweb Cloud