Get the FREE Ultimate OpenClaw Setup Guide →

k8s-gpu

NVIDIA GPU hardware introspection for Kubernetes clusters via MCP

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio arangogutierrez-k8s-gpu-mcp-server npx -y k8s-gpu-mcp-server@latest

How to use

The k8s-gpu-mcp-server is an ephemeral diagnostic agent that provides real-time NVIDIA GPU hardware introspection for Kubernetes clusters under the Model Context Protocol (MCP). It exposes a low-footprint HTTP server that serves MCP tools and workflows, enabling AI-assisted troubleshooting for complex GPU issues. The server ships with a suite of NVML-based tools for inventory, health, error analysis, topology, timing data, and Kubernetes-aware diagnostics, all accessible through a consistent MCP interface. Prompts are available to guide operators through common GPU triage workflows, making it easier to gather context, identify root causes, and surface remediation steps.

To use it, install the MCP server via npx or npm and point your MCP client (Clorde/Cursor/Claude Desktop, or other MCP hosts) at the emitted endpoint. The server is designed to operate securely in read-only modes by default, with operator-mode options for deeper access where supported. Once running, you can invoke the available tools such as get_gpu_inventory, get_gpu_health, analyze_xid_errors, get_nvlink_topology, get_gpu_timeline, describe_gpu_node, get_pod_gpu_allocation, explain_failure, get_incident_report, and other MCP prompts to orchestrate comprehensive GPU diagnostics.

How to install

Prerequisites:

  • Node.js (version compatible with npx/npm) and npm installed on your host or CI environment
  • Internet access to fetch the MCP server package
  • Optional: Docker if you prefer containerized runs

Install methods:

  1. One-line (recommended):
  • Ensure Node.js and npm are installed
  • Run: npx k8s-gpu-mcp-server@latest
  1. Global installation (alternative):
  • Install the package globally and run from anywhere: npm install -g k8s-gpu-mcp-server
  • Then start the server (the exact start command may be exposed by the package, e.g., k8s-gpu-mcp-server or a node script).
  1. Install from source (advanced):
  • Clone the repository: git clone https://github.com/ArangoGutierrez/k8s-gpu-mcp-server.git
  • Build/install as needed per repository instructions (example shown in README): cd k8s-gpu-mcp-server make agent
  • Run the locally built agent binary (as shown in the quickstart): cat examples/gpu_inventory.json | ./bin/agent --nvml-mode=mock

    or

    cat examples/gpu_inventory.json | ./bin/agent --nvml-mode=real

Notes:

  • If you are deploying to Kubernetes, you can use the provided Helm charts or OCI deployments as shown in the README to run the MCP server in a cluster.”

Additional notes

Tips and caveats:

  • The MCP server is designed to be low-footprint; it keeps a persistent HTTP front-end and performs GPU work on-demand when tools are invoked.
  • GPU access in Kubernetes typically requires a RuntimeClass (e.g., nvidia) or equivalent GPU operator setup. If a RuntimeClass is not available, you may need to enable fallbacks as described in the README (disable gpu.runtimeClass and enable gpu.resourceRequest).
  • There are multiple deployment paths: one-line npx install, global npm install, or deploying via Helm charts to Kubernetes.
  • The MCP prompts (gpu-health-check, diagnose-xid-errors, gpu-triage) guide orchestration of multiple tools to produce actionable insights.
  • For Claude Desktop integration, you can configure a mcpServers entry that executes kubectl to run the agent inside a pod, enabling remote querying via Claude.
  • If you encounter issues with tool availability, ensure NVML access is functioning and that you are running a compatible GPU driver stack.

Related MCP Servers

Sponsor this space

Reach thousands of developers