gro

LLM agent runtime with paged virtual memory and spatial context awareness

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio tjamescouch-gro npx -y @tjamescouch/gro

How to use

gro is a provider-agnostic runtime for running persistent agent loops against any large language model provider (Anthropic, OpenAI, Google, xAI, Groq, or local models) with features like virtual memory management, semantic retrieval, streaming tool usage, and MCP tool integration. It is designed to operate in a containerized environment for host isolation and security. By default, gro can run in interactive mode with a virtual memory system, automatically managing context via memory paging and optional semantic retrieval when embeddings are available. You can drive gro via the CLI to perform one-shot prompts, interactive sessions, and model-typed commands, and you can configure providers through model selection and environment keys. gro also supports advanced capabilities like result streaming, prompt caching, and explicit search over prior pages to retrieve relevant context during conversations.

To use gro, install the package globally and invoke the gro CLI. You can start an interactive session with memory, specify a model, or pipe input to gro. For example, you can run a one-shot prompt using a chosen provider, open an interactive session to carry a running memory, or resume a previous session. gro will handle provider inference from the model you select, manage API keys through environment variables or a macOS Keychain, and surface relevant context before each turn when semantic retrieval is enabled.

In practice, gro provides capabilities such as: (1) virtual memory with paging and summarization to maintain long-running conversations, (2) semantic retrieval to surface relevant past context when embedding keys are configured, (3) streaming tool usage where the agent can call external tools or functions as part of its reasoning, and (4) an AgentChat network integration for distributing work among multiple agents. These features together enable robust long-running assistant workflows across diverse LLM providers.

How to install

Prerequisites:

Node.js 18+ and npm installed on your system
Optional: Docker if you prefer containerized usage (not required if using the Node.js CLI)

Step 1: Install gro globally via npm (as shown in the README example)

npm install -g @tjamescouch/gro

Step 2: Ensure you have a provider API key configured if you plan to use hosted models

Anthropic: export ANTHROPIC_API_KEY
OpenAI: export OPENAI_API_KEY
Google: export GOOGLE_API_KEY
xAI: export XAI_API_KEY
Groq: export GROQ_API_KEY

Step 3: Run gro with your desired workflow

One-shot prompt (Anthropic by default)

export ANTHROPIC_API_KEY=sk-...
gro "explain the CAP theorem in two sentences"

Interactive memory session

gro -i

Resume last session

gro -c

Step 4: (Optional) Configure memory or model options

gro -i --gro-memory virtual            # default virtual memory with semantic retrieval
gro -i -m claude-sonnet-4-5             # explicit model selection (Anthropic)
gro -m gpt-4.1 "hello"                 # OpenAI model selection (provider inferred by model name)

If you prefer containerized usage, you can run gro inside a container following your preferred container workflow, but the npm-based CLI is the recommended starting point for development and testing.

Additional notes

Tips and notes:

gro supports multiple memory modes: virtual (default), simple, fragmentation, and hnsw. Choose the mode that fits your latency and memory requirements.
Semantic retrieval requires embedding API keys; if not configured, retrieval is gracefully disabled but all other memory features still work.
The system uses a memory paging approach with page summaries that can be loaded back on demand via page references (@@ref@@). This enables long-running conversations without token blow-up.
You can explicitly search within the memory with @@search('query')@@ to retrieve and load relevant pages into context.
Prompt caching can significantly reduce repeated prompt costs; you can disable it with gro --no-prompt-caching when needed.
For macOS users, keys can be stored in Keychain via gro --set-key <provider> for persistent secure storage.
If you encounter provider/model compatibility issues, verify that your model name maps correctly to a provider and that you have the necessary API key configured in the environment.
gro is designed to operate in containerized environments to protect the host; ensure appropriate resource limits and network access in your container runtime.

Related MCP Servers

openfang

7.1k

Open-source Agent Operating System

ai

1.3k

One-stop shop for building AI-powered products and businesses with Stripe.

sre

1.2k

The SmythOS Runtime Environment (SRE) is an open-source, cloud-native runtime for agentic AI. Secure, modular, and production-ready, it lets developers build, run, and manage intelligent agents across local, cloud, and edge environments.

golf

811

Production-Ready MCP Server Framework • Build, deploy & scale secure AI agent infrastructure • Includes Auth, Observability, Debugger, Telemetry & Runtime • Run real-world MCPs powering AI Agents

DeepMCPAgent

804

Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE.

llm-functions

713

Easily create LLM tools and agents using plain Bash/JavaScript/Python functions.