gro
LLM agent runtime with paged virtual memory and spatial context awareness
claude mcp add --transport stdio tjamescouch-gro npx -y @tjamescouch/gro
How to use
gro is a provider-agnostic runtime for running persistent agent loops against any large language model provider (Anthropic, OpenAI, Google, xAI, Groq, or local models) with features like virtual memory management, semantic retrieval, streaming tool usage, and MCP tool integration. It is designed to operate in a containerized environment for host isolation and security. By default, gro can run in interactive mode with a virtual memory system, automatically managing context via memory paging and optional semantic retrieval when embeddings are available. You can drive gro via the CLI to perform one-shot prompts, interactive sessions, and model-typed commands, and you can configure providers through model selection and environment keys. gro also supports advanced capabilities like result streaming, prompt caching, and explicit search over prior pages to retrieve relevant context during conversations.
To use gro, install the package globally and invoke the gro CLI. You can start an interactive session with memory, specify a model, or pipe input to gro. For example, you can run a one-shot prompt using a chosen provider, open an interactive session to carry a running memory, or resume a previous session. gro will handle provider inference from the model you select, manage API keys through environment variables or a macOS Keychain, and surface relevant context before each turn when semantic retrieval is enabled.
In practice, gro provides capabilities such as: (1) virtual memory with paging and summarization to maintain long-running conversations, (2) semantic retrieval to surface relevant past context when embedding keys are configured, (3) streaming tool usage where the agent can call external tools or functions as part of its reasoning, and (4) an AgentChat network integration for distributing work among multiple agents. These features together enable robust long-running assistant workflows across diverse LLM providers.
How to install
Prerequisites:
- Node.js 18+ and npm installed on your system
- Optional: Docker if you prefer containerized usage (not required if using the Node.js CLI)
Step 1: Install gro globally via npm (as shown in the README example)
npm install -g @tjamescouch/gro
Step 2: Ensure you have a provider API key configured if you plan to use hosted models
- Anthropic: export ANTHROPIC_API_KEY
- OpenAI: export OPENAI_API_KEY
- Google: export GOOGLE_API_KEY
- xAI: export XAI_API_KEY
- Groq: export GROQ_API_KEY
Step 3: Run gro with your desired workflow
- One-shot prompt (Anthropic by default)
export ANTHROPIC_API_KEY=sk-...
gro "explain the CAP theorem in two sentences"
- Interactive memory session
gro -i
- Resume last session
gro -c
Step 4: (Optional) Configure memory or model options
gro -i --gro-memory virtual # default virtual memory with semantic retrieval
gro -i -m claude-sonnet-4-5 # explicit model selection (Anthropic)
gro -m gpt-4.1 "hello" # OpenAI model selection (provider inferred by model name)
If you prefer containerized usage, you can run gro inside a container following your preferred container workflow, but the npm-based CLI is the recommended starting point for development and testing.
Additional notes
Tips and notes:
- gro supports multiple memory modes: virtual (default), simple, fragmentation, and hnsw. Choose the mode that fits your latency and memory requirements.
- Semantic retrieval requires embedding API keys; if not configured, retrieval is gracefully disabled but all other memory features still work.
- The system uses a memory paging approach with page summaries that can be loaded back on demand via page references (@@ref@@). This enables long-running conversations without token blow-up.
- You can explicitly search within the memory with @@search('query')@@ to retrieve and load relevant pages into context.
- Prompt caching can significantly reduce repeated prompt costs; you can disable it with gro --no-prompt-caching when needed.
- For macOS users, keys can be stored in Keychain via gro --set-key <provider> for persistent secure storage.
- If you encounter provider/model compatibility issues, verify that your model name maps correctly to a provider and that you have the necessary API key configured in the environment.
- gro is designed to operate in containerized environments to protect the host; ensure appropriate resource limits and network access in your container runtime.
Related MCP Servers
openfang
Open-source Agent Operating System
ai
One-stop shop for building AI-powered products and businesses with Stripe.
sre
The SmythOS Runtime Environment (SRE) is an open-source, cloud-native runtime for agentic AI. Secure, modular, and production-ready, it lets developers build, run, and manage intelligent agents across local, cloud, and edge environments.
golf
Production-Ready MCP Server Framework • Build, deploy & scale secure AI agent infrastructure • Includes Auth, Observability, Debugger, Telemetry & Runtime • Run real-world MCPs powering AI Agents
DeepMCPAgent
Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE.
llm-functions
Easily create LLM tools and agents using plain Bash/JavaScript/Python functions.