What providers and models are supported?

Clawd Throttle supports 8 providers (Anthropic, Google, OpenAI, DeepSeek, xAI, Moonshot, Mistral, Ollama) and 25+ models. You can mix local and remote models as configured.

How does it decide the cheapest model?

A fast 8-dimension classifier scores prompts in under 1 ms to determine a complexity tier, which the router maps to a model based on your active mode and available providers, then proxies the request to that API and logs the decision.

Where are logs stored and is data kept secure?

Routing decisions and costs are logged to a local JSONL file in ~/.config/clawd-throttle/. Prompt content is never stored; only a SHA-256 hash is logged, and API keys stay in your local config.

Clawd Throttle

Scanned

@liekzejaws

npx machina-cli add skill @liekzejaws/clawd-throttle --openclaw

Files (1)

SKILL.md

3.3 KB

Clawd Throttle

Route every LLM request to the cheapest model that can handle it. Stop paying Opus prices for "hello" and "summarize this."

Supports 8 providers and 25+ models: Anthropic (Claude), Google (Gemini), OpenAI (GPT / o-series), xAI (Grok), DeepSeek, Moonshot (Kimi), Mistral, and Ollama (local).

How It Works

Your prompt arrives
The classifier scores it on 8 dimensions (token count, code presence, reasoning markers, simplicity indicators, multi-step patterns, question count, system prompt complexity, conversation depth) in under 1 millisecond
The router maps the resulting tier (simple / standard / complex) to a model based on your active mode and configured providers
The request is proxied to the correct API
The routing decision and cost are logged to a local JSONL file

Routing Modes

Mode	Simple	Standard	Complex
eco	Grok 4.1 Fast	Gemini Flash	Haiku
standard	Grok 4.1 Fast	Haiku	Sonnet
gigachad	Haiku	Sonnet	Opus 4.6

Each cell shows the first-choice model. The router tries a preference list and falls through to the next available provider if the first is not configured.

Available Commands

Command	What It Does
`route_request`	Send a prompt and get a response from the cheapest capable model
`classify_prompt`	Analyze prompt complexity without making an LLM call
`get_routing_stats`	View cost savings and model distribution stats
`get_config`	View current configuration (keys redacted)
`set_mode`	Change routing mode at runtime
`get_recent_routing_log`	Inspect recent routing decisions

Overrides

Heartbeats and summaries always route to the cheapest model
Type /opus, /sonnet, /haiku, /flash, or /grok-fast to force a specific model
Sub-agent calls automatically step down one tier from their parent

Setup

Get at least one API key (Anthropic or Google required; others optional):
- Anthropic: https://console.anthropic.com/settings/keys
- Google AI: https://aistudio.google.com/app/apikey
- xAI: https://console.x.ai
- OpenAI: https://platform.openai.com/api-keys
- DeepSeek: https://platform.deepseek.com
- Moonshot: https://platform.moonshot.cn
- Mistral: https://console.mistral.ai
Run the setup script:
```
npm run setup
```
Choose your routing mode (eco / standard / gigachad)

Privacy

Prompt content is never stored. Only a SHA-256 hash is logged.
All data stays local in ~/.config/clawd-throttle/
API keys stored in your local config file

Source

git clone https://clawhub.ai/liekzejaws/clawd-throttleView on GitHub

Overview

Clawd Throttle routes every LLM request to the cheapest model that can handle it, spanning eight providers and 25+ models. It scores prompts on eight dimensions in under 1ms, supports eco, standard, and gigachad modes, and logs routing decisions for cost tracking.

How This Skill Works

When a prompt arrives, the classifier scores it on eight dimensions (token count, code presence, reasoning markers, simplicity indicators, multi-step patterns, question count, system prompt complexity, conversation depth) in under 1 ms. The router then maps the resulting tier (simple/standard/complex) to a model based on the active mode and configured providers, proxies the request to the chosen API, and logs the routing decision and cost to a local JSONL file.

When to Use It

You want to minimize spend by routing to the cheapest model that can handle the prompt across eight providers.
Prompt complexity varies and you want automatic tiering to simple, standard, or complex within eco/standard/gigachad modes.
You need cost awareness and auditing by logging routing decisions locally.
You prefer local or private deployments (e.g., Ollama) to avoid sending prompts to external dashboards.
You want to switch modes on the fly (eco, standard, gigachad) to balance cost and performance.

Quick Start

Step 1: Install clawd-throttle and gather API keys for at least one provider (Anthropic or Google required).
Step 2: Run npm run setup and choose your routing mode (eco, standard, or gigachad).
Step 3: Route prompts with route_request or classify_prompt and review the locally stored logs at ~/.config/clawd-throttle/.

Best Practices

Define and routinely update the provider/model roster to reflect current pricing.
Tune the 8-dimension classifier to align with your typical prompts (token count, code, reasoning, etc.).
Enable local JSONL logging and monitor cost savings with get_routing_stats.
Test simple vs. complex prompts across modes to validate routing decisions.
Secure API keys in local config and rotate them periodically.

Example Use Cases

A customer-support chatbot that routes common inquiries to low-cost models while preserving accuracy.
A research project that tracks cost savings across prompts to optimize prompt design.
An enterprise chat tool using eco mode for routine queries and gigachad for high-complexity tasks.
A privacy-focused deployment using Ollama/local models for sensitive conversations.
An analytics dashboard that surfaces routing stats and cost per provider for governance.

Frequently Asked Questions

Add this skill to your agents