Get the FREE Ultimate OpenClaw Setup Guide →
l

Clawd Throttle

Scanned

@liekzejaws

npx machina-cli add skill @liekzejaws/clawd-throttle --openclaw
Files (1)
SKILL.md
3.3 KB

Clawd Throttle

Route every LLM request to the cheapest model that can handle it. Stop paying Opus prices for "hello" and "summarize this."

Supports 8 providers and 25+ models: Anthropic (Claude), Google (Gemini), OpenAI (GPT / o-series), xAI (Grok), DeepSeek, Moonshot (Kimi), Mistral, and Ollama (local).

How It Works

  1. Your prompt arrives
  2. The classifier scores it on 8 dimensions (token count, code presence, reasoning markers, simplicity indicators, multi-step patterns, question count, system prompt complexity, conversation depth) in under 1 millisecond
  3. The router maps the resulting tier (simple / standard / complex) to a model based on your active mode and configured providers
  4. The request is proxied to the correct API
  5. The routing decision and cost are logged to a local JSONL file

Routing Modes

ModeSimpleStandardComplex
ecoGrok 4.1 FastGemini FlashHaiku
standardGrok 4.1 FastHaikuSonnet
gigachadHaikuSonnetOpus 4.6

Each cell shows the first-choice model. The router tries a preference list and falls through to the next available provider if the first is not configured.

Available Commands

CommandWhat It Does
route_requestSend a prompt and get a response from the cheapest capable model
classify_promptAnalyze prompt complexity without making an LLM call
get_routing_statsView cost savings and model distribution stats
get_configView current configuration (keys redacted)
set_modeChange routing mode at runtime
get_recent_routing_logInspect recent routing decisions

Overrides

  • Heartbeats and summaries always route to the cheapest model
  • Type /opus, /sonnet, /haiku, /flash, or /grok-fast to force a specific model
  • Sub-agent calls automatically step down one tier from their parent

Setup

  1. Get at least one API key (Anthropic or Google required; others optional):
  2. Run the setup script:
    npm run setup
    
  3. Choose your routing mode (eco / standard / gigachad)

Privacy

  • Prompt content is never stored. Only a SHA-256 hash is logged.
  • All data stays local in ~/.config/clawd-throttle/
  • API keys stored in your local config file

Source

git clone https://clawhub.ai/liekzejaws/clawd-throttleView on GitHub

Overview

Clawd Throttle routes every LLM request to the cheapest model that can handle it, spanning eight providers and 25+ models. It scores prompts on eight dimensions in under 1ms, supports eco, standard, and gigachad modes, and logs routing decisions for cost tracking.

How This Skill Works

When a prompt arrives, the classifier scores it on eight dimensions (token count, code presence, reasoning markers, simplicity indicators, multi-step patterns, question count, system prompt complexity, conversation depth) in under 1 ms. The router then maps the resulting tier (simple/standard/complex) to a model based on the active mode and configured providers, proxies the request to the chosen API, and logs the routing decision and cost to a local JSONL file.

When to Use It

  • You want to minimize spend by routing to the cheapest model that can handle the prompt across eight providers.
  • Prompt complexity varies and you want automatic tiering to simple, standard, or complex within eco/standard/gigachad modes.
  • You need cost awareness and auditing by logging routing decisions locally.
  • You prefer local or private deployments (e.g., Ollama) to avoid sending prompts to external dashboards.
  • You want to switch modes on the fly (eco, standard, gigachad) to balance cost and performance.

Quick Start

  1. Step 1: Install clawd-throttle and gather API keys for at least one provider (Anthropic or Google required).
  2. Step 2: Run npm run setup and choose your routing mode (eco, standard, or gigachad).
  3. Step 3: Route prompts with route_request or classify_prompt and review the locally stored logs at ~/.config/clawd-throttle/.

Best Practices

  • Define and routinely update the provider/model roster to reflect current pricing.
  • Tune the 8-dimension classifier to align with your typical prompts (token count, code, reasoning, etc.).
  • Enable local JSONL logging and monitor cost savings with get_routing_stats.
  • Test simple vs. complex prompts across modes to validate routing decisions.
  • Secure API keys in local config and rotate them periodically.

Example Use Cases

  • A customer-support chatbot that routes common inquiries to low-cost models while preserving accuracy.
  • A research project that tracks cost savings across prompts to optimize prompt design.
  • An enterprise chat tool using eco mode for routine queries and gigachad for high-complexity tasks.
  • A privacy-focused deployment using Ollama/local models for sensitive conversations.
  • An analytics dashboard that surfaces routing stats and cost per provider for governance.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers