How does cost-tracker estimate token usage when no token counts are exposed?

It uses quick estimates: 1 token ≈ 4 characters, ≈¾ of a word, with code being denser; it also provides file-size heuristics and example commands to approximate tokens.

What providers and models does it cover?

It tracks spend across major providers (OpenAI, Anthropic, Google) and other models listed in the Pricing Reference, aggregating input/output tokens and costs per session.

When are budget warnings issued?

Budget warnings follow the defined thresholds: 50% ( Heads up ), 80% ( Approaching limit ), and 95% ( Budget nearly exhausted ).

cost-tracker

Scanned

cost tokens budget llm monitoring api

npx machina-cli add skill suryast/free-ai-agent-skills/cost-tracker --openclaw

Files (1)

SKILL.md

7.7 KB

💰 Cost Tracker

Compatible with Claude Code, Codex CLI, Cursor, Windsurf, and any SKILL.md-compatible agent.

Track what your AI sessions actually cost. Estimate token usage, cumulative spend, and warn you before you hit budget thresholds — across OpenAI, Anthropic, Google, and other major providers.

Triggers

Activate this skill when:

User asks "how much has this session cost?"
User asks "what's my token usage?"
User sets a session budget ("keep this under $2")
User wants a cost estimate before a large task
Cumulative session spend needs tracking
"track my costs", "budget check", "token count", "how much am I spending"

Pricing Reference (update as models change)

Use these rates to estimate costs. All prices are per 1M tokens (input / output).

Anthropic

Model	Input	Output
claude-opus-4	$15.00	$75.00
claude-sonnet-4	$3.00	$15.00
claude-haiku-4	$0.80	$4.00
claude-opus-3	$15.00	$75.00
claude-sonnet-3.5	$3.00	$15.00
claude-haiku-3.5	$0.80	$4.00

OpenAI

Model	Input	Output
gpt-4o	$2.50	$10.00
gpt-4o-mini	$0.15	$0.60
gpt-4-turbo	$10.00	$30.00
gpt-4	$30.00	$60.00
gpt-3.5-turbo	$0.50	$1.50
o1	$15.00	$60.00
o1-mini	$3.00	$12.00
o3-mini	$1.10	$4.40

Google

Model	Input	Output
gemini-2.0-flash	$0.075	$0.30
gemini-2.0-pro	$1.25	$5.00
gemini-1.5-pro	$1.25	$5.00
gemini-1.5-flash	$0.075	$0.30

Other

Model	Input	Output
mistral-large	$3.00	$9.00
mistral-small	$0.20	$0.60
llama-3.3-70b (Groq)	$0.59	$0.79
deepseek-r1	$0.55	$2.19

⚠️ Prices change frequently. Always verify at the provider's pricing page before making financial decisions.

How It Works

Session Tracking

When activated, maintain a running cost ledger in the conversation context:

SESSION COST LEDGER
===================
Model: claude-sonnet-4
Started: [timestamp]

Turn  | Input tok | Output tok | Cost
------|-----------|------------|------
  1   |    2,340  |       450  | $0.0134
  2   |    4,120  |       890  | $0.0259
  3   |    1,870  |       340  | $0.0107
------|-----------|------------|------
Total |    8,330  |     1,680  | $0.0500

Budget: $2.00 | Used: $0.05 (2.5%) | Remaining: $1.95

Token Estimation

When you can't read token counts directly from the API response, estimate:

Quick estimates (rough, for planning):

1 token ≈ 4 characters of English text
1 token ≈ ¾ of a word
Code is denser: 1 token ≈ 3 characters
1 page of plain text ≈ 500–750 tokens
1,000-word article ≈ 1,300–1,500 tokens

File size estimates:

Small file (<50 lines): ~500–1,000 tokens
Medium file (50–200 lines): ~1,000–4,000 tokens
Large file (200–500 lines): ~4,000–10,000 tokens
Full codebase context: count with wc -c then divide by 4

Pre-task estimate command:

# Estimate tokens in a file
wc -c myfile.py | awk '{printf "~%d tokens\n", $1/4}'

# Estimate tokens in entire codebase
find . -name "*.py" -o -name "*.ts" -o -name "*.js" | xargs wc -c 2>/dev/null | tail -1 | awk '{printf "~%d tokens (input)\n", $1/4}'

# Count words as rough proxy
wc -w myfile.txt | awk '{printf "~%d tokens\n", $1*1.3}'

Budget Warnings

Issue warnings at these thresholds:

50% of budget: ℹ️ Heads up — halfway through budget
80% of budget: ⚠️ Approaching limit — consider wrapping up
95% of budget: 🚨 Budget nearly exhausted — stop or expand

Cost Estimation Before Large Tasks

Before any task involving large files or long conversations, estimate upfront:

📊 PRE-TASK ESTIMATE
====================
Task: Refactor entire codebase
Files to read: 23 files (~180,000 chars)
Estimated input: ~45,000 tokens
Expected output: ~8,000 tokens (code changes + explanation)
Model: claude-sonnet-4

Estimated cost: $0.255
  Input:  45,000 × $3.00/M  = $0.135
  Output:  8,000 × $15.00/M = $0.120

Proceed? This is ~13% of your $2.00 budget.

Output Format

Quick status (inline, on request)

💰 This session: ~$0.05 (8,330 tokens in / 1,680 out) | Budget: $1.95 remaining

Full report (on request or at session end)

╔══════════════════════════════════════╗
║        SESSION COST REPORT           ║
╠══════════════════════════════════════╣
║ Model:    claude-sonnet-4            ║
║ Duration: 23 minutes                 ║
╠══════════════════════════════════════╣
║ INPUT TOKENS                         ║
║   Turns:          12                 ║
║   Total tokens:   42,840             ║
║   Cost:           $0.1285            ║
╠══════════════════════════════════════╣
║ OUTPUT TOKENS                        ║
║   Total tokens:   8,920              ║
║   Cost:           $0.1338            ║
╠══════════════════════════════════════╣
║ TOTAL COST:       $0.2623            ║
║ Budget used:      13.1% of $2.00     ║
║ Remaining:        $1.74              ║
╚══════════════════════════════════════╝

Multi-Provider Session

If a session spans multiple models or providers:

MULTI-MODEL SESSION SUMMARY
============================
gpt-4o         → 12,000 in / 2,400 out → $0.054
claude-haiku-4 → 45,000 in / 8,000 out → $0.068
gemini-flash   →  8,000 in / 1,200 out → $0.001
────────────────────────────────────────────────
TOTAL          → 65,000 in / 11,600 out → $0.123

Common Scenarios

"How much did that last task cost?"

Calculate the tokens in the most recent exchange, apply the current model's rates, and report inline.

"Estimate the cost of indexing my repo"

find . -type f \( -name "*.py" -o -name "*.ts" -o -name "*.js" -o -name "*.md" \) \
  | xargs wc -c 2>/dev/null | tail -1 \
  | awk '{
      tokens = $1/4
      cost_sonnet = (tokens/1000000) * 3.00
      cost_haiku  = (tokens/1000000) * 0.80
      cost_gpt4o  = (tokens/1000000) * 2.50
      printf "Repo size: ~%.0f tokens\n", tokens
      printf "claude-sonnet-4: $%.4f\n", cost_sonnet
      printf "claude-haiku-4:  $%.4f\n", cost_haiku
      printf "gpt-4o:          $%.4f\n", cost_gpt4o
    }'

"Set a $5 budget for this session"

Acknowledge the budget, start tracking, and proactively warn at 50%, 80%, and 95% thresholds. If the budget would be exceeded by a planned task, warn before proceeding.

Notes

Token counts are estimates unless the model API returns exact counts in its response metadata
Output tokens are typically 3–10× more expensive per token than input — optimize accordingly
Caching (where available) can reduce input costs by 80–90% for repeated context
Streaming responses don't change token costs — you pay for tokens regardless
System prompts count as input tokens on every turn

Source

git clone https://github.com/suryast/free-ai-agent-skills/blob/main/cost-tracker/SKILL.mdView on GitHub

Overview

Cost Tracker monitors AI session spend across providers, estimates token usage when direct counts aren’t available, and warns you before you blow your budget. It maintains a running ledger of costs per session and supports major providers like OpenAI, Anthropic, and Google.

How This Skill Works

When activated, Cost Tracker maintains a SESSION COST LEDGER with model and timing, logging per-turn input/output tokens and costs. If direct token counts aren’t exposed by the API, it uses quick estimates (e.g., 1 token ≈ 4 characters) and file-size heuristics to project usage, then aggregates it against your budget and triggers warnings at defined thresholds.

When to Use It

User asks how much has this session cost?
User asks what's my token usage?
User sets a session budget (e.g., keep this under $2)
User wants a cost estimate before a large task
Cumulative session spend needs tracking

Quick Start

Step 1: Activate cost-tracker in your agent and select a session or task
Step 2: Set a budget and enable percentage-based warnings
Step 3: Run your task; monitor the running ledger and adjust as needed

Best Practices

Always enable per-session tracking at the start of a conversation
Record the model and budget thresholds in the ledger
Use the quick-estimate rules when raw token counts aren’t available
Cross-check estimated costs with provider pricing pages for accuracy
Trigger budget warnings at 50%, 80%, and 95% thresholds

Example Use Cases

A support bot tracks spend per customer session to avoid surprises
A coding assistant estimates tokens before running a large refactor task
An analytics bot compares costs across models like gpt-4-turbo and claude for long prompts
A tutoring bot monitors token usage to cap monthly API spend
A team audits monthly LLM costs using the running cost ledger

Frequently Asked Questions

Add this skill to your agents

Related Skills

log-analysis

chaterm/terminal-skills

日志分析与处理

monitoring

chaterm/terminal-skills

监控与告警

system-admin

chaterm/terminal-skills

Linux system administration and monitoring

convex-agents

waynesutton/convexskills

Building AI agents with the Convex Agent component including thread management, tool integration, streaming responses, RAG patterns, and workflow orchestration

api-design-patterns

petekp/claude-code-setup

Comprehensive API design patterns covering REST, GraphQL, gRPC, versioning, authentication, and modern API best practices

prom-query

cacheforge-ai/cacheforge-skills

Prometheus Metrics Query & Alert Interpreter — query metrics, interpret timeseries, triage alerts