npx machina-cli add skill suryast/free-ai-agent-skills/cost-tracker --openclawπ° Cost Tracker
Compatible with Claude Code, Codex CLI, Cursor, Windsurf, and any SKILL.md-compatible agent.
Track what your AI sessions actually cost. Estimate token usage, cumulative spend, and warn you before you hit budget thresholds β across OpenAI, Anthropic, Google, and other major providers.
Triggers
Activate this skill when:
- User asks "how much has this session cost?"
- User asks "what's my token usage?"
- User sets a session budget ("keep this under $2")
- User wants a cost estimate before a large task
- Cumulative session spend needs tracking
- "track my costs", "budget check", "token count", "how much am I spending"
Pricing Reference (update as models change)
Use these rates to estimate costs. All prices are per 1M tokens (input / output).
Anthropic
| Model | Input | Output |
|---|---|---|
| claude-opus-4 | $15.00 | $75.00 |
| claude-sonnet-4 | $3.00 | $15.00 |
| claude-haiku-4 | $0.80 | $4.00 |
| claude-opus-3 | $15.00 | $75.00 |
| claude-sonnet-3.5 | $3.00 | $15.00 |
| claude-haiku-3.5 | $0.80 | $4.00 |
OpenAI
| Model | Input | Output |
|---|---|---|
| gpt-4o | $2.50 | $10.00 |
| gpt-4o-mini | $0.15 | $0.60 |
| gpt-4-turbo | $10.00 | $30.00 |
| gpt-4 | $30.00 | $60.00 |
| gpt-3.5-turbo | $0.50 | $1.50 |
| o1 | $15.00 | $60.00 |
| o1-mini | $3.00 | $12.00 |
| o3-mini | $1.10 | $4.40 |
| Model | Input | Output |
|---|---|---|
| gemini-2.0-flash | $0.075 | $0.30 |
| gemini-2.0-pro | $1.25 | $5.00 |
| gemini-1.5-pro | $1.25 | $5.00 |
| gemini-1.5-flash | $0.075 | $0.30 |
Other
| Model | Input | Output |
|---|---|---|
| mistral-large | $3.00 | $9.00 |
| mistral-small | $0.20 | $0.60 |
| llama-3.3-70b (Groq) | $0.59 | $0.79 |
| deepseek-r1 | $0.55 | $2.19 |
β οΈ Prices change frequently. Always verify at the provider's pricing page before making financial decisions.
How It Works
Session Tracking
When activated, maintain a running cost ledger in the conversation context:
SESSION COST LEDGER
===================
Model: claude-sonnet-4
Started: [timestamp]
Turn | Input tok | Output tok | Cost
------|-----------|------------|------
1 | 2,340 | 450 | $0.0134
2 | 4,120 | 890 | $0.0259
3 | 1,870 | 340 | $0.0107
------|-----------|------------|------
Total | 8,330 | 1,680 | $0.0500
Budget: $2.00 | Used: $0.05 (2.5%) | Remaining: $1.95
Token Estimation
When you can't read token counts directly from the API response, estimate:
Quick estimates (rough, for planning):
- 1 token β 4 characters of English text
- 1 token β ΒΎ of a word
- Code is denser: 1 token β 3 characters
- 1 page of plain text β 500β750 tokens
- 1,000-word article β 1,300β1,500 tokens
File size estimates:
- Small file (<50 lines): ~500β1,000 tokens
- Medium file (50β200 lines): ~1,000β4,000 tokens
- Large file (200β500 lines): ~4,000β10,000 tokens
- Full codebase context: count with
wc -cthen divide by 4
Pre-task estimate command:
# Estimate tokens in a file
wc -c myfile.py | awk '{printf "~%d tokens\n", $1/4}'
# Estimate tokens in entire codebase
find . -name "*.py" -o -name "*.ts" -o -name "*.js" | xargs wc -c 2>/dev/null | tail -1 | awk '{printf "~%d tokens (input)\n", $1/4}'
# Count words as rough proxy
wc -w myfile.txt | awk '{printf "~%d tokens\n", $1*1.3}'
Budget Warnings
Issue warnings at these thresholds:
- 50% of budget: βΉοΈ Heads up β halfway through budget
- 80% of budget: β οΈ Approaching limit β consider wrapping up
- 95% of budget: π¨ Budget nearly exhausted β stop or expand
Cost Estimation Before Large Tasks
Before any task involving large files or long conversations, estimate upfront:
π PRE-TASK ESTIMATE
====================
Task: Refactor entire codebase
Files to read: 23 files (~180,000 chars)
Estimated input: ~45,000 tokens
Expected output: ~8,000 tokens (code changes + explanation)
Model: claude-sonnet-4
Estimated cost: $0.255
Input: 45,000 Γ $3.00/M = $0.135
Output: 8,000 Γ $15.00/M = $0.120
Proceed? This is ~13% of your $2.00 budget.
Output Format
Quick status (inline, on request)
π° This session: ~$0.05 (8,330 tokens in / 1,680 out) | Budget: $1.95 remaining
Full report (on request or at session end)
ββββββββββββββββββββββββββββββββββββββββ
β SESSION COST REPORT β
β βββββββββββββββββββββββββββββββββββββββ£
β Model: claude-sonnet-4 β
β Duration: 23 minutes β
β βββββββββββββββββββββββββββββββββββββββ£
β INPUT TOKENS β
β Turns: 12 β
β Total tokens: 42,840 β
β Cost: $0.1285 β
β βββββββββββββββββββββββββββββββββββββββ£
β OUTPUT TOKENS β
β Total tokens: 8,920 β
β Cost: $0.1338 β
β βββββββββββββββββββββββββββββββββββββββ£
β TOTAL COST: $0.2623 β
β Budget used: 13.1% of $2.00 β
β Remaining: $1.74 β
ββββββββββββββββββββββββββββββββββββββββ
Multi-Provider Session
If a session spans multiple models or providers:
MULTI-MODEL SESSION SUMMARY
============================
gpt-4o β 12,000 in / 2,400 out β $0.054
claude-haiku-4 β 45,000 in / 8,000 out β $0.068
gemini-flash β 8,000 in / 1,200 out β $0.001
ββββββββββββββββββββββββββββββββββββββββββββββββ
TOTAL β 65,000 in / 11,600 out β $0.123
Common Scenarios
"How much did that last task cost?"
Calculate the tokens in the most recent exchange, apply the current model's rates, and report inline.
"Estimate the cost of indexing my repo"
find . -type f \( -name "*.py" -o -name "*.ts" -o -name "*.js" -o -name "*.md" \) \
| xargs wc -c 2>/dev/null | tail -1 \
| awk '{
tokens = $1/4
cost_sonnet = (tokens/1000000) * 3.00
cost_haiku = (tokens/1000000) * 0.80
cost_gpt4o = (tokens/1000000) * 2.50
printf "Repo size: ~%.0f tokens\n", tokens
printf "claude-sonnet-4: $%.4f\n", cost_sonnet
printf "claude-haiku-4: $%.4f\n", cost_haiku
printf "gpt-4o: $%.4f\n", cost_gpt4o
}'
"Set a $5 budget for this session"
Acknowledge the budget, start tracking, and proactively warn at 50%, 80%, and 95% thresholds. If the budget would be exceeded by a planned task, warn before proceeding.
Notes
- Token counts are estimates unless the model API returns exact counts in its response metadata
- Output tokens are typically 3β10Γ more expensive per token than input β optimize accordingly
- Caching (where available) can reduce input costs by 80β90% for repeated context
- Streaming responses don't change token costs β you pay for tokens regardless
- System prompts count as input tokens on every turn
Source
git clone https://github.com/suryast/free-ai-agent-skills/blob/main/cost-tracker/SKILL.mdView on GitHub Overview
Cost Tracker monitors AI session spend across providers, estimates token usage when direct counts arenβt available, and warns you before you blow your budget. It maintains a running ledger of costs per session and supports major providers like OpenAI, Anthropic, and Google.
How This Skill Works
When activated, Cost Tracker maintains a SESSION COST LEDGER with model and timing, logging per-turn input/output tokens and costs. If direct token counts arenβt exposed by the API, it uses quick estimates (e.g., 1 token β 4 characters) and file-size heuristics to project usage, then aggregates it against your budget and triggers warnings at defined thresholds.
When to Use It
- User asks how much has this session cost?
- User asks what's my token usage?
- User sets a session budget (e.g., keep this under $2)
- User wants a cost estimate before a large task
- Cumulative session spend needs tracking
Quick Start
- Step 1: Activate cost-tracker in your agent and select a session or task
- Step 2: Set a budget and enable percentage-based warnings
- Step 3: Run your task; monitor the running ledger and adjust as needed
Best Practices
- Always enable per-session tracking at the start of a conversation
- Record the model and budget thresholds in the ledger
- Use the quick-estimate rules when raw token counts arenβt available
- Cross-check estimated costs with provider pricing pages for accuracy
- Trigger budget warnings at 50%, 80%, and 95% thresholds
Example Use Cases
- A support bot tracks spend per customer session to avoid surprises
- A coding assistant estimates tokens before running a large refactor task
- An analytics bot compares costs across models like gpt-4-turbo and claude for long prompts
- A tutoring bot monitors token usage to cap monthly API spend
- A team audits monthly LLM costs using the running cost ledger
Frequently Asked Questions
Related Skills
log-analysis
chaterm/terminal-skills
ζ₯εΏεζδΈε€η
monitoring
chaterm/terminal-skills
ηζ§δΈεθ¦
system-admin
chaterm/terminal-skills
Linux system administration and monitoring
convex-agents
waynesutton/convexskills
Building AI agents with the Convex Agent component including thread management, tool integration, streaming responses, RAG patterns, and workflow orchestration
Alerting & Monitoring Testing
PramodDutta/qaskills
Testing monitoring and alerting configurations including threshold validation, alert routing, escalation policies, and false-positive rate monitoring.
api-design-patterns
petekp/claude-code-setup
Comprehensive API design patterns covering REST, GraphQL, gRPC, versioning, authentication, and modern API best practices