# token-optimizer

> Reduce OpenClaw token usage and API costs by 85-95% through smart model routing, lazy context loading, heartbeat optimization, multi-provider support, and local model fallback. Supports Anthropic, OpenAI, Google, OpenRouter, and Ollama (local).

- Author: M Asif Rahman
- Repository: Asif2BD/OpenClaw-Token-Optimizer
- Version: 20260206234228
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/Asif2BD/OpenClaw-Token-Optimizer
- Web: https://mule.run/skillshub/@@Asif2BD/OpenClaw-Token-Optimizer~token-optimizer:20260206234228

---

---
name: token-optimizer
description: Reduce OpenClaw token usage and API costs by 85-95% through smart model routing, lazy context loading, heartbeat optimization, multi-provider support, and local model fallback. Supports Anthropic, OpenAI, Google, OpenRouter, and Ollama (local).
version: 1.2.0
homepage: https://github.com/Asif2BD/OpenClaw-Token-Optimizer
metadata: {"openclaw":{"emoji":"🪙","homepage":"https://github.com/Asif2BD/OpenClaw-Token-Optimizer","requires":{"bins":["python3"]}}}
---

# 🪙 Token Optimizer

**Reduce OpenClaw token usage and API costs by 85-95%**

## One-Line Installation

```bash
git clone https://github.com/Asif2BD/OpenClaw-Token-Optimizer.git ~/.openclaw/skills/token-optimizer
```

**That's it!** The skill is now available. Tell your agent:
> "I have the token-optimizer skill installed. Use it to optimize my token usage."

Or manually run the scripts to start saving immediately.

---

## What This Skill Does

| Feature | Savings | Command |
|---------|---------|---------|
| **Context Optimization** | 70-90% | Loads only needed files, not everything |
| **Model Routing** | 60-98% | Uses cheap models for simple tasks |
| **Heartbeat Optimization** | 90-95% | Smart intervals, quiet hours |
| **Multi-Provider** | Variable | Falls back to cheaper providers |
| **Local Fallback** | 100% | Zero cost when cloud APIs fail |

## Quick Start Commands

### 1. Generate Optimized AGENTS.md (Biggest Win!)
```bash
python3 ~/.openclaw/skills/token-optimizer/scripts/context_optimizer.py generate-agents
# Review AGENTS.md.optimized and replace your current AGENTS.md
```

### 2. Route Tasks to Appropriate Models
```bash
# Simple greeting → Use cheap model (Haiku/Nano/Flash)
python3 ~/.openclaw/skills/token-optimizer/scripts/model_router.py "thanks!"

# Complex task → Use smart model (Opus/GPT-4.1/Pro)  
python3 ~/.openclaw/skills/token-optimizer/scripts/model_router.py "design a microservices architecture"
```

### 3. Install Optimized Heartbeat
```bash
cp ~/.openclaw/skills/token-optimizer/assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md
```

### 4. Check Token Budget
```bash
python3 ~/.openclaw/skills/token-optimizer/scripts/token_tracker.py check
```

---

## Scripts Reference

### context_optimizer.py

Recommends minimal context files based on prompt complexity.

```bash
# Recommend context for a prompt
context_optimizer.py recommend "hi"
# → Load only: SOUL.md, IDENTITY.md (savings: ~80%)

context_optimizer.py recommend "analyze our codebase"
# → Load: SOUL.md, IDENTITY.md, MEMORY.md, memory/TODAY.md (savings: ~30%)

# Generate optimized AGENTS.md
context_optimizer.py generate-agents
# Creates AGENTS.md.optimized with lazy loading instructions

# View usage statistics
context_optimizer.py stats
```

### model_router.py

Routes tasks to appropriate model tiers. Supports multiple providers.

```bash
# Auto-detect provider and route
model_router.py "read the config file"
# → cheap tier (Haiku/Nano/Flash)

model_router.py "write a Python function"
# → balanced tier (Sonnet/Mini/Flash)

model_router.py "design system architecture"
# → smart tier (Opus/GPT-4.1/Pro)

# Force specific provider
model_router.py "thanks" --provider openai
# → openai/gpt-4.1-nano

# Compare all providers
model_router.py compare

# List providers
model_router.py providers
```

**Supported Providers:**

| Provider | Cheap | Balanced | Smart |
|----------|-------|----------|-------|
| Anthropic | claude-haiku-4 | claude-sonnet-4-5 | claude-opus-4 |
| OpenAI | gpt-4.1-nano | gpt-4.1-mini | gpt-4.1 |
| Google | gemini-2.0-flash | gemini-2.5-flash | gemini-2.5-pro |
| OpenRouter | gemini-2.0-flash | claude-sonnet-4-5 | claude-opus-4 |

### heartbeat_optimizer.py

Manages heartbeat check intervals with quiet hours.

```bash
# Plan which checks should run now
heartbeat_optimizer.py plan

# Check if specific type should run
heartbeat_optimizer.py check email
heartbeat_optimizer.py check calendar

# Record that a check was performed
heartbeat_optimizer.py record email

# Adjust interval (seconds)
heartbeat_optimizer.py interval email 7200  # 2 hours

# Reset all state
heartbeat_optimizer.py reset
```

**Default Intervals:**
- Email: 60 minutes
- Calendar: 2 hours
- Weather: 4 hours
- Social: 2 hours
- Monitoring: 30 minutes

**Quiet Hours:** 23:00-08:00 (skips non-urgent checks)

### token_tracker.py

Monitors daily token budget and usage.

```bash
# Check current usage
token_tracker.py check

# Get model suggestions for task type
token_tracker.py suggest general

# Reset daily tracking
token_tracker.py reset
```

**Status Levels:**
- `ok` — Below 80% of daily limit
- `warning` — 80-99% of daily limit
- `exceeded` — Over limit, switch to cheaper models

---

## Configuration

### Environment Variables

Set your preferred provider's API key:

```bash
# Anthropic (default)
export ANTHROPIC_API_KEY="sk-ant-..."

# OpenAI
export OPENAI_API_KEY="sk-proj-..."

# Google
export GOOGLE_API_KEY="AIza..."

# OpenRouter (unified API)
export OPENROUTER_API_KEY="sk-or-v1-..."
```

The model router auto-detects which provider to use based on available keys.

### Local Model Fallback (Ollama)

Set up zero-cost fallback when cloud APIs fail:

```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull qwen2.5:3b
```

Add to `~/.openclaw/config.json`:

```json
{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://localhost:11434/v1",
        "apiKey": "ollama-local",
        "api": "openai-completions",
        "models": [{"id": "qwen2.5:3b", "name": "Qwen 2.5 3B (Local)"}]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "fallbacks": ["ollama/qwen2.5:3b"]
      }
    }
  }
}
```

See [docs/LOCAL-FALLBACK.md](docs/LOCAL-FALLBACK.md) for complete setup guide.

### Customization

Edit patterns in `scripts/model_router.py`:
- `COMMUNICATION_PATTERNS` — Patterns that always use cheap tier
- `BACKGROUND_TASK_PATTERNS` — Heartbeat/cron patterns
- `ROUTING_RULES` — Task classification rules
- `PROVIDER_MODELS` — Model mappings per provider

---

## Integration Patterns

### Before Every Response

```python
# 1. Get context recommendation
from context_optimizer import recommend_context_bundle
rec = recommend_context_bundle(user_prompt)

# 2. Load only recommended files
if rec["context_level"] == "minimal":
    load_only(["SOUL.md", "IDENTITY.md"])
    # Skip everything else!

# 3. Get model recommendation
from model_router import route_task
routing = route_task(user_prompt)

# 4. Use recommended model
model = routing["recommended_model"]
```

### In HEARTBEAT.md

```bash
# Check if we should run any checks
result=$(python3 scripts/heartbeat_optimizer.py plan)
should_run=$(echo $result | jq -r .should_run)

if [ "$should_run" = "false" ]; then
    echo "HEARTBEAT_OK"
    exit 0
fi

# Run only planned checks
# ...
```

### In Cronjobs

Always specify the cheapest model that can handle the task:

```bash
# Good: Use Haiku for routine tasks
cron add --schedule "0 * * * *" \
  --payload '{"kind":"agentTurn","message":"Check server health","model":"anthropic/claude-haiku-4"}' \
  --sessionTarget isolated

# Bad: Using Opus for simple checks (60x more expensive!)
```

---

## Expected Savings

### Why 85-95% Savings? (v1.2.0 Analysis)

**Combined effect is multiplicative:**
- Context reduction: ~78% (loads 22% of original)
- Model cost reduction: ~64% (pays 36% of original rate)
- Combined: 1 - (0.22 × 0.36) = **92% savings**

### Example: 100K tokens/day workload

| Strategy | Context | Model | Monthly Cost | Savings |
|----------|---------|-------|--------------|---------|
| Baseline (no optimization) | 50K | Sonnet | $9.00 | 0% |
| Context optimization only | 11K | Sonnet | $2.00 | 78% |
| Model routing only | 50K | Mixed | $3.20 | 64% |
| **Both (this skill)** | **11K** | **Mixed** | **$0.72** | **92%** |
| + Local fallback (offline) | Any | Local | $0.00 | **100%** |

### Cronjob Savings

Using Haiku instead of Opus for 10 daily cronjobs:
- Opus: 10 × 5K tokens × $15/MTok = $0.75/day = **$22.50/month**
- Haiku: 10 × 5K tokens × $0.25/MTok = $0.0125/day = **$0.38/month**
- **Savings: $22/month per agent (98% reduction)**

---

## Troubleshooting

**Scripts fail with "module not found"**
→ Ensure Python 3.7+ is installed. Scripts use stdlib only.

**State files not persisting**
→ Check `~/.openclaw/workspace/memory/` exists and is writable.

**Routing suggests wrong tier**
→ Customize patterns in `scripts/model_router.py`.

**Budget shows $0.00**
→ Token tracker needs manual usage recording or integration with session_status.

---

## Files Included

```
token-optimizer/
├── SKILL.md              # This file
├── README.md             # Quick start guide
├── CHANGELOG.md          # Version history
├── LICENSE               # MIT License
├── scripts/
│   ├── context_optimizer.py   # Context loading optimization
│   ├── model_router.py        # Multi-provider model routing
│   ├── heartbeat_optimizer.py # Heartbeat interval management
│   └── token_tracker.py       # Budget monitoring
├── assets/
│   ├── HEARTBEAT.template.md  # Drop-in heartbeat template
│   ├── cronjob-model-guide.md # Cronjob model selection guide
│   └── config-patches.json    # Advanced config examples
├── docs/
│   ├── LOCAL-FALLBACK.md      # Local model setup guide (NEW!)
│   └── RESEARCH-NOTES.md      # Research and methodology
└── references/
    └── PROVIDERS.md           # Provider comparison guide
```

---

## Requirements

- Python 3.7+ (stdlib only, no external dependencies)
- OpenClaw installation
- Write access to `~/.openclaw/workspace/memory/`

---

## Credits

Part of the **SuperSkills** collection for OpenClaw.

Created by:
- **Oracle** — Research, analysis, and documentation
- **Morpheus** — Code review and publication

---

*"The best token is the one you don't spend."* 🪙