# cerebras > Fast LLM inference via Cerebras Cloud. Use as a "junior coder" for well-defined tasks like boilerplate generation, refactoring, test writing, documentation, and format conversions. Triggers when a task is clearly specified and needs fast execution rather than deep reasoning. GLM-4.7 at 1000 tok/s. - Author: m2 - Repository: machine-machine/openclaw-cerebras-skill - Version: 20260131071659 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/machine-machine/openclaw-cerebras-skill - Web: https://mule.run/skillshub/@@machine-machine/openclaw-cerebras-skill~cerebras:20260131071659 --- --- name: cerebras description: Fast LLM inference via Cerebras Cloud. Use as a "junior coder" for well-defined tasks like boilerplate generation, refactoring, test writing, documentation, and format conversions. Triggers when a task is clearly specified and needs fast execution rather than deep reasoning. GLM-4.7 at 1000 tok/s. --- # Cerebras Skill Fast inference workhorse for well-defined tasks. ## When to Use ✅ **Good fit:** - Boilerplate code generation - Simple refactoring - Test case writing - Documentation generation - Format conversions (JSON↔YAML, etc.) - Code translation between languages - Regex/pattern generation - Simple CRUD operations ❌ **Not ideal for:** - Complex architectural decisions - Ambiguous requirements - Multi-step reasoning chains - Security-sensitive code review ## Quick Start ```bash # Simple completion python3 scripts/cerebras_client.py complete "Write a Python function to reverse a string" # Code task with context python3 scripts/cerebras_client.py code "Add error handling" --context "$(cat myfile.py)" # Chat mode python3 scripts/cerebras_client.py chat "Explain this code" --context "$(cat myfile.py)" ``` ## Python API ```python from scripts.cerebras_client import CerebrasClient client = CerebrasClient() # Simple completion result = await client.complete("Write pytest tests for a calculator class") # With system prompt result = await client.chat( messages=[{"role": "user", "content": "Convert this to TypeScript"}], system="You are a code translator. Output only code, no explanations." ) # Streaming async for chunk in client.stream("Write a long function..."): print(chunk, end="") ``` ## Models | Model | Speed | Best For | |-------|-------|----------| | `glm-4.7` | 1000 tok/s | Coding tasks, agents | | `llama-3.3-70b` | 2000 tok/s | General tasks | | `qwen-3-32b` | 2500 tok/s | Fast general | Default: `glm-4.7` (best coding performance) ## Configuration ```bash export CEREBRAS_API_KEY="your-api-key" ``` Or store in `~/.config/cerebras/config`: ``` CEREBRAS_API_KEY=your-api-key CEREBRAS_MODEL=glm-4.7 ``` ## Integration with m2 Use Cerebras as junior coder in your workflow: ```python # Senior (Claude) decides what to do plan = await claude.think("How should we refactor this module?") # Junior (Cerebras) executes for task in plan.tasks: result = await cerebras.complete(task.prompt) await save_file(task.path, result) ``` ## References - [API Details](references/api.md) - [Prompt Templates](references/templates.md)