# model-configuration

> SDK/API patterns for configuring LLM models on Letta agents. Use when setting model handles, adjusting temperature/tokens, configuring provider-specific settings (reasoning, extended thinking), or setting up custom endpoints.

- Author: Cameron
- Repository: letta-ai/skills
- Version: 20260121103813
- Stars: 49
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/letta-ai/skills
- Web: https://mule.run/skillshub/@@letta-ai/skills~model-configuration:20260121103813

---

---
name: model-configuration
description: SDK/API patterns for configuring LLM models on Letta agents. Use when setting model handles, adjusting temperature/tokens, configuring provider-specific settings (reasoning, extended thinking), or setting up custom endpoints.
license: MIT
---

# Letta Model Configuration

Patterns for configuring LLM models on Letta agents via SDK/API. Covers model handles, settings, provider-specific configuration, and custom endpoints.

## When to Use This Skill

Use this skill when:
- Creating agents with specific model configurations
- Adjusting model settings (temperature, max tokens, context window)
- Configuring provider-specific features (OpenAI reasoning, Anthropic thinking)
- Setting up custom OpenAI-compatible endpoints
- Changing models on existing agents
- Configuring embedding models for self-hosted deployments

**Not covered here:** Model selection advice (which model to choose) - see `agent-development` skill's `references/model-recommendations.md`.

## Model Handles

Models use a `provider/model-name` format:

| Provider | Handle Prefix | Example |
|----------|---------------|---------|
| OpenAI | `openai/` | `openai/gpt-4o`, `openai/gpt-4o-mini` |
| Anthropic | `anthropic/` | `anthropic/claude-sonnet-4-5-20250929` |
| Google AI | `google_ai/` | `google_ai/gemini-2.0-flash` |
| Azure OpenAI | `azure/` | `azure/gpt-4o` |
| AWS Bedrock | `bedrock/` | `bedrock/anthropic.claude-3-5-sonnet` |
| Groq | `groq/` | `groq/llama-3.3-70b-versatile` |
| Together | `together/` | `together/meta-llama/Llama-3-70b` |
| OpenRouter | `openrouter/` | `openrouter/anthropic/claude-3.5-sonnet` |
| Ollama (local) | `ollama/` | `ollama/llama3.2` |

## Basic Model Configuration

### Python
```python
from letta_client import Letta

client = Letta(api_key="your-api-key")

agent = client.agents.create(
    model="openai/gpt-4o",
    model_settings={
        "provider_type": "openai",  # Required - must match model provider
        "temperature": 0.7,
        "max_output_tokens": 4096,
    },
    context_window_limit=128000
)
```

### TypeScript
```typescript
import Letta from "@letta-ai/letta-client";

const client = new Letta({ apiKey: "your-api-key" });

const agent = await client.agents.create({
  model: "openai/gpt-4o",
  model_settings: {
    provider_type: "openai",  // Required - must match model provider
    temperature: 0.7,
    max_output_tokens: 4096,
  },
  context_window_limit: 128000,
});
```

## Common Settings

| Setting | Type | Description |
|---------|------|-------------|
| `provider_type` | string | **Required.** Must match model provider (`openai`, `anthropic`, `google_ai`, etc.) |
| `temperature` | float | Controls randomness (0.0-2.0). Lower = more deterministic. |
| `max_output_tokens` | int | Maximum tokens in the response. |

## Context Window Limit

Set at agent level (not inside `model_settings`):

```python
agent = client.agents.create(
    model="anthropic/claude-sonnet-4-5-20250929",
    context_window_limit=200000  # Use 200K of Claude's context
)
```

**Important:**
- Must be <= model's maximum context size
- Default: 32,000 tokens if not specified
- Larger windows increase latency and may reduce reliability
- When context fills up, Letta automatically summarizes older messages

## Changing an Agent's Model

Update existing agents with `agents.update()`:

### Python
```python
# Change model only
client.agents.update(
    agent_id=agent.id,
    model="anthropic/claude-sonnet-4-5-20250929"
)

# Change model and settings
client.agents.update(
    agent_id=agent.id,
    model="openai/gpt-4o",
    model_settings={
        "provider_type": "openai",
        "temperature": 0.5
    },
    context_window_limit=64000
)
```

### TypeScript
```typescript
// Change model only
await client.agents.update(agent.id, {
  model: "anthropic/claude-sonnet-4-5-20250929",
});

// Change model and settings
await client.agents.update(agent.id, {
  model: "openai/gpt-4o",
  model_settings: {
    provider_type: "openai",
    temperature: 0.5,
  },
  context_window_limit: 64000,
});
```

**Note:** Agents retain memory and tools when changing models.

## Provider-Specific Settings

For OpenAI reasoning models and Anthropic extended thinking, see `references/provider-settings.md`.

## Custom Endpoints

For OpenAI-compatible endpoints (vLLM, LM Studio, LocalAI), see `references/custom-endpoints.md`.

## Embedding Models

Required for self-hosted deployments (Letta Cloud handles automatically):

```python
agent = client.agents.create(
    model="openai/gpt-4o",
    embedding="openai/text-embedding-3-small"
)
```

Common embedding models:
- `openai/text-embedding-3-small` (recommended)
- `openai/text-embedding-3-large`
- `openai/text-embedding-ada-002`

## Anti-Hallucination Checklist

Before configuring models, verify:

- [ ] Model handle uses correct `provider/model-name` format
- [ ] `model_settings` includes required `provider_type` field
- [ ] `context_window_limit` is set at agent level, not in `model_settings`
- [ ] Provider-specific settings use correct nested structure (see references)
- [ ] For self-hosted: embedding model is specified
- [ ] Temperature is within valid range (0.0-2.0)

## Example Scripts

See `scripts/` for runnable examples:
- `scripts/basic_config.py` - Basic model configuration
- `scripts/basic_config.ts` - TypeScript equivalent
- `scripts/change_model.py` - Changing models on existing agents
- `scripts/provider_specific.py` - OpenAI reasoning, Anthropic thinking