# elevenlabs-tts

> Generate audio tracks from text using ElevenLabs eleven_v3 model. Presents available voices, allows language selection, and lets you review/modify text before generating. Requires ELEVENLABS_API_KEY environment variable.

- Author: eovidiu
- Repository: eovidiu/agents-skills
- Version: 20260204225927
- Stars: 2
- Forks: 0
- Last Updated: 2026-02-06
- Source: https://github.com/eovidiu/agents-skills
- Web: https://mule.run/skillshub/@@eovidiu/agents-skills~elevenlabs-tts:20260204225927

---

---
name: elevenlabs-tts
description: Generate audio tracks from text using ElevenLabs eleven_v3 model. Presents available voices, allows language selection, and lets you review/modify text before generating. Requires ELEVENLABS_API_KEY environment variable.
---

# ElevenLabs Text-to-Speech

## Overview

This skill generates high-quality audio tracks from text using ElevenLabs' eleven_v3 model - their most advanced speech synthesis model with natural, life-like speech, high emotional range, and support for 70+ languages.

## Prerequisites

Before using this skill, ensure:
1. The `ELEVENLABS_API_KEY` environment variable is set
2. You have an ElevenLabs account with API access

## IMPORTANT: Interactive Workflow

When using this skill, you MUST follow this workflow using AskUserQuestion:

### Step 1: Fetch available voices
```bash
python scripts/elevenlabs_tts.py voices
```

### Step 2: Ask user to select a voice
Use AskUserQuestion to present voice options. Include 3-4 popular voices plus "Other".

### Step 3: Ask user to select a language
Use AskUserQuestion to present language options:
- English (en), Spanish (es), French (fr), German (de), etc.
- Include "Other" for custom ISO 639-1 codes

### Step 4: Get the text to convert
Either receive text from user or ask them to provide it.

### Step 5: Show text for confirmation
Use AskUserQuestion to display the final text and ask user to confirm or edit.

### Step 6: Generate audio
Only after user confirms, call:
```bash
python scripts/elevenlabs_tts.py generate "TEXT" --voice VOICE_ID --language CODE --output file.mp3 --yes
```

## Script Commands

### List Voices
```bash
python scripts/elevenlabs_tts.py voices
```

### Generate Audio (with explicit options)
```bash
python scripts/elevenlabs_tts.py generate "Your text" \
  --voice VOICE_ID \
  --language CODE \
  --output file.mp3 \
  --yes
```

### Setup Dependencies
```bash
python scripts/elevenlabs_tts.py setup
```

## Core Capabilities

### 1. Voice Selection

The skill fetches all voices available to your account and displays them with:
- Voice name
- Voice ID
- Labels (gender, age, accent, use case)

Select by entering the number or voice ID.

### 2. Language Selection

Eleven v3 supports 70+ languages. Common languages are presented for quick selection:

| Code | Language |
|------|----------|
| en | English |
| es | Spanish |
| fr | French |
| de | German |
| it | Italian |
| pt | Portuguese |
| zh | Chinese |
| ja | Japanese |
| ko | Korean |

Enter `0` for custom ISO 639-1 codes. See `references/language_codes.md` for the full list.

### 3. Text Review

Before generating, use AskUserQuestion to display the text for final review. The user can:
- Confirm the text as-is
- Request modifications

This prevents wasted API calls on typos or last-minute changes.

### 4. Audio Generation

The script calls the ElevenLabs TTS API with:
- Model: `eleven_v3` (latest, most expressive)
- Output format: MP3 (44.1kHz, 128kbps)
- Your selected voice and language

## Command Reference

```bash
# List voices
python scripts/elevenlabs_tts.py voices

# Generate with explicit options (use after collecting choices via AskUserQuestion)
python scripts/elevenlabs_tts.py generate "Text" \
  --voice VOICE_ID \
  --language CODE \       # ISO 639-1 code (e.g., en, fr, de, ro)
  --output file.mp3 \
  --yes                   # Skip stdin confirmation (already confirmed via AskUserQuestion)

# Setup dependencies
python scripts/elevenlabs_tts.py setup
python scripts/elevenlabs_tts.py setup --force
```

## Audio Tags (Eleven v3 Feature)

Eleven v3 supports inline audio tags for expressive control:

```
[slowly] Back then... [chuckles] we had no phones.
[whispers] Just dirt roads and [coughs] big dreams.
[sad] Then it happened...
```

Include these tags in your text to control delivery.

## Workflow Integration

When using this skill in a workflow:

1. **For articles/blog posts**: Generate audio narration
   ```bash
   python scripts/elevenlabs_tts.py generate "$(cat article.txt)" --voice VOICE_ID --language en --output narration.mp3 --yes
   ```

2. **For podcasts**: Generate intro/outro segments
   ```bash
   python scripts/elevenlabs_tts.py generate "[upbeat] Welcome to Tech Talk, your weekly dose of innovation!" --voice VOICE_ID --language en --output intro.mp3 --yes
   ```

3. **For multilingual content**: Generate in target language
   ```bash
   python scripts/elevenlabs_tts.py generate "Bienvenue sur notre site" --voice VOICE_ID --language fr --output french_welcome.mp3 --yes
   ```

## Resources

- `references/language_codes.md`: Complete list of 70+ supported language codes

## Troubleshooting

### API Key Not Found

```
Error: No API key found. Set ELEVENLABS_API_KEY environment variable.
```

Set your API key:
```bash
export ELEVENLABS_API_KEY="your-key-here"
```

### Voice Not Found

Use `voices` command to list available voices and copy the exact voice ID.

### Rate Limiting

If you encounter rate limits, wait before retrying. Consider upgrading your ElevenLabs plan for higher quotas.

### Language Not Supported

Eleven v3 supports 70+ languages. If your language isn't working:
- Verify the ISO 639-1 code is correct (2-letter codes like 'en', 'fr', 'de')
- Check `references/language_codes.md` for supported languages
- Some languages may require specific voice types