# skill-speaker > Run a local microphone↔speaker voice dialog using OpenAI Realtime audio models (Node.js). Uses the `mic` and `speaker` npm packages. Pass a detailed prompt that includes all dialog context + what must be learned from the user. - Author: Eugene-E0a80fd8080ff8e - Repository: Eugene-E0a80fd8080ff8e/codex-skills - Version: 20251224190955 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/Eugene-E0a80fd8080ff8e/codex-skills - Web: https://mule.run/skillshub/@@Eugene-E0a80fd8080ff8e/codex-skills~skill-speaker:20251224190955 --- --- name: skill-speaker description: Run a local microphone↔speaker voice dialog using OpenAI Realtime audio models (Node.js). Uses the `mic` and `speaker` npm packages. Pass a detailed prompt that includes all dialog context + what must be learned from the user. --- # Local Voice Dialog (mic ↔ OpenAI Realtime ↔ speaker) This skill runs a two-way voice conversation: - Your microphone streams audio to an OpenAI Realtime audio model. - The model streams audio back to your speakers. - Optional live transcripts are printed to stdout (useful for Codex to capture what was learned). ## Prompt requirement (important) During the call, the Realtime model does **not** have access to most of the context that Codex has (repo files, prior reasoning, etc). Put *everything the voice agent needs* into the startup prompt, including: - Relevant background/context for the conversation - What the agent must learn from the user (explicit questions or a checklist) - Any constraints (tone, length, allowed topics, how to summarize learned info) ## One-time setup 1. Install the skill into Codex: - Copy `skill-speaker/` into `$CODEX_HOME/skills` (default `~/.codex/skills/`) 2. Install Node deps: - `cd scripts && npm install` 3. Hardcode your OpenAI key: - Edit `scripts/config.js` and set `OPENAI_API_KEY` - Avoid committing the key to a public repo. ## Run - Prompt from an argument: - `node scripts/voice-dialog.js --prompt "You are a voice interviewer. Context: ... Learn: ..."` - Prompt from a file (best for long prompts): - `node scripts/voice-dialog.js --prompt-file /path/to/prompt.txt` - Prompt from stdin: - `cat /path/to/prompt.txt | node scripts/voice-dialog.js --prompt-stdin` - Test microphone level (no OpenAI connection): - `node scripts/voice-dialog.js --mictest` While running: - Press `q` or `space` to end the conversation and print `RESULT_JSON_*`. - Saying “goodbye” (or similar) should cause the model to call `end_conversation` and end automatically. ## Notes / troubleshooting - Requires network access (OpenAI Realtime). - In network-restricted Codex runs, request approval to allow the connection. - Uses the modern Realtime WebSocket endpoint (`wss://api.openai.com/v1/realtime`) via `openai/realtime/ws` (not the deprecated beta module path). - Use headphones to avoid speaker→mic feedback loops. - `mic` typically shells out to `arecord` (Linux) or `sox`; install those if startup fails. - If you see silence in `--mictest`, set the capture device (ALSA) with `--mic-device` (default is `default`), e.g. `--mic-device plughw:0,0`. Use `arecord -L` to list devices. - `speaker` is a native module; you may need build tooling for `npm install`. - If the model name changes or you lack access, edit `scripts/config.js` (`REALTIME_MODEL`).