# gettr-transcribe-summarize > Download audio from a GETTR post (via HTML og:video), transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT), and summarize the transcript into bullet points and/or a timestamped outline. Use when given a GETTR post URL and asked to produce a transcript or summary. - Author: purpose - Repository: purpose168/moltbot-skills - Version: 20260131094917 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-08 - Source: https://github.com/purpose168/moltbot-skills - Web: https://mule.run/skillshub/@@purpose168/moltbot-skills~gettr-transcribe-summarize:20260131094917 --- --- name: gettr-transcribe-summarize description: Download audio from a GETTR post (via HTML og:video), transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT), and summarize the transcript into bullet points and/or a timestamped outline. Use when given a GETTR post URL and asked to produce a transcript or summary. homepage: https://gettr.com metadata: {"clawdbot":{"emoji":"📺","requires":{"bins":["mlx_whisper","ffmpeg"]},"install":[{"id":"mlx-whisper","kind":"pip","package":"mlx-whisper","bins":["mlx_whisper"],"label":"Install mlx-whisper (pip)"},{"id":"ffmpeg","kind":"brew","formula":"ffmpeg","bins":["ffmpeg"],"label":"Install ffmpeg (brew)"}]}} --- # Gettr Transcribe + Summarize (MLX Whisper) ## Quick start (single command) Run the full pipeline (Steps 1–3) with one command: ```bash bash scripts/run_pipeline.sh "" ``` To explicitly set the transcription language (recommended for non-English content): ```bash bash scripts/run_pipeline.sh --language zh "" ``` Common language codes: `zh` (Chinese), `en` (English), `ja` (Japanese), `ko` (Korean), `es` (Spanish), `fr` (French), `de` (German), `ru` (Russian). This outputs: - `./out/gettr-transcribe-summarize//audio.wav` - `./out/gettr-transcribe-summarize//audio.vtt` Then proceed to Step 4 (Summarize) to generate the final deliverable. --- ## Workflow (GETTR URL → transcript → summary) ### Inputs to confirm Ask for: - GETTR post URL - Output format: **bullets only** or **bullets + timestamped outline** - Summary size: **short**, **medium** (default), or **detailed** - Language (optional): if the video is non-English and auto-detection fails, ask for the language code (e.g., `zh` for Chinese) Notes: - This skill does **not** handle authentication-gated GETTR posts. - This skill does **not** translate; outputs stay in the video's original language. - If transcription quality is poor or mixed with English, re-run with explicit `--language` flag. ### Prereqs (local) - `mlx_whisper` installed and on PATH - `ffmpeg` installed (recommended: `brew install ffmpeg`) ### Step 0 — Pick an output directory Recommended convention: `./out/gettr-transcribe-summarize//` Extract the slug from the GETTR post URL (e.g., `https://gettr.com/post/p1abc2def` → slug = `p1abc2def`). Directory structure: - `./out/gettr-transcribe-summarize//audio.wav` - `./out/gettr-transcribe-summarize//audio.vtt` - `./out/gettr-transcribe-summarize//summary.md` ### Step 1 — Extract the media URL and slug Preferred: fetch the post HTML and read `og:video*`. ```bash python3 scripts/extract_gettr_og_video.py "" ``` This prints the best candidate video URL (often an HLS `.m3u8`) and the post slug. Extract the slug from the URL path (e.g., `/post/p1abc2def` → `p1abc2def`) to create the output directory. **Important: Streaming URLs require browser extraction** For streaming URLs (`gettr.com/streaming/`), the Python script may return a stale/invalid `og:video` URL that fails with HTTP 412. This is because GETTR dynamically generates signed stream URLs via JavaScript. If the URL is a streaming link OR if download fails with HTTP 412: 1. Open the streaming URL in a browser and wait for the page to fully load (JavaScript must execute) 2. Extract the `og:video` meta tag content from the rendered DOM: ```javascript document.querySelector('meta[property="og:video"]').getAttribute('content') ``` 3. Use that fresh URL for the download step The browser-extracted URL will have a valid signature and work with ffmpeg. If extraction fails, ask the user to provide the `.m3u8`/MP4 URL directly (common if the post is private/gated or the HTML is dynamic). ### Step 2 — Download audio with ffmpeg Extract audio-only (16kHz mono WAV) for faster and more stable transcription: ```bash bash scripts/download_audio.sh "" "./out/gettr-transcribe-summarize//audio.wav" ``` This directly extracts audio without intermediate video, reducing disk I/O and processing time. ### Step 3 — Transcribe with MLX Whisper Generate VTT output with timestamps: ```bash mlx_whisper "./out/gettr-transcribe-summarize//audio.wav" \ -f vtt \ -o "./out/gettr-transcribe-summarize/" \ --model mlx-community/whisper-large-v3-turbo \ --condition-on-previous-text False \ --word-timestamps True ``` To explicitly set the language (recommended when auto-detection fails): ```bash mlx_whisper "./out/gettr-transcribe-summarize//audio.wav" \ -f vtt \ -o "./out/gettr-transcribe-summarize/" \ --model mlx-community/whisper-large-v3-turbo \ --condition-on-previous-text False \ --word-timestamps True \ --language zh ``` Flags explained: - `-f vtt`: VTT format provides timestamps for building the outline. - `--condition-on-previous-text False`: prevents hallucination errors from propagating across segments. - `--word-timestamps True`: more precise timing for section boundaries. - `--language `: explicit language code (e.g., `zh`, `en`, `ja`, `ko`). Use when auto-detection fails. Notes: - By default, language is auto-detected. For non-English content where detection fails, use `--language`. - Common language codes: `zh` (Chinese), `en` (English), `ja` (Japanese), `ko` (Korean), `es` (Spanish), `fr` (French), `de` (German), `ru` (Russian). - If too slow or memory-heavy, try smaller models: `mlx-community/whisper-medium` or `mlx-community/whisper-small`. - If quality is poor, try the full model: `mlx-community/whisper-large-v3` (slower but more accurate). - If `--word-timestamps` causes issues, omit it (the pipeline script handles this automatically). ### Step 4 — Summarize Write the final deliverable to `./out/gettr-transcribe-summarize//summary.md`. Pick a **summary size** (user-selectable): - **Short:** 5–8 bullets; (if outline) 4–6 sections - **Medium (default):** 8–20 bullets; (if outline) 6–15 sections - **Detailed:** 20–40 bullets; (if outline) 15–30 sections Include: - **Bullets** (per size above) - Optional **timestamped outline** (per size above) Timestamped outline format (default heading style): ``` [00:00 - 02:15] Section heading - 1–3 sub-bullets ``` When building the outline from VTT cues: - Group adjacent cues into coherent sections. - Use the start time of the first cue and end time of the last cue in the section. ## Bundled scripts - `scripts/run_pipeline.sh`: full pipeline wrapper (Steps 1–3 in one command) - `scripts/extract_gettr_og_video.py`: fetch GETTR HTML and extract `og:video*` URL + post slug (with retry/backoff) - `scripts/download_audio.sh`: download/extract audio from HLS or MP4 URL to 16kHz mono WAV ### Error handling - **Non-video posts**: The extraction script detects image/text posts and provides a helpful error message. - **Network errors**: Automatic retry with exponential backoff (up to 3 attempts). - **No audio track**: The download script validates output and reports if the source has no audio. - **HTTP 412 errors**: The extracted `og:video` URL has an expired/invalid signature. Use browser extraction to get a fresh URL (see Step 1 and `references/troubleshooting.md`). ## Troubleshooting See `references/troubleshooting.md` for detailed solutions to common issues including: - HTTP 412 errors (stale signed URLs) - Extraction failures - Download errors - Transcription quality issues