# daedalus-simple > Quick website extraction and site generation. Use when user wants to crawl, extract, process, or migrate a website. Handles the full flow from URL to running local site. - Author: Tyler Kim - Repository: tylertaewook/dispatch - Version: 20260201162450 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/tylertaewook/dispatch - Web: https://mule.run/skillshub/@@tylertaewook/dispatch~daedalus-simple:20260201162450 --- --- name: daedalus-simple description: Quick website extraction and site generation. Use when user wants to crawl, extract, process, or migrate a website. Handles the full flow from URL to running local site. --- # Daedalus Simple Extract any website and generate a local Next.js site in two steps. --- ## IMPORTANT: Always Run in Background **Daedalus commands are long-running and MUST be run in background terminal sessions using the PTY tools.** ```bash # CORRECT: Use pty_spawn for all daedalus commands pty_spawn: command="daedalus", args=["crawl-and-extract", "--url", ""], title="Daedalus Crawl" # Then monitor progress with pty_read pty_read: id="", limit=50 # Check if still running pty_list ``` **Why?** - Crawling can take 10-30+ minutes depending on site size - Extraction phases process pages one-by-one with LLM calls - Running synchronously will timeout and fail **Workflow Pattern:** 1. `pty_spawn` the daedalus command 2. Periodically `pty_read` to check progress 3. Use `pty_read` with `pattern="error|ERROR"` to check for issues 4. Wait for process to exit before starting next phase --- ## When to Use Invoke this skill when the user: - Wants to "crawl" or "extract" a website - Wants to "process" or "migrate" a site - Provides a URL and wants to create a site from it - Asks to "scrape" or "copy" a website's content --- ## Workflow ### Step 1: Crawl and Extract Spawn the `crawl-and-extract` command in a background PTY session: ```bash pty_spawn: command="daedalus", args=["crawl-and-extract", "--url", ""], title="Daedalus Crawl" ``` Monitor progress periodically: ```bash pty_read: id="", limit=50 ``` This command: 1. Crawls the website and saves raw HTML 2. Extracts site configuration (name, navigation, footer) 3. Discovers page templates automatically 4. Extracts structured content from all pages **After completion, summarize results to the user:** - Number of pages crawled - Number of templates discovered - Site name extracted - Output location Example summary: > Extraction complete: > > - **Site**: Example Government Agency > - **Pages**: 47 pages extracted > - **Templates**: 5 templates discovered > - **Output**: `/path/to/output-folder` Then ask: **"Would you like me to generate a local site you can preview?"** --- ### Step 2: Generate Site (if user agrees) Spawn the `generate-site` command in a background PTY session: ```bash pty_spawn: command="daedalus", args=["generate-site"], title="Daedalus Generate" ``` Monitor until completion: ```bash pty_read: id="", limit=50 ``` This command: 1. Uses AI to generate a Next.js site from extracted data 2. Creates minimal, editorial-style design 3. Installs dependencies automatically 4. Starts a local dev server **After completion, provide the clickable link:** > Site is ready: [http://localhost:4000](http://localhost:4000) --- ## Command Reference All commands should be run via `pty_spawn` to handle long-running processes: | Command | Purpose | | --------------------------------------------------------------------------- | ------------------------------------------ | | `pty_spawn: command="daedalus", args=["crawl-and-extract", "--url", ]` | Crawl site and extract all content | | `pty_spawn: command="daedalus", args=["generate-site"]` | Generate Next.js site and start dev server | | `pty_spawn: command="daedalus", args=["generate-site", "--no-server"]` | Generate files only, don't start server | | `pty_spawn: command="daedalus", args=["generate-site", "--port", ]` | Use specific starting port | Use `pty_read` and `pty_list` to monitor progress. --- ## Example Conversation Flow **User**: "Can you extract https://example.gov for me?" **Agent**: 1. Set DAEDALUS_OUTPUT if needed 2. Spawn: `pty_spawn: command="daedalus", args=["crawl-and-extract", "--url", "https://example.gov"]` 3. Monitor progress with `pty_read` periodically 4. Wait for completion (check with `pty_list`) 5. Summarize: "Extracted 47 pages across 5 templates from Example Gov" 6. Ask: "Would you like me to generate a local preview site?" **User**: "Yes" **Agent**: 1. Spawn: `pty_spawn: command="daedalus", args=["generate-site"]` 2. Monitor with `pty_read` until server starts 3. Respond: "Your site is ready: [http://localhost:4000](http://localhost:4000)" --- ## Output Structure After both commands complete: ``` $DAEDALUS_OUTPUT/ ├── raw/ # Crawled HTML pages ├── site-config.json # Extracted site configuration ├── templates.json # Discovered templates ├── extracted/ # Structured JSON for each page └── site/ # Generated Next.js site ├── app/ ├── lib/ ├── package.json └── node_modules/ ``` --- ## Troubleshooting **Crawl returns 0 pages**: The site may block crawlers. Check if the URL is accessible. **Generate-site fails**: Ensure extracted data exists. Run `crawl-and-extract` first. **Port already in use**: Use `--port 5000` to try a different port. **AI generation fails**: The command falls back to default templates automatically.