# website-replicator > Clone websites locally for study and analysis. Use when users ask to replicate, clone, mirror, download, or save a website for studying its structure, design, CSS, JavaScript, or overall architecture. Also use when asked to analyze a website's code structure or create an offline copy of a site. - Author: Totó Busnello - Repository: totobusnello/Toto-Code - Version: 20260202135438 - Stars: 1 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/totobusnello/Toto-Code - Web: https://mule.run/skillshub/@@totobusnello/Toto-Code~website-replicator:20260202135438 --- --- name: website-replicator description: Clone websites locally for study and analysis. Use when users ask to replicate, clone, mirror, download, or save a website for studying its structure, design, CSS, JavaScript, or overall architecture. Also use when asked to analyze a website's code structure or create an offline copy of a site. --- # Website Replicator Clone websites locally to study their structure, styling, and implementation. Supports both public websites and login-protected sites. ## Available Scripts | Script | Use Case | |--------|----------| | `replicate_website.py` | Public websites (no login required) | | `authenticated_website_replicator.py` | Sites requiring authentication (login, MFA, SSO) | --- ## Basic Replicator (Public Sites) For websites that don't require login. ### Quick Start ```bash python3 ~/.claude/skills/website-replicator/scripts/replicate_website.py https://example.com -o ./output ``` ### Options | Argument | Default | Description | |----------|---------|-------------| | `url` | required | Target website URL | | `-o, --output` | `./replicated-site` | Output directory | | `-d, --depth` | 2 | Max crawl depth (0 = single page) | | `-p, --pages` | 50 | Max pages to download | ### Examples ```bash # Single page only python3 ~/.claude/skills/website-replicator/scripts/replicate_website.py https://example.com -d 0 # Deep crawl python3 ~/.claude/skills/website-replicator/scripts/replicate_website.py https://example.com -d 5 -p 200 ``` --- ## Authenticated Replicator (Login-Protected Sites) For websites requiring login, MFA, SSO, or other authentication. ### Quick Start **Interactive mode (recommended for first use):** ```bash python3 ~/.claude/skills/website-replicator/scripts/authenticated_website_replicator.py \ https://app.example.com/login \ https://app.example.com/dashboard \ --interactive ``` **Automated login with credentials:** ```bash python3 ~/.claude/skills/website-replicator/scripts/authenticated_website_replicator.py \ https://app.example.com/login \ https://app.example.com/dashboard \ --username user@email.com \ --password "mypassword" ``` ### Options | Option | Default | Description | |--------|---------|-------------| | `login_url` | required | URL of the login page | | `target_url` | required | URL to start replicating from after login | | `-o, --output` | `./replicated-site` | Output directory | | `-d, --depth` | 2 | Max crawl depth | | `-p, --pages` | 50 | Max pages to download | | `-u, --username` | - | Username/email for login | | `--password` | - | Password for login | | `-i, --interactive` | false | Interactive mode - manually complete login | | `--visible` | false | Show browser window during automated login | | `--use-playwright` | false | Use Playwright for all pages (slower, handles JS) | | `--wait` | 3 | Seconds to wait after login | ### Custom Selectors For non-standard login forms: | Option | Description | |--------|-------------| | `--username-selector` | CSS selector for username field | | `--password-selector` | CSS selector for password field | | `--submit-selector` | CSS selector for submit button | ### Examples **Login with MFA (Interactive):** ```bash python3 ~/.claude/skills/website-replicator/scripts/authenticated_website_replicator.py \ https://secure-app.com/signin \ https://secure-app.com/home \ --interactive ``` A browser window opens. Complete login including MFA, then press Enter. **Custom login form selectors:** ```bash python3 ~/.claude/skills/website-replicator/scripts/authenticated_website_replicator.py \ https://custom-app.com/auth \ https://custom-app.com/main \ -u admin --password admin123 \ --username-selector "#login-field" \ --password-selector "#pass-field" \ --submit-selector ".submit-btn" ``` **JavaScript-heavy SPA:** ```bash python3 ~/.claude/skills/website-replicator/scripts/authenticated_website_replicator.py \ https://spa-app.com/login \ https://spa-app.com/app \ --interactive --use-playwright -d 3 -p 100 ``` --- ## What Gets Downloaded - **HTML pages** - With rewritten links to local files - **CSS** - Stylesheets and their referenced assets (fonts, images) - **JavaScript** - External script files - **Images** - Including srcset variants - **Media** - Video/audio sources - **Favicons** - All icon variants ## Output Structure ``` output-dir/ ├── index.html # Entry point ├── css/ # Stylesheets ├── js/ # Scripts ├── images/ # Image assets ├── _manifest.txt # Download summary └── _cookies.json # Session cookies (authenticated only) ``` ## Workflow 1. Run the appropriate script with target URL(s) 2. For authenticated sites, complete login (auto or manual) 3. Open `output-dir/index.html` in browser to view 4. Examine source files to study implementation 5. Check `_manifest.txt` for download summary ## Troubleshooting - **Login fails?** Use `--interactive` and `--visible` to debug - **Missing content?** Try `--use-playwright` for JS-rendered pages - **Timeouts?** Increase `--wait` for slow-loading apps - **Wrong selectors?** Inspect the login page and use custom `--*-selector` options ## Limitations - Same-domain only (won't follow external links) - Some dynamic/SPA content may require `--use-playwright` - Does not handle CAPTCHA automatically (use interactive mode) - File downloads/binary attachments may not work