# tiktok-video-scraper > Scrapes TikTok videos to extract content, author info, and engagement metrics (likes, comments, shares, favorites). Use when user wants to scrape TikTok videos, extract video data, or analyze TikTok content. - Author: dingbohan89-coder - Repository: dingbohan89-coder/dingbohan123 - Version: 20260105185550 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/dingbohan89-coder/dingbohan123 - Web: https://mule.run/skillshub/@@dingbohan89-coder/dingbohan123~tiktok-video-scraper:20260105185550 --- --- name: tiktok-video-scraper description: Scrapes TikTok videos to extract content, author info, and engagement metrics (likes, comments, shares, favorites). Use when user wants to scrape TikTok videos, extract video data, or analyze TikTok content. --- # TikTok Video Scraper ## Quick Start Scrape a TikTok video by URL: ```bash node scripts/scrape.js "https://www.tiktok.com/@username/video/VIDEO_ID" ``` Output saved to: `results/tiktok_video_TIMESTAMP.json` ## First Time Setup ### Install Dependencies ```bash cd tiktok-skill npm install ``` ### Login (Required) ```bash node scripts/login.js ``` This opens a browser window. Complete the login process manually. Cookies are saved automatically for future use. ## How It Works The scraper uses Playwright to automate a real browser: 1. Loads saved cookies (or prompts for login) 2. Opens the video URL in a browser 3. Waits for page to fully load (20 seconds) 4. Extracts data from the page DOM and embedded JSON 5. Saves results to JSON file 6. Updates cookies for next use ## Available Scripts ### scrape.js - Main Scraper Scrapes a single video: ```bash node scripts/scrape.js "VIDEO_URL" ``` **Extracts:** - Title/Description - Author (@username) - Author URL - Likes (点赞数) - Comments (评论数) - Shares (分享数) - Favorites (收藏数) - Hashtags - View count (if available) **Output format:** ```json { "url": "video_url", "timestamp": "2025-12-31T...", "data": { "title": "video_title", "description": "video_description", "author": "username", "authorUrl": "https://www.tiktok.com/@username", "likes": "1.3K", "comments": "11", "shares": "430", "favorites": "1064", "views": "", "hashtags": ["tag1", "tag2"] } } ``` ### get-stats.js - Statistics Only Fetches engagement metrics only: ```bash node scripts/get-stats.js "VIDEO_URL" ``` **Output:** ```json { "url": "video_url", "timestamp": "2025-12-31T...", "stats": { "likes": "1.3K", "comments": "11", "shares": "430", "favorites": "1064", "views": "" } } ``` ## Data Limitations **Platform Restrictions:** - ❌ **View count** (播放量) - Only visible to video creator, cannot be extracted - ✅ **Likes/Comments/Shares/Favorites** - Usually available - ⚠️ **Hashtags** - Extracted when available in description **Access Issues:** Some videos may be unavailable due to: - Private account settings - Regional restrictions - Video removed - Age restrictions ## Directory Structure ``` tiktok-skill/ ├── SKILL.md # This file ├── scripts/ │ ├── scrape.js # Main scraper │ ├── get-stats.js # Stats extraction │ └── login.js # Login helper ├── results/ # Scraped data (auto-created) └── tiktok_cookies.json # Saved login state (auto-created) ``` ## Troubleshooting ### Data extraction incomplete **Possible causes:** 1. Login state expired 2. Video is private or restricted 3. Page structure changed **Solutions:** ```bash # Delete old cookies and re-login rm tiktok_cookies.json node scripts/login.js ``` ### Browser doesn't open Make sure Playwright Chromium is installed: ```bash npx playwright install chromium ``` ### Extraction fails or times out 1. Check if the video URL is accessible in a browser 2. Verify you're logged in to the correct account 3. Try waiting longer - increase timeout in scripts/scrape.js ## Examples ### Example 1: Scrape a Video ```bash node scripts/scrape.js "https://www.tiktok.com/@goodluck0311/video/7523066743128968455" ``` **Result:** `results/tiktok_video_1767173479719.json` ### Example 2: Get Statistics Only ```bash node scripts/get-stats.js "https://www.tiktok.com/@goodluck0311/video/7523066743128968455" ``` **Result:** Just the engagement metrics ### Example 3: Batch Scraping ```bash # Create a list of URLs cat urls.txt | while read url; do node scripts/scrape.js "$url" done ``` ## Technical Notes **Technology Stack:** - Playwright - Browser automation - Node.js - Runtime environment - Chromium - Headless browser **Why Playwright:** - Simulates real user behavior - Handles dynamic content and JavaScript rendering - Works with logged-in sessions - Reliable for complex web apps **Data Extraction Methods:** 1. **Primary**: Extract from embedded JSON in page source (`__RENDER_DATA__`, `__UNIVERSAL_DATA_FOR_REHYDRATION__`) 2. **Fallback**: Parse DOM elements with `[data-e2e]` attributes 3. **Final**: Extract from meta tags **Wait Times:** - 20 seconds default for page load - Ensures all JavaScript content is loaded - Can be adjusted in scripts/scrape.js ## Best Practices 1. **Always log in from a real account** - Some content requires login 2. **Respect rate limits** - Add delays between requests 3. **Check cookies periodically** - Re-login if scraping fails 4. **Verify URLs** - Make sure videos are public when sharing scripts 5. **Handle errors gracefully** - Check for extraction failures in results ## Related Skills For Xiaohongshu scraping, see: `../xiaohongshu-skill/SKILL.md`