# newsletter-events-research > Research events from Instagram, web aggregators, and Facebook event URLs. Use when scraping event sources, downloading flyer images, or extracting event details. - Author: Aniket Panjwani - Repository: aniketpanjwani/local_media_tools - Version: 20251226012256 - Stars: 39 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/aniketpanjwani/local_media_tools - Web: https://mule.run/skillshub/@@aniketpanjwani/local_media_tools~newsletter-events-research:20251226012256 --- --- name: newsletter-events-research description: Research events from Instagram, web aggregators, and Facebook event URLs. Use when scraping event sources, downloading flyer images, or extracting event details. --- ## How This Skill Works This skill gathers raw event data from configured sources. It does NOT write newsletter content - use `newsletter-events-write` for that. ### Data Sources 1. **Instagram** - Via ScrapeCreators API (requires API key) 2. **Web Aggregators** - Via Firecrawl (requires API key) 3. **Facebook Events** - Pass event URLs directly (e.g., `https://facebook.com/events/123456`) ### Output Research produces structured data saved to `~/.config/local-media-tools/data/`: - `data/raw/instagram_.json` - Raw API responses - `data/images/instagram//` - Downloaded flyer images - `data/events.db` - SQLite database with profiles, posts, events, venues ### Key Principle **Images are critical.** Many venues post event details only in flyer images, not captions. Always analyze downloaded images with Claude's vision. **Image Download Requirement:** Instagram CDN URLs return 403 when accessed via WebFetch. Images MUST be downloaded using Python's `requests` library with proper User-Agent headers, then analyzed locally using the Read tool. ## Use CLI Tools - Never curl **NEVER use curl or raw API calls.** Always use the CLI tools provided: **Instagram:** ```bash # Scrape all configured accounts uv run python scripts/cli_instagram.py scrape --all # Scrape specific account uv run python scripts/cli_instagram.py scrape --handle wayside_cider # List posts from database uv run python scripts/cli_instagram.py list-posts --handle wayside_cider # Show database statistics uv run python scripts/cli_instagram.py show-stats # Classify posts (single or batch) uv run python scripts/cli_instagram.py classify --post-id 123 --classification event --reason "Has future date" uv run python scripts/cli_instagram.py classify --batch-json '[{"post_id": "123", "classification": "event", "reason": "..."}]' ``` The CLI tools ensure: - Correct API parameters (`handle`, not `username`) - Rate limiting (2 calls/second) - Automatic retry on 429/5xx errors - Proper database storage with FK relationships - Raw responses saved to `~/.config/local-media-tools/data/raw/` **Do NOT:** - Use `curl` to call ScrapeCreators API directly - Write raw SQL to insert data - Guess API parameter names What would you like to research? 1. **Instagram** - Scrape Instagram accounts for events 2. **Web Aggregators** - Scrape web event aggregator sites 3. **All configured sources** - Full research from all sources in config 4. **Facebook event URLs** - Pass specific event URLs to scrape You can also paste Facebook event URLs directly: - `https://facebook.com/events/123456` - `https://facebook.com/events/789012` **Wait for response before proceeding.** | Response | Workflow | |----------|----------| | 1, "instagram", "ig" | `workflows/research-instagram.md` | | 2, "web", "aggregator", "websites" | `workflows/research-web-aggregator.md` | | 3, "all", "both", "full" | `workflows/research-all.md` | | 4, "facebook", contains `facebook.com/events/` | `workflows/research-facebook.md` | All domain knowledge in `references/`: **APIs:** scrapecreators-api.md, facebook-scraper-api.md, firecrawl-api.md **Detection:** event-detection.md | Workflow | Purpose | |----------|---------| | research-instagram.md | Scrape Instagram, download images, extract events | | research-facebook.md | Scrape individual Facebook event URLs | | research-web-aggregator.md | Dispatcher for web scraping (calls scrape + extract) | | research-web-scrape.md | Phase 1: Scrape pages, return JSON | | research-web-extract.md | Phase 2: Extract events from JSON, save via CLI | | research-all.md | Run all research workflows | Research is complete when: - [ ] CLI tool used to scrape accounts (not curl) - [ ] Raw data saved to `~/.config/local-media-tools/data/raw/` - [ ] Posts saved to database with profiles - [ ] Posts classified as event/not_event/ambiguous - [ ] Events extracted from classified posts - [ ] Data ready for `newsletter-events-write` skill