# emerging-topic-scout > Monitor bioRxiv/medRxiv preprints and academic discussions to identify emerging research hotspots before they appear in mainstream journals - Author: Rowtion - Repository: aipoch/skills-collection - Version: 20260210095832 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-10 - Source: https://github.com/aipoch/skills-collection - Web: https://mule.run/skillshub/@@aipoch/skills-collection~emerging-topic-scout:20260210095832 --- --- name: emerging-topic-scout description: Monitor bioRxiv/medRxiv preprints and academic discussions to identify emerging research hotspots before they appear in mainstream journals version: 1.0.0 category: Research tags: [] author: AIPOCH license: MIT status: Draft risk_level: High skill_type: Hybrid (Tool/Script + Network/API) owner: AIPOCH reviewer: '' last_updated: '2026-02-06' --- # Emerging Topic Scout A real-time monitoring system for identifying "incubation period" research hotspots in biological and medical sciences before they are defined by mainstream journals. ## Overview This skill continuously monitors: - **bioRxiv**: Biology preprints via RSS/API - **medRxiv**: Medicine preprints via RSS/API - **Academic discussions**: Social media and forum mentions It uses trend analysis algorithms to detect sudden spikes in topic frequency, cross-platform mentions, and emerging keyword clusters. ## Installation ```bash cd /Users/z04030865/.openclaw/workspace/skills/emerging-topic-scout pip install -r scripts/requirements.txt ``` ## Usage ### Basic Scan ```bash python scripts/main.py --sources biorxiv medrxiv --days 7 --output json ``` ### Advanced Configuration ```bash python scripts/main.py \ --sources biorxiv medrxiv \ --keywords "CRISPR,gene editing,long COVID" \ --days 14 \ --min-score 0.7 \ --output markdown \ --notify ``` ## Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `--sources` | list | `biorxiv,medrxiv` | Data sources to monitor | | `--keywords` | string | (auto-detect) | Comma-separated keywords to track | | `--days` | int | `7` | Lookback period in days | | `--min-score` | float | `0.6` | Minimum trending score (0-1) | | `--max-topics` | int | `20` | Maximum topics to return | | `--output` | string | `markdown` | Output format: `json`, `markdown`, `csv` | | `--notify` | flag | `false` | Send notification for high-priority topics | | `--config` | path | `config.yaml` | Path to configuration file | ## Output Format ### JSON Output ```json { "scan_date": "2026-02-06T05:57:00Z", "sources": ["biorxiv", "medrxiv"], "hot_topics": [ { "topic": "gene editing therapy", "keywords": ["CRISPR", "base editing", "prime editing"], "trending_score": 0.89, "velocity": "rapid", "preprint_count": 34, "cross_platform_mentions": 127, "related_papers": [ { "title": "New CRISPR variant shows promise", "authors": ["Smith J.", "Lee K."], "doi": "10.1101/2026.01.15.xxxxx", "source": "biorxiv", "published": "2026-01-15", "abstract_summary": "..." } ], "emerging_since": "2026-01-20" } ], "summary": { "total_papers_analyzed": 1247, "new_topics_detected": 8, "high_priority_alerts": 2 } } ``` ### Markdown Output ```markdown # Emerging Topics Report - 2026-02-06 ## 🔥 High Priority Topics ### 1. Gene Editing Therapy (Score: 0.89) - **Keywords**: CRISPR, base editing, prime editing - **Growth Rate**: Rapid (+145% vs last week) - **Preprints**: 34 papers - **Cross-platform mentions**: 127 #### Key Papers 1. "New CRISPR variant shows promise" - Smith J. et al. - DOI: 10.1101/2026.01.15.xxxxx - Source: bioRxiv ``` ## Configuration File Create `config.yaml` for persistent settings: ```yaml sources: biorxiv: enabled: true rss_url: "https://www.biorxiv.org/rss/recent.rss" api_endpoint: "https://api.biorxiv.org/details/" medrxiv: enabled: true rss_url: "https://www.medrxiv.org/rss/recent.rss" api_endpoint: "https://api.medrxiv.org/details/" trending: min_papers_threshold: 5 velocity_window_days: 3 novelty_weight: 0.4 momentum_weight: 0.6 keywords: auto_detect: true custom_trackers: - "artificial intelligence" - "machine learning" - "single cell" - "spatial transcriptomics" output: default_format: markdown save_history: true history_path: "./data/history.json" notifications: enabled: false high_score_threshold: 0.8 ``` ## Trending Score Algorithm The trending score (0-1) is calculated using: ``` Score = (Novelty × 0.4) + (Momentum × 0.4) + (CrossRef × 0.2) Where: - Novelty: Inverse frequency of topic in historical data - Momentum: Rate of increase in mentions over velocity window - CrossRef: Mentions across multiple platforms ``` ## API Endpoints ### bioRxiv API - Base: `https://api.biorxiv.org/` - Details: `/details/[server]/[DOI]/[format]` - Publication: `/pub/[DOI]/[format]` ### medRxiv API - Same structure as bioRxiv ## Data Storage Historical data is stored in `data/history.json` for: - Trend comparison - Velocity calculation - Duplicate detection ## Examples ### Example 1: Quick Daily Scan ```bash python scripts/main.py --days 1 --output markdown ``` ### Example 2: Weekly Deep Analysis ```bash python scripts/main.py \ --days 7 \ --min-score 0.7 \ --max-topics 50 \ --output json \ > weekly_report.json ``` ### Example 3: Track Specific Research Area ```bash python scripts/main.py \ --keywords "Alzheimer,neurodegeneration,amyloid" \ --days 30 \ --min-score 0.5 ``` ## Troubleshooting ### Rate Limiting If you encounter rate limits, increase the `--delay` parameter (default: 1s between requests). ### Missing Papers Ensure RSS feeds are accessible. Some institutional firewalls may block preprint servers. ### Low Trending Scores For niche topics, lower `--min-score` threshold or increase `--days` for more data. ## References See `references/README.md` for: - API documentation links - Research papers on trend detection - Related tools and resources ## License MIT License - Part of OpenClaw Skills Collection ## Risk Assessment | Risk Indicator | Assessment | Level | |----------------|------------|-------| | Code Execution | Python scripts with tools | High | | Network Access | External API calls | High | | File System Access | Read/write data | Medium | | Instruction Tampering | Standard prompt guidelines | Low | | Data Exposure | Data handled securely | Medium | ## Security Checklist - [ ] No hardcoded credentials or API keys - [ ] No unauthorized file system access (../) - [ ] Output does not expose sensitive information - [ ] Prompt injection protections in place - [ ] API requests use HTTPS only - [ ] Input validated against allowed patterns - [ ] API timeout and retry mechanisms implemented - [ ] Output directory restricted to workspace - [ ] Script execution in sandboxed environment - [ ] Error messages sanitized (no internal paths exposed) - [ ] Dependencies audited - [ ] No exposure of internal service architecture ## Prerequisites ```bash # Python dependencies pip install -r requirements.txt ``` ## Evaluation Criteria ### Success Metrics - [ ] Successfully executes main functionality - [ ] Output meets quality standards - [ ] Handles edge cases gracefully - [ ] Performance is acceptable ### Test Cases 1. **Basic Functionality**: Standard input → Expected output 2. **Edge Case**: Invalid input → Graceful error handling 3. **Performance**: Large dataset → Acceptable processing time ## Lifecycle Status - **Current Stage**: Draft - **Next Review Date**: 2026-03-06 - **Known Issues**: None - **Planned Improvements**: - Performance optimization - Additional feature support