# debug > Diagnose and fix issues with running hackathon infrastructure. Use when services are unhealthy, users report problems, OAuth fails, video doesn't work, or maps don't load. Covers SSH access, Docker logs, TLS certificates, and network connectivity troubleshooting. - Author: Phil Ewels - Repository: nf-core/hackathon-infra - Version: 20260127081731 - Stars: 1 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/nf-core/hackathon-infra - Web: https://mule.run/skillshub/@@nf-core/hackathon-infra~debug:20260127081731 --- --- name: debug description: > Diagnose and fix issues with running hackathon infrastructure. Use when services are unhealthy, users report problems, OAuth fails, video doesn't work, or maps don't load. Covers SSH access, Docker logs, TLS certificates, and network connectivity troubleshooting. --- # Debug Hackathon Infrastructure Diagnose and fix issues with the running stack. **Reference:** [common-issues.md](common-issues.md) - Quick lookup table for common problems --- ## Quick Health Check Start here for any issue: ```bash ./scripts/status.sh ``` **Don't panic if unhealthy immediately after deploy** - services take 5-15 minutes to initialize. --- ## SSH Access ```bash ./scripts/ssh.sh wa # WorkAdventure ./scripts/ssh.sh lk # LiveKit ./scripts/ssh.sh turn # Coturn ./scripts/ssh.sh jitsi # Jitsi ``` **If SSH fails:** - Check `ssh-add -l` shows keys from 1Password - Verify instance is running in AWS Console - Check security group allows port 22 --- ## General Debugging Once SSH'd in: ### Check Initialization Status ```bash cloud-init status # Success: status: done # Problem: status: running (still initializing) or status: error ``` ### View Startup Logs ```bash cat /var/log/cloud-init-output.log # Look for errors near the end ``` ### Check Docker Containers ```bash docker ps # All containers should show "Up X minutes" ``` ### View Container Logs ```bash docker compose logs -f # All containers docker compose logs # Specific service ``` ### Check Resources ```bash df -h # Disk space (problem if >90%) free -h # Memory (problem if very low) ``` --- ## Service-Specific Issues ### WorkAdventure **OAuth shows "Access Denied" XML:** ```bash # Sync templates then restart ./scripts/sync-maps.sh ./scripts/ssh.sh wa cd /opt/workadventure && docker compose restart ``` **Containers keep restarting:** ```bash ./scripts/ssh.sh wa docker compose logs --tail 100 play docker compose logs --tail 100 back ``` **Map not loading:** ```bash aws s3 ls s3://nfcore-hackathon-maps/default/ --profile nf-core # If empty: ./scripts/sync-maps.sh ``` ### LiveKit **"OK" not returned:** ```bash ./scripts/ssh.sh lk curl -s http://localhost:7880 # Check local docker logs caddy # Check TLS ``` **TLS certificate errors:** ```bash docker restart caddy ``` ### Coturn **No relay candidates in Trickle ICE test:** Most common cause - certificate permissions: ```bash ./scripts/ssh.sh turn ls -la /etc/coturn/certs/ # Must be 644, if not: sudo chmod 644 /etc/coturn/certs/* docker restart coturn ``` ### Jitsi **Let's Encrypt failed:** ```bash ./scripts/ssh.sh jitsi sudo /usr/share/jitsi-meet/scripts/install-letsencrypt-cert.sh ``` **"service-unavailable" errors:** ```bash ./scripts/ssh.sh jitsi docker compose restart prosody sleep 10 docker compose restart jicofo docker compose restart jvb ``` --- ## Network Debugging ### DNS Not Resolving ```bash dig +short app.hackathon.nf-co.re dig +short livekit.hackathon.nf-co.re dig +short jitsi.hackathon.nf-co.re dig +short turn.hackathon.nf-co.re ``` ### Test Port Connectivity ```bash nc -zv livekit.hackathon.nf-co.re 443 nc -zv jitsi.hackathon.nf-co.re 443 nc -zuv turn.hackathon.nf-co.re 3478 # UDP ``` ### Check Security Groups ```bash aws ec2 describe-security-groups --profile nf-core --region eu-west-1 \ --filters "Name=tag:Name,Values=nfcore-hackathon-*" \ --query 'SecurityGroups[].[GroupName,IpPermissions]' --output json ``` --- ## Terraform State Issues ### Recovery from State Corruption If Terraform state doesn't match AWS reality: 1. **Don't panic** - AWS resources still exist 2. Compare state vs reality (commands below) 3. For resources in AWS but not state: `terraform import
` 4. For resources in state but not AWS: `terraform state rm
` 5. Run `terraform plan` to verify alignment **For complete manual cleanup:** See [teardown skill](../teardown/SKILL.md) and [manual-cleanup-scripts.md](../teardown/manual-cleanup-scripts.md) ### State Doesn't Match AWS ```bash # What Terraform thinks exists terraform state list # What actually exists aws ec2 describe-instances --profile nf-core --region eu-west-1 \ --filters "Name=tag:Name,Values=nfcore-hackathon-*" \ --query 'Reservations[].Instances[].[InstanceId,Tags[?Key==`Name`].Value|[0],State.Name]' \ --output table ``` **Resource in AWS but not state:** `terraform import
` **Resource in state but not AWS:** `terraform state rm
` ### State Lock Errors 1. Wait 15 minutes (auto-expire) 2. Check no other terraform process running 3. **NEVER force-unlock** without explicit user approval --- ## Recovery Actions ### Restart Single Service ```bash ./scripts/ssh.sh docker compose restart ``` ### Full Redeploy ```bash terraform destroy ./scripts/sync-maps.sh # CRITICAL: sync BEFORE apply terraform apply # Wait 10-15 minutes ./scripts/status.sh ``` **Why sync first?** WorkAdventure downloads OAuth templates from S3 during boot. Without syncing first, the sign-in page shows "Access Denied" XML instead of the login form. --- ## Browser-Side Debugging ### Check Console (F12) - JavaScript errors - Failed network requests - CORS errors ### Common Browser Issues | Issue | Cause | Fix | |-------|-------|-----| | WebSocket failed | Back service not reachable | Check `docker compose logs back` | | CORS errors | S3 bucket policy | Check bucket is public | | Access Denied for maps | Maps not synced | Run `./scripts/sync-maps.sh` | | Audio/video denied | HTTPS or permissions | Ensure HTTPS, grant permissions |