# troubleshoot > Read-only diagnostics and troubleshooting for SignalRoom. Use when debugging issues, checking system health, analyzing logs, or verifying connections. This skill restricts modifications to prevent accidental changes during investigation. - Author: Marco - Repository: mmbianco78/signalroom - Version: 20251221203513 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/mmbianco78/signalroom - Web: https://mule.run/skillshub/@@mmbianco78/signalroom~troubleshoot:20251221203513 --- --- name: troubleshoot description: Read-only diagnostics and troubleshooting for SignalRoom. Use when debugging issues, checking system health, analyzing logs, or verifying connections. This skill restricts modifications to prevent accidental changes during investigation. allowed-tools: Read, Grep, Glob, Bash --- # Troubleshooting & Diagnostics ## Quick Health Checks ### 1. Fly.io Worker Status ```bash fly status fly logs --app signalroom-worker ``` ### 2. Temporal Connection ```bash python scripts/test_temporal_connection.py ``` ### 3. Supabase Connection ```bash python -c " from signalroom.common import settings print(f'Host: {settings.supabase_db_host}') print(f'Port: {settings.supabase_db_port}') print(f'User: {settings.supabase_db_user}') " ``` ### 4. Recent Pipeline Runs ```sql SELECT load_id, schema_name, status, inserted_at FROM s3_exports._dlt_loads ORDER BY inserted_at DESC LIMIT 5; ``` ## Common Error Patterns ### Database Errors | Error | Cause | Check | |-------|-------|-------| | "password authentication failed" | Wrong user format | User should be `postgres.{project_ref}` | | "connection refused" | Wrong host/port | Pooler: port 6543, Direct: port 5432 | | "too many connections" | Connection leak | Use pooler, check for unclosed connections | | "relation does not exist" | Table not created | Check schema name, run pipeline first | ### Temporal Errors | Error | Cause | Check | |-------|-------|-------| | "No worker available" | Worker not running | `fly status`, `fly logs` | | "Activity timed out" | Pipeline too slow | Check activity duration, add heartbeats | | "RestrictedWorkflowAccessError" | Sandbox blocking imports | Use `UnsandboxedWorkflowRunner` | | "asyncio.run() cannot be called" | Nested event loop | Use `await` directly in activities | ### Pipeline Errors | Error | Cause | Check | |-------|-------|-------| | "Unknown source" | Source not registered | Check `SOURCES` dict in runner.py | | "Primary key violation" | Duplicate data with merge | Check source data, primary key definition | | "Column type mismatch" | Schema evolution conflict | Check dlt schema, may need table drop | ## Log Analysis ### Fly.io Logs ```bash # Recent logs fly logs # Follow logs fly logs -f # Filter by level fly logs | grep -i error ``` ### Local Worker Logs ```bash make logs-worker ``` ### Structured Log Fields ```json { "event": "pipeline_completed", "source": "everflow", "load_id": "1705312345", "row_counts": {"daily_stats": 523} } ``` Search by event: ```bash fly logs | grep "pipeline_failed" fly logs | grep "activity_failed" ``` ## Verification Commands ### Verify Environment ```bash # Check required env vars are set python -c " from signalroom.common import settings required = ['supabase_db_host', 'supabase_db_password', 'temporal_address'] for var in required: val = getattr(settings, var, None) status = '✓' if val else '✗' print(f'{status} {var}') " ``` ### Verify Imports ```bash python -c "from signalroom.workers.main import main; print('OK')" ``` ### Verify Temporal Activities ```bash python -c " from signalroom.temporal.activities import run_pipeline_activity print('Activities import OK') " ``` ### Verify dlt Sources ```bash python -c " from signalroom.pipelines.runner import SOURCES print('Registered sources:', list(SOURCES.keys())) " ``` ## Database Diagnostics ### Check Table Exists ```sql SELECT table_schema, table_name FROM information_schema.tables WHERE table_name = 'daily_stats'; ``` ### Check Recent Data ```sql -- Everflow SELECT date, COUNT(*) as rows FROM everflow.daily_stats GROUP BY date ORDER BY date DESC LIMIT 7; -- Redtrack SELECT date, COUNT(*) as rows FROM redtrack.daily_spend GROUP BY date ORDER BY date DESC LIMIT 7; ``` ### Check dlt Load History ```sql SELECT load_id, inserted_at, status FROM everflow._dlt_loads ORDER BY inserted_at DESC LIMIT 10; ``` ## Temporal UI Diagnostics **URL**: https://cloud.temporal.io/namespaces/signalroom-713.nzg5u/workflows ### Check Workflow Status 1. Open workflow by ID 2. Look at "Event History" 3. Find failed activity 4. Expand to see error details ### Check Pending Activities 1. Go to workflow detail 2. Look for "Pending Activities" section 3. Check if worker is processing ## Network Diagnostics ### DNS Resolution ```bash nslookup aws-0-us-east-1.pooler.supabase.com nslookup ap-northeast-1.aws.api.temporal.io ``` ### Port Connectivity ```bash nc -zv aws-0-us-east-1.pooler.supabase.com 6543 ``` ## Recovery Procedures ### Restart Fly.io Worker ```bash fly apps restart signalroom-worker ``` ### Clear Stuck Pipeline State ```bash dlt pipeline {pipeline_name} drop-pending-packages ``` ### Revert Recent Changes ```bash git log --oneline -5 git revert ``` ## When to Escalate If you cannot resolve after: 1. Checking logs for specific error 2. Verifying connections 3. Testing locally 4. Reviewing recent changes Document findings and escalate with: - Exact error message - Relevant log snippets - What you've tried - Timeline of when it started ## References - **API Reference**: `docs/API_REFERENCE.md` — Live docs, auth, request/response examples - **Source Details**: `docs/SOURCES.md` — Schema, queries, implementation notes - **Data Patterns**: `docs/DATA_ORGANIZATION.md` — Client data structure