# sentry-debugger > Debug and analyze {{PROJECT_NAME}} errors and exceptions. Use when investigating crashes, error patterns, performance issues, or understanding what went wrong in production. Covers error queries by agent, exception types, and common debugging scenarios. - Author: Arif Khan - Repository: ak-eyther/model-project - Version: 20260107025203 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/ak-eyther/model-project - Web: https://mule.run/skillshub/@@ak-eyther/model-project~sentry-debugger:20260107025203 --- --- name: sentry-debugger description: Debug and analyze {{PROJECT_NAME}} errors and exceptions. Use when investigating crashes, error patterns, performance issues, or understanding what went wrong in production. Covers error queries by agent, exception types, and common debugging scenarios. --- # Sentry Debugger for {{PROJECT_NAME}} ## When to Use Sentry vs LangSmith | Question | Use | |----------|-----| | "What crashed?" | **Sentry** - see exceptions | | "Why did the agent think X?" | **LangSmith** - see inputs/outputs | | "Why did app error out?" | **Sentry** - see stacktrace | | "Why is response wrong?" | **LangSmith** - trace the reasoning | | "Is DB connection failing?" | **Sentry** - connection errors | | "Is Analyst slow?" | **LangSmith** - latency analysis | | "What errors happen most?" | **Sentry** - error frequency | | "What patterns is Analyst missing?" | **LangSmith** - compare evidence across runs | **Rule of thumb:** - Sentry = **crashing** problems (exceptions, failures) - LangSmith = **thinking** problems (wrong output, bad reasoning) ## Quick Start - `SENTRY_AUTH_TOKEN=... SENTRY_QUERY="is:unresolved environment:production" \` `python .claude/skills/sentry-debugger/scripts/query_sentry_issues.py` - `SENTRY_AUTH_TOKEN=... SENTRY_ISSUE_ID=... \` `python .claude/skills/sentry-debugger/scripts/get_sentry_issue.py` ## Bundled Resources ### References - `references/production-errors.md`: production env vars + query examples. ### Scripts - `scripts/query_sentry_issues.py`: list recent issues via SENTRY_* env vars. - `scripts/get_sentry_issue.py`: fetch a single issue (optional latest event). ## Quick Start for Agents ### Where to Find Auth Token (Local Development) **Location:** `backend/.env` (in project root) **Required variables:** ```bash SENTRY_AUTH_TOKEN=sntryu_... # API query token (REQUIRED for this skill) SENTRY_DSN=https://... # Error reporting DSN (different purpose) ``` **How to check if token exists:** ```bash grep "SENTRY_AUTH_TOKEN" backend/.env ``` ### Quick Test (Verify Skill Works) Run this to test the skill immediately: ```bash # From project root cd /path/to/Mission-Inbox # Extract token and run test SENTRY_AUTH_TOKEN=$(grep "^SENTRY_AUTH_TOKEN=" backend/.env | cut -d'=' -f2) \ ./backend/venv/bin/python3 << 'EOF' import os import requests token = os.environ.get("SENTRY_AUTH_TOKEN") headers = {"Authorization": f"Bearer {token}"} url = "https://sentry.io/api/0/projects/zappian-media/python-serverless/issues/" response = requests.get(url, headers=headers, params={"query": "is:unresolved", "limit": 3}) print(f"Status: {response.status_code}") if response.status_code == 200: issues = response.json() print(f"✅ Found {len(issues)} issues") for issue in issues: print(f" • {issue['title'][:60]}") else: print(f"❌ Error: {response.text[:200]}") EOF ``` **Expected output:** `Status: 200` with list of recent issues ### Common Issues When Testing | Problem | Solution | |---------|----------| | `SENTRY_AUTH_TOKEN not found` | Check `backend/.env` file exists with token | | `ModuleNotFoundError: requests` | Use `./backend/venv/bin/python3` (not system python) | | `401 Unauthorized` | Token expired - regenerate in Sentry.io settings | | `404 Project not found` | Verify org=`zappian-media`, project=`python-serverless` | ## Environment Variables (Railway) ``` SENTRY_AUTH_TOKEN= SENTRY_ORG=zappian-media SENTRY_PROJECT=python-serverless SENTRY_DSN= ``` ## Agent Architecture Context Errors may originate from any node in the pipeline: ``` memory_hydrate_node → orchestrator_node → analyst_node → response ``` Common error sources: - `memory_hydrate_node`: ChromaDB connection, collection not found - `orchestrator_node`: LLM timeout, JSON parse errors - `analyst_node`: PostgreSQL queries, tool failures, LLM errors ## Troubleshooting Decision Tree ### Start Here: What's the symptom? ``` Symptom → Sentry Search ───────────────────────────────────────────────────────────── "App crashed" → is:unresolved level:fatal "Something silently failed" → is:unresolved level:error "Intermittent issues" → Sort by count, look for spikes "Slow responses" → Search: timeout OR slow "Database problems" → Search: postgresql OR database ``` ### By Error Type: Where to Look #### Database Errors (PostgreSQL) | Error Message | Likely Cause | Where to Check | |---------------|--------------|----------------| | `connection refused` | Railway DB down or connection limit | Railway dashboard → Postgres metrics | | `relation does not exist` | Missing table/migration | Check `rollup_*` tables exist | | `timeout expired` | Query too slow | Check query in `diagnostic_tools.py` or `health_tools.py` | | `too many connections` | Connection pool exhausted | Check `connections.py` pool settings | **Files to check:** `backend/app/database/connections.py`, `backend/app/analytics/tools/` #### LLM Errors (OpenAI/Anthropic/OpenRouter) | Error Message | Likely Cause | Where to Check | |---------------|--------------|----------------| | `RateLimitError` | API quota exceeded | Check usage dashboard, add retry logic | | `Timeout` | LLM overloaded | Increase timeout, check prompt length | | `InvalidRequestError` | Prompt too long or malformed | Check token count in prompt | | `AuthenticationError` | Bad API key | Check Railway env vars | **Files to check:** `backend/app/core/llm_clients.py` #### ChromaDB Errors | Error Message | Likely Cause | Where to Check | |---------------|--------------|----------------| | `Collection not found` | Collection doesn't exist | Run collection init script | | `Connection refused` | ChromaDB not running | Check Railway service status | | `Embedding dimension mismatch` | Model changed | Rebuild collection with same model | **Files to check:** `backend/app/core/memory.py` #### JSON/Parsing Errors | Error Message | Likely Cause | Where to Check | |---------------|--------------|----------------| | `JSONDecodeError` | LLM returned non-JSON | Check LLM response, add output validation | | `KeyError` | Missing field in response | LLM didn't follow schema | | `ValidationError` | Pydantic schema mismatch | Check `workflow_state.py` | **Files to check:** `backend/app/agents/*/langgraph_node.py` ### By Agent: Common Errors #### memory_hydrate_node | Frequent Error | Search Query | Fix | |----------------|--------------|-----| | ChromaDB connection | `chromadb OR chroma` | Check ChromaDB service, CHROMA_HOST env var | | Empty results | `collection OR embedding` | Verify collection has data | #### orchestrator_node | Frequent Error | Search Query | Fix | |----------------|--------------|-----| | LLM timeout | `timeout orchestrator` | Increase timeout, simplify prompt | | JSON parse fail | `JSONDecodeError orchestrator` | Add retry with format reminder | #### analyst_node | Frequent Error | Search Query | Fix | |----------------|--------------|-----| | Tool failures | `tool OR postgresql analyst` | Check tool function, DB connection | | LLM timeout | `timeout analyst` | Analyst prompt may be too long | ### Error Severity Guide ``` 🔴 FATAL (level:fatal) → App crashed, user got no response → Action: Fix immediately 🟠 ERROR (level:error) → Something failed but app recovered → Action: Fix soon, check frequency 🟡 WARNING (level:warning) → Degraded experience, fallback used → Action: Monitor, fix if frequent ⚪ INFO (level:info) → Logged for debugging → Action: Ignore unless investigating ``` ## SDK Setup ```python import requests import os SENTRY_AUTH_TOKEN = os.environ["SENTRY_AUTH_TOKEN"] SENTRY_ORG = os.environ.get("SENTRY_ORG", "zappian-media") SENTRY_PROJECT = os.environ.get("SENTRY_PROJECT", "python-serverless") headers = {"Authorization": f"Bearer {SENTRY_AUTH_TOKEN}"} base_url = f"https://sentry.io/api/0" ``` ## Common Queries ### List Recent Issues ```python # Get latest issues (errors grouped by type) url = f"{base_url}/projects/{SENTRY_ORG}/{SENTRY_PROJECT}/issues/" params = {"query": "is:unresolved", "limit": 10} response = requests.get(url, headers=headers, params=params) issues = response.json() for issue in issues: print(f"{issue['title'][:60]}") print(f" Count: {issue['count']} | First: {issue['firstSeen'][:10]}") print(f" Link: {issue['permalink']}") ``` ### Get Recent Events (Individual Errors) ```python # Get actual error events (not grouped) url = f"{base_url}/projects/{SENTRY_ORG}/{SENTRY_PROJECT}/events/" params = {"limit": 20} response = requests.get(url, headers=headers, params=params) events = response.json() for event in events: print(f"{event['title'][:50]} | {event['dateCreated'][:16]}") ``` ### Search by Error Type ```python # Find specific exception types queries = { "database": "is:unresolved postgresql OR database OR sqlalchemy", "llm": "is:unresolved timeout OR openai OR anthropic OR LLM", "chromadb": "is:unresolved chromadb OR collection OR embedding", "json": "is:unresolved JSONDecodeError OR parse OR json", } query_type = "llm" # change as needed url = f"{base_url}/projects/{SENTRY_ORG}/{SENTRY_PROJECT}/issues/" params = {"query": queries[query_type], "limit": 10} response = requests.get(url, headers=headers, params=params) ``` ### Get Error Details ```python # Get full details for a specific issue issue_id = "" url = f"{base_url}/issues/{issue_id}/" response = requests.get(url, headers=headers) issue = response.json() print(f"Title: {issue['title']}") print(f"Count: {issue['count']}") print(f"Users affected: {issue['userCount']}") # Get latest event for this issue (with full stacktrace) url = f"{base_url}/issues/{issue_id}/events/latest/" response = requests.get(url, headers=headers) event = response.json() # Stacktrace for entry in event.get("entries", []): if entry["type"] == "exception": for exc in entry["data"]["values"]: print(f"\nException: {exc['type']}: {exc['value']}") for frame in exc.get("stacktrace", {}).get("frames", [])[-5:]: print(f" {frame.get('filename')}:{frame.get('lineNo')} in {frame.get('function')}") ``` ### Error Frequency Over Time ```python # Get error stats for last 24 hours url = f"{base_url}/projects/{SENTRY_ORG}/{SENTRY_PROJECT}/stats/" params = {"stat": "received", "resolution": "1h"} response = requests.get(url, headers=headers, params=params) stats = response.json() print("Errors per hour (last 24h):") for timestamp, count in stats[-24:]: print(f" {timestamp}: {count}") ``` ## Debugging Scenarios ### "What's crashing in production right now?" ```python # Unresolved issues, sorted by last seen url = f"{base_url}/projects/{SENTRY_ORG}/{SENTRY_PROJECT}/issues/" params = { "query": "is:unresolved", "sort": "date", "limit": 5 } response = requests.get(url, headers=headers, params=params) for issue in response.json(): print(f"🔴 {issue['title'][:50]}") print(f" Last seen: {issue['lastSeen']}") print(f" Events: {issue['count']}") ``` ### "Are LLM calls timing out?" ```python url = f"{base_url}/projects/{SENTRY_ORG}/{SENTRY_PROJECT}/issues/" params = { "query": "is:unresolved timeout OR TimeoutError OR ReadTimeout", "limit": 10 } response = requests.get(url, headers=headers, params=params) timeouts = response.json() if timeouts: print(f"⚠️ Found {len(timeouts)} timeout-related issues") else: print("✅ No timeout issues found") ``` ### "What errors happened during a specific query?" ```python # Search by custom tag (if you tag with session_id) session_id = "abc123" url = f"{base_url}/projects/{SENTRY_ORG}/{SENTRY_PROJECT}/issues/" params = {"query": f"session_id:{session_id}"} response = requests.get(url, headers=headers, params=params) ``` ### "Compare error rates: this week vs last week" ```python from datetime import datetime, timedelta # This week url = f"{base_url}/projects/{SENTRY_ORG}/{SENTRY_PROJECT}/stats/" params = {"stat": "received", "resolution": "1d"} response = requests.get(url, headers=headers, params=params) stats = response.json() this_week = sum(count for _, count in stats[-7:]) last_week = sum(count for _, count in stats[-14:-7]) print(f"This week: {this_week} errors") print(f"Last week: {last_week} errors") print(f"Change: {((this_week - last_week) / max(last_week, 1)) * 100:+.1f}%") ``` ## Query Syntax Reference | Filter | Example | |--------|---------| | Unresolved only | `is:unresolved` | | By text | `timeout` or `database error` | | By tag | `tag:environment:production` | | By level | `level:error` or `level:fatal` | | Combined | `is:unresolved level:error timeout` | | Time range | `firstSeen:-24h` | ## Adding Context to Your Code To make Sentry errors more debuggable, add context in your agent nodes: ```python import sentry_sdk # In analyst_node or orchestrator_node sentry_sdk.set_context("agent", { "name": "analyst", "session_id": state.get("session_id"), "query": state.get("query")[:100], }) sentry_sdk.set_tag("agent_type", "analyst") sentry_sdk.set_tag("session_id", state.get("session_id")) ``` This allows filtering errors by agent type and session in Sentry.