# optimizing-queries > Analyzes and optimizes SQL/NoSQL queries for performance. Use when reviewing query performance, optimizing slow queries, analyzing EXPLAIN output, suggesting indexes, identifying N+1 problems, recommending query rewrites, or improving database access patterns. Supports PostgreSQL, MySQL, SQLite, MongoDB, Redis, DynamoDB, and Elasticsearch. - Author: galihcitta - Repository: galihcitta/dotclaudeskills - Version: 20260112141943 - Stars: 4 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/galihcitta/dotclaudeskills - Web: https://mule.run/skillshub/@@galihcitta/dotclaudeskills~optimizing-queries:20260112141943 --- --- name: optimizing-queries description: Analyzes and optimizes SQL/NoSQL queries for performance. Use when reviewing query performance, optimizing slow queries, analyzing EXPLAIN output, suggesting indexes, identifying N+1 problems, recommending query rewrites, or improving database access patterns. Supports PostgreSQL, MySQL, SQLite, MongoDB, Redis, DynamoDB, and Elasticsearch. --- # Query Optimization Analyze database queries and provide actionable optimization recommendations. ## Workflow Copy and track progress: ``` Query Optimization: - [ ] Identify database type - [ ] Analyze query structure - [ ] Check for red flags - [ ] Provide recommendations - [ ] Suggest verification ``` 1. **Identify database type** from syntax or user context 2. **Analyze query structure** - parse SELECT, FROM, WHERE, JOIN, ORDER BY, GROUP BY clauses 3. **Check for red flags** (see Quick Reference) 4. **Provide concrete recommendations** with copy-pasteable code 5. **Suggest verification** - EXPLAIN command or metric to confirm **For complex queries (3+ JOINs or nested subqueries):** - Decompose into logical units - Analyze each unit's access pattern - Synthesize recommendations in priority order ## Analysis Checklist For each query, evaluate: - **Index usage**: Missing or inefficient indexes - **Scan type**: Full scans on large tables - **Join efficiency**: Join order, type, missing conditions - **Filtering**: WHERE placement, function usage on columns - **Projection**: SELECT * vs specific columns - **Pagination**: OFFSET vs keyset/cursor - **N+1 detection**: Loop patterns in code context ## Quick Reference ### SQL Red Flags - `SELECT *` on large tables - `WHERE function(column)` prevents index use - `LIKE '%pattern'` (leading wildcard) - `OR` conditions spanning different columns - Large `OFFSET` values - `NOT IN` with nullable subquery - Missing `LIMIT` on unbounded queries ### NoSQL Red Flags - MongoDB: COLLSCAN, `$where`, unbounded arrays - Redis: `KEYS *`, missing TTL, wrong data structure - DynamoDB: Scan operations, hot partitions - Elasticsearch: Deep pagination, wildcard prefix ## Examples **Example 1: Missing Index** Input: ```sql SELECT * FROM orders WHERE customer_email = 'user@example.com'; ``` Output: ``` ## Issue [HIGH] Full table scan on `orders` - no index on `customer_email` ## Impact Query time grows linearly with table size; causes lock contention ## Solution CREATE INDEX idx_orders_customer_email ON orders(customer_email); ## Verification EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_email = 'user@example.com'; -- Should show "Index Scan" instead of "Seq Scan" ``` **Example 2: N+1 Query Pattern** Input: ```python users = db.query("SELECT * FROM users") for user in users: orders = db.query(f"SELECT * FROM orders WHERE user_id = {user.id}") ``` Output: ``` ## Issue [HIGH] N+1 query pattern - executes 1 + N queries instead of 1 ## Impact 100 users = 101 queries; network latency multiplied ## Solution SELECT u.*, o.* FROM users u LEFT JOIN orders o ON u.id = o.user_id; ## Verification Check query count before/after; should reduce from N+1 to 1 ``` **Example 3: When NOT to Optimize** Input: ```sql SELECT * FROM config WHERE key = 'app_version'; ``` Output: ``` ## No Issues Detected This query on a small lookup table (typically <100 rows) is acceptable. Adding an index would add write overhead without meaningful read benefit. ## When to Reconsider - If table grows beyond 10,000 rows - If query is called >100 times/second ``` **Example 4: Incomplete Information** Input: ```sql SELECT * FROM large_table WHERE status = ?; ``` Output: ``` ## Issue [MEDIUM] Preliminary analysis without schema information ## Potential Issues 1. `SELECT *` - may fetch unnecessary columns 2. `status` column - if low cardinality (e.g., 5 values), index may not help ## Information Needed - Table row count - Cardinality of `status` column - Existing indexes: `SHOW INDEX FROM large_table;` ``` **Example 5: MongoDB COLLSCAN** Input: ```javascript db.orders.find({ customer_email: "user@example.com" }) ``` Output: ``` ## Issue [HIGH] COLLSCAN on `orders` - no index on `customer_email` ## Impact Query scans entire collection; O(n) performance ## Solution db.orders.createIndex({ customer_email: 1 }) ## Verification db.orders.find({ customer_email: "..." }).explain("executionStats") // Should show "stage": "IXSCAN" instead of "COLLSCAN" ``` ## Response Format Structure recommendations as: ``` ## Issue [HIGH/MEDIUM/LOW] [Specific problem identified] ## Impact [Performance/resource cost] ## Solution [Concrete fix with code] ## Verification [EXPLAIN command or metric to confirm] ``` ## Confidence Levels When uncertainty exists, indicate confidence: - **HIGH**: Pattern clearly matches known anti-pattern - **MEDIUM**: Likely issue, depends on data distribution/schema - **LOW**: Potential issue, needs EXPLAIN to confirm ## Response Constraints 1. **Max 5 recommendations per query** - prioritize by impact 2. **Order by severity**: HIGH → MEDIUM → LOW 3. **One primary fix per issue** - mention alternatives briefly 4. **Code must be copy-pasteable** - no placeholders like `` 5. **Verification commands must be complete** - include all flags ## Prioritization Matrix | Issue Type | Impact | Priority | |------------|--------|----------| | Missing JOIN index | Query blocks | P0 | | Full table scan (>100k rows) | Slow response | P0 | | N+1 pattern | Latency × N | P1 | | SELECT * | Memory/bandwidth | P1 | | Missing LIMIT | Resource exhaustion | P1 | | Large OFFSET | Slow pagination | P2 | | Suboptimal ORDER BY | Sort overhead | P2 | ## Reference Files Consult these for detailed patterns (use grep to find specific sections): - `references/sql-patterns.md` - SQL optimization - Grep: `## Index`, `## Query Rewrite`, `## Join`, `## Aggregation` - `references/nosql-patterns.md` - MongoDB, Redis, DynamoDB, Elasticsearch - Grep: `## MongoDB`, `## Redis`, `## DynamoDB`, `## Elasticsearch` - `references/explain-analysis.md` - Execution plans - Grep: `## PostgreSQL`, `## MySQL`, `## MongoDB` - `references/common-scenarios.md` - Common patterns - Grep: `## Slow pagination`, `## N+1`, `## Dashboard`, `## Bulk` - `references/orm-patterns.md` - ORM fixes - Grep: `### Django`, `### SQLAlchemy`, `### ActiveRecord`, `### Prisma` ## Utility Scripts All scripts use Python standard library only (no pip install required). Run these for automated analysis: ### Query Analysis **analyze_query.py** - Detect anti-patterns and score complexity: ```bash python scripts/analyze_query.py "SELECT * FROM orders WHERE ..." python scripts/analyze_query.py --file query.sql --json ``` **suggest_index.py** - Generate index recommendations: ```bash python scripts/suggest_index.py "SELECT ... FROM orders WHERE status = 'pending'" ``` ### EXPLAIN & Logs **parse_explain.py** - Analyze EXPLAIN output: ```bash python scripts/parse_explain.py --pg explain_output.txt python scripts/parse_explain.py --mysql explain_output.txt ``` **diff_explain.py** - Compare before/after EXPLAIN outputs: ```bash python scripts/diff_explain.py --pg before.txt after.txt python scripts/diff_explain.py --mysql before.txt after.txt --json ``` **parse_slow_log.py** - Analyze slow query logs: ```bash python scripts/parse_slow_log.py --pg /var/log/postgresql.log --top 20 python scripts/parse_slow_log.py --mysql slow-query.log --json ``` ### Schema & Code **analyze_schema.py** - Find schema optimization opportunities: ```bash python scripts/analyze_schema.py schema.sql ``` **detect_orm_issues.py** - Find ORM anti-patterns in code: ```bash python scripts/detect_orm_issues.py app.py python scripts/detect_orm_issues.py --dir ./models ``` **check_migration.py** - Check migration safety: ```bash python scripts/check_migration.py --pg migration.sql python scripts/check_migration.py --mysql migration.sql ``` ### Workflow with Scripts **For query optimization:** 1. `analyze_query.py` → detect issues 2. `suggest_index.py` → generate indexes 3. User runs EXPLAIN (before) 4. User applies optimization 5. User runs EXPLAIN (after) 6. `diff_explain.py` → verify improvement **For verifying optimization impact:** ```bash # Save before state psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT ..." > before.txt # Apply optimization (add index, rewrite query, etc.) # Save after state psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT ..." > after.txt # Compare python scripts/diff_explain.py --pg before.txt after.txt ``` **For codebase audit:** 1. `detect_orm_issues.py --dir ./` → find N+1 patterns 2. `analyze_schema.py` → check schema 3. Prioritize fixes by severity **For migration review:** 1. `check_migration.py` → safety check 2. Fix critical issues before deployment 3. Use safe alternatives (CONCURRENTLY, etc.) ## EXPLAIN Commands Provide appropriate commands: ```sql -- PostgreSQL EXPLAIN (ANALYZE, BUFFERS) SELECT ...; -- MySQL EXPLAIN ANALYZE SELECT ...; -- SQLite EXPLAIN QUERY PLAN SELECT ...; ``` ```javascript // MongoDB db.collection.find({...}).explain("executionStats") ``` ## Index Recommendation Format When suggesting indexes: ```sql -- PostgreSQL/MySQL CREATE INDEX [CONCURRENTLY] idx_table_columns ON table(col1, col2, col3); -- Rationale: [why this column order, covering considerations] ``` ```javascript // MongoDB (ESR: Equality, Sort, Range) db.collection.createIndex({ equality_field: 1, sort_field: -1, range_field: 1 }) ``` ## Handling Missing Information When user provides incomplete context: **No schema provided:** > "To give precise recommendations, I need the table schema. Could you share: > 1. Column data types > 2. Existing indexes > 3. Approximate row counts > > Meanwhile, here's what I can infer from the query..." **No EXPLAIN output:** > "Run this command and share the output: > ```sql > EXPLAIN (ANALYZE, BUFFERS) ; > ``` > This will show actual vs estimated rows and scan types." **Unknown database type:** > "Which database are you using? (PostgreSQL, MySQL, SQLite, MongoDB, etc.) > The optimization strategies differ significantly between them." ## Conditional Workflows **Analyzing a query:** 1. Run `analyze_query.py` first 2. Check for anti-patterns 3. Suggest fixes with examples **Interpreting EXPLAIN output:** 1. Run `parse_explain.py` with appropriate flag 2. Identify warnings 3. Correlate with query structure **Full optimization request:** 1. Analyze query → identify issues 2. Suggest indexes → generate CREATE statements 3. Request EXPLAIN → interpret results 4. Provide before/after comparison **Complex optimization (3+ issues or architectural decisions):** Use extended reasoning to: 1. Map full query execution path 2. Identify all optimization opportunities 3. Evaluate trade-offs between approaches 4. Prioritize by impact and implementation cost ## Script Error Handling **If scripts fail:** 1. Check Python version (requires 3.7+) 2. Verify input format (SQL string, valid file path) 3. Fall back to manual analysis using reference files **Common errors:** - `FileNotFoundError` → Check file path, use absolute paths - `json.JSONDecodeError` → Input not valid JSON, check --pg/--mysql flags - Empty output → Query may be valid, no issues detected **When scripts are unavailable:** Perform manual analysis using Quick Reference red flags and reference files. ## Unsupported Databases For databases not explicitly covered (Oracle, SQL Server, CockroachDB, etc.): 1. Apply general SQL patterns from `references/sql-patterns.md` 2. Note: "Optimization patterns based on PostgreSQL/MySQL; verify syntax for [database]" 3. Recommend user consult database-specific documentation for EXPLAIN syntax and index creation