# k8s-troubleshoot > Debug Kubernetes issues using kubectl, logs, events, and resource inspection. Use on 'pod not starting', 'crash loop', 'OOMKilled', 'debug k8s', 'why is pod failing'. - Author: Forgejo Mirror Bot - Repository: pypeaday/dotfiles - Version: 20260123143044 - Stars: 2 - Forks: 0 - Last Updated: 2026-02-07 - Source: https://github.com/pypeaday/dotfiles - Web: https://mule.run/skillshub/@@pypeaday/dotfiles~k8s-troubleshoot:20260123143044 --- --- name: k8s-troubleshoot description: "Debug Kubernetes issues using kubectl, logs, events, and resource inspection. Use on 'pod not starting', 'crash loop', 'OOMKilled', 'debug k8s', 'why is pod failing'." --- # Kubernetes Troubleshooting Skill Systematic debugging for Kubernetes issues. ## When to Use - Pods stuck in Pending/CrashLoopBackOff - OOMKilled containers - Service connectivity issues - Deployment rollout failures - PVC/storage problems ## Diagnostic Flow ### 1. Get Status ```bash kubectl get pods -o wide kubectl get events --sort-by='.lastTimestamp' kubectl describe pod ``` ### 2. Check Logs ```bash kubectl logs --previous # crashed container kubectl logs -c # specific container stern # multiple pods ``` ### 3. Resource Issues ```bash kubectl top pods kubectl describe node | grep -A5 "Allocated resources" ``` ## Common Issues ### Pending Pod | Cause | Check | Fix | |-------|-------|-----| | No resources | `kubectl describe pod` -> Events | Increase limits or add nodes | | No matching node | Check nodeSelector/affinity | Fix selectors | | PVC not bound | `kubectl get pvc` | Check storage class | ### CrashLoopBackOff | Cause | Check | Fix | |-------|-------|-----| | App error | `kubectl logs --previous` | Fix app code | | Missing config | Check ConfigMap/Secret mounts | Create missing resources | | Bad command | Check `command`/`args` in spec | Fix entrypoint | | OOMKilled | `kubectl describe pod` -> State | Increase memory limit | ### ImagePullBackOff | Cause | Check | Fix | |-------|-------|-----| | Wrong image | Check image name/tag | Fix image reference | | Private registry | Check imagePullSecrets | Add registry credentials | | Rate limit | Check events | Use registry mirror | ### Service Not Reachable ```bash # Check endpoints exist kubectl get endpoints # Check selector matches pods kubectl get pods -l # Test from inside cluster kubectl run debug --rm -it --image=alpine -- wget -qO- : ``` ## Quick Commands ```bash # All failing pods kubectl get pods --field-selector=status.phase!=Running # Events for namespace kubectl get events --sort-by='.lastTimestamp' -n # Resource usage kubectl top pods --sort-by=memory # Shell into pod kubectl exec -it -- /bin/sh # Port forward for debugging kubectl port-forward 8080:80 # Restart deployment kubectl rollout restart deployment/ # Check rollout status kubectl rollout status deployment/ ``` ## Log Patterns to Search ```bash # Errors kubectl logs | grep -i error # Python tracebacks kubectl logs | grep -A 20 "Traceback" # OOM kubectl logs | grep -i "out of memory\|oom\|killed" ```