# powergraph-gnn-research > Research pipeline for topology-aware GNN representation learning on power grids using the PowerGraph benchmark. Use when (1) building physics-guided GNNs for power flow (PF), optimal power flow (OPF), or cascading failure prediction, (2) implementing self-supervised pretraining for power systems, (3) evaluating cascade explanation fidelity against ground-truth masks, or (4) conducting reproducible ML-for-power-systems research. Triggers include "PowerGraph", "power flow GNN", "OPF surrogate", "cascade prediction", "physics-guided GNN", "grid analytics ML", "power system representation learning". - Author: Mohammed - Repository: mhdhazmi/GNNPowerSystem - Version: 20260109181926 - Stars: 1 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/mhdhazmi/GNNPowerSystem - Web: https://mule.run/skillshub/@@mhdhazmi/GNNPowerSystem~powergraph-gnn-research:20260109181926 --- --- name: powergraph-gnn-research description: Research pipeline for topology-aware GNN representation learning on power grids using the PowerGraph benchmark. Use when (1) building physics-guided GNNs for power flow (PF), optimal power flow (OPF), or cascading failure prediction, (2) implementing self-supervised pretraining for power systems, (3) evaluating cascade explanation fidelity against ground-truth masks, or (4) conducting reproducible ML-for-power-systems research. Triggers include "PowerGraph", "power flow GNN", "OPF surrogate", "cascade prediction", "physics-guided GNN", "grid analytics ML", "power system representation learning". --- # PowerGraph GNN Research Pipeline **Primary claim**: A grid-specific self-supervised, physics-consistent GNN encoder improves PF/OPF learning (especially low-label/OOD), and transfers to cascading-failure prediction and explanation. ## Scripts | Task | Script | |------|--------| | Data ingestion | `scripts/load_powergraph.py` | | PF baseline | `scripts/train_pf_baseline.py` | | Physics metrics | `scripts/physics_residual.py` | | SSL pretraining | `scripts/pretrain_ssl.py` | | Multi-task training | `scripts/train_multitask.py` | | Explanation eval | `scripts/eval_cascade_explanation.py` | ## Workflow 1. **Data** → PowerGraph → PyG (PF/OPF node targets + cascade graph labels + exp masks) 2. **Baseline** → PF regression with sin/cos angles + physics residual metric 3. **Multi-task** → Shared encoder + PF/OPF/Cascade heads 4. **SSL** → Masked injection/edge reconstruction → fine-tune 5. **Evaluation** → Explanation AUC vs ground-truth masks + robustness tests ## Validity Anchors (Critical) **Angle handling**: Predict `sin(θ), cos(θ)`, recover via `atan2`. Direct MSE on raw angles fails at ±π wrap-around. **Physics residual**: Report KCL mismatch alongside accuracy. Ground truth ≈ 0, random >> 1. **Blocked splits**: PowerGraph uses 1-year load @ 15-min. Use months 1-9 train / 10 val / 11-12 test. Random splits leak seasonal patterns. **Explanation fidelity**: Use PowerGraph `exp.mat` ground-truth masks. Report AUC + Precision@K. ## Reference Docs - `references/data_pipeline.md` — PowerGraph → PyG conversion, splits - `references/model_architecture.md` — Physics-guided message passing, heads - `references/ssl_pretraining.md` — Masked tasks, low-label experiments - `references/uncertainty_quantification.md` — Ensembles, MC dropout, calibration - `references/evaluation_protocols.md` — Metrics, robustness, statistical tests - `references/publication_soundness.md` — Reviewer risks, claim framing - `references/experiment_configs.md` — YAML config structure, sweeps ## Common Pitfalls | Issue | Fix | |-------|-----| | Angle wrap-around | sin/cos representation | | Data leakage | Blocked time splits | | Cascade imbalance | Weighted/focal loss | | OOM large grids | Gradient checkpointing | | SSL collapse | Stop-gradient + EMA encoder | | Physics violations | Residual regularization | ## Publication Checklist - [ ] Ablation: single-task vs multi-task vs SSL+multi-task - [ ] Low-label curves: 10/20/50/100% training data - [ ] Physics residual alongside accuracy metrics - [ ] Blocked splits (not random) - [ ] Explanation AUC against exp.mat ground truth - [ ] Robustness under edge-drop perturbations - [ ] Statistical significance with 95% CI - [ ] One-command reproducibility (`python analysis/run_all.py`)