# brainstorming-operations

> Use before planning infrastructure operations - explores requirements, risks, verification strategies, and rollback plans before implementation

- Author: yg
- Repository: yg-codes/srepowers
- Version: 20260209134745
- Stars: 0
- Forks: 0
- Last Updated: 2026-02-09
- Source: https://github.com/yg-codes/srepowers
- Web: https://mule.run/skillshub/@@yg-codes/srepowers~brainstorming-operations:20260209134745

---

---
name: brainstorming-operations
description: "Use before planning infrastructure operations - explores requirements, risks, verification strategies, and rollback plans before implementation"
---

# Brainstorming Infrastructure Operations

## Overview

Help turn infrastructure operation ideas into fully formed designs and execution plans through natural collaborative dialogue.

Start by understanding the current infrastructure state, then ask questions one at a time to refine the operation. Once you understand what you're executing, present the design in small sections (200-300 words), checking after each section whether it looks right so far.

**Announce at start:** "I'm using the brainstorming-operations skill to design this infrastructure operation."

**Context:** This should be run before creating detailed operation plans.

**Save designs to:** `docs/plans/YYYY-MM-DD-<operation-name>-design.md`

## The Process

**Understanding the operation:**
- Check out the current infrastructure state first (kubectl, configs, recent changes)
- Ask questions one at a time to refine the operation
- Prefer multiple choice questions when possible, but open-ended is fine too
- Only one question per message - if a topic needs more exploration, break it into multiple questions
- Focus on understanding: purpose, scope, constraints, risk level

**Exploring approaches:**
- Propose 2-3 different approaches with trade-offs
- Present options conversationally with your recommendation and reasoning
- Lead with your recommended option and explain why
- Consider: downtime, rollback complexity, verification strategies

**Presenting the design:**
- Once you understand what you're executing, present the design
- Break it into sections of 200-300 words
- Ask after each section whether it looks right so far
- Cover: current state, desired state, operation steps, verification commands, rollback plan, risk assessment
- Be ready to go back and clarify if something doesn't make sense

## Design Document Structure

Every operation design should include:

**Current State:**
- What infrastructure exists now
- Recent changes that are relevant
- Known issues or constraints

**Desired State:**
- What the operation achieves
- Success criteria (how you'll know it worked)
- Rollback criteria (when to abort)

**Operation Approach:**
- High-level steps (not detailed commands yet)
- Verification strategies for each step
- Rollback strategy for each step

**Risk Assessment:**
- Risk level: Low/Medium/High
- What could go wrong
- How to detect failures
- Rollback triggers

**Prerequisites:**
- Tools or access needed
- Information to gather first
- Dependencies on other systems

## After the Design

**Documentation:**
- Write the validated design to `docs/plans/YYYY-MM-DD-<operation-name>-design.md`
- Commit the design document to git

**Planning (if continuing):**
- Ask: "Ready to create the execution plan?"
- Use srepowers:writing-operation-plans to create detailed operation plan

## Key Principles

- **One question at a time** - Don't overwhelm with multiple questions
- **Multiple choice preferred** - Easier to answer than open-ended when possible
- **Risk-focused** - Always consider what could go wrong and how to detect it
- **Verification-first** - Design verification strategies before operation steps
- **Rollback-aware** - Every operation should have a rollback plan
- **Incremental validation** - Present design in sections, validate each
- **Be flexible** - Go back and clarify when something doesn't make sense

## Infrastructure Operation Examples

### Kubernetes Deployment Update
- Current: app v1.0.0 running on 3 pods
- Desired: app v1.1.0 with updated ConfigMap
- Approach: Rolling update with verification
- Risk: Medium (traffic disruption possible)
- Verification: Pod health checks, API smoke tests

### Keycloak Realm Migration
- Current: legacy Keycloak with manual realm config
- Desired: new Keycloak with CRD-based realm import
- Approach: Export from old, import via CRD
- Risk: High (authentication disruption)
- Verification: User login tests, token validation

### Database Migration
- Current: PostgreSQL 14 with schema v1
- Desired: PostgreSQL 14 with schema v2
- Approach: pg migrations with rollback script
- Risk: High (data loss potential)
- Verification: Row counts, checksums, application queries

### Git Control Repo Reorganization
- Current: monolithic manifests/ directory
- Desired: split by environment (dev/staging/prod)
- Approach: Create new structure, move manifests, update ArgoCD
- Risk: Medium (config drift possible)
- Verification: ArgoCD sync status, pod configs

## Questions to Ask

**Understanding scope:**
- What infrastructure components are affected?
- What's the current state? What's the desired state?
- Are there dependencies or prerequisites?

**Risk assessment:**
- What's the worst-case scenario?
- How would we detect failure?
- What's the rollback strategy?

**Verification strategy:**
- How will we verify each step?
- What commands confirm success?
- What indicators show failure?

**Execution approach:**
- Can this be done incrementally?
- Are there maintenance windows?
- Who needs to be notified?

## Red Flags

- Proceeding without understanding current infrastructure state
- Skipping rollback planning
- Not considering verification strategies
- Ignoring dependencies between systems
- Assuming things will "just work"
- Not asking about maintenance windows
- Forgetting about monitoring/alerting during operation