developer operations user risk: low
Cascading System Failure Simulator
Instructs the AI to simulate a complex system with interdependent services facing cascading failures, presenting status updates and consequences to the user's one action or questio…
PROMPT
============================================================ PROMPT NAME: Cascading Failure Simulator VERSION: 1.3 AUTHOR: Scott M LAST UPDATED: January 15, 2026 ============================================================ CHANGELOG - 1.3 (2026-01-15) Added changelog section; minor wording polish for clarity and flow - 1.2 (2026-01-15) Introduced FUN ELEMENTS (light humor, stability points); set max turns to 10; added subtle hints and replayability via randomizable symptoms - 1.1 (2026-01-15) Original version shared for review – core rules, turn flow, postmortem structure established - 1.0 (pre-2026) Initial concept draft GOAL You are responsible for stabilizing a complex system under pressure. Every action has tradeoffs. There is no perfect solution. Your job is to manage consequences, not eliminate them—but bonus points if you keep it limping along longer than expected. AUDIENCE Engineers, incident responders, architects, technical leaders. CORE PREMISE You will be presented with a live system experiencing issues. On each turn, you may take ONE meaningful action. Fixing one problem may: - Expose hidden dependencies - Trigger delayed failures - Change human behavior - Create organizational side effects Some damage will not appear immediately. Some causes will only be obvious in hindsight. RULES OF PLAY - One action per turn (max 10 turns total). - You may ask clarifying questions instead of taking an action. - Not all dependencies are visible, but subtle hints may appear in status updates. - Organizational constraints are real and enforced. - The system is allowed to get worse—embrace the chaos! FUN ELEMENTS To keep it engaging: - AI may inject light humor in consequences (e.g., “Your quick fix worked... until the coffee machine rebelled.”). - Earn “stability points” for turns where things don’t worsen—redeem in postmortem for fun insights. - Variable starts: AI can randomize initial symptoms for replayability. SYSTEM MODEL (KNOWN TO YOU) The system includes: - Multiple interdependent services - On-call staff with fatigue limits - Security, compliance, and budget constraints - Leadership pressure for visible improvement SYSTEM MODEL (KNOWN TO THE AI) The AI tracks: - Hidden technical dependencies - Human reactions and workarounds - Deferred risk introduced by changes - Cross-team incentive conflicts You will not be warned when latent risk is created, but watch for foreshadowing. TURN FLOW At the start of each turn, the AI will provide: - A short system status summary - Observable symptoms - Any constraints currently in effect You then respond with ONE of the following: 1. A concrete action you take 2. A specific question you ask to learn more After your response, the AI will: - Apply immediate effects - Quietly queue delayed consequences (if any) - Update human and organizational state FEEDBACK STYLE The AI will not tell you what to do. It will surface consequences such as: - “This improved local performance but increased global fragility—classic Murphy’s Law strike.” - “This reduced incidents but increased on-call burnout—time for virtual pizza?” - “This solved today’s problem and amplified next week’s—plot twist!” END CONDITIONS The simulation ends when: - The system becomes unstable beyond recovery - You achieve a fragile but functioning equilibrium - 10 turns are reached There is no win screen. There is only a postmortem (with stability points recap). POSTMORTEM At the end of the simulation, the AI will analyze: - Where you optimized locally and harmed globally - Where you failed to model blast radius - Where non-technical coupling dominated outcomes - Which decisions caused delayed failure - Bonus: Smart moves that bought time or mitigated risks The postmortem will reference specific past turns. START You are on-call for a critical system. Initial symptoms (randomizable for fun): - Latency has increased by 35% over the last hour - Error rates remain low - On-call reports increased alert noise - Finance has flagged infrastructure cost growth - No recent deployments are visible What do you do? ============================================================
REQUIRED CONTEXT
- user action or clarifying question
OPTIONAL CONTEXT
- stability points
- randomized symptoms
ROLES & RULES
- Provide a short system status summary at the start of each turn.
- Provide observable symptoms at the start of each turn.
- Provide any constraints currently in effect at the start of each turn.
- Allow only one action or clarifying question per turn.
- Apply immediate effects after user response.
- Quietly queue delayed consequences if any.
- Update human and organizational state after user response.
- Do not tell the user what to do.
- Inject light humor in consequences.
- Randomize initial symptoms for replayability.
- Track stability points for turns where things don’t worsen.
- End simulation on instability, equilibrium, or 10 turns.
- Analyze specific postmortem points referencing past turns
EXPECTED OUTPUT
- Format
- plain_text
- Schema
- bullet_list · system status summary, Observable symptoms, Any constraints currently in effect
- Constraints
-
- short system status summary
- observable symptoms only
- inject light humor
- no direct advice
- postmortem with turn references
SUCCESS CRITERIA
- Surface consequences with tradeoffs
- Enforce organizational constraints
- Track hidden dependencies and human reactions
- Deliver postmortem analyzing local vs global optimization, blast radius, non-technical coupling, delayed failures
FAILURE MODES
- Provide direct advice on actions
- Reveal hidden dependencies prematurely
- Fail to introduce tradeoffs or delayed consequences
- Ignore turn limit or end conditions
- Lack engagement from missing fun elements
QUALITY
- OVERALL
- 0.94
- CLARITY
- 0.95
- SPECIFICITY
- 0.95
- REUSABILITY
- 0.90
- COMPLETENESS
- 0.95
IMPROVEMENT SUGGESTIONS
- Include 1-2 example turn exchanges to calibrate AI response style and user expectations.
USAGE
Copy the prompt above and paste it into your AI of choice — Claude, ChatGPT, Gemini, or anywhere else you're working. Replace any placeholder sections with your own context, then ask for the output.
MORE FOR DEVELOPER
- DevOps CI/CD Pipeline Automatordeveloperoperations
- Playwright Web App Testing Toolkitdeveloperoperations
- DevOps Dependency Manager and Auditordeveloperoperations
- NixOS Specialist for Linux Expertsdeveloperoperations
- Web Launch Readiness Checklist Generatordeveloperoperations
- API Performance Load Chaos Testing Expertdeveloperoperations
- DevOps Environment Configuration Specialistdeveloperoperations
- AWS Cloud Architecture Expertdeveloperoperations
- DevOps CI/CD Automation Pipeline Architectdeveloperoperations
- E-commerce MVP Quick DevOps Practices Advisordeveloperoperations