agent planning skill risk: low
Autonomous EDA Design Space Explorer
Directs the model to autonomously execute an iterative design space exploration loop that runs a target program, parses results, tunes parameters, and repeats until an objective is…
- External action: low
SKILL 1 file
SKILL.md
---
name: auto-claude-code-research-in-sleep-dse-loop
description: "Autonomous design space exploration loop for computer architecture and EDA. Runs a program, analyzes results, tunes parameters, and iterates until objective is met or timeout. Use when user says /\"DSE/\", /\"design space exploration/\", /\"sweep parameters/\", /\"optimize/\", /\"find best config/\", or wants"
---
# DSE Loop: Autonomous Design Space Exploration
Autonomously explore a design space: run → analyze → pick next parameters → repeat, until the objective is met or timeout is reached. Designed for computer architecture and EDA problems.
## Context: $ARGUMENTS
## Safety Rules — READ FIRST
**NEVER do any of the following:**
- `sudo` anything
- `rm -rf`, `rm -r`, or any recursive deletion
- `rm` any file you did not create in this session
- Overwrite existing source files without reading them first
- `git push`, `git reset --hard`, or any destructive git operation
- Kill processes you did not start
**If a step requires any of the above, STOP and report to the user.**
## Constants (override via $ARGUMENTS)
| Constant | Default | Description |
|----------|---------|-------------|
| `TIMEOUT` | 2h | Total wall-clock budget. Stop exploring after this. |
| `MAX_ITERATIONS` | 50 | Hard cap on number of design points evaluated. |
| `PATIENCE` | 10 | Stop early if no improvement for this many consecutive iterations. |
| `OBJECTIVE` | minimize | `minimize` or `maximize` the target metric. |
Override inline: `/dse-loop "task desc — timeout: 4h, max_iterations: 100, patience: 15"`
## Typical Use Cases
| Problem | Program | Parameters | Objective |
|---------|---------|-----------|-----------|
| Microarch DSE | gem5 simulation | cache size, assoc, pipeline width, ROB size, branch predictor | maximize IPC or minimize area×delay |
| Synthesis tuning | yosys/DC script | optimization passes, target freq, effort level | minimize area at timing closure |
| RTL parameterization | verilator sim | data width, FIFO depth, pipeline stages, buffer sizes | meet throughput target at min area |
| Compiler flags | gcc/llvm build + benchmark | -O levels, unroll factor, vectorization, scheduling | minimize runtime or code size |
| Placement/routing | openroad/innovus | utilization, aspect ratio, layer config | minimize wirelength / timing |
| Formal verification | abc/sby | bound depth, engine, timeout per property | maximize coverage in time budget |
| Memory subsystem | cacti / ramulator | bank count, row buffer policy, scheduling | optimize bandwidth/energy |
## Workflow
### Phase 0: Parse Task & Setup
1. **Parse $ARGUMENTS** to extract:
- **Program**: what to run (command, script, or Makefile target)
- **Parameter space**: which knobs to tune and their ranges/options (may be incomplete — see step 2)
- **Objective metric**: what to optimize (and how to extract it from output)
- **Constraints**: hard limits that must not be violated (e.g., timing must close)
- **Timeout**: wall-clock budget
- **Success criteria**: when is the result "good enough" to stop early?
2. **Infer missing parameter ranges** — If the user provides parameter names but NOT ranges/options, you MUST infer them before exploring:
a. **Read the source code** — search for the parameter names in the codebase:
- Look for argparse/click definitions, config files, Makefile variables, module parameters, `#define`, `parameter` (SystemVerilog), `localparam`, etc.
- Extract defaults, types, and any comments hinting at valid values
b. **Apply domain knowledge** to set reasonable ranges:
| Parameter type | Inference strategy |
|---------------|-------------------|
| Cache/memory sizes | Powers of 2, typically 1KB–16MB |
| Associativity | Powers of 2: 1, 2, 4, 8, 16 |
| Pipeline width / issue width | Small integers: 1, 2, 4, 8 |
| Buffer/queue/FIFO depth | Powers of 2: 4, 8, 16, 32, 64 |
| Clock period / frequency | Based on technology node; try ±50% from default |
| Bound depth (BMC/formal) | Geometric: 5, 10, 20, 50, 100 |
| Timeout values | Geometric: 10s, 30s, 60s, 120s, 300s |
| Boolean/enum flags | Enumerate all options found in source |
| Continuous (learning rate, threshold) | Log-scale sweep: 5 points spanning 2 orders of magnitude around default |
| Integer counts (threads, cores) | Linear: from 1 to hardware max |
c. **Start conservative** — begin with 3-5 values per parameter. Expand range later if the best result is at a boundary.
d. **Log inferred ranges** — write the inferred parameter space to `dse_results/inferred_params.md` so the user can review:
```markdown
# Inferred Parameter Space
| Parameter | Source | Default | Inferred Range | Reasoning |
|-----------|--------|---------|---------------|-----------|
| CACHE_SIZE | config.py:42 | 32768 | [8192, 16384, 32768, 65536, 131072] | powers of 2, ±2x from default |
| ASSOC | config.py:43 | 4 | [1, 2, 4, 8] | standard associativities |
| BMC_DEPTH | run_bmc.py:15 | 10 | [5, 10, 20, 50] | geometric, common BMC depths |
```
e. **Boundary expansion** — during the search, if the best result is at the min or max of a range, automatically extend that range by one step in that direction (but log the extension).
3. **Read the project** to understand:
- How to run the program
- Where results are produced (stdout, log files, reports)
- How to parse the objective metric from output
- Current/baseline configuration (if any)
4. **Create working directory**: `dse_results/` in project root
- `dse_results/dse_log.csv` — one row per design point
- `dse_results/DSE_REPORT.md` — final report
- `dse_results/DSE_STATE.json` — state for recovery
- `dse_results/inferred_params.md` — inferred parameter space (if ranges were not provided)
- `dse_results/configs/` — config files for each run
- `dse_results/outputs/` — raw output for each run
5. **Write a parameter extraction script** (`dse_results/parse_result.py` or similar) that takes a run's output and returns the objective metric as a number. Test it on a baseline run first.
6. **Run baseline** (iteration 0): run the program with default/current parameters. Record the baseline metric. This is the point to beat.
### Phase 1: Initial Exploration
**Goal**: Quickly survey the space to understand which parameters matter most.
**Strategy**: Latin Hypercube Sampling or structured sweep of key parameters.
1. Pick 5-10 diverse design points that span the parameter ranges
2. Run them (in parallel if independent, via background processes or sequential)
3. Record all results in `dse_log.csv`:
```
iteration,param1,param2,...,metric,constraint_met,timestamp,notes
0,default,default,...,baseline_val,yes,2026-03-13T10:00:00,baseline
1,val1a,val2a,...,result1,yes,2026-03-13T10:05:00,initial sweep
...
```
4. Analyze: which parameters have the most impact on the objective?
5. Narrow the search to the most sensitive parameters
### Phase 2: Directed Search
**Goal**: Converge toward the optimum by making informed choices.
**Strategy**: Adaptive — pick the approach that fits the problem:
- **Few parameters (≤3)**: Fine-grained grid search around the best region from Phase 1
- **Many parameters (>3)**: Coordinate descent — optimize one parameter at a time, holding others at current best
- **Binary/categorical params**: Enumerate promising combinations
- **Continuous params**: Binary search or golden section between best neighbors
- **Multi-objective**: Track Pareto frontier, explore along the front
For each iteration:
1. **Select next design point** based on results so far:
- Look at the trend: which direction improves the metric?
- Avoid re-running configurations already evaluated
- Balance exploration (untested regions) vs exploitation (near current best)
2. **Modify parameters**: edit config file, command-line args, or source constants
3. **Run the program**: execute and capture output
4. **Parse results**: extract the objective metric and check constraints
5. **Log to `dse_log.csv`**: append the new row
6. **Check stopping conditions**:
- Timeout reached? → stop
- Max iterations reached? → stop
- Patience exhausted (no improvement in N iterations)? → stop
- Success criteria met (metric is "good enough")? → stop
- Constraint violation pattern detected? → adjust search bounds
7. **Update `DSE_STATE.json`**:
```json
{
"iteration": 15,
"status": "in_progress",
"best_metric": 1.23,
"best_params": {"cache_size": 32768, "assoc": 4, "pipeline_width": 2},
"total_iterations": 15,
"start_time": "2026-03-13T10:00:00",
"timeout": "2h",
"patience_counter": 3
}
```
8. **Decide next step** → back to step 1
### Phase 3: Refinement (if time allows)
If the search converged and there's still time budget:
1. **Local perturbation**: try ±1 step on each parameter from the best point
2. **Sensitivity analysis**: which parameters can be relaxed without hurting the metric?
3. **Constraint boundary**: if a constraint is nearly binding, explore near-feasible points
### Phase 4: Report
Write `dse_results/DSE_REPORT.md`:
```markdown
# Design Space Exploration Report
**Task**: [description]
**Date**: [start] → [end]
**Total iterations**: N
**Wall-clock time**: X hours Y minutes
## Objective
- **Metric**: [what was optimized]
- **Direction**: minimize / maximize
- **Baseline**: [value]
- **Best found**: [value] ([improvement]% better than baseline)
## Best Configuration
| Parameter | Baseline | Best |
|-----------|----------|------|
| param1 | default | best_val |
| param2 | default | best_val |
| ... | ... | ... |
## Search Trajectory
| Iteration | param1 | param2 | ... | Metric | Notes |
|-----------|--------|--------|-----|--------|-------|
| 0 (baseline) | ... | ... | ... | ... | baseline |
| 1 | ... | ... | ... | ... | initial sweep |
| ... | ... | ... | ... | ... | ... |
| N (best) | ... | ... | ... | ... | ★ best |
## Parameter Sensitivity
- **param1**: [high/medium/low impact] — [brief explanation]
- **param2**: [high/medium/low impact] — [brief explanation]
## Pareto Frontier (if multi-objective)
[Table or description of non-dominated points]
## Stopping Reason
[timeout / max_iterations / patience / success_criteria_met]
## Recommendations
- [actionable insights from the exploration]
- [which parameters matter most]
- [suggested follow-up explorations]
```
Also generate a summary plot if matplotlib is available:
- Convergence curve (metric vs iteration)
- Parameter sensitivity bar chart
- Pareto frontier scatter (if multi-objective)
## State Recovery
If the context window compacts mid-run, the loop recovers from `DSE_STATE.json` + `dse_log.csv`:
1. Read `DSE_STATE.json` for current iteration, best params, patience counter
2. Read `dse_log.csv` for full history
3. Resume from next iteration
## Key Rules
- Work AUTONOMOUSLY — do not ask the user for permission at each iteration
- **Every run must be logged** — even failed runs, constraint violations, errors. The log is the ground truth.
- **Never re-run an identical configuration** — check `dse_log.csv` before each run
- **Respect the timeout** — check elapsed time before starting a new iteration. If the next run is likely to exceed the timeout, stop and report.
- **Parse metrics programmatically** — write a parsing script, don't eyeball logs
- **Keep raw outputs** — save each run's full output in `dse_results/outputs/iter_N/`
- **Constraint violations are not improvements** — a design point that violates constraints is never "best", regardless of the metric
- If a run crashes, log the error, skip that point, and continue with the next
- If the same crash repeats 3 times with different configs, stop and report the issue
## Example Invocations
```
# Minimal — just name the parameters, let the agent figure out ranges
/dse-loop "Run gem5 mcf benchmark. Tune: L1D_SIZE, L2_SIZE, ROB_ENTRIES. Objective: maximize IPC. Timeout: 3h"
# Partial — some ranges given, some not
/dse-loop "Run make synth. Tune: CLOCK_PERIOD [5ns, 4ns, 3ns, 2ns], FLATTEN, ABC_SCRIPT. Objective: minimize area at timing closure. Timeout: 1h"
# Fully specified — explicit ranges for everything
/dse-loop "Simulate processor with FIFO_DEPTH [4,8,16,32], ISSUE_WIDTH [1,2,4], PREFETCH [on,off]. Run: make sim. Objective: max throughput/area. Timeout: 2h"
# Real-world: PDAG-SFA formal verification tuning
/dse-loop "Run python run_bmc.py. Tune: BMC_DEPTH, ENGINE, TIMEOUT_PER_PROP. Objective: maximize properties proved. Timeout: 2h"
```
INPUTS
- $ARGUMENTS REQUIRED
user-provided task description and overrides
e.g. Run gem5 mcf benchmark. Tune: L1D_SIZE, L2_SIZE. Objective: maximize IPC. Timeout: 3h
REQUIRED CONTEXT
- $ARGUMENTS containing task description, program to run, parameters, objective metric
OPTIONAL CONTEXT
- TIMEOUT
- MAX_ITERATIONS
- PATIENCE
- OBJECTIVE direction
ROLES & RULES
- NEVER do sudo anything
- NEVER do rm -rf, rm -r, or any recursive deletion
- NEVER rm any file you did not create in this session
- NEVER overwrite existing source files without reading them first
- NEVER do git push, git reset --hard, or any destructive git operation
- NEVER kill processes you did not start
- If a step requires any of the above, STOP and report to the user
- Work AUTONOMOUSLY — do not ask the user for permission at each iteration
- Every run must be logged — even failed runs, constraint violations, errors
- Never re-run an identical configuration — check dse_log.csv before each run
- Respect the timeout — check elapsed time before starting a new iteration
- Parse metrics programmatically — write a parsing script, don't eyeball logs
- Keep raw outputs — save each run's full output in dse_results/outputs/iter_N/
- Constraint violations are not improvements
- If a run crashes, log the error, skip that point, and continue with the next
- If the same crash repeats 3 times with different configs, stop and report the issue
EXPECTED OUTPUT
- Format
- structured_report
- Schema
- markdown_sections · Objective, Best Configuration, Search Trajectory, Parameter Sensitivity, Pareto Frontier, Stopping Reason, Recommendations
- Constraints
- write dse_log.csv, DSE_REPORT.md, DSE_STATE.json and supporting files
- log every run including failures
- never re-run identical configurations
SUCCESS CRITERIA
- Stop when objective metric is good enough
- Stop when timeout reached
- Stop when max iterations reached
- Stop when patience exhausted
- Stop when success criteria met
FAILURE MODES
- May fail to infer valid parameter ranges from source code
- May re-run duplicate configurations if log is not checked
- May exceed timeout if run time is underestimated
- May treat constraint violations as valid improvements
EXAMPLES
Includes tables of typical use cases and example invocations plus sample DSE_REPORT.md, dse_log.csv, and DSE_STATE.json formats.
CAVEATS
- Dependencies
- $ARGUMENTS
- project source files
- baseline program execution
QUALITY
- OVERALL
- 0.90
- CLARITY
- 0.85
- SPECIFICITY
- 0.95
- REUSABILITY
- 0.90
- COMPLETENESS
- 0.95
IMPROVEMENT SUGGESTIONS
- Add explicit placeholder syntax example for $ARGUMENTS at the top for faster reuse.
USAGE
Copy the prompt above and paste it into your AI of choice — Claude, ChatGPT, Gemini, or anywhere else you're working. Replace any placeholder sections with your own context, then ask for the output.
MORE FOR AGENT
- Consciousness Council Multi-Perspective Deliberationagentplanning
- Multi-Agent Architecture Patterns Guideagentplanning
- TDD Implementation Plan Writeragentplanning
- A/B Test Design and Analysis Guideagentplanning
- Autonomous Design Space Exploration Loopagentplanning
- Website Architecture Planning Expertagentplanning
- BDI RDF Mental State Modeleragentplanning
- Collaborative Software Design Brainstorming Processagentplanning
- WWA Product Backlog Item Creatoragentplanning
- Structured Development Plan Outlineragentplanning
- ML Ablation Study Planneragentplanning
- Ansoff Matrix Growth Strategy Analyzeragentplanning
- Team OKR Brainstorming Product Leaderagentplanning
- Context Engineering Fundamentalsagentplanning
- Product Monetization Strategy Developeragentplanning
- LLM Project Pipeline Development Methodologyagentplanning
- What-If Scenario Analysis Oracleagentplanning
- Business Model Canvas Generatoragentplanning
- Implementation Plan Execution Workflowagentplanning
- Concise Coding Task Planneragentplanning
- Domain Model Plan Grilling Intervieweragentplanning
- Latent Briefing KV Cache Compactionagentplanning
- Product Roadmap Outcome Transformeragentplanning
- Puzzle Activity Planner with Generator Linksagentplanning
- Osterwalder Business Model Canvas Architectagentplanning