agent analysis skill risk: low
ML Experiment Results Analyzer
Analyzes ML experiment results from JSON/CSV files by locating outputs, building comparison tables with independent/dependent variables and deltas, performing statistical analysis,…
SKILL 1 file
SKILL.md
--- name: auto-claude-code-research-in-sleep-analyze-results description: "Analyze ML experiment results, compute statistics, generate comparison tables and insights. Use when user says \"analyze results\", \"compare\", or needs to interpret experimental data." --- # Analyze Experiment Results Analyze: $ARGUMENTS ## Workflow ### Step 1: Locate Results Find all relevant JSON/CSV result files: - Check `figures/`, `results/`, or project-specific output directories - Parse JSON results into structured data ### Step 2: Build Comparison Table Organize results by: - **Independent variables**: model type, hyperparameters, data config - **Dependent variables**: primary metric (e.g., perplexity, accuracy, loss), secondary metrics - **Delta vs baseline**: always compute relative improvement ### Step 3: Statistical Analysis - If multiple seeds: report mean +/- std, check reproducibility - If sweeping a parameter: identify trends (monotonic, U-shaped, plateau) - Flag outliers or suspicious results ### Step 4: Generate Insights For each finding, structure as: 1. **Observation**: what the data shows (with numbers) 2. **Interpretation**: why this might be happening 3. **Implication**: what this means for the research question 4. **Next step**: what experiment would test the interpretation ### Step 5: Update Documentation If findings are significant: - Propose updates to project notes or experiment reports - Draft a concise finding statement (1-2 sentences) ## Output Format Always include: 1. Raw data table 2. Key findings (numbered, concise) 3. Suggested next experiments (if any)
INPUTS
- $ARGUMENTS REQUIRED
description of the experiment results to analyze
REQUIRED CONTEXT
- $ARGUMENTS
TOOLS REQUIRED
- file_search
- code_execution
ROLES & RULES
- Find all relevant JSON/CSV result files
- Organize results by independent variables, dependent variables and delta vs baseline
- Always compute relative improvement
- Report mean +/- std and check reproducibility if multiple seeds
- Identify trends and flag outliers if sweeping a parameter
- Structure each finding as Observation, Interpretation, Implication, Next step
- Propose updates to documentation if findings are significant
- Always include raw data table, key findings and suggested next experiments
EXPECTED OUTPUT
- Format
- structured_report
- Schema
- numbered_list · Raw data table, Key findings, Suggested next experiments
- Constraints
- always include raw data table
- always include numbered key findings
- always include suggested next experiments if any
SUCCESS CRITERIA
- Locate and parse result files
- Build comparison tables with deltas
- Perform statistical analysis
- Generate structured insights
- Update documentation when appropriate
CAVEATS
- Dependencies
- $ARGUMENTS
- result files in figures/, results/ or project-specific directories
- Missing context
- Target project root or experiment naming conventions
- Preferred statistical libraries or output formats (e.g., markdown vs LaTeX)
- Ambiguities
- Does not specify exact project-specific output directories or how they are discovered.
- "Delta vs baseline" does not define how the baseline is identified.
QUALITY
- OVERALL
- 0.79
- CLARITY
- 0.85
- SPECIFICITY
- 0.75
- REUSABILITY
- 0.80
- COMPLETENESS
- 0.78
IMPROVEMENT SUGGESTIONS
- Add a configurable list of result directories as a prompt variable instead of hard-coded examples.
- Specify how the baseline run is selected (e.g., first entry, lowest loss, or user-provided name).
USAGE
Copy the prompt above and paste it into your AI of choice — Claude, ChatGPT, Gemini, or anywhere else you're working. Replace any placeholder sections with your own context, then ask for the output.
MORE FOR AGENT
- Comprehensive Codebase Bug Analysis and Fixeragentanalysis
- DHDNA Cognitive Pattern Profileragentanalysis
- CLAUDE.md Repo Generator Updateragentanalysis
- Competitor Analysis and Differentiation Strategistagentanalysis
- Porter's Five Forces Industry Analyzeragentanalysis
- Codebase Wiki Researcheragentanalysis
- PESTLE Macro Environment Analystagentanalysis
- Phylogenetics Analysis Pipelineagentanalysis
- System Performance Profiling Assistantagentanalysis
- Behavioral User Segmentation Analystagentanalysis
- System Performance Profiler with Instrumentationagentanalysis
- Product SWOT Analysis Generatoragentanalysis
- Glycoengineering Sequence Analysis Toolkitagentanalysis
- Seaborn Statistical Visualization Referenceagentanalysis
- scikit-bio Bioinformatics Analysis Skillagentanalysis
- User Feedback Sentiment Segment Analyzeragentanalysis
- SHAP Model Interpretability Guideagentanalysis
- DDD Ubiquitous Language Glossary Extractoragentanalysis
- Website SEO Audit with Subagent Delegationagentanalysis
- North Star Metric Classifier and Validatoragentanalysis
- SEO Content E-E-A-T Quality Analyzeragentanalysis
- Codebase Architecture Deep Analyzeragentanalysis
- ETE3 Phylogenetic Tree Toolkit Guideagentanalysis
- Codebase Architecture Code Path Traceragentanalysis
- Bitcoin Lightning Network Design Revieweragentanalysis