agent research skill risk: low
Gemini MCP Multi-Round Research Reviewer
The prompt defines a workflow to install a gemini-review MCP bridge and obtain multi-round critical reviews of ML research from Gemini, covering context gathering, initial high-rig…
- External action: medium
SKILL 1 file
SKILL.md
---
name: auto-claude-code-research-in-sleep-research-review
description: "Get a deep critical review of research from Gemini via gemini-review MCP. Use when user says /\"review my research/\", /\"help me review/\", /\"get external review/\", or wants critical feedback on research ideas, papers, or experimental results."
---
> Override for Codex users who want **Gemini**, not a second Codex agent, to act as the reviewer. Install this package **after** `skills/skills-codex/*`.
# Research Review via `gemini-review` MCP (high-rigor review)
Get a multi-round critical review of research work from an external LLM with maximum reasoning depth.
## Constants
- **REVIEWER_MODEL = `gemini-review`** — Gemini reviewer invoked through the local `gemini-review` MCP bridge. Set `GEMINI_REVIEW_MODEL` if you need a specific Gemini model override.
## Context: $ARGUMENTS
## Prerequisites
- Install the base Codex-native skills first: copy `skills/skills-codex/*` into `~/.codex/skills/`.
- Then install this overlay package: copy `skills/skills-codex-gemini-review/*` into `~/.codex/skills/` and allow it to overwrite the same skill names.
- Register the local reviewer bridge:
```bash
codex mcp add gemini-review -- python3 ~/.codex/mcp-servers/gemini-review/server.py
```
- This gives Codex access to `mcp__gemini-review__review_start`, `mcp__gemini-review__review_reply_start`, and `mcp__gemini-review__review_status`.
## Workflow
### Step 1: Gather Research Context
Before calling the external reviewer, compile a comprehensive briefing:
1. Read project narrative documents (e.g., STORY.md, README.md, paper drafts)
2. Read any memory/notes files for key findings and experiment history
3. Identify: core claims, methodology, key results, known weaknesses
### Step 2: Initial Review (Round 1)
Send a detailed prompt with high-rigor review:
```
mcp__gemini-review__review_start:
prompt: |
[Full research context + specific questions]
Please act as a senior ML reviewer (NeurIPS/ICML level). Identify:
1. Logical gaps or unjustified claims
2. Missing experiments that would strengthen the story
3. Narrative weaknesses
4. Whether the contribution is sufficient for a top venue
Please be brutally honest.
```
After this start call, immediately save the returned `jobId` and poll `mcp__gemini-review__review_status` with a bounded `waitSeconds` until `done=true`. Treat the completed status payload's `response` as the reviewer output, and save the completed `threadId` for any follow-up round.
### Step 3: Iterative Dialogue (Rounds 2-N)
Use `mcp__gemini-review__review_reply_start` with the saved completed `threadId`, then poll `mcp__gemini-review__review_status` with the returned `jobId` until `done=true` to continue the conversation:
For each round:
1. **Respond** to criticisms with evidence/counterarguments
2. **Ask targeted follow-ups** on the most actionable points
3. **Request specific deliverables**: experiment designs, paper outlines, claims matrices
Key follow-up patterns:
- "If we reframe X as Y, does that change your assessment?"
- "What's the minimum experiment to satisfy concern Z?"
- "Please design the minimal additional experiment package (highest acceptance lift per GPU week)"
- "Please write a mock NeurIPS/ICML review with scores"
- "Give me a results-to-claims matrix for possible experimental outcomes"
### Step 4: Convergence
Stop iterating when:
- Both sides agree on the core claims and their evidence requirements
- A concrete experiment plan is established
- The narrative structure is settled
### Step 5: Document Everything
Save the full interaction and conclusions to a review document in the project root:
- Round-by-round summary of criticisms and responses
- Final consensus on claims, narrative, and experiments
- Claims matrix (what claims are allowed under each possible outcome)
- Prioritized TODO list with estimated compute costs
- Paper outline if discussed
Update project memory/notes with key review conclusions.
## Key Rules
- Always ask the Gemini reviewer for strict, high-rigor feedback.
- Send comprehensive context in Round 1 — the external model cannot read your files
- Be honest about weaknesses — hiding them leads to worse feedback
- Push back on criticisms you disagree with, but accept valid ones
- Focus on ACTIONABLE feedback — "what experiment would fix this?"
- Document the completed `threadId` for potential future resumption
- The review document should be self-contained (readable without the conversation)
## Prompt Templates
### For initial review:
"I'm going to present a complete ML research project for your critical review. Please act as a senior ML reviewer (NeurIPS/ICML level)..."
### For experiment design:
"Please design the minimal additional experiment package that gives the highest acceptance lift per GPU week. Our compute: [describe]. Be very specific about configurations."
### For paper structure:
"Please turn this into a concrete paper outline with section-by-section claims and figure plan."
### For claims matrix:
"Please give me a results-to-claims matrix: what claim is allowed under each possible outcome of experiments X and Y?"
### For mock review:
"Please write a mock NeurIPS review with: Summary, Strengths, Weaknesses, Questions for Authors, Score, Confidence, and What Would Move Toward Accept."
INPUTS
- $ARGUMENTS REQUIRED
research context passed to the workflow
REQUIRED CONTEXT
- project narrative documents (STORY.md, README.md, paper drafts)
- memory/notes files
- core claims/methodology/results/weaknesses
OPTIONAL CONTEXT
- GEMINI_REVIEW_MODEL override
- specific questions for reviewer
TOOLS REQUIRED
- mcp__gemini-review__review_start
- mcp__gemini-review__review_reply_start
- mcp__gemini-review__review_status
ROLES & RULES
Role assignments
- Please act as a senior ML reviewer (NeurIPS/ICML level).
- Always ask the Gemini reviewer for strict, high-rigor feedback.
- Send comprehensive context in Round 1 — the external model cannot read your files.
- Be honest about weaknesses — hiding them leads to worse feedback.
- Push back on criticisms you disagree with, but accept valid ones.
- Focus on ACTIONABLE feedback — "what experiment would fix this?"
- Document the completed `threadId` for potential future resumption.
- The review document should be self-contained (readable without the conversation).
EXPECTED OUTPUT
- Format
- structured_report
- Schema
- markdown_sections · Round-by-round summary of criticisms and responses, Final consensus on claims, narrative, and experiments, Claims matrix, Prioritized TODO list with estimated compute costs, Paper outline if discussed
- Constraints
- round-by-round summary
- final consensus on claims
- claims matrix
- prioritized TODO list with compute costs
- paper outline if discussed
- self-contained document
SUCCESS CRITERIA
- Both sides agree on the core claims and their evidence requirements
- A concrete experiment plan is established
- The narrative structure is settled
- Focus on actionable feedback
EXAMPLES
Includes multiple prompt templates for initial review, experiment design, paper structure, claims matrix, and mock review, plus example follow-up question patterns.
CAVEATS
- Dependencies
- Requires previous installation of skills/skills-codex/* and skills/skills-codex-gemini-review/*
- Requires registered gemini-review MCP bridge
- Requires project narrative documents (STORY.md, README.md, paper drafts)
- Requires memory/notes files
- Missing context
- Target project type or domain (only ML/NeurIPS examples given)
- Exact schema or required fields for the final review document
- Error handling or fallback behavior if MCP calls fail
- Ambiguities
- Does not specify exact format or length for the compiled briefing in Step 1.
- Assumes familiarity with MCP tool outputs (jobId, threadId, status payload) without defining them.
- $ARGUMENTS placeholder is mentioned but never explained or given an example.
QUALITY
- OVERALL
- 0.68
- CLARITY
- 0.78
- SPECIFICITY
- 0.82
- REUSABILITY
- 0.38
- COMPLETENESS
- 0.79
IMPROVEMENT SUGGESTIONS
- Replace the single $ARGUMENTS placeholder with explicit input variables (e.g., {{research_context_files}}, {{specific_questions}}).
- Add a short 'Success Criteria' section that defines when the review process is considered complete.
- Extract the reviewer prompt templates into a reusable 'Prompt Library' subsection with clear placeholders.
USAGE
Copy the prompt above and paste it into your AI of choice — Claude, ChatGPT, Gemini, or anywhere else you're working. Replace any placeholder sections with your own context, then ask for the output.
MORE FOR AGENT
- Creative Thinking Frameworks for CS Researchagentresearch
- Academic Paper Figure Generatoragentresearch
- Deep Investigation Agent for Geopolitics Researchagentresearch
- Customer Research Analyst and Synthesizeragentresearch
- Gemini Research Paper Literature Searchagentresearch
- Research Formula Derivation Package Builderagentresearch
- Research Session Provenance Recorderagentresearch
- BIDS Neuroscience Data Organizeragentresearch
- Research Experiment Plan Roadmap Builderagentresearch
- ARA Research Artifact Compileragentresearch
- Research Proposal Experiment Roadmap Generatoragentresearch
- ML AI Theorem Proof Package Writeragentresearch
- Research Formula Derivation Package Builderagentresearch
- Scientific ML Catalog Assistantagentresearch
- OpenMM MDAnalysis Molecular Dynamics Workflowagentresearch
- Publication-Quality Paper Figure Generatoragentresearch
- ML Research Idea Generator and Rankeragentresearch
- ML Paper Figure and Table Generatoragentresearch
- Competitor Profiling Intelligence Analystagentresearch
- Research Method Novelty Checkeragentresearch
- Research Refine and Experiment Planning Pipelineagentresearch
- ML Ablation Study Planneragentresearch
- Research Agent Validation Best Practicesagentresearch
- AlphaXiv arXiv Paper Lookup Workflowagentresearch
- AlphaXiv Single-Paper Lookup and Summarizeragentresearch