agent marketing skill risk: low
Firecrawl SEO Site Crawling Commands
Defines Firecrawl MCP tools (crawl, map, scrape, search) with parameters, SEO usage patterns, cross-skill integrations, error handling, and cost notes for website analysis.
- External action: medium
SKILL 2 files
SKILL.md
--- name: seo-firecrawl description: ")" --- # Firecrawl Extension for Claude SEO This skill requires the Firecrawl extension to be installed: ```bash ./extensions/firecrawl/install.sh ``` **Check availability:** Before using any Firecrawl tool, verify the MCP server is connected by checking if `firecrawl_scrape` or any Firecrawl tool is available. If tools are not available, inform the user the extension is not installed and provide install instructions. ## Quick Reference | Command | Purpose | |---------|---------| | `/seo firecrawl crawl <url>` | Full-site crawl with content extraction | | `/seo firecrawl map <url>` | Discover site structure (URLs only, fast) | | `/seo firecrawl scrape <url>` | Single-page scrape with JS rendering | | `/seo firecrawl search <query> <url>` | Search within a crawled site | ## Commands ### crawl -- Full-Site Crawl Crawl an entire website starting from the given URL. Returns page content, metadata, and links for all discovered pages. **MCP Tool:** `firecrawl_crawl` **Parameters:** - `url` (required): Starting URL to crawl - `limit`: Max pages to crawl (default: 100, max: 500) - `maxDepth`: Max link depth from start URL (default: 3) - `includePaths`: Array of glob patterns to include (e.g., `["/blog/*"]`) - `excludePaths`: Array of glob patterns to exclude (e.g., `["/admin/*", "/api/*"]`) - `scrapeOptions.formats`: Output formats -- `["markdown", "html", "links"]` **SEO Usage Patterns:** 1. **Comprehensive audit crawl**: Crawl full site, extract all pages for subagent analysis 2. **Section-focused crawl**: Use `includePaths` to audit only `/blog/*` or `/products/*` 3. **Broken link detection**: Crawl with `["links"]` format, check all hrefs for 404s 4. **Content inventory**: Extract all page titles, meta descriptions, H1s at scale 5. **SPA/JS-rendered sites**: Firecrawl renders JavaScript, solving the Issue #11 problem **Example orchestration for `/seo audit`:** ``` 1. firecrawl_map(url) -> get all URLs (fast, no content) 2. Filter to top 50 most important pages (homepage, key sections) 3. firecrawl_crawl(url, limit=50) -> get full content 4. Feed content to seo-technical, seo-content, seo-schema agents ``` **Cost awareness:** - Free tier: 500 credits/month - 1 credit = 1 page crawled or scraped - Map operations are cheaper (0.5 credits per URL discovered) - Always inform user of estimated credit usage before large crawls ### map -- Site Structure Discovery Discover all URLs on a website without fetching content. Fast and credit-efficient. **MCP Tool:** `firecrawl_map` **Parameters:** - `url` (required): Website URL to map - `limit`: Max URLs to discover (default: 5000) - `search`: Optional search term to filter URLs **SEO Usage Patterns:** 1. **Sitemap comparison**: Map site, compare discovered URLs vs XML sitemap 2. **Orphan page detection**: URLs in sitemap but not linked from any page 3. **Crawl budget analysis**: Total indexable pages vs pages linked from homepage 4. **URL pattern analysis**: Identify URL structure patterns, duplicates, parameter bloat 5. **Pre-audit discovery**: Run map first, then targeted crawl on key sections **Output:** Array of URLs. Present as: ``` Site: example.com Pages discovered: 342 URL Pattern Breakdown: /blog/* - 128 pages (37%) /products/* - 89 pages (26%) /category/* - 45 pages (13%) /pages/* - 32 pages (9%) / (root pages) - 48 pages (14%) ``` ### scrape -- Single-Page Deep Scrape Scrape a single page with full JavaScript rendering. More thorough than `fetch_page.py` because it executes JS and waits for dynamic content. **MCP Tool:** `firecrawl_scrape` **Parameters:** - `url` (required): Page URL to scrape - `formats`: Output formats -- `["markdown", "html", "links", "screenshot"]` - `onlyMainContent`: Strip nav/footer/sidebar (default: true) - `waitFor`: CSS selector or milliseconds to wait for content - `timeout`: Request timeout in ms (default: 30000) - `actions`: Browser actions before scraping (click, scroll, wait) **SEO Usage Patterns:** 1. **SPA content extraction**: Scrape JS-rendered React/Vue/Angular pages 2. **Dynamic content audit**: Pages with lazy-loaded content below the fold 3. **Paywall/login detection**: Identify content behind authentication walls 4. **Main content extraction**: Use `onlyMainContent` for clean E-E-A-T analysis 5. **Screenshot capture**: Use `screenshot` format for visual analysis **When to use scrape vs fetch_page.py:** | Scenario | Use | |----------|-----| | Static HTML page | `fetch_page.py` (no API cost) | | JS-rendered SPA | `firecrawl_scrape` (renders JS) | | Need response headers | `fetch_page.py` (returns headers) | | Need clean markdown | `firecrawl_scrape` (better extraction) | | Rate-limited/blocked | `firecrawl_scrape` (handles anti-bot) | ### search -- Site-Scoped Search Search within a website for specific content. Useful for finding pages related to a topic without crawling everything. **MCP Tool:** `firecrawl_search` **Parameters:** - `query` (required): Search query - `url` (required): Website to search within - `limit`: Max results (default: 10) - `scrapeOptions.formats`: Output format for matched pages **SEO Usage Patterns:** 1. **Content gap validation**: Search for a keyword on the site to check if content exists 2. **Internal linking opportunities**: Find pages mentioning a topic that could link to each other 3. **Duplicate content detection**: Search for key phrases to find near-duplicates 4. **Competitor content research**: Search competitor site for specific topics ## Cross-Skill Integration ### With seo-audit (full audit) When Firecrawl is available during `/seo audit`: 1. Use `firecrawl_map` to discover all site URLs 2. Compare with XML sitemap (seo-sitemap) to find orphan/missing pages 3. Select top pages for deep analysis 4. Feed crawled content to all subagents (technical, content, schema, geo) 5. Report total crawlable pages, URL patterns, and crawl depth ### With seo-technical - Broken link detection: crawl all internal links, check for 404s - Redirect chain mapping: follow all redirects, flag chains > 2 hops - Mixed content detection: check HTTP resources on HTTPS pages - Canonical verification: compare canonical URLs with actual URLs ### With seo-sitemap - Sitemap coverage: % of crawled pages present in sitemap - Orphan pages: pages found by crawl but missing from sitemap - Stale sitemap entries: URLs in sitemap that return 404/410 ### With seo-content - Content extraction: feed clean markdown to E-E-A-T analysis - Thin content detection: identify pages with < 300 words at scale - Duplicate content: compare content across pages for near-duplicates ### With seo-schema - Schema extraction: pull JSON-LD from all crawled pages - Schema coverage: % of pages with structured data - Schema validation: batch-validate extracted schemas ## Error Handling | Error | Cause | Resolution | |-------|-------|-----------| | `FIRECRAWL_API_KEY not set` | MCP not configured | Run `./extensions/firecrawl/install.sh` | | `402 Payment Required` | Credits exhausted | Check usage at firecrawl.dev/app, upgrade plan | | `429 Too Many Requests` | Rate limited | Wait 60s, reduce crawl concurrency | | `408 Timeout` | Page too slow to render | Increase `timeout`, try without JS rendering | | `403 Forbidden` | Site blocks crawling | Check robots.txt, may need to skip this site | **Graceful fallback:** If Firecrawl is unavailable, inform the user and suggest: 1. Use `fetch_page.py` for single-page analysis (no API cost) 2. Use `WebFetch` tool for basic HTML retrieval 3. Install Firecrawl: `./extensions/firecrawl/install.sh`
ROLES & RULES
- Before using any Firecrawl tool, verify the MCP server is connected by checking if firecrawl_scrape or any Firecrawl tool is available.
- If tools are not available, inform the user the extension is not installed and provide install instructions.
- Always inform user of estimated credit usage before large crawls
EXPECTED OUTPUT
- Format
- markdown
- Schema
- markdown_sections · Quick Reference table, Parameters list, SEO Usage Patterns numbered list, Output example, Error Handling table
- Constraints
- include tables for commands and errors
- provide usage patterns and examples
SUCCESS CRITERIA
- Verify MCP server connection before tool use
- Inform user if extension not installed with install instructions
- Inform user of estimated credit usage before large crawls
- Present map output with URL pattern breakdown
EXAMPLES
Includes command syntax examples, orchestration steps for /seo audit, sample output for map, usage pattern lists, comparison tables, and error table.
CAVEATS
- Dependencies
- Firecrawl extension installed
- MCP server connected
- FIRECRAWL_API_KEY set
- Missing context
- Target user or environment assumptions (e.g., Claude with specific MCP setup)
- Ambiguities
- description field contains only ")" which appears malformed
- Reference to "Issue #11" without prior context or definition
QUALITY
- OVERALL
- 0.82
- CLARITY
- 0.85
- SPECIFICITY
- 0.90
- REUSABILITY
- 0.70
- COMPLETENESS
- 0.80
IMPROVEMENT SUGGESTIONS
- Replace the malformed description field with a concise one-sentence summary of the skill
- Add explicit output format expectations for each command (e.g., markdown tables, JSON)
USAGE
Copy the prompt above and paste it into your AI of choice — Claude, ChatGPT, Gemini, or anywhere else you're working. Replace any placeholder sections with your own context, then ask for the output.
MORE FOR AGENT
- Product Marketing Context Document Creatoragentmarketing
- Performance Ad Creative Generatoragentmarketing
- App Store Optimization Listing Auditoragentmarketing
- Content Strategy Planning Guideagentmarketing
- Marketing Psychology Mental Models Expertagentmarketing
- SEO Backlink Profile Analyzeragentmarketing
- DataForSEO Live SEO Data Toolsagentmarketing
- GTM Motions Scoring and Planningagentmarketing
- Market Segments Identifier and Analyzeragentmarketing
- SEO Site Audit Expertagentmarketing
- E-commerce SEO Analysis Commandsagentmarketing
- Go-to-Market Strategy Creatoragentmarketing
- Email Sequence Marketing Designeragentmarketing
- SEO Command and Subagent Orchestratoragentmarketing
- Conversion Marketing Copywriter for Web Pagesagentmarketing
- SEO Content Brief Generatoragentmarketing
- Community Marketing Strategy Advisoragentmarketing
- Marketing Page CRO Analyzeragentmarketing
- SaaS Product Launch Strategistagentmarketing
- Lead Magnet Strategy Planneragentmarketing
- In-App Paywall CRO Optimizeragentmarketing
- Popup CRO Optimization Expertagentmarketing
- Cost-Effective Product Marketing Ideas Generatoragentmarketing
- SEO SERP Overlap Keyword Clusteringagentmarketing
- DataForSEO Live SEO Data Handleragentmarketing