API key is missing or invalid. Ensure you have a valid SITEMAPKIT_API_KEY configured on the SitemapKit MCP server and retry. If needed, obtain a key at https://app.sitemapkit.com/settings/api.

Monthly quota exceeded

You’ve hit your plan’s limit. Upgrade your plan at https://sitemapkit.com/pricing to increase quota and rerun the crawl.

Too many requests per minute. Wait and retry. The response includes a retryAfter timestamp to indicate when you may retry.

What if I want only part of a sitemap?

Use extract_sitemap for a single sitemap URL if you only need URLs from that sitemap; for a broader view, use full_crawl but constrain with max_urls as needed.

sitemapkit

Scanned

npx machina-cli add skill aiskillstore/marketplace/sitemapkit --openclaw

Files (1)

SKILL.md

2.9 KB

SitemapKit

Use the SitemapKit MCP tools to discover and extract URLs from any website's sitemaps.

Tools available

discover_sitemaps — finds all sitemap files for a domain (checks robots.txt, common paths, sitemap indexes). Use this first when you just want to know what sitemaps exist.
extract_sitemap — fetches all URLs from a specific sitemap URL. Use when the user gives you a direct sitemap URL.
full_crawl — discovers all sitemaps for a domain and returns every URL across all of them in one call. Use this when the user wants the complete list of pages on a site.

When to use which tool

User says	Use
"find sitemaps for X" / "does X have a sitemap?"	`discover_sitemaps`
"extract URLs from X/sitemap.xml"	`extract_sitemap`
"get all pages on X" / "crawl X" / "list all URLs on X"	`full_crawl`

Usage guidelines

Always pass a full URL including protocol: https://example.com
full_crawl and discover_sitemaps only use the domain — paths are ignored
extract_sitemap needs the exact sitemap URL, e.g. https://example.com/sitemap.xml
Default max_urls is 1000. If the user wants more, pass a higher value (up to plan limit)
If truncated: true appears in the result, tell the user there are more URLs and suggest increasing max_urls
Check meta.quota.remaining in the response — if it's low, warn the user proactively

Error handling

Error	What to tell the user
`Unauthorized`	API key is missing or invalid. Get one at https://app.sitemapkit.com/settings/api
`Monthly quota exceeded`	Plan limit reached. Upgrade at https://sitemapkit.com/pricing
`Rate limit exceeded`	Too many requests per minute. Wait and retry — the response includes a `retryAfter` timestamp

Example interactions

"What pages does stripe.com have?" → Call full_crawl with url: "https://stripe.com", present the URL list.

"Find all sitemaps for shopify.com" → Call discover_sitemaps with url: "https://shopify.com", list the sitemap URLs found and which sources they came from (robots.txt, common paths, etc.).

"Extract https://example.com/sitemap-posts.xml" → Call extract_sitemap with url: "https://example.com/sitemap-posts.xml", present the URLs with lastmod dates if available.

"How many pages does vercel.com have?" → Call full_crawl, report totalUrls and whether the result was truncated.

Source

git clone https://github.com/aiskillstore/marketplace/blob/main/skills/0nl1n1n/sitemapkit/SKILL.mdView on GitHub

Overview

SitemapKit provides focused tools to discover sitemap files for a domain, extract URLs from a specific sitemap, or crawl an entire site’s URL set across all sitemaps. You use it to audit site structure, perform URL discovery, and validate coverage for SEO or crawling purposes. A valid SITEMAPKIT_API_KEY on your MCP server is required.

How This Skill Works

You call one of three tools against the SitemapKit MCP server: discover_sitemaps to identify available sitemaps for a domain, extract_sitemap to fetch URLs from a given sitemap URL, or full_crawl to aggregate all URLs across all discovered sitemaps. For domain-based tools, only the domain is used (paths are ignored). Results include the URLs and metadata such as totalUrls, truncated status, and quota information.

When to Use It

Find sitemaps for a domain (e.g., "does X have a sitemap?")
Extract URLs from a specific sitemap URL you already have
Get a complete list of pages on a site by crawling all sitemaps
Audit site structure to understand coverage and page availability
Perform URL discovery for SEO, indexing, or migration planning

Quick Start

Ensure your SitemapKit MCP server has a valid API key configured.
Choose the appropriate tool based on your goal: discover_sitemaps for domain-wide sitemap discovery, extract_sitemap for a specific sitemap URL, or full_crawl for all URLs across all sitemaps.
Provide input with full URLs (including https://). For domain-based tools, pass the domain (e.g., https://example.com) and let the tool fetch relevant sitemaps.
Review the response for totalUrls, truncated, and quota.remaining; if truncated or quota is low, adjust max_urls or retry after quota refresh.

Best Practices

Always pass a full URL including protocol (https://).
Prefer discover_sitemaps first to map available sitemaps before targeted extraction or full crawl.
Use max_urls to limit results when you don’t need every URL, especially on large domains.
If you see truncated: true, increase max_urls or run a subsequent call to retrieve more URLs.
Monitor meta.quota.remaining and plan requests to avoid interruptions during critical crawls.

Example Use Cases

What pages does stripe.com have? → Use full_crawl with url: https://stripe.com to obtain a complete URL list across all sitemaps.
Find all sitemaps for shopify.com → Use discover_sitemaps with url: https://shopify.com and review found sitemap URLs and their sources (robots.txt, common paths, etc.).
Extract https://example.com/sitemap-posts.xml → Use extract_sitemap with url: https://example.com/sitemap-posts.xml and present the URLs (with lastmod if available).
How many pages does vercel.com have? → Use full_crawl with url: https://vercel.com and report totalUrls and whether the result was truncated.
Audit a client domain’s site structure → Run full_crawl to collect all URLs, then analyze sitemap coverage, duplicate paths, and crawl priority.

Frequently Asked Questions

Add this skill to your agents