What is markdown.new used for?

It converts public URLs into Markdown suitable for AI workflows, enabling summarization, RAG ingestion, extraction, and archiving.

How do I handle rate limits?

429 is treated as a rate limit; the documented limit is 500 requests per day per IP. Respect robots.txt and terms of service, and space out requests as needed.

What delivery modes are supported?

Delivery supports auto/POST API modes and a deliver-md option to output Markdown files. Prefix mode builds a URL; POST mode sends a JSON body for automation.

Markdown.new Skill

Scanned

@joelchance

npx machina-cli add skill @joelchance/markdown-convert --openclaw

Files (1)

SKILL.md

3.0 KB

Markdown.new

Use this skill to convert public URLs into LLM-ready Markdown via markdown.new.

Path Resolution (Critical)

Resolve relative paths like scripts/... and references/... from the skill directory, not workspace root.
If current directory is unknown, use an absolute script path.

python3 ~/.codex/skills/markdown-new/scripts/markdown_new_fetch.py 'https://example.com'

cd ~/.codex/skills/markdown-new
python3 scripts/markdown_new_fetch.py 'https://example.com'

Avoid this pattern from an arbitrary workspace root:

python3 scripts/markdown_new_fetch.py 'https://example.com'

Workflow

Validate the input URL is public http or https.
Run scripts/markdown_new_fetch.py with --method auto first.
Re-run with --method browser if output misses JS-rendered content.
Enable --retain-images only when image links are required.
Capture response metadata (x-markdown-tokens, x-rate-limit-remaining, and JSON metadata when present) for downstream planning.

Quick Start

Commands below assume current directory is the skill root (~/.codex/skills/markdown-new).

python3 scripts/markdown_new_fetch.py 'https://example.com' > page.md

python3 scripts/markdown_new_fetch.py 'https://example.com' --method browser --retain-images --output page.md

python3 scripts/markdown_new_fetch.py 'https://example.com' --deliver-md

Method Selection

auto: default. Let markdown.new use its fastest successful pipeline.
ai: force Workers AI HTML-to-Markdown conversion.
browser: force headless browser rendering for JS-heavy pages.

Use auto first, then retry with browser only when needed.

Delivery Mode

Use --deliver-md to force file output in .md format.
In delivery mode, content is wrapped as:
- <url>
- ...markdown...
- </url>
If --output is omitted, the script auto-generates a filename from the URL.

API Modes

Prefix mode:
- https://markdown.new/https://example.com?method=browser&retain_images=true
POST mode:
- POST https://markdown.new/
- JSON body: {"url":"https://example.com","method":"auto","retain_images":false}

Prefer POST mode for automation and explicit parameters.

Limits And Safety

Treat 429 as rate limiting (documented limit: 500 requests/day/IP).
Convert only publicly accessible pages.
Respect robots.txt, terms of service, and copyright constraints.
Do not treat markdown.new output as guaranteed complete for every page; verify critical extractions.

References

references/markdown-new-api.md

Source

git clone https://clawhub.ai/joelchance/markdown-convertView on GitHub

Overview

Markdown.new converts public web pages into LLM-ready Markdown for AI workflows such as summarization, RAG ingestion, extraction, and archiving. It supports multiple conversion methods (auto, ai, browser), optional image retention, and safety controls like rate limits and copyright constraints.

How This Skill Works

The tool validates that the input URL is public. It runs the fetch script with --method auto, retrying with --method browser if JS-rendered content is needed; enable --retain-images only when images are required. It also captures metadata (x-markdown-tokens, x-rate-limit-remaining) for downstream planning and delivery in Markdown format.

When to Use It

You need a concise Markdown version of a public article for summarization.
You want to ingest a page into a RAG/pipeline for retrieval and QA.
You are archiving public pages in Markdown for compliance or offline use.
You want to extract specific content while minimizing token usage.
You are processing JavaScript-heavy pages that require a headless browser render.

Quick Start

Step 1: python3 scripts/markdown_new_fetch.py 'https://example.com' > page.md
Step 2: python3 scripts/markdown_new_fetch.py 'https://example.com' --method browser --retain-images --output page.md
Step 3: python3 scripts/markdown_new_fetch.py 'https://example.com' --deliver-md

Best Practices

Always validate that the target URL is public and accessible; respect robots.txt and terms of service.
Start with auto method; switch to browser only if JS-rendered content is missing or incomplete.
Use --deliver-md when you need a ready-to-use Markdown file output.
Enable --retain-images only if image content is required for your workflow to avoid larger outputs.
Run from the skill root (~/.codex/skills/markdown-new) so relative path resolution works correctly.

Example Use Cases

Ingest a product docs page into a knowledge base by converting it to Markdown for quick search and retrieval.
Summarize a public news article and feed the summary into an AI briefing workflow.
Archive a public research landing page in Markdown for offline access and compliance.
Process a JS-heavy webpage by rendering with a browser and retaining images for visual context.
Automate batch processing of multiple URLs using POST mode with explicit parameters for integration pipelines.

Frequently Asked Questions

Add this skill to your agents