Markdown.new Skill
Scanned@joelchance
npx machina-cli add skill @joelchance/markdown-convert --openclawMarkdown.new
Use this skill to convert public URLs into LLM-ready Markdown via markdown.new.
Path Resolution (Critical)
- Resolve relative paths like
scripts/...andreferences/...from the skill directory, not workspace root. - If current directory is unknown, use an absolute script path.
python3 ~/.codex/skills/markdown-new/scripts/markdown_new_fetch.py 'https://example.com'
cd ~/.codex/skills/markdown-new
python3 scripts/markdown_new_fetch.py 'https://example.com'
Avoid this pattern from an arbitrary workspace root:
python3 scripts/markdown_new_fetch.py 'https://example.com'
Workflow
- Validate the input URL is public
httporhttps. - Run
scripts/markdown_new_fetch.pywith--method autofirst. - Re-run with
--method browserif output misses JS-rendered content. - Enable
--retain-imagesonly when image links are required. - Capture response metadata (
x-markdown-tokens,x-rate-limit-remaining, and JSON metadata when present) for downstream planning.
Quick Start
Commands below assume current directory is the skill root (~/.codex/skills/markdown-new).
python3 scripts/markdown_new_fetch.py 'https://example.com' > page.md
python3 scripts/markdown_new_fetch.py 'https://example.com' --method browser --retain-images --output page.md
python3 scripts/markdown_new_fetch.py 'https://example.com' --deliver-md
Method Selection
auto: default. Let markdown.new use its fastest successful pipeline.ai: force Workers AI HTML-to-Markdown conversion.browser: force headless browser rendering for JS-heavy pages.
Use auto first, then retry with browser only when needed.
Delivery Mode
- Use
--deliver-mdto force file output in.mdformat. - In delivery mode, content is wrapped as:
<url>...markdown...</url>
- If
--outputis omitted, the script auto-generates a filename from the URL.
API Modes
- Prefix mode:
https://markdown.new/https://example.com?method=browser&retain_images=true
- POST mode:
POST https://markdown.new/- JSON body:
{"url":"https://example.com","method":"auto","retain_images":false}
Prefer POST mode for automation and explicit parameters.
Limits And Safety
- Treat
429as rate limiting (documented limit: 500 requests/day/IP). - Convert only publicly accessible pages.
- Respect
robots.txt, terms of service, and copyright constraints. - Do not treat markdown.new output as guaranteed complete for every page; verify critical extractions.
References
references/markdown-new-api.md
Overview
Markdown.new converts public web pages into LLM-ready Markdown for AI workflows such as summarization, RAG ingestion, extraction, and archiving. It supports multiple conversion methods (auto, ai, browser), optional image retention, and safety controls like rate limits and copyright constraints.
How This Skill Works
The tool validates that the input URL is public. It runs the fetch script with --method auto, retrying with --method browser if JS-rendered content is needed; enable --retain-images only when images are required. It also captures metadata (x-markdown-tokens, x-rate-limit-remaining) for downstream planning and delivery in Markdown format.
When to Use It
- You need a concise Markdown version of a public article for summarization.
- You want to ingest a page into a RAG/pipeline for retrieval and QA.
- You are archiving public pages in Markdown for compliance or offline use.
- You want to extract specific content while minimizing token usage.
- You are processing JavaScript-heavy pages that require a headless browser render.
Quick Start
- Step 1: python3 scripts/markdown_new_fetch.py 'https://example.com' > page.md
- Step 2: python3 scripts/markdown_new_fetch.py 'https://example.com' --method browser --retain-images --output page.md
- Step 3: python3 scripts/markdown_new_fetch.py 'https://example.com' --deliver-md
Best Practices
- Always validate that the target URL is public and accessible; respect robots.txt and terms of service.
- Start with auto method; switch to browser only if JS-rendered content is missing or incomplete.
- Use --deliver-md when you need a ready-to-use Markdown file output.
- Enable --retain-images only if image content is required for your workflow to avoid larger outputs.
- Run from the skill root (~/.codex/skills/markdown-new) so relative path resolution works correctly.
Example Use Cases
- Ingest a product docs page into a knowledge base by converting it to Markdown for quick search and retrieval.
- Summarize a public news article and feed the summary into an AI briefing workflow.
- Archive a public research landing page in Markdown for offline access and compliance.
- Process a JS-heavy webpage by rendering with a browser and retaining images for visual context.
- Automate batch processing of multiple URLs using POST mode with explicit parameters for integration pipelines.