Md To Json Parser
Scannednpx machina-cli add skill kay-ou/ClaudeSkills/md-to-json-parser --openclawname: md-to-json-parser description: Structure Markdown documents into JSON, supporting headings, paragraphs, tables, code blocks, etc. This skill should be used when users need to parse markdown files, convert markdown to structured data, extract content from markdown documents, analyze markdown structure, or process .md files programmatically. Keywords: markdown解析, markdown转JSON, 提取markdown结构, 分析markdown文档, parse markdown, convert md to json, extract markdown content, markdown structure analysis, process markdown files, 处理markdown文件 inputs:
- md_file_path outputs:
- structured_json instructions: |
Markdown to JSON Parser
This skill parses Markdown documents and converts them into structured JSON format, making it easier for Claude to understand and work with document content programmatically.
When to Use This Skill
Use this skill when you need to:
- Extract structured data from Markdown files
- Convert documentation into JSON for processing
- Analyze document structure and content
- Process Markdown files programmatically
Processing Steps
Step 1: Load and Validate Markdown File
- Read the Markdown file from the provided path
- Validate file exists and is readable
- Handle encoding issues (default to UTF-8)
Step 2: Parse Document Structure
- Extract headings (H1-H6) with their hierarchy
- Identify paragraphs and text content
- Locate tables and convert to structured format
- Find code blocks with their language annotations
- Detect lists (ordered and unordered)
- Identify links, images, and other inline elements
Step 3: Convert Tables to Arrays
- Parse table headers as column names
- Convert each row to a JSON object
- Preserve cell content with proper escaping
- Handle merged cells appropriately
Step 4: Preserve Code Blocks
- Maintain original formatting and indentation
- Preserve language annotations
- Keep special characters and syntax intact
- Handle multi-line code blocks correctly
Step 5: Generate Structured JSON Output
- Create hierarchical structure reflecting document organization
- Include metadata like word count, heading count, etc.
- Preserve relationships between elements
- Ensure JSON is valid and well-formed
Output Format
The structured JSON includes:
{
"metadata": {
"title": "Document Title",
"word_count": 1500,
"heading_count": 8,
"table_count": 3,
"code_block_count": 5
},
"structure": {
"headings": [
{"level": 1, "text": "Main Title", "id": "main-title"}
],
"paragraphs": [
{"text": "Content...", "word_count": 120}
],
"tables": [
{
"headers": ["Column1", "Column2"],
"rows": [
{"Column1": "data1", "Column2": "data2"}
]
}
],
"code_blocks": [
{
"language": "python",
"content": "def example():\n pass"
}
]
}
}
Error Handling
- Handle missing files gracefully
- Manage encoding issues
- Deal with malformed Markdown
- Report parsing errors clearly
- Validate JSON output format
Source
git clone https://github.com/kay-ou/ClaudeSkills/blob/main/.claude/skills/md-to-json-parser/SKILL.mdView on GitHub Overview
Md To Json Parser converts Markdown documents into a structured JSON representation, capturing headings, paragraphs, tables, code blocks, lists, links, and inline elements. It’s designed to extract content, analyze document structure, and process .md files programmatically for downstream workflows.
How This Skill Works
The tool loads and validates the Markdown file (default UTF-8), then parses the document structure to extract headings, paragraphs, tables, code blocks, lists, and inline elements. Tables are converted to arrays with headers and rows, code blocks retain language annotations and formatting, and the result is a hierarchical JSON with metadata about the document.
When to Use It
- Extract structured data from Markdown files for downstream processing
- Convert documentation into JSON for ingestion by apps and services
- Analyze document structure and content to improve navigation and searchability
- Process Markdown files programmatically in automated pipelines
- Prepare Markdown content for APIs, databases, or CMS migrations
Quick Start
- Step 1: Load and validate the Markdown file from the given path
- Step 2: Parse headings, paragraphs, tables, code blocks, and inline elements
- Step 3: Generate and output the structured JSON with metadata
Best Practices
- Validate input file paths and ensure the file exists before parsing
- Normalize encoding to UTF-8 and handle potential encoding issues
- Preserve code blocks with language annotations and exact formatting
- Escape and preserve content when converting tables and inline text
- Validate the resulting JSON to ensure it is well-formed and complete
Example Use Cases
- Convert a project's README.md into JSON for a documentation platform
- Extract content from API reference docs into a structured data format for a search index
- Audit and analyze the structure of internal knowledge bases stored as Markdown
- Migrate Markdown specifications into a structured dataset for automated test generation
- Process Markdown reports to feed into a data catalog or analytics pipeline