file-converter
npx machina-cli add skill next-open-ai/openclawx/file-converter --openclawFile Converter Skill
Use this skill to convert a file from one format to another or extract content from files.
Workflow
- Identify the source file format and the target format.
- Determine which system tool is available and best suited for the conversion:
- For document conversion (Markdown, HTML, Word):
pandoc - For images (resize, format change):
convertormagick(ImageMagick) - For PDF text extraction:
pdftotext - For OCR on images:
tesseract
- For document conversion (Markdown, HTML, Word):
- Use the
bashtool to check if the required tool is installed (e.g.,which pandoc). If not, inform the user they need to install the dependency. - Execute the conversion command via
bash. - Verify the output file exists and has content, then notify the user.
Source
git clone https://github.com/next-open-ai/openclawx/blob/main/presets/workspaces/file-assistant/skills/file-converter/SKILL.mdView on GitHub Overview
File-converter enables format transformation and content extraction using common Bash tools. It supports pandoc for document conversions, ImageMagick for image tasks, and pdftotext or tesseract for PDF text and OCR. The workflow validates tool availability before executing conversions to ensure reliable results.
How This Skill Works
The skill identifies the source and target formats, selects the appropriate tool (pandoc for documents, convert/magick for images, pdftotext for PDF text, or tesseract for OCR), and uses which to verify installation. It then runs the conversion command in Bash and finally checks that the output file exists and contains content.
When to Use It
- Convert Markdown, HTML, or Word documents to other formats using pandoc
- Convert DOCX to PDF or HTML to Markdown via pandoc
- Resize, reformat, or convert images with ImageMagick (convert/magick)
- Extract text from a PDF using pdftotext
- Perform OCR on an image or screenshot with tesseract to produce searchable text
Quick Start
- Step 1: Identify the source format and the desired target format
- Step 2: Check installation with which pandoc / convert / pdftotext / tesseract as appropriate
- Step 3: Run the conversion command in Bash and verify the output file exists and has content
Best Practices
- Check tool availability with which before attempting a conversion
- Test conversions on small samples to verify formatting and output quality
- Verify the output file exists and contains content after the command runs
- Handle unsupported format pairs gracefully with a clear error message
- Document the exact command used and output file path for reproducibility
Example Use Cases
- Convert a Markdown file to HTML using pandoc
- Transform a DOCX into PDF with pandoc
- Resize a JPEG and save as PNG using ImageMagick
- Extract text from a PDF with pdftotext
- Run OCR on a PNG image with tesseract to produce a TXT file