Get the FREE Ultimate OpenClaw Setup Guide →

docx-processing-openai

npx machina-cli add skill lawvable/awesome-legal-skills/docx-processing-openai --openclaw
Files (1)
SKILL.md
3.2 KB

DOCX Skill

When to use

  • Read or review DOCX content where layout matters (tables, diagrams, pagination).
  • Create or edit DOCX files with professional formatting.
  • Validate visual layout before delivery.

Workflow

  1. Prefer visual review (layout, tables, diagrams).
    • If soffice and pdftoppm are available, convert DOCX -> PDF -> PNGs.
    • Or use scripts/render_docx.py (requires pdf2image and Poppler).
    • If these tools are missing, install them or ask the user to review rendered pages locally.
  2. Use python-docx for edits and structured creation (headings, styles, tables, lists).
  3. After each meaningful change, re-render and inspect the pages.
  4. If visual review is not possible, extract text with python-docx as a fallback and call out layout risk.
  5. Keep intermediate outputs organized and clean up after final approval.

Temp and output conventions

  • Use tmp/docs/ for intermediate files; delete when done.
  • Write final artifacts under output/doc/ when working in this repo.
  • Keep filenames stable and descriptive.

Dependencies (install if missing)

Prefer uv for dependency management.

Python packages:

uv pip install python-docx pdf2image

If uv is unavailable:

python3 -m pip install python-docx pdf2image

System tools (for rendering):

# macOS (Homebrew)
brew install libreoffice poppler

# Ubuntu/Debian
sudo apt-get install -y libreoffice poppler-utils

If installation isn't possible in this environment, tell the user which dependency is missing and how to install it locally.

Environment

No required environment variables.

Rendering commands

DOCX -> PDF:

soffice -env:UserInstallation=file:///tmp/lo_profile_$$ --headless --convert-to pdf --outdir $OUTDIR $INPUT_DOCX

PDF -> PNGs:

pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME

Bundled helper:

python3 scripts/render_docx.py /path/to/file.docx --output_dir /tmp/docx_pages

Quality expectations

  • Deliver a client-ready document: consistent typography, spacing, margins, and clear hierarchy.
  • Avoid formatting defects: clipped/overlapping text, broken tables, unreadable characters, or default-template styling.
  • Charts, tables, and visuals must be legible in rendered pages with correct alignment.
  • Use ASCII hyphens only. Avoid U+2011 (non-breaking hyphen) and other Unicode dashes.
  • Citations and references must be human-readable; never leave tool tokens or placeholder strings.

Final checks

  • Re-render and inspect every page at 100% zoom before final delivery.
  • Fix any spacing, alignment, or pagination issues and repeat the render loop.
  • Confirm there are no leftovers (temp files, duplicate renders) unless the user asks to keep them.

Source

git clone https://github.com/lawvable/awesome-legal-skills/blob/main/skills/docx-processing-openai/SKILL.mdView on GitHub

Overview

DOCX Processing OpenAI is a toolkit for reading, editing, and creating Word documents (.docx) with built in visual quality control. It supports extracting content from existing files, generating professionally formatted documents, and editing with precise typography and layout to meet delivery standards.

How This Skill Works

The workflow uses python-docx for structured edits and document creation (headings, styles, tables, lists). For visual validation, it can render DOCX to PDF and PNG via soffice and pdf2image, or run the bundled render_docx.py script. If rendering tools are unavailable, it can fall back to text extraction with python-docx and highlight layout risks.

When to Use It

  • Read DOCX content where layout matters (tables, diagrams, pagination).
  • Create or edit DOCX files with professional formatting (styles, headings, tables).
  • Validate visual layout before client delivery with render checks.
  • Ensure typography and margins align with brand guidelines during production.
  • Produce PDF/PNG previews of DOCX for reviews and handoffs.

Quick Start

  1. Step 1: Install required Python packages and rendering tools (python-docx, pdf2image, soffice or LibreOffice).
  2. Step 2: Use python-docx to read, edit, or create a .docx; apply headings, styles, tables, and lists.
  3. Step 3: Render to PDF/PNG for QA and save final artifacts under output/doc/.

Best Practices

  • Prioritize visual review after every meaningful change.
  • Keep intermediates in tmp/docs/ and final artifacts in output/doc/.
  • Use python-docx for structured edits (headings, styles, tables, lists).
  • Render to PDF/PNG for page-level QA and catch spacing or pagination issues.
  • Use ASCII hyphens only and avoid placeholder tokens in final artifacts.

Example Use Cases

  • Legal contracts formatted with precise typography and tables.
  • Corporate policies with diagrams and multi column layouts.
  • Annual reports requiring consistent typography and margins.
  • Client deliverables with brand aligned styles and templates.
  • Team documents with reusable styles and structured content.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers