Get the FREE Ultimate OpenClaw Setup Guide →

docx-template-filling

Scanned
npx machina-cli add skill belumume/claude-skills/docx-template-filling --openclaw
Files (1)
SKILL.md
8.0 KB

DOCX Template Filling - Forensic Preservation

Fill template forms programmatically with zero detectable artifacts. The filled document must be indistinguishable from manual typing in the original template.

When to Use This Skill

Invoke when:

  • Filling standardized forms and templates
  • Completing application forms
  • Responding to questionnaires and surveys
  • Processing template-based documents
  • Any scenario where the recipient must not detect programmatic manipulation

Critical requirement: Template integrity must be 100% preserved (logos, footers, headers, styles, metadata, element structure).

Core Philosophy: Preservation Over Recreation

WRONG approach: Extract content from template, generate new document

  • Loses metadata
  • Changes element IDs
  • Alters styles subtly
  • Creates detectable artifacts

RIGHT approach: Load template, insert content at anchor points using XML API

  • Preserves all original elements
  • Maintains metadata
  • Zero structural changes
  • Indistinguishable from manual entry

Critical Anti-Patterns

❌ NEVER: Use pandoc with --reference-doc

# This SEEMS correct but ONLY copies styles, NOT structure
pandoc content.md -o output.docx --reference-doc=template.docx

What happens:

  • Template's tables disappear
  • Logos, headers, footers lost
  • Only style definitions copied
  • Looks completely different

Why it fails: --reference-doc means "copy the style definitions," NOT "preserve the document structure"

❌ NEVER: Append content at the end

# This destroys template structure
template = Document('template.docx')

# Remove content after markers
# ... (deletion logic)

# Append all new content at end
for para in new_content:
    template.add_paragraph(para.text)  # WRONG!

What happens:

  • Template questions appear unanswered
  • All answers grouped at end
  • Structure broken
  • Obviously programmatic

❌ NEVER: Recreate tables

# DON'T copy table structure and rebuild
new_table = template.add_table(rows=3, cols=2)
# Even if copying all properties, it's not the original!

What happens:

  • Loses original element IDs
  • Style inheritance breaks
  • Metadata changes
  • Detectable as modified

Essential Workflow

Step 1: Inspect Template Structure FIRST

Always inspect before modifying. Never assume structure.

Use the provided inspection script:

python scripts/inspect_template.py template.docx

This prints:

  • All tables with identities
  • Potential anchor points (paragraphs ending with ":", "Answer:", etc.)
  • Headers and footers
  • Document element counts

Why critical: Prevents modifying wrong tables, missing anchors, breaking structure.

Step 2: Selective Table Filling

Modify cells in place. Never recreate tables.

from docx import Document

template = Document('template.docx')

# Fill specific cells in existing table
info_table = template.tables[0]
info_table.rows[0].cells[1].text = "Jane Smith"
info_table.rows[1].cells[1].text = "S12345"

# Table structure, styles, borders all preserved

Principle: Modify existing cells. Never remove and recreate.

Step 3: Anchor-Based Content Insertion

Insert content at specific positions using XML API.

# Find anchor paragraphs
anchor_positions = []
for i, para in enumerate(template.paragraphs):
    if para.text.strip() == "Answer:":
        anchor_positions.append(i)

# Insert content after anchor using XML API
def insert_after(doc, anchor_idx, content_paras):
    anchor_elem = doc.paragraphs[anchor_idx]._element
    parent = anchor_elem.getparent()

    for offset, para in enumerate(content_paras):
        parent.insert(
            parent.index(anchor_elem) + 1 + offset,
            para._element
        )

# Load content to insert
content_doc = Document('my_content.docx')
section_paragraphs = content_doc.paragraphs[5:64]

# Insert at anchor
insert_after(template, anchor_positions[0], section_paragraphs)

# Save
template.save('completed.docx')

Why XML API:

  • doc.add_paragraph() appends at end → wrong position
  • para.insert_paragraph_before() has stale reference issues
  • XML API: direct element manipulation → correct position, zero artifacts

Step 4: Multi-Anchor Insertion (Reverse Order)

When inserting at multiple positions, insert from bottom to top to preserve earlier indices.

# Template has anchors at paragraphs 18, 27, 37

# Insert in REVERSE order
insert_after(template, 37, section3_content)  # Last anchor first
insert_after(template, 27, section2_content)  # Middle still at 27
insert_after(template, 18, section1_content)  # First still at 18

Why reverse: Inserting content shifts later paragraph indices but not earlier ones.

Advanced Patterns

For detailed implementations, see references/patterns.md:

  • Content range extraction - Extract multi-section content between markers
  • Table identity detection - Identify tables when no IDs exist
  • Robust anchor matching - exact/partial/smart modes
  • Table repositioning - Move tables without recreating
  • Verification - Ensure zero artifacts after filling

Common Scenarios

Scenario 1: Form with Info Table + Q&A

template = Document('form_template.docx')

# Fill info table
info_table = template.tables[0]
info_table.rows[0].cells[1].text = "Applicant Name"

# Find "Answer:" anchors
anchors = [i for i, p in enumerate(template.paragraphs)
           if p.text.strip() == "Answer:"]

# Insert responses
responses = Document('my_responses.docx')
response_content = responses.paragraphs[5:30]

insert_after(template, anchors[0], response_content)

template.save('form_completed.docx')

Scenario 2: Report with Table Repositioning

template = Document('report_template.docx')

# Fill team table
team_table = template.tables[0]
team_table.rows[0].cells[1].text = "Team 5"

# Insert section content at anchors
# ... (insertion code)

# Move summary table to correct position
summary_heading_idx = next(i for i, p in enumerate(template.paragraphs)
                           if "Summary Table:" in p.text)

# Move table from end to after summary heading
# See references/patterns.md for move_table_to_position()

template.save('report_completed.docx')

Bundled Resources

Scripts

  • scripts/inspect_template.py - Inspect template structure before modification
    • Usage: python scripts/inspect_template.py <template.docx>
    • Prevents destructive mistakes by showing all tables, anchors, headers/footers

References

  • references/patterns.md - Detailed technical patterns
    • Content range extraction
    • Table identity detection strategies
    • XML-level insertion patterns
    • Multi-anchor workflows
    • Verification procedures
    • Complete code examples

Load patterns.md when implementing specific operations beyond basic workflow.

Verification Checklist

Template filling is successful if:

  • Filled document indistinguishable from manual entry
  • All template tables preserved (count unchanged unless expected)
  • Headers/footers unchanged
  • Logo(s) intact
  • Scoring/grading tables empty (if they should be)
  • Styles identical to original
  • Content inserted at correct anchor points (not at end)
  • Template owner cannot detect programmatic manipulation

Key Lessons

This skill documents patterns where:

  • Templates have info tables (to fill) and evaluation/scoring tables (preserve empty)
  • Multiple anchor points like "Answer:", "Response:", or "Solution:" for content insertion
  • Tables may need repositioning to correct sections
  • Document structure must remain intact (headers, footers, logos, branding)
  • Zero artifacts requirement (recipient cannot detect automation)

Use cases: Forms, questionnaires, standardized documents, applications, reports.

Core principle: Preservation over recreation. Never rebuild - always modify in place.

Source

git clone https://github.com/belumume/claude-skills/blob/main/web-desktop-exports/docx-template-filling/SKILL.mdView on GitHub

Overview

docx-template-filling fills template forms programmatically while preserving 100% of the original structure. The filled document keeps logos, footers, headers, styles, metadata and element IDs, producing an output indistinguishable from manual entry. This preservation is essential for compliance, auditability, and form integrity.

How This Skill Works

The workflow loads the existing DOCX template and inserts content at anchor points using an XML based API rather than recreating content. It relies on inspecting the template to locate anchors and then updates in place table cells and paragraphs while preserving the template structure, metadata and element IDs. This approach prevents artifacts and keeps the document indistinguishable from manual typing.

When to Use It

  • Filling standardized forms and templates
  • Completing application forms
  • Responding to questionnaires and surveys
  • Processing template based documents
  • Any scenario where the recipient must not detect programmatic manipulation

Quick Start

  1. Step 1: Inspect template structure FIRST using the inspection script: python scripts/inspect_template.py template.docx
  2. Step 2: Fill content in place, e.g., modify existing table cells or insert after anchor points using an XML based approach with the docx library
  3. Step 3: Validate by re running the inspection script and save as a new DOCX to confirm zero artifacts

Best Practices

  • Inspect template structure before modifying using the inspection script
  • Modify existing cells and anchors in place instead of recreating elements
  • Fill only specific known anchor points to preserve IDs
  • Use XML API driven insertion and avoid end of document appends
  • Validate preserved structure by re inspecting the output and checking logos, headers, footers and metadata

Example Use Cases

  • Filling a job application form without changing the template layout while preserving branding
  • Populating a standardized insurance claim form with responses while keeping logos and metadata intact
  • Completing a client onboarding questionnaire in a template without altering styles or IDs
  • Updating vendor intake forms while maintaining header and footer integrity
  • Replacing answers in a legal template while preserving all element structure and metadata

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers