Get the FREE Ultimate OpenClaw Setup Guide →

epub-translate

npx machina-cli add skill eugenepyvovarov/mcpbundler-agent-skills-marketplace/epub-translate --openclaw
Files (1)
SKILL.md
4.4 KB

EPUB Translate

Workflow

  1. One-time setup (writes .skills-data/epub-translate/.env): scripts/epub-translate setup
  2. Extract translation units: scripts/epub-translate extract --epub /path/book.epub
  3. Translate via OpenAI API: scripts/epub-translate translate --job-dir <job-dir> --target-lang <bcp47> [--fraction 0.1]
  4. Apply + repack: scripts/epub-translate apply --job-dir <job-dir> --translations <job-dir>/translations.jsonl --target-lang <bcp47> --out-epub /path/book.<bcp47>.epub
  5. Optional: validate output: scripts/epub-translate validate --epub /path/book.<bcp47>.epub

Commands

  • scripts/epub-translate setup:
    • Prompts for OPENAI_API_KEY (saved to .skills-data/epub-translate/.env).
    • Prompts for default OPENAI_MODEL (e.g. gpt-5.1, gpt-5-mini).
  • scripts/epub-translate extract ...:
    • Unzips the EPUB into a per-run job dir under .skills-data/epub-translate/tmp/.
    • Parses META-INF/container.xml → package document (.opf) → manifest/spine.
    • Extracts XHTML <head><title> and leaf block-level XHTML fragments (inner HTML) in reading order into units.jsonl.
    • Optional: include OPF dc:title via --include-opf-title.
  • scripts/epub-translate translate ...:
    • Reads <job-dir>/units.jsonl and writes <job-dir>/translations.jsonl.
    • Uses the OpenAI Responses API (https://api.openai.com/v1/responses).
    • Resume-safe by default: if translations.jsonl already contains some ids, they are skipped.
  • scripts/epub-translate apply ...:
    • Applies translated fragments back into the unpacked XHTML files.
    • Updates dc:language in the OPF and xml:lang in the XHTML.
    • Repackages as a valid EPUB (writes mimetype first, stored/uncompressed).
  • scripts/epub-translate validate ...: checks basic EPUB container invariants.

JSONL formats

units.jsonl (input to translation) has one JSON object per line, e.g.:

{"id":1,"kind":"xhtml-fragment","doc_path":"EPUB/chapter1.xhtml","xpath":"/html[1]/body[1]/p[3]","tag":"p","source_inner_html":"Hello <em>world</em>!","source_markup_hash":"..."}

translations.jsonl (output from translation) must contain:

{"id":1,"translated_inner_html":"Hola <em>mundo</em>!"}

Translation rules (markup-safe)

  • Translate only human-readable text, not markup.
  • Do not change any tags, nesting, attribute names, attribute values, URLs, IDs, or filenames.
  • Keep entities/character references as-is (e.g. &amp;, &#160;).
  • If a fragment contains non-translatable text (code, formulas, URLs), leave it unchanged.

Local data and env

  • Store all mutable state under <project_root>/.skills-data/<skill-name>/.
  • Keep config and registries in .skills-data/<skill-name>/ (for example: config.json, <feature>.json).
  • Use .skills-data/<skill-name>/.env for SKILL_ROOT, SKILL_DATA_DIR, and any per-skill env keys.
  • Install local tools into .skills-data/<skill-name>/bin and prepend it to PATH when needed.
  • Install dependencies under .skills-data/<skill-name>/venv:
    • Python: .skills-data/<skill-name>/venv/python
    • Node: .skills-data/<skill-name>/venv/node_modules
    • Go: .skills-data/<skill-name>/venv/go (modcache, gocache)
    • PHP: .skills-data/<skill-name>/venv/php (cache, vendor)
  • Write logs/cache/tmp under .skills-data/<skill-name>/logs, .skills-data/<skill-name>/cache, .skills-data/<skill-name>/tmp.
  • Keep automation in <skill-root>/scripts and read SKILL_DATA_DIR (default to <project_root>/.skills-data/<skill-name>/).
  • Do not write outside <skill-root> and <project_root>/.skills-data/<skill-name>/ unless the user requests it.

OpenAI config keys

Stored in .skills-data/epub-translate/.env (created by scripts/epub-translate setup):

  • OPENAI_API_KEY
  • OPENAI_MODEL (default: gpt-5-mini; change via setup --model ..., editing .env, or translate --model ...)
  • OPENAI_BASE_URL (default: https://api.openai.com/v1)
  • OPENAI_REASONING_EFFORT (default: low; change via setup --reasoning-effort ... or editing .env)

Source

git clone https://github.com/eugenepyvovarov/mcpbundler-agent-skills-marketplace/blob/main/epub-translate/SKILL.mdView on GitHub

Overview

epub-translate unpacks an EPUB, extracts XHTML fragments into JSONL units, and translates them via the OpenAI Responses API while preserving markup. It then updates language metadata and repackages a valid, mimetype-first EPUB for distribution. This enables accurate, automated translations that keep formatting and structure intact.

How This Skill Works

It locates the package document by parsing META-INF/container.xml to find the OPF, then collects XHTML head titles and leaf block fragments in reading order into units.jsonl. The OpenAI API consumes translations.jsonl and returns translated_inner_html for each id, which are re-injected into the original XHTML while keeping tags and attributes intact. Finally, it updates the language metadata in the OPF and in the XHTML and repackages the EPUB with mimetype first and uncompressed storage to ensure validity.

When to Use It

  • Bulk translate multiple EPUBs into a new language for a library.
  • Preserve all markup, tags, and structure while translating human readable text.
  • Update dc language and xml:lang metadata to reflect the new language.
  • Resume work safely if a translation run is interrupted, since translations can be skipped if already completed.
  • Validate the final EPUB to ensure the container invariants before distribution.

Quick Start

  1. Step 1: Run scripts/epub-translate setup to configure API key and model.
  2. Step 2: Run scripts/epub-translate extract --epub /path/book.epub to create units.jsonl.
  3. Step 3: Run scripts/epub-translate translate --job-dir <job-dir> --target-lang <bcp47>, then apply and validate to produce the new EPUB.

Best Practices

  • Test on a small EPUB before large batches to tune translation quality.
  • Use include-opf-title if you want to translate the OPF title as well.
  • Keep non translatable content like code and URLs unchanged per rules.
  • Verify the translations.jsonl ids align with units.jsonl before applying.
  • Always run the validate command on the final EPUB.

Example Use Cases

  • Translate a 350-page English travel guide to Spanish while preserving headings, lists, and emphasis.
  • Localize a German math textbook to French, ensuring math elements remain intact.
  • Convert an English children s story EPUB to Italian with preserved typography and markup.
  • Publish a bilingual distribution by producing new EPUBs with updated language metadata.
  • Process a catalog of EPUBs for a regional library in multiple languages using separate runs.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers