paper-slide-deck
Scannednpx machina-cli add skill Ronitnair/research-skills/paper-slide-deck --openclawPaper Slide Deck Generator
Transform academic papers and content into professional slide deck images with automatic figure extraction.
Usage
/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck # Then paste content
Script Directory
Important: All scripts are located in the scripts/ subdirectory of this skill.
Agent Execution Instructions:
- Determine this SKILL.md file's directory path as
SKILL_DIR - Script path =
${SKILL_DIR}/scripts/<script-name>.ts - Replace all
${SKILL_DIR}in this document with the actual path
Script Reference:
| Script | Purpose |
|---|---|
scripts/generate-slides.py | Generate AI slides via Gemini API (Python) |
scripts/merge-to-pptx.ts | Merge slides into PowerPoint |
scripts/merge-to-pdf.ts | Merge slides into PDF |
scripts/detect-figures.ts | Auto-detect figures/tables in PDF |
scripts/extract-figure.ts | Extract figure from PDF page (uses PyMuPDF fallback) |
scripts/apply-template.ts | Apply figure container template |
Options
| Option | Description |
|---|---|
--style <name> | Visual style (see Style Gallery) |
--audience <type> | Target audience: beginners, intermediate, experts, executives, general |
--lang <code> | Output language (en, zh, ja, etc.) |
--slides <number> | Target slide count |
--outline-only | Generate outline only, skip image generation |
Style Gallery
| Style | Description | Best For |
|---|---|---|
academic-paper | Clean professional, precise charts | Conference talks, thesis defense |
blueprint (Default) | Technical schematics, grid texture | Architecture, system design |
chalkboard | Black chalkboard, colorful chalk | Education, tutorials, classroom |
notion | SaaS dashboard, card-based layouts | Product demos, SaaS, B2B |
bold-editorial | Magazine cover, bold typography, dark | Product launches, keynotes |
corporate | Navy/gold, structured layouts | Investor decks, proposals |
dark-atmospheric | Cinematic dark mode, glowing accents | Entertainment, gaming |
editorial-infographic | Magazine explainers, flat illustrations | Tech explainers, research |
fantasy-animation | Ghibli/Disney style, hand-drawn | Educational, storytelling |
intuition-machine | Technical briefing, bilingual labels | Technical docs, academic |
minimal | Ultra-clean, maximum whitespace | Executive briefings, premium |
pixel-art | Retro 8-bit, chunky pixels | Gaming, developer talks |
scientific | Academic diagrams, precise labeling | Biology, chemistry, medical |
sketch-notes | Hand-drawn, warm & friendly | Educational, tutorials |
vector-illustration | Flat vector, retro & cute | Creative, children's content |
vintage | Aged-paper, historical styling | Historical, heritage, biography |
watercolor | Hand-painted textures, natural warmth | Lifestyle, wellness, travel |
Auto Style Selection
| Content Signals | Selected Style |
|---|---|
| paper, thesis, defense, conference, ieee, acm, icml, neurips, cvpr, acl, aaai, iclr | academic-paper |
| tutorial, learn, education, guide, intro, beginner | sketch-notes |
| classroom, teaching, school, chalkboard, blackboard | chalkboard |
| architecture, system, data, analysis, technical | blueprint |
| creative, children, kids, cute, illustration | vector-illustration |
| briefing, bilingual, infographic, concept | intuition-machine |
| executive, minimal, clean, simple, elegant | minimal |
| saas, product, dashboard, metrics, productivity | notion |
| investor, quarterly, business, corporate, proposal | corporate |
| launch, marketing, keynote, bold, impact, magazine | bold-editorial |
| entertainment, music, gaming, creative, atmospheric | dark-atmospheric |
| explainer, journalism, science communication | editorial-infographic |
| story, fantasy, animation, magical, whimsical | fantasy-animation |
| gaming, retro, pixel, developer, nostalgia | pixel-art |
| biology, chemistry, medical, pathway, scientific | scientific |
| history, heritage, vintage, expedition, historical | vintage |
| lifestyle, wellness, travel, artistic, natural | watercolor |
| Default | blueprint |
Layout Gallery
Optional layout hints for individual slides. Specify in outline's // LAYOUT section.
Slide-Specific Layouts
| Layout | Description | Best For |
|---|---|---|
title-hero | Large centered title + subtitle | Cover slides, section breaks |
quote-callout | Featured quote with attribution | Testimonials, key insights |
key-stat | Single large number as focal point | Impact statistics, metrics |
split-screen | Half image, half text | Feature highlights, comparisons |
icon-grid | Grid of icons with labels | Features, capabilities, benefits |
two-columns | Content in balanced columns | Paired information, dual points |
three-columns | Content in three columns | Triple comparisons, categories |
image-caption | Full-bleed image + text overlay | Visual storytelling, emotional |
agenda | Numbered list with highlights | Session overview, roadmap |
bullet-list | Structured bullet points | Simple content, lists |
Infographic-Derived Layouts
| Layout | Description | Best For |
|---|---|---|
linear-progression | Sequential flow left-to-right | Timelines, step-by-step |
binary-comparison | Side-by-side A vs B | Before/after, pros-cons |
comparison-matrix | Multi-factor grid | Feature comparisons |
hierarchical-layers | Pyramid or stacked levels | Priority, importance |
hub-spoke | Central node with radiating items | Concept maps, ecosystems |
bento-grid | Varied-size tiles | Overview, summary |
funnel | Narrowing stages | Conversion, filtering |
dashboard | Metrics with charts/numbers | KPIs, data display |
venn-diagram | Overlapping circles | Relationships, intersections |
circular-flow | Continuous cycle | Recurring processes |
winding-roadmap | Curved path with milestones | Journey, timeline |
tree-branching | Parent-child hierarchy | Org charts, taxonomies |
iceberg | Visible vs hidden layers | Surface vs depth |
bridge | Gap with connection | Problem-solution |
Academic-Specific Layouts
| Layout | Description | Best For |
|---|---|---|
paper-title | Title, authors, affiliations, venue | Conference paper cover |
outline-agenda | Numbered section list with highlights | Talk structure overview |
methods-diagram | Central architecture/pipeline diagram | Methods, system design |
results-chart | Chart area + data annotations | Quantitative results |
equation-focus | Centered equation + variable definitions | Mathematical derivations |
qualitative-grid | 2x2 or 3x2 image comparison grid | Visual results, ablations |
references-list | Numbered citation list | Key references slide |
contributions | Numbered contribution points | Contributions summary |
Usage: Add Layout: <name> in slide's // LAYOUT section to guide visual composition.
Design Philosophy
This deck is designed for reading and sharing, not live presentation:
- Each slide must be self-explanatory without verbal commentary
- Structure content for logical flow when scrolling
- Include all necessary context within each slide
- Optimize for social media sharing and offline reading
File Management
Output Directory
Each session creates an independent directory named by content slug:
slide-deck/{topic-slug}/
├── source-{slug}.{ext} # Source files (text, images, etc.)
├── outline.md
├── outline-{style}.md # Style variant outlines
├── prompts/
│ └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdf
Slug Generation:
- Extract main topic from content (2-4 words, kebab-case)
- Example: "Introduction to Machine Learning" →
intro-machine-learning
Conflict Resolution
If slide-deck/{topic-slug}/ already exists:
- Append timestamp:
{topic-slug}-YYYYMMDD-HHMMSS - Example:
intro-mlexists →intro-ml-20260118-143052
Source Files
Copy all sources with naming source-{slug}.{ext}:
source-article.md(main text content)source-diagram.png(image from conversation)source-data.xlsx(additional file)
Multiple sources supported: text, images, files from conversation.
Workflow
Step 1: Analyze Content
- Save source content (if pasted, save as
source.md) - Follow
references/analysis-framework.mdfor deep content analysis - Determine style (use
--styleor auto-select from signals) - Detect languages (source vs. user preference)
- Plan slide count (
--slidesor dynamic) - For academic papers (PDF with figures): Run automatic figure detection:
This outputs a JSON file with all detected figures/tables, their page numbers, and captions.npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
Step 2: Generate Outline Variants
- Generate 3 style variant outlines based on content analysis
- Follow
references/outline-template.mdfor structure - Auto-populate IMAGE_SOURCE for academic papers:
- Read
figures.jsonfrom Step 1 - Map figures to slides using rules in
references/analysis-framework.mdSection 8 - Automatically add
// IMAGE_SOURCEblocks to appropriate slides:- Architecture/pipeline figures → Methods slides (
Source: extract) - Results tables → Quantitative results slides (
Source: extract) - Comparison images → Qualitative results slides (
Source: extract) - Conceptual/simple diagrams → Leave for AI generation (
Source: generateor omit)
- Architecture/pipeline figures → Methods slides (
- Read
- Save as
outline-{style}.mdfor each variant
Step 3: User Confirmation
Single AskUserQuestion with all applicable options:
| Question | When to Ask |
|---|---|
| Style variant | Always (3 options + custom) |
| Language | Only if source ≠ user language |
After selection:
- Copy selected
outline-{style}.mdtooutline.md - Regenerate in different language if requested
- User may edit
outline.mdfor fine-tuning
If --outline-only, stop here.
Step 4: Generate Prompts
- Read
references/base-prompt.md - Combine with style instructions from outline
- Add slide-specific content
- If
Layout:specified in outline, include layout guidance in prompt:- Reference layout characteristics for image composition
- Example:
Layout: hub-spoke→ "Central concept in middle with related items radiating outward"
- Save to
prompts/directory
Step 5: Image Generation Method Selection
Before generating images, ask user to choose generation method:
Use AskUserQuestion with options:
| Option | Label | Description |
|---|---|---|
| 1 | Gemini API (Recommended) | Official Google API via Python. Requires GOOGLE_API_KEY env var. |
| 2 | Gemini Web (Browser-based) | ⚠️ Uses reverse-engineered web API. No API key needed but may break. |
Based on selection:
Option 1: Gemini API (Python)
- Verify API key: Check
GOOGLE_API_KEYorGEMINI_API_KEYenvironment variable - Run generation script:
python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview
Script Features:
- Auto-installs
google-genaipackage if missing - Retry logic with exponential backoff (3 retries)
- Skips already-generated slides (> 10KB)
- Supports custom model via
--modelflag - Outputs to
slides/subdirectory
Troubleshooting:
- If server disconnection errors occur, script auto-retries
- For persistent failures, re-run the script (it skips completed slides)
- Check API quota if many failures occur
Option 2: Gemini Web Skill
-
Consent Check: Read consent file at:
- Windows:
$APPDATA/baoyu-skills/gemini-web/consent.json - macOS:
~/Library/Application Support/baoyu-skills/gemini-web/consent.json - Linux:
~/.local/share/baoyu-skills/gemini-web/consent.json
- Windows:
-
If no consent or version mismatch, display disclaimer and ask:
⚠️ DISCLAIMER: This uses a reverse-engineered Gemini Web API (NOT official). Risks: May break anytime, no support, possible account risk. -
For each slide, run:
npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \ --promptfiles prompts/01-slide-cover.md \ --image 01-slide-cover.png \ --sessionId slides-{topic-slug}-{timestamp}Where
GEMINI_WEB_SKILL_DIR= path tobaoyu-danger-gemini-webskill directory. -
Proxy support: If user is in restricted network, prepend:
HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890
Step 5.5: Process IMAGE_SOURCE (Automatic Figure Extraction)
For academic presentations, IMAGE_SOURCE metadata was auto-populated in Step 2 based on figure detection from Step 1.
Automatic Execution:
-
Parse outline to identify slides with
Source: extract -
Create figures directory:
mkdir -p figures -
For each extract slide, automatically:
- Read the Figure number, Page, and Caption from metadata
- Run figure extraction script:
npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \ --pdf source-paper.pdf \ --page <page-number> \ --output figures/figure-<N>.png - Run template application script:
npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \ --figure figures/figure-<N>.png \ --title "<slide-headline>" \ --caption "Figure <N>: <caption-text>" \ --output <NN>-slide-<slug>.png - Report: "Extracted: Figure N → slide NN"
-
For slides with
Source: generate(or no IMAGE_SOURCE):- Proceed to Step 6 for AI generation
Note: Source PDF must be saved as source-paper.pdf in output directory.
Troubleshooting:
- If figure detection missed a figure: manually add
// IMAGE_SOURCEblock to outline - If wrong figure mapped: edit the
Figure:andPage:values in outline - If extraction fails: check PDF page number (1-indexed)
PyMuPDF Fallback for Page Extraction:
If extract-figure.ts fails with "Image or Canvas expected" error (common with complex PDFs), use PyMuPDF:
import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1] # 0-indexed
mat = fitz.Matrix(3, 3) # 3x scale for high resolution
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")
Then apply template using apply-template.ts.
Step 6: Generate Images
- Use selected method from Step 5
- Skip slides already processed in Step 5.5 (those with
Source: extract) - Generate session ID:
slides-{topic-slug}-{timestamp} - Generate each remaining slide with same session ID
- Report progress: "Generated X/N"
- Auto-retry once on generation failure
Step 7: Merge to PPTX and PDF
npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>
Step 8: Output Summary
Slide Deck Complete!
Topic: [topic]
Style: [style name]
Location: [directory path]
Slides: N total
- 01-slide-cover.png ✓ Cover
- 02-slide-intro.png ✓ Content
- ...
- {NN}-slide-back-cover.png ✓ Back Cover
Outline: outline.md
PPTX: {topic-slug}.pptx
PDF: {topic-slug}.pdf
Slide Modification
See references/modification-guide.md for:
- Edit single slide workflow
- Add new slide (with renumbering)
- Delete slide (with renumbering)
- File naming conventions
Image Generation Dependencies
Gemini API (Option 1 - Recommended)
Requires:
GOOGLE_API_KEYorGEMINI_API_KEYenvironment variable- Python 3.8+ with pip
google-genaipackage (auto-installed by script)
Model: gemini-3-pro-image-preview (default)
Gemini Web Skill (Option 2)
Requires:
baoyu-danger-gemini-webskill installed at.claude/skills/baoyu-danger-gemini-web- Google Chrome browser with logged-in Google account
- User consent for reverse-engineered API disclaimer
PDF Figure Extraction
Requires:
- Primary:
pdfjs-distnpm package (use legacy build for Node.js) - Fallback:
pymupdfPython package (more reliable for complex PDFs) canvasnpm package for apply-template.ts
References
| File | Content |
|---|---|
references/analysis-framework.md | Deep content analysis for presentations |
references/outline-template.md | Outline structure and STYLE_INSTRUCTIONS format |
references/modification-guide.md | Edit, add, delete slide workflows |
references/content-rules.md | Content and style guidelines |
references/base-prompt.md | Base prompt for image generation |
references/figure-container-template.md | Visual specs for extracted figure containers |
references/styles/<style>.md | Full style specifications |
Notes
Image Generation
- Nano Banana Pro API: Recommended. Stable, reliable, requires API key
- Gemini Web: No API key needed, but uses reverse-engineered API with account risk
- Generation time: 10-30 seconds per slide
- Auto-retry once on generation failure
- Maintain style consistency via session ID
Content Guidelines
- Use stylized alternatives for sensitive public figures
- Both methods use the same underlying Gemini model for image generation
Extension Support
Custom styles and configurations via EXTEND.md.
Check paths (priority order):
.paper-skills/paper-slide-deck/EXTEND.md(project)~/.paper-skills/paper-slide-deck/EXTEND.md(user)
If found, load before Step 1. Extension content overrides defaults.
Source
git clone https://github.com/Ronitnair/research-skills/blob/main/paper-slide-deck/SKILL.mdView on GitHub Overview
Transforms academic papers and content into professional slide decks as images. It automatically detects figures from PDFs, creates detailed outlines with style instructions, and renders each slide image. Ideal for turning papers into presentations, conference talks, or content-to-slides workflows.
How This Skill Works
Feed paper.pdf or content.md; the tool auto-detects figures with detect-figures.ts and extracts them as needed. It then generates slide images via an AI pipeline (Gemini API) using the selected style, audience, language, and slide count, and can merge results into PPTX or PDF. An outline-only option lets you craft structure first.
When to Use It
- User asks to create slides or a deck from a research paper.
- Prepare a conference or thesis-defense presentation from a paper.
- Convert content.md notes into a styled slide deck (e.g., for executives).
- Produce an outline-first deck to validate structure before full slide generation.
- Generate decks in different languages using --lang (e.g., en, zh).
Quick Start
- Step 1: Run the CLI with a PDF or content file, e.g., /paper-slide-deck path/to/paper.pdf.
- Step 2: Choose --style, --audience, --lang, and optional flags like --slides or --outline-only.
- Step 3: Export the final deck to PPTX or PDF (use merge-to-pptx.ts or merge-to-pdf.ts).
Best Practices
- Choose a matching style from the Style Gallery (e.g., academic-paper for conferences, sketch-notes for tutorials).
- Use --outline-only first to validate structure before image generation.
- Review auto-detected figures and captions; add alt text for accessibility.
- Specify audience (--audience) to tailor tone and depth.
- Limit slides with --slides to control pacing and avoid overload.
Example Use Cases
- Convert a NeurIPS paper into a 12-slide AI conference talk with auto-detected figures.
- Turn a journal article into a 15-slide corporate deck using the corporate style.
- Generate an 8-slide executive briefing from content.md in English or Chinese.
- Create chalkboard-style slides from classroom notes for a tutorial.
- Produce slides in zh for a bilingual research presentation.