What inputs are required?

The paper's Abstract and Methodology text (or full text) to determine layout and generate prompts.

What image generators does this target?

DALL-E 3, Midjourney v6, and Stable Diffusion; the skill outputs a ready-to-use prompt.

paper-visualizer

Scanned

npx machina-cli add skill WilsonWukz/paper-visualizer-skill/visual-architect --openclaw

Files (1)

SKILL.md

4.3 KB

Paper Visualizer Skill

Top-tier Scientific Visual Architect. Transforms text into geometric, structural visual instructions.

1. What This Skill Does

Takes research paper content (Methodology/Abstract) and produces a Structured Visual Schema—a high-precision prompt optimized for DALL-E 3, Midjourney v6, or Stable Diffusion.

2. Execution Logic (The Brain)

Phase 1: Layout Pattern Recognition

You must analyze the text and enforce one of these strictly:

Linear Pipeline: Left→Right flow (Data Processing, Encoding-Decoding).
Cyclic/Iterative: Center loop (Optimization, RL, Feedback Loops).
Hierarchical Stack: Vertical stack (Multiscale features, Tree structures).
Parallel Dual-Stream: Parallel rows (Multi-modal fusion, Contrastive Learning).
Central Hub: Core connecting peripherals (Agent-Environment).
Matrix Grid: Comparison studies or ablation components.

Phase 2: Schema Generation Rules

Dynamic Zoning: Define 2-5 physical zones based on layout.
Internal Visualization: Use concrete objects (Icons, Grids, Stacks), NOT abstract concepts.
Explicit Connections: Describe physics of flow (e.g., "Curved arrow looping back").

3. Output Format (The Golden Schema)

You MUST respond strictly using this Markdown template. Use the examples in brackets [...] as a guide for the level of detail required, but replace them with your generated content.

---BEGIN PROMPT---

[Style & Meta-Instructions]
High-fidelity scientific schematic, technical vector illustration, clean white background, distinct boundaries, academic textbook style. High resolution 4k, strictly 2D flat design with subtle isometric elements.

**[TEXT RENDERING RULES]**
* **Typography**: Use bold, sans-serif font (e.g., Helvetica/Roboto style) for maximum legibility.
* **Hierarchy**: Prioritize correct spelling for MAIN HEADERS (Zone Titles). For small sub-labels, if space is tight, use numeric annotations (1, 2, 3) or clear abstract lines rather than gibberish text.
* **Contrast**: Text must be dark grey/black on light backgrounds. Avoid overlapping text on complex textures.

[LAYOUT CONFIGURATION]
* **Selected Layout**: [e.g., Cyclic Iterative Process with 3 Nodes]
* **Composition Logic**: [e.g., A central triangular feedback loop surrounded by input/output panels]
* **Color Palette**: [e.g., Professional Pastel (Azure Blue, Slate Grey, Coral Orange, Mint Green)]

[ZONE 1: LOCATION - LABEL]
* **Container**: [Shape description, e.g., Top-Left Rectangular Panel]
* **Visual Structure**: [Concrete objects, e.g., A stack of 3 layered documents with binary code patterns]
* **Key Text Labels**: "[Text 1]"

[ZONE 2: LOCATION - LABEL]
* **Container**: [Shape description, e.g., Central Circular Engine]
* **Visual Structure**: [Concrete objects, e.g., A clockwise loop connecting 3 internal modules: A (Gear), B (Graph), C (Filter)]
* **Key Text Labels**: "[Text 2]", "[Text 3]"

[ZONE 3: LOCATION - LABEL]
... (Add Zone 4 or 5 if necessary based on the selected layout)

[CONNECTIONS]
1. [Connection description, e.g., A curved dotted arrow looping from Zone 2 back to Zone 1 labeled "Feedback"]
2. [Connection description, e.g., A wide flow arrow branching from Zone 2 to Zone 3]

---END PROMPT---

4. Usage Guide for User

Input: Upload your paper PDF and say:

"Generate a visual schema for this paper's methodology section"

Pro Tips for Best Results:

Text Correction: AI image generators often misspell complex scientific terms. Use the generated image as a base layer, then overlay correct text in PowerPoint/Canva/Illustrator.
Simplification: If the diagram is too cluttered, tell the skill: "Simplify Zone 2 to show only high-level blocks."

Advanced Constraints:

Add --svg to request a Mermaid/SVG code block representation (Experimental).
Add --style "poster" for simplified, bold layouts.

5. Technical Limitations

SVG output is an approximation; prioritize the Text Schema for Image Generation models.
Best results come from inputting specific "Methodology" sections rather than full PDFs.

Source

git clone https://github.com/WilsonWukz/paper-visualizer-skill/blob/main/skills/visual-architect/SKILL.mdView on GitHub

Overview

Paper Visualizer converts research papers' Methodology and Abstract into a Structured Visual Schema. It analyzes the paper's logic, selects an optimal layout pattern, and defines 2-5 physical zones with concrete objects, producing detailed prompts for DALL-E 3, Midjourney v6, or Stable Diffusion.

How This Skill Works

First, the tool parses the paper text to pick one of six layout patterns: Linear Pipeline, Cyclic/Iterative, Hierarchical Stack, Parallel Dual-Stream, Central Hub, or Matrix Grid. Then it generates the 2-5 zone schema with internal visuals (icons, grids, stacks) and explicit connections that describe the flow between zones.

When to Use It

You need a visual diagram of a paper's methodology/abstract for a shareable figure or slide.
You want a publisher-ready schematic with a chosen layout pattern (linear, hub, cyclic, etc.) to illustrate the study.
You are comparing components (ablations, multi-modal fusion) using a Matrix Grid to show relationships.
You need to illustrate an optimization loop or RL feedback as a Cyclic/Iterative pattern.
You want a detailed, AI image generation prompt ready for DALL-E 3, Midjourney v6, or Stable Diffusion.

Quick Start

Step 1: Paste the paper Abstract and Methodology text into the skill input.
Step 2: Choose a layout pattern (or let the skill auto-detect) for the schematic.
Step 3: Use the generated Golden Schema prompt with DALL-E 3, Midjourney v6, or Stable Diffusion.

Best Practices

Provide the paper's Abstract and Methodology text to maximize structural accuracy.
Prefer a single layout pattern to minimize clutter and improve readability.
Keep the dynamic zoning to 2-5 zones and use concrete objects (icons, grids, stacks) instead of abstract terms.
Add clear labels for zones and explicit connections describing data and flow (e.g., curved arrows, loops).
Review and refine the generated prompt to fix domain-specific terms before rendering.

Example Use Cases

Visualize an ML methodology from a CV/NLP paper as a Linear Pipeline diagram.
Depict a reinforcement learning loop with a Cyclic/Iterative pattern.
Create a matrix grid to compare ablation studies in a vision paper.
Illustrate a multi-modal fusion schematic with Parallel Dual-Stream layout.
Represent a hierarchical multiscale feature architecture as a stacked zones diagram.

Frequently Asked Questions

Add this skill to your agents