How strict are the 6 gates?

All gates must pass. If any gate fails, iterate with principle-based feedback up to three times.

Do I need to know code to use this?

No heavy coding is required. It runs through the browser-use workflow and Gemini session via a guided sequence.

Image Generator

Scanned

npx machina-cli add skill aiskillstore/marketplace/image-generator --openclaw

Files (1)

SKILL.md

5.4 KB

Image Generator

Generate professional teaching visuals using Gemini 3 with multi-turn reasoning partnership.

Quick Start

# 1. Start browser (via browser-use skill)
bash .claude/skills/browser-use/scripts/start-server.sh

# 2. Navigate to Gemini
# Use browser_navigate to https://gemini.google.com/

# 3. Generate image from creative brief
# Paste creative brief → Wait 30-35s → Verify 6 gates → Download

Core Principles

Reasoning over prediction - Creative briefs (Story/Intent/Metaphor) activate reasoning; pixel specs don't
Multi-turn partnership - Teach Gemini your standards through principle-based feedback
6-gate quality - Explicit pass/fail before download
Autonomous batch - No permission-asking between visuals

Input: Creative Brief Format

Receive from visual-asset-workflow:

## The Story
[Narrative about what's visualized]

## Emotional Intent
[What it should FEEL like]

## Visual Metaphor
[Universal concept for instant comprehension]

## Subject / Composition / Action / Location / Style
[Gemini 3 prompt structure]

## Color Semantics
Blue (#2563eb) = Authority | Green (#10b981) = Execution

## Typography Hierarchy
Largest: Key insight | Medium: Supporting | Smallest: Context

Do NOT convert to pixel specs - use as-is to activate reasoning.

Workflow (Per Visual)

Step	Action	Tool
1	Navigate to gemini.google.com	browser_navigate
2	Select "🍌 Create Image"	browser_click
3	Paste creative brief	browser_type
4	Wait 30-35 seconds	browser_wait_for
5	Verify 6 gates (below)	Visual inspection
6	If fail: Iterate with feedback (max 3)	browser_type
7	If pass: Download full size	browser_click
8	Copy to `apps/learn-app/static/img/part-{N}/chapter-{NN}/`	Bash
9	Embed in lesson immediately	Edit
10	NEW CHAT for next visual	browser_navigate

Quality Gates (ALL Must Pass)

Gate	Criterion	Fail Action
1. Spelling	99% accuracy (Y-Combinator, Kubernetes)	Iterate
2. Layout	Proportions match prompt (2×2 not 3×1)	Iterate
3. Color	Brand colors match (#2563eb not #002050)	Iterate
4. Typography	Largest = key concept (not decoration)	Iterate
5. Teaching	<5 sec concept grasp at target proficiency	Iterate
6. Uniqueness	Not duplicate of existing chapter image	New chat

Decision: ALL pass → Download | ANY fail → Iterate (max 3 tries)

Iteration: Principle-Based Feedback

When gate fails, provide teaching feedback:

Gate 4 FAILED: Typography hierarchy incorrect

The largest text is "$100K" (supporting detail) but should be "$3T"
(key insight students must grasp).

Increase '$3T' to dominant size. Reduce '$100K' to supporting size.
Information importance drives sizing.

Batch Mode

When invoked with "generate all visuals":

For EACH visual in list:
  A. NEW CHAT (context isolation)
  B. Generate (paste brief)
  C. Verify 6 gates
  D. Iterate if needed (max 3)
  E. Download when pass
  F. Embed in lesson
  G. Log "✅ N/M"
  H. NEXT (no stopping)

Never ask: "Continue?" "Pause here?" "Review?"

Report at END only:

BATCH COMPLETE
✅ Generated: 16/18
⚠️ Deferred: 2 (quality issues)
Location: apps/learn-app/static/img/part-{N}/

Proficiency Limits

Level	Max Elements	Grasp Time
A2	5-7	<5 sec
B1	7-10	<10 sec
C2	No limit	N/A

Token Conservation (Batch Mode)

For >8 visuals, condense briefs:

Original (250 tokens):

"Top Layer shows Coordinator at center top with label 'Orchestrator'
featuring conductor icon, with role 'Strategic oversight'..."

Condensed (80 tokens):

"Top Layer - Coordinator: Center top, 'Orchestrator' (conductor),
Role: 'Strategic oversight', Gold (#fbbf24), Large hexagon."

Keep: Story, Intent, Metaphor, Colors, Reasoning Condense: Long examples → Short labels

Anti-Patterns

Don't	Why
Accept first output without 6 gates	Quality standard violation
Ask permission between batch items	Breaks autonomous agency
Convert briefs to pixel specs	Defeats reasoning activation
Skip embedding step	Creates orphan images
Reuse same chat for next visual	Context contamination

Session Interruption

If session ends mid-batch, create checkpoint:

# Checkpoint: Part {N}
Status: INTERRUPTED at 8/18

## Completed:
- ✅ Image 1: filename (embedded lesson-01.md)
- ✅ Image 2: filename (embedded lesson-02.md)

## Remaining:
- ⏳ Image 8: filename

On continuation: Read checkpoint → Resume → Update incrementally

Success Indicators

✅ All 6 gates verified before download
✅ Batch completion without permission-asking
✅ Principle-based iteration feedback
✅ Images organized by part/chapter
✅ Immediate embedding (no orphans)
✅ >85% production-ready rate

Source

git clone https://github.com/aiskillstore/marketplace/blob/main/skills/92bilal26/image-generator/SKILL.mdView on GitHub

Overview

Generates professional teaching visuals using Gemini 3 via a browser-automation workflow. It targets chapter illustrations, diagrams, and teaching visuals, not stock photos or decorative images, and enforces six quality gates to guarantee consistency and instructional clarity.

How This Skill Works

The skill runs a browser-use session to Gemini, guiding the user through a structured creative brief (Story, Emotional Intent, Visual Metaphor, Subject/Composition/Action/Location/Style, Color Semantics, Typography Hierarchy). It then generates images, validates them against six gates (with up to three iterations), downloads the final asset, and copies it into the lesson directory for immediate embedding, leveraging a multi-turn reasoning partnership with Gemini 3.

When to Use It

When creating chapter illustrations to accompany a lesson.
When designing diagrams or process visuals needing a clear visual metaphor.
When you require visuals that adhere to branded color semantics and typography hierarchy.
When generating visuals in batch for a multi-chapter course, with gate-driven QA.
When you need visuals for teaching, not stock photos or decorative imagery.

Quick Start

Step 1: Start browser (via browser-use skill) using: bash .claude/skills/browser-use/scripts/start-server.sh
Step 2: Navigate to Gemini: browser_navigate to https://gemini.google.com/
Step 3: Generate image from creative brief: Paste creative brief → Wait 30-35s → Verify 6 gates → Download

Best Practices

Fill out the Creative Brief in full: The Story, Emotional Intent, Visual Metaphor, Subject/Composition/Action/Location/Style, Color Semantics, and Typography Hierarchy.
Rely on color semantics: Blue for Authority, Green for Execution.
Run all six quality gates and fix any gate failures via principle-based feedback (up to 3 iterations).
Do NOT convert to pixel specs — use the creative brief as-is to activate reasoning.
After download, copy to apps/learn-app/static/img/part-{N}/chapter-{NN}/ and embed immediately.

Example Use Cases

Illustration for a chapter on Gemini 3 capabilities and reasoning-based prompts.
Diagram showing data flow for a machine learning lesson.
Visual metaphor image illustrating the concept of Authority vs Execution with brand colors.
Step-by-step process image for a teaching module requiring clear sequencing.
Batch-generated visuals set for a 12-week course syllabus.

Frequently Asked Questions

Add this skill to your agents

Image Generator

Image Generator

Quick Start

Core Principles

Input: Creative Brief Format

Workflow (Per Visual)

Quality Gates (ALL Must Pass)

Iteration: Principle-Based Feedback

Batch Mode

Proficiency Limits

Token Conservation (Batch Mode)

Anti-Patterns

Session Interruption

Success Indicators

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What visuals can I produce with this skill?

How strict are the 6 gates?

Do I need to know code to use this?