What is the difference between Gemini conversational prompts and Imagen direct prompts?

Gemini prompts are tailored for natural-language, interactive guidance, while Imagen prompts are direct, generation-focused instructions. This skill outputs both styles from a single brief, ensuring compatibility with each workflow.

Can I use this if I’m not targeting Google tools?

The workflow is designed for Gemini/Imagen within Google generation stacks. If you’re not targeting Google tools, the prompts may be less aligned with your pipeline.

How are risks and constraints handled?

The workflow includes coherence and policy-safety validation and explicitly returns assumptions, risks, and next actions to guide safe, reliable visual generation.

gemini-visual-director

npx machina-cli add skill cloudaipro/openclaw-agent-skills/gemini-visual-director --openclaw

Files (1)

SKILL.md

2.0 KB

Gemini Visual Director

Objective

Generate Gemini and Imagen-optimized prompts with high semantic clarity and strong cinematic detail.

Trigger Rules

Use when Google ecosystem targeting is explicit.

Positive cues:

Gemini prompt optimization.
Imagen image or video prompt drafting.
Story-rich visual prompts for Google generation stacks.

Do not use when:

The user targets non-Google tools only.
The request is pure film treatment planning.
The request is market strategy instead of visual generation.

Inputs

Required:

Concept statement.
Subject and environment.

Optional:

Style references.
Camera and lighting preferences.
Output format constraints.

Output Schema

Return the universal envelope from ../shared/references/output-schemas.md.

Artifacts order:

Narrative visual brief.
Gemini conversational prompt.
Imagen-ready direct prompt.
Prompt tuning variants.

Workflow

Clarify subject, scene, and intent.
Enrich with spatial and sensory detail.
Separate Gemini conversational and Imagen direct prompt styles.
Add composition, lens, light, and material cues.
Validate coherence and policy safety.
Return assumptions, risks, and next actions.

Quality Bar

Keep subject and action unambiguous.
Keep composition and lighting explicit.
Keep style cues compatible and non-conflicting.
Keep variants purposefully different.

Safety Rules

Refuse harmful visual manipulation requests.
Avoid private-data reconstruction instructions.
Mark uncertainty and constraints clearly.

Resources

Domain framework: references/domain.md
Envelope validator: scripts/validate_output.py
Creative artifact validator: scripts/validate_creative_artifacts.py

Source

git clone https://github.com/cloudaipro/openclaw-agent-skills/blob/main/skills/gemini-visual-director/SKILL.mdView on GitHub

Overview

Gemini Visual Director converts rough concepts into narrative-rich prompts optimized for Google generation workflows. It delivers semantically precise visual direction with cinematic detail, crafted to align with Gemini and Imagen requirements in Google stacks.

How This Skill Works

The workflow starts by clarifying the subject, scene, and intent; it enriches the brief with spatial and sensory details; then it separates two prompt styles (Gemini conversational vs Imagen direct) and adds composition, lens, lighting, and material cues. It concludes with a coherence and safety check and returns assumptions, risks, and next actions.

When to Use It

You explicitly target Gemini or Imagen in a Google generation workflow and need highly precise visual direction.
You have rough concepts and require narrative-rich prompts with explicit composition, lighting, and camera cues.
You need both Gemini conversational prompts and Imagen-ready direct prompts derived from a single brief.
You want prompts aligned with Google generation stack constraints, including output formats and safety checks.
You want prompt tuning variants and a clear list of assumptions, risks, and next actions.

Quick Start

Step 1: Clarify subject, scene, and intent.
Step 2: Enrich with spatial and sensory detail; set style references and camera/light cues.
Step 3: Generate Gemini conversational prompt and Imagen-ready direct prompt, add variants, then run safety and coherence checks.

Best Practices

Define subject, scene, and intent up front to anchor the brief.
Specify style references, camera/lens, lighting, and material cues explicitly.
Keep Gemini and Imagen prompts separate and coherent to avoid conflicts.
Iterate with prompt tuning variants to explore different interpretations.
Perform a safety and coherence check and document assumptions, risks, and next actions.

Example Use Cases

Cyberpunk city at dusk: subject a rogue hacker, environment neon-lit rain-soaked alley, style Blade Runner-inspired, 35mm lens, volumetric neon lighting.
Product launch stills: futuristic gadget in a clean studio, high-contrast macro look, 100mm macro, key lights for crisp reflections.
Historical exploratory scene: a lone explorer in misty ancient ruins, chiaroscuro lighting, 85mm lens, soft rim light.
Brand storytelling visuals: environmental sustainability ad, natural light, wide-angle environment, documentary realism.
Fantasy city concept art: floating islands, painterly style, 24mm wide lens, golden hour glow.

Frequently Asked Questions

Add this skill to your agents