Get the FREE Ultimate OpenClaw Setup Guide →

gemini-visual-director

npx machina-cli add skill cloudaipro/openclaw-agent-skills/gemini-visual-director --openclaw
Files (1)
SKILL.md
2.0 KB

Gemini Visual Director

Objective

Generate Gemini and Imagen-optimized prompts with high semantic clarity and strong cinematic detail.

Trigger Rules

Use when Google ecosystem targeting is explicit.

Positive cues:

  • Gemini prompt optimization.
  • Imagen image or video prompt drafting.
  • Story-rich visual prompts for Google generation stacks.

Do not use when:

  • The user targets non-Google tools only.
  • The request is pure film treatment planning.
  • The request is market strategy instead of visual generation.

Inputs

Required:

  • Concept statement.
  • Subject and environment.

Optional:

  • Style references.
  • Camera and lighting preferences.
  • Output format constraints.

Output Schema

Return the universal envelope from ../shared/references/output-schemas.md.

Artifacts order:

  1. Narrative visual brief.
  2. Gemini conversational prompt.
  3. Imagen-ready direct prompt.
  4. Prompt tuning variants.

Workflow

  1. Clarify subject, scene, and intent.
  2. Enrich with spatial and sensory detail.
  3. Separate Gemini conversational and Imagen direct prompt styles.
  4. Add composition, lens, light, and material cues.
  5. Validate coherence and policy safety.
  6. Return assumptions, risks, and next actions.

Quality Bar

  • Keep subject and action unambiguous.
  • Keep composition and lighting explicit.
  • Keep style cues compatible and non-conflicting.
  • Keep variants purposefully different.

Safety Rules

  • Refuse harmful visual manipulation requests.
  • Avoid private-data reconstruction instructions.
  • Mark uncertainty and constraints clearly.

Resources

  • Domain framework: references/domain.md
  • Envelope validator: scripts/validate_output.py
  • Creative artifact validator: scripts/validate_creative_artifacts.py

Source

git clone https://github.com/cloudaipro/openclaw-agent-skills/blob/main/skills/gemini-visual-director/SKILL.mdView on GitHub

Overview

Gemini Visual Director converts rough concepts into narrative-rich prompts optimized for Google generation workflows. It delivers semantically precise visual direction with cinematic detail, crafted to align with Gemini and Imagen requirements in Google stacks.

How This Skill Works

The workflow starts by clarifying the subject, scene, and intent; it enriches the brief with spatial and sensory details; then it separates two prompt styles (Gemini conversational vs Imagen direct) and adds composition, lens, lighting, and material cues. It concludes with a coherence and safety check and returns assumptions, risks, and next actions.

When to Use It

  • You explicitly target Gemini or Imagen in a Google generation workflow and need highly precise visual direction.
  • You have rough concepts and require narrative-rich prompts with explicit composition, lighting, and camera cues.
  • You need both Gemini conversational prompts and Imagen-ready direct prompts derived from a single brief.
  • You want prompts aligned with Google generation stack constraints, including output formats and safety checks.
  • You want prompt tuning variants and a clear list of assumptions, risks, and next actions.

Quick Start

  1. Step 1: Clarify subject, scene, and intent.
  2. Step 2: Enrich with spatial and sensory detail; set style references and camera/light cues.
  3. Step 3: Generate Gemini conversational prompt and Imagen-ready direct prompt, add variants, then run safety and coherence checks.

Best Practices

  • Define subject, scene, and intent up front to anchor the brief.
  • Specify style references, camera/lens, lighting, and material cues explicitly.
  • Keep Gemini and Imagen prompts separate and coherent to avoid conflicts.
  • Iterate with prompt tuning variants to explore different interpretations.
  • Perform a safety and coherence check and document assumptions, risks, and next actions.

Example Use Cases

  • Cyberpunk city at dusk: subject a rogue hacker, environment neon-lit rain-soaked alley, style Blade Runner-inspired, 35mm lens, volumetric neon lighting.
  • Product launch stills: futuristic gadget in a clean studio, high-contrast macro look, 100mm macro, key lights for crisp reflections.
  • Historical exploratory scene: a lone explorer in misty ancient ruins, chiaroscuro lighting, 85mm lens, soft rim light.
  • Brand storytelling visuals: environmental sustainability ad, natural light, wide-angle environment, documentary realism.
  • Fantasy city concept art: floating islands, painterly style, 24mm wide lens, golden hour glow.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers