Get the FREE Ultimate OpenClaw Setup Guide →

gemini-image

Scanned
npx machina-cli add skill akrindev/google-studio-skills/gemini-image --openclaw
Files (1)
SKILL.md
10.7 KB

Gemini Image Generation

Generate high-quality images from text prompts using Google's Gemini and Imagen models through executable scripts.

When to Use This Skill

Use this skill when you need to:

  • Create visual content from text descriptions
  • Generate multiple image variations
  • Create images at specific resolutions (1K, 2K, 4K)
  • Produce images for different aspect ratios (social media, banners, etc.)
  • Generate photorealistic images or artistic visuals
  • Create images with person generation controls
  • Batch generate multiple images at once
  • Combine with text generation for complete content creation

Available Scripts

scripts/generate_image.py

Purpose: Generate images using Gemini 3 Pro Image or Imagen 4 models

When to use:

  • Any image generation task
  • Multiple image generation (1-4 per request)
  • Custom resolution and aspect ratio needs
  • Professional asset creation
  • Photorealistic or artistic image generation

Key parameters:

ParameterDescriptionExample
promptText description (required)"A futuristic city at sunset"
--model, -mModel to usegemini-3-pro-image-preview
--output-dir, -oOutput directory for imagesimages/
--name, -nBase name for output filesartwork
--no-timestampDisable auto timestampFlag
--aspect, -aAspect ratio16:9
--size, -sResolution2K or 4K
--numNumber of images (1-4)4
--personPerson generation policyallow_adult

Output: List of saved PNG file paths

Workflows

Workflow 1: Basic Image Generation

python scripts/generate_image.py "A futuristic city at sunset with flying cars"
  • Best for: Quick image generation, prototypes
  • Model: gemini-3-pro-image-preview (default, highest quality)
  • Output: images/generated_image_YYYYMMDD_HHMMSS.png

Workflow 2: Social Media (Instagram, Facebook)

python scripts/generate_image.py "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop
  • Best for: Instagram posts, profile pictures
  • Aspect: 1:1 (square format)
  • Resolution: 2K (2048x2048)
  • Output: images/coffee-shop_YYYYMMDD_HHMMSS.png

Workflow 3: YouTube Thumbnails (16:9)

python scripts/generate_image.py "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail
  • Best for: YouTube, video thumbnails
  • Aspect: 16:9 (widescreen)
  • Resolution: 2K (2752x1536)
  • Output: images/thumbnail_YYYYMMDD_HHMMSS.png

Workflow 4: Multiple Variations

python scripts/generate_image.py "Abstract geometric patterns in blue and gold" --num 4 --name abstract
  • Best for: A/B testing, design options
  • Generates: 4 distinct variations
  • Output: images/abstract_YYYYMMDD_HHMMSS_0.png, images/abstract_YYYYMMDD_HHMMSS_1.png, etc.

Workflow 5: Custom Output Directory

python scripts/generate_image.py "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum
  • Best for: Print materials, high-end assets, organized projects
  • Model: gemini-3-pro-image-preview only (for 4K)
  • Resolution: 4K (5504x3072 for 16:9)
  • Directory created automatically if it doesn't exist

Workflow 6: Photorealistic Images (Imagen 4)

python scripts/generate_image.py "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate
  • Best for: Realistic photos, product shots
  • Model: imagen-4.0-generate-001 (photorealistic)
  • Notes: English prompts only
  • Max 4 images per request

Workflow 7: Blog Post Featured Image

python scripts/generate_image.py "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image
  • Best for: Blog headers, article images
  • Combines well with: gemini-text for blog content generation

Workflow 8: Content Creation Pipeline (Text + Image)

# 1. Generate content (gemini-text skill)
python skills/gemini-text/scripts/generate.py "Write a product description for smart home device"

# 2. Generate product image (this skill)
python scripts/generate_image.py "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product

# 3. Create social media post
  • Best for: E-commerce, marketing campaigns
  • Combines with: gemini-text, gemini-batch for batch production

Workflow 9: Disable Timestamp

python scripts/generate_image.py "Fixed filename image" --name my-image --no-timestamp
  • Best for: When you want complete control over filename
  • Output: images/my-image.png (no timestamp)
  • Use when: Generating files for specific naming schemes or automated pipelines

Parameters Reference

Model Selection

ModelNicknameQualityMax SizeBest For
gemini-3-pro-image-previewNano Banana ProHighest4KProfessional assets, advanced text rendering
gemini-2.5-flash-imageNano BananaGood2KHigh-volume, low-latency
imagen-4.0-generate-001Imagen 4Photorealistic2KRealistic photos, product shots

Aspect Ratios

RatioUse Case1K Size2K Size
1:1Instagram, avatars1024x10242048x2048
16:9YouTube, presentations1376x7682752x1536
9:16Instagram Stories, TikTok768x13761536x2752
4:3Traditional displays1024x7682048x1536
3:4Portrait orientation768x10241536x2048
21:9Ultrawide-5504x2400

Note: 4K resolution only available with gemini-3-pro-image-preview

Resolution Guide

SizeUse CaseBest Model
1K (1024px)Web thumbnails, previewsAny model
2K (2048px)Standard web, social mediaAny model
4K (4096px)Print, high-end assetsgemini-3-pro only

Person Generation Policy

PolicyDescriptionRestrictions
dont_allowNo people in imagesNone
allow_adultAdults onlyRecommended default
allow_allAll agesRestricted in EU, UK, CH, MENA

Output Interpretation

File Naming

  • Default format: {name}_YYYYMMDD_HHMMSS.png (auto timestamp)
  • Single image example: artwork_20260130_031643.png
  • Multiple images: {name}_YYYYMMDD_HHMMSS_0.png, {name}_YYYYMMDD_HHMMSS_1.png, etc.
  • Without timestamp (--no-timestamp): {name}.png
  • Script prints: "Saved: /path/to/file.png"

Image Quality

  • All images include SynthID watermark for authenticity
  • PNG format for lossless quality
  • Can be converted to JPEG/WEBP if needed
  • 4K images are significantly larger file sizes

Error Messages

  • "Model not available": Check model name spelling
  • "Unsupported size": Verify size/model combination
  • "Aspect ratio error": Use supported ratios for selected model

Common Issues

"google-genai or pillow not installed"

pip install google-genai pillow

"Image generation failed"

  • Check prompt length (too verbose can fail)
  • Try simpler, more focused prompts
  • Verify model availability in your region
  • Check API quota limits

"Unsupported aspect ratio"

  • Check if ratio is supported by selected model
  • Imagen 4 has fewer ratio options than Gemini
  • Use 16:9 or 1:1 for best compatibility

"4K not supported"

  • 4K only works with gemini-3-pro-image-preview
  • Use --size 2K for other models
  • Try --model gemini-3-pro-image-preview --size 4K

"Imagen prompt language error"

  • Imagen models support English prompts only
  • Use gemini-3-pro-image-preview for other languages
  • Translate prompt to English for Imagen

File too large for storage

  • Use --size 1K for smaller files
  • Compress images after generation
  • Convert PNG to JPEG for web use

Best Practices

Prompt Engineering

  • Be specific and descriptive
  • Include style descriptors (e.g., "photorealistic", "digital art")
  • Mention lighting, mood, and composition
  • Use analogies for complex concepts
  • Avoid negative prompts (describe what you want, not what to avoid)

Model Selection

  • Use gemini-3-pro-image-preview for: High quality, text rendering, 4K
  • Use gemini-2.5-flash-image for: Speed, high volume
  • Use imagen-4.0-generate-001 for: Photorealism, product shots

Performance Optimization

  • Generate multiple images at once with --num
  • Use lower resolution for previews
  • Batch requests for high-volume needs (gemini-batch skill)
  • Cache results for repeated requests

Quality Tips

  • Use 2K resolution for most web uses
  • 4K only when maximum detail is needed
  • Combine specific prompts with style guidance
  • Test prompts with --num 1 before generating batches

Cost Management

  • Use flash models for cost efficiency
  • 4K generation costs significantly more
  • Batch multiple requests when possible
  • Generate at 1K for testing, 2K/4K for final

Related Skills

  • gemini-text: Generate text content alongside images
  • gemini-tts: Create audio for image-based content
  • gemini-batch: Process multiple image requests efficiently
  • gemini-embeddings: Generate image embeddings for similarity search

Quick Reference

# Basic
python scripts/generate_image.py "Your prompt"

# Social media (1:1)
python scripts/generate_image.py "Prompt" --aspect 1:1 --size 2K --name social-post

# YouTube thumbnail (16:9)
python scripts/generate_image.py "Prompt" --aspect 16:9 --size 2K --name thumbnail

# 4K high quality
python scripts/generate_image.py "Prompt" --aspect 16:9 --size 4K --name high-res

# Multiple variations
python scripts/generate_image.py "Prompt" --num 4 --name variations

# Custom directory
python scripts/generate_image.py "Prompt" --output-dir ./my-images/ --name custom

# Photorealistic
python scripts/generate_image.py "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo

# No timestamp
python scripts/generate_image.py "Prompt" --name fixed-name --no-timestamp

Reference

Source

git clone https://github.com/akrindev/google-studio-skills/blob/main/skills/gemini-image/SKILL.mdView on GitHub

Overview

Create high-quality visuals from text prompts using Google's Gemini and Imagen models through a simple Python script. It supports multiple outputs, custom aspect ratios, resolutions up to 4K, and batch generation, making it ideal for prototyping visuals and producing assets for social, video, and print.

How This Skill Works

Run the Python script scripts/generate_image.py with a text prompt and optional flags. It interfaces with Gemini 3 Pro Image or Imagen 4 models and saves PNG outputs to a specified directory. Use parameters like --aspect, --size, --num, --output-dir, --name, and --person to control composition, resolution, and batch variations.

When to Use It

  • Create visual content from text descriptions
  • Generate multiple image variations for exploration or A/B testing
  • Produce images at specific resolutions (1K, 2K, 4K) and custom aspect ratios
  • Design social media assets, banners, or thumbnails
  • Generate photorealistic or artistic visuals with optional person-generation controls

Quick Start

  1. Step 1: Write a clear text prompt describing the image you want to generate.
  2. Step 2: Run python scripts/generate_image.py "PROMPT" with desired options (e.g., --model, --aspect, --size, --num, --output-dir).
  3. Step 3: Check the output PNGs in the specified directory and select the best variation for use.

Best Practices

  • Start with a precise, descriptive prompt and specify style cues if needed
  • Leverage --num (1-4) and experiment with different --aspect and --size values
  • Organize outputs with --output-dir and meaningful --name; use --no-timestamp if needed
  • Review generated PNGs and iterate on prompts to improve quality
  • For print-ready assets, select 4K resolution and an appropriate aspect ratio

Example Use Cases

  • python scripts/generate_image.py "A futuristic city at sunset with flying cars" --aspect 16:9 --size 2K --name city-sunset
  • python scripts/generate_image.py "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop
  • python scripts/generate_image.py "Tech gadget thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail
  • python scripts/generate_image.py "Abstract geometric patterns in blue and gold" --num 4 --name abstract
  • python scripts/generate_image.py "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --name museum

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers