What models can I use with this skill?

You can generate images using Gemini 3 Pro Image or Imagen 4 models via the CLI (e.g., gemini-3-pro-image-preview or imagen-4.0-ge options).

Can I produce 4K images?

Yes. Choose --size 4K and, when applicable, select a model that supports higher resolutions to generate 4K outputs.

What outputs does the script generate?

The script returns a list of saved PNG file paths in the output directory for easy retrieval.

gemini-image

Scanned

npx machina-cli add skill akrindev/google-studio-skills/gemini-image --openclaw

Files (1)

SKILL.md

10.7 KB

Gemini Image Generation

Generate high-quality images from text prompts using Google's Gemini and Imagen models through executable scripts.

When to Use This Skill

Use this skill when you need to:

Create visual content from text descriptions
Generate multiple image variations
Create images at specific resolutions (1K, 2K, 4K)
Produce images for different aspect ratios (social media, banners, etc.)
Generate photorealistic images or artistic visuals
Create images with person generation controls
Batch generate multiple images at once
Combine with text generation for complete content creation

Available Scripts

scripts/generate_image.py

Purpose: Generate images using Gemini 3 Pro Image or Imagen 4 models

When to use:

Any image generation task
Multiple image generation (1-4 per request)
Custom resolution and aspect ratio needs
Professional asset creation
Photorealistic or artistic image generation

Key parameters:

Parameter	Description	Example
`prompt`	Text description (required)	`"A futuristic city at sunset"`
`--model`, `-m`	Model to use	`gemini-3-pro-image-preview`
`--output-dir`, `-o`	Output directory for images	`images/`
`--name`, `-n`	Base name for output files	`artwork`
`--no-timestamp`	Disable auto timestamp	Flag
`--aspect`, `-a`	Aspect ratio	`16:9`
`--size`, `-s`	Resolution	`2K` or `4K`
`--num`	Number of images (1-4)	`4`
`--person`	Person generation policy	`allow_adult`

Output: List of saved PNG file paths

Workflows

Workflow 1: Basic Image Generation

python scripts/generate_image.py "A futuristic city at sunset with flying cars"

Best for: Quick image generation, prototypes
Model: gemini-3-pro-image-preview (default, highest quality)
Output: images/generated_image_YYYYMMDD_HHMMSS.png

Workflow 2: Social Media (Instagram, Facebook)

python scripts/generate_image.py "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop

Best for: Instagram posts, profile pictures
Aspect: 1:1 (square format)
Resolution: 2K (2048x2048)
Output: images/coffee-shop_YYYYMMDD_HHMMSS.png

Workflow 3: YouTube Thumbnails (16:9)

python scripts/generate_image.py "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail

Best for: YouTube, video thumbnails
Aspect: 16:9 (widescreen)
Resolution: 2K (2752x1536)
Output: images/thumbnail_YYYYMMDD_HHMMSS.png

Workflow 4: Multiple Variations

python scripts/generate_image.py "Abstract geometric patterns in blue and gold" --num 4 --name abstract

Best for: A/B testing, design options
Generates: 4 distinct variations
Output: images/abstract_YYYYMMDD_HHMMSS_0.png, images/abstract_YYYYMMDD_HHMMSS_1.png, etc.

Workflow 5: Custom Output Directory

python scripts/generate_image.py "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum

Best for: Print materials, high-end assets, organized projects
Model: gemini-3-pro-image-preview only (for 4K)
Resolution: 4K (5504x3072 for 16:9)
Directory created automatically if it doesn't exist

Workflow 6: Photorealistic Images (Imagen 4)

python scripts/generate_image.py "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate

Best for: Realistic photos, product shots
Model: imagen-4.0-generate-001 (photorealistic)
Notes: English prompts only
Max 4 images per request

Workflow 7: Blog Post Featured Image

python scripts/generate_image.py "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image

Best for: Blog headers, article images
Combines well with: gemini-text for blog content generation

Workflow 8: Content Creation Pipeline (Text + Image)

# 1. Generate content (gemini-text skill)
python skills/gemini-text/scripts/generate.py "Write a product description for smart home device"

# 2. Generate product image (this skill)
python scripts/generate_image.py "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product

# 3. Create social media post

Best for: E-commerce, marketing campaigns
Combines with: gemini-text, gemini-batch for batch production

Workflow 9: Disable Timestamp

python scripts/generate_image.py "Fixed filename image" --name my-image --no-timestamp

Best for: When you want complete control over filename
Output: images/my-image.png (no timestamp)
Use when: Generating files for specific naming schemes or automated pipelines

Parameters Reference

Model Selection

Model	Nickname	Quality	Max Size	Best For
`gemini-3-pro-image-preview`	Nano Banana Pro	Highest	4K	Professional assets, advanced text rendering
`gemini-2.5-flash-image`	Nano Banana	Good	2K	High-volume, low-latency
`imagen-4.0-generate-001`	Imagen 4	Photorealistic	2K	Realistic photos, product shots

Aspect Ratios

Ratio	Use Case	1K Size	2K Size
1:1	Instagram, avatars	1024x1024	2048x2048
16:9	YouTube, presentations	1376x768	2752x1536
9:16	Instagram Stories, TikTok	768x1376	1536x2752
4:3	Traditional displays	1024x768	2048x1536
3:4	Portrait orientation	768x1024	1536x2048
21:9	Ultrawide	-	5504x2400

Note: 4K resolution only available with gemini-3-pro-image-preview

Resolution Guide

Size	Use Case	Best Model
1K (1024px)	Web thumbnails, previews	Any model
2K (2048px)	Standard web, social media	Any model
4K (4096px)	Print, high-end assets	gemini-3-pro only

Person Generation Policy

Policy	Description	Restrictions
`dont_allow`	No people in images	None
`allow_adult`	Adults only	Recommended default
`allow_all`	All ages	Restricted in EU, UK, CH, MENA

Output Interpretation

File Naming

Default format: {name}_YYYYMMDD_HHMMSS.png (auto timestamp)
Single image example: artwork_20260130_031643.png
Multiple images: {name}_YYYYMMDD_HHMMSS_0.png, {name}_YYYYMMDD_HHMMSS_1.png, etc.
Without timestamp (--no-timestamp): {name}.png
Script prints: "Saved: /path/to/file.png"

Image Quality

All images include SynthID watermark for authenticity
PNG format for lossless quality
Can be converted to JPEG/WEBP if needed
4K images are significantly larger file sizes

Error Messages

"Model not available": Check model name spelling
"Unsupported size": Verify size/model combination
"Aspect ratio error": Use supported ratios for selected model

Common Issues

"google-genai or pillow not installed"

pip install google-genai pillow

"Image generation failed"

Check prompt length (too verbose can fail)
Try simpler, more focused prompts
Verify model availability in your region
Check API quota limits

"Unsupported aspect ratio"

Check if ratio is supported by selected model
Imagen 4 has fewer ratio options than Gemini
Use 16:9 or 1:1 for best compatibility

"4K not supported"

4K only works with gemini-3-pro-image-preview
Use --size 2K for other models
Try --model gemini-3-pro-image-preview --size 4K

"Imagen prompt language error"

Imagen models support English prompts only
Use gemini-3-pro-image-preview for other languages
Translate prompt to English for Imagen

File too large for storage

Use --size 1K for smaller files
Compress images after generation
Convert PNG to JPEG for web use

Best Practices

Prompt Engineering

Be specific and descriptive
Include style descriptors (e.g., "photorealistic", "digital art")
Mention lighting, mood, and composition
Use analogies for complex concepts
Avoid negative prompts (describe what you want, not what to avoid)

Model Selection

Use gemini-3-pro-image-preview for: High quality, text rendering, 4K
Use gemini-2.5-flash-image for: Speed, high volume
Use imagen-4.0-generate-001 for: Photorealism, product shots

Performance Optimization

Generate multiple images at once with --num
Use lower resolution for previews
Batch requests for high-volume needs (gemini-batch skill)
Cache results for repeated requests

Quality Tips

Use 2K resolution for most web uses
4K only when maximum detail is needed
Combine specific prompts with style guidance
Test prompts with --num 1 before generating batches

Cost Management

Use flash models for cost efficiency
4K generation costs significantly more
Batch multiple requests when possible
Generate at 1K for testing, 2K/4K for final

Related Skills

gemini-text: Generate text content alongside images
gemini-tts: Create audio for image-based content
gemini-batch: Process multiple image requests efficiently
gemini-embeddings: Generate image embeddings for similarity search

Quick Reference

# Basic
python scripts/generate_image.py "Your prompt"

# Social media (1:1)
python scripts/generate_image.py "Prompt" --aspect 1:1 --size 2K --name social-post

# YouTube thumbnail (16:9)
python scripts/generate_image.py "Prompt" --aspect 16:9 --size 2K --name thumbnail

# 4K high quality
python scripts/generate_image.py "Prompt" --aspect 16:9 --size 4K --name high-res

# Multiple variations
python scripts/generate_image.py "Prompt" --num 4 --name variations

# Custom directory
python scripts/generate_image.py "Prompt" --output-dir ./my-images/ --name custom

# Photorealistic
python scripts/generate_image.py "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo

# No timestamp
python scripts/generate_image.py "Prompt" --name fixed-name --no-timestamp

Reference

See references/ for model documentation (if available)
Get API key: https://aistudio.google.com/apikey
Documentation: https://ai.google.dev/gemini-api/docs/image-generation
SynthID: https://deepmind.google/technologies/synthid/

Source

git clone https://github.com/akrindev/google-studio-skills/blob/main/skills/gemini-image/SKILL.mdView on GitHub

Overview

Create high-quality visuals from text prompts using Google's Gemini and Imagen models through a simple Python script. It supports multiple outputs, custom aspect ratios, resolutions up to 4K, and batch generation, making it ideal for prototyping visuals and producing assets for social, video, and print.

How This Skill Works

Run the Python script scripts/generate_image.py with a text prompt and optional flags. It interfaces with Gemini 3 Pro Image or Imagen 4 models and saves PNG outputs to a specified directory. Use parameters like --aspect, --size, --num, --output-dir, --name, and --person to control composition, resolution, and batch variations.

When to Use It

Create visual content from text descriptions
Generate multiple image variations for exploration or A/B testing
Produce images at specific resolutions (1K, 2K, 4K) and custom aspect ratios
Design social media assets, banners, or thumbnails
Generate photorealistic or artistic visuals with optional person-generation controls

Quick Start

Step 1: Write a clear text prompt describing the image you want to generate.
Step 2: Run python scripts/generate_image.py "PROMPT" with desired options (e.g., --model, --aspect, --size, --num, --output-dir).
Step 3: Check the output PNGs in the specified directory and select the best variation for use.

Best Practices

Start with a precise, descriptive prompt and specify style cues if needed
Leverage --num (1-4) and experiment with different --aspect and --size values
Organize outputs with --output-dir and meaningful --name; use --no-timestamp if needed
Review generated PNGs and iterate on prompts to improve quality
For print-ready assets, select 4K resolution and an appropriate aspect ratio

Example Use Cases

python scripts/generate_image.py "A futuristic city at sunset with flying cars" --aspect 16:9 --size 2K --name city-sunset
python scripts/generate_image.py "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop
python scripts/generate_image.py "Tech gadget thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail
python scripts/generate_image.py "Abstract geometric patterns in blue and gold" --num 4 --name abstract
python scripts/generate_image.py "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --name museum

Frequently Asked Questions

Add this skill to your agents