Gemini Image Generation
npx machina-cli add skill Ibrahim-3d/nano-banana-claude-plugin/genimage --openclawGemini Image Generation Skill
Provides comprehensive knowledge for generating and editing images using the Gemini API through 8 specialized Python scripts.
How It Works
All generation and editing is text-guided through the Gemini API. There is no visual UI, mask painter, or interactive editor. You describe what you want in natural language, and Gemini's AI handles the rest.
For image editing, Gemini semantically understands your text instructions and automatically identifies which regions of the image to modify. For example, "replace the sky with a sunset" - Gemini knows what "the sky" means and replaces only that region.
Available Modes
Text-to-Image (no input image)
- Standard (
texttoimage.py): Fast generation via gemini-2.5-flash-image at 1K - High-res (
hires.py): 2K or 4K via gemini-3-pro-image-preview - Search-grounded (
searchground.py): Uses real-time Google Search data
Image Editing (input image + text)
- General edit (
imageedit.py): All text-guided editing - inpainting, add/remove objects, background replacement, detail-preserving edits, bringing sketches to life, changing angles, and any other modification - Style transfer (
styletransfer.py): Apply the artistic style of one image onto another (requires 2 images)
Multi-Image
- Compose (
compose.py): Combine elements from multiple images - Multi-reference (
multiref.py): Up to 14 reference images (gemini-3-pro)
Interactive
- Multi-turn (
multiturn.py): Chat-based iterative editing with memory
Running Scripts
python "$CLAUDE_PLUGIN_ROOT/scripts/<script>.py" --prompt "..." [options]
Editing Examples
All of these use imageedit.py --image photo.png --prompt "...":
- Inpainting: "Replace the sky with dramatic storm clouds"
- Remove object: "Remove the person on the left and fill naturally"
- Add object: "Add a golden retriever sitting on the couch"
- Background swap: "Replace the background with a tropical beach"
- Bring to life: "Transform this pencil sketch into a photorealistic image"
- Detail preserve: "Place this logo on a billboard in Times Square, keep the logo sharp"
- Style change: "Make this photo look like an oil painting"
Prompting Guide
Photorealistic Scenes
Use photography terms: camera angles, lens types, lighting, fine details.
Template: "A photorealistic [shot type] of [subject], [action], set in [environment]. Illuminated by [lighting], creating a [mood] atmosphere. Captured with [camera/lens]. [aspect ratio] format."
Stylized Illustrations
Specify art style, color palette, medium, background.
Template: "A [style] illustration of [subject] in the style of [medium]. Color palette: [colors]. Background: [description]. No text."
Text-Heavy Images
Put desired text in quotes. Specify font style and placement.
Template: "An image containing the text "[text]" in [font]. [Describe scene around it]."
Product Photography
Describe materials, lighting setup, environment, camera angle.
Template: "Professional product photography of [product] on [surface]. [Lighting] lighting, [background]. Shot from [angle] with [lens]."
Aspect Ratios
Supported: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
Resolution (gemini-3-pro-image-preview only)
1K- Default (1024px)2K- High resolution (2048px)4K- Ultra resolution (4096px)
Must use uppercase K.
Models
- gemini-2.5-flash-image (Nano Banana): Fast, 1K, high-volume
- gemini-3-pro-image-preview (Nano Banana Pro): Pro quality, up to 4K, thinking mode, search grounding, 14 reference images
Source
git clone https://github.com/Ibrahim-3d/nano-banana-claude-plugin/blob/main/skills/genimage/SKILL.mdView on GitHub Overview
Gemini Image Generation provides text-guided image creation and editing via the Gemini API using eight specialized Python scripts. It supports rapid text-to-image generation, high-res outputs up to 4K, style transfers, background edits, and multi-image compositions—all without a visual UI. This enables fast, repeatable visual content workflows that you can automate.
How This Skill Works
Tasks are driven by natural language prompts passed to Gemini through scripts like texttoimage.py, hires.py, searchground.py, imageedit.py, styletransfer.py, compose.py, multiref.py, and multiturn.py. Gemini semantically interprets prompts to identify regions for edits, enabling actions like replacing the sky or background swaps. Run a script with python $CLAUDE_PLUGIN_ROOT/scripts/<script>.py --prompt "..." [options] and select the desired resolution (1K/2K/4K) and aspect ratio.
When to Use It
- Generate a brand-new image from a descriptive prompt (Text-to-Image).
- Edit an existing image via inpainting, object removal, background changes, or detail edits (Image Editing).
- Apply a style transfer from one image to another.
- Assemble elements from multiple images (Multi-Image: Compose or Multi-reference).
- Engage in interactive, memory-based editing across turns (Interactive: Multi-turn).
Quick Start
- Step 1: Pick a script and craft a prompt, e.g., python $CLAUDE_PLUGIN_ROOT/scripts/texttoimage.py --prompt 'A photorealistic sunset over mountains.' --resolution 4K --aspect 16:9
- Step 2: If editing, run imageedit.py with an input image and a prompt, e.g., python $CLAUDE_PLUGIN_ROOT/scripts/imageedit.py --image scene.jpg --prompt 'Replace the sky with a sunset' --resolution 2K
- Step 3: Review results, adjust prompt or options, and upscale to 4K using gemini-3-pro-image-preview.
Best Practices
- Start with a concise prompt and clearly specify the target aspect ratio and resolution (e.g., 4K in a 16:9 frame).
- Use the provided templates to guide prompts for photorealistic, stylized illustrations, or product photography.
- In editing, reference explicit regions (e.g., 'replace the sky' or 'remove the person on the left') to leverage semantic understanding.
- For multi-image tasks, supply well-labeled reference images and choose the appropriate mode (Compose or Multi-reference).
- Iterate with higher-detail prompts and upscale results to 4K if ultra-high resolution is required.
Example Use Cases
- Replace the sky with dramatic storm clouds (Inpainting).
- Remove the person on the left and fill naturally (Object removal).
- Add a golden retriever sitting on the couch (Add object).
- Replace the background with a tropical beach (Background swap).
- Make this photo look like an oil painting (Style transfer).