Which modalities are returned?

You can request TEXT and IMAGE by setting generationConfig and response_modalities.

How do I configure image size?

Use image_config in GenerateContentConfig to set aspect_ratio and image_size (e.g., 16:9, 2K).

nano-banana-pro

Scanned

npx machina-cli add skill Makiya1202/ai-agents-skills/nano-banana-pro --openclaw

Files (1)

SKILL.md

7.0 KB

Nano Banana Pro (Gemini 3 Pro Image)

Generate high-quality images with Google's Gemini 3 Pro Image API.

Overview

Nano Banana Pro is the marketing name for Gemini 3 Pro Image (gemini-3-pro-image-preview), Google's state-of-the-art image generation and editing model built on Gemini 3 Pro.

Quick Start

Get API Key

Go to Google AI Studio
Click "Get API Key"
Store securely as environment variable

Basic Image Generation (Python)

from google import genai
from google.genai import types

client = genai.Client(api_key="YOUR_GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="A serene Japanese garden with cherry blossoms and a koi pond",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)

# Process response
for part in response.candidates[0].content.parts:
    if hasattr(part, 'text'):
        print(f"Description: {part.text}")
    elif hasattr(part, 'inline_data'):
        # Save image
        image_data = part.inline_data.data  # Base64 encoded
        mime_type = part.inline_data.mime_type  # image/png
        
        import base64
        with open("output.png", "wb") as f:
            f.write(base64.b64decode(image_data))

REST API (cURL)

curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "role": "user",
      "parts": [{"text": "Create a vibrant infographic about photosynthesis"}]
    }],
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"]
    }
  }'

TypeScript/JavaScript

const GEMINI_API_KEY = process.env.GEMINI_API_KEY;

async function generateImage(prompt: string) {
  const response = await fetch(
    'https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent',
    {
      method: 'POST',
      headers: {
        'x-goog-api-key': GEMINI_API_KEY!,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        contents: [{ 
          role: 'user', 
          parts: [{ text: prompt }] 
        }],
        generationConfig: {
          responseModalities: ['TEXT', 'IMAGE'],
        },
      }),
    }
  );

  const data = await response.json();
  return data;
}

Configuration Options

Image Configuration

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Professional product photo of a coffee mug",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",  # Options: 1:1, 3:2, 16:9, 9:16, 21:9
            image_size="2K"       # Options: 1K, 2K, 4K
        )
    )
)

With Google Search Grounding

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Create an infographic showing today's stock market trends",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]  # Enable search grounding
    )
)

Multi-Turn Conversations (Iterative Editing)

# Create a chat session
chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)

# Initial generation
response1 = chat.send_message(
    "Create a vibrant infographic explaining photosynthesis"
)

# Edit the image
response2 = chat.send_message(
    "Update this infographic to be in Spanish. Keep all other elements the same."
)

Key Capabilities

1. Superior Text Rendering

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="""Create a professional poster with:
    - Title: "Annual Tech Summit 2025"
    - Date: March 15-17, 2025
    - Location: San Francisco Convention Center
    """,
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)

2. Character Consistency (Up to 5 Subjects)

import base64

def load_image(path: str) -> str:
    with open(path, "rb") as f:
        return base64.b64encode(f.read()).decode()

character_ref = load_image("character.png")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        {"text": "Generate an image of this person at a tech conference"},
        {"inline_data": {"mime_type": "image/png", "data": character_ref}}
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)

Next.js API Route

// app/api/generate-image/route.ts
import { NextRequest, NextResponse } from 'next/server';

export async function POST(request: NextRequest) {
  const { prompt, aspectRatio = '1:1', imageSize = '2K' } = await request.json();

  try {
    const response = await fetch(
      'https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent',
      {
        method: 'POST',
        headers: {
          'x-goog-api-key': process.env.GEMINI_API_KEY!,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          contents: [{ role: 'user', parts: [{ text: prompt }] }],
          generationConfig: {
            responseModalities: ['TEXT', 'IMAGE'],
            imageConfig: { aspectRatio, imageSize },
          },
        }),
      }
    );

    const data = await response.json();
    const parts = data.candidates?.[0]?.content?.parts || [];
    const imagePart = parts.find((p: any) => p.inline_data);

    return NextResponse.json({
      image: imagePart ? {
        data: imagePart.inline_data.data,
        mimeType: imagePart.inline_data.mime_type,
        url: `data:${imagePart.inline_data.mime_type};base64,${imagePart.inline_data.data}`,
      } : null,
    });
  } catch (error) {
    return NextResponse.json({ error: 'Generation failed' }, { status: 500 });
  }
}

Model Comparison

Feature	Nano Banana (2.5 Flash)	Nano Banana Pro (3 Pro Image)
Model ID	gemini-2.5-flash-image	gemini-3-pro-image-preview
Quality	Good	Best
Speed	Faster	Slower
Cost	Lower	Higher
Best For	Previews, high-volume	Production, professional

Resources

Documentation: https://ai.google.dev/gemini-api/docs/image-generation
Google AI Studio: https://aistudio.google.com
Prompt Guide: https://ai.google.dev/gemini-api/docs/prompting-intro

Source

git clone https://github.com/Makiya1202/ai-agents-skills/blob/master/skills/nano-banana-pro/SKILL.mdView on GitHub

Overview

Nano Banana Pro is the marketing name for Gemini 3 Pro Image (gemini-3-pro-image-preview). It enables generation of high-quality AI images via Google's Gemini API, suitable for professional visuals and image-generation features in apps.

How This Skill Works

Requests are sent to the Gemini 3 Pro Image API (gemini-3-pro-image-preview). You provide prompts and optional image_config and generationConfig to control modalities, aspect_ratio, and size; the API returns TEXT and IMAGE data, with base64-encoded images in the response.

When to Use It

Building AI image features powered by Gemini API
Generating professional product photos or marketing visuals
Creating infographics or illustrated content from prompts
Prototype prompts and iterate with gemini-3-pro-image-preview
Integrating with Google image generation workflows and grounding

Quick Start

Step 1: Get API Key – Go to Google AI Studio and click Get API Key, then export as an environment variable
Step 2: Basic Image Generation (Python) – Use the Python client and call generate_content with gemini-3-pro-image-preview
Step 3: REST API (cURL) – POST to the Gemini endpoint with contents and generationConfig

Best Practices

Always specify response_modalities to include IMAGE when you need visuals
Use image_config to set aspect_ratio (e.g., 16:9) and image_size (e.g., 2K)
Enable google_search grounding when prompt reliability matters
Handle and securely store base64-encoded images returned by the API
Secure API keys (prefer environment variables) and rotate credentials regularly

Example Use Cases

Python: Generate a serene Japanese garden image using gemini-3-pro-image-preview
REST API: Create a vibrant infographic about photosynthesis with TEXT and IMAGE
JavaScript: Build a product photo gallery with 16:9, 2K images
Grounded generation: Infographics of stock market trends using google_search grounding
Iterative editing: Refine prompts in a multi-turn session for style tweaks

Frequently Asked Questions

Add this skill to your agents

nano-banana-pro

Nano Banana Pro (Gemini 3 Pro Image)

Overview

Quick Start

Get API Key

Basic Image Generation (Python)

REST API (cURL)

TypeScript/JavaScript

Configuration Options

Image Configuration

With Google Search Grounding

Multi-Turn Conversations (Iterative Editing)

Key Capabilities

1. Superior Text Rendering

2. Character Consistency (Up to 5 Subjects)

Next.js API Route

Model Comparison

Resources

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What is Nano Banana Pro?

Which modalities are returned?

How do I configure image size?