What models are available?

Currently using gemini-3-pro-image; you can switch to other variants like gemini-3-pro-image-preview or similar models by editing generate.py if needed.

What input formats are supported?

Supported input formats are PNG, JPG, JPEG, GIF, and WEBP.

What are the environment requirements?

You must set GOOGLE_PROXY_API_KEY and GOOGLE_PROXY_BASE_URL and run with Python 3.10+ with the openai package installed.

Gemini Image Proxy

Scanned

@YspCoder

npx machina-cli add skill @YspCoder/gemini-image-proxy --openclaw

Files (1)

SKILL.md

3.3 KB

Gemini Image Simple

Generate and edit images using Gemini 3 Pro Image via the OpenAI Python SDK and an OpenAI-compatible API endpoint.

Why This Skill

Feature	This Skill	Others (nano-banana-pro, etc.)
Dependencies	openai (SDK)	google-genai, pillow, etc.
Requires pip/uv	✅ Yes	✅ Yes
Works on Fly.io free	✅ Yes (with pip)	❌ Fails
Works in containers	✅ Yes (with pip)	❌ Often fails
Image generation	✅ Full	✅ Full
Image editing	✅ Yes	✅ Yes
Setup complexity	Install SDK + set API key	Install packages first

Bottom line: This skill uses the OpenAI SDK, so you must install openai once with pip.

Install

python3 -m pip install openai

Quick Start

# Set env
export GOOGLE_PROXY_API_KEY="your_api_key"
export GOOGLE_PROXY_BASE_URL="https://example.com/v1"

# Generate
python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "A cat wearing a tiny hat" cat.png

# Edit existing image
python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "Make it sunset lighting" edited.png --input original.png

Usage

Generate new image

python3 {baseDir}/scripts/generate.py "your prompt" output.png

Edit existing image

python3 {baseDir}/scripts/generate.py "edit instructions" output.png --input source.png

Supported input formats: PNG, JPG, JPEG, GIF, WEBP

Environment

Set these environment variables:

GOOGLE_PROXY_API_KEY (your API key)
GOOGLE_PROXY_BASE_URL (OpenAI-compatible base URL, e.g. https://example.com/v1)

How It Works

Uses Gemini 3 Pro Image (gemini-3-pro-image) via the OpenAI Python SDK:

client.images.generate(...) for new images
client.images.edits(...) for edits
Requires the openai package

That's it. Works on any Python 3.10+ installation with openai installed.

Model

Currently using: gemini-3-pro-image

Other available models (can be changed in generate.py if needed):

gemini-3-pro-image-preview - Preview variant
imagen-4.0-ultra-generate-001 - Imagen 4.0 Ultra
imagen-4.0-generate-001 - Imagen 4.0
gemini-2.5-flash-image - Gemini 2.5 Flash with image gen

Examples

# Landscape
python3 {baseDir}/scripts/generate.py "Misty mountains at sunrise, photorealistic" mountains.png

# Product shot
python3 {baseDir}/scripts/generate.py "Minimalist product photo of a coffee cup, white background" coffee.png

# Edit: change style
python3 {baseDir}/scripts/generate.py "Convert to watercolor painting style" watercolor.png --input photo.jpg

# Edit: add element
python3 {baseDir}/scripts/generate.py "Add a rainbow in the sky" rainbow.png --input landscape.png

Source

git clone https://clawhub.ai/YspCoder/gemini-image-proxyView on GitHub

Overview

Gemini Image Proxy lets you generate new images and edit existing ones using Gemini 3 Pro Image through the OpenAI Python SDK and an OpenAI-compatible API endpoint. This setup uses environment variables GOOGLE_PROXY_API_KEY and GOOGLE_PROXY_BASE_URL to authenticate and route requests to the proxy.

How This Skill Works

The skill calls Gemini 3 Pro Image through the OpenAI client: client.images.generate(...) for new images and client.images.edits(...) for edits. It relies on the openai package and requires setting GOOGLE_PROXY_API_KEY and GOOGLE_PROXY_BASE_URL to target the proxy endpoint, with Python 3.10+ as the runtime.

When to Use It

Generate a new image from a prompt (e.g., 'Misty mountains at sunrise').
Edit an existing image with new instructions (e.g., 'Convert to watercolor').
Integrate image generation into a Python project via an OpenAI-compatible endpoint.
Operate in containerized environments or Fly.io where dependencies are installed with pip.
Experiment with different Gemini image models (gemini-3-pro-image and variants).

Quick Start

Step 1: pip install openai
Step 2: export GOOGLE_PROXY_API_KEY='your_api_key' and GOOGLE_PROXY_BASE_URL='https://example.com/v1'
Step 3: Generate or edit, e.g.: python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "A cat wearing a tiny hat" cat.png or python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "Make it sunset lighting" edited.png --input original.png

Best Practices

Install and pin the openai package (pip install openai) and keep it updated.
Securely set GOOGLE_PROXY_API_KEY and GOOGLE_PROXY_BASE_URL; verify the proxy is reachable.
Use the supported input formats (PNG, JPG, JPEG, GIF, WEBP) and provide a valid output path.
Clearly separate generate and edit prompts to avoid unintended edits.
Ensure your runtime is Python 3.10+ and the environment can load the openai package.

Example Use Cases

Landscape: 'Misty mountains at sunrise, photorealistic' → mountains.png
Product shot: 'Minimalist product photo of a coffee cup, white background' → coffee.png
Edit: 'Convert to watercolor painting style' → watercolor.png
Edit: 'Add a rainbow in the sky' → rainbow.png
Edit: 'Apply sunset lighting' → sunset.png

Frequently Asked Questions

Add this skill to your agents