Get the FREE Ultimate OpenClaw Setup Guide →

image-generation

Scanned
npx machina-cli add skill shinpr/mcp-image/image-generation --openclaw
Files (1)
SKILL.md
5.1 KB

Image Generation Prompt Best Practices

Prompt Structure

Enhance every image generation prompt around three core elements:

1. SUBJECT (What)

The main focus of the image.

  • Physical characteristics: textures, materials, colors, scale
  • Actions, poses, expressions if applicable
  • Distinctive features that define the subject

2. CONTEXT (Where/When)

The environment and conditions.

  • Setting, background, spatial relationships (foreground, midground, background)
  • Time of day, weather, atmospheric conditions
  • Mood and emotional tone of the scene

3. STYLE (How)

The visual treatment.

  • Artistic or photographic approach: reference specific artists, movements, or styles
  • Lighting design: direction, quality, color temperature, shadows
  • Camera/lens choices: specify focal length, aperture, and shooting angle when photographic

Core Principles

  • Preserve intent — Enrich the user's original vision, never override it
  • Positive descriptions only — Describe what should be present; rephrase any exclusion as an inclusion
  • Specific over vague — "golden hour sunlight at 15° angle" beats "nice lighting"
  • Natural flow — Weave elements into a single flowing description, not a bullet list

Enhancement Patterns

Hyper-Specific Details

Add concrete visual details where the user left gaps:

  • Lighting → direction, quality, color temperature, shadow behavior
  • Textures → surface materials, weathering, reflectivity
  • Atmosphere → particulates, humidity, depth haze
  • Scale → relative sizes, distances, proportions

Camera Control Terminology

When a photographic look is appropriate:

  • Lens type: "shot with 85mm portrait lens", "wide-angle 24mm"
  • Aperture: "shallow depth of field at f/1.8", "deep focus at f/11"
  • Angle: "low angle emphasizing height", "bird's eye view"
  • Motion: "motion blur on the paws", "frozen mid-action"

Atmospheric Enhancement

Convey mood through environmental details:

  • Emotional tone: "serene", "ominous", "jubilant"
  • Light quality: "dappled shadows", "harsh midday sun", "soft diffused overcast"
  • Weather/air: "morning mist", "dust particles in a sunbeam"

Text in Images

When the image should contain readable text (signs, labels, titles, typography):

  • Specify the exact text content in quotes: "OPEN 24 HOURS" in bold sans-serif
  • Describe visual treatment: font style, weight, size relative to the scene
  • Define placement and integration: "centered on the storefront awning", "hand-lettered on the chalkboard"

Feature Patterns

Character Consistency

When the same character must be recognizable across multiple images:

  • Include at least 3 recognizable visual markers (distinctive scar, signature clothing, unique hairstyle, characteristic accessory)
  • Use anchoring words: "distinctive", "signature", "always wears", "always has"
  • Be specific: "round tortoiseshell glasses" not just "glasses"

Compositional Integration (Multi-Element Blending)

When combining multiple visual elements in one scene:

  • Define spatial relationships with proportions: "foreground (40% of frame)", "midground", "background"
  • Use integration language: "seamlessly blending", "harmoniously composed", "naturally integrated"
  • Specify relative scale and interaction between elements

Real-World Accuracy

When depicting real places, cultures, or historical elements:

  • Use specific terminology: "traditional Edo-period architecture", "authentic Moroccan zellige tilework"
  • Include culturally accurate details
  • Reference geographical or historical specifics

Purpose-Driven Enhancement

Tailor the prompt to the intended use:

PurposeEmphasis
Product photoClean background, studio lighting, commercial appeal
UI mockupFlat design elements, consistent spacing, screen-appropriate
Presentation slideBold composition, clear focal point, text-friendly layout
Social mediaEye-catching, vibrant, crop-friendly aspect ratio
Book/album coverTypography space, dramatic mood, symbolic elements

Image Editing

When modifying an existing image:

  • Preserve the original's core characteristics: color palette, lighting style, composition
  • Use anchoring phrases: "maintain the existing...", "preserve the original...", "keep the same..."
  • Be specific about what to change vs what to keep unchanged
  • Describe modifications relative to the existing image, not from scratch

Example

Input: "A happy dog in a park"

Enhanced: "Golden retriever mid-leap catching a red frisbee, ears flying, tongue out in joy, in a sunlit urban park. Soft morning light filtering through oak trees creates dappled shadows on emerald grass. Background shows families on picnic blankets, slightly out of focus. Shot from low angle emphasizing the dog's athletic movement, with motion blur on the paws suggesting speed."

Source

git clone https://github.com/shinpr/mcp-image/blob/main/skills/image-generation/SKILL.mdView on GitHub

Overview

This skill teaches you to craft image prompts around a Subject-Context-Style structure. By focusing on What, Where/When, and How, you preserve intent, improve specificity, and produce higher-quality visuals—from illustrations and photos to visual assets and edited images.

How This Skill Works

You compose a single flowing description that clearly separates Subject, Context, and Style. Use enhancement patterns to fill gaps with hyper-specific lighting, textures, atmosphere, and, when appropriate, camera control terms. The approach ensures prompts are natural and actionable, making it easier for image generators to realize the user's vision.

When to Use It

  • Designing a new character or concept art for a project
  • Generating photorealistic scenes with precise lighting and lens cues
  • Producing multi-element visuals with explicit foreground, midground, and background relationships
  • Creating visuals that include readable text, signs, or typography
  • Maintaining subject consistency across a series of images with identifiable markers

Quick Start

  1. Step 1: Identify the Subject (What), Context (Where/When), and Style (How) for your image
  2. Step 2: Add hyper-specific details to lighting, textures, atmosphere, and any needed camera terms
  3. Step 3: If text is required, insert exact wording and define its placement within the scene

Best Practices

  • Start with a clear Subject, Context, and Style outline and weave them into one flowing prompt
  • Add hyper-specific details for lighting, textures, atmosphere, and scale to fill gaps
  • When photography is desired, incorporate camera control terminology (lens, aperture, angle, motion) as appropriate
  • If text must appear, specify exact content in quotes and its visual treatment
  • For consistent characters across images, include at least three recognizable markers (e.g., distinctive scarf, signature glasses, unique accessory)

Example Use Cases

  • "OPEN 24 HOURS" in bold sans-serif on a storefront awning, centered and legible in the scene
  • Portrait of a fox-like character with a signature scarf, round tortoiseshell glasses, and a silver pendant, shot with an 85mm portrait lens at f/1.8
  • A serene lakeside at sunrise with morning mist, soft diffused light, and gentle ripples in the water
  • A product shot on a clean white background with crisp shadows and warm, golden-hour lighting
  • A traditional Edo-period street scene with authentic tilework and wooden storefronts, atmospheric haze and detailed architecture

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers