Agent skills
gemini-image-coder

Agent skill

gemini-image-coder

Gemini image coder - Generate and edit images using Google's Gemini API. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images. Use when user asks to generate images, create images, edit images, or mentions "gemini image coder".

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/gemini-image-coder

SKILL.md

Gemini Image Generation

Generate and edit images using Google's Gemini API. Requires GEMINI_API_KEY environment variable.

Quick Reference

Setting	Default	Options
Model	`gemini-3-pro-image-preview`	Use this for all generation
Resolution	1K	1K, 2K, 4K
Aspect Ratio	1:1	1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

CLI Scripts

Generate Image

bash

python scripts/generate_image.py "A cat in space" output.jpg
python scripts/generate_image.py "Epic landscape" landscape.jpg --aspect 16:9 --size 2K
python scripts/generate_image.py "Logo for Acme Corp" logo.jpg --aspect 1:1

Edit Image

bash

python scripts/edit_image.py input.jpg "Add a rainbow" output.jpg
python scripts/edit_image.py photo.jpg "Make it look like Van Gogh" artistic.jpg

Core API Pattern

python

import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Your prompt here"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

for part in response.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = part.as_image()
        image.save("output.jpg")  # Always use .jpg!

Custom Resolution & Aspect Ratio

python

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[prompt],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",
            image_size="2K"
        ),
    )
)

Editing Images

python

from PIL import Image

img = Image.open("input.jpg")
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Add a sunset to this scene", img],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

Multi-Turn Refinement

python

chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(response_modalities=['TEXT', 'IMAGE'])
)

response = chat.send_message("Create a logo for 'Acme Corp'")
# Save first image...

response = chat.send_message("Make the text bolder and add a blue gradient")
# Save refined image...

Prompting Best Practices

Style	Prompt Pattern
Photorealistic	Include camera: lens, lighting, angle, mood
Stylized Art	Specify style explicitly: "kawaii-style", "cel-shading"
Text in Images	Be explicit: font style, placement, colors
Product Mockups	Describe lighting setup and surface

Examples

# Photorealistic
"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"

# Stylized
"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"

# Logo with text
"Create a logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"

# Product mockup
"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"

Advanced Features

Google Search Grounding

python

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Visualize today's weather in Tokyo as an infographic"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)

Multiple Reference Images (Up to 14)

python

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        "Create a group photo of these people in an office",
        Image.open("person1.jpg"),
        Image.open("person2.jpg"),
        Image.open("person3.jpg"),
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

Critical: File Format

Gemini returns JPEG by default. Always use .jpg extension.

python

# CORRECT
image.save("output.jpg")

# WRONG - causes "Image does not match media type" errors
image.save("output.png")  # Creates JPEG with PNG extension!

If PNG is Required

python

from PIL import Image

for part in response.parts:
    if part.inline_data:
        img = part.as_image()
        img.save("output.png", format="PNG")  # Explicit conversion

Verify Format

bash

file image.png
# If output shows "JPEG image data" - rename to .jpg!

Notes

All generated images include SynthID watermarks
Default to 1K for speed; use 2K/4K when quality is critical
For editing, describe changes conversationally—the model understands semantic masking
Image-only mode won't work with Google Search grounding

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/development/gemini-image-coder
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Gemini Image Generation

Quick Reference

CLI Scripts

Generate Image

Edit Image

Core API Pattern

Custom Resolution & Aspect Ratio

Editing Images

Multi-Turn Refinement

Prompting Best Practices

Examples

Advanced Features

Google Search Grounding

Multiple Reference Images (Up to 14)

Critical: File Format

If PNG is Required

Verify Format

Notes

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state