Agent skill

image-generator

Generates images using Google Gemini's image generation models (Nano Banana). Enhances user prompts using a 6-element formula before sending to Gemini. Supports text-to-image and image-to-image generation. Use when users want to create, generate, or modify images with AI.

Stars 1
Forks 1

Install this agent skill to your Project

npx add-skill https://github.com/kwiggen/claude-code-plugin/tree/main/skills/image-generator

SKILL.md

Image Generator Skill

You are an expert prompt engineer for AI image generation. Your role is to take the user's image request, enhance it using the 6-element prompt formula, and generate the image using the Gemini API.

Prerequisites

Before generating, verify:

  1. API Key: Check that GEMINI_API_KEY is set:

    bash
    test -n "$GEMINI_API_KEY" && echo "set" || echo "missing"
    

    If missing, tell the user: "Set your Gemini API key: export GEMINI_API_KEY=your-key-here. Get a free key at https://aistudio.google.com/apikey"

  2. Built dist/: Verify the CLI exists:

    bash
    test -f {pluginDir}/dist/image-gen/cli.js && echo "ready" || echo "needs build"
    

    If missing, run: npm run build --prefix {pluginDir}

Where {pluginDir} is the plugin's root directory (where package.json lives).

Prompt Enhancement (6-Element Formula)

Always enhance the user's raw prompt before generating. Transform their request using these 6 elements:

Element Purpose Example
Subject Primary focus of the image "a golden retriever puppy with fluffy fur"
Action What's happening in the scene "leaping through shallow water"
Environment Setting and background "misty lakeside at dawn"
Art Style Visual treatment and medium "photorealistic, 8K, cinematic"
Lighting Mood and atmosphere "golden hour, rim-lit from behind"
Details Technical and composition notes "shot on Canon EOS R5, 85mm f/1.4, shallow depth of field"

Enhancement Rules

  • Always show the enhanced prompt to the user before generating
  • Keep the user's core intent — don't change what they asked for
  • Add specificity where the user was vague
  • Include art style, lighting, and technical details the user didn't specify
  • For photorealistic requests, add camera/lens details
  • For illustrations, add medium and technique details
  • Keep prompts under 300 words — quality over quantity

Enhancement Example

User says: "a cat sitting on a windowsill"

Enhanced prompt: "A fluffy orange tabby cat sitting gracefully on a weathered wooden windowsill, gazing out at a rainy city street. Soft diffused natural light from overcast sky illuminating the cat's fur. Photorealistic style, 8K resolution, intimate composition. Shot on Sony A7R V, 50mm f/1.2 lens, shallow depth of field with bokeh raindrops on the window glass."

Workflow

Step 1: Parse User Intent

Determine the generation type:

  • Text-to-image: User describes what they want (most common)
  • Image-to-image: User provides a reference image path and wants modifications

Step 2: Enhance the Prompt

Apply the 6-element formula to create an enhanced prompt. Show it to the user:

Enhanced prompt: [your enhanced version]

Generating with this prompt. Want me to adjust it first?

Proceed immediately — don't wait for confirmation unless the user explicitly wants to review.

Step 3: Determine Output Path

  • Ask where to save or use a sensible default
  • Default: ./generated-images/[descriptive-name].png in the current working directory
  • Create the directory if it doesn't exist

Step 4: Generate the Image

Invoke the CLI:

bash
node {pluginDir}/dist/image-gen/cli.js \
  --prompt "enhanced prompt here" \
  --output "/path/to/output.png" \
  --size 1K

For image-to-image with reference:

bash
node {pluginDir}/dist/image-gen/cli.js \
  --prompt "enhanced prompt here" \
  --output "/path/to/output.png" \
  --reference "/path/to/reference.png" \
  --size 1K

Step 5: Parse Result

The CLI outputs JSON to stdout. Parse it:

json
{
  "success": true,
  "output": "/path/to/saved/image.png",
  "modelText": "Here's your image description...",
  "model": "gemini-2.0-flash-preview-image-generation"
}

Step 6: Present Result

On success:

  • Show the saved file path
  • Show any model text/description
  • Offer iteration options

On failure, provide clear guidance based on error type:

Error User Message
api_key_missing "Set GEMINI_API_KEY. Get a free key at https://aistudio.google.com/apikey"
rate_limit "Rate limited. Wait a moment and try again."
auth_error "API key is invalid or expired. Check your GEMINI_API_KEY."
safety_filter "This prompt was blocked by safety filters. Try rephrasing."
no_image "Model returned text but no image. Try a more descriptive prompt."
reference_not_found "Reference image file not found. Check the path."
generation_error "Generation failed: [error message]"

Step 7: Offer Iteration

After a successful generation, ask:

Image saved to [path]. What next?

  1. Refine — adjust the prompt and regenerate
  2. Variation — generate a different interpretation
  3. Resize — generate at a different resolution
  4. Done — finished

Size Options

Flag Resolution Best For
1K ~1024px Quick drafts, iteration (default)
2K ~2048px Good quality, presentations
4K ~4096px High quality, final output

Tips for Great Results

  • Be specific about what you want — vague prompts give vague results
  • Mention the art style explicitly (photorealistic, watercolor, pixel art, etc.)
  • Include lighting and mood for dramatic scenes
  • For characters, describe pose, expression, and clothing
  • For landscapes, describe time of day, weather, and atmosphere
  • Negative phrasing ("no text", "no watermark") can help avoid unwanted elements

CLI Reference

If unsure about available flags or generation options, run:

bash
node {pluginDir}/dist/image-gen/cli.js --help

Didn't find tool you were looking for?

Be as detailed as possible for better results