Agent skill

image-generation

Generate images using Google Gemini (gemini-3-pro-image-preview). Requires GEMINI_API_KEY.

Stars 157

Forks 17

Install this agent skill to your Project

npx add-skill https://github.com/SawyerHood/middleman/tree/main/apps/backend/src/swarm/skills/builtins/image-generation

SKILL.md

Image Generation

Generate images using Google Gemini (gemini-3-pro-image-preview) for both text-to-image and image-to-image workflows.

Use the packaged CLI:

bash

middleman image generate \
  --prompt "a cute robot bee in a garden" \
  --output "/path/to/output.png"

Image-to-image generation is supported with repeated --input-image flags:

bash

middleman image generate \
  --prompt "turn this sketch into a painted poster with a limited teal and coral palette" \
  --input-image "/path/to/sketch.png" \
  --input-image "/path/to/reference.jpg" \
  --output "/path/to/output.png"

Options

--prompt (required): text description of the image to generate
--output (required): output file path (extension auto-detected when omitted)
--input-image (optional, repeatable): local source image(s) to send alongside the prompt
--aspect-ratio (optional): aspect ratio like 16:9, 1:1, 4:3
--size (optional): image size, default 1K

Output

The script prints JSON:

Success: { "ok": true, "file": "/path/to/output.png", "mimeType": "image/png" }
Failure: { "ok": false, "error": "..." }

Maintainer

SawyerHood Core maintainer

Source details

Full Name: SawyerHood/middleman
Branch: main
Path in repo: apps/backend/src/swarm/skills/builtins/image-generation
License: Apache License 2.0
Topics: claude-code ai-agents agents

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

SawyerHood/middleman

brave-search

Web search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.

157 17

Explore

SawyerHood/middleman

memory

Update persistent swarm memory in ${SWARM_MEMORY_FILE} when the user explicitly asks to remember, update, or forget durable information.

157 17

Explore

SawyerHood/middleman

cron-scheduling

Create, list, and remove persistent scheduled tasks using cron expressions.

157 17

Explore

SawyerHood/dev-browser

dev-browser

Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

5,065 317

Explore

davila7/claude-code-templates

verl-rl-training

Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with flexible infrastructure backends.

23,776 2,298

Explore

davila7/claude-code-templates

openrlhf-training

High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2× faster than DeepSpeedChat with distributed architecture and GPU resource sharing.

23,776 2,298

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Image Generation

Options

Output

Recommended Agent Skills

brave-search

memory

cron-scheduling

dev-browser

verl-rl-training

openrlhf-training