Agent skill
video-generation
Gemini video generation with Veo 3.1 via the Python SDK. Use when generating videos from text or images, using reference images, first/last frame interpolation, or video extension, and when tuning Veo parameters (aspect ratio, resolution, duration, negative prompts, personGeneration, seed).
Install this agent skill to your Project
npx add-skill https://github.com/Xiangyu-CAS/Vision-Skills/tree/main/skills/video-generation
SKILL.md
Video Generation with Gemini (Veo 3.1)
Use this skill when the user asks to generate or extend videos with Gemini using the Python SDK.
Default to veo-3.1-fast-generate-preview, resolution="720p", and duration_seconds=4, unless the user asks otherwise or the task requires different settings (e.g., extension, interpolation, reference images, 1080p/4k).
Workflow
- Identify the task type: text-to-video, image-to-video, reference images, first/last frames (interpolation), or video extension.
- Ensure
GEMINI_API_KEYis available (env or local.env), then use the Python SDK. - When using images, pass
types.Image(imageBytes=..., mimeType=...)(notPIL.Imageortypes.Part) to avoid input type errors. - Call
client.models.generate_videos(...)with the correct inputs/config (see references). - Poll the operation until
done, then download and save the video. - If no videos are returned, surface a clear error and suggest checking the API key, model, and config.
Use these references (by task type)
- Common setup and workflow:
references/overview.md - Parameters and constraints:
references/parameters.md - Model versions and limits:
references/model-versions-and-limitations.md - Prompting guidance:
references/prompt-guide.md
Task types
- Text-to-video:
examples/text-to-video.md - Image-to-video:
examples/image-to-video.md - Reference images:
examples/reference-images.md - First/last frames (interpolation):
examples/first-last-frames.md - Video extension:
examples/video-extension.md
Tuning examples
- Aspect ratio:
examples/aspect-ratio.md - Resolution (4k):
examples/resolution.md - Negative prompt:
examples/negative-prompt.md
Defaults and notes
- Default model:
veo-3.1-fast-generate-preview. - Default output: 720p, 4 seconds.
- For image inputs, always provide
imageBytes+mimeTypeviatypes.Imageto preventINVALID_ARGUMENTerrors. - 1080p/4k, reference images, interpolation, and video extension require
duration_seconds=8. - Video extension is limited to 720p inputs and requires a video from a previous Veo generation.
- Video generation can take minutes; allow longer timeouts when running commands.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
image-generation
Gemini image generation and editing skill for text-to-image, image-to-image edits, multi-reference composition, and Google Search grounding. Use when creating or modifying images via Gemini (default model gemini-3-pro-image-preview) with the Python SDK.
bbdown-cli
Install and use the BBDown CLI on Linux/macOS for Bilibili downloads, including login/cookies/access_token, downloading by URL, preferring 720p when available, and writing output under a local data/ directory.
migrate-to-skills
discover-assumptions
Use after solution concepts exist to surface and prioritize assumptions behind outcomes, opportunities, or solution ideas and design experiments to test them.
discover-opportunities
Use after outcomes are defined to discover opportunities, unmet needs, market gaps, or JTBD insights before choosing solutions.
discover-outcomes
Use at the start of product strategy to define or refine desired outcomes and success metrics (e.g., for Opportunity Solution Trees or continuous discovery) before selecting opportunities or solutions.
Didn't find tool you were looking for?