Agent skill
faion-video-gen-skill
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/other/faion-video-gen-skill
SKILL.md
AI Video Generation Mastery
Text-to-Video, Image-to-Video, and Video Editing with AI (2025-2026)
Quick Reference
| Platform | Best For | Max Duration | API | Cost Range |
|---|---|---|---|---|
| Sora 2 | Photorealism, complex motion | 20s (Plus), 60s (Pro) | OpenAI | $20-200/mo subscription |
| Runway Gen-4 | Professional, consistent | 10s | Yes | $0.05-0.10/second |
| Pika Labs 2.5 | Speed, effects | 5s (extendable) | Yes | $0.20-0.50/video |
| Kling 2.0 | Alternative, good value | 10s | Limited | Freemium |
| Luma Dream Machine | Fast iteration | 5s | Yes | Credits-based |
Platform Comparison
Feature Matrix
| Feature | Sora 2 | Runway Gen-4 | Pika 2.5 | Kling 2.0 |
|---|---|---|---|---|
| Text-to-Video | Yes | Yes | Yes | Yes |
| Image-to-Video | Yes | Yes | Yes | Yes |
| Video-to-Video | Yes | Yes | Partial | No |
| Camera Controls | Advanced | Advanced | Basic | Basic |
| Motion Brush | Yes | Yes | No | No |
| Lip Sync | Yes | No | Yes | Yes |
| Audio Generation | Yes | No | Yes (SFX) | No |
| Storyboard Mode | Yes | Multi-shot | No | No |
| Resolution | 1080p | 4K | 1080p | 1080p |
| API Access | OpenAI | Yes | Yes | Limited |
When to Use Each
| Use Case | Recommended Platform |
|---|---|
| Cinematic quality, complex scenes | Sora 2 |
| Professional production, API integration | Runway Gen-4 |
| Quick iterations, social content | Pika Labs 2.5 |
| Budget-conscious, testing | Kling 2.0 |
| Rapid prototyping | Luma Dream Machine |
Sora 2 (OpenAI)
Overview
OpenAI's flagship video generation model. Best for photorealistic output and complex scene understanding.
Access: ChatGPT Plus ($20/mo) or Pro ($200/mo)
Capabilities
| Feature | Description |
|---|---|
| Text-to-Video | Generate from detailed prompts |
| Image-to-Video | Animate static images |
| Video-to-Video | Remix, edit, extend existing videos |
| Blend Mode | Combine two video sources |
| Re-cut | Edit and re-render existing videos |
| Storyboard | Multi-shot timeline planning |
Prompt Engineering
Effective Prompt Structure:
[Scene Description] + [Camera Movement] + [Lighting] + [Style] + [Duration Hint]
Example Prompts:
# Cinematic Scene
A woman with dark skin and wavy hair walks through a neon-lit Tokyo alley
at night. Camera follows from behind, tracking shot. Cyberpunk aesthetic,
moody lighting with pink and blue neon reflections on wet pavement.
# Product Shot
A sleek smartphone rotates slowly on a white marble surface. Camera orbits
around the device. Studio lighting with soft shadows, minimalist aesthetic,
commercial quality.
# Nature Documentary
A monarch butterfly emerging from its chrysalis in extreme close-up.
Time-lapse style, macro photography, natural morning light filtering
through leaves.
Best Practices
- Be specific about movement - Describe camera and subject motion explicitly
- Specify lighting - "Golden hour", "studio lighting", "neon", etc.
- Reference styles - "cinematic", "documentary", "commercial", "music video"
- Duration awareness - Shorter prompts = faster, more coherent output
- Iterate - Use re-cut and blend for refinement
Limitations
- Max 20 seconds (Plus), 60 seconds (Pro)
- Text rendering still imperfect
- Human hands/fingers can have artifacts
- No direct API (use ChatGPT interface or Sora interface)
- Limited exports per month based on subscription
Runway Gen-3 / Gen-4
Overview
Industry-standard for professional video production. Best API support and control options.
Pricing:
- Standard: $0.05/second generated
- Turbo: $0.02/second (lower quality, faster)
- Unlimited plan: $96/month
Gen-4 Features
| Feature | Description |
|---|---|
| Extended Duration | Up to 40 seconds per generation |
| Multi-Shot | Plan and generate connected shots |
| Camera Controls | Pan, tilt, zoom, dolly, orbit |
| Motion Brush | Paint motion onto specific areas |
| Structure Reference | Maintain subject consistency |
API Integration
# Runway Gen-4 Python SDK
import runwayml
client = runwayml.RunwayML()
# Text-to-Video
text_task = client.text_to_video.create(
model="gen4",
prompt="A serene lake at sunrise, mist rising from the water, camera slowly pushes forward",
duration=10,
aspect_ratio="16:9",
resolution="1080p"
)
# Poll for completion
import time
while text_task.status not in ["SUCCEEDED", "FAILED"]:
text_task = client.tasks.retrieve(text_task.id)
print(f"Status: {text_task.status}")
time.sleep(5)
if text_task.status == "SUCCEEDED":
video_url = text_task.output[0]
print(f"Video ready: {video_url}")
# Image-to-Video
with open("scene.png", "rb") as f:
image_data = f.read()
image_task = client.image_to_video.create(
model="gen4",
prompt_image=image_data,
prompt_text="Camera slowly pans right, birds fly across the sky",
duration=10,
aspect_ratio="16:9"
)
# With Camera Controls
controlled_task = client.image_to_video.create(
model="gen4",
prompt_image=image_data,
prompt_text="Gentle wind moves the grass",
camera_motion={
"horizontal": 0.3, # Pan right
"vertical": 0.0, # No tilt
"zoom": 0.1, # Slight zoom in
"roll": 0.0 # No roll
},
duration=10
)
Camera Motion Parameters
| Parameter | Range | Effect |
|---|---|---|
horizontal |
-1.0 to 1.0 | Pan left/right |
vertical |
-1.0 to 1.0 | Tilt up/down |
zoom |
-1.0 to 1.0 | Zoom out/in |
roll |
-1.0 to 1.0 | Rotate camera |
Motion Brush Workflow
- Upload source image
- Paint mask over areas to animate
- Describe motion for masked areas
- Generate with motion applied only to selection
Pika Labs 2.5
Overview
Fast, cost-effective video generation with unique features like lip sync and sound effects.
Pricing:
- Free tier: 250 credits/month
- Pro: $8/month - 700 credits
- Unlimited: $28/month
Key Features
| Feature | Description |
|---|---|
| Lip Sync | Sync character lips to audio |
| Sound Effects | AI-generated SFX matching video |
| Modify Region | Edit specific parts of video |
| Expand Canvas | Outpaint video frames |
| Pikaffects | Special effects library |
API Integration
import requests
import time
PIKA_API_KEY = "your-api-key"
PIKA_BASE_URL = "https://api.pika.art/v1"
headers = {
"Authorization": f"Bearer {PIKA_API_KEY}",
"Content-Type": "application/json"
}
# Text-to-Video
response = requests.post(
f"{PIKA_BASE_URL}/generate",
headers=headers,
json={
"prompt": "A cat playing piano in a jazz club, cinematic lighting",
"aspect_ratio": "16:9",
"motion_strength": 3, # 1-5 scale
"guidance_scale": 12,
"negative_prompt": "blurry, distorted, low quality"
}
)
task_id = response.json()["task_id"]
# Poll for result
while True:
status_response = requests.get(
f"{PIKA_BASE_URL}/tasks/{task_id}",
headers=headers
)
status = status_response.json()
if status["status"] == "completed":
video_url = status["output"]["video_url"]
break
elif status["status"] == "failed":
raise Exception(f"Generation failed: {status['error']}")
time.sleep(3)
# Image-to-Video with motion
with open("image.png", "rb") as f:
files = {"image": f}
data = {
"prompt": "The character turns and smiles",
"motion_strength": 2,
"fps": 24
}
response = requests.post(
f"{PIKA_BASE_URL}/image-to-video",
headers={"Authorization": f"Bearer {PIKA_API_KEY}"},
files=files,
data=data
)
Pikaffects (Special Effects)
| Effect | Description |
|---|---|
| Melt | Object melts into liquid |
| Explode | Particle explosion |
| Inflate | Object inflates like balloon |
| Crush | Object gets crushed |
| Cake-ify | Transform into cake |
| Squish | Squeeze and release |
Kling 2.0
Overview
Chinese-developed alternative with competitive quality and pricing. Good for budget-conscious projects.
Access: Web interface, limited API
Features
| Feature | Description |
|---|---|
| Motion Templates | Pre-built motion patterns |
| Style Transfer | Apply artistic styles |
| Character Animation | Consistent character motion |
| Inpainting | Edit specific regions |
Best For
- Testing and prototyping
- Budget-conscious production
- Simple animations
- Social media content
Limitations
- API access requires application
- Documentation primarily in Chinese
- Some features geo-restricted
- Processing queue can be slow
Workflows
Text-to-Video Workflow
1. CONCEPT
├── Define scene objective
├── Write detailed description
└── Choose platform based on needs
2. PROMPT ENGINEERING
├── Subject: Who/what is in the scene
├── Action: What happens
├── Setting: Where, when
├── Camera: Movement, angle
├── Style: Aesthetic, mood
└── Technical: Duration, aspect ratio
3. GENERATION
├── Start with shorter duration (5s)
├── Iterate on prompt
├── Try variations
└── Select best result
4. REFINEMENT
├── Extend if needed
├── Apply fixes (inpainting)
├── Color grade
└── Add audio
5. POST-PRODUCTION
├── Upscale if needed
├── Add music/SFX
├── Export in required format
└── Archive prompts and settings
Image-to-Video Workflow
1. IMAGE PREPARATION
├── High resolution (min 1024px)
├── Clean composition
├── Consider what will move
└── Remove artifacts
2. MOTION PLANNING
├── What moves: subject, background, camera
├── Direction of movement
├── Speed/intensity
└── Duration needed
3. GENERATION
├── Upload image
├── Describe motion in prompt
├── Set camera controls if available
└── Generate and review
4. ITERATION
├── Adjust motion strength
├── Try different angles
├── Combine multiple generations
└── Use motion brush for precision
Multi-Shot Production
1. SCRIPT BREAKDOWN
├── Write or obtain script
├── Break into scenes
├── Break scenes into shots
└── Note required visuals per shot
2. SHOT LIST
Shot # | Description | Duration | Camera | Platform
01 | Estab. wide | 5s | Static | Runway
02 | Subject enters | 10s | Track right | Sora
03 | Close-up face | 5s | Push in | Runway
04 | POV shot | 5s | Handheld | Pika
3. GENERATION ORDER
├── Generate establishing shots first
├── Then action sequences
├── Finally close-ups and details
└── Keep consistent style prompts
4. EDITING
├── Import all clips to timeline (Premiere, DaVinci)
├── Arrange in sequence
├── Add transitions
├── Color match all clips
└── Add audio track
5. EXPORT
├── Master in highest quality
├── Create platform-specific versions
└── Archive project files
Style Consistency
Maintaining Visual Coherence
| Technique | Description |
|---|---|
| Style Prompt Base | Use consistent style descriptors across all shots |
| Reference Image | Use same source image for related shots |
| Character Sheets | Generate reference images first, use for all videos |
| Color Palette | Specify exact colors in prompts |
| Lighting Consistency | Same lighting description across shots |
Style Prompt Template
Base Style Prompt (prepend to all shots):
"[Shot description], cinematic film grain, color graded in teal and orange,
professional lighting, 24fps motion blur, shallow depth of field,
shot on RED camera --style [consistent_style_id]"
Character Consistency
-
Generate Character Reference
- Create detailed character image first
- Document exact appearance details
- Save as reference for all shots
-
Description Template
[Character: young woman, dark curly hair, brown eyes, wearing olive green jacket and white t-shirt] + [action/scene] -
Use Image-to-Video
- Keep character image as starting frame
- Describe motion, not appearance
- Maintain same source across shots
Video Editing Automation
Bulk Generation Scripts
# Batch video generation with Runway
import runwayml
import json
import time
client = runwayml.RunwayML()
# Load shot list
with open("shot_list.json") as f:
shots = json.load(f)
# Example shot_list.json:
# [
# {"id": "01", "prompt": "...", "duration": 5, "image": "shot01.png"},
# {"id": "02", "prompt": "...", "duration": 10, "image": null}
# ]
results = []
for shot in shots:
if shot.get("image"):
# Image-to-video
with open(shot["image"], "rb") as f:
task = client.image_to_video.create(
model="gen4",
prompt_image=f.read(),
prompt_text=shot["prompt"],
duration=shot["duration"]
)
else:
# Text-to-video
task = client.text_to_video.create(
model="gen4",
prompt=shot["prompt"],
duration=shot["duration"]
)
results.append({
"shot_id": shot["id"],
"task_id": task.id
})
print(f"Started shot {shot['id']}: {task.id}")
time.sleep(2) # Rate limiting
# Poll all tasks
completed = []
while len(completed) < len(results):
for result in results:
if result["shot_id"] in [c["shot_id"] for c in completed]:
continue
task = client.tasks.retrieve(result["task_id"])
if task.status == "SUCCEEDED":
completed.append({
"shot_id": result["shot_id"],
"url": task.output[0]
})
print(f"Completed shot {result['shot_id']}")
elif task.status == "FAILED":
print(f"Failed shot {result['shot_id']}: {task.error}")
completed.append({"shot_id": result["shot_id"], "url": None})
time.sleep(10)
# Save results
with open("generated_videos.json", "w") as f:
json.dump(completed, f, indent=2)
FFmpeg Post-Processing
# Concatenate multiple clips
ffmpeg -f concat -safe 0 -i clips.txt -c copy output.mp4
# clips.txt format:
# file 'shot01.mp4'
# file 'shot02.mp4'
# file 'shot03.mp4'
# Upscale to 4K
ffmpeg -i input.mp4 -vf "scale=3840:2160:flags=lanczos" -c:v libx264 -crf 18 output_4k.mp4
# Add audio track
ffmpeg -i video.mp4 -i audio.mp3 -c:v copy -c:a aac -shortest output.mp4
# Create loop
ffmpeg -stream_loop 3 -i input.mp4 -c copy output_looped.mp4
# Adjust speed (2x faster)
ffmpeg -i input.mp4 -filter:v "setpts=0.5*PTS" output_fast.mp4
# Add fade in/out (1 second each)
ffmpeg -i input.mp4 -vf "fade=t=in:st=0:d=1,fade=t=out:st=4:d=1" output.mp4
# Convert to GIF (for previews)
ffmpeg -i input.mp4 -vf "fps=15,scale=480:-1:flags=lanczos" -c:v gif output.gif
DaVinci Resolve Automation
# DaVinci Resolve script (run inside Resolve)
import DaVinciResolveScript as dvr
resolve = dvr.scriptapp("Resolve")
project_manager = resolve.GetProjectManager()
project = project_manager.GetCurrentProject()
# Create timeline
media_pool = project.GetMediaPool()
timeline = media_pool.CreateTimelineFromClips(
"AI Generated Sequence",
clips_list # List of imported clips
)
# Add transitions
for i, clip in enumerate(timeline.GetItemListInTrack("video", 1)):
if i > 0:
timeline.AddTransition(
clip,
"Cross Dissolve",
duration=24 # frames
)
# Color match
project.SetCurrentTimeline(timeline)
timeline.ApplyGradeFromDRX(0, "color_grade.drx")
# Render
project.SetCurrentRenderFormatAndCodec("mp4", "H265_NVIDIA")
project.SetRenderSettings({
"TargetDir": "/output/",
"CustomName": "final_output"
})
project.AddRenderJob()
project.StartRendering()
Storyboarding
Pre-Production Planning
# Storyboard Template
## Project: [Title]
## Total Duration: [X seconds/minutes]
## Platform: [Sora/Runway/Pika]
---
### Shot 01
- **Duration:** 5s
- **Visual:** Wide establishing shot of city at dawn
- **Camera:** Slow push forward
- **Motion:** Cars moving on streets below
- **Audio:** Ambient city sounds
- **Prompt:** "Aerial view of Manhattan at dawn, golden hour lighting,
camera slowly pushes forward toward the skyline, cars visible on
streets below, cinematic, 4K quality"
- **Reference:** [image link]
---
### Shot 02
- **Duration:** 3s
- **Visual:** Close-up of protagonist's eyes opening
- **Camera:** Static, then slight push in
- **Motion:** Eyes open, blink
- **Audio:** Alarm clock sound
- **Prompt:** "Extreme close-up of human eyes opening, morning light
falling across face, eyes blink twice, photorealistic, shallow DOF"
- **Reference:** [image link]
Sora Storyboard Mode
-
Access Storyboard
- Open Sora interface
- Select "Storyboard" mode
- Define timeline length
-
Add Shots
- Click to add shot markers
- Write prompt for each shot
- Upload reference images
-
Connect Shots
- Define transitions between shots
- Set camera continuity
- Preview full sequence
-
Generate
- Generate all shots in sequence
- Review and regenerate as needed
- Export final video
Resolution and Export
Supported Resolutions
| Platform | Max Resolution | Aspect Ratios |
|---|---|---|
| Sora 2 | 1920x1080 | 16:9, 9:16, 1:1 |
| Runway Gen-4 | 4096x2160 | 16:9, 9:16, 1:1, 4:5, 21:9 |
| Pika 2.5 | 1920x1080 | 16:9, 9:16, 1:1 |
| Kling 2.0 | 1920x1080 | 16:9, 9:16 |
Platform-Specific Exports
| Platform | Resolution | Aspect | Duration | Notes |
|---|---|---|---|---|
| YouTube | 1920x1080+ | 16:9 | Any | Include 2s intro/outro |
| TikTok | 1080x1920 | 9:16 | 15-60s | Vertical, fast-paced |
| Instagram Reels | 1080x1920 | 9:16 | 15-90s | Vertical |
| Instagram Post | 1080x1350 | 4:5 | 3-60s | Square or tall |
| Twitter/X | 1280x720 | 16:9 | 2:20 max | Keep under 512MB |
| 1920x1080 | 16:9 | 3-10min | Professional content |
Upscaling
For higher resolution output:
# Using Topaz Video AI (via CLI)
import subprocess
subprocess.run([
"tvai",
"--input", "generated_1080p.mp4",
"--output", "upscaled_4k.mp4",
"--model", "proteus-3",
"--scale", "2",
"--format", "mp4"
])
# Using Real-ESRGAN for frames
# 1. Extract frames
subprocess.run([
"ffmpeg", "-i", "input.mp4",
"-vf", "fps=24",
"frames/frame_%04d.png"
])
# 2. Upscale frames
subprocess.run([
"realesrgan-ncnn-vulkan",
"-i", "frames/",
"-o", "upscaled_frames/",
"-n", "realesrgan-x4plus"
])
# 3. Reassemble video
subprocess.run([
"ffmpeg", "-framerate", "24",
"-i", "upscaled_frames/frame_%04d.png",
"-c:v", "libx264", "-pix_fmt", "yuv420p",
"upscaled_video.mp4"
])
Cost Comparison
Monthly Subscription Comparison
| Platform | Free Tier | Basic | Pro | Unlimited |
|---|---|---|---|---|
| Sora 2 | - | $20/mo (Plus) | $200/mo (Pro) | - |
| Runway | 125 credits | $15/mo | $35/mo | $96/mo |
| Pika | 250 credits | $8/mo | $28/mo | $58/mo |
| Kling | Yes | $5/mo | $10/mo | - |
Per-Video Cost Estimate
| Video Type | Duration | Sora | Runway | Pika |
|---|---|---|---|---|
| Social clip | 5s | ~$1 | $0.25 | $0.20 |
| Product demo | 30s | ~$5 | $1.50 | $1.00 |
| Short film | 2min | ~$20 | $6.00 | $4.00 |
Cost Optimization Tips
-
Prototype cheap, produce premium
- Use Pika/Kling for concept testing
- Generate final in Sora/Runway
-
Optimize duration
- Shorter clips = lower cost
- Combine clips in post-production
-
Batch processing
- Generate during off-peak hours
- Use API for bulk discounts
-
Cache and reuse
- Save successful prompts
- Extend existing clips vs. generating new
Common Issues and Solutions
Quality Issues
| Issue | Cause | Solution |
|---|---|---|
| Blurry output | Low motion, poor prompt | Increase motion strength, add detail |
| Morphing faces | Complex motion, profile views | Use front-facing shots, shorter duration |
| Flickering | Frame inconsistency | Reduce motion, use image-to-video |
| Artifacts | Complex scene, hands/text | Simplify scene, avoid text generation |
| Wrong style | Vague prompt | Add explicit style references |
Motion Issues
| Issue | Cause | Solution |
|---|---|---|
| No movement | Motion strength too low | Increase motion parameter |
| Chaotic motion | Too many moving elements | Focus on one subject |
| Unnatural motion | Over-prompting | Simplify motion description |
| Camera drift | Default behavior | Specify "static camera" |
Consistency Issues
| Issue | Cause | Solution |
|---|---|---|
| Character changes | No reference | Use same source image |
| Color mismatch | Different generations | Add color palette to prompt |
| Style drift | Inconsistent prompts | Create base style prompt |
Integration with Other Skills
| Skill | Integration Point |
|---|---|
| faion-image-gen-skill | Generate source images for image-to-video |
| faion-audio-skill | Add voiceover, music, sound effects |
| faion-langchain-skill | Automate video generation pipelines |
| faion-openai-api-skill | Access Sora API when available |
Tools and Resources
| Tool | Purpose | Link |
|---|---|---|
| FFmpeg | Video processing CLI | ffmpeg.org |
| DaVinci Resolve | Professional editing | blackmagicdesign.com |
| Topaz Video AI | Upscaling | topazlabs.com |
| Runway API | Programmatic access | docs.runwayml.com |
| Pika API | Programmatic access | pika.art/api |
References
- Runway Documentation
- Pika Labs API
- Sora User Guide
- FFmpeg Documentation
- Video Encoding Best Practices
Skill Version: 1.0 Last Updated: 2026-01-18 Part of Faion Network AI/LLM Skills
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
agent-ops-spec
Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-testing
Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
agent-ops-state
Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.
Didn't find tool you were looking for?