Agent skill
youtube-to-markdown
Use when user asks YouTube video extraction, get, fetch, transcripts, subtitles, or captions. Writes video details and transcription into structured markdown file.
Install this agent skill to your Project
npx add-skill https://github.com/vre/flow-state/tree/main/docs/youtube-to-markdown/research/2026.01.14 token usage test/code/no-polish
SKILL.md
YouTube to Markdown (No-Polish Variant)
Test variant: Removes transcript polish steps (4, 7, 8). Keeps separate summary steps (5, 6) and comment steps (10a, 10b).
Execute all steps sequentially without asking for user approval.
Step 0: Check extracted before
python3 ./check_existing.py "<YOUTUBE_URL>" "<output_directory>"
If returns exists: true: Skip to Step 10 (comments).
Step 1: Extract data (metadata, description, chapters)
python3 extract_data.py "<YOUTUBE_URL>" "<output_directory>"
Creates: youtube_{VIDEO_ID}metadata.md, youtube{VIDEO_ID}description.md, youtube{VIDEO_ID}_chapters.json
Step 2: Extract transcript
If video language is en, proceed directly. If non-English, ask user which language.
python3 extract_transcript.py "<YOUTUBE_URL>" "<output_directory>" "<LANG_CODE>"
Creates: youtube_{VIDEO_ID}_transcript.vtt
Fallback (only if transcript unavailable)
Ask user: "No transcript available. Proceed with Whisper transcription?"
python3 extract_transcript_whisper.py "<YOUTUBE_URL>" "<output_directory>"
Step 3: Deduplicate transcript
Set BASE_NAME from Step 1 output (youtube_{VIDEO_ID})
python3 ./deduplicate_vtt.py "<output_directory>/${BASE_NAME}_transcript.vtt" "<output_directory>/${BASE_NAME}_transcript_dedup.md" "<output_directory>/${BASE_NAME}_transcript_no_timestamps.txt"
Copy dedup as final transcript (no polish steps):
cp "<output_directory>/${BASE_NAME}_transcript_dedup.md" "<output_directory>/${BASE_NAME}_transcript.md"
Step 5: Summarize transcript
task_tool:
- subagent_type: "general-purpose"
- model: "sonnet"
- prompt:
INPUT: <output_directory>/${BASE_NAME}_transcript_no_timestamps.txt
OUTPUT: <output_directory>/${BASE_NAME}_summary.md
FORMATS: ./summary_formats.md
1. Classify content type:
- TIPS: gear reviews, rankings, "X ways to...", practical advice lists
- INTERVIEW: podcasts, conversations, Q&A, multiple perspectives
- EDUCATIONAL: concept explanations, analysis, "how X works"
- TUTORIAL: step-by-step instructions, coding, recipes
2. Analyze content structure:
- Identify meaningful content units (topic shifts, argument structure, narrative breaks)
- If single continuous topic, omit content unit headers
- Skip ads, sponsors, self-promotion
3. Read FORMATS file and use format for detected content type. Target <10% of transcript bytes.
ACTION REQUIRED: Use the Write tool NOW to save output to OUTPUT file.
Step 6: Review and tighten summary
task_tool:
- subagent_type: "general-purpose"
- model: "sonnet"
- prompt:
INPUT: <output_directory>/${BASE_NAME}_summary.md
OUTPUT: <output_directory>/${BASE_NAME}_summary_tight.md
FORMATS: ./summary_formats.md
You are an adversarial copy editor. Cut fluff, enforce quality.
Rules:
- Read FORMATS - preserve format
- Byte budget: <10% of transcript bytes
- Hidden Gems: Remove if duplicates main content
- Tightness: Cut filler words, compress verbose explanations, prefer lists over prose
Preserve original language - do not translate.
ACTION REQUIRED: Use the Write tool NOW to save output to OUTPUT file.
Step 9: Create output files
python3 finalize.py "${BASE_NAME}" "<output_directory>"
Outputs:
youtube - {title} ({video_id}).md- Main summaryyoutube - {title} - transcript ({video_id}).md- Description and transcript
Use --debug flag to keep intermediate work files.
Step 10a: Extract comments
python3 ./extract_comments.py "<YOUTUBE_URL>" "<output_directory>"
Creates: youtube_{VIDEO_ID}_comments.md
python3 ./prefilter_comments.py "<output_directory>/${BASE_NAME}_comments.md" "<output_directory>/${BASE_NAME}_comments_prefiltered.md"
Step 10b: Analyze comments
task_tool:
- subagent_type: "general-purpose"
- model: "sonnet"
- prompt:
SUMMARY: <output_directory>/${BASE_NAME}_summary_tight.md
COMMENTS: <output_directory>/${BASE_NAME}_comments_prefiltered.md
OUTPUT: Append to summary file
Read SUMMARY to understand video content.
Detect video type:
- TIPS: gear reviews, rankings, practical advice
- INTERVIEW: podcasts, conversations, Q&A
- EDUCATIONAL: concept explanations, analysis
- TUTORIAL: step-by-step instructions
Extract insights from COMMENTS that ADD VALUE beyond summary.
Format output as:
## Comment Insights
**Key Takeaway**: [one sentence]
[Category based on video type]:
- **[Insight type]**: [insight with @username attribution]
Rules:
- Only insights NOT already in summary
- Prioritize actionable over opinions
- Be ruthlessly concise
ACTION REQUIRED: Read the summary file and append Comment Insights section using Write tool.
Didn't find tool you were looking for?