Agent skill

whisper

Local AI speech recognition based on whisper.cpp, supporting transcription and subtitle generation. Core Scenario: When the user needs to transcribe audio to text, generate SRT subtitles, or merge subtitles into video.

Stars 19
Forks 4

Install this agent skill to your Project

npx add-skill https://github.com/x-cmd/skill/tree/main/data/x-cmd/whisper

SKILL.md

whisper - Local Speech-to-Text & Subtitles

The whisper module provides a high-performance local speech recognition capability using whisper.cpp. It handles everything from model management to video subtitle merging.

When to Activate

  • When the user wants to transcribe an audio file into text.
  • When generating .srt subtitle files from audio/video.
  • When merging generated subtitles into a video file.
  • When performing real-time speech-to-text using LiveKit or Streaming.

Core Principles & Rules

  • Local Processing: Emphasize that transcription happens locally without uploading data.
  • Model Selection: Allow users to choose from different model sizes (tiny, base, small, medium, large) for speed vs. accuracy.
  • File Integrity: Ensure input audio files are accessible.

Additional Scenarios

  • SRT Generation: Use dictate --srt to create industry-standard subtitle files.
  • Video Integration: Use merge to embed subtitles into a video stream.

Patterns & Examples

Simple Transcription

bash
# Interactively choose a model and transcribe an audio file
x whisper ./meeting_record.mp3

Generate Subtitles

bash
# Create an SRT subtitle file from audio
x whisper dictate --srt -o my_subtitles ./interview.wav

Merge Subtitles

bash
# Embed an SRT file into a video
x whisper merge ./subtitles.srt ./video.mp4

Checklist

  • Confirm if the user has downloaded the required whisper model.
  • Verify the audio file format is supported by whisper.cpp.
  • Check if ffmpeg is available for the merge subcommand.

Expand your agent's capabilities with these related and highly-rated skills.

x-cmd/skill

pufferlib

High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.

19 4
Explore
x-cmd/skill

fluidsim

Framework for computational fluid dynamics simulations using Python. Use when running fluid dynamics simulations including Navier-Stokes equations (2D/3D), shallow water equations, stratified flows, or when analyzing turbulence, vortex dynamics, or geophysical flows. Provides pseudospectral methods with FFT, HPC support, and comprehensive output analysis.

19 4
Explore
x-cmd/skill

metabolomics-workbench-database

Access NIH Metabolomics Workbench via REST API (4,200+ studies). Query metabolites, RefMet nomenclature, MS/NMR data, m/z searches, study metadata, for metabolomics and biomarker discovery.

19 4
Explore
x-cmd/skill

geniml

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

19 4
Explore
x-cmd/skill

zinc-database

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

19 4
Explore
x-cmd/skill

astropy

Comprehensive Python library for astronomy and astrophysics. This skill should be used when working with astronomical data including celestial coordinates, physical units, FITS files, cosmological calculations, time systems, tables, world coordinate systems (WCS), and astronomical data analysis. Use when tasks involve coordinate transformations, unit conversions, FITS file manipulation, cosmological distance calculations, time scale conversions, or astronomical data processing.

19 4
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results