Agent skill
assemblyai-performance-tuning
Optimize AssemblyAI API performance with caching, parallel processing, and model selection. Use when experiencing slow transcriptions, implementing caching strategies, or optimizing throughput for batch transcription workloads. Trigger with phrases like "assemblyai performance", "optimize assemblyai", "assemblyai latency", "assemblyai caching", "assemblyai slow", "assemblyai batch".
Install this agent skill to your Project
npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/main/plugins/saas-packs/assemblyai-pack/skills/assemblyai-performance-tuning
SKILL.md
AssemblyAI Performance Tuning
Overview
Optimize AssemblyAI transcription performance through model selection, parallel processing, caching, and webhook-based architectures.
Prerequisites
assemblyaipackage installed- Understanding of async patterns
- Redis or in-memory cache available (optional)
Latency Benchmarks (Actual)
Async Transcription
| Audio Duration | Approx. Processing Time | Notes |
|---|---|---|
| 30 seconds | ~10-15 seconds | Includes queue time |
| 5 minutes | ~30-60 seconds | Scales sub-linearly |
| 1 hour | ~3-5 minutes | Depends on queue load |
| 10 hours | ~15-30 minutes | Max async duration |
Streaming
| Metric | Value |
|---|---|
| First partial transcript | ~300ms (P50) |
| Final transcript latency | ~500ms (P50) |
| End-of-turn detection | Automatic with endpointing |
Model Speed vs. Accuracy
| Model | Speed | Accuracy | Price/hr |
|---|---|---|---|
nano |
Fastest | Good | $0.12 |
best (Universal-3) |
Standard | Highest | $0.37 |
nova-3 (streaming) |
Real-time | High | $0.47 |
nova-3-pro (streaming) |
Real-time | Highest | $0.47 |
Instructions
Step 1: Choose the Right Model
import { AssemblyAI } from 'assemblyai';
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY!,
});
// For highest accuracy (default)
const accurate = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'best',
});
// For fastest processing and lowest cost
const fast = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'nano',
});
Step 2: Parallel Batch Processing
import PQueue from 'p-queue';
const queue = new PQueue({ concurrency: 10 });
async function batchTranscribe(audioUrls: string[]) {
const results = await Promise.all(
audioUrls.map(url =>
queue.add(() =>
client.transcripts.transcribe({ audio: url, speech_model: 'nano' })
)
)
);
return results.filter(t => t.status === 'completed');
}
// Process 100 files with 10 concurrent jobs
const urls = Array.from({ length: 100 }, (_, i) => `https://storage.example.com/audio-${i}.mp3`);
const transcripts = await batchTranscribe(urls);
console.log(`Completed: ${transcripts.length}/${urls.length}`);
Step 3: Use Webhooks Instead of Polling
// SLOW: transcribe() polls every 3 seconds until done
const slow = await client.transcripts.transcribe({ audio: audioUrl });
// FAST: submit() returns immediately, webhook notifies on completion
const fast = await client.transcripts.submit({
audio: audioUrl,
webhook_url: 'https://your-app.com/webhooks/assemblyai',
});
// Your webhook handler processes the result — no polling overhead
Step 4: Cache Transcript Results
import { LRUCache } from 'lru-cache';
import type { Transcript } from 'assemblyai';
const transcriptCache = new LRUCache<string, Transcript>({
max: 500,
ttl: 60 * 60 * 1000, // 1 hour
});
async function getCachedTranscript(transcriptId: string): Promise<Transcript> {
const cached = transcriptCache.get(transcriptId);
if (cached) return cached;
const transcript = await client.transcripts.get(transcriptId);
if (transcript.status === 'completed') {
transcriptCache.set(transcriptId, transcript);
}
return transcript;
}
Step 5: Redis Cache for Distributed Systems
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
async function getCachedTranscriptRedis(transcriptId: string): Promise<Transcript> {
const cached = await redis.get(`transcript:${transcriptId}`);
if (cached) return JSON.parse(cached);
const transcript = await client.transcripts.get(transcriptId);
if (transcript.status === 'completed') {
await redis.setex(
`transcript:${transcriptId}`,
3600, // 1 hour TTL
JSON.stringify(transcript)
);
}
return transcript;
}
Step 6: Minimize Feature Overhead
// Only enable features you actually need — each adds processing time
// Minimal (fastest)
const minimal = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'nano',
punctuate: true,
format_text: true,
});
// Full intelligence (slower, more expensive)
const full = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'best',
speaker_labels: true,
sentiment_analysis: true,
entity_detection: true,
auto_highlights: true,
content_safety: true,
iab_categories: true,
summarization: true,
summary_type: 'bullets',
});
Step 7: Performance Monitoring
async function timedTranscribe(audioUrl: string, options: Record<string, any> = {}) {
const start = Date.now();
const transcript = await client.transcripts.transcribe({
audio: audioUrl,
...options,
});
const durationMs = Date.now() - start;
const stats = {
transcriptId: transcript.id,
status: transcript.status,
audioDuration: transcript.audio_duration,
processingTimeMs: durationMs,
ratio: transcript.audio_duration
? (durationMs / 1000 / transcript.audio_duration).toFixed(2)
: 'N/A',
wordCount: transcript.words?.length ?? 0,
model: options.speech_model ?? 'best',
};
console.log('Transcription stats:', stats);
return { transcript, stats };
}
Output
- Optimal model selection based on speed/accuracy/cost trade-offs
- Parallel batch processing with concurrency control
- Webhook-based architecture (eliminates polling overhead)
- In-memory and Redis caching for transcript retrieval
- Performance monitoring with processing time ratios
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Slow transcription | Large file + best model | Use nano model or split audio |
| Queue backlog | Too many concurrent submissions | Limit concurrency with p-queue |
| Cache stale data | Transcript re-processed | Set appropriate TTL, invalidate on webhook |
| Polling overhead | Using transcribe() for many files |
Switch to submit() + webhooks |
Resources
Next Steps
For cost optimization, see assemblyai-cost-tuning.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
dockerfile-generator
Dockerfile Generator - Auto-activating skill for DevOps Basics. Triggers on: dockerfile generator, dockerfile generator Part of the DevOps Basics skill category.
branch-naming-helper
Branch Naming Helper - Auto-activating skill for DevOps Basics. Triggers on: branch naming helper, branch naming helper Part of the DevOps Basics skill category.
readme-generator
Readme Generator - Auto-activating skill for DevOps Basics. Triggers on: readme generator, readme generator Part of the DevOps Basics skill category.
makefile-generator
Makefile Generator - Auto-activating skill for DevOps Basics. Triggers on: makefile generator, makefile generator Part of the DevOps Basics skill category.
gitignore-generator
Gitignore Generator - Auto-activating skill for DevOps Basics. Triggers on: gitignore generator, gitignore generator Part of the DevOps Basics skill category.
pre-commit-hook-setup
Pre Commit Hook Setup - Auto-activating skill for DevOps Basics. Triggers on: pre commit hook setup, pre commit hook setup Part of the DevOps Basics skill category.
Didn't find tool you were looking for?