Agent skill

performance

This skill should be used when profiling code, optimizing bottlenecks, benchmarking, or when "performance", "profiling", "optimization", or "--perf" are mentioned.

View SKILL.md on GitHub Repository

Stars 26

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/outfitter-dev/agents/tree/main/plugins/outfitter/skills/performance

Metadata

Additional technical details for this skill

version: 1.0.0

SKILL.md

Performance Engineering

Evidence-based performance optimization → measure → profile → optimize → validate.

<when_to_use>

Profiling slow code paths or bottlenecks
Identifying memory leaks or excessive allocations
Optimizing latency-critical operations (P95, P99)
Benchmarking competing implementations
Database query optimization
Reducing CPU usage in hot paths
Improving throughput (RPS, ops/sec)

NOT for: premature optimization, optimization without measurement, guessing at bottlenecks

</when_to_use>

<iron_law>

NO OPTIMIZATION WITHOUT MEASUREMENT

Required workflow:

Measure baseline performance with realistic workload
Profile to identify actual bottleneck
Optimize the bottleneck (not what you think is slow)
Measure again to verify improvement
Document gains and tradeoffs

Optimizing unmeasured code wastes time and introduces bugs.

</iron_law>

Load the maintain-tasks skill for stage tracking:

Stage 1: Establishing baseline

content: "Establish performance baseline with realistic workload"
activeForm: "Establishing performance baseline"

Stage 2: Profiling bottlenecks

content: "Profile code to identify actual bottlenecks"
activeForm: "Profiling code to identify bottlenecks"

Stage 3: Analyzing root cause

content: "Analyze profiling data to determine root cause"
activeForm: "Analyzing profiling data"

Stage 4: Implementing optimization

content: "Implement targeted optimization for identified bottleneck"
activeForm: "Implementing optimization"

Stage 5: Validating improvement

content: "Measure performance gains and verify no regressions"
activeForm: "Validating performance improvement"

Key Performance Indicators

Latency (response time):

P50 (median) — typical case
P95 — most users
P99 — tail latency
P99.9 — outliers
TTFB — time to first byte
TTLB — time to last byte

Throughput:

RPS — requests per second
ops/sec — operations per second
bytes/sec — data transfer rate
queries/sec — database throughput

Memory:

Heap usage — allocated memory
GC frequency — garbage collection pauses
GC duration — stop-the-world time
Allocation rate — memory churn
Resident set size (RSS) — total memory

CPU:

CPU time — total compute
Wall time — elapsed time
Hot paths — frequently executed code
Time complexity — algorithmic efficiency
CPU utilization — percentage used

Always measure:

Before optimization (baseline)
After optimization (improvement)
Under realistic load (not toy data)
Multiple runs (account for variance)

<profiling_tools>

TypeScript/Bun

Built-in timing:

typescript

console.time('operation')
// ... code to measure
console.timeEnd('operation')

// High precision
const start = Bun.nanoseconds()
// ... code to measure
const elapsed = Bun.nanoseconds() - start
console.log(`Took ${elapsed / 1_000_000}ms`)

Performance API:

typescript

const mark1 = performance.mark('start')
// ... code to measure
const mark2 = performance.mark('end')
performance.measure('operation', 'start', 'end')
const measure = performance.getEntriesByName('operation')[0]
console.log(`Duration: ${measure.duration}ms`)

Memory profiling:

Chrome DevTools → Memory tab → heap snapshots
Node.js --inspect flag + Chrome DevTools
process.memoryUsage() for RSS/heap tracking

CPU profiling:

Chrome DevTools → Performance tab → record session
Node.js --prof flag + node --prof-process
Flamegraphs for visualization

Rust

Benchmarking:

rust

#[cfg(test)]
mod benches {
    use criterion::{black_box, criterion_group, criterion_main, Criterion};

    fn benchmark_function(c: &mut Criterion) {
        c.bench_function("my_function", |b| {
            b.iter(|| my_function(black_box(42)))
        });
    }

    criterion_group!(benches, benchmark_function);
    criterion_main!(benches);
}

Profiling:

cargo bench — criterion benchmarks
perf record + perf report — Linux profiling
cargo flamegraph — visual flamegraphs
cargo bloat — binary size analysis
valgrind --tool=callgrind — detailed profiling
heaptrack — memory profiling

Instrumentation:

rust

use std::time::Instant;

let start = Instant::now();
// ... code to measure
let duration = start.elapsed();
println!("Took: {:?}", duration);

</profiling_tools>

<optimization_patterns>

Algorithm Improvements

Time complexity:

O(n²) → O(n log n) — sorting, searching
O(n) → O(log n) — binary search, trees
O(n) → O(1) — hash maps, memoization

Space-time tradeoffs:

Cache computed results (memoization)
Precompute expensive operations
Index data for faster lookup
Use hash maps for O(1) access

Memory Optimization

Reduce allocations:

typescript

// Bad: creates new array each iteration
for (const item of items) {
  const results = []
  results.push(process(item))
}

// Good: reuse array
const results = []
for (const item of items) {
  results.push(process(item))
}

rust

// Bad: allocates String every time
fn format_user(name: &str) -> String {
    format!("User: {}", name)
}

// Good: reuses buffer
fn format_user(name: &str, buf: &mut String) {
    buf.clear();
    buf.push_str("User: ");
    buf.push_str(name);
}

Memory pooling:

Reuse expensive objects (connections, buffers)
Object pools for frequently allocated types
Arena allocators for batch allocations

Lazy evaluation:

Compute only when needed
Stream processing vs loading all data
Iterators over materialized collections

I/O Optimization

Batching:

Batch API calls (1 request vs 100)
Batch database writes (bulk insert)
Batch file operations (single write vs many)

Caching:

Cache expensive computations
Cache database queries (Redis, in-memory)
Cache API responses (HTTP caching)
Invalidate stale cache entries

Async I/O:

Non-blocking operations (async/await)
Concurrent requests (Promise.all, tokio::spawn)
Connection pooling (reuse connections)

Database Optimization

Query optimization:

Add indexes for common queries
Use EXPLAIN/EXPLAIN ANALYZE
Avoid N+1 queries (use joins or batch loading)
Select only needed columns
Filter at database level (WHERE vs client filter)

Schema design:

Normalize to reduce duplication
Denormalize for read-heavy workloads
Partition large tables
Use appropriate data types

Connection management:

Connection pooling (don't create per request)
Prepared statements (avoid SQL parsing)
Transaction batching (reduce round trips)

</optimization_patterns>

Loop: Measure → Profile → Analyze → Optimize → Validate

Define performance goal — target metric (e.g., P95 < 100ms)
Establish baseline — measure current performance under realistic load
Profile systematically — identify actual bottleneck (not guesses)
Analyze root cause — understand why code is slow
Design optimization — plan targeted improvement
Implement optimization — make focused change
Measure improvement — verify gains, check for regressions
Document results — record baseline, optimization, gains, tradeoffs

At each step:

Document measurements with methodology
Note profiling tool output
Track optimization attempts (what worked/failed)
Update performance documentation

Before declaring optimization complete:

Check gains:

✓ Measured improvement meets target?
✓ Improvement statistically significant?
✓ Tested under realistic load?
✓ Multiple runs confirm consistency?

Check regressions:

✓ No degradation in other metrics?
✓ Memory usage still acceptable?
✓ Code complexity still manageable?
✓ Tests still pass?

Check documentation:

✓ Baseline measurements recorded?
✓ Optimization approach explained?
✓ Gains quantified with numbers?
✓ Tradeoffs documented?

ALWAYS:

Measure before optimizing (baseline)
Profile to find actual bottleneck
Use realistic workload (not toy data)
Measure multiple runs (account for variance)
Document baseline and improvements
Check for regressions in other metrics
Consider readability vs performance tradeoff
Verify statistical significance

NEVER:

Optimize without measuring first
Guess at bottleneck without profiling
Benchmark with unrealistic data
Trust single-run measurements
Skip documentation of results
Sacrifice correctness for speed
Optimize without clear performance goal
Ignore algorithmic improvements

Methodology:

benchmarking.md — rigorous benchmarking methodology

Related skills:

codebase-recon — evidence-based investigation (foundation)
debugging — structured bug investigation
typescript-dev — correctness before performance

Maintainer

outfitter-dev Core maintainer

Source details

Full Name: outfitter-dev/agents
Branch: main
Path in repo: plugins/outfitter/skills/performance
License: MIT License

Featured Tools

Join Our Newsletter

Reference for Outfitter Stack patterns including Result types, Handler contract, Error taxonomy, and @outfitter/* package conventions. Use when learning the stack, looking up patterns, understanding packages, or when "Result", "Handler", "error taxonomy", "OutfitterError", "CLI output", "pagination", "MCP server", "MCP tool", "structured logging", "redaction", "test handler", "daemon", "IPC", or "@outfitter/*" are mentioned.

26 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

Metadata

SKILL.md

Performance Engineering

Key Performance Indicators

TypeScript/Bun

Rust

Algorithm Improvements

Memory Optimization

I/O Optimization

Database Optimization

Recommended Agent Skills

stack-feedback

stack-architecture

stack-templates

stack-audit

stack-review

stack-patterns