Agent skill

mlops-engineer

Expert in Machine Learning Operations bridging data science and DevOps. Use when building ML pipelines, model versioning, feature stores, or production ML serving. Triggers include "MLOps", "ML pipeline", "model deployment", "feature store", "model versioning", "ML monitoring", "Kubeflow", "MLflow".

Stars 66
Forks 6

Install this agent skill to your Project

npx add-skill https://github.com/404kidwiz/claude-supercode-skills/tree/main/mlops-engineer-skill

SKILL.md

MLOps Engineer

Purpose

Provides expertise in Machine Learning Operations, bridging data science and DevOps practices. Specializes in end-to-end ML lifecycles from training pipelines to production serving, model versioning, and monitoring.

When to Use

  • Building ML training and serving pipelines
  • Implementing model versioning and registry
  • Setting up feature stores
  • Deploying models to production
  • Monitoring model performance and drift
  • Automating ML workflows (CI/CD for ML)
  • Implementing A/B testing for models
  • Managing experiment tracking

Quick Start

Invoke this skill when:

  • Building ML pipelines and workflows
  • Deploying models to production
  • Setting up model versioning and registry
  • Implementing feature stores
  • Monitoring production ML systems

Do NOT invoke when:

  • Model development and training → use /ml-engineer
  • Data pipeline ETL → use /data-engineer
  • Kubernetes infrastructure → use /kubernetes-specialist
  • General CI/CD without ML → use /devops-engineer

Decision Framework

ML Lifecycle Stage?
├── Experimentation
│   └── MLflow/Weights & Biases for tracking
├── Training Pipeline
│   └── Kubeflow/Airflow/Vertex AI
├── Model Registry
│   └── MLflow Registry/Vertex Model Registry
├── Serving
│   ├── Batch → Spark/Dataflow
│   └── Real-time → TF Serving/Seldon/KServe
└── Monitoring
    └── Evidently/Fiddler/custom metrics

Core Workflows

1. ML Pipeline Setup

  1. Define pipeline stages (data prep, training, eval)
  2. Choose orchestrator (Kubeflow, Airflow, Vertex)
  3. Containerize each pipeline step
  4. Implement artifact storage
  5. Add experiment tracking
  6. Configure automated retraining triggers

2. Model Deployment

  1. Register model in model registry
  2. Build serving container
  3. Deploy to serving infrastructure
  4. Configure autoscaling
  5. Implement canary/shadow deployment
  6. Set up monitoring and alerts

3. Model Monitoring

  1. Define key metrics (latency, throughput, accuracy)
  2. Implement data drift detection
  3. Set up prediction monitoring
  4. Create alerting thresholds
  5. Build dashboards for visibility
  6. Automate retraining triggers

Best Practices

  • Version everything: code, data, models, configs
  • Use feature stores for consistency between training and serving
  • Implement CI/CD specifically designed for ML workflows
  • Monitor data drift and model performance continuously
  • Use canary deployments for model rollouts
  • Keep training and serving environments consistent

Anti-Patterns

Anti-Pattern Problem Correct Approach
Manual deployments Error-prone, slow Automated ML CI/CD
Training-serving skew Prediction errors Feature stores
No model versioning Can't reproduce or rollback Model registry
Ignoring data drift Silent degradation Continuous monitoring
Notebook-to-production Unmaintainable Proper pipeline code

Expand your agent's capabilities with these related and highly-rated skills.

404kidwiz/claude-supercode-skills

documentation-engineer

Technical documentation and knowledge management expert. Use when creating comprehensive documentation systems, improving developer knowledge sharing, or building documentation-driven development workflows.

66 6
Explore
404kidwiz/claude-supercode-skills

backend-developer

Comprehensive backend development for building production-ready server-side applications with multiple frameworks, databases, and deployment strategies. Use when building APIs, services, databases, or server infrastructure.

66 6
Explore
404kidwiz/claude-supercode-skills

powershell-5.1-expert

Expert in legacy Windows PowerShell 5.1. Specializes in WMI, ADSI, COM automation, and maintaining backward compatibility with Windows Server environments. Use for Windows-specific automation on legacy systems. Triggers include "PowerShell 5.1", "Windows PowerShell", "WMI", "ADSI", "COM object", "legacy PowerShell".

66 6
Explore
404kidwiz/claude-supercode-skills

qa-expert

Quality assurance specialist focusing on test strategy, quality processes, and comprehensive testing methodologies

66 6
Explore
404kidwiz/claude-supercode-skills

multi-agent-coordinator

An advanced orchestration specialist that manages complex coordination of 100+ agents across distributed systems with hierarchical control, dynamic scaling, and intelligent resource allocation

66 6
Explore
404kidwiz/claude-supercode-skills

tooling-engineer

Expert in building developer tools, CLI utilities, IDE extensions, and optimizing local development environments.

66 6
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results