Agent skills
experiment-tracking

Agent skill

experiment-tracking

Master ML experiment tracking - MLflow, W&B, Neptune, versioning, reproducibility

View SKILL.md on GitHub Repository

Stars 1

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/pluginagentmarketplace/custom-plugin-mlops/tree/main/skills/experiment-tracking

SKILL.md

Experiment Tracking Skill

Learn: Master ML experiment tracking for reproducibility and collaboration.

Skill Overview

Attribute	Value
Bonded Agent	02-experiment-tracking
Difficulty	Intermediate
Duration	30 hours
Prerequisites	mlops-basics

Learning Objectives

Set up experiment tracking infrastructure
Log parameters, metrics, and artifacts systematically
Compare experiments and identify best models
Use model registry for version management
Collaborate with team using shared tracking

Topics Covered

Module 1: Platform Setup (6 hours)

Platform Comparison:

Feature	MLflow	W&B	Neptune
Self-hosted	✅	❌	❌
Free tier	✅	✅	✅
Real-time	❌	✅	✅
Git integration	⚠️	✅	✅

Setup Exercises:

Install MLflow and start local server
Create W&B account and initialize project
Compare UI/UX of both platforms

Module 2: Experiment Logging (10 hours)

What to Log:

python

# Complete logging example
with mlflow.start_run():
    # 1. Parameters (hyperparameters, configs)
    mlflow.log_params({
        "learning_rate": 0.001,
        "batch_size": 32,
        "model_type": "transformer"
    })

    # 2. Metrics (per-step and final)
    for epoch in range(10):
        mlflow.log_metrics({
            "train_loss": train_loss,
            "val_loss": val_loss
        }, step=epoch)

    # 3. Artifacts (models, plots, configs)
    mlflow.log_artifact("confusion_matrix.png")
    mlflow.pytorch.log_model(model, "model")

    # 4. Tags (for filtering)
    mlflow.set_tags({
        "experiment_type": "baseline",
        "dataset_version": "v2.1"
    })

Module 3: Model Registry (8 hours)

Registry Workflow:

┌─────────────────────────────────────────────────────────────┐
│                    MODEL REGISTRY FLOW                       │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Train → Log Model → Register → Staging → Production → Archive
│                          │          │           │              │
│                          ▼          ▼           ▼              │
│                     Version 1   Validate    Deploy           │
│                     Version 2   A/B Test    Monitor          │
│                     Version N   Approve     Rollback         │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Exercises:

Register a trained model
Promote model through stages
Implement rollback procedure

Module 4: Best Practices (6 hours)

Naming Conventions:

experiments/
├── {project_name}/
│   ├── {experiment_type}_{date}/
│   │   ├── run_{config_hash}/

Reproducibility Checklist:

Log git commit hash
Capture environment (pip freeze)
Set and log random seeds
Log data version/hash
Save config files as artifacts

Code Templates

Template: Production Experiment Tracker

python

# templates/experiment_tracker.py
import mlflow
import hashlib
import subprocess
from datetime import datetime

class ProductionExperimentTracker:
    """Production-ready experiment tracking wrapper."""

    def __init__(self, experiment_name: str, tracking_uri: str):
        mlflow.set_tracking_uri(tracking_uri)
        mlflow.set_experiment(experiment_name)
        self.run = None

    def start_run(self, run_name: str = None):
        """Start a new tracked run."""
        self.run = mlflow.start_run(run_name=run_name)

        # Auto-log environment info
        self._log_environment()
        return self

    def _log_environment(self):
        """Capture reproducibility information."""
        # Git info
        try:
            git_hash = subprocess.check_output(
                ["git", "rev-parse", "HEAD"]
            ).decode().strip()
            mlflow.set_tag("git_commit", git_hash)
        except:
            pass

        # Timestamp
        mlflow.set_tag("run_timestamp", datetime.now().isoformat())

    def log_config(self, config: dict):
        """Log configuration as parameters."""
        # Flatten nested config
        flat_config = self._flatten_dict(config)
        mlflow.log_params(flat_config)

    def log_metrics(self, metrics: dict, step: int = None):
        """Log metrics with optional step."""
        mlflow.log_metrics(metrics, step=step)

    def log_model(self, model, artifact_path: str = "model"):
        """Log model with signature."""
        mlflow.pytorch.log_model(model, artifact_path)

    def end_run(self):
        """End the current run."""
        if self.run:
            mlflow.end_run()

    def _flatten_dict(self, d: dict, parent_key: str = '') -> dict:
        """Flatten nested dictionary."""
        items = []
        for k, v in d.items():
            new_key = f"{parent_key}.{k}" if parent_key else k
            if isinstance(v, dict):
                items.extend(self._flatten_dict(v, new_key).items())
            else:
                items.append((new_key, v))
        return dict(items)

Troubleshooting Guide

Issue	Cause	Solution
Runs not syncing	Network issue	Check connectivity, use offline mode
Large artifacts fail	Size limit	Use cloud storage for large files
Duplicate run names	No uniqueness	Add timestamp or hash to names

Resources

MLflow Documentation
W&B Documentation
[See: training-pipelines] - Integrate tracking with pipelines

Version History

Version	Date	Changes
2.0.0	2024-12	Production-grade with templates
1.0.0	2024-11	Initial release

Maintainer

pluginagentmarketplace Core maintainer

Source details

Full Name: pluginagentmarketplace/custom-plugin-mlops
Branch: main
Path in repo: skills/experiment-tracking
License: Other

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

pluginagentmarketplace/custom-plugin-mlops

mlops-basics

Master MLOps fundamentals - lifecycle, principles, tools, practices, and organizational adoption

1 0

Explore

pluginagentmarketplace/custom-plugin-mlops

feature-stores

Master feature stores - Feast, data validation, versioning, online/offline serving

1 0

Explore

pluginagentmarketplace/custom-plugin-mlops

ml-infrastructure

Production-grade ML infrastructure with Kubernetes, auto-scaling, and cost optimization

1 0

Explore

pluginagentmarketplace/custom-plugin-mlops

model-serving

Master model serving - inference optimization, scaling, deployment, edge serving

1 0

Explore

pluginagentmarketplace/custom-plugin-mlops

training-pipelines

Master training pipelines - orchestration, distributed training, hyperparameter tuning

1 0

Explore

pluginagentmarketplace/custom-plugin-mlops

ml-monitoring

Production-grade ML model monitoring, drift detection, and observability

1 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Experiment Tracking Skill

Skill Overview

Learning Objectives

Topics Covered

Module 1: Platform Setup (6 hours)

Module 2: Experiment Logging (10 hours)

Module 3: Model Registry (8 hours)

Module 4: Best Practices (6 hours)

Code Templates

Template: Production Experiment Tracker

Troubleshooting Guide

Resources

Version History

Recommended Agent Skills

mlops-basics

feature-stores

ml-infrastructure

model-serving

training-pipelines

ml-monitoring