Agent skills
deploy-monitoring

Agent skill

deploy-monitoring

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/deploy-monitoring

SKILL.md

📊 Deploy Monitoring

Monitoring, alerting ve rollback stratejileri.

❤️ Health Checks

typescript

app.get('/health', (req, res) => {
  res.json({ status: 'healthy', version: process.env.APP_VERSION });
});

app.get('/ready', async (req, res) => {
  await db.$queryRaw`SELECT 1`;
  res.json({ status: 'ready' });
});

📈 Metrics (Prometheus)

typescript

const httpDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests',
  labelNames: ['method', 'route', 'status'],
});

🚨 Alert Rules

yaml

- alert: HighErrorRate
  expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
  for: 5m
  labels:
    severity: critical

⏪ Rollback

bash

# Kubernetes
kubectl rollout undo deployment/app

# Vercel
vercel rollback

🔄 Workflow

Kaynak: Google SRE Book - Monitoring & Prometheus Best Practices

Aşama 1: Observability Instrumentation

Health Checks: /health (Liveness) ve /ready (Readiness) uç noktalarını tanımla.
Custom Metrics: Uygulamaya özel kritik metrikleri (Örn: Sipariş sayısı, Hata oranı) Prometheus/Grafana için dışa aktar.
Log Centralization: Dağınık logları ELK (Elasticsearch/Logstash/Kibana) veya Datadog gibi bir merkezde topla.

Aşama 2: SLI/SLO & Alerting Setup

Defining SLIs: Başarı göstergelerini (Latency < 200ms, Error rate < %1) belirle.
Alert Groups: Kritik hataları (P0) telefon/PagerDuty üzerinden, bilgilendirme amaçlı olanları Slack üzerinden bildir.
Error Budget: SLO'nuzun ne kadar dışına çıkabileceğinizi (Hata Bütçesi) hesapla ve aşım yaklaştığında deployları durdur.

Aşama 3: Analysis & Incident Response

Dashboarding: Grafana üzerinde sistem sağlığını gösteren gerçek zamanlı panolar oluştur.
Post-Mortem: Her büyük olaydan (Incident) sonra kök neden analizi (Root Cause Analysis) yap ve dökümante et.
Automated Rollback: Kritik alert tetiklendiğinde sistemin otomatik bir önceki stabil versiyona dönmesini sağla.

Kontrol Noktaları

Aşama	Doğrulama
1	Yeni bir servis eklendiğinde monitoring otomatik devreye giriyor mu?
2	Alertler "aksiyon alınabilir" (Actionable) bilgi içeriyor mu?
3	Loglarda PII (Kişisel veri) maskeleniyor mu?

Deploy Monitoring v1.5 - With Workflow

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/deploy-monitoring
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

📊 Deploy Monitoring

❤️ Health Checks

📈 Metrics (Prometheus)

🚨 Alert Rules

⏪ Rollback

🔄 Workflow

Aşama 1: Observability Instrumentation

Aşama 2: SLI/SLO & Alerting Setup

Aşama 3: Analysis & Incident Response

Kontrol Noktaları

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state