Agent skill

prometheus-monitoring

Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.

Stars 151
Forks 20

Install this agent skill to your Project

npx add-skill https://github.com/aj-geddes/useful-ai-prompts/tree/main/skills/prometheus-monitoring

SKILL.md

Prometheus Monitoring

Table of Contents

  • Overview
  • When to Use
  • Quick Start
  • Reference Guides
  • Best Practices

Overview

Implement comprehensive Prometheus monitoring infrastructure for collecting, storing, and querying time-series metrics from applications and infrastructure.

When to Use

  • Setting up metrics collection
  • Creating custom application metrics
  • Configuring scraping targets
  • Implementing service discovery
  • Building monitoring infrastructure

Quick Start

Minimal working example:

yaml
# /etc/prometheus/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  external_labels:
    cluster: production

alerting:
  alertmanagers:
    - static_configs:
        - targets: ["localhost:9093"]

rule_files:
  - "/etc/prometheus/alert_rules.yml"

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node"
    static_configs:
      - targets: ["localhost:9100"]

  - job_name: "api-service"
// ... (see reference guides for full implementation)

Reference Guides

Detailed implementations in the references/ directory:

Guide Contents
Prometheus Configuration Prometheus Configuration
Node.js Metrics Implementation Node.js Metrics Implementation
Python Prometheus Integration Python Prometheus Integration
Alert Rules Alert Rules
Docker Compose Setup Docker Compose Setup

Best Practices

✅ DO

  • Use consistent metric naming conventions
  • Add comprehensive labels for filtering
  • Set appropriate scrape intervals (10-60s)
  • Implement retention policies
  • Monitor Prometheus itself
  • Test alert rules before deployment
  • Document metric meanings

❌ DON'T

  • Add unbounded cardinality labels
  • Scrape too frequently (< 10s)
  • Ignore metric naming conventions
  • Create alerts without runbooks
  • Store raw event data in Prometheus
  • Use counters for gauge-like values

Expand your agent's capabilities with these related and highly-rated skills.

aj-geddes/useful-ai-prompts

websocket-implementation

Implement real-time bidirectional communication with WebSockets including connection management, message routing, and scaling. Use when building real-time features, chat systems, live notifications, or collaborative applications.

151 20
Explore
aj-geddes/useful-ai-prompts

refactor-legacy-code

Modernize and improve legacy codebases while maintaining functionality. Use when you need to refactor old code, reduce technical debt, modernize deprecated patterns, or improve code maintainability without breaking existing behavior.

151 20
Explore
aj-geddes/useful-ai-prompts

Sentiment Analysis

Classify text sentiment using NLP techniques, lexicon-based analysis, and machine learning for opinion mining, brand monitoring, and customer feedback analysis

151 20
Explore
aj-geddes/useful-ai-prompts

flask-api-development

Develop lightweight Flask APIs with routing, blueprints, database integration, authentication, and request/response handling. Use when building RESTful APIs, microservices, or lightweight web services with Flask.

151 20
Explore
aj-geddes/useful-ai-prompts

ML Model Explanation

Interpret machine learning models using SHAP, LIME, feature importance, partial dependence, and attention visualization for explainability

151 20
Explore
aj-geddes/useful-ai-prompts

Statistical Hypothesis Testing

Conduct statistical tests including t-tests, chi-square, ANOVA, and p-value analysis for statistical significance, hypothesis validation, and A/B testing

151 20
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results