Agent skill

gcloud-usage

This skill should be used when user asks about "GCloud logs", "Cloud Logging queries", "Google Cloud metrics", "GCP observability", "trace analysis", or "debugging production issues on GCP".

Stars 589
Forks 54

Install this agent skill to your Project

npx add-skill https://github.com/fcakyon/claude-codex-settings/tree/main/plugins/gcloud-tools/skills/gcloud-usage

SKILL.md

GCP Observability Best Practices

Structured Logging

JSON Log Format

Use structured JSON logging for better queryability:

json
{
  "severity": "ERROR",
  "message": "Payment failed",
  "httpRequest": { "requestMethod": "POST", "requestUrl": "/api/payment" },
  "labels": { "user_id": "123", "transaction_id": "abc" },
  "timestamp": "2025-01-15T10:30:00Z"
}

Severity Levels

Use appropriate severity for filtering:

  • DEBUG: Detailed diagnostic info
  • INFO: Normal operations, milestones
  • NOTICE: Normal but significant events
  • WARNING: Potential issues, degraded performance
  • ERROR: Failures that don't stop the service
  • CRITICAL: Failures requiring immediate action
  • ALERT: Person must take action immediately
  • EMERGENCY: System is unusable

Log Filtering Queries

Common Filters

# By severity
severity >= WARNING

# By resource
resource.type="cloud_run_revision"
resource.labels.service_name="my-service"

# By time
timestamp >= "2025-01-15T00:00:00Z"

# By text content
textPayload =~ "error.*timeout"

# By JSON field
jsonPayload.user_id = "123"

# Combined
severity >= ERROR AND resource.labels.service_name="api"

Advanced Queries

# Regex matching
textPayload =~ "status=[45][0-9]{2}"

# Substring search
textPayload : "connection refused"

# Multiple values
severity = (ERROR OR CRITICAL)

Metrics vs Logs vs Traces

When to Use Each

Metrics: Aggregated numeric data over time

  • Request counts, latency percentiles
  • Resource utilization (CPU, memory)
  • Business KPIs (orders/minute)

Logs: Detailed event records

  • Error details and stack traces
  • Audit trails
  • Debugging specific requests

Traces: Request flow across services

  • Latency breakdown by service
  • Identifying bottlenecks
  • Distributed system debugging

Alert Policy Design

Alert Best Practices

  • Avoid alert fatigue: Only alert on actionable issues
  • Use multi-condition alerts: Reduce noise from transient spikes
  • Set appropriate windows: 5-15 min for most metrics
  • Include runbook links: Help responders act quickly

Common Alert Patterns

Error rate:

  • Condition: Error rate > 1% for 5 minutes
  • Good for: Service health monitoring

Latency:

  • Condition: P99 latency > 2s for 10 minutes
  • Good for: Performance degradation detection

Resource exhaustion:

  • Condition: Memory > 90% for 5 minutes
  • Good for: Capacity planning triggers

Cost Optimization

Reducing Log Costs

  • Exclusion filters: Drop verbose logs at ingestion
  • Sampling: Log only percentage of high-volume events
  • Shorter retention: Reduce default 30-day retention
  • Downgrade logs: Route to cheaper storage buckets

Exclusion Filter Examples

# Exclude health checks
resource.type="cloud_run_revision" AND httpRequest.requestUrl="/health"

# Exclude debug logs in production
severity = DEBUG

Debugging Workflow

  1. Start with metrics: Identify when issues started
  2. Correlate with logs: Filter logs around problem time
  3. Use traces: Follow specific requests across services
  4. Check resource logs: Look for infrastructure issues
  5. Compare baselines: Check against known-good periods

Expand your agent's capabilities with these related and highly-rated skills.

fcakyon/claude-codex-settings

hetzner-deploy

This skill should be used when user asks to "deploy to Hetzner", "create Hetzner server", "manage Hetzner Cloud", "hcloud CLI", or works with Hetzner Cloud infrastructure including servers, networks, firewalls, load balancers, DNS zones, and volumes.

589 54
Explore
fcakyon/claude-codex-settings

pdf

Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.

589 54
Explore
fcakyon/claude-codex-settings

docx

Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of 'Word doc', 'word document', '.docx', or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a 'report', 'memo', 'letter', 'template', or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation.

589 54
Explore
fcakyon/claude-codex-settings

xlsx

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

589 54
Explore
fcakyon/claude-codex-settings

pptx

Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions "deck," "slides," "presentation," or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.

589 54
Explore
fcakyon/claude-codex-settings

dokploy-deploy

This skill should be used when user asks to "deploy with Dokploy", "use Dokploy Cloud", "manage self-hosted Dokploy", "deploy Docker Compose on Dokploy", "manage Dokploy databases", "configure Dokploy domains", or "look up Dokploy CLI commands".

589 54
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results