Agent skills
vps-deployment-specialist

Agent skill

vps-deployment-specialist

Full-stack VPS deployment: Docker, Caddy/Nginx, SSL, CI/CD, monitoring, security hardening. Use for server setup, deployment failures, SSL issues, Docker problems, reverse proxy config, GitHub Actions CI/CD, permission errors, firewall config. Triggers on: "deploy failed", "SSL error", "502 bad gateway", "connection refused", "docker build failed", "permission denied", "CI/CD", "Caddy", "Nginx", "firewall", "fail2ban", "UFW", "SSH", "server hardening", "reverse proxy".

View SKILL.md on GitHub Repository

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/vps-deployment-specialist

SKILL.md

VPS Deployment Specialist Skill

Full-stack VPS deployment covering server setup, Docker orchestration, reverse proxy (Nginx/Caddy), SSL/TLS, CI/CD pipelines, monitoring, and troubleshooting.

Trigger Patterns

Deploy failed / deployment error
SSL certificate error / HTTPS not working
502 Bad Gateway / 503 Service Unavailable
Connection refused / timeout
Docker build failed / container won't start
Permission denied (SSH, Docker, files)
GitHub Actions CI/CD issues
Caddy / Nginx configuration
Firewall / UFW blocking connections
Server hardening / security setup

Quick Diagnostics

First Steps (Always Start Here)

bash

# 1. Check service status
docker ps                        # Running containers
systemctl status docker          # Docker daemon
systemctl status caddy           # Reverse proxy (if using Caddy)

# 2. Check logs
docker compose logs -f --tail=100  # Container logs
journalctl -u docker -n 50         # Docker daemon logs
tail -f /var/log/syslog            # System logs

# 3. Verify network
curl -I localhost                # Local endpoint
ss -tlnp                         # Listening ports
ufw status                       # Firewall rules

# 4. Check resources
df -h                            # Disk space
free -h                          # Memory
docker system df                 # Docker disk usage

1. Server Initial Setup & Hardening

Create Deploy User

bash

# Create user with sudo
adduser deploy
usermod -aG sudo deploy

# Set up SSH key auth
mkdir -p /home/deploy/.ssh
chmod 700 /home/deploy/.ssh
echo "YOUR_PUBLIC_KEY" >> /home/deploy/.ssh/authorized_keys
chmod 600 /home/deploy/.ssh/authorized_keys
chown -R deploy:deploy /home/deploy/.ssh

Harden SSH (/etc/ssh/sshd_config)

bash

# Disable root login
PermitRootLogin no

# Disable password auth (key only)
PasswordAuthentication no
PubkeyAuthentication yes

# Use strong key types
HostKeyAlgorithms ssh-ed25519,rsa-sha2-512

# Optional: Change port (obscurity, not security)
# Port 2222

# Apply changes
systemctl restart sshd

Install Docker

bash

# Official Docker install
curl -fsSL https://get.docker.com | sh
usermod -aG docker deploy
# Log out and back in for group to take effect

Configure UFW Firewall

bash

ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp    # SSH (or custom port)
ufw allow 80/tcp    # HTTP
ufw allow 443/tcp   # HTTPS
ufw enable

# Verify
ufw status verbose

Install Fail2ban

bash

apt install fail2ban -y

# Create jail config
cat > /etc/fail2ban/jail.local << 'EOF'
[sshd]
enabled = true
port = ssh
filter = sshd
logpath = /var/log/auth.log
maxretry = 3
findtime = 600
bantime = 3600
# Progressive: increase ban time on repeat offenders
bantime.increment = true
bantime.factor = 24
EOF

systemctl enable fail2ban
systemctl start fail2ban

Enable Automatic Security Updates

bash

apt install unattended-upgrades -y
dpkg-reconfigure -plow unattended-upgrades

Set Up Swap (if low RAM)

bash

fallocate -l 2G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo '/swapfile none swap sw 0 0' >> /etc/fstab

2. Docker Configuration

Common Issues

Issue	Solution
Build context too large	Add `.dockerignore` with node_modules, .git, etc.
Layer caching broken	Order Dockerfile: deps first, code last
Network in containers	Use Docker networks, check DNS
Volume permissions	Match UID/GID or use named volumes
Running as root	Add USER directive, create non-root user

Multi-stage Build Pattern (Secure)

dockerfile

# Build stage
FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM nginx:alpine
# Create non-root user
RUN addgroup -g 1001 -S appgroup && \
    adduser -u 1001 -S appuser -G appgroup
COPY --from=build --chown=appuser:appgroup /app/dist /usr/share/nginx/html
USER appuser
EXPOSE 80

Security Best Practices (2025-2026)

dockerfile

# Use specific versions, not :latest
FROM node:20.10-alpine AS build

# Enable BuildKit for faster builds
# DOCKER_BUILDKIT=1 docker build .

# Use cache mounts for package managers
RUN --mount=type=cache,target=/root/.npm npm ci

# Health check
HEALTHCHECK --interval=30s --timeout=3s \
  CMD curl -f http://localhost/ || exit 1

Image Scanning

bash

# Docker Scout (built-in)
docker scout cves <image>

# Trivy (popular open source)
trivy image <image>

Docker Content Trust (signed images)

bash

export DOCKER_CONTENT_TRUST=1
docker pull nginx:alpine  # Only pulls if signed

3. Reverse Proxy (Caddy)

Why Caddy: Automatic HTTPS (Let's Encrypt/ZeroSSL), HTTP/3 support, ~15-25% the config size of nginx, built-in health checks, graceful reloads.

Production Setup with Health Checks

caddyfile

example.com {
    reverse_proxy app:80 {
        health_uri /health
        health_interval 30s
        health_timeout 5s
    }
    encode gzip zstd

    header {
        Strict-Transport-Security "max-age=31536000; includeSubDomains"
        X-Content-Type-Options "nosniff"
        X-Frame-Options "DENY"
        Referrer-Policy "strict-origin-when-cross-origin"
        -Server  # Remove server header
    }
}

WebSocket + API Routing

caddyfile

example.com {
    handle /api/* {
        reverse_proxy backend:3000
    }
    handle /ws/* {
        reverse_proxy backend:3000 {
            header_up Connection {http.request.header.Connection}
            header_up Upgrade {http.request.header.Upgrade}
        }
    }
    handle {
        reverse_proxy frontend:80
    }
}

Rate Limiting & Circuit Breaking

caddyfile

example.com {
    # Rate limit: 100 requests per minute per IP
    rate_limit {remote.ip} 100r/m

    reverse_proxy app:80 {
        lb_try_duration 5s
        lb_try_interval 250ms
        fail_duration 30s
    }
}

Trusted Proxies (important for real IP)

caddyfile

{
    servers {
        trusted_proxies static 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16
    }
}

Local SSL with sslip.io

caddyfile

# Use IP-based domain for automatic SSL
flowstate.84.46.253.137.sslip.io {
    reverse_proxy app:80
}

4. Reverse Proxy (Nginx)

Basic Setup

nginx

server {
    listen 80;
    server_name example.com;

    location / {
        proxy_pass http://app:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

SSL with Let's Encrypt

nginx

server {
    listen 443 ssl http2;
    server_name example.com;

    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    # Modern SSL config
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers off;

    location / {
        proxy_pass http://app:80;
    }
}

WebSocket Support

nginx

location /ws/ {
    proxy_pass http://backend:3000;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
}

5. SSL/TLS Configuration

Caddy (Automatic)

Caddy handles SSL automatically. Just use a domain name:

caddyfile

example.com {
    reverse_proxy app:80
}
# SSL is automatically configured

Manual with Certbot (Nginx)

bash

# Install
apt install certbot python3-certbot-nginx -y

# Generate cert
certbot --nginx -d example.com

# Auto-renewal (in cron)
0 0 * * * certbot renew --quiet

Testing SSL

bash

# Check certificate
openssl s_client -connect example.com:443 -servername example.com

# Test configuration
curl -vI https://example.com

# Check expiry
echo | openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -dates

6. GitHub Actions CI/CD

CRITICAL: Supply Chain Security (March 2025)

The tj-actions/changed-files action was compromised, exposing secrets for 23,000+ repos. Best practices:

Pin actions to SHA, not version tags
Review third-party actions before use
Use OIDC instead of long-lived secrets

Secure Deploy Workflow

yaml

name: Deploy to VPS
on:
  push:
    branches: [master]

# Explicitly declare minimal permissions
permissions:
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    # Use environment for production secrets
    environment: production
    steps:
      # Pin to SHA for security
      - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Build
        run: |
          npm ci
          npm run build

      - name: Deploy via SSH
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.VPS_HOST }}
          username: ${{ secrets.VPS_USER }}
          key: ${{ secrets.SSH_PRIVATE_KEY }}
          script: |
            cd /app
            docker compose pull
            docker compose up -d --build

Secrets Best Practices (2025-2026)

Practice	Why
Rotate every 30-90 days	Limits exposure window
Use OIDC over long-lived tokens	Eliminates static credentials
Environment secrets for prod	Requires reviewer approval
Never use structured data (JSON/YAML)	May not be redacted in logs
Pin actions to SHA	Prevents supply chain attacks

OIDC for Cloud Providers (no secrets needed)

yaml

permissions:
  id-token: write
  contents: read

jobs:
  deploy:
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/github-actions
          aws-region: us-east-1

7. Docker Compose Production

yaml

version: '3.8'

services:
  caddy:
    image: caddy:alpine
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile:ro
      - caddy_data:/data
      - caddy_config:/config
    depends_on:
      app:
        condition: service_healthy

  app:
    build: .
    restart: unless-stopped
    environment:
      - NODE_ENV=production
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    deploy:
      resources:
        limits:
          memory: 512M
          cpus: '0.5'

volumes:
  caddy_data:
  caddy_config:

8. Monitoring & Health Checks

Docker Health Check (Dockerfile)

dockerfile

HEALTHCHECK --interval=30s --timeout=3s --start-period=40s \
  CMD curl -f http://localhost/ || exit 1

Log Aggregation

bash

# View all compose logs
docker compose logs -f

# Specific service with timestamps
docker compose logs -f --timestamps app

# Last 100 lines
docker compose logs --tail=100 app

Fail2ban Monitoring

bash

# Check banned IPs
fail2ban-client status sshd

# Unban an IP
fail2ban-client set sshd unbanip 1.2.3.4

# View ban log
tail -f /var/log/fail2ban.log

Resource Monitoring

bash

# Disk usage (alert at 80%)
df -h | awk '$5 > 80 {print}'

# Memory usage
free -h

# Docker disk usage
docker system df

# Clean up unused resources
docker system prune -a --volumes

Simple Uptime Monitoring

Uptime Kuma - Self-hosted, easy setup
Better Uptime - SaaS with free tier
Simple curl check in cron:

bash

*/5 * * * * curl -sf https://example.com/health || echo "Site down" | mail -s "Alert" admin@example.com

9. Troubleshooting Decision Tree

Deployment failed
├── Build failed?
│   ├── npm ci failed → Check package-lock.json, clear cache
│   ├── Docker build failed → Check Dockerfile, .dockerignore
│   └── Out of space → docker system prune, expand disk
├── Deploy succeeded but site down?
│   ├── 502 Bad Gateway → App container not running/healthy
│   │   └── Check: docker ps, docker compose logs app
│   ├── Connection refused → Firewall, wrong port binding
│   │   └── Check: ufw status, ss -tlnp
│   ├── SSL error → Certificate expired/misconfigured
│   │   └── Check: openssl s_client, certbot certificates
│   └── Timeout → Resource exhaustion, infinite loop
│       └── Check: free -h, top, docker stats
├── SSH failed?
│   ├── Permission denied → Check key, known_hosts, sshd_config
│   └── Connection timeout → Firewall, wrong IP, SSH port
└── Works locally but not on VPS?
    ├── Environment variables missing → Check .env on VPS
    ├── Port conflicts → ss -tlnp, check other services
    └── Different file paths → Check absolute paths in config

10. Quick Commands Reference

Task	Command
Check running containers	`docker ps`
View logs	`docker compose logs -f`
Rebuild and restart	`docker compose up -d --build`
Check disk space	`df -h`
Check memory	`free -h`
Check listening ports	`ss -tlnp`
Test local endpoint	`curl -I localhost`
Restart compose stack	`docker compose restart`
Remove unused images	`docker image prune -a`
Check firewall	`ufw status`
Check fail2ban	`fail2ban-client status`
View SSH attempts	`journalctl -u sshd -n 50`

11. Rollback Strategy

bash

# Option 1: Git-based rollback
git checkout <previous-commit>
docker compose up -d --build

# Option 2: Docker image tags (if using registry)
docker compose pull  # Gets :latest
docker compose up -d

# Option 3: Keep backup of working state
cp docker-compose.yml docker-compose.yml.backup
cp Caddyfile Caddyfile.backup

# Restore from backup
cp docker-compose.yml.backup docker-compose.yml
docker compose up -d --build

Related Skills

supabase-debugger - Database and Supabase-specific issues
dev-debugging - Application-level Vue/Pinia debugging
tauri-debugger - Desktop app deployment issues

Sources

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/vps-deployment-specialist
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

VPS Deployment Specialist Skill

Trigger Patterns

Quick Diagnostics

First Steps (Always Start Here)

1. Server Initial Setup & Hardening

Create Deploy User

Harden SSH (/etc/ssh/sshd_config)

Install Docker

Configure UFW Firewall

Install Fail2ban

Enable Automatic Security Updates

Set Up Swap (if low RAM)

2. Docker Configuration

Common Issues

Multi-stage Build Pattern (Secure)

Security Best Practices (2025-2026)

Image Scanning

Docker Content Trust (signed images)

3. Reverse Proxy (Caddy)

Production Setup with Health Checks

WebSocket + API Routing

Rate Limiting & Circuit Breaking

Trusted Proxies (important for real IP)

Local SSL with sslip.io

4. Reverse Proxy (Nginx)

Basic Setup

SSL with Let's Encrypt

WebSocket Support

5. SSL/TLS Configuration

Caddy (Automatic)

Manual with Certbot (Nginx)

Testing SSL

6. GitHub Actions CI/CD

CRITICAL: Supply Chain Security (March 2025)

Secure Deploy Workflow

Secrets Best Practices (2025-2026)

OIDC for Cloud Providers (no secrets needed)

7. Docker Compose Production

8. Monitoring & Health Checks

Docker Health Check (Dockerfile)

Log Aggregation

Fail2ban Monitoring

Resource Monitoring

Simple Uptime Monitoring

9. Troubleshooting Decision Tree

10. Quick Commands Reference

11. Rollback Strategy

Related Skills

Sources

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state