Agent skill
vps-deployment-specialist
Full-stack VPS deployment: Docker, Caddy/Nginx, SSL, CI/CD, monitoring, security hardening. Use for server setup, deployment failures, SSL issues, Docker problems, reverse proxy config, GitHub Actions CI/CD, permission errors, firewall config. Triggers on: "deploy failed", "SSL error", "502 bad gateway", "connection refused", "docker build failed", "permission denied", "CI/CD", "Caddy", "Nginx", "firewall", "fail2ban", "UFW", "SSH", "server hardening", "reverse proxy".
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/vps-deployment-specialist
SKILL.md
VPS Deployment Specialist Skill
Full-stack VPS deployment covering server setup, Docker orchestration, reverse proxy (Nginx/Caddy), SSL/TLS, CI/CD pipelines, monitoring, and troubleshooting.
Trigger Patterns
- Deploy failed / deployment error
- SSL certificate error / HTTPS not working
- 502 Bad Gateway / 503 Service Unavailable
- Connection refused / timeout
- Docker build failed / container won't start
- Permission denied (SSH, Docker, files)
- GitHub Actions CI/CD issues
- Caddy / Nginx configuration
- Firewall / UFW blocking connections
- Server hardening / security setup
Quick Diagnostics
First Steps (Always Start Here)
# 1. Check service status
docker ps # Running containers
systemctl status docker # Docker daemon
systemctl status caddy # Reverse proxy (if using Caddy)
# 2. Check logs
docker compose logs -f --tail=100 # Container logs
journalctl -u docker -n 50 # Docker daemon logs
tail -f /var/log/syslog # System logs
# 3. Verify network
curl -I localhost # Local endpoint
ss -tlnp # Listening ports
ufw status # Firewall rules
# 4. Check resources
df -h # Disk space
free -h # Memory
docker system df # Docker disk usage
1. Server Initial Setup & Hardening
Create Deploy User
# Create user with sudo
adduser deploy
usermod -aG sudo deploy
# Set up SSH key auth
mkdir -p /home/deploy/.ssh
chmod 700 /home/deploy/.ssh
echo "YOUR_PUBLIC_KEY" >> /home/deploy/.ssh/authorized_keys
chmod 600 /home/deploy/.ssh/authorized_keys
chown -R deploy:deploy /home/deploy/.ssh
Harden SSH (/etc/ssh/sshd_config)
# Disable root login
PermitRootLogin no
# Disable password auth (key only)
PasswordAuthentication no
PubkeyAuthentication yes
# Use strong key types
HostKeyAlgorithms ssh-ed25519,rsa-sha2-512
# Optional: Change port (obscurity, not security)
# Port 2222
# Apply changes
systemctl restart sshd
Install Docker
# Official Docker install
curl -fsSL https://get.docker.com | sh
usermod -aG docker deploy
# Log out and back in for group to take effect
Configure UFW Firewall
ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp # SSH (or custom port)
ufw allow 80/tcp # HTTP
ufw allow 443/tcp # HTTPS
ufw enable
# Verify
ufw status verbose
Install Fail2ban
apt install fail2ban -y
# Create jail config
cat > /etc/fail2ban/jail.local << 'EOF'
[sshd]
enabled = true
port = ssh
filter = sshd
logpath = /var/log/auth.log
maxretry = 3
findtime = 600
bantime = 3600
# Progressive: increase ban time on repeat offenders
bantime.increment = true
bantime.factor = 24
EOF
systemctl enable fail2ban
systemctl start fail2ban
Enable Automatic Security Updates
apt install unattended-upgrades -y
dpkg-reconfigure -plow unattended-upgrades
Set Up Swap (if low RAM)
fallocate -l 2G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo '/swapfile none swap sw 0 0' >> /etc/fstab
2. Docker Configuration
Common Issues
| Issue | Solution |
|---|---|
| Build context too large | Add .dockerignore with node_modules, .git, etc. |
| Layer caching broken | Order Dockerfile: deps first, code last |
| Network in containers | Use Docker networks, check DNS |
| Volume permissions | Match UID/GID or use named volumes |
| Running as root | Add USER directive, create non-root user |
Multi-stage Build Pattern (Secure)
# Build stage
FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage
FROM nginx:alpine
# Create non-root user
RUN addgroup -g 1001 -S appgroup && \
adduser -u 1001 -S appuser -G appgroup
COPY --from=build --chown=appuser:appgroup /app/dist /usr/share/nginx/html
USER appuser
EXPOSE 80
Security Best Practices (2025-2026)
# Use specific versions, not :latest
FROM node:20.10-alpine AS build
# Enable BuildKit for faster builds
# DOCKER_BUILDKIT=1 docker build .
# Use cache mounts for package managers
RUN --mount=type=cache,target=/root/.npm npm ci
# Health check
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost/ || exit 1
Image Scanning
# Docker Scout (built-in)
docker scout cves <image>
# Trivy (popular open source)
trivy image <image>
Docker Content Trust (signed images)
export DOCKER_CONTENT_TRUST=1
docker pull nginx:alpine # Only pulls if signed
3. Reverse Proxy (Caddy)
Why Caddy: Automatic HTTPS (Let's Encrypt/ZeroSSL), HTTP/3 support, ~15-25% the config size of nginx, built-in health checks, graceful reloads.
Production Setup with Health Checks
example.com {
reverse_proxy app:80 {
health_uri /health
health_interval 30s
health_timeout 5s
}
encode gzip zstd
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains"
X-Content-Type-Options "nosniff"
X-Frame-Options "DENY"
Referrer-Policy "strict-origin-when-cross-origin"
-Server # Remove server header
}
}
WebSocket + API Routing
example.com {
handle /api/* {
reverse_proxy backend:3000
}
handle /ws/* {
reverse_proxy backend:3000 {
header_up Connection {http.request.header.Connection}
header_up Upgrade {http.request.header.Upgrade}
}
}
handle {
reverse_proxy frontend:80
}
}
Rate Limiting & Circuit Breaking
example.com {
# Rate limit: 100 requests per minute per IP
rate_limit {remote.ip} 100r/m
reverse_proxy app:80 {
lb_try_duration 5s
lb_try_interval 250ms
fail_duration 30s
}
}
Trusted Proxies (important for real IP)
{
servers {
trusted_proxies static 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16
}
}
Local SSL with sslip.io
# Use IP-based domain for automatic SSL
flowstate.84.46.253.137.sslip.io {
reverse_proxy app:80
}
4. Reverse Proxy (Nginx)
Basic Setup
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://app:80;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
SSL with Let's Encrypt
server {
listen 443 ssl http2;
server_name example.com;
ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
# Modern SSL config
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers off;
location / {
proxy_pass http://app:80;
}
}
WebSocket Support
location /ws/ {
proxy_pass http://backend:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
5. SSL/TLS Configuration
Caddy (Automatic)
Caddy handles SSL automatically. Just use a domain name:
example.com {
reverse_proxy app:80
}
# SSL is automatically configured
Manual with Certbot (Nginx)
# Install
apt install certbot python3-certbot-nginx -y
# Generate cert
certbot --nginx -d example.com
# Auto-renewal (in cron)
0 0 * * * certbot renew --quiet
Testing SSL
# Check certificate
openssl s_client -connect example.com:443 -servername example.com
# Test configuration
curl -vI https://example.com
# Check expiry
echo | openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -dates
6. GitHub Actions CI/CD
CRITICAL: Supply Chain Security (March 2025)
The tj-actions/changed-files action was compromised, exposing secrets for 23,000+ repos. Best practices:
- Pin actions to SHA, not version tags
- Review third-party actions before use
- Use OIDC instead of long-lived secrets
Secure Deploy Workflow
name: Deploy to VPS
on:
push:
branches: [master]
# Explicitly declare minimal permissions
permissions:
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
# Use environment for production secrets
environment: production
steps:
# Pin to SHA for security
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Build
run: |
npm ci
npm run build
- name: Deploy via SSH
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.VPS_HOST }}
username: ${{ secrets.VPS_USER }}
key: ${{ secrets.SSH_PRIVATE_KEY }}
script: |
cd /app
docker compose pull
docker compose up -d --build
Secrets Best Practices (2025-2026)
| Practice | Why |
|---|---|
| Rotate every 30-90 days | Limits exposure window |
| Use OIDC over long-lived tokens | Eliminates static credentials |
| Environment secrets for prod | Requires reviewer approval |
| Never use structured data (JSON/YAML) | May not be redacted in logs |
| Pin actions to SHA | Prevents supply chain attacks |
OIDC for Cloud Providers (no secrets needed)
permissions:
id-token: write
contents: read
jobs:
deploy:
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/github-actions
aws-region: us-east-1
7. Docker Compose Production
version: '3.8'
services:
caddy:
image: caddy:alpine
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_data:/data
- caddy_config:/config
depends_on:
app:
condition: service_healthy
app:
build: .
restart: unless-stopped
environment:
- NODE_ENV=production
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost/"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
deploy:
resources:
limits:
memory: 512M
cpus: '0.5'
volumes:
caddy_data:
caddy_config:
8. Monitoring & Health Checks
Docker Health Check (Dockerfile)
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s \
CMD curl -f http://localhost/ || exit 1
Log Aggregation
# View all compose logs
docker compose logs -f
# Specific service with timestamps
docker compose logs -f --timestamps app
# Last 100 lines
docker compose logs --tail=100 app
Fail2ban Monitoring
# Check banned IPs
fail2ban-client status sshd
# Unban an IP
fail2ban-client set sshd unbanip 1.2.3.4
# View ban log
tail -f /var/log/fail2ban.log
Resource Monitoring
# Disk usage (alert at 80%)
df -h | awk '$5 > 80 {print}'
# Memory usage
free -h
# Docker disk usage
docker system df
# Clean up unused resources
docker system prune -a --volumes
Simple Uptime Monitoring
- Uptime Kuma - Self-hosted, easy setup
- Better Uptime - SaaS with free tier
- Simple curl check in cron:
*/5 * * * * curl -sf https://example.com/health || echo "Site down" | mail -s "Alert" admin@example.com
9. Troubleshooting Decision Tree
Deployment failed
├── Build failed?
│ ├── npm ci failed → Check package-lock.json, clear cache
│ ├── Docker build failed → Check Dockerfile, .dockerignore
│ └── Out of space → docker system prune, expand disk
├── Deploy succeeded but site down?
│ ├── 502 Bad Gateway → App container not running/healthy
│ │ └── Check: docker ps, docker compose logs app
│ ├── Connection refused → Firewall, wrong port binding
│ │ └── Check: ufw status, ss -tlnp
│ ├── SSL error → Certificate expired/misconfigured
│ │ └── Check: openssl s_client, certbot certificates
│ └── Timeout → Resource exhaustion, infinite loop
│ └── Check: free -h, top, docker stats
├── SSH failed?
│ ├── Permission denied → Check key, known_hosts, sshd_config
│ └── Connection timeout → Firewall, wrong IP, SSH port
└── Works locally but not on VPS?
├── Environment variables missing → Check .env on VPS
├── Port conflicts → ss -tlnp, check other services
└── Different file paths → Check absolute paths in config
10. Quick Commands Reference
| Task | Command |
|---|---|
| Check running containers | docker ps |
| View logs | docker compose logs -f |
| Rebuild and restart | docker compose up -d --build |
| Check disk space | df -h |
| Check memory | free -h |
| Check listening ports | ss -tlnp |
| Test local endpoint | curl -I localhost |
| Restart compose stack | docker compose restart |
| Remove unused images | docker image prune -a |
| Check firewall | ufw status |
| Check fail2ban | fail2ban-client status |
| View SSH attempts | journalctl -u sshd -n 50 |
11. Rollback Strategy
# Option 1: Git-based rollback
git checkout <previous-commit>
docker compose up -d --build
# Option 2: Docker image tags (if using registry)
docker compose pull # Gets :latest
docker compose up -d
# Option 3: Keep backup of working state
cp docker-compose.yml docker-compose.yml.backup
cp Caddyfile Caddyfile.backup
# Restore from backup
cp docker-compose.yml.backup docker-compose.yml
docker compose up -d --build
Related Skills
supabase-debugger- Database and Supabase-specific issuesdev-debugging- Application-level Vue/Pinia debuggingtauri-debugger- Desktop app deployment issues
Sources
Didn't find tool you were looking for?