Agent skill
production-debugging
Connect to and debug the Orient production server on Oracle Cloud. Use this skill when asked to check production logs, restart containers, debug deployment issues, SSH into the server, view container status, troubleshoot 502 errors, rollback deployments, fix WhatsApp pairing issues, debug dashboard loading issues, or investigate why production is down. Covers SSH access, Docker container management, log analysis, Express 5 errors, SPA asset issues, and deployment troubleshooting.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/production-debugging
SKILL.md
Production Server Debugging
Debug and manage the Orient production environment on Oracle Cloud.
Prerequisites
SSH access requires these environment variables (stored in .env):
OCI_HOST- Oracle Cloud server IP (default:152.70.172.33)OCI_USER- SSH username (default:opc)- SSH key configured in
~/.ssh/id_rsa
Quick setup (if not in .env):
export OCI_HOST=152.70.172.33
export OCI_USER=opc
SSH Connection
Connect to Production
ssh -o StrictHostKeyChecking=no $OCI_USER@$OCI_HOST
Run Remote Command
ssh -o StrictHostKeyChecking=no $OCI_USER@$OCI_HOST "command"
Example: Check if containers are running
ssh $OCI_USER@$OCI_HOST "docker ps --format 'table {{.Names}}\t{{.Status}}'"
Server Directory Structure
~/orient/
├── docker/ # Compose files and configs
│ ├── docker-compose.yml
│ ├── docker-compose.v2.yml
│ ├── docker-compose.prod.yml
│ ├── docker-compose.r2.yml
│ ├── nginx-ssl.conf
│ └── deploy-server.sh
├── data/ # Persistent data (volumes)
├── logs/ # Application logs
├── backups/ # Deployment backups
└── .env # Environment variables
Working directory for Docker commands:
cd ~/orient/docker
Container Overview
| Container | Port | Purpose |
|---|---|---|
| orienter-nginx | 80, 443 | Reverse proxy, SSL termination |
| orienter-dashboard | 4098 | Dashboard web app + API |
| orienter-opencode | 4099, 8765 | AI agent API + OAuth callback |
Container Logs
View Recent Logs
# Core containers
docker logs orienter-dashboard --tail 100
docker logs orienter-bot-whatsapp --tail 100
docker logs orienter-opencode --tail 100
docker logs orienter-nginx --tail 50
Follow Logs in Real-Time
docker logs -f orienter-dashboard
Logs Since Specific Time
docker logs orienter-dashboard --since 1h
docker logs orienter-dashboard --since "2024-01-07T10:00:00"
Parse JSON Logs
docker logs orienter-dashboard --tail 50 2>&1 | jq -R 'fromjson? // .'
Search for Errors
docker logs orienter-dashboard 2>&1 | grep -i error | tail -20
Quick Debugging Reference
| Issue | Command |
|---|---|
| Check all containers | docker ps -a |
| Container crash loop | docker logs <container> --tail 100 |
| View nginx errors | docker logs orienter-nginx --tail 50 |
| Check dashboard logs | docker logs orienter-dashboard --tail 100 |
| Check OpenCode logs | docker logs orienter-opencode --tail 100 |
| Restart dashboard | docker restart orienter-dashboard |
| Check database | sqlite3 /app/data/orient.db "SELECT 1" |
| View env vars | cat ~/orient/.env |
| Check container env vars | docker exec <container> env | grep <VAR_NAME> |
| Missing env var (crash loop) | Add to .env, then docker compose --env-file ../.env -f compose.yml up -d <service> |
| Check GitHub Actions | gh run list --limit 5 (run locally) |
| WhatsApp pairing stuck | docker restart orienter-bot-whatsapp |
| WhatsApp factory reset | rm -rf ~/orient/data/whatsapp-auth/* && docker restart orienter-bot-whatsapp |
Common Production Issues
1. Dashboard Container Crash Loop
Symptoms: Dashboard shows "Restarting" status, nginx returns 502
Check logs:
ssh $OCI_USER@$OCI_HOST "docker logs orienter-dashboard --tail 100 2>&1"
Common causes:
Express 5 / path-to-regexp Error
Error: TypeError: Missing parameter name at index 1: *
This happens when using bare * wildcards in Express 5 routes:
// BROKEN
app.get('*', (req, res) => { ... });
// FIXED
app.get('/{*splat}', (req, res) => { ... });
Fix: Update the SPA catch-all route in packages/dashboard/src/server/index.ts
Database Connection Error
Error: SQLITE_CANTOPEN or database file not found
# Check SQLite database file exists
docker exec orienter-dashboard ls -la /app/data/orient.db
# Check SQLITE_DB_PATH
docker exec orienter-dashboard env | grep SQLITE_DB_PATH
2. Dashboard Assets Not Loading (text/html error)
Symptoms: Browser console shows:
Failed to load module script: Expected a JavaScript-or-Wasm module script
but the server responded with a MIME type of "text/html"
Cause: Nginx is incorrectly proxying asset requests
Debug:
# Check what content-type is returned for JS files
curl -sI "https://app.orient.bot/dashboard/assets/index-*.js" | grep content-type
# Expected: content-type: text/javascript; charset=utf-8
# If: content-type: text/html -> nginx routing issue
3. 502 Bad Gateway
Check which service is down:
docker ps --format "table {{.Names}}\t{{.Status}}"
Check nginx upstream:
docker logs orienter-nginx --tail 20 | grep -i "upstream\|502"
Common causes:
- Dashboard not started (crash loop)
- Wrong port in nginx config
- Container name mismatch
4. WhatsApp Pairing Issues
Symptoms:
- Dashboard shows "Connection Closed" when requesting pairing code
- QR code stops refreshing
- Logs show
QR refs attempts endederror
Quick fix:
ssh $OCI_USER@$OCI_HOST "docker restart orienter-bot-whatsapp"
Full reset:
ssh $OCI_USER@$OCI_HOST "rm -rf /home/opc/orient/data/whatsapp-auth/* && docker restart orienter-bot-whatsapp"
Diagnosis:
ssh $OCI_USER@$OCI_HOST "docker logs orienter-bot-whatsapp --tail 100 2>&1 | grep -i -E '(pair|code|connection|closed|error)'"
5. Missing Environment Variables (Crash Loop)
Symptoms: Container in crash loop with "Restarting" status, logs show missing environment variable errors
Example error:
Error: DASHBOARD_JWT_SECRET environment variable is required
Common missing variables:
DASHBOARD_JWT_SECRET(production) orDASHBOARD_JWT_SECRET_STAGING(staging)SQLITE_DB_PATH- OAuth credentials (
GOOGLE_OAUTH_CLIENT_ID, etc.) - API keys (Anthropic, OpenAI, etc.)
Quick checklist for staging environment:
# Check if variable exists in .env
ssh $OCI_USER@$OCI_HOST "grep DASHBOARD_JWT_SECRET_STAGING /home/opc/orient/.env"
# Check what env vars the container actually has
ssh $OCI_USER@$OCI_HOST "docker exec orienter-dashboard-staging env | grep DASHBOARD"
Critical knowledge about environment variables:
-
Staging uses _STAGING suffix: Staging containers expect environment variables with
_STAGINGsuffix- Production:
DASHBOARD_JWT_SECRET - Staging:
DASHBOARD_JWT_SECRET_STAGING
- Production:
-
docker restart does NOT reload .env: Simply restarting a container won't pick up new environment variables
bash# WRONG - Won't reload env vars docker restart orienter-dashboard-staging # CORRECT - Recreates container with new env vars cd ~/orient/docker docker compose --env-file ../.env -f docker-compose.v2.yml -f docker-compose.staging.yml up -d dashboard -
docker-compose needs --env-file flag: When .env is in parent directory, must specify explicitly
Fix missing environment variables:
# 1. Add missing variable to .env file
ssh $OCI_USER@$OCI_HOST "echo 'DASHBOARD_JWT_SECRET_STAGING=\"your-secret-here\"' >> /home/opc/orient/.env"
# 2. Recreate container with updated env (from docker directory)
ssh $OCI_USER@$OCI_HOST "cd /home/opc/orient/docker && docker compose --env-file ../.env -f docker-compose.v2.yml -f docker-compose.staging.yml up -d dashboard"
# 3. Verify container is healthy
ssh $OCI_USER@$OCI_HOST "docker ps | grep dashboard"
# 4. Check logs for successful startup
ssh $OCI_USER@$OCI_HOST "docker logs orienter-dashboard-staging --tail 20"
7. SSH Command Not Found (Monitoring Failure)
Symptoms: Monitoring page shows "Failed to load server metrics" with error logs showing:
/bin/sh: ssh: not found
Cause: The dashboard container doesn't have openssh-client installed. This is required for the monitoring service which SSHes into the production server to collect metrics.
Quick check:
# Verify SSH is available in container
docker exec orienter-dashboard which ssh
# Expected: /usr/bin/ssh
# If empty or "not found": SSH client missing
Fix for Dockerfile: The packages/dashboard/Dockerfile must include openssh-client in the alpine packages:
# In the runner stage
RUN apk add --no-cache curl bash openssh-client
Workaround for staging (build patched image locally):
# 1. Build a patched image with SSH support
cd ~/orient
docker build -t dashboard-staging-ssh -f packages/dashboard/Dockerfile .
# 2. Update staging compose to use local image
sed -i 's|image: ghcr.io/orient-code/orient/dashboard:staging|image: dashboard-staging-ssh|' docker/docker-compose.staging.yml
# 3. Recreate container
cd ~/orient/docker
docker compose --env-file ../.env -f docker-compose.v2.yml -f docker-compose.staging.yml up -d dashboard
8. SSH Key Mounting for Containers
Problem: Container needs SSH access to production server for monitoring, but SSH keys need proper permissions.
Docker Compose configuration:
# In docker-compose.v2.yml
dashboard:
environment:
- SSH_KEY_PATH=/app/.ssh/id_rsa
volumes:
# Mount SSH key from docker/.ssh directory (NOT ~/.ssh which has permission issues)
- ./.ssh:/app/.ssh:ro
Setup steps on server:
# 1. Create docker/.ssh directory
mkdir -p ~/orient/docker/.ssh
# 2. Generate or copy SSH key (must be readable by container's non-root user)
ssh-keygen -t rsa -f ~/orient/docker/.ssh/id_rsa -N ""
# 3. Set readable permissions (container runs as uid 1001)
chmod 644 ~/orient/docker/.ssh/id_rsa
chmod 644 ~/orient/docker/.ssh/id_rsa.pub
# 4. Add public key to authorized_keys for localhost access
cat ~/orient/docker/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# 5. Recreate container to pick up the mount
cd ~/orient/docker
docker compose --env-file ../.env -f docker-compose.v2.yml up -d dashboard
Why ./.ssh instead of ~/.ssh?
- Tilde (
~) doesn't expand in docker-compose volume paths - User's
~/.sshtypically has strict permissions (600) that block container access - Container runs as non-root user (nodejs, uid 1001), needs readable permissions
Required environment variables (in .env):
OCI_HOST=152.70.172.33 # or SSH_HOST
OCI_USER=opc # or SSH_USER
# SSH_KEY_PATH is set in compose file: /app/.ssh/id_rsa
Verify SSH works from container:
docker exec orienter-dashboard ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 -i /app/.ssh/id_rsa opc@152.70.172.33 'echo SSH works'
9. Container Environment Variable Reload
Critical: docker restart does NOT reload environment variables from .env. You must recreate the container.
Wrong approach (env vars won't update):
# This won't pick up new .env values!
docker restart orienter-dashboard
Correct approach (recreates container with fresh env):
cd ~/orient/docker
docker compose --env-file ../.env -f docker-compose.v2.yml -f docker-compose.prod.yml up -d dashboard
For staging containers:
cd ~/orient/docker
docker compose --env-file ../.env -f docker-compose.v2.yml -f docker-compose.staging.yml up -d dashboard
Verify environment variables loaded:
# Check specific variable
docker exec orienter-dashboard env | grep SSH_KEY_PATH
docker exec orienter-dashboard env | grep OCI_HOST
# For staging
docker exec orienter-dashboard-staging env | grep SSH_KEY_PATH
Common mistake with staging: Forgetting the _STAGING suffix for environment variables:
- Production uses:
DASHBOARD_JWT_SECRET - Staging uses:
DASHBOARD_JWT_SECRET_STAGING
6. Deployment Failure
Check recent GitHub Actions:
gh run list --limit 5
gh run view <run-id>
Check which images were built:
gh run view <run-id> | grep -E "Build.*Image"
If builds show 0s with - prefix, they were skipped (no changes detected).
Force rebuild if needed:
gh workflow run deploy.yml -f force_build_all=true
Container Management
List Running Containers
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
Restart a Container
docker restart orienter-dashboard
docker restart orienter-bot-whatsapp
docker restart orienter-opencode
docker restart orienter-nginx
Restart All Services
cd ~/orient/docker
COMPOSE_FILES="-f docker-compose.v2.yml -f docker-compose.prod.yml -f docker-compose.r2.yml"
docker compose ${COMPOSE_FILES} restart
Execute Command in Container
docker exec -it orienter-dashboard sh
docker exec -it orienter-opencode bash
Check Container Files
# Check dashboard assets exist
docker exec orienter-dashboard ls -la /app/packages/dashboard/public/assets/
# Check nginx config
docker exec orienter-nginx cat /etc/nginx/conf.d/default.conf
View Container Resource Usage
docker stats --no-stream
Inspect Container
docker inspect orienter-dashboard | jq '.[0].State'
Health Checks
Verify All Services
# Nginx
curl -sf https://app.orient.bot/health && echo "Nginx: OK" || echo "Nginx: FAIL"
# Dashboard
curl -sf https://app.orient.bot/dashboard/api/health && echo "Dashboard: OK" || echo "Dashboard: FAIL"
# OpenCode
curl -sf https://code.orient.bot/global/health && echo "OpenCode: OK" || echo "OpenCode: FAIL"
Check Database
docker exec orienter-dashboard sqlite3 /app/data/orient.db "SELECT 1;"
Deployment Management
Check Deployment Script Options
~/orient/docker/deploy-server.sh --help
Rollback to Previous Deployment
cd ~/orient/docker
./deploy-server.sh rollback
Manual Full Restart
cd ~/orient/docker
COMPOSE_FILES="-f docker-compose.v2.yml -f docker-compose.prod.yml -f docker-compose.r2.yml"
docker compose ${COMPOSE_FILES} down
docker compose ${COMPOSE_FILES} up -d
Check Disk Space
df -h /home
docker system df
Clean Up Docker Resources
docker system prune -f
docker image prune -a -f --filter "until=168h" # Remove images older than 7 days
Error Pattern Reference
| Error Message | Likely Cause | Solution |
|---|---|---|
Missing parameter name at index 1: * |
Express 5 bare wildcard | Use /{*splat} instead of * |
Failed to load module script: text/html |
Nginx proxy_pass wrong | Fix nginx to strip path prefix |
Connection Closed (WhatsApp) |
WebSocket died | Restart bot-whatsapp container |
SQLITE_CANTOPEN |
Database file missing | Check SQLITE_DB_PATH and /app/data/ directory |
502 Bad Gateway |
Upstream container down | Check container status and logs |
container is unhealthy |
Health check failing | Check container logs |
DASHBOARD_JWT_SECRET variable is required |
Missing env var | Add to .env, recreate with docker-compose |
environment variable is required |
Missing env var in container | Check staging uses _STAGING suffix, use docker-compose up -d (not restart) |
/bin/sh: ssh: not found |
Missing openssh-client | Add openssh-client to Dockerfile, rebuild image |
SSH command failed (monitoring) |
SSH key or config missing | Mount SSH key to /app/.ssh, set OCI_HOST in .env |
Permission denied (publickey) |
SSH key permissions wrong | chmod 644 on mounted key, ensure container can read it |
Session Data Locations
/home/opc/orient/data/
├── orient.db # SQLite database
├── media/ # Uploaded media files
└── oauth-tokens/ # OAuth tokens
Log Locations
- Container logs:
docker logs <container> - Nginx access:
docker exec orienter-nginx cat /var/log/nginx/access.log - Application logs:
~/orient/logs/(if configured)
Server Details
- Host: 152.70.172.33
- User: opc
- SSH Key:
~/.ssh/id_rsa - Deploy Directory: ~/orient
- Docker Directory: ~/orient/docker
- Domains:
app.orient.bot- Dashboardcode.orient.bot- OpenCodestaging.orient.bot- Staging Dashboardcode-staging.orient.bot- Staging OpenCode
Didn't find tool you were looking for?