Agent skill
cloud-infrastructure
Cloud infrastructure design and deployment patterns for AWS, Azure, and GCP. Use when designing cloud architectures, implementing IaC with Terraform, optimizing costs, or setting up multi-region deployments.
Install this agent skill to your Project
npx add-skill https://github.com/aiskillstore/marketplace/tree/main/skills/89jobrien/cloud-infrastructure
SKILL.md
Cloud Infrastructure
Comprehensive cloud infrastructure skill covering multi-cloud architecture, Infrastructure as Code, cost optimization, and production deployment patterns.
When to Use This Skill
- Designing cloud architecture for new applications
- Implementing Infrastructure as Code (Terraform, CloudFormation, Pulumi)
- Cost optimization and resource right-sizing
- Multi-region and high-availability deployments
- Cloud migration planning
- Security and compliance implementation
- Auto-scaling and performance optimization
Cloud Architecture Patterns
Compute Patterns
| Pattern | AWS | Azure | GCP | Use Case |
|---|---|---|---|---|
| Serverless | Lambda | Functions | Cloud Functions | Event-driven, variable load |
| Containers | ECS/EKS | AKS | GKE | Microservices, consistent env |
| VMs | EC2 | Virtual Machines | Compute Engine | Legacy apps, full control |
| Batch | Batch | Batch | Batch | Large-scale processing |
Storage Patterns
| Type | AWS | Azure | GCP | Use Case |
|---|---|---|---|---|
| Object | S3 | Blob Storage | Cloud Storage | Static files, backups |
| Block | EBS | Managed Disks | Persistent Disk | Database storage |
| File | EFS | Azure Files | Filestore | Shared file systems |
| Archive | Glacier | Archive | Coldline | Long-term retention |
Database Patterns
| Type | AWS | Azure | GCP | Use Case |
|---|---|---|---|---|
| Relational | RDS, Aurora | SQL Database | Cloud SQL | ACID transactions |
| NoSQL | DynamoDB | Cosmos DB | Firestore | Flexible schema |
| Cache | ElastiCache | Cache for Redis | Memorystore | Session, caching |
| Data Warehouse | Redshift | Synapse | BigQuery | Analytics |
Infrastructure as Code
Terraform Best Practices
Project Structure:
infrastructure/
├── modules/
│ ├── networking/
│ ├── compute/
│ └── database/
├── environments/
│ ├── dev/
│ ├── staging/
│ └── prod/
├── main.tf
├── variables.tf
├── outputs.tf
└── versions.tf
State Management:
- Use remote state (S3, Azure Blob, GCS)
- Enable state locking (DynamoDB, Blob lease)
- Separate state per environment
- Never commit state files
Module Design:
- Single responsibility per module
- Expose minimal required variables
- Document inputs/outputs
- Version modules with git tags
Cost Optimization
Compute Savings:
- Reserved Instances (1-3 year commitment): 30-60% savings
- Spot/Preemptible instances: 60-90% savings for interruptible workloads
- Right-sizing: Match instance size to actual usage
- Auto-scaling: Scale down during low usage
Storage Savings:
- Lifecycle policies: Auto-transition to cheaper tiers
- Compression: Reduce storage footprint
- Deduplication: Eliminate redundant data
- Delete unused resources: Orphaned volumes, snapshots
Network Savings:
- Use CDN for static content
- Optimize data transfer paths
- Use private endpoints
- Compress API responses
High Availability Patterns
Multi-AZ Deployment
- Deploy across 2-3 availability zones
- Use load balancers for distribution
- Database replication across AZs
- Automatic failover configuration
Multi-Region Deployment
- Active-active or active-passive
- DNS-based routing (Route53, Traffic Manager)
- Data replication strategy
- Disaster recovery procedures
Resilience Patterns
- Circuit breakers for external dependencies
- Retry with exponential backoff
- Bulkhead isolation
- Graceful degradation
Security Best Practices
Identity & Access
- Principle of least privilege
- Use IAM roles, not long-term credentials
- Enable MFA for privileged accounts
- Regular access reviews
Network Security
- VPC/VNet isolation
- Security groups as firewalls
- Private subnets for backend services
- VPN/Direct Connect for hybrid
Data Protection
- Encryption at rest (KMS)
- Encryption in transit (TLS)
- Key rotation policies
- Backup and recovery testing
Monitoring & Observability
Key Metrics
- CPU, Memory, Disk utilization
- Network throughput and latency
- Error rates and types
- Cost per service/team
Alerting Strategy
- Set thresholds based on baselines
- Alert on symptoms, not causes
- Runbooks for each alert
- Escalation paths defined
Reference Files
references/terraform_patterns.md- IaC patterns and examplesreferences/cost_optimization.md- Detailed cost reduction strategies
Integration with Other Skills
- security-engineering - For security architecture
- network-engineering - For network design
- performance - For optimization strategies
- devops-runbooks - For operational procedures
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
perigon-backend
Perigon ASP.NET Core + EF Core + Aspire conventions
perigon-agent
Pointers for Copilot/agents to apply Perigon conventions
perigon-angular
Angular 21+ standalone/Material/signal conventions for Perigon WebApp
fastapi-mastery
Comprehensive FastAPI development skill covering REST API creation, routing, request/response handling, validation, authentication, database integration, middleware, and deployment. Use when working with FastAPI projects, building APIs, implementing CRUD operations, setting up authentication/authorization, integrating databases (SQL/NoSQL), adding middleware, handling WebSockets, or deploying FastAPI applications. Triggered by requests involving .py files with FastAPI code, API endpoint creation, Pydantic models, or FastAPI-specific features.
context7-efficient
Token-efficient library documentation fetcher using Context7 MCP with 86.8% token savings through intelligent shell pipeline filtering. Fetches code examples, API references, and best practices for JavaScript, Python, Go, Rust, and other libraries. Use when users ask about library documentation, need code examples, want API usage patterns, are learning a new framework, need syntax reference, or troubleshooting with library-specific information. Triggers include questions like "Show me React hooks", "How do I use Prisma", "What's the Next.js routing syntax", or any request for library/framework documentation.
browser-use
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction.
Didn't find tool you were looking for?