Get the FREE Ultimate OpenClaw Setup Guide →

cloud-infrastructure

npx machina-cli add skill aiskillstore/marketplace/cloud-infrastructure --openclaw
Files (1)
SKILL.md
5.0 KB

Cloud Infrastructure

Comprehensive cloud infrastructure skill covering multi-cloud architecture, Infrastructure as Code, cost optimization, and production deployment patterns.

When to Use This Skill

  • Designing cloud architecture for new applications
  • Implementing Infrastructure as Code (Terraform, CloudFormation, Pulumi)
  • Cost optimization and resource right-sizing
  • Multi-region and high-availability deployments
  • Cloud migration planning
  • Security and compliance implementation
  • Auto-scaling and performance optimization

Cloud Architecture Patterns

Compute Patterns

PatternAWSAzureGCPUse Case
ServerlessLambdaFunctionsCloud FunctionsEvent-driven, variable load
ContainersECS/EKSAKSGKEMicroservices, consistent env
VMsEC2Virtual MachinesCompute EngineLegacy apps, full control
BatchBatchBatchBatchLarge-scale processing

Storage Patterns

TypeAWSAzureGCPUse Case
ObjectS3Blob StorageCloud StorageStatic files, backups
BlockEBSManaged DisksPersistent DiskDatabase storage
FileEFSAzure FilesFilestoreShared file systems
ArchiveGlacierArchiveColdlineLong-term retention

Database Patterns

TypeAWSAzureGCPUse Case
RelationalRDS, AuroraSQL DatabaseCloud SQLACID transactions
NoSQLDynamoDBCosmos DBFirestoreFlexible schema
CacheElastiCacheCache for RedisMemorystoreSession, caching
Data WarehouseRedshiftSynapseBigQueryAnalytics

Infrastructure as Code

Terraform Best Practices

Project Structure:

infrastructure/
├── modules/
│   ├── networking/
│   ├── compute/
│   └── database/
├── environments/
│   ├── dev/
│   ├── staging/
│   └── prod/
├── main.tf
├── variables.tf
├── outputs.tf
└── versions.tf

State Management:

  • Use remote state (S3, Azure Blob, GCS)
  • Enable state locking (DynamoDB, Blob lease)
  • Separate state per environment
  • Never commit state files

Module Design:

  • Single responsibility per module
  • Expose minimal required variables
  • Document inputs/outputs
  • Version modules with git tags

Cost Optimization

Compute Savings:

  • Reserved Instances (1-3 year commitment): 30-60% savings
  • Spot/Preemptible instances: 60-90% savings for interruptible workloads
  • Right-sizing: Match instance size to actual usage
  • Auto-scaling: Scale down during low usage

Storage Savings:

  • Lifecycle policies: Auto-transition to cheaper tiers
  • Compression: Reduce storage footprint
  • Deduplication: Eliminate redundant data
  • Delete unused resources: Orphaned volumes, snapshots

Network Savings:

  • Use CDN for static content
  • Optimize data transfer paths
  • Use private endpoints
  • Compress API responses

High Availability Patterns

Multi-AZ Deployment

  • Deploy across 2-3 availability zones
  • Use load balancers for distribution
  • Database replication across AZs
  • Automatic failover configuration

Multi-Region Deployment

  • Active-active or active-passive
  • DNS-based routing (Route53, Traffic Manager)
  • Data replication strategy
  • Disaster recovery procedures

Resilience Patterns

  • Circuit breakers for external dependencies
  • Retry with exponential backoff
  • Bulkhead isolation
  • Graceful degradation

Security Best Practices

Identity & Access

  • Principle of least privilege
  • Use IAM roles, not long-term credentials
  • Enable MFA for privileged accounts
  • Regular access reviews

Network Security

  • VPC/VNet isolation
  • Security groups as firewalls
  • Private subnets for backend services
  • VPN/Direct Connect for hybrid

Data Protection

  • Encryption at rest (KMS)
  • Encryption in transit (TLS)
  • Key rotation policies
  • Backup and recovery testing

Monitoring & Observability

Key Metrics

  • CPU, Memory, Disk utilization
  • Network throughput and latency
  • Error rates and types
  • Cost per service/team

Alerting Strategy

  • Set thresholds based on baselines
  • Alert on symptoms, not causes
  • Runbooks for each alert
  • Escalation paths defined

Reference Files

  • references/terraform_patterns.md - IaC patterns and examples
  • references/cost_optimization.md - Detailed cost reduction strategies

Integration with Other Skills

  • security-engineering - For security architecture
  • network-engineering - For network design
  • performance - For optimization strategies
  • devops-runbooks - For operational procedures

Source

git clone https://github.com/aiskillstore/marketplace/blob/main/skills/89jobrien/cloud-infrastructure/SKILL.mdView on GitHub

Overview

Design and deploy cloud architectures across AWS, Azure, and GCP using Infrastructure as Code (Terraform, CloudFormation, Pulumi). This skill emphasizes modular patterns, cost optimization, multi-region deployments, and security/compliance to drive scalable, resilient cloud environments.

How This Skill Works

It combines architecture patterns for compute, storage, and databases with IaC practices. Teams organize resources into modular Terraform/CloudFormation/Pulumi templates, manage remote state per environment, and implement cost, HA, and security controls across providers, enabling consistent multi-cloud deployments.

When to Use It

  • Architecting cloud topology for a new application across AWS, Azure, and GCP
  • Implementing Infrastructure as Code (Terraform, CloudFormation, or Pulumi) to provision resources
  • Cost optimization and resource right-sizing in a multi-cloud setup
  • Planning multi-region and high-availability deployments with DR strategies
  • Enforcing security, access controls, and compliance during cloud migrations

Quick Start

  1. Step 1: Define target cloud patterns and required services (compute, storage, DB) across providers.
  2. Step 2: Organize your repo into infrastructure/modules and environments (dev/staging/prod) and implement modular templates.
  3. Step 3: Initialize and deploy with your IaC tool (terraform init/plan/apply or equivalent) and verify across environments.

Best Practices

  • Adopt a modular Terraform project structure (modules for networking, compute, database; environments dev/staging/prod).
  • Use remote state with S3/Azure Blob/GCS and enable locking; keep state per environment; never commit state files.
  • Design modules with a single responsibility, expose minimal inputs/outputs, and document usage.
  • Version modules with git tags to enable reproducible deployments.
  • Incorporate cost optimization and HA patterns (auto-scaling, multi-AZ/region, lifecycle policies, CDN, private endpoints) from the start.

Example Use Cases

  • Deploy a 3-tier web app on AWS with VPC, ALB, Auto Scaling, and RDS across multiple AZs using Terraform modules.
  • Implement active-active multi-region DR with data replication and DNS routing via Route53/T Traffic Manager.
  • Apply cost optimization by right-sizing instances, enabling reserved/spot instances, and configuring lifecycle policies.
  • Manage Terraform remote state for dev/staging/prod using S3 with DynamoDB locks across environments.
  • Enforce security basics: IAM roles rather than long-term credentials and MFA for privileged accounts.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers