Get the FREE Ultimate OpenClaw Setup Guide →

infrastructure-validation

npx machina-cli add skill lgbarn/shipyard/infrastructure-validation --openclaw
Files (1)
SKILL.md
4.6 KB
<!-- TOKEN BUDGET: 140 lines / ~420 tokens -->

Infrastructure Validation

<activation>

Activation Triggers

  • Files matching: *.tf, *.tfvars, Dockerfile, docker-compose.yml, playbook*.yml, roles/, inventory/
  • Config: .shipyard/config.json has iac_validation set to "auto" or true
  • Templates with AWSTemplateFormatVersion (CloudFormation)
  • YAML with apiVersion: (Kubernetes)

Natural Language Triggers

  • "validate terraform", "check docker", "lint ansible", "IaC validation", "infrastructure check"
</activation>

Overview

IaC mistakes don't cause test failures -- they cause outages, breaches, and cost overruns. Validate before every change.

Core principle: Never apply without plan review. Like TDD requires tests before code, IaC requires validation before apply.

<instructions>

Terraform Workflow

Run in order. Each step must pass before proceeding.

terraform fmt -check          # 1. Format (auto-fix with fmt if needed)
terraform validate            # 2. Syntax validation
terraform plan -out=tfplan    # 3. Review every change -- NEVER skip
tflint --recursive            # 4. Lint (if installed)
tfsec . OR checkov -d .       # 5. Security scan (if installed)

Drift detection: terraform plan -detailed-exitcode -- exit code 2 means drift. Document what drifted and why before overwriting.

Ansible Workflow

yamllint .                              # 1. YAML syntax
ansible-lint                            # 2. Best practices
ansible-playbook --syntax-check *.yml   # 3. Playbook syntax
ansible-playbook --check *.yml          # 4. Dry run (where supported)
molecule test                           # 5. Role tests (if configured)

Docker Workflow

hadolint Dockerfile                     # 1. Lint (if installed)
docker build -t test-build .            # 2. Build
trivy image test-build                  # 3. Security scan (if installed)
docker compose config                   # 4. Validate compose (if applicable)
</instructions>

Common Mistakes

Terraform

MistakeFix
Local state fileUse remote backend (S3+DynamoDB, GCS)
No state lockingEnable lock table
Hardcoded secretsUse variables + secret manager
* in security groupsRestrict to specific CIDRs
Unpinned provider versionPin in required_providers
Missing tagsRequire via policy or module defaults

Ansible

MistakeFix
Plaintext secretsansible-vault encrypt
shell instead of modulesUse native modules (apt, copy, etc.)
Everything as rootbecome: false by default, escalate only when needed

Docker

MistakeFix
FROM ubuntu:latestPin to digest: FROM ubuntu:22.04@sha256:...
Running as rootAdd USER nonroot
COPY . .Use .dockerignore, copy specific files
Secrets in ENV/ARGUse build secrets or runtime injection
No health checkAdd HEALTHCHECK instruction
Single-stage buildUse multi-stage builds
<rules>

Red Flags -- STOP

  • terraform apply -auto-approve without prior plan review
  • Security group with 0.0.0.0/0 on non-HTTP ports
  • IAM policy with * action or * resource
  • Secrets in .tf, .yml, or Dockerfile
  • State file committed to git
  • latest tag on any base image
  • Container running as root in production
</rules> <examples>

Validation Finding Examples

Good Finding -- specific, shows the problem, gives the fix

**IaC-Critical: Overly permissive security group in modules/network/main.tf**

Resource: aws_security_group_rule.allow_all (line 34)
Problem: Ingress rule allows 0.0.0.0/0 on port 22 (SSH).
         This exposes SSH to the entire internet.
Fix: Restrict to bastion host CIDR or VPN range:
     cidr_blocks = [var.vpn_cidr]
Validation: `tfsec .` flagged this as HIGH severity (AWS018).

Bad Finding -- vague, no location, no evidence

**Security Issue: Network configuration may be too open.**

Review the security groups for potential issues.
</examples>

Integration

Referenced by: shipyard:builder (detects IaC files, follows appropriate workflow), shipyard:verifier (IaC validation mode), shipyard:auditor (IaC security checks)

Pairs with: shipyard:security-audit (security lens for IaC), shipyard:shipyard-verification (IaC claims need validation evidence)

Source

git clone https://github.com/lgbarn/shipyard/blob/main/skills/infrastructure-validation/SKILL.mdView on GitHub

Overview

IaC mistakes can cause outages, breaches, and cost overruns. Validate before every change. This skill provides structured validation workflows for Terraform, Ansible, Docker, CloudFormation, and Kubernetes IaC, enforcing plan reviews and pre-apply checks to prevent costly errors.

How This Skill Works

For each supported tool, the skill runs a recommended validation sequence before apply: formatting, syntax checks, linting, and a plan or dry-run. It also supports drift detection (e.g., terraform plan -detailed-exitcode) and optional security scans, plus guidance on common mistakes to fix before deployment.

When to Use It

  • Working on Terraform (.tf, .tfvars) files
  • Authoring Ansible playbooks, roles, or inventories
  • Building Docker images from Dockerfile or docker-compose.yml
  • Working with CloudFormation templates
  • Validating Kubernetes YAML manifests (apiVersion present)

Quick Start

  1. Step 1: Identify the IaC files and triggers (Terraform, Ansible, Docker, CloudFormation, Kubernetes).
  2. Step 2: Run the appropriate validation sequence for each tool (format/syntax, lint, plan/dry-run, security checks).
  3. Step 3: Review results, fix findings, and re-run until all checks pass before applying changes.

Best Practices

  • Always run format, syntax, and lint steps before plan or apply
  • Review Terraform plans and perform dry runs (or syntax checks) before applying changes
  • Use remote backends and secret management; avoid hardcoded secrets
  • Incorporate security scanning (tfsec, checkov, trivy) when available
  • Enforce governance: require tags, access controls, and document drift when it occurs

Example Use Cases

  • Terraform: drift detection and plan review to prevent unintended changes (terraform plan -detailed-exitcode).
  • Ansible: yamllint and ansible-lint catch syntax and best-practice issues before playbook execution.
  • Docker: hadolint linting the Dockerfile, followed by a secure build and a runtime scan.
  • CloudFormation: template checks ensuring correct syntax and required fields before deployment.
  • Kubernetes: YAML manifests validated for apiVersion presence and syntax prior to kubectl apply.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers