How to Use AI Agents to Automate Terraform Infrastructure Changes in 2026

AI agents can now plan, review, and apply Terraform changes from natural language. Here's how agentic AI is transforming infrastructure-as-code workflows.

Imagine typing "add a Redis cache to the staging environment with 2GB memory and private subnet access" and having an AI agent write the Terraform code, run the plan, get approval, and apply it — all while following your organization's security policies and naming conventions.

This isn't a demo. It's happening in production at organizations using agentic AI for infrastructure management. And it's changing how DevOps teams think about Terraform.

What Agentic Terraform Looks Like

Traditional Terraform workflow:

Engineer writes HCL → terraform plan → review → terraform apply → verify

Agentic Terraform workflow:

Engineer describes intent → AI agent writes HCL → agent runs plan →
agent checks policies → human approves → agent applies → agent verifies

The key difference: the engineer describes what they want, not how to build it. The AI agent handles the translation from intent to infrastructure code.

The Tools Making This Possible

1. Claude/GPT with Tool Use + Terraform CLI

The simplest approach: give an LLM access to the Terraform CLI and your codebase.

python

import anthropic
 
client = anthropic.Anthropic()
 
tools = [
    {
        "name": "read_terraform_file",
        "description": "Read a Terraform file from the codebase",
        "input_schema": {
            "type": "object",
            "properties": {
                "file_path": {"type": "string", "description": "Path to .tf file"}
            },
            "required": ["file_path"]
        }
    },
    {
        "name": "write_terraform_file",
        "description": "Write or update a Terraform file",
        "input_schema": {
            "type": "object",
            "properties": {
                "file_path": {"type": "string"},
                "content": {"type": "string"}
            },
            "required": ["file_path", "content"]
        }
    },
    {
        "name": "terraform_plan",
        "description": "Run terraform plan and return the output",
        "input_schema": {
            "type": "object",
            "properties": {
                "working_dir": {"type": "string"}
            },
            "required": ["working_dir"]
        }
    },
    {
        "name": "terraform_apply",
        "description": "Run terraform apply (requires human approval)",
        "input_schema": {
            "type": "object",
            "properties": {
                "working_dir": {"type": "string"},
                "plan_file": {"type": "string"}
            },
            "required": ["working_dir", "plan_file"]
        }
    }
]
 
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=tools,
    messages=[{
        "role": "user",
        "content": "Add a Redis ElastiCache cluster to staging. 2GB, cache.r7g.large, private subnet, encryption at rest enabled."
    }],
    system="You are a Terraform infrastructure agent. You have access to read and write .tf files and run terraform commands. Follow the existing code style and naming conventions in the codebase."
)

2. Atlantis + AI Review

Atlantis already automates terraform plan on pull requests. Adding an AI review layer:

yaml

# atlantis.yaml with AI review
version: 3
projects:
  - name: staging
    dir: environments/staging
    workflow: ai-reviewed
    autoplan:
      when_modified: ["*.tf", "*.tfvars"]
 
workflows:
  ai-reviewed:
    plan:
      steps:
        - init
        - plan
        - run: |
            # Send plan output to AI for review
            terraform show -json $PLANFILE | \
            curl -X POST https://your-api.com/review-plan \
              -H "Content-Type: application/json" \
              -d @- | \
            tee plan-review.md
        - run: |
            # Post AI review as PR comment
            gh pr comment $PULL_NUM --body-file plan-review.md

The AI reviews the plan for:

Security issues (open security groups, unencrypted resources)
Cost implications (expensive instance types, over-provisioned resources)
Naming convention violations
Missing tags
Blast radius concerns (too many resources changing at once)

3. Spacelift AI Assist

Spacelift's built-in AI features can:

Generate Terraform from natural language descriptions
Review plans and flag risks
Suggest optimizations
Auto-remediate drift

4. env0 AI Terraform Generator

env0 offers AI-powered Terraform generation integrated into their IaC management platform, with policy enforcement and cost estimation built in.

Building Your Own Terraform AI Agent

Here's a practical architecture for a Terraform AI agent:

Architecture

┌─────────────────────────────────────────────┐
│                 Slack / Chat                  │
│     "Add Redis to staging, 2GB, encrypted"   │
└──────────────────┬──────────────────────────┘
                   │
┌──────────────────▼──────────────────────────┐
│              Agent Orchestrator              │
│  1. Parse intent                            │
│  2. Read existing Terraform                 │
│  3. Generate new HCL                        │
│  4. Run terraform plan                      │
│  5. Check OPA policies                      │
│  6. Request human approval                  │
│  7. Apply if approved                       │
│  8. Verify deployment                       │
└──────────────────┬──────────────────────────┘
                   │
┌─────────┬────────┼────────┬─────────────────┐
│ Codebase│ TF CLI │ OPA    │ Cloud APIs      │
│ (Git)   │        │Policies│ (AWS/GCP/Azure) │
└─────────┴────────┴────────┴─────────────────┘

The Agent Loop

python

import subprocess
import json
 
class TerraformAgent:
    def __init__(self, workspace_dir, llm_client):
        self.workspace_dir = workspace_dir
        self.llm = llm_client
 
    def handle_request(self, user_request):
        # Step 1: Understand existing infrastructure
        existing_tf = self.read_existing_terraform()
 
        # Step 2: Generate Terraform code
        new_code = self.generate_terraform(user_request, existing_tf)
 
        # Step 3: Write to file
        self.write_terraform(new_code)
 
        # Step 4: Run terraform plan
        plan_output = self.terraform_plan()
 
        # Step 5: AI reviews the plan
        review = self.review_plan(plan_output, user_request)
 
        # Step 6: Check policies
        policy_result = self.check_opa_policies(plan_output)
 
        # Step 7: Present to human for approval
        approval = self.request_approval(
            plan=plan_output,
            review=review,
            policy=policy_result
        )
 
        if approval:
            # Step 8: Apply
            result = self.terraform_apply()
            # Step 9: Verify
            self.verify_deployment()
            return f"Infrastructure updated: {result}"
        else:
            self.rollback_code_changes()
            return "Changes cancelled by user"
 
    def terraform_plan(self):
        result = subprocess.run(
            ["terraform", "plan", "-out=tfplan", "-no-color"],
            cwd=self.workspace_dir,
            capture_output=True, text=True
        )
        return result.stdout
 
    def check_opa_policies(self, plan_output):
        # Convert plan to JSON
        subprocess.run(
            ["terraform", "show", "-json", "tfplan"],
            cwd=self.workspace_dir,
            capture_output=True, text=True
        )
        # Run OPA evaluation
        result = subprocess.run(
            ["opa", "eval", "-d", "policies/", "-i", "tfplan.json",
             "data.terraform.deny"],
            capture_output=True, text=True
        )
        return json.loads(result.stdout)

OPA Guardrail Policies

The agent needs guardrails. Use OPA policies to prevent dangerous changes:

rego

# policies/terraform.rego
package terraform
 
# Deny public S3 buckets
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket"
    resource.change.after.acl == "public-read"
    msg := sprintf("Public S3 bucket not allowed: %s", [resource.address])
}
 
# Deny overly permissive security groups
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_security_group_rule"
    resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
    resource.change.after.type == "ingress"
    msg := sprintf("Open ingress rule not allowed: %s", [resource.address])
}
 
# Deny expensive instance types without approval tag
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_instance"
    expensive := {"x1", "p4", "p5", "dl1", "trn1"}
    instance_family := split(resource.change.after.instance_type, ".")[0]
    expensive[instance_family]
    not resource.change.after.tags.cost_approved
    msg := sprintf("Expensive instance %s requires cost_approved tag: %s",
                   [resource.change.after.instance_type, resource.address])
}
 
# Limit blast radius — max 10 resources per apply
deny[msg] {
    changes := [r | r := input.resource_changes[_]; r.change.actions != ["no-op"]]
    count(changes) > 10
    msg := sprintf("Too many changes (%d). Split into smaller applies.", [count(changes)])
}

Safety Guardrails for AI Terraform

This is infrastructure. Mistakes can take down production. Essential guardrails:

1. Never Auto-Apply to Production

python

def request_approval(self, plan, review, policy):
    if self.environment == "production":
        # Always require human approval for production
        return self.wait_for_human_approval(plan, review, policy)
    elif self.environment == "staging":
        # Auto-apply if policies pass and no destructive changes
        if not policy["violations"] and not self.has_destructive_changes(plan):
            return True
        return self.wait_for_human_approval(plan, review, policy)
    else:  # dev
        # Auto-apply if policies pass
        return not policy["violations"]

2. Destructive Change Detection

python

def has_destructive_changes(self, plan_json):
    """Flag any destroy or replace actions"""
    for resource in plan_json.get("resource_changes", []):
        actions = resource.get("change", {}).get("actions", [])
        if "delete" in actions or "replace" in actions:
            return True
    return False

3. Cost Estimation Before Apply

bash

# Use Infracost to estimate cost impact
infracost diff --path . --format json | jq '.totalMonthlyCost'

4. Automatic Rollback

python

def apply_with_rollback(self):
    # Save state before apply
    subprocess.run(["terraform", "state", "pull"], capture_output=True)
 
    result = subprocess.run(
        ["terraform", "apply", "tfplan"],
        cwd=self.workspace_dir,
        capture_output=True, text=True
    )
 
    if result.returncode != 0:
        # Apply failed — state is unchanged, report error
        return {"success": False, "error": result.stderr}
 
    # Verify deployment
    if not self.verify_deployment():
        # Deployment verification failed — destroy new resources
        self.terraform_destroy_new_resources()
        return {"success": False, "error": "Deployment verification failed"}
 
    return {"success": True}

What This Means for DevOps Engineers

AI agents writing Terraform doesn't eliminate the DevOps engineer. It changes the job:

Before	After
Write HCL for every change	Define policies and guardrails
Review every PR manually	Review AI-generated plans for edge cases
Debug syntax errors	Design infrastructure patterns the AI follows
Copy-paste modules	Build reusable modules the AI composes
Respond to infra tickets	Set up self-service with AI agent

The engineer becomes the architect and guardrail designer, not the code writer.

Getting Started Today

Start with plan review — use AI to review terraform plan output in PRs. No risk, immediate value.
Add policy checks — define OPA policies for your security and cost requirements.
Enable generation for dev — let the AI generate Terraform for development environments. Low risk, fast iteration.
Expand to staging — add human approval gates and blast radius limits.
Production last — only after months of proven reliability in lower environments.

Wrapping Up

AI agents writing Terraform isn't science fiction — it's happening now. The combination of LLMs with tool use, policy engines like OPA, and mature Terraform tooling makes it practical and safe.

The key is guardrails. Never give an AI agent unrestricted access to production infrastructure. Always have policy checks, human approval gates, and automatic rollback.

Start with AI-powered plan review in your PRs. That alone will catch security issues and cost surprises that humans miss.

Want to master Terraform, IaC best practices, and infrastructure automation? The KodeKloud Terraform course covers everything from basics to advanced patterns with hands-on labs. For cloud infrastructure to practice Terraform, DigitalOcean has a great Terraform provider and predictable pricing.

How to Use AI Agents to Automate Terraform Infrastructure Changes in 2026

What Agentic Terraform Looks Like

The Tools Making This Possible

1. Claude/GPT with Tool Use + Terraform CLI

2. Atlantis + AI Review

3. Spacelift AI Assist

4. env0 AI Terraform Generator

Building Your Own Terraform AI Agent

Architecture

The Agent Loop

OPA Guardrail Policies

Safety Guardrails for AI Terraform

1. Never Auto-Apply to Production

2. Destructive Change Detection

3. Cost Estimation Before Apply

4. Automatic Rollback

What This Means for DevOps Engineers

Getting Started Today

Wrapping Up

Stay ahead of the curve

Related Articles

Auto-Generate Terraform Modules Using OpenAI Function Calling

Build an AI Terraform Cost Estimator Using Claude (2026)

Build a Terraform Drift Detection Bot with Claude

Comments