Terraform + GitHub Actions: The Modern IaC CI/CD Pipeline Every DevOps Team Needs

Published on 2 months ago
DevOps and Infrastructure
Terraform + GitHub Actions: The Modern IaC CI/CD Pipeline Every DevOps Team Needs

Introduction

You've probably been there. An engineer runs terraform apply from their local machine, state drifts, another engineer runs a conflicting plan, and suddenly your staging environment is half-provisioned at 2 AM.

Manual Terraform workflows have a dirty secret — they don't scale. As soon as you have more than two engineers touching infrastructure, you need a single source of truth for every change, peer review before anything touches production, full audit trails, automated validation, and zero long-lived credentials sitting on developer laptops.

This is exactly what a well-designed Terraform + GitHub Actions pipeline solves. By treating infrastructure changes the same way you treat application code — with PRs, reviews, automated tests, and controlled deployments — you eliminate the chaos and build a repeatable, auditable system.

Why Manual Terraform is Killing Your Team

Manual workflows create four serious problems at scale:

No guardrails. Anyone with AWS credentials can run terraform apply directly. There's no review, no approval, no rollback plan.

State conflicts. Two engineers planning at the same time against the same state file creates race conditions that corrupt your infrastructure state.

Zero visibility. When something breaks, you have no audit trail. Who applied what? When? What changed?

Credential sprawl. Every developer has long-lived AWS keys on their laptop. Each one is a potential security incident waiting to happen.

A CI/CD pipeline fixes all four — automatically.

Pipeline Architecture Overview

Before writing any YAML, here's the mental model. The pipeline has two distinct moments:

On Pull Request — Format check → Validate → Lint → terraform plan (posted as a PR comment so reviewers see exactly what will change before approving).

On Merge to main — The approved plan is applied automatically. No human runs commands. The pipeline is the only entity that ever runs terraform apply.

Engineer opens PR

terraform fmt + validate + tflint

terraform plan (posted to PR as comment)

Code Review + Approval

Merge → main → terraform apply

Prerequisites & Repo Structure

Here's the repository layout we'll work with throughout this guide:


├── .github/
│ └── workflows/
│ ├── terraform-ci.yml # runs on PR
│ └── terraform-apply.yml # runs on merge to main
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── terraform.tfvars
│ ├── staging/
│ └── prod/
├── modules/
│ ├── vpc/
│ ├── eks/
│ └── rds/
└── README.md

Each environment lives in its own directory with its own remote state file (S3 + DynamoDB for AWS). Shared modules live in modules/ and are consumed by each environment.

Tip: Store Terraform state remotely and use state locking. Never commit .terraform/ or *.tfstate to Git. Add them to .gitignore from day one.

Writing the GitHub Actions CI Workflow

This workflow fires on every pull request. It runs format checks, validation, linting, and terraform plan — then posts the plan output as a PR comment.

yaml

name: Terraform CI

on:
pull_request:
branches:
- main
paths:
- 'environments/**'
- 'modules/**'

permissions:
contents: read
pull-requests: write # needed to post plan comments
id-token: write # needed for OIDC auth

env:
TF_VERSION: "1.8.0"
WORKING_DIR: ./environments/prod

jobs:
terraform-ci:
name: Terraform Plan
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Configure AWS credentials via OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-terraform
aws-region: us-east-1

- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}

- name: Terraform Format Check
id: fmt
run: terraform fmt -check -recursive
working-directory: ${{ env.WORKING_DIR }}
continue-on-error: true

- name: Terraform Init
id: init
run: terraform init -input=false
working-directory: ${{ env.WORKING_DIR }}

- name: Terraform Validate
id: validate
run: terraform validate -no-color
working-directory: ${{ env.WORKING_DIR }}

- name: Terraform Plan
id: plan
run: terraform plan -no-color -out=tfplan
working-directory: ${{ env.WORKING_DIR }}
continue-on-error: true

- name: Post Plan to PR
uses: actions/github-script@v7
with:
script: |
const output = `#### 🖌 Terraform Format: \`${{ steps.fmt.outcome }}\`
#### Terraform Init: \`${{ steps.init.outcome }}\`
#### Terraform Validate: \`${{ steps.validate.outcome }}\`
#### Terraform Plan: \`${{ steps.plan.outcome }}\`

<details><summary>Show Plan</summary>

\`\`\`terraform
${{ steps.plan.outputs.stdout }}
\`\`\`

</details>

*Pushed by: @${{ github.actor }}, Action: \`${{ github.event_name }}\`*`;

github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
})

The paths filter ensures the pipeline only runs when Terraform files actually change — not on every PR. The continue-on-error: true on the plan step means even a failed plan gets posted to the PR, so reviewers can see exactly why it failed.

Adding tflint for Static Analysis

Beyond terraform validate, use tflint to catch provider-specific issues, deprecated syntax, and bad practices:

yaml

- name: Setup TFLint
uses: terraform-linters/setup-tflint@v4
with:
tflint_version: v0.50.3

- name: Init TFLint
run: tflint --init
working-directory: ${{ env.WORKING_DIR }}

- name: Run TFLint
run: tflint -f compact
working-directory: ${{ env.WORKING_DIR }}

Terraform Apply on Merge

When a PR is merged to main, the apply workflow fires automatically. No engineer needs to remember to run anything.

yaml

name: Terraform Apply

on:
push:
branches:
- main
paths:
- 'environments/**'
- 'modules/**'

permissions:
contents: read
id-token: write

jobs:
terraform-apply:
name: Terraform Apply
runs-on: ubuntu-latest
environment: production # enables GitHub environment protection rules

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Configure AWS credentials via OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-terraform
aws-region: us-east-1

- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: "1.8.0"

- name: Terraform Init
run: terraform init -input=false
working-directory: ./environments/prod

- name: Terraform Apply
run: terraform apply -auto-approve -input=false
working-directory: ./environments/prod

The environment: production key ties this job to a GitHub Environment, giving you required reviewers, deployment protection rules, and environment-specific secrets — all managed from the GitHub UI.

Secrets Management & OIDC Auth

This is the most critical security decision in the entire pipeline. Do not use long-lived AWS access keys stored as GitHub Secrets. Use OpenID Connect (OIDC) instead — GitHub Actions authenticates directly with AWS without any static credentials.

Step 1: Create an OIDC Identity Provider in AWS IAM using token.actions.githubusercontent.com as the provider URL with audience sts.amazonaws.com.

Step 2: Create an IAM Role with a trust policy scoped to your specific GitHub org and repo.

Step 3: Attach a least-privilege IAM policy — only the permissions your Terraform code actually needs, not AdministratorAccess.

Step 4: Store the role ARN as a GitHub Secret and reference it in the workflow.

The IAM trust policy:

json

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aw s:iam::ACCOUNT_ID:oidcprovider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
},
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:YOUR-ORG/YOURREPO:*"
}
}
}
]
}

For non-AWS providers: Use HashiCorp Vault with GitHub Actions OIDC to issue short-lived dynamic credentials for any cloud — keeping zero static secrets anywhere in your pipeline.

Multi-Environment Strategy

Most teams need at least dev, staging, and prod. The cleanest pattern is separate workflow triggers per environment with separate state files, rather than one monolithic workflow with conditionals.

EnvironmentTriggerAuto ApplyRequires ApprovalState Backend
devPush to developYesNoS3: tfstate-dev
stagingPush to mainYesNoS3: tfstate-staging
prodPush to mainNoYes (manual)S3: tfstate-prod

For production specifically, configure a GitHub Environment with required reviewers. This adds a manual approval gate between plan and apply — even after a PR merges, a designated approver must click "Approve" in GitHub before Terraform touches prod.

yaml

terraform-apply-prod:
name: Apply → Production
runs-on: ubuntu-latest
environment:
name: production
url: https://console.aws.amazon.com/
needs: terraform-apply-staging # prod runs only after staging succeeds

Drift Detection

Configuration drift happens when real cloud resources diverge from your Terraform state — usually because someone made a manual change in the console. Left unchecked, drift causes plan/apply failures and inconsistent environments.

The fix is a scheduled drift detection workflow that runs terraform plan daily and notifies your team when changes are detected:

name: Drift Detection

on:
schedule:
- cron: '0 8 * * 1-5' # 8 AM UTC on weekdays
workflow_dispatch: # allow manual trigger

jobs:
detect-drift:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: us-east-1

- uses: hashicorp/setup-terraform@v3

- name: Terraform Init
run: terraform init -input=false
working-directory: ./environments/prod

- name: Run Drift Detection Plan
id: plan
run: |
terraform plan -detailed-exitcode -no-color 2>&1 | tee plan.txt
echo "exitcode=$?" >> $GITHUB_OUTPUT
working-directory: ./environments/prod
continue-on-error: true

- name: Notify Slack on Drift
if: steps.plan.outputs.exitcode == '2' # exit 2 = changes detected
uses: slackapi/[email protected]
with:
payload: |
{
"text": "⚠️ *Infrastructure drift detected in prod!* Review: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

Best Practices & Gotchas

1. Pin Your Terraform and Provider Versions

Always pin exact versions in required_providers. Unexpected provider upgrades have broken more production environments than almost anything else.

terraform {
required_version = "= 1.8.0"

required_providers {
aws = {
source = "hashicorp/aws"
version = "= 5.50.0"
}
}
}

2. Never Run terraform apply in the CI Job

Your CI (pull request) job should only ever run plan. Apply lives in a separate workflow, triggered only on merge to main. This separation is what gives you the safety net of peer review.

3. Use -target Sparingly

terraform apply -target=aws_instance.web is tempting in emergencies, but it creates partial state that confuses future applies. If you find yourself reaching for -target, it's usually a sign your modules are too tightly coupled.

4. Protect Your State Backend

Your S3 state bucket needs versioning enabled (for instant rollback on corrupted state), server-side encryption (SSE-KMS), all public access blocked, and a DynamoDB table for state locking to prevent concurrent applies.

5. Add Checkov for Security Scanning

Use checkov in your CI pipeline to catch security misconfigurations — open security groups, unencrypted S3 buckets, public RDS instances — before they ever reach your cloud account.

yaml

- name: Run Checkov Security Scan
uses: bridgecrewio/checkov-action@v12
with:
directory: ./environments/prod
framework: terraform
output_format: github_failed_only
soft_fail: false # fail the build on security issues

Pro tip: Use Checkov's --skip-check flag to suppress known false positives with a documented reason. This keeps signal-to-noise high and prevents engineers from ignoring all checks due to irrelevant failures.

Conclusion

Building a Terraform + GitHub Actions pipeline isn't just about automation — it's about building a culture where infrastructure changes are as rigorously reviewed as application code. When every change goes through a PR, every plan is visible to reviewers, and every apply is logged and traceable, you eliminate the entire class of incidents that come from "someone ran something manually."

Here's what this pipeline gives you:

  • A CI workflow that validates, lints, and plans on every PR — with the plan posted as a comment
  • A CD workflow that applies on merge to main, with environment protection gates for prod
  • OIDC authentication — no static credentials anywhere
  • A multi-environment strategy with separate state and IAM roles per environment
  • A drift detection workflow that keeps you honest about your live infrastructure
  • Security scanning with Checkov integrated directly into the CI gate

This pipeline is production-ready as written. Adapt the environment paths and IAM role ARNs to your setup, and you have a foundation that scales from a solo engineer to a full platform team.

Written by

Subhash Tiwari
Subhash TiwariDevOps Engineer