{"id":19,"date":"2026-01-10T02:24:50","date_gmt":"2026-01-10T02:24:50","guid":{"rendered":"https:\/\/platformsignals.dev\/?p=19"},"modified":"2026-01-10T02:27:37","modified_gmt":"2026-01-10T02:27:37","slug":"terraform-is-not-enough-why-infrastructure-as-code-drifts","status":"publish","type":"post","link":"https:\/\/platformsignals.dev\/?p=19","title":{"rendered":"Terraform is Not Enough: Why &#8220;Infrastructure as Code&#8221; Drifts"},"content":{"rendered":"\n<p><strong>The lie we tell ourselves about IaC, and how to fix it with GitOps.<\/strong><\/p>\n\n\n\n<p>If I had a dollar for every time a team claimed to be &#8220;100% Infrastructure as Code&#8221; while simultaneously holding a root terminal open in another window, I could retire early.<\/p>\n\n\n\n<p>We tell ourselves a comforting lie in SRE:&nbsp;<em>If it is in the Terraform file, it is in production.<\/em><\/p>\n\n\n\n<p>The reality is much messier. The reality is&nbsp;<strong>Configuration Drift<\/strong>. And if you aren&#8217;t actively detecting it, your Terraform state is nothing more than a historical document\u2014a snapshot of what you&nbsp;<em>hoped<\/em>&nbsp;the infrastructure looked like three months ago.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The &#8220;ClickOps&#8221; Reality<\/h3>\n\n\n\n<p>Here is the classic scenario:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li>You define an AWS Security Group in Terraform allowing port 443.<\/li>\n\n\n\n<li>At 2:00 AM, the site goes down. The on-call engineer (maybe you) realizes the database needs to talk to a new microservice immediately.<\/li>\n\n\n\n<li>Do they write a Pull Request, wait for CI, get approval, and apply? No. The site is down.<\/li>\n\n\n\n<li>They log into the AWS Console, manually add the rule, fix the outage, and go back to sleep.<\/li>\n<\/ol>\n\n\n\n<p><strong>Result:<\/strong>&nbsp;Your infrastructure is now running a configuration that exists nowhere in your code. The next time someone runs&nbsp;<code>terraform apply<\/code>, Terraform will either unknowingly revert that fix (causing another outage) or fail entirely because the state is desynchronized.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Why Drift Happens<\/h3>\n\n\n\n<p>It isn&#8217;t just panic-fixes. Drift happens because:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>External changes:<\/strong>\u00a0Cloud providers sometimes change default behaviors or resource IDs.<\/li>\n\n\n\n<li><strong>Manual tinkering:<\/strong>\u00a0&#8220;I&#8217;ll just change this instance size to test something quick.&#8221; (It is never quick).<\/li>\n\n\n\n<li><strong>Untracked resources:<\/strong>\u00a0S3 buckets created by scripts or other teams that Terraform doesn&#8217;t even know exist.<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"683\" src=\"https:\/\/platformsignals.dev\/wp-content\/uploads\/2026\/01\/terraform-deploy-pipeline-1024x683.png\" alt=\"\" class=\"wp-image-20\" srcset=\"https:\/\/platformsignals.dev\/wp-content\/uploads\/2026\/01\/terraform-deploy-pipeline-1024x683.png 1024w, https:\/\/platformsignals.dev\/wp-content\/uploads\/2026\/01\/terraform-deploy-pipeline-300x200.png 300w, https:\/\/platformsignals.dev\/wp-content\/uploads\/2026\/01\/terraform-deploy-pipeline-768x512.png 768w, https:\/\/platformsignals.dev\/wp-content\/uploads\/2026\/01\/terraform-deploy-pipeline.png 1536w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">The Fix: From &#8220;IaC&#8221; to &#8220;GitOps&#8221;<\/h3>\n\n\n\n<p>Writing HCL (HashiCorp Configuration Language) is the easy part. The hard part is the&nbsp;<strong>lifecycle of the apply<\/strong>. To be a true Platform Engineer, you need to move beyond running Terraform locally on your laptop.<\/p>\n\n\n\n<p>Here is the maturity model for fixing drift:<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Phase 1: The Ban (Policy)<\/h4>\n\n\n\n<p>Remove write access to the cloud console for humans. This is extreme but effective. If you can&#8217;t click &#8220;Edit Security Group&#8221; in the AWS console because you literally don&#8217;t have the permission, you&nbsp;<em>must<\/em>&nbsp;use Terraform.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><em>Pros:<\/em>\u00a0Stops drift cold.<\/li>\n\n\n\n<li><em>Cons:<\/em>\u00a0Makes emergency firefighting much harder. You need a &#8220;Break Glass&#8221; procedure for 2 AM incidents.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Phase 2: Automated Drift Detection (Observability)<\/h4>\n\n\n\n<p>If you can&#8217;t lock the doors, at least install a security camera. Set up a scheduled job (Cron \/ Jenkins \/ GitHub Actions) that runs&nbsp;<code>terraform plan<\/code>&nbsp;every hour.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If the plan shows &#8220;No Changes,&#8221; send a \u2705 to Slack.<\/li>\n\n\n\n<li>If the plan shows &#8220;Changes Detected&#8221; (meaning reality doesn&#8217;t match code), send a \ud83d\udea8 alert to the SRE channel.<\/li>\n<\/ul>\n\n\n\n<p>This turns Drift from a &#8220;deployment surprise&#8221; into a &#8220;monitoring alert.&#8221;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Phase 3: The GitOps Pipeline (Atlantis \/ Terraform Cloud)<\/h4>\n\n\n\n<p>Stop running Terraform from your laptop. Use a tool like&nbsp;<strong>Atlantis<\/strong>&nbsp;or&nbsp;<strong>Terraform Cloud<\/strong>&nbsp;that ties the&nbsp;<code>apply<\/code>&nbsp;to a GitHub Pull Request.<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li>You open a PR.<\/li>\n\n\n\n<li>Atlantis automatically runs\u00a0<code>plan<\/code>\u00a0and comments the output on the PR.<\/li>\n\n\n\n<li>Your colleague reviews the code and the plan.<\/li>\n\n\n\n<li>You comment\u00a0<code>atlantis apply<\/code>.<\/li>\n\n\n\n<li>The bot applies the code and merges the PR.<\/li>\n<\/ol>\n\n\n\n<p>This creates a perfect audit trail. We know&nbsp;<em>who<\/em>&nbsp;changed the infra,&nbsp;<em>what<\/em>&nbsp;the output was, and&nbsp;<em>when<\/em>&nbsp;it happened.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Conclusion<\/h3>\n\n\n\n<p>Terraform code is static; infrastructure is dynamic. If you treat your&nbsp;<code>.tf<\/code>&nbsp;files as &#8220;fire and forget,&#8221; you are building a house of cards. True reliability comes when you stop trusting the code blindly and start verifying the state continuously.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The lie we tell ourselves about IaC, and how to fix it with GitOps. If I had a dollar for every time a team claimed to be &#8220;100% Infrastructure as Code&#8221; while simultaneously holding a root terminal open in another window, I could retire early. We tell ourselves a comforting lie in SRE:&nbsp;If it is [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[9,7,8,6,5,4],"class_list":["post-19","post","type-post","status-publish","format-standard","hentry","category-war-stories","tag-atlantis","tag-aws","tag-ci-cd","tag-gitops","tag-infrastructure-as-code","tag-terraform"],"_links":{"self":[{"href":"https:\/\/platformsignals.dev\/index.php?rest_route=\/wp\/v2\/posts\/19","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/platformsignals.dev\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/platformsignals.dev\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/platformsignals.dev\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/platformsignals.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=19"}],"version-history":[{"count":1,"href":"https:\/\/platformsignals.dev\/index.php?rest_route=\/wp\/v2\/posts\/19\/revisions"}],"predecessor-version":[{"id":21,"href":"https:\/\/platformsignals.dev\/index.php?rest_route=\/wp\/v2\/posts\/19\/revisions\/21"}],"wp:attachment":[{"href":"https:\/\/platformsignals.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=19"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/platformsignals.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=19"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/platformsignals.dev\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=19"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}