Why This Topic Matters Now
Infrastructure as Code has moved from a niche practice to a standard expectation in most engineering organizations. But as teams grow and systems become more distributed, the simple act of managing configuration files turns into a coordination problem. The question is no longer whether to use IaC, but how to structure the workflow around it. Workflow patterns—the sequence of steps from code change to infrastructure update—directly affect deployment speed, error rates, and team autonomy.
Many teams start with a basic push model: someone runs a script or a CI job that applies changes directly to production. That works for a single developer or a small project, but it quickly breaks down when multiple people touch the same resources. Conflicts, unapproved changes, and configuration drift become daily headaches. Pull-based workflows, where changes are reviewed and merged before applying, offer more control but can slow down delivery. Hybrid approaches try to balance both, but they introduce their own complexity.
This guide compares the three dominant workflow patterns for IaC—push, pull, and hybrid—by showing how they work, where they fit, and what trade-offs they carry. We focus on conceptual understanding rather than tool-specific instructions, so you can apply these patterns to Terraform, Pulumi, Ansible, or any other IaC tooling. By the end, you should be able to identify which pattern matches your team's size, risk tolerance, and operational maturity.
Core Idea in Plain Language
At its simplest, an IaC workflow is the path a configuration change takes from a developer's local machine to the live infrastructure. The core idea is to treat infrastructure definitions like application code: version-controlled, reviewed, tested, and deployed through a repeatable process. The pattern determines who initiates the deployment and how changes are approved.
In a push-based workflow, an operator or a CI pipeline directly applies changes to a target environment. There's no separate review step before the apply—the code is already reviewed in the pull request, and the apply happens automatically or manually after merge. This is common in small teams or for non-critical environments where speed matters more than strict controls.
A pull-based workflow uses an operator (like a GitOps agent) that continuously compares the desired state in a repository with the actual state of the infrastructure. When a change is merged, the agent pulls the new configuration and applies it. No one runs an apply command directly. This pattern is popular in Kubernetes ecosystems with tools like Argo CD or Flux.
Hybrid workflows combine elements of both. For example, a team might use push for development environments and pull for production, or they might use a push-based CI pipeline that triggers a pull-based operator. The goal is to get the speed of push where safety is less critical and the reliability of pull where it matters most.
The key insight is that the workflow pattern influences not just technical outcomes but also team dynamics. Push patterns give developers more autonomy but require strong discipline and monitoring. Pull patterns enforce consistency but can create bottlenecks if the review process is too heavy. Hybrid patterns are flexible but require careful design to avoid confusion about which process applies when.
How It Works Under the Hood
Push-Based Mechanics
In a push workflow, the deployment pipeline is triggered by an event—typically a merge to a main branch or a manual trigger. The pipeline runs a plan or dry-run step (e.g., terraform plan), then executes the apply. The state file is updated immediately. This works well when the pipeline has direct network access to the infrastructure API, and when the team trusts that the review process before merge is sufficient.
The main risk is that if the pipeline fails mid-apply, or if someone bypasses the review process, the infrastructure can end up in an inconsistent state. Also, because the apply happens from a CI runner, the state file must be stored centrally (e.g., in S3 or Terraform Cloud) and locked during the operation to prevent concurrent modifications.
Pull-Based Mechanics
A pull workflow inverts the control: instead of pushing changes to infrastructure, an agent inside the target environment polls a Git repository for changes. When a new commit is detected, the agent reconciles the actual state with the desired state defined in the repo. This means the agent must have credentials to modify the infrastructure, and those credentials are often stored as secrets within the cluster or environment.
The advantage is that the apply happens inside the environment, so network access is not an issue. The agent can also detect drift—changes made outside the workflow—and revert them automatically. However, the agent needs to be deployed and maintained, and it adds latency: changes take effect only after the next polling cycle (usually a few minutes).
Hybrid Mechanics
Hybrid workflows often use a push pipeline for initial provisioning or for environments where the pull agent is not yet set up. For example, a CI pipeline might apply the base infrastructure (VPC, IAM roles) using push, then deploy a pull-based agent into a Kubernetes cluster to manage applications. Another common hybrid is to use push for staging environments and pull for production, with a manual approval gate before the production apply.
The challenge is maintaining consistency across the two modes. If a team uses push for development and pull for production, they need to ensure that the same configuration is applied in both, which can be tricky if the push pipeline uses different parameters or if drift occurs in the development environment.
Worked Example or Walkthrough
Let's walk through a realistic scenario: provisioning a multi-environment Kubernetes cluster (dev, staging, production) using Terraform. The team has five engineers and is adopting IaC for the first time.
Scenario: Push-Based Approach
The team creates a single repository with directories for each environment. They set up a CI pipeline that runs terraform plan on every pull request and terraform apply on merge to the main branch. The pipeline uses Terraform Cloud for state management and locking. This works well initially: changes are fast, and the team can iterate quickly.
After a few weeks, a developer accidentally merges a change that modifies the production VPC CIDR block. The pipeline applies it immediately, causing a brief network outage. The team realizes they need an approval gate for production. They add a manual approval step in the pipeline before the production apply, turning it into a hybrid workflow: push for dev and staging, push with approval for production.
Scenario: Pull-Based Approach
Another team adopts a pull-based workflow from the start. They use Argo CD to manage Kubernetes manifests and Crossplane for cloud resources. The configuration is stored in a Git repository, and Argo CD continuously syncs the cluster state. When a developer merges a change, Argo CD picks it up within three minutes and applies it.
The team finds that drift detection is a big win: if someone manually changes a resource via the cloud console, Argo CD reverts it automatically. However, they struggle with the initial setup—configuring Argo CD and Crossplane requires deep knowledge of both tools. Also, they cannot easily apply changes that require manual intervention (like rotating a database password) because the agent always reconciles to the desired state.
Scenario: Hybrid Approach
A third team uses a hybrid approach: they push the base infrastructure (VPC, subnets, IAM) using a CI pipeline, then deploy a pull-based agent (Flux) into each cluster to manage application workloads. The CI pipeline also handles secrets by storing them in a vault and injecting them into the cluster during the initial push.
This works well for separating concerns: the platform team owns the push pipeline for core infrastructure, while application teams own the Git repositories that Flux syncs. The downside is that there are two workflows to maintain, and debugging issues requires understanding both the CI pipeline and the GitOps operator.
Edge Cases and Exceptions
No workflow pattern handles every situation gracefully. Here are common edge cases where each pattern can break down.
Drift Detection
In push workflows, drift is not automatically detected. If someone makes a manual change via the cloud console, the next push will either overwrite it (if the change is not in the configuration) or cause a conflict. Teams using push often need to run periodic drift detection scripts or use tools like Terraform's refresh command. Pull workflows handle drift natively, but they can be too aggressive: if a manual change is intentional (e.g., a temporary scaling adjustment), the agent will revert it, potentially causing an incident.
Secret Management
Secrets are a challenge in any workflow. In push workflows, secrets can be injected at runtime from a vault, but the pipeline must have access to the vault. In pull workflows, secrets are often stored encrypted in Git (using tools like Sealed Secrets or Mozilla SOPS) and decrypted by the agent. If the agent's decryption key is compromised, all secrets are exposed. Hybrid workflows can use push for initial secret injection and pull for ongoing management, but this adds complexity.
State Locking and Concurrency
In push workflows, state locking is critical to prevent two pipelines from applying changes simultaneously. Most remote state backends support locking, but if a pipeline crashes, the lock may not be released automatically, blocking subsequent deployments. Pull workflows typically use a single agent per environment, so locking is less of an issue, but if the agent is down, no changes can be applied.
Approval Gates
In pull workflows, approval is typically done at the pull request level before merge. If you need an additional approval before the apply (e.g., for production), you need to implement a manual step in the CI pipeline or use a tool like Argo CD's sync waves with manual approval. This can be done, but it adds complexity and slows down the pipeline.
Limits of the Approach
Each workflow pattern has fundamental limits that teams should understand before committing.
Push Workflow Limits
Push workflows scale poorly as the number of environments and teams grows. Each pipeline run must have network access to the target environment, which can be a security concern in multi-account setups. Also, because the apply runs from a CI runner, the state file must be stored in a shared location, creating a single point of failure. If the state file is corrupted or locked, no one can deploy.
Pull Workflow Limits
Pull workflows introduce latency—changes are not applied immediately. This can be a problem for incident response where you need to make a quick change. Also, pull agents consume resources in the target environment, and they need to be updated regularly. If the agent is misconfigured, it might not sync correctly, leading to configuration drift that goes unnoticed until the next audit.
Hybrid Workflow Limits
Hybrid workflows inherit the limits of both patterns and add complexity. Teams must maintain two sets of tooling and processes, which can lead to confusion about which workflow applies to which resource. For example, if a developer makes a change to a resource that is managed by the pull agent but the change is pushed via CI, the agent will revert it. This can cause frustrating ping-pong between tools.
Another limit is that hybrid workflows often require custom scripting to bridge the two modes. For instance, if you want to use a push pipeline to bootstrap a pull agent, you need to ensure that the agent's configuration is stored in the same Git repository and that the pipeline does not overwrite it later.
Reader FAQ
What is the best workflow pattern for a small team (2-3 people)?
For a small team, a push-based workflow is often the most pragmatic choice. It's simple to set up, and the team can communicate directly about changes. Use a remote state backend with locking to avoid conflicts. As the team grows, consider adding approval gates for production environments.
Can I use a pull-based workflow with Terraform?
Yes, but it requires additional tooling. Terraform is not natively pull-based; you need a GitOps operator like Crossplane or Terraform Cloud's run triggers to simulate a pull model. Alternatively, you can use a tool like Atlantis, which provides a pull-based workflow by running Terraform plans and applies in response to pull request comments.
How do I handle secrets in a pull-based workflow?
Use a tool like Sealed Secrets or SOPS to encrypt secrets in the repository. The pull agent decrypts them using a key stored in the environment (e.g., a Kubernetes secret). Ensure that the decryption key is rotated regularly and that access to it is restricted.
What happens if the pull agent goes down?
If the agent goes down, no changes are applied until it is restored. The infrastructure remains in its last known state. To mitigate this, run multiple replicas of the agent and monitor its health. Also, have a manual fallback plan (e.g., a push pipeline) for emergency changes.
Should I use a monorepo or multiple repos for IaC?
Both work, but the choice affects your workflow. A monorepo simplifies cross-resource dependencies but can lead to larger blast radius (one change affects many resources). Multiple repos reduce blast radius but require more coordination. For pull-based workflows, multiple repos are easier to manage because each repo maps to a single application or environment.
As a next step, audit your current IaC workflow: map out the steps from code change to infrastructure update, and identify where delays, errors, or manual interventions occur. Then pick one pattern from this guide and run a trial on a non-critical environment. Measure deployment frequency and failure rate before and after. That data will tell you whether the pattern fits your team's context.
Comments (0)
Please sign in to post a comment.
Don't have an account? Create one
No comments yet. Be the first to comment!