Why Ansible still matters in a Kubernetes world
Every time I mention Ansible, someone brings up Kubernetes. They're solving different problems. Kubernetes manages containerised workloads. Ansible manages the servers those workloads run on — patching OS packages, configuring system settings, managing users, installing agents, handling the 100+ on-prem servers that aren't containerised and never will be. Here's how to use Ansible properly for server automation at scale.
Project structure that doesn't fall apart
ansible/
inventories/
production/
hosts.yaml # production host groups
group_vars/
all.yaml # vars for all hosts
webservers.yaml
staging/
hosts.yaml
roles/
common/ # applied to every server
tasks/main.yaml
handlers/main.yaml
defaults/main.yaml
monitoring-agent/ # install node exporter
hardening/ # CIS benchmark baseline
playbooks/
site.yaml # master playbook
patch.yaml # OS patching only
deploy-agent.yaml # agent rollout
ansible.cfg
Inventory as code
# inventories/production/hosts.yaml
all:
children:
webservers:
hosts:
web-01.prod: {{ansible_host: 10.0.1.10}}
web-02.prod: {{ansible_host: 10.0.1.11}}
vars:
nginx_worker_processes: 4
databases:
hosts:
db-01.prod: {{ansible_host: 10.0.2.10}}
monitoring:
hosts:
prometheus-01.prod: {{ansible_host: 10.0.3.10}}
Idempotent tasks — the most important principle
Every Ansible task must be safe to run multiple times with the same result. This is idempotency. It's what makes Ansible useful for automation rather than just scripting.
# Bad — not idempotent, runs every time
- name: Add line to config
shell: echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf
# Good — idempotent, only changes if needed
- name: Enable IP forwarding
ansible.posix.sysctl:
name: net.ipv4.ip_forward
value: '1'
state: present
reload: yes
Using shell or command modules should be a last resort. Built-in modules like ansible.builtin.package, ansible.builtin.template, and ansible.posix.sysctl are idempotent by design.
Automated patching playbook
# playbooks/patch.yaml
- name: Rolling OS patch
hosts: all
serial: "20%" # patch 20% of hosts at a time
become: yes
tasks:
- name: Update all packages
ansible.builtin.package:
name: "*"
state: latest
update_cache: yes
- name: Check if reboot required (RHEL)
ansible.builtin.stat:
path: /var/run/reboot-required
register: reboot_required
- name: Reboot if required
ansible.builtin.reboot:
reboot_timeout: 300
when: reboot_required.stat.exists
The serial: "20%" setting is critical for production. Without it, Ansible patches all hosts simultaneously. With it, it patches 20% at a time — if something goes wrong, 80% of your fleet is still running.
Ansible Vault for secrets
# Encrypt a secrets file
ansible-vault encrypt group_vars/all/secrets.yaml
# Edit encrypted file
ansible-vault edit group_vars/all/secrets.yaml
# Run playbook with vault password from CI secret
ansible-playbook site.yaml --vault-password-file ~/.vault_pass
ansible-vault for anything sensitive — database passwords, API keys, certificates. The encrypted file is safe to commit; the vault password is not.
Running Ansible from CI/CD
# GitHub Actions example
- name: Run Ansible patching
uses: dawidd6/action-ansible-playbook@v2
with:
playbook: playbooks/patch.yaml
directory: ansible/
key: ${{{{ secrets.SSH_PRIVATE_KEY }}}}
vault_password: ${{{{ secrets.ANSIBLE_VAULT_PASSWORD }}}}
options: |
--inventory inventories/production/
--limit webservers
Running Ansible from CI gives you an audit trail — every playbook run is a pipeline execution with logs, triggered by a commit, reviewable by the team. This is how you get infrastructure changes out of individuals' laptops and into a controlled, repeatable process.