AnsibleIaCAutomation Jul 2024 9 min read

Ansible for Server Automation at Scale: A Practical Guide

How to use Ansible to manage 100+ servers reliably — project structure, idempotent tasks, automated patching with serial execution, and running it all from CI/CD.

Why Ansible still matters in a Kubernetes world

Every time I mention Ansible, someone brings up Kubernetes. They're solving different problems. Kubernetes manages containerised workloads. Ansible manages the servers those workloads run on — patching OS packages, configuring system settings, managing users, installing agents, handling the 100+ on-prem servers that aren't containerised and never will be. Here's how to use Ansible properly for server automation at scale.

Project structure that doesn't fall apart

ansible/
  inventories/
    production/
      hosts.yaml           # production host groups
      group_vars/
        all.yaml           # vars for all hosts
        webservers.yaml
    staging/
      hosts.yaml
  roles/
    common/              # applied to every server
      tasks/main.yaml
      handlers/main.yaml
      defaults/main.yaml
    monitoring-agent/    # install node exporter
    hardening/           # CIS benchmark baseline
  playbooks/
    site.yaml            # master playbook
    patch.yaml           # OS patching only
    deploy-agent.yaml    # agent rollout
  ansible.cfg

Inventory as code

# inventories/production/hosts.yaml
all:
  children:
    webservers:
      hosts:
        web-01.prod: {{ansible_host: 10.0.1.10}}
        web-02.prod: {{ansible_host: 10.0.1.11}}
      vars:
        nginx_worker_processes: 4
    databases:
      hosts:
        db-01.prod: {{ansible_host: 10.0.2.10}}
    monitoring:
      hosts:
        prometheus-01.prod: {{ansible_host: 10.0.3.10}}

Idempotent tasks — the most important principle

Every Ansible task must be safe to run multiple times with the same result. This is idempotency. It's what makes Ansible useful for automation rather than just scripting.

# Bad — not idempotent, runs every time
- name: Add line to config
  shell: echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf

# Good — idempotent, only changes if needed
- name: Enable IP forwarding
  ansible.posix.sysctl:
    name: net.ipv4.ip_forward
    value: '1'
    state: present
    reload: yes

Using shell or command modules should be a last resort. Built-in modules like ansible.builtin.package, ansible.builtin.template, and ansible.posix.sysctl are idempotent by design.

Automated patching playbook

# playbooks/patch.yaml
- name: Rolling OS patch
  hosts: all
  serial: "20%"          # patch 20% of hosts at a time
  become: yes

  tasks:
    - name: Update all packages
      ansible.builtin.package:
        name: "*"
        state: latest
        update_cache: yes

    - name: Check if reboot required (RHEL)
      ansible.builtin.stat:
        path: /var/run/reboot-required
      register: reboot_required

    - name: Reboot if required
      ansible.builtin.reboot:
        reboot_timeout: 300
      when: reboot_required.stat.exists

The serial: "20%" setting is critical for production. Without it, Ansible patches all hosts simultaneously. With it, it patches 20% at a time — if something goes wrong, 80% of your fleet is still running.

Ansible Vault for secrets

# Encrypt a secrets file
ansible-vault encrypt group_vars/all/secrets.yaml

# Edit encrypted file
ansible-vault edit group_vars/all/secrets.yaml

# Run playbook with vault password from CI secret
ansible-playbook site.yaml --vault-password-file ~/.vault_pass
Never commit unencrypted secrets to your Ansible repo. Use ansible-vault for anything sensitive — database passwords, API keys, certificates. The encrypted file is safe to commit; the vault password is not.

Running Ansible from CI/CD

# GitHub Actions example
- name: Run Ansible patching
  uses: dawidd6/action-ansible-playbook@v2
  with:
    playbook: playbooks/patch.yaml
    directory: ansible/
    key: ${{{{ secrets.SSH_PRIVATE_KEY }}}}
    vault_password: ${{{{ secrets.ANSIBLE_VAULT_PASSWORD }}}}
    options: |
      --inventory inventories/production/
      --limit webservers

Running Ansible from CI gives you an audit trail — every playbook run is a pipeline execution with logs, triggered by a commit, reviewable by the team. This is how you get infrastructure changes out of individuals' laptops and into a controlled, repeatable process.

← Back to all articles