Project Specific Ansible Overrides Via Preprocessor

September 19, 2021

Intro

Ansible is a configuration management, multi-machine task running, DevOps automating declarative devops tool. I’ve used it for things so varied as developer machine setup, thought about using it as an abstraction layer over plain Dockerfiles, and as a task runner for CloudFormation automation.

Ansible breaks configuration up into “playbooks” which have many “tasks” inside them. In these tasks you can call shell scripts, CloudFormation templates, tons of other built-in or third party actions. You can likewise lookup variable definitions from files, databases, or write your own (as I did to look up exported Cloudformation variables!).

Ansible reads a playbook, executing every task it finds until the end. The tasks in a playbook are mostly static - there is include_tasks, include_playbook and a task tagging feature, but it’s a simple paradigm.

Today I had a radical thought: “What if I could run a pre-processor over an Ansible playbook? Like C’s pre-procesor”. I could externally substitute variables, turn off/on whole sections of code, or build a collection of tasks outside of the playbook paradigm.

Pinky, are you pondering what I’m pondering?

“Uhhh, I think so, Brain, but where are we going to find a YAML pre-processor at this hour?”

Astute readers here will notice that Ansible already provides a template processing language over top of YAML, via Jinja. True! So we can’t use Python and Jinja as our pre-processor!.

The answer: GNU m4

M4 is a reasonably good generic text preprocessor that I last used I think to generate Apiary API documentation for something. (Took notes on M4 too).

Too Long, Didn’t Read, Show Me The Code

Github Repo: rwilcox/ansible_preprocessor_treatise

Ok… walk me through this

If we write a playbook like so:

- name: playbook
  hosts: localhost
  tasks:
    - name: simple_variable_preprocess
      debug:
       msg: M4_MESSAGE

Then run it through m4 like so:

$ m4 --define=M4_MESSAGE="dynamic" playbooks/main.yml.m4 > playbooks/main.yml

we get:

- name: playbook
  hosts: localhost
  tasks:
    - name: simple_variable_preprocess
      debug:
       msg: dynamic

Which we then run through ansible-playbook and get “dynamic” outputted to us.

Ok, the simplest use case done.

Easily reusing task definitions

Preprocessing with m4 means we can break the playbook -> task coupling, and store tasks outside of playbooks.

$ ls
playbooks/ task_inventory/

m4 lets you include files into other files

- name: playbook
  hosts: localhost
  tasks:
    - name: simple_variable_preprocess
      debug:
       msg: "hi"

include(`simple_task_include.yml')

simple_task_include.yml looks like so (note all the padding whitespace!)


    - name: test
      debug:
        msg: "I'm from this other file!"

And is pulled together with the following m4 command

$ m4 -I task_inventory playbooks/main.yml.m4 > playbooks/main.yml

Looking like:

- name: playbook
  hosts: localhost
  tasks:
    - name: simple_variable_preprocess
      debug:
       msg: "hi"


    - name: test
      debug:
        msg: "I'm from this other file!"

Project Specific Overrides In The Middle Of A Playbook

The real science is here. I wondered if using Ansible playbooks as a CI/CD runner might be an interesting way to organize a pipeline, and make more of it locally runnable.

A playbook executes tasks from start to finish, failing the run if a task fails. Ansible is still infrastructure as code, so it fulfills that requirement too.

If your herd is either relatively stable, or with relatively little technology sprawl, you might have very unified pipeline with only a few microservices wanting to do something custom.

Using Ansible in the pipeline may mean having your CI/CD system clone the current Ansible CI/CD playbook you’ve put together (perhaps to .cicd_ansible), then run ansible_playbook .cicd_ansible/playbooks/main.yml. This main playbook having tasks around running compiling, unit tests, coverage reports, uploading artifacts, etc.

But what if some project wants to use SBT instead of the herd standard Maven?! How does this one project provide its own implementation of Ansible build tasks?

Preprocessing again to the rescue.

Our Ansible playbook wants to use standard defaults unless there’s a project specific override file specified.

- name: playbook
  hosts: localhost
  tasks:

    - name: task_preprocess
      debug:
        msg: yo

esyscmd(`.cicd_ansible/bin/cator.sh project_specific_overrides/override.yml .cicd_ansbile/task_inventory/default_task_include.yml')

Now, the if statement abilities of m4 are not great. What I wanted to do is write a “if X file exists use that, else use this”. Couldn’t do it in m4 so I wrote a shell script, a “cat or” shell script.

Contents of .cicd_ansible/bin/cator.sh


#!/bin/bash

if test -f "$1"; then
    cat "$1"
else
    cat "$2"
fi

We use the esyscmd m4 command to add the stdout of the process to the rendered template, and now a default build process can happen or a user can override it with an Ansible task snippet in a per-contract location (project_specific_overrides/).

Conclusion

It’s fun to make things work in ways that they’re not actually supposed to! Even as just experiments!

It would be interesting to see how this works out in practice! I also wonder if this preprocessing could be a way to avoid some of the deeply nestled “I’m in a Python context inside a Jinja context inside a YAML string” inception rabbithole I sometimes felt when writing Ansible code. Or maybe not!

Anyway, too much fun!