How to write quality code in Ansible?

2020-08-15|By Stanisław Szymański|Code

More play, less book. Ask about DevOps way of automation, Ansible will pop up, sure thing. It has its bright sides, and ones that aren't as bright, but that's not what we're touching today - In this article, I will try to explain the principles of Ansible role building, that you can easily introduce in practice, to build elegant, elastic, easy to use and troubleshoot code. A real joy to maintain.

I do also want this article to be understandable and simple - so that you won't need 10 years of experience, nor a Senior job title, to reap the benefits and become better at coding. When I started, such articles helped me very much, and now I am glad to return the favor and help others.

Granulate - do not overdo

The first thing you should keep an eye out for, is granulation. Do not copy the same task to every playbook that needs it, create a "boilerplate" role to fulfill the task instead, and make the main role depend on it:

A boilerplate of yum-package install

- name: 'put {{ yum__pkgs | to_json }} into the state {{ yum__state }}'
  yum:
    name: '{{ yum__pkgs }}'
    state: '{{ yum__state }}'
    update_cache: '{{ yum__update_cache }}'

A boilerplate of the main role

dependencies:
  - role: 'centsible.dep-yum'
    yum__pkgs: '{{ elastic__pkgs }}'

However, not every role you'll need for the main goal, should be set as a dependency - there's another option, the prerequisites - roles attached in the playbook instead of main roles' meta file. What's the difference, and when to use either one?

  • dependency -- mean that the role is crucial to the proper execution of the role that depends on it. That you must absolutely use it, in order to accomplish a bigger goal of the main role.
  • prerequisite -- comes into action, when you might want something done, but not really attached to the main role; basically, when the task you'd want to accomplish is out of the scope of the main role.

Properly split Elasticsearch setup roles

- hosts: es-masters
  roles:
    - role: 'centsible.elastic-stack/centsible.es-elastic-repository'
    - role: 'centsible.elastic-stack/centsible.es-elastic'

- hosts: es-masters
  roles:
    - role: 'centsible.elastic-stack/centsible.es-filebeat'
    - role: 'centsible.elastic-stack/centsible.es-auditbeat'
    - role: 'centsible.elastic-stack/centsible.es-packetbeat'

Think of roles like they were large, heavy shopping bags, with a lot of different food products. Carrying them all at once is impossible, or at least, very, very uncomfortable and cumbersome. So, you split them - one with your lunch goes first, as the lunchtime is near - you're starving, so it's crucial for you. Then, you can grab the rest - they're not as important, yet, you still need them for dinner and tomorrows' menu. This example is also great to explain why over-granulation should definitely be avoided. Imagine that you split 20KG of one bag into 100 separate small bags. Things got heavier, as bags do have some weight too. Comfort? None. And, you do still have a serious problem on your hands, as because it's even harder to carry them, you cannot get all 100 bags at one go. If you'd split this into two, maybe three or four bags - sure, that would help. The lesson is - You should split things to make work easier, not harder. Don't split everything you touch, or you'll end up with a mini-zoo of partially useful, partially useless roles. A hell-ish mini-zoo.

Roles should have one main responsibility. Do one thing, and do it right. Depend on dependencies, give roles their roles.

Easy to change, perfectly flexible - this is the way of simple Ansible.Me, 2021

Ansible loves Jinja2

- it is a fact. Besides variable templating, which you already probably heard of, there's much, much more - filtering variable values, basic testing, etc… Check out this example:

Useful jinja2-filters

# Value manipulation
{{ eye_color | mandatory }} - Makes setting the 'eye_color' required
{{ variable | default(256) }} - Makes the variable value '256' if not set
{{ comment_value | quote }} - encloses the  'comment_value' in quotes
{{ (color == "Blue") | ternary('Yes','No', 'Cannot determine') }} - will be yes if color is 'Blue', no if it isn't, and 'Cannot determine' if value of 'color' is null
{{ the_address | ipv4 }} - checks if 'the_address' is an correct IPv4 address
{{ '127.0.0.1/24' | ipaddr('address') }} - will provide '127.0.0.1' 

To put it simply - Jinja makes playbooks easily reusable and very flexible, when done right.

The green reed which bends in the wind is stronger than the mighty oak which breaks in a storm.Confucius, a reaaallly long time ago

Keeping things easy to bend, makes them harder to break. Ansible is no exception. Templated mandatory values won't get lost, input tested via templated checks won't spoil anything… And there's also the comfort of getting some values without huge code blocks or workarounds. It's definitely worth to spend a little while, learning how Jinja works.

Keep secrets… secret

Sometimes, you might run into values you would not want to disclose - API keys, SSH keys, usernames, passwords, authentication keys; maybe even private endpoint addresses… Stuff that should remain obscure, else it would pose a threat to your infrastructures' security, for example. There are two ways to sort it out in Ansible:

  • Dead men tell no tales (If by a twist of fate you happen to be Jack Sparrow. Jack, why is the rum gone, have you determined that yet? Just curious.)
  • Use ansible-vault (If you're not a Caribbean Pirate. Let's face it, straight up, you probably aren't.)

I've never captained a ship, except a small training yacht, and a large cargo ship if you count Docker with a lot of pods as a container sea vessel - so let's focus on the vault. The easiest method you could leverage is to encrypt whole files, but I'd propose a bit harder, yet much more… polite, elegant way to handle the situation: store secret variable values in a separate file, call them using variable substitution, and let Ansible take care of the rest:

Example of secret file contents

# Dummy value, put into a yaml file, ie. secrets.yml
vault__shell_root_password: asd12345

Example of secret file contents encrypted by ansible-vault

$ANSIBLE_VAULT;1.1;AES256
33363634633461383735333035303166663332633433306338323332323461343936363864333665
6431353634613433373666373266386231303439376433650a633964323135353363383034303666
63646165373833373839373833663466356562616439356132316533653031353836373338666562
6138663961393437610a616566616436643932643534303263626535383464353063383038373966
31386236653233636466623533363063373236313530323764626265356231316232

There are a few ways to attach the secrets to your playbook - we'll go with the simplest and easiest one this time, attachment of secrets as vars file for all hosts. That will work, and it's common - but, as your project grows, you'll surely need a better way to handle this - does every host and role need the forbidden knowledge of secrets? Not really. You'll find something more appropriate for larger or more formal environments in the docs, for sure - the keywords are vault ID's and labeling vaults.

Example of secrets file inclusion

- hosts: '< all your target hosts >'
  vars_files:
    - '< path to secrets file, can be templated too! >/secrets.yml'

That's it. Seriously, it's that easy. One discomfort persists though - blind guess tells me you don't like to provide the password each time you deploy anything. Lazy lazy, naughty naughty. But it's understandable. To fix that, create a file called .vault_pass anywhere you want, then tell Ansible where it is, and it will automatically get it when needed:

Vault password auto-gather setting from Ansible config

# Put this into ansible.cfg - the Ansible config file
vault_password_file = < path to the .vault_pass file >

Make sure it is well guarded, and most importantly - if you are using git; add it to .gitignore

or encrypt it with git-secret, otherwise you won't protect anything! Treat the password like a password - tell it only to those who need to know it, keep it safe and secure. Anyways, it's much easier than protecting multiple variable values, isn't it?

Do not use vars by default(s)

The basic structure of Ansible role contains two directories with variable sources:

The basic structure of Ansible role

├── defaults
│   └── main.yml
└── vars
    └── main.yml
  • vars/main.yml - that's where you should set variable values, that you don't want the playbook operator to tamper with. As a sign that they are the more "constant" ones or probably even essential to the playbook operation, and shouldn't be changed by the "player" if he wants to be sure that everything will work as intended.
  • defaults/main.yml - those values are used only when there's no other source of variable value - if you simply didn't set them anywhere else. They are pretty much intended to be set or overwritten by the "player"

And why would that be, what's the deal? The answer lies in the so-called "Variable precedence"

- basically, it means that there is an order, in which variable values are overwritten or not, depending on the type they were set as. Defaults will be nearly always overwritten, vars have to be overridden by deliberate, intentional manual action. Keep an eye on this - maintenance of roles with proper variable segregation is much, much easier.

Ghost in the shell

Ah, yes. The shell

module. It is awesome - sometimes, even mandatory(still awesome!) - but, when you can, you should use appropriate dedicated Ansible modules instead, as they are much more predictable - those were built with automation in mind. Using shell may yield various results (it just shoots whatever you put there into /bin/sh - this may even be destructive, watch out!) and is often impractical, as it's easier to just set a toggle in a module, than to play with pipes, echo stuff, duct-tape output capture, escaping characters…

Setting host timezone via shell module

- name: 'set timezone to {{ your_timezone }}'
  shell: 'timedatectl set-timezone {{ your_timezone }}'

Setting host timezone via dedicated timezone module

- name: 'set timezone to -- {{ timezone__zone }}'
  timezone:
    name: '{{ timezone__zone }}'

Want the nail in a perfect spot, just to hang your painting? Use a proper hammer. Sledgehammer won't do you no good. Moreover, you'll have to explain, why there's a giant hole in the wall. Hiding it under the painting just delays the fuss.

Take "OK's" with a grain of salt

Everybody lies.Gregory House, House M.D

Trust is expensive - mostly because it's dangerous. It's much safer to make sure, that when you get an "OK", it is a real, true "OK" - systemd does not know everything. A simple set of smoke tests can be the guarantee you need, properly implemented will of course make the role execution longer, but will also assure you, that this role is indeed reliable. Ansible handlers fit perfectly here - define tests as handlers, set notify statement in the appropriate task, and voilà.

A set of basic smoke-tests

- name: '(smoke-test) -- service is running'
  systemd:
    name: '{{ haproxy__service_name }}'
    state: 'started'
  register: haproxy__status


- name: '(smoke-test) -- report fail'
  fail:
    msg: |
      Service {{ haproxy__service_name }} is not running.
      Output of `systemctl status {{ haproxy__service_name }}`:
      {{ haproxy__status.stdout }} {{ haproxy__status.stderr }}
  when: haproxy__status is failed


- name: '(smoke-test) -- frontend service is listening'
  wait_for:
    port: '{{ haproxy__frontend_port }}'
    delay: 3
    timeout: 5
  ignore_errors: 'true'


- name: '(smoke-test) -- backend service is listening'
  wait_for:
    host: '{{ item.address }}'
    port: '{{ item.port }}'
    delay: 3
    timeout: 5
  with_items: '{{ haproxy__backend_servers }}'
  ignore_errors: 'true'


- name: '(smoke-test) -- frontend service responds with 200 OK'
  uri:
    url: 'https://sysdogs.com:{{ haproxy__frontend_port }}'
  ignore_errors: 'true'

Example of smoke tests usage

- name: 'deploy configuration -- {{ haproxy__conf_dest }}'
  template:
    src: '{{ haproxy__conf_src }}'
    dest: '{{ haproxy__conf_dest }}'
    validate: '{{ haproxy__conf_validate_cmd }}'
  notify:
    - 'restart -- {{ haproxy__service_name }}'
    - '(smoke-test) -- service is running'
    - '(smoke-test) -- report fail'
    - '(smoke-test) -- frontend service is listening'
    - '(smoke-test) -- backend service is listening'
    - '(smoke-test) -- frontend service responds with 200 OK'

Making sure that everything's alright is always worth the time. Remember - forewarned is forearmed. You'll often have to predict whether something will or won't happen - and it's easier to fix tiny mistakes on development than huge failures on production environment.

Perfection requires style selection

To avoid confusion, interpretation problems and a lot of headaches, select one code style, and keep it. With a common standard, it is easy to catch mistypes - odd elements stand out, look conspicuous. Someone will probably one day read, maybe review the code, stumble upon the different style, and start pondering... "Is it not in the set standard because it's faulty, or maybe it's monkey-patched and should be rewritten later?" There is no guarantee that you'll be there to clear things out. Hence, if you've decided on something once, stand with it. Make your code speak for yourself, so you won't have to.

Use pre-commit pre-commit, to avoid crying post-commit

I am 100% sure that everyone has their "big brains time" once a while. You know, the little while, in which you become a mastermind, and crunch code like a supercomputer. When the ideas flow, and quantity wins, quality usually loses. Later, when the dust settles, you've got a lot of code, made up of a crystallized concepts, transferred straight to reality. It's working... Great. But it's a huge mess too... Not great. That's when pre-commit saves the day. To put things shortly - people tend to forget about rules, when in heat of passion. Machine won't. With pre-commit you can set a list of rules and checks that will be performed on certain actions - for instance, before the commit creation, or just before git push - if the tests fail, nothing goes further, so you won't be sending those pesky double quotes or invalid tabbing anywhere. It can even correct some issues for you, so that you don't even have to touch anything again. Git hooks for the rescue!

Failed pre-commit test validation

(files) -- yaml files has to have yml extension....................(no files to check)Skipped
(pre-commit) -- validate manifest..................................(no files to check)Skipped
(git) -- forbid adding large files.....................................................Passed
(git) -- forbid using submodules.......................................................Passed
(git) -- forbid committing with merge-conflicts........................................Passed
(git) -- forbid committing to master, develop or release directly......................Passed
(file) -- check if executables have interpreter mentioned..........(no files to check)Skipped
(file) -- verify case-sensitive conflicts..............................................Passed
(file) -- look for empty symlinks..................................(no files to check)Skipped
(code) -- fix trailing whitespace......................................................Passed
(code) -- fix end-of-line..............................................................Failed
hookid: end-of-file-fixer

Files were modified by this hook. Additional output:

Fixing roles/centos/es-elastic/tasks/main.yml

(code) -- check document-string........................................................Passed
(code) -- verify JSON syntax...........................................................Passed
(code) -- verify YAML syntax...........................................................Passed
(code) -- sort and fix requirements-txt................................................Passed
(code) -- fix double-quote-string......................................................Passed
(security) -- detect AWS credentials...................................................Passed
(security) -- detect private keys......................................................Passed
(code) -- fix dependencies order.......................................................Passed
(code) -- check trailing-coma..........................................................Passed
(code) -- forbid tabs to be commited...................................................Passed
(code) -- replace tabs with spaces automatically.......................................Passed
(ansible) -- lint the code.............................................................Passed
(ansible) -- verify that vault files are really encrypted..............................Passed
(shell) -- lint the code...........................................(no files to check)Skipped
(shell) -- format the code.........................................(no files to check)Skipped
(code) -- lint the yaml files..........................................................Passed
[...]

As you see, the test has failed - i forgot to put a blank line after code in es-elastic role tasks. pre-commit fixed it for me, and if i'll run the test again, providing that there are no other issues or changes, test will pass:

Passed pre-commit test validation

(files) -- yaml files has to have yml extension....................(no files to check)Skipped
(pre-commit) -- validate manifest..................................(no files to check)Skipped
(git) -- forbid adding large files.....................................................Passed
(git) -- forbid using submodules.......................................................Passed
(git) -- forbid committing with merge-conflicts........................................Passed
(git) -- forbid committing to master, develop or release directly......................Passed
(file) -- check if executables have interpreter mentioned..........(no files to check)Skipped
(file) -- verify case-sensitive conflicts..............................................Passed
(file) -- look for empty symlinks..................................(no files to check)Skipped
(code) -- fix trailing whitespace......................................................Passed
(code) -- fix end-of-line..............................................................Passed
(code) -- check document-string........................................................Passed
(code) -- verify JSON syntax...........................................................Passed
(code) -- verify YAML syntax...........................................................Passed
(code) -- sort and fix requirements-txt................................................Passed
(code) -- fix double-quote-string......................................................Passed
(security) -- detect AWS credentials...................................................Passed
(security) -- detect private keys......................................................Passed
(code) -- fix dependencies order.......................................................Passed
(code) -- check trailing-coma..........................................................Passed
(code) -- forbid tabs to be commited...................................................Passed
(code) -- replace tabs with spaces automatically.......................................Passed
(ansible) -- lint the code.............................................................Passed
(ansible) -- verify that vault files are really encrypted..............................Passed
(shell) -- lint the code...........................................(no files to check)Skipped
(shell) -- format the code.........................................(no files to check)Skipped
(code) -- lint the yaml files..........................................................Passed
[...]

Details are important, but they consume time. Here's an option to both get a cookie and eat a cookie - no strings attached. Consider adding pre-commit to your daily code-crunching, you'll surely love it.

Focus on test, be blessed

There is no way to be 100% sure, tha nothing will go crazy, when you do basically... well, anything. But, solid test session can provide 90% of such assurance, if not even more. Ansible has its own test tool - it's called molecule

. It worked pretty fine, for a long time - but currently, its usability has somewhat degraded, due to quirks that haven't been worked out when stepping up from version 2 to 3.

Pros:

  • Somewhat simplified testing - a few things are taken care of.
  • python3 in use - up-to-date is good.
  • Pick your favorite test verifier - there are (or were) a few available.
  • Idempotence testing - you can check if you'll get the same result every time, which is something you really, really want

Cons:

  • Test execution takes time. When ran on VMs provisioned with Vagrant, i mean minutes. Not seconds. Not great.
  • Some things you still have to take care of - want to do something else than running the roles? Write a playbook for that.
  • Python3 in use - some things changed, you'll have to keep that in mind.
  • Favorite test verifier means either do Ansible-driven tests (If you can call such primitive assertions a test, spoiler warning, you cannot), or testinfra, goss support was cut from version 3.
  • If you decide on testinfra (I am not surprised, great choice), you may be forced to enact bad practices. testaid is no longer maintained; its successor, takeltest is fine, but some things may get different.
  • docker is the default driver, even though it cannot do everything - to test some roles, you will need a full VM - which could be neatly provided by Vagrant as in molecule v2, but…
  • Vagrant driver was further separated from main molecule core and isn't available "by default" - it is available as a separate plugin instead - you'll have to tinker with it and with the main config to get it up, the docs won't help you, and as i've recently discovered, it has its own issues which may render the driver unusable.

Sample Ansible-based test validating package installation

package:
  package_1:
    installed: true
  package_2:
    installed: true

Sample Testinfra-based test validating package installation

# -*- coding: utf-8 -*-
import pytest
import os
import testinfra.utils.ansible_runner
import testaid

testinfra_hosts = testaid.hosts()


def test_packages(host, testvars):
    apt__pkg = host.package(testvars['apt__pkgs'])

    assert apt__pkg.is_installed

As you see, molecule is far away from perfection, and there's a lot of difficult choices. See for yourself, if something like that suits you - it may be really worth, to lose some time and nerves, but profit from the assurance that your roles are solid and reliable. I do also have one more general Python tip, hence you'll probably be dealing with it if you want to get Molecule running. Use venv

- it's much better than installing everything "as is" on top of everything else and then figuring out why something doesn't work. And it's also making separation of different python versions, dependencies management, and later clean-up easier.

The best defense is to use common sense

Times we're living in are pretty... wild. There's a lot to be lost, a lot to be gained… I'll provide a few tips here - to keep safe, as better safe than a drawer (sorry for bad jokes, spent time learning Ansible, didn't focus too much on the comedy).

  • Sensitive data, even protected by ansible-vault, isn't safe from you and your mistakes, as you can disclose it by pure accident, even unintentionally. Use no_log to tell Ansible that it shouldn't show the output of some tasks/commands and keep those "Peeping Toms" away.
  • Debug stuff where it's safe, validate roles on VM's or other means of testing before you'll put them to use. You wouldn't want to see your empire of infrastructure crumble to dust, would you?
  • Always think before you do - ask yourself, "Is there any risk of casualties, if I'll make it this way?"
  • When something goes wrong, don't panic - it won't help you, actually it only makes things worse. You need to be cold as stone, quickly evaluate the facts, and take swift action.
  • Don't mindlessly copy-paste things into Ansible. Learn how it works first, then evaluate if you want to use it or not. Think of such snippets, like of black boxes, which contents you don't know. There might be a delicious cake there... But a hand-grenade with pin pulled too.

The end

I hope that this article will one day prove useful to you - and I'm glad I could share my knowledge. Good luck with your playbooks. Now you're past [Gathering Facts], time to [Play] - happy tinkering!

References

LinkedInLinkedInLinkedIn
Stanisław Szymański photo

About the author

Stanisław Szymański