Multi Environment Ansible Layout – Part 3: The Reasoning
I got a response on twitter:
Interesting. For #CentOS infra we decided to have each role being a git repo, and inventories (for ci, public, test, staging) completely seperated : https://t.co/dtUrPHrtl1. See also https://t.co/tJs8a2z7B0 using ansible-roles-ctl to updates roles (which galaxy doesn't !)
— @[email protected] - Fabian Arrotin 🪫 (@Arrfab) November 5, 2019
I wanted to talk a little bit about the decisions I made and the thinking behind them.
Inline Roles
One of the more controversial things I did in this layout was to "bake-in", or include the Ansible roles in the layout, instead of making each of the roles a separate Git repository.
I had initially housed each role in a separate git repo, hosted on a private gitlab server. The thinking behind this was to grant access to small components of our infrastructure instead of granting access to the entirety of our Ansible infrastructure.
In practice, the only people making changes to the roles were my team. The organization at large wasn’t contributing to our Ansible roles in any meaningful way.
Another issue we faced was a compliance requirement. Changes to code, including
"infrastructure code" need to be vetted by more than one person. Our main Ansible
git repo had a policy that prevented merging code to the master
branch without
a code review. However, not all of the repositories for roles had the same policy.
This allowed code changes to get into production without review. Enforcing the
policy on all of the role repos would have been a bit of a pain.
The "nail-in-the-coffin" for the separate repos was an issue with our private
gitlab server which is hosting all of these Ansible repos. About two months ago
the ansible-galaxy
command was just silently failing. It would get through
downloading about 75% of the roles then just exit with no logs of the issue.
I thought I tracked this down to some sort of rate-limiting on the Gitlab server,
and it went away for about a week then it re-occurred.
I spent a good day writing a bash script which would replace ansible-galaxy
and
that got us another week or so of work. When the problem re-occurred, we decided
to put all the roles into the single repository.
I was unaware of the ansible-roles-ctl
tool until Fabian posted about it on
twitter. Maybe I’ll try that next time I refresh this layout.
The last point with regard to Ansible role management, is that all of the roles we use, save one, are "first-party" roles. Meaning we created them entirely. If we were using other roles written by others we may have made a different decision.
To manage the one "third-party" role we use, we include it as a git submodule.
Inventory for all environments in one place
Most of our infrastructure is in AWS, however, we have a handful of servers and other on-prem devices we manage with this infrastructure.
In AWS we have resources spread across five different accounts, some environments
sharing accounts. To make management easier we use region and instance filters
in the ec2.ini
configuration file.
Having all of the environments in one folder structure, it makes changing the
environment as easy as passing a different sub-folder to the -i
switch.
ansible-playbook -i inv/int playbooks/web.yml # runs against integration ansible-playbook -i inv/prod playbooks/web.yml # runs against production
Using the environ
and environment_long
variables mentioned previously we can
easily do conditional tasks inside our playbooks and roles.
Shared group_vars
and the silly symlinks
The simlinking in each environment’s sub-folder can seem a bit convoluted, but it is designed to minimize multiple copies of the same files spread about.
First, we only want a single copy of the dynamic inventory script ec2.py
.
Each environment gets its own copy of the configuration, but the script itself
needs to stay in sync across environments.
The folders host_vars
and group_vars
are linked to this destination as that’s
one of the places that Ansible expects them to be (same level as the inventory).
The localhost
file is there to help when doing one-off ad-hoc style things
which may or may not connect to remote infrastructure. This is also to enable
local playbooks which configure the machines on which Ansible is running.