Automate All the Things With Ansible: Part One

Overview

This is part one of a two-part tutorial on Ansible. In this part you will learn what Ansible is, how to install and configure it, and how to install a local Vagrant cluster to test it. Then, you'll discover the inventory, modules, ad-hoc commands, playbooks, run strategies, blocks and the vault.

What Is Ansible?

Ansible is a configuration management and orchestration tool. It operates in the same domain as Puppet, Chef, and Saltstack. This means that with Ansible you can remotely provision a whole fleet of remote servers, install and deploy software on them, and track them remotely.

Ansible is an open-source project implemented in Python and has a pluggable architecture with modules that can manage pretty much any operating system, cloud environment, and system administration tool or framework. You can also pretty easily extend it with your own plugins if you want to do something special.

One of the unique features of Ansible is that it doesn't install any software on managed machines. It manages the machines remotely over SSH. To manage a remote machine, you just need to make sure that your public SSH key is in the authorized_keys file of that machine.

Getting Started With Ansible

Ansible runs on a control machine and can manage servers running any operating system, but the control machine can't be a Windows machine at the moment. I'll use Mac OS X in this tutorial as the control machine.

Installation

Ansible requires Python 2.6 or 2.7. To install it, type:

pip install ansible

On Mac OS X, it is recommended to increase the number of file handles:

sudo launchctl limit maxfiles 1024 unlimited

If you see an error like "Too many open files" you probably need to do that.

To verify Ansible was installed properly, type ansible --version. You should see:

ansible 2.0.0.2

  config file =

  configured module search path = Default w/o overrides

The version number may be different, of course.

The Ansible Configuration File

Ansible has a configuration file that lets you control many options. The search order is:

ANSIBLE_CONFIG (an environment variable)
ansible.cfg (in the current directory)
.ansible.cfg (in the home directory)
/etc/ansible/ansible.cfg

You can also override specific settings using individual environment variables, which take precedence over the configuration file.

Check out the Ansible documentation to learn about all the options.

Set Up the Vagrant Cluster

To really understand the power of Ansible, you need a bunch of servers to manage. For the purpose of this tutorial I'll use a Vagrant cluster of 3 VMs, but as far as Ansible is concerned those are just some hosts it needs to manage. To learn more about Vagrant, check out Introduction to Vagrant.

First, install VirtualBox and Vagrant. Then put the following in a file called 'Vagrantfile' in a work directory

# -*- mode: ruby -*-

# vi: set ft=ruby :

hosts = {

  "larry" => "192.168.88.10",

  "curly" => "192.168.88.11",

  "moe" => "192.168.88.12"

}

Vagrant.configure("2") do |config|

  config.vm.box = "precise64"

  config.vm.box_url = "http://files.vagrantup.com/precise64.box"

 hosts.each do |name, ip|

    config.vm.define name do |machine|

      machine.vm.network :private_network, ip: ip

      machine.vm.provider "virtualbox" do |v|

        v.name = name

      end

    end

  end

end

Then type vagrant up. Vagrant will create three virtual machines for you, available as larry, curly and moe. To verify, type vagrant status. You should see:

Current machine states:



larry                     running (virtualbox)

curly                     running (virtualbox)

moe                       running (virtualbox)



This environment represents multiple VMs. The VMs are all listed

above with their current state. For more information about a specific

VM, run `vagrant status NAME`.

To make sure you can SSH into your cluster hosts, type: vagrant ss >> ~/.ssh/config.

Now you can SSH into any of your virtual servers using their hostname. For example: ssh curly. This will allow Ansible to connect to your cluster hosts over SSH without any issues with usernames, passwords, or keys.

Inventory

Now that we have a cluster, we need to tell Ansible about it. This is done using an inventory file. The inventory file is a list of host names organized in groups using an INI file format. Put the following in a file called 'hosts' in your work directory.

[funny]

 larry

 

 [funnier]

 curly

 moe

I put 'larry' in a group called 'funny' and the other hosts in a group called 'funnier'. That organization will allow us to perform actions on these groups. You can also perform actions on individual hosts and on all the hosts.

Modules

Ansible has a very modular and extensible architecture. All its capabilities are organized in modules. There are core modules and extra modules. Each module represents a command, and most take arguments. You can use modules directly in ad-hoc commands or in playbooks. You can read about all modules in the documentation.

Ad-Hoc Commands

It's time to get hands-on. The simplest way to use Ansible is to run ad-hoc commands. Ad-hoc commands use modules. The format of an ad-hoc command is:

ansible <host group> -i <inventory file> -m <module> [-a <argument 1>, ... <argument N>]

For example, to see if all the hosts in your inventory are up, you can use the ping module (without arguments):

ansible all -i hosts -m ping

curly | SUCCESS => {

    "changed": false,

    "ping": "pong"

}

larry | SUCCESS => {

    "changed": false,

    "ping": "pong"

}

moe | SUCCESS => {

    "changed": false,

    "ping": "pong"

}

Ansible has many modules for all common system administration tasks like file management, user management, and package management, as well as many uncommon tasks. But if you don't find what you need or just feel more comfortable with plain shell commands, you can use the shell module directly including pipes. The following command extracts the internal and external IP addresses of all hosts:

ansible all -i hosts -m shell -a '/sbin/ifconfig | grep inet.*Bcast'"



larry | SUCCESS | rc=0 >>

          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0

          inet addr:192.168.88.10  Bcast:192.168.88.255  Mask:255.255.255.0



curly | SUCCESS | rc=0 >>

          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0

          inet addr:192.168.88.11  Bcast:192.168.88.255  Mask:255.255.255.0



moe | SUCCESS | rc=0 >>

          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0

          inet addr:192.168.88.12  Bcast:192.168.88.255  Mask:255.255.255.0

Playbooks

Ad-hoc commands are nice when you want to quickly do something on a bunch of hosts, but the real power of Ansible is in its playbooks. Playbooks are YAML files where you define collections of tasks to accomplish goals like provisioning, configuring, deploying and orchestrating your infrastructure.

Example Playbook

Let's take a look at what a typical playbook looks like before we get down to the details.

---

- hosts: funnier

  tasks:

   - name: Install Nginx

     apt: pkg=nginx state=installed update_cache=true

     notify: Start Nginx

   - name: Install Python 3

     apt: pkg=python3-minimal state=installed

  handlers:

    - name: Start Nginx

      service: name=nginx state=started

The playbook has a hosts section where you specify hosts from the inventory file. In this case, the group name is "funnier". Then there is a tasks section with two tasks that install Nginx and Python 3. Finally, there is a handlers section where Nginx is started after its installation.

Running Playbooks

You run playbooks with the ansible-playbook command. You still need to provide an inventory file and the playbook you want to run. Save the playbook into a file called "playbook.yml" in your working directory. Let's give it a try:

ansible-playbook -i hosts playbook.yml



PLAY ***************************************************************************



TASK [setup] *******************************************************************

ok: [moe]

ok: [curly]



TASK [Install Nginx] ***********************************************************

fatal: [moe]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to lock apt for exclusive operation"}

fatal: [curly]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to lock apt for exclusive operation"}



PLAY RECAP *********************************************************************

curly                      : ok=1    changed=0    unreachable=0    failed=1

moe                        : ok=1    changed=0    unreachable=0    failed=1

Oh, no. What happened? Ansible gives a decent error message here: "Failed to lock apt for exclusive operation". Many playbooks will require sudo privileges. This playbook is not an exception. To run the playbook with sudo privileges, you just add the --sudo flag:

ansible-playbook -i hosts playbook.yml --sudo



PLAY ***************************************************************************



TASK [setup] *******************************************************************

ok: [curly]

ok: [moe]



TASK [Install Nginx] ***********************************************************

changed: [moe]

changed: [curly]



TASK [Install Python 3] ********************************************************

changed: [moe]

changed: [curly]



RUNNING HANDLER [Start Nginx] **************************************************

changed: [moe]

changed: [curly]



PLAY RECAP *********************************************************************

curly                      : ok=4    changed=3    unreachable=0    failed=0

moe                        : ok=4    changed=3    unreachable=0    failed=0

Ansible is idempotent, which means if something is already in the desired state then Ansible will leave it alone. In the output of ansible-playbook, you can see which tasks succeeded or failed and which hosts were changed.

Let's run the same playbook again. Nothing is supposed to be changed:

ansible-playbook -i hosts playbook.yml --sudo



PLAY ***************************************************************************



TASK [setup] *******************************************************************

ok: [moe]

ok: [curly]



TASK [Install Nginx] ***********************************************************

ok: [curly]

ok: [moe]



TASK [Install Python 3] ********************************************************

ok: [curly]

ok: [moe]



PLAY RECAP *********************************************************************

curly                      : ok=3    changed=0    unreachable=0    failed=0

moe                        : ok=3    changed=0    unreachable=0    failed=0

Run Strategies

Prior to Ansible 2.0, plays executed in a linear fashion, task by task. All the target hosts executed the first task. Only when all the hosts were done with the first task could they start on the second task.

Ansible 2.0 added the concept of run strategies. There are currently two strategies: the "linear" strategy I described above, which is the default strategy, and the "free" strategy where hosts are free to execute the tasks in the playbook still in order, but not in lockstep with other hosts.

This could be useful if hundreds of hosts need to download several files from some FTP servers. The first host may finish downloading the first file and move on to the next one, while other hosts are still busy downloading the first file. By the time the other hosts get to download the next file, the first host is done already, and there is less contention.

The free strategy seems superior in most situations. You just add a strategy: free key-value pair to the playbook.

- hosts: all

  strategy: free

  tasks:

  ...

Blocks

Another new Ansible 2.0 feature is blocks. Blocks let you group tasks together. This is very useful if you have tasks that need to execute only under a certain condition. Previously, you had to do it for every task separately.

---

- hosts: all

  tasks:

    - debug: msg='Task 1 here'

      when: ansible_distribution == 'Ubuntu'



    - debug: msg='Task 2 here'

      when: ansible_distribution == 'Ubuntu'



    - debug: msg='Task 3 here'

      when: ansible_distribution == 'Ubuntu'

With blocks, you can group all these debug tasks together and put the "when" condition at the block level.

- hosts: all

  tasks:

    - block:

      - debug: msg='Task 1 here'

      - debug: msg='Task 2 here'

      - debug: msg='Task 3 here'

      when: ansible_distribution == 'Ubuntu'

The Vault

Ansible communicates with remote machines over SSH, but the playbooks may contain secrets like username, passwords and API keys. Since you typically store playbooks in source control systems like git, this information will be visible to anyone who has read access.

Ansible helps with the ansible-vault program that lets you create, edit and rekey encrypted files. These files can be decrypted on the fly when running the playbook by providing a password. If you add the --vault-ask-pass flag to ansible-playbook then it will prompt you for the vault password.

Alternatively, you may add --vault-password-file <password file> and Ansible will read the password from your file. If you use the password file, don't store it in source control!

Now, you can safely store the encrypted files in source control and not worry about anyone finding your secrets. You need to manage your vault password carefully. If you lose it, you won't be able to decrypt the files in the vault.

Conclusion

Ansible is a great tool. It is lightweight. It can be used interactively with ad-hoc commands, and it scales very well to massive systems. It also has a lot of momentum and a great community. If you manage or even just work with remote servers, you want Ansible.

Stay tuned for part two.

HIGHLIGHTS OF THE DAY