Jul 18, 2014

Lightweight Provisioning of Maven artifacts using Ansible in a VirtualBox

I am currently working on quite a huge software project involving 16+ Solr cores running in a master/slave setup on a zoo of servers. I recently had to upgrade the whole environment from Solr 3.6.1 to Solr 4.6.1. Manually!
We have a tar.gz archive with an install shell script that does most of the work. But still: distributing and extracting the archive, then running the install script manually on several servers is tedious to say the least. Also, depending on the machine, I had to make certain host specific adjustments.

Never again! My requirement was clear: I need an automatic deployment solution that does all the manual work for me in upcoming releases.

Using Ansible for Provisioning

Currently, there are quite a few provisioning solutions out there to choose from. At QAware, Puppet is used to provision the internal server infrastructure. I had also heard of Chef as being the other key player when it comes to provisioning. However, it quickly turned out that both tools have one major drawback for my specific use case: they need to be installed on the target machines.

The problem: we do not have the required sudo permissions on any of the hosts to install the required software packages for either Puppet or Chef.

After a tip from a colleague and a bit of further research on the web Ansible quickly emerged as the prime candidate to provision our Solr server infrastructure. Ansible currently has the status Adopt on the ThoughtWorks Technology Radar from July 2014 and is considered as a stable and mature tool. Its main benefit: no specific software or agent needs to be installed on the target machines. Everything is automated via SSH. So I decided to give Ansible a try. 

Running Ansible with Vagrant and VirtualBox

There was only one minor problem: almost all of our developers use Windows 8 including myself. But Ansible requires a Linux environment to run in. No big deal: Go Go Gadgeto Virtualization!

A good and free choice to run virtual machines is Oracle's VirutalBox. You can download the current version here. Don't forget to install the expansion pack also. Next you will need to install Vagrant, you can find and download the current version here. This is an awesome tool tool to setup, configure and run consistent virtual machines. Once you have those two tools installed you are ready to continue.

First we need to create a new and fresh virtual machine instance using Vagrant. There are a lot of preconfigured so called boxes to choose from, for this tutorial we use a default Ubuntu 12.04 box. Issue the following commands to initialize, download and start the VM:
$ vagrant init hashicorp/precise64
$ vagrant up --color
When you issue these commands they will take some time to complete. Vagrant will now download the specified box and once complete start the VM using VirtualBox. You won't see the box running though, but you can connect via SSH on localhost:2222.
Next, we enable the GUI mode for the virtual machine so we can logon using a username and password. We will also specify the amount of memory the VM will be assigned. Simply open the Vagrantfile in an editor of your choice and make the follwing changes:
config.vm.provider "virtualbox" do |vb|
    # Don't boot with headless mode
    vb.gui = true
    # Use VBoxManage to customize the VM: change memory to 512MB
    vb.customize ["modifyvm", :id, "--memory", "512"]

Installing Ansible using Vagrant provisioning

Next we need to install Ansible itself. Vagrant provides several provisioning options out of the box, such as Puppet or Chef. But to keep things simple we'll use basic shell provisioning. You simply write all the installation steps you would normally execute in a Linux shell into a plain shell script. The Vagrant shell provisioning then merely executes this script to perform the installation. Again, open the Vagrantfile and insert the following code snippet:
config.vm.provision "shell" do |s|
    s.path = "src/main/vagrant/provision-vm.sh"
You probably have to adjust the file path to match your project. In my case the shell script is part of a Maven project, for more details about the Maven integration see the next section. The installation of Ansible itself is performed by a series of shell commands:

sudo apt-get -y update
sudo apt-get -y install sshpass
sudo apt-get -y install vim

# install the official Ansible package
sudo apt-get -y install ansible

# install the latest Ansible from source
sudo apt-get -y install git
sudo apt-get -y install python-pip
sudo pip install paramiko PyYAML jinja2 httplib2

git clone git://github.com/ansible/ansible.git
chown -R vagrant:vagrant ansible

# write additional settings into .bashrc 
echo 'export ANSIBLE_HOST_KEY_CHECKING=False' >> /home/vagrant/.bashrc
echo 'source ansible/hacking/env-setup' >> /home/vagrant/.bashrc
Ansible is now installed in your provisioning VM and ready to use. Well, almost. After using Ansible for some initial tests, I realized that I also want to provision a preconfigured Ansible configuration file as well as a known_hosts file with all the SSH host keys for our server infrastructure. Vagrant provides another provisioning mechanism to achieve this easily: file provisioning. This simply copies files from your host machine into the VM. Again, open your Vagrantfile and insert the following snippet:
config.vm.provision "file" do |f|
    f.source = "src/main/vagrant/known_hosts"
    f.destination = ".ssh/known_hosts"

  config.vm.provision "file" do |f|
    f.source = "src/main/vagrant/.ansible.cfg"
    f.destination = ".ansible.cfg"

A quick introduction to Ansible

Writing Ansible files isn't hard. Ansible uses YAML as it's syntax, thus being quite simple. The main element is a list of tasks. These tasks make up the individual installation actions that are executed in the specified order on the specified hosts. I think the following listing is pretty self explanatory.  
- hosts: solr-slaves
  remote_user: qqair00
    home: /home/qqair00
    apps: /usr/local
    solr_tarball: Solr_Slave-1.2.3.tar.gz
    - name: Shutdown Solr instances if running ...
      shell: bin/{{item.script}} stop chdir={{home}} removes=bin/
        - { script: 'solr-01.sh' }
      ignore_errors: yes
    - name: Copy and extract tarball to apps directory
      unarchive: src={{solr_tarball}} dest={{apps}}/
        - Start Solr 01 instance
    - name: Start Solr 01 instance
      shell: bin/solr-01.sh start chdir={{home}}
      async: 30
      poll: 0

I won't go into more detail here, every installation is different anyway. The offcial Ansible documentation is good a reference for the available features and modules. Also, have a look at the Ansible playbook examples on GitHub.

Integrating Ansible provisioning into a Maven project

Each installation artifact in our project is being built by an individual Maven module as a tar.gz archive (using the maven-assembly-plugin). So the idea was simple: we need an additional Maven module that gathers all these installation artifacts together with our Ansible playbooks and make all these files accessible from within the Vagrant VM. The following diagram shows the directory layout of our provisioning-vm Maven module.

All the source files of the Ansible playbooks are contained in the src/main/ansible/ directory. The directory layout of the Ansible source directory follows the best practices from the Ansible documentation (see here for more details). For simple provisioning tasks you could simply use individual YML files. But as soon as the infrastructure gets more complex it is definitely better to partition your playbooks into individual roles for better reuse. For example, we have a separate Java role to provision the JDK on all of our different servers.

To prepare and assemble the final provisioning directory with all required playbooks and installation artifacts we will use two basic Maven plugins (see POM snippet for details).
  • maven-resources-plugin: this plugin is used to copy all our playbook files to the build directory. Additionally, we can use filtering to process all our YML files to insert dynamic and build specific variables such as ${project.version}.
  • maven-dependency-plugin: this plugin is used to copy all the required installation artifacts into the files/ directory of the Ansible role directories. The artifacts are declared as dependencies in the provisioning module's POM.

Next, we need to make the Ansible source files available in our provisioning VM so that we can execute the playbooks for our infrastructure. To achieve this, we simply mount the target/ directory of our Maven module as a directory into the VM using Vagrant. Open your Vagrantfile and insert the following line:
config.vm.synced_folder "target/", "/home/vagrant/provision",
                        :mount_options => ["dmode=755", "fmode=644"]

And that's it. Now you can call 'mvn install', followed by 'vagrant up' in a console from the provisioning module's directory. Once the provisioning VM has started you can login at the VM's console window and change into the mounted provision/ directory. Finally, you can provision your infrastructure:
$ ansible-playbook all-solr-slaves.yml -i INT --ask-pass


  1. Hi Mario, i was wondering if ansible is able to do the opposite, ie, to run the command mvn clean install on a given path.

    1. Hi, you can probably do something like this with Ansible even though I do not understand your use case fully.
      You could use Ansible to install Maven on the target machine you are provisioning, and then issue a shell command on that target machine to do the 'mvn clean install' in a directory of your choice.
      Perhaps that helps.