High Availability architecture on AWS using HAProxy loadbalancer and Ansible

In the real world, when our website goes viral over the night, the huge traffic will start coming to our web servers. At this point, we have to launch extra web servers to meet the requirements. The best solution for this problem is to use some intelligent technologies like Kubernetes, EKS, or RedHat OpenShift. But all of these tools use load balancers behind the scene to balance the traffic among the multiple servers. To understand the working of load balancers let’s build a high availability architecture using the HAProxy load balancer on AWS EC2 instance and automate this configuration using Ansible Playbook.

Ansible is a configuration management tool and HAProxy is a software that provides a high availability load balancer and proxy server for TCP and HTTP-based applications that spread requests across multiple servers.

Objectives:

  1. Automatically configure the HAProxy load balancer using Ansible Playbook.
  2. When a new web server is added to the ansible inventory, the HAProxy configuration file should be updated with the new webserver.

Prerequisites:

  1. Webservers should be configured
  2. One EC2 instance for Ansible Controller node should be provisioned

Let’s begin with Ansible Installation…

Step 1: Installing Ansible on AWS EC2 instance

Login to the EC2 instance which you want to use as an ansible controller node.
In this instance, we will install Ansible and configure Ansible.

As we know that Ansible is developed using Python language, we can download Ansible using the python package installer PIP.

Before this make sure you have Python3 is installed.

$ pip3 install ansible

Step 2: Configure Ansible

In the previous step, we had installed Ansible software. But in order to use this software, we have to configure it. We required to provide two things to Ansible. One is the inventory that contains the IP addresses of the web servers and another is the ansible config file.

Creating an Ansible inventory file:

$ mkdir /etc/myinventories
$ vim /etc/myinventories/myHosts.txt

The myHosts.txt is the Ansible inventory file in which we will write the IP addresses of the web servers.

We will provide the group name webservers and add the IP of one web server in the ansible inventory file.

Now, we have the inventory file ready. But how ansible will come to know that this is the Ansible inventory file?

To tell the ansible which inventory file to use, we have to add the location of the inventory file in the ansible configuration file.

Configure Ansible:

The Ansible configuration file will be placed in the /etc/ansible folder. If a file is absent then create one ansible.cfg file in the /etc/ansible folder.

Here, we will start configuring the Ansible from an empty configuration file.

$ vi /etc/ansible/ansible.cfg

Let’s add inventory file entry under the default section in the ansible.cfg file.

Now, ansible knows from where to get the inventory file.

At this point, if we tried to ping to the webserver using the ansible ping module, we will get an error. That error will tell us that unable to authenticate. This is because our web servers are running on AWS EC2 instance and we are using a private key to login into them. When ansible tries to ping to the web server, it will log in there first. And for login purposes, it will require a private key associated with that instance. But until now, we haven’t provided a private key to the ansible.

So, let’s provide a private key to the ansible first.

As our controller node is running on the AWS EC2 instance and my private key is in the local system, I have to transfer that private key to the controller node first. For this, I will use a windows program called WinSCP.

Open the WinSCP program in the local system and login to the Ansible controller node. Remember, for login to the controller node, you have to use the private key associated with the controller node in the .ppk format.

After this, transfer the private key in the .pem format to the ansible controller node.

Now, we had transferred the private key to the ansible controller node. After this, we have to update the Ansible configuration file with the location of the private key file.

Now, try to ping the web servers using the ansible ping module.

$ ansible all -m ping

So, we had successfully configured the Ansible controller.

Now, we can move forward to write ansible playbooks.

Step 3: Writing an Ansible Playbook for configuring HAProxy

I will configure the HAProxy load balancer on the controller node itself. So, I will run this playbook on the localhost.

This file has the code that will install, configure the HAProxy load balancer, and start the HAProxy services on the controller node.
I had added a handler to this ansible-playbook that will trigger the code to restart the haproxy services only if the haproxy configuration file is changed.

The loadbalancer_port variable will have the port number on which the load balancer will run. In this case, the load balancer port will be 8080.

Step 4: Writing the HAProxy configuration file

The haproxy configuration file will be named as haproxy.cfg.

Get the haproxy.cfg file here.

On line no 44, we are providing the haproxy load balancer port number that will be taken from a variable defined in the haproxy.yml file.

Now, we have to provide the details of the backend that is the IP addresses of the servers where our application is running. But we have multiple servers and we have to load balance all of them. Here, we are grouping the server running with the application that we have to load balance. For this, we will group all of these servers with the same tag name that is ‘webservers’.

For using all of these IP addresses, we will be using Jinja templating(line no 63 to 65) that will go through the list of all IP addresses and provide them as the backend. Our server with the application is running on port 80.

Finally, we have the ansible-playbook for configuring the HAProxy load balancer.

Step 5: Running the Ansible Playbook

We already had added the IP of the webserver to the ansible inventory.

Now, if we tried to run this ansible-playbook, it will fail with an error that the task Install HAProxy Software had failed. This task had failed because for installing the software, we required superuser permissions. But we haven’t asked ansible to get the superuser permissions. We can fix this by adding the privilege escalation permissions to the ansible configuration file.

This will make sure that the superuser permissions will be provided to the tasks wherever required.

Now, run the ansible-playbook haproxy.yml.

$ ansible-playbook haproxy.yml

Step 6: Fixing the Issue of the Package module by using a command module and making command module idempotent

I’m getting some error with dependencies conflicts for package module. That error is specifically related to the amazon linux 2 and dnf. Because Amazon Linux 2 does not support dnf. And if we have to use yum and rpm in ansible then we have to switch to the python2.7 in Amazon Linux 2.

So, instead of using a package module, I will use the command module for installing HAProxy software. But we know that the command module is not idempotent.

So, to overcome the issue of idempotency with the command module, I will first check if the haproxy software is installed and if the haproxy software is installed then the code for installing haproxy software will not run. And if the haproxy software is not installed then the code for installing haproxy software will run.

First try to run previous playbook with package module. And if you get the error with python and dnf dependencies in package module then consider using this playbook with the command module

Now, run the ansible-playbook.

$ ansible-playbook haproxy.yml

This will install and configure haproxy automatically on the controller node.

Step 7: Check the website

Go to the load balancer IP on port 8080 and check if the website is up and running.

The IP address of the system with a load balancer is 65.1.1.111 and as our load balancer is running on port 8080, we have to go to port 8080 on this IP address.

So, open up the browser and go to http://65.1.1.111:8080

The above image is showing that our website is accessible through the load balancer.

Step 7: Add another webserver

Now, for testing the high availability, let's add another webserver to the load balancer.

The webserver is already configured. So, we just have to include the IP of the webserver in the ansible inventory file and re-run the ansible-playbook. This will update the load balancer with the new web server and after that traffic will be equally distributed between the two webservers.

Add the IP of the new web server in the inventory file:

Now, run the ansible-playbook again.

You can see in the above image that the haproxy configuration file has been updated and the haproxy service is restarted for changes to take effect.

For testing the load balancer, go to the load balancer IP and try refreshing the page couple of times and you can see we are switching back and forth between the different web servers. This means the website that we are seeing is been fetched from different servers for every request. And this is proof that the load balancer is working fine behind the scene and balances the load equally between the webservers. The algorithm used by this load balancer is the round-robin algorithm.

Balancing load between multiple servers using a load balancer

You can add as many web servers as you want and the load balancer will keep on balancing the load among them.

Glad you had made it till the end! Thanks for reading this article. I hope you had learned something new and interesting through this article.

If you like this article, don’t forget to show some appreciation through applauds.

For any help or suggestions connect with me on Twitter at @TheNameIsAnkush or find me on LinkedIn.

Tech blogger, researcher and integrator