When we developed the TPC 2017 mobile application, we wanted to create a repeatable process for delivering white-labeled mobile applications in this space. This new delivery model did not end with the mobile application’s UI and data. The backend had to be configuration-driven and easy to redeploy as well. This way we can spin up a mobile application with a working backend in minutes.
To do this, we embraced the idea of “infrastructure as code” (aka “IaC”.) IaC allows us to get away from the manual configurations and custom scripts that have been traditionally used to manage infrastructure. Instead, IaC allows us to use the software development practices we use for the rest of a software development project’s assets, from code to images, such as version control, continuous integration, and automated testing. By bringing the infrastructure components into the process, we can make infrastructure changes more reliably and easily.
Here at Infinity, we are language and platform agnostic. With that in mind, we chose Terraform as it is a multi-platform solution to writing infrastructure as code. It’s a high-level configuration syntax that abstracts command line and complex web portals away.
At its essence, Terraform is driven by a three-step process. First, configuration files that describe the components your application or system need. From this, Terraform will create an execution plan that determines the steps Terraform will perform. Finally, Terraform will execute the plan to build the infrastructure. Because Terraform is designed with change automation in mind, should the infrastructure configuration change, Terraform can determine the specifics of what changed and design an incremental execution plan to maximize efficiency. You’re also able to review the changes that will be made before they happen.
To understand Terraform, it’s important to understand how the concepts of “providers” and “resources” apply to it.
A “provider” is used to interact with or manage the resources supported by the services to which you connect, often via an API. Providers are usually Infrastructure as a Service (IaaS), but they are not limited to that. Vault, for example, is a credential and secret key storage service. There are many providers that are available out-of-the-box with Terraform, including Azure, AWS, Datadog, Mailgun and many more. Additionally, you can write your own provider.
In the case of the TPC 2017 application, we are using Microsoft Azure services. With Terraform, we can generate most of the services we need in Azure. There are a few exceptions like the application service. Fortunately, Terraform allows you to use Azure Resource Manager (“ARM”) templates. To learn more about ARM templates, see my previous blog post Azure Automation Made Easy. They work inside of the Terraform scripts and maintain state allowing for change sets to be generated. Additionally, you can pass parameters into and out of these templates, making them almost seamless to use in the Terraform scripts.
Generally, “resources” are logical instances (components) of infrastructure. Examples of these are load balancers, databases, and virtual machines. Because Terraform is focused on infrastructure items, the built-in resources were written for these. Since cloud providers continually provide new types of services, Terraform may not always have every one available but allows for custom items to be created using resources.
Terraform is a great way to write infrastructure as code. Now your infrastructure can be stored in Git along with the rest of your code which means that you can now version your infrastructure. And just like versioned code, you can revert to previous versions, easily see what has changed, and know who changed your infrastructure. Another benefit is the ability to hand over responsibility for the infrastructure to your developers without giving them access to the production environment. A developer can write code to setup servers, firewalls, etc. without having to access the system itself. The Terraform scripts can be reviewed or executed by a system administrator or even built into continuous integration.
There are some key issues we have found with Terraform as well. If you do not maintain the state file between executions, Terraform has to rescan your infrastructure entirely. This can take a long time. Additionally, if the Terraform script is not executed with full privileges, it might not be able to see all the resources it has created, which can cause issues. We did not run into the issue with the TPC project, but we have seen others run into it. The main problem we ran into is that Terraform does not have a debug mode. To test a script you must run it against providers and create resources, which can be expensive, especially if you forget to destroy afterwards.