Over the last three years, the technology platform at Motus has undergone a radical transformation. This time three years ago, we had three physical servers in a rack somewhere in Texas hosting two monolithic applications and one database. Every developer and tester had their own full Linux VM that required constant care and feeding to manage and update. Our production deployment process involved two people, three wiki pages of manual steps, and a lot of crossed fingers. We didn’t dare touch our systems for a quarter of the month lest something go wrong from which we could not recover.
Today, the story is different. Over the next few posts, we’d like to share with you the choices we made, the tools we’ve used (and built), and how those decisions have held up to the realities of production. This post will talk about our internal dev and QA systems, which were the first to undergo a change.
In the summer of 2014, we had just started to implement our change from a monolithic application platform to a more service-oriented one. We decided to move our authentication and authorization functions to this new architecture, and it was getting complicated. Our developers and QA testers each had a VM that hosted their copy of a database, our legacy PHP application, and our Java mileage entry application. All of the sudden, we were asking them to manage not only those three components, but three more Java service and frontend applications. It became very clear very quickly that this was a) very time consuming, and b) a royal pain in the rear. When we did production deployments, we had huge puppet changes to make for every service for each environment to configure it. The complication factor made it very easy to make mistakes and very hard for certain key people to go on vacation.
Also in the summer of 2014, a technology called Docker was starting to really take hold. It was developed by a hosting company to make Linux containers more usable and portable, and it looked really interesting. Here was a way to package up not only your application but your server components, configuration bits, and dependencies, and ship it around your various environments. We had been playing around with it for a few months internally, mainly as a way of trying to make our dev databases easier to manage. But all of the sudden we thought that if we started packaging our apps as Docker containers including Java, Tomcat, and the app, we could make the deployment of all the correct versions much easier. Thus was born DynEnv.
Motus DynEnv is a mechanism for our internal users to manage suites of application versions, all wired together with a single domain name. It starts with the user creating a manifest that looks something like this:
This defines a list of services, their code branch, and build number (or latest build) that the user wants to deploy. The user can go to a web portal, specify the manifest they want, and deploy it. After deployment is successful, they are presented with a URL like http://myenv.dynenv that they can use. This radically simplifies the process of managing software builds and deployments because the versioning, configuration, and installation all happens behind the scenes.
How does it work? We used a few key technologies:
- Docker. Each application or service was bundled into a Docker container, and every time we did a CI build, we pushed out a new Docker image.
- Hipache. Hipache is a name-based virtual host server with a Redis backend, which allowed us to dynamically add and remove virtual hosts. When we spun up a service in a Docker container, we added an entry in Hipache that pointed https://userservice-myenv.dynenv to http://server123:24521. That way each service had a consistent URL no matter where it was running and on what port
- Groovy/Grails. We built the UI to DynEnv in Grails, and it was extremely quick and easy to use. Having Java support was key as we transitioned from running Docker command via exec() to using a Java Docker API.
One of the biggest challenges in moving to a services architecture is service discovery. Since all of our services talk to each other over HTTP, we decided early on that DNS would be our discovery mechanism. That proved to be a fortuitous decision, because in summer of 2014, the Docker/microservices/service discovery landscape was still very much in flux. Sticking with DNS meant that we didn’t need to test something too bleeding edge. It also made wiring together our dynamic environments pretty easy once we discovered Hipache. Each service is configured using environment variables, so during creation of the dynamic environment, we simply pass unique service URLs to all the clients and when the services come up, they can talk to each other.
What did we learn?
DynEnv wasn’t automatically a bucket of sunshine and roses. There were quite a few things that we learned:
- Docker was not a mature technology in the summer of 2014. We had countless issues with DNS resolution, stability, disk space leakage, Docker Hub reliability, and more. It was pretty painful for a while – we stuck with it only because it was just a little less painful than the alternative.
- Hipache is not a well-maintained piece of software. It looked like it was going places, but since it was maintained by dotCloud, the company that became Docker, it quickly fell by the wayside. We’ve actually replaced it in production, but that’s another story.
- Date flipping. Due to some quirks of our legacy application, each developer and QA person still needs a database and Docker host that they can change the system date on (to simulate certain conditions in our code). This prevents us from moving to a more clustered approach.
- System composition. When we started on this, there wasn’t really a good tool (like Docker Compose) to wire apps together. We had to roll our own, and that was kind of painful.
Despite all the above, this mechanism is the only way that we have been able to scale our service development. We currently have north of 30 services, and it would be impossible to manage without this kind of system.
Stay tuned for stories from our production environment!