We began our hosting journey, as many do, on Heroku because they make it easy to deploy your application and automate some of the routine tasks associated with deployments, etc. However, as our team grew and our product matured, our needs have outgrown Heroku. I will dive into the history and reasons for this in a future blog post.
We decided to migrate our infrastructure to Kubernetes running on Amazon EKS. Although Google Kubernetes Engine has a slightly more mature Kubernetes offering and is more user-friendly; we decided to go with EKS because we already using other AWS services (including a previous migration from Heroku Postgres to AWS RDS). We are still in the process of moving our main website workloads to EKS, however we have successfully migrate all our staging and testing PR apps to run in a staging cluster. We developed a Slack chatops application (also running in the cluster) which automates all the common tasks of spinning up and managing a production-like cluster for a pull request. This allows our engineering team to iterate quickly and safely test code in a full production environment. Helm plays a central role when deploying our staging apps into the cluster. We use CircleCI to build docker containers for each PR push, which are then published to Amazon EC2 Container Service (ECR). An
upgrade-operator process watches the ECR repository for new containers and then uses Helm to rollout updates to the staging environments. All this happens automatically and makes it really easy for developers to get code onto servers quickly. The immutable and isolated nature of our staging environments means that we can do anything we want in that environment and quickly re-create or restore the environment to start over.
The next step in our journey is to migrate our production workloads to an EKS cluster and build out the CD workflows to get our containers promoted to that cluster after our QA testing is complete in our staging environments.