Editor's Note: This is Part 1 of a three-part interview we did with Jérôme Petazzoni, Senior Engineer at Docker, in San Francisco last month. Check out Part 2: How Docker Fits Into The Current DevOps Landscape and Part 3: How Docker Manages Its Massive Open Source Project.
dotCloud set out to be a Platform as a Service (PaaS) provider that would allow you to easily deploy and host apps in whatever language your app was written in. dotCloud always had an urge to open source their platform and so they started releasing the source code to their platform piece by piece. Once they released dotCloud's container engine as an open source project called Docker on HackerNews last year, they realized they had struck a nerve and solved a major painpoint. Deploying and packaging apps in a reliable and repeatable way has always been a pain. Docker solved this by isolating your app and all the necessary components to run your app in a Dockerfile, and then made it easy to ship this off to different servers and share with other developers.
400,000+ downloads, 300+ contributors, and 10,000+ GitHub stars later, Docker is now one of the most popular open source projects in the world.
We sat down with Jérôme Petazzoni, a Senior Engineer who's been at Docker since they were dotCloud, to hear more about Docker's origins and find out about how they build Docker today. This is the first of three parts of our interview.
LS: So let's start at the beginning, how did Docker first come about?
J: When we were building the dotCloud PaaS, our edge against the competition was the polyglot aspect. The ability to easily add support for new languages, new databases, message queues — anything. This edge came from the fact that we were using containers.
At some point, we started to refactor and rebuild this container engine, which was the core of the dotCloud PaaS. The code was then three years old, and we wanted to adapt it to be more suited to the issues and challenges that we were facing. We also had this really strong urge to release this code as an open source product. Releasing the original code would have been extremely difficult, because it was tied to many internal mechanisms, such as service discovery, naming, or load balancing. Everything was deeply intertwined, and very specific to our platform. It would have been hard to open source something and have it be really useful for other people.
The first stab at open sourcing the platform was Hipache, the load balancer that we use for the dotCloud PaaS. (The name means "it's Apache, but for hipsters.") Hipache was designed from day one to be useful outside of the platform — and indeed, it has been used by others outside of the platform.
It was a HTTP load balancer. We needed something like Nginx, but allowing dynamic reconfiguration. We needed to be able to add new virtual hosts, new back-ends without having to restart anything. Our initial Nginx setup required us to regenerate a big configuration file for each minor configuration change. Each time a we were deploying applications, scaling them up and down, or moving back-ends, we had to generate this huge configuration file. Eventually, it became a major scalability issue.
So Hipache was completely dynamic, letting you add, remove, change configurations without restarting. And it wasn't tied at all to the code of the dotCloud platform. The dotCloud platform was more like an endorsement; a kind of "hey look, we're using that in production, with significant traffic!" And soon, other people started to use it as well.
For our new container engine (which would eventually become Docker), we wanted the same process. The first versions looked more like a proof of concept than a polished product, obviously; but people got really excited about it nonetheless. The general consensus was "This is really great. We want that." It quickly became apparent that it made sense for us to focus more and more energy on that specific project; and eventually, we renamed the company to become Docker instead of dotCloud because that was reflecting what we were actually doing.
LS: Right. It comes down to that setting up a virtual environment was a bigger pain point, right? And particularly for development purposes.
J: We see Docker being useful for development, and more. But at the same time, we had to have realistic expectations. We knew that a young project wouldn't be deployed straight to production like that. But it made a lot of sense to use it for development and testing workflows. Little by little, it would be suitable for more things: staging environments, then eventually production, all the way to large-scale applications, big data, huge clusters. But we would recommend the early adopters start with something which is not mission critical, and doesn't necessarily have huge uptime and reliability requirements. It's easier to deal with the occasional rough edges of the project in those scenarios. But from the beginning, we were thinking about Docker in production setups.
LS: So, question about that. You guys were actually using it in production, right? You were actually using Docker to deploy to production, right? For dotCloud.
J: The core technology behind Docker is the same as the one behind dotCloud. We had been running containers in production for more than three years. The key differences between dotCloud and Docker are mostly the bindings, the APIs, and some new concepts. (And, of course, the fact that Docker is written in Go!) For instance, we brought closer together the way we run containers, and the way we author those containers.
LS: Right. So having a registry, for instance.
J: Yes, the registry as well. The container engine of dotCloud looked a lot like Docker, but had a very different build process. There was no Dockerfile.
LS: Because it wasn't meant to be a platform in the beginning.
J: Right. The building and authoring tools for dotCloud containers made a strong emphasis on accountability and auditing. It made the whole process very cumbersome and complicated. Docker gave us an opportunity to reinvent that. We wanted the authoring process to be simple, fast, and easy to use by anyone.
Then we could reintroduce accountability and tracking (for instance with the Trusted Builds mechanism), and automate the build process with Dockerfiles. All that stuff came kind of naturally. As you fill one need, you see the next thing appear, and you fill it as well. You go on step by step like that.
LS: How important were the platform aspects early on? The ability to say Docker push and Docker pull for instance?
J: The ability to push and pull and share container images is something that we wanted to have very early, and it was actually in version 0.1, the first public release. It's something that we were envisioning at the very beginning. That was one of the pain points with the old system. Authoring used to be complicated, and slow. Faster than manual deployment, but still much slower than what you can see today when you "docker pull" something. We knew that it would be important, and needed to be there.
LS: Actually, it's funny because just the other day I was reading a blog post about Docker and someone had a really good analogy. What GitHub has done for git, you guys are doing for Linux. Right?
J: It's a nice analogy. I like the way it sounds. I hope it will eventually prove to be true.
Check out Part 2: How Docker Fits Into The Current DevOps Landscape and Part 3: How Docker Manages Its Massive Open Source Project