By Gregory Koberger, Founder at ReadMe.
One feature request ReadMe kept getting was for it to be available to be deployed as a single-tenant, private instance within a customer’s own servers, aka “on-premise.” It made sense as a feature request (HIPAA compliance, secret projects, etc), however it was something we felt we wouldn’t be able to do until we had a whole Enterprise team. So, we told anyone that asked for it that we unfortunately weren’t able to do it. We got enough requests, however, that we decided to investigate what it would take to accomplish. It turned out to be much easier than we expected.
Dockerizing ReadMe
We knew the easiest way to deploy ReadMe would be to Dockerize ReadMe. Our original plan was to just send out Docker containers that people could install locally, so we started there.
Since we weren’t currently using Docker, we had to containerize our app. It sounds a lot scarier than it really is. We created a Dockerfile that looks like this:
FROM node:wheezy
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY package.json /usr/src/app
COPY . /usr/src/app
RUN npm install -g gulp
RUN gulp deploy
ENV NODE_ENV "enterprise"
ENV MAILGUN_USER "postmaster@readme.io"
ENV MAILGUN_PASS "********"
EXPOSE 3000
CMD ["node", "server.js"]
It’s basically just a deploy script, with some environment variables and configuration mixed in.
Overall, Dockerizing our app took us about 2-3 hours. The biggest blocker was databases. We previously used Compose.io for our backend database, so getting Mongo running inside Docker was the hardest thing.
Managing Dockerfiles
We originally thought about doing it ourselves by sending customers a Docker container. However, it quickly spiraled: licensing, configuration, orchestration, updates, backups, reporting, monitoring, debugging…. there was a lot to worry about. Also, our first customer was pretty adamant about ensuring the integrity of their private environment, meaning we wouldn’t have access to the machine it would be running on. Without being about to view the logs or the server, we felt like we’d be too in the dark.
We found a relatively new service called Replicated on StackShare that a bunch of companies we liked (NPM, Travis, etc) were already using. We decided to give it a shot and see how fast we could get up and running. It’s great; you just provide Replicated with a Dockerized version of your app and they provide the features that make it an enterprise-ready, installable application. After spending a few hours getting our services Dockerized, it took us less than a day to integrate and get our on-premise version of our product ready to go.
Deployments
For our customers, the experience is pretty simple. They run an install command on whatever server they want to deploy to, and it just works. Our customers get it set up and running almost as quickly as signing up on the site.
We didn’t want our customers to be stuck on old builds of ReadMe, which would happen if we just sent out a plain Dockerized version. Replicated made it easy for each individual customer to pick their own policy. We just upload our code, and the customer can either have it auto-deployed or manually approved. Either way, upgrading is incredibly simple: it’s one-click for the customer.
This is what the customer sees. They can "Check Now" for a new version, and see their history
On our end, it’s as simple as deploying to Heroku. Dependencies are installed via our install script, and our Dockerfile is similar to Heroku’s buildpacks. We didn’t have to change our package.json at all; NPM and Node worked like normal.
Managing Environments
We use the same codebase for our on-premise deployments. Much like how we have a production, staging and development environment, we also have an enterprise environment.
There’s two major differences in our enterprise builds:
- Customer-entered environment variables - These are things specific to the deployment, such as the URL or their Mailgun credentials. Replicated lets the customer manage these variables, and we just read them the same way as normal environment variables.
- Turning on/off irrelevant features - Not every feature, such as pricing, is relevant for enterprise builds. Enterprise builds also get some additional features that normally only our support staff would see, such as the ability to toggle on beta features or view boring metadata we normally hide away. We also didn’t want it to phone home at all, so those features are disabled.
One potential negative is that there’s old versions floating around. Since we’re a SaaS app by default, it’s normally easy since everyone is on the same version. But since we're sending out snapshots, everyone is on a different version. All our versions are versioned by Replicated and associated with a git commit, so it's easy for us to locally get the proper version.
But for us, the lack of updates is actually a feature. Enterprises don’t like when things change without warning, so we can now push new updates to all of our regular customers while enterprise customers don’t have to deal with buttons moving around or features showing up. Our SLA dictates that we don’t support older versions (except for security updates), and we haven’t had any issues with this yet. Most of our customers are happy to stay up-to-date; they just want to be in control of pressing the button.
Third-party Apps
We’re big fans of using third-party services for as much as possible: Stripe for billing, Mailgun for emails, Segment for analytics, etc. (full stack listed here). This is great for the public cloud version of your app, but becomes problematic when going on-premise (as your end customer will likely not want to set up a litany of 3rd party services to run your app).
So, we split things into two categories: things that could be removed (analytics, billing), and things that could be replaced (emails).
For the former, we just put those features behind a flag and hid them when the environment was Enterprise. We were already using basic feature flags to hide things like analytics – they’re just “if” statements in our code for the most part.
script(src=asset_url('js/bundle-dash.js'))
if env !== 'enterprise'
script.
ga('create', 'UA-52479696-1', 'auto');
ga('send', 'pageview');
For the latter, we allowed the customer to configure the application with their own keys (you can also abstract things like SMTP out to allow your customer to use any SMTP server if needed).
Communication and Support
One thing we failed at early on was encouraging our customers to come to us as soon as there was a problem. Since we couldn’t monitor the servers the way we do in production, we had some issues early on with things breaking and us not knowing about it. It caused a bit of frustration on their end and we were blissfully unaware (for a time).
So, we’ve worked to be better at communicating with customers using our on-premise version. Replicated lets you do some basic monitoring (since anything more would defeat the purpose of being on-premise), and we’ve gotten better at spotting issues.
Companies get a nice dashboard, so they know what's running and what's broken. We have four services running (main site, NGINX, Mongo and Redis), and they're easily monitored:
When you need to debug, you can have the company send you a “support bundle” zip file that contains logs for all these running services:
On our end, we budgeted two weeks of a developer’s time for each install. It’s a lot of time, however once it’s all up and running it tends to work well in the future. If we end up doing more of these, I’d likely hire an engineer to do this full-time (and potentially on-site). We haven’t had a need for that yet, though.
I would recommend doing your first installation with a friendly company. That way you can polish any rough edges in the process or product, before installing it at larger companies. Like anything, the first time never goes as smoothly as you expected!
Support and Contracts
We added a line to our ToS that everyone had to agree to before they could install the Dockerfile. We don't obfuscate our code, and trust our customers to not violate the ToS. In the future, if this becomes a problem, we'll start minifying our Node before shipping.
We also created a support SLA for our companies using our on-prem version. We give everyone our direct phone number for 24/7 support (luckily this hasn’t been abused!), and guarantee 24-hour responses for non-vital issues. It’s important to be proactive: since you can’t see the site or detect issues, early on it’s important to check in every few days to make sure there’s no issues.
Getting Stuck
We got stuck a lot, since it was our first time doing this. Luckily, Docker has a ton of stuff on Stack Overflow and their own site, so nothing blocked us too badly. Replicated had a Slack channel that was immensely useful. And if you get stuck, we're happy to help! Just email us (support@readme.io), and we can do what we can to help you get started.
Why You Should Too
Companies often chase after that elusive next feature which will change everything and make revenue shoot up. Josh Pigford recently wrote a great article about it. And that feature doesn’t exist… with one exception! On-premise was a simple feature to add that directly impacted our growth. We went from our normal $59 plan to being able to sell our product for two orders of magnitude more. It’s had the single biggest ROI for us of anything we’ve done.
If you have customers asking you for on-premise, you don’t have an excuse… it’s a lot easier than you’d think!
Note: there's a similar service out there called Gravitational, but we didn't evaluate it since Replicated seemed to fit our needs.