Scaling Clearbit to 2M API Requests Per Day

5,777
Clearbit
APIs for determining who's behind an email address

By Harlow Ward, ‎Developer and Co-founder at Clearbit.


Clearbit builds Business Intelligence APIs - Our suite of APIs are focused on Lead Enrichment and Automated Research.

Clearbit lookup example

Our goal is to help modern businesses make better data-driven decisions. Our platform aggregates data from hundreds of public sources and packages it up into beautifully hand-crafted JSON payloads.

Customers use our APIs to:

  • Give their sales team more information on customers, leads, and prospects.
  • Integrate and surface person/company data to the end-users of their systems.
  • Underwrite transactions and reduce fraud.

Outside of our paid products we also love releasing free products. These bite sized APIs are hyper focused on helping designers and developers enhance the user-experience of their tools and systems.

A few of these freebies include:


Engineering at Clearbit

Our engineering team consists of three developers: Alex MacCaw (also our fearless CEO), Rob Holland, and myself.

We are a small dev team, and that means we all wear a lot of hats. Day-to-day, it’s not uncommon to jump between Frontend HTML/JS/CSS, API design, Service administration, DB administration, Infrastructure management, and of course a little customer support.


Services Everywhere

We made the decision early on to build a microservice-first architecture. This means our system is composed of lots of tiny Single Responsibility Services (SRS anyone?).

In general these services are written in Ruby, leverage Sinatra to expose JSON endpoints, and use RSpec to verify accuracy. Each service maintains its own datastore; depending on the service's needs we’ll typically choose from Amazon RDS, Amazon DynamoDB, or hosted Elasticsearch with Found.

There are some great arguments to be made about a MonolithFirst architecture. However, in our case, we felt our data boundaries were reasonably clear from the beginning, and this allowed us to make a few low-risk bets around building and running a microservice-first architecture. So far so good!

Our web services fall into two categories:

  1. External (publicly accessible, authenticated via API keys).
  2. Internal (accessible within VPC, locked down to specific security groups).

At any given time we’re running 70+ different internal services across a cluster of 18 machines. Our external (customer facing) APIs are serving upwards of 2 million requests per-day, and that number is rapidly increasing.


Early Days

When working with a microservice architecture it's difficult to overstate how important it is for a developer to be able to quickly push a new web service.

Our initial aritecture was built on Amazon EC2 and leveraged dokku-alt (a Docker powered mini-Heroku) to manage deployments.

Dokku-alt covered our basic requirements:

  • Git based deploys.
  • Managing ENV vars outside of config files.
  • Ability to rollback in case of emergency.

However, as the number of servers grew some shortcomings of dokku-alt began to emerge. This was no fault of dokku-alt; we were just outgrowing our architecture.

As we added more machines the problems compounded. The per-machine configuration management we had initially loved quickly became unsustainable. On top of that, running git push production master simultaneously to every box in the cluster made for some nerve-racking deploys.

The state of our deployment system was beginning to take a toll on the team's productivity. It was time to make a change. We collectively decided to explore our options.


Current Stack

As our infrastructure grew, our deployment requirements also evolved:

  • Distributed configuration management.
  • Git push to only one repository.
  • Blue/Green style deploys.

After looking into solutions like Deis and Flynn, we decided we'd feel happier with something with simpler semantics. We were attracted to Fleet because of it's simplicity and flexibility, and the reputation of the CoreOS team.

Co-ordinating configuration between machines became a breeze with the use of etcd. Now when our deployer app builds a new docker container we can inject environment variables from etcd directly into the container.

From there, we use Fleet to distribute the units accross our cluster of servers. We’ve found fleet-ui super handy for visualizing the distribution of units across our cluster.


fleetui


To keep our operational expenses down, we have a static pool of on-demand EC2 instances running the etcd quorum, HAProxy, and several of the HTTP front ends. On top of that, we leverage a dynamic pool of EC2 Spot Instances to handle the dynamic nature of our workloads during times of extremely high throughput.

Word to the wise: Don’t use Spot Instances as part of your etcd quorum -- When someone else bids higher than the current Spot Price (and they will), the Spot Instances will disappear without warning.


Monitoring

It’s hard to stress how important it’s been for us to have a deep and instantly available understanding of the current state of all our services.

Starting from the outside, we use Runscope to continually ping and analyze responses from our services. It’s been instrumental in verifying and maintaining the APIs with dynamic date versioning.

Digging a level deeper, we use Librato for measuring and monitoring lower level system behaviour. We’re diligent about creating alerts that will notify the team if anything seems awry.

Sentry notifies us immediatly via Slack and Email if any of our services are throwing errors. We’re big believers in the Broken windows theory, and try to keep Sentry as clean as possible.

Finally, we use SumoLogic as our log aggregation platform. We run Sumo Collectors on each of our hosts. SumoLogic is our last line of defense for spotting inconsistent system behaviour and debugging historical issues.


Looking Forward

We have a private contrib repo with a handful of rack middlewares that are shared across our services. These middlewares dramatically cut down on duplication of code around Authentication, Authorization, Rate Limiting, and IP Restrictions.

In general, the shared middleware approach has worked well for us. However, as we look to the future and the team continues to experiment with new languages, the Ruby middlewares can’t be shared across new languages in the polyglot system.

Our goal is to push this shared logic out of the services and into the proxy layer (possibly with the help of VulcanD, Kong, or some custom HAProxy foo).

If you have made a transition like this before, or have a an elegant idea of how to summersault this hurdle, I’d love to buy you a beverage. harlow@clearbit.com


Clearbit
APIs for determining who's behind an email address
Tools mentioned in article
Open jobs at Clearbit
Senior Software Engineer - Data
At Clearbit, we help our customers build unstoppable growth engines. Our products enable businesses to understand their customers, generate demand, act on intent, and increase conversions all the way down the funnel. Today, Clearbit powers more than 1,500 B2B companies including Asana, Segment, and Atlassian. It's an exciting time to join Clearbit. We're a rapidly growing SaaS company, full of can-do people who care about craft, collaboration, and our customers. As a software engineer on the Data team at Clearbit, you'll work very closely with the rest of our team to build and maintain our key APIs and the data platform at the heart of our products. We value ownership, curiosity, and creativity. You will have the ability to take an idea through all the stages from conception to shipping a product. This reflects throughout our company, but is especially true in engineering. As an engineer at Clearbit, you'll be highly independent and autonomous. Since we're building such disparate data APIs and products, you'll be working with a large array of different technologies and fields. Expect lots of interesting challenges! <li>Work with a Ruby/Sinatra/Sequel/Postgres/Kafka stack</li><li>Bring new features from concept to shipped product</li><li>Come up with new product directions and contribute with ideas</li> <li>5+ years of development experience</li><li>3+ years of experience in Ruby</li><li>Independent and self motivated—maintaining side projects and libraries a major plus</li><li>Remote first company with headquarters in San Francisco, CA </li> <li>A brief write-up explaining who you are as a programmer. For example, how you got started, what area of the stack you feel most familiar with, what motivates you, what technologies you want to learn over the next year</li><li>Some ways you think Clearbit could improve, APIs we could add etc</li><li>A side project you really enjoyed working on</li><li>Links to online profiles you use (GitHub, Twitter, etc)</li><li>A description of your work history (whether as a resume, LinkedIn profile, or prose)</li>
Support Engineer
(US Only)
About the role The Support Engineer works with current and prospective customers to find solutions as they as they relate to Clearbit. In this role, you will work with customer success managers, solutions consultants, engineering and product teams to deliver effective and compelling technical solutions to current and prospective customers. We offer a highly collaborative environment with a very experienced team working on awesome cutting edge products. This is a fast-paced role that comes with a tremendous level of ownership. If you're a self-starter, enjoy cross-functionality, learning about marketing technology, understanding customer challenges, working with the hottest startups and large enterprise companies, this is your role! About Clearbit Here at Clearbit, our mission is to be the Growth Engine that pushes the boundary of what's possible in marketing and sales. We build data-driven SaaS products that enable businesses to generate demand, act on intent, drive conversion, and retain and expand their customers. Today, Clearbit powers more than 1,500 B2B companies including Asana, Segment, and Atlassian.   This belief in meaningful growth extends to our employees. We invest in personal and team growth, valuing constructive feedback, and emotional intelligence. We aim to maintain a working environment of psychological safety, where vulnerability is not a weakness, so it's easier to take creative risks, re-define what’s possible, and grow into the best version of yourself. Your teammates will push you (kindly) to grow and ask for the same in return.  <li>Become an expert in our products</li><li>Be the first point of contact for our customers and manage escalations to out product team</li><li>Work with a diverse customer base on an ongoing basis, guiding, troubleshooting (sometimes in real time with our customers) and filing bugs as they arise, consulting customers of solutions and best practices along the way</li><li>Where no documentation exists, you’ll proactively research the issue, find solutions where possible, and record solutions for future reference. </li><li>Be a voice of the customer, understand customer product usage themes, and provide Product feedback</li><li>Deliver an incredible customer experience</li> <li>1 year experience in client facing B2B SaaS Support Engineering or Tier 1 IT Technician Support</li><li>Comfortable in ambiguity and self sufficient in trying to find the solution</li><li>Excellent written communication skills</li><li>Problem-solving ability, customer centric outlook</li> <li>Experience with ticket queue management using help desk software solutions like Zendesk</li><li>Working experience with SQL, AWS- Athena familiarity, querying or modifying existing templates that include complex joins across multiple tables, Ruby, Javascript (APIs/Debug Console Errors)</li><div><br></div> <li>We invest in personal and team growth, valuing constructive feedback, and emotional intelligence. We aim to maintain a working environment of psychological safety, where vulnerability is not a weakness, so that it's easier to take creative risks, re-define what’s possible, and grow into the best version of yourself. Your teammates will push you to grow (kindly) and ask for the same in return.</li><li>Building a company takes a physical and mental toll, which is why we think it's incredibly important to make sure that we provide benefits that focus on physical and mental health.</li><li>Our benefits include:- Competitive salary and meaningful equity- Health, dental, and vision for you and your family- Paid parental leave- Mental health resources, coaching and therapy sessions- 401(K)- Education benefits</li>
Marketing Website Engineer
(US Only)
Clearbit provides real-time intelligence on all the accounts (and decision markers) in your target market and provides the necessary solutions to help you create demand, capture intent and optimize pipeline. Today, Clearbit powers more than 1,500 of the world’s most successful B2B companies, including Asana, Segment, and Atlassian.  We're a high-growth SaaS company full of can-do, kind people who care about craft, collaboration, and our customers. We have aggressive growth plans for 2022 and beyond. With a bias for action — we are moving fast.  This belief in meaningful growth extends to our employees. We invest in personal and team growth, valuing constructive feedback, and emotional intelligence. We aim to maintain a working environment of psychological safety, where vulnerability is not a weakness, so it's easier to take creative risks, re-define what’s possible, and grow into the best version of yourself. Your teammates will push you (kindly) to grow and ask for the same in return. About the Role: Clearbit is seeking an experienced, self-reliant engineer to join our talented marketing team. In this role, you will own our website and work with the marketing team and other departments to keep our website content fresh, improve conversion rates, and build engaging, innovative web experiences. This is an exciting opportunity for someone looking to level up their career, expand their skillset, drive direct impact on our growth trajectory, and create value for our customers. Note: we’re not looking for the “typical” marketing engineer. We are looking for a product owner and engineer-in-one, someone who can think for themselves, ideate, and prioritize high-impact initiatives, manage projects, and collaborate across other teams. <li>Website optimization</li><li>Collaborate with Demand to build A/B tests and new landing pages</li><li>Collaborate with Content on building and managing a robust resource section and technical content projects</li><li>Collaborate with Growth on web-based projects (tools, conversion flows)</li><li>Website project management</li><li>Create light-weight design specs and projects</li><li>Troubleshoot website problems</li><li>Maintain and update website</li><li>Monitor website traffic and website vitals</li><li>Web stack improvements / optimizations</li><li>Tagging & tracking</li><li>Maintaining top code quality, writing tests where needed (Cypress)</li> <li><a href="http://Next.js" class="postings-link">Next.js</a> (all typescript, hosted on Vercel)</li><li>Contentful</li><li>Tailwind CSS (99% of CSS is handled by this)</li><li>Serverless functions for backend&nbsp;</li> <li>Great communicator (especially async), including documentation</li><li>Team player</li><li>Great initiative, able to ship work independently without a lot of supervision</li><li>Excellent time and priority management</li><li>Humble, learner, willing to try new things</li><li>Writes clean, clear code</li><li>Good eye for detail</li><li>Self-QA </li><li>Solid experience in React, up-to-date with hooks, etc.</li><li>Experience working with marketing (and sales)</li><div><br></div> <li><a href="http://Next.js" class="postings-link">Next.js</a> experience</li><li>Typescript experience</li><li>Experience with a headless CMS such as Contentful</li><li>Tailwind or some similar utility CSS framework</li><li>Website speed optimization experience</li><li>Experience with Clearbit</li><li>Personalization experience</li><li>Conversion rate optimization experience</li><div><br></div>
Senior Software Engineer
(US Only)
At Clearbit, we help our customers build unstoppable growth engines. Our products enable businesses to understand their customers, generate demand, act on intent, and increase conversions all the way down the funnel. Today, Clearbit powers more than 1,500 B2B companies including Asana, Segment, and Atlassian. It's an exciting time to join Clearbit. We're a rapidly growing SaaS company, full of can-do people who care about craft, collaboration, and our customers. We’re also entirely remote!  This position on the Data team focuses on our Reveal product and the future of identity resolution. As the world has shifted to working remotely, the difficulties of identifying users and potential customers via traditional methods have been increasing. Our current methods have been highly successful, but we want to ensure that we’re on the cutting edge of identity resolution as the landscape continues to evolve.  We’re looking for someone who can visualize where we can take our product, plan the steps to get there, and execute it. This requires someone who can function as a hybrid of an engineering team lead and product manager. You’ll have the support of the rest of the team, but you’ll primarily be working autonomously in the initial stages of your role. This is a really exciting opportunity to have ownership of a crucial product and take it to the next level. <li>Work with a Ruby/Go/Sinatra/SQL/Postgres/Kafka stack</li><li>Bring new features from concept to shipped product</li><li>Formulate new product directions and contribute ideas</li> <li>5+ years of development experience</li><li>Proven history of leading projects from concept to execution</li><li>Experience with some or all of Ruby, Go, and SQL</li><li>Independence and self motivation</li> <li>A description of your work history (whether as a resume, LinkedIn profile, or prose)</li>
Verified by
Software Engineer
Founder
You may also like