Scaling Clearbit to 2M API Requests Per Day

5,643
Clearbit
APIs for determining who's behind an email address

By Harlow Ward, ‎Developer and Co-founder at Clearbit.


Clearbit builds Business Intelligence APIs - Our suite of APIs are focused on Lead Enrichment and Automated Research.

Clearbit lookup example

Our goal is to help modern businesses make better data-driven decisions. Our platform aggregates data from hundreds of public sources and packages it up into beautifully hand-crafted JSON payloads.

Customers use our APIs to:

  • Give their sales team more information on customers, leads, and prospects.
  • Integrate and surface person/company data to the end-users of their systems.
  • Underwrite transactions and reduce fraud.

Outside of our paid products we also love releasing free products. These bite sized APIs are hyper focused on helping designers and developers enhance the user-experience of their tools and systems.

A few of these freebies include:


Engineering at Clearbit

Our engineering team consists of three developers: Alex MacCaw (also our fearless CEO), Rob Holland, and myself.

We are a small dev team, and that means we all wear a lot of hats. Day-to-day, it’s not uncommon to jump between Frontend HTML/JS/CSS, API design, Service administration, DB administration, Infrastructure management, and of course a little customer support.


Services Everywhere

We made the decision early on to build a microservice-first architecture. This means our system is composed of lots of tiny Single Responsibility Services (SRS anyone?).

In general these services are written in Ruby, leverage Sinatra to expose JSON endpoints, and use RSpec to verify accuracy. Each service maintains its own datastore; depending on the service's needs we’ll typically choose from Amazon RDS, Amazon DynamoDB, or hosted Elasticsearch with Found.

There are some great arguments to be made about a MonolithFirst architecture. However, in our case, we felt our data boundaries were reasonably clear from the beginning, and this allowed us to make a few low-risk bets around building and running a microservice-first architecture. So far so good!

Our web services fall into two categories:

  1. External (publicly accessible, authenticated via API keys).
  2. Internal (accessible within VPC, locked down to specific security groups).

At any given time we’re running 70+ different internal services across a cluster of 18 machines. Our external (customer facing) APIs are serving upwards of 2 million requests per-day, and that number is rapidly increasing.


Early Days

When working with a microservice architecture it's difficult to overstate how important it is for a developer to be able to quickly push a new web service.

Our initial aritecture was built on Amazon EC2 and leveraged dokku-alt (a Docker powered mini-Heroku) to manage deployments.

Dokku-alt covered our basic requirements:

  • Git based deploys.
  • Managing ENV vars outside of config files.
  • Ability to rollback in case of emergency.

However, as the number of servers grew some shortcomings of dokku-alt began to emerge. This was no fault of dokku-alt; we were just outgrowing our architecture.

As we added more machines the problems compounded. The per-machine configuration management we had initially loved quickly became unsustainable. On top of that, running git push production master simultaneously to every box in the cluster made for some nerve-racking deploys.

The state of our deployment system was beginning to take a toll on the team's productivity. It was time to make a change. We collectively decided to explore our options.


Current Stack

As our infrastructure grew, our deployment requirements also evolved:

  • Distributed configuration management.
  • Git push to only one repository.
  • Blue/Green style deploys.

After looking into solutions like Deis and Flynn, we decided we'd feel happier with something with simpler semantics. We were attracted to Fleet because of it's simplicity and flexibility, and the reputation of the CoreOS team.

Co-ordinating configuration between machines became a breeze with the use of etcd. Now when our deployer app builds a new docker container we can inject environment variables from etcd directly into the container.

From there, we use Fleet to distribute the units accross our cluster of servers. We’ve found fleet-ui super handy for visualizing the distribution of units across our cluster.


fleetui


To keep our operational expenses down, we have a static pool of on-demand EC2 instances running the etcd quorum, HAProxy, and several of the HTTP front ends. On top of that, we leverage a dynamic pool of EC2 Spot Instances to handle the dynamic nature of our workloads during times of extremely high throughput.

Word to the wise: Don’t use Spot Instances as part of your etcd quorum -- When someone else bids higher than the current Spot Price (and they will), the Spot Instances will disappear without warning.


Monitoring

It’s hard to stress how important it’s been for us to have a deep and instantly available understanding of the current state of all our services.

Starting from the outside, we use Runscope to continually ping and analyze responses from our services. It’s been instrumental in verifying and maintaining the APIs with dynamic date versioning.

Digging a level deeper, we use Librato for measuring and monitoring lower level system behaviour. We’re diligent about creating alerts that will notify the team if anything seems awry.

Sentry notifies us immediatly via Slack and Email if any of our services are throwing errors. We’re big believers in the Broken windows theory, and try to keep Sentry as clean as possible.

Finally, we use SumoLogic as our log aggregation platform. We run Sumo Collectors on each of our hosts. SumoLogic is our last line of defense for spotting inconsistent system behaviour and debugging historical issues.


Looking Forward

We have a private contrib repo with a handful of rack middlewares that are shared across our services. These middlewares dramatically cut down on duplication of code around Authentication, Authorization, Rate Limiting, and IP Restrictions.

In general, the shared middleware approach has worked well for us. However, as we look to the future and the team continues to experiment with new languages, the Ruby middlewares can’t be shared across new languages in the polyglot system.

Our goal is to push this shared logic out of the services and into the proxy layer (possibly with the help of VulcanD, Kong, or some custom HAProxy foo).

If you have made a transition like this before, or have a an elegant idea of how to summersault this hurdle, I’d love to buy you a beverage. harlow@clearbit.com


Clearbit
APIs for determining who's behind an email address
Tools mentioned in article
Open jobs at Clearbit
Head of Customer Success, Support, So...
(US Only)
We’re looking for a seasoned customer success executive to grow and scale our growing customer-facing teams. This individual will be responsible for providing a vision for how to strengthen partnerships with existing customers and creating processes focused on setting new customers up for long term success. You will be responsible for bringing your strategic vision and innovative approach to lead this part of the business across multiple teams, including customer success, support, solutions engineering (and eventually professional services). We're looking for a programmatic operator who can determine and implement the optimal balance of resources to service our customers & partners successfully. You'll own the customer experience across the customer lifecycle in close partnership with Sales, Marketing, Product, and Engineering. This is a remote, full-time position reporting to the CEO.
  • Work alongside CEO/Leadership team to formulate and execute on the high-level customer success; set and achieve goals for onboarding, adoption/value, and retention 
  • Improve functional KPIs (SLAs, CSAT, retention, renewal, productivity, and gross margin)
  • Develop a clear view of the customer lifecycle, mapping out a plan to move the customer through their journey with Clearbit
  • Manage and scale the Customer Success team (including Customer Success, Support, Solutions Engineering, and eventually Professional Services) so they can effectively connect with customers, drive first value, adoption, and growth
  • Leverage and report on extensive data and analysis to drive ongoing strategy, process, and resource allocation
  • Hire and build out team to service our rapidly growing customer base
  • Manage cross-functional relationships with leaders in Sales, Product, Engineering, and Marketing to advocate for users and influence long-term product roadmap
  • Lead and motivate a customer-facing team (including training, promotions, internal processes, etc.)
  • High bar of excellence — an established record of strategic and operational customer success leadership in organizations of 100+ ppl (ideally experience scaling CS in 100 ppl - 500+ ppl orgs)
  • Excellent track record of identifying when and who to hire 'ahead of the curve', and success in sourcing top caliber candidates and managing/retaining them; experience building/managing remote and/or part-time teams a plus
  • Experience in building teams, tools, and processes to scale Customer Success organizations in B2B SaaS (i.e., thorough understanding of SaaS and recurring revenue business models)
  • Experience scaling paying customers, while improving functional KPIs (SLAs, CSAT, retention, renewal, productivity, and gross margin)
  • Commanding knowledge of and familiarity with customer-facing technology, tools, and data to ensure minimal low-value touchpoints and workflows
  • Deeply-rooted customer empathy - Clear ability to understand and advocate for our customer 
  • Exceptional use of / comfort with data to drive decision making
  • Background in / knowledge of MarTech is a plus
    Senior Go Engineer
    San Francisco, CA (or , US Timezones)
    At Clearbit, we help our customers build unstoppable growth engines. Our products enable businesses to understand their customers, generate demand, act on intent, and increase conversions all the way down the funnel. Today, Clearbit powers more than 1,500 B2B companies including Asana, Segment, and Atlassian. It's an exciting time to join Clearbit. We're a rapidly growing SaaS company, full of can-do people who care about craft, collaboration, and our customers. As a software engineer on the Core Services, you'll be building the foundational pieces of our architecture, on top of which the rest of Engineering will build amazing product experiences. We own things like data ingress/egress, event storage, and everything in between. We use technologies like Go, Kafka, and DynamoDB. You'll be a part of the team that designs, builds, scales, and maintains some of the most important services and systems at Clearbit. You'll be joining an incredibly talented and experienced group of engineers who are setting us up for future engineering success. As an engineer at Clearbit, you'll have the autonomy and support to shape our future.
  • Architect and build distributed services on top of which the rest of the Engineering will build product experiences.
  • Solve some of the most interesting and challenging engineering problems we have.
  • Scale pieces of our overall system to set us up for future success.
  • Mentor and train other team members on design techniques and coding standards.
  • Manage individual project priorities, deadlines and deliverables with your technical expertise.
  • 5+ years of development experience
  • Very strong knowledge and experience with Go.
  • Independent and self motivated.
  • Being comfortable mentoring and helping other engineers and teams succeed technically.
  • Remote first company with headquarters in San Francisco, CA
  • Senior Full Stack Engineer, X
    San Francisco, CA (or , US Timezones)
    Who We Are Clearbit is the marketing data engine for customer interactions. We help businesses grow by providing tools that help them deeply understand their customers, identify future prospects, and highly personalize every single marketing and sales interaction. What You'll Do As a Full Stack engineer on the X team, you'll be integral in the design and development of one of our new core products. You'll be working very closely with the rest of our team, building and maintaining a rich, easy-to-use control plane for modern marketing workflows. We value ownership and knowing what questions to ask very highly—the ability to take an idea through all the stages from conception to shipping a product. This reflects throughout our company, but is especially true in engineering. As an engineer at Clearbit, you'll be highly independent and autonomous. Since we're building such disparate data APIs and products you'll be working with a large array of different technologies and fields. Expect lots of interesting challenges.
  • Work with a Ruby/Sinatra/Sequel/Postgres stack on the backend (AWS)
  • Work with HTML/CSS/JS/React/GraphQL on the front end
  • Bring new features from concept to shipped product
  • Come up with new product directions and contribute with ideas
  • 4+ years of development experience
  • 2+ years of experience in JS and Ruby
  • 2+ years of experience with React (Preferred)
  • Experience with GraphQL (Preferred)
  • Independent and self motivated - maintaining side projects and libraries a major plus
  • Remote in US timezones (We do have offices in San Francisco, CA)
  • A brief write-up explaining who you are as a programmer. For example, how you got started, what area of the stack you feel most familiar with, what motivates you, what technologies you want to learn over the next year
  • Some ways you think Clearbit could improve, APIs we could add etc
  • A side project you really enjoyed working on
  • Links to online profiles you use (GitHub, Twitter, etc)
  • A description of your work history (whether as a resume, LinkedIn profile, or prose)
  • VP Engineering
    (US Only)
    Clearbit is building Intelligence applied to Acquisition and Conversion layers for B2B Marketing & Operations Teams. Over 1,000 companies use Clearbit to improve acquisition & conversion across their digital funnel. We partner with marketing & sales operations teams at leading enterprises, fast-growing startups, and everything in between. VP Engineering The VP of Engineering will focus on Organization + Execution and best served by an experienced manager of people, projects, teams, and departments. They will be tasked to define the vision and strategy that moves Clearbit's engineering culture to the next level, while challenging and inspiring the team to deliver on outcomes. Clearbit currently has around 100 employees, 17 of which are on Eng team (3 Managers + 14 ICs). We have 6 remaining headcount for 2021 and will most likely grow another 50% in 2022. We're still early in our journey. Come join us.
  • Organization- Owns the plans for staffing the team as well as organizational structure, ensuring projects are staffed accordingly. Guides the structure and people.
  • Execution- Making sure things get done. Align with Product on roadmap, plan work, and coordinate large efforts.
  • Verified by
    Founder
    Software Engineer
    You may also like