Avatar of Trey Tacon

Originally, we had a single MongoDB replica set that we stored everything on. As we scaled, we realized two things:

  • A single Mongo replica set wasn’t going to cut it for our many quickly growing collections
  • Analytics and rich searching don’t scale well in Mongo.

To solve for the first item, we now run multiple large scale Mongo deployments with a mix of replica sets and sharded replica sets (depends on the application activity for the given database). In solving for the second item, we now run multiple large Elasticsearch deployments to provide the majority of our rich searching functionality.

We also heavily use Redis across the entire platform for things like distributed locking, caching, and backing part of our job queuing layer. This has led to our most recent (and ongoing!) scaling challenge.

READ MORE
How Mixmax Uses Node and Go to Process 250M Events a day - Mixmax Tech Stack (stackshare.io)
13 upvotes·71.8K views
Shared insights
on
MeteorMeteorNode.jsNode.js
at

Mixmax was originally built using Meteor as a single monolithic app. As more users began to onboard, we started noticing scaling issues, and so we broke out our first microservice: our Compose service, for writing emails and Sequences, was born as a Node.js service. Soon after that, we broke out all recipient searching and storage functionality to another Node.js microservice, our Contacts service. This practice of breaking out microservices in order to help our system more appropriately scale, by being more explicit about each microservice’s responsibilities, continued as we broke out numerous more microservices.

READ MORE
How Mixmax Uses Node and Go to Process 250M Events a day - Mixmax Tech Stack (stackshare.io)
7 upvotes·148.6K views
Shared insights
on
strongDMstrongDM
at

In a distributed world, auditing database access, credential management and rotation, and onboarding can be a nightmare. Someone running a query on a staging DB that’s taking down the test environment forever? Good luck hunting that down. Have a new engineer onboard and they need to run an audit query on the staging DB to see if their new code might break an old schema? Have fun configuring that. Need to run your periodic credential rotation, ...enjoy. This was not only a huge pain point for our team, but me personally, and then strongDM came into the picture.

strongDM acts as a control plane to manage access to every database and server. By centralizing all database credentials & ssh keys in strongDM, onboarding and offboarding becomes much faster.

I seriously cannot imagine working without strongDM now. It’s one of those tools that seamlessly fits into your workflow and you can’t envision work without it.

READ MORE
How Mixmax Uses Node and Go to Process 250M Events a day - Mixmax Tech Stack (stackshare.io)
7 upvotes·4.7K views

A huge part of our continuous deployment practices is to have granular alerting and monitoring across the platform. To do this, we run Sentry on-premise, inside our VPCs, for our event alerting, and we run an awesome observability and monitoring system consisting of StatsD, Graphite and Grafana. We have dashboards using this system to monitor our core subsystems so that we can know the health of any given subsystem at any moment. This system ties into our PagerDuty rotation, as well as alerts from some of our Amazon CloudWatch alarms (we’re looking to migrate all of these to our internal monitoring system soon).

READ MORE
How Mixmax Uses Node and Go to Process 250M Events a day - Mixmax Tech Stack (stackshare.io)
6 upvotes·945.6K views

As Mixmax began to scale super quickly, with more and more customers joining the platform, we started to see that the Meteor app was still having a lot of trouble scaling due to how it tried to provide its reactivity layer. To be honest, this led to a brutal summer of playing Galaxy container whack-a-mole as containers would saturate their CPU and become unresponsive. I’ll never forget hacking away at building a new microservice to relieve the load on the system so that we’d stop getting paged every 30-40 minutes. Luckily, we’ve never had to do that again! After stabilizing the system, we had to build out two more microservices to provide the necessary reactivity and authentication layers as we rebuilt our Meteor app from the ground up in Node.js. This also had the added benefit of being able to deploy the entire application in the same AWS VPCs. Thankfully, AWS had also released their ALB product so that we didn’t have to build and maintain our own websocket layer in Amazon EC2. All of our microservices, except for one special Go one, are now in Node with an nginx frontend on each instance, all behind AWS Elastic Load Balancing (ELB) or ALBs running in AWS Elastic Beanstalk.

READ MORE
How Mixmax Uses Node and Go to Process 250M Events a day - Mixmax Tech Stack (stackshare.io)
5 upvotes·1 comment·214.9K views
Wojciech Bator
Wojciech Bator
·
November 6th 2020 at 10:14AM

You can consider using Go for a more misson critical services when extending the product. Recently, I moved from Node to Go when building a tool which will process a big amount of various files concurrently and the goroutines plus channels combo is pretty powerful. For a web application server I'd stick with Node.js for it's easy prototyping, wider ecosystem, good frameworks, but any CPU-heavy processing I'd pass to Go. Node just chokes when there's longer synchronous processing to be done.

Go also fits quite well in distributed ecosystem with it's Circuit.

·
Reply
Shared insights
on
Node.jsNode.jsGolangGolang
at

Building a communication platform means processing a TON of data. Our backend, built primarily in Node.js and Go, processes up to 250M events a day with 200k/minute at peak load. As the glue for an organization’s communication, not only are we processing a huge number of internal events, but we’re also processing data from external sources like CRMs and ATSs totaling 3.2 million events and amounting to a data volume exceeding 14 GB each hour. We've already scaled our platform up 2x in the past 3 months and plan to grow another 10x this year, all while maintaining strict "three 9's" uptime that our customers expect, as they rely on Mixmax all day to get their work done.

READ MORE
How Mixmax Uses Node and Go to Process 250M Events a day - Mixmax Tech Stack (stackshare.io)
4 upvotes·9.9K views