Using Kafka to Throttle QPS on MySQL Shards in Bulk Write APIs

1,392
Pinterest
Pinterest is a social bookmarking site where users collect and share photos of their favorite events, interests and hobbies. One of the fastest growing social networks online, Pinterest is the third-largest such network behind only Facebook and Twitter.

At Pinterest, backend core services are in charge of various operations on pins, boards, and users from both Pinners and internal services. While Pinners’ operations are identified as online traffic because of real-time response, internal traffic is identified as offline because processing is asynchronous, and real-time response is not required.

The services’ read and write APIs are shared between traffic of these cases. The majority of Pinners’ operations on a single object (such as creating a board, saving a Pin, or editing user settings through web or mobile) are routed to one of the APIs to fetch and update data in datastores. Meanwhile, internal services use these APIs to take actions on a large number of objects on behalf of users (such as deactivating spam accounts, removing spam Pins).

To offload internal offline traffic from APIs so online traffic can be handled exclusively with better reliability and performance, write APIs should support batch objects. A bulk write platform on top of Kafka is proposed and implemented. This also ensures internal services like QPS are supported more efficiently, without being restricted to guarantee high throughput. In this post, we’ll cover the characteristics of internal offline traffic, the challenges we faced and how we attacked them by building a bulk write platform in backend core services.

Datastores and write APIs

At Pinterest, MySQL is one major datastore to store content created by users. To store billions of Pins, boards and other data for hundreds of millions of Pinners, many MySQL database instances form a MySQL cluster, which is split into logical shards to manage and serve the data more efficiently. All data are split across on these shards.

To read and write data efficiently for one user, the data is stored in the same shard so that APIs only need to fetch data from one shard without fan-out queries to various shards. To prevent any single request from occupying MySQL database resource for a long time, every single query is configured with timeout.

All write APIs of core services were built for online traffic from Pinners at the beginning and work well as only a single object is accepted because pinner operates on a single object most of the time) and the operation is lightweight. Even when Pinners would take bulk operation, e.g. move a number of Pins to a section one board, the performance is still good because the number of objects isn’t very big and write APIs can handle them one by one.

Challenges

The situation changes as more and more internal services use existing write APIs for various bulk operations (such as removing many Pins for a spam user within a short period of time or backfilling a new field for a huge number of existing Pins). As write APIs can only handle one object at a time, much higher traffic with spikes is seen in these APIs.

To handle more traffic, autoscaling of the services can be applied but does not necessarily solve the problem completely because the capacity of the system is restricted by the MySQL cluster. With the existing architecture of MySQL cluster, it’s hard to do autoscaling of MySQL cluster.

To protect the services and MySQL cluster, rate limiting is applied to write APIs.

Although throttling can help to some extent, it has several drawbacks that prevent backend core services from being more reliable and scalable.

  1. Both online and offline traffic to an API affect each other. If the spike of internal offline traffic happens, online traffic to the same API is affected with higher latency and downgraded performance, which impacts the user’s experience.
  2. As more and more internal traffic is sent to the API, rate limiting needs to keep bumping carefully so that APIs can serve more traffic without affecting existing traffic.
  3. Rate limiting does not stop hot shards. When internal services write data for a specific user, e.g. ingest a large number of feed pins for a partner, all requests are targeting the same shard. The hot shard is expected because of spike of requests in a short period of time. The situation gets worse when update operations in MySQL are expensive.

As internal services need to handle a big number of objects within a short period of time and do not need a real-time response, requests that target to the same shard can be combined together and handled asynchronously with one shared query to MySQL to improve efficiency and save bandwidth of connection resource of MySQL. All combined batch requests should be processed at a controlled rate to avoid hot shards.

Bulk write architecture

The bulk write platform was architectured to support high QPS for internal services with high throughput and zero hot shards. Also, migrating to the platform should be straightforward by simply calling new APIs.

Bulk write APIs and Proxy

To support write (update, delete and create) operation on a batch of objects, a set of bulk write APIs are provided for internal service, which can accept a list of objects instead of a single one object. This helps reduce QPS dramatically to the APIs compared to regular write APIs.

Proxy is a finagle service that maps incoming requests to different batching modules, which combine requests to the same shard together, according to the type of objects.

Batching Module

Batching module is to split a batch request into small batches based on the operation type and object type so one batch of objects can be processed efficiently in MySQL, which has timeout configured for each query.

This was designed for two major considerations:

  • Firstly, write rate to every shard should be configured to avoid hot shards as shards may contain different numbers of records and perform variously. One batch request from proxy contains objects on different shards. To control QPS accurately at shards, the batch request is splitting into batches based on targeting shards. ‘Shard Batching’ module splits requests by affected MySQL shards
  • Secondly, each write operation has its own batch size. The operations on different object types have different performance because they update a different number of various tables. For instance, creating a new Pin may change four to five different tables, meanwhile updating an existing Pin may change two tables only. Also, an update query to tables may take various lengths of time. Thus, a batch update for one object type may experience various latencies for different batch sizes. To make batch update efficient, the batch size is configured differently for various write operations. ‘Operation Batching’ further splits these requests by types of operation.

Rate Limiter with Kafka

All objects in a batch request from the batching module are on the same shard. Hot shard is expected if too many requests are hitting one specific shard. Hot shard affects all other queries to the same shard and downgrades the performance of the system. To avoid the issue, all requests to one shard should be sent at a controlled rate thus the shard will not be overwhelmed and can handle requests efficiently. To achieve this goal, one ratelimiter needed for every shard and it controls all requests of the shard.

To support high QPS from internal clients at the same time, all requests from them should be stored temporarily in the platform and processed at a controlled speed. This is where Kafka makes a good fit for these purposes.

  1. Kafka can handle very high qps write and read.
  2. Kafka is a reliable distributed message storage system to buffer batch requests so that requests are processed at a controlled rate.
  3. Kafka can leverage the re-balancing of load and manage consumers automatically.
  4. Each partition is assigned to one consumer exclusively (in the same consumer group) and the consumer can process requests with good rate-limiting.
  5. Requests in all partitions are processed by different consumer processors simultaneously so that throughput is very high.

P: partition C: consumer processor

Kafka Configuration

Firstly, each shard in MySQL cluster has a matching partition in Kafka so that all requests to that shard will be published to the corresponding partition and processed by one dedicated consumer processor with accurate QPS. Secondly, a large number of consumer processors are running so that one or two partitions at maximum are assigned to one consumer processor to achieve maximum throughput.

Consumer Processor

The Consumer processor does rate-limiting of QPS on a shard with two steps:

  • Firstly, how many requests that a consumer can pull from its partition at a time is configured.
  • Secondly, consumer consults with the configuration for shards to get the precise number of batch requests that one shard can handle and uses Guava Ratelimiter to do rate control. For instance, for some shards, it may handle low traffic because hot users are stored in that shards.

Consumer processors can handle different failures with appropriate actions. To handle congestion in the threadpool, the consumer processor will retry the task with configured back off time if threadpool is full and busy with existing tasks. To handle failures in MySQL shards, it will check the response from MySQL cluster to catch errors and exceptions and take appropriate action on different failures. For instance, when it sees two consecutive failures of a timeout, it will send alerts to system admin and will stop pulling and processing requests with a configured wait time. With these mechanisms, the success rate of request processing is high.

Results

Several use cases of internal teams have been launched to bulk write platform with good performance. For instance, feed ingestion for partners is using the platform. Many improvements are observed in both the time spent and the success rate of the process. The result of ingesting around 4.3 million Pins is shown as follows.

Also, the hot shard is not seen during feed ingestion any more, which has caused a lot of similar issues before.

What’s next

As more internal traffic is separated from existing write APIs to new bulk write APIs, the performance of APIs for online traffic sees improvement, like less downtime, lower latency. This helps make systems more reliable and efficient.

The next step for the new platform is to support more cases by extending existing operations on more object types.

Acknowledgments

Thanks to Kapil Bajaj, Carlo De Guzman, Zhihuang Chen and the rest of the Core Services team at Pinterest! Also special thanks to Brian Pin, Sam Meder from the Shopping Infra team for providing support.

Pinterest
Pinterest is a social bookmarking site where users collect and share photos of their favorite events, interests and hobbies. One of the fastest growing social networks online, Pinterest is the third-largest such network behind only Facebook and Twitter.
Tools mentioned in article
Open jobs at Pinterest
iOS Engineer, Product
San Francisco, CA, US; New York City, NY, US; Portland, OR, US; Seattle, WA, US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p><p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p><p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p><span style="font-weight: 400;">We are looking for inquisitive, well-rounded iOS engineers to join our Product engineering teams. Working closely with product managers, designers, and backend engineers, you’ll play an important role in enabling the newest technologies and experiences. You will build robust frameworks &amp; features. You will empower both developers and Pinners alike. You’ll have the opportunity to find creative solutions to thought-provoking problems. Even better, because we covet the kind of courageous thinking that’s required in order for big bets and smart risks to pay off, you’ll be invited to create and drive new initiatives, seeing them from inception through to technical design, implementation, and release.</span></p><p><strong>What you’ll do:</strong></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">Build out Pinner-facing frontend features in iOS to power the future of inspiration on Pinterest</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Contribute to and lead each step of the product development process, from ideation to implementation to release; from rapidly prototyping, running A/B tests, to architecting and building solutions that can scale to support millions of users</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Partner with design, product, and backend teams to build end to end functionality</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Put on your Pinner hat to suggest new product ideas and features</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Employ automated testing to build features with a high degree of technical quality, taking responsibility for the components and features you develop</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Grow as an engineer by working with world-class peers on varied and high impact projects</span></li></ul><p><strong>What we’re looking for:</strong></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">Deep understanding of iOS development and best practices in Objective C and/or Swift</span><span style="font-weight: 400;">, e.g. xCode, app states, memory management, etc</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">2+ years of industry iOS application development experience, building consumer or business facing products</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Experience in following best practices in writing reliable and maintainable code that may be used by many other engineers</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Ability to keep up-to-date with new technologies to understand what should be incorporated</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Strong collaboration and communication skills</span></li></ul><p><strong>Product iOS Engineering teams:&nbsp;</strong></p><p><span style="font-weight: 400;">Creator Incentives&nbsp;</span></p><p><span style="font-weight: 400;">Home Product</span></p><p><span style="font-weight: 400;">Native Publishing</span></p><p><span style="font-weight: 400;">Search Product</span></p><p><span style="font-weight: 400;">Social Growth</span></p><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p><p>At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.</p></div>
iOS Engineer
Warsaw, POL
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p><p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p><p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p><strong>What you’ll do:</strong></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">Build product features into existing VOCHI app to enrich it with a lot of video/audio editing tools&nbsp; (effects, filters, canvas, trim/split/merge, audio effects, speed and other)</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Knit across teams by collaborating with Product managers and designers and other functions to build smooth Feed and Video editor experience</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Prototype and create integrative solutions that can be utilized both in VOCHI and Pinterest mobile clients</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Contribute best-in-class programming skills to develop highly innovative consumer-facing mobile products</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Contribute to and lead each step of the product development process, from ideation to implementation to release; from rapidly prototyping, running A/B test, to architecting and building solutions that can scale to support millions of users</span></li></ul><p><strong>What we’re looking for:</strong></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">6+ years of software engineering experience</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">4+ years of industry experience in developing iOS applications</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Deep understanding of developing on iOS devices in Swift</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Deep understanding of Clean Architecture principles, and different design patterns</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Strong skills and great product sense</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Knowledge on multi-threading, memory management and caching on mobile application</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Strong communication skills</span></li></ul><p>&nbsp;</p><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p><p>At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.</p></div>
Head of Monetization Sciences and ML ...
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p><p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p><p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p><span style="font-weight: 400;">You will lead a data science ML engineering organization that is responsible for data driven insights and ML solutions that aim to optimize the Ads marketplace at Pinterest spanning the both advertiser life-cycle and ads delivery funnel. Using your strong analytical skill sets, thorough understanding of machine learning, online auctions and experience in managing large engineering organizations you’ll advance the state of the art in ML and auction theory while at the same time </span><span style="font-weight: 400;">unlocking</span><span style="font-weight: 400;">&nbsp;Pinterest’s monetization potential.&nbsp; In short, this is a unique position, where you’ll get the freedom to work across the organization to bring together pinners, content creators and advertisers in this unique marketplace.</span></p><p><strong>What you’ll do:</strong></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">Manage a large organization of ML engineers, data scientists and economists responsible for data drive insights and ML solutions that power the monetization efforts at Pinterest</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Provide technical and organizational leadership to grow the organization in alignment with the business needs</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Manage and grow managers of managers and principal ML experts</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Collaborate with engineering and product leadership to define and execute the Monetization vision for Pinterest&nbsp;</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Build strong and effective XFN collaborations across the company with Sales, BizOps, Data eng and Core</span></li></ul><p><strong>What we’re looking for:</strong></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">MSc. or Ph.D. degree in Economics, Statistics, Computer Science or related field</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">12+ years of relevant industry experience</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">8+ years of management experience</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">XFN collaborator and a strong communicator</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Bridge builder who drives alignment across the engineering and product organization at Pinterest</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Hands-on experience building large-scale ML systems and/or Ads domain knowledge</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Strong mathematical skills with knowledge of statistical models</span></li></ul><p>#LI-TG1</p><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p><p>At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.</p></div>
Android Engineer, Shopping Product
Toronto, ON, CA
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p><p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p><p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p><span style="font-weight: 400;">Shopping is at the core of Pinterest’s mission to help people create a life they love. The shopping product team at Pinterest is inventing a brand new, more visual and personalized shopping experience for 350M+ users worldwide. The team is responsible for inspiring Pinners to shop, helping them find the best product and </span><span style="font-weight: 400;">providing </span><span style="font-weight: 400;">seamless checkout </span><span style="font-weight: 400;">experience.&nbsp;</span></p><p><span style="font-weight: 400;">You’ll be responsible for building an Android application that enables Pinners to create the life they love with product discovery and decision experiences that guide from inspiration to purchase. </span><span style="font-weight: 400;">Working closely with the design team, you’ll build beautiful </span><span style="font-weight: 400;">Android</span><span style="font-weight: 400;"> shopping applications for phones and tablets.</span></p><p><strong>What you'll do:</strong></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">Build features to power the future of Shopping on Pinterest</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Contribute to and lead each step of the product development process, from ideation to implementation to release; from rapidly prototyping, running A/B tests, to architecting and building solutions that can scale to support millions of users</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Work with cross functional peers (PM, Design) to define the product roadmap</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Analyze and visualize data to drive product insights and to inform our decisions</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Contribute best-in-class programming skills to develop highly innovative consumer-facing mobile products</span></li></ul><p><strong>What we're looking for:</strong></p><ul><li style="font-weight: 400;"><span style="font-weight: 400;">2+ years of industry </span><span style="font-weight: 400;">Android</span><span style="font-weight: 400;"> application development experience&nbsp;</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Experience in building consumer facing products on </span><span style="font-weight: 400;">Android</span><span style="font-weight: 400;"> platforms for a rapidly iterating product</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Holistic knowledge and passion for the </span><span style="font-weight: 400;">Android</span><span style="font-weight: 400;"> platform</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Strong command of data that could help improve the user experience</span></li><li style="font-weight: 400;"><span style="font-weight: 400;">Strong communication skills and great product intuition</span></li></ul><p><span style="font-weight: 400;"><span data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;#LI-NO1&quot;}" data-sheets-userformat="{&quot;2&quot;:14524,&quot;5&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;6&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;7&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;8&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;10&quot;:2,&quot;14&quot;:{&quot;1&quot;:2,&quot;2&quot;:0},&quot;15&quot;:&quot;Calibri&quot;,&quot;16&quot;:12}">#LI-NO1</span></span></p><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p><p>At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.</p></div>
Verified by
Software Engineer
Sourcer
Software Engineer
Talent Brand Manager
Tech Lead, Big Data Platform
Security Software Engineer
You may also like