Optimizing Pinterest’s Data Ingestion Stack: Findings and Learnings

1,200
Pinterest
Pinterest's profile on StackShare is not actively maintained, so the information here may be out of date.

By Ping-Min Lin | Software Engineer, Logging Platform


At Pinterest, the Logging Platform team maintains the backbone of data ingestion infrastructure that ingests terabytes of data per day. When building the services powering these pipelines, it is extremely important that we build efficient systems considering how widespread and deep in the stack the systems are. Along our journey of continuous improvement, we’ve figured out basic but useful patterns and learnings that could be applied in general — and hopefully for you as well.

MemQ: Achieving memory-efficient batch data delivery using Netty

MemQ is the next-gen data ingestion platform built in-house and recently open-sourced by the Logging Platform team. When designing the service, we tried hard to maximize the efficiency of our resources, specifically, we focused on reducing GC by using off-heap memory. Netty was chosen as our low-level networking framework due to its great balance between flexibility, performance, and sophisticated out-of-the-box features. For example, we used ByteBuf heavily throughout the project. ByteBufs are the building blocks of data within Netty. They are similar to Java NIO ByteBuffers, but allow the developers much more control of the lifecycle of the objects by providing a “smart pointer” approach for customized memory management using manual reference counting. By using ByteBufs, we managed to transport messages with a single copy of data by passing off-heap network buffer pointers, further reducing cycles used on garbage collection.

The typical journey of a message in the MemQ broker: Each message received from the network will be reconstructed via a length-encoded protocol that will be allocated into a ByteBuf that is off of the JVM heap (direct memory in Netty terms), and will be the only existing copy of the payload throughout the whole pipeline. This ByteBuf reference will be passed into the topic processor and put into a Batch along with other messages that are also waiting to be uploaded to the storage destination. Once the upload constraints are met, either due to the time threshold or the size threshold, the Batch will be dispatched. In the case of uploading to a remote object store like S3, the whole batch of messages will be kept in a CompositeByteBuf (which is a virtual wrapper ByteBuf consisting of multiple ByteBufs) and uploaded to the destination using the netty-reactor library, allowing us to create no additional copies of data within the processing path. By building on top of ByteBufs and other Netty constructs, we were able to iterate rapidly without sacrificing performance and avoid reinventing the wheel.

Singer: Leveraging asynchronous processing to reduce thread overheads

Singer has been around at Pinterest for a long time, reliably delivering messages to PubSub backends. With more and more use cases onboarded to Singer, we’ve started to hit bottlenecks on memory usage that led to frequent OOM issues and incidents. Singer has memory and CPU resources constrained on nearly all fleets at Pinterest to avoid impact on the host service e.g. our API serving layer. After inspecting the code and leveraging debugging tools such as VisualVM, Native Memory Tracking (NMT), and pmap, we noticed various potential improvements to be done, most notably reducing the number of threads. After performing NMT result analysis we noticed the number of threads and the memory used by the stack as a result of these threads (allocated due to the Singer executor and producer thread pools).

Taking a deeper look into the source of these threads, the majority of these threads come from the thread pools for each Kafka cluster Singer publishes to. The threads in these thread pools are used to wait for Kafka to complete writing messages to a partition and then report the status of the writes. While the threads do the job, each thread in the JVM (by default) will allocate 1MB of memory used for the thread’s stack.

A Singer NMT report showing the different memory regions a JVM process allocates. The Thread entry represents the thread stack. Arena contains the off-heap/direct memory portion managed outside of the JVM heap.

Even with lazy allocation of the stack memory on the underlying operating systems until the thread is actually used, this still quickly adds up to hundreds of MBs of the process’ memory. When there are a lot of log streams publishing to multiple partitions on different clusters, the memory used by thread stacks can be easily comparable to the 800MB default heap size of Singer and eats into the resources of the application.

Each submission of KafkaWriteTask will occupy a thread. Full code can be found here

By closely examining the usage of these threads, it quickly becomes clear that most of these threads are doing non-blocking operations such as updating metrics and are perfectly suitable for asynchronous processing using CompletableFutures provided starting in Java 8. The CompletableFuture allows us to resolve the blocking calls by chaining stages asynchronously, thus replacing the usage of these threads that had to wait until the results to come back from Kafka. By utilizing the callback in the KafkaProducer.send(record, callback) method, we rely on the Kafka producer’s network client to be completely in control of the multiplexing of networking.

A brief example of the result code after using CompletableFutures. Full code can be found here

Once we convert the original logic into several chained non-blocking stages, it becomes obvious to use a single common thread pool to handle them regardless of the logstream, so we use the common ForkJoinPool that is already at our disposal from JVM. This dramatically reduces the thread usage for Singer, from a couple of hundred threads to virtually no additional threads. This improvement demonstrates the power of asynchronous processing and how network-bound applications can benefit from it.

Kafka and Singer: Balancing performance and efficiency with controllable variance

Operating our Kafka clusters has always been a delicate balance between performance, fault tolerance, and efficiency. Our logging agent Singer, at the front line of publishing messages to Kafka, is a crucial component that plays a heavy role in these factors, especially in routing the traffic by deciding which partitions we deliver data to for a topic.

The Default Partitioner: Evenly Distributed Traffic

In Singer, logs from a machine would be picked up and routed to the corresponding topic it belongs to and published to that topic in Kafka. In the early days, Singer would publish uniformly to all the partitions that topic has in a round-robin fashion using our default partitioner. For example, if there were 3000 messages on a particular host that needed to be published to a 30 partition topic, each partition would roughly receive 100 messages. This worked pretty well for most of the use cases and has a nice benefit where all partitions receive the same amount of messages, which is great for the consumers of these topics since the workload is evenly distributed amongst them.

DefaultPartitioner: Producers and Partitions are fully connected

The Single Partition Partitioner: In Favor of the Law of Large Numbers

SinglePartitionPartitioner: Ideal scenario where connections are evenly distributed

As Pinterest grew, we had fleets expanding to thousands of hosts, and this evenly-distributed approach started to cause some issues to our Kafka brokers: high connections counts and large amounts of produce requests started to elevate the brokers’ CPU usage, and spreading out the messages means that the batch sizes are smaller for each partition, or lower efficiency of the compression, resulting in higher aggregated network traffic. To tackle this, we implemented a new partitioner: the SinglePartitionPartitioner. This partitioner solves the issue by forcing Singer to only write to one random partition per topic per host, reducing the fanout from all brokers to a single broker. This partition remains the same throughout the producer’s lifetime until Singer restarts.

For pipelines that had a large producer fleet and relatively uniform message rates across hosts, this was extremely effective: The law of large numbers worked in our favor, and statistically, if the number of producers is significantly larger than partitions, each partition will still receive a similar amount of traffic. Connection count went down from (number of brokers serving the topic) times (number of producers) to only (number of producers), which could be up to a hundred times less for larger topics. Meanwhile, batching up all messages per producer to a single partition improved compression ratios by at least 10% in most use cases.

SinglePartitionPartitioner: Skewed scenario where there are too few producers vs. partitions

The Fixed Partitions Partitioner: Configurable variance for adjusting trade-offs

Despite coming up with this new solution, there were still some pipelines that lie in the middle ground where both solutions are subpar, such as when the number of producers is not large enough to outnumber the number of partitions. In this case, the SinglePartitionPartitioner would introduce significant skew between partitions: some partitions will have multiple producers writing to them, and some are assigned very few or even no producers. This skew could cause unbalanced workloads for the downstream consumers, and also increases the burden for our team to manage the cluster, especially when storage is tight. We thus recently introduced a new partitioner that can be used on these cases, and even cover the original use cases: the FixedPartitionsPartitioner, which basically allows us to not only publish to one fixed partition like the SinglePartitionPartitioner, but randomly across a fixed number of partitions.

This approach is somewhat similar to the concept of virtual nodes in consistent hashing, where we artificially create more “effective producers” to achieve a more continuous distribution. Since the number of partitions for each host can be configured, we can tune it to the sweet spot where the efficiency and performance are both at desired levels. This partitioner could also help with “hot producers” by spreading traffic out while still maintaining a reasonable connection count. Although a simple concept, it turns out that having the ability to configure the degree of variance could be a powerful tool to manage trade-offs.

FixedPartitionsPartitioner: Less skew while still keeping connection count lower than the default

Relative compression ratio and request rate skew with different numbers of fixed partitions on a 120 partition topic on 30 brokers

Conclusion and Acknowledgements

These learnings are just a few examples of improvements the Logging Platform team has been making. Despite their seemingly different nature, the ultimate goal of all these improvements was to achieve better results for our team and our customers. We hope that these findings are inspiring and could spark a few ideas for you.

None of the content in this article could have been delivered without the in-depth discussions and candid feedback from Ambud Sharma, Eric Lopez, Henry Cai, Jeff Xiang, and Vahid Hashemian on the Logging Platform team. We also deeply appreciate the great support from external teams that provided support and input on the various improvements we’ve been working on. As we strive for continuous improvement within our architecture, we hope we will be able to share more interesting findings in our pursuit of perfecting our system.

Pinterest
Pinterest's profile on StackShare is not actively maintained, so the information here may be out of date.
Tools mentioned in article
Open jobs at Pinterest
Backend Engineer, Core & Monetization
San Francisco, CA, US; , CA, US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p> <p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p> <p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p><span style="font-weight: 400;">We are looking for inquisitive, well-rounded Backend engineers to join our Core and Monetization engineering teams. Working closely with product managers, designers, and backend engineers, you’ll play an important role in enabling the newest technologies and experiences. You will build robust frameworks &amp; features. You will empower both developers and Pinners alike. You’ll have the opportunity to find creative solutions to thought-provoking problems. Even better, because we covet the kind of courageous thinking that’s required in order for big bets and smart risks to pay off, you’ll be invited to create and drive new initiatives, seeing them from inception through to technical design, implementation, and release.</span></p> <p><strong>What you’ll do:</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">Build out the backend for Pinner-facing features to power the future of inspiration on Pinterest</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Contribute to and lead each step of the product development process, from ideation to implementation to release; from rapidly prototyping, running A/B tests, to architecting and building solutions that can scale to support millions of users</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Partner with design, product, and backend teams to build end-to-end functionality</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Put on your Pinner hat to suggest new product ideas and features</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Employ automated testing to build features with a high degree of technical quality, taking responsibility for the components and features you develop</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Grow as an engineer by working with world-class peers on varied and high impact projects</span></li> </ul> <p><strong>What we’re looking for:</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">2+ years of industry backend development experience, building consumer or business facing products</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Proficiency in common backend tech stacks for RESTful API, storage, caching and data processing</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience in following best practices in writing reliable and maintainable code that may be used by many other engineers</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Ability to keep up-to-date with new technologies to understand what should be incorporated</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Strong collaboration and communication skills</span></li> </ul> <p><strong>Backend Core Engineering teams:</strong></p> <ul> <li><span style="font-weight: 400;">Community Engagement</span></li> <li><span style="font-weight: 400;">Content Acquisition &amp; Media Platform</span></li> <li><span style="font-weight: 400;">Core Product Indexing Infrastructure</span></li> <li><span style="font-weight: 400;">Shopping Catalog&nbsp;</span></li> <li><span style="font-weight: 400;">Trust &amp; Safety Platform</span></li> <li><span style="font-weight: 400;">Trust &amp; Safety Signals</span></li> <li><span style="font-weight: 400;">User Understanding</span></li> </ul> <p><strong>Backend Monetization Engineering teams:&nbsp;</strong></p> <ul> <li><span style="font-weight: 400;">Ads API Platform</span></li> <li><span style="font-weight: 400;">Ads Indexing Platform</span></li> <li><span style="font-weight: 400;">Ads Reporting Infrastructure</span></li> <li><span style="font-weight: 400;">Ads Retrieval Infra</span></li> <li><span style="font-weight: 400;">Ads Serving and ML Infra</span></li> <li><span style="font-weight: 400;">Measurement Ingestion</span></li> <li><span style="font-weight: 400;">Merchant Infra&nbsp;</span></li> </ul> <p>&nbsp;</p> <p><span style="font-weight: 400;">At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. This position will pay a base salary of $145,700 to $258,700. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.</span></p> <p><span style="font-weight: 400;">Information regarding the culture at Pinterest and benefits available for this position can be found at <a href="https://www.pinterestcareers.com/pinterest-life/">https://www.pinterestcareers.com/pinterest-life/</a>.</span></p> <p><span style="font-weight: 400;">This position is not eligible for relocation assistance.</span></p> <p>#LI-CL5&nbsp;</p> <p>#LI-REMOTE</p> <p>&nbsp;</p><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p> <p>At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.</p></div>
Engineering Manager, Advertiser Autom...
San Francisco, CA, US; , CA, US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p> <p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p> <p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p><span style="font-weight: 400;">As the Engineering Manager of the Advertiser Automation team, you’ll be leading a large team that’s responsible for key systems that are instrumental to the performance of ad campaigns, tying machine learning models and other automation techniques to campaign creation and management. The ideal candidate should have experience leading teams that work across the web technology stack, be driven about partnering with Product and other cross-functional leaders to create a compelling vision and roadmap for the team, and be passionate about helping each member of their team grow.</span></p> <p><strong>What you’ll do:</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">Managing a team of full-stack engineers</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Work closely with Product and Design on planning roadmap, setting technical direction and delivering value</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Coordinate closely with XFN partners on multiple partner teams that the team interfaces with</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Lead a team that’s responsible for key systems that utilize machine learning models to help advertisers create more performant campaigns on Pinterest</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Partner with Product Management to provide a compelling vision and roadmap for the team.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Work with PM and tech leads to estimate scope of work, define release schedules, and track progress.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Mentor and develop engineers at various levels of seniority.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Keep the team accountable for hitting business goals and driving meaningful impact</span></li> </ul> <p><strong>What we’re looking for:</strong></p> <ul> <li style="font-weight: 400;"><em><span style="font-weight: 400;">Our PinFlex future of work philosophy requires this role to visit a Pinterest office for collaboration approximately 1x per quarter. For employees not located within a commutable distance from this in-office touchpoint, Pinterest will cover T&amp;E. Learn more about PinFlex <a href="https://www.pinterestcareers.com/pinflex/" target="_blank">here</a>.</span></em></li> <li style="font-weight: 400;"><span style="font-weight: 400;">1+ years of experience as an engineering manager (perf cycles, managing up/out, 10 ppl)</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">5+ years of software engineering experience as a hands on engineer</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience leading a team of engineers through a significant feature or product launch in collaboration with Product and Design</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Track record of developing high quality software in an automated build and deployment environment</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience working with both frontend and backend technologies</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Well versed in agile development methodologies</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Ability to operate in a fast changing environment / comfortable with ambiguity</span></li> </ul> <p>&nbsp;</p> <p><span style="font-weight: 400;">At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. This position will pay a base salary of $172,500 to $258,700. The position is also eligible for equity and incentive compensation. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.</span></p> <p><span style="font-weight: 400;">Information regarding the culture at Pinterest and benefits available for this position can be found at </span><a href="https://www.pinterestcareers.com/pinterest-life/"><span style="font-weight: 400;">https://www.pinterestcareers.com/pinterest-life/</span></a><span style="font-weight: 400;">.</span></p> <p>#LI-REMOTE</p> <p>#LI-NB1</p><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p> <p>At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.</p></div>
Engineering Manager, Conversion Data
Seattle, WA, US; , WA, US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p> <p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p> <p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p><span style="font-weight: 400;">Pinterest is one of the fastest growing online advertising platforms, and our continued success depends on our ability to enable advertisers to understand the value and return on their advertising investments. Conversion Data, a team within the Measurement org, is a Seattle engineering product team. </span><span style="font-weight: 400;">The Conversion Data team is functioning as custodian of conversion data inside Pinterest. We build tools to make conversion data accessible and usable for consumers with valid business justifications. We are aiming to have conversion data consumed in a privacy-safe and secured way. By providing toolings and support, we reduce friction for consumers to stay compliant with upcoming privacy headwinds.&nbsp;</span></p> <p><strong>What you’ll do</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">Manager for the Conversion Data team (5 FTE ICs and 3 contractors) which sits within the Measurement Data Foundations organization in Seattle.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Help to reinvent how conversion data can be utilized for downstream teams in the world while maintaining a high bar for Pinner privacy.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Work closely with cross functional partners in Seattle as measurement is a cross-company cutting initiative.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Drive both short term execution and long term engineering strategy for Pinterest’s conversion data products.</span></li> </ul> <p><strong>What we’re looking for:</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience managing product development teams, including working closely with PM and Product Design to identify, shape and grow successful products</span></li> <li style="font-weight: 400;">The ideal candidate will have experience with processing high volumes of data at a scale.</li> <li style="font-weight: 400;">Grit, desire to work in a team, for the betterment of all - correlates to the Pinterest value of “acts like an owner”</li> <li style="font-weight: 400;">2+ years EM experience</li> </ul> <p><span style="font-weight: 400;">At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. This position will pay a base salary of $172,500 to $258,700. The position is also eligible for equity and incentive compensation. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.</span></p> <p><span style="font-weight: 400;">Information regarding the culture at Pinterest and benefits available for this position can be found at </span><a href="https://www.pinterestcareers.com/pinterest-life/"><span style="font-weight: 400;">https://www.pinterestcareers.com/pinterest-life/</span></a><span style="font-weight: 400;">.</span></p> <p>#LI-REMOTE</p> <p>#LI-NB1</p><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p> <p>At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.</p></div>
Verified by
You may also like