3 Innovations While Unifying Pinterest’s Key-Value Storage

901
Pinterest
Pinterest is a social bookmarking site where users collect and share photos of their favorite events, interests and hobbies. One of the fastest growing social networks online, Pinterest is the third-largest such network behind only Facebook and Twitter.

By Jessica Chan | Engineering Manager, MySQL & Key-Value Storage


Engineers hate migrations. What do engineers hate more than migrations? Data migrations. Especially critical, terabyte-scale, online serving migrations which, if done badly, could bring down the site, enrage customers, or cripple hundreds of critical internal services.

So why did the Key-Value Systems Team at Pinterest embark on a two-year realtime migration of all our online key-value serving data to a single unified storage system? Because the cost of not migrating was too high. In 2019, Pinterest had four separate key-value systems owned by different teams with different APIs and featuresets. This resulted in duplicated development effort, high operational overhead and incident counts, and confusion among engineering customers.

In unifying all of Pinterest’s 500+ key-value use cases (over 4PB of unique data serving 100Ms of QPS) onto one single interface, not only did we make huge gains in reducing system complexity and lowering operational overhead, we achieved a 40–90% performance improvement by moving to the most efficient storage engine, and we saved the company a significant amount in costs per year by moving to the most optimal replication and versioning architecture.

In this blog post, we selected three (out of many more) innovations to dive into that helped us notch all these wins.

But first, some background

Before this effort, Pinterest used to have four key-value storage systems:

  • Terrapin: a read-only, batch-load, key-value storage built at Pinterest and featured in Designing Data-Intensive Applications based on HDFS
  • Rockstore: a multi-mode (readonly, read-write, streaming-write) key-value storage also built at Pinterest, based on the open-source Rocksplicator framework, written in C++, and using RocksDB as a storage engine
  • UserMetaStore: a read-write key-value storage with a simplified thrift API on top of HBase
  • Rocksandra: a read-write, key-value storage based on a version of Cassandra, which used RocksDB under the hood

One of the biggest challenges when consolidating to a single system is assessing the feasibility of both achieving feature parity across all systems and integrating those features well into a single platform. Another challenge is to determine which system to consolidate to, and whether to go with an existing system or to consider something that doesn’t already exist at Pinterest. And a final, nontrivial challenge is to convince leadership and hundreds of engineers that migrating in the first place is a good idea.

Before embarking on such a large undertaking, we had to step back. A working group dedicated a few months to deep-dive on requirements and technologies, analyze tradeoffs and benefits, and come up with a final proposal that was ultimately approved. Rockstore, which was the most cost-efficient and performant, simplest to operate and extend, and provided the lowest migration cost, was chosen as the one storage system to rule them all.

We won’t describe the entire migration project in this post, but we’ll highlight some of the best parts.

Innovation 1: API abstractions allow us to seamlessly migrate customer data

We know that in code, strong abstractions lead to cleaner interfaces and more flexibility to make changes “under the hood” without disruption. This is especially true of organizations as well. While each of the four storage systems had their own thrift API abstractions, the fact that there were four interfaces, and some of them, like Terrapin, still required customers to know internal details about the architecture in order to use it (leaky abstraction), made life difficult for both customers and platform owners.

A diagram might be helpful to illustrate the complexity of maintaining four separate, key-value storage systems. If you were a customer, which would you choose?

Figure 1: Four separate Key Value Systems at Pinterest, each with their own APIs, set of unique features and underlying architectures, and varying degrees of performance and cost.

We introduced a new API, aptly called the KVStore API, to be the new unified thrift interface that would absorb the rest. Once everyone is on a single unified API that is built with the intention to be general, the platform team can have the flexibility to make changes, even change storage engines, under the hood without involving customers. This is the ideal state:

Figure 2: The ideal state is a single unified Key-Value interface, reducing the complexity both for customers and for platform owners. When we can consolidate our resources as a company and invest in a single platform, we can move faster and build better.

The migration to get from four systems to the ideal one above was split into two phases: the first, targeting read-only data, and the second, targeting read-write data. Each phase required its own unique migration strategy to be the least disruptive to customers.

Phase 1: Read-only data migration (totally seamless)

The read-only phase was first because it was simpler (immutable data is easier to migrate than mutable data receiving live writes) and because it targeted the majority of customers (about 70% were using Terrapin). Because Terrapin was so prolific and established in our code base, having everyone migrate their APIs to access KVStore would have taken a ton of time and effort with very little incremental value.

We decided to instead migrate most Terrapin customers seamlessly: no changes were required of users calling Terrapin APIs, but unbeknownst to callers, the Terrapin API service was augmented with an embedded KVStore API library to retrieve data from Rockstore. And because Terrapin is a batch-loaded system, we also found a central base class and rerouted workflows to double-load data into Rockstore instead of Terrapin (and then eventually we cut Terrapin off).

Figure 3: By introducing a routing layer between the Terrapin APIs and the Terrapin leaf storage, we can achieve a data migration and eliminate the costly and less stable Terrapin storage system for immediate business impact, all without asking customers to take any action. The tradeoff here is the tech debt and layer of indirection: we are now asking customers to clean up their usage of the Terrapin API in order to directly call KVStore API.

Because Rockstore was more performant and cost-efficient than Terrapin, users saw a 30–90% decrease in latency. When we decommissioned the storage infrastructure of Terrapin, the company also saw $7M of annualized savings, all without users needing to lift a finger (with just a few exceptions). The tradeoff is that we now have some tech debt of ensuring that users clean up their code by moving off of deprecated Terrapin APIs and onto KVStore API so that we no longer have a layer of indirection.

Phase 2: Read-write data migration (partially seamless)

The read-write side presented a different picture: there were fewer than 200 use cases to tackle, and the number of call sites was less extreme, but building feature parity for a read-write system as opposed to read-only involved some serious development. In order to be on par with UserMetaStore (essentially HBase), Rockstore needed a brand new wide-column format, increased consistency modes, offline snapshot support, and higher durability guarantees.

While the team took the time to develop these features, we decided to “bite the bullet” and ask all users to migrate from UserMetaStore’s API to KVStore API from the get-go. The benefit of doing this is it’s a low-risk, low-effort move. Thanks again to the power of abstraction, we implemented a reverse proxy so that customers moving to KVStore API were actually still calling UserMetaStore under the hood. By making this small change now, customers were buying a lasting contract that wouldn’t require such changes again for the foreseeable future.

Figure 4: Instead of taking the same approach as we did with Terrapin in Figure 3, we decided asking customers to migrate their APIs up front made more sense for unifying the read-write storage systems. Once customers moved to our KVStore API abstraction layer, we were free to move their data from UserMetaStore to Rockstore under the hood.

Some of the biggest challenges were actually not technical. Finding owners of the data was an archeological exercise, and holding hundreds of owners accountable for completing their part was difficult due to competing priorities. But when it was done, and when the Rockstore platform was ready, the team was completely unblocked to backfill the data from UserMetaStore to Rockstore without any customer involvement. We also vowed to make sure all data was attributed to owners going forward.

Innovation 2: A wide-column format eliminated both CPU and network load for large payloads

Some of the most popular Terrapin workloads had an interesting property: use cases would store values consisting of large blobs of thrift structures but only need to retrieve a very small piece of that data when read.

At first, these callers would download the huge values that they stored, deserialize them on the client side, and read the property they needed. This very quickly revealed itself to be inefficient in terms of unnecessary network load, throughput degradation, and wasteful client CPU utilization.

The Terrapin solution to this was to introduce an API feature called “trimmer,” where you could specify a Thrift struct and the fields you wanted from it in the request itself. Terrapin would not only retrieve the object, it would also deserialize it and return only the fields requested. This was better in that the network bandwidth was reduced, important especially for reducing cross-AZ traffic costs, but it was worse in terms of both platform cost and leaky abstractions. More CPU utilization meant more machines were needed, and business logic in the platform meant that Terrapin needed to know about required thrift structures. Performance also takes a hit since clients are waiting for this increased processing time.

To solve this in Rockstore and unblock the migration, the team decided against simply re-implementing the trimmer. Instead, we introduced a new file format that accommodated a wide-column access pattern. This means that instead of storing a binary blob of data that can be deserialized into a thrift structure, you can actually store and encode your data structure in a native format that can be retrieved like a key-value pair using a combination of primary keys and local keys. For example, if you have a struct UserData that is a mapping of 30 fields keyed to a user id, instead of storing a key-value pair of (key: user id, value: UserData), you can instead store (key: user id, (local key: UserData field 1, local value: Userdata value 1), (local key: UserData field 2, local value: Userdata value 2)), etc.

The API is then designed to allow you to either access the entire row (all columns associated with user id) or only certain properties (UserData field 3 and 12 of user id). Under the hood, Rockstore is performing a blazing fast range scan or single-point key-value lookup. This accounted for some of the more extreme performance improvements that we ultimately observed. Goodbye network and CPU costs!

Innovation 3: A versioning system for batch-loaded, read-only data unblocked instant data migrations between clusters

One of the biggest pain points of the read-only mode of Rockstore was the inability to move data once it was loaded onto a cluster. If customer data grew beyond what was provisioned for it, or if a certain cluster became unstable, it took two weeks and two or three teams to coordinate changes to workflows, reconfigure thrift call sites, and budget time to double-upload, monitor, and read data to and from the new location.

Another pain point of the read-only mode Rockstore was that it only supported exactly two versions due to how it implements versioning. This was incompatible with Terrapin requirements, which supported fewer than two for cost savings and more than two for critical datasets which require on-disk instant rollback.

The solution to this is what we call “timestamp-based versioning.” Rockstore read-only used to have “round-robin versioning,” where each new version uploaded into the system would either be version One or version Two. Once all the partitions of an uploaded version were online, the version map would simply flip. This created the exactly-two version constraint. Another constraint that bound customers to a specific cluster was the fact that customers needed to specify a serverset address that corresponded to the cluster on which their data lived. Another leaky abstraction! When the data moved, customers needed to make changes to follow it.

In timestamp-based versioning, every upload is attributed a timestamp and registered to a central metastore called Key-Value Store Manager (KVSM), which was used to coordinate cluster map configurations. Once more, the power of abstraction comes in: by calling KVStore APIs, as a customer you no longer need to know on which cluster your data lives. KVStore figures that out for you using the cluster map configuration.

Not only does this abstraction allow for as few as one version or as many as 10 to be stored on disk or in S3 (to trade off cost savings and rollback safety), but moving a dataset from one cluster to another is as simple as a single API call to change the cluster metadata in KVSM and kicking off a new upload. Once the metadata is updated, the new upload will automatically be loaded to the new cluster. And once online, all serving maps will point requests to that location. Thanks to timestamp-based versioning, two weeks of effort has been reduced to a single API call.

Thank you for reading about our journey to a single, abstracted, key-value storage at Pinterest. I’d like to acknowledge all the people that contributed to this critical and technically challenging project: Rajath Prasad, Kangnan Li, Indy Prentice, Harold Cabalic, Madeline Nguyen, Jia Zhan, Neil Enriquez, Ramesh Kalluri, Tim Jones, Gopal Rajpurohit, Guodong Han, Prem Thangamani, Lianghong Xu, Alberto Ordonez Pereira, Kevin Lin, all our partners in SRE, security, and Eng Productivity, and all of our engineering customers at Pinterest which span teams from ads to homefeed, machine learning to signal platform. None of this would be possible without the teamwork and collaboration from everyone here.

Pinterest
Pinterest is a social bookmarking site where users collect and share photos of their favorite events, interests and hobbies. One of the fastest growing social networks online, Pinterest is the third-largest such network behind only Facebook and Twitter.
Tools mentioned in article
Open jobs at Pinterest
Machine Learning Engineer
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

With more than 400 million users around the world and 300 billion ideas saved, Pinterest Machine Learning engineers build personalized experiences to help Pinners create a life they love. With just over 3,000 global employees, our teams are small, mighty, and still growing. At Pinterest, you’ll experience hands-on access to an incredible vault of data and contribute large-scale recommendation systems in ways you won’t find anywhere else.

What you’ll do:

  • Build cutting edge technology using the latest advances in deep learning and machine learning to personalize Pinterest
  • Partner closely with teams across Pinterest to experiment and improve ML models for various product surfaces (Homefeed, Ads, Growth, Shopping, and Search), while gaining knowledge of how ML works in different areas
  • Use data driven methods and leverage the unique properties of our data to improve candidates retrieval
  • Work in a high-impact environment with quick experimentation and product launches
  • Keeping up with industry trends in recommendation systems 

 

What we’re looking for:

  • 2+ years of industry experience applying machine learning methods (e.g., user modeling, personalization, recommender systems, search, ranking, natural language processing, reinforcement learning, and graph representation learning)
  • End-to-end hands-on experience with building data processing pipelines, large scale machine learning systems, and big data technologies (e.g., Hadoop/Spark)
  • Nice to have:
    • M.S. or PhD in Machine Learning or related areas
    • Publications at top ML conferences
    • Expertise in scalable realtime systems that process stream data
    • Passion for applied ML and the Pinterest product

 

#LI-HYBRID
#LI-LA1

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

iOS Engineer, Product
San Francisco, CA, US; New York City, NY, US; Portland, OR, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

We are looking for inquisitive, well-rounded iOS engineers to join our Product engineering teams. Working closely with product managers, designers, and backend engineers, you’ll play an important role in enabling the newest technologies and experiences. You will build robust frameworks & features. You will empower both developers and Pinners alike. You’ll have the opportunity to find creative solutions to thought-provoking problems. Even better, because we covet the kind of courageous thinking that’s required in order for big bets and smart risks to pay off, you’ll be invited to create and drive new initiatives, seeing them from inception through to technical design, implementation, and release.

What you’ll do:

  • Build out Pinner-facing frontend features in iOS to power the future of inspiration on Pinterest
  • Contribute to and lead each step of the product development process, from ideation to implementation to release; from rapidly prototyping, running A/B tests, to architecting and building solutions that can scale to support millions of users
  • Partner with design, product, and backend teams to build end to end functionality
  • Put on your Pinner hat to suggest new product ideas and features
  • Employ automated testing to build features with a high degree of technical quality, taking responsibility for the components and features you develop
  • Grow as an engineer by working with world-class peers on varied and high impact projects

What we’re looking for:

  • Deep understanding of iOS development and best practices in Objective C and/or Swift, e.g. xCode, app states, memory management, etc
  • 2+ years of industry iOS application development experience, building consumer or business facing products
  • Experience in following best practices in writing reliable and maintainable code that may be used by many other engineers
  • Ability to keep up-to-date with new technologies to understand what should be incorporated
  • Strong collaboration and communication skills

Product iOS Engineering teams: 

Creator Incentives 

Home Product

Native Publishing

Search Product

Social Growth

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Machine Learning Engineer, Core Engi...
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

With more than 400 million users around the world and 300 billion ideas saved, Pinterest Machine Learning engineers build personalized experiences to help Pinners create a life they love. With just over 3,000 global employees, our teams are small, mighty, and still growing. At Pinterest, you’ll experience hands-on access to an incredible vault of data and contribute large-scale recommendation systems in ways you won’t find anywhere else.

What you’ll do:

  • Build cutting edge technology using the latest advances in deep learning and machine learning to personalize Pinterest
  • Partner closely with teams across Pinterest to experiment and improve ML models for various product surfaces (Homefeed, Ads, Growth, Shopping, and Search), while gaining knowledge of how ML works in different areas
  • Use data driven methods and leverage the unique properties of our data to improve candidates retrieval
  • Work in a high-impact environment with quick experimentation and product launches
  • Keeping up with industry trends in recommendation systems 

 

What we’re looking for:

  • 2+ years of industry experience applying machine learning methods (e.g., user modeling, personalization, recommender systems, search, ranking, natural language processing, reinforcement learning, and graph representation learning)
  • End-to-end hands-on experience with building data processing pipelines, large scale machine learning systems, and big data technologies (e.g., Hadoop/Spark)
  • Nice to have:
    • M.S. or PhD in Machine Learning or related areas
    • Publications at top ML conferences
    • Expertise in scalable realtime systems that process stream data
    • Passion for applied ML and the Pinterest product

 

#LI-HYBRID
#LI-LA1

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Software Engineer, Infrastructure
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

The Pinterest Infrastructure Engineering organization builds, scales, and evolves the systems which the rest of Pinterest Engineering uses to deliver inspiration to the world.  This includes source code management, continuous integration, artifact packaging, continuous deployment, service traffic management, service registration and discovery, as well as holistic observability and the underlying compute runtime and container orchestration.  A collection of platforms and capabilities which accelerate development velocity while protecting Pinterest’s production availability for one of the world’s largest public cloud workloads. 

What you’ll do:

  • Design, develop, and operate large scale, distributed systems and networks
  • Work with Engineering customers to understand new requirements and address them in a scalable and efficient manner
  • Actively work to improve the developer process and experience in all phases from coding to operation

What we’re looking for:

  • 2+ years of industry software engineering experience
  • Experience building & operating large scale distributed systems and/or networks
  • Experience in Python, Java, C++, or Go or another language and a willingness to learn
  • Bonus: Experience deploying and operating large scale workloads on a public cloud footprint

Available Hiring Teams: Cloud Delivery Platform (Infra Eng), Code & Language Runtime (Infra Eng), Traffic (Infra Eng), Cloud Systems (Infra Eng), Online Systems (Data Eng), Key Value Systems (Data Eng), Real Time Analytics (Data Eng), Storage & Caching (Data Eng), ML Serving Platform (Data Eng)

 

#LI-SG1

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Verified by
Software Engineer
Sourcer
Software Engineer
Talent Brand Manager
Tech Lead, Big Data Platform
Security Software Engineer
You may also like