Powering Inclusive Search & Recommendations with Our New Visual Skin Tone Model

704
Pinterest
Pinterest is a social bookmarking site where users collect and share photos of their favorite events, interests and hobbies. One of the fastest growing social networks online, Pinterest is the third-largest such network behind only Facebook and Twitter.

By Nadia Fawaz | Research Scientist & Tech Lead, Applied Science, Bhawna Juneja | Software Engineer, Search Quality, David Xue | Software Engineer, Visual Search


To truly bring everyone the inspiration to create a life they love, Pinterest is committed to content diversity and to developing inclusive search and recommendation engines. A top request we hear from Pinners is that they want to feel represented in the product, which is why we built our first version of skin tone ranges, an inclusive search feature, in 2018. We’re proud to introduce the latest version of skin tone ranges, a newly built in-house technology. These new skin tone ranges are paving the way for more inclusive inspirations to be recommended in search, as well as in our augmented reality technology, Try on, and are driving initiatives for more diverse recommendations across the platform.

Skin tone ranges in Beauty Search and in AR Try-on Similar Looks

Developing more inclusive skin tone ranges

Trying to understand the skin tone range in an image is a complex challenge for computer vision systems, given the impact of shadows, different lighting, and a variety of other impediments. Developing inclusive skin tone ranges required an end-to-end iterative process to build, evaluate and improve performance over several versions. While qualitative evaluation could help reveal issues, in order to make progress, we needed to measure performance gaps across skin tone ranges and understand the error patterns for each range.

A variety of lighting conditions

Starting with diverse data

We labeled a diverse set of beauty images covering a wide range of skin tones to evaluate the system performance during development. Measuring performance is important to assess progress, however coarse aggregate metrics over the entire data, such as accuracy, are not sufficient, as the aggregation may hide performance discrepancies between skin tone ranges. To quantify performance biases, we went beyond overall aggregates and computed granular metrics per skin tone range, including precision, recall, and F1-score. Per range metrics would show if errors disproportionately affected some ranges. We also used confusion matrices to analyze error patterns for each range. The matrices would reveal if a model failed to predict a skin tone for images in a range, leading to a very low recall and F1-score for that range, or if it failed to distinguish images from different ranges and misclassified them, impacting recall and precision for several ranges, as in the examples below.

Examples of issues

To understand the root-causes of issues, we performed an error analysis of the components of the skin tone system based on their output. At a high level, a skin tone system may include

  • a detection model that attempts to determine the presence and location of a face in a beauty image, but does not attempt to recognize an individual person’s face
  • a color extraction module
  • a scorer and thresholder to estimate the skin tone range

Analyzing the score distributions per skin tone range over the diverse dataset can show if the score distributions are separable or if they overlap, and if the thresholds are out-of-phase with the diverse data, as in the example above. Both issues can be amplified by color extraction failures in challenging lighting conditions. Studying face detection errors can reveal if the model fails to detect faces in beauty images with a darker skin tone at significantly higher rates than in images with lighter skin tones, which would preclude the system from generating a skin tone range for these images. This type of bias in face detection models can carry over to the skin tone system, and no amount of downstream post-processing for fairness on the output of the system can correct such upstream bias. Biases in face detection have been analyzed previously in the Gender Shades study by Joy Buolamwini and Timnit Gebru. Requiring face detection to predict skin tone also limits the scope of the system, as it cannot handle images of other body parts such as manicured hands, and it contributes to the overall system latency and scalability challenges.

Through analysis, we reached the conclusion that to improve fairness in performance across all skin tone ranges, we needed to build an end-to-end system with bias mitigation.

Developing new skin tone ranges by mitigating biases

Visual skin tone ranges V1: Mitigating bias

We developed the new visual skin tone v1 ranges based on visual input and focused on:

  • mitigating biases to make skin tone perform outstandingly well across all ranges
  • creating a signal that doesn’t require the presence of a full front-facing face, but also works for partial faces or other body parts
  • extending to applications beyond beauty, such as fashion
  • leveraging this more reliable signal as a building block to improve fairness and reduce potential bias in other ML models

The visual skin tone v1 leverages several computer vision techniques to estimate the skin tone range in a beauty image. After exposure correction, a face detection model identifies the face area and landmarks corresponding to facial features such as eyes, eyebrows, nose, mouth and face edge. This face detection model has better coverage on images with darker skin tones. Some facial features, such as eyes and lips, are then cropped out, and binary erosion is applied to remove hair and edge noise and finally produce a face skin mask. If face detection fails to identify a face in the image, for example in images of other body parts, Hue Saturation Value (HSV) processing attempts to locate skin pixels and produces a skin mask. The color extraction module then estimates a dominant color based on the RGB distribution of the skin mask pixels. The dominant color is converted to the LAB space, and the individual topology angle (ITA) is computed as a nonlinear function of L and B coordinates. The resulting ITA scores are more separable across ranges. Using a diverse dataset of images, fairness aware tuning is performed on the ITA scores to produce a skin tone prediction while mitigating biases in performance between ranges.

Evaluation of the visual skin tone v1 on the diverse set of beauty Pins showed ~3x higher accuracy on the predicted skin tone. Moreover, per range precision, recall and F1-score metrics increased for all ranges. We observed ~10x higher recall and ~6x higher F1-score on darker skin tones. The new model reduced biases in performance across skin tone ranges, and led to a major increase in coverage of skin tone ranges for billions of images in our beauty, women’s and men’s fashion corpora.

Beyond offline evaluation, having humans in the loop can significantly improve performance by integrating feedback from human evaluation, users and communities. For instance, we conducted several rounds of qualitative review and annotation of the skin tone inference results on diverse images to identify new error patterns and inform training data collection and modeling choices, as we iterated on the model. We also leveraged side-by-side comparisons of results in inclusive bug bashes with a diverse group of participants. Regular quantitative and qualitative evaluations help improve quality over time. In production, we ran experiments to evaluate the new skin tone v1, and built dashboards to monitor the diversity of content served.

Visual skin tone ranges V2: Keep learning

While iterating on skin tone v1, we first focused on getting the simpler cases right, such as front-facing faces in beauty portrait images. As we later expanded to the broader cases of rotated faces, different lighting conditions, occlusions such as facial hair, sunglasses, face masks, other body parts, and integrated more images from diverse communities, we learned from the errors of skin tone v1 to develop a more robust skin tone v2. We worked closely with designers to iterate and develop clear labeling guidelines for tens of thousands of images. Iterating on the model and the collection of its training and evaluation data by actively integrating learnings from earlier versions allowed the model to improve over time. This helped expand its application beyond beauty images to the broader context of fashion.

The need to handle more complex images led us to move away from face detection, and to take a new approach for skin tone v2 based on an end-to-end CNN model from the raw images. We first trained a ResNet model to learn skin tone from a more diverse set of images from beauty and fashion, including v1 error cases. This model outperformed v1 when evaluated on larger, more challenging data. We then considered adding skin tone prediction as a new jointly trained head in the multi-task Unified Embedding model. This approach led to further performance improvements, but at the cost of increased complexity and of coupling with the multi-head development and release schedule. Eventually, we used the 2048-dimensional binarized Unified Embedding as input to a multilayer perceptron (MLP), trained using dropout and a softmax with cross-entropy loss to predict skin tone ranges. This led to significant performance enhancements for all ranges, benefiting from the information captured in our existing embedding while requiring far less computation.

Productionizing visual skin tone at scale

To productionize skin tone v1 for billions of beauty and fashion images, we first identified which Pin images were relevant for skin tone prediction. We leveraged several Pinterest signals, such as Pin2Interest to gather beauty and fashion content and our embedding-based visual Image Style and Shopping Style signals, to filter out irrelevant Pins, like product images, which helped with scale and precision by narrowing the image corpus.

To generate skin tone ranges for existing and new images for skin tone v1, we used our GPU-enabled C++ service for image-based models, that supports both real-time online extraction and offline extraction in two stages — an ad hoc backfill and a scheduled incremental workflow.

For visual skin tone v2, our embedding-based feature extractor utilizes pre-computed unified visual embedding as input features to the MLP. This approach uses Spark and CPU Hadoop clusters to significantly speed up skin tone classification in a cost-effective manner. Without having to process the image pixels, our embedding-based approach reduces the time needed to compute the backfill for billions of Pin images from nearly a week to under an hour.

Applications

Improving skin tone ranges in search for global audiences

Skin tone ranges provide Pinners the option to filter beauty results by a skin tone range of their choice, represented by four palettes. The improved skin tone models gave us the confidence to make skin tone ranges more prominent in the product and launch internationally in search.

Deploying the new skin tone v1 for beauty search queries first required indexing the skin tone signal as a discrete feature among four ranges and the prediction method — face detection or HSV processing. To evaluate skin tone v1 in search, we first gathered qualitative feedback from a diverse set of internal participants and then launched an experiment to assess the online performance at scale. The internal evaluation and the experiment analysis showed a clear improvement in precision and recall for the new model. The model was more accurate at classifying pin images into their respective skin tone ranges, especially the darker ranges, leading to large gains in precision and coverage in search results. We also noticed that skin tone range adoption rates in English speaking countries were comparable to the U.S., and both increased with the combined launches of the redesigned skin tone range UI and the new skin tone range model.

Skin tone ranges in similar looks for AR Try on

Try on was developed with inclusion in mind at the outset of Pinterest AR, supported by visual skin tone v1. The Similar Looks module in the AR Try on for lipstick experience allows users to discover makeup looks with similar lip styles. By integrating skin tone ranges in Similar Looks, users can filter inspiration looks by a skin tone range of their choice.

To build Similar Looks, the makeup parameters of a beauty pin are estimated by DNN models trained on a high quality, human-curated diverse set of tens of thousands of beauty images spanning a wide range of skin tones. First, an embedding-based DNN classifier for the Try-On Taxonomy of Image Style is trained with PyTorch using the Unified Embedding as input. Lipstick parameter extraction is performed using a cascade consisting of a face detector, landmark detector, and DNN-based parameter regressor. The visual skin tone v1 is indexed and combined with a lightweight approach to retrieve Makeup Look pins in the selected skin tone range with lipstick parameters most similar to the color of the query makeup product in perceptual color space. Together these components form a new kind of visual discovery experience for makeup try-on, connecting individual products to an inspirational and diverse set of beauty Pins.

Content diversity understanding and diversification

Leveraging diversity signals such as skin tone helps us analyze and understand the diversity of our content, as well as how it is surfaced and engaged with. With skin tone v1, we quadrupled our skin tone range coverage of beauty and fashion content. [Source: Pinterest Internal Data, April 2020] Our skin tone signal is now 3x as likely to detect multiple skin tone ranges in the top search results [Pinterest Internal data, July 2020], allowing more accurate measurements of the diversity of content served. Such analysis can help inform work around diversification of content inventory and its distribution on Pinterest.

The road ahead

Through our experience developing skin tone ranges and integrating them in our search and AR Try on products, we learned the importance of building ML systems with inclusion by design and respect for user privacy at the heart of technical choices. In a multi-disciplinary collaboration between engineering and teams spanning many organizations, we are building on this foundation to further improve skin tone ranges, develop diversity signals, diversify search results and recommendations in various surfaces, and expand the inclusive product experience to more content and domains globally.

Acknowledgments

This work is the result of a cross-functional collaboration between many teams. Many thanks to Josh Beal, Laksh Bhasin, Lulu Cheng, Nadia Fawaz, Angela Guo, Edmarc Hedrick, Emma Herold, Ryan James, Nancy Jeng, Bhawna Juneja, Dmitry Kislyuk, Molly Marriner, Candice Morgan, Monica Pangilinan, Seth Dong Huk Park, Zhdan Philippov, Rajat Raina, Chuck Rosenberg, Marta Scotto, Annie Ta, Michael Tran, Eric Tzeng, David Xue.

Pinterest
Pinterest is a social bookmarking site where users collect and share photos of their favorite events, interests and hobbies. One of the fastest growing social networks online, Pinterest is the third-largest such network behind only Facebook and Twitter.
Tools mentioned in article
Open jobs at Pinterest
Machine Learning Engineer
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

With more than 400 million users around the world and 300 billion ideas saved, Pinterest Machine Learning engineers build personalized experiences to help Pinners create a life they love. With just over 3,000 global employees, our teams are small, mighty, and still growing. At Pinterest, you’ll experience hands-on access to an incredible vault of data and contribute large-scale recommendation systems in ways you won’t find anywhere else.

What you’ll do:

  • Build cutting edge technology using the latest advances in deep learning and machine learning to personalize Pinterest
  • Partner closely with teams across Pinterest to experiment and improve ML models for various product surfaces (Homefeed, Ads, Growth, Shopping, and Search), while gaining knowledge of how ML works in different areas
  • Use data driven methods and leverage the unique properties of our data to improve candidates retrieval
  • Work in a high-impact environment with quick experimentation and product launches
  • Keeping up with industry trends in recommendation systems 

 

What we’re looking for:

  • 2+ years of industry experience applying machine learning methods (e.g., user modeling, personalization, recommender systems, search, ranking, natural language processing, reinforcement learning, and graph representation learning)
  • End-to-end hands-on experience with building data processing pipelines, large scale machine learning systems, and big data technologies (e.g., Hadoop/Spark)
  • Nice to have:
    • M.S. or PhD in Machine Learning or related areas
    • Publications at top ML conferences
    • Expertise in scalable realtime systems that process stream data
    • Passion for applied ML and the Pinterest product

 

#LI-HYBRID
#LI-LA1

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

iOS Engineer, Product
San Francisco, CA, US; New York City, NY, US; Portland, OR, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

We are looking for inquisitive, well-rounded iOS engineers to join our Product engineering teams. Working closely with product managers, designers, and backend engineers, you’ll play an important role in enabling the newest technologies and experiences. You will build robust frameworks & features. You will empower both developers and Pinners alike. You’ll have the opportunity to find creative solutions to thought-provoking problems. Even better, because we covet the kind of courageous thinking that’s required in order for big bets and smart risks to pay off, you’ll be invited to create and drive new initiatives, seeing them from inception through to technical design, implementation, and release.

What you’ll do:

  • Build out Pinner-facing frontend features in iOS to power the future of inspiration on Pinterest
  • Contribute to and lead each step of the product development process, from ideation to implementation to release; from rapidly prototyping, running A/B tests, to architecting and building solutions that can scale to support millions of users
  • Partner with design, product, and backend teams to build end to end functionality
  • Put on your Pinner hat to suggest new product ideas and features
  • Employ automated testing to build features with a high degree of technical quality, taking responsibility for the components and features you develop
  • Grow as an engineer by working with world-class peers on varied and high impact projects

What we’re looking for:

  • Deep understanding of iOS development and best practices in Objective C and/or Swift, e.g. xCode, app states, memory management, etc
  • 2+ years of industry iOS application development experience, building consumer or business facing products
  • Experience in following best practices in writing reliable and maintainable code that may be used by many other engineers
  • Ability to keep up-to-date with new technologies to understand what should be incorporated
  • Strong collaboration and communication skills

Product iOS Engineering teams: 

Creator Incentives 

Home Product

Native Publishing

Search Product

Social Growth

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Machine Learning Engineer, Core Engi...
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

With more than 400 million users around the world and 300 billion ideas saved, Pinterest Machine Learning engineers build personalized experiences to help Pinners create a life they love. With just over 3,000 global employees, our teams are small, mighty, and still growing. At Pinterest, you’ll experience hands-on access to an incredible vault of data and contribute large-scale recommendation systems in ways you won’t find anywhere else.

What you’ll do:

  • Build cutting edge technology using the latest advances in deep learning and machine learning to personalize Pinterest
  • Partner closely with teams across Pinterest to experiment and improve ML models for various product surfaces (Homefeed, Ads, Growth, Shopping, and Search), while gaining knowledge of how ML works in different areas
  • Use data driven methods and leverage the unique properties of our data to improve candidates retrieval
  • Work in a high-impact environment with quick experimentation and product launches
  • Keeping up with industry trends in recommendation systems 

 

What we’re looking for:

  • 2+ years of industry experience applying machine learning methods (e.g., user modeling, personalization, recommender systems, search, ranking, natural language processing, reinforcement learning, and graph representation learning)
  • End-to-end hands-on experience with building data processing pipelines, large scale machine learning systems, and big data technologies (e.g., Hadoop/Spark)
  • Nice to have:
    • M.S. or PhD in Machine Learning or related areas
    • Publications at top ML conferences
    • Expertise in scalable realtime systems that process stream data
    • Passion for applied ML and the Pinterest product

 

#LI-HYBRID
#LI-LA1

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Software Engineer, Infrastructure
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

The Pinterest Infrastructure Engineering organization builds, scales, and evolves the systems which the rest of Pinterest Engineering uses to deliver inspiration to the world.  This includes source code management, continuous integration, artifact packaging, continuous deployment, service traffic management, service registration and discovery, as well as holistic observability and the underlying compute runtime and container orchestration.  A collection of platforms and capabilities which accelerate development velocity while protecting Pinterest’s production availability for one of the world’s largest public cloud workloads. 

What you’ll do:

  • Design, develop, and operate large scale, distributed systems and networks
  • Work with Engineering customers to understand new requirements and address them in a scalable and efficient manner
  • Actively work to improve the developer process and experience in all phases from coding to operation

What we’re looking for:

  • 2+ years of industry software engineering experience
  • Experience building & operating large scale distributed systems and/or networks
  • Experience in Python, Java, C++, or Go or another language and a willingness to learn
  • Bonus: Experience deploying and operating large scale workloads on a public cloud footprint

Available Hiring Teams: Cloud Delivery Platform (Infra Eng), Code & Language Runtime (Infra Eng), Traffic (Infra Eng), Cloud Systems (Infra Eng), Online Systems (Data Eng), Key Value Systems (Data Eng), Real Time Analytics (Data Eng), Storage & Caching (Data Eng), ML Serving Platform (Data Eng)

 

#LI-SG1

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Verified by
Software Engineer
Sourcer
Software Engineer
Talent Brand Manager
Tech Lead, Big Data Platform
Security Software Engineer
You may also like