How Pinterest Fights Spam Using Machine Learning

664
Pinterest
Pinterest is a social bookmarking site where users collect and share photos of their favorite events, interests and hobbies. One of the fastest growing social networks online, Pinterest is the third-largest such network behind only Facebook and Twitter.

By Vishwakarma Singh | Trust and Safety Machine Learning Lead


Hundreds of millions of people regularly visit Pinterest to visually discover inspiring ideas among billions of Pins. Inspiration is a high bar and we must be vigilant in ensuring that Pinners don’t see spam, harmful content or misinformation. To enforce our community policies and maintain an inspiring environment, we use the latest in machine learning technology to build automated systems that swiftly detect and act against both spammy content and spammers.

Our anti-spam system consists of both reactive and proactive components to effectively counter adversarial abusers — users who intentionally try to evade the system. Our proactive system consists of sophisticated machine learning models, whereas the reactive system includes both rules executed in a real-time rules engine and lightweight machine learning models. We not only use the latest modeling techniques but also iterate on these models at regular intervals by adding new data and exploring new technical breakthroughs to either maintain or improve their performance over time to effectively address spam.

One tactic malicious actors enact is misusing a Pin’s image and linking to a malicious external website. Our models detect spam vectors, like Pin links, as well as users engaging in spammy behaviors. We quickly limit distribution of Pins with spam links and take direct action against users identified with a high confidence to be engaging in spammy behavior. We perform a manual review for those identified with low confidence to limit false positives, and we notify users of our actions to maintain transparency and also provide an option of appeal against our decision.

Machine Learning Models

Spam Domain Model

We proactively identify spam Pin links using a Deep Neural Network classifier (shown in Figure 1). To maximize impact, our model learns to classify a domain as spam rather than a link. We apply the same enforcement to all Pins with links belonging to the same domain. This model is trained interactively on manually labeled domains to achieve a higher recall and lower false positive rate. We use features created from links, web page text and media, user-domain interactions, and user behavior as inputs. For each domain, we sample links and webpages to create features. We semantically split links into semantic tokens and use only frequent tokens as features. We analyze outlying patterns in user actions over time to create behavioral features. This model is periodically batch inferred at scale by a PySpark job using Tensorflow, Spark SQL, and a UDF.

Figure 1. Deep Neural Network for domain classification

Spam User Model

Identifying users engaging in spam activities is the ultimate solution for fighting spam, but it is extremely hard to achieve. We leverage both supervised and unsupervised models to build an effective spam user identification system.

Classification Model

Our spam user classification model is a Deep Neural Network (shown in Figure 2) and is part of our proactive system. It is trained using synthetically labeled data generated with minimal human supervision to ensure quality. We use features created from user attributes and their past behaviors as inputs. We also use user-domain interaction, summarized as a domain scores distribution for each user where domain scores are reused from the spam domain model, as an input. This model is periodically batch inferred to score millions of Pinners by a PySpark job using Tensorflow, Spark SQL, and a UDF.

Figure 2. Deep Neural Network for user classification

Clustering

We have developed lightweight clustering models for early detection of suspicious users and bots. This technique also addresses gaps in our classification models, which are unaware of emerging patterns unless re-trained with fresh labeled data. We cluster users on attributes which can successfully isolate suspicious groups with high accuracy. Experts identify these attributes by exploring the behavior of suspicious users and their use of resources for creating spammy content. This model is implemented using PySpark and SparkSQL and executes daily.

Spam User-Domain Model

Interactions of users with domains are explicitly captured by a heterogeneous bipartite graph as shown in Figure 3. We represent users and domains as nodes in the graph and create an edge between a user and a domain if the user has created or saved a Pin with the domain’s link. This graph facilitates simultaneous identification of spam users and domains using a semi-supervised learning. We use a small set of labeled users and domains to run a label propagation algorithm and learn scores for the unlabeled users and domains. We implement this iterative algorithm in Spark and run it periodically.

Figure 3. Bipartite graph of users and domains for label propagation

Measurement

We measure spam prevalence on Pinterest by computing the number of Pin impressions which either have spam links or have been created by users engaging in spammy activities. We periodically sample and manually review both impressed Pins and users. We scaled our measurement by starting to sample and review from highly impressed head domains and then extended the coverage to tail domains over a period of time. These samples are used for measuring overall spam prevalence as well as training our machine learning models.

Conclusion

Pinterest’s mission is to bring everyone the inspiration to create a life they love. We strive to protect our Pinners’ experiences by swiftly and appropriately acting against malicious users and spam content as identified by our array of latest machine learning models. We plan to keep investing in evolving our community guidelines and technology to address inevitably emerging challenges and bring the best experience to our millions of valued users.

Acknowledgements

Thanks to Yuanfang Song, Omkar Panhalkar, Rundong Liu, Qinglong Zeng, Attila Dobi, Abhijit Mahabal, Alok Singhal, Maisy Samuelson, and the rest of the Trust and Safety team for their contributions in developing machine learning models for spam! Thanks to Harry Shamansky for helping with the publication of the blog post!

Pinterest
Pinterest is a social bookmarking site where users collect and share photos of their favorite events, interests and hobbies. One of the fastest growing social networks online, Pinterest is the third-largest such network behind only Facebook and Twitter.
Tools mentioned in article
Open jobs at Pinterest
Machine Learning Engineer
San Francisco, CA, US; Palo Alto, CA, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

With more than 400 million users around the world and 300 billion ideas saved, Pinterest Machine Learning engineers build personalized experiences to help Pinners create a life they love. With just over 3,000 global employees, our teams are small, mighty, and still growing. At Pinterest, you’ll experience hands-on access to an incredible vault of data and contribute large-scale recommendation systems in ways you won’t find anywhere else.

What you’ll do:

  • Build cutting edge technology using the latest advances in deep learning and machine learning to personalize Pinterest
  • Partner closely with teams across Pinterest to experiment and improve ML models for various product surfaces (Homefeed, Ads, Growth, Shopping, and Search), while gaining knowledge of how ML works in different areas
  • Use data driven methods and leverage the unique properties of our data to improve candidates retrieval
  • Work in a high-impact environment with quick experimentation and product launches
  • Keeping up with industry trends in recommendation systems 

 

What we’re looking for:

  • 2+ years of industry experience applying machine learning methods (e.g., user modeling, personalization, recommender systems, search, ranking, natural language processing, reinforcement learning, and graph representation learning)
  • End-to-end hands-on experience with building data processing pipelines, large scale machine learning systems, and big data technologies (e.g., Hadoop/Spark)
  • Nice to have:
    • M.S. or PhD in Machine Learning or related areas
    • Publications at top ML conferences
    • Expertise in scalable realtime systems that process stream data
    • Passion for applied ML and the Pinterest product

 

#LI-HYBRID
#LI-LA1

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Android Engineer, Platform
San Francisco, CA, US; New York City, NY, US; Portland, OR, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

We are looking for inquisitive, well-rounded Android engineers to join our Platform engineering teams. Working closely with product managers, designers, and backend engineers, you’ll play an important role in enabling the newest technologies and experiences. You will build robust frameworks & features. You will empower both developers and Pinners alike. You’ll have the opportunity to find creative solutions to thought-provoking problems. Even better, because we covet the kind of courageous thinking that’s required in order for big bets and smart risks to pay off, you’ll be invited to create and drive new initiatives, seeing them from inception through to technical design, implementation, and release.

What you’ll do:

  • Support millions of users and enable colleagues by ensuring excellence in core pieces that are shared throughout the application
  • Identify app-wide challenges; propose, test, and ship solutions
  • Drive changes that improve the entire app such as modularization, implementing image/video loading, RTL text, dependency injection, and reusable UI components
  • Enable developers to work more effectively by improving app architecture, testing capabilities and release cycles
  • Solve hard-to-see user pain points that often affect the entire app such as performance, monitoring crash rates and solving user metric anomalies
  • Grow as a developer by working with world-class peers on varied and high impact projects

What we’re looking for:

  • Deep understanding of Android development and best practices in Kotlin and/or Java, e.g. Activity Lifecycle, memory management, etc.
  • 2+ years of industry Android application development experience, building consumer or business facing products
  • Experience in following best practices in writing reliable and maintainable code that may be used by many other engineers
  • Ability to keep up-to-date with new technologies to understand what should be incorporated
  • Strong collaboration and communication skills

Platform Android Engineering teams:

Android Platform - Frameworks & Architecture

Android Platform - Mobile Networking

Android Platform - UI Components / Gestalt

Metrics Quality

Video Foundation

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Android Engineer, Product
San Francisco, CA, US; New York City, NY, US; Portland, OR, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

We are looking for inquisitive, well-rounded Android engineers to join our product Engineering teams. Working closely with product managers, designers, and backend engineers, you’ll play an important role in enabling the newest technologies and experiences. You will build robust frameworks & features. You will empower both developers and Pinners alike. You’ll have the opportunity to find creative solutions to thought-provoking problems. Even better, because we covet the kind of courageous thinking that’s required in order for big bets and smart risks to pay off, you’ll be invited to create and drive new initiatives, seeing them from inception through to technical design, implementation, and release.

What you’ll do:

  • Build out Pinner-facing frontend features in Android to power the future of inspiration on Pinterest
  • Contribute to and lead each step of the product development process, from ideation to implementation to release; from rapidly prototyping, running A/B tests, to architecting and building solutions that can scale to support millions of users
  • Partner with design, product, and backend teams to build end to end functionality
  • Put on your Pinner hat to suggest new product ideas and features
  • Employ automated testing to build features with a high degree of technical quality, taking responsibility for the components and features you develop
  • Grow as an engineer by working with world-class peers on varied and high impact projects

What we’re looking for:

  • Deep understanding of Android development and best practices in Kotlin and/or Java, e.g. Activity Lifecycle, memory management, etc.
  • 2+ years of industry Android application development experience, building consumer or business facing products
  • Experience in following best practices in writing reliable and maintainable code that may be used by many other engineers
  • Ability to keep up-to-date with new technologies to understand what should be incorporated
  • Strong collaboration and communication skills

Product Android Engineering teams:

Activation - New User Growth

Closeup Product

Community Engagement 

Creator Incentives 

Creator Monetization

Home Product

Pinterest TV

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Backend Engineer, Monetization
Palo Alto, CA, US; San Francisco, CA, US; New York City, NY, US; Seattle, WA, US

About Pinterest:  

Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love. In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping Pinners make their lives better in the positive corner of the internet.

Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our PinFlex landing page to learn more. 

We are looking for inquisitive, well-rounded Backend engineers to join our Monetization engineering teams. Working closely with product managers, designers, and backend engineers, you’ll play an important role in enabling the newest technologies and experiences. You will build robust frameworks & features. You will empower both developers and Pinners alike. You’ll have the opportunity to find creative solutions to thought-provoking problems. Even better, because we covet the kind of courageous thinking that’s required in order for big bets and smart risks to pay off, you’ll be invited to create and drive new initiatives, seeing them from inception through to technical design, implementation, and release.

What you’ll do:

  • Build out the backend for Pinner-facing features to power the future of inspiration on Pinterest
  • Contribute to and lead each step of the product development process, from ideation to implementation to release; from rapidly prototyping, running A/B tests, to architecting and building solutions that can scale to support millions of users
  • Partner with design, product, and backend teams to build end-to-end functionality
  • Put on your Pinner hat to suggest new product ideas and features
  • Employ automated testing to build features with a high degree of technical quality, taking responsibility for the components and features you develop
  • Grow as an engineer by working with world-class peers on varied and high impact projects

What we’re looking for:

  • 2+ years of industry backend development experience, building consumer or business facing products
  • Proficiency in common backend tech stacks for RESTful API, storage, caching and data processing
  • Experience in following best practices in writing reliable and maintainable code that may be used by many other engineers
  • Ability to keep up-to-date with new technologies to understand what should be incorporated
  • Strong collaboration and communication skills

Backend Monetization Engineering teams: 

Ads API Platform

Ads Indexing Platform

Ads Retrieval Infra

Ads Serving and ML Infra

Measurement Ingestion

Merchant Infra 

Our Commitment to Diversity:

At Pinterest, our mission is to bring everyone the inspiration to create a life they love—and that includes our employees. We’re taking on the most exciting challenges of our working lives, and we succeed with a team that represents an inclusive and diverse set of identities and backgrounds.

Verified by
Software Engineer
Sourcer
Software Engineer
Talent Brand Manager
Tech Lead, Big Data Platform
Security Software Engineer
You may also like