skip to main content
10.1145/2908131.2908134acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
invited-talk

Applying machine learning to ads integrity at Facebook

Published:22 May 2016Publication History

ABSTRACT

More than 1.5 billion people use Facebook monthly (1 billion daily) to stay connected with friends and family, to discover what is going on in the world, and to share and express what matters to them. They have access to content such as pages' updates, group posts, products and ads. Maintaining a great experience requires that the content shown to them is of the highest quality.

Facebook's Advertising system is designed to foster a positive user experience in which ads are shown to the people most likely to care about the content. Unfortunately, a small number of the submitted ads are not suitable to be shown to the people using Facebook: some contain low quality creatives, some are spammy or misleading, others may run afoul of local customs and laws, others still may prey on people's emotions or contain excessively shocking content.

The mission of the Integrity team at Facebook consists on identifying and blocking at scale low quality creatives and content that violate Facebook's policies, before they enter the matching and ranking algorithms to be potentially displayed to people on our platform. Towards the goal of protecting people and advertisers by creating a safe, high-quality ad experience, we review all ads created and decide whether they meet our quality bar or violate our policies. However, at Facebook's scale it is not feasible to manually review each new ad. Instead the team uses a combination of automated Machine Learning models and Human Computing to detect policy violating and low quality ads, block its distribution within the platform, and notify the content creator with hints about how to remedy the issue.

All ads are scored by hundreds of supervised and unsupervised machine learning models, classified by a complex set of rule based engines, and only those most likely to contain low quality content are reviewed by humans (who must ensure the ad complies with the high standards expected from advertisers on Facebook's platform). In order to make the process more efficient, given that ads might reuse creatives (e.g., same image across different ads), the system considers the full ad, as well as each of the components individually (e.g., title, description, image, video, etc.). Given the complexity of this process there exist multiple challenges to be addressed such as

Large class imbalance distributions. Most created ads are good quality, providing a very skewed distribution to train our machine learning models on.

Global reach and internationalization. Integrity needs to understand the languages and identify patterns from all regions where Facebook operates.

Feature engineering on ad content. Understanding in detail content such as text, image, video or audio is difficult and requires a large variety of techniques.

Human reviewer accuracy. Tagging data by humans is not a perfect and 100% accurate process and typically introduces noisy and wrong labels that might alter metrics, decisions and training data.

Dynamic ecosystem and evolving patterns. New products, global scale and changes in advertiser behavior requires a system that adapts and evolves to detect new and potentially unseen patterns.

Machine learning at scale. Facebook operates at exabyte scale, therefore requiring solutions that are capable to generate features, train and execute machine learning models for very large volumes.

This talk provides an overview of the ad review process, introduces some of the challenges Facebook faces as well as some solutions in these areas.

Index Terms

  1. Applying machine learning to ads integrity at Facebook

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WebSci '16: Proceedings of the 8th ACM Conference on Web Science
      May 2016
      392 pages
      ISBN:9781450342087
      DOI:10.1145/2908131

      Copyright © 2016 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 May 2016

      Check for updates

      Qualifiers

      • invited-talk

      Acceptance Rates

      WebSci '16 Paper Acceptance Rate13of70submissions,19%Overall Acceptance Rate218of875submissions,25%

      Upcoming Conference

      Websci '24
      16th ACM Web Science Conference
      May 21 - 24, 2024
      Stuttgart , Germany

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader