Facebook Pushes Back Against Report That Claims Its AI Sucks at Detecting Hate Speech

Facebook Pushes Back Against Report That Claims Its AI Sucks at Detecting Hate Speech
Photo: Carl Court, Getty Images

On Sunday, Facebook vice president of integrity Guy Rosen tooted the social media company’s own horn for moderating toxic content, writing in a blog post that the prevalence of hate speech on the platform has fallen by nearly half since July 2020. The post appeared to be in response to a series of damning Wall Street Journal reports and testimony from whistleblower Frances Haugen outlining the ways the social media company is knowingly poisoning society.

“Data pulled from leaked documents is being used to create a narrative that the technology we use to fight hate speech is inadequate and that we deliberately misrepresent our progress,” Rosen said. “This is not true.”

“We don’t want to see hate on our platform, nor do our users or advertisers, and we are transparent about our work to remove it,” he continued. “What these documents demonstrate is that our integrity work is a multi-year journey. While we will never be perfect, our teams continually work to develop our systems, identify issues and build solutions.”

He argued that it was “wrong” to judge Facebook’s success in tackling hate speech based solely on content removal, and the declining visibility of this content is a more significant metric. For its internal metrics, Facebook tracks the prevalence of hate speech across its platform, which has dropped by nearly 50% over the past three quarters to 0.05% of content viewed, or about five views out of every 10,000, according to Rosen.

That’s because when it comes to removing content, the company often errs on the side of caution, he explained. If Facebook suspects a piece of content — whether that be a single post, a page, or an entire group — violates its regulations but is “not confident enough” that it warrants removal, the content may still remain on the platform, but Facebook’s internal systems will quietly limit the post’s reach or drop it from recommendations for users.

“Prevalence tells us what violating content people see because we missed it,” Rosen said. “It’s how we most objectively evaluate our progress, as it provides the most complete picture.”

Sunday saw also the release of the Journal’s latest Facebook exposé. In it, Facebook employees told the outlet they were concerned the company isn’t capable of reliably screening for offensive content. Two years ago, Facebook cut the amount of time its teams of human reviewers had to focus on hate-speech complaints from users and reduced the overall number of complaints, shifting instead to AI enforcement of the platform’s regulations, according to the Journal. This served to inflate the apparent success of Facebook’s moderation tech in its public statistics, the employees claimed.

According to a previous Journal report, an internal research team found in March that Facebook’s automated systems were removing posts that generated between 3-5% of the views of hate speech on the platform. These same systems flagged and removed an estimated 0.6% of all content that violated Facebook’s policies against violence and incitement.

In her testimony before a Senate subcommittee earlier this month, Haugen echoed these stats. She said Facebook’s algorithmic systems can only catch “a very tiny minority” of offensive material, which is still concerning even if, as Rosen claims, only a fraction of users ever come across this content. Haugen previously worked as Facebook’s lead product manager for civic misinformation and later joined the company’s threat intelligence team. As part of her whistleblowing efforts, she’s provided a trove of internal documents to the Journal revealing the inner workings of Facebook and how its own internal research proved how toxic its products are for users.

Facebook has vehemently disputed these reports, with the company’s vice president of global affairs, Nick Clegg, calling them “deliberate mischaracterizations” that use cherry-picked quotes from leaked material to create “a deliberately lop-sided view of the wider facts.”