Crowdsourcing Deepfake Flagging on Twitter: A Reactive Solution

Ben Colman

Co-Founder and CEO

The internet never forgets. A single image, a few seconds of video, or even a simple tweet can wreak havoc in our interconnected world. The advent of deepfakes greatly exacerbated this problem over recent years, creating the most convincing method of disruption and disinformation in today’s world.

Recently, Twitter's initiative to expand its crowdsourced fact-checking program by including images has been touted as a solution to tackle the platform’s relatively unchecked deepfake issue. Their “Community Notes'' system allows contributors to add context to potentially misleading content on the platform, which now includes deepfakes.

We commend Twitter for starting this, but this implementation is less than the bare minimum. By the time a tweet goes viral, the harm is already done. A viral deepfake, like the recent example of the fake Pentagon explosion, spreads like wildfire — the Pentagon bombing photo caused a $100 Billion drop in the stock market within minutes (as we discussed on CNBC) — causing widespread panic and confusion long before any corrective action is taken. In essence, by the time the crowdsourced army of fact-checkers identifies a deepfake, the damage is already done.

To explain the inherent flaws in Twitter's system, let's dissect the anatomy of viral deepfakes. They often emerge from less-than reputable sources (which can seem reputable thanks to Twitter’s “everyone can pay for a blue checkmark” approach), quickly gaining momentum via posts and reposts from high-profile or verified accounts, then rapidly disseminated to millions of users. Once a deepfake reaches this viral status, crowdsourcing becomes nearly ineffective as the speed of propagation far outpaces the rate of detection.

Crowdsourcing is a reactive approach and, in this unique case, deflects blame and responsibility from Twitter’s non-existent Trust and Safety teams (which were dissolved by new owner Elon Musk last December). This approach waits for a deepfake to appear, gain traction, and then attempts to counteract it. This process can take hours, sometimes even days, depending on the complexity of the deepfake and the time it takes for a sufficient number of people to flag it.

In stark contrast, proactive deepfake detection technology intercepts and scrutinizes potential deepfakes before they reach a single user. This active approach detects minute discrepancies and media idiosyncrasies that largely go unnoticed by the human eye, providing a line of defense at the source before any malicious intent can be actualized. At Reality Defender, we offer best-in-class detection methods across all media types (and an ensemble of corresponding models) to ensure that deepfakes are detected and flagged to a platform’s moderation team in real time. As a company fighting against dangerous deepfakes and AI-generated misinformation, we believe in safeguarding our online spaces not only for the users of today, but for future generations as well.

Twitter's crowdsourcing solution offers a false sense of security, a band-aid on a wound that requires a tourniquet. By the time a Community Note appears, the deepfake has already embedded itself in the collective memory of the users. Countering said harm leaves less of an impact and reaches fewer users than the original content.

It's not enough to put out the fire. We must prevent it from starting in the first place. Twitter and all platforms like it need to change their approach from a reactive one to a proactive one, creating a solution that works ahead of the problem instead of following in its wake. Not doing this will only harm millions of users and spread disinformation at an unprecedented rate.

Proactive deepfake detection is the only viable way forward in the fight against deepfakes. It is a necessity in an age of ever-advancing AI, where reality can be twisted and manipulated by simply “feeding the model.” We welcome any and all industry players to join us in taking proactive steps in detecting deepfakes on platforms, in media, in content across their company, or in any setting where humans are susceptible to manipulation through any medium. Once we ensure that deepfakes are stopped at the source, we can provide a safer, more trustworthy world for all.

If you are interested in collaborating with Reality Defender to solve the many challenges of AI, deepfakes, and generative content, please contact us here.

Insights