Symptoms and Cure of Fake News | Blog Posts

Fake News: A phrase which started, grew and blew out of proportion. ‘Fake News’ is what intelligentsia would call ‘disinformation’. It has led to quite the spur over the recent years, but the phrase’s origin was lost in the uprising against it.

First Use

In mid-2016, Buzzfeed’s media editor, Craig Silverman, observed a stream of whimsically concocted untrue stories originating from the small Eastern European town of Veles in Macedonia. Their investigation led them to uncover 140 fake news websites which teens of Veles were using to make money off Facebook advertising. This gave birth to the faux journalism that we now call ‘Fake News’ [1].

But, we’re not here to discuss the nomenclature of ‘Fake News’ and its colossal political ramifications. The age has now shifted towards identifying its source(s) and more importantly curbing it.

Automation - Dynamo of good and bad

Automation plays a key role in this arena: in the generation of fake news and towards its identification and termination. AI systems came to the forefront with GPT-2 by OpenAI, the AI research organization founded by Elon Musk. Even though OpenAI did a good job with language modeling, the incoherence of the textual output indicated that it was algorithm generated. Thus, it didn’t prove to be an imminent threat in the world of news content [2].

The next advancement came in with Grover. A team of researchers from the University of Washington built a tool to prevent AI-generated fake news from spreading over the internet [3]. The biggest peril that came with Grover was the generation of eerily plausible fake news with content better than human-generated content. This precipitates from the fact that in addition to the body of an article, Grover analyzes its headline, publication name, author name, and other details. It also adapts to the writing styles of different news outlets and writes convincing articles in that style. Yet, at the same time, the best antidote to the propaganda that can potentially be generated by Grover is Grover itself. It distinguishes algorithm-generated fake news from human-generated content with an accuracy of 92% [3]. Thus, the moderated use of tools like Grover becomes essential since that would lead to benefits outweighing the risks.

Grover detecting human and self-written fake news — Figure 1: Grover is potent enough to detect both human and self-written fake news with impressively high accuracy [4]

This acts as a good pivot to discuss the collaborative initiatives of MIT-IBM Watson AI Lab and HarvardNLP. They formulated the algorithm, ‘Giant Language Model Test Room (GLTR)’ to detect automatically generated text. Even though the algorithm does the detection by trying to predict the next word in the sentence, their approach is unique. It does this by depending on the unpredictability of human vocabulary while constructing sentences and tests to see if each word in the text is the most obvious choice of word in that position [5]. Even though this method isn’t foolproof, it simulates the human psyche. In this way, the algorithm identifies the AI-generated text by testing for its generic nature.

Fabula – Stepping on to fabulous?

The most fascinating piece of research in this space came from the London startup Fabula AI. Their patented study revolved around the concept of ‘Geometric Deep Learning’. The technology behind this concept, and the path that Fabula AI drew from there to its unexpected application in the field of detecting fake news, is particularly interesting. The common notion had been that detecting automatically generated fake news is achievable but that the process would involve perpetual fine-tuning due to the gigantic volume of data out there and the nuances in each format.

Fabula AI upturned that idea by devising a method that eliminates the need to read and understand news altogether [6]!. Instead, it relies on the propagation patterns (non-Euclidean geometry) of such news. In 2018, Fabula AI Ltd. and Ecole Polytechnique Federale de Lausanne published their patent [7]. The patent titled ‘System and a method for learning features on geometric domains’ specifically lays out the method to analyze non-Euclidean geometry. A generic instance shared in the patent depicts how a volumetric convolutional neural network is used to analyze the deformation of a deformable shape by applying a 4X4X4 3D filter before and after the deformation to view the differences in the filter elements (Figure 2). Let’s dive a little deeper into this approach and see how it relates to the identification of fake news.

Deformation of a deformable shape by applying a 4X4X4 3D filter — Figure 2: Difference in filter elements before and after the deformation [7]

Non-Euclidean Geometry of Social Networks

According to Michael Bronstein, the co-founder and Chief Scientist of Fabula, the circulation of fake news and true news have different characteristic patterns. The role that geometric deep learning plays here is that it can work with heterogeneous network-structured data such as the interaction between users and the path that news traverses along [6].

According to an MIT study, truth spreads slower than fake news and this pace is attributed to the human psyche. Humans are more prone to retweet false news [8].

The study accounted for millions of tweets taken from the duration 2006 to 2017 and funneled them down to any those related to the 126,000 stories that were checked by the 6 fact-checking organizations: Snopes, PolitiFact, FactCheck.org, Truth or Fiction, Hoax Slayer and About.com [8]. Some statistics that this study generated were as follows:

· Truth mostly spreads to less than 1000 people

· False news spreads to between 1,000 and 100,000 people

· Truth took 6 times longer to percolate to 1500 people as compared to false information

· Falsehood spreads more broadly and is retweeted by more unique users [8]

As seen in figure 3, these patterns are so prominent that when plotted visually, users with a greater propensity to propagate fake news are easily distinguishable. Thus, news emanating from these users has a greater chance of being misinformation.

Figure 3: Visualization of fake news vs real news distribution pattern – Users who predominantly share fake news are marked red and the users who never share fake news are marked blue. This clear difference makes this approach a good approach to identify fake news [6].

Fabula claims that its algorithms were able to identify fake news within hours of diffusion with an accuracy of 93% (accuracy figure uses a standard aggregate measurement of machine learning classification model performance, called ROC (Receiver Operating Characteristics) AUC (Area Under The Curve)) [6].

One patent, diverse applications

Social media wasn’t the only area where Fabula used this approach. Fabula’s Federico Monti, a member of Prof. Michael Bronstein research group was posed with the problem that the irregularity of the network of sensors positioned in the IceCube Neutrino Laboratory, presents different levels of density in different regions of the ice, making it difficult to process the data collected [9]. He directed his solution towards the Graph Convolutional Neural Networks (GCNN) which are used to study non-Euclidean geometry such as graphs or surfaces [10]. Using this approach, Fabula has also identified a classifier that has trained to detect the spread of infectious diseases through a population [6]. Similar irregular dynamics can be found in the propagation of fake news through a social network. Identification of these dynamics helps nip this transmission in the bud.

Advancing from this research, Michael Bronstein recently applied to the World Intellectual Property Organization and published another patent [11]. This one more specifically relates to the evaluation of news in social media networks. It elaborates on the user descriptors that need to be collected and the set of neural networks that the graph neural network can belong to.

In the past two years, a sizable number of these large companies have been directing their efforts towards joining this fight against fakery. When the fake news wave started gaining momentum globally, most multi-national companies started researching in this area and added to the global pool of solutions to curb it.

Tech giants tapping into Fake News Detection

Companies such as Google and Twitter are ensuring that they get the front seat in the fight against fake news. On June 3rd, 2019, Twitter announced that it acquired Fabula AI to enhance its ML expertise [12]. It is evident that Twitter understands its role in the fake news ecosystem and thus wants to safeguard its platform and its users against lies, deceit, and misinformation

Google, on the other hand, is using its search engine strength and putting more simplistic measures in place to fight fake news. Moving away from implementing AI algorithms that detect false articles, it released a tool that can help news agencies tag articles that expose misinformation so that Google News can explicitly feature it. In an extension to this tool, Google also provides journalists with a database of these stories for fact-checking purposes [13]. Google News Initiative is a big venture and these steps were a few of the many that Google took in this direction.

The eastern hemisphere is fighting this battle with equal rigor. The Chinese government has imposed serious regulations against misinformation. In 2016, it criminalized creating or spreading false news which undermines economic and social order. Last August, the authorities launched an app to offer a platform to report such content [14].

Jinri Toutiao, China’s biggest news aggregator mobile app, is working on AI algorithms to create a bot that generates fake news. This provides researchers with a continuous input of misinformation required to train fact-checking models and thus improves the quality of the identification of ‘false news’ or ‘rumors’ as the Chinese government refers to them [15][16]. Baidu, the behemoth search engine company headquartered in Beijing, has filed a couple of patents in this space too. One of them closely relates to the identification and classification of a low-quality news source [17]. More specifically the method suggests doing so by checking the information of the to-be-recognized news source against a pre-built low-quality news information repository. This approach is predicted to completely automate the news quality-check process and increase efficiency significantly.

Humans vs Machines

From time immemorial, humans have been trapped in self-created problems. Fake news is one of them. Thus, in this era where AI acts as our friend and foe, will the positives ever outweigh the damages? If yes, when? Who would be responsible for this victory? Humans or machines?

References

[1] https://www.bbc.com/news/blogs-trending-42724320

[2] https://www.theverge.com/2019/2/14/18224704/ai-machine-learning-language-models-read-write-openai-gpt2

[3] https://futurism.com/ai-generates-fake-news

[4] https://medium.com/ai2-blog/counteracting-neural-disinformation-with-grover-6cf6690d463b

[5] http://gltr.io/

[6] https://techcrunch.com/2019/02/06/fabula-ai-is-using-social-spread-to-spot-fake-news/

[7] https://patents.google.com/patent/US10013653B2/en?oq=US10013653

[8] https://techcrunch.com/2018/03/08/false-news-spreads-faster-than-truth-online-thanks-to-human-nature/

[9] https://www.fabula.ai/news-index/2019/2/19/fabulas-geometric-deep-learning-helping-detect-neutrinos

[10] https://www.usi.ch/en/feeds/9771

[11] https://patents.google.com/patent/WO2019183191A1/en?q=fake&q=news&oq=fake+news