April Fools hoax stories could offer linguistic clues to spotting fake news articles, say scientists who identified the similarities in the language used in humorous spoofs and malicious stories. Researchers from Lancaster University in the UK compiled a dataset of more than 500 April Fools articles sourced from more than 370 websites and written over 14 years.
They found that there are similarities in the written structure of humorous April Fools hoaxes published by media outlets and fake news stories. “April Fools hoaxes are very useful because they provide us with a verifiable body of deceptive texts that give us an opportunity to find out about the linguistic techniques used when an author writes something fictitious disguised as a factual account,” said Edward Dearden from Lancaster University.
“By looking at the language used in April Fools and comparing them with fake news stories we can get a better picture of the kinds of language used by authors of disinformation,” said Dearden.
A comparison of April Fools hoax texts against genuine news articles written in the same period revealed stylistic differences. Researchers focused on specific features within the texts, such as the amount of details used, vagueness, formality of writing style and complexity of language.
They then compared the April Fools stories with a ‘fake news’ dataset, and found a number of similar characteristics. Such articles tend to contain less complex language, an easier reading difficulty, and longer sentences than genuine news.
Important details for news stories, such as names, places, dates and times, were found to be used less frequently within April Fools hoaxes and fake news. First person pronouns, such as ‘we’, are also a prominent feature for both April Fools and fake news, researchers said.
The team also created a machine learning ‘classifier’ to identify if articles are April Fools hoaxes, fake news or genuine news stories. The classifier achieved a 75 per cent accuracy at identifying April Fools articles and 72 per cent for identifying fake news stories.
When the classifier was trained on April Fools hoaxes and set the task of identifying fake news it recorded an accuracy of more than 65 per cent.
“Looking at details and complexities within a text are crucial when trying to determine if an article is a hoax,” said Alistair Baron, from Lancaster University.
“Although there are many differences, our results suggest that April Fools and fake news articles share some similar features, mostly involving structural complexity,” Baron said.
“Our findings suggest that there are certain features in common between different forms of disinformation and exploring these similarities may provide important insights for future research into deceptive news stories,” he said.