This AI found trends hidden in British history for more than 150 years

Researchers from University of Bristol used AI to analyse British newspapers from 1800-1950
Data analysis of newspapers by researchers at the University of Bristol pinpointed the year when electricity took over from steam in the UKScience & Society Picture Library / Contributor / Getty

When did electricity take over from steam in the UK? When did football replace cricket as the most popular sport? And what year did women start to become more frequently mentioned in the press?

We can turn to big data for the answers. Specifically, a new paper by a team artificial intelligence researchers at the University of Bristol that used AIto analyse the news from 100 different British regional newspapers over the past 150 years.

The team of academics, led by professor Nello Cristianini, collaborated closely with the company findmypast, which is digitising historical newspapers from the British Library as part of their British Newspaper Archive project.

Over 35 million articles and 28.6 billion words - around 14 per cent of local newspapers from 1800-1950 - were used for the study, which aimed to establish whether major historical and cultural changes could be detected from statistical footprints in the content of the local papers. Professor Nello Cristianini, professor of AI from the department of engineering and mathematics at Bristol, said the study aimed to: “demonstrate an approach to understanding continuity and change in history, based on the distant reading of a vast body of news, which complements what is traditionally done by historians.”

Subscribe to WIRED

Simple content analysis of the newspaper articles meant researchers could detect key events including wars, epidemics and coronations with high accuracy. AI techniques enabled the team to move beyond counting words by detecting references to named entities, such as individuals, companies and locations. In addition, the paper compared the results from the newspaper study with text from books written at the time, in order to determine whether newspapers could be “more sensitive to certain culture shifts”, as newspapers had a closer relation to current events.

The results demonstrated the social and political feelings of the time. For instance, until the 1930s, the Liberals were mentioned more in the papers than the Conservatives. But after 1922, the Liberals didn’t return a majority to Parliament; and this was the time the Labour party took off as the main rival to the Conservatives.

In addition, the research team found that 1898 was the crossing point when steam declined and electricity rose in popularity; 1902 was pinpointed as the year trains overtook horses; and in 1909, football became more prominent than cricket.

A family listening to the radio in Britian in 1931. Radio mentions in the newspaper peaked during the war yearsDaily Herald Archive / Getty

In terms of technology, mentions of the radio appeared to peak in the war years, particularly World War Two, and there was a rapid uptake in mentions of the television at the end of the data set, when broadcasting become more popular.

Unsurprisingly, males were systematically more represented in the local press than females during the entire 150-year period studied. However, it was found that there was a slow increase in the presence of women after 1900 although this wasn’t found to be caused by any particular incident. The UK Parliament attributes suffragette militancy as beginning in 1905, so it would make sense that the 1900s marked a turning point when women would be mentioned more in the press.

Something the researchers did note was that the amount of gender bias in the news over the 150-year period was not very different from current levels.

Though the study demonstrated how big data can be beneficial for historical studies, the research team stresses that: “the practice of close reading cannot be replaced by algorithmic means.” Tom Lansdall-Welfare, research associate in machine learning in the department of computer science, who led the computational part of the study, said: "We have demonstrated that computational approaches can establish meaningful relationships between a given signal in large-scale textual corpora and verifiable historical moments.”

"However, what cannot be automated is the understanding of the implications of these findings for people, and that will always be the realm of the humanities and social sciences, and never that of machines.”

The research was published in the Proceedings of the National Academy of Sciences as was part of the University of Britsol’s ThinkBIG project which aims to explore the interplay between social sciences, humanities, and large-scale data-drivien AI.

This article was originally published by WIRED UK