Improved News Generation Bot, Valtteri 2.0 – Soon Reporting Both Crime and Election News

During early Summer of 2018, the Immersive Automation project will debut Valtteri 2.0. This will be an improved version from the election news generation bot that we released in April 2017 (see http://www.vaalibotti.fi). In addition to language improvements, Valtteri 2.0 will showcase its ability to generate news articles in a new domain, i.e., crime statistics, as well as be able automatically generate visualizations to go with the article. For example, Valtteri 2.0 will be able to write an article on the current state and on interesting trends of motor vehicle theft statistics in any municipality in Finland.

Crime has always interested the public.  On a typical day, crime and justice stories make up 15% of the reported news (Katz, 1987).

Often, however, crime news reported in newspapers might give readers misleading, exaggerated, or biased notions of crime. In other words,, news does not always present crimes in the proportions in which they are actually committed (Graber, 1979), whereas looking at data in context may give give a completely different picture.

Using data to paint an accurate picture motivated this second version of

Mockup of Valtteri 2.0 (Source of text and graphic is Statistics Finland)

Valtteri. The prototype is currently being implemented and is planned to be ready in early Summer of 2018. Similarly to Valtteri 1.0, version 2.0 will take in structured data, this time extracted from Statistics Finland, analyze the data, and generate  hundreds of thousands of news article  as a result – an impossible feat for human journalists. In addition, users will still be able to select the news they would like to read and be able to interact with the included visualizations.

Vatteri 1.0, showed us the possibilities and gave us experience in automatically generating natural language news articles from structured data. However, no systems exist which we know of, that automatically generate news from criminal offence statistics, let alone in  multiple languages. Stay tuned for the release of Valtteri 2.0!

References:

  • Katz, J. (1987). What makes crimenews’? Media, Culture & Society, 9(1), 47-75.
  • Graber, D. A. (1979). Is crime news coverage excessive? Journal of communication, 29(3), 81-92.

Will robots write the news a hundred years from now?

The original text for this blogpost is an article that post-doc researcher Lauri Haapanen wrote for The Institute for the Languages of Finland.

The automation of news has been in the pipeline for decades. However, it still humans that are writing the news. Why is that so?

The most apparent challenge for automation is language. Algorithms are already able to conjugate words successfully. However, the subtle nuances of human language do not conjugate in conditional sentences of “if A, then B”. This limitation makes the language stiff and in the long run rather monotonous.  An even bigger problem is content. Algorithms are producing numeric and highly structured result data from companies, sports and elections. This enables news about these subjects to be successfully automated, including in Finland and in Finnish. However, a real scoop is all about new, unexpected, and hard-earned information. A pre-coded algorithm cannot get the grip of such issues.

Lauri Haapanen

Thirdly, the hesitation of media companies and software developers is hindering the development. “One would imagine that there is a lot going on in the industry,” says news automation researcher Carl-Gustav Lindén, “but with a few exceptions, there really isn’t”. Technology itself is not a foreign issue in the field of editing, as the newsrooms are full of it. However, the talk of “robotic journalists” has frightened human publishers, although “there is no sign that development in automation would have reduced journalists’ work,” Lindén recalls. “We should rather see this development as a step forward in the co-operation between journalists and technology.”

It is certain that “robots” will not be writing analytical and engaging stories in a matter of years or even decades. It is also certain that the collaboration between people and software will be developed. A few Finnish editorial offices are already locating potential news topics from public protocols with the help of algorithms. Why could the same software not also compile background material, reveal hidden correlations between distant variables, produce copies and make different versions of ready texts for diverse distribution channels?

Given the speed at which technology is developing, predicting a hundred years in the future feels quite ridiculous. We are still going to need to select and communicate news a hundred years from now. Perhaps we’ll be transmitting news to human consciousness directly, without the use of verbal language, which we will think of as a useless bottleneck in the process.