Re­search­ers are clear­ing language bar­ri­ers for automated news, aim­ing for an in­creas­ingly var­ied view of the world

Media. Text NEWS on screen of laptop, tablet, pv and phone. 3d

By Aino Pekkarinen

Researchers at the University of Helsinki are developing news automation and information retrieval from text masses in cooperation with five other universities and the Finnish News Agency STT.

How to automatically find the essential content of news in various languages? How might a computer produce news smoothly, and does technology adapt to small linguistic areas such as Finland?

These are among the challenges to be tackled by EMBEDDIA, a research project launching in 2019 with EU funding, with the University of Helsinki participating. The three-year project will be developing methods for automated text analysis and generation. This means that the Immersive Automation project will continue in another context, with a new name.

One of the goals of the project is to simplify searching for information from online news, regardless of its language.

“Combining news written in several languages widens the perspective on the subject at hand, while making it possible to find out what is written on the item in different languages and indifferent media. The goal is to improve people’s access to information,” says Professor of Computer Science Hannu Toivonen, whose research group is taking part in the project.

The University of Helsinki’s Swedish School of Social Science is also a project participant, focused on investigating the needs of media companies.

“This project opens fascinating avenues into developing entirely new solutions for media to utilise. Ensuring a genuine demand for them is also important,” says Docent Carl-Gustav Lindén, a researcher of media and journalism.

Computers have the capacity to report on every single game

Many media companies are already employing automated news for reporting on sports and elections. Using structured data, computers are able to write news articles. For example, ice hockey games are a comfortably regular phenomenon from a computational viewpoint: they consist of three periods, resulting in an unambiguous number of goals.

According to Toivonen, news automation is useful because it enables the production of a great amount of news from consistent data. Computers can write articles on local hockey games even for a handful of readers.

“In such cases, the audience of a single piece of news may be small, but when the number of articles is great, media businesses both achieve extensive coverage and respond to specific needs,” Toivonen explains.

For now, automated news is comprised of election and sports coverage, and the like, which is generated in a structured manner from structured data. In-depth profiles and news analyses produced by computers are still some way off in the future, since computers are yet unable to handle the linguistic and content variation of these text types.

“Reporters are still needed. The nature of the profession may evolve, and meta-editorial elements will be involved. For example, journalists may instruct computers on reporting various subject matter,” Toivonen says.

“Such developments don’t necessarily apply to all journalists, but everyone must understand the direction the world of media is taking and the possibilities generated by new technologies,” Lindén adds.

In­creasingly creative content through metaphors

In the EMBEDDIA project, Toivonen’s group is focusing on how to make computers able to automatically produce news as efficiently as possible and in several languages. This is a continuation of the group’s earlier research on news automation (in Finnish and Swedish only).

Modern technology provides computers with the ability to create relatively smooth content on election results, but they are not yet good at writing vividly. A creative touch is now being sought for both text structures and word choice.

“Metaphors employ structures that can betaught to computers, at least to a degree. This is how we hope to put a littlecolour into the language,” says Toivonen.

Background: A partnership of universities and media businesses

The University of Helsinki is participating in EMBEDDIA, a research project to be launched in 2019, developing news automation across language boundaries.

The three-year project is funded by the EU’s Horizon 2020 research programme. The University of Helsinki’s share of the funding totals approximately€450,000.

In addition to the University of Helsinki, five other European universities are taking part in EMBEDDIA, as are the Finnish News Agency STT and three other media businesses.

The name of the project derives from machine learning technologies known as word embeddings, which learn relations between words based on the contexts of their occurrences. The multilingual word-embedding models to be developed in the project will help computers find connections between texts written in different languages.

“Automation enables journalists do things that ordinary people cannot do”

By Hanna Tuulonen

After two years of hard work and excellent results, the Immersive Automation research project received a worthy final seminar with quests from all over Europe. One of the seminar’s most expected quests was David Caswell, executive product manager of BBC News Labs and founder of Structured Stories.

BBC News Labs was founded in 2012 andin the past six years it has grown from a few part-time staff to about 20 team members including journalists, developers, scientists, developer-journalists and broadcast craft experts. According to Caswell, one of its – as well as the media future’s – main goals is to restore a privileged position for newsrooms.

In his presentation Caswell, said that the media field needs to find new artefacts for news that can restore and maintain a privileged position in a ‘many-to-many’ communications environment.

“One-to-many news artefacts, such as articles and programmes, cannot maintain a privileged position in a many-to-many communication environment because anyone can create them. We have to find and create something that is not easy to copy”, Caswell said at the seminar.

As a solution, Caswell presents several options. One of the most common yet interesting is personalisation. According to Caswell, it is efficient use of attention because it reduces cognitive friction.

Caswell also talked about how authority is moving from authorship to evidence and the need to shift a ‘trust me’ attitude to ‘see for yourself’ approach. In addition, newsrooms have to provide context on top of content.

“Content is abundant and cheap. Context is rare and valuable, and it can be assembled from networks of information such as connected data and integrated automation. It enables journalists do things that ordinary people cannot do”, Caswell explained.

At BBC News Lab, these ideas have been taken into use by providing many different kinds of news artefacts of the same story. The audience can, for example, choose to read a short or a long version of a particular story, or a watch a video instead

In addition to Caswell, two Nordic pioneers – editor Magnus Aabech from the Norwegian News Agency NTB and CEO Sören Karlsson from the United Robots from Sweden – talked in the seminar. In his presentation Aabech followed Caswell’s view of that one size does not fit all, illustrating this with the following picture.

On his turn, Sören Karlsson from the United Robots talked about what news automation means to journalists’ work in practise. He presented the following picture, where an automatic chatbot asks the team leader for comments after the match.

“Using this kind of tools gives us reliable, relevant and high quality data. It also increases the quality of the texts, gives a continuous flow of news and personalized distribution”, Karlsson said.

Besides Caswell, Aabech and Karlsson, also business developer Maija Paikkala from the Finnish news agency STT and head of Yle News Lab Jukka Niva from the Finnish public broadcasting company talked about how they are experimenting with new forms of structured journalism.

The seminar ended with the presentation of professor Hannu Toivonen who presented the University onHelsinki Department of Computer Science’s news project Embeddia, followed by apanel discussion moderated by PhD student Stefanie Sirén-Heikel. During the seminar, Immersive Automation research project’s WAN-IFRAreport was also presented by docent Carl-Gustav Lindén and PhD student Hanna Tuulonen.

The Immersive Automation research project’s final seminar was held in Helsinki at the Swedish School of Social Science, University of Helsinki on the 28th of November 2018.

News production becomes automatic – meta editors are coming

News production is changing as the routine parts of editorial work are being automated. VTT and the University of Helsinki will explore how interesting and high-quality news can be produced automatically, as well as what kind of new user experiences can be offered.

In order to serve the increasingly demanding audiences in multiple digital channels, media houses are trying to automate the most routine editorial work. This way, the editors can concentrate on writing more challenging special stories and giving their audiences opportunities to immerse themselves in increasingly personalized news experiences.

The University of Helsinki and VTT Technical Research Centre of Finland Ltd will research automatic news production where a personalised news experience is enabled by data and machine learning. Hyper locality and audience participation are the key elements here.

“Semi-automatic solutions will be the common practice: the editor will finalise the automatically produced text and define templates for automatic news generating programs. In the future, all editors will be, to some extent, meta editors”, believes VTT’s research professor Caj Södergård.

The degree of automation rises gradually

So far, automation has been trialed in news production by big actors, such as the American press agency AP (Associated Press), with writing analyses of financial statements for example. In addition to financial news, sports news is already automatically produced around the world.

“One can expect that producing other types of news can be automated up to a certain point depending on the availability of data. More demanding journalism – such as leading articles and in-depth articles – will remain the task of human journalists,” states the journalism researcher Carl-Gustav Lindén from Swedish School of Social Science, part of the University of Helsinki.

“The University of Helsinki studies how data science can be applied to news production and its automation. We develop tools based on data mining and machine learning for journalists to streamline their work,” tells professor Hannu Toivonen from the department of computer science at University of Helsinki.

VTT Technical Research Centre of Finland Ltd studies how automatically produced content affects the audience and what promotes and prevents an immersive experience. VTT is also responsible for the demonstration of a news ecosystem and studies new ways to distribute content in cooperation with the technology companies participating in the project.

The main financier of the Immersive Automation project is Tekes through their Media remake program. Other financiers of the project are Media Industry Research Foundation of Finland, The Swedish Cultural Foundation in Finland, Sanoma, Alma Media, Conmio, Keski-Pohjanmaan Kirjapaino, and KSF Media as well as the research institutions.