Loading…
Friday May 29, 2026 11:30am - 12:45pm CEST
Processing natural language is seen as the task that artificial intelligence is most adept at. However, as journalists and researchers, we need our technologies to be explainable, understandable, and deterministic. Because of this, not all artificial intelligence algorithms are well-suited for our work. And, when every company promises that their AI software is extraordinary, it's difficult to distinguish the empty promises from what the technology can actually do. Working on OpenAleph, an open-source tool for investigative journalism, has taught us a lot about processing natural language. We extract names of people and companies from raw text. We try to infer the language a text is written in. The names of places, cities, and countries are crucial to us, in order to situate data geographically. All of this is heavily reliant on algorithms. But not all algorithms are as good as getting us what we want!

In this session, we'll show you what works and what doesn't. Everything we demonstrate can be used independently of OpenAleph, and integrated into your own workflows. Some machine learning algorithms are excellent at getting us more insights from our data. In addition to this, data that we already have, or public data, can be harnessed to help us identify names of people and places, just based on similarity - no AI required!

Finally, we'll discuss how these approaches compare to using large language models and generative AI. This session is half teaching and discussing common solutions, half workshop. For the workshop part, bring a laptop running Python if possible.
Speakers
avatar for Simon Wörpel

Simon Wörpel

Director of Technology, Data and Research Center – DARC

avatar for Natalie Widmann

Natalie Widmann

Data Journalist, SWR Data Lab
I'm a Data Journalist supporting journalist and human rights activists with data, tools and automation.
I'm happy to talk about scraping data, extracting the most relevant information from it, understanding algorithms and using them for investigations.
Friday May 29, 2026 11:30am - 12:45pm CEST
3.13

Attendees (8)


Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link