Loading…
Venue: Z1.13 - Aula Hanswijk clear filter
Friday, May 29
 

10:00am CEST

Opening of the conference
Friday May 29, 2026 10:00am - 10:30am CEST
The opening of the conference will take place in the Aula Hanswijk (Z1.13, on the first floor), and will be streamed into the Aula Donche (Z1.15, first floor).
Friday May 29, 2026 10:00am - 10:30am CEST
Z1.13 - Aula Hanswijk

11:30am CEST

OSINT 101: The latest tools, tricks & tactics
Friday May 29, 2026 11:30am - 12:45pm CEST
Drawing on real cross-border investigations from OCCRP's Research & Data team, this session will share the tools, techniques, and workflows the team relies on daily to support hundreds of journalists around the world. From geolocating images and tracking assets to social media investigations and smart browser hacks, this session will offer a practical, field-tested OSINT toolkit.

With shrinking newsroom budgets and a constant stream of "must-have" tools, it's harder than ever for journalists to know what OSINT tools might actually be worth using or paying for. This session cuts through the noise and focuses on what works right now.

Whether conference attendees are new to open-source research or looking for a sharp refresher, they will leave with concrete skills, trusted tools, and time-saving methods they can immediately apply to their own investigations.
Speakers
avatar for Shaya Laughlin

Shaya Laughlin

Research Director, OCCRP
Friday May 29, 2026 11:30am - 12:45pm CEST
Z1.13 - Aula Hanswijk

2:00pm CEST

Inside the European Defence Fund: hidden decisions, weak ethics, and funding for an Israeli arms manufacturer
Friday May 29, 2026 2:00pm - 3:15pm CEST
This session will explain the methods behind an investigation into the European Defence Fund (EDF) that uncovered how structural weaknesses allowed Israel’s largest state-owned weapons manufacturer - directly involved in the war in Gaza- to receive millions in EU funding, despite rules meant to support only European companies. The reporters will explain how they identified relevant projects and traced the flow of funds.

They will break down our data work (scraping the tenders portal and building a dataset) and guide you through how the EDF policy works, its loopholes, legal framework, and how to work with the EU’s defence expenditure processes.

In today’s political environment of increasing militarisation, where a huge share of EU funding is directed toward defence, journalists need to learn how to access and analyse European public defence tender data, understand the EDF tendering and decision-making process, and see how European governments fund and benefit from defence projects.

This session will demonstrate how combining data work, investigative reporting, cross-border collaboration, and legal analysis can uncover hidden practices, reveal intentional gaps and inconsistencies that favour the arms industry over EU principles and international law.
Speakers
avatar for Maria Maggiore

Maria Maggiore

Investigate-Europe
avatar for Konstantina Maltepioti

Konstantina Maltepioti

Data Journalist, Reporters United
Konstantina Maltepioti is a data journalist at Reporters United, an independent network of investigative journalists based in Greece. Her work focuses on political corruption, environmental issues, and human rights. She specialises in open-source investigations, ship-tracking, scraping... Read More →
Friday May 29, 2026 2:00pm - 3:15pm CEST
Z1.13 - Aula Hanswijk

3:45pm CEST

Investigating with trade data
Friday May 29, 2026 3:45pm - 5:00pm CEST
This session will cover how to use trade information for investigative reporting, from the theory behind commercial flows to its application to real investigative cases. The first part of the presentation will focus on the "dictionary" that is crucial to read and interpret trade data. Then, it will explore how to source official customs statistics for free, understand crucial variables in import-export sheets, and find workarounds to expensive third-party commercial providers.

Examples from real investigations will show the power of using trade data in covering topics such as deforestation, sanctions evasion, the military industry, cocaine trafficking, but also much more "ordinary" commercial flows that might be linked to pollution/environmental issues. Throughout the session, participants will be welcome to bring examples of commodities they would like to track and guided in a few hands-on exercises to familiarise themselves with finding and understanding this kind of data.

The session is suitable for beginners. No technical/coding experience is needed, but the participants should be familiar with spreadsheets.


Speakers
avatar for Edoardo Anziano

Edoardo Anziano

Investigative Reporter, IrpiMedia
Investigative journalist covering transnational organised crime & illicit economies
Friday May 29, 2026 3:45pm - 5:00pm CEST
Z1.13 - Aula Hanswijk
 
Saturday, May 30
 

9:30am CEST

AI-Assisted OSINT: Automating the investigative workflow
Saturday May 30, 2026 9:30am - 10:45am CEST
Most investigative workflows still rely on manually juggling dozens of tools. In this session, we'll walk through a live demo of a semi-automated pipeline built for real casework: web search and archiving with Playwright, face extraction, reverse image search, database cross-referencing with Telegram bots, social media analysis, and structured reporting via Obsidian mcp. All of this is orchestrated by Claude, an AI layer you can teach your own investigative methodology. At the end, participants will work through a simplified case using a workflow of their own.

Before the session, please install: Python, Claude Code. This session will teach participants to combine several smaller OSINT tools so they work together efficiently without requiring much manual effort. No special tools needed

Materials: https://github.com/anastasiiamorozova/ai-osint
Speakers
avatar for Anastasiia Morozova

Anastasiia Morozova

Data and investigative journalist, Onet.pl/Ringier Axel Springer
I’m a data and investigative journalist with a background in tracking Russian influence, desinformation operations and sanctions evasion in Europe. I’m especially interested in projects where I can combine data analysis and visual storytelling to expose hidden networks or financial... Read More →
avatar for Jeremy Crowlesmith

Jeremy Crowlesmith

Data journalist / AI specialist, KRO-NCRV
hi, i'm jeremy. i build tools and tell stories with data. from scraping to analysis to visualization — the whole stack. i have twenty years of building for the web. now i'm focused on investigative data journalism: using code to find stories hidden in documents and datasets. - based... Read More →
Saturday May 30, 2026 9:30am - 10:45am CEST
Z1.13 - Aula Hanswijk

11:15am CEST

Embracing agents with Pydantic AI
Saturday May 30, 2026 11:15am - 12:30pm CEST
"Agentic AI" is all the rage, but what does it offer beyond traditional LLM workflows? In this hands-on session we'll answer this question (and more) while leveraging Python's Pydantic AI library to build a start-to-finish agentic AI workflow.

Participants will learn how agents work, when they're useful, how to build custom tools, and options for tracing and evaluation. You'll leave able to write agentic workflows to extract information from texts, do semi-autonomous research, and deliver clean, structured results.

Basic experience with Python/LLMs is helpful but not required. After attending this session, participants will be able to understand when and how to apply agentic approaches to problems. Participants should have Python/Jupyter installed or a Google account for working in the cloud.

Having a GitHub account - https://github.com - will allow you to easily run the code without anything on your own computer.

Workshop materials can be found at https://jsoma.github.io/workshop-ai-agents/
Speakers
avatar for Jonathan Soma

Jonathan Soma

Knight Chair in Data Journalism, Columbia University
Jonathan Soma is the Knight Chair in Data Journalism at Columbia University, where he serves as Director of the Data Journalism MS program and the Lede Program, an intensive data journalism summer course. His lectures cover everything from basic Python and data analysis to interactive... Read More →
avatar for Jan van der Burgt

Jan van der Burgt

Investigative coder / AI specialist, Freelance / Open State Foundation
I leverage AI technologies to collect and analyse data at scale, uncovering the hidden patterns that build stories.

Investigative focus: lobbying, government overreach, migration, global food supply chains.
Saturday May 30, 2026 11:15am - 12:30pm CEST
Z1.13 - Aula Hanswijk

1:45pm CEST

Who's behind this website?
Saturday May 30, 2026 1:45pm - 3:00pm CEST
Reporting online today, journalists have to battle astroturf campaigns, fake news sites and sketchy shell companies to find out who is behind the story. It frequently leads to a frustratingly common question: who is behind this website?

Popular tools and approaches to investigating websites have been less reliable lately. There's more opaqueness in areas where there should be more transparency; crypto payments add a layer of confusion, and generative AI makes it easy for adversarial actors to operate hundreds of websites.

Using a range of OSINT tools and real-world investigations, we will walk you through investigating the provenance and ownership of websites: identifying the scope and scale of the network it belongs to — if any? Who’s behind the site, now and in the past? Who are the main actors promoting this website? Is it AI slop? Are foreign actors likely behind the domain?

While it is not always possible to fully unmask the owner of a site, using a thorough checklist of tools and techniques that we have used in real-world investigations we can help you make sure to reveal as much as possible about a website, and potentially uncover important clues. We will also walk you through how to conduct these investigations safely depending on your threat model, and how to document your findings reliably.

 This session is suitable for beginners and doesn't assume existing technical knowledge.
Speakers
avatar for Priyanjana Bengani

Priyanjana Bengani

Computational Journalism Fellow, Columbia University
Priyanjana Bengani is the Tow Computational Journalism Fellow at Columbia University's Tow Center for Digital Journalism. Her work focuses on using computational techniques to research the digital media landscape, including partisan local news and the intersection of platform companies... Read More →
Saturday May 30, 2026 1:45pm - 3:00pm CEST
Z1.13 - Aula Hanswijk

3:30pm CEST

Which schools are the most exposed to pesticides in your country? How to investigate with data, maps and scientists
Saturday May 30, 2026 3:30pm - 4:45pm CEST
In this session, we’ll show you how we approached a sensitive topic: mapping the potential exposure of all schools to the use of pesticides from surrounding agricultural activities. We will explain  how - working with scientists - we came up with a robust methodology,  found the data we needed and then crunched it to tell us where to go to make truly data-driven reporting on the ground.  

This investigation, published in Le Monde in December 2025, has sparked a lot of national and local interest thanks to its interactive map. Attendees will leave the session with a clear step-by-step guide to get started and adapt the ambition of the investigation to their own capacity. Feel free to bring with you any datasets that would help to map similar issues in your country.

You can read the main story in English here
Free access to the map in French here

Speakers
avatar for Raphaelle Aubert

Raphaelle Aubert

Data journalist, Le Monde
I'm an investigative data journalist at Le Monde. My most recent cross-border collaborations include:Green to Grey: how Europe is destroying the little nature it has left Forever Lobbying Project: revealing the cost of PFAS remediation in EuropeUnder the Surface: 300 Contaminants... Read More →
Saturday May 30, 2026 3:30pm - 4:45pm CEST
Z1.13 - Aula Hanswijk

5:15pm CEST

Mining data from unstructured documents
Saturday May 30, 2026 5:15pm - 5:45pm CEST
You have a folder of documents and you want to extract data points from each one. And the data isn't in a structured table with neat rows and columns either. Here's where string functions and regular expressions can help. The demonstration will be in R but the skills are generic to all languages.

Materials: https://github.com/gebelo/Dataharvest2026
Speakers
avatar for Robert Gebeloff

Robert Gebeloff

Reporter, New York Times
Robert Gebeloff has worked as a data projects reporter for The New York Times since 2008 and has taught data journalism for many years in newsrooms and at conferences. He was co-winner of the George Polk Award in 2015 and was a Pulitzer Prize finalist in both 2015 and 2016 for projects... Read More →
Saturday May 30, 2026 5:15pm - 5:45pm CEST
Z1.13 - Aula Hanswijk

6:00pm CEST

Modern document processing with Natural PDF
Saturday May 30, 2026 6:00pm - 6:30pm CEST
Say hello to Natural PDF, a new Python library for wrangling PDFs that's focused on usability and feature-completeness. Process PDFs with scraping-like selectors and spatially-aware queries, asking for "the red alphanumeric string" or "the content below the big Summary header." Beyond the basics, Natural PDF is also full of modern conveniences like table detection, multiple OCR engines, and citation-aware LLM data extraction.

To get the most out of this session, participants should have experience with Python and struggling with terrible PDFs.

Materials: https://jsoma.github.io/natural-pdf-workshop/
Speakers
avatar for Jonathan Soma

Jonathan Soma

Knight Chair in Data Journalism, Columbia University
Jonathan Soma is the Knight Chair in Data Journalism at Columbia University, where he serves as Director of the Data Journalism MS program and the Lede Program, an intensive data journalism summer course. His lectures cover everything from basic Python and data analysis to interactive... Read More →
Saturday May 30, 2026 6:00pm - 6:30pm CEST
Z1.13 - Aula Hanswijk
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.