Name: Build web scrapers with AI for non-coding journalists
Start: 2026-05-29T11:30:00+0200
End: 2026-05-29T12:45:00+0200

Build web scrapers with AI for non-coding journalists

Friday May 29, 2026 11:30am - 12:45pm CEST

3.02

Scraping data from the Internet has become a key skill for many investigations and reporting projects that rely on data. Building custom web scrapers used to require solid coding skills but in two recent environmental investigations supported by the Pulitzer Center, we used Large Language Models (LLMs) like ChatGPT, Google Gemini, or Claude to help us build scrapers for online content without much coding skills. This hands-on workshop will teach you how to inspect a website and choose a scraping strategy. Then it will demonstrate, step-by-step, how to build web scrapers that have been used in the investigations. LLM prompts will be shared and participants can follow along to create their first custom web scraper.

After attending you will understand website structure for scraping and be able to use LLMs to build basic web scrapers.

Participants should come with their own laptops, register a free account on any of the main LLMs (e.g. ChatGPT, Google Gemini, Claude) and have a free Google Colab account at colab.research.google.com.

No coding skill is required but basic familiarity with LLMs is recommended.

Materials: https://github.com/kuangkeng/dataharvest2026-ai-scraper

Speakers

Kuang Keng Kuek Ser

Senior Editor for Rainforest Investigations, Pulitzer Center

Kuang Keng Kuek Ser is the Senior Editor for Rainforest Investigations at the Pulitzer Center, a non-profit organization based in Washington, DC that supports independent journalists globally. He supports and mentors three fellowships investigating issues related to tropical rainforest... Read More →

Anastasiia Morozova

Data and investigative journalist, Onet.pl/Ringier Axel Springer

I’m a data and investigative journalist with a background in tracking Russian influence, desinformation operations and sanctions evasion in Europe. I’m especially interested in projects where I can combine data analysis and visual storytelling to expose hidden networks or financial... Read More →

Friday May 29, 2026 11:30am - 12:45pm CEST
3.02

Data skills, Workshop

Attendees (50)

View All →

Dataharvest 2026 - the European Investigative Journalism Conference

Kuang Keng Kuek Ser

Anastasiia Morozova

Attendees (50)

Get help with the event

Dataharvest 2026 - the European Investigative Journalism Conference

Kuang Keng Kuek Ser

Anastasiia Morozova

Attendees (50)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Get help with the event