Name: No download button? Getting web data without writing a scraper
Start: 2026-05-30T16:15:00+0200
End: 2026-05-30T16:45:00+0200

No download button? Getting web data without writing a scraper

Saturday May 30, 2026 4:15pm - 4:45pm CEST

3.09

Journalists often run into data that is visible on a website but impossible to download directly: a table buried in a government page, a list of public records, or search results that change with every query. Writing a full scraper can be time-consuming and technically demanding for what is often a one-time task.

This session introduces three lightweight approaches that cover most of these cases: reading a table directly from a page using pandas, downloading raw HTML and parsing it into a dataframe and pulling data through network requests. These techniques are practical tools for everyday newsroom situations. Participants will take home a GitHub repository with a working notebook to try on their own data, though some adaptation will be needed to apply it to different websites.

The three approaches vary in complexity. Basic Python knowledge is enough to follow along, but participants with more experience will be able to go further, and the code can be adapted with the help of an LLM.

Materials: https://github.com/teodoracurcic/dh2026-getting-data

Speakers

Teodora Curcic

BBC

Teodora Ćurčić is an investigative and data journalist from Serbia with over seven years of experience reporting on corruption, political finance, gender-based violence, and social justice. She spent most of her career at the award-winning Center for Investigative Journalism of... Read More →

Saturday May 30, 2026 4:15pm - 4:45pm CEST
3.09

Data skills, Mini

Dataharvest 2026 - the European Investigative Journalism Conference

Teodora Curcic

Attendees (12)

Get help with the event

Dataharvest 2026 - the European Investigative Journalism Conference

Teodora Curcic

Attendees (12)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Get help with the event