Loading…
Friday May 29, 2026 2:00pm - 3:15pm CEST
Scraped data can often be the backbone of an investigation, but some websites are more difficult to scrape than others. This session will cover how to approach dealing with tricky sites, including coping with captchas, IP blocking, and browser fingerprinting. We'll cover how to figure out what might be preventing you from scraping a site, and what options you have to proceed, with their pros, cons, and costs.

This is an advanced session aimed at people who already have experience of writing code to scrape websites and want to move up to the next level: participants will leave with an understanding of how to deal with hard-to-scrape websites, plus the tradeoffs of different approaches. No tools are required to follow along, just a web browser.
Speakers
avatar for Max Harlow

Max Harlow

Bloomberg News
Max Harlow is a data reporter at Bloomberg News. He also runs Journocoders, a community group for journalists to develop technical skills for use in their reporting.
Friday May 29, 2026 2:00pm - 3:15pm CEST
3.02

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link