Loading…
Saturday May 30, 2026 9:30am - 10:45am CEST
You need to harvest data from a Web site. But there's no download button. It's time to scrape! There are many options, but one of the most consistently effective is launching an automated browser. You tell the browser where to go and what to click, and when to ingest the content. To follow along, participants should have some knowledge of coding in any language.

Participants will come away from this class knowing the basics of Web scraping with a browser emulator. To follow along, participants should have R Studio installed https://posit.co/download/rstudio-desktop/, create a new project, download this file selenium-server-standalone-3.5.3.jar into the project directory, and have the appropriate Chrome binary downloaded into the directory https://googlechromelabs.github.io/chrome-for-testing/last-known-good-versions-with-downloads.json
Speakers
avatar for Robert Gebeloff

Robert Gebeloff

Reporter, New York Times
Robert Gebeloff has worked as a data projects reporter for The New York Times since 2008 and has taught data journalism for many years in newsrooms and at conferences. He was co-winner of the George Polk Award in 2015 and was a Pulitzer Prize finalist in both 2015 and 2016 for projects... Read More →
avatar for Simon Wörpel

Simon Wörpel

Director of Technology, Data and Research Center – DARC

Saturday May 30, 2026 9:30am - 10:45am CEST
3.04

Attendees (5)


Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link