Name: How to look up named entities in text – fast
Start: 2026-05-30T17:15:00+0200
End: 2026-05-30T17:45:00+0200

How to look up named entities in text – fast

Saturday May 30, 2026 5:15pm - 5:45pm CEST

3.09

Have you ever stumbled at the problem "I have a bunch of documents, give me all the politicians named in it"? If yes, you know the hassle: NER is noisy, and to qualify names (Is this a politician or not) requires external services, APIs or a large language model.

Or, use "Juditha": It's an open source poor mans entity extraction and resolution tool. No external service required, just put in your list of names and then extract them from arbitrary unstructured content. Works on any laptop, super fast. Of course it works with names of criminals, too. Or company names. Whatever you need.

In this session I'll walk through how to use the "juditha" command line and how to populate it with names of interest. At the end, anyone can take it home to detect the names that matter in your material.

Knowledge about how to use a command line and install python packages helps. If you ever suffered the problems about named entity recognition, you'll have even more fun.

Juditha: https://github.com/dataresearchcenter/juditha

Speakers

Simon Wörpel

Director of Technology, Data and Research Center – DARC

Saturday May 30, 2026 5:15pm - 5:45pm CEST
3.09

Data skills, Mini

Dataharvest 2026 - the European Investigative Journalism Conference

Simon Wörpel

Attendees (26)

Get help with the event

Dataharvest 2026 - the European Investigative Journalism Conference

Simon Wörpel

Attendees (26)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Get help with the event