Journalism and machine learning: a poweful tool for investigations

DETAILED EXAMPLES OF SURVEYS WITH THE HELP OF MACHINE LEARNING

Satellite images on which mining spots appear
Satellite image divided into superpixels
Interactive map of illegal drilling in northwest Ukraine, produced by Texty based on their survey results

The project :
duration: 1 month
team: 5 people (1 journalist, 1 art director, 2 data journalists/developers and 1 person in charge of the model)
amount of data analyzed: 450 000 satellite images
programming language: Python
type of algorithms: unsupervised (SLIC) and supervised (XGBoost)

The comments are sometimes more violent than the posts they are published under.

The project :
duration: several months
team: 11 people (3 developers and 8 journalists)
amount of data analyzed: 2.6 million posts and comments
programming language: Python
type of algorithm: semi-supervised

Display a search result based on a company name

“The model can give you leads, but the story has to be built through investigation,” insists Gianfranco Rossi, a member of the Ojo Publico team who worked on Funes.

The project :
duration: 15 months
team: 4 people: 1 statistician, 2 journalists, 1 editor
amount of data analyzed: 245 000 contracts (52 GB of data)
programming language: R
type of algorithm: supervised

The project :
duration: /
team: 1 data journalist + 1 editor
amount of data analyzed: 578,000 ads
programming language: Python
type of algorithm: supervised

LESSONS THAT CAN BE LEARNED

To evaluate the feasibility of an ML project,

Other examples of ML used in investigations :

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store