Skip to main content

Data and code for: Uncovering commercial activity in informal cities

Cite this dataset

Straulino, Daniel et al. (2022). Data and code for: Uncovering commercial activity in informal cities [Dataset]. Dryad.


Knowledge of the spatial organisation of economic activity within a city is key to policy concerns. However, in developing cities with high levels of informality, this information is often unavailable. Recent progress in machine learning together with the availability of street imagery offers an affordable and easily automated solution. Here we propose an algorithm that can detect what we call visible firms using street view imagery. Using Medellín, Colombia as a case study, we illustrate how this approach can be used to uncover previously unseen economic activity.

This dataset contains the data and code required to replicate our analysis for the manuscript: "Uncovering commercial activity in informal cities."


The dataset contains has two components. More information on both of them is found in the corresponding README files.

  1. Code and data related to detecting visible firms in Medellin.
    • Random locations were sampled from the street grid of the city. A model was trained on the google street view images corresponding to them. The model code is in the folder best_model, and the location coordinates are found in the data folder.
    • The predictions of the model are found in the prediction tab, and the corresponding metrics are found in the main folder.
  2. Code and data for the analysis of the spatial distribution found with the model presented in 1.
    • Data on the location of all detections in the city of Medellin.
    • Code to recreate the analysis/figures in the paper.


UK Research and Innovation, Award: ES/P011055/1