Accurate estimates of historical forest extent and associated deforestation rates are crucial for quantifying tropical carbon cycles and formulating conservation policy. In Africa, data-driven estimates of historical closed-canopy forest extent and deforestation at the continental scale are lacking, and existing modelled estimates diverge substantially. Here, we synthesize available palaeo-proxies and historical maps to reconstruct forest extent in tropical Africa around 1900, when European colonization accelerated markedly, and compare these historical estimates with modern forest extent to estimate deforestation. We find that forests were less extensive in 1900 than bioclimatic models predict. Resultantly, across tropical Africa, ~ 21.7% of forests have been deforested, yielding substantially slower deforestation than previous estimates (35–55%). However, deforestation was heterogeneous: West and East African forests have undergone almost complete decline (~ 83.3 and 93.0%, respectively), while Central African forests have expanded at the expense of savannahs (~ 1.4% net forest expansion, with ~ 135,270 km2 of savannahs encroached). These results suggest that climate alone does not determine savannah and forest distributions and that many savannahs hitherto considered to be degraded forests are instead relatively old. These data-driven reconstructions of historical biome distributions will inform tropical carbon cycle estimates, carbon mitigation initiatives and conservation planning in both forest and savannah systems.

#### Probability of forest for 1900, 70% tree cover

This raster contains the median modelled probability of forest for the year 1900 and computed using a tree cover threshold of 70%.

Proba_for_1900_70TC.tif

#### Probability of forest for 2000, 70% tree cover

This raster contains the median modelled probability of forest for the year 2000 and computed using a tree cover threshold of 70%.

Proba_for_2000_70TC.tif

#### Forest and savanna distribution for 1900, 70% tree cover

This raster contains the modelled distribution of forest (coded 1) and savanna (coded 0) for the year 1900, and using a probability threshold of 0.5 and for 70% tree cover threshold.

Forest1900_thres05_70TC.tif

#### Forest and savanna distribution for 2000, 70% tree cover

This raster contains the modelled distribution of forest (coded 1) and savanna (coded 0) for the year 2000, and using a probability threshold of 0.5 and for 70% tree cover threshold.

Forest2000_probthres05_70TC_UNC.tif

#### Forest changes between 1900 and 2000, 70% tree cover

This raster contains the changes in forest area between 1900 and 2000, for 70% tree cover threshold. The corresponding codes are: 0=savanna, 1=deforestation, 2=afforestation and 3=forest.

Forest_change_70TC.tif

#### Probability of forest for 1900, 75% tree cover

This raster contains the median modelled probability of forest for the year 1900 and computed using a tree cover threshold of 75%.

Proba_for_1900_75TC.tif

#### Probability of forest for 2000, 75% tree cover

This raster contains the median modelled probability of forest for the year 2000 and computed using a tree cover threshold of 75%.

Proba_for_2000_75TC.tif

#### Forest and savanna distribution for 1900, 75% tree cover

This raster contains the modelled distribution of forest (coded 1) and savanna (coded 0) for the year 1900, and using a probability threshold of 0.5 and for 75% tree cover threshold.

Forest1900_thres05_75TC.tif

#### Forest and savanna distribution for 2000, 75% tree cover

This raster contains the modelled distribution of forest (coded 1) and savanna (coded 0) for the year 2000, and using a probability threshold of 0.5 and for 75% tree cover threshold.

Forest2000_thres05_75TC.tif

#### Forest changes between 1900 and 2000, 75% tree cover

This raster contains the changes in forest area between 1900 and 2000, for 75% tree cover threshold.The corresponding codes are: 0=savanna, 1=deforestation, 2=afforestation and 3=forest.

Forest_change_75TC.tif

#### Probability of forest for 1900, 65% tree cover

This raster contains the median modelled probability of forest for the year 1900 and computed using a tree cover threshold of 65%.

Proba_for_1900_65TC.tif

#### Probability of forest for 2000, 65% tree cover

This raster contains the median modelled probability of forest for the year 2000 and computed using a tree cover threshold of 65%.

Proba_for_2000_65TC.tif

#### Forest and savanna distribution for 1900, 65% tree cover

This raster contains the modelled distribution of forest (coded 1) and savanna (coded 0) for the year 1900, and using a probability threshold of 0.5 and for 65% tree cover threshold.

Forest1900_thres05_65TC.tif

#### Forest and savanna distribution for 2000, 65% tree cover

This raster contains the modelled distribution of forest (coded 1) and savanna (coded 0) for the year 2000, and using a probability threshold of 0.5 and for 65% tree cover threshold.

Forest2000_thres05_65TC.tif

#### Forest changes between 1900 and 2000, 65% tree cover

This raster contains the changes in forest area between 1900 and 2000, for 65% tree cover threshold. The corresponding codes are: 0=savanna, 1=deforestation, 2=afforestation and 3=forest.

Forest_change_65TC.tif

#### Confidence levels for 1900 and 70% tree cover

Raster containing the confidence levels in defining forest and savanna distribution and boundaries for 1900 for forest defined as 70% tree cover threshold and for a probability > 0.5. The confidence levels are expressed between 0 (less confident), and 1 (more confident).

Forest1900_probthres05_70TC_UNC.tif

#### Confidence levels for 2000 and 70% tree cover

Raster containing the confidence levels in defining forest and savanna distribution and boundaries for 2000 for forest defined as 70% tree cover threshold and for a probability > 0.5. The confidence levels are expressed between 0 (less confident), and 1 (more confident).

Forest2000_probthres05_70TC_UNC.tif

#### Confidence levels for 1900 and 75% tree cover

Raster containing the confidence levels in defining forest and savanna distribution and boundaries for 1900 for forest defined as 75% tree cover threshold and for a probability > 0.5. The confidence levels are expressed between 0 (less confident), and 1 (more confident).

Forest1900_probthres05_75TC_UNC.tif

#### Confidence levels for 2000 and 75% tree cover

Raster containing the confidence levels in defining forest and savanna distribution and boundaries for 2000 for forest defined as 75% tree cover threshold and for a probability > 0.5. The confidence levels are expressed between 0 (less confident), and 1 (more confident).

Forest2000_probthres05_75TC_UNC.tif

#### Confidence levels for 1900 and 65% tree cover

Raster containing the confidence levels in defining forest and savanna distribution and boundaries for 1900 for forest defined as 65% tree cover threshold and for a probability > 0.5. The confidence levels are expressed between 0 (less confident), and 1 (more confident).

Forest1900_probthres05_65TC_UNC.tif

#### Confidence levels for 2000 and 65% tree cover

Raster containing the confidence levels in defining forest and savanna distribution and boundaries for 2000 for forest defined as 65% tree cover threshold and for a probability > 0.5. The confidence levels are expressed between 0 (less confident), and 1 (more confident).

Forest2000_probthres05_65TC_UNC.tif