High risk of extinction across the flowering plant tree of life
Data files
Mar 19, 2026 version files 8.09 GB
-
EDGEangio_DataS1_EDGEspp_RLPred.csv
47 MB
-
EDGEangio_DataS10_Pred_200draws.csv
151.37 MB
-
EDGEangio_DataS11_200_Weighted_trees.tre
3.41 GB
-
EDGEangio_DataS2_EDGEspp_RLonly.csv
55.71 MB
-
EDGEangio_DataS3_EDGEfam_RLPred.csv
76.95 KB
-
EDGEangio_DataS4_EDGEfam_RLonly.csv
76.49 KB
-
EDGEangio_DataS5_Maps_summary.csv
24.63 KB
-
EDGEangio_DataS6_200EDGEscores.csv
1.23 GB
-
EDGEangio_DataS7_backbone_tree.tre
2.74 MB
-
EDGEangio_DataS8_Missing_genera.csv
85.64 KB
-
EDGEangio_DataS9_200trees.tre
3.19 GB
-
README.md
20.18 KB
Abstract
Global biodiversity policies recognize the necessity to preserve evolutionary lineages as their diversity underpins current and future benefits to people and the future of life on earth. Plants are mostly absent from global biodiversity assessments resulting in a taxonomic imbalance that has undermined their conservation for decades. We present a tree of life and extinction risk estimates for all species of angiosperms, representing a global assessment of their threatened evolutionary history. We estimate that 21.2 % of angiosperm evolutionary history is at risk of extinction and identify 9,945 priority species that disproportionately account for total threatened evolutionary history. These prioritizations serve to redress imbalances between plants and animals, monitor conservation effectiveness, and optimize resource allocation in the face of increasing human pressures on biodiversity.
Dataset DOI: 10.5061/dryad.0rxwdbsdv
Description of the data and file structure
The following provides all the data used to compile EDGE scores for all angiosperm species, including the cleaned backbone tree with missing families imputed, the list of missing genera, the 200 draws of extinction risk predictions, the 200 complete species-level phylogenetics trees, and the 200 extinction-risk-weighted species-level phylogenetics trees. It also includes the comprehensive results obtained from these compilations, including the EDGE scores for each species and associated data (Data S1-S2) and the EDGE family compilation (Data S3-S4). Details for each data set are as follows:
Data S1: Complete species list with Evolutionarily Distinct and Globally Endangered (EDGE) scores and rankings, including EDGE species, based on the compilation of EDGE scores considering both International Union for Conservation of Nature (IUCN) Red List assessments and extinction risk predictions.
Data S2: Complete species list with Evolutionarily Distinct and Globally Endangered (EDGE) scores and rankings, including EDGE species, based on the compilation of EDGE scores considering only International Union for Conservation of Nature (IUCN) Red List assessments (i.e., omitting extinction risk predictions)
Data S3: Compilation of Evolutionarily Distinct and Globally Endangered (EDGE) scores per family, with EDGE families identified, based on the compilation of EDGE scores considering both International Union for Conservation of Nature (IUCN) Red List assessments and extinction risk predictions.
Data S4: Compilation of Evolutionarily Distinct and Globally Endangered (EDGE) scores per family, with EDGE families identified, based on the compilation of EDGE scores considering only International Union for Conservation of Nature (IUCN) Red List assessments (i.e., omitting extinction risk predictions).
Data S5: Summary of geographical information from the World Checklist of Vascular Plants (1) used to produce distribution maps of Evolutionarily Distinct and Globally Endangered (EDGE) species (Fig. 4) and EDGE family richness (fig. S4), for each botanical country, including codes from the World Geographical Scheme for Recording Plant Distributions (2).
Data S6: The 200 replicates of Evolutionarily Distinct and Globally Endangered (EDGE) scores obtained from the compilation considering both International Union for Conservation of Nature (IUCN) Red List assessments and extinction risk predictions.
Data S7: Backbone phylogenetic tree (Newick format) obtained from the “GenBank-Magallón-backbone” (GBMB) tree of Smith & Brown (3); families not included in the original tree were imputed manually (see Materials and Methods and table S3).
Data S8: List of genera missing in the “GenBank-Magallón-backbone” (GBMB) tree of Smith & Brown (3) and that were imputed.
Data S9: The 200 complete species-level phylogenetic trees of angiosperms (Newick format) used in the compilation of Evolutionarily Distinct and Globally Endangered (EDGE) scores.
Data S10: Extinction risk predictions for the 200 draws used in compilation of Evolutionarily Distinct and Globally Endangered (EDGE) scores.
Data S11: The 200 complete species-level phylogenetic trees of angiosperms (Newick format) with branches weighted by the probability of extinction of the species they subtend; used in the compilation of threatened evolutionary history.
Files and variables
File: EDGEangio_DataS1_EDGEspp_RLPred.csv
Description: Complete species list with Evolutionarily Distinct and Globally Endangered (EDGE) scores and rankings, including EDGE species, based on the compilation of EDGE scores considering both International Union for Conservation of Nature (IUCN) Red List assessments and extinction risk predictions.
Variables
Column headings as follows:
- EDGE.rank, EDGE rank based on EDGE median value (edge.med)
- edge.med, EDGE median value across 200 trees
- ed.med, ED median value across 200 trees
- tbl.med, median value of terminal branch length across 200 trees
- above.median.tot, number of replicates (out of 200) where the EDGE value of a species is above the EDGE median values of all species within a given replicate
- above.median.perc, proportion of replicate where a species is above the EDGE median values of all species within the replicate (= above.median.tot/200)
- pext.med, median value of probability of extinction risk among the 200 replicates assigned to a species
- total.thr.draws, number of draws in which a species was predicted as threatened (score of 1), species with IUCN Red List assessments are marked as "0"
- perc.thr.draws, proportion of draws in which a species is predicted to be threatened (= total.thr.draws/200)
- threat, IUCN Red List category (CR, EN, VU, NT, LC, EX, or EW) or extinction risk predictions (thr = threatened
- not = not threatened), a species predicted to be threatened (= thr) is found to be threatened in at least 100 draws out of 200 (perc.thr.draws >= 0.5)
- RL.ERP, indicate if the extinction risk of a species has been obtained from the IUCN Red List (RL) or predicted (ERP)
- thr.or.not, indicate if a species if threatened (thr) or not threatened (not)
- used to determine which species are assigned to the various EDGE lists
- in.backbone, indicate if a species is found in the backbone tree (y) or has been imputed (n)
- above.med, indicate species that have an EDGE value above the median of all species in 95 % or more of the 200 replicates (y)
- EDGE.List, identifies the species that are on the EDGE List (ys, strict EDGE species
- yc, candidate EDGE species)
- EDGE.Borderline, identifies borderline EDGE species (y)
- EDGE.Research, identifies the species that are on the EDGE Research List (y)
- EDGE.Watch, identifies the species that are on the EDGE Watch List (y)
- useful.plants, identifies species with at least one recorded use (y).
File: EDGEangio_DataS2_EDGEspp_RLonly.csv
Description: Complete species list with Evolutionarily Distinct and Globally Endangered (EDGE) scores and rankings, including EDGE species, based on the compilation of EDGE scores considering only International Union for Conservation of Nature (IUCN) Red List assessments (i.e. omitting extinction risk predictions)
Variables
Column headings as follows:
- EDGE.rank, EDGE rank based on EDGE median value (edge.med)
- edge.med, EDGE median value across 200 trees
- ed.med, ED median value across 200 trees
- tbl.med, median value of terminal branch length across 200 trees
- above.median.tot, number of replicates (out of 200) where the EDGE value of a species is above the EDGE median values of all species within a given replicate
- above.median.perc, proportion of replicate where a species is above the EDGE median values of all species within the replicate (= above.median.tot/200)
- pext.med, median value of probability of extinction risk among the 200 replicates assigned to a species
- total.thr.draws, number of draws in which a species was predicted as threatened (score of 1), species with IUCN Red List assessments are marked as "0"
- perc.thr.draws, proportion of draws in which a species is predicted to be threatened (= total.thr.draws/200)
- threat, IUCN Red List category (CR, EN, VU, NT, LC, EX, or EW) or extinction risk predictions (thr = threatened
- not = not threatened), a species predicted to be threatened (= thr) is found to be threatened in at least 100 draws out of 200 (perc.thr.draws >= 0.5)
- RL.ERP, indicate if the extinction risk of a species has been obtained from the IUCN Red List (RL) or predicted (ERP)
- thr.or.not, indicate if a species if threatened (thr) or not threatened (not)
- used to determine which species are assigned to the various EDGE lists
- in.backbone, indicate if a species is found in the backbone tree (y) or has been imputed (n)
- above.med, indicate species that have an EDGE value above the median of all species in 95 % or more of the 200 replicates (y)
- EDGE.List, identifies the species that are on the EDGE List (y, EDGE species)
- EDGE.Borderline, identifies borderline EDGE species (y)
- EDGE.Research, identifies the species that are on the EDGE Research List (y)
- EDGE.Watch, identifies the species that are on the EDGE Watch List (y)
- useful.plants, identifies species with at least one recorded use (y).
File: EDGEangio_DataS3_EDGEfam_RLPred.csv
Description: Compilation of Evolutionarily Distinct and Globally Endangered (EDGE) scores per family, with EDGE families identified, based on the compilation of EDGE scores considering both International Union for Conservation of Nature (IUCN) Red List assessments and extinction risk predictions.
Variables
Column headings as follow:
- N.spp, number of species
- N.edgespp, number of EDGE species
- P.edgespp, proportion of EDGE species
- N.assessed, number of species assessed by the Red List
- P.assessed, proportion of species assessed by the IUCN Red List
- N.redlist.thrt, number of species assessed as threatened by the IUCN Red List
- P.redlist.thrt, proportion of species assessed as threatened by the IUCN Red List
- N.thrt, number of threatened species assessed as threatened by both the IUCN Red List and the extinction risk predictions
- P.thrt, proportion of threatened species assessed as threatened by both the IUCN Red List and the extinction risk predictions
- edge.med, median EDGE score of all species in the family
- edge.mean, mean EDGE score of all species in the family
- edge.max, maximum EDGE score
- is.lineage, indicate families that have been identified as EDGE lineages following Gumbs et al. (2024), i.e. families with 1) all IUCN Red List data-sufficient species threatened, 2) with EDGE score for the family above median of all families, and 3) with at least half the species in the family assessed and assigned a data-sufficient category
- data.suff.thrt, number of species assessed by the IUCN Red List and assigned a threatened data-sufficient category
- is.EDGE.fam, indicate families that have been identified as an EDGE family, i.e. a family that has 1) a family EDGE score (i.e. mean EDGE score of all species in the family – see edge.mean) above the median EDGE scores of all families (as in the original EDGE lineage concept), and 2) has a higher-than-median proportion of threatened species, which in the present study is 0.316
- N.regions, number of botanical countries in which the family is recorded
- PD.med, median evolutionary history (phylogenetic diversity, PD) of 200 trees in million years
- PD.med.spp, median evolutionary history (phylogenetic diversity, PD) per species of 200 trees in million years
- Thrt.PD.med, median threatened evolutionary history of 200 trees in million years
- Thrt.PD.spp, median threatened evolutionary history per species of 200 trees in million years
- Thrt.perc, proportion of evolutionary history that is threatened.
File: EDGEangio_DataS4_EDGEfam_RLonly.csv
Description: Compilation of Evolutionarily Distinct and Globally Endangered (EDGE) scores per family, with EDGE families identified, based on the compilation of EDGE scores considering only International Union for Conservation of Nature (IUCN) Red List assessments (i.e. omitting extinction risk predictions).
Variables
Column headings as follow:
- N.spp, number of species
- N.edgespp, number of EDGE species
- P.edgespp, proportion of EDGE species
- N.assessed, number of species assessed by the Red List
- P.assessed, proportion of species assessed by the IUCN Red List
- N.redlist.thrt, number of species assessed as threatened by the IUCN Red List
- P.redlist.thrt, proportion of species assessed as threatened by the IUCN Red List
- N.thrt, number of threatened species assessed as threatened by both the IUCN Red List and the extinction risk predictions
- P.thrt, proportion of threatened species assessed as threatened by both the IUCN Red List and the extinction risk predictions
- edge.med, median EDGE score of all species in the family
- edge.mean, mean EDGE score of all species in the family
- edge.max, maximum EDGE score
- is.lineage, indicate families that have been identified as EDGE lineages following Gumbs et al. (2024), i.e. families with 1) all IUCN Red List data-sufficient species threatened, 2) with EDGE score for the family above median of all families, and 3) with at least half the species in the family assessed and assigned a data-sufficient category
- data.suff.thrt, number of species assessed by the IUCN Red List and assigned a threatened data-sufficient category
- is.EDGE.fam, indicate families that have been identified as an EDGE family, i.e. a family that has 1) a family EDGE score (i.e. mean EDGE score of all species in the family – see edge.mean) above the median EDGE scores of all families (as in the original EDGE lineage concept), and 2) has a higher-than-median proportion of threatened species, which in the present study is 0.316
- N.regions, number of botanical countries in which the family is recorded
- PD.med, median evolutionary history (phylogenetic diversity, PD) of 200 trees in million years
- PD.med.spp, median evolutionary history (phylogenetic diversity, PD) per species of 200 trees in million years
- Thrt.PD.med, median threatened evolutionary history of 200 trees in million years
- Thrt.PD.spp, median threatened evolutionary history per species of 200 trees in million years
- Thrt.perc, proportion of evolutionary history that is threatened.
File: EDGEangio_DataS5_Maps_summary.csv
Description: Summary of geographical information from the World Checklist of Vascular Plants (1) used to produce distribution maps of Evolutionarily Distinct and Globally Endangered (EDGE) species (Fig. 4) and EDGE family richness (fig. S4), for each botanical country, including codes from the World Geographical Scheme for Recording Plant Distributions (2).
Variables
Column headings as follow:
- EDGE.spp, number of EDGE species (strict and candidate)
- Endemic.EDGE.spp, number of endemic EDGE species
- higher.EDGE.spp, botanical countries with a higher-than expected proportion of EDGE species for their total species richness (y)
- EDGE.families, number of EDGE families
- Native, number of native species
- Endemic, number of endemic species
- Families, number of families
- Prop.EDGE.spp, proportion of EDGE species in the flora
- Prop.EDGE.families, proportion of EDGE families in the flora
- Prop.endemic.EDGE.spp, proportion of endemic EDGE species in the flora.
File: EDGEangio_DataS6_200EDGEscores.csv
Description: The 200 replicates of Evolutionarily Distinct and Globally Endangered (EDGE) scores obtained from the compilation considering both International Union for Conservation of Nature (IUCN) Red List assessments and extinction risk predictions.
File: EDGEangio_DataS7_backbone_tree.tre
Description: Backbone phylogenetic tree (Newick format) obtained from the “GenBank-Magallón-backbone” (GBMB) tree of Smith & Brown (3); families not included in the original tree were imputed manually (see Materials and Methods and table S3).
File: EDGEangio_DataS8_Missing_genera.csv
Description: List of genera missing in the “GenBank-Magallón-backbone” (GBMB) tree of Smith & Brown (3) and that were imputed.
File: EDGEangio_DataS9_200trees.tre
Description: The 200 complete species-level phylogenetic trees of angiosperms (Newick format) used in the compilation of Evolutionarily Distinct and Globally Endangered (EDGE) scores.
File: EDGEangio_DataS10_Pred_200draws.csv
Description: Extinction risk predictions for the 200 draws used in compilation of Evolutionarily Distinct and Globally Endangered (EDGE) scores.
File: EDGEangio_DataS11_200_Weighted_trees.tre
Description: The 200 complete species-level phylogenetic trees of angiosperms (Newick format) with branches weighted by the probability of extinction of the species they subtend; used in the compilation of threatened evolutionary history.
Code/software
Analyses were primarly run using several packages from the R programming language (https://www.r-project.org/):
rWCVP (1.2.4): Name matching for cleaning of tip labels in backbone tree (https://cloud.r-project.org/web/packages/rWCVP/index.html)
ape (v5.8.1): Manipulation of phylogenetic trees in R, including removal of unwanted tips from the backbone tree following cleaning step (https://cran.r-project.org/web/packages/ape/index.html)
MonoPhy (v.1.3.2): Investigation of monophyly of taxonomic groups in phylogenetic trees (https://cran.r-project.org/web/packages/MonoPhy/index.html)
phytools (v.2.5.2): Various tools for phylogenetic comparitive analysis; used for binding missing family tips to bakbone tree using function bind.tip (https://cran.r-project.org/web/packages/phytools/index.html)
V.Phylomaker (version V): Approach used for the imputation of missing genera and species to the backbone tree to produce 200 complete species-level trees (https://github.com/jinyizju/V.PhyloMaker)
PDcalc (v.0.5.0): Resolution of polytomies in 200 complete species-level trees using function bifurcatr (https://github.com/davidnipperess/PDcalc)
brms (v.2.23.0): Used to fit a Bayesian generalized mixed model to the source data (species’ geographic distributions, lifeforms and documented human uses) used to compile extinction risk predictions (https://cran.r-project.org/web/packages/brms/index.html).
Access information
Access information
Data was derived from the following sources:
- The backbone tree used here is derived from the GBMB tree of Smith & Brown (3): https://bsapubs.onlinelibrary.wiley.com/doi/full/10.1002/ajb2.1019
- The complete species list for angiosperms was sourced from version 11 of the World Checklist of Vascular Plants found here (1): https://sftp.kew.org/pub/data-repositories/WCVP/
- Useful plant data was sourced from Pironon et al 2024 (4): https://www.science.org/doi/full/10.1126/science.adg8028; https://zenodo.org/records/10345634
- IUCN Red List assessments for angiosperms was sourced from The IUCN Red List of Threatened Species. Version 2022-2. (2022). https://www.iucnredlist.org.
References
(1) R. Govaerts, E. Nic Lughadha, N. Black, R. Turner, A. Paton, The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity. Sci Data 8, 215 (2021).
(2) R. Brummitt, P. Francisco, S. Hollis, N. A. Brummitt, World Geographic Scheme for Recording Plant Distributions (Hunt Institute for Botanical Documentation, Carnegie Mellon University, Pittsburgh, PA, USA, ed. 2nd, 2001).
(3) S. A. Smith, J. W. Brown, Constructing a broadly inclusive seed plant phylogeny. Am J Bot 105, 302–314 (2018).
(4) S. Pironon, I. Ondo, M. Diazgranados, R. Allkin, A. C. Baquero, R. Cámara-Leret, C. Canteiro, Z. Dennehy-Carr, R. Govaerts, S. Hargreaves, A. J. Hudson, R. Lemmens, W. Milliken, M. Nesbitt, K. Patmore, G. Schmelzer, R. M. Turner, T. R. van Andel, T. Ulian, A. Antonelli, K. J. Willis, The global distribution of plants used by humans. Science 383, 293–297 (2024).
