Skip to main content
Dryad

Data from: The invisible species: Big data unveil coverage gaps in the Atlantic forest hotspot

Data files

Oct 02, 2025 version files 451.12 MB

Click names to download individual files

Abstract

Rapid technological advancements and the biodiversity crisis have motivated efforts to document species before their extinction. However, taxonomic coverage gaps, where certain species are underrepresented in biodiversity databases, can distort our understanding of ecosystems. This dataset includes the data we used to quantify how many plant species within a biodiversity hotspot are "invisible," meaning they are excluded from studies due to insufficient occurrence data.  Additionally, we identified factors influencing the invisibility of species.

We downloaded and filtered occurrence data from 15,010 plant species from online biodiversity databases. We utilized multiple thresholds, each representing a minimum required number of records, to classify species as “invisible” if their record count fell below these thresholds. We fitted logistic models to estimate how factors such as life form, presence of a vernacular name, geographical distribution, endemism, and year of taxonomic publication influence the odds of species exclusion.

The proportion of invisible species ranged from 14% when employing simple tools requiring just three records to as high as 64% with more demanding tools requiring at least 60 records. Species with specific characteristics are more prone to invisibility, including non-tree species, species without vernacular names, species with restricted distributions within Atlantic Forest, endemic species, and species with names published more recently. A significant portion of these invisible species are distributed along the coastline. In contrast, the continental portion of the biome exhibits fewer taxonomic coverage gaps of known species, most likely due to lower rates of new species descriptions. Coverage gaps are shaped by the interaction of biological traits, societal preferences, limited technical support, and human activities.