Do marine mammals diversify more slowly than non-marine mammals?
Data files
Jan 27, 2026 version files 1.26 GB
-
4705sp.tree
515.16 KB
-
mam_hab_analyses.R
17.94 KB
-
mam_hab_functions.R
40.89 KB
-
mam_hab_maps.R
7.66 KB
-
MAMMALS.rds
1.25 GB
-
MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_MCC_v2_target.tre
3.09 MB
-
ne_10m_land.zip
3.27 MB
-
README.md
14.22 KB
-
trait_data_imputed.csv
1.89 MB
Abstract
Species richness is generally lower in marine than in terrestrial ecosystems, but the reasons behind this disparity remain unclear. This study examines whether marine mammals diversify at a slower pace than their non-marine counterparts, aiming to shed light on the factors explaining potential diversification differences among them. We combined time-calibrated phylogenies, species distribution data, and life-history traits to compare DR variation among marine and non-marine mammals, and to assess DR correlation with ecological realm and species traits. Contrary to previous findings at higher taxonomic scales, our results show that marine mammals do not exhibit lower DR than non-marine mammals, but even higher depending on the phylogenetic framework. Our regression analyses indicate that taxonomy (particularly family) is the dominant predictor of DR variation among mammals rather than the ecological realm. Still, DR appears negatively correlated with body mass in marine mammals and with range size in non-marine mammals. Besides, the geographic distribution of DR points to a more uniform pattern in marine than in non-marine artiodactyls, for which high DR values concentrate in the Northern Hemisphere. Conversely, high DR values for marine carnivores are clustered around the poles, while a more homogeneous distribution is observed across continents for their terrestrial relatives. These findings challenge the conventional view that marine ecosystems inherently constrain species diversification. Instead, they suggest that taxonomy and species-specific traits, rather than the ecological realm alone, are the primary drivers of mammalian diversification. Our study emphasises the complexity of mammalian evolutionary patterns and the importance of integrating taxonomic, ecological, and biogeographic factors in macroevolutionary analyses.
https://doi.org/10.5061/dryad.wpzgmsc1h
This repository contains the data and code used for running the analyses in the study "Do marine mammals diversify more slowly than non-marine mammals?", available at https://doi.org/10.1111/jbi.70114
Authors: Adriana Oliver, Graciela Sotelo, Sofía Galván, Iván Rey-Rodríguez, Adrián Castro-Insua, Sara Gamboa, Marta Matamala-Pagès, Pedro Beca-Carretero, Eduardo Méndez-Quintas, Andrea Serodio, Sara Varela
Year: 2026
Contact: Graciela Sotelo, gsotelo.fdz@gmail.com{.email}
Description
The data consist of the following files:
- MAMMALS.rds: shapefile of mammal species distributions from the IUCN Red List of Threathened Species (version 2024-1), provided as a R object (.rds file format) for convenience. It is read in R as indicated in the script mam_hab_analyses.R. It consists of 12,950 species observations (rows) with 29 attributes (columns) that adjust to the IUCN Mapping Standards. These are the attributes:
- id_no: identificiation number of the record
- sci_name: scientific name
- presence: presence distribution codes (1 - extant, 2 - probably extant, 3 - possibly extant, 4 - possibly extinct, 5 - extinct, 6 - presence uncertain, 7 - expected additional range)
- origin: origin distribution codes (1 - native, 2 - reintroduced, 3 - introduced, 4 - vagrant, 5 - origin uncertain, 6 - assisted colonisation)
- seasonal: seasonality distribution codes (1 - resident, 2 - breeding season, 3 - non-breeding season, 4 - passage, 5 - seasonal occurrence uncertain)
- compiler: name of the individual/s or institution responsible for generating the distribution
- yrcompiled: year in which the taxon distribution was mapped, compiled, or modified
- citation: individual/s or institution/s responsible for providing the map data for the Red List
- subspecies: subspecies name/epithet
- subpop: subpopulation name/epithet
- source: primary source of the distribution
- island: name of the island on which the polygon is located
- tax_comm: taxonomic comments that refer directly to the polygon
- dist_comm: distribution comments that refer directly to the polygon
- generalisd: flag to indicate whether the polygon is generalised (1 - true, 0 - false)
- legend: combinations of the presence, origin, and seasonality codes used to create legends for the final distribution maps
- kingdom: taxonomic rank, kingdom
- phylum: taxonomic rank, phylum
- class: taxonomic rank, class
- order_: taxonomic rank, order
- family: taxonomic rank, family
- genus: taxonomic rank, genus
- category: IUCN Red List species category (NT - not evaluated, DD - data deficient, LC - least concern, NT - near threatened, VU - vulnerable, EN - endangered, CR - critically endangered, EW - extinct in the wild, EX - extinct)
- marine: presence in marine environments (true - present, false - absent)
- terrestria: presence in terrestrial environments (true - present, false - absent)
- freshwater: presence in freshwater environments (true - present, false - absent)
- Shape_Len: total shape (polygon) perimeter in decimal degrees
- Shape_Area: system generated shape (polygon) area in decimal degrees (not useful)
- geometry: distribution range as multipolygon data
- trait_data_imputed.csv: database of mammal traits from COMBINE, by Soria et al. 2021 (https://doi.org/10.1002/ecy.3344). It is a plain text file, with each row representing a record and each value within a row separated by a comma. It is read in R as indicated in the script mam_hab_analyses.R, but it can be opened with other software such as Microsoft Excel for instance. It consists of 6,234 observations (rows) for extant and recently extinct mammal species, with data on taxonomy (first six columns) and traits (next 54 columns). The traits inform on morphology, reproduction, diet, biogeography, life-habit, phenology, behavior, home range, and density. These are the columns:
- order: order name of the species
- family: family name of the species
- genus: genus name of the species
- species: specific epithet name of the species
- iucn2020_binomial: IUCN v. 2020-2 binomial name
- phylacine_binomial: PHYLACINE v. 1.2 binomial name
- adult_mass_g: body mass of an adult individual in grams
- brain_mass_g: weight of the brain of an adult individual in grams
- adult_body_length_mm: total length from tip of the nose to anus or base of the tail of an adult individual in millimeters
- adult_forearm_length_mm: total length from elbow to wrist of an adult individual in millimeters, specific to order Chiroptera
- max_longevity_d: maximum reported age at death for the species in days
- maturity_d: the amount of time needed to reach sexual maturity in days
- female_maturity_d: the amount of time needed for a female to reach sexual maturity in days
- male_maturity_d: the amount of time needed for a male to reach sexual maturity in days
- age_first_reproduction_d: age at which females give birth to their first litter or their young attach to teats in days
- gestation_length_d: length of time of fetal growth in days
- teat_number_n: total number of teats present in an individual of the species
- litter_size_n: number of offspring born per litter per female
- litters_per_year_n: number of litters per female per year
- interbirth_interval_d: time between reproduction events in days
- neonate_mass_g: weight of an individual at birth in grams
- weaning_age_d: age at which primary nutritional dependency on the mother ends and independent foraging begins in days
- weaning_mass_g: weight at weaning in grams
- generation_length_d: average age of parents of the current cohort in days
- dispersal_km: the distance an animal travels between its place of birth to the place where it reproduces in kilometers
- density_n_km2: number of individuals of the species per squared kilometer
- hibernation_torpor: individuals of the species go through hibernation or torpor (1 - yes, 0 - no)
- fossoriality: the species is above ground dwelling or ground/fossorial dwelling (1 - fossorial and/or ground dwelling, 2 - above ground dwelling)
- home_range_km2: size of the area within which everyday activities of individuals or groups of individuals are typically restricted in squared kilometer
- social_group_n: number of individuals in a group that spends most of their daily time together
- dphy_invertebrate: percentage of the diet composed of invertebrates
- dphy_vertebrate: percentage of the diet composed of vertebrates
- dphy_plant: percentage of the diet composed of plants and/or fungi
- det_inv: percentage of the diet composed of invertebrates
- det_vend: percentage of the diet composed of animals, birds
- det_vect: percentage of the diet composed of reptiles, snakes, amphibians, salamanders
- det_vfish: percentage of the diet composed of fish
- det_vunk: percentage of the diet composed of vertebrates – general or unknown
- det_scav: percentage of the diet composed of scavenge, garbage, offal, carcasses, trawlers, carrion
- det_fruit: percentage of the diet composed of fruit, drupes
- det_nect: percentage of the diet composed of nectar, pollen, plant exudates, gums
- det_seed: percentage of the diet composed of seed, maize, nuts, spores, wheat, grains
- det_plantother: percentage of the diet composed of other plant elements
- det_diet_breadth_n: number of prevalent (≥ 20%) EltonTraits dietary categories consumed
- trophic_level: trophic level of the species (1 - herbivore, 2 - omnivore, 3 - carnivore)
- foraging_stratum: assignment to one of five foraging stratum categories (M - marine, G - ground level, including aquatic foraging, S - scansorial, Ar - arboreal, A - aerial)
- activity_cycle: activity cycle of each species (1 - nocturnal only, 2 - nocturnal/crepuscular, cathemeral, crepuscular or diurnal/crepuscular, 3 - diurnal only)
- freshwater: the species spends a significant amount of time in freshwater bodies (1 - yes, 0 - no)
- marine: the species spends a significant amount of time in oceans and/or seas (1 - yes, 0 - no)
- terrestrial_non-volant: the species spends a significant amount of time on land (1 - yes, 0 - no)
- terrestrial_volant: the species is capable of powered flight and spends a significant amount of time flying in the air (1 - yes, 0 - no)
- upper_elevation_m: upper elevation limit at which the species can be found in meters
- lower_elevation_m: lower elevation limit at which the species can be found in meters
- altitude_breadth_m: difference between the upper and lower elevation limits of a species in meters
- island_dwelling: 20% or more of the breeding range occurs on an island (1 - yes, 0 - no)
- island_endemicity: score of island endemicity obtained from species’ ranges and historical and fossil occurrence records (Exclusively marine; Occurs on mainland; Occurs on large land bridge islands: the species occurs on islands greater than 1,000 km2 that are separated from the mainland by water no more than 110 m deep. The islands would have been part of the mainland during the last glacial maximum; Occurs on small land bridge islands: the species occurs on islands smaller than 1,000 km2 that are separated from the mainland by water no more than 110 m deep. The islands would have been part of the mainland during the last glacial maximum; Occurs only on isolated islands: the species occurs on islands separated from the mainland by water deeper than 110 m)
- disected_by_mountains: range dissected by mountains (based on elevation gradients with slopes equal or higher than 5 degrees) (1 - yes, 0 - no)
- glaciation: historical exposure to glaciation, considered as more than 20% range overlap with areas glaciated in the last 21000 years (1 - yes, 0 - no)
- biogeographical_realm: biogeographical realms in which the species can be encountered (Afrotropical, Antarctic, Australasian, Indomalayan, Nearctic, Neotropical, Oceanian, Palearctic)
- habitat_breadth_n: number of distinct suitable level 1 IUCN habitats (from 1 to 9)
- 4705sp.tree: consensus phylogeny of mammals from Álvarez-Carretero et al. 2022 (https://doi.org/10.1038/s41586-021-04341-1), in nexus format. The file is read in R as indicated in the script mam_hab_analyses.R, but it can be opened with a text editor or visualised with software such as FigTree for instance.
- MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_MCC_v2_target.tre: consensus phylogeny of mammals from Upham et al. 2019 (https://doi.org/10.1371/journal.pbio.3000494), derived from "DNA-only" data and calibrated using a node dating approach, in nexus format. The file is read in R as indicated in the script mam_hab_analyses.R, but it can be opened with a text editor or visualised with software such as FigTree for instance.
- ne_10m_land.zip: compressed folder containing the shapefile for land areas from Natural Earth. A shapefile consists of multiple file types beyond the .shp (specifically, .cpg, .dbf, .prj, .sbn, and .sbx). The user only interacts directly with the .shp file but the other files need to be in the same folder. This information is used to generate the maps shown in the publication (Figure 4). The shapefile is read in R as indicated in the script mam_hab_maps.R, but it can be opened and used in any GIS software or in Python as well.
The code consists of the following R scripts:
- mam_hab_functions.R: functions required for processing, analysing, and plotting data, as used in the script mam_hab_analyses.R.
- mam_hab_analyses.R: script used for processing, analysing, and plotting data (except for Figure 4).
- mam_hab_maps.R: script used for generating Figure 4, which corresponds to the global distribution of DR values across marine and non-marine mammal species, from results obtained with the script mam_hab_analyses.R.
Code/Software
All analyses were conducted under R version 4.4.1, using the following packages (in alphabetical order): caper v.1.0.3, caret v.6.0-94, dplyr v.1.1.4, ggplot2 v.3.5.1, ggpubr v.0.6.0, phytools v.2.3-0, picante v.1.8.2, phylolm v.2.6.5, randomForest v.4.7-1.1, sf v.1.0-16, stringr v.1.5.1, terra v.1.7-78, viridis v.0.6.5.
Access Information
The data were sourced from:
- Mammal distribution data: https://www.iucnredlist.org/resources/spatial-data-download
- Mammal trait data: https://figshare.com/articles/dataset/COMBINE_a_Coalesced_Mammal_Database_of_Intrinsic_and_extrinsic_traits/13028255/4
- Mammal phylogeny (Álvarez-Carretero et al. 2022): https://github.com/sabifo4/mammals_dating/blob/main/02_SeqBayes_S2/03_Generate_final_mammal_tree/4705sp.tree
- Mammal phylogeny (Upham et al. 2019): https://github.com/n8upham/MamPhy_v1/tree/master/_DATA/MamPhy_fullPosterior_BDvr_DNAonly_4098sp_topoFree_NDexp_MCC_v2_target.tre
- Land polygons: https://www.naturalearthdata.com/downloads/10m-physical-vectors/10m-land/
