Identifying traits that enable lizard adaptation to different habitats
Lanna, Flávia; Colli, Guarino; Burbrink, Frank; Carstens, Bryan (2021), Identifying traits that enable lizard adaptation to different habitats, Dryad, Dataset, https://doi.org/10.5061/dryad.t1g1jwt3c
Aim: Species adapt differently to contrasting environments, such as open habitats with sparse vegetation and forested habitats with dense forest cover. We investigated colonization patterns in the open and forested environments in the Diagonal of Open Formations and surrounding rain forests (i.e., Amazon and Atlantic Forest) in Brazil, tested whether the diversification rates were affected by the environmental conditions, and identified traits that enabled species to persist in those environments.
Location: South America, Brazil.
Taxon: Squamata, Lizards
Methods: We estimated ancestral ranges to identify range shifts relative to traditional open and forested habitats for all species. We used phylogenetic information and the current distribution of species in open and forested environments. To evaluate whether these environments influenced species diversification, we tested 12 models using a Hidden Geographic State Speciation and Extinction analysis. Finally, we combined phylogenetic relatedness and species traits in a machine learning framework to identify the traits permitting adaptation in those contrasting environments.
Results: We identified 41 total transitions between open and forested habitats, of which 80% were from the forested habitats to the open habitats. Widely distributed species had lower speciation and extinction rates than species in forested or open habitats, with the latter having the higher overall rates. Mean body temperature, microhabitat, female SVL, and diet were identified as putative traits that enabled adaptation to different environments, and phylogenetic relatedness was an important predictor of species occurrence.
Main conclusions: Our results indicate that transitions from forested to open habitats are most common. The combination of phylogenetic reconstruction of ancestral distributions and the machine learning framework enables us to integrate organismal trait data, environmental data, and evolutionary history in a manner that could be applied on a global scale.
This repository contains the scripts and data used to conduct the analyses performed by Lanna et al.
For more details on the methodology, see the Lanna et al. manuscript.
1. Meiri 2018 (https://doi.org/10.1111/geb.12773) trait dataset: "Meiri_2018_traits_copy.csv". This file is a copy of the Meiri 2018 trait dataset, used to get most of the traits used in the random forest analysis.
2. "Names_reptile_database_to_merge.csv" contains the species names used in the random forest analysis and the habitat that each species occupy. This file was used to collect the traits for the same species in the Meiri 2018 trait dataset. The script "Merge_csv_files.py" was used to merge the two files mentioned here.
3. "Merged_Data.csv" is the raw trait dataset that, after a careful filtering process became the file "Species_Open_or_Forested_traits_with_citations.csv", and then the file "For_RF_raw_data_2.csv", that was used in the random forest analysis.
4. Raw trait dataset with citations: "Species_Open_or_Forested_traits_with_citations.csv".
This file contains the raw data that was used to create the trait dataset used in the random forest analyses. The traits were collected from https://doi.org/10.1111/geb.12773 (Meiri 2018). Traits added from different sources have the source citation.
5. Trait dataset: "For_RF_raw_data_2.csv" includes all 13 traits for the 235 species used in the random forest analysis.
The traits are described in Table 1 of the manuscript.
The habitats are coded as: open = 0, forested = 1.
Values for the categorical traits are:
Leg development: - Four-legged = 0
- Leg-reduced = 1
- Hind limbs only = 2
- Limbless = 3
Activity time: - Diurnal = 0
- Nocturnal = 1
- Cathemeral = 2
Microhabitat: - Terrestrial = 0
- Arboreal = 1
- Fossorial = 2
- Saxicolous = 3
- Semi-Arboreal = 4
- Semi-Aquatic = 5
Diet: - Carnivorous = 0
- Herbivorous = 1
- Omnivorous = 2
Foraging mode: - Active foraging = 0
- Sit and wait = 1
- Mixed = 2
Reproductive mode: - Oviparous = 0
- Viviparous = 1
6. Phylogenetic tree: "Corrected_names_Binary_forest_open_tree_from_Tonini9755_tree_minBL.newick" was used in all the analyses.
The original tree can be found at Tonini et al. 2016 paper (https://doi.org/10.1016/j.biocon.2016.03.039).
7. "Correct_names_2.txt" was used in the script "Keep_tip_for_244spp_2.R" to update the names used on the tree according to Reptile database lizard names (http://www.reptile-database.org/)
8. "all_244spp_2_habitats_2_BioGeoBEARS.txt" was used to run the BioGeoBEARS analysis.
9. "244spp_2_habitats_GEOHiSSE_2.txt" was used to run the GeoHiSSE analysis.
Range 0 = both habitats
Range 1 = open habitats
Range 2 = forested habitats
10. "Pairwise_distance_matrix_all_species_2.csv" represents the pairwise phylogenetic distance matrix among all the species present in the phylogenetic tree. This file was later edited to contain only the species used in the random forest analysis (see file 11 description).
11. "Pairwise_dist_matrix_235spp.csv" was used in the random forest analysis and represents the pairwise phylogenetic distance matrix among the 235 species used in this analysis.
12. "transitions_count_per_family.xlsx" was used to create figure 2.
It has information about the number of transitions that occurred from open to forested or from forested to open habitats within each family.
1. "Supporting_information.docx" contains information not used in the main file that can be found here "manuscript link".
1. "Map.R" was used to create the map of figure 1.
2. "Keep_tip_for_244spp_2.R" was used to edit the original consensus phylogenetic tree from Tonini et al. 2016 (https://doi.org/10.1016/j.biocon.2016.03.039) to contain only the species used in this study.
The script also corrected the polytomies and changed the outdated species names for the names according to Reptile Database (http://www.reptile-database.org/).
3. "Script_BioGeoBEARS_2.R" was used to run the BioGeoBEARS analysis, adapted from http://phylo.wikidot.com/biogeobears#script
It was also used to create figure S1.
4. "geom_bar_histogram_transitions.R" was used to create figure 2.
5. "GeoHiSSE_12_models_NEW_SCRIPT.R" was used to run the GeoHiSSE analysis, adapted from Caetano et al. 2018 (https://doi.org/10.1111/evo.13602).
It was also used to create figure 3.
6. "Compute_phylo_distance_from_tree.R" was used to create the "Pairwise_distance_matrix_all_species_2.csv" and the "Pairwise_dist_matrix_235spp.csv" files.
7. "Merge_excel_files.py" was used in Python 3 to merge the Meiri 2018 trait dataset with the "Names_reptile_database_to_merge.csv" file. The .csv file resultant of this merging is the "Merged_Data.csv".
8. "RF_model_type_1_with_loop.R" was used to run the first three models of the random forest classification analysis to identify important traits to predict species occurrence in open and forested habitats.
It was also used to create the plots corresponding to models 1, 2, and 3 in figure 4 and figure S2.
9. "missForest_Boruta_RF_235spp_allTraits.R" was used to run models 4 and 5 of the random forest classification analysis.
In this script, we imputed the data using the missForest function before running the random forest classification algorithm.
It was also used to create the plots corresponding to models 4 and 5 in figure 4 and figure S2.
10. "Important_traits_RF.R" was used to check for trait differences between open and forested habitats and how the variation is distributed.
It was also used to create figure 5.