Two main hypotheses have been proposed to explain the diversification of the Caatinga biota. The riverine barrier hypothesis (RBH) claims that the São Francisco River (SFR) is a major biogeographic barrier to gene flow. The Pleistocene climatic fluctuation hypothesis (PCH) states that gene flow, geographic genetic structure, and demographic signatures on endemic Caatinga taxa were influenced by Quaternary climate fluctuation cycles. Herein we analyze genetic diversity and structure, phylogeographic history, and diversification of a widespread Caatinga lizard (Cnemidophorus ocellifer) based on large geographical sampling for multiple loci to test the predictions derived from the RBH and PCH. We inferred two well-delimited lineages (Northeast and Southwest) that have diverged along the Cerrado-Caatinga border during the Mid-Late Miocene (6–14 Ma) despite the presence of gene flow. We reject both major hypotheses proposed to explain diversification in the Caatinga. Surprisingly, our results revealed a striking complex diversification pattern where the Northeast lineage originated as a founder effect from a few individuals located along the edge of the Southwest lineage that eventually expanded throughout the Caatinga. The Southwest lineage is more diverse, older, and associated to the Cerrado-Caatinga boundaries. Finally, we suggest that C. ocellifer from the Caatinga is composed of two distinct species. Our data support speciation in the presence of gene flow and highlight the role of environmental gradients in the diversification process.
Align_12S_62samples_substitution.rate.estimation
This file contains aligned sequences used to estimate a 12S substitution rate for Teiidae.
Align_12S_137samples.outgroup_gblock
This file contains aligned sequences used to estimate 12S gene tree.
Align_ATPSB_126samples.outgroup_gblock
This file contains aligned sequences used to estimate ATPSB gene tree.
Align_NKTR_115samples.outgroup
This file contains aligned sequences used to estimate NKTR gene tree.
Align_R35_134samples.outgroup
This file contains aligned sequences used to estimate R35 gene tree.
Align_RP40_132samples.outgroup
This file contains aligned sequences used to estimate RP40 gene tree.
Align_NE.SW_12S_398samples_gblock
This file contains 12S aligned sequences used to calculate the number of polymorphic sites (S), haplotype number (h), haplotype diversity (Hd), nucleotide diversity (π), and Tajima’s D for Northeast (NE) and Southwest (SW) lineages. This file was also used to to estimate uncorrected pairwise genetic distances between and within lineages and to investigate genetic structure with analyses of molecular variance (AMOVA).
Align_NE.SW_12S_137samples_gblock
This file contains 12S aligned sequences used in most phylogeographic analyses: population assignment, haplotype genealogy, species tree estimation, migration estimate, species validation, and model based approach. Northeast (NE) and Southwest (SW) sequence clusters were used separately in historical demography analyses and phylogeographic reconstruction.
Align_NE.SW_ATPSB_252.phased.sequences_gblock
This file contains ATPSB aligned sequences used in most phylogeographic analyses: population assignment, haplotype genealogy, species tree estimation, migration estimate, species validation, and model based approach. Northeast (NE) sequences cluster was used separately in phylogeographic reconstruction.
Align_NE.SW_NKTR_230.phased.sequences
This file contains NKTR aligned sequences used in most phylogeographic analyses: population assignment, haplotype genealogy, species tree estimation, migration estimate, species validation, and model based approach. Northeast (NE) sequences cluster was used separately in phylogeographic reconstruction.
Align_NE.SW_R35_268.phased.sequeces
This file contains R35 aligned sequences used in most phylogeographic analyses: population assignment, haplotype genealogy, species tree estimation, migration estimate, species validation, and model based approach. Northeast (NE) sequences cluster was used separately in phylogeographic reconstruction.
Align_NE.SW_RP40_264.phased.sequences
This file contains RP40 aligned sequences used in most phylogeographic analyses: population assignment, haplotype genealogy, species tree estimation, migration estimate, species validation, and model based approach. Northeast (NE) sequences cluster was used separately in phylogeographic reconstruction.
Tree_12S_62samples_substitution.rate
Tree_12S_137samples.outgroup_gblock
Tree_ATPSB_126samples.outgroup_gblock
Tree_NKTR_115samples.outgroup
Tree_R35_134samples.outgroup
Tree_RP40_132samples.outgroup
Geneland_137samples_code
Sample codes used in Geneland analyses.
Geneland_137samples_4nuDNA_input
Geneland input file based on four nuclear genes.
Geneland_137samples_mtDNA_input
Geneland input file based on mitochondrial gene (12S).
Geneland_137samples_LongLat_input
Geneland input file based on geographic coordinates (longitute and latitude) of each sample.
Structure_137samples_4nuDNA_input
Structure input file based on four nuclear genes (fasta format).
Structure_137samples_4nuDNA_input
Structure input file (genotype matrix format) obtained from xmfas2struct program; -9 is used for missing data.
IMa2_137samples_5genes_input
IMa2 input file based on five genes.