Genetic studies are increasingly detecting cryptic taxa that likely represent a significant component of global biodiversity. However, cryptic taxa are often criticized because they are typically detected serendipitously and may not receive the follow-up study required to verify their geographic or evolutionary limits. Here, we follow-up a study of Eucalyptus salubris that unexpectedly detected two divergent lineages but was not sampled sufficiently to make clear interpretations. We undertook comprehensive sampling for an independent genomic analysis (3,605 SNPs) to investigate whether the two purported lineages remain discrete genetic entities or if they intergrade throughout the species’ range. We also assessed morphological and ecological traits, and sequenced chloroplast DNA. SNP results showed strong genome-wide divergence (FST=0.252) between two discrete lineages: one dominated the north and one the southern regions of the species’ range. Within lineages gene flow was high, with low differentiation (mean FST=0.056) spanning hundreds of kilometres. In the central region, the lineages were interspersed but maintained their genomic distinctiveness: an indirect demonstration of reproductive isolation. Populations of the southern lineage exhibited significantly lower specific leaf area and occurred on soils with lower phosphorus relative to the northern lineage. Finally, two major chloroplast haplotypes were associated with each lineage but were shared between lineages in the central distribution. Together, these results suggest that these lineages have non-contemporary origins and that ecotypic adaptive processes strengthened their divergence more recently. We conclude that these lineages warrant taxonomic recognition as separate species and provide fascinating insight to eucalypt speciation.
Here, we present three datasets regarding SNP genotypes, specific leaf area measurements and soil phosphorus content for a study of cryptic species in Eucalyptus salubris. There is additional chloroplast sequence data for this study available on GenBank: MT104517-MT104558 .
1. SNP genotypes: this is the filtered dataset of 3605 SNP loci used in all population genomic analyses. The data is in the format of a 2-row STRUCTURE input file with missing data coded as -9.
2. Specific leaf area: individual measurements of specific leaf area (ratio of leaf area to dry mass). Ten leaves per tree were measured across the same trees sampled for DNA. Lineage classification is also indicated.
3. Soil phosphorus: soil phosphorus content (mg/kg) for each of the sites sampled. Lineage classification is also indicated.