The formation and spread of the Australian arid zone during the Neogene was a profoundly transformative event in the biogeographic history of Australia, resulting in extinction or range contraction in lineages adapted to mesic habitats, as well as diversification and range expansion in arid-adapted taxa (most of which evolved from mesic ancestors). However, the geographic origins of the arid zone biota are still relatively poorly understood, especially among highly diverse invertebrate lineages, many of which are themselves poorly documented at the species level. Spiny trapdoor spiders (Idiopidae: Arbanitinae) are one such lineage, having mesic ‘on-the-continent’ Gondwanan origins, while also having experienced major arid zone radiations in select clades. In this study, we present new orthologous nuclear markers for the phylogenetic inference of mygalomorph spiders, and use them to infer the phylogeny of Australasian Idiopidae with a 12-gene parallel tagged amplicon next-generation sequencing approach. We use these data to test the mode and timing of diversification of arid-adapted idiopid lineages across mainland Australia, and employ a continent-wide sampling of the fauna’s phylogenetic and geographic diversity to facilitate ancestral area inference. We further explore the evolution of phenotypic and behavioural characters associated with both arid and mesic environments, and test an ‘out of south-western Australia’ hypothesis for the origin of arid zone clades. Three lineages of Idiopidae are shown to have diversified in the arid zone during the Miocene, one (genus Euoplos) exclusively in Western Australia. Arid zone Blakistonia likely had their origins in South Australia, whereas in the most widespread genus Aganippe, a more complex scenario is evident, with likely range expansion from southern Western Australia to southern South Australia, from where the bulk of the arid zone fauna then originated. In Aganippe, remarkable adaptations to phragmotic burrow-plugging in transitional arid zone taxa have evolved twice independently in Western Australia, while in Misgolas and Cataxia, burrow door-building behaviours have likely been independently lost at least three times in the eastern Australian mesic zone. We also show that the presence of idiopids in New Zealand (Cantuaria) is likely to be the result of recent dispersal from Australia, rather than ancient continental vicariance. By providing the first comprehensive, continental synopsis of arid zone biogeography in an Australian arachnid lineage, we show that the diversification of arbanitine Idiopidae was intimately associated with climate shifts during the Neogene, resulting in multiple Mio-Pliocene radiations.
Supplementary File 1 - DNA orthologs Initial Assessment Subset
Supplementary File 1. Concatenated alignment of nucleotide sequence data for the ‘initial assessment subset’ of 151 orthologous loci. See character set partitions at end of matrix for the gene-specific breakdown of putative orthologs. See Supplementary File 4 for taxon abbreviations.
Supplementary File 2 - AA orthologs Initial Assessment Subset
Supplementary File 2. Concatenated alignment of amino acid sequence data for the ‘initial assessment subset’ of 151 orthologous loci. See character set partitions at end of matrix for the gene-specific breakdown of putative orthologs. See Supplementary File 4 for taxon abbreviations.
Supplementary File 3 - Comparison of phylogenetic informativeness measures
Supplementary File 3. Comparison of phylogenetic informativeness measures across three epochs (deep = Epoch 3; taxonomy = Epoch 2; species = Epoch 1; see Fig. 2, Supplementary File 4), using the 'initial assessment' subset of 151 orthologous loci (Supplementary Files 1-2).
Supplementary File 4 - PhyDesign Informativeness Quantitation IA Subset
Supplementary File 4. Phylogenetic informativeness quantitation outputs from PhyDesign (López-Giráldez and Townsend, 2011) for nucleotide (Tab 1) and amino acid (Tab 2) data, using the ‘initial assessment’ subset of 151 orthologous loci (Supplementary Files 1-2). Loci are listed and ranked according to their per-site informativeness over three epochs (0-25 Ma; 25-160 Ma; 160-294.965 Ma); the top-10 ranked loci for each epoch are highlighted in red.
Supplementary File 5 - Specimen & laboratory data
Supplementary File 5. Collection and repository data for every specimen sequenced as part of this study, along with conditions for polymerase chain reaction (PCR) amplification of amplicons and GenBank accession numbers. F = forward primer; R = reverse primer; Ta = annealing temperate.
Supplementary File 6 - Loci
Supplementary File 6. Locus summaries for genes used in the final optimised 12-gene ‘FULL’ dataset (see Tables 1-2). Genes are organised according to their phylogenetic informativeness over three epochs (see Table 2, Supplementary File 3), followed by additional unranked nuclear loci from previous studies. Test taxa listed are those sequenced as part of Phase 6 (see Materials and Methods).
Supplementary File 7 - FULL dataset MrBayes
Supplementary File 7. Concatenated alignment of nucleotide sequence data for the final 12-gene 129-taxon (FULL) dataset, including Bayesian command block. See character set partitions at end of matrix for the gene- and partition-specific breakdowns.
Supplementary File 8 - MAJORITY dataset MrBayes
Supplementary File 8. Concatenated alignment of nucleotide sequence data for the reduced representation 12-gene 120-taxon (MAJORITY) dataset, including Bayesian command block. See character set partitions at end of matrix for the gene- and partition-specific breakdowns.
Supplementary File 9 - BEAST_9 analysis
Supplementary File 9. BEAST .xml input file used for analysis of the ‘BEAST_9’ dataset.
Supplementary File 10 - BEAST_7 analysis
Supplementary File 10. BEAST .xml input file used for analysis of the ‘BEAST_7’ dataset.
Supplementary File 11 - FULL analysis MrBayes topology detail
Supplementary File 11. Bayesian majority-rule consensus tree (presented as two parts) resulting from a partitioned phylogenetic analysis of the ‘FULL’ (12-gene) dataset (129 taxa; 9,601 bp; 20 million generations). Non-idiopid outgroups have been omitted for ease of presentation, and posterior probability values are > 0.98 unless otherwise stated. The type species of valid genera are shown in red, and the type species of junior generic synonyms are shown in dark blue; specimens of named species from their respective type localities are highlighted (#). The schematic summary phylogeny of Figure 5 is shown at left for comparison, and clades not recovered as presented in the Bayesian and/or likelihood analyses of the ‘MAJORITY’ dataset are highlighted (*).
Supplementary File 12 - Parsimony ancestral states
Supplementary File 12. Mesquite input file of characters and the tree resulting from Bayesian analysis of the ‘FULL’ dataset, used for parsimony ancestral state reconstruction. Character coding is as follows: CHARACTER 1 (distribution): 0 = south-western Western Australia; 1 = arid zone; 2 = temperate south-eastern Australia; 3 = eastern Australia. CHARACTER 2 (phragmosis): 0 = unmodified abdomen; 1 = phragmotic abdominal morphology. CHARACTER 3 (burrow morphology): 0 = closed, with a hinged trapdoor lid; 1 = open burrow; 2 = semi-closed burrow, with a sock-like entrance.
Supplementary File 13 - BEAST_7 tree for Cantuaria
Supplementary File 13. Chronogram resulting from a fossil-calibrated BEAST analysis of the reduced-representation ‘BEAST_7’ dataset, showing New Zealand taxa in the genus Cantuaria. The time-scale highlights 35 Ma and the beginning of the Miocene at 23 Ma. See Supplementary File 5 for the identification of numbered taxa.