A new sectional classification of Lachenalia (Asparagaceae) based on a multilocus DNA phylogeny

Duncan, Graham D.1 ; Schlichting, Carl D.2; Forest, Felix3; Ellis, Allan G.4; Lemmon, Alan R.5; Moriarty Lemmon, Emily5; Verboom, G. Anthony 6

Published Dec 05, 2021 on Dryad. https://doi.org/10.5061/dryad.9kd51c5jf

Data files

Dec 05, 2021 version files 3.53 MB

Appendix_1.xlsx

14.07 KB
APPENDIX_2.txt

24.65 KB
AstralTree.tre

12.69 KB
RAxML_bestTree_T467_morphology.nex

50.94 KB
RAxML_bipartitions.T467_ConcatLoci

15.99 KB
readme.txt

993 B
Table_taxa.xlsx

16.23 KB
TrimmedAlignments_conSeqs.tar.gz

3.40 MB

Abstract

Lachenalia J.Jacq. ex Murray (Asparagaceae; Scilloideae; Hyacintheae) is a large and morphologically diverse genus of more than 140 bulbous species endemic to southern Africa. Previous attempts to infer a well resolved and robustly supported phylogeny of Lachenalia using Sanger sequencing of candidate loci and/or morphological characters have been largely unsuccessful. Consequently, the current infrageneric classification is artificial and there is a need to explore alternative avenues to produce a phylogenetic classification. In this paper we present a novel phylogenetic hypothesis for Lachenalia inferred using maximum likelihood and coalescent-based species tree estimation (ASTRAL) as applied to 378 hybrid-enrichment loci. Our tree is well resolved and well supported, providing strong support for a monophyletic radiation of the genus in southern Africa and a solid foundation for a revised infrageneric classification. The well-supported placement of L. isopetala Jacq. as sister to Lachenalia + Massonia supports the establishment of a new monotypic genus, Pseudolachenalia G.D.Duncan, to accommodate this species. Conversely, the inclusion of species previously classified as Polyxena Kunth within the Lachenalia clade supports the transfer of these species to Lachenalia. Within Lachenalia, the delimitation of subgenera and sections is complicated by the highly imbalanced character of the phylogeny and by the high levels of homoplasy shown by most morphological characters traditionally used to delimit species in this group. Nonetheless, we propose an infrageneric taxonomy comprising 10 morphologically distinct, monophyletic sections. The largest of these, section Lachenalia, is further divided into 13 more-or-less diagnosable, monophyletic subsections. Keys to the sections of Lachenalia, and to the subsections of section Lachenalia, are provided.

Sampling

Actively growing leaf material of 132 Lachenalia accessions, representing 118 species plus a further five subspecies, were sampled from the wild-provenance collection of indigenous bulbous taxa in cultivation in the bulb nursery at Kirstenbosch National Botanical Garden (Table 1). The balance of the Lachenalia samples were replicate collections of selected widespread and/or morphologically variable species. In addition, we sampled, as outgroups, Massonia bifolia (Jacq.) J.C.Manning & Goldblatt, M. depressa Houtt., M. pustulata Jacq. and Veltheimia capensis (L.) DC, giving a total of 136 accessions. Voucher specimens of sampled accessions are lodged in the Compton Herbarium (NBG).

DNA isolation and sequencing

DNA isolation from plant tissue was done according to the protocol of Healey & al. (2014). Two grams of fresh leaf material were frozen and pulverized in liquid nitrogen using a mortar and pestle. As recommended by Healey & al. (2014), all steps were performed in 50 ml Falcon tubes, with the final DNA pellets eluted in 200 µl of Tris-EDTA (TE) buffer. Where pellets did not resuspend fully, an additional amount (typically 100-200 µl) of TE buffer was added until the pellet was fully dissolved. The DNA samples were then purified using the Qiagen Genomic DNEasy Clean and Concentrator Kit (Qiagen N.V., Venlo, Nederland) according to the manufacturer’s instructions. Thereafter, the quality and quantity of DNA were assessed, both by running and visualizing samples in a 1% TBE agarose gel, containing ethidium bromide, and using a NanoDrop spectrophotometer.

Sequence data collection

DNA sequences were collected in collaboration with the Center for Anchored Phylogenomics (www.anchoredphylogeny.com). Libraries were prepared following a protocol originally developed by Meyer & Kircher (2010) and later modified by Prum & al. (2015). Extracted DNA was quantified using a Qubit fluorometer (ThermoFisher Scientific), then sonicated to a size distribution of 250-400 bp using a Covaris ultrasonicator. Following end repair, common Illumina adapters were ligated onto the blunt ends, after which sample-specific 8 bp single indexes were introduced using an indexing PCR step. Libraries were pooled in equal concentration in three batches of ca.16 samples, prior to enrichment using the AHE Angiosperm kit (v.1) described in Buddenhagen & al. (2016), Mitchell & al. (2017) and Léveillé-Bourret & al. (2018). This kit targets 517 AHE loci. The AHE Angiosperm kit was derived from 25 genomes from species across the Angiosperm clade. The targets for this kit are exons from genes that are low-copy and align well across the angiosperms due to their intermediate levels of variation. These loci were not selected based on function. Although these target exons are somewhat short (each typically <500 bp each), the method also enriches for flanking regions that contain higher levels of sequence variation that is useful for resolving shallow-level divergences. Enriched library pools were then pooled into a single sequencing pool (again, in equal concentration), which was sequenced on a single Illumina HiSeq2500 sequencing lane with a paired-end 150 bp protocol.

Read assembly

Reads were processed following Hamilton & al. (2016). Reads passing the Cassava high-chastity filter were demultiplexed with no mismatches tolerated. Paired reads were then checked for overlap and merged when appropriate, following Rokyta & al. (2012). This process corrects sequencing errors and removes sequencing adapters. Reads were then assembled using the quasi-de novo assembler described in Hamilton & al. (2016), with probe region sequences from Arabidopsis thaliana (L.) Heynh., Billbergia nutans H.Wendl. ex Regel and Carex lurida Wahlenb. serving as divergent references. Following assembly, consensus sequences were produced for assembly clusters containing at least 20 reads. Orthology was assessed for each locus using a neighbor-joining approach that utilized alignment-free distance matrices (see Hamilton et al., 2016 for details). Orthologous sequences were aligned using Mafft v. 7.023b (Katoh & Standley, 2013), then trimmed and masked following Hamilton & al. (2016). Geneious v. R9 (Kearse & al., 2012) was used to manually inspect the trimmed alignments to check for alignment and trimming errors. The bioinformatics pipeline produced 380 raw alignments, averaging 1337 bp in length (min = 741 bp, max = 4665 bp). After trimming and masking, 378 aligned loci remained, averaging 628 bp in length (min = 98 bp, max = 1994 bp). In total, these alignments contained 231, 198 sites (77, 330 informative). Only 15% of the characters in these alignments were ambiguous.

Phylogenetic analysis

A phylogenetic tree was estimated by Maximum Likelihood, as implemented in RAxML v. 2.2.3 (Stamatakis, 2006), using default settings. DNA was assumed to follow a general time reversable model of evolution. Data were partitioned by locus. One hundred bootstrap replicates were performed to assess uncertainty in the phylogenetic estimate. Gene trees, estimated for each locus using the same procedure, were then used to estimate a species tree using ASTRAL (Mirarab & Warnow, 2015).

Morphological matrix

All species included in the phylogenetic analysis were scored for 76 characters describing variation in the morphology of bulbs, leaves, seeds, inflorescences and flowers (Appendices 1, 2). For this purpose, we made use of Mesquite version 3.6 (Maddison & Maddison, 2018). Forty-eight characters were binary and 28 multistate, with seven of the latter (i.e. leaf number; leaf orientation; ventral tepal extension; seed length; seed width; strophiole length; chalazal collar length) being treated as ordered because they are fundamentally quantitative. Where character states were unknown or inapplicable, they were scored as “?”. Ancestral character states were then reconstructed in Mesquite using parsimony and the ASTRAL species tree for the purpose of identifying clade-specific synapomorphies.

A new sectional classification of Lachenalia (Asparagaceae) based on a multilocus DNA phylogeny

Data files

Abstract

Methods

Works referencing this dataset