Skip to main content

Extremely low genetic diversity in the European clade of the model bryophyte Anthoceros agrestis

Cite this dataset

Dawes, Tom; Villarreal A., Juan Carlos; Forrest, Laura L. (2020). Extremely low genetic diversity in the European clade of the model bryophyte Anthoceros agrestis [Dataset]. Dryad.


The hornwort Anthoceros agrestis is emerging as a model system for the study of symbiotic interactions and carbon fixation processes. It is an annual species with a remarkably small and compact genome. Single accessions of the plant have been shown to be related to the cosmopolitan perennial hornwort Anthoceros punctatus. We provide the first detailed insight into the evolutionary history of the two species. Due to the rather conserved nature of organellar loci, we sequenced multiple accessions in the Anthoceros agrestis-A.  punctatus complex using three nuclear regions: the ribosomal spacer ITS2, and exon and intron regions from the single-copy coding genes rbcS and phytochrome. We used phylogenetic and dating analyses to uncover the relationships between these two taxa. Our analyses resolve a lineage of genetically near-uniform European A. agrestis accessions and two non-European A. agrestis lineages. In addition, the cosmopolitan species Anthoceros punctatus forms two lineages, one of mostly European accessions, and another from India. All studied European A. agrestis accessions have a single origin, radiated relatively  recently (less than 1 million years ago), and are currently strictly associated with agroecosystem habitats


Molecular methods We sequenced 51 accessions representing Anthoceros punctatus and A. agrestis from Europe, the Americas, Australia and Asia (Table S1, Figure 1). Given some uncertainty as to what the closest extant relative of A. agrestis and A. punctatus is, we included several related species of Anthoceros as outgroup taxa, based on previous studies (Duff et al. 2007; Villarreal et al. 2017). We included five accessions of A. neesii, three accessions of A. cristatus, and one accession each of A. venosus, A. lamellatus and A. fragilis. All those species share similar spore morphology and have been associated with the A. punctatus complex (Schuster 1992; Villarreal et al. 2015). Voucher information and GenBank accession numbers are available in Supplementary Table 1. The DNA extraction and PCR amplification followed standard protocols (available upon request). The data set includes nucleotide sequences from three nuclear loci: part of the RuBisCO small subunit (rbcS), which has been shown to be single copy in European accessions of A. agrestis, part of the phytochrome gene, which again is single copy in European A. agrestis (Szövényi unpublished data), and the internal transcribed spacer 2 region (ITS2). Primers to amplify rbcS and phytochrome were newly generated as part of this study (Table S2). We used Geneious 9.0.5 (Biomatters Limited) to generate an initial alignment of the nucleotide sequences, with subsequent manual augmentation. Due to some failed amplifications, the final data matrix (hereafter called dataset I) comprises forty-seven accessions that have sequence data for at least two loci, including thirty accessions of A. agrestis, twelve accessions of A. punctatus, two accessions of A. neesii, and one each of A. cristatus, A. lamellatus, A. venosus and A. fragilis. A subset of the matrix used for the dating analyses (dataset II) contains 20 accessions and includes representatives of A. agrestis, A. punctatus, and the outgroup species A. neesii, A. cristatus and A. fragilis.

            Phylogenetic analyses ¾ We analysed each locus separately, and also as a single partitioned dataset with models suggested by PartitionFinder (Lanfear et al. 2012), under the maximum likelihood criterion (ML), using RAxML black box (Stamatakis 2014) with 500 bootstrap replicates (MLB). Basic statistics on each dataset were retrieved using PAUP* (Swofford 2002). Bayesian analyses on the combined dataset were conducted in MrBayes 3.2 (Ronquist et al. 2012), using the default two runs and four chains, with default priors on most parameters. To assess burn-in and convergence we compared the bipartitions across the two runs. Convergence was usually achieved in MrBayes after 2 ´ 107 generations, with trees sampled every 10,000th generation for a total length of 10 x106 generations; we discarded 25% of each run and then pooled runs. All analyses were run under the CIPRES platform (Warnow 2010).

            Molecular Clock Dating We used a root prior from a previous study (Villarreal et al. 2015) to date a phylogeny produced using dataset II, with A. cristatus, A. fragilis and A. neessii as outgroup taxa. There is no reliable fossil described from Anthoceros, although spores similar to those of A. punctatus and referred to as Rudolphisporis rudolfii have been reported from Late Miocene (Machnín, Bohemia), Pliocene and Pleiostocene deposits from Europe (Krutzsch 1963). We have decided to use a root prior because of the uncertainty around this fossil spore and the lack of precise dating of the stratum. We gave the root a normal prior, with a mean of 20 Ma and sd of 3, to account for the highest posterior density (HPD) from a previous study (Villarreal et al. 2015).  In addition, we used the crown age of the A. punctatus – agrestis group from this previous study, applying it to the ingroup with a mean of 6 Ma and sd of 1. The last calibration used was the age of Ascension Island, as a proxy age for the neo-endemic A. cristatus. Ascension Island is the tip of an undersea volcano that is thought to have emerged from the ocean one million years ago (Ashmole and Ashmole 2000). We used a normal prior with a mean of 1 and sd of 0.25. To explore the effective priors, we ran analyses with either one or two calibrations at a time, and one on an empty alignment, to compare the frequency distribution of age estimates for each calibrated node with the prior. Bayesian divergence time estimation used a Yule process tree prior with unlinked data partitions, using same substitution models suggested by PartitionFinder. The analyses were done using an uncorrelated log-normal (UCLN) relaxed clock model. The MCMC chains were run for 900 million generations, with parameters sampled every 10,000th generation using BEAST 1.8.3 (Drummond et al. 2012). Tracer 1.5 was used to assess effective sample sizes (ESS) for all estimated parameters and to decide the appropriate percentages of burn-in. We verified that all ESS values were >200. Trees were combined in TreeAnnotator 1.8 (part of the BEAST package), and maximum clade credibility trees with mean node heights were visualized using FigTree 1.4.0. We report the HPD intervals (the interval containing 95% of the sampled values). 


Smithsonian Tropical Research Institute, Award: Earl S Tupper Fellowship 2015

Conseil de recherches en sciences naturalles et en genie du Canada, Award: RGPIN/05967-2016

Rural and Environment Science and Analytical Services Division

Scottish Government's Rural and Environmental Science and Analytical Services Division*