Skip to main content

Disentangling lousy relationships: Comparative phylogenomics of two sucking louse lineages parasitizing chipmunks

Cite this dataset

Bell, Kayce et al. (2021). Disentangling lousy relationships: Comparative phylogenomics of two sucking louse lineages parasitizing chipmunks [Dataset]. Dryad.


The evolution of obligate parasites is often interpreted in light of their hosts’ evolutionary history. An expanded approach is to examine the histories of multiple lineages of parasites that inhabit similar environments on a particular host lineage. Western North American chipmunks (genus Tamias) have a broad distribution, a history of divergence with gene flow, and host two species of sucking lice (Anoplura), Hoplopleura arboricola and Neohaematopinus pacificus. From total genomic sequencing, we obtained sequences of over 1100 loci sampled across the genomes of these lice to compare their evolutionary histories and examine the roles of host association in structuring louse relationships. Within each louse species, clades are largely associated with closely related chipmunk host species. Exceptions to this pattern appear to have a biogeographic component, but differ between the two louse species. Phylogenetic relationships among these major louse clades, in both species, are not congruent with chipmunk relationships. In the context of host associations, each louse lineage has a different evolutionary history, supporting the hypothesis that host-parasite assemblages vary both across the landscape and with the taxa under investigation. In addition, the louse Hoplopleura erratica (parasitizing the eastern Tamias striatus) is embedded within H. arboricola, rendering it paraphyletic. This phylogenetic result, together with comparable divergences within H. arboricola, indicate a need for taxonomic revision. Both host divergence and biogeographic components shape parasite diversification as demonstrated by the distinctive diversification patterns of these two independently evolving lineages that parasitize the same hosts.


DNA was extracted from sucking lice by grinding one individual in extraction buffer using the Qiagen QIAmp Micro kit (Qiagen, Hilden, Germany) protocols with the following exceptions: samples digested for 48 h at 72°C and final elution buffer was heated to 55°C and incubated on the column membrane for 5 min at 55°C. Louse DNA was prepared for whole genome sequencing with KAPA Hyper Prep Kit (Kapa Biosystems, Wilmington, MA). Libraries for 9 or 10 samples were pooled and 150 bp paired-end reads were run on six Illumina HiSeq 2500 lanes at the Roy J. Carver Biotechnology Center at the University of Illinois at Urbana-Champaign. Sequencing reads were first examined using FastQC v0.10.1 (Babraham Bioinformatics, Andrews 2010) to screen for sequencing anomalies. We removed duplicated sequence read pairs using the script available from the mcscript Github package ( The de-duplicated reads were then quality trimmed in the FASTX Toolkit v0.0.14 (Hannon Lab) by removing the first 3 bases with consistently lower scores from the 5’ end of the sequence. All reads were then quality trimmed from the 3’ end to remove bases with a phred score less than 28 using a sliding window of 1 nt. Finally, trimmed reads with fewer than 75 nt were removed from the dataset. A curated set of 1,107 genes from the human louse, Pediculus humanus, were assembled in aTRAM v1.0 (Allen et al. 2015) using the ABySS v1.5 assembler (Simpson et al. 2009) and 3 iterations, using the protein sequence from Pediculus humanus as the target for tblastn searches of quality trimmed reads. Following assembly of loci, the exons of each locus were assembled together using the exon_stitching program in aTRAM as described in (Allen et al. 2017). In this exon-stitching step, we used the program Exonerate v2.2 (Slater and Birney 2005) to identify the exonic regions in each of the aTRAM assemblies and then stitched them together into one contig that contained all the exons per gene. These loci were aligned with a translated alignment in PRANK v.170427 (Löytynoja 2104) and back translated to DNA. Following that alignment, we removed sites with over 90% missing data or gaps in trimAL v1.2 (Capella-Gutierrez et al. 2009). Each locus was then aligned with MAFFT v7 (Katoh et al. 2002; Katoh and Standley 2013). COI was assembled in aTRAM v2.0 (Allen et al. 2018) using ABySS v2.0.2 (Simpson et al. 200), with the COI protein sequence from Hoplopleura kitti (GenBank accession KJ648943) as the target for tblastn searches of quality trimmed reads. Sequence assemblies were trimmed to just COI and aligned with MAFFT v7. Four of the samples would not assemble using ABySS in aTRAM, so we did those manually, by running aTRAM without an assembler, taking the tblastn hits and then mapping those to the target read in Geneious Prime 2020.2 ( and extracting the consensus sequence. COI sequences were aligned in MAFFT v7 and trimmed to the open reading frame corresponding to the target sequence.