Diversity and host specificity of Borrelia burgdorferi's outer surface protein C (ospC) alleles in synanthropic mammals, with a notable ospC allele U absence from mixed infections
Data files
Dec 11, 2023 version files 301.91 MB
Abstract
Interactions among pathogen genotypes that vary in host specificity may affect overall transmission dynamics in multi-host systems. Borrelia burgdorferi, a bacterium that causes Lyme disease, is typically transmitted among wildlife by Ixodes ticks. Despite the existence of many alleles of B. burgdorferi’s sensu stricto outer surface protein C (ospC) gene, most human infections are caused by a small number of ospC alleles [“human infectious alleles” (HIAs)], suggesting variation in host specificity associated with ospC. To characterize the wildlife host association of B. burgdorferi’s ospC alleles, we used metagenomics to sequence ospC alleles from 68 infected individuals belonging to eight mammalian species trapped at three sites in suburban New Brunswick, New Jersey (USA). We found that multiple allele (“mixed”) infections were common. HIAs were most common in mice (Peromyscus spp.) and only one HIA was detected at a site where mice were rarely captured. OspC allele U was exclusively found in chipmunks (Tamias striatus), and although a significant number of different alleles were observed in chipmunks, including HIAs, allele U never co-occurred with other alleles in mixed infections. Our results suggest that allele U may be excluding other alleles, thereby reducing the capacity of chipmunks to act as reservoirs for HIAs.
README: Diversity and host specificity of Borrelia burgdorferi's outer surface protein C (ospC) alleles in synanthropic mammals, with a notable ospC allele U absence from mixed infections
https://doi.org/10.5061/dryad.dr7sqvb54
This repository includes raw sequence reads, SAMtools coverage data, coverage data graphs, metadata, and R code that allow one to reproduce all of the analyses in our manuscript.
Description of the data and file structure
Raw sequence reads (fastq format) are presented in a tarred/compressed file called "221028_KLY7P_Ellis copy.tar".
"Shifflett et al. 2023_metadata.xlsx" is the main data file. There are two sheets that are called separately in the R code. They describe the captured animals and provide additional information including whether they were infected and what species they belong to and where and when they were captured. The first sheet, “Data”, lists the date of capture, the site an individual was captured at, the common name for the species (under “animal”), the scientific name for the species, the individual’s weight (in grams; blank cells indicate that weight was not recorded), the individual’s sex (M=male, F=female, juv= juvenile and unable to sex; blanks in the Sex column are due to an inability to determine the sex of the individuals), The ear-tag ID given (tags with "R" at the end represent recaptured individuals), recapture status (no means the animal was caught for the first time and yes means the animal had been previously captured), followed by a unique ID. The second sheet, "Infected" lists the individual's unique ID, B. burgdorferi infection status as determined by PCR protocol (PCR_Infected), and B. burgdorferi infection status as determined by the qPCR protocol (qPCR_Infected).
The directory "samtools_coverage_data" includes the SAMtools coverage data as txt files. These are called in the R code. This directory is zipped/compressed.
The directory "ospC_Coverage_Rraphs" includes plots of the sequencing coverage for each sample. This directory is zipped/compressed.
The file "RecaptureData.xlsx" includes additional information about each of the recaptured individuals that can be also found in the main data file. The first tab “ID” reports the ear-tag ID given to each captured individual followed by the individual’s unique ID. An “R” is added to the original ear-tag ID for each additional capture of the same individual; while the ear-tag ID changes for each capture, the individual’s ID stays the same. The following sheet “meta_data” includes two tables. The first table lists all the recaptured individuals whose B. burgdorferi infection status or ospC allele composition did NOT change between captures. The second table shows recaptured individuals who developed a B. burgdorferi infection between captures and individuals whose ospC allele composition changed between captures. Both tables list the date of capture, the site location, the scientific name of the species caught, the ear-tag ID, ID, B. burgdorferi infection status determined by PCR followed by qPCR (1 representing infections, 0 representing no infection), and the ospC allele ID. On the second table, empty cells allow for a visual distinction between different capture dates.
Code/Software
The R code that conducts the statistical analyses in our manuscript can be found in the file "Rutgers's paper supplementary R code.R". This R code uses the SAMtools coverage txt files and the metadata file.