Population genomic analysis of an emerging pathogen Lonsdalea quercina affecting various species of oaks in western North America
Data files
Jul 24, 2023 version files 162.41 MB
Abstract
Previously unrecognized diseases continue to threaten the health of forest ecosystems globally. Understanding processes leading to disease emergence is important for effective disease management and prevention of future epidemics. Utilizing whole genome sequencing, we studied the phylogenetic relationship and within diversity of two populations of the bacterial oak pathogen Lonsdalea quercina from western North America (Colorado and California) and compared these populations to other Lonsdalea species found worldwide. Phylogenetic analysis separated Colorado and California populations into two well-supported clades within the genus Lonsdalea, with an average nucleotide identity between them near species boundaries (95.31%) for bacteria, suggesting long isolation. Populations comprise distinct patterns in genetic structure and distribution. Genotypes collected from different host species and habitats were randomly distributed within the California cluster, while most Colorado isolates from introduced planted trees were distinct from isolates collected from a natural stand of CO native Q. gambelii, indicating the presence of cryptic population structure. The distribution of clones in California varied, while Colorado clones were always collected from neighboring trees. Despite its recent emergence, the Colorado population had higher nucleotide diversity, possibly due to migrants moving with nursery stock. Overall results suggest independent pathogen emergence in two states likely driven by changes in host-microbe interactions due to ecosystem conditions changing. To our knowledge, this is the first study on L. quercina population structure. Further studies are warranted to understand evolutionary relationships among L. quercina populations from different areas, including the native habitat of red oak in northeastern USA.
Methods
The pangenome analyses were carried out with Roary pipeline v.3.13.0 with gff3 input files from Prokka annotation. Three datasets of core genes were used for further analyses that are described in detail in the main text of the manuscript.