Core genome phylogenetic tree of two Campylobacter novaezeelandiae and four unclassified thermophilic Campylobacter isolates from Canadian agricultural surface water
Ivanova, Mirena et al. (2021), Core genome phylogenetic tree of two Campylobacter novaezeelandiae and four unclassified thermophilic Campylobacter isolates from Canadian agricultural surface water, Dryad, Dataset, https://doi.org/10.5061/dryad.qrfj6q5f2
This dataset includes 1) concatenated alignment of 135 core gene sequences and 2) phylogenomic tree of the 38 currently described Campylobacter spp., including Campylobacter novaezeelandiae and four additional novel Campylobacter species isolated from agricultural water in Canada. The original alignment of 120 kb was produced by Roary v.3.13.0. Gblocks v.0.91b was used to remove ambiguous alignments and phylogenetically uninformative positions. The final alignment (110 kb) was used as input to RAxML-NG v.0.9.0 to infer a phylogenetic tree under the GTR+G substitution model with 100 bootstrap replicates.
Isolates CW4409, CW4516, CW4519, and CW4600 were sequenced as part of the GenomeTrakr Project at Texas Department of State Health Services (Austin, Texas), and isolates CW4087 and CW4073 were sequenced at the Center for Biotechnology & Genomics, Texas Tech University (Lubbock, Texas). The rest of the Campylobacter spp. genomes (fasta files) were downloaded from GenBank.
All genomes were annotated using Prokka v.1.13.3 and the .gff files were used as input to Roary v.3.13.0 to create a core gene alignment using the following parameters: -e -n -i 55; where "-e" creates a multiFASTA alignment of core genes using Prank; "-n" - fast core gene alignment with Mafft, used with -e; and "-i 55" is the minimum percentage identity for blastp. The resulting "core_gene_alignment.fasta" file (120 kb) was edited in Gblocks v.0.91b with the following parameters: minimum length of a block set to minimum and positions where at least 50% of the sequences had a gap were treated as a gap position. The resulting alignment "core_gene_alignment_Gblocks.fasta (110 kb) was used by RAxML v.0.9.0 to infer a phylogenetic tree under the GTR+G substitution model with 100 bootstrap replicates ("Tree.raxml.support-core_with accessions.newick").