Viability selection yields adult populations that are more genetically variable than those of juveniles, producing a positive correlation between heterozygosity and survival. Viability selection could be the result of decreased heterozygosity across many loci in inbred individuals and a subsequent decrease in survivorship resulting from the expression of the deleterious alleles. Alternatively, locus-specific differences in genetic variability between adults and juveniles may be driven by forms of balancing selection, including heterozygote advantage, frequency-dependent selection or selection across temporal and spatial scales. We use a pooled sequencing approach to compare genome-wide and locus-specific genetic variability between 74 golden eagle (Aquila chrysaetos), 62 imperial eagles (Aquila heliaca) and 69 prairie falcons (Falco mexicanus) juveniles and adults. Although genome-wide genetic variability is comparable between juvenile and adult golden eagles and prairie falcons, imperial eagle adults are significantly more heterozygous than juveniles. This evidence of viability selection may stem from a relatively smaller imperial eagle effective population size and potentially greater genetic load. We additionally identify ~2000 SNPs across the three species with extreme differences in heterozygosity between juveniles and adults. Many of these markers are associated with genes implicated in immune function or olfaction. These loci represent potential targets for studies of how heterozygote advantage, frequency-dependent selection and selection over spatial and temporal scales influence survivorship in avian species. Overall, our genome-wide data extend previous studies that used allozyme or microsatellite markers and indicate that viability selection may be a more common evolutionary phenomenon than often appreciated.
Imperial eagle genome assembly
To generate an imperial eagle genome assembly, we conducted one lane of paired-end (PE) sequencing and one lane of mate-paired (MP) sequencing using an Illumina HiSeq2500 that produced read lengths of 100 bp. We used Trimmomatic 0.35 to remove adaptors and discard low quality bases as in Doyle et al. (2018). We then used ABySS 1.5.2 to conduct several preliminary assemblies of PE and MP reads, using k-mer lengths ranging from 41 to 61. MP reads were used only during the scaffolding step and a minimum of 10 pairs of reads were required to join two contigs. We determined that a k-mer length of 61 produced the best assembly by considering both N50 values and the length of the longest scaffold.
ie_ABySS_min10000.fasta
Imperial eagle annotated genes - protein sequences
Imperial eagle scaffolds greater than 10 kb were annotated using the MAKER 2.31.9 pipeline. Briefly, we used Repeat-Masker 4.0.7 to identify and mask stretches of repetitive DNA, BLAST 2.3.0 to align avian ESTs and proteins to the genome, SNAP 0.15.4 to generate ab initio gene predictions while InterProScan 5.25-64.0 was used to identify putative protein domains. This file describes the protein sequences associated with the annotations.
ie_Galaxy-min10000.all.maker.proteins.fasta
Imperial eagle annotated genes - transcripts
Imperial eagle scaffolds greater than 10 kb were annotated using the MAKER 2.31.9 pipeline. Briefly, we used Repeat-Masker 4.0.7 to identify and mask stretches of repetitive DNA, BLAST 2.3.0 to align avian ESTs and proteins to the genome, SNAP 0.15.4 to generate ab initio gene predictions while InterProScan 5.25-64.0 was used to identify putative protein domains. This file describes the transcript sequences associated with the annotations.
ie_Galaxy-min10000.all.maker.transcripts.fasta
Imperial eagle annotated genes - .gff file
Imperial eagle scaffolds greater than 10 kb were annotated using the MAKER 2.31.9 pipeline. Briefly, we used Repeat-Masker 4.0.7 to identify and mask stretches of repetitive DNA, BLAST 2.3.0 to align avian ESTs and proteins to the genome, SNAP 0.15.4 to generate ab initio gene predictions while InterProScan 5.25-64.0 was used to identify putative protein domains. This .gff file describes the annotations.
ie_genome.all.gff
Golden eagle annotated genes - protein sequences
The golden eagle genome assembly was downloaded from GenBank (Accession: GCA_000766835.1). Golden eagle scaffolds greater than 10 kb were annotated using the MAKER 2.31.9 pipeline. We used Repeat-Masker 4.0.7 to identify and mask stretches of repetitive DNA, BLAST 2.3.0 to align avian ESTs and proteins to the genome, SNAP 0.15.4 to generate ab initio gene predictions while InterProScan 5.25-64.0 was used to identify putative protein domains. This file describes the protein sequences associated with the annotations.
okge_min10000.all.maker.proteins.fasta
Golden eagle annotated genes - transcripts
The golden eagle genome assembly was downloaded from GenBank (Accession: GCA_000766835.1). Golden eagle scaffolds greater than 10 kb were annotated using the MAKER 2.31.9 pipeline. We used Repeat-Masker 4.0.7 to identify and mask stretches of repetitive DNA, BLAST 2.3.0 to align avian ESTs and proteins to the genome, SNAP 0.15.4 to generate ab initio gene predictions while InterProScan 5.25-64.0 was used to identify putative protein domains. This file describes the transcript sequences associated with the annotations.
okge_min10000.all.maker.transcripts.fasta
Golden eagle annotated genes - .gff file
The golden eagle genome assembly was downloaded from GenBank (Accession: GCA_000766835.1). Golden eagle scaffolds greater than 10 kb were annotated using the MAKER 2.31.9 pipeline. We used Repeat-Masker 4.0.7 to identify and mask stretches of repetitive DNA, BLAST 2.3.0 to align avian ESTs and proteins to the genome, SNAP 0.15.4 to generate ab initio gene predictions while InterProScan 5.25-64.0 was used to identify putative protein domains. This .gff file describes the gene annotations.
genome.all.gff
Falco mexicanus SNPs
This file describes SNPs identified in adult and juvenile prairie falcons (including scaffold, SNP location, major/minor alleles, allele frequencies and more). We used Trimmomatic 0.35 to remove adaptors and discard low quality bases from pooled sequencing reads. Reads were scanned using a 4 bp window and cut whenever the average phred quality score dropped below 20. Reads less than 40 bases long were subsequently discarded. High-quality reads were mapped to the relevant genome using BWA 0.7.13. We subsequently used Picard to merge male and female reads within each life stage group (i.e., nestling or adult) to increase coverage, as well as remove duplicate reads. Samtools was used to remove ambiguously mapped reads (e.g., reads with mapping quality below 20) and generate a mpileup file. PoPoolation2 was used to filter indels, identify SNPs and calculate allele frequencies. When calculating allele frequencies, we discarded SNPs with a minor allele frequency below 2, a minimum coverage below 15 for either cohort, or a maximum coverage greater than 99. See README file for additional information.
pf_2alleles_diff_het_freq_min2_worc_splitalleles4.txt
Imperial eagle SNPs
This file describes SNPs identified in adult and juvenile imperial eagles (including scaffold, SNP location, major/minor alleles, allele frequencies and more). We used Trimmomatic 0.35 to remove adaptors and discard low quality bases from pooled sequencing reads. Reads were scanned using a 4 bp window and cut whenever the average phred quality score dropped below 20. Reads less than 40 bases long were subsequently discarded. High-quality reads were mapped to the relevant genome using BWA 0.7.13. We subsequently used Picard to merge male and female reads within each life stage group (i.e., nestling or adult) to increase coverage, as well as remove duplicate reads. Samtools was used to remove ambiguously mapped reads (e.g., reads with mapping quality below 20) and generate a mpileup file. PoPoolation2 was used to filter indels, identify SNPs and calculate allele frequencies. When calculating allele frequencies, we discarded SNPs with a minor allele frequency below 2, a minimum coverage below 15 for either cohort, or a maximum coverage greater than 99. See README file for additional information.
ie_2alleles_diff_het_freq_min2_worc_splitalleles4.txt
Aquila chrysaetos SNPs
This file describes SNPs identified in adult and juvenile golden eagles (including scaffold, SNP location, major/minor alleles, allele frequencies and more). We used Trimmomatic 0.35 to remove adaptors and discard low quality bases from pooled sequencing reads. Reads were scanned using a 4 bp window and cut whenever the average phred quality score dropped below 20. Reads less than 40 bases long were subsequently discarded. High-quality reads were mapped to the relevant genome using BWA 0.7.13. We subsequently used Picard to merge male and female reads within each life stage group (i.e., nestling or adult) to increase coverage, as well as remove duplicate reads. Samtools was used to remove ambiguously mapped reads (e.g., reads with mapping quality below 20) and generate a mpileup file. PoPoolation2 was used to filter indels, identify SNPs and calculate allele frequencies. When calculating allele frequencies, we discarded SNPs with a minor allele frequency below 2, a minimum coverage below 15 for either cohort, or a maximum coverage greater than 99. See README file for additional information.
ge_2alleles_diff_het_freq_min2_worc_splitalleles4.txt