Code from: A genome assembly of the North American golden eagle, Aquila chrysaetos canadensis
Data files
Mar 06, 2026 version files 740.16 MB
-
bAquChr2.0.hap1.liftover_polished.gtf.txt
740.16 MB
-
CCGP_busco.sbatch
1.16 KB
-
chroms.txt
810 B
-
README.md
3.89 KB
-
run_liftoff.sbatch
884 B
-
run_minimap2.sh
678 B
Abstract
The golden eagle (Aquila chrysaetos) is an apex predator across its Holarctic range. Although chromosome-level reference genome assemblies are available for two of the six golden eagle subspecies (European and Japanese), current assemblies for the North American subspecies (A. c. canadensis) were generated using short-read sequencing technology, limiting completeness, contiguity, and accuracy. Here we present a chromosome-length de novo genome assembly for A. c. canadensis as part of the California Conservation Genomics Project (CCGP). We used Pacific Biosciences HiFi reads and Omni-C chromatin-proximity sequencing to produce a high-quality assembly consistent with the standard CCGP reference genome protocol. Our assembly spans 1.28 Gbp and comprises 316 scaffolds with a scaffold N50 of 47.3 Mbp, a contig N50 of 47.0 Mbp, and a benchmarking universal single-copy ortholog (BUSCO) completeness score of 97.4%. This reference genome assembly offers a valuable resource for delineating genomic variation and assessing conservation needs in golden eagle populations across California and its North American range more broadly.
Dataset DOI: 10.5061/dryad.2280gb65r
Description of the data and file structure
This dataset contains the North American golden eagle genome assembly liftover annotation, as well as accompanying code for generating the annotation and analytical code for comparing assembly quality to other Accipitridae assemblies.
Data Files
1. chroms.txt -- list of syntenic scaffolds between European and CCGP North American golden eagle genome assemblies
2. bAquChr2.0.hap1.liftover_polished.gtf.txt -- liftover annotation of the assembly using the European golden eagle reference genome (bAquChr1.4) annotation; generated by the script run_liftoff.sbatch
Analysis Script Files
1. CCGP_busco.sbatch -- code for performing the BUSCO analysis; slurm submission script
2. run_minimap2.sh -- code for aligning the CCGP North American golden eagle assembly (bAquChr2.0.hap1) to the European golden eagle reference genome (bAquChr1.4) for synteny assessment before performing liftover annotation (run_liftoff.sbatch); catenate script and execute each command on the command line
3. run_liftoff.sbatch -- code for generating the liftover annotation; slurm submission script
Code/software Requirements
CCGP_busco.sbatch
Dependencies: BUSCO v5.0.0+
Inputs:
- aves_odb10 -- ortholog database; download to working directory using
busco download -l aves_odb10 - directory containing all genome assemblies for generating BUSCO scores
- bAquChr2.0.hap1 -- North American golden eagle (CCGP assembly)
- Aquila_chrysaetos-1.0.2 -- North American golden eagle (2014 Van Den Bussche assembly)
- AquilaChrysaetos1 -- North American golden eagle (2014 Doyle assembly)
- bAquChr1.4 -- European golden eagle
- Ajapo_1 -- Japanese golden eagle
- bHalAlb1.1 -- White-tailed eagle
- bGypBar2.pri -- Bearded vulture
- bHarHar1_primary_haplotype -- Harpy eagle
- ASM2749777v1 -- Black-mantled goshawk
run_minimap2.sh
Dependencies: minimap2 v2.29+, samtools v1.17+
Inputs: reference genome fasta ${ref} and output directory ${out}
run_liftoff.sbatch
Dependencies: liftoff v1.6.3+
Inputs:
- chroms.txt -- list of syntenic scaffolds between European and CCGP North American gold eagle assemblies
- European golden eagle reference assembly and genome annotation
- CCGP North American golden eagle genome assembly
