Skip to main content
Dryad

A chromosome-scale reference genome and genome-wide genetic variations elucidate adaptation in yak

Abstract

Yak is an important livestock for the people who lived in harsh and oxygen-deprived Qinghai-Tibetan Plateau and Hindu-Kush Himalayan Mountains. Although there is a yak genome be sequenced in 2012, the assembly is quite fragmented due to the limitation of Illumina sequencing technology. An accurate and complete reference genome is critical for studying genetic variation of a specie. Long-read sequences are more complete than short-read ones, and they have been successfully used for high-quality genome assembly in several species. Here, we present a high-quality assembly of the yak genome (PB_v1.0) at chromosome scale, which was constructed using long-read sequencing technology assisted by chromatin interaction technology. Compared to the previous yak genome assembly (BosGru_v2.0), the PB_v1.0 assembly has substantially improved chromosome sequence continuity, minimized repetitive structure ambiguity, and achieved gene model completeness. To intensively characterize genetic variation of yak, we generated de novo genome assemblies based on Illumina short reads of seven recognized domestic yak breeds from Tibet and Sichuan as well as one wild yak from Hoh Xil. By comparing these eight assemblies to the PB_v1.0 genome, we obtained a comprehensive map of yak genetic diversity at whole genome level and identified a few protein-coding genes that were absent from the PB_v1.0 assembly. Although wild yak suffered bottleneck effect, the genetic diversity of wild yak is still higher than that of domestic yak. By whole genome alignment, we identified breed-specific sequences and genes, this will help the breeds identification of yak.