Skip to main content
Dryad logo

Data from: Haplotype sequence collection of ABO blood group alleles by long-read sequencing reveals putative A1-diagnostic variants

Citation

Mattle-Greminger, Maja; Gueuning, Morgan; Thun, Gian Andri (2022), Data from: Haplotype sequence collection of ABO blood group alleles by long-read sequencing reveals putative A1-diagnostic variants, Dryad, Dataset, https://doi.org/10.5061/dryad.q573n5tkj

Abstract

In the era of blood group genomics, reference collections of complete and fully-resolved blood group gene alleles have gained high importance. For most blood groups, however, such collections are currently lacking, as resolving full-length gene sequences as haplotypes (i.e. separated maternal/paternal origin) remains exceedingly difficult with both Sanger and short-read next-generation sequencing. Using the latest third-generation long-read sequencing, we generated a collection of fully-resolved sequences for all six main ABO allele groups: ABO*A1/A2/B/O.01.01/O.01.02/O.02. We selected 77 samples from an ABO genotype dataset (n=25,200) of serologically-typed Swiss blood donors. The entire ABO gene was amplified in two overlapping long-range PCRs (covering ~23.6 kb) and sequenced by long-read Oxford Nanopore sequencing. For quality validation, two samples per ABO group were re-sequenced using Illumina and PacBio technology. All 154 full-length ABO sequences were resolved as haplotypes. We observed novel, distinct sequence patterns for each ABO group. Most genetic diversity was found between, not within, ABO groups. Phylogenetic tree and haplotype network analyses highlighted distinct clades of each ABO group. Strikingly, our data uncovered four genetic variants putatively specific for ABO*A1, for which direct diagnostic targets are currently lacking. We validated A1-diagnostic potential using whole-genome data (n=4,872) of a multi-ethnic cohort. Overall, our sequencing strategy proved powerful for producing high-quality ABO haplotypes and holds promise for generating similar collections for other blood groups. The publicly available collection of 154 haplotypes will serve as a valuable resource for molecular analyses of ABO, as well as studies about the function and evolutionary history of ABO.

Usage Notes

This README_Gueuning-et-al.txt file was generated on 2022-01-23


GENERAL INFORMATION


1. Title of Dataset:

Data from: Haplotype sequence collection of ABO blood group alleles by long-read sequencing reveals putative A1-diagnostic variants


2. Description of Dataset:

The dataset contains accompanying data from the following publication:
Gueuning M, Thun GA, et al. (in review) Haplotype sequence collection of ABO blood group alleles by long-read sequencing reveals putative A1-diagnostic variants.


3. Content of Dataset:

ABO haplotype sequence alignments in FASTA format. All details are provided in the publication.

Gueuning-et-al_ABO_raw_sequence_alignment.fasta
Gueuning-et-al_ABO_analysis_sequence_alignment.fasta


SHARING/ACCESS INFORMATION


1. Data usage

Any data usage requires citing the following publication:

Gueuning M, Thun GA, et al. (in review) Haplotype sequence collection of ABO blood group alleles by long-read sequencing reveals putative A1-diagnostic variants.

Journal-specific restrictions may apply.