Deep evolutionary analysis reveals the design principles of fold A glycosyltransferases
Taujale, Rahil et al. (2020), Deep evolutionary analysis reveals the design principles of fold A glycosyltransferases, Dryad, Dataset, https://doi.org/10.5061/dryad.v15dv41sh
The GT-A sequences were collected by a similarity search strategy using multiply aligned manually curated GT-A fold profiles. The sequences were further aligned to the profiles to determine the GT-A domain bounds and insertions.
This dataset includes all putative GT-A fold sequences that belong to one of the 53 GT-A fold families. These were collected by searching the NCBI nr and the UniProt proteomes databases. The file hierarchy.tsv contains a table that lists the hierarchy of the families. For each level and family, there will be a corresponding _nr.fasta, _nr.tsv, _uniprot,fasta and _uniprot.tsv files that contain the sequences from the NCBInr and the Uniprot proteomes database in fasta and tsv formats respectively.
NIH, Award: R01 GM130915
NIH, Award: T32 GM107004