Skip to main content
Dryad

Data from: Why is inverse symmetry the fundamental architecture of DNA?

Data files

Mar 26, 2021 version files 43.95 KB

Abstract

A striking global property of genomes observable across all domains of life is their universal inverse symmetry, manifest as equivalent frequencies of inverse complementary sequence motifs on the same strand of duplex DNA, as originally stated in Chargaff’s Second Parity Rule (CPR2) for mononucleotides. Simple mechanistic explanations of CPR2 have proven unsatisfactory.

In contrast, we use a conservation principle to explain not only inverse symmetry and its global nature, but also how it breaks down. CoHSI theory (Conservation of Hartley-Shannon Information) when applied to the structure of dsDNA, considered as a homogeneous discrete system, predicts a power-law relationship in frequency versus rank-order of sequence motifs (n-tuples) in a single strand. We show how this combines with inter-strand Watson-Crick base-pairing to predict a genome-wide power-law relationship in frequency versus rank-order of stepped-pairs of inverse complementary motifs (ie. universal inverse symmetry). These predictions were tested and validated on 175 genomes drawn from the 3 domains of life plus viruses. We find that CPR2 holds closely for genomes over 10^5 bp in length, and that CPR2 compliance declines in shorter genomes in inverse proportion to genome length and in direct proportion to n-tuple size, regardless of DNA, RNA or single- or double-stranded nature.