Data from: The twenty amino acids are identified by unique numbers assigned to the uracil, cytosine, adenine, and guanine found in the three base positions of the sixty-four messenger RNA genetic codons
Data files
Jan 15, 2024 version files 285.46 KB
-
1-1-24_Figure_2.pdf
64.96 KB
-
Bayles_Table_5_1_9a_24.xlsx
87.45 KB
-
codon_9_24a_23.pdf
128.20 KB
-
README.md
4.86 KB
Abstract
A codon’s three bases consist of any combination of uracil, cytosine, adenine, or guanine and these encode the twenty amino acids. When the codon’s first two bases are given specific values, and those values are multiplied, then the third base of the codon is used during translation only when the product is greater than three. Here we show that those values plus more variables within the ribosomal decoding site results in specific flow values for each of the twenty amino acid groups. These results are demonstrated in a flow chart showing the unidirectional flow which is expected during the translation process. All twenty amino acids can be represented by numbers that describe their relationship to each other and to the decoding site. We anticipate our findings will increase discussion about using a number system to better understand the translation process.
README: Each of the twenty amino acids in the codon dictionary has its own numerical value
https://doi.org/10.5061/dryad.tht76hf5r
Table 1. The standard codon dictionary. The triplet codons consist of three bases of uracil (U), cytosine (C), adenine (A), and guanine (G). This creates sixty-four codon triplets, sixty-one of which encode the twenty amino acids. UAA, UAG, and UGA are termination codons that do not encode protein.
Fig. 1. The codon dictionary flow chart. The sixty-one encoding triplets follow a three-step flow chart. First, the triplets with the second base A always use their third base during translation as they encode for proteins. Second, the codons exhibiting a C as their second base do not use their third base during the encoding process. Finally, the first bases C and G do not encode unless their second base was an A, which was already taken care of in the first step. The remaining two first base possibilities are U and A and they do encode unless their second base was a C, which was taken care of in the second step of the flow chart.
Table 2. Base letters values. See Table 5. Assigning certain numbers to the codons’ first and second bases and then multiplying them will predict which amino acids need their third bases to encode. Because the flow chart begins with analysis of the second base letter A, it is given the highest value. Next, second base C never encodes a third base so its value is assigned zero. The remaining bases fall in between these two extremes. When the codon’s first base value is multiplied by its second base value, and the product is greater than 3, then that codon’s third base is part of the encoding process.
Table 3. Third base value. See Table 5. When third base does not encode then 4 is added to its value. When third base does encode then 0.9 is added to the codon’s flow value as long as the third base letter is U or C. No amount is added when the third base letter is A or G. There are two special circumstances when calculating flow amount for the third base. When first base multiplied by the second base equals 8 and third base is an A, then an additional 0.9 is added to the total flow amount for that codon. In the second circumstance when the product of the first and second bases equals two, then an additional 0.1 is added to that codon’s total flow calculation.
Table 4. Ribosomal Flp Values, see Table 5. To determine values for the twenty amino acid categories that the sixty-one codons fall within the ribosome is also assigned some values. The ribosome’s A-site tunnel holds two adenine molecules (A) that move freely, or “flip”. T They are assigned flip values that react to a codon’s first and second bases, identified as “B1 Flipper” and “B2 Flipper”.
Table 5. Arranged in descending order, the amino acids break up into twenty flow field values when combining a codon’s three base values with the ribosomal flip values. The result is twenty Codon Flow Values, one value for each of the twenty amino acids. For example, leucine (leu), serine (ser), and arginine (arg) are encoded by six codons each, and each set of six has its own unique number. Methionine (met) is encoded by only one codon AUG, and it is also the only encoding codon with a value of 10. The first column on the left, column 1, lists the codon. Column 2 shows the amino acid encoded by that codon. Columns 3 and 4 are the values assigned to the first (B1) and second (B2) bases of the codon, see Table 2. The next three columns 5, 6, and 7 show the numbers assigned to the codon's third base (B3), see Table 3. Columns 8 and 9 hold the codon's first base (B1 Flip) and second base (B2 Flip) ribosome flip values, see Table 4. The far right column #10 shows the resulting Codon Flow Value, created by combining all the columns. The remaining columns are raw data for setting up commands to created the columned graph and are each labeled according to their task.
Table 5 was prepared using XLSX spreadsheet to compute each column.
Fig. 2. Flow Chart describing the values assigned to each codon. A flow chart shows an orderly pattern for describing a codon’s flow value. The value for (B1 X B2) must be established first so that B3 can be calculated. After the third base’s value is calculated, it is added to Base 1 and 2 values and Flip 1 & 2 values. The codon’s flow value is established.
Fig.3. The 64 codons with numbers assigned in graph form. This graph places all 64 codons in descending order for easy analysis. For example, leucine has six codons that encode it and all six have the same flow value number. The remaining nineteen amino acids also possess one flow value that is unique to them.
Data was derived from the following sources:
- Each table, figure, and graph were created by the authors B. Bayles and C. Heckert.
Methods
There are sixty-one codons that are read by the ribosome and these encode the twenty amino acids. Our dataset assigns numbers to each of the 61 codons. These numbers are calculated by assigning numerical values to the uracil, guanine, adenine, and cytosine that comprise mRNA during translation. Values are also assigned to the site within the ribosome that reads these codons. Twenty values result and they break the sixty-one codons into their respective twenty protein classes. For example, there are six codons that encode for serine and the value chart shows all six have the same value of 9.1, which is unique to serine and not found representing any of the other nineteen amino acid groups. Flow charts are used to show these values calculated in a unidirectional flow as is expected during translation.