Determination of protein secondary structure using a 3D genetic code table. Kushelev Alexander, Nanoworld laboratory.

 Determination of protein secondary structure using a 3D genetic code table

Kushelev Alexander, Nanoworld laboratory.

Until now, it was believed that the genetic code was degenerate, and the number of synonymous codes for amino acids (Ser, Arg, Leu) reached 6. We decided to test this hypothesis. To do this, we compiled a table of well-studied proteins and found out what codes encode amino acids in alpha, beta, 310, and pi-helices.


It turned out that glycine is encoded in the alpha helix by the GGC triplet, in the beta layer by the GGA triplet, in the 310-helix by the GGG triplet, and in the pi-helix by the GGT triplet.

In 1992, we compiled a 3D table of the genetic code [1]


Thus, the triplet encodes not only the amino acid, but also the phi angle.



Sample (Collagen, fragment)

In 1992, we had no reliable evidence for the existence of a three-dimensional genetic code, but 

Aviv A. Rosenberg, Eily Marks, and Alex M. Bronstein have found consistent evidence that Val encoded by the GTA triplet is more likely to be included in the beta list, and Val encoded by the GTT triplet is more likely to be included in the Helix. . Thus, two cells of our 3D Genetic Code table are reliably confirmed. [2].


If the code for helix 310 is repeated three times, the first hydrogen bond is formed and helix 310 is built. In the secondary structure diagram, the helix is indicated by ones. If subsequent codes are alpha or 310, then the spiral continues as a sequence of ones. If the code beta or pi is encountered, the spiral ends and zeros follow on the diagram.

If the pi code is repeated five times in a row, then a pi-spiral begins to be built, which is also denoted by ones: 11111. If instead of the pi code there is some other code (alpha, beta, 310), then the pi-spiral ends and zeros begin.

This is what the diagram of the secondary structure of the protein with the identifier CAG5119892.1 looks like:

secondary structure code:





Let's calculate the correlation with X-ray diffraction analysis

/ nuclear magnetic resonance

The results of the correlation analysis are summarized in Table 2:



Full table: https://disk.yandex.ru/d/A0kxSUCKyKmnXw

Considering that X-ray diffraction analysis is erroneous in space by fractions of an angstrom, it can be erroneous along the primary sequence by no more than one turn of the helix. We decided to discard 5 amino acid residues from each end of each section of the helix. As expected, the correlation increased significantly. 100% correlation was obtained for proteins with extended helical regions.

If we take a sample of proteins that do not contain helices, then it will not be possible to calculate the correlation by the presence of helices, because they simply do not exist. In this case, it is possible to calculate the correlation by the absence of helices, i.e. when the X-ray diffraction analysis

/ nuclear magnetic resonance show that there are no helices.

The data for a sample of 250 random proteins are summarized in Table 3:


Pruning 5 amino acid residues from each end of each helix increases the maximum of the helix and non-helix correlations from 72% to 96%.


A graph of the correlation value versus the number of amino acid residues cut is shown in Figure 3: 

The graph shows that the average sections of the spirals are determined by the X-ray diffraction analysis

with a reliability of 96%.

It is known that even a 100% correlation does not guarantee causation, so we continued testing using other "wet" methods, for example, chemical analysis. Chemical analysis makes it possible to determine the presence of disulfide bridges.

The program automatically built 6 protein fragments closed through disulfide bridges using the 3D genetic code table. At each step of building a model according to the triplet code, the angle phi is taken from the table. If the cycle, for example, of lysozyme, contains 22 amino acid residues, then the number of variants of the structure is three to the power equal to the number of residues minus 1, i.e. 3^(22-1) ~= 3*10^10 (about 30 billion). And only in one of these options will the disulfide bridge close. The probability of this event is 1/(3*10^10), i.e. one thirty-billionth. And the probability of closing all 6 fragments that the program built from the code in automatic mode is 1/(10^84). It is clear that the closure of disulfide bridges in the model is not accidental. If you change the 3D genetic code table, all the disulfide bridges in the model open. And the correlation with X-ray diffraction analysis will be reset to zero.

In 2021, we discovered a protein structure in PDB that is a fractal helix:


This is another confirmation of the operability and predictive power of the 3D coding algorithm.

How to get a 100% guarantee of the correctness of the 3D genetic code table? To do this, it is enough to grow crystals according to the table codes that are repeated. For example, for glycine, these will be the codes n(gga), n(ggc), n(ggg), n(ggt). There are 61 codes and 61 crystals in total. In 2024, it was possible to find out that such regions are typical for membrane proteins[3]


The secondary structure of this protein is determined from the 3D genetic code table:


Despite the difficulties of obtaining crystals from membrane proteins, their typical structure is known. It consists of numerous spiral sections. This is another confirmation of the 3D table of the genetic code.

The advantage of the 3D coding algorithm is the higher reliability of protein structure determination.  Another advantage is speed. This is about a billion times higher than determining protein structure using X-ray crystallography.

/ nuclear magnetic resonance and other “wet” methods.



For some classes of proteins that do not contain Pro, our algorithm allows us to determine not only the secondary, but also the tertiary structure.[4]

Our model experiments help determine the geometric parameters of not only proteins, but also a wide class of chemical compounds [5]

The 3D genetic code algorithm makes it possible to determine the structures of all proteins encoded by a chromosome, and even the entire genome, in one run of the program. This scientific achievement raises civilization to a new level of development.


1. 1. Кушелев А.Ю., Кожевников Д.Н. Конкурсный проект "Наномир" // Экономика сегодня и завтра.-1992.-стр.30 https://img-fotki.yandex.ru/get/5905/nanoworld2003.23/0_4a672_1493e464_orig.jpg

2. Aviv A.Rosenberg, Ailie Marx & Alex M. Bronstein  Codon-specific Ramachandran plots show amino acid bacbone conformation depends on identity of the translated codon. https://www.nature.com/articles/s41467-022-30390-9

3. https://www.uniprot.org/uniprotkb/A0A1Q9E695/entry

4. Соколик В, Кушелев А. Геометрия живого наномира: Пикотехнология белков // LAP LAMBERT Academic Publishing,-2016,-292стр. https://books.google.ru/books?id=2P7IzQEACAAJ

5. Кожевников Д.Н. Кольцегранные модели молекул. Журн. физ. химия. - 1996. - Т. 70. - № 6. - С. 1134-1137.  http://nanoworld88.ru/files/700-800/791.htm


В избранное