[Back to Issue S1 ToC] [Back to Journal Contents] [Back to Biochemistry (Moscow) Home page]

REVIEW: The Influence of the Nucleotide Composition of Genes and Gene Regulatory Elements on the Efficiency of Protein Expression in Escherichia coli


Artur I. Zabolotskii1,a*, Stanislav V. Kozlovskiy1, and Alexey G. Katrukha1

1Faculty of Biology, Lomonosov Moscow State University, 119991 Moscow, Russia

* To whom correspondence should be addressed.

Received May 25, 2022; Revised June 23, 2022; Accepted June 29, 2022
Recombinant proteins expressed in Escherichia coli are widely used in biochemical research and industrial processes. At the same time, achieving higher protein expression levels and correct protein folding still remains the key problem, since optimization of nutrient media, growth conditions, and methods for induction of protein synthesis do not always lead to the desired result. Often, low protein expression is determined by the sequences of the expressed genes and their regulatory regions. The genetic code is degenerated; 18 out of 20 amino acids are encoded by more than one codon. Choosing between synonymous codons in the coding sequence can significantly affect the level of protein expression and protein folding due to the influence of the gene nucleotide composition on the probability of formation of secondary mRNA structures that affect the ribosome binding at the translation initiation phase, as well as the ribosome movement along the mRNA during elongation, which, in turn, influences the mRNA degradation and the folding of the nascent protein. The nucleotide composition of the mRNA untranslated regions, in particular the promoter and Shine–Dalgarno sequences, also affects the efficiency of mRNA transcription, translation, and degradation. In this review, we describe the genetic principles that determine the efficiency of protein production in Escherichia coli.
KEY WORDS: recombinant proteins, codon composition, codon optimization, Escherichia coli, expression activation, solubility

DOI: 10.1134/S0006297923140109