Biochemistry, Replication and Transcription


The flow of genetic information in biological systems from DNA>RNA>Protein is the central dogma in molecular biology. This explains how the genetic information in the form of DNA in a cell is converted to RNA and then to protein for effective utilization.

The process of replication allows cells to generate new genetic material (DNA) using original DNA as a template. The cell cycle consists of four phases-G1, S, G2, and M. During the G1 phase, cells grow and produce material like nucleotide precursors as preparation for DNA replication in the S-phase. Replication occurs in the S-phase cell and new genetic material is synthesized as a preparation for the cell division. Synthesis of histones and other DNA associated proteins is markedly increased in the S-phase. The process is highly regulated and requires many different enzymes that include DNA polymerase, primase, ligase, helicase, and topoisomerase. Replication is known to be semiconservative as the original DNA (the parent strand) splits to make a new strand, while retaining the parent strand.[1] 

Replication in prokaryotic and eukaryotic cells is quite similar. The difference in eukaryotic replication lies in the larger amount of DNA that is associated with histones. The prokaryotic DNA is circular and therefore has only one point of origin where replication starts and moves in a bidirectional manner. Eukaryotic cells, on the other hand, have a linear structure that is organized into tight chromosomes around histones.[2][3]

Transcription is the process where a specific segment of DNA is used as a template and copied into an RNA molecule. This synthesis is carried out by an enzyme known as RNA polymerase. The newly synthesized RNA molecule then exits the nucleus and enters the cytoplasm, where it is translated into protein. [4]


Eukaryotic DNA replication occurs in the nucleus of a cell wherein new DNA is made using the original DNA as a template. The process occurs in three stages-initiation, elongation, and termination. Once the DNA is formed, it undergoes the process of transcription synthesizing messenger RNA, which will then be used to generate proteins. Similar to replication, transcription also follows a three-step process of initiation, elongation, and termination. mRNA formation is followed by post-transcriptional modifications in eukaryotic cells and forms the basis of gene expression.

Issues of Concern

Several issues can occur during replication or transcription, the most common being mutations leading to a loss of function. To avoid these problems, there are enzymes that carry our proofreading and DNA repair as the DNA is being replicated. This ensures high fidelity replication, with very few base pair mismatches. However, problems in proofreading and repair mechanisms can result in mutations, causing the cell to exhibit aberrant behavior. Many mutations may go unnoticed as they are silent or uncovered by the regulatory mechanisms in the cell. Some mutations, however, can directly affect the expression of genes, and cause diseases such as cancer.



The process of replication is a highly complex process and requires a concerted effort of many different proteins including but not limited to DNA Polymerases, Primase, Helicase, and DNA ligase. In eukaryotes, Polymerases δ and ε are the major replicative enzymes.

DNA has a double helical structure, where the two strands are joined together with hydrogen bonds between the complementary base pairs. Each strand is made up of nucleotides joined together by phosphodiester linkages. For the replication process to begin, the DNA helix must first unwind by removing the hydrogen bonds holding the two strands together. The unwinding process is accomplished by DNA helicase that often starts disconnecting the DNA in a region that is rich in adenine (A) and thymine (T). Since these bases have only two hydrogen bonds, instead of three, it is an ideal place for the helicase to start unwinding the DNA. Unwinding creates a replication fork that has a leading and a lagging strand, a result of the anti-parallel nature of the strands. [5][6][7]

DNA polymerase then starts synthesizing the new strand from 5’ to 3’ direction, but requires a short RNA sequence, approximately 10 nucleotides in length, as a primer to begin the process. An RNA primer is therefore generated with the help of RNA primase enzyme, to initiate placing RNA bases complementary to the template strand bases. The DNA strands are directional with one side (3') denoted by a hydroxyl group and another represented by a phosphate group (5'). Multiple RNA primers are needed for the lagging strand which is then used by the DNA polymerase to begin the elongation phase of DNA replication. DNA replication is high fidelity due to the proofreading ability of DNA polymerase that detects, removes, and fixes any errors made during the replication process. [2][8]

On the leading strand, the DNA polymerase continues in constant motion while the lagging strand has to be copied in short segments in the opposite direction since DNA polymerase only codes in 5' to 3' direction. These short repetitive lagging fragments are known as the "Okazaki fragments". Once the Okazaki fragment synthesis is complete, the RNA primer is no longer required and must be eliminated and replaced with the proper DNA sequence. A flap endonuclease 1 (FEN1) and RNase H remove this RNA primer and the gap is then filled by the action of Polymerase δ that then uses the parental DNA strand as a template to add remaining bases. Finally, another enzyme called DNA ligase connects the Okazaki fragments on the lagging strand by creating the phosphodiester bonds. The generation of these bonds is the final process in DNA replication, resulting in the formation of two new daughter DNA double helices. They are referred to as semiconservative because each double helix now consists of one original parent template strand and a newly synthesized strand.[7][9]

Towards the completion of replication, the newly synthesized lagging strand is shorter with a 3’-overhang. This is thought to be due to the degradation of RNA at the end of the chromosome or inability of primase to lay down a primer at the end. To prevent the loss of genes during successive replications, telomeres are added at the ends of the chromosomes by an enzyme called telomerase. Telomerase enzyme is equipped with both the proteins and RNA that has the ability to base-pair with the 3’-overhang and extend it. Without the telomeres, the chromosomes will gradually shorten with each replication and the cell undergoes premature senescence. [10][11]


RNA is synthesized from the DNA template by a process known as transcription. This requires the action of enzymes called RNA polymerases to generate a single-stranded RNA from one of the parent DNA strands. Unlike DNA polymerase, RNA polymerase can initiate RNA synthesis without a primer. While prokaryotes have a single RNA polymerase, eukaryotic cells possess three RNA polymerases. RNA synthesis requires RNA polymerase I for ribosomal RNA, RNA polymerase II for mRNA and microRNA, and RNA polymerase III for tRNA and other small RNAs. Transcription is not as accurate as replication and produces more errors. The replication process is highly accurate, due to the proofreading ability of DNA polymerase.[12]

To initiate the process of transcription, the RNA polymerase must first identify the gene to be transcribed, the correct strand of the dsDNA that it must copy, and bind to a specific sequence on DNA called the promoter. A strand of DNA is read by the RNA polymerase in 3’-5’ direction and its RNA transcript is synthesized in the 5’-3’ direction. The promoter region is known as a TATA box due to the presence of a high frequency of adenine and thymine. In prokaryotes, the TATA box consensus sequence is TATAAT. A similar consensus sequence TATA(A/T)A is present in the eukaryotes. Several other proteins such as transcription factors and binding proteins also participate at the initiation site and function together as a complex.[12][13][13]

Once the RNA polymerase is able to bind to the section of the gene that will undergo transcription, it continues to separate the double helix and synthesize RNA in a 5' to 3' direction. The RNA polymerase places complementary bases to the template strand, except instead of placing thymine with every adenine, the polymerase places a new base called uracil (U). This process is the elongation phase as the RNA polymerase continues down the template creating a new complementary single-stranded RNA. The elongation step proceeds until the polymerase meets a hairpin loop structure known as the termination sequence, which causes the polymerase to fall off, thus beginning the termination phase.

Once synthesized, the RNA undergoes post-transcriptional modification to prevent its degradation during its exit from the nucleus to the cytoplasm where it will be translated to protein. The single-stranded RNA receives a 5' capping by 7-methylguanosine, which is often an mRNA sequence, as well as the addition of poly-A-tail on the 3' end. During these modifications, eukaryotic cells also undergo splicing, in which portions of the RNA called introns are cut out, leaving only the required bases for translation, known as exons. The newly synthesized RNA strand then travel through the pores in the nuclear envelope to the cytosol in order to participate in protein synthesis by the process known as translation.[14][15]


Southern Blotting is a laboratory technique that is used to detect specific DNA sequences in a sample such as blood or tissue. The DNA is sorted by size using electrophoretic separation, transferred to a membrane, and exposed to a specific probe containing the complementary sequence to the DNA in the sample. Binding of a probe to the membrane containing the DNA suggests the presence of the DNA sequence in that particular sample.[16]

Polymerase Chain Reaction (PCR) is a technique that allows amplification of a specific section of DNA. It is commonly used in a wide range of applications such as cloning, mutation detection, and analysis, sequencing, forensics, paternity testing, pathogen detection, genotyping to gene expression and therapy. The process involves denaturation to separate the strands (unwinding), annealing (primer form base-pairs with the target complementary sequences in DNA), and replication. [17][18]

DNA sequencing is a laboratory technique that is used to identify the exact sequence of bases (Adenine, Thymine, Cytosine and, Guanine) in a DNA molecule. Since DNA sequence is used to make RNA and proteins, this is useful information to investigate the normal functioning of genes and how that normal function might be perturbed due to any change in the DNA sequence due to mutations. [19][20]

Clinical Significance

Nucleoside Analogs are used in chemotherapy for treating viral and other diseases by inhibiting the DNA replication or transcription (RNA synthesis). These analogs are competitive inhibitors of the DNA polymerases that cause termination of the growing nucleotide chain when incorporated into a nucleic acid. Acyclovir is a guanosine analog that incorporates into the DNA polymerase of the Herpes simplex virus and inhibits the replication of the viral DNA chain. Azidothymidine (AZT) is a synthetic pyrimidine nucleoside analog used against the Human Immunodeficiency Virus (HIV) that causes a termination of the viral DNA chain by inhibiting the reverse transcriptase enzyme, responsible for transcription of viral RNA to DNA. Cytarabine is a pyrimidine analog used for the treatment of hematological malignancies such as acute lymphocytic leukemia, chronic myelogenous leukemia, acute myeloid leukemia, and non-Hodgkin's lymphoma. Its mechanism of action involves the inhibition of DNA polymerase. [21]

RNA polymerase inhibitors: Anti-tuberculosis drug Rifampicin is an effective inhibitor of RNA polymerase in Mycobacterium tuberculosis, and works by inhibiting the transcription of bacteria. PMID: 28392175 Alpha amanitin, a deadly toxin from the death cap mushroom (Amanita phalloides) is a cyclic peptide that inhibits eukaryotic RNA polymerase II and blocks the gene transcription. PMID: 26375431 

Beta-Thalassemia is a blood disorder where there is a deficiency of hemoglobin caused by molecular defects. Hemoglobin is a tetramer comprised of two alpha and two beta chains. Gene defects such as mutations, insertion, deletions, or substitutions can affect the process of transcription, processing, and translation of beta-globin mRNA, and result in deficient production of the beta-globin protein. [22]

Gene Therapy: DNA sequencing has become much more effective in recent years, allowing the possibility of gene therapy. Gene therapy aims to treat various diseases by replacing the incorrect or mutated gene sequence with the correct sequence to express the right protein. Currently, its application is limited, but it could prove to be a significant change in medical treatment for diseases such as recessive gene disorders, acquired genetic diseases like cancer. [23][24][25][26][27][28]

Created by Anthony Mercadante
Article Details

Article Author

Anthony Mercadante

Article Author

Manjari Dimri

Article Editor:

Shamim Mohiuddin


8/28/2020 8:51:22 AM



Achar YJ,Foiani M, Coordinating Replication with Transcription. Advances in experimental medicine and biology. 2017     [PubMed PMID: 29357070]


O'Donnell M,Langston L,Stillman B, Principles and concepts of DNA replication in bacteria, archaea, and eukarya. Cold Spring Harbor perspectives in biology. 2013 Jul 1     [PubMed PMID: 23818497]


Kang S,Kang MS,Ryu E,Myung K, Eukaryotic DNA replication: Orchestrated action of multi-subunit protein complexes. Mutation research. 2018 May     [PubMed PMID: 28501329]


Makurath MA,Whitley KD,Nguyen B,Lohman TM,Chemla YR, Regulation of Rep helicase unwinding by an auto-inhibitory subdomain. Nucleic acids research. 2019 Jan 23;     [PubMed PMID: 30690484]


Jain R,Aggarwal AK,Rechkoblit O, Eukaryotic DNA polymerases. Current opinion in structural biology. 2018 Dec     [PubMed PMID: 30005324]


Kelly T, Historical Perspective of Eukaryotic DNA Replication. Advances in experimental medicine and biology. 2017     [PubMed PMID: 29357051]


Bernardes de Jesus B,Blasco MA, Telomerase at the intersection of cancer and aging. Trends in genetics : TIG. 2013 Sep     [PubMed PMID: 23876621]


Alfonzo JD, Post-transcriptional RNA modification methods. Methods (San Diego, Calif.). 2016 Sep 1     [PubMed PMID: 27600834]


Zhao BS,Roundtree IA,He C, Post-transcriptional gene regulation by mRNA modifications. Nature reviews. Molecular cell biology. 2017 Jan     [PubMed PMID: 27808276]


Southern EM, Detection of specific sequences among DNA fragments separated by gel electrophoresis. Journal of molecular biology. 1975 Nov 5     [PubMed PMID: 1195397]


Chmielecki J,Meyerson M, DNA sequencing of cancer: what have we learned? Annual review of medicine. 2014     [PubMed PMID: 24274178]


Ciccolini J,Serdjebi C,Le Thi Thu H,Lacarelle B,Milano G,Fanciullino R, Nucleoside analogs: ready to enter the era of precision medicine? Expert opinion on drug metabolism & toxicology. 2016 Aug     [PubMed PMID: 27218825]


Thein SL, The molecular basis of β-thalassemia. Cold Spring Harbor perspectives in medicine. 2013 May 1     [PubMed PMID: 23637309]


Contesse MG,Valentine JE,Wall TE,Leffler MG, The Case for the Use of Patient and Caregiver Perception of Change Assessments in Rare Disease Clinical Trials: A Methodologic Overview. Advances in therapy. 2019 May     [PubMed PMID: 30879250]


Pearlman R,Haraldsdottir S,de la Chapelle A,Jonasson JG,Liyanarachchi S,Frankel WL,Rafnar T,Stefansson K,Pritchard CC,Hampel H, Clinical characteristics of patients with colorectal cancer with double somatic mismatch repair mutations compared with Lynch syndrome. Journal of medical genetics. 2019 Jul     [PubMed PMID: 30877237]


Cartier N,Hacein-Bey-Abina S,Bartholomae CC,Veres G,Schmidt M,Kutschera I,Vidaud M,Abel U,Dal-Cortivo L,Caccavelli L,Mahlaoui N,Kiermer V,Mittelstaedt D,Bellesme C,Lahlou N,Lefrère F,Blanche S,Audit M,Payen E,Leboulch P,l'Homme B,Bougnères P,Von Kalle C,Fischer A,Cavazzana-Calvo M,Aubourg P, Hematopoietic stem cell gene therapy with a lentiviral vector in X-linked adrenoleukodystrophy. Science (New York, N.Y.). 2009 Nov 6     [PubMed PMID: 19892975]