The ability of deoxyribonucleic acid (DNA) to function as the material through which genetic information is stored and transmitted is a direct result of its elegant structure. In their seminal 1953 paper, Watson and Crick unveiled two aspects of DNA’s structure, namely the pairing of the nucleotide bases in a complementary fashion (e.g., adenine with thymine, and cytosine with guanine) and its double-helical nature. These insights into DNA’s structure made sense of previous observations, such as the equivalent ratios of purines and pyrimidines found in the molecule, and provided a framework for the subsequent elucidation of the mechanism of DNA replication. Ultimately, the remarkable structure of DNA, from the nucleotide up to the chromosome, plays a crucial role in its biological function.
Issues of Concern
The primary issue of concern regarding the structure of DNA is when that structure is changed or mutated so that the proteins encoded by the mutated DNA undergo alteration in a manner that adversely impacts the survival of the cell or organism. Mutations in DNA structure can take many forms, such as large or small insertions or deletions of base pairs, or inversions and insertions of whole DNA segments between or within chromosomes. As a consequence of these aberrant changes in DNA structure, the proteins that are encoded by genes with mutated sequences may have modifications in their amino acid sequence that alters their function leading to adverse consequences in the cell.
One chief difference between DNA structure in prokaryotes versus eukaryotes is that prokaryotic DNA molecules are circular and thus do not have free 5’ and 3’ ends. Circular molecules of DNA can also be present in eukaryotic mitochondrial and chloroplast DNA, evidence that supports the endosymbiotic theory of eukaryotic evolution. This structure is in contrast to eukaryotic DNA where the ends of DNA molecules do not connect and are thus “free.” Additionally, prokaryotes typically have one main circular chromosome, while eukaryotes have many linear chromosomes of varying sizes. For the specific purpose of decreasing its size to ensure it fits inside a cell, prokaryotic DNA employs supercoiling. However, because eukaryotes have much more DNA than prokaryotes (3234 mega-base pairs vs. 4.4 mega-base pairs), they need to utilize a different strategy to ensure its DNA can fit inside a microscopic space, which if stretched from end-to-end in a human cell would be two meters long. Specifically, this is done by sequential levels of coiling, starting with DNA wrapping around histone proteins forming a structure known as a nucleosome, then nucleosomes coiling to form chromatin fibers, and then chromatin further condensing into densely packed chromosomes.
A molecule of DNA is made up of two long polynucleotide chains consisting of subunits known as nucleotides. A nucleotide is composed of a nitrogenous base, a pentose sugar, and at least one phosphate group (Figure 1a). In the case of DNA, the sugar is 2’-deoxyribose and thus it has no hydroxyl group attached to its 2’ (pronounced “two prime”) carbon; this is in contrast to RNA, which does not have the 2’ position of its pentose sugar reduced (or deoxygenated), making it just ribose. Covalently bonded to the 5’ carbon of 2’-deoxyribose is a phosphate group. Since the 2’-deoxyribose and phosphate group are always present, what distinguishes the four DNA nucleotides are the nitrogenous bases they incorporate.
There are four main nitrogenous bases that a nucleotide can incorporate, two of which are purines and two that are pyrimidines (Figure 1b). Both purines and pyrimidines are heterocyclic aromatic compounds, as they contain nitrogen atoms in their carbon-based ring which are important for the hydrogen bonding that holds the two strands of a DNA molecule together. However, while pyrimidines are six-membered rings, purines consist of a five-membered ring fused to a six-membered ring. The two pyrimidines found in DNA are thymine (T) and cytosine (C), while the two purines are Adenine (A) and Guanine (G). While the different purines and pyrimidines differ slightly in structure, their functional groups are attached to the same basic heterocyclic form. These nitrogenous bases are covalently bonded via a nitrogen atom to the 1’ carbon of the deoxyribose sugar in a nucleotide (Figure 1a).
Although four major nitrogenous bases make up the nucleotides of DNA, other uncommon non-primary, or modified, bases have been found to exist in nature. The most common modified bases found in bacterial genomes are 5-methylcytosine, N6-methyladenine, and N4-methylcytosine. These have been shown to protect DNA from restriction enzymes, whose function it is to cleave DNA at specific sites. However, the only modified base found in all in all eukaryotic genomes is 5-methylcytosine.
Each strand of DNA is made up of a string of nucleotide subunits linked at their sugar moieties (Figure 2a). Specifically, nucleotides in a strand of DNA are bound together via ester bonds between the phosphate group attached to their 5’ carbon and the hydroxyl group on the 3’ carbon of an adjacent nucleotide. This bond is known as a phosphodiester bond, and it forms via a condensation reaction during DNA synthesis. As a result, each strand of a DNA molecule has a series of nucleotides with their 5’ phosphate and 3’ hydroxyl group participating in phosphodiester bonds. Each strand of a eukaryotic DNA molecule has a “free” 5’ phosphate group on one end, not bonded to a hydroxyl group, and “free” 3’ hydroxyl group on the other end, not bonded to a phosphate group. This asymmetry has led to the adoption of the convention where DNA is read in a particular direction, namely from its 5’ end to its 3’ end. The sequence of nucleotides that make up a molecule of DNA is referred to as its primary structure.
A DNA molecule consists of two of these chains of polymerized nucleotides running side-by-side, joined together by hydrogen bonds that form between their nitrogenous bases (Figure 2a). Notably, the nucleotides bond together in a very specific fashion, with A pairing with T, and G pairing with C; A and T pairing is by two hydrogen bonds, and C and G by three. These specific pairings result in about a 1 to 1 ratio of pyrimidines and purines in any given cell, a concept known as Chargaff’s rule. This scheme of pairing is referred to as complementary base-pairing and is the most energetically favorable pairing possible. Additionally, DNA is structured so that the sugars of each strand are on the outside, while the bases hydrogen bond on the inside, resulting in what is known as the sugar-phosphate backbone. Thus, what arises is two chains of sugar-phosphate backbones running side-by-side with complementary paired nitrogenous bases hydrogen bonding between them. Importantly, the two strands of a DNA molecule run in an antiparallel fashion, so that the 5’ end of one strand is the 3’ end of the other. This base pairing of nucleotides between the two strands of a single DNA molecule is referred to as DNA’s secondary structure.
The three-dimensional shape of a DNA molecule, or its tertiary structure, is a right-handed double helix (Figure 2b). The hydrogen-bonded bases on each strand are stacked in parallel and run perpendicular to the sugar-phosphate backbone. As indicated by its x-ray diffraction pattern, the bases are regularly spaced at 0.34 nm apart along the axis of the helix. Additionally, there are about ten pairs of bases per turn, as a complete turn of the helix is made every 3.4 nm. DNA has a +36-degree rotation per base pair (bp) and a helical diameter of 1.9 nm. When focusing on the backbone of the DNA helix, two helical grooves exist with different widths known as the minor and major grooves (Figure 2b). The minor groove describes the space between the two antiparallel DNA strands where they run closest together, while the major groove describes the space where they are furthest apart. These specific dimensions describe the B form of DNA, the major form present in most stretches of DNA in a cell. This is in contrast to DNA’s much rarer A and Z forms. The A form is a right-handed double helix with less distance between the bases (0.256 nm), and thus more bases per turn (11 bp per turn) and a smaller helical rotation per base pair (+33 degrees). Z DNA is a left-handed double helix and is most present in the human genome where there are many purines and pyrimidines alternating in succession (i.e., in a sequence such as GCGCGCGCGCG). The reason DNA primarily takes the B form, in contrast to any other form, is because it is the most energetically stable tertiary structure.
A notable property of DNA is the ease of reversible separation of its two strands as a result of hydrogen bonds being relatively weak compared to covalent bonds. This is important because fundamental cellular processes such as DNA replication and the transcription of RNA rely on proteins being able to access individually separated strands of DNA. Thus, during these processes, proteins known as helicases move down the DNA molecule and unwind the two strands by disrupting the hydrogen bonding between bases. However, when the cellular processes requiring strand separation complete, the complementary strands can easily re-anneal. This property of reversible separation can be experimentally induced via the heating and cooling of a DNA molecule, and is referred to as denaturation or “melting.”
One notable structural phenomenon of DNA tertiary structure is known as supercoiling, or the coiling of the larger, already coiled, DNA molecule. Specifically, in a DNA molecule that has its ends fixed, such as in the circular DNA found in prokaryotes or the smaller DNA segments that make up a larger chromosome in eukaryotes, separation of the individual strands of DNA during cellular processes causes the DNA to twist-up past the points of strand separation, leading to strain on the larger DNA structure. This transient over-winding of the larger DNA structure when separating individual strands is known as positive supercoiling (Figure 5). Every cell has enzymes that keep DNA actively underwound to compensate for this, resulting in perpetual negative supercoiling, where the larger DNA structure coils in a left-handed fashion. This results in the strands of DNA needing less energy to be separated and keeps the molecule primed for easy separation in the events of transcription and DNA replication.
The unique structure of DNA is ultimately responsible for its function as being the material that stores and transmits genetic information from one generation to the next. Specifically, the four nitrogenous bases that comprise the sequence of nucleotides in a DNA molecule enable an enormous amount of information storage in a minimal space. Additionally, while DNA’s sugar-phosphate backbone and helical structure make it more stable, less prone to damage and more compact, the hydrogen bonds that hold the strands of DNA together make it more accessible for its biological functions as they are individually weak but cumulatively strong. Also, the complementary base pairing of nucleotides in DNA enables accurate semiconservative replication as each strand carries identical genetic information and serves as an independent template during DNA replication.
Many pathologies are a direct result of gene mutations resulting in an altered protein structure. One clear example is sickle-cell anemia, a genetic disease inherited from one’s parents, and predominates in individuals of African descent. This condition is a direct result of a single point mutation of an A to a T in the gene that codes for beta-globin, resulting in the sixth peptide in beta-globin’s polypeptide chain to change from glutamic acid to valine.  Consequently, an individual homozygous for this mutation will have hemoglobin with mutated beta-globin subunits, known as HbS, that aggregate into crystalline arrays when deoxygenated. This mutation results in the deformation of erythrocytes into a sickle-like shape, making them prone to blocking capillaries, and leading to hemolytic anemia and episodes of vascular occlusion that result in often debilitating life-long pain as well as organ damage from reduced blood flow.