Key terms

RNA (ribonucleic acid)Single-stranded nucleic acid that carries out the instructions coded in DNA
Central dogma of biologyThe process by which the information in genes flows into proteins: DNA → RNA → protein
PolypeptideA chain of amino acids
CodonA sequence of three nucleotides that corresponds with a specific amino acid or start/stop signal during translation
TranscriptionProcess during which a DNA sequence of a gene is copied to make an RNA molecule
TranslationProcess during which an mRNA molecule is used to assemble amino acids into polypeptide chains
MutationA change in a genetic sequence

Structure of RNA

DNA alone cannot account for the expression of genes. RNA is needed to help carry out the instructions in DNA.
Like DNA, RNA is made up of nucleotide consisting of a 5-carbon sugar ribose, a phosphate group, and a nitrogenous base. However, there are three main differences between DNA and RNA:
  1. RNA uses the sugar ribose instead of deoxyribose.
  2. RNA is generally single-stranded instead of double-stranded.
  3. RNA contains uracil in place of thymine.
These differences help enzymes in the cell to distinguish DNA from RNA.
Image comparing the structure of single-stranded RNA with double-stranded DNA.
Comparison of RNA and DNA molecules. Image modified from Wikimedia, CC BY-SA 3.0.

Types of RNA

Messenger RNA (mRNA)Carries information from DNA in the nucleus to ribosomes in the cytoplasm
Ribosomal RNA (rRNA)Structural component of ribosomes
Transfer RNA (tRNA)Carries amino acids to the ribosome during translation to help build an amino acid chain

Central dogma of biology

A gene that encodes a polypeptide is expressed in two steps. In this process, information flows from DNA \rightarrow RNA \rightarrow protein, a directional relationship known as the central dogma of molecular biology.

The genetic code

The first step in decoding genetic messages is transcription, during which a nucleotide sequence is copied from DNA to RNA. The next step is to join amino acids together to form a protein.
The order in which amino acids are joined together determine the shape, properties, and function of a protein.
The four bases of RNA form a language with just four nucleotide bases: adenine (A), cytosine (C), guanine (G), and uracil (U). The genetic code is read in three-base words called codons. Each codon corresponds to a single amino acid (or signals the starting and stopping points of a sequence).
Genetic code table. Each three-letter sequence of mRNA nucleotides corresponds to a specific amino acid, or to a stop codon. UGA, UAA, and UAG are stop codons. AUG is the codon for methionine, and is also the start codon.
Codon chart. Image from OpenStax, CC BY 3.0.
The codon table may look kind of intimidating at first. Fortunately, it's organized in a logical way, and it's not too hard to use once you understand this organization.
To see how the codon table works, let's walk through an example. Suppose that we are interested in the codon CAG and want to know which amino acid it specifies.
  1. First, we look at the left side of the table. The axis on the left side refers to the first letter of the codon, so we find C along the left axis. This tells us the (broad) row of the table in which our codon will be found.
  2. Next, we look at the top of the table. The upper axis refers to the second letter of the codon, so we find A along the upper axis. This tells us the column of the table in which our codon will be found.
The row and column from steps 1 and 2 intersect in a single box in the codon table, one containing four codons. It's often easiest to simply look at these four codons and see which one is the one you're looking for.
If you want to use the structure of the table to the maximum, however, you can use the third axis (on the right side of the table) corresponding to the intersect box. By finding the third nucleotide of the codon on this axis, you can identify the exact row within the box where your codon is found. For instance, if we look for G on this axis in our example above, we find that CAG encodes the amino acid glutamine (Gln).

Transcription and translation

Simplified schematic of central dogma, showing the sequences of the molecules involved.
The two strands of DNA have the following sequences:
Transcription of one of the strands of DNA produces an mRNA that nearly matches the other strand of DNA in sequence. However, due to a biochemical difference between DNA and RNA, the Ts of DNA are replaced with Us in the mRNA. The mRNA sequence is:
Translation involves reading the mRNA nucleotides in groups of three, each of which specifies and amino acid (or provides a stop signal indicating that translation is finished).
AUG \rightarrow Methionine AUC \rightarrow Isoleucine UCG \rightarrow Serine UAA \rightarrow "Stop"
Polypeptide sequence: (N-terminus) Methionine-Isoleucine-Serine (C-terminus)
In transcription, a DNA sequence is rewritten, or transcribed, into a similar RNA "alphabet." In eukaryotes, the RNA molecule must undergo processing to become a mature messenger RNA (mRNA).
In translation, the sequence of the mRNA is decoded to specify the amino acid sequence of a polypeptide. The name translation reflects that the nucleotide sequence of the mRNA sequence must be translated into the completely different "language" of amino acids.


Sometimes cells make mistakes in copying their genetic information, causing mutations. Mutations can be irrelevant, or they effect the way proteins are made and genes are expressed.


A substitution changes a single base pair by replacing one base for another.
There are three kinds of substitution mutations:
  • Silent mutations do not affect the sequence of amino acids during translation.
  • Nonsense mutations result in a stop codon where an amino acid should be, causing translation to stop prematurely.
  • Missense mutations change the amino acid specified by a codon.

Insertions and deletions

An insertion occurs when one or more bases are added to a DNA sequence. A deletion occurs when one or more bases are removed from a DNA sequence.
Because the genetic code is read in codons (three bases at a time), inserting or deleting bases may change the "reading frame" of the sequence. These types of mutations are called frameshift mutations.
A frameshift mutation “shifts” how a sequence of nucleotides is read as triplets (codons) during translation. This may, in turn, alter which amino acids are added to polypeptide. In this example, the original reading frame of a gene encodes an mRNA with codons that specify the amino acid sequence: methionine (Met), isoleucine (Ile), argenine (Arg), and asparagine (Asn). A deletion of the 4th nucleotide (T) shifts the reading frame at the point of the deletion. This produces a new reading frame in the DNA template after the 3rd nucleotide. The mRNA of the new frame bears different codons past the point of the mutation (the first methionine-specifying codon remains unchanged). These codons specify the amino acid sequence: methionine (Met), tyrosine (Tyr), and glycine (Gly).
As this example illustrates, a frameshift mutation changes how nucleotides are interpreted as codons beyond the point of the mutation, and this, in turn, may change the amino acid sequence.

Common mistakes and misconceptions

  • Amino acids are not made during protein synthesis. Some students think that the purpose of protein synthesis is to create amino acids. However, amino acids are not being made during translation, they are being used as building blocks to make proteins.
  • Mutations do not always have drastic or negative effects. Often people hear the term "mutation" in the media and understand it to mean that a person will have a disease or disfigurement. Mutations are the source of genetic variety, so although some mutations are harmful, most are unnoticeable, and many are even good!
  • Insertions and deletions that are multiples of three nucleotides will not cause frameshift mutations. Rather, one or more amino acids will just be added to or deleted from the protein. Insertions and deletions that are not multiples of three nucleotides, however, can dramatically alter the amino acid sequence of the protein.