If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

The genetic code

The genetic code links groups of nucleotides in an mRNA to amino acids in a protein. Start codons, stop codons, reading frame.

Introduction

Have you ever written a secret message to one of your friends? If so, you may have used a code to keep the message hidden. For instance, you may have replaced the letters of the word with numbers or symbols, following a particular set of rules. In order for your friend to understand the message, they would need to know the code and apply the same set of rules, in reverse, to decode it.
Decoding messages is also a key step in gene expression, in which information from a gene is read out to build a protein. In this article, we'll take a closer look at the genetic code, which allows DNA and RNA sequences to be "decoded" into the amino acids of a protein.

Background: Making a protein

Genes that provide instructions for proteins are expressed in a two-step process.
  • In transcription, the DNA sequence of a gene is "rewritten" in RNA. In eukaryotes, the RNA must go through additional processing steps to become a messenger RNA, or mRNA.
  • In translation, the sequence of nucleotides in the mRNA is "translated" into a sequence of amino acids in a polypeptide (protein chain).
If this is a new concept for you, you may want to learn more by watching Sal's video on transcription and translation.

Codons

Cells decode mRNAs by reading their nucleotides in groups of three, called codons. Here are some features of codons:
  • Most codons specify an amino acid
  • Three "stop" codons mark the end of a protein
  • One "start" codon, AUG, marks the beginning of a protein and also encodes the amino acid methionine
Codons in an mRNA are read during translation, beginning with a start codon and continuing until a stop codon is reached. mRNA codons are read from 5' to 3' , and they specify the order of amino acids in a protein from N-terminus (methionine) to C-terminus.
The mRNA sequence is:
5'-AUGAUCUCGUAA-5'
Translation involves reading the mRNA nucleotides in groups of three; each group specifies an amino acid (or provides a stop signal indicating that translation is finished).
3'-AUG AUC UCG UAA-5'
AUG Methionine (Start) AUC Isoleucine UCG Serine UAA "Stop"
Polypeptide sequence: (N-terminus) Methionine-Isoleucine-Serine (C-terminus)

The genetic code table

The full set of relationships between codons and amino acids (or stop signals) is called the genetic code. The genetic code is often summarized in a table.
Genetic code table. Each three-letter sequence of mRNA nucleotides corresponds to a specific amino acid, or to a stop codon. UGA, UAA, and UAG are stop codons. AUG is the codon for methionine, and is also the start codon.
Image credit: "The genetic code," by OpenStax College, Biology (CC BY 3.0).
Notice that many amino acids are represented in the table by more than one codon. For instance, there are six different ways to "write" leucine in the language of mRNA (see if you can find all six).
An important point about the genetic code is that it's universal. That is, with minor exceptions, virtually all species (from bacteria to you!) use the genetic code shown above for protein synthesis.

Reading frame

To reliably get from an mRNA to a protein, we need one more concept: that of reading frame. Reading frame determines how the mRNA sequence is divided up into codons during translation.
That's a pretty abstract concept, so let's look at an example to understand it better. The mRNA below can encode three totally different proteins, depending on the frame in which it's read:
mRNA sequence: 5'-UCAUGAUCUCGUAAGA-3'
Read in Frame 1:
5'-UCA UGA UCU CGU AAG A-3'
Ser-STOP-Ser-Arg-Lys
Read in Frame 2:
5'-U CAU GAU CUC GUA AGA-3'
His-Asp-Leu-Val-Arg
Read in Frame 3:
5'-UC AUG AUC UCG UAA GA-3'
Met(Start)-Ile-Ser-STOP
The start codon's position ensures that Frame 3 is chosen for translation of the mRNA.
So, how does a cell know which of these protein to make? The start codon is the key signal. Because translation begins at the start codon and continues in successive groups of three, the position of the start codon ensures that the mRNA is read in the correct frame (in the example above, in Frame 3).
Mutations (changes in DNA) that insert or delete one or two nucleotides can change the reading frame, causing an incorrect protein to be produced "downstream" of the mutation site:
Illustration shows a frameshift mutation in which the reading frame is altered by the deletion of two amino acids.
_Image credit; "The genetic code: Figure 3," by OpenStax College, Biology, CC BY 4.0._

How was the genetic code discovered?

The story of how the genetic code was discovered is a pretty cool and epic one. We've stashed our version in the pop-up below, so as not to distract you if you're in a hurry. However, if you have some time, it's definitely interesting reading.
I always like to imagine how cool it would have been to be one of the people who discovered the basic molecular code of life. Although we now know the code, there are many other biological mysteries still waiting to be solved (perhaps by you!).

Want to join the conversation?