The genetic code links groups of nucleotides in an mRNA to amino acids in a protein. Start codons, stop codons, reading frame.
Have you ever written a secret message to one of your friends? If so, you may have used a code to keep the message hidden. For instance, you may have replaced the letters of the word with numbers or symbols, following a particular set of rules. In order for your friend to understand the message, they would need to know the code and apply the same set of rules, in reverse, to decode it.
Decoding messages is also a key step in gene expression, in which information from a gene is read out to build a protein. In this article, we'll take a closer look at the genetic code, which allows DNA and RNA sequences to be "decoded" into the amino acids of a protein.
Background: Making a protein
Genes that provide instructions for proteins are expressed in a two-step process.
- In transcription, the DNA sequence of a gene is "rewritten" in RNA. In eukaryotes, the RNA must go through additional processing steps to become a messenger RNA, or mRNA.
- In translation, the sequence of nucleotides in the mRNA is "translated" into a sequence of amino acids in a polypeptide (protein chain).
If this is a new concept for you, you may want to learn more by watching Sal's video on transcription and translation.
Cells decode mRNAs by reading their nucleotides in groups of three, called codons. Here are some features of codons:
- Most codons specify an amino acid
- Three "stop" codons mark the end of a protein
- One "start" codon, AUG, marks the beginning of a protein and also encodes the amino acid methionine
Codons in an mRNA are read during translation, beginning with a start codon and continuing until a stop codon is reached. mRNA codons are read from 5' to 3' , and they specify the order of amino acids in a protein from N-terminus (methionine) to C-terminus.
The genetic code table
The full set of relationships between codons and amino acids (or stop signals) is called the genetic code. The genetic code is often summarized in a table.
Notice that many amino acids are represented in the table by more than one codon. For instance, there are six different ways to "write" leucine in the language of mRNA (see if you can find all six).
An important point about the genetic code is that it's universal. That is, with minor exceptions, virtually all species (from bacteria to you!) use the genetic code shown above for protein synthesis.
To reliably get from an mRNA to a protein, we need one more concept: that of reading frame. Reading frame determines how the mRNA sequence is divided up into codons during translation.
That's a pretty abstract concept, so let's look at an example to understand it better. The mRNA below can encode three totally different proteins, depending on the frame in which it's read:
So, how does a cell know which of these protein to make? The start codon is the key signal. Because translation begins at the start codon and continues in successive groups of three, the position of the start codon ensures that the mRNA is read in the correct frame (in the example above, in Frame 3).
Mutations (changes in DNA) that insert or delete one or two nucleotides can change the reading frame, causing an incorrect protein to be produced "downstream" of the mutation site:
How was the genetic code discovered?
The story of how the genetic code was discovered is a pretty cool and epic one. We've stashed our version in the pop-up below, so as not to distract you if you're in a hurry. However, if you have some time, it's definitely interesting reading.
I always like to imagine how cool it would have been to be one of the people who discovered the basic molecular code of life. Although we now know the code, there are many other biological mysteries still waiting to be solved (perhaps by you!).
Want to join the conversation?
- Are Glutamate (Glu) and Glutamine (Gln) interchangeable? or there is something wrong with the example on reading the codon table, because CAG codes for Gln, not Glu.(8 votes)
- When does the tRNA know when to use AUG as a start codon and when to code Methionine? Are there other influencers(6 votes)
- Excellent question!
Translation is quite bit more complicated that this introductory material can cover.
The sequence of the mRNA around a potential start codon influences whether or not it will be used§. These sequences are bound by proteins that help guide the ribosome to assemble at the correct place to start translation.
(In fact, codons other than AUG are sometimes used as start codons!)
This is covered in a bit more detail in another article:
I also encourage you to look at some of the references for that section, which will help give you more detail on this high complex process that is still being actively studied.
§Note: The mechanisms are very different in prokaryotic and eukaryotic organisms — they can also vary between different species and even for different genes!(5 votes)
- would it be possible to use the "coding language" of RNA to synthesize chemicals?(5 votes)
- Yes, proteins are made of amino acids which are coded within the DNA sequence, so yes, recombinant DNA may be used.
Also, there are already efforts to use DNA as a digital store of information:
- Why is
AUGa start codon and
UAGstop codons?(5 votes)
- No one knows exactly why evolution chose which specific codons represent each amino acid. This likely happened in an arbitrary manner very early in evolution and has been maintained every since.(5 votes)
- I have heard that the 3' end of mrna is rich in stop codons so that in case of a mutation the peptide gets released but I am unable to find an article about that. Can someone confirm if this is true or not?(4 votes)
- You are correct.
Usually nucleotides present in mRNA channel downstream the A site help determining the future.
The expected hierarchy in the intrinsic fidelity of the
stop codons (UAA>UAG>>UGA) was observed, with
highly influential effects on termination readthrough
mediated by nucleotides at position +4 and position
There are also cases where there are mutations non-stop codon so transcription cannot stop.
- Why does leucine happen to have 6 ways to code while many other amino acids only have 2? Does it have to do with how essential it is?(4 votes)
- In the section, Reading Frame, frameshift mutations are mentioned.
Point mutations will shift the frame of reference.
The insertion or deletion of three(or it's multiple )bases would insert or delete one or more codons or amino acids, without shifting the reading frame. But addition or subtraction of amino acids from a polypeptide would transform it..... How is this dealt with?(3 votes)
- How small "in frame" indels (insertions and deletions) are dealt with depends on many factors including where in the gene the indel happens — so the short answer is "it depends".
For example, if you disrupt the catalytic site of an enzyme the effect will probably be the same as if the protein was never produced at all — this is likely to lead to a complete loss (assuming the mutation is homozygous) of that enzyme activity — the effect on the cell could be anything from fatal to unnoticeable (depending on how critical that enzyme activity is in that cell).
On the other hand, some proteins have loops of amino acid sequences on their surfaces that do not appear to be critically important and making those loops a little longer or shorter might have little or no effect on the protein function.
(Note that we only use "point mutation" to refer to mutations that change a base — not for deletions of a single base pair.)(2 votes)
- How do mutations occur in the genetic code?(3 votes)
- Mutations are caused by mutagens: mutation causing agents, including radiation, viruses, chemicals and more. This alters the molecular structure and composition of the DNA, causing a mutation in the genetic code.(2 votes)
- if there are 999 bases in an rna that codes for a protein with 333 amino acids and the base at position 901 is deleted such that the length of the rna becomes 998 bases, how many codons will be altered ?(2 votes)
- If you are doing transcription forward from base 1 to 999 then base 901 is the first base in in codon 301 so there will be a shift in 33 codons. Since there are multiple codons for a specific amino acid there may or may not be "errors" in each of the amino acid choices.(3 votes)
- If the mRNA is coded from the 5' end to the 3', how is the code similar to that of the coding strand of the DNA? Since it reads from the other end, it should be the reverse of it right?(2 votes)
- Actually, the mRNA strand is coded from the template strand of the DNA which runs from 3' to 5' end. The coding strand is the other strand of DNA helix other than the template strand that runs from 5' to 3' end and is parallel to the mRNA strand.(2 votes)