- Translation (mRNA to protein)
- Overview of translation
- Differences in translation between prokaryotes and eukaryotes
- DNA replication and RNA transcription and translation
- Intro to gene expression (central dogma)
- The genetic code
The genetic code
The genetic code links groups of nucleotides in an mRNA to amino acids in a protein. Start codons, stop codons, reading frame.
Have you ever written a secret message to one of your friends? If so, you may have used a code to keep the message hidden. For instance, you may have replaced the letters of the word with numbers or symbols, following a particular set of rules. In order for your friend to understand the message, they would need to know the code and apply the same set of rules, in reverse, to decode it.
Decoding messages is also a key step in gene expression, in which information from a gene is read out to build a protein. In this article, we'll take a closer look at the genetic code, which allows DNA and RNA sequences to be "decoded" into the amino acids of a protein.
Background: Making a protein
Genes that provide instructions for proteins are expressed in a two-step process.
- In transcription, the DNA sequence of a gene is "rewritten" in RNA. In eukaryotes, the RNA must go through additional processing steps to become a messenger RNA, or mRNA.
- In translation, the sequence of nucleotides in the mRNA is "translated" into a sequence of amino acids in a polypeptide (protein chain).
If this is a new concept for you, you may want to learn more by watching Sal's video on transcription and translation.
Cells decode mRNAs by reading their nucleotides in groups of three, called codons. Here are some features of codons:
- Most codons specify an amino acid
- Three "stop" codons mark the end of a protein
- One "start" codon, AUG, marks the beginning of a protein and also encodes the amino acid methionine
Codons in an mRNA are read during translation, beginning with a start codon and continuing until a stop codon is reached. mRNA codons are read from 5' to 3' , and they specify the order of amino acids in a protein from N-terminus (methionine) to C-terminus.
The mRNA sequence is:
Translation involves reading the mRNA nucleotides in groups of three; each group specifies an amino acid (or provides a stop signal indicating that translation is finished).
3'-AUG AUC UCG UAA-5'
AUG Methionine (Start) AUC Isoleucine UCG Serine UAA "Stop"
Polypeptide sequence: (N-terminus) Methionine-Isoleucine-Serine (C-terminus)
The genetic code table
The full set of relationships between codons and amino acids (or stop signals) is called the genetic code. The genetic code is often summarized in a table.
Genetic code table. Each three-letter sequence of mRNA nucleotides corresponds to a specific amino acid, or to a stop codon. UGA, UAA, and UAG are stop codons. AUG is the codon for methionine, and is also the start codon.
Notice that many amino acids are represented in the table by more than one codon. For instance, there are six different ways to "write" leucine in the language of mRNA (see if you can find all six).
An important point about the genetic code is that it's universal. That is, with minor exceptions, virtually all species (from bacteria to you!) use the genetic code shown above for protein synthesis.
To reliably get from an mRNA to a protein, we need one more concept: that of reading frame. Reading frame determines how the mRNA sequence is divided up into codons during translation.
That's a pretty abstract concept, so let's look at an example to understand it better. The mRNA below can encode three totally different proteins, depending on the frame in which it's read:
mRNA sequence: 5'-UCAUGAUCUCGUAAGA-3'
Read in Frame 1:
5'-UCA UGA UCU CGU AAG A-3'
Read in Frame 2:
5'-U CAU GAU CUC GUA AGA-3'
Read in Frame 3:
5'-UC AUG AUC UCG UAA GA-3'
The start codon's position ensures that Frame 3 is chosen for translation of the mRNA.
So, how does a cell know which of these protein to make? The start codon is the key signal. Because translation begins at the start codon and continues in successive groups of three, the position of the start codon ensures that the mRNA is read in the correct frame (in the example above, in Frame 3).
Mutations (changes in DNA) that insert or delete one or two nucleotides can change the reading frame, causing an incorrect protein to be produced "downstream" of the mutation site:
Illustration shows a frameshift mutation in which the reading frame is altered by the deletion of two amino acids.
How was the genetic code discovered?
The story of how the genetic code was discovered is a pretty cool and epic one. We've stashed our version in the pop-up below, so as not to distract you if you're in a hurry. However, if you have some time, it's definitely interesting reading.
I always like to imagine how cool it would have been to be one of the people who discovered the basic molecular code of life. Although we now know the code, there are many other biological mysteries still waiting to be solved (perhaps by you!).
Want to join the conversation?
- Are Glutamate (Glu) and Glutamine (Gln) interchangeable? or there is something wrong with the example on reading the codon table, because CAG codes for Gln, not Glu.(9 votes)
- They are 2 different amino acids, so no they cannot be use interchangeably.(6 votes)
- When does the tRNA know when to use AUG as a start codon and when to code Methionine? Are there other influencers(6 votes)
- Excellent question!
Translation is quite bit more complicated that this introductory material can cover.
The sequence of the mRNA around a potential start codon influences whether or not it will be used§. These sequences are bound by proteins that help guide the ribosome to assemble at the correct place to start translation.
(In fact, codons other than AUG are sometimes used as start codons!)
This is covered in a bit more detail in another article:
I also encourage you to look at some of the references for that section, which will help give you more detail on this high complex process that is still being actively studied.
§Note: The mechanisms are very different in prokaryotic and eukaryotic organisms — they can also vary between different species and even for different genes!(5 votes)
- would it be possible to use the "coding language" of RNA to synthesize chemicals?(5 votes)
- Yes, proteins are made of amino acids which are coded within the DNA sequence, so yes, recombinant DNA may be used.
Also, there are already efforts to use DNA as a digital store of information:
- Why is
AUGa start codon and
UAGstop codons?(5 votes)
- No one knows exactly why evolution chose which specific codons represent each amino acid. This likely happened in an arbitrary manner very early in evolution and has been maintained every since.(5 votes)
- I have heard that the 3' end of mrna is rich in stop codons so that in case of a mutation the peptide gets released but I am unable to find an article about that. Can someone confirm if this is true or not?(4 votes)
- You are correct.
Usually nucleotides present in mRNA channel downstream the A site help determining the future.
The expected hierarchy in the intrinsic fidelity of the
stop codons (UAA>UAG>>UGA) was observed, with
highly influential effects on termination readthrough
mediated by nucleotides at position +4 and position
There are also cases where there are mutations non-stop codon so transcription cannot stop.
- In the section, Reading Frame, frameshift mutations are mentioned.
Point mutations will shift the frame of reference.
The insertion or deletion of three(or it's multiple )bases would insert or delete one or more codons or amino acids, without shifting the reading frame. But addition or subtraction of amino acids from a polypeptide would transform it..... How is this dealt with?(3 votes)
- How small "in frame" indels (insertions and deletions) are dealt with depends on many factors including where in the gene the indel happens — so the short answer is "it depends".
For example, if you disrupt the catalytic site of an enzyme the effect will probably be the same as if the protein was never produced at all — this is likely to lead to a complete loss (assuming the mutation is homozygous) of that enzyme activity — the effect on the cell could be anything from fatal to unnoticeable (depending on how critical that enzyme activity is in that cell).
On the other hand, some proteins have loops of amino acid sequences on their surfaces that do not appear to be critically important and making those loops a little longer or shorter might have little or no effect on the protein function.
(Note that we only use "point mutation" to refer to mutations that change a base — not for deletions of a single base pair.)(2 votes)
- So the genetic code is the mRNA sequence of bases and it starts from the 5' to the 3' and it is the coding strand. Now if we want to find the tRNA sequence, which is the template or the non-coding, for ACU, for example, we start at 3' to 5' and we write it as TGA? Is that the correct way or am I missing something?(2 votes)
- Just one correction. You do not write it TGA but UGA.
There is no Thymine in RNA, but Uracil. Everything else is right. :D
같이 공부합시다.(2 votes)
- how many alleles are expressed when a b cell carrying two alleles encode immunoglobulin heavy and light chains(2 votes)
- Only one.
Whether it is a monoallelic expression or Ig allelic exclusion of immunoglobulin chains.
If it is monoallelic, means that always 0% of one allele and 100% of other is expressed.
On the contrary, IgG H and L chains are exclusive. Even though both are expressed, 50% each, they are exclusive. Only one can be found in phenotype, not both. One is silencing the another.
This photo may help you
- Are proteins made at the same time as new DNA? Does DNA unwind when it makes proteins?(1 vote)
- The DNA that isn't being utilized is very tightly packaged, and contrarily, the DNA that is being utilized is unwound, so yes, in a sense, but your choice of words is slightly off... DNA unwinds to be transcribed into RNA, which eventually makes its way to a ribosome, which then gets translated into protein. So you are somewhat correct, just your word choice is off. Don't forget the central dogma: DNA->RNA->protein, that middle molecule is essential.(3 votes)
- How do mutations occur in the genetic code?(2 votes)
- Mutations are caused by mutagens: mutation causing agents, including radiation, viruses, chemicals and more. This alters the molecular structure and composition of the DNA, causing a mutation in the genetic code.(1 vote)