If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Eukaryotic pre-mRNA processing

5' cap and poly-A tail. Splicing, introns, and exons.

Key points:

  • When an RNA transcript is first made in a eukaryotic cell, it is considered a pre-mRNA and must be processed into a messenger RNA (mRNA).
  • A 5' cap is added to the beginning of the RNA transcript, and a 3' poly-A tail is added to the end.
  • In splicing, some sections of the RNA transcript (introns) are removed, and the remaining sections (exons) are stuck back together.
  • Some genes can be alternatively spliced, leading to the production of different mature mRNA molecules from the same initial transcript.

Introduction

Imagine that you run a book-making factory, and you've just printed up all the pages of your favorite book. Now that you have the pages, is the book ready to go? Well...books usually have front and back covers. So you might want to put those on. Also, were there any blank or messed-up pages made during printing? You should probably check for those and remove them before selling your books, or you might end up with some unhappy customers.
The steps we just talked about are pretty similar to what happens to RNA transcripts in the cells of your body. In humans and other eukaryotes, a freshly made RNA transcript (hot off the RNA polymerase "presses") is not quite ready to go. Instead, it's called a pre-mRNA and has to go through some processing steps to become a mature messenger RNA (mRNA) that can be translated into a protein. These include:
  • Addition of cap and tail molecules to the two ends of the transcript. These play a protective role, like a book's front and back covers.
  • Removal of "junk" sequences called introns. Introns are sort of like blank or messed-up pages made during a book's printing, which have to be removed in order for the book to be readable .
In this article, we'll take a closer look at the cap, tail, and splicing modifications that eukaryotic RNA transcripts receive, seeing how they're carried out and why they are important for making sure we get the right protein from our RNA.

Overview of pre-mRNA processing in eukaryotes

As a quick review, gene expression (the "reading out" of a gene to make a protein, or chunk of a protein) happens a little bit differently in bacteria and eukaryotes such as humans.
Left panel: eukaryotic cell. In the nucleus, a pre-mRNA is produced through transcription of a region of DNA from a linear chromosome. This transcript must undergo processing (splicing and addition of 5' cap and poly-A tail) while it is still in the nucleus in order to become a mature mRNA. The mature mRNA is exported from the nucleus to the cytosol, where it is translated at a ribosome to make a polypeptide.
Right panel: bacterium. The DNA takes the form of a circular chromosome and is located in the cytosol. While the DNA is being transcribed to make an RNA, the RNA (which is already considered a mRNA at this point) can associate with a ribosome and start being translated to make a polypeptide.
In bacteria, RNA transcripts are ready to act as messenger RNAs and get translated into proteins right away. In eukaryotes, things are a little more complex, though in an pretty interesting way. The molecule that's directly made by transcription in one of your (eukaryotic) cells is called a pre-mRNA, reflecting that it needs to go through a few more steps to become an actual messenger RNA (mRNA). These are:
  • Addition of a 5' cap to the beginning of the RNA
  • Addition of a poly-A tail (tail of A nucleotides) to the end of the RNA
  • Chopping out of introns, or "junk" sequences, and pasting together of the remaining, good sequences (exons)
Once it's completed these steps, the RNA is a mature mRNA. It can travel out of the nucleus and be used to make a protein.

5' cap and poly-A tail

Both ends of a pre-mRNA are modified by the addition of chemical groups. The group at the beginning (5' end) is called a cap, while the group at the end (3' end) is called a tail. Both the cap and the tail protect the transcript and help it get exported from the nucleus and translated on the ribosomes (protein-making "machines") found in the cytosol1.
The 5’ cap is added to the first nucleotide in the transcript during transcription. The cap is a modified guanine (G) nucleotide, and it protects the transcript from being broken down. It also helps the ribosome attach to the mRNA and start reading it to make a protein.
Image of a pre-mRNA with a 5' cap and 3' poly-A tail. The 5' cap is on the 5' end of the pre-mRNA and is a modified G nucleotide. The poly-A tail is on the 3' end of the pre-mRNA and consists of a long string of A nucleotides (only a few of which are shown).
How is the poly-A tail added? The 3' end of the RNA forms in kind of a bizarre way. When a sequence called a polyadenylation signal shows up in an RNA molecule during transcription, an enzyme chops the RNA in two at that site. Another enzyme adds about 100 - 200 adenine (A) nucleotides to the cut end, forming a poly-A tail. The tail makes the transcript more stable and helps it get exported from the nucleus to the cytosol.

RNA splicing

The third big RNA processing event that happens in your cells is RNA splicing. In RNA splicing, specific parts of the pre-mRNA, called introns are recognized and removed by a protein-and-RNA complex called the spliceosome. Introns can be viewed as "junk" sequences that must be cut out so the "good parts version" of the RNA molecule can be assembled.
What are the "good parts"? The pieces of the RNA that are not chopped out are called exons. The exons are pasted together by the spliceosome to make the final, mature mRNA that is shipped out of the nucleus.
Diagram of a pre-mRNA showing exons and introns. Along the length of the mRNA, there is an alternating pattern of exons and introns: Exon 1 - Intron 1 - Exon 2 - Intron 2 - Exon 3. Each consists of a stretch of RNA nucleotides. During splicing, the introns are revmoved from the pre-mRNA, and the exons are stuck together to form a mature mRNA that does not contain the intron sequences.
A key point here is that it's only the exons of a gene that encode a protein. Not only do the introns not carry information to build a protein, they actually have to be removed in order for the mRNA to encode a protein with the right sequence. If the spliceosome fails to remove an intron, an mRNA with extra "junk" in it will be made, and a wrong protein will get produced during translation.

Alternative splicing

Why splice? We don't know for sure why splicing exists, and in some ways, it seems like a wasteful system. However, splicing does allow for a process called alternative splicing, in which more than one mRNA can be made from the same gene. Through alternative splicing, we (and other eukaryotes) can sneakily encode more different proteins than we have genes in our DNA.
In alternative splicing, one pre-mRNA may be spliced in either of two (or sometimes many more than two!) different ways. For example, in the diagram below, the same pre-mRNA can be spliced in three different ways, depending on which exons are kept. This results in three different mature mRNAs, each of which translates into a protein with a different structure.
Diagram of alternative splicing.
A sequence of DNA encodes a pre-mRNA transcript that contains five regions that may potentially be used as exons: Exon 1, Exon 2, Exon 3, Exon 4, and Exon 5. The exons are arranged in linear order along the pre-mRNA and have introns in between them.
In splicing event #1, all five exons are retained in the mature mRNA. It consists of Exon 1 - Exon 2 - Exon 3 - Exon 4 - Exon 5. When it is translated, it specifies Protein A, a protein with five domains: Coil 1 (specified by Exon 1), Coil 2 (specified by Exon 2), Loop 3 (specified by Exon 3), Loop 4 (specified by Exon 4), and Coil 5 (specified by Exon 5).
In splicing event #2, Exon 3 is not included in the mature mRNA. It consists of Exon 1 - Exon 2 - Exon 4 - Exon 5. When it is translated, it specifies Protein, B a protein with four domains: Coil 1 (specified by Exon 1), Coil 2 (specified by Exon 2), Loop 4 (specified by Exon 4), and Coil 5 (specified by Exon 5). It does not contain Loop 3 because Exon 3 is not present in the mRNA.
In splicing event #3, Exon 4 is not included in the mature mRNA. It consists of Exon 1 - Exon 2 - Exon 3 - Exon 5. When it is translated, it specifies Protein C, a protein with four domains: Coil 1 (specified by Exon 1), Coil 2 (specified by Exon 2), Loop 3 (specified by Exon 3), and Coil 5 (specified by Exon 5). It does not contain Loop 4 because Exon 4 is not present in the mRNA.
_Image credit: "DNA, alternative splicing," by the National Human Genome Research Institute (public domain)._

Try it yourself: Splice the message

Your mission, should you choose to accept it: decode the following top-secret message. First, remove the "junk" letters, colored in purple and underlined. Second, put the remaining letters into groups of three, starting at the beginning.
THEDOGRAMAPQANANDAZAPTQMTETHEHAT
Have you given it a try?
  • If you remove the purple sequences, you should get this series of letters:
  • If you group the remaining letters into sets of three, you should get this message:
The process you just went through is basically what your cells must do when they express a gene. As we discussed earlier in the article, most eukaryotic pre-mRNAs contain "junk" sequences called introns, which are like the purple letters in the message. These sequences must be removed, and the meaningful sequences (exons), equivalent to the maroon letters in the message above, must be stuck back together to make a mature mRNA.
During translation, the mRNA sequence is read in groups of three nucleotides. Each three-letter "word" corresponds to an amino acid that's added to a polypeptide (protein or protein subunit). If an RNA hasn't been spliced, it will contain extra nucleotides that it shouldn't, leading to an incorrect protein "message." Something similar happens if we try to decode the message above without removing the purple letters:
THE DOG RAM APQ ANA NDA ZAP TQM TET HEH AT
Just as removing the purple letters from the sentence is key to ending up with the right message, so splicing is key to ensuring that an mRNA carries the right information (and directs production of the correct polypeptide).

Want to join the conversation?