Eukaryotic gene transcription: Going from DNA to mRNA

Genes are stored deep inside a cell, in a locked room called the nucleus. Ribosomes, the machines that assemble proteins, live outside the nucleus, floating around in a soup of chemicals called the cytosol. This spatial separation presents a logistical hurdle for the cell. A ribosome needs the instructions in a gene to put the corresponding protein together, but genes are trapped inside the nucleus. How do the instructions in a gene get out of the nucleus and to the ribosome?
The solution is simple (if you ignore the details). The instructions in a gene (written in the language of DNA nucleotides) are transcribed into a portable gene, called an mRNA transcript. These mRNA transcripts escape the nucleus and travel to the ribosomes, where they deliver their protein assembly instructions. The creation of mRNA transcripts (the creation of these portable genes) is called gene transcription. Let’s learn about it.

An analogy for understanding transcription

Imagine that you are the owner of an Italian restaurant. You store all the recipes that your cooks use in one big book and every night, when the kitchen closes down, you store the book in your office for safe keeping. Now imagine that one Saturday afternoon the door to your office has a major malfunction, and you and the recipe book get locked inside. The restaurant opens for business in a few hours. You try calling every locksmith in town, but nobody is open on the weekend. How are you going to get the recipes from inside your locked office to the cooks on the outside so that they can make the dishes your customers order?
You come up with the following system. When a customer places an order for a particular dish, you have the waiter come knock on your door and tell you. You turn around and find the relevant recipe in the book, and then you write it down on a 3x5 notecard. Because space is tight and time is of the essence, you use some shorthand and abbreviations but make sure to include all the essential elements of the recipe. You then put the notecard in a sealable plastic bag (to protect it from damage in the kitchen) and slide it under the crack at the bottom of the door. The waiter takes the card to the cook waiting in the kitchen, and the cook has the information he needs to make the customer’s dish. Problem solved. (If you ignore the fact that you're still locked inside your office!)

The mechanics of transcription

In cells, transcription is the process that resembles copying a recipe onto a 3x5 card and sliding it under the office door. The 3x5 card, with the recipe written on it, is analogous to a messenger RNA transcript (mRNA transcript, for short). An mRNA transcript is a single strand of RNA that encapsulate the information contained in a gene. Think of an mRNA transcript as a portable gene: smaller and more mobile than the DNA sequence that it is built from, but containing the same information.

What does an mRNA transcript look like?

When learning about something new, it’s good to see if you can put it in terms of something you already understand. In the case of mRNA transcripts, the thing you already understand is a single strand of DNA (assuming you have read our article on DNA structure and function).
If you hold a picture of such a DNA strand in your mind, you can turn it into an mRNA transcript by making two changes.
  • First, add a hydroxyl group to the 2’ carbon of each deoxyribose. In biochemist speak, you need to hydroxylate the 2’ deoxyriboses.
  • Second, snip the methyl group off of every thymine that occurs in the nucleotide strand. In biochemist speak, you need to demethylate each thymine.
Hydroxylated deoxyribose is called ribose. Demethylated thymine is called uracil. A single strand of RNA is exactly like a single strand of DNA, in terms of the chemicals it is made out of, except that it uses ribose in place of deoxyribose and uracil in place of thymine.
It’s worth noting that cells don’t make mRNA transcripts by starting with a single strand of DNA and then making the changes we just described. Instead, they use a pre-existing supply of ribose and uracil, together with the other components of nucleotides, to make mRNA from scratch. We are simply suggesting that the best way to understand the chemical structure of mRNA is to start with a strand of DNA and make the two changes described.
As another way of wrapping your head around the subtle differences between DNA and RNA, have a look at the following chart.
bonds between nucleotidesphosphodiesterphosphodiester
sugar in nucleotidesdeoxyriboseribose
nucleobasesadenine, thymine, guanine, cytosineadenine, uracil, guanine, cytosine
primary functioninformation storage

How is an mRNA transcript made?

An mRNA transcript is made by an enzyme called RNA polymerase II. As you can tell from the name, the function of RNA polymerase II is broadly similar to DNA polymerase. The only high-level difference is in the building blocks used.
DNA polymerase uses a single strand of DNA as a template and synthesizes a strand of DNA. Each nucleotide in the synthesized DNA strand is complementary to the nucleotide in the template strand. RNA polymerase II also uses a strand of DNA as a template. Instead of using this template to make a complementary strand of DNA, it uses it to make a complementary strand of RNA — the mRNA transcript.

mRNA processing

Once RNA polymerase is done, the mRNA transcript has to be processed before it can make its journey out of the nucleus and to the ribosome. Processing has two phases: protection and splicing.


During this phase, nucleotide sequences are added to each end of the mRNA transcript to protect it from degradation that can occur outside of the nucleus. The 5’ end of a single G nucleotide is attached to the 5’ end of the transcript. This is called the 5’ cap. At the 3’ end of the transcript, a long sequence of A nucleotides are attached. This is called the poly-A tail. The 5’ cap and the poly-A tail protect the mRNA transcript from attack by enzymes in the cytoplasm called exonucleases that specifically target RNA molecules with exposed 5’ ends.
Think of this protection phase of processing in terms of our restaurant analogy. You know that you have to protect the 3x5 card with the recipe on it from the damage that might occur to it in the kitchen, so you put the notecard inside a plastic bag, in order to shield it from any water, oil, or other stray ingredients that could compromise the integrity of the ink the message is written in. The 5’ cap and poly-A tail have the same protective purpose.


The other phase of mRNA processing is called splicing. The purpose of splicing is to remove the introns from the mRNA transcript. Introns are sequences of RNA that don’t contain any information about how to construct a protein.
Introns are snipped out of an mRNA transcript by a complex of enzymes called a spliceosome. A spliceosome locates introns, cuts them out, and then fuses the remaining parts of the mRNA transcript back together. The parts of the mRNA transcript that aren’t spliced out by the spliceosome are called exons. In contrast to introns, exons are the part of an mRNA transcript that actually contain assembly instructions for a protein. Many call the mRNA transcript that still contains introns pre-mRNA, and the intron-free transcript that the spliceosome produces primary mRNA (also called “mature mRNA” by some authors).
Think of intron splicing in terms of our restaurant analogy. The recipes in the big book may contain extraneous information, such as where the recipe came from, its history in your family, or what other dishes or drinks it can be paired with. For your purposes, this sort of information isn’t relevant to your primary purpose. So you leave it out when you make the notecard, and only include the need-to-know information for making the dish. In the case of transcription, this need-to-know info is contained in the exons, and all the rest—the introns—can be left out. The result is a smaller and more mobile version of the mRNA transcript.
Once an mRNA has been protected and spliced, it is ready to leave the nucleus and begin the second phase of protein synthesis, called translation.

Consider the following: RNA interference

In our restaurant example, what would happen if the notecard you slipped under the door never made it to the kitchen? The dish wouldn’t get made, because the cook wouldn’t have the recipe, right?
In recent years, a promising class of medical therapies have used a version of this idea to develop new therapies for a number of formidable diseases. Falling under the name of RNA interference, these therapies disrupt the production of harmful proteins by intercepting and incapacitating mRNA transcripts before they make it to ribosomes. This prevents the corresponding proteins from being made in the first place!
A recently proposed treatment for Ebola represents perhaps the most spectacular application of RNA interference techniques.
Ebola, like most viruses, is basically a transcription machine. It contains a viral genome --- basically a very small and simple chromosome --- together with a polymerase enzyme. Once the virus infiltrates one of your cells, the polymerase enzyme synthesizes mRNA transcripts for each of the genes in its genome. It then commandeers your own ribosomes and uses them to build its own proteins. Once you make these proteins for the ebola virus, they direct the construction of new ebola viruses.
One obvious idea for fighting ebola would be to throw a monkey wrench into this whole process, and stop the replication process before it ever gets off the ground.
Using a class of engineered molecules called short interfering RNA (siRNA, for short), scientists have shown that the viral mRNAs synthesized by the Ebola virus inside infected cells can be captured and destroyed before they are able to deliver their genetic messages to the ribosomes of a host cell. No Ebola mRNA, no Ebola proteins. No Ebola proteins, and Ebola loses its ability to replicate inside host cells!
As with all other new therapies, siRNA-based treatments for Ebola were initially validated in non-human animal models. However, during the latest outbreak of the virus in West Africa, and its subsequent spread to North America, authorities in the U.S. Food and Drug Administration took the radical step of issuing a so-called “compassionate use” exemption for an siRNA-based Ebola therapy called TKM-ebola. While the details are sketchy, we know that TKM-ebola was administered to several different patients, and it could have played a role in their subsequent recoveries.
This article is licensed under a CC-BY-NC-SA 4.0 license.