If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Eukaryotic gene transcription: Going from DNA to mRNA

Genes are stored deep inside a cell, in a locked room called the nucleus. Ribosomes, the machines that assemble proteins, live outside the nucleus, floating around in a soup of chemicals called the cytosol. This spatial separation presents a logistical hurdle for the cell. A ribosome needs the instructions in a gene to put the corresponding protein together, but genes are trapped inside the nucleus. How do the instructions in a gene get out of the nucleus and to the ribosome?
The solution is simple (if you ignore the details). The instructions in a gene (written in the language of DNA nucleotides) are transcribed into a portable gene, called an mRNA transcript. These mRNA transcripts escape the nucleus and travel to the ribosomes, where they deliver their protein assembly instructions. The creation of mRNA transcripts (the creation of these portable genes) is called gene transcription. Let’s learn about it.

An analogy for understanding transcription

Imagine that you are the owner of an Italian restaurant. You store all the recipes that your cooks use in one big book and every night, when the kitchen closes down, you store the book in your office for safe keeping. Now imagine that one Saturday afternoon the door to your office has a major malfunction, and you and the recipe book get locked inside. The restaurant opens for business in a few hours. You try calling every locksmith in town, but nobody is open on the weekend. How are you going to get the recipes from inside your locked office to the cooks on the outside so that they can make the dishes your customers order?
You come up with the following system. When a customer places an order for a particular dish, you have the waiter come knock on your door and tell you. You turn around and find the relevant recipe in the book, and then you write it down on a 3x5 notecard. Because space is tight and time is of the essence, you use some shorthand and abbreviations but make sure to include all the essential elements of the recipe. You then put the notecard in a sealable plastic bag (to protect it from damage in the kitchen) and slide it under the crack at the bottom of the door. The waiter takes the card to the cook waiting in the kitchen, and the cook has the information he needs to make the customer’s dish. Problem solved. (If you ignore the fact that you're still locked inside your office!)

The mechanics of transcription

In cells, transcription is the process that resembles copying a recipe onto a 3x5 card and sliding it under the office door. The 3x5 card, with the recipe written on it, is analogous to a messenger RNA transcript (mRNA transcript, for short). An mRNA transcript is a single strand of RNA that encapsulate the information contained in a gene. Think of an mRNA transcript as a portable gene: smaller and more mobile than the DNA sequence that it is built from, but containing the same information.

What does an mRNA transcript look like?

When learning about something new, it’s good to see if you can put it in terms of something you already understand. In the case of mRNA transcripts, the thing you already understand is a single strand of DNA (assuming you have read our article on DNA structure and function).
If you hold a picture of such a DNA strand in your mind, you can turn it into an mRNA transcript by making two changes.
  • First, add a hydroxyl group to the 2’ carbon of each deoxyribose. In biochemist speak, you need to hydroxylate the 2’ deoxyriboses.
  • Second, snip the methyl group off of every thymine that occurs in the nucleotide strand. In biochemist speak, you need to demethylate each thymine.
Hydroxylated deoxyribose is called ribose. Demethylated thymine is called uracil. A single strand of RNA is exactly like a single strand of DNA, in terms of the chemicals it is made out of, except that it uses ribose in place of deoxyribose and uracil in place of thymine.
It’s worth noting that cells don’t make mRNA transcripts by starting with a single strand of DNA and then making the changes we just described. Instead, they use a pre-existing supply of ribose and uracil, together with the other components of nucleotides, to make mRNA from scratch. We are simply suggesting that the best way to understand the chemical structure of mRNA is to start with a strand of DNA and make the two changes described.
As another way of wrapping your head around the subtle differences between DNA and RNA, have a look at the following chart.
bonds between nucleotidesphosphodiesterphosphodiester
sugar in nucleotidesdeoxyriboseribose
nucleobasesadenine, thymine, guanine, cytosineadenine, uracil, guanine, cytosine
primary functioninformation storage

How is an mRNA transcript made?

An mRNA transcript is made by an enzyme called RNA polymerase II. As you can tell from the name, the function of RNA polymerase II is broadly similar to DNA polymerase. The only high-level difference is in the building blocks used.
DNA polymerase uses a single strand of DNA as a template and synthesizes a strand of DNA. Each nucleotide in the synthesized DNA strand is complementary to the nucleotide in the template strand. RNA polymerase II also uses a strand of DNA as a template. Instead of using this template to make a complementary strand of DNA, it uses it to make a complementary strand of RNA — the mRNA transcript.

mRNA processing

Once RNA polymerase is done, the mRNA transcript has to be processed before it can make its journey out of the nucleus and to the ribosome. Processing has two phases: protection and splicing.


During this phase, nucleotide sequences are added to each end of the mRNA transcript to protect it from degradation that can occur outside of the nucleus. The 5’ end of a single G nucleotide is attached to the 5’ end of the transcript. This is called the 5’ cap. At the 3’ end of the transcript, a long sequence of A nucleotides are attached. This is called the poly-A tail. The 5’ cap and the poly-A tail protect the mRNA transcript from attack by enzymes in the cytoplasm called exonucleases that specifically target RNA molecules with exposed 5’ ends.
Think of this protection phase of processing in terms of our restaurant analogy. You know that you have to protect the 3x5 card with the recipe on it from the damage that might occur to it in the kitchen, so you put the notecard inside a plastic bag, in order to shield it from any water, oil, or other stray ingredients that could compromise the integrity of the ink the message is written in. The 5’ cap and poly-A tail have the same protective purpose.


The other phase of mRNA processing is called splicing. The purpose of splicing is to remove the introns from the mRNA transcript. Introns are sequences of RNA that don’t contain any information about how to construct a protein.
Introns are snipped out of an mRNA transcript by a complex of enzymes called a spliceosome. A spliceosome locates introns, cuts them out, and then fuses the remaining parts of the mRNA transcript back together. The parts of the mRNA transcript that aren’t spliced out by the spliceosome are called exons. In contrast to introns, exons are the part of an mRNA transcript that actually contain assembly instructions for a protein. Many call the mRNA transcript that still contains introns pre-mRNA, and the intron-free transcript that the spliceosome produces primary mRNA (also called “mature mRNA” by some authors).
Think of intron splicing in terms of our restaurant analogy. The recipes in the big book may contain extraneous information, such as where the recipe came from, its history in your family, or what other dishes or drinks it can be paired with. For your purposes, this sort of information isn’t relevant to your primary purpose. So you leave it out when you make the notecard, and only include the need-to-know information for making the dish. In the case of transcription, this need-to-know info is contained in the exons, and all the rest—the introns—can be left out. The result is a smaller and more mobile version of the mRNA transcript.
Once an mRNA has been protected and spliced, it is ready to leave the nucleus and begin the second phase of protein synthesis, called translation.

Consider the following: RNA interference

In our restaurant example, what would happen if the notecard you slipped under the door never made it to the kitchen? The dish wouldn’t get made, because the cook wouldn’t have the recipe, right?
In recent years, a promising class of medical therapies have used a version of this idea to develop new therapies for a number of formidable diseases. Falling under the name of RNA interference, these therapies disrupt the production of harmful proteins by intercepting and incapacitating mRNA transcripts before they make it to ribosomes. This prevents the corresponding proteins from being made in the first place!
A recently proposed treatment for Ebola represents perhaps the most spectacular application of RNA interference techniques.
Ebola, like most viruses, is basically a transcription machine. It contains a viral genome --- basically a very small and simple chromosome --- together with a polymerase enzyme. Once the virus infiltrates one of your cells, the polymerase enzyme synthesizes mRNA transcripts for each of the genes in its genome. It then commandeers your own ribosomes and uses them to build its own proteins. Once you make these proteins for the ebola virus, they direct the construction of new ebola viruses.
One obvious idea for fighting ebola would be to throw a monkey wrench into this whole process, and stop the replication process before it ever gets off the ground.
Using a class of engineered molecules called short interfering RNA (siRNA, for short), scientists have shown that the viral mRNAs synthesized by the Ebola virus inside infected cells can be captured and destroyed before they are able to deliver their genetic messages to the ribosomes of a host cell. No Ebola mRNA, no Ebola proteins. No Ebola proteins, and Ebola loses its ability to replicate inside host cells!
As with all other new therapies, siRNA-based treatments for Ebola were initially validated in non-human animal models. However, during the latest outbreak of the virus in West Africa, and its subsequent spread to North America, authorities in the U.S. Food and Drug Administration took the radical step of issuing a so-called “compassionate use” exemption for an siRNA-based Ebola therapy called TKM-ebola. While the details are sketchy, we know that TKM-ebola was administered to several different patients, and it could have played a role in their subsequent recoveries.

Want to join the conversation?

  • blobby green style avatar for user nadia.addasi
    Is there a reason that a deoxyribose and Thymine are changed, in order for mRNA to leave the nucleus? I can't see how that one hydroxyl or, moreso, that methyl group makes a difference; is this known?
    (7 votes)
    Default Khan Academy avatar avatar for user
    • duskpin ultimate style avatar for user sikorsky fan
      This is kind of a basic explanation, and it would be hard to get into the nitty gritty details, but I hope it helps you understand.
      Thymine is changed to Uracil because Uracil is easier to produce. DNA contains Thymine because it is more stable, but RNA does not need to be around as long.
      The hydroxyl group on RNA is there for a similar reason. This group makes the molecule more susceptible to hydrolysis (to recap, the splitting of a molecule using water), so RNA can be more easily decomposed. DNA uses deoxyribose because, like Thymine, it is more stable.
      Even though this question is 3 years old, I hope it helps people who might be wondering the same thing.
      (65 votes)
  • blobby green style avatar for user Mark Falina
    I'm sure RNA interference has been tried in the treatment of Covid-19. Is there a particular reason why the process worked for one virus, Ebola, but does not work for another virus, Covid-19? Granted, they are different types of viruses, but the idea is to interfere with the purpose of mRNA in general not a specific type of mRNA.
    (9 votes)
    Default Khan Academy avatar avatar for user
  • female robot grace style avatar for user kaystinweisenberger
    The 5'Cap G is different than a regular G found in the DNA or mRNA, right? It's methylated? Does that affect the function or reading at all? Does that stay on the mRNA for translation?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • leafers ultimate style avatar for user Juan Macias
      It is methylated so it would be a little different than DNA/mRNA. It does not affect the function/reading because the ribosome docks onto the Shine Dalgarno Sequence (in prokaryotes) and the Kozak Sequence (in eukaryotes). Once it docks here it will start transcribing at the start site (AUG). It does stay on the mRNA while translation happens and actually serves as a site for the docking of proteins but that is beyond the scope of the material on the test.
      (11 votes)
  • piceratops ultimate style avatar for user KEVIN
    If the introns are going to be removed anyways, what's the purpose of having them in the DNA?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • starky sapling style avatar for user Krish
      Here is the awnser that I found in my textbook
      Introns are crucial because the protein repertoire or variety is greatly enhanced by alternative splicing in which introns take partly important roles. Alternative splicing is a controlled molecular mechanism producing multiple variant proteins from a single gene in a eukaryotic cell.
      (9 votes)
  • blobby green style avatar for user Shruti Patel
    Not related to the topic, but the way this module/lesson is organized is way out of order. That makes it a little confusing to review. At the beginning on this article, it is assumed that students read the DNA Structure and Function article. I wish there was a way to rearrange the topic in appropriate order.
    (6 votes)
    Default Khan Academy avatar avatar for user
  • leafers ultimate style avatar for user ff142
    The article says the mRNA transcript after splicing is primary mRNA aka mature mRNA, but the "Transcription 1" video and my textbook say the primary mRNA has to be processed to become mature mRNA, and Wikipedia says primary mRNA = pre-mRNA
    (4 votes)
    Default Khan Academy avatar avatar for user
  • starky sapling style avatar for user Logan Tritto
    is this for transcription in a Eukaryote or a prokaryote?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user alishahussain.AH
    If you have to transcribe DNA into RNA do you transcribe the template or the coding?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • leaf grey style avatar for user KRSikoraIII
      Your terms are confused. The "coding" strand is the "template" strand (it's also called the "sense" or "transcribed" strand). The "non-coding" strand (aka the "antisense" or "anticoding" strand), containing the anti-codons, is read by RNA polymerase.
      (0 votes)
  • blobby green style avatar for user ratul.khan4433
    How mRNAs are protected from premature degradation in prokaryotes? As we know that prokaryotes do not undergo post-transcriptional modification.
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user scmukilan
    In a published book, it says "Introns are removed by small nuclear RNA (SnRNA) and protein complex called small nuclear ribonucleoproteins or SnRNPs (Snurps)" Is it the same as Spliceosome. And is also heterogenous nuclear RNA same as pre-mRNA....?
    (2 votes)
    Default Khan Academy avatar avatar for user