An in-depth looks at how transcription works. Initiation (promoters), elongation, and termination.

Key points:

  • Transcription is the process in which a gene's DNA sequence is copied (transcribed) to make an RNA molecule.
  • RNA polymerase is the main transcription enzyme.
  • Transcription begins when RNA polymerase binds to a promoter sequence near the beginning of a gene (directly or through helper proteins).
  • RNA polymerase uses one of the DNA strands (the template strand) as a template to make a new, complementary RNA molecule.
  • Transcription ends in a process called termination. Termination depends on sequences in the RNA, which signal that the transcript is finished.

Introduction

What makes death cap mushrooms deadly? These mushrooms get their lethal effects by producing one specific toxin, which attaches to a crucial enzyme in the human body: RNA polymerase.1^1
Photograph of Amanita phalloides (death cap) mushrooms.
_Image modified from "Amanita phalloides," by Archenzo (CC BY-SA 3.0). The modified image is licensed under a CC BY-SA 3.0 license._
RNA polymerase is crucial because it carries out transcription, the process of copying DNA (deoxyribonucleic acid, the genetic material) into RNA (ribonucleic acid, a similar but more short-lived molecule).
Transcription is an essential step in using the information from genes in our DNA to make proteins. Proteins are the key molecules that give cells structure and keep them running. Blocking transcription with mushroom toxin causes liver failure and death, because no new RNAs—and thus, no new proteins—can be made.2^2
Transcription is essential to life, and understanding how it works is important to human health. Let's take a closer look at what happens during transcription.

Transcription overview

Transcription is the first step of gene expression. During this process, the DNA sequence of a gene is copied into RNA.
Before transcription can take place, the DNA double helix must unwind near the gene that is getting transcribed. The region of opened-up DNA is called a transcription bubble.
In transcription, a region of DNA opens up. One strand, the template strand, serves as a template for synthesis of a complementary RNA transcript. The other strand, the coding strand, is identical to the RNA transcript in sequence, except that it has uracil (U) bases in place of thymine (T) bases.
Example:
Coding strand: 5'-ATGATCTCGTAA-3' Template strand: 3'-TACTAGAGCATT-5' RNA transcript: 5'-AUGAUCUCGUAA-3'
In translation, the RNA transcript is read to produce a polypeptide.
Example:
RNA transcript: 5'-AUG AUC UCG UAA-3' Polypeptide: (N-terminus) Met - Ile - Ser - [STOP] (C-terminus)
Transcription uses one of the two exposed DNA strands as a template; this strand is called the template strand. The RNA product is complementary to the template strand and is almost identical to the other DNA strand, called the nontemplate (or coding) strand. However, there is one important difference: in the newly made RNA, all of the T nucleotides are replaced with U nucleotides.
The site on the DNA from which the first RNA nucleotide is transcribed is called the +1+1 site, or the initiation site. Nucleotides that come before the initiation site are given negative numbers and said to be upstream. Nucleotides that come after the initiation site are marked with positive numbers and said to be downstream.
If the gene that's transcribed encodes a protein (which many genes do), the RNA molecule will be read to make a protein in a process called translation.
Eukaryotic cells, like those of our bodies, do a few extra processing steps between transcription and translation. You can learn more about those in the article on RNA processing, but you don't need to worry about them right now.

RNA polymerase

RNA polymerases are enzymes that transcribe DNA into RNA. Using a DNA template, RNA polymerase builds a new RNA molecule through base pairing. For instance, if there is a G in the DNA template, RNA polymerase will add a C to the new, growing RNA strand.
RNA polymerase synthesizes an RNA strand complementary to a template DNA strand. It synthesizes the RNA strand in the 5' to 3' direction, while reading the template DNA strand in the 3' to 5' direction. The template DNA strand and RNA strand are antiparallel.
RNA transcript: 5'-UGGUAGU...-3' (dots indicate where nucleotides are still being added at 3' end) DNA template: 3'-ACCATCAGTC-5'
RNA polymerase always builds a new RNA strand in the 5’ to 3’ direction. That is, it can only add RNA nucleotides (A, U, C, or G) to the 3' end of the strand.
The two ends of a strand of DNA or RNA strand are different from each other. That is, a DNA or RNA strand has directionality.
  • At the 5’ end of the chain, the phosphate group of the first nucleotide in the chain sticks out. The phosphate group is attached to the 5' carbon of the sugar ring, which is why this is called the 5' end.
  • At the other end, called the 3’ end, the hydroxyl of the last nucleotide added to the chain is exposed. The hydroxyl group is attached to the 3' carbon of the sugar ring, which is why this is called the 3' end.
Many processes, such as DNA replication and transcription, can only take place in one particular direction relative the the directionality of a DNA or RNA strand.
You can learn more in the article on nucleic acids.
RNA polymerases are large enzymes with multiple subunits, even in simple organisms like bacteria. In addition, humans and other eukaryotes have three different kinds of RNA polymerases: I, II, and III. Each one specializes in transcribing certain classes of genes.

Transcription initiation

To begin transcribing a gene, RNA polymerase binds to the DNA of the gene at a region called the promoter. Basically, the promoter tells the polymerase where to "sit down" on the DNA and begin transcribing.
The promoter region comes before (and slightly overlaps with) the transcribed region whose transcription it specifies. It contains recognition sites for RNA polymerase or its helper proteins to bind to. The DNA opens up in the promoter region so that RNA polymerase can begin transcription.
Each gene (or, in bacteria, each group of genes transcribed together) has its own promoter. A promoter contains DNA sequences that let RNA polymerase or its helper proteins attach to the DNA. Once the transcription bubble has formed, the polymerase can start transcribing.

Promoters in bacteria

To get a better sense of how a promoter works, let's look an example from bacteria. A typical bacterial promoter contains two important DNA sequences, the 1010 and 3535 elements.
RNA polymerase recognizes and binds directly to these sequences. The sequences position the polymerase in the right spot to start transcribing a target gene, and they also make sure it's pointing in the right direction.
Basically, the rear part of the enzyme binds to the 3535 element, while the front part binds to the 1010 element. Thus, RNA polymerase can only bind to the promoter if it's pointing in a particular direction, one in which it faces towards the region to be transcribed.
Once the RNA polymerase has bound, it can open up the DNA and get to work. DNA opening occurs at the 1010 element, where the strands are easy to separate due to the many As and Ts (which bind to each other using just two hydrogen bonds, rather than the three hydrogen bonds of Gs and Cs).
Bacterial promoter. The promoter lies at the start of the transcribed region, encompassing the DNA before it and slightly overlapping with the transcriptional start site. The promoter contains two elements, the -35 element and the -10 element. The -35 element is centered about 35 nucleotides upstream of (before) the transcriptional start site (+1), while the -10 element is centered about 10 nucleotides before the transcriptional start site. In this particular example, the sequence of the -35 element (on the coding strand) is 5'-TTGACG-3', while the sequence of the -10 element (on the coding strand) is 5'-TATAAT-3'. The RNA polymerase has regions that specifically bind to the -10 and -35 elements.
The 1010 and the 3535 elements get their names because they come 3535 and 1010 nucleotides before the initiation site (+1+1 in the DNA). The minus signs just mean that they are before, not after, the initiation site.

Promoters in humans

In eukaryotes like humans, the main RNA polymerase in your cells does not attach directly to promoters like bacterial RNA polymerase. Instead, helper proteins called basal (general) transcription factors bind to the promoter first, helping the RNA polymerase in your cells get a foothold on the DNA.
Many eukaryotic promoters have a sequence called a TATA box. The TATA box plays a role much like that of the 1010 element in bacteria. It's recognized by one of the general transcription factors, allowing other transcription factors and eventually RNA polymerase to bind. It also contains lots of As and Ts, which make it easy to pull the strands of DNA apart.
The promoter of a eukaryotic gene is shown. The promoter lies upstream of and slightly overlaps with the transcriptional start site (+1). It contains a TATA box, which has a sequence (on the coding strand) of 5'-TATAAA-3'. The first eukaryotic general transcription factor binds to the TATA box. Then, other general transcription factors bind. Finally, RNA polymerase II and some additional transcription factors bind to the promoter.

Elongation

Once RNA polymerase is in position at the promoter, the next step of transcription—elongation—can begin. Basically, elongation is the stage when the RNA strand gets longer, thanks to the addition of new nucleotides.
During elongation, RNA polymerase "walks" along one strand of DNA, known as the template strand, in the 3' to 5' direction. For each nucleotide in the template, RNA polymerase adds a matching (complementary) RNA nucleotide to the 3' end of the RNA strand.
Here is the reaction that adds an RNA nucleotide to the chain:
Polymerization reaction in which a RNA nucleotide triphosphate is added to the existing RNA strand. The RNA nucleotide triphosphate has a series of three phosphate groups attached to it. The innermost phoosphate group reacts with the 3' hydroxyl on the nucleotide at the end of the existing strand, forming a phosphodiester bond that attaches the new nucleotide to the end of the chain. A pyrophosphate (molecule consisting of two phosphate groups) is lost in this process, and is later cleaved into two individual inorganic phosphates. In general, this reaction will occur only when an incoming nucleotide is complementary to the next exposed nucleotide in the DNA strand that serves as a template for RNA synthesis.
The RNA strand looks similar to DNA, except that it contains the base uracil in place of thymine and has ribose sugars (which have a hydroxyl group on the 2' carbon) in place of deoxyribose sugars.
RNA polymerase synthesizes an RNA transcript complementary to the DNA template strand in the 5' to 3' direction. It moves forward along the template strand in the 3' to 5' direction, opening the DNA double helix as it goes. The synthesized RNA only remains bound to the template strand for a short while, then exits the polymerase as a dangling string, allowing the DNA to close back up and form a double helix.
In this example, the sequences of the coding strand, template strand, and RNA transcript are:
Coding strand: 5' - ATGATCTCGTAA-3'
Template strand: 3'-TACTAGAGCATT-5'
RNA: 5'-AUGAUC...-3' (the dots indicate where nucleotides are still being added to the RNA strand at its 3' end)
The RNA transcript is nearly identical to the non-template, or coding, strand of DNA. However, RNA strands have the base uracil (U) in place of thymine (T), as well as a slightly different sugar in the nucleotide. So, as we can see in the diagram above, each T of the coding strand is replaced with a U in the RNA transcript.
DNA nucleotide: lacks a hydroxyl group on the 2' carbon of the sugar (i.e., sugar is deoxyribose). Bears a thymine base that has a methyl group attached to its ring.
RNA nucleotide: has a hydroxyl group on the 2' carbon of the sugar (i.e., sugar is ribose). Bears a uracil base that is very similar in structure to thymine, but does not have a methyl group attached to the ring.
Image based on similar image from CyberBridge 3^3.
RNA nucleotides are similar to DNA nucleotides, but not identical. They have a ribose sugar rather than deoxyribose, so they have a hydroxyl group on the 2' carbon of the sugar ring. Also, in RNA, there is no T (thymine). Instead, RNA nucleotides carry the base uracil (U), which is structurally similar to thymine and forms complementary base pairs with adenine (A).
The picture below shows DNA being transcribed by many RNA polymerases at the same time, each with an RNA "tail" trailing behind it. The polymerases near the start of the gene have short RNA tails, which get longer and longer as the polymerase transcribes more of the gene.
In the microscope image shown here, a gene is being transcribed by many RNA polymerases at once. The RNA chains are shortest near the beginning of the gene, and they become longer as the polymerases move towards the end of the gene. This pattern creates a kind of wedge-shaped structure made by the RNA transcripts fanning out from the DNA of the gene.
_Image modified from "Transcription label en," by Dr. Hans-Heinrich Trepte (CC BY-SA 3.0). The modified image is licensed under a CC BY-SA 3.0 license._

Transcription termination

RNA polymerase will keep transcribing until it gets signals to stop. The process of ending transcription is called termination, and it happens once the polymerase transcribes a sequence of DNA known as a terminator.

Termination in bacteria

There are two major termination strategies found in bacteria: Rho-dependent and Rho-independent.
In Rho-dependent termination, the RNA contains a binding site for a protein called Rho factor. Rho factor binds to this sequence and starts "climbing" up the transcript towards RNA polymerase.
Rho-dependent termination. The terminator is a region of DNA that includes the sequence that codes for the Rho binding site in the mRNA, as well as the actual transcription stop point (which is a sequence that causes the RNA polymerase to pause so that Rho can catch up to it). Rho binds to the Rho binding site in the mRNA and climbs up the RNA transcript, in the 5' to 3' direction, towards the transcription bubble where the polymerase is. When it catches up to the polymerase, it will cause the transcript to be released, ending transcription.
When it catches up with the polymerase at the transcription bubble, Rho pulls the RNA transcript and the template DNA strand apart, releasing the RNA molecule and ending transcription. Another sequence found later in the DNA, called the transcription stop point, causes RNA polymerase to pause and thus helps Rho catch up.4^4
Rho-independent termination depends on specific sequences in the DNA template strand. As the RNA polymerase approaches the end of the gene being transcribed, it hits a region rich in C and G nucleotides. The RNA transcribed from this region folds back on itself, and the complementary C and G nucleotides bind together. The result is a stable hairpin that causes the polymerase to stall.
Rho-independent termination. The terminator DNA sequence encodes a region of RNA that folds back on itself to form a hairpin. The hairpin is followed by a series of U nucleotides in the RNA (not pictured). The hairpin causes the polymerase to stall, and the weak base pairing between the A nucleotides of the DNA template and the U nucleotides of the RNA transcript allows the transcript to separate from the template, ending transcription.
In a terminator, the hairpin is followed by a stretch of U nucleotides in the RNA, which match up with A nucleotides in the template DNA. The complementary U-A region of the RNA transcript forms only a weak interaction with the template DNA. This, coupled with the stalled polymerase, produces enough instability for the enzyme to fall off and liberate the new RNA transcript.
In eukaryotes like humans, transcription termination happens differently depending on the type of gene involved. Here, we'll see how termination works for protein-coding genes.
Note: This is a pretty weird mechanism. It does not make complete sense, even to the biologists who study it in great detail. You have been warned!
Termination begins when a polyadenylation signal appears in the RNA transcript. This is a sequence of nucleotides that marks where an RNA transcript should end. The polyadenylation signal is recognized by an enzyme that cuts the RNA transcript nearby, releasing it from RNA polymerase.
Oddly enough, RNA polymerase continues transcribing after the transcript is released, often making 500500 2,2,000000 more nucleotides' worth of RNA5^5. Eventually, it detaches from the DNA through mechanisms that are not yet fully understood6^6. The extra RNA is not usually translated and seems to be a wasteful byproduct of transcription.

What happens to the RNA transcript?

After termination, transcription is finished. An RNA transcript that is ready to be used in translation is called a messenger RNA (mRNA). In bacteria, RNA transcripts are ready to be translated right after transcription. In fact, they're actually ready a little sooner than that: translation may start while transcription is still going on!
In the diagram below, mRNAs are being transcribed from several different genes. Although transcription is still in progress, ribosomes have attached each mRNA and begun to translate it into protein. When an mRNA is being translated by multiple ribosomes, the mRNA and ribosomes together are said to form a polyribosome.
Illustration shows mRNAs being transcribed off of genes. Ribosomes attach to the mRNAs before transcription is done and begin making protein.
Image modified from "Prokaryotic transcription: Figure 3, by OpenStax College, Biology, CC BY 4.0.
Why can transcription and translation happen simultaneously for an mRNA in bacteria? One reason is that these processes occur in the same 5' to 3' direction. That means one can follow or "chase" another that's still occurring. Also, in bacteria, there are no internal membrane compartments to separate transcription from translation.
The picture is different in the cells of humans and other eukaryotes. That's because transcription happens in the nucleus of human cells, while translation happens in the cytosol. Also, in eukaryotes, RNA molecules need to go through special processing steps before translation. That means translation can't start until transcription and RNA processing are fully finished. You can learn more about these steps in the transcription and RNA processing video.

Attribution:

This article is a modified derivative of the following articles:
The modified article is licensed under a CC BY-NC-SA 4.0 license.

Works cited:

  1. Berger, S. (2006). The mushroom Amanita phalloides. In Transcription and RNA polymerase II. Retrieved from http://www.chem.uwec.edu/Webpapers2006/sites/bergersl/pages/amanitin.html.
  2. Amanita phalloides. (2016, February 6). Retrieved February 13, 2016 from Wikipedia: https://en.wikipedia.org/wiki/Amanita_phalloides.
  3. CyberBridge. (2007). RNA structure. In Structure of DNA. Retrieved from http://cyberbridge.mcb.harvard.edu/dna_3.html.
  4. Rho factor. (2016, October 19). Retrieved November 20, 2016 from Wikipedia: https://en.wikipedia.org/wiki/Rho_factor.
  5. Lodish, H., Berk, A., Zipursky, S. L., Matsudaira, P., Baltimore, D., and Darnell, J. (2000). Three eukaryotic RNA polymerases employ different termination mechanisms. n Molecular cell biology (4th ed., section 11.1). New York, NY: W. H. Freeman. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK21601/#_A2857_.
  6. Richard, P. and Manley, J. L. (2009). Transcription termination by nuclear RNA polymerases. Genes & Dev., 23, 1247-1269. http://genesdev.cshlp.org/content/23/11/1247.full.

References:

3'-end cleavage and polyadenylation. (2016). In Nobelprize.org. Retrieved from http://www.nobelprize.org/educational/medicine/dna/a/splicing/splicing_endformation.html.
Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., and Walter, P. (2002). Posttranscriptional controls. In Molecular biology of the cell (4th ed.). New York, NY: Garland Science. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK26890/.
Alpha-amanitin. (2016, February 11). Retrieved February 13, 2016 from Wikipedia: https://en.wikipedia.org/wiki/Alpha-Amanitin.
Amanita phalloides. (2016, February 6). Retrieved February 13, 2016 from Wikipedia: https://en.wikipedia.org/wiki/Amanita_phalloides.
Berg, J. M., Tymoczko, J. L., and Stryer, L. (2002). Transcription is catalyzed by RNA polymerase. In Biochemistry (5th ed., section 28.1). New York, NY: W. H. Freeman. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK22546/.
Berger, Shanna. (2006). Eukaryotic transcription. In Transcription and RNA polymerase II. Retrieved from http://www.chem.uwec.edu/Webpapers2006/sites/bergersl/pages/eukaryotic.html.
Berger, S. (2006). The mushroom Amanita phalloides. In Transcription and RNA polymerase II. Retrieved from http://www.chem.uwec.edu/Webpapers2006/sites/bergersl/pages/amanitin.html.
Brown, T. A. (2002). Assembly of the transcription initiation complex. In Genomes (2nd ed., Ch. 9). Oxford, UK: Wiley-Liss. Retrieved from www.ncbi.nlm.nih.gov/books/NBK21115/.
Gong, X. Q., Nedialkov, Y. A., and Burton, Z. F. (2004). α-amanitin blocks translocation by human RNA polymerase II. The Journal of Biological Chemistry, 279, 27422-27427. http://dx.doi.org/10.1074/jbc.M402163200.
Griffiths, A. J. F., Miller, J. H., Suzuki, D. T., Lewontin, R. C., and Gelbart, W. M. (2000). Transcription and RNA polymerase. In An introduction to genetic analysis (7th ed.). New York, NY: W. H. Freeman. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK22085/.
Inverted repeat. (2016, February 13). Retrieved February 13, 2016 from Wikipedia: https://en.wikipedia.org/wiki/Inverted_repeat.
Lodish, H., Berk, A., Zipursky, S. L., Matsudaira, P., Baltimore, D., and Darnell, J. (2000). Bacterial transcription initiation. In Molecular cell biology (4th ed., section 10.2). New York, NY: W. H. Freeman. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK21612/.
Lodish, H., Berk, A., Zipursky, S. L., Matsudaira, P., Baltimore, D., and Darnell, J. (2000). RNA polymerase II transcription-initiation complex. In Molecular cell biology (4th ed., section 10.6). New York, NY: W. H. Freeman. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK21610/.
Lodish, H., Berk, A., Zipursky, S. L., Matsudaira, P., Baltimore, D., and Darnell, J. (2000). Transcription termination. In Molecular cell biology (4th ed., section 11.1). New York, NY: W. H. Freeman. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK21601/.
Moran, L. A. (2008, September 16). How RNA polymerase binds to DNA [Web log post]. In Sandwalk: Strolling with a skeptical biochemist. Retrieved from http://sandwalk.blogspot.com/2008/09/how-rna-polymerase-binds-to-dna.html
OpenStax College, Biology. (2016, March 23). Eukaryotic transcription. In OpenStax CNX. Retrieved from http://cnx.org/contents/GFy_h8cu@10.8:6l70P9u6@5/Eukaryotic-Transcription.
OpenStax College, Concepts of Biology. (2016, October 31). Transcription. In OpenStax CNX. Retrieved from http://cnx.org/contents/s8Hh0oOc@9.11:TkuNUJis@3/Transcription.
Polyadenylation. (2016, January 24). Retrieved February 11, 2016 from Wikipedia: https://en.wikipedia.org/wiki/Polyadenylation.
Purves, W. K., Sadava, D. E., Orians, G. H., and Heller, H.C. (2004). Transcription: DNA-directed RNA synthesis. In Life: the science of biology (7th ed., pp. 237-239). Sunderland, MA: Sinauer Associates.
Raven, P. H., Johnson, G. B., Mason, K. A., Losos, J. B., and Singer, S. R. (2014). Genes and how they work. In Biology (10th ed., AP ed., pp. 278-303). New York, NY: McGraw-Hill.
Reece, J. B., Urry, L. A., Cain, M. L., Wasserman, S. A., Minorsky, P. V., and Jackson, R. B. (2011). Transcription is the DNA-directed synthesis of RNA: A closer look. In Campbell biology (10th ed., pp. 340-342). San Francisco, CA: Pearson.
Rho factor. (2016, October 19). Retrieved November 20, 2016 from Wikipedia: https://en.wikipedia.org/wiki/Rho_factor.
Richard, P. and Manley, J. L. (2009). Transcription termination by nuclear RNA polymerases. Genes & Dev., 23, 1247-1269. http://genesdev.cshlp.org/content/23/11/1247.full.
Saunders, A., Core, L. J., and Lis, J. T. (2006). Breaking barriers to transcription elongation. Nature Reviews Molecular Cell Biology, 7, 557-567. http://dx.doi.org/10.1038/nrm1981.
Terminator (genetics). (2015, December 14). Retrieved February 13, 2016 from Wikipedia: https://en.wikipedia.org/wiki/Terminator_%28genetics%29.
Webb, S. Witte, L., Wong, K., Woreta, T., and Yoo, E. (2002, May 8). TFIIH. In RNA polymerase II in eukaryotes and prokaryotes. Retrieved from http://www.biochem.umd.edu/biochem/kahn/molmachines/newpolII/TFIIH.html.
Loading