If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

DNA libraries & generating cDNA

Visit us (http://www.khanacademy.org/science/healthcare-and-medicine) for health and medicine content or (http://www.khanacademy.org/test-prep/mcat) for MCAT related content. These videos do not provide medical advice and are for informational purposes only. The videos are not intended to be a substitute for professional medical advice, diagnosis or treatment. Always seek the advice of a qualified health provider with any questions you may have regarding a medical condition. Never disregard professional medical advice or delay in seeking it because of something you have read or seen in any Khan Academy video. Created by Ronald Sahyouni.

Video transcript

- [Voiceover] Alright, so let's say that you've got this little guy over here and he's got his shoes and he's just happy, smilin'. So, this guy right here is our protein. So, let's look at how this protein was created. So, in order to make protein we have to start out with our base, and in this case our base is DNA. >From DNA we generate messenger RNA, then that messenger RNA eventually leads to the formation of a protein. And protein is this happy guy over here. This is pretty straightforward, but what if we wanted to go in reverse? What if we started out with a protein and we wanted to figure out what its DNA sequence was? So, if we wanted to go in this direction. So let's look at how this is done. Now scientists thought it would be nice to basically be able to type in the name of any protein that they're interested in and automatically it would pop up with the DNA sequence of that protein. Now that is known as a DNA library. And a DNA library would be beneficial for researchers, and scientists, and clinicians. So, let's look at how this is done. So, we'll start out with our protein and our protein is basically a chain of amino acids. So, amino acids basically are formed from messenger RNA. So, if we know the amino acid sequence of our protein we know what the messenger RNA sequence is based on the Codon table, that we all are too familiar with. So, if we have the messenger RNA sequence, what we do is we add an enzyme known reverse transcriptase and when we add reverse transcriptase, basically takes this messenger RNA and makes a complimentary DNA sequence to the messenger RNA. And that's known as cDNA, the 'c' stands for complimentary. So complimentary DNA, one thing to keep in mind is single-stranded DNA. So, normally DNA in our cells is double-stranded DNA, but complimentary DNA is single-stranded. So, in order to generate double-stranded DNA we need to add another enzyme known as DNA polymerase. DNA polyermase basically generates double-stranded DNA. So, this is basically step one of the process of creating a DNA library. So this is step one, now let's look at step two. So, now that we have our double-stranded DNA what we need to do is sequence it. So, in order to sequence it we'll start out with our double-stranded DNA and we'll basically inject it into some sort of cloning vector, such as a plasmate or a virus. Cloning vector, and that cloning vector can then be, then you can take that cloning vector and add it to some bacteria. And it'll basically infect the bacteria and the bacteria will basically produce lots and lots of this DNA, this double-stranded DNA, so that's a process known as Amplification. And once we have lots of double-stranded DNA, we'll go and sequence that double-stranded DNA and basically once we have that sequence we'll put the sequence into a large database that's readily accessible online and that database will basically populate the DNA library. So, now anybody that is interested in the DNA sequence of a particular protein can just go into this library, and pull up the genetic sequence of the protein of interest.