If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

How chatbots and large language models work

Large Language Models like ChatGPT have remarkable abilities to generate content based on training data but do they have actual intelligence? Find out more about how LLM's and Chatbots work as we explore this question.

Featuring:
Cristóbal Valenzuela the creator of Runway
Mira Murati the CTO of OpenAI

Presented by: Code.org, ETS, ISTE, Khan Academy

Start learning at code.org today!

Stay in touch with us on social media:
• Twitter: https://twitter.com/codeorg
• Facebook: https://www.facebook.com/Code.org
• Instagram: https://instagram.com/codeorg
• TikTok: https://tiktok.com/@code.org
• LinkedIn: https://www.linkedin.com/company/code-org
• Medium: https://medium.com/@codeorg

Help make our work possible with a donation at http://code.org/donate!
Created by Code.org.

Want to join the conversation?

  • blobby green style avatar for user alisher.alihon
    It's really interesting to know about how AI works and how it is able to analyze all these tasks.
    (10 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user pshirt
    how does a AI understand the prompt given bye a human?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • leafers ultimate style avatar for user JustClassa
      AIs are trained on "Large Language Models" are trained on readings, texts, wikipedias, webpages, human interactions. Typically billions of interactions. So... when the human prompts the bot, the bot takes the prompts and generates conversational connections, like neurons in a brain firing, that seem to, statistically, relate to your question. This is done through a process called Natural Language Processing which is a collection of highly advanced techniques to create connections between words from conversations such as the one that I am having with you now.

      One of the pitfalls of this, is that nothing that the AI generates is original. Everything in its database has happened before within its field of study. So you have to be careful when using the information that it provides you in a professional setting.
      (3 votes)
  • leafers sapling style avatar for user THE GIRL KNOW IT ALL01
    ok i know a lot but not this
    (4 votes)
    Default Khan Academy avatar avatar for user
  • duskpin sapling style avatar for user juliannagonzalez425
    how does a AI understand the prompt given bye a human?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • duskpin tree style avatar for user ✿ Tati ✿ stan skz & newjeans
      When an AI receives a prompt from a human, it uses a combination of natural language processing (NLP) techniques and machine learning algorithms to understand and process the input. NLP helps the AI analyze the structure, context, and meaning of the text. However, if there are spelling mistakes or missing punctuation, it can sometimes affect the AI's understanding and response. The AI may still attempt to provide a relevant answer, but it's important to ensure clear and accurate communication to get the best results. Hope this helps! :)
      (5 votes)
  • male robot hal style avatar for user KEVIN
    link removed

    My original post- ca. early Oct 2023

    What makes us - collectively- think that AI can actually recreate or create from scratch, prose such as Shakespeare wrote, or by extension, symphonies such as Mozart composed, or theories, such as Faraday developed, or see into the subconscious, such as Freud postulated, or write poems, such as those by Maya Angelou? Just because the program, AI, can mimic the procedures that these people used to create their works, that doesn't mean that what AI spits out will have any value. It will always and only be a copy, a derivative - in the non-mathematical sense - and an imitation that appropriates the labor and hard wrought creativity of others.

    Can AI create by candlelight, or with bombs going off close by, or under religious persecution, or without food or water or shelter? (Do you know what? It probably can. Those very human experiences/conditions don't shape how AI gets from point A to point B. That is why, from the purely, let's say, Shakespeare prose vantage point, what AI presents to us reads as hyper real, which determines that it is hyper fake/false.)

    I appreciate that KA is putting this in front of us, but at the moment, this is snake oil. Let's see.
    (4 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user ibad.2551
    what is AI and its full form
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Donald Vonderheide
    How does the ai understand the prompt given by a person?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • aqualine sapling style avatar for user sanvi.krishna25
    how complex will ai get
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user 108213048
    idk because no entendi nada
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user singh.andy86
    do we need AI in the future
    (1 vote)
    Default Khan Academy avatar avatar for user

Video transcript

Hi, I'm Mira Murati. I'm the chief technology officer at Openai, the company that created ChatGPT. I really wanted to work on AI because it has the potential to really improve almost every aspect of life and help us tackle really hard challenges. Hi, I'm Cristobal Valenzuela, CEO and co-founder of Runway Runway, is a research company that builds AI algorithms for storytelling and video creation. Chat bots like ChatGPT are based on a new type of AI technology that's called large language models. So instead of a typical neural network which trains on a specific task like how to recognize faces or images, a large language model is trained on the largest amount of information possible, such as everything available on the Internet. It uses this training to then be able to generate completely new information, like to write essays or poems, have conversations, or even write code. The possibilities seem endless, but how does this work and what are its shortcomings? Let's dive in. While a chatbot built on a large language model may seem magical, it works based on some really simple ideas. In fact, most of the magic of AI is based on very simple math concepts from statistics applied billions of times using fast computers. The AI uses probabilities to predict the text that you want it to produce based on all the previous text that it has been trained on. Suppose that we want to train a large language model to read every play written by William Shakespeare so that it could write new plays in the same style. We'd start with all the texts from Shakespeare's plays stored letter by letter in a sequence next, we'd analyze each letter to see what letter is most likely to come next after an I, the next most likely letters to show up in Shakespeare plays are S or N after an, S, T, C, or H, and so on. This creates a table of probabilities. With just this, we can try to generate new writing. We pick a random letter to start starting with the first letter. We can see what's most likely to come next. We don't always have to pick the most popular choice because that would lead to repetitive cycles. Instead, we pick randomly. Once we have the next letter, we repeat the process to find the next letter and then the next one and so on. Okay, well, that doesn't look at all like Shakespeare. It's not even English, but it's a first step. The simple system might not seem even remotely intelligent, but as we build up from here, you'll be surprised where it goes. The problem in the last example is that at any point the AI only considers a single letter to pick what comes next. That's not enough context, and so the output is not helpful. What if we could train it to consider a sequence of letters, like sentences or paragraphs, to give it more context to pick the next one? To do this, we don't use a simple table of probabilities. We use a neural network. A neural network is a computer system that is loosely inspired by the neurons in the brain. It is trained on a body of information, and with enough training, it can learn to take in new information and give simple answers. The answers always include probabilities because there can be many options. Now let's take a neural network and train it on all the letters sequences in Shakespeare's plays to learn what letter is likely to come next at any point. Once we do this, the neural networks can take any new sequence and predict what could be a good next letter. Sometimes the answer is obvious, but usually is not. It turns out this new approach works better, much better by looking at the long enough sequence of letters, the AI can learn complicated patterns, and it uses those to produce all new texts. It starts the same way with a starting letter and then using probabilities to pick the next letter and so on. But this time, the probabilities are based on the entire context of what came beforehand. As you see, this works surprisingly well. Now, a system like ChatGPT uses a similar approach, but with three very important additions. First, instead of just training on Shakespeare, it looks at all the information it can find on the Internet, including all the articles on Wikipedia or all the code on GitHub. Second, instead of learning and predicting letters from just the 26 choices in the alphabet, it looks at tokens which are either full words or word parts or even code. And third difference is that a system of this complexity needs a lot of human tuning to make sure it produces reasonable results in a wide variety of situations, while also protecting against problems like producing highly biased or even dangerous content. Even after we do this tuning, it's important to note that this system is still just using random probabilities to choose words. A large language model can produce unbelievable results that seem like magic, but because it's not actually magic, it can often get things wrong. And when it gets things wrong, people ask, does a large language model have actual intelligence? Discussions about A.I. often spark philosophical debates about the meaning of intelligence. Some argue that a neural network producing words using probabilities doesn't have really intelligence. But what isn't under debate is that large language models produce amazing results with applications in many fields. This technology is already being used to create apps and websites, help produce movies and video games, and even discover new drugs. The rapid acceleration of AI will have enormous impacts on society, and it's important for everybody to understand this technology. What I'm looking forward to is the amazing things people will create with AI., and I hope you dive in to learn more about how AI works and explore what you can build with it.