If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Machine learning algorithms

Machine learning (ML) is a type of algorithm that automatically improves itself based on experience, not by a programmer writing a better algorithm. The algorithm gains experience by processing more and more data and then modifying itself based on the properties of the data.

Types of machine learning

There are many varieties of machine learning techniques, but here are three general approaches:
  • reinforcement learning: The algorithm performs actions that will be rewarded the most. Often used by game-playing AI or navigational robots.
  • unsupervised machine learning: The algorithm finds patterns in unlabeled data by clustering and identifying similarities. Popular uses include recommendation systems and targeted advertising.
  • supervised machine learning: The algorithm analyzes labeled data and learns how to map input data to an output label. Often used for classification and prediction.
Let's dive into one of the most common approaches to understand more about how a machine learning algorithm works.

Neural networks

An increasingly popular approach to supervised machine learning is the neural network. A neural network operates similarly to how we think brains work, with input flowing through many layers of "neurons" and eventually leading to an output.
Diagram of a neural network, with circles representing each neuron and lines representing connections between neurons. The network starts on the left with a column of 3 neurons labeled "Input". Those neurons are connected to another column of 4 neurons, which itself connects to another column of 4, and those neurons are labeled "Hidden layers". The second hidden layer of neurons is connected to a column of 3 neurons labeled "Output".

Training a network

Computer programmers don't actually program each neuron. Instead, they train a neural network using a massive amount of labeled data.
The training data depends on the goal of the network. If its purpose is to classify images, a training data set could contain thousands of images labeled as "bird", "airplane", etc.
A grid of images in 10 categories (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck).
Images from the CIFAR10 training data set. Image source: CIFAR10
The goal of the training phase is to determine weights for the connections between neurons that will correctly classify the training data.
A diagram of a neural network classifying an image of a plane. Parts of the image are fed into the first layer of neurons, those neurons lead to a middle layer, and those neurons lead to a final layer of neurons. Each edge between neurons is labeled with a question mark, denoting an unknown weight.
The weights between the neurons are unknown (labeled with a "?" here), and the neural network wants to find weights that will result in classifying each image correctly.
The neural network starts off with all the weights set to random values, so its initial classifications are way off. It learns from its mistakes, however, and eventually comes up with a set of weights that do the best job at classifying all of the training data.
A diagram of a neural network classifying an image of a plane. Parts of the image are fed into the first layer of neurons, those neurons lead to a middle layer, and those neurons lead to a final layer of neurons. Each neuron has a weight (from 0 to 1). In the final layer, the neuron labeled "plane" has the highest weight.
Each of the connections between neurons is assigned a weight (represented by shades of green). A neuron multiplies each connection weight by the value of the input neuron, and sums up all of those to come up with a single number (shown on each neuron). The neuron will only send its value to the next layer if it's above a threshold.

Using the network

When the neural network is asked to classify an image, it uses the learned weights and outputs the possible classes and their probabilities.
Diagram of a neural network, with circles representing each neuron and lines representing connections between neurons. The network starts on the left with an image of a fox. The image is broken into 4 parts, and those parts are connected to column of 4 neurons, which itself connects to another column of 4. The second column is connected to 3 possible outputs: "Fox (0.85)", "Dog (0.65)", and "Cat (0.25)".

Accuracy

The accuracy of a neural network is highly dependent on its training data, both the amount and diversity. Has the network seen the object from multiple angles and lighting conditions? Has it seen the object against many different backgrounds? Has it really seen all varieties of that object? If we want a neural network to truly understand the world, we need to expose it to the huge diversity of our world.
Companies, governments, and institutions are increasingly using machine learning to make decisions for them. They often call it "artificial intelligence," but a machine learning algorithm is only as intelligent as its training data. If the training data is biased, then the algorithm is biased. And unfortunately, training data is biased more often than it's not.
In the following articles, we'll explore the ramifications of letting machines make decisions for us based on biased data.

🙋🏽🙋🏻‍♀️🙋🏿‍♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!

Want to join the conversation?

  • blobby green style avatar for user zleahy247
    can you do a simple video course on writing neural networks through javascript
    (23 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user gspersonal.com
    Please reveal the equations required to adjust the weights. Thx.
    (4 votes)
    Default Khan Academy avatar avatar for user
  • leaf blue style avatar for user gavin.dunbar
    how does the ai know how which node to pass the information to
    (2 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Jeremy Nielsen
      That's what the learning part of machine learning is for. Many neural networks are trained through thousands of slightly different iterations, oftentimes using Darwinian algorithms for each group so that it takes the best method from that group and then slightly modifies it into a new group of iterations, continuing this process to get the "perfect model", which is, of course, not perfect but given tens or hundreds of thousands, maybe even millions of total iterations over hundreds of "generations", it tends to come out well, as long as the training data is solid.
      (4 votes)
  • aqualine ultimate style avatar for user akarthik6
    Can you make a simple video course on making neural networks in Python?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user coliam
    in the labeled data sets, how is a dog distinguished from a wolf? Are the pictures reduced to mathematical point, for example, distance between eyes, length of ears?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Boyd Jayden
    can you do a simple video course on writing neural networks through javascript
    (1 vote)
    Default Khan Academy avatar avatar for user
  • aqualine ultimate style avatar for user danahm1
    i have so many unanswered questions
    (1 vote)
    Default Khan Academy avatar avatar for user
  • leafers seed style avatar for user Ameen Ahmad
    what is a hidden layer in neural network used for ? and how do you determine how many layers do you need for a particular model ? can you please explain with example
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user England, Damon
    in the labeled data sets, how is a dog distinguished from a fox? Are the pictures reduced to mathematical point, for example, distance between eyes, length of ears?
    (0 votes)
    Default Khan Academy avatar avatar for user
  • sneak peak green style avatar for user G. Tarun
    Hi! Does Khan Academy use machine learning and AI anywhere on the platform? Does the Course Mastery system, and decisions of which questions to display in an attempt of the Course Challenge, Unit Test, or Mastery Challenge, use ML or AI? Does the LSAT and SAT practice use machine learning? I'm looking for concrete examples of ML and AI that I engage with everyday. And Khan Academy is the best place to start!
    (0 votes)
    Default Khan Academy avatar avatar for user
    • female robot grace style avatar for user Jme
      The course mastery system is a basic algorithm: It presents questions (tagged with what practice they're from), and at the end it goes through all the practices and ups your level by on the ones you did good at, and if you did bad it lowers your level on the ones you did bad at. If you got all questions right, it ups your level twice on all the questions asked.

      Well, that's how it works for the quizzes, unit tests, and course tests. The individual quizzes are much simpler. If you get some amount of questions right (I think it's 75% or above), your level is set to one bar. If you get all the questions right, your level is set to two bars.

      At least, that's from my experience. I don't actually know KhanAcademy's internal code, but that's the algorithm that makes the most sense with what I've seen.
      (1 vote)