If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

User Datagram Protocol (UDP)

The User Datagram Protocol (UDP) is a lightweight data transport protocol that works on top of IP.
UDP provides a mechanism to detect corrupt data in packets, but it does not attempt to solve other problems that arise with packets, such as lost or out of order packets. That's why UDP is sometimes known as the Unreliable Data Protocol.
UDP is simple but fast, at least in comparison to other protocols that work over IP. It's often used for time-sensitive applications (such as real-time video streaming) where speed is more important than accuracy.

Packet format

When sending packets using UDP over IP, the data portion of each IP packet is formatted as a UDP segment.
Each UDP segment contains an 8-byte header and variable length data.

Port numbers

The first four bytes of the UDP header store the port numbers for the source and destination.
A networked device can receive messages on different virtual ports, similar to how an ocean harbor can receive boats on different ports. The different ports help distinguish different types of network traffic.
Here's a listing of some ports in use by UDP on my laptop:
Each row starts with the name of the process that's using the port and ends with the protocol and port number.
🔍 What sort of network traffic do those processes handle? If you search the web for the process name plus the port number, you can probably figure it out. You could even try it on the computer you're using now.

Segment Length

The next two bytes of the UDP header store the length (in bytes) of the segment (including the header).
Two bytes is 16 bits, so the length can be as high as this binary number:
1111111111111111
In decimal, that's (2161) or 65,535. Thus, the maximum length of a UDP segment is 65,535 bytes.

Checksum

The final two bytes of the UDP header is the checksum, a field that's used by the sender and receiver to check for data corruption.
Before sending off the segment, the sender:
  1. Computes the checksum based on the data in the segment.
  2. Stores the computed checksum in the field.
Upon receiving the segment, the recipient:
  1. Computes the checksum based on the received segment.
  2. Compares the checksums to each other. If the checksums aren't equal, it knows the data was corrupted.
To understand how a checksum can detect corrupted data, let's follow the process to compute a checksum for a very short string of data: "Hola".
First, the sender would encode "Hola" into binary somehow. The following encoding uses the the ASCII/UTF-8 encoding:
Hola
01001000011011110110110001100001
That encoding gives these 4 bytes:
01001000 01101111 01101100 01100001
Next, the sender segments the bytes into 2-byte (16-bit) binary numbers:
01001000011011110110110001100001
To compute the checksum, the sender adds up the 16-bit binary numbers:
0100100001101111+01101100011000011011010011010000
The computer can now send a UDP segment with the encoded "Hola" as the data and 1011010011010000 as the checksum.
The entire UDP segment could look like this:
FieldValue
Source port number00010101 00001001
Destination port number0001010 100001001
Length00000000 00000100
Checksum10110100 11010000
Data01001000 01101111 01101100 01100001
What if the data got corrupted from "Hola" to "Mola" on the way?
First let's see what the corrupted data would look like in binary.
"Mola" encoded into binary...
Mola
01001101011011110110110001100001
...and then segmented into 16-bit numbers:
01001101011011110110110001100001
Now let's see what checksum the recipient would compute:
0100110101101111+01101100011000011011100111010000
The recipient can now programmatically compare the checksum they received in the UDP segment with the checksum they just computed:
  • Received: 1011010011010000
  • Computed: 1011100111010000
Do you see the difference?
When the recipient discovers that the two checksums are different, it knows that the data was corrupted somehow along the way. Unfortunately, the recipient can not use the computed checksum to reconstruct the original data, so it will likely just discard the packet entirely.
The actual UDP checksum computation process includes a few more steps than shown here, but this is the general process of how we can use checksums to detect corrupted data.
🙋🏽🙋🏻‍♀️🙋🏿‍♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!

Want to join the conversation?

  • marcimus pink style avatar for user Yizuhi Galaviz
    UDP doesn't do anything about packets arriving out of order, right? Is that why sometimes, in live streaming, the audio and video are not synchronized? Or why the live streams sometimes lag?
    (56 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Martin
      Yes, UDP does without handshakes. That means the information received is somewhat unreliable when it comes to ordering, duplicates and packets arriving at all.
      And you correctly identified the problems that arise with that :)
      (50 votes)
  • leaf red style avatar for user layaz7717
    What might cause data to become corrupted? Also, when you get the notification that a file is corrupted, does it have the same meaning? The data has gotten messed up somehow?
    (9 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Shane McGookey
      It might be helpful to consider what "data" is. In this context, we are talking about data that is transmitted over a network. If you are using a personal computer, then the process of transmitting data over a network involves transitioning data from your HDD or SSD to your computer's RAM via internal buses, and then to the NIC (Network Interface Card) to communicate it across a network. The data (which is represented using binary) must then traverse the connection between your device and the device you are sending data to. This traversal could entail moving the data over a WiFi network, over Ethernet cables, etc.

      This is an extensive process, and we certainly take for granted its complexity when we interact with the Internet each and every day. If during this process any bits of the data (again, data is being represented as binary; 1s and 0s) were to flip (go from 0 to 1, or 1 to 0) or were to be lost then the data received by the receiving machine wouldn't be the same as what was originally sent. In this case, the machine may not be able to interpret the data anymore - as it has lost its meaning - and so you end up with data corruption.

      To your latter question, yes. When information saved to the machine's non-volatile storage has been saved incorrectly and has thus lost its meaning, then the computer will notify you that the data has been corrupted.
      (25 votes)
  • leaf red style avatar for user layaz7717
    What exactly happens when videos start to glitch? For example, in Zoom meetings, sometimes people tend to look all "blocky" and the details in their video are not defined. Is that something to do with data corruption?
    (10 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Martin
      Yes, that generally indicates issues with packages, a lot of packets actually (a single dropped packets could be easily dealt with because of error-correcting encodings), generally it will just be a little blurriness or wobbly sound or something similarly barely noticeable.
      (11 votes)
  • area 52 blue style avatar for user Dhairya Patel
    Is it possible for the data in the checksum (the two bytes) to be corrupt?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Aland Soran
    Are the terms package, packet and segment referring to the same thing? if not, could you please define each one? thank you
    (2 votes)
    Default Khan Academy avatar avatar for user
    • starky ultimate style avatar for user KLaudano
      "Package" is an informal term that people seem to use in place of "packet".

      Segments, packets, and frames are created at different layers in the OSI model and each adds its own header to the data with more information. Segments are created at the transport layer and include port numbers. Packets are created at the network layer from segments and have IP addresses. Frames are created at the data link layer from segments and have MAC addresses.

      (I know you didn't ask about frames, but for the sake of completeness I felt it should be added.)
      (10 votes)
  • leafers sapling style avatar for user green_ninja
    Hi!

    I'm having trouble with adding binary numbers. I watched a few of Khan Academy's videos on YouTube, but I don't understand the mathematical reasoning behind it. Can someone please explain?

    Thanks!
    (5 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user washiwalajulius
      In decimal the maximum sum is 9 before you move or add another number next to it e.g
      4
      +5
      = 9
      but
      1
      +9
      =10(we go back to zero ten since we passed the maximum sum (9) and carry 1 to the left n it becomes 10)
      in binary (base 2) its the same thing just that the maximum ti move to the next line here is 1.
      1
      +0
      =1
      1
      +1
      =10 (remember maximum sum that u can represent as a single number is 1 in binary just like 9 in base 10 or decimal)
      After reaching our maximum we carry the reminder to the next number on the left side.
      another binary example
      10
      + 11
      =101
      Hope you can get it from here
      1 last example to cement
      111
      + 111
      01110
      notice 1+1+1 = 1 carry 1 to the left
      1+1 = 0 carry/move 1 to the left/next number
      (1 vote)
  • blobby green style avatar for user Neev Badu
    (2 votes)
    Default Khan Academy avatar avatar for user
    • hopper jumping style avatar for user pamela ❤
      Good question! The 4 bytes is the width of the header. Together, the source port number and destination port number in the first row take up 4 bytes. Since they're shown equal sized, each of them take up 2 bytes (16 bits). Similarly the segment length and checksum together take up 4 bytes, and each take up 2 bytes.
      (8 votes)
  • blobby green style avatar for user joseandresdurand
    If UDP performs the checksum when the recipient receives the data, is this calculation done for every segment it receives? Is it done in the routers? And finally, does this verification not affect the data transmission speed?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user anonymous
      UDP (User Datagram Protocol) is a connectionless protocol used for transmitting data over a network. Unlike TCP, UDP does not include mechanisms for flow control, error recovery, or ensuring data delivery in order. Let's address your questions regarding UDP checksum, its verification, and its impact on data transmission speed:

      Checksum Calculation:
      When data is sent using UDP, a checksum is calculated over the data and included in the UDP header. This checksum is used to detect errors in the data during transmission. The checksum is calculated based on the data in each segment, along with some header information. It's important to note that this checksum is calculated by the sender before the data is transmitted, not by the recipient.

      Checksum Verification:
      Upon receiving a UDP segment, the recipient uses the checksum value in the UDP header to verify the integrity of the received data. If the calculated checksum at the receiver's end does not match the checksum in the header, it indicates that the data might have been corrupted during transmission. In such cases, the receiver may choose to discard the corrupted segment or take other appropriate actions based on the application's requirements.

      Checksum Calculation Frequency:
      The checksum is calculated for each individual UDP segment sent by the sender. Each segment is treated separately, and the checksum calculation is not done at the routers. Routers typically operate at the network layer (IP layer) and don't usually involve themselves with the details of transport-layer protocols like UDP.

      Impact on Data Transmission Speed:
      Calculating and verifying the checksum does add some overhead to the data transmission process, as it involves performing mathematical operations on the data. However, the impact on data transmission speed is relatively low compared to the benefits of error detection. UDP is often used in scenarios where speed and low latency are prioritized over the guaranteed delivery and error recovery mechanisms of TCP.

      That being said, the impact of checksum calculation and verification on data transmission speed can vary depending on factors such as the processing power of the sender and receiver, the network's speed, and the frequency at which data is being transmitted.

      In summary, UDP performs checksum calculation on the sender's side for each segment, and the checksum is verified on the recipient's side. While the checksum calculation does introduce some overhead, the impact on data transmission speed is generally manageable, especially in applications where real-time communication and low latency are important.
      (9 votes)
  • leaf green style avatar for user Morteza Saharkhiz
    Section Segment Length says that "Two bytes is 2^{16} bits" which is typo, right? 2 bytes is 16 bits and can store 2^{16} possible values.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • hopper jumping style avatar for user MaryTheBest
    Can the checksum be corrupted, and if so, is there a way for the UDP to deal with that?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • starky ultimate style avatar for user KLaudano
      Yes, any part of the packet could be corrupted, including the checksum. If the checksum were corrupted, the computer has no way of knowing whether the checksum was corrupted or the data was. All the computer would know is that the given checksum does not match the data's checksum, so the packet would likely be discarded.
      (6 votes)