Main content
Computers and the Internet
Transmission Control Protocol (TCP)
The Transmission Control Protocol (TCP) is a transport protocol that is used on top of IP to ensure reliable transmission of packets.
TCP includes mechanisms to solve many of the problems that arise from packet-based messaging, such as lost packets, out of order packets, duplicate packets, and corrupted packets.
Since TCP is the protocol used most commonly on top of IP, the Internet protocol stack is sometimes referred to as TCP/IP.
Packet format
When sending packets using TCP/IP, the data portion of each IP packet is formatted as a TCP segment.
Each TCP segment contains a header and data. The TCP header contains many more fields than the UDP header and can range in size from 20 to 60 bytes, depending on the size of the options field.
The TCP header shares some fields with the UDP header: source port number, destination port number, and checksum. To remember how those are used, review the UDP article.
From start to finish
Let's step through the process of transmitting a packet with TCP/IP.
Step 1: Establish connection
When two computers want to send data to each other over TCP, they first need to establish a connection using a three-way handshake.
The first computer sends a packet with the SYN bit set to 1 (SYN = "synchronize?"). The second computer sends back a packet with the ACK bit set to 1 (ACK = "acknowledge!") plus the SYN bit set to 1. The first computer replies back with an ACK.
The SYN and ACK bits are both part of the TCP header:
In fact, the three packets involved in the three-way handshake do not typically include any data. Once the computers are done with the handshake, they're ready to receive packets containing actual data.
Step 2: Send packets of data
When a packet of data is sent over TCP, the recipient must always acknowledge what they received.
The first computer sends a packet with data and a sequence number. The second computer acknowledges it by setting the ACK bit and increasing the acknowledgement number by the length of the received data.
The sequence and acknowledgement numbers are part of the TCP header:
Those two numbers help the computers to keep track of which data was successfully received, which data was lost, and which data was accidentally sent twice.
Step 3: Close the connection
Either computer can close the connection when they no longer want to send or receive data.
A computer initiates closing the connection by sending a packet with the FIN bit set to 1 (FIN = finish). The other computer replies with an ACK and another FIN. After one more ACK from the initiating computer, the connection is closed.
Detecting lost packets
TCP connections can detect lost packets using a timeout.
After sending off a packet, the sender starts a timer and puts the packet in a retransmission queue. If the timer runs out and the sender has not yet received an ACK from the recipient, it sends the packet again.
The retransmission may lead to the recipient receiving duplicate packets, if a packet was not actually lost but just very slow to arrive or be acknowledged. If so, the recipient can simply discard duplicate packets. It's better to have the data twice than not at all!
Handling out of order packets
TCP connections can detect out of order packets by using the sequence and acknowledgement numbers.
When the recipient sees a higher sequence number than what they have acknowledged so far, they know that they are missing at least one packet in between. In the situation pictured above, the recipient sees a sequence number of #73 but expected a sequence number of #37. The recipient lets the sender know there's something amiss by sending a packet with an acknowledgement number set to the expected sequence number.
Sometimes the missing packet is simply taking a slower route through the Internet and it arrives soon after.
Other times, the missing packet may actually be a lost packet and the sender must retransmit the packet.
In both situations, the recipient has to deal with out of order packets. Fortunately, the recipient can use the sequence numbers to reassemble the packet data in the correct order.
🙋🏽🙋🏻♀️🙋🏿♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!
Want to join the conversation?
- When handling out-of-order packets, how does sending the expected acknowledgement number indicate to the sender that something is amiss? How would the sender know if it had to re-send the package if it was lost?(9 votes)
- Imagine you want to send the letters of the alphabet to a friend over the Internet.
You send ('a', 1), ('b', 2), ('c', 3), one by one to your friend. The numbers are used in case the packets/messages arrive out of order.
Now, suppose the friend gets ('b', 2), but then ('d', 4). It's missing 'c' because it expects a continuous increase of numbers and 3 is missing. So your friend asks you to resend the letter at position 3 (this is the idea behind the expected acknowledgement number).
As mentioned in the article, it may be just that ('c', 3) is taking longer to arrive and so in that case, the sender sends a duplicate message, but duplicates are typically dropped by the receiver.
A helpful way to think about these numbers is that they synchronize the data so both parties have the same "view" of it.
Imagine if we didn't have a universal notion of time. Then, the sender's view of time would be different from the receivers. Hence, they synchronize their "view" of time by communicating numbers.
Hope this helps!(33 votes)
- What does the article mean "setting the ACK bit and increasing the acknowledgement number by the length of the data received"?(6 votes)
- Say you want to send a message that's 32 bytes long.
So you check if the receiver is there. The receiver answers with ACK #1. That means something like "hullo I can hear you".
Now you send the message of 32 bytes and the receiver responds with ACK #33, that means "I got your message and received a total of 33 bytes from you". (the additional byte comes from the introduction).
Now if you send another message thats 100 bytes long, the receiver would respond with ACK #133, meaning "I got 133 bytes from you".
That way you can see if the connection works without the receiver sending you the entire message back and you having to read it.(14 votes)
- Hi. Following up on Carita's question below? How does the sender know that a packet is missing if the recipient only responds with "Ack [expected packet number]"? Is an Ack for a missing packet somehow different from an Ack for a received packet to trigger the sender to resend the missing packet?(4 votes)
- Good question, this is a central concern in protocol development: how to deal with ambiguity.
A sender also keeps their sequence number to synchronize, so if they receive an Ack[packet_number] not matching their current sequence number, they can distinguish between a missing packet and sending new packets.
An Ack for a missing packet is not any different from an Ack for a received packet. It is the fact that the sender has their own current sequence number, which in some sense is their own "time", that enables them to distinguish.
Hope this helps!(7 votes)
- Hello,
I'm looking at the Transmission Control Protocol page (I'm trying to ask a question under that page, but all I'm seeing is the comments from the User Datagram Protocol page, so I'm not sure where this is actually going to post). It keeps mentioning a sequence number. What exactly is a sequence number, and how does it change as more data is sent? For example, in the last section, it has the numbers 1 and 37 for the sequence, and 73 and 37 for the acknowledgement. Where are these numbers coming from?(4 votes)- I believe that these numbers represent different packages and the order they were sent in - ex: you send a 3 text messages and they're flagged as a sequence of message 1,2 and 3 in the order they were sent(3 votes)
- How we can get to know what we are using TCP or UDP?(2 votes)
- Wireshark is a free tool that enables you to inspect the Internet packets (UDP or TCP based) flowing in and out of your device. Here's a tutorial I used at some point to get started: (https://www2.cs.siu.edu/~cs441/lectures/Wireshark%20Tutorial.pdf)
Additionally, some well-known Internet activity defaults to a particular protocol. For instance, surfing the web generally uses TCP, whereas live streams use UDP.
Hope this helps!(5 votes)
- Why bring in Transmission Control Protocol when it can lead to bigger problems than it's used to having?(2 votes)
- TCP gives a reliable network connection, ensuring that all packets arrive (if possible) and are assembled in the correct order. Generally, these benefits outweigh its extra network usage which is why TCP is usually used instead of UDP or just IP.(3 votes)
- "In the situation pictured above, the recipient sees a sequence number of #73 but expected a sequence number of #37.”
Why is the expected sequence number 37 in this case? Is the sequence number also incremented like the acknowledgment number based upon the bytes received in the data? Or is the sequence number incremented according to the sequence of the packets (which makes more sense here)?(3 votes) - While the recipient is reassembling the packets, does it keep all those packets in the RAM?(3 votes)
- What is meant by the term "offset" mentioned in the TCP segment illustration?(3 votes)
- The given example for illustration holds good for two computers connected either in LAN or a computer which has accessed a website.
My question is, when I connect my computer to the internet but haven't opened any web browser or app, then with which device / computer my device performs SYN and Handshake operation ?
Is it any random computer or a router or the server of my ISP ?(3 votes)- When you turn on your PC it starts services (or deamons if you use Linux) for starting network stack. DHCP for acquiring IP address, for example, starts as soon as you turn on your network port (if you hadn't explicitly forbidden it), which usually happens on device startup.
So, there is no need in browser for a device to communicate in networks. There are much more programs that require network connection.(1 vote)