Written by

greg jorgensen

Date published

June 22, 2025

TCP Handshake
Other features of TCP:
TCP Window Scaling
TCP "Splitting"
TCP Flow Control
Other Protocols just as an FYI:
Datagram Transport Layer Security (DTLS)
User Datagram Protocol (UDP)

TCP Handshake

SYN (Synchronise sequence numbers)

The initiating client (let's call it Client A) sends a segment with the SYN flag set to the server (Server B). This SYN flag indicates an attempt to establish a connection. Additionally, Client A includes a sequence number (let's call it x), which is a random value generated by the client. This number is essential for the ordered exchange of data during the session.

Client A ---> (SYN, Seq=X) ---> Server B

SYN + ACK (Acknowledgment)

Upon receiving the SYN segment from Client A, Server B responds with a segment that has both the SYN and ACK flags set. The acknowledgement number (let's call it y) is set to one more than the received initial sequence number (x + 1) from Client A. Server B also generates its own sequence number (let's call it z).

Client A <--- (SYN-ACK, Seq=Y, Ack=X+1) <--- Server B

ACK (Acknowledgment)

Finally, Client A sends an ACK segment back to Server B. The sequence number is the initial sequence number incremented by one (x + 1), and the acknowledgment number is one more than the sequence number received from Server B(z + 1).

Client A ---> (ACK, Seq=X+1, Ack=Y+1) ---> Server B

Now, both Client A and Server B have synchronised sequence and acknowledgment numbers which are used to manage data transfer throughout the session.

Each TCP segment's sequence number will refer to the byte in the stream of data that it's carrying, and each recipient will acknowledge bytes as they are received.

Once data transfer is complete, the connection is terminated using a similar handshake method, often initiated by a FIN (Finish) flag in a segment

A TCP connection doesn't necessarily alternate packets between sender and receiver every time.

The TCP window size controls the back-and-forth mechanism of TCP.

The TCP window size is the amount of data that can be sent before an acknowledgement (ACK) is required. If the window size is large enough, the sender can send multiple packets without waiting for an ACK from the receiver. The receiver, in turn, can acknowledge multiple packets at once.

The number of frames that can be sent before requiring an acknowledgement is determined by the TCP window size. It’s maximum size is 65,535 bytes as the Window Size header is 16-bits.

💡

The maximum payload for a TCP packet over Ethernet would be 1500 (MTU) - 20 (IP header) - 20 (TCP header) = 1460 bytes. For UDP, it would be 1500 (MTU) - 20 (IP header) - 8 (UDP header) = 1472 bytes.

🔥

Each data segment is still limited by the Maximum Segment Size (MSS), typically defined by the maximum transmission unit (MTU) of the underlying network minus the size of the IP and TCP headers. (1500 bytes, ethernet)

📢

You may hear people use the terms "TCP packet" and "TCP segment" interchangeably; there is a technical distinction between the two but I don’t think anyone really knows or cares. BUT, a TCP segment is encapsulated inside a TCP packet. UDP is often a packet as well.

Other features of TCP:

TCP Window Scaling

A mechanism that allows for a larger “receive window size” which is the buffer where received data is temporarily stored.

⚠️

The window size, specified in the TCP header, determines the amount of data that can be sent before an acknowledgement must be received from the destination.

Using the Windows Size field, TCP users (client or server) can advertise a maximum of 216 = 65536 bytes or 64 KiB as its buffer size to its other end-user.

This 16-bit field was enough during the initial design of the TCP in the 1980s, but modern devices have a lot more memory.

Solution: Use the option “scale factor” to convey 30 bits window size using only 16 bits field in the TCP header. *The “scale factor” is introduced in the TCP option field named 'Window Scale' during the initial TCP handshake.

How it works:

The TCP "Window Size" field in the TCP header is 16 bits long, which means it can represent a number up to 2^16 (65,536). This number is the number of bytes that a sender can transmit without receiving an acknowledgement – in other words, the size of the receive window.

To allow larger windows, TCP uses an option called Window Scaling, a factor used to multiply the value in the "Window Size" field. This scale factor is actually a shift count that tells us how many positions we need to shift the "Window Size" field value to the left.

The Window Scaling option allows for a shift count up to 14.
If we apply a shift count of 14 to the maximum "Window Size" field value of 65,536 (2^16), we get 2^16 * 2^14 = 2^30 bytes, which equals 1 GiB.

So, in practice, when a sender wants to use a window size of 1 GiB, it would set the "Window Size" field to 65,536 and the shift count in the Window Scaling option to 14.

The receiver, knowing the scale factor, would then interpret the "Window Size" field as representing a window size of 1 GiB.

Window scaling (with a size of 1GiB) is really only useful in “long fat networks” (LFNs) that have large bandwidth-delay products and is not commonly used (BDP).

BDP is the product of a data link's capacity (in bits per second) and its round-trip delay time (in seconds). If the BDP is large, the system can send more data before waiting for acknowledgements, improving the network's efficiency and throughput.

TCP "Splitting"

Also known as TCP termination or TCP proxy, is a technique that breaks the TCP connection into two separate connections: one between the client and the proxy, and the other between the proxy and the server.

Client >TCP connection 1 > Server 🚫 ! Nuh uh !

Client > TCP Connection 1> Proxy > Inspect > Proxy >TCP Connection 2 > Server ✅

The client initiates a TCP connection to the server, but it's intercepted by the proxy.
The proxy completes the TCP handshake with the client, establishing a TCP connection.
The proxy then initiates a separate TCP connection to the server, completing a handshake independently of the client.
Now there are two separate TCP connections: one from the client to the proxy, and one from the proxy to the server.

An advantage of this is that the TCP connection can be optimised independently based on it’s specific network conditions. For example, the proxy can quickly acknowledge packets to the client, allowing the client to send more data without waiting for acknowledgements from the server. This is beneficial in high latency networks where waiting for server acknowledgements could significantly slow down traffic.

TCP splitting can interfere with certain applications or protocols that rely on TCP's specific behaviors. This includes:

TCP Timestamps

Proxy alteration of TCP timing can disrupt round-trip time measurements and protection against wrapped sequence numbers. 😟

Selective Acknowledgement (SACK)

Proxies might not correctly relay SACK information during TCP splitting, affecting efficient recovery from packet loss. Boo.

Transport Layer Security (TLS)

TCP splitting may pose security concerns by necessitating the termination and re-establishment of TLS connections, essentially causing a man-in-the-middle operation.

This is common with banks and stuff. Very annoying if you don’t allowlist stuff and accidentally proxy all traffic for users.

TCP Flow Control

This control is a mechanism that prevents the sender from overwhelming the receiver by sending too much data too quickly. This is done through acknowledgements and window sizes.

The receiver’s operating system has a buffer that is a part of the system’s memory allocated to temporarily hold incoming network data for the specific TCP connection until the application is ready to process.

This is called the buffer.

Establishing a Connection

When a TCP connection is established, both the sender and receiver agree on a window size.

Sending Data

When data is sent, the sender must wait for an acknowledgment from the receiver before it can send data beyond the window size.

Receiving Data

When the receiver gets the data, it sends an acknowledgement back to the sender. This acknowledgement includes a window update, which tells the sender how much more data can be sent. The window size might be decreased if the receiver is processing data slowly, or it might be increased if the receiver is ready to handle more data.

Adjusting the Window Size

If the receiver is unable to process incoming data quickly enough (maybe it's busy with other tasks, or its buffer is full), it can advertise a window size of 0. This tells the sender to stop sending data. Once the receiver is ready for more data, it can send a window update with a larger size, and the sender can resume transmitting.

The sender can send a probe segment to check if the receiver is ready to resume transmission; this mechanism is called the persist timer.

Basically, it’s a way for the RECEIVER to slow down or speed up the rate the sender is communicating so it’s able to process all the incoming data properly and efficiently.

Other Protocols just as an FYI:

Datagram Transport Layer Security (DTLS)

Operates at the Transport layer of the TCP/IP model, the same layer where TCP and UDP reside.

Based on the Transport Layer Security (TLS) protocol, but designed to work with datagram protocols like UDP too.

DTLS is commonly used to provide security for non-TCP traffic and ensure CIA. Confidentiality, integrity and authentication.

DTLS implements a retransmission scheme for handling packet loss and reordering, which are common issues for UDP-like traffic.

It cannot guarantee delivery like tcp but it does provide CIA. (Confidentiality, Integrity, Availability)

User Datagram Protocol (UDP)

Considered a connectionless protocol because it doesn't establish a connection before data transmission occurs.

Lower Overhead compared to TCP, due to smaller headers, meaning a higher percentage of data transferred is the data payload instead of the header information.
No Connection Setup: UDP does not establish a connection before transmitting data which saves time
No congestion or flow control is good for real-time applications like video streaming and voice-over IP.
UDP supports broadcast and multicast transmission because it's a connectionless protocol.

Some applications may provide their own mechanisms for ensuring data reliability, making TCP's reliability features unnecessary.

Headers in UDP

Source Port is an optional field; when meaningful, it indicates the port of the sending process and may be assumed to be the port to which a reply should be addressed in the absence of any other information.
Destination Port indicates the port of the destination process.
Length specifies the length in bytes of the entire datagram: header and data.
Checksum is used for error-checking of the header and data.

Cool features of TCP + other sweet protocols