IP's role to get a packet/datagram from the source host to the destination host.
IP is a packet delivery system; the packet carries some payload in its
Data field; this could be an ICMP message but usually is something more
interesting and important than error messages or Echoes; usually, it is the
data/message of some networking application like Web, Email, Telnet; this
message is not carried raw in IP's Data field but is carried by another
layer of envelope: TCP or UDP.
So IP usually carries a TCP or UDP envelope.
Inside the Ethernet frame is the IP packet inside of which is a TCP segment inside
of which is the application's "message" which is usually also an envelope of the application.
IP is an unreliable and connectionless datagram
service. Packets can be lost, delivered out of order, even duplicated.
IP is a best-effort protocol: it tries to deliver its payload, if it
succeeds, good; if it fails, it doesn't care or do anything about it! (except for maybe
an ICMP error message from a router).
IP is essentially a one-way fire the packet into the network and forget about it. Doesn't know what its carrying.
So some additional value-added service/protocol is needed to deal with these issues.
That's TCP.
(For realtime and streaming applications e.g. VoIP and video, and for
some short request and response protocols like DHCP, DNS, SNMP, NTP they
aren't dealt with, these applications use UDP).
TCP turns the network into a reliable and
connection-oriented service that guarantees delivery of message by
using a complex system of acknowledgements and sequence numbers.
It also performs a flow and traffic congestion role (even more complex).
Positive acknowledgment with retransmission is used to guarantee reliability of data transfers.
The receiver responds with an acknowledgment message as it receives the data
(or after it receives some multiple segments worth of data;
this is the flow control feature of TCP specified by the automatically-adjusted "window" size).
The sender keeps a record of each segment it sends.
The sender also maintains a timer from when the segment was sent,
and retransmits a segment if the timer expires before the message has been acknowledged.
This timeout value is adjusted by TCP to accommodate fluctuating network traffic congestion
and is a key factor in the functioning of the network (and Internet).
Web, email, telnet/ssh remote login, ftp are some application protocols
that use TCP. The goal is accurate "file transfer" activity, where every
bit and byte of the message must be delivered.
All of the (multi-segment) message must arrive and be reassembled before
being given to the application.
TCP is another 'envelope' and so has a header with various fields, including
a Data field that carries the payload (the [portion of] web page, email, file etc).
To identify the application/protocol
of the payload, ports are used. (These are software constructs that are completely
different from hardware NIC ports.) TCP header's Source port field indicates the sending
application, the Destination port field indicates what application at the destination
host this TCP segment is intended for.
Analogy is that IP network is the street, (router at an intersection),
IP host is the house, the ports are the doors (up to 64K
of them). Behind some doors is a particular application. Servers listen
at a well-known port (e.g. web HTTP server at port 80, telnet server at port 23,
ftp at port 21, SMTP email server at port 25) so that a client knows to whom
(i.e. which server/listener)
to send the message.
Clients such as a web browser or telnet client are assigned an unused port number
that will be used just for the duration of this networking activity.
These short-lived ports are called ephemeral ports.
Combination of an IP address and a port is called a socket. e.g. 192.168.0.101:80
The TCPs on the two hosts establish a connection so they "know" they are communicating
back and forth.They can determine that all data sent was received.
Analogy of establishing a conversation, conversing back and forth, then ending the conversation.
A host can be in many simultaneous conversations (i.e. TCP connections)
with another or many other hosts.
netstat show all connections netstat -n numerically rather than with port names and DNS names netstat -f don't truncate the names to fit column width netstat -a show connections and listening servers netstat -aob w/PID and program netstat -ano show all connections and listeners numerically w/PID netstat -p TCP restrict to TCP (no UDPs)
1. SYN: The active open is performed by the client sending a SYN to the server.
The client sets the segment's sequence number to a random value A.
2. SYN-ACK: In response, the server replies with a SYN-ACK.
The acknowledgment number is set to one more than the received sequence number i.e. A+1
(a SYN segment is considered to have one byte of "ghost data"),
and the sequence number that the server chooses for the packet is another random number, B.
3. ACK: Finally, the client sends an ACK back to the server.
The sequence number is set to the received acknowledgement value i.e. A+1,
and the acknowledgement number is set to one more than the received sequence number i.e. B+1.
Both the client and server have received an acknowledgment of the connection.
Steps 1 and 2 establish the connection parameter (sequence number) for one direction and it is acknowledged.
Steps 2 and 3 establish the connection parameter (sequence number) for the other direction and it is acknowledged.
With these, a full-duplex communication is established.
NB. Wireshark reports the initial sequence numbers as 0 for our ease-of-use.
The sequence numbers increment by the number of data payload bytes that are sent in each direction
by subsequent segments.
The two sequence numbers are independent.
TCP is more about sending bytes than packets.
TCP connection is full-duplex: both ends can be simultaneously sending and receiving
(but usually is running half-duplex?)