DOUG LLOYD: In this video we're going to talk about the Transmission Control Protocol, TCP. If you haven't watched the video on internet protocol, IP, you may wish to do so before watching this video because the two are pretty interrelated. So, the internet protocol, again, a quick summary, that's the protocol that moves information from a sending machine to a receiving machine through the network. So what's TCP? While just moving from a sending machine to receiving machine, isn't the full story. We also know that our program, our computers, for example, are running multiple programs, and have multiple services running on those machines. And so, if we want to get a packet, or information to a specific program, on a specific machine, we need more information than just what IP allows us to get information from point A to point B. So, TCP can be thought of as directing the packet to the correct program, or the correct service, on the receiving machine. And so it's important to, as you might expect, know where it's supposed to go, and what the packet is for at the same time. And so, frequently, when you talk about transmission control protocol, TCP, you really often hear it in the context, TCP slash IP, or just TCP/IP. These two protocols are so interrelated that, they're basically treated as a single unit. But they are two separate protocols that do two separate things. Again, IP is responsible for getting it from one machine to another. And TCP is responsible for getting it to the correct program, or the correct service on a machine. And it does something else that IP doesn't do, which is guarantee delivery. So, if we now couple a machine's IP address with the so-called port number, and a port number is how a specific service, or utility, or program, is identified on a machine. If we now have an IP address plus a port number, now we can uniquely identify a particular service running on a particular machine. So that's why TCP and IP are so frequently interrelated, because that port number on its own doesn't really mean anything if you need a port number, and the machine that you're talking about. What machine is supposed to be using this particular port, for example. The other thing that TCP does, as I said, is it guarantees delivery. So, in addition to specifying the port number, it also indicates how many packets, the internet protocol, IP, has split the data into. And it orders those packets so they can be reconstructed on the receiving machine, even if they received-- in a different order than they were sent. Which can happen because IP is a connectionless protocol, and so different packets can take different paths through the system. Some of these port numbers are very commonly used, and they've been standardized across all computers, like, pretty much every computer manufacturer now. So something called FTP, the file transfer protocol, which is used to transmit files, as you might expect, from one machine to another, that uses port 21 conventionally. Email, SMTP, uses port 25. DNS, the domain name system, which we talked about in our internet primer video, uses port 53. If you're ever browsing the web, you're pretty much always using port 80, unless you're browsing the web securely, secure web browsing, using port 443. So what's this TCP/IP process? What's happening with both of these protocols together? Well, let's talk about it. When a program wants to send data, TCP helps break it into chunks, and communicates those packets to the computer's networked software. So it takes the data and it wraps information around it that indicates what port is supposed to go to, and what order that packet is out of all. So make packet one of 10, two of 10, three of 10, and so on. IP gets those data chunks that have been wrapped with TCP, and wraps more information about where the packet is supposed to go. We might call this the IP layers surrounding the packet. So, it's sort of, like, one of those nesting dolls. We have the data in the middle, and then TCP on top of, telling it where the data inside of TCP is supposed to go, to what port or what service on a machine. Around that is the IP layer. What IP address, what machine, is actually getting this. So then, that packet that's been wrapped with all those layers, is sent through internet protocol through the system of routers, getting from point A to point B. When the receiving machine, or device, gets it, it looks at the IP layer, it says, yup that's my IP address, so it takes off, sort of cracks the egg, and takes off the IP layer. Then it sees that there's a TCP layer, and it says, OK, looks like this is going to port x, or port y. And apparently it's packet number eight of 15. So that's good to know. So then it can take that information, take off the TCP layer now, knowing that it's for port x, and it's packet number eight, and get at the data inside. And it can prepare the data to be organized in the correct way. And once all of the data is received, TCP can hand it off to the correct service, and say, here you go. Here's the data that you received. That process might look something like this. So let's send an email from a sender to a receiver. And let's say this email is pretty small, so we only need to break it into four packets, and we'll call them A, B, C, and D. Well, we want to move that first packet what happens? Well, we take that chunk of data, the data that is part of packet A, and around that we're going to wrap it with a TCP layer. Emails, you may recall, are sent via port 25, and we have four chunks of data, here, that we're going to be using, and this is the first of them. So maybe our TCP layer contains information about, well, we're going to port 25, and this is packet number one of four. Around that, so now we have all that information bundled up together, we're going to say where we want it to go, what machine, what IP address is supposed to get this packet. And that's part of the IP layer. And there's other information in there as well, such as the return address in case something goes wrong, it knows where to send information back, and so on. But the IP layer goes around all of that. That entire thing is bundled together, as one big unit, and sent through an IP transfer. So it gets routed through the router network, using internet protocol. And the receiver receives the entire thing. And then it can start to deconstruct what's happening here. It looks at the IP layer, the outside layer of this data, and says, yep, that's my IP address so we can discard that. I can, kind of, ignore it, doesn't need it anymore, and it can look one level deeper. It sees that, OK, this is data that is intended to be received on port 25. It's apparently the first part of four. So, I'm going to keep that in mind, and look at the data, and slot it roughly where I think it's going to go. Now, because of the internet protocol it's not necessarily the case that the next packet the receiver gets, is packet two. In fact, the next thing the receiver gets might be packet number three because these packets took different paths because of different traffic on the network. And so, I'm not going to go through the diagram of building it up again, but packet three moves, gets stripped away of all of its layers, the IP layer, the TCP layer, and the data gets put in the right spot. And then, let's say it receives packet four. Now let's say, that's it, it doesn't get any more data. What is it going to do? IP doesn't do anything for us. But TCP does. TCP knows, well, I've received one of four, three of four, and four of four. I'm not getting any more data. So something has gone wrong. But I can guarantee delivery. I know that packet number two is missing. And so TCP can now make a request, sort of, in the reverse direction. Bundling up its request in much the same way, and sending it via IP, which, I know, could lead to some sort of infinite loop of everybody dropping packets on the way. But suffice it to say that TCP says, I'm missing a packet. I need to send information back to the sender. Fortunately the sender's IP address is, sort of, bundled up in the IP layer. It's part of-- it's the return address on the envelope. And say, I'm missing packet number two, can you please resend it. When the sender receives that information, it doesn't have to send the entire email again. It only needs to send that individual piece of it that was missing, so we could send packet number two. And when it gets it, now TCP says, I have all four pieces of data that I need. So, I can assemble them together, and take this entire block of information and pass it along to port 25, where it will be interpreted as an email. And that-- in this way we've now send an email from sender to receiver using TCP/IP. So, as I said, if at any point along the way something went wrong, TCP can deal with it. It can make a request that the information gets sent back to it. And it can reconstruct the message. And once it's reconstructed the message from all the packets it's received, then it can organize them and deliver them to the correct service. So that's TCP in a nutshell. That's how we guarantee delivery of information. Remember the TCP frequently works with IP, so these two protocols really do go hand in hand. We discussed them in several videos here because they do different things, but they're so interrelated, they you'll usually use them together. I'm Doug Lloyd. This is CS50.