BRIAN YU: Let's begin by thinking about how it is that computers and other devices communicate with one another over the internet. Presumably when this happens, one device is sending some sort of message over the internet to another device. And that other device is responding back with some sort of message. And computers and other devices all across the internet are performing this process, sending and receiving messages, whether those messages or emails or web pages or chat messages or something else, but how is this all happened? Well, it turns out that computers and other devices on the internet have standardized on a set of protocols some basic rules to follow that govern how it is that these devices should be communicating with one another. And one of these protocols is called TCP/IP. TCP stands for transmission control protocol, and IP stands for Internet Protocol. And these are the protocols that determine how it is that computers are able to communicate with one another over the internet. How does this work? Well, you can think of it as analogous to sending a letter in an envelope, for example. We have an envelope here that we want to send from one computer to another. Or you can think more physically as sending from like, one address to another. What information needs to go on that envelope for us to be able to know who we're sending information to and to make sure that envelope can get to the right place. Well, on a real envelope, you need the name and the address of the recipient. And you'll also put on the envelope the sender and the sender's address. And much in the same way that when you're sending physical mail, you include the address of the person who you want to receive the message, as well as the address of the person sending the message. On the internet it's very similar, except instead of physical addresses, computers and other devices on the internet have IP addresses-- internet protocol addresses. And these often take the form of some number, dot some other number, dot some other number, dot some other number-- four numbers separated by dots. And so you might imagine that on this digital envelope that we're sending from one computer on the internet to another computer on the internet, we might have on the face of it 1.2.3.4, the IP address of the computer that we want to be receiving this message. And then we include our own IP address, the IP address of the sender. In this case, 5.6.7.8. Of course, this envelope contains more information than just this. But at minimum, it definitely needs some address that we're sending information to and some address that we're sending information from. These four numbers that make up the IP address, #.#.#.#, all range from 0 to 255, so from 0.0.0.0 to 255.255.255.255. And if that number 255 sounds familiar in binary, the number 255, as you might recall, is really just eight ones one after another, which means each of these numbers inside of an IP address is represented by eight bits of information from all eight zeros to all eight ones. There are four numbers. So four numbers with eight bits each means that in total, we have 32 bits with which to work with IP addresses. And if you recall, 32 bits means we can count as high as about 4 billion, which means there can only ever be 4 billion addresses that are available on the internet. 4 billion devices that are connected to the internet that have addresses to which we can send information or that have addresses that can be the senders of information themselves. And while 4 billion feels like it's a pretty big number-- the number that you can represent using 32 bits worth of information-- in practice now with the internet so ubiquitous and with billions of people, many of whom have computers but not just computers, but also phones and other devices that are all connected to the internet, we're fast running up against this 32-bit limit. And it's for that reason that this notation of #.#.#.#, each of which is 8 bits, is known as IPv4, v4 of the internet protocol using 32-bit addresses, we now have a newer protocol called IPv6 that uses 128-bit addresses instead of 32-bit addresses, allowing for far, far more addresses that can be available on the internet. IPv4 is still pretty common. But many devices now are transitioning to IPv6, as the addresses that they use in order to communicate with one another over the internet. But what other information do we need in addition to just the address of the person that we're sending information to? You might imagine that if a computer has some IP address that allows other people to send them messages over the internet, that computer might be receiving a whole bunch of different types of messages. That computer might be receiving emails and packets that are traveling over the internet. That computer might be receiving web pages. It might be receiving file transfers from other devices. And the computer needs some way of knowing how to distinguish between all these different types of packets of information. Is this packet of information an email, or is this packet of information a web page? This is important relevant information that this device needs to know about. So in order to solve that problem, we assign each of the different services of information-- things, like email or web pages or file transfers a number. And that number is called a port number. And so some common port numbers are 21 for FTP, a file transfer protocol that allows you to transfer files over the internet, 25 for SMTP commonly used for email, and 80 for HTTP, which you might be familiar with, which is for sending messages over the internet in the form of web pages, for example. And there are many other ports that are used as well. So what really goes on the envelope, then, is not just an IP address of the destination of who you want to be sending a message over the internet to but also a port number. So it might look something like this-- 1.2.3.4 is the IP address of who it is that you want to send a message to, colon, and then the port number 80, in this case, indicating HTTP, which means we're sending a web page from one computer on the internet to another computer on the internet. But these IP addresses are not what you and I would normally interact with when we're typing in URLs into our web browser in order to try to access a web page. Indeed, we're not really typing IP addresses. You more commonly are probably typing a URL-- something like this-- http://www.example.com that tries to specify which website it is that you want to visit. But if we've just established that all these devices on the internet are identified by their IP address and not by the URL that looks like this, how is it that our web browser knows that when we go to www.example.com, what IP address should the computer be trying to connect to? In order to solve that, we use what's called DNS or the domain name system. And what DNS really is is it's really just a mapping between URLs-- things like google.com or harvard.edu and yale.edu with their corresponding IP address. It would be pretty annoying if every time you wanted to visit a website, you needed to know the IP address of the server on which that website was being served from in order to type in those exact numbers. It's much easier to remember something like harvard.edu or google.com. And so DNS is a bunch of servers-- these DNS servers that exist on the internet that know for any particular URL what IP address does it correspond to. And so when you type something like google.com into your web browser, your web browser can check with the DNS server and say, what is the IP address of google.com? And once it knows the IP address, then your computer is unable to communicate and say, let me go to google.com and request google.com's web page from there. So that's DNS. This way of taking these URLs and translating them into the corresponding IP addresses so that we can take a URL, like http://www.example.com and figure out where on the internet is the server that is going to serve us that web page. But let's now take a closer look at this first part of URL, http. And HTTP, as it turns out, is yet another protocol that exists on the internet. In this case standing for hypertext transfer protocol. Hypertext just being short for HTML or a markup language which we're going to see shortly. But what HTTP is all about really is what's inside of each of those envelopes that when you send an envelope from one device on the internet to another device, sure, on the outside it's labeled by the IP address of the person you're trying to communicate with and your own IP address of the sender. But what's inside of the envelope? What is the content of my request to a web server when I'm trying to get a web page? And what is the content of that response that comes back when I am trying to get information back from a particular web server? Well, when I make a request, it might look a little something like this. So inside of that envelope, we might see content like this where the first word you see here is GET. This is what we call a request method. And in this case it just means I'm trying to get a web page. Next up is this slash. And the second part of specifying what particular resource on the web page that I'm trying to connect to, do I want to get back? And so many websites, like google.com, for example, have a /search or a /settings or any number of other pages that I might want to try to access. Some may also have images that I might want to get. And so this slash in this case is specifying that I just want the root of the website-- whatever I would get to if I went to in this case, www.example.com. But as you might imagine, I might go to www.example.com/ something else to get a particular page. And that would go in the second parameter here. Following that is HTTP/1.1. This is specifying which version of the HTTP protocol that I'm using in order to communicate with this host. In this case, I'm using version 1.1, which is quite common. Nowadays version two is also pretty common. But you'll still see HTTP version 1.1 around quite a bit. And beneath that I'm connecting to a particular host-- something like www.example.com example as the host that I want to communicate with. It's possible that the web server that I'm communicating with is actually holding multiple hosts altogether and is balancing between them. And so in my request, I need to specify what host do I want to connect to? In this case, www.example.com. And there's more information that comes in the request other than that. But the key ideas here is that I am trying to get a page from a particular host. I specify what page it is that I want to access in addition to specifying which version of the HTTP protocol I'm using in order to try and make this request to the website. What then responds back to me when www.example.com receives my request and wants to send something back to me the person who requested this page in the first place, well, the response might look something like this. Again, starting with HTTP/1.1, the version of the HTTP protocol that is being used in order to communicate information. Next up is this number 200. This is what we would call a status code. Every time HTTP gives you a response, it's going to come along with some code that specifies how the response was resolved. And 200 just means as the word immediately following it would describe everything was OK. We were able to successfully give you back some sort of response. What type of response came back? Well, that comes on the second line of the response. Content type colon: text/html means the response that came back to me is some HTML, some markup that's ultimately going to represent a web page. But more on HTML in just a moment. The response has more information again than just that, likely the actual content of the HTML that's coming back to me. But in short, this response is specifying the version of the HTTP protocol, the status code that came back, 200, meaning OK, and then the content type-- what type of information is coming back to me. In this case, it was HTML. But it very well might have been a text file or an image or any other information that might have been transferred over the internet. This status code 200, meaning OK maybe isn't something you've seen before. But certainly if you've used the internet, there are other status codes that you're probably more familiar with. For example, 404 might be pretty common which means not found. If you try to request a page that just isn't there, HTTP says that the server is going to respond with a status code of 404, meaning that it wasn't found. And there are other status codes as well. Status codes, like 301 which means move permanently. In other words, this page has moved somewhere. And so you'll often be redirected from one page to another page when you receive a status code of 301. 403 is a status code meaning forbidden. For example, if you try to access a page that you don't have permission to see or that you need to log in to see first, you'll very often see a status code of 403, meaning you don't have permission to be able to see this particular resource. And status code 500 generally means an internal server error. Whoever programmed the server that is accepting these requests and responding with information likely has a bug in their code, for example, that might result in an error that occurs. And so that's when a status code of 500 gets responded as well. And there are many other status codes. But these are just some of the more popular ones. And status code 200 is what comes back all the time when you request a page, and you're able to view it successfully. The web browser usually doesn't show you the status code number 200 just because everything went well, so there wasn't a reason to show you anything in addition. But as we'll see in just a moment, we can take a look at what's actually happening in the network-- what messages are being sent-- what responses are coming back. And in those responses, we'll often see a status code of 200 which will indicate to us that everything was indeed OK. So let's take a look at a real example and actually try to open up a web browser and see what happens when we make a web request when we type in a URL and try and visit a web page. I'll go ahead and open up Google Chrome, although this feature is available in other web browsers as well. I'll go up into the View menu. And I'll go to Developer and then Developer Tools. And then over here on the right-hand side, I'm going to go to the Network tab, which is going to allow me to monitor all of the network traffic-- all of the requests and responses that are coming from my web browser and being sent back to my web browser. So we go to the URL bar. And let me just visit a website, like google.com, for example. And we'll take a look at what happens after this web page loads. All right, so once google.com loads, you'll notice here in the Network tab a lot of things have shown up. But I'm going to ignore most of that and just scroll up to the very top where I'll notice that I had an initial request for google.com which is what I typed in. And I'll click on that just to take a look at some more details. Here I see the headers. And if I scroll down, I can take a look at the Request Headers, which was the information that I sent to Google's servers when I tried to request this web page. And I can also see above that the Response Headers, the information that came back to me when Google responded with this web page. I'll go ahead and click the View Source button here just to see what was inside of this request. And you'll notice right here we see HTTP/1.1, meaning version 1.1 of the HTTP protocol. And the response code that seems to have come back to me is 301, move permanently-- one we've seen before that means I've been redirected to somewhere else. Well, I typed in google.com. So where have I been redirected to? Well, if we look immediately below it on the second line, we see location: and then http://www.google.com. So it seems that whenever I try to visit google.com, Google is redirecting me to www.google.com instead. And this is a fairly common convention where www stands for worldwide web. And many web applications are hosting their website on www.something, for example, although strictly speaking, that isn't required. So I've been redirected to www.google.com. And we can see that here I have www.google.com. This was the next request that my web browser made because it saw that I'd been redirected somewhere else. Here, 307, Internal Redirect just means I've been redirected again. Where have I been redirected to? Well, it looks like here, I've been redirected to https://www.google.com. So https, the s standing for secure, just means that Google wants me to connect to their website securely. And indeed, more and more websites nowadays are using https, a version of the protocol that allows for information to be encrypted when it's transferred between one computer and another over the internet just to make sure that that connection is more secure. And so now, I'll go to the next request that my computer made. Here, the URL that I was requesting is now https://www.google.com. And now, finally, the status code is 200, meaning everything was OK. So what happened here? I typed google.com into my web browser. And I got a 301 response code, meaning that I'd been redirected to somewhere else, www.google.com. When I went there, I made a request to that URL and got another response, which was that I'd been redirected to https://www.google.com. And when I made a request to that URL, now and only now do I see that I get a status code of 200 meaning that everything was OK. And a lot of other resources have come back, as well, so when you load Google's web page, it's likely loading other scripts or other images or other fonts, for example, that all need to be loaded in order to display this web page. But the key here is that anytime you make a request over the internet, it's using this HTTP protocol that allows me to make a request to a particular URL. And when the response comes back, I can actually see what's going on using the Network tab of Google Chrome or similar features for other web browsers that you might be using in order to get an understanding for what exactly is coming back from this web server. And as we'll see later on in this track, we'll start to build web applications of our own where we're writing these web servers and deciding what status code should be returned? What is the content that should be returned back to the user? When they make a request to slash or to slash something else or to any other page that might exist on our web application. So ultimately, there are a lot of protocols at play here in determining how it is that devices on the internet, be that your computer or phones or tablets are able to communicate with other devices on the internet. We saw a TCP and IP, a transmission control protocol and the internet protocol that are determining things like how it is that information is sent from one device to another-- what goes on the outside of that envelope-- things like the IP address, the address of who it is that you're trying to send information to in addition to the port number that determines what kind of service you're trying to communicate, whether that be an email address or whether that be a web page or whether that's a file that you're trying to send from one person to another person. And then we saw HTTP, the hypertext transfer protocol, which is the protocol that determines what's actually inside of that envelope. What is the information that I am sending to a web server when I'm trying to request a web page? And then what comes back to me in that response-- the version of the protocol, the status code, and then the actual content of what comes back to me. So next, we'll take a look at that actual content and take a look at how to build web pages that can be sent using the HTTP protocol and ultimately viewed inside of your web browser.