[MUSIC PLAYING] DAVID J. MALAN: All right. So this CS50, and this is week eight, the week of all Hallows' Eve. Indeed, thanks to our friends here in the American Repertory Theater, the stage looks amazing today with some special lighting and some special characters. Of course, speaking of characters, this past week you all explored 50-ville for the very first time, looking for the rubber duck that had gone missing. And thankfully, the culprits have been found, and allow me to say that a little someone would like to say hello. Yes, even he has rather dressed up for the occasion, but thank you for all the hard work there. 

So this week, of course, we transition to the world of web programming, the motivation being that for the past many weeks, pretty much all of the code we have written has been focused on command line programs, compiling your code, interpreting your code, but generally just interacting with a fairly mundane blinking prompt, textually. But of course, the software that you and I use every day these days is in the form of laptops and desktops in browsers or on mobile devices or apps, and today, we begin to transition to a set of languages and a set of technologies via which you can start to apply all of the past week's knowledge and mental models for procedural programming to a much more familiar, a much more graphical domain. Indeed, over the course of the next couple of weeks, we'll be focused on web programming and the use of languages called HTML and CSS and JavaScript, with which today's websites are made, and increasingly with which today's mobile applications or apps on your phone are made as well. 

But in order to get to that point in the story, we need to consider what the framework is on top of which we're going to run these websites or these web applications. And so that invites the question of the internet. Exactly what is the internet? All of us use it every day, but let's take a couple of volunteers from the audience, just to define for us what we mean by the internet. All of us are literally on the internet right now, but if you take a step back and think about it, what is the internet? How might you define it for someone less technical than you or someone less familiar? Sophia, how would you define it? 

AUDIENCE: The network of all the computers around the world that are taking in information from the network and also giving it information. 

DAVID J. MALAN: Perfect. The internet is this network of networks, so if you have a small network in your home, a small network or a large network at your company or you university and you start to interconnect all of those networks using cables or some kind of wireless technology, you get the internet, so to speak, a network of networks. And this is really the infrastructure, if you will, on top of which all of today's applications are run. So when you use the web, when you use chat, when you use Slack, when you use video conferencing, Zoom or the like, you're using the internet, but think of the internet really as the lower-level plumbing that gets the zeros and ones from you to someone else and back, and the applications on top of that are all implemented, ultimately, in software. 

And so if we consider, then, that we've got all of these computers interconnected somehow, it stands to reason that we need to somehow decide as a global community how to get data from point A to point B and beyond. And so throughout the internet are these computers called routers. And at the end of the day, they're probably a little bigger than the desktops and laptops with which you and I are familiar, but at the end of the day, they're the same kinds of devices with CPUs, Central Processing Units, the brains inside of the computer that do all the thinking, RAM or memory, where all of the values are stored, and hard disks, where data is persisted. And pictured here, for instance, is an image from MIT that depicted a few years back of what-- some of the most significant peering points on the internet throughout the United States. So each of the red dots here represents essentially one router or one very important place into which a lot of cables come in and then go out and interconnect all points of the country. And then this story continues well beyond the United States these days using oceanic cables and other wireless or satellite technologies or the like. 

So suffice it to say, there's this mesh, this interconnection of all of these different computers and in turn networks throughout the world, which is to say that there's many different paths that data can take to go from point A to point B. There isn't necessarily a line between you and Facebook.com or Stanford.edu. Rather, there's a whole bunch of routers-- sometimes a handful, sometimes as many as 30-- that will relay your data from left to right, to up to down, or in some other direction in order to get data from you to the web server that you're trying to contact and then back to you with the server's response. 

So how does all of this work? Well, decades ago, humans essentially had to get together and decide as a group what standards they were going to use, or more specifically, what protocols all of these computers are going to speak. A protocol isn't so much a language as it is a set of conventions, right? Back in healthier times, you and I, if we were meeting each other in person, might extend a hand, and if I did this, you would immediately know that you should probably extend your hand too, and we would have a physical handshake. And that's a human protocol. I initiate a communication with you by extending my hand. You acknowledge that communication by extending your hand. And then that interaction is complete. So we have these human protocols. 

In the world of computers, there's similarly protocols, but obviously it's all zeros and ones. So if the first computer sends this pattern of zeros and ones, the other computer should reply with a different set of zeros and ones. And so these protocols we're about to discuss just standardize what those patterns of zeros and ones are, or really, what all of the messages are going back and forth. And two of the protocols most commonly used to get data on the internet from point A to point B are called TCP/IP. TCP and IP are two separate protocols, but they're so often used together that you typically mention them in one breath, TCP/IP. And these are acronyms you've probably seen maybe on your Mac or PC or somewhere on your phone settings. And it refers to essentially two sets of conventions that computers use to get data from one point to another. 

So what do we mean by data and what do we mean by moving things between point A and point B? We'll just consider it as an old-school envelope, whereby if you wanted to send a letter to someone else in the world, you and I would probably reach for a piece of paper back in the day. We would pick up an envelope and we would write our note on that piece of paper, put the paper in the envelope, and then the most important step after writing the actual message would be to address the envelope. And of course, in the real world, you would put the recipient's address typically in the middle of the envelope. You might put your return address in the top corner of the envelope, and then maybe postage or something like that. But we humans have pretty much standardized through all of the postal systems that kind of convention when using envelopes. 

So the metaphor here is that the envelope and the message therein are generally thought of or referred to as packets, packets of information. And this would be the physical incarnation of what computers ultimately are just going to do using zeros and ones. So let's tease apart the two sets of conventions they use for actually putting data in these envelopes, addressing these envelopes, and sending them out from point A to point B. 

Let's consider first IP. IP stands for Internet Protocol, and pretty much any Mac and PC and iPhone and iPad and Android device these days has been designed by Apple or Google or someone else to understand IP. It's as though those companies have written software running on those devices that make sure that those devices all support IP, just like I was taught, presumably by some human, this human convention of shaking hands back in the day. IP, Internet Protocol, simply standardizes how computers address each other. So in our physical human world, if you wanted to send me an envelope for instance, you might write to Harvard's Computer Science Department at 33 Oxford Street, Cambridge, Massachusetts, 02138, USA. That is presumably a unique postal address that addresses the Computer Science building on campus so that if you drop an envelope in the mail in California or anywhere abroad it should eventually via some number of hops and mail carriers and the like make its way to that particular address. 

Computers, then, have similarly unique addresses known as IP addresses. And so when your computer, Mac, PC, phone, whatever, sends data from itself to another server, the address that it writes on the outside of that virtual envelope is the IP address of the remote server. So for instance, if I were to send a message to you, I would figure out what your IP address is. I would write that IP address virtually on the outside of this envelope. I would probably write my own IP address on the top left-hand corner of this metaphorical envelope. And then I would send it out on the internet. And what does that mean? It would mean I take that envelope and I hand it to the nearest router. 

So it turns out when you're at home, you actually have a router of your own. It's that device that connects to your cable modem or DSL modem or something like that. If you're on campus, like at a place like Harvard or Yale, Harvard and Yale have their own routers, so your computer, when on campus, just knows to hand data off to that. And if you're at home using-- or if you're elsewhere in the world, like in Starbucks or an airport, similarly are there routers there. So your computers generally know where the closest router is, and then router's purpose in life is, again, to figure out, does this packet go left, right, up, down, so to speak, in order to get it closer to its destination. 

But this sort of is a chicken and an egg. If I want to send you a piece of information, I need to know your IP address, but I don't really know your IP address until I know where you are. So there is this other system that you've probably seen an acronym for, too, called DNS, Domain Name System. And this is a technology that's deployed throughout the internet that's supported by Macs, PCs, and phones these days, that just translates what you and I would typically call domain names, or fully qualified domain names, from those English-like or human-readable characters to the corresponding IP addresses, right? There's a reason that companies do not advertise their websites as being a numeric IP address. None of us would ever remember them. They instead advertise them as Microsoft.com and Google.com and NewYorkTimes.com. 

DNS is a technology that your Mac and PC and phone support that know when a human types in one of those human-readable addresses, a domain name, DNS converts those names to the IP addresses. So literally, if you type in Harvard.edu or Yale.edu, enter into your web browser, your Mac or PC quickly looks up the IP address of that web server using the software that came with the Mac or PC and converts it to the corresponding IP address and then writes, virtually, on the outside of the envelope, the IP address of Harvard or Yale's web server before sending it out on the internet. 

So these are just services. DNS is a service that your own ISP, Internet Service Provider, provides. When you're on campus, it's Harvard or Yale. When you're at Starbucks, it's probably Starbucks. When you're in an airport, it's the airport. When you're at home, it's your own internet service provider like Verizon or Comcast or the like. So the world just decided to use that technology as well. 

And lastly, one other acronym for now, TCP. TCP, or Transmission Control Protocol, is a solution to a couple of problems, one of which is that it tends to be pretty convenient for individual servers on the internet to be able to do multiple things, right? And you can-- there's lots of things the internet can do. The servers can host email. They can host websites. They can host chat servers, video conferencing. I mean, that's already a growing list of features of software that you can use on the internet, and it would be nice, financially, administratively, if one server could do multiple things at once. And indeed, they can. 

So when a computer receives one of these virtual envelopes and that computer, that server, happens to support multiple services, email, web, chat, video, whatever-- it looks at the envelope for one additional piece of information. And that piece of information is known as a port number, P-O-R-T number, which is just a small integer that the world has decided represents specific services. So for instance, in the world of TCP, the world decided years ago that our computers should virtually write the number 80 on these envelopes after the IP address to signify that this is a request for a web page, or 443 on the outside of the envelope if it's a secure request for a page using something called HTTPS. More on that in a bit. And there's other numbers as well. Email has its own unique numbers. Zoom has its own unique numbers. And all of these other internet services that you and I might use every day have their own unique TCP ports so that companies and people can have one server doing multiple things, but upon receipt of one of these envelopes, the server can look at it and be-- and realize, oh, this is a request for email. This is a request for a web page. This is a request for chat, or something else altogether. 

Now, notably, too, TCP also handles delivery, and it's the part of the protocol that also ensures that when you send data from point A to point B, if any data gets lost because literally something's wrong with one of those routers or because maybe the-- one of those writers got overwhelmed and just received more packets at once than it can handle-- that could happen, because these computers have, of course, finite memory. If you send too much data through one, the internet might get congested, your video might buffer, and a whole bunch of other symptoms might arise. So TCP also handles the process of retransmitting data as needed. If any of these packets is lost on the internet, literally TCP will also compel your Mac or PC, your phone, to resend that data as well. 

But what's notable about the internet is that data doesn't necessarily follow one specific path. In fact, if you send multiple packets from one person to another, those packets might actually take different routes each time. And this is actually a feature, not a bug, so to speak, because you can imagine servers getting congested or problems needed to getting-- needing to be routed around. And so TCP/IP also supports, with other protocols, an adaptive solution to this problem, whereby maybe your data will go this way sometimes. Maybe it'll go this way some other times. But this is why, in part, that sometimes your internet speeds are variable, because, again, these routers in between might be different or might be a little bit overloaded. 

So we thought we'd try to tell this story by enlisting the help of some of the CS50 staff. In fact, Brian, let me start with you. Would you mind taking on the role in just a moment of playing a web browser, someone's own Mac or PC or phone, and request of me-- maybe something silly, like asking me for a picture of a cat? 

BRIAN YU: Yeah, sure. So if I want to ask you, some web server, for a picture of a cat, I need to send a message to you in order to send that request to you. So I might write down my request on a sheet of paper. And I'll just put that request inside of an envelope. And then I would have to label that envelope with all the information we talked about, in particular with your IP address, that I might look up with DNS. And then I can send that envelope off. 

DAVID J. MALAN: All right. And I think we need a little bit of help here, because Brian and I are in different places. And so he and I can't just hand the envelope from one to the other. So let's go ahead and enlist the help of CS50 staff here, also, who have chimed in here on Zoom and see if we can't route this request from Brian, who's playing the role of a web browser, to me, who will play the role of a web server, in order to receive this request for a cat. So here we go. Let's see if we can enlist the team here. 

[MUSIC PLAYING] 

All right. Well, thank you to Phyllis for having handed me this envelope. And what we have now is the request that Brian sent me. I'm going to go open it up, and I, indeed, see a message inside requesting a picture of a cat, which is not uncommon on the internet. So now if I'm the web server and I actually have an archive of pictures of cats, I'm going to go ahead and respond to Brian with one of those cats. But to do so, I'm going to go ahead and have to look up on my hard drive or somewhere in the computer that picture of a cat. And here's one here, so I'm going to go and send Brian this very happy cat. I've got some envelopes of my own, and I'm to go ahead and write Brian's IP address on the middle of this envelope, I'm going to put my IP address on the top-left of this envelope, and then maybe any other identifying information I need, and then I'll go ahead and put the cat into the envelope. 

But, of course, this isn't really going to fit. And this is actually quite commonly the case. Any time a computer is trying to transmit a decent amount of data, whether it's a big image or maybe it's an even bigger video file, for equity's sake, it tends to be good for computers to chop up large packets into multiple smaller packets. In fact, you might have heard of something called Net Neutrality or a more technical topic known as Quality of Service. In a nutshell, Net Neutrality speaks to just what kinds of decisions computers should make when it comes to prioritizing data. And a common convention is, historically, that all of us should chop up our large packets into smaller packets, send them out, so they can get then commingled with other people's packets and we all reach our destinations at the same rate. 

Net Neutrality, as an aside, is all about an interest by some parties in prioritizing maybe the data from certain companies that pay a bit more. And so this really speaks to just use or maybe abuse of these basic primitives here. 

But this is not fair for me to try to cram this one big image into an envelope, so I'm going to literally go ahead and tear the picture in half, essentially chop the packet into two. Let me go ahead now and put this into the envelope, because it'll fit a little more easily. So I've got one packet of information for Brian. I've got now-- let's see-- one more packet of information for Brian that I'll fit the other half of this image into. But I think I'm going to have to do something else. 

Before I drop this out on the internet and hand it back to Phyllis to send out back to Brian, I might need some additional information on these envelopes. I've already got Brian's IP in the To field. I've got my IP address in the From field. I've also jotted down the port number that I should use for Brian and my own return port number. And those are decided typically by my Mac or PC. But I feel like I probably need a little more information. What more should I virtually write on the outside of this envelope to make sure that the data is received as intended? Any intuition? No familiarity with TCP/IP assumed here. But if Brian's about to now get two envelopes, what additional data should I perhaps give him? [? Greg? ?] 

AUDIENCE: Brian may confuse the top of the photo with the bottom, so you need somehow to tell Brian that this is a top and this is a bottom, a link maybe to converge them. 

DAVID J. MALAN: Perfect. And so we need to make sure Brian knows the order in which these packets should be reassembled so that he, indeed, gets the cat the right way and not the wrong way, for instance. So what probably suffices is for me to add what we'll call a sequence number to each of these packets, which is essentially a number which you can think of as one of two, and on the other one, two of two, so that Brian knows when-- what order to reassemble the packets, but also more importantly, in case one of the packets, or both of them, gets lost or somehow dropped by one of the routers along the way, there's enough information on those packets to enable me and him to recover that and resend packet one and/or two as needed. 

So let's go ahead and do this. Let me go ahead and enlist the help of the team, starting with Phyllis here. And Phyllis, if you'd like to go ahead here and-- 

[MUSIC PLAYING] 

All right. Of course, that's only half of the problem. So I'm to go ahead now and send the second packet, finally. In an ideal world, I would actually send these out in parallel, but there's no reason that they couldn't still follow different paths. In fact, this one I worry might take a little bit more time. Let's see. 

[MUSIC PLAYING] 

[SINGING IN ITALIAN] 

Amazing. Brian, do you want to go ahead and open up your envelopes and reassemble them? 

BRIAN YU: Yes. I have two envelopes. I guess I'll open up the one that says one of two first, and it is the top half of the cat. And then I'll open up the other envelope, which is two of two. And that is the bottom half of the cat. And so together, I think I now have the full cat. 

DAVID J. MALAN: Wonderful. Well, thank you to Brian and to the whole team. And so to recap, IP is this protocol, the set of conventions that standardizes what gets written on these envelopes. It's how computers uniquely address each other with numbers of some sort. TCP governs a few different things, but among them is this numbering of services, like 80 for insecure web traffic or 443 for secure web traffic that ensures that the data gets from one point to another and is handled by the right application running on that particular server. DNS, then, is what we use to begin with. If Brian had his own domain name, my computer would have had to look up his IP address, or conversely, he would have had to look up mine, so that we humans, who are actually using the internet in a human-friendly way, don't have to remember IP addresses, which, again, are just numbers, but instead can remember things like Harvard.edu, Yale.edu, and the like. 

So that, then, is the internet, the fundamental infrastructure, the plumbing on top of which we now have the ability to get data from point A to point B. And so in some sense, if you're comfortable with that, we can now abstract the internet away and just think of it as being a mechanism that gets data from one point to another. And so long as we can now assume that we have this fundamental public service that gets data from one point to another, now we can start to build on top of it in terms of software and other languages and actually use it for interesting things. 

But before we forge ahead to do those things, any questions or confusion we can clear up on TCP or IP or DNS or the internet or routers or any of these other new terms? [? Greg, ?] back to you. 

AUDIENCE: I have a question. So does chopping the information create any problem? Because, I don't know, a piece of information can go there for two seconds and another one for three seconds. Does it create any problem for the user? 

DAVID J. MALAN: Really good question. These packets can take different durations of time, and even though I did stipulate that they should go out to Phyllis's hands roughly at the same time, even if she needs to pass them in two different directions, there can absolutely be delays. And in fact, typically, you and I as humans will start to notice delays if packets take more than 200 milliseconds to get from point A to point B. After that, it looks like there's a bit of delay, and certainly if it's two or three seconds, you'll really notice that at that point, it's not necessarily a problem-- Brian hopefully would patiently wait for the second half of the cat for some amount of time if he only received one packet. Eventually, he as a human, and in turn, he as a computer, would probably get a little anxious and would ask me to retransmit a packet if it doesn't arrive after 5 seconds, 10 seconds, 30 seconds. These time outs can typically be specified by the software running on the person's computer, but at that point, you and I would certainly notice the difference. 

All right. So if we now have this ability fundamentally to get data from point A to point B, what is actually inside of the envelope that Brian sent me, and what was inside the envelope I sent him besides just the picture of a cat? Well, for that, we transition to another language or another protocol, rather, called HTTP, HyperText Transfer Protocol. And this is an acronym you've probably seen or typed bunches of times. It's, of course, what appears in the beginning of URLs, Uniform Resource Locators, which are the tools that you and I use to actually figure out what websites or what image we actually want to request of the internet. 

So the web, the world-wide web, is really just one of many services that run on top of the internet. The web gives us web pages. Zoom gives us video conferencing. Other tools give us text chatting, voice chatting, and the like. So the web is really just an application on top of the internet. It's hands-down the most popular application, but it really is just an application. It's a service that's using that underlying plumbing. So HTTP is a different protocol that really governs what goes inside of these envelopes. TCP/IP governs what goes outside the envelopes. HTTP governs what goes inside of the envelopes, assuming we are talking about web browsers and web servers and not video conferencing or something else. 

So with HTTP, it comes with a few different commands, or a pretty limited vocabulary, two of which are the most important terms to know, which is GET and POST. These are literally English verbs, and they are two of the commands, if you will, that HTTP supports. And what Brian probably did inside of that envelope is he probably literally wrote down GET cat or something like that. POST is used for other applications that we'll get to before long, but "GET" is the operative word. And it literally is how a browser will request or get information from a server. So somewhere in the envelope Brian sent me was the English word "GET," probably followed by cat.jpg or something like that. There's probably a bit more information, but the essence of HTTP means that if Brian wants something from me and he's the browser and I'm the server, he should start his request with the standardized verb "GET" followed by the name of the file that he wants to get. 

So let's put this now into the context of one of the more familiar URLs. So here's, for instance, a canonical format of a URL. And let's highlight a few features of it. So first, HTTPS. Increasingly, you're seeing this on the web. Even if you don't type it, it's often automatically appearing in the address bar of your browser, because browsers or web servers are adding it for you. The S just refers to a secure version of HTTP. And we'll come back to this topic of security next week and beyond, too. But in the context of HTTP, this just means that the data between me and Brian and vice versa is encrypted somehow. It's way better than Caesar or other ciphers. It's way more mathematically sophisticated. But it essentially just scrambles the information so that Brian knows he's asking for a cat, I, the web server, knows he's asking for a cat, but if any of you or any of the TFs who were playing the role of routers maliciously or nosily opened the envelope instead of handing it off to the next staff member, they wouldn't understand what's inside the envelope, because it would look like, similar to Caesar and other ciphers, sort of like random zeros and ones. So HTTPS just means that the contents of these packets are encrypted. 

What else is salient about these URLs? Well, here's what we call a domain name. Odds are most everyone knows what a domain name is, and it's typically two phrases, something dot something else. And example.com is, of course, an example here, but Harvard.edu, Yale.edu, and millions of others these days. 

To the end of that, though, is what we would typically call the top-level domain or TLD. This is just the type of website, historically, that you're trying to visit. Dot com meant commercial. Dot edu meant education. Dot net meant some kind of network. Dot org is an organization. That's no longer really the case. In fact, there's hundreds, perhaps even thousands, of top-level domains nowadays that you can buy domains in that try to categorize things sometimes, but there's no hard rules around most of those top-level domains. You have to be an accredited educational institution to use dot edu. You have to be in the US military to use dot mil. There are similar constraints in other countries who have their own two-character country code TLDs, like dot UK for the United Kingdom, dot JP for Japan, and many others. Each country is free to standardize as it sees fit. But you and I can buy a dot com, a dot org, a dot net, a dot us, a dot-- there's many, many, many others. And if you go on Wikipedia you can see a nearly exhaustive list. But this just tends to categorize the type of website that it is. 

Besides that, there's this prefix, this, generally known as a hostname. And www is just a human convention. Years ago, pretty much any server on the internet that had a human-friendly name like this, www.example.com, this was just meant to connote to the user that, oh, www, this must be the address of a web server and not a mail server not a chat server or something else. It's not strictly required. It's just human convention. And odds are, you and I, when you visit websites, you probably don't even bother typing this in anymore. But it is a historical feature that allows a visual cue-- clue, typically, to the humans as to what type of server it is. 

So besides that, there's this one hidden piece of information as well. If you just want to visit example.com's homepage, you might just type this URL, or even just type example.com and hit Enter and let the browser redirect you, so to speak-- take you to this canonical form of the URL. But very often, you're technically requesting a specific file. And if not mentioned, that file name is typically index.html. It can be other things as well depending on the language or the server technology that someone's using. But implicit at the end of URLs is often the name of a file. Brian might have specifically requested cat.jpg, but if he were requesting not a picture of a cat but a full-fledged web page with text and other information, odds are, there's an implicit file name there, like index.html. And this is now important, because when we look inside this envelope, this is a piece of information that needs to then be in there. 

So let's take a look at some sample HTTP requests and responses, the more technical dive into what Brian and I and the staff acted out a moment ago. Technically speaking, when Brian sent me a request for that cat, he wrote inside this envelope, not only the key word GET and something like a cat.jpg-- he also specified a couple of other things. And let's genericize it now away from cats and just propose this. Inside of an HTTP request that is any of these virtual envelopes is literally a request for, like, GET followed by slash, if you don't want to cat, you just want the default homepage, followed by a mention of what version of HTTP the browser and server should speak. 1.1 is pretty common. 2 is pretty-- is increasingly common. 3 is even now out there. But there's just different versions of the protocol. It's like humans have refined what it means to shake hands. These versions of protocols evolve over time. 

But there's also a line like this, Host colon www.example.com, because-- just in case I am a particularly fancy server that supports not only example.com but maybe Harvard.edu and Yale.edu. It's possible, long story short, for companies nowadays to host multiple websites and multiple domains on the same server. This little clue inside the envelope makes sure that it goes to example.com or Harvard.edu or Yale.edu if all of these entities are sharing the same physical server. So more specifically, a request might look-- instead look like this. If you're not just requesting the default homepage but you want a specific file, it might say /index.html instead. What does my response look like? 

So I've gotten Brian's envelope. Now I'm going to go ahead and respond with my own one or two or more envelopes. Inside of mine, yes, is going to go pieces of that cat, but some additional information as well, per the protocol. So my response-- just like in the human world, I might extend my hand, if I see Brian initiating a handshake. I'm going to respond with something like this, HTTP/1.1, which just reminds the browser what version I'm speaking, then a number, which is the status code, followed by a shorthand summary, like OK. 200 OK means I got you. I found the cat. Here it comes, piece by piece, in these envelopes. And I also put it in the envelope a mention of the content type. If it's a web page, I'm going to put text/html. If it's a jpg, I might instead say image/jpg. And there's the different content types, otherwise known as line types, for all different file formats in the world. 

Well, that's not always going to be the case, that the response is as simple as that, whereby your browser requests information and the server responds with the requested information. Sometimes the users make their way to the wrong place. So for instance, suppose that a browser visits www.Harvard.edu, the response might not necessarily be OK initially. It might not be status code 200. 

And in fact, we can see this. Let me go ahead and open up on my screen here a browser window that's going to take me to, let's say, Harvard.edu. And I'm going to go ahead and type in to the URL bar http://www.Harvard.edu, Enter. Now, all this happened pretty quickly, but if I click on the URL bar, which has been simplified or shortened by Chrome at the moment, notice where I actually ended up. Somehow or other, my browser did not keep me at HTTP. It redirected me, so to speak, to HTTPS. This is probably intentional on Harvard's part. They would rather that I'd be visiting them securely so that if I'm reading articles or other content, that's really nobody's business except mine and Harvard. Certainly no one-- no routers in between should be able to see this. So somehow, Harvard redirected me from HTTP to HTTPS. 

Well, how can I see this? Well, it turns out, embedded in Chrome and Edge and Firefox and Safari-- all of today's browsers-- there are often developer tools that sometimes you have to enable via a certain menu-- but these developer tools are so powerful and they allow you, the user, or now you, the programmer, to actually see and understand what's going on underneath the hood of these browsers and servers. So I'm going to do this in Chrome, specifically. I'm going to go to View, Developer, and then I'm going to go to Developer Tools. And odds are if you're a Chrome User, this menu option has always been there, even if you never noticed it. So feel free to play along at home. 

And then notice this pops up on the top right here. I'm going to go ahead and move it down to the bottom just by clicking the dot, dot, dot menu and move the developer tools to the bottom of my screen, just so we can see things a little wider. And I'm going to go ahead and click on the Network tab up here. And when I click on the Network tab here, I'm going to see a whole bunch of information related to my last request, so I'm going to go ahead and do this request again. Let me go ahead and go back to the URL bar and let me go ahead and-- actually, just for good measure, let me do this in Incognito mode. And even though you perhaps are in the habit of using Incognito mode if you don't want the browser to remember where you've been or what you've logged in as-- and Incognito mode is incredibly powerful for developers tool so that you can reset the browser state to a first condition without any previous network browsing showing up in your history. 

So I'm going to do this again now in Incognito mode after having opened developer tools. http://www.Harvard.edu, Enter. And a whole bunch of stuff just flew by the window, some of which is this chart information, which shows me the performance. So to [? Greg, ?] your question earlier about noticing the amount of time, you can see that some of the requests that were just induced vary between a few milliseconds and over 1,000 milliseconds. But what I care about for now is this fairly arcane listing down here. A whole lot of stuff just flew across the screen, and indeed, if I zoom in on the bottom, simply visiting Harvard.edu induces 70 HTTP requests per this mention in the bottom left-hand corner. It resulted in 6.8 megabytes of information being transferred. And in total, it took a-- rather atrociously, 11.95 seconds. So [? Greg, ?] like, that is slow, relatively speaking. Well, absolutely speaking. 

So what's the takeaway here? Well, any time you visit a web page, there's not just the one web page itself with all of the text in it. There's probably images, maybe videos, maybe music and other things. All of those get downloaded separately. So if Brian had asked me for a full web page, like the course's home website, I might respond not with a single envelope or two envelopes. I might respond with 70 envelopes containing the responses to every piece of media that composes CS50's own website, or in this case, Harvard's. 

But for now, let's focus only on the first of these requests. If I look at the first row here in Chrome, I will see a reminder of where I visited first. But notice the Status column over here is 301. 301 Moved Permanently. It turns out that there's numbers besides 200 that tell browsers what to do. 200 just means OK, here's the data we requested. 301 means, nn-nn, whatever you requested has moved permanently to a different URL. So let me go ahead and click this first row and you'll see that a whole different set of tabs pops up. I'm going to click Headers here. And now let me define a term. When Brian and I are using HTTP inside of these envelopes and I write something like GET slash HTTP/1.1 or host colon www.example.com, each of those lines of text is what we'll call an HTTP header. It's a line of text inside of the envelope. So what we're seeing here is Chrome's summary of all of the headers that were inside of these envelopes. 

Let me go ahead and look at my request headers first. I'm going to click View Source. And I can literally see the raw request that my browser sent to www.Harvard.edu, GET slash HTTP/1.1, Host colon www.Harvard.edu, and then a bunch of other stuff which we'll ignore for now. But those are all HTTP headers. But if I scroll back up here, let's look at the response headers now, what came back in a different envelope from Harvard to my laptop. Notice here that it's HTTP 1.1, but it's not 200 OK, it's 301 Moved Permanently. This is a hint to my browser that, uh-uh, there's nothing at the URL you visited. You need to visit a different location instead. To know where I need to go, I need to scroll down and find this header here. Notice that the third line in the response is Location colon https://www.Harvard.edu. So this is how the envelope that comes back contains a clue to me to say, nn-nn, we have moved permanently to the secure version of the website. 

And if I zoom out now and click this little X to close those tabs, you'll see that the next request that my browser automatically sent on its own was to instead, if I scroll down here, to this request URL, https://www.Harvard.edu, and the response I got this time under this general summary here was now indeed 200. So this is just a simple mechanism that allows a browser and a server to intercommunicate in a way that can send them from one location to another. 

And let me make this a little more familiar. Odds are you have seen not this before, explicitly, because you, as a human, would rarely, if ever, see the number 301 or Moved Permanently until today, now that you're a programmer who's using these developer tools, but odds are you've seen another number. Maybe in the chat if you want to just chime in, if you're thinking about web pages and numbers, has anyone seen-- quite often, probably, a number that maybe now makes a little more sense? Brian, what are you seeing? 

BRIAN YU: A lot of people saying 404. I also saw 500 and a 502. 

DAVID J. MALAN: Yeah, so 404 is the code that humans adopted years ago that just signifies Not Found. So if you visit an incorrect URL or an old URL that no longer exists on a server, for maybe an old cat that's been deleted, the server will respond not with 200 OK but with 404 Not Found, thereby telling your browser to display some kind of error message. Weirdly, browsers years ago weren't especially user friendly, and then browsers just told us humans 404, 404, which frankly, is not very user friendly. But all it boils down to is this little hint inside of the response envelope coming back that indicates that something went wrong, that something was not found. 

And there's a whole list of these status codes, and this is certainly not something you need to memorize, but as we focus more and more on web programming, you'll just get naturally familiar with some of these. There's other ways of redirecting the user from one place to another. 302 and 307 can be used. For efficiency, servers can sometimes respond with 304, which essentially means, you already asked me that question. The cat has not changed on the server. Use your own copy of the cat. So long story short, if Brian's own browser were smart, it would cache-- C-A-C-H-E-- that is, remember the cat that he just downloaded from me so that if Brian hits reload or he comes back to that same website again and wants to see the cat again, it just-- his browser loads the local copy instead of bothering me, the web server, and wasting time, milliseconds, sending another cat. 304 would just say, the cat is the same. Use your own local copy. 

Then there's others. You might have seen 401 or 403 before, which refer to not being logged in correctly or something like that. 500 is actually bad. And in fact, I can pretty much guarantee that over the next couple of weeks, all of you will experience your very first of several HTTP 500 errors. That's going to be next week that you screwed up with your code and you actually wrote buggy Python code that just meant the whole server didn't know what to do. And that's an internal server error. Fixable, and will help you debug it, but indeed, that's quite common as well. 503 just means the server might be overloaded in some way and so service is unavailable, and there's others, dot, dot, dot, as well. 

So we can actually have a little bit of fun with this in a couple of different directions. It turns out that if we send this HTTP request, we can take a look at what comes back. And let me go ahead and do this. Instead of using my browser, I'm going to use a command line tool which tends to just be a little cleaner, because I don't have to futz around with all of these buttons. Let me go ahead and use a program called Curl. And Curl's purpose in life is just to connect to a URL. And it's not going to bother showing me the web page or any of the content. It's just going to show me the HTTP headers if I use a command line argument of dash capital I. 

And now I'm going to go ahead and http://safetyschool.org. And I'm going to go ahead and Enter. And this is my Mac now sending one envelope to safetyschool.org containing GET, that verb, requesting the home page. They are presumably going to respond to me with another envelope inside of which is some kind of response. Maybe it's a 200. Maybe it's something else. 

[CHUCKLES] All right. It looks like-- forgive me-- that safetyschool.org has moved permanently, per this 301, to this new location www.Yale.edu. sorry. And in fact, we can do this. If I copy this URL-- and let me go into a browser. I'll use Incognito again so that I don't have any past history. I'm going to go ahead and hit Enter. And voila, the visual effect is just as real as the headers would imply. So indeed, the funny thing about this joke is that someone on the internet has been paying for the domain name safetyschool.org for, like, 20 years now for this joke, and the only thing it does is redirect one domain name to another. 

Now, fair is fair. Let me go ahead and transition away from safetyschool.org or to harvardsucks.org, which also exists, and someone on the other side has been hosting this website for some time. And in fact, if you visit that URL-- let's go to harvardsucks.org, Enter. You'll actually see a whole website. So the Yalies really went all-out here. And you can actually see an amazing hack here whereby at harvardsucks.org there's an old YouTube video of an amazing hack or prank that was pulled at one of the Harvard-Yale football games some years ago where Yale, to their credit, tricked us into spelling out, with a bitmap, if-- of all things, we suck. So fair's fair. 

So in anyhow, bit of a stretch to connect those to underlying HTTP messages, but it all, indeed, relates to these very simple primitives. Let me point out one other thing as well. We might also see in the form of HTTP requests even more sophisticated first lines where you're not requesting just slash, the default homepage. You're not requesting /cat.jpg or /index.html. There might also be question marks and equal signs. And notice, this is an excerpt from an envelope my Mac or PC or phone might send to google.com requesting pictures of cats. 

And in fact, let me go ahead and do this on my browser. Let me go to HTTPS-- I'm not going to bother using the insecure version at all. I'm going to go explicitly to google.com/search?q=cats. So this is the human version of the URL that my Mac will translate into this lower-level message that's going to be shoved inside of the virtual envelope. So I'm going to go ahead and hit Enter. And voila, I now see, indeed, a whole bunch of pictures of cats, including some more horrific photos from a movie that didn't fare well as well. 

So that is to say that it seems that once you understand URL formats, you can begin to pass input to servers. And here's now where we bridge past weeks to future weeks. Thus far, when we visited web pages like Harvard.edu and Yale.edu and the like, we're just visiting static web content. We're not actually providing user input like you would using GET string or input or any kinds of command line programs we've written. But it turns out that you URLs do support user input, and they are standardized. If you see a question mark and then the name of a variable like q and then an equal sign and then a word like "cat," that's the web-based analog of a command line program having asked you what is the value of q, and the human typing in cats. So this is to say there is a way using URLs that will actually allow us to pass input to a web server. And indeed, that's what's happening when you're visiting google.com, but it just boils down to understanding these URLs. 

And before we begin to build some of our own solutions on top of this infrastructure, any questions now or confusion on HTTP or status codes or anything we've seen thus far? Anything at all? Yeah, over to Santiago. 

AUDIENCE: When you want to, for example, publish a web page, why is it that you have to buy a domain name? Is that because you're using memory in some server? 

DAVID J. MALAN: Yeah, that's a really good question. Why do you have to buy a domain name? It kind of boils down to capitalism, to be honest. There is a non-zero cost to running certain aspects of the internet, certainly, or really all aspects of the internet. There are some nonprofit and volunteers-- non-profit organizations and volunteers that have historically helped govern it. Increasingly, though, there's overhead to operationalizing the internet, running things like the main DNS servers and other features. And so there are what are called internet registrars, much like a university registrar, whose purpose in life is to allow people to essentially rent domain names on an annual basis. 

And indeed, when you buy a domain name, it's not yours permanently. Instead, you're paying a yearly fee once-- a renewal fee every one or two or three years or the like. It might range from a couple of dollars to hundreds or even thousands of dollars. We can go down the rabbit hole talking about domain name squatting, whereby if you think of a really cool word and you buy the domain name and someone else comes along and wants it, there's capitalism at play there, potentially an opportunity for you to sell a domain name to someone else. But in part, it helps just regulate exactly who can sign up for domain names and presumably put some downward pressure on all of them just disappearing if you could just sign up for free for as many as you want. 

Other questions or clarifications on not just HTTP but also TCP, IP, DNS, or anything else from today's alphabet soup? 

BRIAN YU: A question came in the chat. If you have multiple packets that you're trying to send from one place to the other, do they have to be sent out one after the other, or can you send all of the packets out at the same time? 

DAVID J. MALAN: Really good question. We did not think that we humans could do that very well choreographically using Zoom a bit ago, so we sent one packet a time through the teaching fellows. But yes, a computer would typically dump all of those packets out at the same time. They would be serialized one after the other, but it would happen very quickly. And by chance, they might all follow the same route through the teaching fellows as routers, or they might go in different directions depending on just how congested or how busy the internet is at that moment in time. They might arrive out of order, but indeed, that's why Brian needs to know what the sequence number is on the outside of the envelope so he can rearrange them in the correct order. Anything else on your end, Brian? 

BRIAN YU: How do the rooters know which way to send any particular packet of data? 

DAVID J. MALAN: Really good question. How do the routers know? So back in the day-- and in some cases, it's literally hardcoded. You can think of a router as having essentially an Excel spreadsheet in its memory with at least two columns, one of which is an IP address, the other of which is the direction it should go out on, like right, left, up and down. Like, the cables aren't going in four different directions, certainly, but you can think of it in-- metaphorically, in that way. It tells the router that if you receive data for this IP address, send it out on this cable, or if it's for this IP address, send it out on that cable. And all of these cables are connected to other routers in the same city, in different cities, across an ocean, to some other endpoint. 

That would be very painful, though, if humans had to manually configure all of the interconnections we saw on MIT's map just a bit ago. And so it turns out there's other protocols out there that we won't spend time on in this class, but that routers rely on in order to dynamically adapt. So long story short, there are protocols that will figure out if all of a sudden my packets are not getting through to Brian, I'm going to start routing around that, dynamically, and the routers are going to figure out, that does not seem to be a good destination because I'm not getting any response, or it's just taking way too long to hear back. So there are protocols that govern how you can decide whether to start dynamically changing those so-called routing tables, the spreadsheet to which I referred earlier. 

All right, so we have now, at this point, an infrastructure known as the internet that allows us to send packets of information from point A to point B by writing addresses and port numbers on the outside of those envelopes. We have another protocol called HTTP which is specifically used for web browsers and web servers, separate from videoconferencing and chat, which have their own set of conventions and protocols, but we have a mechanism for get-- requesting information and responding with information. And we know from problem set 4 how you can respond with a cat. It's just a sequence of bits, whether it's a bitmap or a jpg or something else, but we haven't yet seen what an actual page looks like. 

And indeed, if we look a little deeper in the envelope that I'm sending to Brian and he's sending to me and we're getting back from Harvard and we're getting back from Yale, we're going to see another language altogether. It's not a programming language, per se. It's what's known as a markup language, which just means it's more about aesthetics than it is about logic. And there's going to be a couple of other languages tucked in there, CSS, Cascading Style Sheets, JavaScript, which is a proper programming language. But let's go ahead and take a five-minute break here. And when we come back, we'll learn to make web pages themselves. 

All right. So when you visit a website requesting the home page or a specific file on the website, exactly what is inside of the virtual envelope a little deeper down below the HTTP headers that you get back from the server? Well, that language is known as HTML, HyperText Markup Language, which indeed is not a programming language, which means there's no loops. There's no conditions. There's no functions or variables per se. It's just text that tells a browser fairly pedantically, top to bottom, left to right, what to display and how. 

So let's take a look at some examples. An HTML page is going to contain really two different concepts inside of it, what we'll call tags or elements, and also attributes. Well, what are those? Well, here is perhaps the simplest web page we can make. And this is HTML itself. And you'll see that it's structured in kind of a symmetric way. Some things are indented like in a proper programming language, but there is some symmetry to what's going on here. So let's tease apart top to bottom exactly what we're looking at here. 

This very first line is known as a document type declaration. Long story short, whenever making a modern web page, this should just be the very first line of your file, no matter what. It signifies that you and I are using the latest version of HTML, which is version 5. In the future, this line will probably change as HTML itself, the language, evolves as humans add more and more features to it. Below that, notice, is a pair of what we're going to call tags. Tags are things between open brackets that start with a word like HTML or some succinct phrase like that, optionally with something like this word and an equal sign and maybe something in quotes after that. But highlighted in yellow here is the first of our HTML tags. And coincidentally, this tag is the HTML tag. 

And the way it works is as follows. When a browser receives an envelope containing text like this, it first reads that first line and says OK, this file contains HTML version 5. What comes after it? Oh, here is the content of the web page. It says, hey, browser, here comes some HTML. Notice down here is the opposite of that statement. When you get to the end of this file, you'll see a similar-looking tag, but there's a forward slash front of the same word, HTML. That's what we'll call a close tag if we think of this as an open tag. Or if you think of this as a start tag, this is an end tag. And most tags indeed have that symmetry whereby when you open them once, you should eventually close them, ideally in the appropriate order. Notice that you don't have to repeat other stuff. When you close a tag, you just mentioned the name of the tag to keep it fairly succinct. And that means, hey, browser, that's it for the HTML. 

All right, what's inside of that? If we look down below this, you'll see that there's this thing here, which is what's going to be called an attribute. Attributes tend to be short, succinct phrases that have some special meaning for that particular tag. This particular attribute, if you read the documentation for the language HTML, will say that if you add lang= quote-unquote "something" to your HTML tag, that's going to be a clue to the browser that says, hey, browser, here comes HTML, and by the way, the contents of this web page are going to be in English, at least in this case, by default, for en. Every language in the world has its own two-digit or three-digit-- three character-- two-character or three-character code that can be placed inside these quotes that will standardize exactly what the browser interprets it as. Useful these days if you have translation enabled in your browser. It knows what language the page is written in so that it can help you translate it to your own spoken language. 

All right, below that, there's two sets-- two pairs of tags, the head tag here and the body tag here. And I've highlighted them both at the same time, because you can think of these as both children of the HTML tag. So if we borrow our metaphor of a family tree and some kind of hierarchy here. If you think of the HTML tag as being like the parent, so to speak, this parent has two children, a head tag and a body tag, each of which is respectively opened and closed. 

Let's consider the first one, the head tag. What's inside of that, so to speak? Inside of that is the title tag which as you might guess by now is going to represent the title of the web page we're writing. Specifically, the title of this web page is going to be literally-- and just goofily-- hello comma title. So that's what you would see in the tab of this page. Let's back up a little bit and look now with the second child of the HTML tag, the so-called body tag. This is going to be the big, rectangular region of the web page, otherwise known as the body or viewport. And here, we see that the contents of that rectangular region of the page is going to be literally hello comma body. 

So that is to say this is the HTML for a fairly simplistic page whose title bar in the tab is hello comma title and whose body in the big rectangular region is quite simply hello comma body. And it's perhaps helpful now to call out explicitly that we can think of this, a la week 5, as really a data structure. Even though it's just text inside of that envelope that gets read top to bottom, left to right, what the browser is actually going to do on your laptop or desktop or phone is actually build a data structure in memory. So Microsoft, who wrote Edge, or Google, who wrote Chrome, or Apple who wrote Safari, wrote code that reads HTML top to bottom, left to right, like a big, long string, parses it-- that is, analyzes it-- and builds up into the computer's memory a tree-like data structure like this, much like for problem set 5, you built up your own hash table in memory for what was otherwise just a big text file of words. 

So you can see the hierarchy here. If you think of the whole file as being a so-called document, we'll draw a node, so to speak, in this tree here. The very first and only child of that is the HTML tag. Indeed, every page has to start with that HTML tag. It has two children, as I proposed, head and body respectively. And then head has a title child, and that has a child itself, which is just text. And just to be a little nit-picky, I've deliberately drawn these nodes in slightly different shapes just to connote that HTML, head, title, and body are indeed all tags, opened and closed. These ovals here are just text. Those are not inside of-- those are not tags themselves. That's just raw text here and here. And then the document node is the one random one. This is the only thing that's going to start with an exclamation point, typically, unless you have what we'll call comments in HTML, which are just notes to self that we saw in C and in Python. There's similar syntax for those. 

All right. With that said, if this is the simplest web page we can make, where do we make it? How do we make it? So you could certainly just open up your Mac or PC and open up something like Text Edit or Notepad.exe and type this out, copy and paste it, save the file, and open it in your browser. But that's not that interesting, because if you just save an HTML file on your Mac or PC, you are going to be literally the only one in the world who can visit it. So ideally, you want a server on which you can write and save your HTML so that other people, your users, your customers, can visit the file via the internet. 

Now, thankfully, we all have access to a tool already called CS50 IDE, which itself is a web-based tool for writing code, and the code we'll start writing now just happens to be in HTML. So let me go ahead and do that. Let me go ahead and open up a new file. I'll go ahead and call this, say, hello.html, dot html being the conventional file extension, and let me just go ahead and retype. That so !DOCTYPE HTML says, hey, browser, here comes version five. html lang="en". And now notice what the IDE is doing for me. For better or for worse, depending on your preferences, it's going to try to complete your thoughts for you so you can just type less. This is increasingly a feature of IDEs, Integrated Development Environments, because now I can type roughly half as much. 

Now, I'm going to go ahead and open the head of the page. Notice it got automatically closed. I'm going to go ahead and open the title of the page. That will automatically close as well. And let me go ahead and just do something like hello, title. And then down here, outside of the head tag, I'll do my body tag and do hello comma body. Now, strictly speaking, this indentation is not necessary. If I wanted to be a little more terse and not use this many lines, this is totally reasonable as well. And it's probably reasonable up to a point. If I had a crazy long title, I probably should move it to a line of its own. But again, these details are not going to matter to the computer, to the browser reading this, but they certainly make it prettier and easier for me, the human, and presumably you to read as well. 

So I've gone ahead and saved this file. And in the past, I would have used like make for C or would have used python for Python, but neither of those is applicable, because we're not writing or running code. I now want to visit this web page. And how do I do that? Well, I need a browser, and I'm all set there. Obviously, I can use Chrome, Safari, whatever on my own Mac. But I also need a server. And it turns out that CS50 IDE, insofar as it is already a web server that we use to write code, we can use it as a web server to serve our HTML as well. So a little bit ago, when I played the role of a web server, I need to essentially implement in the IDE that same notion, of some program that's just going to listen and listen and listen, like I was waiting for Brian, and any time I get an HTTP request from anyone's browser, I'm going to respond with the appropriate file. 

Now, we're not going to implement a web server ourselves. Web servers are kind of commodity these days. Anyone can just download or pay for one and use one. And indeed, the IDE comes with one quite simply called http-server. So this is a program preinstalled in the IDE. It's free and open source. You can use it on Linux or Macs or PCs as well. But it's preinstalled in the IDE. And when I run it, what it's going to do for me is start, curiously, a second web server. Because the IDE itself is already running on CS50's own web server, I need to now run my own server. But in order to distinguish one from the other, I'm just going to use a different port. And by default, the port that the CS50 IDE uses is this one 8080. So again, by default, most web servers in the world use port 80 if insecure and port 443 if secure. But those are, unfortunately, already used by CS50 IDE itself, which is running already on CS50's web server. So if I want to use the same server, the same computer in the cloud, to listen for other requests of my own, I'm just going to start my own second web server in parallel and just have it listen on a different port. And that's just so that you and I can run our own web server even though we don't have control over the IDE itself outside of our own accounts. 

Now, it's a pretty cryptic looking hostname, if you will. It's this random thing, 0cda3813 and so forth. But at the end of the day, it's just a URL. Notice that it ends in CS50.xyz, which is a domain name that we bought and we use solely for this purpose of running web servers on CS50 IDE. So if I go ahead and click that and click Open, voila, I will now see a fairly arcane textual listing of all of the files in the folder in which I just ran HTTP server. 

And let me go ahead and zoom in a little bit, and you'll see that there's only one file on there thus far that we've written, hello.html. So let me go ahead and click on that file. And voila, there it is, hello, body, my very first page. I don't see the title because I'm in full-screen mode, but let me go ahead and un-full-screen myself, and sure enough, if I zoom in on the title in the tab of this page, it's hello comma title. 

So what has just happened? I happened to be using CS50 IDE just because it's convenient. You and I already have accounts on it. We're running our own web server, implementing the software version of the role I was humanly playing earlier. I'm using Chrome as my browser, just like Brian was our browser in the story before. And so when I visit this long URL in my browser's bar that the server told me to visit, notice that it ends with /hello.html. So all in one environment, I'm serving web pages and requesting web pages. And this is perfect, because this is what a real-world software developer would do when building their own websites or web applications. They want to actually keep everything local and work on it and work on it until they're ready to release it to the world. 

Well, let me go ahead here and point out one thing in the tab here. And in fact, some of you very cleverly are-- actually, amazingly, transcribed that URL, because I'm seeing more HTTP requests coming in right now live. Notice that in the terminal window of my IDE where I ran HTTP server, I'm seeing, row by row, the requests coming in. And so this is kind of a log, because my web server is still running, and if any of you actually want to type out that same URL again, if you rewind in time in the video, you can actually visit my hello.html file right now on the internet, assuming you're watching the lecture live, and you can see new rows appearing in my output here. But that's just to say it's useful for us diagnostically. 

But let me go ahead and do something else here for just a moment. I'm going to go ahead, and in a moment, create another file, this time to demonstrate some other HTML tags. So let's go back here and in my-- and I'll keep my terminal window running, but I don't really care about the output now, so I'm just going to go ahead and minimize it down there. I'm going to go ahead and create another file up here called paragraphs.html. 

And let's see if we can't introduce some other features of HTML. I'll go ahead and type out the same as before, !DOCTYPE HTML, my HTML tag with my lang for English attributes. Sometimes, admittedly, the IDE will get confused if I start a thought, don't finish my thought, then try to finish it again. And that's fine. You might just have to clean up what the IDE is trying to do for you to be helpful. I'm going to go ahead and create the head tag here. I'm going to give myself a title here. I'll call this page paragraphs. I'll keep it all in one line just to keep it succinct. I'll open up my body. And now I'm going to go ahead and type out five paragraphs of Latin text that I'll just go ahead and put right here. And let me go ahead and indent this nicely just to make it nice and readable. This is your lorem ipsum text, which is just Latin-like nonsense. And here I have five paragraphs of text now. So this is different. It's way more than just hello, body. 

So let me go ahead and save this file. Let me go back to my other tab here. Notice that nothing has changed until I click reload, which will reveal the latest contents of my folder. So let me click paragraphs.html, and I should see five paragraphs of Latin-like text. Huh, no, that's just a big mess, one massive, long paragraph. Any instincts for what the bug here might be? Any thoughts on the chat or with a raised hand? Yeah, over to Ryan. 

AUDIENCE: At least from the way it's set up, it doesn't look like HTML has auto line spacing by default, so it's going to collect them all into this one big string unless you somehow create a space in between each paragraph. 

DAVID J. MALAN: Exactly. HTML, like most any computer language, programming or otherwise, is going to take you literally, and if you don't tell it what to do using, in HTML cases, these tags, it's just going to do some default behavior instead. So let me actually go back to CS50 IDE-- and you know? Let me introduce another tag here. It turns out there's a tag called the paragraph tag. And the shorthand notation for that is quite simply open bracket p closed bracket. The IDE is going to try to finish my thought, but because I already have the paragraph, I'm going to need to manually fix this myself. So let me go ahead and open it there. And let me go ahead now and just insert a few of these. So one there, one there, one there, one there. And let me go ahead and copy the close tag. One there, one there, one there, and one there. And now, let me just, for style's sake, indent further. 

And I know that pretty much in every past week I've claimed that copy paste is bad-- not really the case with HTML, because if you want multiple paragraphs, there's no notion of a loop. You can't create five paragraphs of Latin-like text with HTML alone, so copy paste, in this case, is the right solution. 

All right. Let me go ahead now and go back to my other tab and now hit reload. Nothing's going to change until you tell it to, so just like you would reload a normal website, let me reload my own. And voila, we fixed the problem that Ryan identified by now explicitly using HTML's paragraph tag. And it's deliberately the p tag because HTML tags tend to be succinct. It's fewer characters to type. And how do I know it's the p tag? You just have to learn it at some point, in a class, in a book, in a website. And indeed, much like with Python, as with C, we're not going to aspire to teach you the laundry list of HTML tags and attributes that are out there, but focus today particularly on concepts and fundamentals so that you can add to your vocabulary quite quickly via any number of online resources that we'll point you to. 

All right, let's go ahead and do this. Rather than do everything from scratch, let me go ahead and copy this and create another file that I'll call headings.html. When writing a paper or when writing or reading a book, it's very common to have chapter headings or section or subsection headings, and indeed, you can do this in HTML as well. So I'm going to go ahead and introduce a couple more tags here, namely the H1 tag, which is like the biggest heading tag. And I'm just going to write the word one here just to keep it simple. Over here, I'm going to do H2, and I'll say two. Down here, I'm going to go ahead and say H3, and I'll say three. And down here, I'll do H4, and then four. And then down here, I'll do H5-- here, five. And then down here, I've ran out of paragraphs, but there is a six, so I'm going to go ahead and give myself one duplicated paragraph just for demonstration's sake so that we have all six here and go ahead and save it there. 

All right. So if I go back now to my browser and reload, it's looking-- nothing happens because I'm in the wrong file, but if I go back, I now have a file called headings.html. Let's click that. And it's the same content, but now, it's getting a little prettier, right? It's big and bold headings, one, two, three, four-- notice those headings are getting smaller and smaller, but that's the convention in a book or an academic paper where your sections and subsections and subsubsections, get smaller and smaller. And we can customize this if we really want, but out of the box HTML gives us the ability to even format things like headings like that as well. 

Well, what else can we do in HTML? Well, let me go back to my IDE. Let me go ahead and copy paste some of this just to save some time. And let me create another file called, say, list.html. Turns out HTML makes it really easy to write lists. So here, let me change my title to lists. And down here, if I wanted to have a list of three things, like foo, bar, and baz, which are generic computer science terms whatever you just need placeholders like x, y, and z in math, foo, bar, and baz are what people tend to reach for. 

All right, I have a nice clean list there. Let me go back to my other tab, go back to my directory index here, and there's list.html. Let me click on that. And voila, same problem as Ryan identified. Again, if I don't pedantically tell the browser, start a list, continue the list, keep going, end the list, it's just going to assume that I just want one big block of text. In fact, it preserved white space. It collapsed all of those new lines and tabs into single spaces. But that's not what I want. 

So how can I fix this? I need some kind of additional tag. And it turns out there's a couple of approaches. There's the unordered list tag, so ul, for Unordered List, which means it's not numbered. And then inside of that, you can have child tags called the li, or List Item. Foo, let me give myself another one, bar, and give myself another one, baz. So it's more to type, and definitely there's almost as many red characters, the HTML, which is just being nicely syntax highlighted for me by the IDE, than there is actual content, foo, bar, baz. But if I now go back here and reload, I get a much cleaner, bulleted list. And if you looked at the course's website, we actually make heavy use of bulleted lists for content and indentation and so forth. We're just using a whole bunch of ul tags. 

If, by contrast, you wanted the computer to number of things for you, you could certainly do it like this, 1, 2, 3, but you can imagine that getting a little annoying quickly if you want to reorder things or add things in between. So computers are really good at doing tedious things. So let me change this unordered list to an ordered list using ol instead. And if I go back to the other tab, reload, voila, now it's 1, 2, 3, and it's automatically numbered for me. I don't have to worry about it at all. 

Well, let's do one other that speaks to the structure of the page. Let me go ahead and copy my starting point, hello, and create a file called table.html. If you ever want to layout tabular data where you have rows and columns because you want to make sense of some financial information or just something akin to a spreadsheet in your own website-- well, how can we do this? Let me go ahead and call this table. And down here in my body, let me introduce the table tag. And the table tag is a little more involved because you have to define what are called table rows. So I can do a tr tag there. And then inside of table rows I can have data, so td for Table Data. Let me just put the number one. And let me go ahead and make-- let me mock up something a little familiar, like a phone keypad, 2, and 3. Then let me go ahead and copy this once more and give myself another row with, say, 4, 5, 6, and let me give myself one more of those with, how about, 7, 8, 9. And then lastly, one more of those just to give myself the equivalent of a keypad and do the asterisk and then zero and then the pound key. 

Let me save this, go to my other browser tab, open up table.html, and voila, you see something akin to an old-school phone keypad there. And there is implicit rows and columns. If I wanted to, I can make it a little prettier with actual lines or borders in between and around these things. But HTML gives me the ability to lay out tabular data using trs for table rows and tds for the columns therein. 

All right this is all pretty boring and textual, and really not the web that you and I all know, so let me go back here, and let's do something a little more interesting. Let me go ahead and start off a file called maybe image.html. And let me go ahead and start with our boilerplate as before. I'll rename this title image. And down here, let me go ahead and do something like this. Let me go ahead and do image-- how about source equals quote-unquote "harvard.jpg". It turns out I came with an image of Harvard in my IDE. And let me go ahead and describe it as much, too. Let me add this alt attribute here, Harvard University. 

And we'll come back to what this means in a moment, but here we have this second tag thus far that actually shows us how to customize the behavior of a tag. So the lang attribute earlier customized the behavior of the whole web page by telling the browser here comes a web page written in English. And down here, we have two attributes, alt, which has a value after the equal sign, and then src, S-R-C, which itself has a value after the equal sign. You can use single quotes or double quotes, but you should be consistent. But each of these attributes should have an equal sign in between the key and the value there. 

So how might I go about doing this? Well, let me go ahead and open up this file now. Let me go ahead and reload. We should see now image.html and Harvard.jpg, which I just grabbed from my IDE. And voila, image.html is the original painting of what's adorned our backdrop here for the past several weeks. So that's interesting. You can link to a specific image like that. 

And what's the role of the alt attribute? So the alt attribute is all too overlooked by new and experienced programmers alike, but this speaks to accessibility. Not all of us can necessarily see and hear and interact with media in the same way as others, and so those who have difficulty with sight or with sound or the like, the alt attribute is a wonderfully powerful and so simple mechanism to include on your image tags that literally just describes in English or your own spoken language what it is a human would otherwise be looking at, even if they are perhaps blind and cannot actually see what's there. And if they have a screen reader, installed software that actually can vocalize text on the screen, this incredibly usefully helps people hear what it is that you and I might otherwise only be looking at. So be sure to be mindful of those kinds of tags, and you would only know that these tags exist by, again, taking a class, reading a book, looking at our online reference. We're just beginning to add to now our vocabulary. 

And in fact, let's take this one step further and do something a little more powerfully and familiar still. Let me go ahead and create a file called link.html. Let me paste my starting point there. I'll retitle this as link. The web, of course, is filled with links, and indeed, HyperText Markup Language is all about hypertext, which is an arcane allusion to links. Hypertext is text with links that link elsewhere. So how might I implement a link in a web page? Well, let me go ahead and, in this page, initially just encourage people to visit Harvard, period. 

Let me go back to my other browser window, open up now link.html. And of course, this does not really do anything. I can't click on visit Harvard or anything else and have it do anything, because it's obviously just text. So how can I actually link the user to some destination? Well, we need another tag. It is called the anchor tag, abbreviated with a single letter a, it has an attribute of href, for Hyper Reference. And hyper reference just means, what do you want to link to? Well, let's go ahead and keep it simple. Let's link to a file I already created, image.html. And the word I want to link is literally Harvard. So on the left of the word "Harvard," I have a href="image.html", on the right of Harvard, I have the close tag. And again, notice just because I had an attribute on the tag does not mean you need to redundantly copy paste it in the close tag. It suffices to close only the name of the tag. 

Let me save the file, go back to this page now. Let me zoom in a little bit and reload. And voila, now you see the familiar hyperlink that you might see on many web pages where it's actually underlined. And indeed, if I hover over that-- if I hover over that and then click, voila, we'll find ourself at Harvard University back in 1792, because now what I'm looking at is image.html. And in fact, let me go out of full screen mode for just a moment to make clear that the URL at this point in the story where I see just Visit Harvard in the page, is something/link.html. You URL will differ from mine, but mine happens to be this long, cryptic string, because it's my account slash link.html. When I click on the link, though, notice that I end up at image.html, thereby taking me to a relative URL that is a file in my own account. 

If I don't want to link to that file, though-- maybe I want to link to Harvard itself. It's not sufficient to just do Harvard.edu. That is not a URL. www is not a URL. I need my protocol, so to speak, either HTTP, or, better yet, HTTPS. If I save that file now and reload and go back here, the text looks exactly the same, but notice, if I hover over it, there's a tiny, tiny, tiny, tiny little visual clue at the bottom of the screen that says where I'm going to end up. And indeed, if I click this now, notice that my URL bar is not going to stay as my IDE slash link.html. It's going to whisk me away to the actual Harvard.edu. 

And here, it's worth noting that Chrome and Safari and browsers, for better or for worse, are increasingly simplifying the user experience or UX of browsers. I am not literally at Harvard.edu. If you click or double-click on the address bar, you'll see where you actually are. And this is, for developers, a worse. For regular users, it's probably cleaner just to see the domain name. But all of the information is indeed there if you dig for it just a little bit. 

But there's kind of an exploit here, possible. There's kind of an exploit here. What if I were to do something somewhat maliciously, like this. Like, let me change this to Yale.edu leave the word "Harvard" unchanged? If I go back now to my other tab and reload, it looks different at the moment because it's blue instead of purple. Purple by default means I've been there before, which we were a few minutes ago. Blue means I haven't visited before. But if I don't really notice that subtlety, I might very well think that, oh, this is the university I want to go to. But voila, when I click on that, wrong place. 

All right, silly example, but this can really be exploited for ill purposes. What comes to mind, or what threats come to mind with this very simple mechanism? Right, now that you have the ability to make web pages, you have the ability to say you're going one place but really lead the user elsewhere. Can you see how this might be abused? Santiago? 

AUDIENCE: I think it maybe could be used by so-called hackers who can break in and insert malicious software into your computer? 

DAVID J. MALAN: Yeah. 

AUDIENCE: And they trick you into doing that. 

DAVID J. MALAN: Yeah. And "trick" is the operative word. I mean, most of us are probably not in the habit of opening up-- before clicking on a link, hovering over it, like I did a moment ago, and then very paranoiacly, looking down here to see if-- am I really going to the right place? And even this can be spoofed. You can trick the user into thinking they're going to the right place but still override this behavior. And so if you've ever been the victim of or the near victim of phishing attack, P-H-I-S-H-I-N-G-- "phishing" refers to trying to trick humans, as Santiago says, via social engineering into doing something that they didn't actually intend. 

And so you can imagine receiving spam in your email inbox that says, click this link to visit PayPal.com, because you need to verify your password, or click here to tell us your social security number. This is so common these days to get emails which themselves these days are-- if they're not just text, they are HTML itself. When you're looking at any email in Gmail that has clickable links or images, that email contains HTML like we're writing here. It is trivial to trick users into going places that they didn't actually intend. And so among the takeaways for today, beyond the mechanics of how to do these things, should be consideration for your own personal security, as to how distrusting you should really be of websites because of how simple these mechanisms are and how they can lead you, indeed, to the wrong place. 

And recall that a bit ago, we wrote this link.html example, which had, in the correct version, a link to www.Harvard.edu whose text was the word "Harvard". Suppose now that we want to override the browser's default stylization of links-- which, recall, if I now visit in my other tab, link.html, is pretty boring. By default, and this has been true for like 20 years, links tend to be blue and underlined before you visit them, or purple and underlined after you visited them at least once, a visual cue. But most websites today, including CS50's own, use different colors and different aesthetics for links on a web page, with or without underlining, different colors, maybe even different background colors. You can style these things using CSS in bunches of ways. 

So how might we do this? Well, let's go ahead and be fair here and say Visit Harvard, or, for instance, a href="http://www.yale.edu" question mark-- no, nope, close bracket Yale. Let's give myself two links that, of course, if I reload, just looks like this. Both of them are now boring and purple because we've been to both places already. 

So let me go ahead and add a style tag up here. Just to keep us in the same file, I'm going to go back to using a style tag rather than introduce a separate file. So to keep things simple in my style tag, let me go ahead and change these links, for instance, to have a color of-- maybe let's make them all red initially, ff0000. 

And let's go ahead and save that. Let me reload, and you'll see that now both of the links are red and underlined. I don't really like the underline, so let's get rid of that. Let's change the text decoration of my a tags to be none. And again, you would only know these properties exist from some form of reference. But again, just adding to our vocabulary. Let's reload now. And now the underlines are gone. 

But it would be kind of cool, like some websites, if when you hover over the link-- at least on a laptop or desktop-- the link then underlines, drawing your attention all the more to it. How can we do that? Well, it turns out that you can use what are called pseudo selectors, which are-- work like this. If I want to change the behavior of the a tag but only when our user is hovering over it, you literally write the name of the tag colon hover. And then inside of this block, I'm going to go ahead and say text-decoration underline. So this will say, make everything red and not underlined by default, but when the user hovers, go ahead and decorate it with an underline. So let's go back to the file after saving, reload. No visual change yet until I move my cursor up here, and voila, now it's getting a little more like modern websites. 

Now, this isn't quite fair to Yale that both of the links are red, so what if we change the colors of different types of links? Well, let me go down here, and I need to distinguish these links in different ways. And I could use classes for that. But if I've only got one Harvard link and one Yale link, I might as well uniquely identify them. Let me go ahead and add an attribute called id of quote-unquote Harvard, and I'll keep it all lowercase, kind of like a variable. And that then here I'm going to say id="yale". I could call these things anything I want, but because there's only one Harvard link and one Yale link, I'm going to add an attribute in HTML that just lets me verbally uniquely identify each of those links. 

But up here, notice what I can do now. Let me go ahead and remove the color from here and let me instead say that for any tag that-- for the tag that has an ID of Harvard, go ahead and color it as ff0000, but if it has an ID of Yale, go ahead and have a color of 0000ff, so that should give me my blue in RGB. And notice the new symbol here is the hash symbol. So in the world of CSS, a hash symbol before a word means the unique identifier Harvard or the unique identifier Yale. A dot before a word means the class centered, the class large or medium or small. And if you don't have any symbol before the word, like a hash or a dot, means literally the tag called a or literally the tag called a when it's being hovered over it. So again, a few pieces of syntax. It's not programming code, but it is code of some sort here. 

Let me save this. Let me go back to my tab and reload. And voila, now I have the beginnings of a prettier website where I'm distinguishing Harvard with its underline in red and Yale with its underline in blue, but only under those certain conditions. So we have with CSS the ability to much more precisely control the aesthetics of our web pages. 

All right, let's go ahead and clarify just a couple of things here. Let me go up to one final example and see if we can't now come back to that idea of user input. So let me go back to the IDE here and let me grab a little bit of starter code from my hello file as before and create one final example here called search.html that's purely HTML indeed. I'm going to name this thing search, and then I'm going to go, down in my body of the page, use another new tag. Turns out HTML also supports a form tag, and that form tag can take a couple of attributes, one of which is action. And this is where you want to have the form lead the user. I'm going to go ahead and come back to that in just a moment. The other is method, and the method is the HTTP verb to use. For now, I'm going to use get. And here, inconsistently, it should be lowercase, even though we've previously seen it in uppercase. 

Inside of the form tag, I'm going to have a couple of inputs, an input whose name is going to be q and whose type is going to be search. And then down here, I'm going to have another input whose type is going to be submit and whose value is going to be quote-unquote "search" as well. They're a little different. I'm deliberately omitting the name because it's not strictly necessary. But where am I going with this? Well, I haven't actually implemented a search engine. All I'm doing at the moment is implementing a front end to a search engine, a front end to google.com. I'm going to let Google itself do the hard work of actually searching the data-- the internet for me. So I'm going to specify an action of www.google.com/search. 

So here we have what is about to be a form, text boxes and buttons that the user can interact with, the action of which is going to be to send the user to this URL using get. But that URL is going to have automatically added to it by my browser one HTTP parameter, so to speak, a variable of sorts, called q. And why this? Well, recall earlier that when I visited google.com, I was able to simulate a search by literally going to https://www.google.com/search?q=cats. I claimed that Google is designed by the software engineers there to take user input via the URL. 

Well, you and I do not search for things by typing out long URLs like that with q equals anything. That would be incredibly poor experience. You and I just type things into search boxes or forms. So indeed, if I now go into my other tab here for search.html, you're not going to be very impressed by the aesthetics of my form right now-- it's just a rectangular text box and a search button. But watch what happens. My URL at the moment ends in search.html. I'm going to go ahead and type in something literally like cats. And now notice, if I hit Enter or manually click on the Search button, my web page, which contains an HTML form-- because it has an action that's Google's URL and a method of get, my browser is going to convert that into the corresponding HTTP request and in turn URL so that the user is automatically sent to, if I double-click on it, this full URL here. And the user's input is automatically by the browser appended to the URL via question q=cat. And it's not just cats. We now have our own very simple Google search engine where we can search for dogs, too, if we want. And notice that the URL here changes to be ?=dogs. 

So, like, this is how the web now works. We talked earlier about how the internet works, how you just get raw data, zeros and ones, packets of info from point A to point B. This is now how the web works. When you visit a website and don't just want a specific picture of a cat or a dog, you want to search for cats or dogs, or you want to log into a website, or you want to check out of amazon.com, providing user input, you are always filling out HTML forms which look essentially just like this. They might have more inputs and they might be a little more complicated, but they are form tag on amazon.com, on Facebook.com, on any website with one or more inputs, that when submitted, so to speak, via you hitting Enter or clicking Submit or Search or whatever the button is labeled, that's how the next request gets submitted to the server. 

Whew. So that's a lot, and that's not all the HTML tags out there, but that's all the ideas of HTML, right? It really is like open tags and close tags, some of which can have zero or more attributes, and just understanding that mental model, and now, via forms, we have the mechanism for submitting, so to speak, user input to search to websites, to web servers. And just to call out that other verb here, turns out Google only supports get for its search program. At www.google.com/search, you can only use get, but POST is also very common. And in fact, POST is a different verb that can be tucked inside of that envelope. 

And POST actually changes what happens in the browser so that q=cats and q=dogs does not show up in the URL, because this is actually one other threat to our privacy, right? If you're little sibling comes over to your browser or your parents after you've been searching on the web, they can scroll through, often, your entire history of your browser. Why? Because literally everything you search for on Google or Bing or whatever ended up in the URL because of this mechanism. And for user convenience, your browser tends to cache or save all of those URLs. 

Now, that's marginally intrusive, certainly, when you have roommates or family members of the like to your privacy, but it's especially concerning if you are registering for websites or checking out with usernames and passwords and credit card numbers and things that are even more sensitive. Long story short, POST, which is just a different verb that you can use in HTTP, that hides the q=cats. That hides the credit card number equals this. That hides the password equals that, essentially, metaphorically, deeper in the envelope. It does not put it into the URL bar, but it still sends it to the server a little more privately. 

All right. That was a lot. That was a lot of tags all at once culminating in this ability to get-- to submit user input. But any questions, then, on the ideas, the syntax, or the implications of making web pages with this language HTML? Nothing from you, Brian? 

BRIAN YU: All good here. 

DAVID J. MALAN: All right, well, I'm kind of embarrassed by all of the work I've done thus far, because all of these pages are incredibly boring and very underwhelming and nowhere close to the sort of user interfaces that you and I are familiar with day-to-day when using the actual web. So let me go ahead and close most of these tabs, and let's transition to not focusing on the structure of web pages alone, but now focusing on the aesthetics of web pages, the stylization of web pages, where now your artistic flair can really come out. And we can begin to recreate user interfaces ultimately more like our world of scratch, where you actually see colors and shapes and images and sounds, but still using these basic building blocks. 

And for this, we need a language called CSS or Cascading Style Sheets. This, too, not a programming language, but it is an additional language that you can use in conjunction with HTML in order to stylize your pages and make them prettier. CSS boils down to the use of what we'll call properties. Properties are similar in spirit to variables, or more generally, they're just more key value pairs. And again, notice this recurring theme. Like, we introduced dictionaries or dicts in Python a couple of weeks ago. Those are just collections of key value pairs in the form of a hash table, essentially. We saw just a moment ago attributes in HTML which essentially are key value pairs. q=cats is a key equals a value. q=dogs is a key equals a value. CSS has the same idea, but because different people invented this, they call these things properties instead of attributes. But it's the same idea, just a different vocabulary. And there's going to be different ways to apply different properties, like color and font size and positioning, to HTML tags using CSS. 

So how do we get there? Well, it turns out that CSS is a language that you can use in conjunction with HTML, which is to say you can start with an HTML page like this, like we've already created, saying hello, title and hello, body, but you can add some additional attributes and/or tags to it to begin to stylize it. You can actually add a style tag in the head of the web page, here, pictured here, which is another thing we could put inside the head, though strictly speaking, it can go elsewhere in the body as well. Alternatively, you can link to a file in a separate file. And so we'll see a couple of different approaches there. 

But before we do that, let's make this more real with some concrete examples. Let me go ahead and whip up a new example here in a file called css.html just to demonstrate this new language. Let me go ahead and start, as always, with just my raw HTML, with my !DOCTYPE html. Let me go ahead and do my html lang=en. Then down here, I'll do the head of my page, the title of my page. I'll just title it simply css. The body of my web page-- and now let's actually do something interesting. Let me introduce a few more HTML tags that are available in the language. One is called header. And here, I'm going to go ahead and say something like John Harvard. I'm going to make a homepage for John Harvard, the founder of Harvard. And let me go ahead here and do a main section of my page. And I'm going to say welcome to my home page. And then down here, I'm going to have a so-called footer. And inside of here, I'm going to say copyright, and I'll write copyright symbol John Harvard. 

So super simple web page, three lines. And just here's three new HTML tags. You would only know this, again, from a class, a book, or an online reference, but there are tags called header, main, and footer that don't do anything special. They don't make things big and bold like the heading tags did, H1 through H6. They are just what are called semantic tags. They are more than generic paragraph tags, but they help you, the programmer, and they help the browser know that, oh, this is the header of your page. This is the main part of your page. This is the footer, which can help, again, with screen readers, distinguishing what's really important on the page. It can help with translation tools, like Google Translate, to know what you actually want translated, like the main part of the page and not less important info, like the header or footer. So just three HTTP tags, but you can think of them like paragraph tags but with more specific names. 

All right. So let me go ahead and save this file, open up-- not my dogs here, but go back to the index. And now I have a new file, css.html. Let's click it, and voila. You know, it's pretty underwhelming, but it does what it says. It's got three lines with this content. Let's begin to stylize this and try to make it a little more inviting. So to do this, let me go ahead and add a style attribute with equals quote unquote. Now, here's where things get a little weird. In the world of HTML and CSS, you do actually, or can actually, commingle the two languages. And we kind of saw this already. We saw Python code with SQL inside of it. Today, we're seeing HTML code with CSS inside of it. So inside of these quotes, I'm going to put some of those so-called properties, key value pairs. 

Let me go ahead and change the font size of my header to be large and let me go ahead and align the text as centered. Notice the pattern here. It's new. It's not equal sign. There's not additional quotes. It's because, again, left hand wasn't talking to right hand, and different people invented HTML and CSS, essentially. But font-size is the name of a CSS property, an aesthetic detail. Colon is what separates the key from the value, and the value in this case is large. And I'm choosing large because it's one of the available ones. You've got small, medium, large, extra large, and a bunch of others as well. You can also use specific font sizes or pixel sizes as well. text-align is going to align text colon in the center. 

Now, let me go ahead and do something similar here. On the main tag, let's do font-size medium, and then text-align center. And then down here, let's do style="font-size small, because it's the footer. It's less important. text-align center. All right, saving the file. And I'm sad to say the semicolons are back. They're not strictly necessary in all places, but I'm doing it for consistency. They're definitely needed here to separate keys from other keys. 

Let me open up this page again. Let me hit reload. And voila, it's a little prettier. It's nothing to write home about, really, but it has John Harvard as large, "Welcome to my home page" as medium, and the footer as something small. So that's not bad. And let me clean this up, too. This open parentheses c close parentheses isn't necessarily that pretty. I don't know where the symbol is on my screen, but let me go ahead here and let me type out this, ampersand hash 169 semicolon. This is weird, but this is a feature of HTML called an HTML entity, which is a numeric code that identifies a symbol that is often not on your keyboard but you might want to display anyway. Let me go ahead here and reload. And voila, if I zoom in, now we have a proper copyright symbol. And there's other HTML entities for other symbols as well, especially if you can't see them on your keyboard. 

All right. What happens after this? That's not bad, but I feel like there's an inelegance here. Can anyone recognize in the code I've written thus far, even if you've never seen HTML before-- is there an opportunity for better design? I claim it's correct, but feel free to chime in the chat. Can I improve the design? And even though I said earlier that copy paste is bad, that doesn't mean we can't-- or is necessary in HTML, that doesn't mean we can't chip away at it in some places. Brian, anything jumping out? 

BRIAN YU: Some people are saying that your style attributes are starting to get very long. 

DAVID J. MALAN: They are getting long, and I dare say redundant. Even though the font size is changing, so that seems kind of necessary, text-align center, text-align center, text-align center-- that seems unnecessary, right? Like, why am I doing that again and again? But here's where the C in CSS comes in. CSS is Cascading Style Sheets, which literally refers to a waterfall effect of these properties. So what I can actually do is-- let me go ahead and remove text-align center from each of header, main, and footer. And let me be a little smarter. Let me go up to my body tag, which is the so-called parent tag, for those three, add a style attribute there, and put text-align center there, and trust that because body is the parent, any properties it has will cascade down onto, so to speak, its children, thereby allowing me to define text-align in one place instead of three. Let me go ahead and reload the page, and voila, nothing has changed, but I claim now that my page is better designed because I've eliminated that redundancy and made my lines, to Brian's point, a little shorter as well. 

So that's pretty clean, but this seems a little bit of a slippery slope. If I wanted to make a larger home page, which surely I might, with lots more content, having all of these style attributes all over the place very quickly gets messy. And in the real world, it makes it hard for you and I to collaborate with better artists than you or I might be. It might be nice for one of us, if working on a team, maybe for even a CS50 final project-- one of us does the HTML, one of us does the CSS. This would be a mess if you and I were trying to collaborate on the same exact file. So let's see if we can't take a step toward factoring these out and keeping HTML and CSS separate. 

So how might I do this? Well, let me go ahead here and get rid of all of this. Let me get rid of all of these style attributes, which, again, just doesn't feel very maintainable, certainly as my page gets longer. And let me transition to that other format we saw a teaser for before, a style tag. Not an attribute, but a tag that can go in the head of the web page here. And let me go ahead and do something a little different here. Let me go ahead and say my header should be text-align center. My main part should be text-align-- let me do this just like we did in C-- text-align center. And then my footer will be text-align center. And then just for consistency, with before, font-size large. Down here in main, font-size medium. Down here, font-size small. The IDE is not recognizing all of these keywords, but it's OK that some are white, some are blue here. 

So there's still some redundancy here, but notice what I'm doing now. Inside of my style tag, it is allowed to use what's called a CSS selector. And there's different types of selectors, but the one we're seeing now is what's called the type selector. And it's a bit arcanely named, but this just means that if you want to style every instance of a certain tag, every header tag, every main tag, every footer tag, you can literally, inside of a style tag, put the name of that tag, then a pair of curly braces-- apologies, they are back for CSS-- and then inside of those curly braces, you just put those key value pairs, the properties separated by semicolons. And I'm stylizing it nice on separate lines just so that it's a lot more readable. 

The effect is not going to be any different. If I go back to my other tab and reload, nothing has changed, but I've begun to tease out the CSS from the rest of my page. But notice there's still some redundancy here, and I could remove, for instance, text-align center from here, text-align center from here, text-align center from here, and maybe apply it to the body instead. But there's other types of selectors I can use. If you want to define one or more properties in a reusable way such that you can use them on all sorts of tags, turns out CSS supports what are called classes. 

And I'm going to go ahead and do this instead. Instead of just saying my header is going to have a font size of large, I'm going to introduce what's called a class in CSS, the syntax for which is to use a dot and then a keyword of your own choice. And I'm going to call this first set of properties .large, this next set of properties .medium, and this next set of properties .small. And this is a little weird that it's starting with a dot. That's just because humans decided that when you define what's called a class, a reuseable set of properties, you start your keyword with a dot. And I'm going to give myself one other one. .centered is going to be text-align center. 

So none of these classes, centered, large, medium, or small are in use yet until I now apply them to my HTML tags. So let me scroll back down to my HTML where I removed earlier all of those style attributes, and I'm going to use a different attribute now called class. I'm going to go ahead and add to this header centered large. I'm going to add class to the main tag as centered medium. And down here I'm going to say class="centered small". So I can have inside of the attributes called class a value that's just a space-separated list of words. And strictly speaking, it can be any number of spaces, but that's just looks stupid and is stupid, stylistically, so just one space suffices. But this just means, hey, browser, apply the centered class and the medium class to the following contents. And again, if I go back to the tab and reload, nothing, still, has changed. I'm just changing the design. I'm hopefully improving the design. But here, too, you can probably note that using centered again and again is also, again, dumb. Let me remove that redundancy. Let me add to my body a class of centered. 

And now things are getting a little tight. Like, I've reintroduced some attributes, but now, if I'm collaborating with someone, I can say, can you go ahead and create me CSS classes for large text, medium text, small text? I'll just assume that you're going to do that correctly, and I can just use those terms, those classes, and assume that you have defined code in CSS like this stuff here. But you know what? I can do one step better. I don't need you to even touch this same file. You and I can work pretty independently. 

In fact, let me go ahead and propose this. Let me highlight all of the code I just wrote up here and cut it. Let me get rid of this style tag all together. Let me create a new file called, let's say, styles.css, just by convention, but I could call it anything I want .css, paste it into there, and save it. And let me go ahead and-- I don't need any of the indentation anymore, so let me just shift everything over there using a fancy keyboard shortcut. This could be the file you're working on. You create all of these properties and these classes for me in a separate file. Then in my HTML file, if you and I are collaborating, I can use a link tag-- a hyper reference value of styles.css, the relationship of which is that of stylesheet to your file. 

Now, here's where you just kind of have to throw up your hands. In an ideal world, we would have called links in web pages the link tag, and we wouldn't have used open bracket a for anchor. This is not a link in the clickable sense. This just means link this HTML file to this CSS file with a relationship of stylesheet. What is the stylesheet? It is a sheet of styles. It is a file of properties. And those properties can be inside of types or classes or what we'll soon see are unique IDs as well. And there's other types for that as well. But here we have, now, arguably, the best-designed version in that this is pretty compact. It's pretty succinct. There's only HTML and HTML values and attributes, but all the CSS is in this second file. Now you and I can really collaborate independently. 

Any questions then on CSS and what we can do with it? We have only scratched the surface of CSS thus far, but suffice it to say-- and actually, just to tease just how bad at this I might be, suppose you really want to have a colorful background. Well, let me go into my CSS file here, styles.css, and I don't have to use only classes. I can say something like, well, let's go ahead and make my body tag have a background color of-- how about red semicolon? Let's go back to the browser, reload. OK, it's getting ugly pretty fast. Let me go ahead and change the color of my text maybe to white to make it stand out a little more on the red. Reload. All right, it's back to something there. I can change this to any color, maybe a little Yale blue over here. Reload. Voila, it's now blue. 

Or if you really want to be fancy, per week four, how about we make it 00ff00, speaking hexadecimal, which, of course, is going to make it green for me instead? So this is to say there's so many different CSS properties out there. And again, we're focusing here only on the list-- the ideas. When it comes time to actually using these properties, we'll point you at the appropriate reference, just to flesh out your vocab all the more. 

All right. Any questions, then, on the capabilities of CSS and its relationship with HTML? Anything on your end, Brian? 

BRIAN YU: All good here. 

DAVID J. MALAN: So the one thing we haven't done any of today at all is programming. And for that, we actually need a third and final language, that of JavaScript. And in the past weeks of the course, we've used what we would describe as really server-side programming. You have written C code. You've written Python code on the server, specifically on CS50 IDE, which is, by nature, in the cloud, a server. But it turns out that all this time you haven't really been using your own Mac or your own PC or your own phone, for that matter, other than it's just a very expensive display, a very expensive screen. All of the code is written, all of the code is running, on this backend server. And that's a missed opportunity, because all of your users, all of your friends, all of us are on our own pretty fancy Macs and PCs or phones these days. It's kind of a shame that those devices all have CPUs and RAM and storage space, yet we're not using any of those capabilities. We're really just using the glass screen to see what's elsewhere on a server. 

But with JavaScript, we have another language that we'll see in a bit that will allow us to write code, save it on the server, but run it on users' computers and do what's called client-side programming, so we'll still save the code we write on the server, but we're going to include the code inside of these virtual envelopes that get downloaded by users' browsers, and instead of just reading the code, top to bottom, left to right, and displaying information, as is the case with HTTP and CSS, will additionally read the JavaScript code deeper in the envelope and execute that on users' Macs and PCS and phones. 

And here's where web programming gets really interesting, because you now have a full-fledged computer at your disposal that's not even your own. So let's go ahead and take another five-minute break here, and when we come back, we'll conclude with a look at JavaScript. 

All right. We are back and we're about to introduce a programmatic way of controlling web pages and even adding to web pages and changing the content users see, thereby making our websites no longer just static-- that is, written once and viewable the same way forever, but dynamic, somehow changing in response to user interactions or user input. Let's go ahead and now consider how we can make our pages all the more dynamic by introducing JavaScript. 

So let's go ahead quickly and introduce just some of the syntax of JavaScript. And wonderfully, JavaScript syntactically is pretty similar to C and Python, a little more syntactically similar to C, but there's no memory management. There's no pointers. So it's kind of like somewhere in between C and Python syntax. But it should all come fairly familiar-- fairly easily now. In Scratch, recall when we had a variable called counter initialized to zero, how do we now declare the same in JavaScript? It's going to look like this. So it's not quite Python. It's not quite C. You literally use the keyword let, which means let the following variable exist. The variable's called counter = 0;. Strictly speaking, these semicolons aren't always necessary, but for consistency, I'll keep using them here. 

Suppose you want to change the counter by one, as in Scratch, incremented by one. In JavaScript, you can do it very verbosely like this. You can do it a little more succinctly like this, like in Python in C, or the plus-plus is back. So if you've been missing that in Python, the-- as Nicholas seems to be, the plus-plus is now back as well. How about something like this, a condition? So if you want to say if x less than y, in JavaScript, it's going to say if x less than y. So here, we have the curly braces are back. The parentheses are back. But the ideas are still the same. 

If else in Scratch looked like this. In JavaScript, it's going to look like that. In Scratch, if else if else looks like this. In JavaScript, it's going to look like this, so more like C. There's no weird l if abbreviation. It's literally else if again, but with the parentheses and with the curly braces. And honestly, if you get hung up early on in these next few weeks-- like, these are the stupid details of learning some new language. You've got to develop the muscle memory. You've got to start remembering what language is what. But don't stress when you get hung up on curly braces and parentheses. Like, those things have never fundamentally mattered, but they do, of course, matter to the computer. But not to us, certainly intellectually, as we introduce the new compelling features of this language. 

How about something like a for loop or while loop, for that matter? In Scratch, if you wanted to do something forever, you can convert that now in JavaScript, similar to C, while true, or while any expression is true. If you want to repeat something three times, almost the same as C, but I'm using let here instead of int. So JavaScript, like Python, is loosely typed. It has types, but you, the programmer, don't need to stress over specifying them. The computer will figure it out. So let is how you would declare a keyword. And with that said, there are other ways to declare variables in JavaScript, including constants, but for now, we'll keep things simple and focus only on this keyword here. 

So here's that page again and the tree representation thereof. And this tree is useful only as a mental model for what the computer is doing inside of its memory after having loaded a page. With JavaScript, now, we have the ability to change this tree in real time. When the user clicks something, drags something, types something in, with JavaScript, we will now have a programming language that allows us to mutate this tree in real time, thereby making our pages no longer static or fixed in one state, but dynamic and changing instead. 

So how might we do this? Well, let's consider exactly where we can go about writing some JavaScript code. We can do that by adding a script tag in the head like this. Strictly speaking, it can also go in the body like this, and there are some logical implications of those choices. Or we can even factor it out to a file like scripts.js, just as we did with CSS. So even though the tag is different, it's scripts, and the attribute is different. It's source. And it's stupidly written like this. If you are using an external file, you literally close the tag, even though there's nothing inside of it. That's one of these-- a reality of using this particular tag. 

Well, let me go ahead and propose that we write an actual program in JavaScript. Let me go back over to my IDE. Let's create a new file called-- actually, let's go ahead and open up our old one, hello.html, and let's make it interactive. Rather than say hello, body all the time, let's see if we can't make it say a little something to me and to you. So down here in the body of my page, let me go ahead and change this and do something like this, form, close form, input-- how about let's do ID of quote-unquote "name", and type is going to be text. And then let me go ahead and give myself a submit button, type="submit". So super simple. The only difference now is I want a generic text box. I don't want one that's specific to searching. And I want a submit button, a generic one. I don't care what it says. 

Let me go back to my other tab, click hello.html, and we have a form similar to the Google search example, but now, I am going to use a web form in my own HTML file to interact with the user, because after all, this is how humans interact with web pages, typically, is via these forms. I want to enable the user to type in their name, click Submit, and then I want my web page now, not Google, to say Hello, David, Hello, Brian hello to whoever types this in. 

So I'm not going to use a form in quite the same way as before. There's not going to be an action, because I'm not going to send it to Google. I'm not going to send it anywhere else. I'm going to write code that's entirely client side in the browser itself. So what do I want the form to do? When this form is submitted, I want it to call a function called greet. And at the moment, this is a little messy. We're going to clean this up in a moment. But there is this attribute in HTTP called alt onsubmit, the value of which is not HTML per se. It's not CSS. It's now JavaScript code. I want to call the function called greet. This function doesn't exist yet, but it will soon. 

How do I go about doing that? Well, let me go ahead and go up here and add my script tag. And up here, let me go ahead and define a function in JavaScript. It's a little similar to C. It's a little similar to Python. You literally, though, in JavaScript say function. You write the name of your function and the arguments in parentheses. And then in curly braces, you define the function. I'm going to go ahead and just say alert quote-unquote 'hello, body' for now. I'm going to keep it simple and just goofily output hello, body. And because this form, by default, as with all forms, typically does gets submitted to a server, I'm going to add one other curiosity. I'm going to say return false also inside of this onsubmit attribute. 

So I realize things are escalating quickly here because there's a lot of pieces in motion, but focus again on the basics. The attribute is onsubmit. What do you want to do when the user submits? I want to execute these two lines of JavaScript code called the greet function and then return false. Return false in this context means don't submit the form. Call the function, but don't submit the form anywhere. I just want to use it as a client-side tool for interacting with the user. I don't need to send it to the server or to Google to anyone else. 

Let me go ahead now and specify this. I deliberately gave this text field an ID of name, but we'll use that in just a moment. Let me go over to my other tab now, reload, because I've made changes. I can type in David, but I'm going to be ignored for the moment. But when I click Submit now, notice, it's not the best user interface, but there is an alert on my screen from my specific URL that says hello, body. But I'd like it to say hello comma David. So how can I do that? 

Well, let me go back into my code here and let me go ahead and define a variable called name, but I could call it anything I want. And let me use this special function, document.querySelector(). And this is a function that comes with JavaScript that allows you to select any HTML element. But you need to be able to identify it using some kind of selector. And here's where, for better or for worse, CSS and HTML and JavaScript start borrowing ideas from each other. If I want to get the value of the user's text box, I bet I could give it a unique ID, like quote-unquote "name", and I can select it by using my CSS syntax of name. And once I've selected that HTML element, so to speak, that tag, I'm going to go inside of it and get its value. 

So this syntax is a little familiar, a little unfamiliar. We've seen the dot notation before in C when it came to structs. We've seen the dot notation in Python when it came to libraries, like the CSV library. We're using it in a similar way. Document is this special global variable that just comes with JavaScript in the browser, and it gives you access to all things related to your document, your web page. querySelector is a function that comes with that document that Google and Microsoft and Apple wrote for you called querySelector whose purpose in life is to take a CSS selector in quotes that identifies one or more nodes that is nodes from the tree. And this tree actually has a name, though I've not used it before. This tree that we keep referring to is technically called a DOM, a Document Object Model, which is just a fancy way of saying that document, this global variable, gives you access to this DOM, this tree. So when you call document.querySelector, that induces your browser for you to search the whole tree, all of those nodes and parents and children and grandchildren and so forth-- all of those nodes, looking for the one with the unique identifier of name and then looking inside of that node, inside that rectangle, if you will, getting the actual value that the human typed in. 

So if I want to now output this value, I'm just going to use some old-school concatenation. Let me add to that string, hello name. And I can use double quotes. I can use single quotes. Does not matter so long as you're consistent in the JavaScript world. For whatever reasons, the convention tends to be single quotes if only because you can type it almost twice as fast because you don't have to hold the stupid Shift key to type a double quote-- or to type a single quote as you would a double quote. 

All right. Let me go back to this page, reload, because I've made changes. Let me type in my name again. Autocomplete is popping up nicely. Click Submit. And voila, Hello, David. The user interface is still kind of lame, but it's at least now dynamic. And I can type in anyone's name, as you could, Brian, Submit. Hello, Brian. 

And you can change other things about the web page. And some of this is just implementation details, but recall that you can give a button a value, and you can say something like greet me. Now, if I go back here and reload, my button now says greet me. Turns out that your input for text can have a placeholder, like what's your name, and then saving that file and reloading, you see in grayed-out text a prompt on the box that's not actually there, because once you start clicking, it goes away. I can also disable that somewhat annoying autocomplete by saying autocomplete="off". And then I can go back to my page and reload, and now notice, even when I click in there, it doesn't autocomplete, which is maybe good for privacy's sake. 

And notice, too, I can do one other thing. Let me go ahead and add autofocus. Notice that every time I reload the page, I had to click in the box. But now when I reload the page, the cursor is already there, blinking and ready to go, thereby saving your users one step. So again, another vote in favor of accessibility and usability by just putting the user where they probably want to be anyway. 

All right. Any questions thus far on this introduction of JavaScript by way of this function greet inside of our new script tag and using this new onsubmit attribute that calls that function and then short-circuits the form submission by just saying return false, call the function but don't submit the form to the server, like we did with Google? 

All right, well, let me-- at the risk of making it look more complicated, let me make it more complicated, but in a way that will be familiar over the next couple of weeks when you've seen more and more examples and you've used third-party libraries-- you've seen some building blocks. We would not expect you to write this week, tonight, based on these examples, but just to give you the mental model for how JavaScript is typically written. 

So this with, like with CSS, tends to be a little poorly designed. When you start commingling your languages, like, ugh, that makes it hard to collaborate with someone else. It makes it hard to maintain one language independent of another. So it tends to be a good thing to keep these things separate. And that's not always the case. There's actually now a trend, especially in mobile app development, of putting these things back together. But this tends to keep things clean, at least when we're first starting out. 

So let me go ahead and actually go ahead and get rid of my code up here for now. And let me go ahead and get rid of this onsubmit handler. And let me go ahead now and down here-- actually, up here, give myself a script tag. And let me go ahead and do the same thing a little differently. document.querySelector-- this time, let me select the form. And I'm not going to bother giving it a name. There's only one form, so if I just select form, that'll give me my form. And let me use this fancy function, addEventListener. It turns out that the world of web programming is filled with what we would call events. When you click on a page, that's an event. When you drag on a page, that's an event. On a phone, when you touch a page, that's an event. Turns out, using JavaScript, you now, the programmer, can write code that listens for these events and responds to them by doing something. 

So I'm going to go ahead and add an event listener on my form that's listening for the Submit event. What do I want my code to do when the form is submitted? I want to call the greet function. All right, for the greet function to work, I need to reintroduce it. So let me quickly whip up the greet function. Let me go ahead and do an alert of hello and then plus name. I need to get the name node, so let me say let name = document.querySelector. And now let me go ahead-- and I still have the ID there, so let's say quote-unquote hash name dot value. So I think greet now exists. 

But now this line of code, notice, does this. It says, hey, browser, find me the form, then add an event listener listening for the Submit event, and when that event is call-- when that event happens, call this function. This is very similar-- well, somewhat similar to our lambda example last week when we passed a function into the sorted function so as to tell what to sort by value instead of by key. Remember that syntax when we define a function just to help us sort things. Notice that I do not want to call greet. I do not want to call greet on line 13. I want to tell the browser to call greet when it is ready. Therefore, I pass the function in by name. 

Let me go ahead now and save and reload. OK, nothing visually has changed, but when I type in David and greet me-- [GASPS] huh. Nothing happened. Let me say David. Greet. Nothing happened. When in doubt, if something's acting up, let's go back to those developer tools, Developer, Developer Tools. And let's not look at network, but look at console. And here is where your new friend is going to show you all of the mistakes that you've made in JavaScript code. Just like in Python and C, you see the errors in your terminal window. Because JavaScript is being run on the client's side, in the user's browser, you can't just look at your terminal window. There won't be any errors there. You need to look in your own browser if you're the one testing things. And sure enough, notice here, cannot read property addEventListener of null. 

So this is a subtle one, but I did it deliberately, because it's so common-- and I promise I did it deliberately. Let me go back here and point out this. Browsers are pretty naive. As fancy and as powerful as they seem to be getting, they still take us literally, just like Python and C did, top to bottom, left to right. And if on line 13 I am saying query for the form tag, but the form tag doesn't exist until line 21, it's not going to work, and therefore the error message I'm seeing makes sense. You cannot read property addEventListener of null. Where is null coming from? Form is currently null as of line 13. It does not exist until line 21. 

So how do we fix this? Well, one way, the quick and dirty way, would be, well, let's just move this down below, inside of the body now but below the form tag. And I think this will work. Let me go ahead and reload now. The error goes away, but I haven't done anything yet. Let me click in the box and say David, greet me. OK, it's back to working. But much like in Python and C, this is kind of a slippery slope. Like, the solution to our problems can't just be, well, move it down. Move it down. There's got to be a better way. Similar in spirit to prototypes, but not quite in the same way. 

The way JavaScript handles this problem is as follows. If I undo that and go back to the top where I have my script now, I can actually do this instead. I can do something like this. I can do this document.addEventListener-- and this is a weird one-- but DOMContentLoaded, call this function listen. And let me go here and give myself a second function, function listen. And inside of this function will be that one line of code. So I've got two functions now, one of which handles the greeting, no changes there, a new one called listen whose purpose in life is just to add that event listener. That's all. But notice here on line 18, now, I'm adding an event listener to the document itself saying when the DOM content is loaded-- that is to say, when this whole tree has been loaded and therefore the form tag has been loaded and everything else, go ahead and call the listen function. The listen function is just going to add another listener to the form tag listening for submissions. 

And so now, if I go ahead and save this, reload, I'll keep my developer tools open, and type in David and greet me, still works, but I haven't had to do this stupid resort of moving my code down to the bottom and solving the problem that way. I'm just telling the browser, don't do the following until the DOMs content has loaded. 

All right, questions on any of this? And this is absolutely the more-- the most sophisticated, I think, of the syntax and the logic we've done today. But again, we're just planting the seeds for understanding this in the weeks to come. 

No? Well, let me go ahead and blow your minds with one other feature of JavaScript that actually has similarities with Python. Recall that any time we define a function in one place and then use it in one place, that's generally been kind of lame. Like, why are you bothering to create a function, adding lines of code to write a function that you're only going to call once? Last week with Python recall that-- two weeks ago, with Python, recall we defined a function f, and then we said, no, no, no, this is stupid, let's just use a lambda function, an anonymous function, basically, because it's only being used in one place. 

So this is going to be the ugliest we see, but here we go. If I know that I want to call a function called listen when the DOM's content is loaded, I don't need to give that function a name. I can actually just put the function right there. I'm going to literally copy and paste this over here. Let me remove the excess white space here. Let me go ahead and now point out-- and I'm going to do this a little stylistically differently, just to be consistent with what other people do. Notice now I've done this. I've literally moved that function as the second argument to addEventListener, and I don't need its name at this point. I'm going to go ahead and just do this. The equivalent in JavaScript of a lambda function is to literally just say function with no name and still have open parentheses, with or without a space in between them. And so what this is now saying a little more elegantly, even though more cryptically, on the document object, your global variable, add the event listener listening for the DOM content loaded. Well, what do you want to do when the DOM's content has loaded? Call this anonymous function, otherwise known as a lambda function. 

But notice what we're going to do. We're going to query for the form and we're going to add an event listener on Submit by calling the greet function. Well, we don't need to do that. Let's go ahead and remove that. Let's go ahead and delete the greet function name and get rid of it. Let's make one more anonymous function. Let me paste this in here. I've got to clean up my formatting real quick, so let me go ahead and remove some whitespace here, remove the function name, put my curly brace over there, get rid of this one here, indent that, indent this, close this. 

Whew, scary looking, to be sure, or at least cryptic looking, or maybe delightful, depending on whether you like this kind of thing. But again, it's just basic building blocks, right? In Python, you can define functions that don't have names. That's great because if you want to pass one function to another function, you can literally just write the code using the supported syntax, which in JavaScript does not use the word lambda, but to use the word function no name but still parentheses, and then making sure it's still well-formed function with an open curly brace here and a closed curly brace here, you can then write out your code. And the only reason I have this other parenthesis over here is because I'm already inside of a function called addEventListener. 

So again, if uncomfortable with that-- not a problem, certainly at this stage. But again, we're just now stacking these different ideas on top, on top, on top of one another. All right, well, let me show you now a premade example that shows you exactly what you can do now with these events. Here's a nonexhaustive list of events that you can listen for in a browser, not just things like submitting, but also clicking and dragging, key pressing, moving your mouse over, moving your mouse down, the button up and down, touching and moving, and other such events. There's this whole list of events that you can do such that you can actually do things like this. Let me go back to my IDE, and let me open up a premade example, this one called hello5.html. And in hello5.html, I've got this example already that-- it's just doing a few things. It's listening for DOMContentLoaded, but it's then listening for key up. 

And what's it going to do with key up? Well, let's see. Let me go over here into my index. I'm going to close the debugging tools. Let me reload my directory index here. And that gives me this other file in source eight called hello5.html. Now, notice here, I'm going to go ahead and type in David-- oh. Notice, the web page itself is immediately interacting with me, and as soon as I delete it, it says, hello, whoever you are. So with JavaScript, you have the ability even to change the contents of the web page, not just by throwing an ugly alert. You can use code like you see here, which we won't get into the details of, but allows you to change the page itself. 

And notice here-- this is kind of ugly-looking syntax. You don't even have to concatenate. Just like Python has f strings, in JavaScript, you can use backticks plus a dollar sign and stupid curly braces and do the same thing. And I'm showing some bias here. Stupid syntax, I think, but same exact idea as it was in Python. And again, these are the kinds of things that will trip you up early on, inevitably, but again, as you get more comfortable with the language, all of the ideas will outshine the particular syntax. 

All right. Let's look at a few other premade examples just to give you a sense of the capabilities of JavaScript. Here's a program called background.html. And this web page you'll see is going to have three buttons. It turns out you can implement buttons on a web page in different ways. You can literally use the button tag. Down here, I have a whole bunch of code using querySelector again, but more powerfully, notice what you can also do. If you go ahead in JavaScript and select your body by saying document.querySelector body, you can actually then access a special variable inside of the page's body-- or any tag for that matter-- called style. And then you can change, with JavaScript, the style of a CSS property using code. And unfortunately, left hand was not talking to right hand. In CSS, it would be background dash color, as we saw. Unfortunately, in JavaScript, a proper programming language, this would be background minus color, like literally arithmetic. So the world decided that any CSS properties with hyphens would instead become something like background capital color to distinguish the two. But it's the same exact idea. 

If I go now into this file here, background.html-- well, now I have a very simple page with three buttons, R, G, and B. But notice when I click on them, I have defined in advance, I claim, some event listeners for click. And every time I hear the click, I change the CSS of the page. And if this weren't cool already, let me open up View, Developer Tools, developer tools here. Notice the third and final tab today is elements. No matter how ugly your HTML is, the developer tag-- the developer tools elements tab will always show it to you in a very pretty, printed, colorful fashion. But it will also show you the actual CSS as it's changing in real time. 

So let me reload the page. By default, notice the whole page has a white background. And notice down here, indeed, the body has no style attributes on it. But if I zoom out for a moment and now click R, watch the HTML on the bottom of the page. Chrome just dynamically added background-color red. If I click green, background-color green is now there, and blue-- so using these developer tools, you can interact with your own website and see what's changing in the DOM. We are changing in real time the attributes of this tree, thereby making the page all the more dynamic. 

This is powerful, too. Let me go to Harvard.edu, for instance, open up developer tools, and notice here, we can see all of Harvard.edu's HTML in the Elements tab. Notice it's a lot of HTML here, but there's all of these triangles that expand it-- rather, that collapse it, just to make it more succinct. If you want to look at specific things, let's go ahead and say something like this. Let's see. What could be fun to change here? How about this? I'm going to go ahead and right-click or Control-click on About Harvard, because this I'm finding not what I want. It's going to automatically open in the Elements tab the actual HTML that Harvard used to create About Harvard. It's got some other tags here. Span is another thing. It's like a mini paragraph, all on one line. Class we've seen before. I don't know what lg-nav-txt-- it's probably large navigational text that Harvard invented as a class name. 

But notice what I can do here. How about we change this to Yale and then hit Enter? And voila, hacking Harvard.edu. So of course understanding what's going on here is important. I'm only changing my local copy of Harvard.edu that's been downloaded onto my page. If you go to Harvard.edu now, those changes are not persistent, so it's not hacking per se. But this is really, too, how you can learn how to design web pages. If I go to Yale.edu, I can similarly look at all of Yale's HTML and change things here. Let me right-click on About Yale. Click on Inspect. Let me go ahead and change this to Harvard, Enter. And I can be really malicious, too, but only on my own machine. Notice over here at right, in the developer tools, you can see all of the CSS styles that are currently being applied to that particular tag. And notice this. If I change color down here at top-- bottom right, let me change it to ff0000 and hit Enter. Voila. I've changed all of Yale's tags along that row to be red. 

So this isn't, again, about hacking some website, because it has no effect on the actual server, but it is so much easier and faster to fine-tune your own page's aesthetics by just using your browser, tinker with things, try new properties, and so forth, and then when you're ready to save it, then go into your text editor and type out or copy paste those particular attributes. 

What else can we do? Well, let me show you a few final examples here, too. Let me go ahead and go into size.html. Notice here-- here's some sample Latin text in an initial font size, but I'm using a little drop-down menu that you see commonly on web forms. Let me make the text larger. Let me make it extra, extra large. This is just using JavaScript, listening for change events to this drop down menu, and correspondingly changing the style's size of that particular paragraph of text. 

Let me do this other thing. Back in my day when I learned HTML, HTML 1.0-- we're up to five now-- there was literally an HTML tag called blink and that would literally do this. My first home page probably greeted visitors with, like, "Welcome to my web page" in this hideous, hideous blinking aesthetic. But with JavaScript, we can do the same thing. And let me go open and-- oh, let me go ahead and open Inspect here. And let me move this up here and zoom in. Notice what's happening. I wrote some JavaScript code in this file called blink.html that every half a second or second is changing the style of my body to be visible or hidden, visible or hidden. 

And if you want a little teaser of that, if I actually look at that HTML in blink.html, that's because in JavaScript there's this cool function called window.setInterval that lets you call a function every number of milliseconds. So if I were to change this to be even faster-- let's do it every 100 milliseconds and save and reload, you'll see now it's flashing even faster. So you can do things again and again by registering these kinds of intervals in code. 

Even cooler, let me go ahead and grab another URL here. And just because of my browser's settings, I'm going to go ahead and open this one in Safari instead of Chrome right now. This is called geolocation.html. In this file, we've written some code in advance that's actually going to try to figure out where in the world you are. And notice, for privacy's sake, we can't just presume to figure out where you all are. We're instead going to prompt you, like this-- the browser is going to do that for you. I'm going to go ahead and allow this query. And voila, this file, geolocation.html, just prints out your GPS coordinates. Not particularly interesting, but if I go to, like, Google Maps, I can literally search for those GPS coordinates. And if you're curious as to where I am right now at this moment-- that's not me, but if I go into satellite mode and zoom in, we are indeed roughly and that part of the American Repertory Theater on Brattle Street in Cambridge, Massachusetts, USA. So pretty creepy that using JavaScript, you can even figure out where your users are. 

Now, creepy at first glance, but if you've used Uber Eats or Grubhub or really any website, like, for the weather that asks you for your location, how is it doing that? Well, the programmer of those websites has written some code, as we did in geolocation.html, that has a line of code like this, navigator.geolocation.getCurrentPosition, and then it's a function built into your browser that, with the user's permission, will tell you the user's latitude and longitude, using, again, just some built-in functionality to the browser. 

And then one final example here in JavaScript. It turns out that you can implement autocomplete in an even fancier way. In advance of today, we converted problem set 5, spellchecker, the 140,000-plus words, the text file called large, into a corresponding JavaScript file called large.js, and we wrote autocomplete here. So let me go ahead and type in A, and I will instantaneously see an unordered list of all of the words that start with A. Let me type in A-P, Now it's changing to all the words that start with A-P. A-P-P-L-E-- this is how autocomplete works. Using JavaScript, what am I listening for? 

Well, if I go back to this laundry list of events from a moment ago, I bet I'm listening for one of these key presses or key up events, so listening for the user hitting their key, and as soon as they type their key, I'm using probably that .value syntax to get the value of whatever the human typed in, and then I'm displaying it in the page, and then dynamically adding or removing li elements from the page dynamically. So again, we've not seen hands-on how to do this, but the building blocks are there. You can change the web page's style. You can add HTML to the page. And you can listen for these kinds of events. 

Now, it turns out that it is not only JavaScript that can make use of URLs in this way. And we thought we'd do one final demo here, this one calling back into play Python, whereby I'm going to do a little something with our jack-o'-lantern here. Let me bring him over closer to me. And you'll see that he's got a light bulb tucked inside of him. And let's go ahead and face him forward. It turns out that the light bulb inside of this jack-o'-lantern here is actually one of these fancier modern LED light bulbs that has an internet connection. It's an internet of things device, an IoT device. And it happens to be talking to this little wireless device here that I have on the lectern so that the light bulb is literally communicating wirelessly to that device on the lectern. And that device, in turn, is plugged into Harvard's network. And my laptop, of course, is plugged into Harvard's network, too, and so we have the ability now, it would seem, to write code on my Mac or your PC that somehow talks to this light bulb by using our local internet, our local network, if you will, to connect those two devices. 

And it turns out that devices like this very often have APIs, Application Programming Interfaces, that for simplicity, are actually based on URLs. They're simple URLs so that if I send a certain HTTP request to this light bulb, it will turn itself off or on or do something else. And if I send another request, it will do that thing as well. Now, that's not how all APIs work, but indeed, just because we're transitioning now to web programming, doesn't mean we're leaving Python behind. In fact, next week we'll bring Python back all the more, and SQL, combine all five of these technologies, HTML, CSS, JavaScript, Python, and SQL, and tie them all together into a full-fledged web application. 

But to do this, let me go ahead and create a program here called light.py in Python. And I'm doing it on my own Mac so that I'm not in the cloud. I'm actually on Harvard's local network here. Let me import a couple of libraries, import os and import requests. We've not seen this before, but there's a library that's available with Python called requests, which allows you to make, with Python, HTTP requests, just like a browser. Let me go ahead and declare a variable called username that's going to be the result of just getting what's called an environment variable called username. So for privacy's sake I didn't want to type my own username and password for Harvard's network into this program, so there exist what are called environment variables on Macs and PCs and Linux computers that you can store values secretly elsewhere. And using Python's os.genenv function, you can load those into the computer's memory somewhat privately. 

Let me go ahead and get the IP address of the light bulb by doing os.getenv quote-unquote "IP". And then lastly, let me go ahead and construct a URL. By having read the documentation, thankfully a colleague constructed URLs that looks like this, http://, the IP address of the light bulb, slash API slash My username, personally, in case different people want to control the light bulb, slash light slash one slash state. So this is a weird-looking URL, but it's essentially going to be http:// whatever the numeric IP address is of this light bulb slash IP slash my username slash light slash one slash state. So I could literally copy and paste that into a browser if I knew what those values were offhand, but I'm going to do this programmatically instead. 

I'm going to go ahead and write requests-- and I could say get if I want to send a get request, which is not what I want, because I don't want to get the value of the light bulb. I don't want to post the value of the light bulb. It turns out there's a third HTTP verb that we've not seen before, but it's often used for APIs, called put. Technically it's all caps, P-U-T, but in code, it's written in lowercase, .put. So this is going to send-- it's going to send a message from my Mac to this light bulb. 

So let me go ahead and put to this URL the following Python dictionary. I'm going to go ahead and put a value of quote-unquote "on" to the Python Boolean value of false. Now, what is json? Json stands for JavaScript Object Notation, which is just a conventional way of sending textual messages across the internet. So we'll see that before long. But for now, this is just sending to the light bulb a dictionary which has a key of on and a value of false. 

And if I didn't do anything wrong, let me go ahead and close this file, run Python of light.py, cross my fingers, as always-- voila. Now, this light bulb doesn't have to be one foot from me. It can be literally elsewhere on the internet so long as I have an internet connection and I have access to that IP address over the internet, which I didn't want to do today, lest we have hundreds of people turning the light bulb on and off for me. But let me go ahead and change the code a little bit now. Let me go ahead and turn it back on, and change on, of course, to true, as you might guess. So almost the same code. Let me go ahead now and run Python of light.py again. On. 

And let's make it a little fancier. Let's go ahead and get a little more logical here, doing things a little more interestingly than we have thus far. And let's see if we can't bring that blink to life as well. Let me go down here, and let's do something infinitely this time. How about while true-- so forever, this demo will go on-- whoops-- while true, go ahead and put to that URL not just on and off. Let's go ahead and change the brightness to a value of 254, so really bright, and let's go ahead and change on to true, as before. But then let's go ahead and sleep for a moment, so time.sleep for one second. And then let's go ahead and send another request after a second to that same URL, sending in a Python dictionary where on is now false. And then let's go ahead and sleep for one second. 

You might have noticed I need another library. I need to import time, which sounds amazing, but it's just the library called time. I've saved the file, and I'm forever going to send one request turning it on, another request sending it off. And now, the climactic finish, Python of light.py. All right, woo! [CHUCKLES] 

OK. That's it for CS50. We will see you next time. 

[MUSIC PLAYING] 

Phyllis, number two. Number two from your left. [LAUGHS] 

[LAUGHTER] 

OK, this one we can fix. That was amazing, Josh. Sophie? [LAUGHS] 

SPEAKER: I really tried to hide it. 

DAVID J. MALAN: [INAUDIBLE]-- oh! 

[LAUGHTER] 

SPEAKER: So close. 

DAVID J. MALAN: OK. We'll try again. Here we go.