>> All right. This is CS50 Halloween week and the end of Week 8. So we do actually have some pretty scary stories today, but first a couple of announcements. So, one, any of you that showed up at office hours last night, my apologies. We actually overcompensated by moving too much of the staff from last night to tonight and tomorrow. So my apologies. Quite honestly, if you showed up at my bidding encouraging and we weren't actually there in sufficient numbers to help you. So we will be there tonight and tomorrow. And we will adjust accordingly next week. So some of you tuned into this last night. If you pulled up any of the course's lectures videos, we, of course, have a lot of students in the extension school taking this course remotely. We, of course, have a lot of your classmates taking this course remotely. And so we have these videos online of lectures and sections and walkthroughs. And what we thought we'd try is an experiment, really a social pedological experiment just to try to lower the bar online, so you guys talking to one another ideally about material and the lecture and question. And so what some of you might have seen is that we've integrated CS50 chat into each of the course's lectures at the moment. This is an interface after you've logged into your cloud account that you can see who else is watching that video or perhaps another one. And then in theory you can discuss the material. You can click any of the sentences in the transcripts once they appear to jump that part of the lecture. And, so, I think in a decent size class there might actually be some pedological value here in giving you guys the easier ability to chat amongst yourselves. And for now we also put this on the course's home page so you can actually log in and see what kind of chatter is going on. And we'll see how it goes. We'll tweak it as needed. The first several posts that ended up in the chat room last night -- I actually started setting it up last night. Didn't have time to add the authentication, because I had to run to dinner. So I ran to dinner and just let people log in with whatever nickname they wanted to. And so the first message I think posted to the chat room was, hi. So that was kind of nice and fitting. Seems to work. The next one was, hello. The third one that I saw when I came back from dinner was this one here. And the fourth chat ever posted to the CS50 chat interface was, why I actually do not know. So, in any case, that is CS50 chat. And you will see this interface on any of the course's videos if you now tune in to the MP4 versions on the course's website. So all of the rage over the last couple of days, if you've been talking with some of your technical friends or you read any of the blocks, is this utility called Firesheep. And it's actually rather epiphora that just this week we started talking about web programming and security issues and even as far back as our crypto discussion did we talk about one of the technologies involved in this or not involved something called SSL. So a researcher by the name of Eric Butler and a buddy of his named Ian Gallagher just a couple of days ago presented at a hacker conference of sorts a very compelling reason for the world to start being mindful of some of the inherit insecurities in today's internet and more specifically today's web. What's curious is that their presentation really did not point any new fundamental weaknesses in today's worldwide web, but it was a wake-up call to people not only in the audience, but also more broadly, because the mass media has picked this up as have some of you have. So Firesheep is a program that automates the process of sitting inside of a room like this or an internet cafe or Starbucks or where there's a lot of wireless traffic. And with the click of the button you can then sniff all of the wireless traffic in the vicinity of you. And if the people near you are using that wireless connection to connect to popular websites like Facebook or Twitter or Gmail or the like, with the single click, you can log into that person's Facebook or Twitter or other such account. And this is not because some new fatal flaw was discovered in these technologies, but because two fellows highlighted for the whole world's attention just how insecure and how poorly designed some of today's websites or really some of our underlying technologies are. So if you caught wind of this, do realize that you can go to the source. You might have heard some correct things, some incorrect things from friends and the like, but this is actually the blog post where the author of this presentation actually discusses this utility, Firesheep, both before the day he released it and then a day after, after 100 plus thousand people actually downloaded it. What happens is if you download this utility, it's a free plug-in for Firefox, for Macs and Windows. You see a Firefox window like this. And on the left-hand side you see a little button that simply says, start capturing. And if you click that start capturing button, what you start to see are anyone nearby whose been logging into some of those popular websites. So in this hacker demonstration he give a demonstration with himself named Eric, his buddy and then someone else, two other people who were actually in the audience. And then what you can do is click on, for instance, Ian's name up at top left and bam. Now, Eric is logged into Ian's account, not knowing his password, not having looked over his shoulder, just by sniffing the wireless traffic. And, frankly, just to hammer home this point, since one of the goals of this course is really to educate you guys, not necessarily to be computer scientists, but to be technically savvy politicians or business people or really just ordinary people on the street. I decided it would be the most compelling example to tell you that James, Trisha, James, Chris, Emily, Zar [assumed spelling], Mike, Brian, RJ, Jonathan, Waldo, Emily, Clougan [assumed spelling] Rose, Nick, Lou, Mike, Jonathan, Brecka [assumed spelling], Chris, Anna, Brian, Charlie, RJ, Joe, James, Lee, Ryan, CC, all of your accounts have been compromised as of the past five minutes. [applause] Now, frankly, what are you doing on Facebook and all these websites during CS50's lecture? So, I mean, perhaps that's the take away here. But just to show how easy this is, I won't go poking around your own personal accounts, but I do recognize one of the names, Rose Cowe, [assumed spelling] there at top left, one of our teaching fellows. So, Rose, here is your Facebook account. It is that easy. So I actually got her blessing before doing this. We sanitized her news-feed here. But it is literally that easy. I downloaded a program from the web. I read up on how it works, because I was technically curious. I installed this thing, double clicked and bam. And literally the dozen or two names I just rattled off, if someone else in this audience -- and please do not do this -- is running this tool at this moment in time, literally all of your accounts have been compromised, because now anyone in this room, myself included, can start poking around your messages, your pokes, your e-mail, your tweets and the like. And so this was really a wake-up call in this research conference. And don't worry, I won't leave you completely hanging today and just say, good luck. I will actually propose some technological solutions and use this really as a stepping stone this week and next to discuss what these flaws are and what we, as computer scientists, can do to familiarize them, what you as users can do to avoid them and really what we more generally as a society of increasingly technical people can actually do to protect ourselves. But I grabbed this link off the internet yesterday, too. Some folks decided that it would be compelling to say something like this. This is additional program someone wrote called idiocy that's a warning shot to people browsing the internet insecurely. And what utility does is if you install that utility and it's a text that there are people in your vicinity who are logged into twitter, it logs itself in as that person to twitter, poses a tweet saying, hey, idiot, your account has been compromised; here's how to fix it. And then it logs off. So this is what it means by warning shot across the bow. And it's a program called Idiocy. So I bring this up because actually a lot of you e-mailed me right after lecture on Monday when this tool was first announced asking about this. Because everyone in this room and really anyone on this campus, anyone at Starbucks, at an airport, really are vulnerable to these kinds of attacks. And I would be remissed probably if I didn't point out that, though, Harvard might not have all the possible technological defenses against this, they do have all of the administrative board possible defenses against something like this. So please do be mindful of such policies and do as I say, not as I do. But how can you actually defend against these kinds of threats? Well, really, what are the threats in the first place? Well, I'm going to go ahead and run a little command on my Mac here. You can do this on almost any Linux system. It's a command called TCP dump, TCP being one of the protocols we talked briefly about Monday that has to do with getting data from point A to point B. And when I hit enter here and then log into my own computer, I've essentially put my wireless ethernet adapter into what's called promiscuous mode, which actually means that it's listening not only to traffic destined for my laptop, but for any traffic that's here in the room. So if we actually look through this, hopefully we won't see anything sketchy. Actually, one of you seems to be downloading Firesheep, it appears, or googling for it. But what we see going across my terminal window here is just a raw dump of all of the wireless internet traffic going across the -- going through the air right here at this moment in time. I'm going to go ahead and run a different program that makes this a little more visually interesting for us. This is what's generally called in the security community a packet sniffing utility, which has actually really useful technical value. This, too, is a free tool. This has been around for 10, 15 years, something like it. I'm going to click on my wireless interface. And what you see now scrolling by is every one of the individual data packets flowing across the wire, flowing through the air here wirelessly in Sanders or unsuspecting freshman in Annenberg across the way. You can see on the left-hand column the IP addresses, the unique addresses that identify the origin or destination of each of these packets. And if we really spend some time looking, we'd notice some familiar acronyms, TCP, HTTP. So if I actually got nosy and started digging into each of these rows on this table, I could see every website these freshman are visiting or people in this room, we could see what emails have been sent. If you're not using a secure website like Gmail, we could see all of the instant messages certainly going across the wire. You could see everything. And so really the only thing that's been protecting you up until this past Monday was ignorance both on campus here and the general ignorance of the population as to all of these fundamental weaknesses. So hereto I'm quite sure Harvard's administrative board has rules against running packet sniffers for anything other than CS50 lecture purposes. So do be mindful of taking visual things away from this and not practical skills. So what are the other possible solutions? Well, one of the first things that comes to mind and has been proposed by some of your classmates in some of the e-mails is, well, why is Harvard's wireless not secure? I mean, my God, like my little apartment has more secure wireless than Harvard University and a lot of companies. Well, what's the reason perhaps that Harvard does not have encrypted or secure Wi-Fi? Yeah? What's that? >> [ Inaudible comment ] >> Compatibility. Okay. So that's one issue, because there are different encryption technologies. One is called WEP, W-E-P. One is called WPA. Another is called WPA2. These are all different versions. And people with slightly older laptops might not actually support the latest and greatest ciphers, the latest and greatest security technologies. And so the implication is Harvard might have to dumb everything down to that lowest common denominator, but the reason that there exists version one of WPA and version two of WPA is because WEP, for instance, sucks. It's a completely vulnerable protocol such that if you sniff enough encrypted traffic, so traffic that's been encrypted to your home router or wherever, if it's using WEP, after a few seconds or a few minutes if you have gathered enough wireless packets, you can reverse engineer what the secret key or password is for that wireless router and, bam, you're connected. Case in point, the first time I saw this demonstrated was several years ago when I was helping random contacts, a friend's brother move. And we had finished moving all of his boxes into his apartment. And it was this new apartment in Cambridge. And he or my friend really wanted to check their e-mail. And he didn't have Com-cast or cable mode or anything like that yet. And so my buddy pulled up his computer on which he downloaded yet another free program. And within I think two minutes, maybe five minutes of sniffing all of the neighbors' wireless traffic, all of which was encrypted, bam, he found someone whose key was not sufficiently strong. It was something silly like one, two, three, four, five. And he figured it out just by running this program that, again, any kid off the internet can download these days. So thankfully there exists more secure protocols like WPA2. There are even flaws in these, but there's a gotcha like that on campus where, well, then you have to be compatible with everyone. We have to somehow tell all 20,000, 30,000 affiliates what the secret password is to access the wireless access point. It might involve a bit more CPU cycles to actually encrypt all this data, which means Harvard might need more hardware, more money. So there is a lot of gotchas there. It just raises the bar. And so it's not actually uncommon to use unencrypted Wi-Fi here. And even then this only pushes the problem away a little bit. The fundamental problem right now, which we'll actually discuss today, is that when you log into a website like Facebook or Twitter or any of these others, you generally are using a secure connection. HTTPS colon slash slash. And the S stands for secure, for SSL, which means that connection is encrypted. But then once you're finished logging in, you might notice that the URL usually changes back to just HTTP colon slash slash. So insecure. But that's not a big deal, because you've already provided your user name and password or maybe your credit card number if you bought something. And so it's not a big deal. It's not as big a deal if someone sees your profile or sees who your friends are. It's more worrisome if someone sees, let's argue, your password. But unfortunately HTTP hypertext transfer protocol, this thing that's getting data from points A to B browser and server it's what's called a stateless protocol. And you can see this visually if you visit with a website, you usually see some simplifying icon as the web page is downloading and then it stops eventually. And once it stops, that means the connection between browser and server has been severed. They're not talking anymore. And yet the next time you click a link on your Facebook page to a friend or to a message or whatever, surely you don't have to log in again and again to visit every page. Somehow the website Facebook has remembered who you are or really that you have logged in. And it does so by planting what's called a cookie on your computer, which is a little file or it's a big random number that gets planted somewhere on your computer. And unbenounced to the typical user, that big random number, that cookie is sent back and forth, back and forth to Facebook, much like a little red stamp at a carnival that gets you in and out of the carnival if you actually leave for an amount of time. Your browser unbenounced to you is always presenting this random number, this hand stamp to Facebook saying you are already authenticated; you are already authenticated; don't ask me again and again and again for my user name and password. And the problem is this hand stamp, this cookie is generally not being transmitted securely, because it's being transmitted by HTTP. So this program Fireshark, if you see where the story is going, is not eavesdropping on all your encrypted traffic. I actually don't know what any of your guys' passwords are to Facebook or to Twitter. What I have stolen using this program and what anyone can steal using this program must be what? That cookie. That hand stamp. It's as though a Firesheep is kind of making a copy of that hand stamp and then presenting it to Facebook or the like as its own. And that's sufficient to get you through that website's security layer. Now, what about encrypting the traffic? So this actually does solve the problem at least of the freshman in Annenberg and the person sitting next to you. If you actually have an encrypted connection, your laptop is going to connect to Harvard's wireless access points, which, as you know, actually are not in Sanders since we have very poor wireless here. You probably are in Annenberg. So between your laptop and those antennas you probably do have a secure encrypted connection if Harvard turned on a protocol like this. It would be a bit of a nuisance for all of us to actually reconfiguring our laptops, but it would at least encrypt us between us and those antennas on the wall. But the problem is that encryption ends at that wireless access point. And there's an ethernet cable leading from that wireless access point to one of Harvard's routers, big computer that moves data around. There's another big wire leading from Harvard's router to the next router, to the next router, to the next router on the internet. So you raise the bar to people you know actually sniffing your data if you're at least encrypted wirelessly, but the whole rest of the internet. If there are twenty hops, 20 routers between you and point B, you really only encrypted the first of those. There are 19 other opportunities where anyone on the internet can actually still sniff your data. So what's the solution? Well, what about encrypting your data not between you and the access point, but between you and point B; right? Go the whole nine yards all the way across the country to Facebook server by using the S in HTTP in the URL, which, again, denotes secure. This means using a technology that we mentioned briefly weeks ago, RSA and cryptography more generally can your browser establish a secure connection to facebook.com itself. And the upside of this is that after your hand is stamped, after a cookie is planted on your computer, so long as you stay on the SSL version of Facebook, the one whose URL starts with HTTPS, at least then this cookie, this hand stamp will continue to be transmitted securely again and again and again. But there's a problem here. So case in point, let's go ahead and do this. So I'm going to go to -- I got to log out of Rose's account. I'm going to clear my recent history. That clears my cookies and all of my temporary files that have been downloaded. So now I'm going to go to facebook.com. All right. So I'm actually on HTTP, which is, in fact, the default. You know, it turns out Facebook does support SSL. So I can insert the S there and hit enter. So it feels like I might have solved this problem. Notice, too, the world has adopted these visual queues like green here. If you visit a company that's paid some extra money to get what's called a digital certificate, you can signify to the world in green, hey, this is secure. But watch what happens. So I'm going to go ahead and log in here, not as Rose, but with a little fake email account. I'm going to log in with my password. And now notice what -- where is the log in button? I'm going to click log in and watch what happens to the URL. I can't see it all at once. Here we go. Log in. So I have to name my new computer. This is actually a security feature that I have turned on just so I can detect who is logging in to my Facebook account after lecture. But now notice what did the URL go back to? Dammit; right? Even though I used HTTPS for the whole transaction, the very last step where I logged into Facebook, I got a response from the server. It says, okay. You're logged in. Now go to home.PHP, Facebook's home page. Guess what URL they redirected me to? HTTP. So even though I've been keeping my little hand stamp secret for all previous steps, I went so far as to manually type the S into the URL, then Facebook took it upon itself to say, hey there, here I am, I'm logging in now. And that cookie is now compromised. And those of you who are playing with fire could now log into my little happy account here and see exactly what I see, which is zero friends with this particular account. It's actually fascinating not having signed up for Facebook account in like five or six years to see what the rest of the world is now going through, but not the point of today. So SSL doesn't seem to work perfectly, because you, the user, have no control over where the web server is going to send you. Now, I wouldn't be surprised, frankly, in the next several days, weeks, months, that Facebook, because they're just so dam big and because this is going to be such a headache for them having so many compromised accounts, so many more spamming accounts from compromised accounts, I wouldn't be surprised if they just move all of their servers to HTTPS and some of this will be taken care of for you. But what can you do in the meantime? Because literally what you saw me do is as easy as running this utility. What are some of the options you actually have? Well, there are also some other plug-ins. So you have to start using a certain browser to use these. But if you use Firefox or are willing to download the free browser called Firefox, there's another plug-in called forced TLS. That's actually been around for a long time. And this plug-in essentially monitors in a good way all of your web browsing. And any time that it sees that your browser is about to request HTTP colon slash slash for a known site like Facebook or Twitter, one of these sites where they know they support HTTPs, they just don't seem to be using it 100 percent of the time, this plug-in will intercept that request, change it to HTTPS for you and then let the request go along its way. So these slides will be on the lectures page today, but there's an alternative. Pretty much the same kind of plug-in. It's just a competitor plug-in, but they're both free. HTTPS is everywhere from the Electronic Frontier Foundation, which is an organization that's all about privacy. And they, too, have put out a plug-in for Firefox with which you can mitigate this. Unfortunately it's not the solution. It's not a fundamental solution. All you do is raise the bar. And this is actually a recurring theme in security and certainly in security research, which is between us, the good guys or most of us, the good guys and the bad guys out there. All you can really do is raise the bar higher and higher and higher. And it's pretty much going to be a perpetual cat and mouse game, though, between good guys and bad guys, whereby all we can do is push the threat a little farther out, make it a little harder for someone else to compromise our accounts. Case in point, even if you use HTTPS, it doesn't matter. For years there have existed what are generally called man in the middle attacks, whereby if the idea of security behind SSL is that I'm supposed to establish a secure connection between me, point A, and say facebook.com, point B, well, I can use HTTPS. But what if my laptop is tricked into talking to not facebook.com, but a middleman, a man in the middle, some other computer, some other router, some person in this audience who has enough technical savvy to either write or these days just download a program that listens for requests to facebook.com and responds quicker than Facebook and says, I am facebook.com, my little PC here, my little Mac. And so my laptop, because it's requested facebook.com, it then goes essentially to your IP address, really to your ethernet address. And you then reply to me with an HTTPS response. So you know what, I am actually using HTTPS. I have a secure connection, but to who? Right? The bad guy then, you, on your computer can proceed to finish the other half of that connection and establish a secure connection to Facebook on my behalf. And, frankly, it doesn't take all that much savvy these days to write or, again, to download a program whose purpose in life is to sit between points A and B. And if I request home.PHP, Facebook's home page, well, I look at that and I say, oh, this victim wants home.PHP. Let me request that of facebook.com. Let me get the response, then forward the response to them. So you can have someone sitting in the middle watching every message you send, every pope that you send, every profile that you visit. If you have a man in the middle, you might very well still have a secure connection, but to the wrong person out there. And this one is harder to mitigate. There are certain defenses that require certain software or hardware on one's computer, but at least now there's not a Firefox plug-in that makes it point and click to actually wage this attack. But none of this stuff -- and this is the scary thing. None of this stuff is new. This existed since I was an undergraduate on their campus, since as long as the web has really been in vogue and SSL has been in use. It's really just a wake-up call. So what's really a better solution? Frankly, as undergraduates, you at least have the luxury of having access to one of Harvard's VPN's server. So you might know that a VPN is a virtual private network. This is just generally a technology with which you on a laptop or desktop can connect securely using encryption to, for instance, one of Harvard's servers called their VPN server. And then you have then any programs you use get routed not wirelessly in the clear, but encrypted to FAS's VPN server, and from there is goes out to the rest of the internet. So if you actually want to use this, Harvard actually made this easier a year or so ago. If you visit VPN.FAS.Harvard.edu, frankly in the short term if you are worried about some nosy friends frankly trying to mess with you and your accounts, this is probably of all the solutions we just rattled off one of the most robust, because it encrypts everything about your traffic, not just special sites that does plug-in's know about. So if you go here and log in with your user name, I think you have to type at FAS or at FAS.Harvard.edu and then your password. Click log in. A little Java program will run that will then create a new encrypted connection between you and Harvard's FAS servers. This should be much less vulnerable to the man in the middle attack, because, well, hereto it is possible that the bar will be significantly raised. And you don't have to worry about traffic leaking out because some plug-in doesn't know about that website. And the other thing, too, is most of you, if you visit Facebook, you know, if you don't have it currently open in a tab, what you probably do is something like this -- if I'm at, let's just say, google.com and I decide I want to go to Facebook, most of us probably do not type HTTPS://www.; right? Most of us type facebook.com or something like that. The problem is if I have not typed HTTPS:// and hit enter, my browser, trying to be helpful, assumes I want a website. 90 something percent of the web is probably not using SSL. Made that up, but it's probably true. And so it's going to assume I want http://facebook.com. So even if you have been super diligent about always typing HTTPS and using these tools and whatnot, you just compromised yourself, because the next time you hit enter and that first time go to http:// you have just showed your hand to the whole world. So if you have some person in the room nearby who's been running a tool like that or sniffing your traffic, bam, it doesn't matter if you are safe 99 percent of the time, it just takes one divulgence of your cookie or this hand stamp for your account to actually be compromised. So with that said -- oh, and as an aside, Facebook's password policy, in fact, is not all that secure. I was able to create -- I tried creating an account just to be funny using -- was it President Skroob's password of 12345. It turns out they don't allow five number passwords, but they do allow six number passwords even if it's completely just a string of like 111111. All right. So please don't take control of my Facebook account, because I'd love to use this as an example in the future with the same accounts. All right. So any questions on this scary Halloween story? Yes. Okay. You two gentlemen with the laptops. >> Yes. So if you log out every time, does that secure it at all or does it get sent every time you try to activate it? >> Good question. If you logged out, are you thereafter safe? It depends. Sometimes when you log out, the website does not actually scrub the data on the server side. They don't actually remove their copy of that cookie, that unique number. And so even though your browser might forget who you are, that doesn't mean the cookie doesn't still exist. And if someone has been sniffing your traffic, their Firebug -- Firefox plug-in might still work. So, in short, logging out is good. You narrow the window of opportunity by not staying logged in and just letting the website continually contact the server, but it just raises the bar slightly. It's a good habit, but it's not a solution. Yeah? [ Inaudible audience question ] >> It's a good question. What do these guys propose to do? So the development of this program really was meant with the best intentions. And this always sparks religious debates over whether or not you, as a researcher or academic or just good citizen, should actually tell the world when you find vulnerabilities in products or in software programs, because it's a two edge sword. On the one hand you educate the good guys so they can shore up their defenses and protect against these threats. But the downside of telling the world about a vulnerability is that now all the bad guys know. I mean, I'm generally of the opinion that you might as well share this information, because odds are the bad guys probably know. And you know what they're not doing is telling you. Right. So really you're not defending against -- you're not protecting yourself all that much more. So generally getting people mindful of these issues is likely to spur them to action. So one of the things that these fellow recommends is really that certainly all websites that exchange any kind of credentials by a user name and password or subsequent cookies run them over HTTPS. In fact, Gmail was known this past summer or so to finally move all Gmail servers by default to SSL. One of the common concerns with SSL is that it takes more CPU cycles. Therefore, it takes more servers. Therefore, it takes more money. But we actually linked on the lectures page today to a write-up by a google researcher who explains that it only costs google like two percent more in CPU cycles to enable SSL for all of their e-mail servers. And so that's pretty compelling. And they had to do some tweaking and whatnot, but there are certainly some lessons there that anyone can use. So, frankly, I'd be surprised if Facebook and some of these sites just don't turn on SSL soon. Frankly, there exists special products whose sole purpose in life is to sit on a network, handle all the SSL traffic and then pass the unencrypted requests off to the web servers. So there are plenty of solutions. Yeah? [ Inaudible audience comment ] >> So most likely correct. So Firesheep works if your ethernet card can be put into promiscuous mode, which is usually the case. Certainly you can do it on most modern Macs. You can do it on a lot of PCs depending on the hardware, but it might very well depend on what you have. But it is taking advantage of that, yeah. [ Inaudible audience comment ] >> So, yes, if you have physical access to the machine, even if it's using SSL, you are not -- it's not being encrypted at the very beginning. Your data is going to be unencrypted in some form in your laptop. So that's just generally true. Any time you have physical access to a machine, you are pretty much screwed when it comes to security, because there are so many ways in which the information can be divulged. So long story short, yes, this does not solve the problem, but it definitely gives us time to kind of re-engineer a better solution to this. And the problem, too, is that there are known flaws in like the wireless protocols the world uses, the SSL protocols that the world uses. It's a very imperfect world. So really this begs a bigger picture question. You know, what really can you do? It might not be a huge deal if your Facebook account is compromised. It might be a little bit embarrassing. It's kind of more problematic if your bank accounts are compromised or super sensitive information is compromised. So what can you really do? Well, just as an aside, I think one of the most compelling technologies that some sites have started to implement is what's generally known as two factor authentication. This, too, has been around for years. It's just no one offers it really or really none of us really demand it. Two factor authentication is this technology that E*TRADE and some other brokers use where you get this little electronic device that has a little LED screen with four or five or six digital numbers on it and usually a button. And those numbers change every several seconds or every several minutes. But the device has been designed in such a way that it's synchronized with your E*TRADE account. So what happens now is when you go to E*TRADE.com, log in with your user name and password, that's not enough. Once you have done that, they say, what's the secret number that's currently on your device? And you have to provide that same number. Otherwise, they assume this is just a bad guy who's stolen your user name and password or who's guessing your user name and password, but doesn't have physical possession. So two factor authentication, frankly, it's hugely compelling and that it requires that to authenticate yourself, to prove you are who you are, you need to know something like your password and you have to have something like your key fob. But thereto someone could certainly, you know, beat the password out of me and just steal the key fob be. And then they would have two of those factors, but it's a lot harder. And thereto they need physical access to me, the human, to actually track that device, which raises the bar. I'm not now exposed to billions of humans on the internet. I'm only exposed to the people in this room at the moment with my little secure key fob. But there's other approaches, too. You don't need special hardware. Bank of America has a neat trick whereby if you have a phone that supports text messages, when you try to log into their website, if I have turned this on, you have to provide your user name, your password. And then they send you a text message that is this random number. The idea being that unless someone somehow intercepting my wireless phone traffic, only I am going to receive that text message. So now I can type that temporary numeric code into the website. But there, too, you know, our wireless networks certainly have vulnerabilities in them. But, again, you raise the bar. And one of the best defenses frankly against the security issues is most people in the world don't care about what you're doing with your Facebook account or don't care what little old you is doing on the internet with your instant messages. It becomes a problem like it has the last couple of days on the campus we all know a lot of people who wouldn't get some sort of amusement out of, you know, embarrassing you by logging into your Facebook account or any one of these other accounts. So when you actually have credible threats, even if it's just playful, then you need to worry a bit more. And even on the internet it's not compelling to get my little Gmail account or my little Facebook account, but to get mine and 900 other thousand users accounts. Because once you have access to a million Facebook accounts, you can send a lot of spam, at least for a few minutes or hours before you're detected and shut down. But that's millions more spams than you might have been able to send out otherwise. So it's when your adversary cares about who you are or doesn't care, which unfortunately is the complete set of possibilities, that you are really at risk for something. So with that said, how does this work? How do we defend against it? And how does it relate PF7 and beyond? So you have a couple of handouts today. These are really meant to be cheat sheets for some of the languages and technologies we're using. They're not exhausted. And what you will find on the internet is that there's a lot of differences when it comes to syntax, capitalization. So try to see through the messiness and turn to me or the teaching fellows or help.CS50.net if there's ever any ambiguity. But we pulled out some decent PDFs that some nice people made that offer some common tags for HTML, some common properties for CSS, some common functions and commands for something called mySQL, which is a data base server we'll start to talk about today and some common functions for PHP. This is not at all exhausted. This is meant to be the sort of thing you just keep on your desk as you are coding, and it might be a quick answer or you might have to turn to google. But for those of you who have never coded HTML before, this is actually a decent little tutorial to at least read or skim through once. So all of these are linked on the course's website, too. And then, finally, I want to draw your attention to this. So for the handout for Monday was about the final project. And recall that there were these links on the front page that recommend that you look at some of these resources just to entice you with this. Notice that, one, any of the campus related apps we talked about in the course, whether it's events or courses or tweets or news or maps, all of them have what we call APIs, application programming interfaces, whereby if you don't like the way we've implemented the core shopping tool, if you don't like the way we're presenting the shuttle data, well, that's okay, because we can give you all of the data in raw format so that you can then represent it to the world in a website, in a C program, in an iPhone app, in an Android app, in a Blackberry app, any kind of application you want for your final project. So realize that these are all available. One of the most popular APIs to use for final projects, frankly, is the Harvard food API whereby we screen and scrape Harvard dinings services website, which is, frankly, a little hard to navigate when you just want to know what's for lunch. And we then give you all of the data in raw format for breakfast, lunch or dinner on any given day. You can search based on dates. We've also put on the course's website and on this sheet from Monday a list of what we call fun APIs. There's a lot of third party websites and companies out there that provide really neat functionality and/or data these days. And this is frankly where programming is really getting fun and exciting, at least for me these days, whereby it's getting so much easier to make cool things relatively easy. Case in point, the shuttle boy text messaging service that I demoed a few weeks ago I don't really know much about how cell phones work. I certainly don't have access to like Verizon's equipment and all of that. But using a free or ad supported service was I initially able a couple of years ago to implement that program whereby I made a website, a PHP based website that listens for web requests for an origin and a destination. And then I just spit out the response after I look up in the data base for shuttle boy when the next shuttles are for point A and point B. And then this third party company, thanks to their API, routes my website's response to your cell phone whether you're on Sprint or T-Mobile or Horizon. So it's really quite cool what you can do. Google calendar has an API whereby you can get all of the data from a google calendar. You can put data into a google calendar. I'll google Checkout and Paypal if you want to create a website that actually charge people money. Frankly, I always thought this was a missed opportunity. The Quincy Grille, bless their heart, has been popular for years, apparently still takes orders by a Gchat, which is actually pretty clever, I think, to actually have a little window open there. But, just think, you can allow people to pay for food with credit card. You can have them order it. You can maintain logs. You can know exactly how much food is going in and goes out. I think this is, frankly, a really neat fine project idea. Foho [assumed spelling], as well, apparently does it in person. You have to visit the desk. And I would like a hot dog. We have computers that can tell the people that you want a hot dog these days. And they can charge you for it, too. Google Finance, Yahoo Finance we'll look at this week. Google charts, if you want to create interesting visuals. What else here? A few more. Anything related to maps, mobile phones, news, I mean, profanity. We have accounts with a provider if you want to write an application like I saw you at Harvard and you want to filter out the crazy talk, at least with high probability, you can use profanity type filters and automate that process and not have to spend human hours actually regulating what your classmates are posting to sites. So do follow those links. There's some really juicy stuff there. So let's actually start to give ourselves the framework in which to use these kinds of tools. So HTML is a mark-up language. It is a language that's not really about programming and logic. It's really about structure and esthetics and display. We have the open bracket HTML tag, which tells the browser, here comes a web page. We have the head tag and the title tag that says here comes the site of the page in the so-called header. And we have the body tag which says, here comes the body of this page. And we saw briefly on Monday things like the tag for anchor for hyper links. We saw the image tag, IMG for images. And can you see on this cheat sheet that we really just scratched the surface on Monday. But anything you've seen on a website today, you can implement yourself. And I can't emphasize enough utility, again, of actually learning from existing examples. If you want to know, for instance, how Facebook put this table of first name, last name over there and how did they put that picture over there, well, you can certainly go to the view menu and go to page source and you can see all of the raw HTML that implements this website. But generally it's pretty messy. So it looks like very bad style. Nothing is really indented. There's no comments, but that's to be expected. If every byte costs Facebook money, you don't necessarily want to send all of that additional formatting to the browser. But realize that there are some really wonderful tools. And we've linked these on the course's website and we'll mention them in problem set seven. The one I think I showed on Monday is called Firebug whereby you can visit any website like facebook.com, and then if you want to see how they implemented this word sign-up, how did they put this text way over there and not all the way on the left, well, you can right click or control click it, choose inspect element and bam, this plug-in called Firebug, which is kind of a debugging tool, shows you exactly where in the HTML they have laid out that particular text or image. It's so nice for actually boot strapping yourself to nutrition and understanding how the web currently works. So that was HTML that. We talked briefly about CSS. And it was requested we change the font of some text, the font size of some text. I needed to align things in the center as opposed to being flush on the left. And so we use this language called CSS, called cascading style sheets. And thus far we've seen it really in one context. We had a style attribute. You recall that an attribute modifies the behavior of a tag. And we said style equals quote, unquote, and then we had a list of properties: Color, colon, white, semicolon, and then font family, colon, san serif, semicolon and the like. And for more such properties, again, you'll have such resources as the little cheat sheets from today and google is your friend. And we have enumerable resources on the course's website. But everything we did on Monday was static; right? We wrote it. We saved it. We pulled it up in a browser. And it never changed on me. If I kept reloading that hamster, I saw the same hamster and the same text again and again and again. And this is very much web 1.0. So it's stupid buzz words, web 1.0, web 2.0. Web 2.0 is just a monocure that the world gave to this idea of websites taking more and more user input. So you have websites like Facebook and YouTube and Flicker being part of web 2.0 where it's not just Mark Zuckerberg and one guy at Flicker and one person at this other company making content for the world to see. It's those people making a website with which you people can make your own websites and profiles and the like which then constitute the content of that page. It's a brilliant business model. Right. You build the infrastructure, the framework. And then you let everyone else do all the hard and the interesting work. And that's very much the phase we are currently in. So we don't, though, have the ability yet like Mark did with initially making Facebook to actually code things dynamically. I could make a site that looks like Facebook. I could hard code a profile for myself with happy cat in the top left corner and my name David Malan in the top middle of the page, but it's going to never change if you reload it, reload it. The only way my profile is going to change is if I SSH to the cloud or if I use FFTP and enter a text file and save it. And this, again, is not what you're all doing on Facebook these days. You're using the web to edit the web, because all of the HTML being spit out by Facebook is not dot HTML files, but it's dot PHP files. On other websites it's not necessarily PHP. It might be Python or Ruby or C plus plus or C Sharp or VB. Any number of languages can generate HTML. So whereas Monday it was about creating HTML by hand, today it will by creating HTML dynamically, by programming. And we'll come circle now with our programming tricks, but first a five-minute Halloween song break. [music] We're not done yet. All right. So we are back. Let's actually go ahead and sniff some of my own Firefox traffic to actually see in what way these accounts are being compromised and then how we can actually use the same mechanism of cookies for actually good and compelling reasons. So I'm going to go ahead and pull up Firefox again here. I've already pulled up Firebug, which is a debugging utility. Some spirited DGB, but much more web centric. It's got different types of features. Some of them are just esthetic like letting you see things, but I'm going to go specifically to the net tab, which is going to show me all of my network traffic. Realize that what I showed you before with the thing called Wireshark where we saw all of the data flowing across the screen in rows, that's a lower level packet sniffer that shows me that this is related to HTTP, this packet, this is related to instant messaging, but it doesn't reassemble everything nicely in a nice graphical user interface. So, again, per my hand gestures on Monday about layering on the energy net, ethernet and an IP and TCP, these programs do the same thing. We're going to see a nice user friendly view of what my browser is doing with Facebook server, but it's on a higher level than we looked at earlier today with the raw packet sniffer. So for debugging purposes and actually for development purposes for P set seven and on beyond, it's always a good practice to get in the habit of clearing your cash often when the designing websites so you don't get confused by having changed something, but the browser doesn't realize it because you forgot to reload. In short, re-clearing your browser by whatever menu options is appropriate. Ask a friend or google for your browser is a good thing. It will avoid confusion for yourself. So I now have a fresh cash. No pages have been downloaded already even though I see the remnants of this past one. I'm going to go ahead and click on facebook.com. And now down here on the left, it looks like when you pull up facebook.com for the first time, it looks like 15 HTTP requests are induced between my browser and the server. Now, I only visited one web page. What are all those other requests? Well, Facebook's got images. It's got a language called Java script. It's got this other language called cascading style sheets. Some pages have videos and sounds. And so there's lots of files embedded in a website. Case in point, on Monday when we embedded the hamster into my web page, I had index -- or rather I had hello.HTML, but then inside hello.HTML I mentioned a file, the URL of that hamster file. And so the way a browser works is it actually uses recursion. When it downloads a HTML file or starting today a PHP file, it analyzes that HTML source code top to bottom, left to right. And if it sees the names of any other files like a JIFF or a JPEG or a HTML file or whatever, it then recursively requests those files and grabs them and then integrates them into the web page. And so this is what's happened here on this table. It looks like there's 15 files that compose Facebook's home page at the moment. The one I really care about now is the first one, because that's what started it all. If I click this link, again, I see a lot of juicy information, some of it archaic and some of it useful for us. My request headers are the ones we looked at Monday. Notice when I pull up facebook.com, I'm telling the server what I typed into the address bar, which can be useful if multiple websites are on the same server. There's this mention of Gzip and deflate. This just means that my browser knows how to compress information. So if Facebook wants to save some bytes and send me a compressed response, I can decompress that for efficiency sake. And then I apparently go and download all these randomly named files that compose the rest of the site's content. So now let's go ahead and do this. And I'm going to go ahead and log in as Malan at CS50.net. And my password for the world to see -- I'll change it later -- is 12345. It was too short, but 6 worked. So I'm going to go ahead and go here. And, actually, let me go to -- let me go ahead and click log-in. All right. Good. No one has changed my site yet. This, as an aside, is a security feature that Facebook offers. If you go to like account settings, security, something like that, you can have the site email you any time someone logs into your account from a new computer. This is use, frankly, if you think a friend knows your password or has been figuring out your password. It does not protect against the attack we looked at today, because I did not know Rose's password. This prompt only protects against someone figuring out your password. So I'm just going to say, this is my MacBook Pro. And this is just my reminder that it's okay if Facebook sees traffic from this computer again. But the point here is this -- let me go ahead and actually clear this screen so we see it again. Malan@CSF50.net. 123456. Enter. All right. So what just happened? So I ultimately got my home page or my profile. But notice this, I first clicked on the first page that was visited. And notice the URL, HTTPS. Now, I can still see my own traffic. So in answer to your comment, David, about it still being unencrypted somewhere, I am in my own browser can certainly see what data my browser is sending. So it's not being hidden from me. And that's what we're seeing, what my browser sent to Facebook. I want to look at request headers here for a moment. And under request headers, it looks like Facebook sends all traffic for log-ins to a server called login.facebook.com. It looks like down here, I have now this thing. Let me go here to post. You might be generally familiar with the idea of post versus get. To post information to a website means to submit information to a website. This, again, is just a tool that's making things nice and pretty for us to see. But what Facebook receives is essentially a hashtable containing a bunch of keys on the left-hand side, one of which is called email, one of which is called locale, one of which is called pass. And on the right-hand side are those values. And so what we get here is apparently some evidence that my e-mail address is indeed Malan@CS50.net. My password at the moment is 123456. And that's literally the information that was sent to Facebook. This, again, is just this Firebug tools presentation of this information. Let me actually look at the raw source that was sent. When you submit information to a website, what really happens here is -- can I see it here? Request headers. Let me scroll up here. Post. What really happens is this: It's going to look a little arcane. But what was sent to Facebook's website was this super long string. But this string is essentially key value pairs, a key like this called Char-set test equal sign and then some value. This value looks a little weird, but that's fine. But let's fast-forward to the bottom. It looks like then there's an ampersand. So what's sent to a website is an ampersand in between these key values pairs and then a key email equals mail in some weird character at CS50.net. This percent four zero is just a web oriented way of encoding an at sign. So it uses simpler characters just in case weird characters like at signs break things. And there's my password. So in short what is actually sent to facebook.com, what I have highlighted here, a bunch of stuff that we're not going to care about today, and email equals Malan@CSF50.net and path equals 123456. The web server on Facebook's end passes that request. It's just a big line of text. It's a big string, passes that to what's called a PHP program, a program written in PHP that analyzes that string, looks for all those ampersands and splits that big string on all the ampersands and then looks for equals signs and splits that big string on all those equal signs so that it can then in ram in memory create a little table, a little hashtable like this. And, nicely enough, what PHP is going to do for us is it's just going to hand us all of those key value pairs in a variable that, in fact, is a hashtable. So you'll see that what you're doing very low level this week and see implementing a hashtable is going to be handed to you in ever so nice form in PHP by way of one variable that you yourselves don't have to implement. But now let's look at the server's response. So I'm going to scroll back up to the so-called headers. And notice that Facebook server has actually responded with a whole bunch of junk, most of which is not too interesting for us today. But the key word of the day certainly is this, set cookie. Once I have logged into Facebook, Facebook servers, behind the scenes, replies to my browser with another one of these HTTP headers. We've seen a few of these. We've seen location, colon. We've seen the word, get. We've seen this stuff on Monday and now again today. This row here that says set cookie is Facebook's way of telling me kind of nicely, but forcibly, remember this information. And it's planting this information essentially in my ram or in a file on my hard drive in some default Firefox or Safari location. And my browser's purpose in life henceforth, according to HTTP version 1.1, it's supposed to grab that information from information from ram or grab that information from disk and resend it to Facebook any time in the future that I request a web page. So you can think of this huge junk, huge mess of text here as the conceptional hand stamp. It's a lot of key values pairs. And every time I visit a profile or your profile or send a message or click that link on Facebook, I'm supposed to re-present this data. So it's a little inefficient arguably. It's a whole lot of text to be sending back and forth across the internet. The web is not really known for its space efficiency, but Facebook appears to have consciously decided to set a lot of cookies. But I believe -- don't quote me on this. I believe the one that matters for Fire -- what's it called again -- Firesheep what matters, I believe, this is the scary cookie, this one down here XS equals ZB06617 -- so this is the big random number I eluded to earlier. Even though there's some letters in there, that's fine. We know about hexadecimal notation. There's other basis systems. So it's still a number that we're drawing using English letters. It's a big random number. The key, the cookie is called XS. And I think just based on having read this fellow's work, I think that's the one that Firesheep is looking for, stealing and then re-presenting to the website as its own. And there is nothing, again, in that tool Firesheep that I couldn't have done in today's lecture manually by just copying and pasting that kind of string or snipping the traffic with that low level packet sniffer, but it would have taken a lot more work. And what's scary, there's this term on the internet known as script kiddie, this refers to someone who might be a kid, might be an adult. It's generally someone with too much free time and relatively little technical skill, but who is good at downloading things and double clicking things. And so when you start to put tools like this into the hands of the masses, frankly, it's probably overall a big net positive, because it wakes up people who have no clue how this stuff works that they really need to get their act together and consumers need to start demanding this kind of protection. So that's the cookie it seems that's going back and forth across the wire that's making us all vulnerable. How is now home.PHP, the site that many of you probably spend way too much time on, how do you actually implement a file like this home.PHP? Well, it turns out if you have access to a web server like the cloud, it's actually -- there's all the internet traffic in the room still going across the wire. It turns out that it's relatively easy to get started. So I'm going to go ahead and SSH to cloud.CS50.net. I'll make the font bigger in just a moment. All right. I'm going to go into, if you recall, my public HTML directory. And in case some of you guys have been like very zealously trying to get started, recall I typed this command the other day, A plus R for hello.C -- hello.HTML. This command gave the whole world all readabilities. Plus R is read. So there's another command, though, if you need to make a directory accessible, folder accessible on the web. So just FYI, if you're doing this on your own before of P set eight -- P set seven goes out, you need to do Tamed [assumed spelling] A plus x, little x, on two things, your public HTML directory, and you also need to do this on your home directory denoted by a single tilde. There. All right. So let me go into my public HTML directory, but all that will be spelled in P set seven into eight into source. And I'm going to go ahead -- actually, let me do something even quicker. It turns out I can do this. So we made this file the other day. Hello.HTML. And it looked like this: cloud.CS50.net, tilde, hello.HTML. Oh, and as an aside, let this be confusing. Unfortunately the left hand was not talking to the right hand when the internet was made. And so Tilde means your home directory. When you have SSH'd to a server like the cloud, unfortunately in a URL like we used Monday tilde CS50 is not your home directory. It's instead your public HTML directory; right? So it's close, but there's slightly different meanings. So don't get the two confused. So tilde at the command line is home directory. Tilde in the context of a URL is that user's public HTML directory. So I made this the other day. It turns out if I want to make a dynamic website, watch this magic. So I'm going to go to my terminal here. CP for copy. Hello.HTML, hello.PHP and, wallah, now I know PHP. All right. So it turns out that PHP is a an interpreted language. And this means I don't have to run a compiler. I don't run GCC or anything like that. It just kind of works. And PHP is like a lot of languages in this regard, Pearl, Python, Ruby, Java-script. Whole bunches of others are what are called interpreted languages. And they're generally smiled upon, because it eliminates a step. It just makes it a little faster, a little more friendly to develop code, because you don't have to recompile, rerun. It just works after you reload the web page. But there is a slight down side. If you are not actually compiling a program downs into zeros and ones that specific CPU understands, what's probably the price you're paying if you're skipping that step? It's probably performance; right? Because I keep saying verbally that a browser reads an HTML file top to bottom, left to right. Well, web server does the same thing with PHP. When my web server, something called Apache, which is free software -- it's probably the most popular web server software out there that we're running on the cloud. When Apache, the web server, realizes, oh, some random person on the internet wants a dot PHP file, it doesn't just send the PHP file to the user's browser, because otherwise that would divulge all of the intellectual property that I wrote in the form of that program. What it's supposed to do first, the web server, is open this PHP file and read it top to bottom, left to right and look for any PHP code that's in that file. And if there is any PHP code, it should execute it. If that PHP code's purpose in life is essentially to call print F, it should actually send to the browser whatever the PHP program printed. And then only do I get on the user's end, the browser, the HTML that resulted from that process. So in short just as you guys have used print F or some of the end curse's functions to print information to the screen, that's ultimately what we're going to be using PHP for, to print or to echo or to output raw HTML code that changes because our PHP code is an actual programming language with if conditions and loops and functions so we can output any HTML function we want. So how do we actually do this? Well, I really just cheated there. When I loaded hello.PHP, this is literally the same file from Monday. But that's consistent with my story, because when I said that when the web server pulls up this file, it looks for any PHP code. And we had no PHP on Monday. I didn't write any PHP on Monday. So it's kind of an empty task. It looks at the file, doesn't see any PHP. So it sends everything else raw to the browser. It sends all of that HTML. So what the browser sees, if I view source on this page, is literally what's on the server. But now let's do things that's a little more dynamic. Let me go ahead and not cheat in this way. Let me go ahead and do something like this. Let me delete the hamster from Monday. Let me really simplify the code here so we can focus only on the PHP. And I'm going to tell the web server here comes some PHP code. And the way to do that is with an open bracket, question mark, PHP. And then you do almost the opposite, question mark, close bracket. This tag tells the web server here comes some PHP code. So anything that's in between these two tags is executed, is interpreted on the server, converted or converted to HTML. And then the result is sent. So in my browser I should never see open bracket question mark unless the web server is broken or is not configured correctly and accidentally divulged all of my source code. Now, as an aside, this is kind of a stupid looking tag. And the world has come up with what are called short tags, which folks have rather religious feelings on this. I'm going to save that. Let me do this again. I would say stylistically it's a lot easier. And, frankly, the nitpicky side of me prefers the symmetry of this. So in the course you'll generally see me write PHP code like this and just omit the word PHP, which frankly I've always thought is stupid that I have to constantly say what language I'm using when it's obvious. So this looks a little cleaner to me. It's certainly saved me three key strokes. So let's go ahead and do print, hello world, exclamation point, semicolon. Now, let me go back to the browser, reload, wallah. My first dynamic website. I'm now actually calling a function print F just like in C. It's printing hello world. Now it's a PHP website, but it's not really dynamic; right? It's not actually changing. So could I do something different? Well, I'm really going to cheat here. I'm going to go ahead and print out -- it turns out PHP has a function called date. And date takes an argument that specifies what you want the date to look like. So I'm going to say something like YMD for year month day. So notice I'm just calling a function called date, passing it another function called print F. Notice here I don't need to use the format strings, because I'm just kind of printing some static string here. I could use percent signs, but notice in PHP more commonly used than print F is just a function called print. So I'm going to get into that habit just to be consistent with what the world uses. Reload. And, dammit, server is not configured correctly. How to fix this problem? Quickly. Let's not do -- let's do this: I can cheat. Let's print out the current time in seconds since January 1st, 1970, because that will change every second of this demo. Dynamic website finally. All right. So I'm printing the clock on the server by reloading every second. All right. So overwhelming. Let's actually do something that's more compelling than just printing stupid exercises like this. So let's literally go into code quite like what I did several years ago for frosh IM. So, again, the silly back story was that years ago there was no freshman intra-mirror website. And so if you wanted to sign up for a sport, you would go to the proctor's dorm and wiggle's worth, slide a piece of paper under the door and, wallah, you were registered. And maybe they'd email you back. So now we introduced in 1990 or whatever a website. Right? And it looked a little something like this. And this is just using some simple HTML, which we'll look at in a moment. But what's compelling about this website versus Monday's is that I just have hyperlinks and images in a web page. I now have form controls. And the funny thing is we humans have done amazing amount of interesting work using so few user input mechanisms. Even though the websites today kind of make some of these things look a little sexier, a little more colorful and pretty, for the most part the worldwide web supports only these forms of user input. Text fields like the name field up here, sometimes it can be multiple lines which is called the text area as opposed to a text field. Website support check boxes, that's something like this. Something like radio buttons which are similar to check boxes, but these are mutually exclusive. If your check one, you can't check the other and vice versa and these things called select menus or drop-down menus. And, again, on various websites you see somewhat different interfaces. And that's because people are using a language called Java-script and CSS all the more to simulate in sexier form these basic user input controls. But recall that when I visited Facebook, I had to provide my user name and my password. I clicked submit. And in those key value pairs were submitted to the server. Let's actually see what happens here. So here is a form. Imagine it's some random website you discovered as a freshman when you want to register. I'm pulling up Firebug and my net tab just so I can see what's about to go across the wire. And I'm going to go ahead and type David here. I'll nominate myself as a captain, male, Matthews, and now register. And now let's see what happens. All right. It looks like the website says I'm registered, but not really. This is just version one. And so now let's look down here. If I expand this link, notice what was sent to the server was a bunch of key values pairs. Captain was sent a value of on. So it turns out when you check a box on the web, the keyword on is sent to the server. The word Matthews was sent as the value of dorm. M was sent for gender. David was sent for name. What was really actually sent across the wire? Well, again, as I predicted earlier, what's really sent from browser to server is just a really big long string with a bunch of ampersands and a bunch of equal signs. Again, this is just a pretty printed version. This is what my browser, because it knows how to speak HTTP, sent from browser to server. Name equals David. Ampersand captain equals on. Ampersand gender equals M and so forth. Those are HTTP parameters. So this begs the question if I now want to be a programmer on the web, I need to be able to access those values on the server side. So let's actually take a look at what frosh IM's one dot PHP looks like. It turns out that even though I called this PHP, just to get me into the habit of writing PHP code, everything in here is static. So I'm kind of cheating. Technically this could have been an HTML file. But if I scroll down, the takes aways for today are a few new tags. I've got a DIV tag up there. And I'll make the font a little bit bigger. I've got a DIV tag up there, but that's just doing some structural stuff for me, but H1 tag that says register for frosh IMs in big bold text, and then this form tag. And it's the form tag really that the magic starts to happen with. So I've got this form tag. Recall we used this for the google example on Monday. The action value here is register one dot PHP, because I just decided based on my design of this program, frosh IM's one dot PHP when the submit button is clicked, is going to submit to, is going to post to this file called register one dot PHP. So that's where the interesting stuff must be happening for frosh IM's registration. What method am I going to use? Well, I chose to use post. Long story short, today there's two methods in the world that are those commonly used get and post. What's the difference anyone? So get -- [ Inaudible audience comment ] >> Exactly. If you used method equals quote, unquote get, all of these key value pairs get sent in the URL where the user can see them. But that has advantages, because it means that you can copy and paste the URL into an email. You can bookmark it. But if you're submitting like a password, you probably don't want that in your browser's history in the URL bar. So any time you want to submit information that's a little sensitive, you should generally use post. Or if you want to submit information like photographs or just things that obviously can't fit in a URL, you use post. Now, I have what's called a table here. This is kind of frowned upon by some people these days, but the realty is it makes things really simple. What I have here really on this page if I change border colon zero to border colon one, what's really going on here, it's kind of a nice way of laying out this page. Come on. Let me cheat temporarily. So what I really have is the table in the middle of this page that looks a little something like this. And if I really change what this looks like -- and, again, we won't focus so much on the HTML aspect today. What's really going on underneath the hood here -- sorry. Oh, dammit. All right. I can't show you the border. Oh, wait. Yes, I can, if I actually save the right answer. Okay. So this is what I have done to lay out this page. And it's a common approach that we use in P set seven for simplicity. But I wanted everything to be a little anal to line up nicely here. So really I have a table underneath the hood with rows and columns, but just lining all this up, but this looks, frankly, very amatuer. I didn't want people to see these ugly lines in the rows that compose the grid. So I essentially changed border equal to zero. As an aside, there's more sophisticated ways of laying stuff out, but at least for P set seven we'll probably take this relatively simple approach. So this then begs the question, how does this data get from frosh IMs one dot PHP to register one dot PHP? Well, just a couple new tags. There's this tag here. Input name equals name. Type equals text. Input name equals captain. Type equals check box. And we saw some of these things on Monday when we simulated the google form. And the interesting one at the bottom, if I scroll down, is gender, because it's a radio button. So input name equals gender. Type equals radio. And I'm not just making these up entirely as I go on. I found in a book or a cheat sheet that type equals radio is a legit value, but name I have full discretion over. Just like you can choose your own variables names, you can choose your own form field names. And then down here I have what's called a select menu. Because the select menu is a little meatier, it's got more data in it, you actually need multiple tags to implement it. So I say select name equals dorm, size equals one means how many things do you want to show at once? Just a drop down or do you want to see them all at once? And then I have option value equals Applicor, and then the word I want to appear in the drop down, and then, again, the close tag or the end tag, which is open bracket slash option which says that's the end of the option. And then, lastly, I have this thing, input type equals submit, value equals register with an exclamation point. But none of this really pertains to what happens next. What happens next is all of the information in this web page gets submitted to register one dot PHP. And register one dot PHP looks like this, not much at all. So at the very bottom we see the spoiler for what we already saw. You are registered? Well, not really. Well, what did I mean by that? Well, at the top of this file is a bunch of PHP codes, because it turns out, this is actually going on here. If I start over with this form and I say, you don't need to know my name, you don't need to know my gender. All right. Fine. I'm in welt, I'll tell you that much and click register, it turns out that this page actually doesn't say I'm registered. It's not very user friendly, but an error has happened. If I try to mess around and just not give the proctor all the information I want, notice it's actually not letting me submit. And that's because atop register one dot PHP -- let me shrink the font so it fits a little more. Okay. So let me shrink the fonts here. Atop register one dot PHP is this notice. Open bracket question mark. Question mark, close bracket. So everything at the top of this file is PHP code. Now, this mess here is just comments. PHP supports inline comments with slash slash and then also slash star, star slash comments like C. So the PHP code is this. And this is what's really neat about PHP is that it's so relatively easy to get started. It turns out that any time you submit a form from one page to another, if the second one is PHP, all of those key value pairs, that so-called hashtable I eluded to before, are handed to you in a special variable called dollar sign underscore P-O-S-T. If by contrast you use that method called get where the values are still sent, but in the URL, dollar sign underscore G-E-T in all capitals. The underscore is just so that they figure you're never going to choose a variable name that starts with an underscore. So we will. This is a special super global variable that just always exists that you get for free. And we're leveraging that. What's inside of this variable is essentially a hashtable, more generally known as an associate of array. In C we really were limited. The only thing you could put between square brackets with respect to a ray was what? Like a number, zero, one, two, three or a variable containing a number. What's really cool about PHP and languages like it is that arrays are not always numeric. They are associative whereby you can use anything as the index into an array. You can use a word. You can use a number. You can use a floating point value. You can use anything you want. Associated arrays in PHP is a generalization -- is an incarnation of this idea of a hashtable. A hashtable, even though you might be implementing this for P set six and really trying to wrap your mind around how to implement it, at the end of the day it's one of the most useful data structures in computer science, because it's just a big container that takes in keys, outputs values. And in theory it runs in constant time or at least in the best case it's constant sign. So it's sort of the Swiss Army knife of data structures, because it's just so useful as a container for stuff. So that's what's inside post. So what I'm saying with my if condition here in PHP, if the name key inside of this hashtable equals equals what's this? Just nothing. The empty string. So quote, unquote. Or so vertical bars the gender field in that super global called post equals equals nothing or dorm equals equals nothing, what do I do? Well, this is kind of a long line, because it's a long URL. I'm going to respond with not HTML, but I'm going to call a function called header. And that header function says, location colon space HTTP colon slash slash. So what this is doing is the web PHP is telling the browser, go there to this location. Well, what's there? If you look at that URL, it's the same location I just came from. And so this is why I just ended up back where I was. And nothing actually happened, because I then called exit. And so I never hit the bottom of this file. So it's important to realize again PHP's interpreted in that it was read top to bottom, left to right. And so if you call exit, you're just not going to proceed any further. Now, this is not a very user friendly version. Let's pull up version two of frosh IM's here. It looks the same, but now let me go ahead and just say, you don't need to know any of this. I'm David, and that's all. Register. So now we finally have a more familiar user interface mechanism. You must provide your name, gender and dorm. Go back. Well, how did I implement this? Well, let's take a look. In register two dot PHP notice I did this. And this is what's cool, too, about PHP. In that last example I put all my PHP code at the very top of the file, but it turns out that I can use these open bracket question marks really any time I want. I can alternate between HTML mode, which is just going to get spit out raw to the browser. And then if I need to make a decision here, wait a minute, I either want to print out X or I want to print out Y, I can make that decision in line with PHP as follows: Here's my HTML tag. Let me zoom in. Here's my HTML tag. Here's my head tag, my title tag. Close head. Open body. And now I'm at the body part of the page. I'm not sure what I want to do yet. So I'm going to check a condition. I enter PHP mode, open bracket, question mark. If the name key inside the post hashtable equals equals nothing or gender is nothing or dorm is nothing, what do I do? Well, notice at the end of this condition slightly new syntax versus C, notice I have a colon, but then I have close -- then I have question mark close bracket. So that means if any one of those expressions is true, go ahead, PHP, and just start spitting out the following content. What's the following content? You must provide your name, gender and dorm, exclamation point. Else, notice I'm back in PHP mode, because I have to ask a question again of the computer. Else, go ahead and echo this instead you are registered. Well, not really. So I'm dropping in and out of PHP mode to make these decisions. Well, I can do slight enhancements of this still if I pull up this time frosh IMs three. Let me go ahead and do this, frosh IMs three and open up register three. And notice here if I'm at three and I click register here, David, one, two, three, register. What's going on here? What I've done in register three is this: If the name is not blank -- so notice I'm using bang equals. If it's not blank, and gender is not blank and over here dorm is not blank, notice this, I'm going to declare a few variables. So if you haven't caught on already, variables in PHP by convention always start with dollar signs. A two variable is going to be Malan at two CS50.net. A subject variable is going to be registration. A body variable is going to be this line here. In PHP you can use dot as the concatenation operator, which means take this string and concatenate it with this string and this one and this one. So this multiline thing here just says make a really long string, but I'm concatenating it line by line for style, but the magic is in this last line. There's thankfully finally in C you have to do everything low level. But in PHP, male two, this subject, this body and these headers, well, what just happened, well, let me go back to my browser, mail dot CS50.net, which happens to be hosted by Gmail. Let me go ahead -- and thankfully it's GPS. So you're not getting into this one. Enter. And Joseph wants to be friends. Thank you. And wait. I got to click submit. Wait a minute. Come on, David. Applicor. Register. This better work. Registration. I have now implemented what I did years ago -- it took me like weeks years ago -- the idea of an email system that just informed the Proctor or me in this case who registered for frosh IMs. So now we have an online registration system. More on this on Monday [applause].