[MUSIC PLAYING] 

DAVID MALAN: This is CS50, and this is the start of week 10. And you might remember this image from a few weeks back when we talked about the internet and how it's actually implemented physically. And you might recall that there's actually a whole bunch of cables as well as wireless technologies that interconnect all of the nodes or routers and other such technologies on the internet. And a lot of that is underseas. 

Well, it turns out that those underseas cables are a bit of a target. And today's lecture is entirely about security, not only the threats that we all face physically, but also virtually, and also, toward tail end today, some of the defenses that we as users can actually put into place. 

But first, one of the first and perhaps most physical threat-- [VIDEO PLAYBACK] -Could Russia be planning an attack on undersea cables that connect global internet? 

-Russian ships and submarines lurking near undersea cables that carry almost all of the world's internet. 

-The entire internet is carried along these cables. 

-First of all, what is the internet doing underwater? Last time I checked, I'm not supposed to get my computer wet. Second, if you ask me how the internet travels from continent to continent, I would've said satellites or lasers, or, honestly, I probably would have just said the internet. 

And what happened to the cloud? I was told there was a cloud. Remember? Hey, let's put that in the cloud. It was like the internet was a vapor of information that circles the Earth, and your computer was like a ladle that scooped out what you needed. 

But it turns out the internet is actually underwater because these cables carry more than 95% of daily internet communications. And US intelligence worries that in times of tension or conflict, Russia might resort to severing them. It would be the biggest disruption to your internet service since your upstairs neighbor put a password on his Wi-Fi. OK? Try his dog's name. [END PLAYBACK] DAVID MALAN: Before we turn now to some of the more virtual threats, a couple of announcements. So our friends a CrimsonEMS are currently recruiting for new EMTs, Emergency Medical Technicians. And this is actually something particularly close to my heart. 

A long time ago, I remember being in an Ikea shortly after graduation, actually. And as I was exiting the store, this little boy who was in a stroller started turning literally blue. And he was choking on some piece of food that had presumably gotten stuck in his throat. 

And his mother was panicking. The parents around them were panicking. And even I, who had a bit of familiarity with EMS just by way of friends, completely froze. And it was only thanks to something like a 15-year-old lifeguard who ran over and actually knew what to do instinctively and called for help and actually pulled the boy out of his stroller and actually addressed the situation. 

And for me, that was a turning point. And it was that moment in time where I decided, dammit, I need to have my act together and actually know how to respond to these kinds of situations. And so I myself got licensed years ago as an EMT. And through graduate school did I ride on MIT's ambulance for some period of time as well as have kept up my license since. 

And actually, to this day, all of CS50 staff here in Cambridge are actually certified in CPR, as well, for similar reasons. So if you're at all interested in this, there's never going to be enough time in the day to take on something new. But if you want a New Year's resolution, do join these guys here or consider reaching out to the Red Cross for certification, either here or in New Haven, as well. 

So CS50's last lunch is this Friday. So if you've not yet joined us, or if you have and you want one more time, do go on CS50's website to fill out the form there. Know, too, that our friends in Yale, Professor Scassellati, has been producing an AI, artificial intelligence, series for us that will start to debut this week on video. So especially if you are interested in pursuing a final project somehow related to artificial intelligence, natural language processing, even robotics, realize that these will be a wonderful inspiration for that. 

And just to give you a teaser of this, here is Scaz himself. 

[VIDEO PLAYBACK] 

-One of the really great things about computer science is that with even only a few weeks of study, you're going to be able to understand many of the intelligent artifacts and devices that populate our modern world. In this short video series, we're going to look at things like how Netflix is able to suggest and recommend movies that I might like, how it is that Siri can answer questions that I have, how it is that Facebook can recognize my face and automatically tag me in a photograph, or how Google is able to build a car that drives on its own. 

So I hope you'll join me for this short series of videos, the CS50 AI series. I think you'll find that you know much more than you thought you did. [END PLAYBACK] DAVID MALAN: So those will appear on the course's website later this week. Stay tuned. And in the meantime, a few announcements as to what lies ahead. So we are here. This is in our lecture on security. This coming Wednesday, Scaz and Andy, our head teaching fellow in New Haven, will be here to look at artificial intelligence itself for a look at computation for communication-- how to build systems that use language to communicate from ELIZA, if you're familiar with this software from yesteryear, to Siri more recently and to Watson, which you might know from Jeopardy or the like. 

Then next Monday, we're not here in Cambridge. We're in New Haven for a second look at artificial intelligence with Scaz and company-- AI opponents in games. So if you've ever played against the computer in some video game or mobile game or the like, we'll talk about exactly that-- how to build opponents for games, how to represent things underneath the hood using trees from games like tic-tac-toe to chess to actual modern video games, as well. 

Sadly, quiz one is shortly thereafter. More details on that on CS50's website later this week. And our final lecture at Yale will be that Friday after the quiz. And our final lecture at Harvard will be the Monday thereafter, by nature of scheduling. 

And so in terms of milestones, besides pset eight out this week; status report, which will be a quick sanity check between you and your teaching fellow; the hackathon, which will be here in Cambridge for students from New Haven and Cambridge alike. We will take care of all transportation from New Haven. The implementation of the final project will be due. And then for both campuses will there be a CS50 fair that allows us to take a look at and delight in what everyone has accomplished. 

In fact, I thought this would be a good moment to draw attention to this device here, which we've used for some amount of time here, which is a nice touch screen. And actually, last year we had a $0.99 app that we downloaded from the Windows app store in order to draw on the screen. 

But frankly, it was very cluttered. It allowed us to draw on the screen, but there were, like, a lot of icons up here. The user interface was pretty bad. If you wanted to change certain settings, there were just so many damn clicks. And the user interface-- or, more properly, the user experience-- was pretty suboptimal, especially using it in a lecture environment. 

And so we reached out to a friend of ours at Microsoft, Bjorn, who's actually been following along with CS50 online. And as his final project, essentially, did he very graciously take some input from us as to exactly the features and user experience we want. And he then went about building for Windows this application here that allows us to draw-- oops-- and spell on the-- wow. Thank you. To draw and spell on this screen here with very minimal user interface. 

So you've seen me, perhaps, tap up here ever so slightly where now we can underline things in red. We can toggle and now go to white text here. If we want to actually delete the screen, we can do this. And if we actually prefer a white canvas, we can do that. So it does so terribly little by design and does it well. So that I futz, hopefully, far less this year in class. 

And thanks, too, to a protege of his am I wearing today a little ring. This is Benjamin, who was interning with Bjorn this summer. So it's a little ring. It's a little larger than my usual ring. But via a little dial on the side here can I actually move the slides left and right, forward and back, and actually advance things wirelessly so that, one, I don't have to keep going back over to the space bar here. And two, I don't have to have one of those stupid clickers and preoccupy my hand by holding the damn thing all the time in order to simply click. And surely, in time, will the hardware like this get super, super smaller. 

So certainly, don't hesitate to think outside the box and do things and create things that don't even exist yet for final projects. Without further ado, a look at what awaits as you dive into your final projects at the CS50 hackathon 

[VIDEO PLAYBACK] 

[MUSIC PLAYING] 

[SNORING] [END PLAYBACK] DAVID MALAN: All right. So the Stephen Colbert clip that I showed just a moment ago was actually on TV just a few days ago. And in fact, a couple of the other clips we'll show today are incredibly recent. And in fact, that speaks to the reality that so much of technology and, frankly, a lot of the ideas we've been talking about in CS50 really are omnipresent. And one of the goals of the course is certainly to equip you with technical skills so that you can actually solve problems programmatically, but two, so that you can actually make better decisions and make more informed decisions. And, in fact, thematic throughout the press and online videos and articles these days is just a frightening misunderstanding or lack of understanding of how technology works, especially among politicians. 

And so indeed, in just a bit we'll take a look at one of those details, as well. But literally just last night was I sitting in Bertucci's, a local franchise Italian place. And I hopped on their Wi-Fi. And I was very reassured to see that it's secure. And I knew that because it says here "Secure Internet Portal" when the screen came up. So this was the little prompt that comes up in Mac OS or in Windows when you connect to a Wi-Fi network for the first time. And I had to read through their terms and conditions and finally click OK. And then I was allowed to proceed. 

So let's start to rethink what all of this means and to no longer take for granted what people tell us when we encounter it with various technology. So one, what does it mean that this is a secure internet portal? What could Bertucci's be reassuring me of? AUDIENCE: The packets sent back and forth are encrypted. DAVID MALAN: Good. The packets being sent back and forth are encrypted. Is that in fact the case? If that were the case, what would I have to do or what would I have to know? Well, you'd see a little padlock icon in Mac OS or in Windows saying that there is indeed some encryption or scrambling going on. But before you can use an encrypted portal or Wi-Fi connection, what do you have to usually type in? A password. I know no such password, nor did I type any such password. I simply clicked OK. So this is utterly meaningless. This is not a secure internet portal. This is a 100% insecure internet portal. There's absolutely no encryption going on, and all that is making it secure is that three-word phrase on the screen there. 

So that means nothing, necessarily, technologically. And a little more worrisome, if you actually read through the terms and conditions, which are surprisingly readable, was this-- "you understand that we reserve the right to log or monitor traffic to ensure these terms are being followed." So that's a little creepy, if Bertucci's is watching my internet traffic. But most any agreement that you've blindly clicked through has surely said that before. 

So what does that actually mean technologically? So if there's some creepy guy or woman in back who's, like, monitoring all the internet traffic, how is he or she accessing that information exactly? What are the technological means via which that person-- or adversary, more generally-- can be looking at our traffic? 

Well, if there's no encryption, what kinds of things could they sniff, so to speak, sort of detect in the air. What would you look at? Yeah? 

AUDIENCE: The packets being sent from your computer to the router? 

DAVID MALAN: Yeah. The packets being sent from the computer to your router. So you might recall when we were in New Haven, we passed those envelopes, physically, throughout the audience to represent data going through the internet. And certainly, if we were throwing them through the audience wirelessly to reach their destination, anyone can sort of grab it and make a copy of it and actually see what's inside of that envelope. 

And, of course, what's inside of these envelopes is any number of things, including the IP address that you're trying to access or the host name, like www.harvard.edu or yale.edu that you're trying to access or something else altogether. Moreover, the path, too-- you know from pset six that inside of HTTP requests are get slash something.html. So if you're visiting a specific page, downloading a specific image or video, all of that information is inside of that packet. And so anyone there in Bertucci's can be looking at that very same data. Well, what are some other threats along these lines to be mindful of before you just start accepting as fact what someone like Bertucci's simply tells you? Well, this was an article-- a series of articles that came out just a few months back. All the rage these days are these newfangled smart TVs. What's a smart TV, if you've heard of them or have one at home? AUDIENCE: Internet connectivity? DAVID MALAN: Yeah, internet connectivity. So generally, a smart TV is a TV with internet connectivity and a really crappy user interface that makes it harder to actually use the web because you have to use, like, up, down, left, and right or something on your remote control just to access things that are so much more easily done on a laptop. 

But more worrisome about a smart TV, and Samsung TVs in this particular case, was that Samsung TVs and others these days come with certain hardware to create what they claim is a better user interface for you. So one, you can talk to some of your TVs these days, not unlike Siri or any of the other equivalents on mobile phones. So you can say commands, like change channel, raise volume, turn off, or the like. But what's the implication of that logically? If you've got the TV in your living room or the TV at the foot of your bed to fall asleep to, what's the implication? Yeah? 

AUDIENCE: There might be something going in through the mechanism to detect your speech. DAVID MALAN: Yeah. AUDIENCE: That could be sent via internet. If it's unencrypted, then it's vulnerable. DAVID MALAN: Indeed. If you have a microphone built into a TV and its purpose in life is, by design, to listen to you and respond to you, it's surely going to be listening to everything you say and then translating that to some embedded instructions. But the catch is that most of these TVs aren't perfectly smart themselves. They're very dependent on that internet connection. 

So much like Siri, when you talk into your phone, quickly sends that data across the internet to Apple servers, then gets back a response, literally is the Samsung TV and equivalents just sending everything you're saying in your living room or bedroom to their servers just to detect did he say, turn on the TV or turn off the TV? And God knows what else might be uttered. Now, there's some ways to mitigate this, right? Like what does Siri and what does Google and others do to at least defend against that risk that they're listening to absolutely everything? It has to be activated by saying something like, hey, Siri, or hi Google or the like or OK, Google or the like. 

But we all know that those expressions kind of suck, right? Like I was just sitting-- actually the last time I was at office hours at Yale, I think, Jason or one of the TFs kept yelling, like, hey, Siri, hey, Siri and was making my phone do things because he was too proximal to my actual phone. But the reverse is true, too. Sometimes those things just kick on because it's imperfect. And indeed, natural language processing-- understanding a human's phrasing and then doing something based on it-- is certainly imperfect. 

Now, worse yet, some of you might have seen or have a TV where you can do stupid or new-age things like this to change channels to the left or this to change channels to the right or lower the volume or raise the volume. But what does that mean the TV has? A camera pointed at you at all possible times. 

And in fact, the brouhaha around Samsung TVs for which they took some flack is that if you read the terms and conditions of the TV-- the thing you certainly never read when unpacking your TV for the first time-- embedded in there was a little disclaimer saying the equivalent of, a you might not want to have personal conversations in front of this TV. And that's what it reduces to. 

But you shouldn't even need to be told that. You should be able to infer from the reality that microphone and camera literally pointing at me all the time maybe is more bad than good. And frankly, I say this somewhat hypocritically. I literally have, besides those cameras, I have a tiny little camera here in my laptop. I have another one over here. I have the in my cellphone on both sides. So lest I put it down the wrong way, they can still watch me and listen to me. 

And all this could be happening all the time. So what's stopping my iPhone or Android phone from doing this all the time? How do we know that Apple and some creepy person at Google, aren't listening in to this very conversation through the phone or conversations I have at home or at work? 

AUDIENCE: Because our lives aren't that interesting. 

DAVID MALAN: Because our lives aren't that interesting. That actually is a valid response. If we're not worried about a particular threat, there is a sort of who cares aspect to it. Little old me is not going to really be a target. But they certainly could. 

And so even though you see some cheesy things on TVs and movies, like, oh, let's turn on the grid and-- like Batman does this a lot, actually, and actually can see Gotham, what's going on by way of people's cellphones or the like. Some of that's a little futuristic, but we're pretty much there these days. 

Almost all of us are walking around with GPS transponders that is telling Apple and Google and everyone else that wants to know where we are in the world. We have a microphone. We have a camera. We're telling things like Snapchat and other applications everyone we know, all of their phone numbers, all of their email addresses. And so again, one of the takeaways today, hopefully, is to at least pause a little bit before just blindly saying, OK when you want the convenience of Snapchat knowing who all of your friends is. But conversely, now Snapchat knows everyone you know and any little notes you might have made in your contacts. 

So this was a timely one, too. A few months back, Snapchat itself was not compromised. But there had been some third-party applications that made it easier to save snaps And the catch was that that third-party service was itself compromised, in part because Snapchat's service supported a feature that they probably shouldn't have, which allowed for this archiving by a third party. 

And the problem was that an archive of, like, 90,000 snaps, I think, were ultimately compromised. And so you might take some comfort in things like Snapchat being ephemeral, right? You have seven seconds to look at that inappropriate message or note, and then it disappears. But one, most of you have probably figured out how to take screenshots by now, which is the most easy way to circumvent that. But two, there's nothing stopping the company or the person's on the internet from intercepting that data, potentially, as well. 

So this was literally just a day or two ago. This was a nice article headline on a website online. "Epic Fail-- Power Worm Ransomware Accidentally Destroys Victim's Data During Encryption." So another ripped from the headlines kind of thing here. So you might have heard of malware, which is malicious software-- so bad software that people with too much free time write. And sometimes, it just does stupid things like delete files or send spam or the like. 

But sometimes, and increasingly, it's more sophisticated, right? You all know how to dabble in encryption. And Caesar and Vigenere aren't super secure, but there's other ones, certainly, that are more sophisticated. And so what this adversary did was wrote a piece of malware that somehow infected a bunch of people's computers. But he was kind of an idiot and wrote a buggy version of this malware such that when he or she implemented the code-- oh, we're getting a lot of-- sorry. We're getting a lot of hits on the microphone. OK. 

So what the problem was that he or she wrote some bad code. And so they generated pseudorandomly an encryption key with which to encrypt someone's data maliciously, and then accidentally threw away the encryption key. So the effect of this malware was not as intended, to ransom someone's data by encrypting his or her hard drive and then expecting $800 US in return for the encryption key, at which point the victim could decrypt his or her data. Rather, the bad guy simply encrypted all the data on their hard drive, accidentally deleted the encryption key, and got no money out of it. But this also means that the victim is truly a victim because now he or she cannot recover any of the data unless they actually have some old-school backup of it. 

So here too is sort of a reality that you'll read about these days. And how can you defend against this? Well, this is a whole can of worms, no pun intended, about viruses and worms and the like. And there is certainly software with which you can defend yourself. But better than that is just to be smart about it. 

In fact, I haven't-- this is one of these do as I say, not as I do things, perhaps-- I haven't really used antivirus software in years because if you generally know what to look for, you can defend against most everything on your own. And actually, timely here at Harvard-- there was a bug or an issue last week where Harvard is clearly, like, monitoring lots of network traffic. And all of you even visiting CS50's website might have gotten an alert saying that you can't visit this website. It's not secure. But if you tried visiting Google or other sites, too, those, too, were insecure. 

That's because Harvard, too, has some kind of filtration system that is keeping an eye out on potentially malicious websites to help protect us against us. But even those things are clearly imperfect, if not buggy, themselves. 

So here-- if you're curious, I'll leave these slides up online-- is the actual information that the adversary gave. And he or she was asking for in bitcoin-- which is a virtual currency-- $800 US to actually decrypt your data. Unfortunately, this was completely foiled. So now we'll look at something more political. And again, the goal here is to start to think about how you can make more informed decisions. And this is something happening currently in the UK. And this was a wonderful tagline from an article about this. The UK is introducing, as you'll see, a new surveillance bill whereby the UK is proposing to monitor everything the Brits do for a period of one year. And then the data is thrown out. Quote, unquote, "It would serve a tyranny well." 

So let's take a look with a friend of Mr. Colbert's. 

[VIDEO PLAYBACK] 

-Welcome, welcome, welcome to "Last Week Tonight." Thank you so much for joining us. I'm John Oliver. Just time for a quick recap of the week. And we begin with the UK, Earth's least magic kingdom. 

This week, debate has been raging over there over a controversial new law. 

-The British government is unveiling new surveillance laws that significantly extend its power to monitor people's activities online. 

-Theresa May there calls it a license to operate. Others have called it a snooper's charter, haven't they? 

-Well, hold on because-- snooper's charter is not the right phrase. That sounds like the agreement an eight-year-old is forced to sign promising to knock before he enters his parents' bedroom. Dexter, sign this snooper's charter or we cannot be held responsible for what you might see. 

This bill could potentially write into law a huge invasion of privacy. 

-Under the plans, a list of websites visited by every person in the UK will be recorded for a year and could be made available to police and security services. 

-This communications data wouldn't reveal the exact web page you looked at, but it would show the site it was on. -OK. So it wouldn't store the exact page, just the website. But that is still a lot of information. For instance, if someone visited orbitz.com, you'd know they were thinking about taking a trip. If they visited yahoo.com, you'd know they just had a stroke and forgot the word "google." And if they visited vigvoovs.com, you'd know they're horny and their B key doesn't work. 

And yet for all the sweeping powers the bill contains, British Home Secretary Theresa May insists that critics have blown it out of proportion. 

-An internet connection record is a record of the communication service that a person has used, not a record of every web page they have accessed. It is simply the modern equivalent of an itemized phone bill. 

-Yeah, but that's not quite as reassuring as she thinks it is. And I'll tell you why. First, I don't want the government looking at my phone calls either. And secondly, an internet browsing history is a little different from an itemized phone bill. No one frantically deletes their phone bill every time they finish a call. 

[END PLAYBACK] DAVID MALAN: A pattern's emerging as to how I prepare for class. It's just to watch TV for a week and see what comes out, clearly. So that, too, was just from last night on "Last Week Tonight." So let's begin to talk now about some of the defenses. Indeed, for something like this, where the Brits are proposing to keep a log of that kind of data, where might it be coming from? Well, recall from pset six, pset seven, and pset eight now that inside of those virtual envelopes-- at least for HTTP-- are messages that look like this. And so this message, of course, is not only addressed to a specific IP address, which the government here or there could certainly log. But even inside of that envelope is an explicit mention of the domain name that's being visited. And if it's not just slash, it might actually be a specific file name or a specific image or movie or, again, anything of interest to you could be certainly intercepted if all of the network traffic is somehow being proxied through governmental servers, as already happens in some countries, or if there are sort of unknown or undisclosed agreements, as has happened already in this country between certain large players-- ISPs and phone companies and the like-- and the government. 

So funny story-- the last time I chose badplace.com off the top of my head as an example of a sketchy website, I didn't actually vet beforehand whether or not that actually led to a badplace.com. Thankfully, this domain name is just parked, and it doesn't actually lead to a badplace.com. So we'll continue to use that one for now. But I'm told that could've backfire very poorly that particular day. 

So let's begin to now talk about certain defenses and what holes there might even be in those. So passwords is kind of the go-to answer for a lot of defense mechanisms, right? Just password protect it, then that will keep the adversaries out. But what does that actually mean? 

So recall from hacker two, back if you tackled that-- when you had to crack passwords in a file-- or even in problem set seven, when we give you a sample SQL file of some usernames and passwords. These were the usernames you saw, and these were the hashes that we distributed for the hacker edition of problem set two. And if you've been wondering all this time what the actual passwords were, this is what, in fact, they decrypt to, which you could have cracked in pset two, or you could have playfully figured them out in problem set seven. All of them have some hopefully cute meaning here or in New Haven. 

But the takeaway is that all of them, at least here, are pretty short, pretty guessable. I mean, based on the list here, which are perhaps the easiest to crack, to figure out by writing software that just guesses and checks, would you say? AUDIENCE: Password. DAVID MALAN: Password's pretty good, right? And it's just-- one, it's a very common password. In fact, every year there's a list of the most common passwords in the world. And quote, unquote "password" is generally atop that list. Two, it's in a dictionary. And you know from problem set five that it's not that hard-- might be a little time consuming-- but it's not that hard to load a big dictionary into memory and then use it to sort of guess and check all possible words in a dictionary. 

What else might be pretty easy to guess and check? Yeah? 

AUDIENCE: The repetition of letters. 

DAVID MALAN: The repetition of symbols and letters. So kind of sort of. So, in fact-- and we won't go into great detail here-- all of these were salted, which you might recall from problem set seven's documentation. Some of them have different salts. So you could actually avoid having repetition of certain characters simply by salting the passwords differently. 

But things like 12345, that's a pretty easy thing to guess. And frankly, the problem with all of these passwords is that they're all just using 26 possible characters, or maybe 52 with some uppercase, and then 10 letters. I'm not using any funky characters. I'm not using zeros for O's or ones for I's or L's or-- if any of you think you're being clever, though, by having a zero for an O in your password or-- OK, I saw someone smile. So someone has an zero for an O in his or her password. 

You're not actually being as clever as you might think, right? Because if more than one of us is doing this in the room-- and I've been guilty of this as well-- well, if everyone's kind of doing this, what does the adversary have to do? Just add zeros and ones and a couple of other-- maybe fours for H's-- to his or her arsenal and just substitute those letters for the dictionary words. And it's just an additional loop or something like that. 

So really, the best defense for passwords is something much, much more random-seeming then these. Now, of course, threats against passwords sometimes include emails like that. So I literally just got this in my inbox four days ago. This is from Brittany, who apparently works at harvard.edu. And she wrote me as a webmail user. "We just noticed that your email account was logged onto another computer in a different location, and you are to verify your personal identity." 

So thematic in many emails like this, which are examples of phishing attacks-- P-H-I-S-H-I-N-G-- where someone is trying to fish and get some information out of you, generally by an email like this. But what are some of the telltale signs that this is not, in fact, a legitimate email from Harvard University? What's that? 

So bad grammar, the weird capitalization, how some letters are capitalized in certain places. There's some odd indentation in a couple of places. What else? What's that? Well, that certainly helps-- the big yellow box that says this might be spam from Google, which is certainly helpful. 

So there's a lot of telltale signs here. But the reality is these emails must work, right? It's pretty cheap, if not free, to send out hundreds or thousands of emails. And it's not just by sending them out of your own ISP. One of the things that malware does tend to do-- so viruses and worms that accidentally infect or computers because they've been written by adversaries-- one of the things they do is just churn out spam. 

So what there does exist in the world, in fact, are things called botnets, which is a fancy way of saying that people with better coding skills than the person who wrote that buggy version of software, have actually written software that people like us unsuspectingly install on our computers and then start running behind the scenes, unbeknownst to us. And those malware programs intercommunicate. They form a network, a botnet if you will. And generally, the most sophisticated of adversaries has some kind of remote control over thousands, if not tens of thousands, of computers by just sending out a message on the internet that all of those bots, so to speak, are able to hear or occasionally request from some central site and then can be controlled to send out spam. 

And these spam things can be just sold to the highest bidder. If you're a company or sort of a fringe company that doesn't really care about the sort of ethics of spamming your users but you just want to hit out a million people and hope that 1% of them-- which is still a nontrivial number of potential buyers-- you can actually pay these adversaries in the sort of black market of sorts to send out these spams via their botnets for you. 

So suffice it to say, this is not a particularly compelling email. But even Harvard and Yale and the like often make mistakes, in that we know from a few weeks back that you can make a link say www.paypal.com. And it looks like it goes there. But, of course, it doesn't actually do that. 

And so Harvard and Yale and others have certainly been guilty over the years in sending out emails that are legitimate, but they contain hyperlinks in them. And we, as humans, have been trained by sort of the officials, quite often, to actually just follow links that we receive in an email. But even that isn't the best practice. So if you do ever get an email like this-- and maybe it is from Paypal or Harvard or Yale or Bank of America or the like-- you still should not click the link, even if it looks legitimate. You should manually type out that URL yourself. And frankly, that's what the system administrator should be telling us to do so that we're not tricked into doing this. 

Now, how many of you, perhaps by looking down at your seat, have passwords written down somewhere? Maybe in a drawer in your dorm room or maybe under-- in a backpack somewhere? Wallet? No? 

AUDIENCE: In a fireproof lockbox? 

DAVID MALAN: In a fireproof lockbox? OK. So that's better than a sticky note on your monitor. So certainly, some of you are insisting no. But something tells me that's not necessarily the case. So how about an easier, more likely question-- how many of you are using the same password for multiple sites? Oh, OK. Now we're being honest. 

All right. So that's wonderful news, right? Because if it means if just one of those sites you all are using is compromised, now the adversary has access to more data about you or more potential exploits. So that's an easy one to avoid. But how many of you have a pretty guessable password? Maybe not as bad as this, but something? For some stupid site, right? It's not high-risk, doesn't have a credit card? All of us. Like, even I have passwords that are probably just 12345, surely. So now try logging into every website you can think of with malan@harvard.edu and 12345 and see if that works. 

But we do this, too. So why? Why do so many of us have either pretty easy passwords or the same passwords? What's the real-world rationale for this? It's easier, right? If I said instead, academically, you guys should really be choosing pseudorandom passwords that are at least 16 characters long and have a combination of alphabetical letters, numbers, and symbols, who the hell is going to be able to do that or remember those passwords, let alone for each and every possible website? 

So what's a viable solution? Well, one of the biggest takeaways today, too, pragmatically, would be, honestly, to start using some kind of password manager. Now, there are upsides and downsides of these things, too. These are two that we tend to recommend in CS50. One's called button 1Password. One's called LastPass. And some of you might use these already. But it's generally a piece of software that does facilitate generating big pseudorandom passwords that you can't possibly remember as a human. It stores those pseudorandom passwords in its own database, hopefully on your local hard drive-- encrypted, better yet. And all you, the human, have to remember, typically, is one master password, which probably is going to be super long. And maybe it's not random characters. Maybe it's, like, a sentence or a short paragraph that you can remember and you can type once a day to unlock your computer. 

So you use an especially large password to protect and to encrypt all of your other passwords. But now you're in the habit of using software like this to generate pseudorandom passwords across all of the websites you visit. And indeed, I can comfortably say now, in 2015, I don't know most of my passwords anymore. I know my master password, and I type that, unknowingly, one or more times a day. But the upside is that now, if any of my one accounts is compromised, there's no way someone is going to use that account to get into another because none of my passwords are the same anymore. 

And certainly, no one, even if he or she writes adversarial software to brute force things and guess all possible passwords-- the odds that they are going to choose my 24-character long passwords is just so, so low I'm just not worried about that threat anymore. 

So what's the trade-off here? That seems wonderful. I'm so much more safe. What's the trade-off? Yeah? 

AUDIENCE: Time. DAVID MALAN: Time. It's a lot easier to type 12345 and I'm logged in versus something that's 24 characters long or a short paragraph. What else? 

AUDIENCE: If someone breaks your master password. DAVID MALAN: Yeah. So you're kind of changing the threat scenario. If someone guesses or figures out or reads the Post-it note in your secure file vault, the master password you have, now everything is compromised whereby previously it was maybe just one account. What else? 

AUDIENCE: If you want to use any of your accounts on another device and you don't have LastPass [INAUDIBLE]. 

DAVID MALAN: Yeah, that's kind of a catch, too. With these tools, too, if you don't have your computer and you're in, like, some cafe or you're at a friend's house or a computer lab or wherever and you want to log into Facebook, you don't even know what your Facebook password is. Now sometimes, you can mitigate this by having a solution that we'll talk about in just a moment called two-factor authentication whereby Facebook will text you or will send a special encrypted message to your phone or some other device that you carry around on your keychain with which you can log in. But that's, perhaps, annoying if you're in the basement of the science center or elsewhere here at New Haven's campus. You might not have signal. And so that's not necessarily solution. So it really is a trade-off. But what I would encourage you to do-- if you go to CS50's website, we actually arranged for the first of these companies for a site license, so to speak, for all CS50 students so you don't have to pay the $30 or so it normally costs. For Macs and Windows, you can check out 1Password for free on CS50's website, and we'll hook you up with that. 

Realize, too, that some of these tools-- including LastPass in one of its forms-- is cloud-based, as Colbert says, which means your passwords are encryptedly stored in the cloud. The idea there is that you can go to some random person or friend's computer and log in to your Facebook account or the like because you first go to lastpass.com, access your password, and then type it in. But what's the threat scenario there? If you're storing things in the cloud, and you're accessing that website on some unknown computer, what could your friend be doing to you or to your keystrokes? OK. I'll be manually advancing slides here on out. 

Keylogger, right? Another type of malware is a keylogger, which is just a program that actually logs everything you type. So there, too, it's probably better to have some secondary device like this. 

So what is two-factor authentication? As the name suggests, it's you have not one but two factors with which to authenticate to a website. So rather than use just a password, you have some other second factor. Now, that generally is, one, factor is something you know. So something kind of in your mind's eye, which is your password which you've memorized. But two, not something else that you know or have memorized but something you physically have. The idea here being your threat no longer could be some random person on the internet who can just guess or figure out your password. He or she has to have physical access to something that you have, which is still possible and still, perhaps, all the more physically threatening. But it's at least a different kind of threat. It's not a million nameless people out there trying to get at your data. Now it's a very specific person, perhaps, that if that's an issue, that's another problem altogether, as well. 

So that generally exists for phones or other devices. And, in fact, Yale just rolled this out mid-semester such that this doesn't affect folks in this room. But those of you following along in New Haven know that if you'd log into your yale.net ID, in addition to typing your user name and your password, you're then prompted with this. And, for instance, this is a screenshot I took this morning when I logged into my Yale account. And it sends me the equivalent of a text message to my phone. But in reality, I downloaded an app in advance that Yale now distributes, and I have to now just type in the code that they send to my phone. 

But to be clear, the upside of this is that now, even if someone figures out my Yale password, I'm safe. That's not enough. That's only one key, but I need two to unlock my account. But what's the downside, perhaps, of Yale's system? And we'll let Yale know. What's the downside? What's that? If you don't have cell service or if you don't have Wi-Fi access because you're just in a basement or something, you might not be able to get the message. Thankfully, in this particular case, this will use Wi-Fi or something else, which works around it. But a possible scenario. What else? You could lose your phone. You just don't have it. The battery dies. I mean, there's a number of annoying scenarios but possible scenarios that could happen that make you regret this decision. And the worst possible outcome, frankly, then would be for users to disable this altogether. So there's always going to be this tension. And you have to find for yourself as a user sort of a sweet spot. And to do this, take a couple of concrete suggestions. If you use Google Gmail or Google Apps, know that if you go to this URL here, you can enable two-factor authentication. Google calls it 2-step verification. And you click Setup, and then you do exactly that. That's a good thing to do, especially these days because, thanks to cookies, you're logged in almost all day long. So you rarely have to type your password anyway. So you might do it once a week, once a month, once a day, and it's less of a big deal than in the past. 

Facebook, too, has this. If you're a little too loose with typing your Facebook password into friends' computers, at least enable two-factor authentication so that that friend, even if he or she has a keystroke logger, they can't get into your account. Well, why is that? Couldn't they just log the code I've typed in on my phone that Facebook has sent to me? AUDIENCE: [INAUDIBLE]. DAVID MALAN: Yeah. The well-designed software will change those codes that are sent to your phone every few seconds or every time and so that, yeah, even if he or she figures out what your code is, you're still safe because it will have expired. And this is what it looks like on Facebook's website. 

But there's another approach altogether. So if those kinds of trade-offs aren't particularly alluring, a general principle in security would be, well, just at least audit things. Don't kind of put your head in the sand and just never know if or when you've been compromised or attacked. At least set up some mechanism that informs you instantly if something anomalous has happened so that you at least narrow the window of time during which someone can do damage. 

And by this, I mean the following-- at Facebook, for instance, you can turn on what they call login alerts. And right now, I've enabled email login alerts but not notifications. And what that means is that if Facebook notices I've logged into a new computer-- like I don't have a cookie, it's a different IP address, it's a different type of computer-- they will, in this scenario, send me an email saying, hey, David. Looks like you logged in from an unfamiliar computer, just FYI. 

And now my account might be compromised, or my annoying friend might have been logging into my account now posting things on my news feed or the like. But at least the amount of time with which I am ignorant of that is super, super narrow. And I can hopefully respond. So all three of these, I would say, are very good things to do. What are some threats that are a little harder for us end users to protect against? Does anyone know what session hijacking is? It's a more technical threat, but very familiar now that we've done pset six and seven and now eight. So recall that when you send traffic over the internet, a few things happen. Let me go ahead and log into c9 or CS50.io. Give me just one moment to log into my jHarvard account. 

AUDIENCE: What's your password. 

DAVID MALAN: 12345. All right. And in here, know that if I go ahead and request a web page-- and in the meantime, let me do this. Let me open up Chrome's Inspector tab and my network traffic. And let me go to http://facebook.com and clear this. Actually, you know what? Let's go to a more familiar one-- https://finance.cs50.net and click Enter and log the network traffic here. 

So notice here, if I look in my network traffic, response headers-- let's go up here. Response headers-- here. So the very first request that I sent, which was for the default page, it responded with these response headers. And we've talked about things like location. Like, location means redirect to login.php. But one thing we didn't talk a huge amount about was lines like this. So this is inside of the virtual envelope that's sent from CS50 Finance-- the version you guys wrote, too-- to a user's laptop or desktop computer. And this is setting a cookie. But what is a cookie? Think back to our discussion of PHP. Yeah? Yeah, it's a way of telling the website that you're still logged in. But how does that work? Well, upon visiting finance.cs50.net, it looks like that server that we implemented is setting a cookie. And that cookie is conventionally call PHPSESSID session ID. And you can think of it like a virtual handstamp at a club or, like, an amusement park, a little piece of red ink that goes on your hand so that the next time you visit the gate, you simply show your hand, and the bouncer at the door will let you pass or not at all based on that stamp. 

So the subsequent requests that my browser sends-- if I go to the next request and you look at the request headers, you'll notice more stuff. But the most important is this highlighted portion here-- not set cookie but cookie. And if I flip through every one of those subsequent HTTP requests, every time I would see a hand being extended with that exact same PHPSESSID, which is to say this is the mechanism-- this big pseudorandom number-- that a server uses to maintain the illusion of PHP's $_SESSION object, into which you can store things like the user's ID or what's in their shopping cart or any number of other pieces of data. 

So what's the implication? Well, what if that data is not encrypted? And, in fact, we for best practice encrypt pretty much every one of CS50's websites these days. But it's very common these days for websites still not to have HTTPS at the start of the URL. They're just HTTP, colon, slash slash. So what's the implication there? That simply means that all of these headers are inside of that virtual envelope. And anyone who sniffs the air or physically intercepts that packet physically can look inside and see what that cookie is. 

And so session hijacking is simply a technique that an adversary uses to sniff data in the air or on some wired network, look inside of this envelope, and see, oh. I see that your cookie is 2kleu whatever. Let me go ahead and make a copy of your hand stamp and now start visiting Facebook or Gmail or whatever myself and just present the exact same handstamp. And the reality is, browsers and servers really are that naive. If the server sees that same cookie, its purpose in life should be to say, oh, that must be David, who just logged in a little bit ago. Let me show this same user, presumably, David's inbox or Facebook messages or anything else into which your logged. 

And the only defense against that is to just encrypt everything inside of the envelope. And thankfully, a lot of sites like Facebook and Google and the like are doing that nowadays. But any that don't leave you perfectly, perfectly vulnerable. And one of the things you can do-- and one of the nice features, frankly, of 1Password, the software I mentioned earlier, is if you install it on your Mac or PC, the software, besides storing your passwords, will also warn you if you ever try logging into a website that's going to send your username and password unencrypted and in the clear, so to speak. All right. So session hijacking boils down to that. But there's this other way that HTTP headers can be used to take advantage of us. And this is still kind of an issue. This is really just an adorable excuse to put up Cookie Monster here. But Verizon and AT&T and others took a lot of flak a few months back for injecting, unbeknownst to users initially, an extra HTTP header. 

So those of you who have had Verizon Wireless or AT&T cell phones, and you've been visiting websites via your phone, unbeknownst to you, after your HTTP requests leave Chrome or Safari or whatever on your phone, go to Verizon or AT&T's router, they presumptuously for some time have been injecting a header that looks like this-- a key-value pair where the key is just X-UIDH for unique identifier header and then some big random value. And they do this so that they can uniquely identify all of your web traffic to people receiving your HTTP request. 

Now, why would Verizon and AT&T and the like want to uniquely identify you to all the websites you're visiting? 

AUDIENCE: Better customer service. 

DAVID MALAN: Better-- no. It's a good thought, but it's not for better customer service. What else? Advertising, right? So they can build up an advertising network, presumably, whereby even if you have turned off cookies, even if you have special software on your phone that keeps you in incognito mode-- ha. There is no incognito mode when the man in the middle-- literally, Verizon or AT&T-- is injecting additional data over which you have no control, thereby revealing who you are to that resulting website again and again. 

So there are ways to opt out of this. But here, too, is something that frankly, the only way to push back on this is to leave the carrier altogether, disable it if they even allow you to, or, as happened in this case, make quite a bit of fuss online such that the companies actually respond. This, too, is just another adorable opportunity to show this. 

And let's take a look at, let's say, one or two final threats. So we talked about CS50 Finance here. So you'll notice that we have this cute little icon on the login button here. What does it mean if I instead use this icon? So before, after. Before, after. What does after mean? It's secure. That's what I'd like you to think. But ironically, it is secure because we do have HTTPS. 

But that is how easy it is to change something on a website, right? You all know a bit of HTML and CSS now. And in fact, it's pretty easy to-- and if you didn't do it-- to change the icon. But this, too, is what companies have taught us to do. So here's a screenshot from Bank of America's website this morning. And notice, one, they're reassuring me that's it's a secure sign in at top left. And they also have a padlock icon on the button, which means what to me, the end user? 

Truly nothing, right? What does matter is the fact that there's the big green URL up top with HTTPS. But if we zoom in on this, is just like me, knowing a little bit of HTML and a bit of CSS, and saying, hey, my website's secure. Like, anyone can put a padlock and the word secure sign-on onto their website. And it truly means nothing. What does mean something is something like this, where you do see https:// the fact that Bank of America corporation has this big green bar, whereas CS50 does not, just means they paid several hundred dollars more to have additional verification done of their domain in the US so that browsers who adhere to this standard will also show us a little bit more than that. 

So we'll leave things at that, frighten you a little more before long. But on Wednesday, we'll be joined by Scaz from Yale for a look at artificial intelligence and what we can do with these machines. We will see you next time.