WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 00:00:02.976 --> 00:00:06.448 [MUSIC PLAYING] 00:01:12.307 --> 00:01:13.390 DAVID J. MALAN: All right. 00:01:13.390 --> 00:01:15.130 So this is CS50. 00:01:15.130 --> 00:01:18.520 My name is David Malan, and this is Harvard University's introduction 00:01:18.520 --> 00:01:20.740 to the intellectual enterprises of computer science 00:01:20.740 --> 00:01:22.450 and the art of programming. 00:01:22.450 --> 00:01:25.060 And this, of course, is our special family weekend, 00:01:25.060 --> 00:01:28.180 wherein not only our CS50's own students here in the audience, but also 00:01:28.180 --> 00:01:29.990 some family members as well. 00:01:29.990 --> 00:01:32.560 Now, you're showing up in the semester a little bit late. 00:01:32.560 --> 00:01:36.130 We've just tackled week eight, which is really our ninth week since computer 00:01:36.130 --> 00:01:37.630 scientists start counting from zero. 00:01:37.630 --> 00:01:40.047 So we've done a whole lot of work over the past few weeks, 00:01:40.047 --> 00:01:42.890 as you might have heard via emails or text messages home, 00:01:42.890 --> 00:01:45.858 including a language known here as binary. 00:01:45.858 --> 00:01:48.400 So on the screen here, of course, is a lot of zeros and ones. 00:01:48.400 --> 00:01:52.600 And suffice it to say, let me sum up the past nine weeks with this 00:01:52.600 --> 00:01:54.340 is what's going on underneath the hood. 00:01:54.340 --> 00:01:57.465 But of course, today, we thought we'd make things a little more accessible, 00:01:57.465 --> 00:01:58.870 a little more broadly applicable. 00:01:58.870 --> 00:02:01.480 And indeed, our focus today will not be on what 00:02:01.480 --> 00:02:04.060 these patterns of zeros and ones represent, 00:02:04.060 --> 00:02:07.390 which in astute, I might notice are replicated visually 00:02:07.390 --> 00:02:10.190 with these light bulbs being in a pattern on and off. 00:02:10.190 --> 00:02:14.140 And as your child might have hinted before class, or perhaps, now, 00:02:14.140 --> 00:02:17.200 this might very well spell a word up to eight characters 00:02:17.200 --> 00:02:20.890 long because you can encode, even in the real world, things digital too. 00:02:20.890 --> 00:02:23.470 But today, we'll focus on things much more high level, 00:02:23.470 --> 00:02:26.170 this notion of cybersecurity, like the security 00:02:26.170 --> 00:02:31.210 of our data, our privacy of our systems, particularly, on the internet 00:02:31.210 --> 00:02:34.330 nowadays because presumably, all of us are carrying technologies around 00:02:34.330 --> 00:02:36.770 in our pocket using laptops and desktops every day. 00:02:36.770 --> 00:02:40.685 And so the goal today is to stipulate that this 00:02:40.685 --> 00:02:42.310 is what's going on underneath the hood. 00:02:42.310 --> 00:02:44.350 But let's solve some problems at a higher level, 00:02:44.350 --> 00:02:47.770 so that your homework, when you go back to wherever you're visiting from, 00:02:47.770 --> 00:02:50.740 can actually be to apply some of today's lessons learned. 00:02:50.740 --> 00:02:55.780 So with that said, perhaps, the most common familiar defense of one's 00:02:55.780 --> 00:02:58.720 systems, and data, phone, and laptops, and desktops would just 00:02:58.720 --> 00:03:00.070 be these simple passwords. 00:03:00.070 --> 00:03:02.440 Unfortunately, you and I are-- frankly, as humans, 00:03:02.440 --> 00:03:05.150 not all that good at choosing passwords. 00:03:05.150 --> 00:03:09.010 And this is in itself a relatively weak form of defense, even though each of us 00:03:09.010 --> 00:03:12.430 has dozens, hundreds, of passwords nowadays 00:03:12.430 --> 00:03:14.920 or at least dozens or hundreds of accounts, 00:03:14.920 --> 00:03:19.653 maybe fives or tens of dozens of passwords. 00:03:19.653 --> 00:03:21.820 Indeed, if you're in the habit of reusing passwords, 00:03:21.820 --> 00:03:25.160 we'll see today, probably among our first lessons learned. 00:03:25.160 --> 00:03:29.050 So for instance, if we look back at the past year, 2021, thanks 00:03:29.050 --> 00:03:33.310 to security researchers who take a look at data that has been hacked or leaked 00:03:33.310 --> 00:03:37.990 online by way of public databases, we have a sense as computer scientists 00:03:37.990 --> 00:03:40.720 of what the most popular, or equivalently, 00:03:40.720 --> 00:03:43.270 what some of the worst passwords are that you and I are 00:03:43.270 --> 00:03:44.300 choosing for our system. 00:03:44.300 --> 00:03:46.750 So as of this past year, according to one measure, 00:03:46.750 --> 00:03:53.470 the most commonly used password in systems everywhere was 123456. 00:03:53.470 --> 00:03:54.220 All right? 00:03:54.220 --> 00:03:56.770 The number two password in our top 10 here list 00:03:56.770 --> 00:04:00.670 was only slightly longer, 123456789. 00:04:00.670 --> 00:04:05.590 After that, we took a turn in the other direction, 12345 alone. 00:04:05.590 --> 00:04:08.440 After that, it got a little more interesting, qwerty, 00:04:08.440 --> 00:04:10.360 which might sound pretty cryptic, but not 00:04:10.360 --> 00:04:12.100 if you look down at your US keyboard. 00:04:12.100 --> 00:04:16.750 And it's the top left hand row of the keys on an American keyboard, so also, 00:04:16.750 --> 00:04:18.310 not all that hard. 00:04:18.310 --> 00:04:21.790 Perhaps, not surprisingly a little disconcertingly, 00:04:21.790 --> 00:04:25.270 number five was password. 00:04:25.270 --> 00:04:29.200 Meanwhile, number six returns us to digits, 12345678. 00:04:29.200 --> 00:04:33.310 After that, really, less effort, 111111. 00:04:33.310 --> 00:04:37.780 After that, a little more variation, but not all that much 123123. 00:04:37.780 --> 00:04:41.830 After that, it's getting even less interesting, 1234567890. 00:04:41.830 --> 00:04:45.520 And then lastly topping the list is just 1234567. 00:04:45.520 --> 00:04:48.470 So this is not a good top 10 list to be on. 00:04:48.470 --> 00:04:52.600 So among today's first takeaways is if you see your password on the screen, 00:04:52.600 --> 00:04:54.490 you didn't make the list in a good way. 00:04:54.490 --> 00:04:57.550 This means hundreds, thousands, millions of other people 00:04:57.550 --> 00:04:59.380 probably have that password of yours. 00:04:59.380 --> 00:05:02.590 Now, in and of itself, that's not necessarily worrisome 00:05:02.590 --> 00:05:05.920 because I don't know who has these passwords in a room as large as this. 00:05:05.920 --> 00:05:09.460 But just intuitively, why is this a bad thing? 00:05:09.460 --> 00:05:14.120 Either parent or child is welcome to raise a hand here. 00:05:14.120 --> 00:05:16.210 Why might this be-- 00:05:16.210 --> 00:05:17.275 intuitively, yeah? 00:05:17.275 --> 00:05:18.707 AUDIENCE: Access to it. 00:05:18.707 --> 00:05:20.040 DAVID J. MALAN: So access to it. 00:05:20.040 --> 00:05:22.290 I mean, we literally, as computer scientists, now have 00:05:22.290 --> 00:05:24.720 a database of really common passwords. 00:05:24.720 --> 00:05:25.680 And your thoughts? 00:05:25.680 --> 00:05:28.408 AUDIENCE: [INAUDIBLE] find it out quickly. 00:05:28.408 --> 00:05:30.700 DAVID J. MALAN: Yeah, you can just find it out quickly. 00:05:30.700 --> 00:05:33.910 I mean, you could imagine trying to guess someone's password by just typing 00:05:33.910 --> 00:05:36.280 in random letters, random numbers, random words, 00:05:36.280 --> 00:05:38.290 but not if you have a top 10 list. 00:05:38.290 --> 00:05:41.920 The adversaries in the world might as well just start with this list. 00:05:41.920 --> 00:05:44.727 Now, you'll notice that even absent from this are slight variance. 00:05:44.727 --> 00:05:46.810 Some of you might be thinking, I'm not on the list 00:05:46.810 --> 00:05:50.470 because I do something clever like I use an exclamation point for the number 00:05:50.470 --> 00:05:55.240 one, or a three for an E, or a 5 for an S. 00:05:55.240 --> 00:05:57.160 And based on the smiles in the room right now, 00:05:57.160 --> 00:06:00.460 you're not all that clever, it turns out, because other people are smiling 00:06:00.460 --> 00:06:04.390 too, which is to say that an adversary can take those same heuristics that you 00:06:04.390 --> 00:06:06.700 might think are making things more secure by just 00:06:06.700 --> 00:06:08.752 tweaking some letters to numbers or vise versa. 00:06:08.752 --> 00:06:10.960 But if you're doing it and other people are doing it, 00:06:10.960 --> 00:06:14.150 the bad guys, so to speak, are going to be doing it as well. 00:06:14.150 --> 00:06:16.570 So unfortunately, when it comes to passwords, 00:06:16.570 --> 00:06:20.020 better is longer, and random, and really unguessable. 00:06:20.020 --> 00:06:21.520 But that's not what most of us have. 00:06:21.520 --> 00:06:23.562 In fact, case in point on our phones, whether you 00:06:23.562 --> 00:06:25.540 have an Android device or an iPhone nowadays, 00:06:25.540 --> 00:06:28.900 odds are you have something relatively simplistic protecting it, 00:06:28.900 --> 00:06:30.070 if you have anything at all. 00:06:30.070 --> 00:06:31.778 But at least, Apple and Google are pretty 00:06:31.778 --> 00:06:34.630 good at at least nudging us to choose these kinds of passcodes now. 00:06:34.630 --> 00:06:38.360 And a four-digit passcode is quite common nowadays. 00:06:38.360 --> 00:06:42.190 And so here's where we have an opportunity, thanks to the URL 00:06:42.190 --> 00:06:45.490 that you saw on the screen earlier, to conjecture as a group 00:06:45.490 --> 00:06:49.060 just how long might it take an adversary, someone out there 00:06:49.060 --> 00:06:51.460 who's out to get us or get one of us-- 00:06:51.460 --> 00:06:54.820 how long might it take an adversary to figure out 00:06:54.820 --> 00:06:57.580 your phone's four-digit passcode? 00:06:57.580 --> 00:06:58.930 This is CS50's own Carter. 00:06:58.930 --> 00:07:01.510 Carter, if you could switch over and pull the audience here-- 00:07:01.510 --> 00:07:04.552 if you take out your phone or laptop, whatever device you might have used 00:07:04.552 --> 00:07:10.960 a few minutes ago, to scan that QR code or to visit that same URL, 00:07:10.960 --> 00:07:14.320 you can see these questions on your browser. 00:07:14.320 --> 00:07:15.790 And if you can't, that's fine. 00:07:15.790 --> 00:07:18.217 We'll share some aggregate data, nonetheless. 00:07:18.217 --> 00:07:20.800 But you should have an opportunity to tap one of your answers. 00:07:20.800 --> 00:07:26.920 And we'll give folks a few more seconds if you'd like to play along at home. 00:07:26.920 --> 00:07:30.670 And here in just a moment-- 00:07:30.670 --> 00:07:32.287 probably have many people reporting. 00:07:32.287 --> 00:07:34.870 But why don't we go ahead and take a look at some percentages? 00:07:34.870 --> 00:07:36.910 It looks like most of you-- 00:07:36.910 --> 00:07:39.340 60% to 70%-- are proposing just a few seconds, 00:07:39.340 --> 00:07:42.340 so that's not all that good news if it's a four-digit passcode. 00:07:42.340 --> 00:07:44.620 Some of you are hoping it's a few minutes. 00:07:44.620 --> 00:07:46.510 8% are hoping a few hours. 00:07:46.510 --> 00:07:50.560 More than 4% of you are really hoping, perhaps, it's a few days. 00:07:50.560 --> 00:07:53.192 Well, let's actually consider how we can answer this question 00:07:53.192 --> 00:07:55.900 and make today not just conceptual, but a little quantitative too 00:07:55.900 --> 00:07:58.483 and see if we can't slap some numbers on questions like these, 00:07:58.483 --> 00:08:00.610 so ultimately, you can make more informed decisions 00:08:00.610 --> 00:08:01.850 with your system's security. 00:08:01.850 --> 00:08:05.140 So for instance, when it comes to four-digit pass codes, 00:08:05.140 --> 00:08:08.560 rather than just consider how secure it is, well, 00:08:08.560 --> 00:08:11.630 let's make it a more precise question like, what are the forms of attack? 00:08:11.630 --> 00:08:14.755 Well, the simplest attack might be just someone grabbing your phone, be it, 00:08:14.755 --> 00:08:17.320 in your family, or maybe at Starbucks, or the airport, 00:08:17.320 --> 00:08:21.760 or the like and just starting all possible combinations, maybe 0000, 00:08:21.760 --> 00:08:24.490 then 0001, and 0002. 00:08:24.490 --> 00:08:26.660 We could maybe automate this a little bit. 00:08:26.660 --> 00:08:30.220 So for instance, I might potentially be able to do something 00:08:30.220 --> 00:08:32.720 like robotosize this here. 00:08:32.720 --> 00:08:35.140 Let me go ahead and full screen a quick video here 00:08:35.140 --> 00:08:38.679 that's just going to paint a picture in just a moment on the screen of how, 00:08:38.679 --> 00:08:41.470 if we're a really clever adversary and know how to build things, 00:08:41.470 --> 00:08:44.270 well, at least, maybe we could automate some of that process. 00:08:44.270 --> 00:08:46.420 So here's an Android phone sitting on a counter. 00:08:46.420 --> 00:08:50.920 Here's a very simple tripod and a little touch device robotically doing all 00:08:50.920 --> 00:08:56.260 of that hacking for you starting at 0000 probably all the way up to 9999. 00:08:56.260 --> 00:09:00.340 Now, that too wasn't necessarily all that fast, but at least, 00:09:00.340 --> 00:09:02.530 the adversary can step away and doesn't actually 00:09:02.530 --> 00:09:05.140 have to be bothered with the time involved, 00:09:05.140 --> 00:09:08.390 the cost involved, in actually hacking that particular device. 00:09:08.390 --> 00:09:11.800 Well, let's go one level deeper, a little more interestingly, 00:09:11.800 --> 00:09:17.950 and consider here how much time really this so-called brute force 00:09:17.950 --> 00:09:18.700 attack would take. 00:09:18.700 --> 00:09:21.825 And that's actually a term of art, much like in yesteryear when maybe there 00:09:21.825 --> 00:09:24.820 was a battering ram trying to brute force their way into a castle 00:09:24.820 --> 00:09:25.870 or something like that. 00:09:25.870 --> 00:09:29.380 A brute force attack digitally is just someone trying manually 00:09:29.380 --> 00:09:32.950 all possible codes or maybe robotically trying all possible codes, 00:09:32.950 --> 00:09:35.500 but generally automating the process in some way 00:09:35.500 --> 00:09:37.270 to go through all possibilities. 00:09:37.270 --> 00:09:42.820 Well, if you've got, for instance, a four-digit passcode-- 00:09:42.820 --> 00:09:46.150 let's ask maybe a follow-up question here, not how long it will take, 00:09:46.150 --> 00:09:51.190 but how many possible four-digit passcodes are there? 00:09:51.190 --> 00:09:53.420 Because then maybe, we can do some quick math. 00:09:53.420 --> 00:09:57.132 And if every passcode takes me a second, or a few milliseconds, or the like, 00:09:57.132 --> 00:10:00.340 then I think we can try to extrapolate from that whether the first answer was 00:10:00.340 --> 00:10:03.920 seconds, or minutes, or days, or hours, or something else. 00:10:03.920 --> 00:10:06.815 So how many four-digit passcodes are possible? 00:10:06.815 --> 00:10:09.940 If you take out your same device, it should have just changed automatically 00:10:09.940 --> 00:10:13.750 if it doesn't seem to have maybe reload your browser with some menu option. 00:10:13.750 --> 00:10:17.740 And then tap in here, how many four-digit passcodes are possible? 00:10:17.740 --> 00:10:20.860 Four total, 40, 9,999, 10,000. 00:10:20.860 --> 00:10:24.610 Or unsure is OK too. 00:10:24.610 --> 00:10:25.510 So let's see. 00:10:25.510 --> 00:10:27.900 We'll give you a few more moments. 00:10:27.900 --> 00:10:30.600 How many four-digit passcodes are possible? 00:10:30.600 --> 00:10:33.160 And shall we reveal the results? 00:10:33.160 --> 00:10:37.230 So now, it looks like a few of you-- 00:10:37.230 --> 00:10:42.540 2% of you are saying just for passcodes, 40, 9,999. 00:10:42.540 --> 00:10:44.340 There's definitely some contention here. 00:10:44.340 --> 00:10:45.900 And 6% are unsure. 00:10:45.900 --> 00:10:48.850 Well, how do we wrap our minds around this? 00:10:48.850 --> 00:10:50.760 Well, let's just do this real simple here. 00:10:50.760 --> 00:10:54.160 Let me switch back over to doing a bit of math. 00:10:54.160 --> 00:10:57.932 And if we have here 10 possibilities for each digit, if there's four digits, 00:10:57.932 --> 00:11:01.140 each digit can be zero, one, two, three, four, five, six, seven, eight, nine. 00:11:01.140 --> 00:11:02.590 So that's 10 possibilities. 00:11:02.590 --> 00:11:05.490 So if you think about the number of permutations, 00:11:05.490 --> 00:11:09.540 that's 10 possibilities for the first digit times 10 for the next, times 10 00:11:09.540 --> 00:11:11.290 for the next, times 10 for the next. 00:11:11.290 --> 00:11:14.130 And so if we do that out, 10 times, 10 times, 10 times, 10 or 10 00:11:14.130 --> 00:11:20.220 to the fourth, there are, indeed-- and 66% of you found 10,000 possibilities. 00:11:20.220 --> 00:11:22.620 And so now we can kind of work backwards and decide, 00:11:22.620 --> 00:11:25.800 how long is it going to take for an adversary to hack into this phone? 00:11:25.800 --> 00:11:28.800 Because if it's one attack, one guess per second, 00:11:28.800 --> 00:11:31.470 well, that's going to map out to 10,000 seconds, 00:11:31.470 --> 00:11:34.560 but maybe not if the adversary isn't a roboticist or a human. 00:11:34.560 --> 00:11:37.170 What if they're a software programmer or someone who 00:11:37.170 --> 00:11:40.560 has taken even a class introductory, like CS50 and learned 00:11:40.560 --> 00:11:41.760 a little bit of programming? 00:11:41.760 --> 00:11:43.590 Well, a little bit frighteningly, it's not 00:11:43.590 --> 00:11:46.470 all that hard to hack into systems if you just 00:11:46.470 --> 00:11:49.890 know how to code, too, and really have the computer do your work for you. 00:11:49.890 --> 00:11:53.190 So in fact, let me go ahead and change over to another screen on my computer 00:11:53.190 --> 00:11:53.910 here. 00:11:53.910 --> 00:11:56.307 This is different for students in the group from VS Code. 00:11:56.307 --> 00:11:58.140 This is just a black and white version of it 00:11:58.140 --> 00:11:59.670 that we've used briefly in the past. 00:11:59.670 --> 00:12:03.090 And I'm just going to go ahead and create a program called crack.py. 00:12:03.090 --> 00:12:05.160 To crack something just technically means 00:12:05.160 --> 00:12:08.380 to figure out what it is, figure out a password in this case. 00:12:08.380 --> 00:12:11.730 And .py means I'm going to use a programming language that we here 00:12:11.730 --> 00:12:15.390 in CS50 have been dabbling in the past couple of weeks with more to come next 00:12:15.390 --> 00:12:16.300 week as well. 00:12:16.300 --> 00:12:20.220 So it turns out, and you need not understand each of these lines of code, 00:12:20.220 --> 00:12:25.518 if I want to try, maybe, generating all 10,000 possible codes, 00:12:25.518 --> 00:12:27.060 I'm not going to bother with a robot. 00:12:27.060 --> 00:12:29.250 I've got all these cables coming out of my computer. 00:12:29.250 --> 00:12:32.250 And odds are one of them is a USB cable or a lightning cable. 00:12:32.250 --> 00:12:36.120 Surely, we could figure out how to connect laptop or desktop to phone 00:12:36.120 --> 00:12:39.810 and just automate the process nowadays by just sending all of the numbers 00:12:39.810 --> 00:12:44.220 into the phone until one unlocks the trick just like in the movies or TV. 00:12:44.220 --> 00:12:47.410 Well, in Python, I could write a program that does this as follows. 00:12:47.410 --> 00:12:51.510 I can import, so to speak, all of the decimal digits, zero through nine. 00:12:51.510 --> 00:12:54.990 And this, for students in the room, is just a slightly better version 00:12:54.990 --> 00:12:57.580 of typing out 10 different numbers manually. 00:12:57.580 --> 00:13:00.540 I can also import from a library, so to speak, 00:13:00.540 --> 00:13:02.910 called itertools for iteration tools, which 00:13:02.910 --> 00:13:04.720 means to do something again and again. 00:13:04.720 --> 00:13:09.210 I can import a function called product, which means the cross product. 00:13:09.210 --> 00:13:11.850 Combine this with this some number of times. 00:13:11.850 --> 00:13:13.680 And then it's just two more lines of code. 00:13:13.680 --> 00:13:15.850 I can use what's called a loop in programming. 00:13:15.850 --> 00:13:19.770 So for every pass code in the cross product of all 10 00:13:19.770 --> 00:13:24.420 of those digits repeated, a total of four times-- 00:13:24.420 --> 00:13:27.480 let me go ahead and-- rather than bother connecting my phone 00:13:27.480 --> 00:13:29.520 and hacking my own phone, let me just print out 00:13:29.520 --> 00:13:31.680 every one of those 10,000 codes on the screen, 00:13:31.680 --> 00:13:33.833 and we'll see how fast the hacker could do this. 00:13:33.833 --> 00:13:34.500 Let me go ahead. 00:13:34.500 --> 00:13:37.750 And print and with an asterisk, which is a little trick to format it nicely, 00:13:37.750 --> 00:13:39.750 I'm going to print out each of those pass codes. 00:13:39.750 --> 00:13:43.350 And that's it, four lines of code, maybe 40 seconds of talking, 00:13:43.350 --> 00:13:45.960 but maybe really four seconds of coding if I actually 00:13:45.960 --> 00:13:47.250 did this without the audience. 00:13:47.250 --> 00:13:49.470 And now let me go ahead and save the file. 00:13:49.470 --> 00:13:52.410 And I'm going to run, as we do every day in class of late. 00:13:52.410 --> 00:13:54.370 Python of crack.py. 00:13:54.370 --> 00:13:57.840 And when I hit Enter, I should see on the screen all 10,000 possibilities 00:13:57.840 --> 00:14:00.180 from 0000 9999. 00:14:00.180 --> 00:14:00.810 So let's see. 00:14:00.810 --> 00:14:04.560 Is it a few seconds, minutes, hours, or days? 00:14:04.560 --> 00:14:05.790 Done. 00:14:05.790 --> 00:14:08.518 So barely even seconds plural if that. 00:14:08.518 --> 00:14:11.310 So that should be a little disconcerting because all that adversary 00:14:11.310 --> 00:14:14.393 needs to do is grab your phone off the counter, plug in a cable, and boom. 00:14:14.393 --> 00:14:15.390 They're done. 00:14:15.390 --> 00:14:18.720 There's no ticking clock or worries as in the movies or TV 00:14:18.720 --> 00:14:20.700 that maybe you're going to come into the room. 00:14:20.700 --> 00:14:22.900 You don't need that much of a window of time. 00:14:22.900 --> 00:14:25.000 So what would be better than this? 00:14:25.000 --> 00:14:27.630 Well, let's consider what our options might 00:14:27.630 --> 00:14:30.392 be if we don't want to just use four-digit pass code. 00:14:30.392 --> 00:14:32.850 Some of you, indeed, might have better passcodes than that. 00:14:32.850 --> 00:14:36.930 And maybe, you use four-letter passcodes instead, so A through Z, 00:14:36.930 --> 00:14:38.820 maybe uppercase and lowercase. 00:14:38.820 --> 00:14:41.110 That starts to make things a little more interesting. 00:14:41.110 --> 00:14:43.080 So should we poll this question too? 00:14:43.080 --> 00:14:46.592 If we upgrade from four digits to just four letters, 00:14:46.592 --> 00:14:49.800 English letters, A through Z, uppercase and lowercase-- why don't we go ahead 00:14:49.800 --> 00:14:55.260 and pol the group here and ask how many four-letter passcodes are there 00:14:55.260 --> 00:14:57.850 instead? 00:14:57.850 --> 00:15:00.510 So this time, the range starts at four. 00:15:00.510 --> 00:15:03.090 Still not the right answer, though, this time. 00:15:03.090 --> 00:15:05.800 How many four-letter pass codes are possible? 00:15:08.870 --> 00:15:11.090 [INAUDIBLE] 00:15:11.090 --> 00:15:12.290 Take a couple more seconds. 00:15:16.590 --> 00:15:18.030 All right. 00:15:18.030 --> 00:15:20.110 Almost a couple hundred responses in already. 00:15:20.110 --> 00:15:21.465 A few more seconds. 00:15:24.920 --> 00:15:31.070 And why don't we go ahead and reveal now the answers, which are-- 00:15:31.070 --> 00:15:35.690 OK, so we solved a couple of problems at least. 00:15:35.690 --> 00:15:37.430 OK, someone's just messing with us now. 00:15:37.430 --> 00:15:37.930 All right. 00:15:37.930 --> 00:15:39.620 So it looks like most of you-- 00:15:39.620 --> 00:15:43.350 76% of you have claimed it's seven million plus possibilities. 00:15:43.350 --> 00:15:46.040 So that's encouraging because that's a whole order of magnitude 00:15:46.040 --> 00:15:46.792 more than before. 00:15:46.792 --> 00:15:49.250 Well, let's figure out how we might do this mathematically. 00:15:49.250 --> 00:15:52.100 So if we've got 26 lowercase, 26 uppercase, 00:15:52.100 --> 00:15:55.020 that's 52 possibilities now for each of those four digits. 00:15:55.020 --> 00:15:57.680 So that's 52 times itself four times, which, 00:15:57.680 --> 00:16:00.530 indeed, either off the top of your head a good guess, 00:16:00.530 --> 00:16:03.500 a calculator on the same device you're using right now, indeed, 00:16:03.500 --> 00:16:05.790 gives us seven million instead. 00:16:05.790 --> 00:16:08.250 Well, what might be slightly better than that? 00:16:08.250 --> 00:16:09.837 Well, maybe four characters. 00:16:09.837 --> 00:16:12.170 And this, indeed, is what your Macs, PCs, and phones are 00:16:12.170 --> 00:16:15.860 urging us to do nowadays, not just numbers, not just letters, but really 00:16:15.860 --> 00:16:19.940 annoying punctuation, so it really looks cryptic not just to the adversary, 00:16:19.940 --> 00:16:22.030 but also to you and me, unfortunately. 00:16:22.030 --> 00:16:23.030 And that's the downside. 00:16:23.030 --> 00:16:27.500 But here now, we have a mental model, and really, a computational framework 00:16:27.500 --> 00:16:29.737 via which we can evaluate the security of these. 00:16:29.737 --> 00:16:31.820 And I'll go ahead and spoil some of the math here. 00:16:31.820 --> 00:16:37.080 If we've got 52 letters of the alphabet, uppercase and lowercase, 10 digits, 00:16:37.080 --> 00:16:40.850 and if I count them out on my keyboard, about 32 punctuation symbols 00:16:40.850 --> 00:16:43.190 in typical English grammar, that actually gives us 00:16:43.190 --> 00:16:49.230 94 possibilities now, which is up from 52, which is up from 10. 00:16:49.230 --> 00:16:50.780 So now, we're really moving. 00:16:50.780 --> 00:16:53.700 And now that would give us 78 million possibilities, 00:16:53.700 --> 00:16:55.280 so another order of magnitude. 00:16:55.280 --> 00:16:58.202 Now, it's still going to be relatively fast because you know what? 00:16:58.202 --> 00:16:59.160 I can actually do this. 00:16:59.160 --> 00:17:01.220 Let me go back into my code here. 00:17:01.220 --> 00:17:03.500 Let me reopen this same program. 00:17:03.500 --> 00:17:06.750 And I can point out just how easy it is to make these changes. 00:17:06.750 --> 00:17:11.630 Instead of importing digits as before, I can import, as your child might know, 00:17:11.630 --> 00:17:14.940 ascii letters, which are A through Z, uppercase, lowercase. 00:17:14.940 --> 00:17:18.319 And I can just change this here, ascii letters. 00:17:18.319 --> 00:17:21.349 And so this was that first version where we just changed to letters. 00:17:21.349 --> 00:17:22.640 Let me now rerun the code. 00:17:22.640 --> 00:17:26.220 And instead of seeing numbers, we'll see letters flying across the screen. 00:17:26.220 --> 00:17:29.060 And if I walk over here to the screen, we'll see that. 00:17:29.060 --> 00:17:33.170 By the time I get here, we're halfway through the entire alphabet lowercase. 00:17:33.170 --> 00:17:35.120 If I now start walking away, I think, yeah, 00:17:35.120 --> 00:17:37.700 we're already done now with uppercase as well. 00:17:37.700 --> 00:17:39.990 If I upgrade this slightly further-- 00:17:39.990 --> 00:17:44.300 let's go ahead and take it one more level and, perhaps, do, let's say, 00:17:44.300 --> 00:17:47.870 ascii letters, and digits, and punctuation. 00:17:47.870 --> 00:17:50.550 And this would be the Pythonic way to say that. 00:17:50.550 --> 00:17:53.600 And I'm going to add to those letters those same digits, 00:17:53.600 --> 00:17:55.940 those same punctuation symbols. 00:17:55.940 --> 00:17:58.700 Let me shrink my font just so the code still fits on the screen. 00:17:58.700 --> 00:18:01.880 And what we now have is with a two seconds of changes, 00:18:01.880 --> 00:18:04.490 a program that if I run this version-- 00:18:04.490 --> 00:18:08.100 whoops-- without the typographical error-- 00:18:08.100 --> 00:18:10.520 this is what we call in CS50 a bug. 00:18:10.520 --> 00:18:12.780 So now, we run the same-- 00:18:12.780 --> 00:18:17.030 this is what we call in CS50 a second bug-- 00:18:17.030 --> 00:18:17.846 punctuation. 00:18:20.490 --> 00:18:21.865 This is where I cross my fingers. 00:18:21.865 --> 00:18:22.365 OK. 00:18:22.365 --> 00:18:25.440 So now it's going to be a little hard to see as flies across the screen. 00:18:25.440 --> 00:18:28.700 But you probably are seeing glimpses of some weird punctuation characters 00:18:28.700 --> 00:18:29.250 as well. 00:18:29.250 --> 00:18:32.270 And I won't waste our time trying to talk through this because this 00:18:32.270 --> 00:18:33.470 is going to take longer. 00:18:33.470 --> 00:18:34.700 We're still in the lowercase. 00:18:34.700 --> 00:18:35.930 I'm still over here already. 00:18:35.930 --> 00:18:39.290 We've not even gotten to N, now O, then P. 00:18:39.290 --> 00:18:40.790 So this is going to run longer. 00:18:40.790 --> 00:18:44.630 But let's end with one final question on the security of all these systems. 00:18:44.630 --> 00:18:47.540 I'm going to cancel that by hitting Control C on my keyboard. 00:18:47.540 --> 00:18:51.950 And let's ask the question instead, if we use eight-character passwords, 00:18:51.950 --> 00:18:53.360 so twice as many characters. 00:18:53.360 --> 00:18:55.790 But even that is not terribly long. 00:18:55.790 --> 00:18:58.670 This is eight characters alone on the stage, eight characters. 00:18:58.670 --> 00:19:01.100 Using letters, numbers, and punctuation might be better. 00:19:01.100 --> 00:19:03.530 Let's do one final vote here, if we could. 00:19:03.530 --> 00:19:07.700 On your same device, how many eight-character possibilities 00:19:07.700 --> 00:19:12.800 are there now for these passcodes? 00:19:12.800 --> 00:19:17.000 And now four didn't even make the list this time. 00:19:17.000 --> 00:19:21.363 All right, a few more seconds, about 100 responses so far. 00:19:21.363 --> 00:19:23.780 How about we go ahead-- and, Carter, if you wouldn't mind, 00:19:23.780 --> 00:19:28.220 let's reveal the results based on the vote, a pretty decent spread here. 00:19:28.220 --> 00:19:30.317 Although the quadrillions are quickly buzzing in. 00:19:30.317 --> 00:19:32.150 And they're contending with the others here. 00:19:32.150 --> 00:19:34.700 Looks like 44% of you said quintillion. 00:19:34.700 --> 00:19:36.740 34% said quadrillion. 00:19:36.740 --> 00:19:39.800 And this time, for the first time, you overbid. 00:19:39.800 --> 00:19:44.270 So indeed, if we go back to the math here, at least, the majority over bid. 00:19:44.270 --> 00:19:46.160 If we have eight-character passcodes that 00:19:46.160 --> 00:19:50.600 gives us 94 times itself eight times or 94 to the eighth power. 00:19:50.600 --> 00:20:01.430 And in fact, that gives us roughly 6,095,689,385,410,816 00:20:01.430 --> 00:20:02.630 possible passcodes. 00:20:02.630 --> 00:20:04.140 Now, what does that mean? 00:20:04.140 --> 00:20:07.700 Well, the adversary's algorithm, the step-by-step code 00:20:07.700 --> 00:20:10.580 that they write to try to hack into your phone, is no different. 00:20:10.580 --> 00:20:13.670 And honestly, if your passcode is eight characters long, 00:20:13.670 --> 00:20:18.500 but they're are 00000000, you're no more secure fundamentally. 00:20:18.500 --> 00:20:22.010 You really want to be somewhere in the sweet spot of that massive range 00:20:22.010 --> 00:20:25.200 of values, so that if the adversary tries this brute force attack just 00:20:25.200 --> 00:20:28.590 running through all possibilities, they will eventually reach 00:20:28.590 --> 00:20:31.320 your passcode just mathematically. 00:20:31.320 --> 00:20:32.400 It will be there. 00:20:32.400 --> 00:20:34.920 Hopefully, though-- well, maybe not hopefully-- you 00:20:34.920 --> 00:20:37.380 and I and they will be gone from this world 00:20:37.380 --> 00:20:39.840 because that much time will have passed. 00:20:39.840 --> 00:20:44.440 And if we do out the math here, this number of seconds, for instance, 00:20:44.440 --> 00:20:48.940 is long past when we will no longer be here. 00:20:48.940 --> 00:20:50.400 So that's the sort of measures. 00:20:50.400 --> 00:20:53.460 We don't sort of fundamentally change the equation for the adversary. 00:20:53.460 --> 00:20:54.670 It's still the same risk. 00:20:54.670 --> 00:20:55.795 It's still the same attack. 00:20:55.795 --> 00:20:59.370 But you significantly drive down the probability of success on their part. 00:20:59.370 --> 00:21:02.880 Or conceptually, you drive up the cost to the adversary. 00:21:02.880 --> 00:21:05.850 And indeed, even in the physical world, this is true. 00:21:05.850 --> 00:21:08.310 You just want your passcode in the digital world 00:21:08.310 --> 00:21:11.220 really to be better than someone else's because you 00:21:11.220 --> 00:21:12.985 want someone else's passcode to be the one 00:21:12.985 --> 00:21:14.610 that the adversary does something with. 00:21:14.610 --> 00:21:18.000 Just like in the physical world, even though it's a bit uncomfortable 00:21:18.000 --> 00:21:21.990 to consider, your house doesn't need to be 100% secure. 00:21:21.990 --> 00:21:24.060 And indeed, it's difficult to make it such. 00:21:24.060 --> 00:21:26.100 There's always going to be a point of weakness. 00:21:26.100 --> 00:21:28.810 Maybe it's that window, the door, or something like that. 00:21:28.810 --> 00:21:33.060 But if your home is more secure than the next door home, just probabilistically, 00:21:33.060 --> 00:21:34.710 you are more secure. 00:21:34.710 --> 00:21:35.760 You're not secure. 00:21:35.760 --> 00:21:37.710 And indeed, any website you see down the road 00:21:37.710 --> 00:21:41.400 that says, we are secure because we do X, Y, or Z, that's nonsense. 00:21:41.400 --> 00:21:44.910 Security is really about comparisons and evaluating things 00:21:44.910 --> 00:21:49.870 if quantitatively relative to some other system, relative to some other code. 00:21:49.870 --> 00:21:51.520 So what's the takeaway here? 00:21:51.520 --> 00:21:55.320 Well, hopefully, a non-trivial number of you will go home this weekend on Monday 00:21:55.320 --> 00:21:57.803 and change at least one passcode. 00:21:57.803 --> 00:21:59.470 But there's going to be a tradeoff here. 00:21:59.470 --> 00:22:01.260 We talk about this all the time in CS50. 00:22:01.260 --> 00:22:06.210 Any time we improve something, we pay some price in time, in performance, 00:22:06.210 --> 00:22:07.360 in cost, somewhere else. 00:22:07.360 --> 00:22:10.470 So what's the downside then of this advice that you should use 00:22:10.470 --> 00:22:13.710 minimally eight-character passcodes? 00:22:13.710 --> 00:22:16.350 Why might you want to say nay and not do this? 00:22:16.350 --> 00:22:17.835 AUDIENCE: You have to remember it. 00:22:17.835 --> 00:22:18.390 DAVID J. MALAN: Say again? 00:22:18.390 --> 00:22:19.350 AUDIENCE: You have to remember it. 00:22:19.350 --> 00:22:21.308 DAVID J. MALAN: You have to remember it, right? 00:22:21.308 --> 00:22:23.050 And so here, there is some sociology. 00:22:23.050 --> 00:22:24.570 There's some human behavior. 00:22:24.570 --> 00:22:25.980 Some of you might have colleagues, if you're 00:22:25.980 --> 00:22:27.688 working in the real world, at least, back 00:22:27.688 --> 00:22:30.510 in healthier times when you had colleagues with desks in cubicles. 00:22:30.510 --> 00:22:33.218 And there's probably one person in the office with a post-it note 00:22:33.218 --> 00:22:35.370 on their monitor with their passcode. 00:22:35.370 --> 00:22:41.160 It's a bit of a cybersecurity offense, but it's also a real world side 00:22:41.160 --> 00:22:44.250 effect, maybe of corporate policies, that aren't really calibrated 00:22:44.250 --> 00:22:45.640 for human behavior. 00:22:45.640 --> 00:22:47.610 So we'll see if there's some other defenses. 00:22:47.610 --> 00:22:49.650 And indeed, let me propose that we talk briefly 00:22:49.650 --> 00:22:52.530 about one that actually tends to kick in automatically. 00:22:52.530 --> 00:22:54.840 Even if your passcode is not as strong as we've just 00:22:54.840 --> 00:22:57.940 seen, one of these six quadrillion possibilities, well, 00:22:57.940 --> 00:22:59.260 what could we do instead? 00:22:59.260 --> 00:23:01.590 Well, has anyone-- and I'll zoom in on this here-- 00:23:01.590 --> 00:23:05.940 accidentally locked themselves out of their own phone before? 00:23:05.940 --> 00:23:07.380 When does that happen? 00:23:07.380 --> 00:23:08.828 Yeah, when you try the password-- 00:23:08.828 --> 00:23:09.870 AUDIENCE: Too many times. 00:23:09.870 --> 00:23:11.020 DAVID J. MALAN: Yeah, so too many times. 00:23:11.020 --> 00:23:12.820 Maybe your finger is slightly off. 00:23:12.820 --> 00:23:17.010 Maybe you're slightly off, and you just don't input the same passcode correctly 00:23:17.010 --> 00:23:18.803 after five times, 10 times. 00:23:18.803 --> 00:23:20.220 There's some reasonable threshold. 00:23:20.220 --> 00:23:21.580 And why does that happen? 00:23:21.580 --> 00:23:24.270 Well, Apple and Google equivalently figure just 00:23:24.270 --> 00:23:27.060 probabilistically if after 10 guesses, you still 00:23:27.060 --> 00:23:29.932 haven't typed in the right passcode, probably, you're not you. 00:23:29.932 --> 00:23:31.890 You're someone else who's picked up your phone, 00:23:31.890 --> 00:23:33.660 so we're just going to go ahead and lock you out. 00:23:33.660 --> 00:23:35.200 Now, what's the effect of this? 00:23:35.200 --> 00:23:37.860 Well, this means now that each of those possible passcodes 00:23:37.860 --> 00:23:39.870 no longer takes roughly one second. 00:23:39.870 --> 00:23:41.890 Now it takes roughly one minute. 00:23:41.890 --> 00:23:43.360 So the attack is still the same. 00:23:43.360 --> 00:23:46.560 But if it's now one passcode or 10 guesses per minute, 00:23:46.560 --> 00:23:50.680 we have significantly by a factor of 60 in this story slowed things down. 00:23:50.680 --> 00:23:52.680 And unfortunately, does anyone know what happens 00:23:52.680 --> 00:23:54.967 if you screw up again after a minute? 00:23:54.967 --> 00:23:56.160 AUDIENCE: It goes longer. 00:23:56.160 --> 00:23:57.120 DAVID J. MALAN: Yeah, it goes longer. 00:23:57.120 --> 00:23:59.100 It's like five minutes and then 10 minutes. 00:23:59.100 --> 00:24:01.477 And Google is kind of obnoxious about it. 00:24:01.477 --> 00:24:03.060 They don't even give you a time frame. 00:24:03.060 --> 00:24:05.410 They just say, try again later. 00:24:05.410 --> 00:24:09.450 And so that keeps not only the adversary out, but also potentially you. 00:24:09.450 --> 00:24:11.017 So therein lies that tradeoff. 00:24:11.017 --> 00:24:14.100 If you've forgotten your code, if-- nowadays, your finger is slightly wet, 00:24:14.100 --> 00:24:16.230 so the screen isn't responding correctly. 00:24:16.230 --> 00:24:18.540 These could be usability downsides too. 00:24:18.540 --> 00:24:21.810 So security is really just about finding the sweet spot 00:24:21.810 --> 00:24:24.180 among these various tradeoffs here. 00:24:24.180 --> 00:24:25.860 But there's other mechanisms too. 00:24:25.860 --> 00:24:30.030 And some of you might recognize this screen from Gmail via which, of course, 00:24:30.030 --> 00:24:30.660 you log in. 00:24:30.660 --> 00:24:36.030 But after you log into Gmail or similar websites, or apps, or systems at work 00:24:36.030 --> 00:24:39.780 nowadays, especially, you might be presented with what's 00:24:39.780 --> 00:24:42.240 called two-factor authentication. 00:24:42.240 --> 00:24:46.110 And what is this in a nutshell in layperson's terms? 00:24:46.110 --> 00:24:48.597 Many of you, if you do anything digitally at work, 00:24:48.597 --> 00:24:49.680 might have to do this now. 00:24:49.680 --> 00:24:50.180 Yeah? 00:24:50.180 --> 00:24:51.860 AUDIENCE: It sends a text to your phone. 00:24:51.860 --> 00:24:52.860 DAVID J. MALAN: Exactly. 00:24:52.860 --> 00:24:56.040 You get texted at your phone, an additional code that's 00:24:56.040 --> 00:24:57.180 not your same password. 00:24:57.180 --> 00:24:59.580 It's typically a numeric code, maybe six digits long. 00:24:59.580 --> 00:25:01.510 It expires after a minute or 10 minutes. 00:25:01.510 --> 00:25:03.030 But why is this a good thing? 00:25:03.030 --> 00:25:06.795 Well, one, it's no longer just a piece of information that you know 00:25:06.795 --> 00:25:08.760 or that you might have written down. 00:25:08.760 --> 00:25:12.180 It's information that changes every time you try to log in. 00:25:12.180 --> 00:25:15.100 But more importantly, it's a fundamentally second factor, which 00:25:15.100 --> 00:25:16.725 means it's not just something you know. 00:25:16.725 --> 00:25:18.180 Now it's something you have. 00:25:18.180 --> 00:25:20.832 So you, for instance, are the only one theoretically 00:25:20.832 --> 00:25:22.290 that should be receiving that code. 00:25:22.290 --> 00:25:24.960 And so now the adversary, if they want to get into your account, 00:25:24.960 --> 00:25:28.800 not only have to guess, or brute force, or maybe read off of a post-it 00:25:28.800 --> 00:25:29.880 note your password. 00:25:29.880 --> 00:25:32.850 They also have to physically have access now to that phone. 00:25:32.850 --> 00:25:35.940 So there's still a threat, absolutely, but it's not everyone 00:25:35.940 --> 00:25:38.080 on the internet with an internet connection. 00:25:38.080 --> 00:25:39.900 Now it's only the people in Starbucks. 00:25:39.900 --> 00:25:41.430 Now it's only the people at work. 00:25:41.430 --> 00:25:44.400 Now it's only the people in your home who might have access 00:25:44.400 --> 00:25:45.400 to that second factor. 00:25:45.400 --> 00:25:48.690 So there too, it just raises the bar to the adversary making it harder, 00:25:48.690 --> 00:25:52.390 more time consuming, more geographically impossible for them to attack you. 00:25:52.390 --> 00:25:54.970 But what's the downside of two-factor authentication, 00:25:54.970 --> 00:25:56.850 whether it's a device-- or even nowadays, 00:25:56.850 --> 00:25:59.232 it's in software, whether it's on your keychain 00:25:59.232 --> 00:26:01.440 or on your phone where you're prompted for this code. 00:26:01.440 --> 00:26:05.910 What's a downside that some of us have probably experienced too? 00:26:05.910 --> 00:26:07.243 AUDIENCE: You forget your phone. 00:26:07.243 --> 00:26:09.535 DAVID J. MALAN: You forget your cell phone, absolutely. 00:26:09.535 --> 00:26:11.910 Right, the factor that you have, you don't have with you. 00:26:11.910 --> 00:26:14.040 Or maybe, you're in a basement somewhere, don't have reception. 00:26:14.040 --> 00:26:14.790 You're on a plane. 00:26:14.790 --> 00:26:15.850 You can't get the code. 00:26:15.850 --> 00:26:17.850 And so there too are these tradeoffs. 00:26:17.850 --> 00:26:19.920 And even IT departments need to keep that in mind 00:26:19.920 --> 00:26:21.462 because what does that mean for them? 00:26:21.462 --> 00:26:23.370 Well, if you don't have your phone with you 00:26:23.370 --> 00:26:25.980 and you are in the habit of calling IT to help you fix this, 00:26:25.980 --> 00:26:29.740 now there's a cost, a human cost, maybe even a financial cost. 00:26:29.740 --> 00:26:34.022 And so IT policy nowadays is really just about finding the right balance 00:26:34.022 --> 00:26:35.730 and where we want to spend our resources, 00:26:35.730 --> 00:26:38.850 but at least raise the bar to the adversary. 00:26:38.850 --> 00:26:41.280 But of course, there's other ways too. 00:26:41.280 --> 00:26:43.680 And this is going to be one of our homework assignments 00:26:43.680 --> 00:26:44.880 if you will after today. 00:26:44.880 --> 00:26:46.920 There's this software called password managers. 00:26:46.920 --> 00:26:50.280 And no need to buzz in on your phone, but maybe with a physical hand. 00:26:50.280 --> 00:26:54.330 How many folks here use a password manager? 00:26:54.330 --> 00:26:57.660 OK, let me ballpark this at 10%, 20%, perhaps. 00:26:57.660 --> 00:27:00.960 So we've got 80% upside here and a lesson learned potentially. 00:27:00.960 --> 00:27:03.450 So a password manager is just a piece of software 00:27:03.450 --> 00:27:07.287 on your Mac, your PC, or your phone nowadays that manages your passwords. 00:27:07.287 --> 00:27:08.370 Well, what does that mean? 00:27:08.370 --> 00:27:10.433 When you go to a website for the first time 00:27:10.433 --> 00:27:13.600 or you download an app for the first time and you have to create an account, 00:27:13.600 --> 00:27:16.918 you can still use your email address, or David as your username, 00:27:16.918 --> 00:27:18.210 or whatever your name might be. 00:27:18.210 --> 00:27:20.250 So you don't have to change that methodology. 00:27:20.250 --> 00:27:26.400 But instead of typing in 123456 as your same password for that website or app 00:27:26.400 --> 00:27:30.360 as well as for every other, now you use the password manager software 00:27:30.360 --> 00:27:33.570 to generate something difficult to guess for you. 00:27:33.570 --> 00:27:37.350 That is you tell the password manager, give me an eight-character random 00:27:37.350 --> 00:27:41.260 passcode, not 0000, but something with punctuation, with numbers, 00:27:41.260 --> 00:27:41.850 with letters. 00:27:41.850 --> 00:27:44.670 And better yet, the password manager, as the name suggests, 00:27:44.670 --> 00:27:46.857 remembers that password for you. 00:27:46.857 --> 00:27:48.690 And the next time you go to another website, 00:27:48.690 --> 00:27:50.898 you do it again with a completely different password, 00:27:50.898 --> 00:27:54.300 maybe same username, maybe two-factor authentication, but different password, 00:27:54.300 --> 00:27:56.040 different password, different password. 00:27:56.040 --> 00:27:57.373 And it doesn't have to be eight. 00:27:57.373 --> 00:28:01.410 I mean, I'm in the habit of using a dozen, two dozen characters in total. 00:28:01.410 --> 00:28:04.440 And at that point, I can't even pronounce the number of possibilities 00:28:04.440 --> 00:28:07.560 because it goes well beyond the quadrillions. 00:28:07.560 --> 00:28:10.290 So the probability that someone's going to get into one of those 00:28:10.290 --> 00:28:13.170 accounts for me now is very, very, very low. 00:28:13.170 --> 00:28:16.140 And they're going to take less interest in me and maybe more interest 00:28:16.140 --> 00:28:18.660 in someone else that's not using as good of a password. 00:28:18.660 --> 00:28:20.380 Now, what does this mean in real terms? 00:28:20.380 --> 00:28:23.250 Well, when you go to log into that managed site, 00:28:23.250 --> 00:28:26.297 you don't manually type your password anymore. 00:28:26.297 --> 00:28:28.380 In fact, you don't generally even need to know it. 00:28:28.380 --> 00:28:32.820 Nowadays, I probably don't 90-plus, 99% of my passwords. 00:28:32.820 --> 00:28:35.190 I entrust them to this password manager. 00:28:35.190 --> 00:28:40.140 Now, of course, you'd like to think that the password manager itself is secure. 00:28:40.140 --> 00:28:41.620 So what might that mean? 00:28:41.620 --> 00:28:43.770 Well, those of you who do use a password manager, 00:28:43.770 --> 00:28:46.650 how do you access that software itself? 00:28:46.650 --> 00:28:49.615 What's protecting your data in your understanding? 00:28:49.615 --> 00:28:50.490 AUDIENCE: Biometrics. 00:28:50.490 --> 00:28:53.370 DAVID J. MALAN: So maybe biometrics, like your face ID, or maybe 00:28:53.370 --> 00:28:55.645 your fingerprint, or maybe more simply, what else? 00:28:55.645 --> 00:28:56.520 AUDIENCE: A password. 00:28:56.520 --> 00:28:58.103 DAVID J. MALAN: Maybe just a password. 00:28:58.103 --> 00:29:02.790 And hopefully, that password that primary password, that gatekeeper, 00:29:02.790 --> 00:29:05.190 is not itself 123456. 00:29:05.190 --> 00:29:08.520 Otherwise, it doesn't matter how secure all of the others are. 00:29:08.520 --> 00:29:12.240 But if you're willing to put in the effort and pick one pretty long 00:29:12.240 --> 00:29:15.390 somewhat random very unguessable password that you just 00:29:15.390 --> 00:29:17.070 promise to commit to memory-- 00:29:17.070 --> 00:29:19.140 and maybe for backup, you literally print it out 00:29:19.140 --> 00:29:21.180 and put it in a safe deposit box or a safe, 00:29:21.180 --> 00:29:23.490 or just hide it somewhere physically that there's 00:29:23.490 --> 00:29:25.350 very low probability someone's going to find 00:29:25.350 --> 00:29:27.630 the backup copy, that might be alone. 00:29:27.630 --> 00:29:31.980 But of course, the flip side is now if you forget that primary password, 00:29:31.980 --> 00:29:34.590 you've now lost all of the eggs in the basket. 00:29:34.590 --> 00:29:38.830 If someone gets that primary password, now they have access to everything. 00:29:38.830 --> 00:29:40.350 So that's rather the tradeoff. 00:29:40.350 --> 00:29:45.030 But I dare say you're probably less threatened, depending on your family, 00:29:45.030 --> 00:29:48.090 by the people immediately around you than the billions 00:29:48.090 --> 00:29:51.420 of other people on the internet that have access, potentially, 00:29:51.420 --> 00:29:52.648 to those same systems. 00:29:52.648 --> 00:29:53.940 So there, too, it's a tradeoff. 00:29:53.940 --> 00:29:55.732 But it's up to you to decide whether or not 00:29:55.732 --> 00:29:57.880 to manage your passwords in this way. 00:29:57.880 --> 00:30:00.600 But if you were on that top 10 list, or even if you're not, 00:30:00.600 --> 00:30:04.110 but you can think of several accounts that all have the same password, 00:30:04.110 --> 00:30:06.550 you're probably going to benefit from something like this. 00:30:06.550 --> 00:30:11.310 And why is it bad, to be clear, to use the same password on multiple sites 00:30:11.310 --> 00:30:15.780 in case that's never sort of dawned in thought? 00:30:15.780 --> 00:30:18.480 Why is that a bad thing, to reuse a password 00:30:18.480 --> 00:30:20.340 on different websites, different apps? 00:30:20.340 --> 00:30:21.270 Any intuition? 00:30:21.270 --> 00:30:22.050 Yeah, in the back? 00:30:22.050 --> 00:30:24.550 AUDIENCE: Once attacked, it's easy to get to. 00:30:24.550 --> 00:30:25.550 DAVID J. MALAN: Exactly. 00:30:25.550 --> 00:30:27.170 Once it's attacked, you can-- 00:30:27.170 --> 00:30:29.270 the adversary, presumably, by transitivity, 00:30:29.270 --> 00:30:31.880 can see, oh, well, if this user's username is 00:30:31.880 --> 00:30:35.030 malan@harvard.edu on this website, and their password is foolishly 00:30:35.030 --> 00:30:38.840 123456 or even something way more complicated, 00:30:38.840 --> 00:30:41.060 they can probably just assume with high probability 00:30:41.060 --> 00:30:42.860 that if I'm being a little reckless, let's 00:30:42.860 --> 00:30:47.210 try accessing malan@harvard.edu's use other accounts, other apps using 00:30:47.210 --> 00:30:48.358 that exact same password. 00:30:48.358 --> 00:30:50.150 And so by transitivity, essentially, you're 00:30:50.150 --> 00:30:52.530 putting your other accounts at risk. 00:30:52.530 --> 00:30:53.780 So what's maybe a takeaway? 00:30:53.780 --> 00:30:56.780 Minimally here, I would start to reconsider your passcodes 00:30:56.780 --> 00:30:58.020 on your most important data. 00:30:58.020 --> 00:30:58.730 Maybe it's medical. 00:30:58.730 --> 00:30:59.600 Maybe it's financial. 00:30:59.600 --> 00:31:00.308 Maybe it's email. 00:31:00.308 --> 00:31:04.070 Anything remotely personal that you really wouldn't want to have access. 00:31:04.070 --> 00:31:06.890 Do you necessarily need the same level of security 00:31:06.890 --> 00:31:10.130 on e-commerce sites or sites that you don't really care about 00:31:10.130 --> 00:31:13.080 or that you signed up for once and after that, that's it? 00:31:13.080 --> 00:31:13.950 Probably not. 00:31:13.950 --> 00:31:15.960 So you can decide for yourself, but again, 00:31:15.960 --> 00:31:17.570 software, like a password manager. 00:31:17.570 --> 00:31:19.570 And these are just some of the possibilities out 00:31:19.570 --> 00:31:21.965 there are probably to be your friend. 00:31:21.965 --> 00:31:23.090 A couple of these are free. 00:31:23.090 --> 00:31:24.550 They come with Windows or Mac OS. 00:31:24.550 --> 00:31:25.550 A couple are commercial. 00:31:25.550 --> 00:31:29.070 Harvard has a site license for students for one of these as well. 00:31:29.070 --> 00:31:30.770 So there are options out there. 00:31:30.770 --> 00:31:32.420 But what else do people use? 00:31:32.420 --> 00:31:35.060 What else can people use to keep their systems secure? 00:31:35.060 --> 00:31:39.590 So most of us nowadays have probably heard of encryption, this technique 00:31:39.590 --> 00:31:41.400 for just scrambling information. 00:31:41.400 --> 00:31:45.200 So when you want to send a message, an email, or upload a photograph, 00:31:45.200 --> 00:31:49.085 or use your credit card, hopefully, it's not just being sent out for all to see, 00:31:49.085 --> 00:31:50.960 but there's some kind of scrambling going on. 00:31:50.960 --> 00:31:53.600 And some fancy mathematics ensure that encryption 00:31:53.600 --> 00:31:57.200 ensures that only you, the sender, and someone else, the receiver, 00:31:57.200 --> 00:32:01.580 can theoretically see what that credit card number is, what that message is 00:32:01.580 --> 00:32:03.330 what that photograph is instead. 00:32:03.330 --> 00:32:06.200 So encryption is sort of commonplace nowadays, 00:32:06.200 --> 00:32:09.555 both in websites, and apps, and ATMs, and other such devices. 00:32:09.555 --> 00:32:10.430 But how does it work? 00:32:10.430 --> 00:32:13.238 Well, back in week two of CS50, your child 00:32:13.238 --> 00:32:15.530 learned a little something about encryption, otherwise, 00:32:15.530 --> 00:32:16.640 known as cryptography. 00:32:16.640 --> 00:32:18.770 And one of the algorithms we talked about 00:32:18.770 --> 00:32:21.160 was quite simply something like this. 00:32:21.160 --> 00:32:25.952 This is what we might call, not only CS50, but plain text, so very plain 00:32:25.952 --> 00:32:28.910 text that, in this case, is English and obviously, everyone in the room 00:32:28.910 --> 00:32:29.670 can read it. 00:32:29.670 --> 00:32:32.600 But what if I wanted to send this message out to someone in this room, 00:32:32.600 --> 00:32:35.720 or out on the internet, or maybe equivalently back in the day, 00:32:35.720 --> 00:32:38.690 maybe write a message down on a scrap of paper in grade school 00:32:38.690 --> 00:32:42.560 and pass a secret note, a secret love note to someone in class with hopes 00:32:42.560 --> 00:32:44.990 that the teacher or any other students in the class 00:32:44.990 --> 00:32:46.400 can't intercept it and read it? 00:32:46.400 --> 00:32:49.880 Well, you probably don't want to say, this is CS50, or I love you, 00:32:49.880 --> 00:32:52.280 or anything remotely sensitive. 00:32:52.280 --> 00:32:54.290 But rather, maybe you want to encrypt it. 00:32:54.290 --> 00:32:56.510 And let's change the T to a U. 00:32:56.510 --> 00:33:01.430 Maybe change the H to an I, the I to a J, the S to a T, the I to a J, 00:33:01.430 --> 00:33:04.580 the S to a T again, the C to a D, the S to a T. 00:33:04.580 --> 00:33:06.440 And we'll just leave the numbers, alone even 00:33:06.440 --> 00:33:09.330 though I worry someone could probably guess what this now does say, 00:33:09.330 --> 00:33:10.070 nonetheless. 00:33:10.070 --> 00:33:13.820 But what was the algorithm as I rattled those changes off, 00:33:13.820 --> 00:33:17.715 whether a student from week two or parents from week now? 00:33:17.715 --> 00:33:18.215 Yeah? 00:33:18.215 --> 00:33:19.730 AUDIENCE: It's a one-letter shift. 00:33:19.730 --> 00:33:21.397 DAVID J. MALAN: Just a one-letter shift. 00:33:21.397 --> 00:33:25.940 And this is more sophisticated called a rotational cipher or a Caesar cipher 00:33:25.940 --> 00:33:27.320 after Caesar back in the day. 00:33:27.320 --> 00:33:28.760 It's relatively simplistic. 00:33:28.760 --> 00:33:31.040 But back in the day, it's not so simplistic 00:33:31.040 --> 00:33:34.430 if you're the first person in the world to ever use it or think of it. 00:33:34.430 --> 00:33:36.500 But nowadays, this is not actually what we use. 00:33:36.500 --> 00:33:38.540 But it's similarly mathematical in nature. 00:33:38.540 --> 00:33:41.660 It's not quite as simple as just adding one or subtracting one 00:33:41.660 --> 00:33:44.810 to go from now what we call ciphertext to plain text. 00:33:44.810 --> 00:33:46.502 But it's similarly math that's involved. 00:33:46.502 --> 00:33:48.710 And let me just stipulate that the way the math works 00:33:48.710 --> 00:33:51.230 is that the sender and the receiver just have 00:33:51.230 --> 00:33:53.390 to have in mind some kind of secret. 00:33:53.390 --> 00:33:56.360 And the secret in this case would very trivially be one, 00:33:56.360 --> 00:33:59.270 but it could be a much bigger, much more unguessable number, 00:33:59.270 --> 00:34:01.820 or maybe some other secret we share, the presumption 00:34:01.820 --> 00:34:05.390 being that my classmates, my teacher in that grade school classroom, if they 00:34:05.390 --> 00:34:07.760 don't know what that secret is that number is, yeah, 00:34:07.760 --> 00:34:11.659 they could try to brute force it and try all possible mathematics, plus one, 00:34:11.659 --> 00:34:12.770 plus two, plus three. 00:34:12.770 --> 00:34:14.120 But that's going to take them some time. 00:34:14.120 --> 00:34:15.630 And they probably don't care enough. 00:34:15.630 --> 00:34:18.750 And so my data might be, therefore, relatively secure. 00:34:18.750 --> 00:34:21.150 But we use encryption all the time nowadays. 00:34:21.150 --> 00:34:24.050 And so for instance, this is at the start of most URLs 00:34:24.050 --> 00:34:26.179 nowadays, even if you don't type it yourself. 00:34:26.179 --> 00:34:30.530 With that said, Safari and even Chrome now are kind of simplifying, if not, 00:34:30.530 --> 00:34:33.590 dumbing down user interfaces to just hide details 00:34:33.590 --> 00:34:37.370 that you and I, as normal users, don't need to see 24/7. 00:34:37.370 --> 00:34:38.150 But it is there. 00:34:38.150 --> 00:34:41.449 And if in fact, on your phone or laptop, you click on the URL, 00:34:41.449 --> 00:34:43.370 even if it's super short initially, you'll 00:34:43.370 --> 00:34:46.040 probably see the whole thing starting with this. 00:34:46.040 --> 00:34:47.570 And the S means secure. 00:34:47.570 --> 00:34:49.550 The S means that encryption is being used. 00:34:49.550 --> 00:34:52.520 But there's other forms of this, not just when you visit websites. 00:34:52.520 --> 00:34:55.159 There's this, end-to-end encryption, which 00:34:55.159 --> 00:34:59.120 is being talked about more nowadays, especially during COVID times with so 00:34:59.120 --> 00:35:02.900 many more of us on video and talking about more sensitive things, 00:35:02.900 --> 00:35:05.390 telemedicine, talking to doctors, things that you also 00:35:05.390 --> 00:35:09.530 wouldn't want to verbally or visually get out into the wild just like text. 00:35:09.530 --> 00:35:13.700 What's different about end-to-end encryption versus HTTPS 00:35:13.700 --> 00:35:18.410 and the type of encryption that most of us use every day on websites alone? 00:35:18.410 --> 00:35:20.720 End-to-end encryption is sort of a better feature 00:35:20.720 --> 00:35:25.150 that you want to increasingly seek when using services like Zoom, or Microsoft 00:35:25.150 --> 00:35:29.170 Teams, or WhatsApp, or the like. 00:35:29.170 --> 00:35:31.270 Any instincts here? 00:35:31.270 --> 00:35:34.905 Yeah, over on the right. 00:35:34.905 --> 00:35:38.575 AUDIENCE: The encryption happens in the source and destination. 00:35:38.575 --> 00:35:39.450 DAVID J. MALAN: Good. 00:35:39.450 --> 00:35:41.492 So the encryption, the scrambling of information, 00:35:41.492 --> 00:35:44.790 happens in the source the sender, and the destination, the receiver, 00:35:44.790 --> 00:35:47.140 without a so-called middleman in between. 00:35:47.140 --> 00:35:49.530 And this is actually very different from most contexts 00:35:49.530 --> 00:35:53.310 nowadays that use just HTTPS because when you're using HTTPS 00:35:53.310 --> 00:35:57.190 to buy something on Amazon securely with your credit card, well, of course, 00:35:57.190 --> 00:36:00.220 Amazon needs to be able to decrypt the message at the end of the day. 00:36:00.220 --> 00:36:01.080 And so that's fine. 00:36:01.080 --> 00:36:05.820 But even when you're using services like video conferencing or maybe text 00:36:05.820 --> 00:36:09.060 messaging nowadays-- well, if you're using WhatsApp, that's owned by Meta. 00:36:09.060 --> 00:36:11.580 And if you're using Instagram, that's owned by Meta. 00:36:11.580 --> 00:36:14.920 There's a lot of middlemen in these apps that we're using. 00:36:14.920 --> 00:36:17.580 And if they were only using encryption period 00:36:17.580 --> 00:36:21.660 or only using something like HTTPS, yes, your connection 00:36:21.660 --> 00:36:24.630 from you to WhatsApp, and in turn, to the recipient 00:36:24.630 --> 00:36:27.750 might very well be secure on each end of that channel. 00:36:27.750 --> 00:36:31.680 But Meta in between the company and any other company in between 00:36:31.680 --> 00:36:34.988 could theoretically, for better or for worse, be looking at that data, 00:36:34.988 --> 00:36:37.030 whether it's to mine it for advertising purposes, 00:36:37.030 --> 00:36:39.120 whether it's to snoop on data that you're sending. 00:36:39.120 --> 00:36:43.140 That is not end-to-end encryption if the middleman, a company, typically, 00:36:43.140 --> 00:36:45.270 has technically access to that data. 00:36:45.270 --> 00:36:49.080 Now, Zoom, and Microsoft Teams, and WhatsApp, and iMessage, 00:36:49.080 --> 00:36:51.510 and other services with which you're familiar increasingly 00:36:51.510 --> 00:36:53.670 are offering stronger guarantees of encryption, 00:36:53.670 --> 00:36:58.500 whereby, it's indeed between parties A and B and not the one in the middle. 00:36:58.500 --> 00:36:59.820 Now, there's downsides here. 00:36:59.820 --> 00:37:02.580 And you can actually see this kind of functionality manifest 00:37:02.580 --> 00:37:03.750 in certain settings. 00:37:03.750 --> 00:37:07.620 For instance, besides iMessage, which just does this for you 00:37:07.620 --> 00:37:10.890 on iPhones or Macs, besides Zoom, you can actually 00:37:10.890 --> 00:37:13.210 fine tune these settings, indeed, within Zoom itself. 00:37:13.210 --> 00:37:16.140 So here's a screenshot that I took last night of just what the user 00:37:16.140 --> 00:37:19.770 interface looks like today to create a new Zoom meeting with the latest 00:37:19.770 --> 00:37:21.060 version of Zoom software. 00:37:21.060 --> 00:37:24.870 And maybe unbeknownst to you, there is a choice of buttons down here. 00:37:24.870 --> 00:37:28.870 And most likely, yours is, by default, enhanced encryption, 00:37:28.870 --> 00:37:31.590 which is brilliant marketing speak because it's just encryption. 00:37:31.590 --> 00:37:32.490 It's not enhanced. 00:37:32.490 --> 00:37:35.260 It actually ironically means worse than this. 00:37:35.260 --> 00:37:37.873 But they want you using it most likely, why? 00:37:37.873 --> 00:37:39.540 Well, it's a little easier to implement. 00:37:39.540 --> 00:37:41.970 It's a little less expensive for them computationally. 00:37:41.970 --> 00:37:46.770 And to be fair, enhanced encryption does scramble the data, but not in a way 00:37:46.770 --> 00:37:48.420 that Zoom can't see it. 00:37:48.420 --> 00:37:49.650 Zoom can, indeed, see it. 00:37:49.650 --> 00:37:51.600 But that's actually a plus in some context 00:37:51.600 --> 00:37:54.090 because if you want to do cloud recordings 00:37:54.090 --> 00:37:56.340 and you want a meeting recorded not on your Mac or PC, 00:37:56.340 --> 00:37:57.810 but let Zoom deal with that. 00:37:57.810 --> 00:38:00.085 If you want automatic transcription nowadays, 00:38:00.085 --> 00:38:02.460 so the words to appear, whether it's English or something 00:38:02.460 --> 00:38:04.320 else on the screen, well, you can't really 00:38:04.320 --> 00:38:06.810 lock Zoom or any other middleman out of that 00:38:06.810 --> 00:38:08.910 because someone needs to save it to the cloud. 00:38:08.910 --> 00:38:12.930 Someone needs to translate the voice to those English or some other language 00:38:12.930 --> 00:38:13.480 words. 00:38:13.480 --> 00:38:15.660 So enhanced encryption enables those features, 00:38:15.660 --> 00:38:18.630 but they also allow a bad actor, malicious employees, 00:38:18.630 --> 00:38:21.690 someone who's just nosey at Zoom or the equivalent middle man 00:38:21.690 --> 00:38:25.080 to just poke around your video conference and hear what you've said 00:38:25.080 --> 00:38:29.770 or see what you've typed as well unless you instead check this box as well. 00:38:29.770 --> 00:38:33.033 So increasingly look for mentions of end-to-end encryption. 00:38:33.033 --> 00:38:35.700 Or give that some thought when you choose a technology via which 00:38:35.700 --> 00:38:38.760 to communicate with someone, whether it's within your family 00:38:38.760 --> 00:38:40.870 or without as well. 00:38:40.870 --> 00:38:45.150 Now, last, but not least, there's other applications of encryption too. 00:38:45.150 --> 00:38:49.600 And this, too, might be a lesson learned as well, full disk encryption. 00:38:49.600 --> 00:38:53.640 So a disk is where your data is stored in your Mac, or PC, or even your phone. 00:38:53.640 --> 00:38:56.940 And full disk encryption just means ideally that all of your data 00:38:56.940 --> 00:38:59.190 is encrypted that is somehow scrambled. 00:38:59.190 --> 00:39:02.100 Now, hopefully, your password for your computer or phone 00:39:02.100 --> 00:39:04.952 is good enough so that even though the device is 00:39:04.952 --> 00:39:07.410 encrypted with that password, at least, you'll remember it. 00:39:07.410 --> 00:39:10.458 And your phone or your Mac or PC will automatically decrypt it for you. 00:39:10.458 --> 00:39:13.500 Of course, you can't scramble the information and hide it from ourselves. 00:39:13.500 --> 00:39:16.470 One of us, at least, for these devices needs to have access. 00:39:16.470 --> 00:39:18.990 But full disk encryption typically means that at least when 00:39:18.990 --> 00:39:22.110 you close the laptop lid or power down for the night, 00:39:22.110 --> 00:39:24.690 that even if someone else steals that device, 00:39:24.690 --> 00:39:28.620 opens the lid, unless they have your passcode, 00:39:28.620 --> 00:39:31.470 they can't even plug in fancy cables to the device 00:39:31.470 --> 00:39:34.320 and just rip the zeros and ones off of the device 00:39:34.320 --> 00:39:35.823 and see what's actually there. 00:39:35.823 --> 00:39:37.740 Full disk encryption means they could do that, 00:39:37.740 --> 00:39:40.890 but they would just see seemingly random zeros and ones. 00:39:40.890 --> 00:39:42.270 Now, there's a downside here too. 00:39:42.270 --> 00:39:44.160 This might slow things down potentially. 00:39:44.160 --> 00:39:48.060 But it is a feature increasingly that's offered and is absolutely something you 00:39:48.060 --> 00:39:51.510 should consider enabling in general, especially if your laptop 00:39:51.510 --> 00:39:52.620 or phone travels with you. 00:39:52.620 --> 00:39:53.970 And certainly, your phone does. 00:39:53.970 --> 00:39:58.020 Or if you plan to donate, or sell, or give away a device, 00:39:58.020 --> 00:40:00.180 you don't want to leave all of the zeros and ones, 00:40:00.180 --> 00:40:02.650 the remnants of your own sensitive data, passed on there. 00:40:02.650 --> 00:40:04.770 So Windows has a feature called BitLocker. 00:40:04.770 --> 00:40:08.320 Mac OS has a feature called FileVault. There's commercial options as well. 00:40:08.320 --> 00:40:12.210 But generally, we're at the point now in 2022 where clicking a button 00:40:12.210 --> 00:40:14.940 is sufficient to enable these features. 00:40:14.940 --> 00:40:17.430 With that said, don't rush into all of these decisions. 00:40:17.430 --> 00:40:19.440 I would make backups of your data. 00:40:19.440 --> 00:40:22.930 And don't maybe email CS50 if something goes wrong with that process. 00:40:22.930 --> 00:40:24.570 But I would do your own due diligence. 00:40:24.570 --> 00:40:27.720 But this, too, would be a menu of possibilities. 00:40:27.720 --> 00:40:32.070 And now, the bad side, the downside, of what seems to be great, 00:40:32.070 --> 00:40:33.990 this notion of full disk encryption. 00:40:33.990 --> 00:40:36.990 Unfortunately, just as we can encrypt our data 00:40:36.990 --> 00:40:39.280 to protect it from the adversaries, so can 00:40:39.280 --> 00:40:44.140 the adversaries if they get into our devices, encrypt our data and do what? 00:40:44.140 --> 00:40:46.420 Not tell us that secret key. 00:40:46.420 --> 00:40:49.190 And so this is generally applied in the context of ransomware, 00:40:49.190 --> 00:40:51.100 which tragically, you increasingly hear about 00:40:51.100 --> 00:40:53.992 in hospital systems, school systems, municipalities 00:40:53.992 --> 00:40:55.450 where systems are getting attacked. 00:40:55.450 --> 00:40:57.730 And the data is not just getting stolen because what 00:40:57.730 --> 00:41:01.570 is the adversary typically need with local municipal or even hospital data? 00:41:01.570 --> 00:41:05.200 The value to the adversary is encrypting all 00:41:05.200 --> 00:41:07.660 of the hospital, all of the municipality's data 00:41:07.660 --> 00:41:11.298 preventing them from accessing it if they have no backups or the like. 00:41:11.298 --> 00:41:13.090 And so ransomware is literally about trying 00:41:13.090 --> 00:41:16.550 to convince someone to pay you money or pay you Bitcoin or something like that 00:41:16.550 --> 00:41:18.490 to give you that secret key. 00:41:18.490 --> 00:41:21.910 And the key, in this case, is surely more sophisticated than the number one. 00:41:21.910 --> 00:41:23.420 But it's really the same idea. 00:41:23.420 --> 00:41:26.380 So here too, yet again, a tradeoff just as we sort of invent something 00:41:26.380 --> 00:41:30.350 for good, it can also be used for evil in so to speak as well. 00:41:30.350 --> 00:41:32.890 But it's really the same underlying principles, 00:41:32.890 --> 00:41:34.600 even though we keep seeing it and hearing 00:41:34.600 --> 00:41:37.490 about it, in these different forms. 00:41:37.490 --> 00:41:40.360 And lastly, if only because folks are generally familiar, 00:41:40.360 --> 00:41:43.750 but don't necessarily know what it is that it's doing for them, 00:41:43.750 --> 00:41:46.810 browsers nowadays have what's often called incognito mode 00:41:46.810 --> 00:41:49.750 or private mode, which has nothing to do with encryption, but does 00:41:49.750 --> 00:41:52.720 have to do with cybersecurity, or really, cyber privacy, 00:41:52.720 --> 00:41:55.450 keeping your data from prying eyes. 00:41:55.450 --> 00:41:57.742 Incognito mode, if you open it in Chrome, for instance, 00:41:57.742 --> 00:41:59.200 looks a little something like this. 00:41:59.200 --> 00:42:02.080 And we use it in CS50 when introducing students, as we did last week, 00:42:02.080 --> 00:42:04.780 to web programming because it, in effect, lets 00:42:04.780 --> 00:42:07.690 you start with a clean slate like a brand new browser that has never 00:42:07.690 --> 00:42:10.720 visited any websites before, which is good for just diagnosing problems. 00:42:10.720 --> 00:42:14.512 But it's often commonly used if you want to log into maybe your Gmail account 00:42:14.512 --> 00:42:17.470 on someone else's computer and you don't want your password being saved 00:42:17.470 --> 00:42:19.262 or you want to visit some website where you 00:42:19.262 --> 00:42:22.610 don't want the URL or the search terms ending up in your autocomplete history. 00:42:22.610 --> 00:42:25.090 So there's multiple uses for incognito mode. 00:42:25.090 --> 00:42:26.870 But what does it really do? 00:42:26.870 --> 00:42:30.310 Well, it doesn't stop your company, it doesn't stop your university, 00:42:30.310 --> 00:42:33.040 your internet service provider, be it, Comcast, Verizon, 00:42:33.040 --> 00:42:36.730 or the like from knowing what websites you go to because-- 00:42:36.730 --> 00:42:37.510 ask your students. 00:42:37.510 --> 00:42:39.920 A couple of weeks ago we talked about-- actually, a week ago, 00:42:39.920 --> 00:42:41.620 we talked about how the internet works. 00:42:41.620 --> 00:42:44.000 And unfortunately, every computer has an IP address, 00:42:44.000 --> 00:42:46.375 which is a unique identifier, which goes out any time you 00:42:46.375 --> 00:42:48.462 go anywhere incognito mode or not. 00:42:48.462 --> 00:42:50.170 So this isn't really covering your tracks 00:42:50.170 --> 00:42:54.130 outside of your office, or outside of your home, or outside of your company. 00:42:54.130 --> 00:42:56.950 But it is, at least, throwing away local information. 00:42:56.950 --> 00:43:00.580 And so we'll talk, in fact, in CS50's week nine this coming Monday 00:43:00.580 --> 00:43:03.692 about cookies, which you might generally know about 00:43:03.692 --> 00:43:04.900 and what are called sessions. 00:43:04.900 --> 00:43:07.900 And so long story short, what incognito mode does 00:43:07.900 --> 00:43:12.225 is it throws away, when you close the window, any locally stored information, 00:43:12.225 --> 00:43:15.100 so these things called cookies, which are sort of virtual hand stamps 00:43:15.100 --> 00:43:16.892 that just remember what you've logged in as 00:43:16.892 --> 00:43:19.030 or what's in your shopping cart or the like. 00:43:19.030 --> 00:43:23.410 But it doesn't hide any information from anyone outside of your own Mac or PC. 00:43:23.410 --> 00:43:25.715 It only prevents those local prying eyes. 00:43:25.715 --> 00:43:28.090 So there, too, even though we have tools that many of you 00:43:28.090 --> 00:43:30.310 are probably in the habit of using or thinking 00:43:30.310 --> 00:43:34.360 you should use to be more private, be more secure on the internet, what we do 00:43:34.360 --> 00:43:37.720 really in CS50, both weeks past and future, 00:43:37.720 --> 00:43:40.220 is talk about how these technologies work, 00:43:40.220 --> 00:43:44.260 so that ultimately, we have all the more of an educated citizenry here 00:43:44.260 --> 00:43:46.800 among undergrads and here as well as online, 00:43:46.800 --> 00:43:49.300 so that you can apply these same lessons learned to problems 00:43:49.300 --> 00:43:50.900 you'll encounter in the future. 00:43:50.900 --> 00:43:52.600 So as promised, the homework. 00:43:52.600 --> 00:43:55.143 One, you should probably use a password manager. 00:43:55.143 --> 00:43:57.310 It doesn't have to be one of those ones on the list, 00:43:57.310 --> 00:44:00.640 but at least starting that conversation maybe with someone who does, 00:44:00.640 --> 00:44:01.570 maybe the-- 00:44:01.570 --> 00:44:04.748 it's often the students in your family, perhaps, 00:44:04.748 --> 00:44:06.790 who can advise you on some of these technologies. 00:44:06.790 --> 00:44:10.653 Consider using a password manager too, using two-factor authentication, 00:44:10.653 --> 00:44:12.820 whether it's your phone or some key fob or the like, 00:44:12.820 --> 00:44:16.030 but at least seeking out that feature at least for accounts that you really 00:44:16.030 --> 00:44:18.760 care about, your email, social media, financial, medical, 00:44:18.760 --> 00:44:21.010 anything where you'd be embarrassed, at best, 00:44:21.010 --> 00:44:24.400 or really violated, at worst, if that kind of information got out 00:44:24.400 --> 00:44:28.360 and then increasingly using not just encryption, which you get automatically 00:44:28.360 --> 00:44:31.930 for most technologies today, but increasingly choosing technologies 00:44:31.930 --> 00:44:35.980 that offer stronger guarantees that keep those middlemen, those companies out 00:44:35.980 --> 00:44:39.040 of the way if only so that you can trust with higher probability 00:44:39.040 --> 00:44:43.900 that only party B knows what party A has said or sent. 00:44:43.900 --> 00:44:45.820 Now, this, of course, was a whirlwind tour. 00:44:45.820 --> 00:44:48.190 There's so much more that you can do online. 00:44:48.190 --> 00:44:52.150 Indeed, this course, CS50, can be taken for free online via platforms like edX 00:44:52.150 --> 00:44:54.430 at edx.org/cs50. 00:44:54.430 --> 00:44:57.400 I thought it might be appropriate to end on this note. 00:44:57.400 --> 00:45:01.300 If anyone would like to conjecture before we 00:45:01.300 --> 00:45:07.050 start playing music and adjourn for lunch, what our final message here is. 00:45:07.050 --> 00:45:12.270 If we reverse the plus one and maybe start minus one here, minus one here-- 00:45:12.270 --> 00:45:14.920 and indeed, thank you so much for coming. 00:45:14.920 --> 00:45:18.240 This was CS50. 00:45:18.240 --> 00:45:21.290 [MUSIC PLAYING]