1 00:00:00,000 --> 00:00:02,976 2 00:00:02,976 --> 00:00:06,448 [MUSIC PLAYING] 3 00:00:06,448 --> 00:01:12,307 4 00:01:12,307 --> 00:01:13,390 DAVID J. MALAN: All right. 5 00:01:13,390 --> 00:01:15,130 So this is CS50. 6 00:01:15,130 --> 00:01:18,520 My name is David Malan, and this is Harvard University's introduction 7 00:01:18,520 --> 00:01:20,740 to the intellectual enterprises of computer science 8 00:01:20,740 --> 00:01:22,450 and the art of programming. 9 00:01:22,450 --> 00:01:25,060 And this, of course, is our special family weekend, 10 00:01:25,060 --> 00:01:28,180 wherein not only our CS50's own students here in the audience, but also 11 00:01:28,180 --> 00:01:29,990 some family members as well. 12 00:01:29,990 --> 00:01:32,560 Now, you're showing up in the semester a little bit late. 13 00:01:32,560 --> 00:01:36,130 We've just tackled week eight, which is really our ninth week since computer 14 00:01:36,130 --> 00:01:37,630 scientists start counting from zero. 15 00:01:37,630 --> 00:01:40,047 So we've done a whole lot of work over the past few weeks, 16 00:01:40,047 --> 00:01:42,890 as you might have heard via emails or text messages home, 17 00:01:42,890 --> 00:01:45,858 including a language known here as binary. 18 00:01:45,858 --> 00:01:48,400 So on the screen here, of course, is a lot of zeros and ones. 19 00:01:48,400 --> 00:01:52,600 And suffice it to say, let me sum up the past nine weeks with this 20 00:01:52,600 --> 00:01:54,340 is what's going on underneath the hood. 21 00:01:54,340 --> 00:01:57,465 But of course, today, we thought we'd make things a little more accessible, 22 00:01:57,465 --> 00:01:58,870 a little more broadly applicable. 23 00:01:58,870 --> 00:02:01,480 And indeed, our focus today will not be on what 24 00:02:01,480 --> 00:02:04,060 these patterns of zeros and ones represent, 25 00:02:04,060 --> 00:02:07,390 which in astute, I might notice are replicated visually 26 00:02:07,390 --> 00:02:10,190 with these light bulbs being in a pattern on and off. 27 00:02:10,190 --> 00:02:14,140 And as your child might have hinted before class, or perhaps, now, 28 00:02:14,140 --> 00:02:17,200 this might very well spell a word up to eight characters 29 00:02:17,200 --> 00:02:20,890 long because you can encode, even in the real world, things digital too. 30 00:02:20,890 --> 00:02:23,470 But today, we'll focus on things much more high level, 31 00:02:23,470 --> 00:02:26,170 this notion of cybersecurity, like the security 32 00:02:26,170 --> 00:02:31,210 of our data, our privacy of our systems, particularly, on the internet 33 00:02:31,210 --> 00:02:34,330 nowadays because presumably, all of us are carrying technologies around 34 00:02:34,330 --> 00:02:36,770 in our pocket using laptops and desktops every day. 35 00:02:36,770 --> 00:02:40,685 And so the goal today is to stipulate that this 36 00:02:40,685 --> 00:02:42,310 is what's going on underneath the hood. 37 00:02:42,310 --> 00:02:44,350 But let's solve some problems at a higher level, 38 00:02:44,350 --> 00:02:47,770 so that your homework, when you go back to wherever you're visiting from, 39 00:02:47,770 --> 00:02:50,740 can actually be to apply some of today's lessons learned. 40 00:02:50,740 --> 00:02:55,780 So with that said, perhaps, the most common familiar defense of one's 41 00:02:55,780 --> 00:02:58,720 systems, and data, phone, and laptops, and desktops would just 42 00:02:58,720 --> 00:03:00,070 be these simple passwords. 43 00:03:00,070 --> 00:03:02,440 Unfortunately, you and I are-- frankly, as humans, 44 00:03:02,440 --> 00:03:05,150 not all that good at choosing passwords. 45 00:03:05,150 --> 00:03:09,010 And this is in itself a relatively weak form of defense, even though each of us 46 00:03:09,010 --> 00:03:12,430 has dozens, hundreds, of passwords nowadays 47 00:03:12,430 --> 00:03:14,920 or at least dozens or hundreds of accounts, 48 00:03:14,920 --> 00:03:19,653 maybe fives or tens of dozens of passwords. 49 00:03:19,653 --> 00:03:21,820 Indeed, if you're in the habit of reusing passwords, 50 00:03:21,820 --> 00:03:25,160 we'll see today, probably among our first lessons learned. 51 00:03:25,160 --> 00:03:29,050 So for instance, if we look back at the past year, 2021, thanks 52 00:03:29,050 --> 00:03:33,310 to security researchers who take a look at data that has been hacked or leaked 53 00:03:33,310 --> 00:03:37,990 online by way of public databases, we have a sense as computer scientists 54 00:03:37,990 --> 00:03:40,720 of what the most popular, or equivalently, 55 00:03:40,720 --> 00:03:43,270 what some of the worst passwords are that you and I are 56 00:03:43,270 --> 00:03:44,300 choosing for our system. 57 00:03:44,300 --> 00:03:46,750 So as of this past year, according to one measure, 58 00:03:46,750 --> 00:03:53,470 the most commonly used password in systems everywhere was 123456. 59 00:03:53,470 --> 00:03:54,220 All right? 60 00:03:54,220 --> 00:03:56,770 The number two password in our top 10 here list 61 00:03:56,770 --> 00:04:00,670 was only slightly longer, 123456789. 62 00:04:00,670 --> 00:04:05,590 After that, we took a turn in the other direction, 12345 alone. 63 00:04:05,590 --> 00:04:08,440 After that, it got a little more interesting, qwerty, 64 00:04:08,440 --> 00:04:10,360 which might sound pretty cryptic, but not 65 00:04:10,360 --> 00:04:12,100 if you look down at your US keyboard. 66 00:04:12,100 --> 00:04:16,750 And it's the top left hand row of the keys on an American keyboard, so also, 67 00:04:16,750 --> 00:04:18,310 not all that hard. 68 00:04:18,310 --> 00:04:21,790 Perhaps, not surprisingly a little disconcertingly, 69 00:04:21,790 --> 00:04:25,270 number five was password. 70 00:04:25,270 --> 00:04:29,200 Meanwhile, number six returns us to digits, 12345678. 71 00:04:29,200 --> 00:04:33,310 After that, really, less effort, 111111. 72 00:04:33,310 --> 00:04:37,780 After that, a little more variation, but not all that much 123123. 73 00:04:37,780 --> 00:04:41,830 After that, it's getting even less interesting, 1234567890. 74 00:04:41,830 --> 00:04:45,520 And then lastly topping the list is just 1234567. 75 00:04:45,520 --> 00:04:48,470 So this is not a good top 10 list to be on. 76 00:04:48,470 --> 00:04:52,600 So among today's first takeaways is if you see your password on the screen, 77 00:04:52,600 --> 00:04:54,490 you didn't make the list in a good way. 78 00:04:54,490 --> 00:04:57,550 This means hundreds, thousands, millions of other people 79 00:04:57,550 --> 00:04:59,380 probably have that password of yours. 80 00:04:59,380 --> 00:05:02,590 Now, in and of itself, that's not necessarily worrisome 81 00:05:02,590 --> 00:05:05,920 because I don't know who has these passwords in a room as large as this. 82 00:05:05,920 --> 00:05:09,460 But just intuitively, why is this a bad thing? 83 00:05:09,460 --> 00:05:14,120 Either parent or child is welcome to raise a hand here. 84 00:05:14,120 --> 00:05:16,210 Why might this be-- 85 00:05:16,210 --> 00:05:17,275 intuitively, yeah? 86 00:05:17,275 --> 00:05:18,707 AUDIENCE: Access to it. 87 00:05:18,707 --> 00:05:20,040 DAVID J. MALAN: So access to it. 88 00:05:20,040 --> 00:05:22,290 I mean, we literally, as computer scientists, now have 89 00:05:22,290 --> 00:05:24,720 a database of really common passwords. 90 00:05:24,720 --> 00:05:25,680 And your thoughts? 91 00:05:25,680 --> 00:05:28,408 AUDIENCE: [INAUDIBLE] find it out quickly. 92 00:05:28,408 --> 00:05:30,700 DAVID J. MALAN: Yeah, you can just find it out quickly. 93 00:05:30,700 --> 00:05:33,910 I mean, you could imagine trying to guess someone's password by just typing 94 00:05:33,910 --> 00:05:36,280 in random letters, random numbers, random words, 95 00:05:36,280 --> 00:05:38,290 but not if you have a top 10 list. 96 00:05:38,290 --> 00:05:41,920 The adversaries in the world might as well just start with this list. 97 00:05:41,920 --> 00:05:44,727 Now, you'll notice that even absent from this are slight variance. 98 00:05:44,727 --> 00:05:46,810 Some of you might be thinking, I'm not on the list 99 00:05:46,810 --> 00:05:50,470 because I do something clever like I use an exclamation point for the number 100 00:05:50,470 --> 00:05:55,240 one, or a three for an E, or a 5 for an S. 101 00:05:55,240 --> 00:05:57,160 And based on the smiles in the room right now, 102 00:05:57,160 --> 00:06:00,460 you're not all that clever, it turns out, because other people are smiling 103 00:06:00,460 --> 00:06:04,390 too, which is to say that an adversary can take those same heuristics that you 104 00:06:04,390 --> 00:06:06,700 might think are making things more secure by just 105 00:06:06,700 --> 00:06:08,752 tweaking some letters to numbers or vise versa. 106 00:06:08,752 --> 00:06:10,960 But if you're doing it and other people are doing it, 107 00:06:10,960 --> 00:06:14,150 the bad guys, so to speak, are going to be doing it as well. 108 00:06:14,150 --> 00:06:16,570 So unfortunately, when it comes to passwords, 109 00:06:16,570 --> 00:06:20,020 better is longer, and random, and really unguessable. 110 00:06:20,020 --> 00:06:21,520 But that's not what most of us have. 111 00:06:21,520 --> 00:06:23,562 In fact, case in point on our phones, whether you 112 00:06:23,562 --> 00:06:25,540 have an Android device or an iPhone nowadays, 113 00:06:25,540 --> 00:06:28,900 odds are you have something relatively simplistic protecting it, 114 00:06:28,900 --> 00:06:30,070 if you have anything at all. 115 00:06:30,070 --> 00:06:31,778 But at least, Apple and Google are pretty 116 00:06:31,778 --> 00:06:34,630 good at at least nudging us to choose these kinds of passcodes now. 117 00:06:34,630 --> 00:06:38,360 And a four-digit passcode is quite common nowadays. 118 00:06:38,360 --> 00:06:42,190 And so here's where we have an opportunity, thanks to the URL 119 00:06:42,190 --> 00:06:45,490 that you saw on the screen earlier, to conjecture as a group 120 00:06:45,490 --> 00:06:49,060 just how long might it take an adversary, someone out there 121 00:06:49,060 --> 00:06:51,460 who's out to get us or get one of us-- 122 00:06:51,460 --> 00:06:54,820 how long might it take an adversary to figure out 123 00:06:54,820 --> 00:06:57,580 your phone's four-digit passcode? 124 00:06:57,580 --> 00:06:58,930 This is CS50's own Carter. 125 00:06:58,930 --> 00:07:01,510 Carter, if you could switch over and pull the audience here-- 126 00:07:01,510 --> 00:07:04,552 if you take out your phone or laptop, whatever device you might have used 127 00:07:04,552 --> 00:07:10,960 a few minutes ago, to scan that QR code or to visit that same URL, 128 00:07:10,960 --> 00:07:14,320 you can see these questions on your browser. 129 00:07:14,320 --> 00:07:15,790 And if you can't, that's fine. 130 00:07:15,790 --> 00:07:18,217 We'll share some aggregate data, nonetheless. 131 00:07:18,217 --> 00:07:20,800 But you should have an opportunity to tap one of your answers. 132 00:07:20,800 --> 00:07:26,920 And we'll give folks a few more seconds if you'd like to play along at home. 133 00:07:26,920 --> 00:07:30,670 And here in just a moment-- 134 00:07:30,670 --> 00:07:32,287 probably have many people reporting. 135 00:07:32,287 --> 00:07:34,870 But why don't we go ahead and take a look at some percentages? 136 00:07:34,870 --> 00:07:36,910 It looks like most of you-- 137 00:07:36,910 --> 00:07:39,340 60% to 70%-- are proposing just a few seconds, 138 00:07:39,340 --> 00:07:42,340 so that's not all that good news if it's a four-digit passcode. 139 00:07:42,340 --> 00:07:44,620 Some of you are hoping it's a few minutes. 140 00:07:44,620 --> 00:07:46,510 8% are hoping a few hours. 141 00:07:46,510 --> 00:07:50,560 More than 4% of you are really hoping, perhaps, it's a few days. 142 00:07:50,560 --> 00:07:53,192 Well, let's actually consider how we can answer this question 143 00:07:53,192 --> 00:07:55,900 and make today not just conceptual, but a little quantitative too 144 00:07:55,900 --> 00:07:58,483 and see if we can't slap some numbers on questions like these, 145 00:07:58,483 --> 00:08:00,610 so ultimately, you can make more informed decisions 146 00:08:00,610 --> 00:08:01,850 with your system's security. 147 00:08:01,850 --> 00:08:05,140 So for instance, when it comes to four-digit pass codes, 148 00:08:05,140 --> 00:08:08,560 rather than just consider how secure it is, well, 149 00:08:08,560 --> 00:08:11,630 let's make it a more precise question like, what are the forms of attack? 150 00:08:11,630 --> 00:08:14,755 Well, the simplest attack might be just someone grabbing your phone, be it, 151 00:08:14,755 --> 00:08:17,320 in your family, or maybe at Starbucks, or the airport, 152 00:08:17,320 --> 00:08:21,760 or the like and just starting all possible combinations, maybe 0000, 153 00:08:21,760 --> 00:08:24,490 then 0001, and 0002. 154 00:08:24,490 --> 00:08:26,660 We could maybe automate this a little bit. 155 00:08:26,660 --> 00:08:30,220 So for instance, I might potentially be able to do something 156 00:08:30,220 --> 00:08:32,720 like robotosize this here. 157 00:08:32,720 --> 00:08:35,140 Let me go ahead and full screen a quick video here 158 00:08:35,140 --> 00:08:38,679 that's just going to paint a picture in just a moment on the screen of how, 159 00:08:38,679 --> 00:08:41,470 if we're a really clever adversary and know how to build things, 160 00:08:41,470 --> 00:08:44,270 well, at least, maybe we could automate some of that process. 161 00:08:44,270 --> 00:08:46,420 So here's an Android phone sitting on a counter. 162 00:08:46,420 --> 00:08:50,920 Here's a very simple tripod and a little touch device robotically doing all 163 00:08:50,920 --> 00:08:56,260 of that hacking for you starting at 0000 probably all the way up to 9999. 164 00:08:56,260 --> 00:09:00,340 Now, that too wasn't necessarily all that fast, but at least, 165 00:09:00,340 --> 00:09:02,530 the adversary can step away and doesn't actually 166 00:09:02,530 --> 00:09:05,140 have to be bothered with the time involved, 167 00:09:05,140 --> 00:09:08,390 the cost involved, in actually hacking that particular device. 168 00:09:08,390 --> 00:09:11,800 Well, let's go one level deeper, a little more interestingly, 169 00:09:11,800 --> 00:09:17,950 and consider here how much time really this so-called brute force 170 00:09:17,950 --> 00:09:18,700 attack would take. 171 00:09:18,700 --> 00:09:21,825 And that's actually a term of art, much like in yesteryear when maybe there 172 00:09:21,825 --> 00:09:24,820 was a battering ram trying to brute force their way into a castle 173 00:09:24,820 --> 00:09:25,870 or something like that. 174 00:09:25,870 --> 00:09:29,380 A brute force attack digitally is just someone trying manually 175 00:09:29,380 --> 00:09:32,950 all possible codes or maybe robotically trying all possible codes, 176 00:09:32,950 --> 00:09:35,500 but generally automating the process in some way 177 00:09:35,500 --> 00:09:37,270 to go through all possibilities. 178 00:09:37,270 --> 00:09:42,820 Well, if you've got, for instance, a four-digit passcode-- 179 00:09:42,820 --> 00:09:46,150 let's ask maybe a follow-up question here, not how long it will take, 180 00:09:46,150 --> 00:09:51,190 but how many possible four-digit passcodes are there? 181 00:09:51,190 --> 00:09:53,420 Because then maybe, we can do some quick math. 182 00:09:53,420 --> 00:09:57,132 And if every passcode takes me a second, or a few milliseconds, or the like, 183 00:09:57,132 --> 00:10:00,340 then I think we can try to extrapolate from that whether the first answer was 184 00:10:00,340 --> 00:10:03,920 seconds, or minutes, or days, or hours, or something else. 185 00:10:03,920 --> 00:10:06,815 So how many four-digit passcodes are possible? 186 00:10:06,815 --> 00:10:09,940 If you take out your same device, it should have just changed automatically 187 00:10:09,940 --> 00:10:13,750 if it doesn't seem to have maybe reload your browser with some menu option. 188 00:10:13,750 --> 00:10:17,740 And then tap in here, how many four-digit passcodes are possible? 189 00:10:17,740 --> 00:10:20,860 Four total, 40, 9,999, 10,000. 190 00:10:20,860 --> 00:10:24,610 Or unsure is OK too. 191 00:10:24,610 --> 00:10:25,510 So let's see. 192 00:10:25,510 --> 00:10:27,900 We'll give you a few more moments. 193 00:10:27,900 --> 00:10:30,600 How many four-digit passcodes are possible? 194 00:10:30,600 --> 00:10:33,160 And shall we reveal the results? 195 00:10:33,160 --> 00:10:37,230 So now, it looks like a few of you-- 196 00:10:37,230 --> 00:10:42,540 2% of you are saying just for passcodes, 40, 9,999. 197 00:10:42,540 --> 00:10:44,340 There's definitely some contention here. 198 00:10:44,340 --> 00:10:45,900 And 6% are unsure. 199 00:10:45,900 --> 00:10:48,850 Well, how do we wrap our minds around this? 200 00:10:48,850 --> 00:10:50,760 Well, let's just do this real simple here. 201 00:10:50,760 --> 00:10:54,160 Let me switch back over to doing a bit of math. 202 00:10:54,160 --> 00:10:57,932 And if we have here 10 possibilities for each digit, if there's four digits, 203 00:10:57,932 --> 00:11:01,140 each digit can be zero, one, two, three, four, five, six, seven, eight, nine. 204 00:11:01,140 --> 00:11:02,590 So that's 10 possibilities. 205 00:11:02,590 --> 00:11:05,490 So if you think about the number of permutations, 206 00:11:05,490 --> 00:11:09,540 that's 10 possibilities for the first digit times 10 for the next, times 10 207 00:11:09,540 --> 00:11:11,290 for the next, times 10 for the next. 208 00:11:11,290 --> 00:11:14,130 And so if we do that out, 10 times, 10 times, 10 times, 10 or 10 209 00:11:14,130 --> 00:11:20,220 to the fourth, there are, indeed-- and 66% of you found 10,000 possibilities. 210 00:11:20,220 --> 00:11:22,620 And so now we can kind of work backwards and decide, 211 00:11:22,620 --> 00:11:25,800 how long is it going to take for an adversary to hack into this phone? 212 00:11:25,800 --> 00:11:28,800 Because if it's one attack, one guess per second, 213 00:11:28,800 --> 00:11:31,470 well, that's going to map out to 10,000 seconds, 214 00:11:31,470 --> 00:11:34,560 but maybe not if the adversary isn't a roboticist or a human. 215 00:11:34,560 --> 00:11:37,170 What if they're a software programmer or someone who 216 00:11:37,170 --> 00:11:40,560 has taken even a class introductory, like CS50 and learned 217 00:11:40,560 --> 00:11:41,760 a little bit of programming? 218 00:11:41,760 --> 00:11:43,590 Well, a little bit frighteningly, it's not 219 00:11:43,590 --> 00:11:46,470 all that hard to hack into systems if you just 220 00:11:46,470 --> 00:11:49,890 know how to code, too, and really have the computer do your work for you. 221 00:11:49,890 --> 00:11:53,190 So in fact, let me go ahead and change over to another screen on my computer 222 00:11:53,190 --> 00:11:53,910 here. 223 00:11:53,910 --> 00:11:56,307 This is different for students in the group from VS Code. 224 00:11:56,307 --> 00:11:58,140 This is just a black and white version of it 225 00:11:58,140 --> 00:11:59,670 that we've used briefly in the past. 226 00:11:59,670 --> 00:12:03,090 And I'm just going to go ahead and create a program called crack.py. 227 00:12:03,090 --> 00:12:05,160 To crack something just technically means 228 00:12:05,160 --> 00:12:08,380 to figure out what it is, figure out a password in this case. 229 00:12:08,380 --> 00:12:11,730 And .py means I'm going to use a programming language that we here 230 00:12:11,730 --> 00:12:15,390 in CS50 have been dabbling in the past couple of weeks with more to come next 231 00:12:15,390 --> 00:12:16,300 week as well. 232 00:12:16,300 --> 00:12:20,220 So it turns out, and you need not understand each of these lines of code, 233 00:12:20,220 --> 00:12:25,518 if I want to try, maybe, generating all 10,000 possible codes, 234 00:12:25,518 --> 00:12:27,060 I'm not going to bother with a robot. 235 00:12:27,060 --> 00:12:29,250 I've got all these cables coming out of my computer. 236 00:12:29,250 --> 00:12:32,250 And odds are one of them is a USB cable or a lightning cable. 237 00:12:32,250 --> 00:12:36,120 Surely, we could figure out how to connect laptop or desktop to phone 238 00:12:36,120 --> 00:12:39,810 and just automate the process nowadays by just sending all of the numbers 239 00:12:39,810 --> 00:12:44,220 into the phone until one unlocks the trick just like in the movies or TV. 240 00:12:44,220 --> 00:12:47,410 Well, in Python, I could write a program that does this as follows. 241 00:12:47,410 --> 00:12:51,510 I can import, so to speak, all of the decimal digits, zero through nine. 242 00:12:51,510 --> 00:12:54,990 And this, for students in the room, is just a slightly better version 243 00:12:54,990 --> 00:12:57,580 of typing out 10 different numbers manually. 244 00:12:57,580 --> 00:13:00,540 I can also import from a library, so to speak, 245 00:13:00,540 --> 00:13:02,910 called itertools for iteration tools, which 246 00:13:02,910 --> 00:13:04,720 means to do something again and again. 247 00:13:04,720 --> 00:13:09,210 I can import a function called product, which means the cross product. 248 00:13:09,210 --> 00:13:11,850 Combine this with this some number of times. 249 00:13:11,850 --> 00:13:13,680 And then it's just two more lines of code. 250 00:13:13,680 --> 00:13:15,850 I can use what's called a loop in programming. 251 00:13:15,850 --> 00:13:19,770 So for every pass code in the cross product of all 10 252 00:13:19,770 --> 00:13:24,420 of those digits repeated, a total of four times-- 253 00:13:24,420 --> 00:13:27,480 let me go ahead and-- rather than bother connecting my phone 254 00:13:27,480 --> 00:13:29,520 and hacking my own phone, let me just print out 255 00:13:29,520 --> 00:13:31,680 every one of those 10,000 codes on the screen, 256 00:13:31,680 --> 00:13:33,833 and we'll see how fast the hacker could do this. 257 00:13:33,833 --> 00:13:34,500 Let me go ahead. 258 00:13:34,500 --> 00:13:37,750 And print and with an asterisk, which is a little trick to format it nicely, 259 00:13:37,750 --> 00:13:39,750 I'm going to print out each of those pass codes. 260 00:13:39,750 --> 00:13:43,350 And that's it, four lines of code, maybe 40 seconds of talking, 261 00:13:43,350 --> 00:13:45,960 but maybe really four seconds of coding if I actually 262 00:13:45,960 --> 00:13:47,250 did this without the audience. 263 00:13:47,250 --> 00:13:49,470 And now let me go ahead and save the file. 264 00:13:49,470 --> 00:13:52,410 And I'm going to run, as we do every day in class of late. 265 00:13:52,410 --> 00:13:54,370 Python of crack.py. 266 00:13:54,370 --> 00:13:57,840 And when I hit Enter, I should see on the screen all 10,000 possibilities 267 00:13:57,840 --> 00:14:00,180 from 0000 9999. 268 00:14:00,180 --> 00:14:00,810 So let's see. 269 00:14:00,810 --> 00:14:04,560 Is it a few seconds, minutes, hours, or days? 270 00:14:04,560 --> 00:14:05,790 Done. 271 00:14:05,790 --> 00:14:08,518 So barely even seconds plural if that. 272 00:14:08,518 --> 00:14:11,310 So that should be a little disconcerting because all that adversary 273 00:14:11,310 --> 00:14:14,393 needs to do is grab your phone off the counter, plug in a cable, and boom. 274 00:14:14,393 --> 00:14:15,390 They're done. 275 00:14:15,390 --> 00:14:18,720 There's no ticking clock or worries as in the movies or TV 276 00:14:18,720 --> 00:14:20,700 that maybe you're going to come into the room. 277 00:14:20,700 --> 00:14:22,900 You don't need that much of a window of time. 278 00:14:22,900 --> 00:14:25,000 So what would be better than this? 279 00:14:25,000 --> 00:14:27,630 Well, let's consider what our options might 280 00:14:27,630 --> 00:14:30,392 be if we don't want to just use four-digit pass code. 281 00:14:30,392 --> 00:14:32,850 Some of you, indeed, might have better passcodes than that. 282 00:14:32,850 --> 00:14:36,930 And maybe, you use four-letter passcodes instead, so A through Z, 283 00:14:36,930 --> 00:14:38,820 maybe uppercase and lowercase. 284 00:14:38,820 --> 00:14:41,110 That starts to make things a little more interesting. 285 00:14:41,110 --> 00:14:43,080 So should we poll this question too? 286 00:14:43,080 --> 00:14:46,592 If we upgrade from four digits to just four letters, 287 00:14:46,592 --> 00:14:49,800 English letters, A through Z, uppercase and lowercase-- why don't we go ahead 288 00:14:49,800 --> 00:14:55,260 and pol the group here and ask how many four-letter passcodes are there 289 00:14:55,260 --> 00:14:57,850 instead? 290 00:14:57,850 --> 00:15:00,510 So this time, the range starts at four. 291 00:15:00,510 --> 00:15:03,090 Still not the right answer, though, this time. 292 00:15:03,090 --> 00:15:05,800 How many four-letter pass codes are possible? 293 00:15:05,800 --> 00:15:08,870 294 00:15:08,870 --> 00:15:11,090 [INAUDIBLE] 295 00:15:11,090 --> 00:15:12,290 Take a couple more seconds. 296 00:15:12,290 --> 00:15:16,590 297 00:15:16,590 --> 00:15:18,030 All right. 298 00:15:18,030 --> 00:15:20,110 Almost a couple hundred responses in already. 299 00:15:20,110 --> 00:15:21,465 A few more seconds. 300 00:15:21,465 --> 00:15:24,920 301 00:15:24,920 --> 00:15:31,070 And why don't we go ahead and reveal now the answers, which are-- 302 00:15:31,070 --> 00:15:35,690 OK, so we solved a couple of problems at least. 303 00:15:35,690 --> 00:15:37,430 OK, someone's just messing with us now. 304 00:15:37,430 --> 00:15:37,930 All right. 305 00:15:37,930 --> 00:15:39,620 So it looks like most of you-- 306 00:15:39,620 --> 00:15:43,350 76% of you have claimed it's seven million plus possibilities. 307 00:15:43,350 --> 00:15:46,040 So that's encouraging because that's a whole order of magnitude 308 00:15:46,040 --> 00:15:46,792 more than before. 309 00:15:46,792 --> 00:15:49,250 Well, let's figure out how we might do this mathematically. 310 00:15:49,250 --> 00:15:52,100 So if we've got 26 lowercase, 26 uppercase, 311 00:15:52,100 --> 00:15:55,020 that's 52 possibilities now for each of those four digits. 312 00:15:55,020 --> 00:15:57,680 So that's 52 times itself four times, which, 313 00:15:57,680 --> 00:16:00,530 indeed, either off the top of your head a good guess, 314 00:16:00,530 --> 00:16:03,500 a calculator on the same device you're using right now, indeed, 315 00:16:03,500 --> 00:16:05,790 gives us seven million instead. 316 00:16:05,790 --> 00:16:08,250 Well, what might be slightly better than that? 317 00:16:08,250 --> 00:16:09,837 Well, maybe four characters. 318 00:16:09,837 --> 00:16:12,170 And this, indeed, is what your Macs, PCs, and phones are 319 00:16:12,170 --> 00:16:15,860 urging us to do nowadays, not just numbers, not just letters, but really 320 00:16:15,860 --> 00:16:19,940 annoying punctuation, so it really looks cryptic not just to the adversary, 321 00:16:19,940 --> 00:16:22,030 but also to you and me, unfortunately. 322 00:16:22,030 --> 00:16:23,030 And that's the downside. 323 00:16:23,030 --> 00:16:27,500 But here now, we have a mental model, and really, a computational framework 324 00:16:27,500 --> 00:16:29,737 via which we can evaluate the security of these. 325 00:16:29,737 --> 00:16:31,820 And I'll go ahead and spoil some of the math here. 326 00:16:31,820 --> 00:16:37,080 If we've got 52 letters of the alphabet, uppercase and lowercase, 10 digits, 327 00:16:37,080 --> 00:16:40,850 and if I count them out on my keyboard, about 32 punctuation symbols 328 00:16:40,850 --> 00:16:43,190 in typical English grammar, that actually gives us 329 00:16:43,190 --> 00:16:49,230 94 possibilities now, which is up from 52, which is up from 10. 330 00:16:49,230 --> 00:16:50,780 So now, we're really moving. 331 00:16:50,780 --> 00:16:53,700 And now that would give us 78 million possibilities, 332 00:16:53,700 --> 00:16:55,280 so another order of magnitude. 333 00:16:55,280 --> 00:16:58,202 Now, it's still going to be relatively fast because you know what? 334 00:16:58,202 --> 00:16:59,160 I can actually do this. 335 00:16:59,160 --> 00:17:01,220 Let me go back into my code here. 336 00:17:01,220 --> 00:17:03,500 Let me reopen this same program. 337 00:17:03,500 --> 00:17:06,750 And I can point out just how easy it is to make these changes. 338 00:17:06,750 --> 00:17:11,630 Instead of importing digits as before, I can import, as your child might know, 339 00:17:11,630 --> 00:17:14,940 ascii letters, which are A through Z, uppercase, lowercase. 340 00:17:14,940 --> 00:17:18,319 And I can just change this here, ascii letters. 341 00:17:18,319 --> 00:17:21,349 And so this was that first version where we just changed to letters. 342 00:17:21,349 --> 00:17:22,640 Let me now rerun the code. 343 00:17:22,640 --> 00:17:26,220 And instead of seeing numbers, we'll see letters flying across the screen. 344 00:17:26,220 --> 00:17:29,060 And if I walk over here to the screen, we'll see that. 345 00:17:29,060 --> 00:17:33,170 By the time I get here, we're halfway through the entire alphabet lowercase. 346 00:17:33,170 --> 00:17:35,120 If I now start walking away, I think, yeah, 347 00:17:35,120 --> 00:17:37,700 we're already done now with uppercase as well. 348 00:17:37,700 --> 00:17:39,990 If I upgrade this slightly further-- 349 00:17:39,990 --> 00:17:44,300 let's go ahead and take it one more level and, perhaps, do, let's say, 350 00:17:44,300 --> 00:17:47,870 ascii letters, and digits, and punctuation. 351 00:17:47,870 --> 00:17:50,550 And this would be the Pythonic way to say that. 352 00:17:50,550 --> 00:17:53,600 And I'm going to add to those letters those same digits, 353 00:17:53,600 --> 00:17:55,940 those same punctuation symbols. 354 00:17:55,940 --> 00:17:58,700 Let me shrink my font just so the code still fits on the screen. 355 00:17:58,700 --> 00:18:01,880 And what we now have is with a two seconds of changes, 356 00:18:01,880 --> 00:18:04,490 a program that if I run this version-- 357 00:18:04,490 --> 00:18:08,100 whoops-- without the typographical error-- 358 00:18:08,100 --> 00:18:10,520 this is what we call in CS50 a bug. 359 00:18:10,520 --> 00:18:12,780 So now, we run the same-- 360 00:18:12,780 --> 00:18:17,030 this is what we call in CS50 a second bug-- 361 00:18:17,030 --> 00:18:17,846 punctuation. 362 00:18:17,846 --> 00:18:20,490 363 00:18:20,490 --> 00:18:21,865 This is where I cross my fingers. 364 00:18:21,865 --> 00:18:22,365 OK. 365 00:18:22,365 --> 00:18:25,440 So now it's going to be a little hard to see as flies across the screen. 366 00:18:25,440 --> 00:18:28,700 But you probably are seeing glimpses of some weird punctuation characters 367 00:18:28,700 --> 00:18:29,250 as well. 368 00:18:29,250 --> 00:18:32,270 And I won't waste our time trying to talk through this because this 369 00:18:32,270 --> 00:18:33,470 is going to take longer. 370 00:18:33,470 --> 00:18:34,700 We're still in the lowercase. 371 00:18:34,700 --> 00:18:35,930 I'm still over here already. 372 00:18:35,930 --> 00:18:39,290 We've not even gotten to N, now O, then P. 373 00:18:39,290 --> 00:18:40,790 So this is going to run longer. 374 00:18:40,790 --> 00:18:44,630 But let's end with one final question on the security of all these systems. 375 00:18:44,630 --> 00:18:47,540 I'm going to cancel that by hitting Control C on my keyboard. 376 00:18:47,540 --> 00:18:51,950 And let's ask the question instead, if we use eight-character passwords, 377 00:18:51,950 --> 00:18:53,360 so twice as many characters. 378 00:18:53,360 --> 00:18:55,790 But even that is not terribly long. 379 00:18:55,790 --> 00:18:58,670 This is eight characters alone on the stage, eight characters. 380 00:18:58,670 --> 00:19:01,100 Using letters, numbers, and punctuation might be better. 381 00:19:01,100 --> 00:19:03,530 Let's do one final vote here, if we could. 382 00:19:03,530 --> 00:19:07,700 On your same device, how many eight-character possibilities 383 00:19:07,700 --> 00:19:12,800 are there now for these passcodes? 384 00:19:12,800 --> 00:19:17,000 And now four didn't even make the list this time. 385 00:19:17,000 --> 00:19:21,363 All right, a few more seconds, about 100 responses so far. 386 00:19:21,363 --> 00:19:23,780 How about we go ahead-- and, Carter, if you wouldn't mind, 387 00:19:23,780 --> 00:19:28,220 let's reveal the results based on the vote, a pretty decent spread here. 388 00:19:28,220 --> 00:19:30,317 Although the quadrillions are quickly buzzing in. 389 00:19:30,317 --> 00:19:32,150 And they're contending with the others here. 390 00:19:32,150 --> 00:19:34,700 Looks like 44% of you said quintillion. 391 00:19:34,700 --> 00:19:36,740 34% said quadrillion. 392 00:19:36,740 --> 00:19:39,800 And this time, for the first time, you overbid. 393 00:19:39,800 --> 00:19:44,270 So indeed, if we go back to the math here, at least, the majority over bid. 394 00:19:44,270 --> 00:19:46,160 If we have eight-character passcodes that 395 00:19:46,160 --> 00:19:50,600 gives us 94 times itself eight times or 94 to the eighth power. 396 00:19:50,600 --> 00:20:01,430 And in fact, that gives us roughly 6,095,689,385,410,816 397 00:20:01,430 --> 00:20:02,630 possible passcodes. 398 00:20:02,630 --> 00:20:04,140 Now, what does that mean? 399 00:20:04,140 --> 00:20:07,700 Well, the adversary's algorithm, the step-by-step code 400 00:20:07,700 --> 00:20:10,580 that they write to try to hack into your phone, is no different. 401 00:20:10,580 --> 00:20:13,670 And honestly, if your passcode is eight characters long, 402 00:20:13,670 --> 00:20:18,500 but they're are 00000000, you're no more secure fundamentally. 403 00:20:18,500 --> 00:20:22,010 You really want to be somewhere in the sweet spot of that massive range 404 00:20:22,010 --> 00:20:25,200 of values, so that if the adversary tries this brute force attack just 405 00:20:25,200 --> 00:20:28,590 running through all possibilities, they will eventually reach 406 00:20:28,590 --> 00:20:31,320 your passcode just mathematically. 407 00:20:31,320 --> 00:20:32,400 It will be there. 408 00:20:32,400 --> 00:20:34,920 Hopefully, though-- well, maybe not hopefully-- you 409 00:20:34,920 --> 00:20:37,380 and I and they will be gone from this world 410 00:20:37,380 --> 00:20:39,840 because that much time will have passed. 411 00:20:39,840 --> 00:20:44,440 And if we do out the math here, this number of seconds, for instance, 412 00:20:44,440 --> 00:20:48,940 is long past when we will no longer be here. 413 00:20:48,940 --> 00:20:50,400 So that's the sort of measures. 414 00:20:50,400 --> 00:20:53,460 We don't sort of fundamentally change the equation for the adversary. 415 00:20:53,460 --> 00:20:54,670 It's still the same risk. 416 00:20:54,670 --> 00:20:55,795 It's still the same attack. 417 00:20:55,795 --> 00:20:59,370 But you significantly drive down the probability of success on their part. 418 00:20:59,370 --> 00:21:02,880 Or conceptually, you drive up the cost to the adversary. 419 00:21:02,880 --> 00:21:05,850 And indeed, even in the physical world, this is true. 420 00:21:05,850 --> 00:21:08,310 You just want your passcode in the digital world 421 00:21:08,310 --> 00:21:11,220 really to be better than someone else's because you 422 00:21:11,220 --> 00:21:12,985 want someone else's passcode to be the one 423 00:21:12,985 --> 00:21:14,610 that the adversary does something with. 424 00:21:14,610 --> 00:21:18,000 Just like in the physical world, even though it's a bit uncomfortable 425 00:21:18,000 --> 00:21:21,990 to consider, your house doesn't need to be 100% secure. 426 00:21:21,990 --> 00:21:24,060 And indeed, it's difficult to make it such. 427 00:21:24,060 --> 00:21:26,100 There's always going to be a point of weakness. 428 00:21:26,100 --> 00:21:28,810 Maybe it's that window, the door, or something like that. 429 00:21:28,810 --> 00:21:33,060 But if your home is more secure than the next door home, just probabilistically, 430 00:21:33,060 --> 00:21:34,710 you are more secure. 431 00:21:34,710 --> 00:21:35,760 You're not secure. 432 00:21:35,760 --> 00:21:37,710 And indeed, any website you see down the road 433 00:21:37,710 --> 00:21:41,400 that says, we are secure because we do X, Y, or Z, that's nonsense. 434 00:21:41,400 --> 00:21:44,910 Security is really about comparisons and evaluating things 435 00:21:44,910 --> 00:21:49,870 if quantitatively relative to some other system, relative to some other code. 436 00:21:49,870 --> 00:21:51,520 So what's the takeaway here? 437 00:21:51,520 --> 00:21:55,320 Well, hopefully, a non-trivial number of you will go home this weekend on Monday 438 00:21:55,320 --> 00:21:57,803 and change at least one passcode. 439 00:21:57,803 --> 00:21:59,470 But there's going to be a tradeoff here. 440 00:21:59,470 --> 00:22:01,260 We talk about this all the time in CS50. 441 00:22:01,260 --> 00:22:06,210 Any time we improve something, we pay some price in time, in performance, 442 00:22:06,210 --> 00:22:07,360 in cost, somewhere else. 443 00:22:07,360 --> 00:22:10,470 So what's the downside then of this advice that you should use 444 00:22:10,470 --> 00:22:13,710 minimally eight-character passcodes? 445 00:22:13,710 --> 00:22:16,350 Why might you want to say nay and not do this? 446 00:22:16,350 --> 00:22:17,835 AUDIENCE: You have to remember it. 447 00:22:17,835 --> 00:22:18,390 DAVID J. MALAN: Say again? 448 00:22:18,390 --> 00:22:19,350 AUDIENCE: You have to remember it. 449 00:22:19,350 --> 00:22:21,308 DAVID J. MALAN: You have to remember it, right? 450 00:22:21,308 --> 00:22:23,050 And so here, there is some sociology. 451 00:22:23,050 --> 00:22:24,570 There's some human behavior. 452 00:22:24,570 --> 00:22:25,980 Some of you might have colleagues, if you're 453 00:22:25,980 --> 00:22:27,688 working in the real world, at least, back 454 00:22:27,688 --> 00:22:30,510 in healthier times when you had colleagues with desks in cubicles. 455 00:22:30,510 --> 00:22:33,218 And there's probably one person in the office with a post-it note 456 00:22:33,218 --> 00:22:35,370 on their monitor with their passcode. 457 00:22:35,370 --> 00:22:41,160 It's a bit of a cybersecurity offense, but it's also a real world side 458 00:22:41,160 --> 00:22:44,250 effect, maybe of corporate policies, that aren't really calibrated 459 00:22:44,250 --> 00:22:45,640 for human behavior. 460 00:22:45,640 --> 00:22:47,610 So we'll see if there's some other defenses. 461 00:22:47,610 --> 00:22:49,650 And indeed, let me propose that we talk briefly 462 00:22:49,650 --> 00:22:52,530 about one that actually tends to kick in automatically. 463 00:22:52,530 --> 00:22:54,840 Even if your passcode is not as strong as we've just 464 00:22:54,840 --> 00:22:57,940 seen, one of these six quadrillion possibilities, well, 465 00:22:57,940 --> 00:22:59,260 what could we do instead? 466 00:22:59,260 --> 00:23:01,590 Well, has anyone-- and I'll zoom in on this here-- 467 00:23:01,590 --> 00:23:05,940 accidentally locked themselves out of their own phone before? 468 00:23:05,940 --> 00:23:07,380 When does that happen? 469 00:23:07,380 --> 00:23:08,828 Yeah, when you try the password-- 470 00:23:08,828 --> 00:23:09,870 AUDIENCE: Too many times. 471 00:23:09,870 --> 00:23:11,020 DAVID J. MALAN: Yeah, so too many times. 472 00:23:11,020 --> 00:23:12,820 Maybe your finger is slightly off. 473 00:23:12,820 --> 00:23:17,010 Maybe you're slightly off, and you just don't input the same passcode correctly 474 00:23:17,010 --> 00:23:18,803 after five times, 10 times. 475 00:23:18,803 --> 00:23:20,220 There's some reasonable threshold. 476 00:23:20,220 --> 00:23:21,580 And why does that happen? 477 00:23:21,580 --> 00:23:24,270 Well, Apple and Google equivalently figure just 478 00:23:24,270 --> 00:23:27,060 probabilistically if after 10 guesses, you still 479 00:23:27,060 --> 00:23:29,932 haven't typed in the right passcode, probably, you're not you. 480 00:23:29,932 --> 00:23:31,890 You're someone else who's picked up your phone, 481 00:23:31,890 --> 00:23:33,660 so we're just going to go ahead and lock you out. 482 00:23:33,660 --> 00:23:35,200 Now, what's the effect of this? 483 00:23:35,200 --> 00:23:37,860 Well, this means now that each of those possible passcodes 484 00:23:37,860 --> 00:23:39,870 no longer takes roughly one second. 485 00:23:39,870 --> 00:23:41,890 Now it takes roughly one minute. 486 00:23:41,890 --> 00:23:43,360 So the attack is still the same. 487 00:23:43,360 --> 00:23:46,560 But if it's now one passcode or 10 guesses per minute, 488 00:23:46,560 --> 00:23:50,680 we have significantly by a factor of 60 in this story slowed things down. 489 00:23:50,680 --> 00:23:52,680 And unfortunately, does anyone know what happens 490 00:23:52,680 --> 00:23:54,967 if you screw up again after a minute? 491 00:23:54,967 --> 00:23:56,160 AUDIENCE: It goes longer. 492 00:23:56,160 --> 00:23:57,120 DAVID J. MALAN: Yeah, it goes longer. 493 00:23:57,120 --> 00:23:59,100 It's like five minutes and then 10 minutes. 494 00:23:59,100 --> 00:24:01,477 And Google is kind of obnoxious about it. 495 00:24:01,477 --> 00:24:03,060 They don't even give you a time frame. 496 00:24:03,060 --> 00:24:05,410 They just say, try again later. 497 00:24:05,410 --> 00:24:09,450 And so that keeps not only the adversary out, but also potentially you. 498 00:24:09,450 --> 00:24:11,017 So therein lies that tradeoff. 499 00:24:11,017 --> 00:24:14,100 If you've forgotten your code, if-- nowadays, your finger is slightly wet, 500 00:24:14,100 --> 00:24:16,230 so the screen isn't responding correctly. 501 00:24:16,230 --> 00:24:18,540 These could be usability downsides too. 502 00:24:18,540 --> 00:24:21,810 So security is really just about finding the sweet spot 503 00:24:21,810 --> 00:24:24,180 among these various tradeoffs here. 504 00:24:24,180 --> 00:24:25,860 But there's other mechanisms too. 505 00:24:25,860 --> 00:24:30,030 And some of you might recognize this screen from Gmail via which, of course, 506 00:24:30,030 --> 00:24:30,660 you log in. 507 00:24:30,660 --> 00:24:36,030 But after you log into Gmail or similar websites, or apps, or systems at work 508 00:24:36,030 --> 00:24:39,780 nowadays, especially, you might be presented with what's 509 00:24:39,780 --> 00:24:42,240 called two-factor authentication. 510 00:24:42,240 --> 00:24:46,110 And what is this in a nutshell in layperson's terms? 511 00:24:46,110 --> 00:24:48,597 Many of you, if you do anything digitally at work, 512 00:24:48,597 --> 00:24:49,680 might have to do this now. 513 00:24:49,680 --> 00:24:50,180 Yeah? 514 00:24:50,180 --> 00:24:51,860 AUDIENCE: It sends a text to your phone. 515 00:24:51,860 --> 00:24:52,860 DAVID J. MALAN: Exactly. 516 00:24:52,860 --> 00:24:56,040 You get texted at your phone, an additional code that's 517 00:24:56,040 --> 00:24:57,180 not your same password. 518 00:24:57,180 --> 00:24:59,580 It's typically a numeric code, maybe six digits long. 519 00:24:59,580 --> 00:25:01,510 It expires after a minute or 10 minutes. 520 00:25:01,510 --> 00:25:03,030 But why is this a good thing? 521 00:25:03,030 --> 00:25:06,795 Well, one, it's no longer just a piece of information that you know 522 00:25:06,795 --> 00:25:08,760 or that you might have written down. 523 00:25:08,760 --> 00:25:12,180 It's information that changes every time you try to log in. 524 00:25:12,180 --> 00:25:15,100 But more importantly, it's a fundamentally second factor, which 525 00:25:15,100 --> 00:25:16,725 means it's not just something you know. 526 00:25:16,725 --> 00:25:18,180 Now it's something you have. 527 00:25:18,180 --> 00:25:20,832 So you, for instance, are the only one theoretically 528 00:25:20,832 --> 00:25:22,290 that should be receiving that code. 529 00:25:22,290 --> 00:25:24,960 And so now the adversary, if they want to get into your account, 530 00:25:24,960 --> 00:25:28,800 not only have to guess, or brute force, or maybe read off of a post-it 531 00:25:28,800 --> 00:25:29,880 note your password. 532 00:25:29,880 --> 00:25:32,850 They also have to physically have access now to that phone. 533 00:25:32,850 --> 00:25:35,940 So there's still a threat, absolutely, but it's not everyone 534 00:25:35,940 --> 00:25:38,080 on the internet with an internet connection. 535 00:25:38,080 --> 00:25:39,900 Now it's only the people in Starbucks. 536 00:25:39,900 --> 00:25:41,430 Now it's only the people at work. 537 00:25:41,430 --> 00:25:44,400 Now it's only the people in your home who might have access 538 00:25:44,400 --> 00:25:45,400 to that second factor. 539 00:25:45,400 --> 00:25:48,690 So there too, it just raises the bar to the adversary making it harder, 540 00:25:48,690 --> 00:25:52,390 more time consuming, more geographically impossible for them to attack you. 541 00:25:52,390 --> 00:25:54,970 But what's the downside of two-factor authentication, 542 00:25:54,970 --> 00:25:56,850 whether it's a device-- or even nowadays, 543 00:25:56,850 --> 00:25:59,232 it's in software, whether it's on your keychain 544 00:25:59,232 --> 00:26:01,440 or on your phone where you're prompted for this code. 545 00:26:01,440 --> 00:26:05,910 What's a downside that some of us have probably experienced too? 546 00:26:05,910 --> 00:26:07,243 AUDIENCE: You forget your phone. 547 00:26:07,243 --> 00:26:09,535 DAVID J. MALAN: You forget your cell phone, absolutely. 548 00:26:09,535 --> 00:26:11,910 Right, the factor that you have, you don't have with you. 549 00:26:11,910 --> 00:26:14,040 Or maybe, you're in a basement somewhere, don't have reception. 550 00:26:14,040 --> 00:26:14,790 You're on a plane. 551 00:26:14,790 --> 00:26:15,850 You can't get the code. 552 00:26:15,850 --> 00:26:17,850 And so there too are these tradeoffs. 553 00:26:17,850 --> 00:26:19,920 And even IT departments need to keep that in mind 554 00:26:19,920 --> 00:26:21,462 because what does that mean for them? 555 00:26:21,462 --> 00:26:23,370 Well, if you don't have your phone with you 556 00:26:23,370 --> 00:26:25,980 and you are in the habit of calling IT to help you fix this, 557 00:26:25,980 --> 00:26:29,740 now there's a cost, a human cost, maybe even a financial cost. 558 00:26:29,740 --> 00:26:34,022 And so IT policy nowadays is really just about finding the right balance 559 00:26:34,022 --> 00:26:35,730 and where we want to spend our resources, 560 00:26:35,730 --> 00:26:38,850 but at least raise the bar to the adversary. 561 00:26:38,850 --> 00:26:41,280 But of course, there's other ways too. 562 00:26:41,280 --> 00:26:43,680 And this is going to be one of our homework assignments 563 00:26:43,680 --> 00:26:44,880 if you will after today. 564 00:26:44,880 --> 00:26:46,920 There's this software called password managers. 565 00:26:46,920 --> 00:26:50,280 And no need to buzz in on your phone, but maybe with a physical hand. 566 00:26:50,280 --> 00:26:54,330 How many folks here use a password manager? 567 00:26:54,330 --> 00:26:57,660 OK, let me ballpark this at 10%, 20%, perhaps. 568 00:26:57,660 --> 00:27:00,960 So we've got 80% upside here and a lesson learned potentially. 569 00:27:00,960 --> 00:27:03,450 So a password manager is just a piece of software 570 00:27:03,450 --> 00:27:07,287 on your Mac, your PC, or your phone nowadays that manages your passwords. 571 00:27:07,287 --> 00:27:08,370 Well, what does that mean? 572 00:27:08,370 --> 00:27:10,433 When you go to a website for the first time 573 00:27:10,433 --> 00:27:13,600 or you download an app for the first time and you have to create an account, 574 00:27:13,600 --> 00:27:16,918 you can still use your email address, or David as your username, 575 00:27:16,918 --> 00:27:18,210 or whatever your name might be. 576 00:27:18,210 --> 00:27:20,250 So you don't have to change that methodology. 577 00:27:20,250 --> 00:27:26,400 But instead of typing in 123456 as your same password for that website or app 578 00:27:26,400 --> 00:27:30,360 as well as for every other, now you use the password manager software 579 00:27:30,360 --> 00:27:33,570 to generate something difficult to guess for you. 580 00:27:33,570 --> 00:27:37,350 That is you tell the password manager, give me an eight-character random 581 00:27:37,350 --> 00:27:41,260 passcode, not 0000, but something with punctuation, with numbers, 582 00:27:41,260 --> 00:27:41,850 with letters. 583 00:27:41,850 --> 00:27:44,670 And better yet, the password manager, as the name suggests, 584 00:27:44,670 --> 00:27:46,857 remembers that password for you. 585 00:27:46,857 --> 00:27:48,690 And the next time you go to another website, 586 00:27:48,690 --> 00:27:50,898 you do it again with a completely different password, 587 00:27:50,898 --> 00:27:54,300 maybe same username, maybe two-factor authentication, but different password, 588 00:27:54,300 --> 00:27:56,040 different password, different password. 589 00:27:56,040 --> 00:27:57,373 And it doesn't have to be eight. 590 00:27:57,373 --> 00:28:01,410 I mean, I'm in the habit of using a dozen, two dozen characters in total. 591 00:28:01,410 --> 00:28:04,440 And at that point, I can't even pronounce the number of possibilities 592 00:28:04,440 --> 00:28:07,560 because it goes well beyond the quadrillions. 593 00:28:07,560 --> 00:28:10,290 So the probability that someone's going to get into one of those 594 00:28:10,290 --> 00:28:13,170 accounts for me now is very, very, very low. 595 00:28:13,170 --> 00:28:16,140 And they're going to take less interest in me and maybe more interest 596 00:28:16,140 --> 00:28:18,660 in someone else that's not using as good of a password. 597 00:28:18,660 --> 00:28:20,380 Now, what does this mean in real terms? 598 00:28:20,380 --> 00:28:23,250 Well, when you go to log into that managed site, 599 00:28:23,250 --> 00:28:26,297 you don't manually type your password anymore. 600 00:28:26,297 --> 00:28:28,380 In fact, you don't generally even need to know it. 601 00:28:28,380 --> 00:28:32,820 Nowadays, I probably don't 90-plus, 99% of my passwords. 602 00:28:32,820 --> 00:28:35,190 I entrust them to this password manager. 603 00:28:35,190 --> 00:28:40,140 Now, of course, you'd like to think that the password manager itself is secure. 604 00:28:40,140 --> 00:28:41,620 So what might that mean? 605 00:28:41,620 --> 00:28:43,770 Well, those of you who do use a password manager, 606 00:28:43,770 --> 00:28:46,650 how do you access that software itself? 607 00:28:46,650 --> 00:28:49,615 What's protecting your data in your understanding? 608 00:28:49,615 --> 00:28:50,490 AUDIENCE: Biometrics. 609 00:28:50,490 --> 00:28:53,370 DAVID J. MALAN: So maybe biometrics, like your face ID, or maybe 610 00:28:53,370 --> 00:28:55,645 your fingerprint, or maybe more simply, what else? 611 00:28:55,645 --> 00:28:56,520 AUDIENCE: A password. 612 00:28:56,520 --> 00:28:58,103 DAVID J. MALAN: Maybe just a password. 613 00:28:58,103 --> 00:29:02,790 And hopefully, that password that primary password, that gatekeeper, 614 00:29:02,790 --> 00:29:05,190 is not itself 123456. 615 00:29:05,190 --> 00:29:08,520 Otherwise, it doesn't matter how secure all of the others are. 616 00:29:08,520 --> 00:29:12,240 But if you're willing to put in the effort and pick one pretty long 617 00:29:12,240 --> 00:29:15,390 somewhat random very unguessable password that you just 618 00:29:15,390 --> 00:29:17,070 promise to commit to memory-- 619 00:29:17,070 --> 00:29:19,140 and maybe for backup, you literally print it out 620 00:29:19,140 --> 00:29:21,180 and put it in a safe deposit box or a safe, 621 00:29:21,180 --> 00:29:23,490 or just hide it somewhere physically that there's 622 00:29:23,490 --> 00:29:25,350 very low probability someone's going to find 623 00:29:25,350 --> 00:29:27,630 the backup copy, that might be alone. 624 00:29:27,630 --> 00:29:31,980 But of course, the flip side is now if you forget that primary password, 625 00:29:31,980 --> 00:29:34,590 you've now lost all of the eggs in the basket. 626 00:29:34,590 --> 00:29:38,830 If someone gets that primary password, now they have access to everything. 627 00:29:38,830 --> 00:29:40,350 So that's rather the tradeoff. 628 00:29:40,350 --> 00:29:45,030 But I dare say you're probably less threatened, depending on your family, 629 00:29:45,030 --> 00:29:48,090 by the people immediately around you than the billions 630 00:29:48,090 --> 00:29:51,420 of other people on the internet that have access, potentially, 631 00:29:51,420 --> 00:29:52,648 to those same systems. 632 00:29:52,648 --> 00:29:53,940 So there, too, it's a tradeoff. 633 00:29:53,940 --> 00:29:55,732 But it's up to you to decide whether or not 634 00:29:55,732 --> 00:29:57,880 to manage your passwords in this way. 635 00:29:57,880 --> 00:30:00,600 But if you were on that top 10 list, or even if you're not, 636 00:30:00,600 --> 00:30:04,110 but you can think of several accounts that all have the same password, 637 00:30:04,110 --> 00:30:06,550 you're probably going to benefit from something like this. 638 00:30:06,550 --> 00:30:11,310 And why is it bad, to be clear, to use the same password on multiple sites 639 00:30:11,310 --> 00:30:15,780 in case that's never sort of dawned in thought? 640 00:30:15,780 --> 00:30:18,480 Why is that a bad thing, to reuse a password 641 00:30:18,480 --> 00:30:20,340 on different websites, different apps? 642 00:30:20,340 --> 00:30:21,270 Any intuition? 643 00:30:21,270 --> 00:30:22,050 Yeah, in the back? 644 00:30:22,050 --> 00:30:24,550 AUDIENCE: Once attacked, it's easy to get to. 645 00:30:24,550 --> 00:30:25,550 DAVID J. MALAN: Exactly. 646 00:30:25,550 --> 00:30:27,170 Once it's attacked, you can-- 647 00:30:27,170 --> 00:30:29,270 the adversary, presumably, by transitivity, 648 00:30:29,270 --> 00:30:31,880 can see, oh, well, if this user's username is 649 00:30:31,880 --> 00:30:35,030 malan@harvard.edu on this website, and their password is foolishly 650 00:30:35,030 --> 00:30:38,840 123456 or even something way more complicated, 651 00:30:38,840 --> 00:30:41,060 they can probably just assume with high probability 652 00:30:41,060 --> 00:30:42,860 that if I'm being a little reckless, let's 653 00:30:42,860 --> 00:30:47,210 try accessing malan@harvard.edu's use other accounts, other apps using 654 00:30:47,210 --> 00:30:48,358 that exact same password. 655 00:30:48,358 --> 00:30:50,150 And so by transitivity, essentially, you're 656 00:30:50,150 --> 00:30:52,530 putting your other accounts at risk. 657 00:30:52,530 --> 00:30:53,780 So what's maybe a takeaway? 658 00:30:53,780 --> 00:30:56,780 Minimally here, I would start to reconsider your passcodes 659 00:30:56,780 --> 00:30:58,020 on your most important data. 660 00:30:58,020 --> 00:30:58,730 Maybe it's medical. 661 00:30:58,730 --> 00:30:59,600 Maybe it's financial. 662 00:30:59,600 --> 00:31:00,308 Maybe it's email. 663 00:31:00,308 --> 00:31:04,070 Anything remotely personal that you really wouldn't want to have access. 664 00:31:04,070 --> 00:31:06,890 Do you necessarily need the same level of security 665 00:31:06,890 --> 00:31:10,130 on e-commerce sites or sites that you don't really care about 666 00:31:10,130 --> 00:31:13,080 or that you signed up for once and after that, that's it? 667 00:31:13,080 --> 00:31:13,950 Probably not. 668 00:31:13,950 --> 00:31:15,960 So you can decide for yourself, but again, 669 00:31:15,960 --> 00:31:17,570 software, like a password manager. 670 00:31:17,570 --> 00:31:19,570 And these are just some of the possibilities out 671 00:31:19,570 --> 00:31:21,965 there are probably to be your friend. 672 00:31:21,965 --> 00:31:23,090 A couple of these are free. 673 00:31:23,090 --> 00:31:24,550 They come with Windows or Mac OS. 674 00:31:24,550 --> 00:31:25,550 A couple are commercial. 675 00:31:25,550 --> 00:31:29,070 Harvard has a site license for students for one of these as well. 676 00:31:29,070 --> 00:31:30,770 So there are options out there. 677 00:31:30,770 --> 00:31:32,420 But what else do people use? 678 00:31:32,420 --> 00:31:35,060 What else can people use to keep their systems secure? 679 00:31:35,060 --> 00:31:39,590 So most of us nowadays have probably heard of encryption, this technique 680 00:31:39,590 --> 00:31:41,400 for just scrambling information. 681 00:31:41,400 --> 00:31:45,200 So when you want to send a message, an email, or upload a photograph, 682 00:31:45,200 --> 00:31:49,085 or use your credit card, hopefully, it's not just being sent out for all to see, 683 00:31:49,085 --> 00:31:50,960 but there's some kind of scrambling going on. 684 00:31:50,960 --> 00:31:53,600 And some fancy mathematics ensure that encryption 685 00:31:53,600 --> 00:31:57,200 ensures that only you, the sender, and someone else, the receiver, 686 00:31:57,200 --> 00:32:01,580 can theoretically see what that credit card number is, what that message is 687 00:32:01,580 --> 00:32:03,330 what that photograph is instead. 688 00:32:03,330 --> 00:32:06,200 So encryption is sort of commonplace nowadays, 689 00:32:06,200 --> 00:32:09,555 both in websites, and apps, and ATMs, and other such devices. 690 00:32:09,555 --> 00:32:10,430 But how does it work? 691 00:32:10,430 --> 00:32:13,238 Well, back in week two of CS50, your child 692 00:32:13,238 --> 00:32:15,530 learned a little something about encryption, otherwise, 693 00:32:15,530 --> 00:32:16,640 known as cryptography. 694 00:32:16,640 --> 00:32:18,770 And one of the algorithms we talked about 695 00:32:18,770 --> 00:32:21,160 was quite simply something like this. 696 00:32:21,160 --> 00:32:25,952 This is what we might call, not only CS50, but plain text, so very plain 697 00:32:25,952 --> 00:32:28,910 text that, in this case, is English and obviously, everyone in the room 698 00:32:28,910 --> 00:32:29,670 can read it. 699 00:32:29,670 --> 00:32:32,600 But what if I wanted to send this message out to someone in this room, 700 00:32:32,600 --> 00:32:35,720 or out on the internet, or maybe equivalently back in the day, 701 00:32:35,720 --> 00:32:38,690 maybe write a message down on a scrap of paper in grade school 702 00:32:38,690 --> 00:32:42,560 and pass a secret note, a secret love note to someone in class with hopes 703 00:32:42,560 --> 00:32:44,990 that the teacher or any other students in the class 704 00:32:44,990 --> 00:32:46,400 can't intercept it and read it? 705 00:32:46,400 --> 00:32:49,880 Well, you probably don't want to say, this is CS50, or I love you, 706 00:32:49,880 --> 00:32:52,280 or anything remotely sensitive. 707 00:32:52,280 --> 00:32:54,290 But rather, maybe you want to encrypt it. 708 00:32:54,290 --> 00:32:56,510 And let's change the T to a U. 709 00:32:56,510 --> 00:33:01,430 Maybe change the H to an I, the I to a J, the S to a T, the I to a J, 710 00:33:01,430 --> 00:33:04,580 the S to a T again, the C to a D, the S to a T. 711 00:33:04,580 --> 00:33:06,440 And we'll just leave the numbers, alone even 712 00:33:06,440 --> 00:33:09,330 though I worry someone could probably guess what this now does say, 713 00:33:09,330 --> 00:33:10,070 nonetheless. 714 00:33:10,070 --> 00:33:13,820 But what was the algorithm as I rattled those changes off, 715 00:33:13,820 --> 00:33:17,715 whether a student from week two or parents from week now? 716 00:33:17,715 --> 00:33:18,215 Yeah? 717 00:33:18,215 --> 00:33:19,730 AUDIENCE: It's a one-letter shift. 718 00:33:19,730 --> 00:33:21,397 DAVID J. MALAN: Just a one-letter shift. 719 00:33:21,397 --> 00:33:25,940 And this is more sophisticated called a rotational cipher or a Caesar cipher 720 00:33:25,940 --> 00:33:27,320 after Caesar back in the day. 721 00:33:27,320 --> 00:33:28,760 It's relatively simplistic. 722 00:33:28,760 --> 00:33:31,040 But back in the day, it's not so simplistic 723 00:33:31,040 --> 00:33:34,430 if you're the first person in the world to ever use it or think of it. 724 00:33:34,430 --> 00:33:36,500 But nowadays, this is not actually what we use. 725 00:33:36,500 --> 00:33:38,540 But it's similarly mathematical in nature. 726 00:33:38,540 --> 00:33:41,660 It's not quite as simple as just adding one or subtracting one 727 00:33:41,660 --> 00:33:44,810 to go from now what we call ciphertext to plain text. 728 00:33:44,810 --> 00:33:46,502 But it's similarly math that's involved. 729 00:33:46,502 --> 00:33:48,710 And let me just stipulate that the way the math works 730 00:33:48,710 --> 00:33:51,230 is that the sender and the receiver just have 731 00:33:51,230 --> 00:33:53,390 to have in mind some kind of secret. 732 00:33:53,390 --> 00:33:56,360 And the secret in this case would very trivially be one, 733 00:33:56,360 --> 00:33:59,270 but it could be a much bigger, much more unguessable number, 734 00:33:59,270 --> 00:34:01,820 or maybe some other secret we share, the presumption 735 00:34:01,820 --> 00:34:05,390 being that my classmates, my teacher in that grade school classroom, if they 736 00:34:05,390 --> 00:34:07,760 don't know what that secret is that number is, yeah, 737 00:34:07,760 --> 00:34:11,659 they could try to brute force it and try all possible mathematics, plus one, 738 00:34:11,659 --> 00:34:12,770 plus two, plus three. 739 00:34:12,770 --> 00:34:14,120 But that's going to take them some time. 740 00:34:14,120 --> 00:34:15,630 And they probably don't care enough. 741 00:34:15,630 --> 00:34:18,750 And so my data might be, therefore, relatively secure. 742 00:34:18,750 --> 00:34:21,150 But we use encryption all the time nowadays. 743 00:34:21,150 --> 00:34:24,050 And so for instance, this is at the start of most URLs 744 00:34:24,050 --> 00:34:26,179 nowadays, even if you don't type it yourself. 745 00:34:26,179 --> 00:34:30,530 With that said, Safari and even Chrome now are kind of simplifying, if not, 746 00:34:30,530 --> 00:34:33,590 dumbing down user interfaces to just hide details 747 00:34:33,590 --> 00:34:37,370 that you and I, as normal users, don't need to see 24/7. 748 00:34:37,370 --> 00:34:38,150 But it is there. 749 00:34:38,150 --> 00:34:41,449 And if in fact, on your phone or laptop, you click on the URL, 750 00:34:41,449 --> 00:34:43,370 even if it's super short initially, you'll 751 00:34:43,370 --> 00:34:46,040 probably see the whole thing starting with this. 752 00:34:46,040 --> 00:34:47,570 And the S means secure. 753 00:34:47,570 --> 00:34:49,550 The S means that encryption is being used. 754 00:34:49,550 --> 00:34:52,520 But there's other forms of this, not just when you visit websites. 755 00:34:52,520 --> 00:34:55,159 There's this, end-to-end encryption, which 756 00:34:55,159 --> 00:34:59,120 is being talked about more nowadays, especially during COVID times with so 757 00:34:59,120 --> 00:35:02,900 many more of us on video and talking about more sensitive things, 758 00:35:02,900 --> 00:35:05,390 telemedicine, talking to doctors, things that you also 759 00:35:05,390 --> 00:35:09,530 wouldn't want to verbally or visually get out into the wild just like text. 760 00:35:09,530 --> 00:35:13,700 What's different about end-to-end encryption versus HTTPS 761 00:35:13,700 --> 00:35:18,410 and the type of encryption that most of us use every day on websites alone? 762 00:35:18,410 --> 00:35:20,720 End-to-end encryption is sort of a better feature 763 00:35:20,720 --> 00:35:25,150 that you want to increasingly seek when using services like Zoom, or Microsoft 764 00:35:25,150 --> 00:35:29,170 Teams, or WhatsApp, or the like. 765 00:35:29,170 --> 00:35:31,270 Any instincts here? 766 00:35:31,270 --> 00:35:34,905 Yeah, over on the right. 767 00:35:34,905 --> 00:35:38,575 AUDIENCE: The encryption happens in the source and destination. 768 00:35:38,575 --> 00:35:39,450 DAVID J. MALAN: Good. 769 00:35:39,450 --> 00:35:41,492 So the encryption, the scrambling of information, 770 00:35:41,492 --> 00:35:44,790 happens in the source the sender, and the destination, the receiver, 771 00:35:44,790 --> 00:35:47,140 without a so-called middleman in between. 772 00:35:47,140 --> 00:35:49,530 And this is actually very different from most contexts 773 00:35:49,530 --> 00:35:53,310 nowadays that use just HTTPS because when you're using HTTPS 774 00:35:53,310 --> 00:35:57,190 to buy something on Amazon securely with your credit card, well, of course, 775 00:35:57,190 --> 00:36:00,220 Amazon needs to be able to decrypt the message at the end of the day. 776 00:36:00,220 --> 00:36:01,080 And so that's fine. 777 00:36:01,080 --> 00:36:05,820 But even when you're using services like video conferencing or maybe text 778 00:36:05,820 --> 00:36:09,060 messaging nowadays-- well, if you're using WhatsApp, that's owned by Meta. 779 00:36:09,060 --> 00:36:11,580 And if you're using Instagram, that's owned by Meta. 780 00:36:11,580 --> 00:36:14,920 There's a lot of middlemen in these apps that we're using. 781 00:36:14,920 --> 00:36:17,580 And if they were only using encryption period 782 00:36:17,580 --> 00:36:21,660 or only using something like HTTPS, yes, your connection 783 00:36:21,660 --> 00:36:24,630 from you to WhatsApp, and in turn, to the recipient 784 00:36:24,630 --> 00:36:27,750 might very well be secure on each end of that channel. 785 00:36:27,750 --> 00:36:31,680 But Meta in between the company and any other company in between 786 00:36:31,680 --> 00:36:34,988 could theoretically, for better or for worse, be looking at that data, 787 00:36:34,988 --> 00:36:37,030 whether it's to mine it for advertising purposes, 788 00:36:37,030 --> 00:36:39,120 whether it's to snoop on data that you're sending. 789 00:36:39,120 --> 00:36:43,140 That is not end-to-end encryption if the middleman, a company, typically, 790 00:36:43,140 --> 00:36:45,270 has technically access to that data. 791 00:36:45,270 --> 00:36:49,080 Now, Zoom, and Microsoft Teams, and WhatsApp, and iMessage, 792 00:36:49,080 --> 00:36:51,510 and other services with which you're familiar increasingly 793 00:36:51,510 --> 00:36:53,670 are offering stronger guarantees of encryption, 794 00:36:53,670 --> 00:36:58,500 whereby, it's indeed between parties A and B and not the one in the middle. 795 00:36:58,500 --> 00:36:59,820 Now, there's downsides here. 796 00:36:59,820 --> 00:37:02,580 And you can actually see this kind of functionality manifest 797 00:37:02,580 --> 00:37:03,750 in certain settings. 798 00:37:03,750 --> 00:37:07,620 For instance, besides iMessage, which just does this for you 799 00:37:07,620 --> 00:37:10,890 on iPhones or Macs, besides Zoom, you can actually 800 00:37:10,890 --> 00:37:13,210 fine tune these settings, indeed, within Zoom itself. 801 00:37:13,210 --> 00:37:16,140 So here's a screenshot that I took last night of just what the user 802 00:37:16,140 --> 00:37:19,770 interface looks like today to create a new Zoom meeting with the latest 803 00:37:19,770 --> 00:37:21,060 version of Zoom software. 804 00:37:21,060 --> 00:37:24,870 And maybe unbeknownst to you, there is a choice of buttons down here. 805 00:37:24,870 --> 00:37:28,870 And most likely, yours is, by default, enhanced encryption, 806 00:37:28,870 --> 00:37:31,590 which is brilliant marketing speak because it's just encryption. 807 00:37:31,590 --> 00:37:32,490 It's not enhanced. 808 00:37:32,490 --> 00:37:35,260 It actually ironically means worse than this. 809 00:37:35,260 --> 00:37:37,873 But they want you using it most likely, why? 810 00:37:37,873 --> 00:37:39,540 Well, it's a little easier to implement. 811 00:37:39,540 --> 00:37:41,970 It's a little less expensive for them computationally. 812 00:37:41,970 --> 00:37:46,770 And to be fair, enhanced encryption does scramble the data, but not in a way 813 00:37:46,770 --> 00:37:48,420 that Zoom can't see it. 814 00:37:48,420 --> 00:37:49,650 Zoom can, indeed, see it. 815 00:37:49,650 --> 00:37:51,600 But that's actually a plus in some context 816 00:37:51,600 --> 00:37:54,090 because if you want to do cloud recordings 817 00:37:54,090 --> 00:37:56,340 and you want a meeting recorded not on your Mac or PC, 818 00:37:56,340 --> 00:37:57,810 but let Zoom deal with that. 819 00:37:57,810 --> 00:38:00,085 If you want automatic transcription nowadays, 820 00:38:00,085 --> 00:38:02,460 so the words to appear, whether it's English or something 821 00:38:02,460 --> 00:38:04,320 else on the screen, well, you can't really 822 00:38:04,320 --> 00:38:06,810 lock Zoom or any other middleman out of that 823 00:38:06,810 --> 00:38:08,910 because someone needs to save it to the cloud. 824 00:38:08,910 --> 00:38:12,930 Someone needs to translate the voice to those English or some other language 825 00:38:12,930 --> 00:38:13,480 words. 826 00:38:13,480 --> 00:38:15,660 So enhanced encryption enables those features, 827 00:38:15,660 --> 00:38:18,630 but they also allow a bad actor, malicious employees, 828 00:38:18,630 --> 00:38:21,690 someone who's just nosey at Zoom or the equivalent middle man 829 00:38:21,690 --> 00:38:25,080 to just poke around your video conference and hear what you've said 830 00:38:25,080 --> 00:38:29,770 or see what you've typed as well unless you instead check this box as well. 831 00:38:29,770 --> 00:38:33,033 So increasingly look for mentions of end-to-end encryption. 832 00:38:33,033 --> 00:38:35,700 Or give that some thought when you choose a technology via which 833 00:38:35,700 --> 00:38:38,760 to communicate with someone, whether it's within your family 834 00:38:38,760 --> 00:38:40,870 or without as well. 835 00:38:40,870 --> 00:38:45,150 Now, last, but not least, there's other applications of encryption too. 836 00:38:45,150 --> 00:38:49,600 And this, too, might be a lesson learned as well, full disk encryption. 837 00:38:49,600 --> 00:38:53,640 So a disk is where your data is stored in your Mac, or PC, or even your phone. 838 00:38:53,640 --> 00:38:56,940 And full disk encryption just means ideally that all of your data 839 00:38:56,940 --> 00:38:59,190 is encrypted that is somehow scrambled. 840 00:38:59,190 --> 00:39:02,100 Now, hopefully, your password for your computer or phone 841 00:39:02,100 --> 00:39:04,952 is good enough so that even though the device is 842 00:39:04,952 --> 00:39:07,410 encrypted with that password, at least, you'll remember it. 843 00:39:07,410 --> 00:39:10,458 And your phone or your Mac or PC will automatically decrypt it for you. 844 00:39:10,458 --> 00:39:13,500 Of course, you can't scramble the information and hide it from ourselves. 845 00:39:13,500 --> 00:39:16,470 One of us, at least, for these devices needs to have access. 846 00:39:16,470 --> 00:39:18,990 But full disk encryption typically means that at least when 847 00:39:18,990 --> 00:39:22,110 you close the laptop lid or power down for the night, 848 00:39:22,110 --> 00:39:24,690 that even if someone else steals that device, 849 00:39:24,690 --> 00:39:28,620 opens the lid, unless they have your passcode, 850 00:39:28,620 --> 00:39:31,470 they can't even plug in fancy cables to the device 851 00:39:31,470 --> 00:39:34,320 and just rip the zeros and ones off of the device 852 00:39:34,320 --> 00:39:35,823 and see what's actually there. 853 00:39:35,823 --> 00:39:37,740 Full disk encryption means they could do that, 854 00:39:37,740 --> 00:39:40,890 but they would just see seemingly random zeros and ones. 855 00:39:40,890 --> 00:39:42,270 Now, there's a downside here too. 856 00:39:42,270 --> 00:39:44,160 This might slow things down potentially. 857 00:39:44,160 --> 00:39:48,060 But it is a feature increasingly that's offered and is absolutely something you 858 00:39:48,060 --> 00:39:51,510 should consider enabling in general, especially if your laptop 859 00:39:51,510 --> 00:39:52,620 or phone travels with you. 860 00:39:52,620 --> 00:39:53,970 And certainly, your phone does. 861 00:39:53,970 --> 00:39:58,020 Or if you plan to donate, or sell, or give away a device, 862 00:39:58,020 --> 00:40:00,180 you don't want to leave all of the zeros and ones, 863 00:40:00,180 --> 00:40:02,650 the remnants of your own sensitive data, passed on there. 864 00:40:02,650 --> 00:40:04,770 So Windows has a feature called BitLocker. 865 00:40:04,770 --> 00:40:08,320 Mac OS has a feature called FileVault. There's commercial options as well. 866 00:40:08,320 --> 00:40:12,210 But generally, we're at the point now in 2022 where clicking a button 867 00:40:12,210 --> 00:40:14,940 is sufficient to enable these features. 868 00:40:14,940 --> 00:40:17,430 With that said, don't rush into all of these decisions. 869 00:40:17,430 --> 00:40:19,440 I would make backups of your data. 870 00:40:19,440 --> 00:40:22,930 And don't maybe email CS50 if something goes wrong with that process. 871 00:40:22,930 --> 00:40:24,570 But I would do your own due diligence. 872 00:40:24,570 --> 00:40:27,720 But this, too, would be a menu of possibilities. 873 00:40:27,720 --> 00:40:32,070 And now, the bad side, the downside, of what seems to be great, 874 00:40:32,070 --> 00:40:33,990 this notion of full disk encryption. 875 00:40:33,990 --> 00:40:36,990 Unfortunately, just as we can encrypt our data 876 00:40:36,990 --> 00:40:39,280 to protect it from the adversaries, so can 877 00:40:39,280 --> 00:40:44,140 the adversaries if they get into our devices, encrypt our data and do what? 878 00:40:44,140 --> 00:40:46,420 Not tell us that secret key. 879 00:40:46,420 --> 00:40:49,190 And so this is generally applied in the context of ransomware, 880 00:40:49,190 --> 00:40:51,100 which tragically, you increasingly hear about 881 00:40:51,100 --> 00:40:53,992 in hospital systems, school systems, municipalities 882 00:40:53,992 --> 00:40:55,450 where systems are getting attacked. 883 00:40:55,450 --> 00:40:57,730 And the data is not just getting stolen because what 884 00:40:57,730 --> 00:41:01,570 is the adversary typically need with local municipal or even hospital data? 885 00:41:01,570 --> 00:41:05,200 The value to the adversary is encrypting all 886 00:41:05,200 --> 00:41:07,660 of the hospital, all of the municipality's data 887 00:41:07,660 --> 00:41:11,298 preventing them from accessing it if they have no backups or the like. 888 00:41:11,298 --> 00:41:13,090 And so ransomware is literally about trying 889 00:41:13,090 --> 00:41:16,550 to convince someone to pay you money or pay you Bitcoin or something like that 890 00:41:16,550 --> 00:41:18,490 to give you that secret key. 891 00:41:18,490 --> 00:41:21,910 And the key, in this case, is surely more sophisticated than the number one. 892 00:41:21,910 --> 00:41:23,420 But it's really the same idea. 893 00:41:23,420 --> 00:41:26,380 So here too, yet again, a tradeoff just as we sort of invent something 894 00:41:26,380 --> 00:41:30,350 for good, it can also be used for evil in so to speak as well. 895 00:41:30,350 --> 00:41:32,890 But it's really the same underlying principles, 896 00:41:32,890 --> 00:41:34,600 even though we keep seeing it and hearing 897 00:41:34,600 --> 00:41:37,490 about it, in these different forms. 898 00:41:37,490 --> 00:41:40,360 And lastly, if only because folks are generally familiar, 899 00:41:40,360 --> 00:41:43,750 but don't necessarily know what it is that it's doing for them, 900 00:41:43,750 --> 00:41:46,810 browsers nowadays have what's often called incognito mode 901 00:41:46,810 --> 00:41:49,750 or private mode, which has nothing to do with encryption, but does 902 00:41:49,750 --> 00:41:52,720 have to do with cybersecurity, or really, cyber privacy, 903 00:41:52,720 --> 00:41:55,450 keeping your data from prying eyes. 904 00:41:55,450 --> 00:41:57,742 Incognito mode, if you open it in Chrome, for instance, 905 00:41:57,742 --> 00:41:59,200 looks a little something like this. 906 00:41:59,200 --> 00:42:02,080 And we use it in CS50 when introducing students, as we did last week, 907 00:42:02,080 --> 00:42:04,780 to web programming because it, in effect, lets 908 00:42:04,780 --> 00:42:07,690 you start with a clean slate like a brand new browser that has never 909 00:42:07,690 --> 00:42:10,720 visited any websites before, which is good for just diagnosing problems. 910 00:42:10,720 --> 00:42:14,512 But it's often commonly used if you want to log into maybe your Gmail account 911 00:42:14,512 --> 00:42:17,470 on someone else's computer and you don't want your password being saved 912 00:42:17,470 --> 00:42:19,262 or you want to visit some website where you 913 00:42:19,262 --> 00:42:22,610 don't want the URL or the search terms ending up in your autocomplete history. 914 00:42:22,610 --> 00:42:25,090 So there's multiple uses for incognito mode. 915 00:42:25,090 --> 00:42:26,870 But what does it really do? 916 00:42:26,870 --> 00:42:30,310 Well, it doesn't stop your company, it doesn't stop your university, 917 00:42:30,310 --> 00:42:33,040 your internet service provider, be it, Comcast, Verizon, 918 00:42:33,040 --> 00:42:36,730 or the like from knowing what websites you go to because-- 919 00:42:36,730 --> 00:42:37,510 ask your students. 920 00:42:37,510 --> 00:42:39,920 A couple of weeks ago we talked about-- actually, a week ago, 921 00:42:39,920 --> 00:42:41,620 we talked about how the internet works. 922 00:42:41,620 --> 00:42:44,000 And unfortunately, every computer has an IP address, 923 00:42:44,000 --> 00:42:46,375 which is a unique identifier, which goes out any time you 924 00:42:46,375 --> 00:42:48,462 go anywhere incognito mode or not. 925 00:42:48,462 --> 00:42:50,170 So this isn't really covering your tracks 926 00:42:50,170 --> 00:42:54,130 outside of your office, or outside of your home, or outside of your company. 927 00:42:54,130 --> 00:42:56,950 But it is, at least, throwing away local information. 928 00:42:56,950 --> 00:43:00,580 And so we'll talk, in fact, in CS50's week nine this coming Monday 929 00:43:00,580 --> 00:43:03,692 about cookies, which you might generally know about 930 00:43:03,692 --> 00:43:04,900 and what are called sessions. 931 00:43:04,900 --> 00:43:07,900 And so long story short, what incognito mode does 932 00:43:07,900 --> 00:43:12,225 is it throws away, when you close the window, any locally stored information, 933 00:43:12,225 --> 00:43:15,100 so these things called cookies, which are sort of virtual hand stamps 934 00:43:15,100 --> 00:43:16,892 that just remember what you've logged in as 935 00:43:16,892 --> 00:43:19,030 or what's in your shopping cart or the like. 936 00:43:19,030 --> 00:43:23,410 But it doesn't hide any information from anyone outside of your own Mac or PC. 937 00:43:23,410 --> 00:43:25,715 It only prevents those local prying eyes. 938 00:43:25,715 --> 00:43:28,090 So there, too, even though we have tools that many of you 939 00:43:28,090 --> 00:43:30,310 are probably in the habit of using or thinking 940 00:43:30,310 --> 00:43:34,360 you should use to be more private, be more secure on the internet, what we do 941 00:43:34,360 --> 00:43:37,720 really in CS50, both weeks past and future, 942 00:43:37,720 --> 00:43:40,220 is talk about how these technologies work, 943 00:43:40,220 --> 00:43:44,260 so that ultimately, we have all the more of an educated citizenry here 944 00:43:44,260 --> 00:43:46,800 among undergrads and here as well as online, 945 00:43:46,800 --> 00:43:49,300 so that you can apply these same lessons learned to problems 946 00:43:49,300 --> 00:43:50,900 you'll encounter in the future. 947 00:43:50,900 --> 00:43:52,600 So as promised, the homework. 948 00:43:52,600 --> 00:43:55,143 One, you should probably use a password manager. 949 00:43:55,143 --> 00:43:57,310 It doesn't have to be one of those ones on the list, 950 00:43:57,310 --> 00:44:00,640 but at least starting that conversation maybe with someone who does, 951 00:44:00,640 --> 00:44:01,570 maybe the-- 952 00:44:01,570 --> 00:44:04,748 it's often the students in your family, perhaps, 953 00:44:04,748 --> 00:44:06,790 who can advise you on some of these technologies. 954 00:44:06,790 --> 00:44:10,653 Consider using a password manager too, using two-factor authentication, 955 00:44:10,653 --> 00:44:12,820 whether it's your phone or some key fob or the like, 956 00:44:12,820 --> 00:44:16,030 but at least seeking out that feature at least for accounts that you really 957 00:44:16,030 --> 00:44:18,760 care about, your email, social media, financial, medical, 958 00:44:18,760 --> 00:44:21,010 anything where you'd be embarrassed, at best, 959 00:44:21,010 --> 00:44:24,400 or really violated, at worst, if that kind of information got out 960 00:44:24,400 --> 00:44:28,360 and then increasingly using not just encryption, which you get automatically 961 00:44:28,360 --> 00:44:31,930 for most technologies today, but increasingly choosing technologies 962 00:44:31,930 --> 00:44:35,980 that offer stronger guarantees that keep those middlemen, those companies out 963 00:44:35,980 --> 00:44:39,040 of the way if only so that you can trust with higher probability 964 00:44:39,040 --> 00:44:43,900 that only party B knows what party A has said or sent. 965 00:44:43,900 --> 00:44:45,820 Now, this, of course, was a whirlwind tour. 966 00:44:45,820 --> 00:44:48,190 There's so much more that you can do online. 967 00:44:48,190 --> 00:44:52,150 Indeed, this course, CS50, can be taken for free online via platforms like edX 968 00:44:52,150 --> 00:44:54,430 at edx.org/cs50. 969 00:44:54,430 --> 00:44:57,400 I thought it might be appropriate to end on this note. 970 00:44:57,400 --> 00:45:01,300 If anyone would like to conjecture before we 971 00:45:01,300 --> 00:45:07,050 start playing music and adjourn for lunch, what our final message here is. 972 00:45:07,050 --> 00:45:12,270 If we reverse the plus one and maybe start minus one here, minus one here-- 973 00:45:12,270 --> 00:45:14,920 and indeed, thank you so much for coming. 974 00:45:14,920 --> 00:45:18,240 This was CS50. 975 00:45:18,240 --> 00:45:21,290 [MUSIC PLAYING] 976 00:45:21,290 --> 00:45:55,000