WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 00:00:00.000 --> 00:00:03.960 [MUSIC PLAYING] 00:00:16.860 --> 00:00:19.770 DAVID MALAN: All right, this is CS50's Introduction 00:00:19.770 --> 00:00:21.960 to Cybersecurity My name is David Malan. 00:00:21.960 --> 00:00:25.470 And this week, let's focus on securing systems, particularly 00:00:25.470 --> 00:00:29.170 those that are somehow networked or even inter-networked as well. 00:00:29.170 --> 00:00:32.310 Now, recall from last time that we presented encryption. 00:00:32.310 --> 00:00:34.450 Is really the solution to a lot of our problems. 00:00:34.450 --> 00:00:36.783 And that's going to be a building block that we continue 00:00:36.783 --> 00:00:40.440 to use to solve a lot of our concerns around the security of not 00:00:40.440 --> 00:00:43.793 only our accounts, our data, but now also our systems. 00:00:43.793 --> 00:00:46.710 For instance, let's consider something that you, yourself, are perhaps 00:00:46.710 --> 00:00:51.090 using right now, which is Wi-Fi, somehow connected via wireless technology 00:00:51.090 --> 00:00:53.350 to the internet and beyond. 00:00:53.350 --> 00:00:56.377 So probably by now you've realized that when you're on Wi-Fi, 00:00:56.377 --> 00:00:57.960 you have to first choose your network. 00:00:57.960 --> 00:01:00.002 And you might choose your network from a dropdown 00:01:00.002 --> 00:01:02.910 menu on your computer or the like, or it might auto select it. 00:01:02.910 --> 00:01:04.780 But there's at least two types of networks. 00:01:04.780 --> 00:01:09.840 One that are unsecured, and then two, which are secured in some way. 00:01:09.840 --> 00:01:11.850 And odds are you recognize, and you've already 00:01:11.850 --> 00:01:14.790 been taught through practice to recognize the little padlock 00:01:14.790 --> 00:01:17.010 icon on your phone, or laptop, or desktop 00:01:17.010 --> 00:01:21.340 as signifying that your Wi-Fi connection is, indeed, encrypted. 00:01:21.340 --> 00:01:22.900 Now what does that actually mean? 00:01:22.900 --> 00:01:25.080 Well, in particular, in this context, it would 00:01:25.080 --> 00:01:29.730 mean that any of the traffic, any of the internet packets of information, 00:01:29.730 --> 00:01:33.150 so to speak, like envelopes of information going from your device 00:01:33.150 --> 00:01:35.820 off onto the internet, are somehow encrypted. 00:01:35.820 --> 00:01:39.630 At least encrypted until they reach whatever device 00:01:39.630 --> 00:01:42.870 your device is talking to wirelessly. 00:01:42.870 --> 00:01:47.040 So you have an encrypted connection between that wireless device, 00:01:47.040 --> 00:01:48.730 often called an access point. 00:01:48.730 --> 00:01:50.700 Now what is the actual technology that's being 00:01:50.700 --> 00:01:53.460 used to secure Wi-Fi networks nowadays? 00:01:53.460 --> 00:01:56.370 Well, hopefully you're using among the latest versions of this-- 00:01:56.370 --> 00:01:59.043 Wi-Fi Protected Access, or WPA. 00:01:59.043 --> 00:02:00.960 And this has evolved over the years, and there 00:02:00.960 --> 00:02:02.418 have been a few different versions. 00:02:02.418 --> 00:02:05.010 And so in general, whenever you configure a phone, 00:02:05.010 --> 00:02:07.380 or whenever you can figure a laptop or desktop, 00:02:07.380 --> 00:02:10.229 ideally you're connecting to a device nowadays 00:02:10.229 --> 00:02:13.512 that supports this technology and the latest version thereof. 00:02:13.512 --> 00:02:15.720 And in a nutshell, what that ensures is that, indeed, 00:02:15.720 --> 00:02:18.960 your traffic from your phone, laptop, or desktop is somehow 00:02:18.960 --> 00:02:22.200 scrambled between you and that other device. 00:02:22.200 --> 00:02:24.180 And that device, in turn, is probably connected 00:02:24.180 --> 00:02:27.940 to devices called routers, computers that route-- left, right, up, 00:02:27.940 --> 00:02:29.940 and down-- information on the internet, which 00:02:29.940 --> 00:02:32.220 might then connect to other routers, until it finally 00:02:32.220 --> 00:02:33.700 reaches its destination. 00:02:33.700 --> 00:02:36.750 So for our purposes, though, Wi-Fi Protected Access 00:02:36.750 --> 00:02:40.710 and using a secure Wi-Fi network is only technically encrypting 00:02:40.710 --> 00:02:44.520 your traffic between you and whatever device is maybe on the wall, 00:02:44.520 --> 00:02:47.100 on the ceiling, nearby, the little thing with antennas, 00:02:47.100 --> 00:02:50.790 perhaps, that you, yourself, are actually talking to. 00:02:50.790 --> 00:02:53.140 So why do we care about this? 00:02:53.140 --> 00:02:56.290 Why do you want even your Wi-Fi connection, for instance, 00:02:56.290 --> 00:02:57.120 to be encrypted? 00:02:57.120 --> 00:03:01.170 Well, it turns out a lot of what you and I do on the internet isn't necessarily 00:03:01.170 --> 00:03:03.180 encrypted already for us. 00:03:03.180 --> 00:03:06.000 Now fortunately, this is decreasingly the case. 00:03:06.000 --> 00:03:09.810 The world has gotten better about using more and more encryption 00:03:09.810 --> 00:03:13.320 in various products, and software, and applications that you and I use, 00:03:13.320 --> 00:03:14.730 but not necessarily. 00:03:14.730 --> 00:03:17.940 Because odds are, even some of you are probably 00:03:17.940 --> 00:03:24.420 in the habit of typing http://www.example.com 00:03:24.420 --> 00:03:26.940 or whatever domain you're actually trying to visit. 00:03:26.940 --> 00:03:29.340 Now maybe you don't type even that, but you're certainly 00:03:29.340 --> 00:03:32.880 familiar with this prefix, this acronym, HTTP, 00:03:32.880 --> 00:03:35.910 which stands for Hypertext Transfer Protocol, which 00:03:35.910 --> 00:03:38.610 is a fancy way of saying that this is a protocol, 00:03:38.610 --> 00:03:43.590 a language that computers use when talking on the worldwide web, 00:03:43.590 --> 00:03:44.710 or web for short. 00:03:44.710 --> 00:03:46.390 So what do we mean by that? 00:03:46.390 --> 00:03:49.530 Well, if you're sitting down at a browser on your phone, or laptop, 00:03:49.530 --> 00:03:53.550 or desktop, and you're visiting a URL that starts with HTTP, 00:03:53.550 --> 00:03:57.510 your device is about to communicate with some remote server using 00:03:57.510 --> 00:04:00.990 this protocol, this language, really a set of conventions 00:04:00.990 --> 00:04:02.970 for talking between each other. 00:04:02.970 --> 00:04:05.730 But the catch is, per last week, when we focused 00:04:05.730 --> 00:04:09.360 on the security of our data, the information, or the packets 00:04:09.360 --> 00:04:12.330 of information that we're sending from browser to server 00:04:12.330 --> 00:04:15.420 and back are vulnerable to eavesdropping, potentially, 00:04:15.420 --> 00:04:17.730 if you're only using HTTP. 00:04:17.730 --> 00:04:18.730 Well, why is that? 00:04:18.730 --> 00:04:22.000 Well HTTP by definition is not encrypted. 00:04:22.000 --> 00:04:26.040 It's just text messages, often English like in nature, 00:04:26.040 --> 00:04:27.480 but they're not at all scrambled. 00:04:27.480 --> 00:04:31.050 Which means if Alice is sitting at her desktop, or laptop, or phone, 00:04:31.050 --> 00:04:34.770 and trying to visit some website, here represented as our friend, Bob, 00:04:34.770 --> 00:04:39.960 there could be some third party, Eve, who's eavesdropping in between-- 00:04:39.960 --> 00:04:42.180 a machine in the middle, so to speak, that 00:04:42.180 --> 00:04:46.230 could be looking at every request Alice is making and every response 00:04:46.230 --> 00:04:49.350 that Bob is sending, if again, Alice is the user in the story 00:04:49.350 --> 00:04:51.780 and Bob is the web server in this story. 00:04:51.780 --> 00:04:56.580 So you're vulnerable if you're only using HTTP and certain other protocols 00:04:56.580 --> 00:04:59.790 or technologies to these machine in the middle attacks. 00:04:59.790 --> 00:05:01.500 And the attack in this case might just be 00:05:01.500 --> 00:05:05.460 someone nosily trying to know what it is you are doing on the internet. 00:05:05.460 --> 00:05:09.990 But worse, they can even manipulate what you're sending or receiving 00:05:09.990 --> 00:05:13.960 if these systems are not using some form of encryption. 00:05:13.960 --> 00:05:15.870 So let's take a specific example. 00:05:15.870 --> 00:05:18.160 When you visit a website on the internet, 00:05:18.160 --> 00:05:20.950 you are downloading effectively a language 00:05:20.950 --> 00:05:23.830 called HTML, Hypertext Markup Language. 00:05:23.830 --> 00:05:27.478 If you take a course like CS50, itself, or an introduction to web development, 00:05:27.478 --> 00:05:30.520 you'll actually learn a language that looks a little something like this. 00:05:30.520 --> 00:05:32.228 And I've shown only the highlights, using 00:05:32.228 --> 00:05:36.190 ellipsis here-- dot, dot, dot-- to wave my hands at details that won't matter. 00:05:36.190 --> 00:05:38.860 But this is the kind of text, or language, 00:05:38.860 --> 00:05:41.590 that we get back from a server when you visit something, 00:05:41.590 --> 00:05:46.630 like http://www.example.com. 00:05:46.630 --> 00:05:49.122 But notice that this doesn't seem to be scrambled, 00:05:49.122 --> 00:05:50.830 even though it might look cryptic to you, 00:05:50.830 --> 00:05:53.170 if you've never written or seen HTML before. 00:05:53.170 --> 00:05:56.740 It doesn't look like random zeros and ones certainly. 00:05:56.740 --> 00:05:58.240 It looks pretty intelligible. 00:05:58.240 --> 00:06:01.760 And, in fact, it looks somewhat English like, with words like body here. 00:06:01.760 --> 00:06:07.150 Well, the catch is, if this is the response coming back from a web server 00:06:07.150 --> 00:06:08.860 to a web browser-- 00:06:08.860 --> 00:06:10.660 for instance Alice's own-- 00:06:10.660 --> 00:06:13.300 what could happen is that some eavesdropper, 00:06:13.300 --> 00:06:15.340 some machine in the middle, could actually 00:06:15.340 --> 00:06:20.950 inject additional HTML code into the web pages that you, and I, 00:06:20.950 --> 00:06:22.480 and Alice are downloading. 00:06:22.480 --> 00:06:24.160 Now what is this representative of? 00:06:24.160 --> 00:06:26.035 Well, this is another feature you might learn 00:06:26.035 --> 00:06:29.230 in a course on website development, but let me highlight just one key phrase 00:06:29.230 --> 00:06:31.180 here because this is a common use case. 00:06:31.180 --> 00:06:34.390 It is possible, therefore, for a machine in the middle 00:06:34.390 --> 00:06:38.590 to inject something like advertisements, or worse something actually 00:06:38.590 --> 00:06:40.960 malicious that tries to steal your data in some way. 00:06:40.960 --> 00:06:44.020 But a common scenario for this machine in the middle attack 00:06:44.020 --> 00:06:48.340 is when your internet service provider, or the coffee shop whose Wi-Fi you're 00:06:48.340 --> 00:06:51.370 using, or the hotel whose Wi-Fi you're using 00:06:51.370 --> 00:06:55.390 wants to inject advertisements into maybe each and every web 00:06:55.390 --> 00:06:58.180 page you are visiting, even if those web pages weren't even 00:06:58.180 --> 00:06:59.680 designed to have advertisements. 00:06:59.680 --> 00:07:03.820 These things might be pre-pended to the very top of the page, for instance. 00:07:03.820 --> 00:07:09.170 Now the reason that machines in the middle are able to do this, 00:07:09.170 --> 00:07:11.890 though, is simply because if you're not using encryption, 00:07:11.890 --> 00:07:16.030 and Alice and Bob are communicating insecurely in that sense, well, 00:07:16.030 --> 00:07:19.810 there's no telling what they could add to the responses that 00:07:19.810 --> 00:07:22.150 are coming back from this web server. 00:07:22.150 --> 00:07:25.480 Let me pause here before we move on to yet other threats 00:07:25.480 --> 00:07:30.160 to see if there are now any questions on this particular attack. 00:07:30.160 --> 00:07:33.350 AUDIENCE: How to detect machine in the middle and get rid of it? 00:07:33.350 --> 00:07:34.850 DAVID MALAN: A really good question. 00:07:34.850 --> 00:07:37.870 So how can you detect a machine in the middle and get rid of it? 00:07:37.870 --> 00:07:40.660 That certainly should be a worthy goal. 00:07:40.660 --> 00:07:44.810 Short answer is you cannot necessarily detect it. 00:07:44.810 --> 00:07:48.970 It is possible that a machine in the middle can be doing all of this 00:07:48.970 --> 00:07:51.350 without your own knowledge. 00:07:51.350 --> 00:07:52.718 How do you get rid of it? 00:07:52.718 --> 00:07:54.760 Well, that too is going to be the focus of today. 00:07:54.760 --> 00:07:59.470 Namely, encryption is going to help us push back on exactly this threat. 00:07:59.470 --> 00:08:02.770 But first, let's consider some additional threats or concerns 00:08:02.770 --> 00:08:03.620 that we might have. 00:08:03.620 --> 00:08:06.080 And one of them technically is called packet sniffing. 00:08:06.080 --> 00:08:09.820 So, again, a packet is the a virtual envelope of sorts 00:08:09.820 --> 00:08:12.050 that you might use when sending data on the internet. 00:08:12.050 --> 00:08:14.625 And so, in fact, here is a pretty standard envelope 00:08:14.625 --> 00:08:16.750 in which, in the human world, I might put a letter, 00:08:16.750 --> 00:08:19.240 and then write something on the outside, and send it off 00:08:19.240 --> 00:08:20.740 to someone through the mail system. 00:08:20.740 --> 00:08:24.220 Well, you can think of packets in the context of computer systems 00:08:24.220 --> 00:08:28.750 as being analogous to this, whereby this is an envelope, whose purpose in life 00:08:28.750 --> 00:08:31.720 is to get data from one point, A, to another point, 00:08:31.720 --> 00:08:33.340 B-- so from Alice to Bob. 00:08:33.340 --> 00:08:37.750 And inside of this packet is the actual message that Alice is sending to Bob, 00:08:37.750 --> 00:08:39.640 and Bob is hopefully sending back to Alice. 00:08:39.640 --> 00:08:41.950 So it might not all fit in one packet. 00:08:41.950 --> 00:08:44.800 So indeed, the internet tends to use multiple packets like these. 00:08:44.800 --> 00:08:46.810 But that's an appropriate metaphor to think 00:08:46.810 --> 00:08:49.840 about what it is we're doing here, otherwise, digitally. 00:08:49.840 --> 00:08:52.360 So packet sniffing then is kind like-- 00:08:52.360 --> 00:08:55.720 [SNIFFS]---- trying to get a sense of what's going on inside of these 00:08:55.720 --> 00:08:56.350 envelopes. 00:08:56.350 --> 00:09:00.310 And, indeed, if the contents of these envelopes are not encrypted, 00:09:00.310 --> 00:09:04.540 scrambled securely in some way, well then any machine in the middle 00:09:04.540 --> 00:09:07.450 can technically take a quick glance inside of these envelopes, 00:09:07.450 --> 00:09:10.210 so to speak, see what's inside of them, even change 00:09:10.210 --> 00:09:13.370 what's inside of them, as we've seen, and then pass it along. 00:09:13.370 --> 00:09:17.230 So what are the implications for what this makes possible, therefore, 00:09:17.230 --> 00:09:21.220 if you are vulnerable to packet sniffing by, again, not 00:09:21.220 --> 00:09:24.940 having your systems use encryption when they are talking to one another? 00:09:24.940 --> 00:09:27.850 Well, here, for instance, is one example of what 00:09:27.850 --> 00:09:31.840 could be inside, metaphorically, an envelope like that. 00:09:31.840 --> 00:09:35.290 This is the kind of message, written mostly in English, 00:09:35.290 --> 00:09:40.540 that represents a browser requesting a web page from a server. 00:09:40.540 --> 00:09:43.690 In particular, when you visit a search engine, for instance, 00:09:43.690 --> 00:09:46.780 and type into the search box what it is you're searching for-- maybe 00:09:46.780 --> 00:09:48.130 you're searching for cats-- 00:09:48.130 --> 00:09:50.380 what happens is your phone, or your laptop, 00:09:50.380 --> 00:09:53.080 or your desktop creates a virtual envelope like that. 00:09:53.080 --> 00:09:56.590 Opens it up, puts inside of it a message that looks like this. 00:09:56.590 --> 00:09:59.410 Closes the envelope, and then sends the envelope off 00:09:59.410 --> 00:10:02.220 onto the internet to a web server that, conversely, 00:10:02.220 --> 00:10:05.470 is going to open the envelope, read this message, and hopefully, send you back 00:10:05.470 --> 00:10:08.380 a whole bunch of search results about cats. 00:10:08.380 --> 00:10:12.680 Now most of this is arcane detail that we don't particularly care about, 00:10:12.680 --> 00:10:14.650 including where it is we're visiting. 00:10:14.650 --> 00:10:16.657 For this story, I'm using example.com. 00:10:16.657 --> 00:10:18.740 It's just something generic, but you could imagine 00:10:18.740 --> 00:10:20.780 it being Google, or Bing, or the like. 00:10:20.780 --> 00:10:24.710 Here you have a command-- get-- which means literally get me a web page. 00:10:24.710 --> 00:10:27.780 /Search is implying that I'm searching for something. 00:10:27.780 --> 00:10:29.460 But here's the interesting part. 00:10:29.460 --> 00:10:32.810 What I'm searching for is a query for cats, 00:10:32.810 --> 00:10:38.430 where a query is a question or a search request of yours, and cats, of course, 00:10:38.430 --> 00:10:39.530 is what I'm searching for. 00:10:39.530 --> 00:10:41.900 So this is the say inside of this envelope, 00:10:41.900 --> 00:10:43.700 whether you're searching for cats, or dogs, 00:10:43.700 --> 00:10:47.960 or anything else, there is some mention of that search query 00:10:47.960 --> 00:10:51.260 inside of the message inside of that envelope that's 00:10:51.260 --> 00:10:55.532 being sent from Alice to Bob, or from you to Google or Bing. 00:10:55.532 --> 00:10:57.740 Lastly, there's this mention here, which is referring 00:10:57.740 --> 00:11:00.170 to exactly this same protocol, HTTP. 00:11:00.170 --> 00:11:02.420 In this case, perhaps, version 3. 00:11:02.420 --> 00:11:06.050 But ultimately, it's the yellow highlighted text here, 00:11:06.050 --> 00:11:09.230 that mention of cats that's worrisome if some machine 00:11:09.230 --> 00:11:13.040 in the middle, some adversary, can sniff this packet and see what's 00:11:13.040 --> 00:11:14.210 going on inside. 00:11:14.210 --> 00:11:16.760 Now there's other threats similar in spirit. 00:11:16.760 --> 00:11:19.340 When you request pages on the internet, you don't sometimes 00:11:19.340 --> 00:11:21.917 just search for information, like cats, which I don't really 00:11:21.917 --> 00:11:24.500 care about if people know, since everyone else on the internet 00:11:24.500 --> 00:11:25.417 is searching for cats. 00:11:25.417 --> 00:11:28.640 But what if I'm trying to check out on some website, like Amazon, 00:11:28.640 --> 00:11:30.470 and buy something with a credit card? 00:11:30.470 --> 00:11:32.900 Well, then what is inside of this envelope 00:11:32.900 --> 00:11:34.880 is some text that looks fairly similar. 00:11:34.880 --> 00:11:37.610 We still have a host in this story of example.com. 00:11:37.610 --> 00:11:39.200 I'm still using HTTP. 00:11:39.200 --> 00:11:41.280 I'm not getting information, per se. 00:11:41.280 --> 00:11:46.160 I'm posting information, akin to uploading my credit card to the server. 00:11:46.160 --> 00:11:49.820 And in particular, down here is a representative 00:11:49.820 --> 00:11:53.390 of how my credit card might be stored inside of this virtual envelope. 00:11:53.390 --> 00:11:56.210 And if I highlight that, indeed, you'll see a credit card 00:11:56.210 --> 00:11:59.450 number that, hopefully, doesn't actually work, but it is the right length. 00:11:59.450 --> 00:12:02.360 If anyone sniffs this packet, they might actually 00:12:02.360 --> 00:12:05.750 be able to find my credit card, and maybe my name, and my address, 00:12:05.750 --> 00:12:09.260 and the little code that you need, and more with respect to whatever it 00:12:09.260 --> 00:12:10.520 is I'm checking out for. 00:12:10.520 --> 00:12:14.570 So it's that easy if the data is itself not encrypted. 00:12:14.570 --> 00:12:19.670 Let me pause here then too and see if there are any questions. 00:12:19.670 --> 00:12:23.750 AUDIENCE: If someone is performing an attack from a machine in the middle, 00:12:23.750 --> 00:12:26.240 does the person actually have to be connected 00:12:26.240 --> 00:12:28.730 to the same Wi-Fi network as you? 00:12:28.730 --> 00:12:31.020 Or they could be on a whole other network? 00:12:31.020 --> 00:12:32.520 DAVID MALAN: A really good question. 00:12:32.520 --> 00:12:37.520 So in general, they would be connected to the same Wi-Fi network so that they 00:12:37.520 --> 00:12:38.498 are-- 00:12:38.498 --> 00:12:41.040 generally, they would be connected to the same Wi-Fi network, 00:12:41.040 --> 00:12:42.420 but even that is not necessary. 00:12:42.420 --> 00:12:45.740 So long as they are within a reasonable proximity to you, 00:12:45.740 --> 00:12:48.530 and their laptop or their device has an antenna that 00:12:48.530 --> 00:12:51.500 can receive all of the wireless packets that are around you, 00:12:51.500 --> 00:12:54.810 they don't necessarily have to have access to that same network, 00:12:54.810 --> 00:12:56.447 especially if it's unencrypted. 00:12:56.447 --> 00:12:58.280 And, in fact, there exists software that can 00:12:58.280 --> 00:13:01.230 listen to all possible networks that are around you. 00:13:01.230 --> 00:13:05.260 And so that, too, is a potential threat. 00:13:05.260 --> 00:13:07.655 Other questions on this here. 00:13:07.655 --> 00:13:10.030 AUDIENCE: Do people have to know your IP address in order 00:13:10.030 --> 00:13:14.020 to be able to see what you're doing or read what websites you're going to? 00:13:14.020 --> 00:13:15.520 DAVID MALAN: A really good question. 00:13:15.520 --> 00:13:18.970 For those unfamiliar, an IP address is a unique identifier 00:13:18.970 --> 00:13:21.190 that every computer on the internet has, much 00:13:21.190 --> 00:13:23.980 like you have a postal address to which humans can send mail. 00:13:23.980 --> 00:13:25.810 Short answer, [? Mahal, ?] is no. 00:13:25.810 --> 00:13:28.600 Someone does not need to know your IP address in advance 00:13:28.600 --> 00:13:32.440 for at least these wireless attacks, because they can simply, 00:13:32.440 --> 00:13:37.120 as per my other response, listen to all of the wireless traffic nearby. 00:13:37.120 --> 00:13:41.800 And they can actually see the IP addresses of senders and receivers 00:13:41.800 --> 00:13:45.700 flying by, so to speak, throughout the air. 00:13:45.700 --> 00:13:48.220 So what's another threat we should be mindful of? 00:13:48.220 --> 00:13:51.760 Well, it turns out that most any time you visit a website nowadays, 00:13:51.760 --> 00:13:54.410 one or more cookies are installed in your computer. 00:13:54.410 --> 00:13:55.550 Now what do I mean by that? 00:13:55.550 --> 00:13:58.930 When you actually visit a website for the first time, particularly one 00:13:58.930 --> 00:14:00.640 that you need to log into, and therefore, 00:14:00.640 --> 00:14:04.420 that needs to remember you when you click, click, click on different pages, 00:14:04.420 --> 00:14:06.220 for instance, to access different emails, 00:14:06.220 --> 00:14:09.760 add different things to your shopping cart, what the server actually does 00:14:09.760 --> 00:14:11.300 is a little something like this. 00:14:11.300 --> 00:14:14.020 The server responds to your request, for instance 00:14:14.020 --> 00:14:18.460 after logging in, with an HTTP response that first says, 00:14:18.460 --> 00:14:20.650 200, which is code for OK. 00:14:20.650 --> 00:14:22.840 It's a so-called status code, similar in spirit 00:14:22.840 --> 00:14:25.670 to the 404 you might have seen in the real world. 00:14:25.670 --> 00:14:26.830 But 200 means OK. 00:14:26.830 --> 00:14:29.440 And then it additionally sends this line of text inside 00:14:29.440 --> 00:14:31.180 of a virtual envelope that gets sent back 00:14:31.180 --> 00:14:36.340 to you-- set dash cookie, colon, and then this key value pair, so to speak. 00:14:36.340 --> 00:14:39.700 A word like session, then an equal sign, then a value. 00:14:39.700 --> 00:14:41.743 Now in practice, the value is actually pretty big 00:14:41.743 --> 00:14:44.410 and random with numbers and letters, but I pick something easier 00:14:44.410 --> 00:14:46.070 to pronounce for today's purposes-- 00:14:46.070 --> 00:14:48.100 1234abcd. 00:14:48.100 --> 00:14:50.590 And what this is similar to, actually, is 00:14:50.590 --> 00:14:54.820 as though when you visit this website, your hand is being stamped. 00:14:54.820 --> 00:14:59.170 Right after you've logged in, the server is now sending you, browser, 00:14:59.170 --> 00:15:02.980 this piece of message to say the equivalent of your hand 00:15:02.980 --> 00:15:04.000 has now been stamped. 00:15:04.000 --> 00:15:06.880 And so the next time you click on a link on that same website, 00:15:06.880 --> 00:15:11.710 it's as though you present this hand stamp again, and again, and again, 00:15:11.710 --> 00:15:15.280 instead of having to input your username and password again, 00:15:15.280 --> 00:15:16.790 and again, and again. 00:15:16.790 --> 00:15:21.190 This is a more seamless way, supported by HTTP, to just remind the server, 00:15:21.190 --> 00:15:23.560 I'm still David, I'm still David, I'm still David, 00:15:23.560 --> 00:15:25.840 by having virtually stamped my hand. 00:15:25.840 --> 00:15:29.450 And that's implemented by way of this cookie, so to speak. 00:15:29.450 --> 00:15:31.630 So the cookie is exactly what's in yellow here. 00:15:31.630 --> 00:15:36.100 And a session is just a concept that refers to the ability of a server 00:15:36.100 --> 00:15:37.900 to remember who you are. 00:15:37.900 --> 00:15:40.655 It's like your shopping session, in this context, 00:15:40.655 --> 00:15:43.780 of a website like amazon.com, where you might have a shopping cart that you 00:15:43.780 --> 00:15:45.100 want to keep adding things to. 00:15:45.100 --> 00:15:47.452 The website wants to remember what's in your cart. 00:15:47.452 --> 00:15:49.660 Therefore, the website needs to remember who you are. 00:15:49.660 --> 00:15:52.060 Therefore, the website is going to check your hand stamp, 00:15:52.060 --> 00:15:54.580 much like an amusement park, a bar, or club 00:15:54.580 --> 00:15:58.360 might once you've already shown them your ticket or your ID. 00:15:58.360 --> 00:16:02.500 Now, subsequently, when your browser visits the same website 00:16:02.500 --> 00:16:05.470 again and again, it, of course, doesn't have a hand to show like this, 00:16:05.470 --> 00:16:08.830 so rather, it sends its own message via HTTP, 00:16:08.830 --> 00:16:11.320 inside of its own virtual envelopes back to the server. 00:16:11.320 --> 00:16:14.230 every time you click another link, add something to your cart, 00:16:14.230 --> 00:16:17.300 open a new email on the site into which you've logged in. 00:16:17.300 --> 00:16:20.380 So here I'm just getting the home page of this server. 00:16:20.380 --> 00:16:23.170 And then I'm sending not set cookie, but cookie. 00:16:23.170 --> 00:16:25.780 This is the textual equivalent in HTTP of my 00:16:25.780 --> 00:16:27.700 presenting my hand with that stamp. 00:16:27.700 --> 00:16:30.160 I'm sending the exact same value as before-- 00:16:30.160 --> 00:16:32.980 session equals 1234abcd. 00:16:32.980 --> 00:16:35.290 That's how the server knows at this moment in time 00:16:35.290 --> 00:16:38.180 that this is me, David, and not you, for instance, 00:16:38.180 --> 00:16:40.690 even if you have logged in separately on your computer. 00:16:40.690 --> 00:16:45.580 Because you on your computer would have a different value for this cookie. 00:16:45.580 --> 00:16:48.910 Now this cookie might be generally stored in memory temporarily 00:16:48.910 --> 00:16:52.330 or it might actually be installed longer term on your computer, for a day, 00:16:52.330 --> 00:16:56.180 for a week, a year, depending on how the server's been configured. 00:16:56.180 --> 00:16:58.900 But the catch with these cookies, even though they 00:16:58.900 --> 00:17:02.860 do solve a very useful problem of retaining state, that 00:17:02.860 --> 00:17:06.579 is remembering who you are, they make you vulnerable, potentially, 00:17:06.579 --> 00:17:09.940 to what's called session hijacking, at least if you're not 00:17:09.940 --> 00:17:13.900 using encryption, that is HTTPS. 00:17:13.900 --> 00:17:17.740 Because if you're using HTTP, all of the contents of those envelopes, 00:17:17.740 --> 00:17:20.530 including the set cookie line and the cookie line 00:17:20.530 --> 00:17:23.140 are just being sent back and forth in the clear, 00:17:23.140 --> 00:17:25.390 without any encryption at all. 00:17:25.390 --> 00:17:27.050 Now what's the implication of that? 00:17:27.050 --> 00:17:29.050 Well, if an adversary is somehow listening 00:17:29.050 --> 00:17:32.200 in on your internet traffic, wirelessly or via wires, 00:17:32.200 --> 00:17:36.590 and they see your unencrypted HTTP traffic going back and forth, 00:17:36.590 --> 00:17:39.310 and they see in this traffic, inside this virtual envelope-- 00:17:39.310 --> 00:17:44.530 oh, David's session cookie happens to be 1234abcd, 00:17:44.530 --> 00:17:47.170 there's nothing technically stopping an adversary 00:17:47.170 --> 00:17:52.870 now from sending its own request, like this, to that same server, 00:17:52.870 --> 00:17:57.040 but copying your cookie, and essentially pretending 00:17:57.040 --> 00:18:00.280 that cookie is the adversary's and not just mine. 00:18:00.280 --> 00:18:04.870 The implication, therefore, logically, is that when I visit that website, 00:18:04.870 --> 00:18:06.160 I might still be logged in. 00:18:06.160 --> 00:18:09.520 But when the adversary visits that website by using this technique, 00:18:09.520 --> 00:18:13.150 they might be logged in too, but as me. 00:18:13.150 --> 00:18:16.720 So here too, the solution really is just to ensure 00:18:16.720 --> 00:18:19.960 you're using an encryption, namely HTTPS in this context, 00:18:19.960 --> 00:18:23.992 to ensure that not only are all of the contents of these message encrypted 00:18:23.992 --> 00:18:25.700 that you care about going back and forth, 00:18:25.700 --> 00:18:29.230 so are these lower-level details that you might not have even known about, 00:18:29.230 --> 00:18:32.540 namely these cookies that are going back and forth. 00:18:32.540 --> 00:18:34.960 Now what is HTTPS doing for us? 00:18:34.960 --> 00:18:37.930 Well, it's ensuring that the connection between Alice and Bob 00:18:37.930 --> 00:18:40.300 is completely encrypted, that is scrambled. 00:18:40.300 --> 00:18:43.810 So that even if there are other machines in between Alice and Bob, 00:18:43.810 --> 00:18:45.970 as there would be on the internet, none of them 00:18:45.970 --> 00:18:49.510 should be able to see what is inside of those virtual envelopes 00:18:49.510 --> 00:18:50.485 going back and forth. 00:18:50.485 --> 00:18:52.360 In fact, that's a good way to think about it. 00:18:52.360 --> 00:18:55.780 Recall that as Alice sends one of these virtual envelopes to Bob, 00:18:55.780 --> 00:18:58.330 and Bob might send a virtual envelope back to Alice, 00:18:58.330 --> 00:19:01.840 there may very well be identifiable information on the outside, 00:19:01.840 --> 00:19:04.990 like the IP address of Bob or the IP address of Alice. 00:19:04.990 --> 00:19:09.880 But HTTPS ensures that what's inside of the envelope is, indeed, encrypted. 00:19:09.880 --> 00:19:11.890 So that even if some other machine in the middle 00:19:11.890 --> 00:19:13.930 intercepts one of these virtual envelopes, 00:19:13.930 --> 00:19:17.430 they can't understand what's inside of it. 00:19:17.430 --> 00:19:19.020 Now how is that done? 00:19:19.020 --> 00:19:21.840 Well, it turns out that there's another protocol 00:19:21.840 --> 00:19:27.180 in the world that handles precisely that process of encrypting HTTP traffic, 00:19:27.180 --> 00:19:27.810 a.k.a. 00:19:27.810 --> 00:19:28.827 HTTPS. 00:19:28.827 --> 00:19:30.660 And the most recent version of this protocol 00:19:30.660 --> 00:19:33.630 is called TLS, which funny enough is perhaps 00:19:33.630 --> 00:19:36.210 an acronym that many people have still not heard of. 00:19:36.210 --> 00:19:39.570 But you might have heard of SSL, which is essentially 00:19:39.570 --> 00:19:40.980 an earlier version of it. 00:19:40.980 --> 00:19:44.490 But TLS is essentially the new and improved version of SSL 00:19:44.490 --> 00:19:47.310 and what modern browsers should now be using. 00:19:47.310 --> 00:19:51.030 What does this do to go about encrypting your traffic? 00:19:51.030 --> 00:19:53.400 Well, it turns out it relies on our focus 00:19:53.400 --> 00:19:56.770 from last time of public key cryptography. 00:19:56.770 --> 00:20:00.120 And this principle that if you give two parties, A and B, 00:20:00.120 --> 00:20:03.993 each their own public and private key, using that, 00:20:03.993 --> 00:20:05.910 you can solve that chicken and the egg problem 00:20:05.910 --> 00:20:09.330 and actually communicate securely, even if in advance you 00:20:09.330 --> 00:20:11.250 don't have a shared secret. 00:20:11.250 --> 00:20:15.030 So recall that asymmetric cryptography allows 00:20:15.030 --> 00:20:18.480 us to establish a secure connection even if you've never visited 00:20:18.480 --> 00:20:21.370 some website or some app before now. 00:20:21.370 --> 00:20:26.520 So what does it mean for a browser to be using TLS, and thus, 00:20:26.520 --> 00:20:30.220 HTTPS to communicate securely with a web server? 00:20:30.220 --> 00:20:33.750 Well, the web server in this story now has what we'll call a certificate, 00:20:33.750 --> 00:20:35.040 a digital certificate. 00:20:35.040 --> 00:20:38.580 And you can think of this, for now, as really a public key 00:20:38.580 --> 00:20:41.830 that has been signed by someone else. 00:20:41.830 --> 00:20:44.130 So the website has a public key and a private key. 00:20:44.130 --> 00:20:46.560 And the private key, as always, stays private. 00:20:46.560 --> 00:20:50.010 But in this case, the web server has a public key 00:20:50.010 --> 00:20:52.740 that's also been digitally signed by some third party. 00:20:52.740 --> 00:20:56.040 And for now, let's assume that there are some big third parties out there, 00:20:56.040 --> 00:20:59.940 companies really, that we all or, at least the browser manufacturers, 00:20:59.940 --> 00:21:04.320 the Google's, the Microsofts, the Apples of the world, all trust on our behalf. 00:21:04.320 --> 00:21:06.840 And they are the ones signing off on the legitimacy 00:21:06.840 --> 00:21:08.590 of these so-called certificates. 00:21:08.590 --> 00:21:12.240 Well, these certificates technically, are of a type called X.509, 00:21:12.240 --> 00:21:14.820 if you're curious about the type of protocol being used. 00:21:14.820 --> 00:21:18.360 But this is just the standard format in which these certificates live. 00:21:18.360 --> 00:21:20.910 But you can think of a certificate really as almost a printed 00:21:20.910 --> 00:21:23.580 piece of paper with some interesting information on it, 00:21:23.580 --> 00:21:27.930 like the name of the website, and how long the certificate is valid for, 00:21:27.930 --> 00:21:30.870 and also the public key, which we said last time 00:21:30.870 --> 00:21:34.380 is really just a big number that has a mathematical relationship 00:21:34.380 --> 00:21:36.640 with that private key as well. 00:21:36.640 --> 00:21:40.110 Now who are these big players, these big companies that 00:21:40.110 --> 00:21:43.463 are doing the signing of these website's certificates? 00:21:43.463 --> 00:21:45.630 Well, you might have heard this phrase at some point 00:21:45.630 --> 00:21:47.610 if you set up your own website, perhaps. 00:21:47.610 --> 00:21:51.300 These are called Certificate Authorities, or CAs. 00:21:51.300 --> 00:21:53.940 And these are a-- 00:21:53.940 --> 00:21:57.840 these are a collection of companies and entities whose purpose in life 00:21:57.840 --> 00:21:59.940 is to digitally sign certificates. 00:21:59.940 --> 00:22:02.670 And the various browser manufacturers of the world-- 00:22:02.670 --> 00:22:05.760 Apple, Microsoft, Google, Mozilla, and others 00:22:05.760 --> 00:22:09.810 have gotten together and included in their browsers, 00:22:09.810 --> 00:22:16.800 like Edge, and Firefox, and Safari, and Chrome, a list of certificate 00:22:16.800 --> 00:22:19.170 authorities that they trust. 00:22:19.170 --> 00:22:24.010 And the idea is that if you trust Apple, and you trust Google, and Microsoft, 00:22:24.010 --> 00:22:27.390 and Mozilla, and other browser manufacturers, then by transitivity, 00:22:27.390 --> 00:22:32.050 you should trust any of the certificate authorities that they, in turn, trust. 00:22:32.050 --> 00:22:34.710 So what actually happens when your browser visits a website? 00:22:34.710 --> 00:22:37.770 Well, it first downloads the certificate from that website, 00:22:37.770 --> 00:22:39.900 assuming you're using HTTPS. 00:22:39.900 --> 00:22:43.380 Your browser then calculates a hash value for that certificate 00:22:43.380 --> 00:22:45.810 by looking at certain fields within it, using 00:22:45.810 --> 00:22:48.990 a special hash function that produces a fixed 00:22:48.990 --> 00:22:51.250 representation of the certificate. 00:22:51.250 --> 00:22:53.050 Now why does it bother doing that? 00:22:53.050 --> 00:22:56.220 Well, the next step that your browser does is it does this. 00:22:56.220 --> 00:22:59.760 It takes a look at the signature on that certificate. 00:22:59.760 --> 00:23:05.550 It uses the certificate authority who signed that certificate's public key. 00:23:05.550 --> 00:23:09.480 And then it uses that CA's public key, the signature 00:23:09.480 --> 00:23:10.770 from that server certificate. 00:23:10.770 --> 00:23:15.150 It runs it through this algorithm here, effectively decrypting the signature 00:23:15.150 --> 00:23:16.540 with the public key. 00:23:16.540 --> 00:23:21.150 And that should produce the exact same hash value. 00:23:21.150 --> 00:23:23.940 That is to say, if you visit a server, and it's presenting you 00:23:23.940 --> 00:23:27.210 with a certificate, and it says that that certificate has been digitally 00:23:27.210 --> 00:23:30.120 signed by a certificate authority, your browser 00:23:30.120 --> 00:23:35.620 can use the certificate authority's public key to decrypt that signature. 00:23:35.620 --> 00:23:38.940 And by way of these hashes, confirm or deny 00:23:38.940 --> 00:23:42.750 that, yes, that server's certificate was indeed 00:23:42.750 --> 00:23:44.830 signed by the certificate authority. 00:23:44.830 --> 00:23:49.110 And, again, if you trust Google, if you trust Microsoft, Apple, Mozilla-- 00:23:49.110 --> 00:23:51.930 and that's another question all to itself, but if you trust them, 00:23:51.930 --> 00:23:54.540 and they, in turn, trust these certificate authorities, 00:23:54.540 --> 00:23:58.260 the presumption is that you should trust your secure connection 00:23:58.260 --> 00:24:00.960 with this particular website. 00:24:00.960 --> 00:24:07.890 Now with that said, does HTTPS, and in turn TLS, keep you secure? 00:24:07.890 --> 00:24:10.950 Mathematically, yes, but you and I, as the humans, 00:24:10.950 --> 00:24:13.260 are again, the potential weakness here. 00:24:13.260 --> 00:24:13.830 Why? 00:24:13.830 --> 00:24:18.050 There's another attack called SSL stripping, for historical reasons. 00:24:18.050 --> 00:24:20.360 But now it refers also to TLS. 00:24:20.360 --> 00:24:24.110 And what this attack involves is tricking the user 00:24:24.110 --> 00:24:27.290 into thinking they have a secure connection to a website, 00:24:27.290 --> 00:24:31.280 when they might actually have not an encrypted connection to that website. 00:24:31.280 --> 00:24:34.130 And worse yet, they might actually have an encrypted connection 00:24:34.130 --> 00:24:38.090 to a third party a machine in the middle's own website. 00:24:38.090 --> 00:24:39.480 So how might this work? 00:24:39.480 --> 00:24:43.640 Well, if you and I are in the habit of only typing still URLs 00:24:43.640 --> 00:24:50.900 as http://www.example.com, or maybe you are and I are in the habit of just 00:24:50.900 --> 00:24:56.270 typing www.//example.com, Enter, into our browsers, or maybe, more likely, 00:24:56.270 --> 00:24:59.750 you and I are in the habit of just typing example.com, Enter, 00:24:59.750 --> 00:25:00.800 in our browsers. 00:25:00.800 --> 00:25:03.988 Well if you watch the URL bar, the address bar in your browser, 00:25:03.988 --> 00:25:05.780 you've probably noticed over time that even 00:25:05.780 --> 00:25:08.390 if you type the most succinct of those inputs, 00:25:08.390 --> 00:25:13.040 it eventually gets converted into a longer URL with the HTTP, 00:25:13.040 --> 00:25:17.370 maybe with the www, and perhaps even more characters as well. 00:25:17.370 --> 00:25:19.940 And that's because your browser is just trying to be helpful. 00:25:19.940 --> 00:25:23.210 Technically, to visit a website you need to use a URL. 00:25:23.210 --> 00:25:30.050 And a URL, in this case, should start with or http:// or maybe https://. 00:25:30.050 --> 00:25:31.800 Your browser is just trying to be helpful. 00:25:31.800 --> 00:25:35.060 So if you don't even type any of those, it might first try HTTP 00:25:35.060 --> 00:25:38.240 and then it might try HTTPS even. 00:25:38.240 --> 00:25:43.400 But the catch is that if you start your interaction with a web server using 00:25:43.400 --> 00:25:48.260 HTTP, that alone might be enough of a window of opportunity 00:25:48.260 --> 00:25:50.900 for an adversary to do something malicious 00:25:50.900 --> 00:25:53.990 with the packet of information you're sending to the server. 00:25:53.990 --> 00:25:57.410 And then maybe send you back a response that tricks you 00:25:57.410 --> 00:26:00.770 into ending up at some other website altogether, or perhaps just 00:26:00.770 --> 00:26:02.720 the adversary's own website. 00:26:02.720 --> 00:26:05.270 And they might be super clever and make you 00:26:05.270 --> 00:26:08.642 think that you have a secure connection to the original website 00:26:08.642 --> 00:26:09.350 that you visited. 00:26:09.350 --> 00:26:13.040 So what do I mean by that? if inside of your browser's virtual envelope 00:26:13.040 --> 00:26:15.560 is an HTTP message like this, which is just saying, 00:26:15.560 --> 00:26:18.440 get me the home page of example.com. 00:26:18.440 --> 00:26:22.550 But suppose for the sake of discussion that you are not using HTTPS, 00:26:22.550 --> 00:26:25.820 you just typed HTTP, or nothing at all, and you're 00:26:25.820 --> 00:26:28.070 trusting your browser to fill this in for you, 00:26:28.070 --> 00:26:30.320 what might come back from the server? 00:26:30.320 --> 00:26:34.100 Well the server could respond with a message that says this. 00:26:34.100 --> 00:26:38.990 HTTP version 3 307, which is another status code, which means 00:26:38.990 --> 00:26:41.780 redirect the browser to a different URL. 00:26:41.780 --> 00:26:46.200 This is the browser's way of saying, detour, go to this other URL instead. 00:26:46.200 --> 00:26:48.290 Well, what location is that URL at? 00:26:48.290 --> 00:26:50.240 Well, perhaps this one here. 00:26:50.240 --> 00:26:53.870 But the catch is that if you're using HTTP, 00:26:53.870 --> 00:26:56.360 and therefore, your request is unencrypted, 00:26:56.360 --> 00:27:00.050 and suppose for this story, that there is a machine in the middle, 00:27:00.050 --> 00:27:04.100 waiting there, listening, to attack you, it could actually 00:27:04.100 --> 00:27:08.450 be that this response is coming from that machine in the middle and not 00:27:08.450 --> 00:27:10.770 the actual website that you intended to visit. 00:27:10.770 --> 00:27:11.270 Why? 00:27:11.270 --> 00:27:15.440 Because if Alice is trying to reach Bob, but Eve, the eavesdropper, 00:27:15.440 --> 00:27:18.860 is in the middle, it could actually be Eve in the middle that's 00:27:18.860 --> 00:27:20.430 responding with this request. 00:27:20.430 --> 00:27:23.460 So Bob doesn't even know what's going on in this story. 00:27:23.460 --> 00:27:24.780 But the catch is here. 00:27:24.780 --> 00:27:27.920 Notice that this eavesdropper, this machine in the middle 00:27:27.920 --> 00:27:29.480 is particularly clever. 00:27:29.480 --> 00:27:34.430 Because they're suggesting that you be redirected, per this status code, 307, 00:27:34.430 --> 00:27:35.690 to https://. 00:27:35.690 --> 00:27:38.630 And odds are, you and I are probably in the habit 00:27:38.630 --> 00:27:41.330 of at least making sure that our browser says secure, 00:27:41.330 --> 00:27:44.030 or that at least you're at an HTTPS URL. 00:27:44.030 --> 00:27:47.600 So you might think, whoo, good, everything is the way it should be. 00:27:47.600 --> 00:27:50.990 Now this is very, very subtle, but what if I 00:27:50.990 --> 00:27:55.730 draw your attention to the actual URL I'm being sent to here. 00:27:55.730 --> 00:27:59.670 At least on a US English keyboard, using this particular font, 00:27:59.670 --> 00:28:02.632 this is no longer example.com-- 00:28:02.632 --> 00:28:11.450 E-X-A-M-P-L-E-- dot-- C-O-M, This is now, indeed, E-X-A-M-P-1-E-- 00:28:11.450 --> 00:28:12.630 dot-- com. 00:28:12.630 --> 00:28:17.870 So this is a particularly subtle attack, whereby the adversary in this story 00:28:17.870 --> 00:28:22.040 seems to have bought a domain name that looks very similar to example.com, 00:28:22.040 --> 00:28:26.030 with an L, but they instead used the number 1, which in some fonts 00:28:26.030 --> 00:28:29.000 and on some screens look so close to an L that are you 00:28:29.000 --> 00:28:31.340 or I really going to even notice this difference? 00:28:31.340 --> 00:28:33.620 The implication, though, of this subtlety 00:28:33.620 --> 00:28:38.780 is that you might very well have a perfectly encrypted connection using 00:28:38.780 --> 00:28:43.220 HTTPS to a server, but it's not Bob's server in this story, 00:28:43.220 --> 00:28:46.010 it's now Eve's server in the middle. 00:28:46.010 --> 00:28:49.460 So SSL stripping in this case refers to an attack, 00:28:49.460 --> 00:28:53.180 whereby you're sort of stripping out what would be an HTTP 00:28:53.180 --> 00:28:57.530 redirection to the right place, and maybe you never even end up at HTTPS, 00:28:57.530 --> 00:29:01.610 and the eavesdropper in the middle always keeps it as HTTP. 00:29:01.610 --> 00:29:04.700 But an even more malicious adversary might take it one step further 00:29:04.700 --> 00:29:08.390 and actually redirect you to their own HTTPS site. 00:29:08.390 --> 00:29:11.060 At which point, you might be vulnerable to the phishing attacks 00:29:11.060 --> 00:29:13.727 that we've discussed before, where you might provide information 00:29:13.727 --> 00:29:17.740 into a website that is not actually legitimate. 00:29:17.740 --> 00:29:20.890 So how can you mitigate this kind of attack? 00:29:20.890 --> 00:29:23.980 Well, one, if you're a user, a consumer, you 00:29:23.980 --> 00:29:30.460 could just get into the habit of always typing out HTTPS and then 00:29:30.460 --> 00:29:32.260 the domain name that you want to visit. 00:29:32.260 --> 00:29:34.990 I will concede that can get tedious quickly, 00:29:34.990 --> 00:29:40.540 but that is, hands down, the most paranoid solution to implement here. 00:29:40.540 --> 00:29:43.300 Because you know you will end up at HTTPS. 00:29:43.300 --> 00:29:45.670 Hopefully, the website actually supports HTTPS, 00:29:45.670 --> 00:29:48.380 but most websites nowadays certainly do. 00:29:48.380 --> 00:29:51.280 With that said, if you're on the flipside of the story, 00:29:51.280 --> 00:29:55.180 and you're actually the designer of the website, the business owner running 00:29:55.180 --> 00:29:59.200 the website, or you have control over not the browser, but the server, 00:29:59.200 --> 00:30:01.790 there's a few different things that you can do. 00:30:01.790 --> 00:30:04.210 And, in fact, you can use a protocol called 00:30:04.210 --> 00:30:10.360 HSTS, which is to say you can actually configure your server to provide hints 00:30:10.360 --> 00:30:14.530 to browsers that you know what, they should always use HTTPS 00:30:14.530 --> 00:30:18.460 when talking to the server no matter what the human has decided. 00:30:18.460 --> 00:30:23.470 And HSTS here, for Hypertext Strict Transport Security, 00:30:23.470 --> 00:30:26.410 has you, as the administrator of the server 00:30:26.410 --> 00:30:30.220 just configure your server to send an additional HTTP header. 00:30:30.220 --> 00:30:33.070 That is to say, an additional line of text inside one 00:30:33.070 --> 00:30:37.810 of those virtual envelopes that just informs the server that we really 00:30:37.810 --> 00:30:40.450 want to be strict about our transport security. 00:30:40.450 --> 00:30:42.550 That is we want to be using TLS. 00:30:42.550 --> 00:30:45.100 We want the user to be using HTTPS. 00:30:45.100 --> 00:30:49.420 And here, the server is telling the browser, assume that this is the case, 00:30:49.420 --> 00:30:52.810 that I want you to use strict security for at least one year. 00:30:52.810 --> 00:30:57.490 This is the number of seconds in a 365-day year, for instance. 00:30:57.490 --> 00:30:59.573 That's just a really long time telling the browser 00:30:59.573 --> 00:31:02.490 that, yes, I'm going to keep my security on for at least a year, which 00:31:02.490 --> 00:31:03.580 should be the case anyway. 00:31:03.580 --> 00:31:07.990 But you can further configure your server to not just output this response 00:31:07.990 --> 00:31:11.140 in every one of those virtual envelopes, going back to browsers. 00:31:11.140 --> 00:31:16.730 You can even more protectively say use strict security for subdomains as well. 00:31:16.730 --> 00:31:20.350 So even if my user is at example.com, also 00:31:20.350 --> 00:31:23.710 make sure that their browser uses HTTPS for something 00:31:23.710 --> 00:31:27.950 like www.example.com, where in that scenario www, 00:31:27.950 --> 00:31:30.640 you can think of as a subdomain, because it's 00:31:30.640 --> 00:31:33.710 part of the example.com domain itself. 00:31:33.710 --> 00:31:36.700 And so what is this telling the browser specifically? 00:31:36.700 --> 00:31:40.990 Even though you might accidentally, conveniently visit a website 00:31:40.990 --> 00:31:45.040 for the very first time using http:// because you typed it, 00:31:45.040 --> 00:31:49.300 or you let your browser automatically fill that for you, 00:31:49.300 --> 00:31:51.940 if the server responds with this message, 00:31:51.940 --> 00:31:56.570 the whole point of HSTS is that the second time, the third time, 00:31:56.570 --> 00:32:01.660 the 300th time, the 3,000th time that your browser visits that exact same 00:32:01.660 --> 00:32:05.020 domain in the future, up until at least a year from now, 00:32:05.020 --> 00:32:09.580 it will automatically switch you to HTTPS. 00:32:09.580 --> 00:32:14.298 And it very protectively won't even let you visit the HTTP. 00:32:14.298 --> 00:32:16.840 Even if that's what's in the URL bar, it's just going to say, 00:32:16.840 --> 00:32:20.420 nope, I've been told to use HTTPS instead. 00:32:20.420 --> 00:32:23.950 And that, therefore, decreases the window of opportunity for adversaries 00:32:23.950 --> 00:32:28.840 to just that very first request that you might make accidentally, conveniently 00:32:28.840 --> 00:32:32.200 that's using HTTP, but every subsequent request 00:32:32.200 --> 00:32:37.550 from your browser, according to this model will now be HTTPS instead. 00:32:37.550 --> 00:32:39.400 And you can go one step further too. 00:32:39.400 --> 00:32:41.290 If you're the administrator of a server, you 00:32:41.290 --> 00:32:45.700 can also include the keyword preload in this message that's inside 00:32:45.700 --> 00:32:47.620 of all of these virtual envelopes. 00:32:47.620 --> 00:32:50.680 And what that will further tell the world is 00:32:50.680 --> 00:32:53.590 that if the browser manufacturers would like 00:32:53.590 --> 00:32:58.780 to preload this information into Chrome, into other browsers 00:32:58.780 --> 00:33:01.540 that humans download, you can even eliminate 00:33:01.540 --> 00:33:05.410 that first window of opportunity because you can have your domain name 00:33:05.410 --> 00:33:09.220 included, essentially, in the source code for browsers like Chrome. 00:33:09.220 --> 00:33:13.570 So that when people download Chrome itself, visit your website, 00:33:13.570 --> 00:33:16.300 like example.com, your browser will already 00:33:16.300 --> 00:33:22.700 know that they should not use HTTP with this website, they should use HTTPS. 00:33:22.700 --> 00:33:28.910 Questions then on HTTP, on TLS, on HSTS. 00:33:28.910 --> 00:33:32.690 It's a lot of acronyms, but realize that some of these defenses 00:33:32.690 --> 00:33:35.960 are available to you as a user, and some of you, if more technical, 00:33:35.960 --> 00:33:39.240 these defenses are available to you as system administrator. 00:33:39.240 --> 00:33:44.660 AUDIENCE: So you mentioned that there is such websites as exam1e.com, 00:33:44.660 --> 00:33:47.540 they very similarly look to example.com. 00:33:47.540 --> 00:33:52.070 And I know that registrars fight against phishing websites like this. 00:33:52.070 --> 00:33:54.920 And whenever somebody tries to register a domain that's 00:33:54.920 --> 00:33:58.760 looking a little bit suspicious, it usually marks it as fraud 00:33:58.760 --> 00:34:00.320 and suspends it. 00:34:00.320 --> 00:34:04.190 So how do such domains still exist? 00:34:04.190 --> 00:34:06.590 And why is it so common? 00:34:06.590 --> 00:34:08.550 DAVID MALAN: That's a really good question. 00:34:08.550 --> 00:34:12.380 And that's great that registrars, who are the companies in the world that 00:34:12.380 --> 00:34:15.080 sell you, or rent you, domain names nowadays, 00:34:15.080 --> 00:34:17.900 are being more vigilant when it comes to your buying 00:34:17.900 --> 00:34:20.750 a domain that could be maliciously used in this way 00:34:20.750 --> 00:34:25.190 because it's so similar to a brand name or an existing website. 00:34:25.190 --> 00:34:27.320 However, there's a lot of registrars out there, 00:34:27.320 --> 00:34:29.570 and I would conjecture that not all of them 00:34:29.570 --> 00:34:32.190 are as good as others at doing that detection. 00:34:32.190 --> 00:34:35.090 There are hundreds of top-level domains nowadays, 00:34:35.090 --> 00:34:39.920 TLDs, which means that you could even choose example-dot something else 00:34:39.920 --> 00:34:43.130 potentially, and that too might not be this way. 00:34:43.130 --> 00:34:46.489 Now, eventually, maybe, especially when you use it maliciously, 00:34:46.489 --> 00:34:48.179 you would eventually get shut down. 00:34:48.179 --> 00:34:50.719 But maybe it's enough to attack one person, 00:34:50.719 --> 00:34:53.760 or two, or 10, or 100 before it's actually shut down. 00:34:53.760 --> 00:34:57.080 So these remain theoretical and actual attacks, 00:34:57.080 --> 00:35:00.720 but there are certainly ways to push back on this. 00:35:00.720 --> 00:35:01.560 A good question. 00:35:01.560 --> 00:35:04.500 Others from the group. 00:35:04.500 --> 00:35:07.680 AUDIENCE: I would like to know that is HTTP 00:35:07.680 --> 00:35:13.680 is the best solution for cookies and super cookies to prevent from attack? 00:35:13.680 --> 00:35:18.512 Or should I clear the cookies frequently? 00:35:18.512 --> 00:35:19.720 DAVID MALAN: A good question. 00:35:19.720 --> 00:35:25.590 Unfortunately, super cookies cannot be stopped at the browser level. 00:35:25.590 --> 00:35:28.200 Super cookies refer to a type of cookie that's 00:35:28.200 --> 00:35:31.770 embedded by your company, your university, your internet service 00:35:31.770 --> 00:35:32.460 provider. 00:35:32.460 --> 00:35:35.140 And you would have to opt out at that level. 00:35:35.140 --> 00:35:37.770 So for context, for Americans in the group, 00:35:37.770 --> 00:35:40.980 AT&T and Verizon started doing this a few years ago, 00:35:40.980 --> 00:35:47.790 where they were injecting cookies into cellphone customers' HTTP requests. 00:35:47.790 --> 00:35:51.360 You literally, and stupidly, and obnoxiously have to log 00:35:51.360 --> 00:35:57.480 into your Verizon.com or your AT&T.com account, via the web or the app, 00:35:57.480 --> 00:35:58.930 and opt out of this. 00:35:58.930 --> 00:36:00.720 So one thing to your question here. 00:36:00.720 --> 00:36:04.740 Super cookies are actually super annoying, super difficult, 00:36:04.740 --> 00:36:06.960 super dangerous in that sense. 00:36:06.960 --> 00:36:09.210 Because they can happen without your knowing. 00:36:09.210 --> 00:36:14.970 However, if you are using HTTPS, that should decrease 00:36:14.970 --> 00:36:16.500 the probability of this happening. 00:36:16.500 --> 00:36:19.132 Because if your data is encrypted, your company, 00:36:19.132 --> 00:36:21.090 your university, your internet service provider 00:36:21.090 --> 00:36:25.980 can't insert the cookies into those encrypted messages unless, 00:36:25.980 --> 00:36:29.670 and we'll talk a bit about this more later, unless you have given 00:36:29.670 --> 00:36:32.340 permission to your company, or university to install 00:36:32.340 --> 00:36:36.680 special software on your Mac or PC. 00:36:36.680 --> 00:36:40.470 So in short, simplest advice is always use HTTPS. 00:36:40.470 --> 00:36:44.030 And if you have a cellphone provider, google around, 00:36:44.030 --> 00:36:46.950 find out if they might be doing this to you. 00:36:46.950 --> 00:36:49.490 And if so, figure out if you can opt out. 00:36:49.490 --> 00:36:51.800 How about one more question here? 00:36:51.800 --> 00:36:54.770 AUDIENCE: The question will, be from a macro perspective, 00:36:54.770 --> 00:36:56.750 will it be feasible in the same way that there 00:36:56.750 --> 00:36:59.540 are machines in the middle of [INAUDIBLE] 00:36:59.540 --> 00:37:04.550 in the middle verifications, like a request from, let's say, 00:37:04.550 --> 00:37:11.250 my cellphone to the [INAUDIBLE] needs to do so many hops to reach the server, 00:37:11.250 --> 00:37:11.750 right? 00:37:11.750 --> 00:37:17.000 Will it be [INAUDIBLE] between each hop, every packet 00:37:17.000 --> 00:37:22.030 will get a stamp, like a passport, to verify the integrity of the connection? 00:37:22.030 --> 00:37:25.280 DAVID MALAN: Potentially, the catch is-- and if I'm understanding the question 00:37:25.280 --> 00:37:30.020 correctly, the catch is you, I, we don't control all of these machines 00:37:30.020 --> 00:37:31.800 in the middle when it comes to routers. 00:37:31.800 --> 00:37:33.560 So it's possible to do what you're doing, 00:37:33.560 --> 00:37:35.730 but there just isn't coordination at that level. 00:37:35.730 --> 00:37:40.410 And so per our discussion last week of end to end encryption, in general, 00:37:40.410 --> 00:37:44.330 it's best that the sender and receiver worry about doing the encryption. 00:37:44.330 --> 00:37:47.510 Because that way you don't have to trust anyone in between you, 00:37:47.510 --> 00:37:53.960 any machines in the middle, so long as you are using a protocol that supports 00:37:53.960 --> 00:37:56.310 some form of encryption end to end. 00:37:56.310 --> 00:37:59.263 So how else can you keep your system secure, particularly when 00:37:59.263 --> 00:38:00.680 they're communicating with others? 00:38:00.680 --> 00:38:03.470 Well, another technology with which you might already be familiar 00:38:03.470 --> 00:38:06.630 is this, a VPN, or a Virtual Private Network. 00:38:06.630 --> 00:38:12.530 Now whereas HTTPS only secures your web traffic between browser and server, 00:38:12.530 --> 00:38:15.380 a VPN is, dare say, a more powerful technology 00:38:15.380 --> 00:38:19.190 because it encrypts all of your internet traffic between you 00:38:19.190 --> 00:38:22.080 and whatever VPN server to which you're connecting. 00:38:22.080 --> 00:38:23.240 So how does this work? 00:38:23.240 --> 00:38:28.910 A VPN allows Alice and Bob to establish an encrypted channel 00:38:28.910 --> 00:38:32.960 that even if there are machines in the middle, routers or otherwise, 00:38:32.960 --> 00:38:34.010 that shouldn't matter. 00:38:34.010 --> 00:38:36.380 Because Alice and Bob are using cryptography 00:38:36.380 --> 00:38:42.020 to encrypt all of the information going in between points A and B. Now 00:38:42.020 --> 00:38:43.110 how much do you use this? 00:38:43.110 --> 00:38:45.110 Well, it's very common if you work for a company 00:38:45.110 --> 00:38:47.480 that itself has servers that you might need access 00:38:47.480 --> 00:38:51.560 to, whether it's email, or files, or anything else, you might, 00:38:51.560 --> 00:38:54.980 from your laptop, have to, by policy at that company, 00:38:54.980 --> 00:38:59.090 connect to your company servers via VPN, Virtual Private Network. 00:38:59.090 --> 00:39:02.120 That is to say your Mac, or PC, or your phone 00:39:02.120 --> 00:39:05.420 have special software that you start up, you probably 00:39:05.420 --> 00:39:08.360 log into using minimally a username and a password, 00:39:08.360 --> 00:39:12.865 maybe using a two-factor code, maybe using a USB device that you 00:39:12.865 --> 00:39:14.240 have to connect to your computer. 00:39:14.240 --> 00:39:16.820 You have to somehow authenticate to that VPN. 00:39:16.820 --> 00:39:19.580 And once that software is up and running and authenticated 00:39:19.580 --> 00:39:23.390 against the VPN server, run in this case by your company. 00:39:23.390 --> 00:39:25.790 All of your internet traffic, thereafter, 00:39:25.790 --> 00:39:30.080 should be encrypted between you, point A, and the company, 00:39:30.080 --> 00:39:33.590 point B. The motivation for that is that this way the company 00:39:33.590 --> 00:39:37.340 can ensure that no matter what services you are accessing inside 00:39:37.340 --> 00:39:42.410 of the corporate network, be it email, or files, or maybe video conferencing, 00:39:42.410 --> 00:39:46.700 or something else all together, no matter what, by nature of that VPN, 00:39:46.700 --> 00:39:49.610 all of that traffic is encrypted. 00:39:49.610 --> 00:39:53.270 But you should realize too that VPNs have a few side 00:39:53.270 --> 00:39:55.100 effects, perhaps good, perhaps bad. 00:39:55.100 --> 00:39:58.580 When Alice connects to Bob, if Bob is the VPN server, 00:39:58.580 --> 00:40:00.740 she has this encrypted tunnel. 00:40:00.740 --> 00:40:03.950 This is what we mean by a private-- a virtual private network. 00:40:03.950 --> 00:40:07.580 She has this encrypted tunnel to Bob, which typically 00:40:07.580 --> 00:40:13.100 makes it appear as though Alice's IP address, her internet address, 00:40:13.100 --> 00:40:16.700 her unique identifier on the internet, is actually that of Bob 00:40:16.700 --> 00:40:17.990 and not her own. 00:40:17.990 --> 00:40:21.800 That is to say, if Alice connects to her company's VPN server, 00:40:21.800 --> 00:40:26.720 she then gets another IP address from her company's own VPN server 00:40:26.720 --> 00:40:27.480 effectively. 00:40:27.480 --> 00:40:32.900 So if Alice then visits gmail.com, or amazon.com, or any other website, 00:40:32.900 --> 00:40:36.110 those websites will actually think that Alice's IP 00:40:36.110 --> 00:40:39.440 address is that of Bob and not Alice. 00:40:39.440 --> 00:40:42.620 So this is very commonly used if you're in one country, 00:40:42.620 --> 00:40:45.380 and you want to masquerade as though you're in another. 00:40:45.380 --> 00:40:48.830 And this might be because you need to in order to access company resources. 00:40:48.830 --> 00:40:50.622 Perhaps, from a show of smiles, this might 00:40:50.622 --> 00:40:52.580 be because you want to access a streaming media 00:40:52.580 --> 00:40:56.510 service that you don't have access to when you're in one country or another. 00:40:56.510 --> 00:41:00.470 The point, though, is that you have this encrypted connection between points A 00:41:00.470 --> 00:41:06.140 and B. And thereafter, you can visit any website, any service, 00:41:06.140 --> 00:41:10.190 as though you are physically at location B. 00:41:10.190 --> 00:41:12.650 Beyond that, there are other technologies 00:41:12.650 --> 00:41:14.450 that you can use to encrypt communications. 00:41:14.450 --> 00:41:19.010 And this one's a little more technical and used by programmers and system 00:41:19.010 --> 00:41:20.330 administrators alike. 00:41:20.330 --> 00:41:23.690 There's a protocol called SSH, for Secure Shell. 00:41:23.690 --> 00:41:26.570 And this is a technology via which you don't necessarily 00:41:26.570 --> 00:41:29.210 encrypt all of your traffic between point A and B, 00:41:29.210 --> 00:41:32.630 although you can use SSH to create the equivalent 00:41:32.630 --> 00:41:35.600 of a virtual private network, or VPN, but SSH 00:41:35.600 --> 00:41:39.980 is all about connecting to a remote server and executing commands on it. 00:41:39.980 --> 00:41:42.920 So not executing commands on your own machine ultimately, 00:41:42.920 --> 00:41:45.290 but on some other machine that's maybe inside 00:41:45.290 --> 00:41:48.540 of your company, your university, or somewhere else in the world. 00:41:48.540 --> 00:41:51.050 So, for instance, if curious as to how this works, 00:41:51.050 --> 00:41:55.640 you might recall that in a previous class I wrote some code on my computer. 00:41:55.640 --> 00:41:58.280 And then I opened up a terminal window that started 00:41:58.280 --> 00:41:59.790 with this prompt, a dollar sign. 00:41:59.790 --> 00:42:01.020 It doesn't mean currency. 00:42:01.020 --> 00:42:04.950 It's just a tradition that the dollar sign means type your commands here. 00:42:04.950 --> 00:42:08.490 But at the time, I was typing them on my laptop here on a local server, 00:42:08.490 --> 00:42:09.470 if you will. 00:42:09.470 --> 00:42:13.730 If I, though, want to use a computer to remotely connect to another 00:42:13.730 --> 00:42:16.550 and then remotely run a command, it might work like this. 00:42:16.550 --> 00:42:18.920 Here I am, let's pretend on my own computer, 00:42:18.920 --> 00:42:22.370 and suppose I type out one command like the date command. 00:42:22.370 --> 00:42:25.100 Not surprisingly, this will tell me what the current date is. 00:42:25.100 --> 00:42:28.430 So suppose that where I am, on my own computer, it is 00:42:28.430 --> 00:42:35.400 Thursday, January 1, at midnight Eastern time, in the year 1970, for instance. 00:42:35.400 --> 00:42:39.800 If, though, the next command I run is not date again, but I use SSH, 00:42:39.800 --> 00:42:43.310 and I connect, for instance, to, oh, how about our friends 00:42:43.310 --> 00:42:44.900 at Stanford University. 00:42:44.900 --> 00:42:48.830 So I'm going to SSH into stanford.edu's server. 00:42:48.830 --> 00:42:53.790 As soon as I'm at that server now, I get another prompt, in this case. 00:42:53.790 --> 00:42:56.660 But if I now run the date command, what you'll see 00:42:56.660 --> 00:43:00.748 is that the date is now apparently slightly in the past, at least 00:43:00.748 --> 00:43:03.290 if I type this quick enough so that the seconds weren't off-- 00:43:03.290 --> 00:43:09.290 Wednesday, December 31, 9:00 PM Pacific time, in the year, still 1969. 00:43:09.290 --> 00:43:13.550 So this is to say SSH is actually a very common technology used 00:43:13.550 --> 00:43:16.730 in the world of software engineering, system administration, when you want 00:43:16.730 --> 00:43:18.530 to control one server from another. 00:43:18.530 --> 00:43:23.180 And what's powerful about it is that everything I just typed after that SSH 00:43:23.180 --> 00:43:27.320 command, even as innocuous as the date command is, on Stanford's server 00:43:27.320 --> 00:43:28.520 would be encrypted. 00:43:28.520 --> 00:43:31.520 So no one between these points A and B would 00:43:31.520 --> 00:43:36.758 be able to know what I'm controlling or what commands I have typed. 00:43:36.758 --> 00:43:39.050 All right, let's go ahead and take a five-minute break. 00:43:39.050 --> 00:43:42.710 And when we resume, we'll look at some other building blocks of systems 00:43:42.710 --> 00:43:46.640 that both solve problems, but also create vulnerabilities for us as well. 00:43:46.640 --> 00:43:48.110 Back in a few. 00:43:48.110 --> 00:43:50.660 All right, let's talk about now what's actually 00:43:50.660 --> 00:43:53.570 been on the outside of these virtual envelopes that's 00:43:53.570 --> 00:43:57.750 helping these envelopes get from their source to their destination. 00:43:57.750 --> 00:44:00.110 So it turns out that on the outside of these envelopes, 00:44:00.110 --> 00:44:02.480 minimally is what we'll call a port number, which 00:44:02.480 --> 00:44:05.570 is literally just a unique number that the world has decided 00:44:05.570 --> 00:44:08.345 on that uniquely represents the type of service 00:44:08.345 --> 00:44:10.890 that that envelope is destined for. 00:44:10.890 --> 00:44:13.490 So, for instance, if you at your browser were 00:44:13.490 --> 00:44:19.750 going to pull up a website like http://www/example.com, 00:44:19.750 --> 00:44:22.520 on the outside of that envelope would not only 00:44:22.520 --> 00:44:27.260 be some mention of www.example.com, but also a so-called port 00:44:27.260 --> 00:44:31.400 number, namely 80 by convention, which means that this envelope should 00:44:31.400 --> 00:44:35.120 be opened not by the other servers' email server or chat server, 00:44:35.120 --> 00:44:37.430 but by its web server specifically. 00:44:37.430 --> 00:44:41.300 Now, if the web server were to respond to us by saying, 00:44:41.300 --> 00:44:45.830 uh-uh, we want you to use HTTPS instead, essentially 00:44:45.830 --> 00:44:49.190 redirecting the browser to a secure version of the website, 00:44:49.190 --> 00:44:51.530 my browser would then have to send a second request 00:44:51.530 --> 00:44:56.010 to the server, this time still mentioning www.example.com, 00:44:56.010 --> 00:44:59.210 but on the outside of that envelope, among other details, 00:44:59.210 --> 00:45:02.743 would be a different port number, namely 443. 00:45:02.743 --> 00:45:05.160 Now these aren't the kinds of things you have to memorize, 00:45:05.160 --> 00:45:07.368 but the computers certainly know what they represent. 00:45:07.368 --> 00:45:15.650 And, in fact, common numbers include 80 for HTTP, 443 for HTTPS, 22 for SSH, 00:45:15.650 --> 00:45:18.140 and hundreds, if not thousands, of other numbers 00:45:18.140 --> 00:45:21.410 as well, that humans decided on, but the computers actually 00:45:21.410 --> 00:45:24.950 rely on to know what piece of software on a computer 00:45:24.950 --> 00:45:28.520 should actually expect and open up these virtual envelopes, 00:45:28.520 --> 00:45:30.560 these things we've called packets. 00:45:30.560 --> 00:45:33.660 So what's the problem or danger here? 00:45:33.660 --> 00:45:38.300 Well, it turns out that your computer can be listening for internet traffic 00:45:38.300 --> 00:45:41.540 on none of these ports, in which case it's completely 00:45:41.540 --> 00:45:43.820 disconnected from inbound connections. 00:45:43.820 --> 00:45:47.870 But very often, computers, especially servers, are listening, so to speak, 00:45:47.870 --> 00:45:52.340 for envelopes destined for maybe port 22, maybe port 80, maybe 00:45:52.340 --> 00:45:54.208 port 443, maybe others as well. 00:45:54.208 --> 00:45:56.000 So you might think, well, that's not great, 00:45:56.000 --> 00:45:58.070 because if these numbers are standardized, then 00:45:58.070 --> 00:46:03.410 adversaries could maybe try to access my server via those port numbers. 00:46:03.410 --> 00:46:05.420 Because they too know what they are. 00:46:05.420 --> 00:46:09.380 So you might think, all right, well, let me run my web server on a number 00:46:09.380 --> 00:46:12.290 other than 80, or a number other than 443, 00:46:12.290 --> 00:46:16.310 and just choose a random number between 0 and 65,000 or so. 00:46:16.310 --> 00:46:19.400 Because the odds that the adversary is going to guess that are much lower. 00:46:19.400 --> 00:46:21.410 SSH, you might consider doing the same. 00:46:21.410 --> 00:46:24.050 But unfortunately, it's all too easy for adversaries 00:46:24.050 --> 00:46:27.230 to wage this kind of attack, known as port scanning. 00:46:27.230 --> 00:46:30.560 And recall that we did our own brute force 00:46:30.560 --> 00:46:33.800 attack in classes past on our own passwords, 00:46:33.800 --> 00:46:36.950 for instance, trying to figure out all possible passwords 00:46:36.950 --> 00:46:38.600 that might be locking a phone. 00:46:38.600 --> 00:46:40.820 And it's not that hard to use code very much 00:46:40.820 --> 00:46:45.740 like that with a loop of some sort that just tries every possible port 00:46:45.740 --> 00:46:49.130 number between some range, roughly 0 to 65,000. 00:46:49.130 --> 00:46:51.710 So port scanning refers to the equivalent 00:46:51.710 --> 00:46:58.100 of accessing-- knocking on the door of every possible port number on a server. 00:46:58.100 --> 00:47:01.040 Now most of those doors might be closed and no one might be home. 00:47:01.040 --> 00:47:06.090 That is they might not be expecting any traffic or visitors on those numbers. 00:47:06.090 --> 00:47:09.410 But by writing software that tries all of those port numbers, 00:47:09.410 --> 00:47:13.580 you can essentially discover services that are running on certain computers. 00:47:13.580 --> 00:47:16.100 Now, hopefully, that in and of itself is not a problem. 00:47:16.100 --> 00:47:18.710 Because most likely, the purpose of these services 00:47:18.710 --> 00:47:20.720 is to be on the internet. 00:47:20.720 --> 00:47:24.080 But hopefully, these services too are using encryption. 00:47:24.080 --> 00:47:26.000 Hopefully, they're using authentication. 00:47:26.000 --> 00:47:27.810 But that's not always the case. 00:47:27.810 --> 00:47:31.190 So security through obscurity, so to speak, 00:47:31.190 --> 00:47:35.930 running services on random or arbitrary, non-standard port numbers 00:47:35.930 --> 00:47:39.680 is not really a good practice unless you're additionally defending 00:47:39.680 --> 00:47:41.720 against all of these common attacks. 00:47:41.720 --> 00:47:44.900 Now in the world of port scanning, though, it's 00:47:44.900 --> 00:47:48.950 a specific example of what we might call penetration testing more generally. 00:47:48.950 --> 00:47:53.600 Penetration testing, or pen testing, is actually a skill, a technique, 00:47:53.600 --> 00:47:57.410 a job even, whereby you are, hopefully, not an adversary, 00:47:57.410 --> 00:48:00.380 but hopefully, a well paid consultant whose purpose 00:48:00.380 --> 00:48:05.120 in life, or whose vocation in life, is to actually try to penetrate someone's 00:48:05.120 --> 00:48:06.890 network or penetrate someone's system. 00:48:06.890 --> 00:48:08.100 And what do I mean by that? 00:48:08.100 --> 00:48:10.850 Well maybe, quite simply, you try scanning all of the ports 00:48:10.850 --> 00:48:14.600 on their servers just to see if there are ports that are open, 00:48:14.600 --> 00:48:16.490 that is listening that shouldn't be. 00:48:16.490 --> 00:48:19.760 Because no sense in opening a door if no one's meant to go through it. 00:48:19.760 --> 00:48:24.360 Or you might try to penetrate their network or system in some other way. 00:48:24.360 --> 00:48:27.600 Maybe you might try to brute force your way through passwords. 00:48:27.600 --> 00:48:30.500 Maybe you might try to socially engineer the employees 00:48:30.500 --> 00:48:31.800 of that company or the like. 00:48:31.800 --> 00:48:38.430 So penetration testing is all about someone who, for good purposes, 00:48:38.430 --> 00:48:43.120 is trying to find possible weaknesses in your infrastructure. 00:48:43.120 --> 00:48:46.470 So that, hopefully, you can pay them and thank them, but then 00:48:46.470 --> 00:48:50.310 fix those problems before actual adversaries try to exploit it 00:48:50.310 --> 00:48:52.450 for malicious purposes instead. 00:48:52.450 --> 00:48:55.980 So this might also be referred to as ethical hacking, where 00:48:55.980 --> 00:49:00.450 you get all of the cachet of being a hacker and really good with computers, 00:49:00.450 --> 00:49:02.940 but the upside of doing it ethically, which 00:49:02.940 --> 00:49:06.600 is to say that if you are in the business of trying to find faults 00:49:06.600 --> 00:49:12.960 with systems, you don't have to do it for illegal financial gain, 00:49:12.960 --> 00:49:17.280 but for very much legal financial gain instead, as in someone 00:49:17.280 --> 00:49:20.790 will pay you to find faults in their system so long as you tell them 00:49:20.790 --> 00:49:24.610 and only them first so that they can actually fix the same. 00:49:24.610 --> 00:49:27.390 And, in fact in this world there's a gamification of it 00:49:27.390 --> 00:49:29.400 of sorts in certain companies, where you might 00:49:29.400 --> 00:49:32.163 have a red team whose purpose in life in this game 00:49:32.163 --> 00:49:34.830 is to actually try to penetrate the network or find some faults. 00:49:34.830 --> 00:49:38.140 And then the blue team, so to speak, whose purpose in life in this story 00:49:38.140 --> 00:49:40.800 is to defend the systems against those attacks. 00:49:40.800 --> 00:49:44.190 And so that too has often yielded better results for some folks, 00:49:44.190 --> 00:49:48.240 given that it helps them find weaknesses before adversaries 00:49:48.240 --> 00:49:50.160 who don't work for them do. 00:49:50.160 --> 00:49:52.950 So how might you keep these attacks out? 00:49:52.950 --> 00:49:56.940 And how might you keep even penetration testing out in a good way 00:49:56.940 --> 00:50:00.057 to demonstrate that, you know what, we are actually pretty secure? 00:50:00.057 --> 00:50:01.890 Well, there's this technology with which you 00:50:01.890 --> 00:50:03.960 might be generally familiar by name, namely 00:50:03.960 --> 00:50:07.050 a firewall, which actually comes from the real world. 00:50:07.050 --> 00:50:09.750 Typically, in buildings that have multiple stores, 00:50:09.750 --> 00:50:14.040 there might literally be a firewall between two of the stores 00:50:14.040 --> 00:50:16.890 so that if there's a fire in one store, it doesn't somehow 00:50:16.890 --> 00:50:19.320 propagate next door to the other store. 00:50:19.320 --> 00:50:22.080 Now in the virtual world, a firewall is essentially 00:50:22.080 --> 00:50:27.630 a piece of software between you and the outside world or between you 00:50:27.630 --> 00:50:30.840 and some other network that keeps data in the network 00:50:30.840 --> 00:50:34.440 that you don't want to leave it and it keeps out from the network data 00:50:34.440 --> 00:50:36.310 that you don't want coming in. 00:50:36.310 --> 00:50:40.200 So, for instance, if you might within your company, or university, or home 00:50:40.200 --> 00:50:44.190 have some sort of local chat service, or intercom system, or the like, 00:50:44.190 --> 00:50:47.740 none of that traffic ideally should end up on the public internet. 00:50:47.740 --> 00:50:52.920 So you might want your firewall to block any intentional or accidental attempts 00:50:52.920 --> 00:50:55.500 to transmit that data outside the network. 00:50:55.500 --> 00:50:58.410 Conversely, if you have a private network that you only 00:50:58.410 --> 00:51:01.740 use for home computing, and watching streaming media, and the like, 00:51:01.740 --> 00:51:04.380 you are not yourself a server, and you certainly don't 00:51:04.380 --> 00:51:07.500 want random people trying to connect to your laptops, or desktops, 00:51:07.500 --> 00:51:10.530 or servers in your home, you might want your firewall 00:51:10.530 --> 00:51:13.320 to keep all internet traffic out. 00:51:13.320 --> 00:51:15.830 Now with that said, there are some problems 00:51:15.830 --> 00:51:17.580 when you want to use services where you do 00:51:17.580 --> 00:51:20.320 need to talk to someone on the outside world, 00:51:20.320 --> 00:51:22.590 maybe like a Zoom call or the like. 00:51:22.590 --> 00:51:25.090 But there are technologies that help mitigate this. 00:51:25.090 --> 00:51:27.570 so these firewalls are not necessarily absolute. 00:51:27.570 --> 00:51:30.460 You can open them up or poke holes in them, 00:51:30.460 --> 00:51:33.390 so to speak, to allow certain services through. 00:51:33.390 --> 00:51:35.580 So how might these firewalls actually work? 00:51:35.580 --> 00:51:40.530 Well, they might simply block traffic, that is internet packets, based 00:51:40.530 --> 00:51:41.490 on IP address. 00:51:41.490 --> 00:51:43.980 Because recall on the outside of those virtual envelopes 00:51:43.980 --> 00:51:47.070 is not just port numbers, but also-- and not quite 00:51:47.070 --> 00:51:51.270 the domain name, like www.example.com, on the outside of those envelopes 00:51:51.270 --> 00:51:56.010 is actually the unique address of a server to which you're sending a packet 00:51:56.010 --> 00:52:01.710 and the unique address of a client that is expecting some response thereto. 00:52:01.710 --> 00:52:05.730 So an IP address is just a numeric unique address for, let's say, 00:52:05.730 --> 00:52:07.530 every computer on the internet. 00:52:07.530 --> 00:52:10.950 It's a bit of a simplification, but it's very similar to the postal addresses 00:52:10.950 --> 00:52:14.850 that you and I use to send mail old-school style or postcards 00:52:14.850 --> 00:52:16.990 throughout the Postal system as well. 00:52:16.990 --> 00:52:18.348 So you could, quite simply-- 00:52:18.348 --> 00:52:20.640 if you, as a parent, for instance, don't want your kids 00:52:20.640 --> 00:52:23.220 accessing social media within the home, you 00:52:23.220 --> 00:52:25.470 could just configure your home's firewall 00:52:25.470 --> 00:52:30.690 to prevent access to the IP addresses of known social media sites. 00:52:30.690 --> 00:52:35.640 And so if the kids are trying to use the laptops or desktops in the home 00:52:35.640 --> 00:52:38.850 to connect to those IP addresses, it would effectively be blocked 00:52:38.850 --> 00:52:42.720 and not allowed through, so long as the software in question, the firewall, 00:52:42.720 --> 00:52:45.390 knows or can figure out what those IP addresses are. 00:52:45.390 --> 00:52:47.160 Now that said, it's not fail-proof. 00:52:47.160 --> 00:52:50.160 All it takes is for someone in the home to have some out-of-band device, 00:52:50.160 --> 00:52:52.710 like a cellphone, that uses the mobile phone network. 00:52:52.710 --> 00:52:54.870 And then, of course, you circumvent the firewall 00:52:54.870 --> 00:52:57.660 that might be based only on your home network. 00:52:57.660 --> 00:53:01.500 So you have to keep in mind exactly what it is your firewalling 00:53:01.500 --> 00:53:03.000 and which networks they're in. 00:53:03.000 --> 00:53:05.010 Now you might not want to block access to sites 00:53:05.010 --> 00:53:08.440 based solely on their IP address, but perhaps based on those port numbers. 00:53:08.440 --> 00:53:12.690 So, for instance, if you wanted to block all internet traffic in or out 00:53:12.690 --> 00:53:14.910 of a network, you could just use your firewall 00:53:14.910 --> 00:53:16.680 to block all of those port numbers. 00:53:16.680 --> 00:53:18.960 But if, wait a minute, you realize that you still 00:53:18.960 --> 00:53:22.920 want to be able to remotely control a computer or server in that network, 00:53:22.920 --> 00:53:26.700 you could open up just one port number, for instance, 22, 00:53:26.700 --> 00:53:29.730 if you want to allow SSH back and forth, or whatever 00:53:29.730 --> 00:53:32.710 port number your preferred VPN software uses instead. 00:53:32.710 --> 00:53:36.780 So you can use firewalls to block traffic based on IP address, 00:53:36.780 --> 00:53:40.710 based on port number, or even, more sophisticatedly, 00:53:40.710 --> 00:53:43.300 via deep packet inspection. 00:53:43.300 --> 00:53:48.000 Which is a big way of saying that even the most sophisticated of firewalls 00:53:48.000 --> 00:53:52.290 can actually open up, theoretically, these virtual envelopes 00:53:52.290 --> 00:53:54.820 and see what's actually inside them. 00:53:54.820 --> 00:53:59.670 And this way, you can even more reliably block certain sites 00:53:59.670 --> 00:54:02.970 by their domain name-- not just their IP address, but by their name. 00:54:02.970 --> 00:54:05.550 You could, for instance, via deep packet inspection 00:54:05.550 --> 00:54:08.627 keep an eye out as to who is emailing whom. 00:54:08.627 --> 00:54:10.710 For instance, corporations that are very concerned 00:54:10.710 --> 00:54:15.360 about their intellectual property ideas and products that they have internally, 00:54:15.360 --> 00:54:17.700 they might use deep packet inspection to make sure 00:54:17.700 --> 00:54:21.720 that you and I are not emailing the press about some new product 00:54:21.720 --> 00:54:23.760 under development, or emailing the competition, 00:54:23.760 --> 00:54:26.160 or anyone in the outside world about some product. 00:54:26.160 --> 00:54:28.740 Because via deep packet inspection, you can pretty much 00:54:28.740 --> 00:54:31.050 look at everything inside of this envelope, 00:54:31.050 --> 00:54:34.290 be it the sender, the receiver, the contents of the message. 00:54:34.290 --> 00:54:38.110 But you can also use this too not just for confidentiality and the like, 00:54:38.110 --> 00:54:42.810 but also, for instance, to check for malware and malicious software. 00:54:42.810 --> 00:54:46.710 That is to say, maybe attachments that you don't want to allow through. 00:54:46.710 --> 00:54:49.980 And we'll consider exactly what those threats might be in just a moment 00:54:49.980 --> 00:54:50.710 as well. 00:54:50.710 --> 00:54:54.000 Now how might a company, a university, or even a home 00:54:54.000 --> 00:54:57.030 implement this kind of firewalling, or more deeply, 00:54:57.030 --> 00:54:58.710 this deep packet inspection? 00:54:58.710 --> 00:55:02.730 Well we're essentially describing a technology that you would call a proxy. 00:55:02.730 --> 00:55:05.388 And a proxy is very often a server. 00:55:05.388 --> 00:55:07.680 Though it doesn't have to be an actual physical server. 00:55:07.680 --> 00:55:10.200 It can be a piece of software running in your network. 00:55:10.200 --> 00:55:14.910 A proxy is a device that essentially implements a potential machine 00:55:14.910 --> 00:55:16.590 in the middle attack. 00:55:16.590 --> 00:55:19.410 But it's not quite an attack in this way, it's by design. 00:55:19.410 --> 00:55:23.040 A proxy is a device, a server, or a piece of software 00:55:23.040 --> 00:55:25.720 that sits between two other points. 00:55:25.720 --> 00:55:28.890 So in this case, Alice might be someone on the inside of a network. 00:55:28.890 --> 00:55:31.200 Bob might be someone on the outside of a network. 00:55:31.200 --> 00:55:34.530 And Eve, in this case, is eavesdropping, literally, 00:55:34.530 --> 00:55:37.590 because her role in this story is to be that of a proxy. 00:55:37.590 --> 00:55:40.500 And the proxy takes data from one side, and ideally, maybe 00:55:40.500 --> 00:55:41.610 hands it out to the other. 00:55:41.610 --> 00:55:45.210 But this proxy might, indeed, be eavesdropping, might be a little nosey. 00:55:45.210 --> 00:55:48.990 And so as a packet comes in this way from Alice, might look in the packet 00:55:48.990 --> 00:55:51.660 and decide, uh-uh, we're not going to send this to Bob. 00:55:51.660 --> 00:55:55.000 And then it might just be dropped or effectively deleted. 00:55:55.000 --> 00:55:58.200 So companies, universities, even home networks 00:55:58.200 --> 00:56:02.130 can use proxies to decide yes or no whether 00:56:02.130 --> 00:56:04.270 or not to allow certain traffic through. 00:56:04.270 --> 00:56:07.650 So in that sense, they're very similar to a firewall. 00:56:07.650 --> 00:56:11.220 But sometimes, proxies are something that have to be configured, 00:56:11.220 --> 00:56:12.660 even for your own devices. 00:56:12.660 --> 00:56:15.810 Because there might be only one path out of a company network, 00:56:15.810 --> 00:56:17.880 only one path out of a university. 00:56:17.880 --> 00:56:22.270 But the catch here is that, literally, all of that traffic by design 00:56:22.270 --> 00:56:24.690 now is going through this middle point. 00:56:24.690 --> 00:56:27.280 And here's where things can get troubling, 00:56:27.280 --> 00:56:31.710 especially if you do attend a university or you do work for a company 00:56:31.710 --> 00:56:37.180 where they maybe issued your laptop, or desktop, or phone. 00:56:37.180 --> 00:56:39.870 So if your company or your school has given you 00:56:39.870 --> 00:56:42.900 a laptop, or desktop, or phone, that might be nice 00:56:42.900 --> 00:56:45.190 that this is one of the perks, to have this hardware. 00:56:45.190 --> 00:56:47.760 But if they have preconfigured it with software, 00:56:47.760 --> 00:56:50.850 realize that there are implications of that. 00:56:50.850 --> 00:56:54.360 You might have your own username and password on that laptop, or desktop, 00:56:54.360 --> 00:56:58.950 or phone, but they might have installed administratively, with root access, 00:56:58.950 --> 00:57:02.980 so to speak, some kind of software that could actually be monitoring everything 00:57:02.980 --> 00:57:04.230 you're doing on that computer. 00:57:04.230 --> 00:57:06.780 And this is not uncommon in the corporate workplace. 00:57:06.780 --> 00:57:09.060 But they can also do things more technically. 00:57:09.060 --> 00:57:13.500 Like they could install their own certificate authority 00:57:13.500 --> 00:57:17.400 on your laptop, or desktop, or phone, essentially 00:57:17.400 --> 00:57:20.640 adding to the list of certificate authorities 00:57:20.640 --> 00:57:23.310 that Microsoft, and Google, and Apple, and Mozilla 00:57:23.310 --> 00:57:25.860 have baked into their own browsers. 00:57:25.860 --> 00:57:28.620 The implication of this is that even if you 00:57:28.620 --> 00:57:33.660 think that you, as Alice, can securely communicate with Bob, 00:57:33.660 --> 00:57:36.630 and maybe Bob in this story is gmail.com, 00:57:36.630 --> 00:57:42.450 or amazon.com, or facebook.com, even if you think that Alice and Bob can-- 00:57:42.450 --> 00:57:44.820 you, Alice, can communicate securely with Bob 00:57:44.820 --> 00:57:51.630 and establish a connection between https://gmail.com, 00:57:51.630 --> 00:57:56.010 if your browser, that is your phone, or laptop, or desktop 00:57:56.010 --> 00:58:00.300 has a certificate installed by your company or university 00:58:00.300 --> 00:58:03.240 that's acting as a CA, a certificate authority, 00:58:03.240 --> 00:58:06.000 essentially your computer could be tricked 00:58:06.000 --> 00:58:10.200 into thinking that you're connecting to the real gmail.com, 00:58:10.200 --> 00:58:13.740 but you're actually connecting to the company's proxy server. 00:58:13.740 --> 00:58:18.210 But because you have this additional certificate on your computer, 00:58:18.210 --> 00:58:20.970 even though the company is masquerading-- 00:58:20.970 --> 00:58:24.510 pretending to be gmail.com, you're actually 00:58:24.510 --> 00:58:25.920 connected to their proxy server. 00:58:25.920 --> 00:58:28.670 And they might actually be forwarding your traffic somewhere else, 00:58:28.670 --> 00:58:33.000 but this is a definition of a machine in the middle attack, 00:58:33.000 --> 00:58:37.410 but it's facilitated by someone else having used our technologies that we 00:58:37.410 --> 00:58:41.640 talked about earlier to trick, really, your local device into thinking 00:58:41.640 --> 00:58:45.180 that this is the real gmail.com, whereas the math might actually 00:58:45.180 --> 00:58:48.840 be based on the company's certificate not on Gmail's own. 00:58:48.840 --> 00:58:50.070 So be mindful of that. 00:58:50.070 --> 00:58:53.070 Again, whenever you're using a device that has left your control 00:58:53.070 --> 00:58:54.930 or has not always been under your control, 00:58:54.930 --> 00:58:59.320 that you do not necessarily know what is installed on it. 00:58:59.320 --> 00:59:02.790 So how else might a company, a university 00:59:02.790 --> 00:59:07.348 be observing or keeping an eye out, either for good or evil purposes? 00:59:07.348 --> 00:59:09.390 Well, they actually might also be doing something 00:59:09.390 --> 00:59:11.430 that we might call URL rewriting. 00:59:11.430 --> 00:59:14.700 So if you have a company or a school email address, 00:59:14.700 --> 00:59:17.340 via which you receive mails from the outside world, 00:59:17.340 --> 00:59:22.650 you might notice that whenever an email contains a link, if you hover over 00:59:22.650 --> 00:59:26.200 the link and look in the bottom corner of your browser, 00:59:26.200 --> 00:59:30.270 it might actually not go to the actual link destination 00:59:30.270 --> 00:59:31.470 that you think it does. 00:59:31.470 --> 00:59:34.005 It might actually go to a URL like this-- maybe 00:59:34.005 --> 00:59:40.860 https://example.com?url= something. 00:59:40.860 --> 00:59:44.940 And the implication of this here is that what companies and schools might 00:59:44.940 --> 00:59:49.890 do to combat malware, maybe malicious software that could be accidentally 00:59:49.890 --> 00:59:53.340 installed or sent your way via URLs, or to prevent you 00:59:53.340 --> 00:59:56.520 from accessing phishing websites that are trying to steal your, 00:59:56.520 --> 00:59:58.380 or the company, or the school's information, 00:59:58.380 --> 01:00:01.500 they might automatically, through some kind of proxy server, 01:00:01.500 --> 01:00:06.360 acting in this case on email, change every URL in an email 01:00:06.360 --> 01:00:11.700 you receive to be example.com and then embed at the end of that example.com 01:00:11.700 --> 01:00:16.440 URL the actual URL, so that you can still reach your destination. 01:00:16.440 --> 01:00:20.070 But the implication of this is that if you click on this link here, 01:00:20.070 --> 01:00:22.530 you're first going to visit example.com. 01:00:22.530 --> 01:00:27.330 That's going to include some data from that proxy server that has added 01:00:27.330 --> 01:00:31.080 your url= something, where something might be gmail.com, amazon.com, 01:00:31.080 --> 01:00:33.030 whatever the actual URL is. 01:00:33.030 --> 01:00:37.440 But because the company or the school in this story controls example.com, 01:00:37.440 --> 01:00:40.950 they know what your URL you're trying to visit. 01:00:40.950 --> 01:00:43.680 Now, one, they could minimally just log that information 01:00:43.680 --> 01:00:46.590 and know that, oh, David seems to be visiting Gmail again. 01:00:46.590 --> 01:00:50.910 Or they could actually check in their database, is gmail.com, 01:00:50.910 --> 01:00:52.980 or whatever this URL is, known to be malicious? 01:00:52.980 --> 01:00:54.522 Is it known to be a phishing website? 01:00:54.522 --> 01:00:57.790 And they can just prevent you from accessing it altogether. 01:00:57.790 --> 01:01:01.290 So this is another form of proxying that's actually very explicit. 01:01:01.290 --> 01:01:04.530 Because you're embedding the machine in the middle, 01:01:04.530 --> 01:01:08.070 quite literally, as example.com, or whatever your university 01:01:08.070 --> 01:01:10.630 or your company's domain name is. 01:01:10.630 --> 01:01:15.720 But it's all toward an end of trying to help protect you from potentially 01:01:15.720 --> 01:01:18.210 malicious phishing type websites. 01:01:18.210 --> 01:01:21.540 But the implication too is that now the company, the school, 01:01:21.540 --> 01:01:26.830 the machine in the middle knows every link you're clicking on as well. 01:01:26.830 --> 01:01:28.950 Let me pause here and see if there are now 01:01:28.950 --> 01:01:32.670 any questions about these techniques of proxying. 01:01:32.670 --> 01:01:36.420 AUDIENCE: I just want to know what's the difference between VPN 01:01:36.420 --> 01:01:41.040 and also Tor network in a higher level? 01:01:41.040 --> 01:01:43.470 Just the difference, because what I do know 01:01:43.470 --> 01:01:48.270 is that these two networks are used to encrypt or anonymize 01:01:48.270 --> 01:01:52.020 our activity or the data that we are sending. 01:01:52.020 --> 01:01:55.650 The second question is that let's say that I'm using a network 01:01:55.650 --> 01:02:01.860 at a university or a company, and they also have their own CA, 01:02:01.860 --> 01:02:04.290 which means that they've got their own databases, 01:02:04.290 --> 01:02:10.170 like which websites I can get in and which websites I cannot get into. 01:02:10.170 --> 01:02:17.430 If I'm using VPN, does that mean that the university or the company 01:02:17.430 --> 01:02:19.442 know what I'm doing? 01:02:19.442 --> 01:02:20.650 DAVID MALAN: A good question. 01:02:20.650 --> 01:02:23.880 So Tor is the anonymization software I was alluding to a moment 01:02:23.880 --> 01:02:25.042 ago in my previous answer. 01:02:25.042 --> 01:02:27.000 And we'll talk about that in our final lecture. 01:02:27.000 --> 01:02:29.650 Tor really is about privacy preserving. 01:02:29.650 --> 01:02:32.040 So it's about covering your tracks so that, 01:02:32.040 --> 01:02:34.530 kind of like the Hollywood movies, when you're here, 01:02:34.530 --> 01:02:38.130 your data is bouncing across all of these different servers in the world, 01:02:38.130 --> 01:02:42.240 and it's difficult to trace it back to its origins, by design. 01:02:42.240 --> 01:02:44.070 So more on that in a couple of weeks' time. 01:02:44.070 --> 01:02:47.925 VPN is an encrypted connection just between one point, A, 01:02:47.925 --> 01:02:52.380 and another point, B. So even though they can't see what you are doing, 01:02:52.380 --> 01:02:55.420 it is very obvious where your data is coming from thereafter. 01:02:55.420 --> 01:02:58.110 So legally, for instance, if the VPN server or company 01:02:58.110 --> 01:03:01.350 were to be subpoenaed, they might have to disclose information about you. 01:03:01.350 --> 01:03:03.660 And so it's not quite as privacy preserving. 01:03:03.660 --> 01:03:07.890 It secures your data, but it doesn't preserve your privacy in the same way. 01:03:07.890 --> 01:03:14.010 On your other question about what you're doing, especially when you're on a VPN, 01:03:14.010 --> 01:03:17.670 if you-- if someone has installed, with administrative privileges, 01:03:17.670 --> 01:03:19.900 software on your computer, all bets are off. 01:03:19.900 --> 01:03:21.910 You should, cannot trust the device. 01:03:21.910 --> 01:03:25.830 So if they've installed their own certificate on your computer or a CA 01:03:25.830 --> 01:03:28.560 to the list of trusted things, any time you 01:03:28.560 --> 01:03:33.030 use that browser, for instance, you are vulnerable to a machine 01:03:33.030 --> 01:03:39.560 in the middle attack or at least a proxying-type implication. 01:03:39.560 --> 01:03:41.708 Beyond that-- but honestly, at that point, 01:03:41.708 --> 01:03:44.000 if they've installed special software in your computer, 01:03:44.000 --> 01:03:46.910 they could be monitoring everything you do on the internet anyway. 01:03:46.910 --> 01:03:49.260 So all bets are off. 01:03:49.260 --> 01:03:52.730 So what is it we're trying to keep, ultimately, out of our systems? 01:03:52.730 --> 01:03:55.640 Well, I dare say, malware is perhaps one of the biggest threats. 01:03:55.640 --> 01:03:58.490 So malware or malicious software is just software 01:03:58.490 --> 01:04:02.090 that someone has written that can do malicious things. 01:04:02.090 --> 01:04:03.710 And that's the nature of software. 01:04:03.710 --> 01:04:05.930 If you've never programmed before, it turns out 01:04:05.930 --> 01:04:08.300 that it's not that hard to write a program that if you 01:04:08.300 --> 01:04:11.300 run it deletes all of the files on your computer, 01:04:11.300 --> 01:04:14.510 or maybe start sending out spam, or maybe starts mining bitcoin, 01:04:14.510 --> 01:04:15.800 or anything else. 01:04:15.800 --> 01:04:17.480 Software can do anything. 01:04:17.480 --> 01:04:20.840 And whether it's malicious or not is really up to the human who wrote it 01:04:20.840 --> 01:04:22.550 or the person who's using it. 01:04:22.550 --> 01:04:24.890 Now you might have heard of specific types of malware, 01:04:24.890 --> 01:04:26.210 for instance, a virus. 01:04:26.210 --> 01:04:30.200 A virus is a piece of software that attaches itself to a host, 01:04:30.200 --> 01:04:32.630 just like in the human physiological world. 01:04:32.630 --> 01:04:36.290 The attach with a virus in the digital world is that you, the human, 01:04:36.290 --> 01:04:39.320 usually have to do something to get infected. 01:04:39.320 --> 01:04:42.447 You have to open a file that's infected with the virus, 01:04:42.447 --> 01:04:45.530 and start running it on your computer, and loading it into your computer's 01:04:45.530 --> 01:04:47.990 memory and CPU or brain. 01:04:47.990 --> 01:04:52.650 You have to click on an attachment in an email that's perhaps infected. 01:04:52.650 --> 01:04:55.970 So viruses generally require human intervention 01:04:55.970 --> 01:04:57.560 and really, human mistakes. 01:04:57.560 --> 01:05:00.410 You're exposing yourself unintentionally to a piece 01:05:00.410 --> 01:05:02.210 of software that can now do anything. 01:05:02.210 --> 01:05:05.030 And a virus can literally do anything that a piece of software can. 01:05:05.030 --> 01:05:07.130 So it can delete all the files on your hard drive. 01:05:07.130 --> 01:05:07.923 It can send spam. 01:05:07.923 --> 01:05:09.090 It can start bitcoin mining. 01:05:09.090 --> 01:05:11.752 It can email all of your files to an adversary. 01:05:11.752 --> 01:05:14.210 Once you have a piece of software running on your computer, 01:05:14.210 --> 01:05:15.600 all bets are off. 01:05:15.600 --> 01:05:20.480 So you might wonder then, well, what's the line between a virus and Microsoft 01:05:20.480 --> 01:05:25.640 Word, or Spotify, or some other piece of software you install on your computer? 01:05:25.640 --> 01:05:28.500 Really, it's ethics, at the end of the day. 01:05:28.500 --> 01:05:32.930 We are trusting that Microsoft Word, and Spotify, and other software you 01:05:32.930 --> 01:05:35.540 might install on your Mac, your PC, or your phone, 01:05:35.540 --> 01:05:37.742 just isn't doing bad things. 01:05:37.742 --> 01:05:40.700 Because very often, once you have a piece of software on your computer, 01:05:40.700 --> 01:05:44.930 it technically can do anything that the operating system-- 01:05:44.930 --> 01:05:47.720 Windows, or Mac OS, or iOS, or Android-- 01:05:47.720 --> 01:05:50.960 make possible, thanks to those manufacturers. 01:05:50.960 --> 01:05:53.480 And it really is a code of ethics. 01:05:53.480 --> 01:05:56.030 It really is, perhaps, capitalistic pressures 01:05:56.030 --> 01:05:59.780 that ensure that companies aren't necessarily infecting us 01:05:59.780 --> 01:06:01.370 with software that's doing bad things. 01:06:01.370 --> 01:06:01.870 Why? 01:06:01.870 --> 01:06:04.220 Probably bad for business, if nothing else, 01:06:04.220 --> 01:06:06.260 if they're caught doing something malicious. 01:06:06.260 --> 01:06:10.693 But with that said, historically it's quite possible for software 01:06:10.693 --> 01:06:13.610 to be written, even within the constraints of an operating system that 01:06:13.610 --> 01:06:14.660 does bad things. 01:06:14.660 --> 01:06:16.670 And heck, maybe it's even accidental. 01:06:16.670 --> 01:06:19.940 It has absolutely been the case that sometimes software deletes things 01:06:19.940 --> 01:06:22.800 that it shouldn't because some human made a mistake. 01:06:22.800 --> 01:06:25.490 Now with that said, gradually the world is getting better 01:06:25.490 --> 01:06:29.930 at designing better and better operating systems that try to Sandbox things. 01:06:29.930 --> 01:06:32.960 In iOS, which runs on iPhones, and iPads, and the like, 01:06:32.960 --> 01:06:35.630 is actually particularly good at this, whereby, 01:06:35.630 --> 01:06:38.780 sometimes annoyingly, apps can't do something 01:06:38.780 --> 01:06:40.730 without your explicit permission. 01:06:40.730 --> 01:06:43.700 Now you and I might not to give much thought to just saying OK, OK, OK, 01:06:43.700 --> 01:06:45.690 because we want the software to do its thing. 01:06:45.690 --> 01:06:50.030 But more so than operating systems past, you and I are increasingly 01:06:50.030 --> 01:06:53.630 being allowed to weigh in on whether some piece of software 01:06:53.630 --> 01:06:57.260 can use the network, can turn on the camera, can turn on the microphone. 01:06:57.260 --> 01:06:59.930 So thankfully, we're getting more and more building blocks 01:06:59.930 --> 01:07:02.010 to mitigate some of these concerns. 01:07:02.010 --> 01:07:03.830 But there are still viruses in the world. 01:07:03.830 --> 01:07:06.290 And you can be infected by them whether it 01:07:06.290 --> 01:07:09.950 is some file you've downloaded and run or some email attachment you've 01:07:09.950 --> 01:07:10.460 clicked. 01:07:10.460 --> 01:07:14.990 But more worrisome is another type of malware that we might call a worm. 01:07:14.990 --> 01:07:17.418 And a worm is very similar in spirit to a virus, 01:07:17.418 --> 01:07:18.960 in that it's just malicious software. 01:07:18.960 --> 01:07:19.910 It can do anything. 01:07:19.910 --> 01:07:23.930 But these things can travel from computer to computer, even 01:07:23.930 --> 01:07:26.690 without interaction by humans. 01:07:26.690 --> 01:07:27.990 Now how does this work? 01:07:27.990 --> 01:07:31.610 Well, a worm, theoretically, once it's installed on one computer, 01:07:31.610 --> 01:07:34.160 having infected it and running, well, it could 01:07:34.160 --> 01:07:36.770 do that technique called port scanning from earlier. 01:07:36.770 --> 01:07:41.180 And maybe it can use that infected computer's internet connection 01:07:41.180 --> 01:07:44.900 and just start scanning the local network or a broader network 01:07:44.900 --> 01:07:49.290 for IP addresses of other computers and ports of other computers. 01:07:49.290 --> 01:07:53.270 And if one of those computer's ports happens to be listening, 01:07:53.270 --> 01:07:58.760 and for unfortunate reasons, that computer is not using encryption, 01:07:58.760 --> 01:08:02.750 it's not using authentication, and it's vulnerable somehow, theoretically 01:08:02.750 --> 01:08:06.230 that worm can travel from computer, to computer, to computer 01:08:06.230 --> 01:08:11.130 by making these network connections via these ports at these IP addresses. 01:08:11.130 --> 01:08:13.010 And that is quite often how they have spread. 01:08:13.010 --> 01:08:15.830 It's the result of mistakes in software. 01:08:15.830 --> 01:08:20.390 It's the result of there having been holes in our firewalls, in our systems 01:08:20.390 --> 01:08:23.600 by allowing them through via these techniques. 01:08:23.600 --> 01:08:25.790 So what's the downside really? 01:08:25.790 --> 01:08:28.880 Well, beyond just wreaking havoc on your own computer, 01:08:28.880 --> 01:08:32.960 by deleting all of your files, spamming people, bitcoin mining, and the like, 01:08:32.960 --> 01:08:36.819 actually what adversaries have been increasingly been doing over the years 01:08:36.819 --> 01:08:38.950 is creating botnets, so to speak. 01:08:38.950 --> 01:08:42.189 That is it turns out it's more valuable to an adversary 01:08:42.189 --> 01:08:46.600 not to completely disable your system, because that doesn't really serve them 01:08:46.600 --> 01:08:51.010 long term, but maybe to install software on your computer that's 01:08:51.010 --> 01:08:52.390 just constantly running. 01:08:52.390 --> 01:08:54.700 And it's not actually doing anything bad to you. 01:08:54.700 --> 01:08:58.359 None of your files are deleted, no spam is being sent, but you are infected. 01:08:58.359 --> 01:09:01.569 And there's just some piece of software constantly running on your computer. 01:09:01.569 --> 01:09:03.939 But maybe this adversary has somehow figured out 01:09:03.939 --> 01:09:06.460 how to infect not just your computer, but my computer, 01:09:06.460 --> 01:09:09.480 and your computer, and your computer, and your computer, and hundreds, 01:09:09.480 --> 01:09:11.470 thousannds of other computers in the world. 01:09:11.470 --> 01:09:17.500 Now, if this software is smart and it's constantly listening for commands, 01:09:17.500 --> 01:09:21.220 an attacker can send some kind of commands, 01:09:21.220 --> 01:09:25.930 not unlike those virtual envelopes, to this entire botnet of computers 01:09:25.930 --> 01:09:30.700 and say, OK computers, now everyone at the same moment 01:09:30.700 --> 01:09:34.090 start attacking some server, or everyone at this moment 01:09:34.090 --> 01:09:37.420 start sending emails, everyone at this moment start mining bitcoin. 01:09:37.420 --> 01:09:41.649 And so you can leverage the network of hundreds or thousands of computers 01:09:41.649 --> 01:09:46.840 all at once and have a much more powerful attack therefore possible. 01:09:46.840 --> 01:09:48.850 And what form do those attacks take? 01:09:48.850 --> 01:09:53.200 Well, quite often what we'd call a denial of service attack, or DOS. 01:09:53.200 --> 01:09:55.150 And this is exactly as the name suggests. 01:09:55.150 --> 01:10:00.080 Sometimes adversaries' goal in life is just to deny service to everyone else, 01:10:00.080 --> 01:10:03.070 whether it's for political reasons, financial reasons, or the like. 01:10:03.070 --> 01:10:07.150 It's one thing for me on me on my laptop or my phone to maybe visit-- 01:10:07.150 --> 01:10:10.630 maybe I'm a little annoyed at Google today, so I go to google.com, 01:10:10.630 --> 01:10:14.230 and then I keep hitting reload, reload, reload, reload-- or faster-- reload, 01:10:14.230 --> 01:10:15.490 reload, reload, reload. 01:10:15.490 --> 01:10:18.250 I'm trying to deny service to other people, 01:10:18.250 --> 01:10:21.070 but realistically, Google has way more resources than me, 01:10:21.070 --> 01:10:23.830 so a denial of service attack only really 01:10:23.830 --> 01:10:27.370 works if you have a lot of resources yourself, more resources 01:10:27.370 --> 01:10:29.530 than the site you're attacking. 01:10:29.530 --> 01:10:33.100 But with botnets, when you control multiple computers, 01:10:33.100 --> 01:10:37.120 you can actually wage a distributed denial of service attack, 01:10:37.120 --> 01:10:39.880 whereby again, you send a command to this whole network 01:10:39.880 --> 01:10:43.750 of infected computers, and you say, all right, everyone, go visit google.com 01:10:43.750 --> 01:10:45.670 right now, and reload, reload, reload. 01:10:45.670 --> 01:10:49.840 And when it's hundreds or thousands of computers doing it to Google, or maybe 01:10:49.840 --> 01:10:52.480 not someone as big as Google, but maybe a small business 01:10:52.480 --> 01:10:54.910 that they're annoyed at or they're competing with, 01:10:54.910 --> 01:11:00.220 you can via a distributed network try to deny service to actual customers 01:11:00.220 --> 01:11:01.360 or users of that site. 01:11:01.360 --> 01:11:01.930 Why? 01:11:01.930 --> 01:11:05.740 Well, a computer, long story short, only has so much memory, 01:11:05.740 --> 01:11:08.300 only can do so much per unit of time. 01:11:08.300 --> 01:11:11.140 And if you completely distract a computer or server 01:11:11.140 --> 01:11:15.580 by all of these requests that are bogus, then the good requests from the real, 01:11:15.580 --> 01:11:19.240 users the real customers, can't necessarily squeeze in. 01:11:19.240 --> 01:11:21.820 And so you're denying service to other people. 01:11:21.820 --> 01:11:24.880 And distributed is particularly malicious. 01:11:24.880 --> 01:11:25.420 Why? 01:11:25.420 --> 01:11:28.420 Well, once a company figured out what's happening, like Google, 01:11:28.420 --> 01:11:30.580 they could just block with their firewall 01:11:30.580 --> 01:11:33.910 my IP address, for instance, coming from my phone, or my laptop, 01:11:33.910 --> 01:11:34.780 or other device. 01:11:34.780 --> 01:11:37.600 Once they know where the attack is coming from, they can just deny 01:11:37.600 --> 01:11:39.160 or they can block service there. 01:11:39.160 --> 01:11:43.780 But if it's distributed, if it's coming from all of us or thousands of people, 01:11:43.780 --> 01:11:47.410 thousands of IP addresses, then it gets a little harder. 01:11:47.410 --> 01:11:51.700 Technically, they could just block all of our IP addresses, thousands of IPs. 01:11:51.700 --> 01:11:56.590 But very often, you and I share IP addresses if we're on the same campus, 01:11:56.590 --> 01:11:57.940 on the same corporate network. 01:11:57.940 --> 01:12:01.900 Even though locally we might have unique addresses, to the outside world 01:12:01.900 --> 01:12:06.790 we might all share one public address from our whole company or school. 01:12:06.790 --> 01:12:11.200 And at some point, it's not going to be in Google's best interest in this story 01:12:11.200 --> 01:12:13.210 to block all of those IP addresses. 01:12:13.210 --> 01:12:17.410 Because then we might be denying service to actual good people on this campus, 01:12:17.410 --> 01:12:22.090 at this company, who just happened to be on the same network as this attacker 01:12:22.090 --> 01:12:24.280 or as this infected computer. 01:12:24.280 --> 01:12:28.660 And so that's really the value of attacking computers nowadays. 01:12:28.660 --> 01:12:31.600 It's not just one thing to get at your computer individually. 01:12:31.600 --> 01:12:33.460 It's what your computer represents. 01:12:33.460 --> 01:12:38.410 You are a node, a potential ally on a network of systems. 01:12:38.410 --> 01:12:41.920 So it's just as well that we try to keep this kind of software 01:12:41.920 --> 01:12:45.250 out altogether instead. 01:12:45.250 --> 01:12:45.760 So how? 01:12:45.760 --> 01:12:46.810 How do you keep it out? 01:12:46.810 --> 01:12:48.935 Well, for years you've probably heard about-- maybe 01:12:48.935 --> 01:12:50.740 you've been using-- antivirus software. 01:12:50.740 --> 01:12:52.300 And that's exactly what it does. 01:12:52.300 --> 01:12:55.180 Antivirus software is software you either download for free, 01:12:55.180 --> 01:12:57.460 maybe pay for, install on your computer. 01:12:57.460 --> 01:12:59.560 And it's generally constantly running. 01:12:59.560 --> 01:13:03.250 Or maybe it runs on a schedule to check are there any known viruses 01:13:03.250 --> 01:13:06.880 or worms on this computer, and if so, let's delete them, or let's somehow 01:13:06.880 --> 01:13:07.828 remove them safely. 01:13:07.828 --> 01:13:10.120 And maybe you have to reboot to get them out of memory, 01:13:10.120 --> 01:13:12.020 but then maybe you're back in business. 01:13:12.020 --> 01:13:13.635 To be fair, that might be too late. 01:13:13.635 --> 01:13:15.760 Because the emails-- the spam might have been sent, 01:13:15.760 --> 01:13:17.230 the files might have been deleted. 01:13:17.230 --> 01:13:22.700 But it at least gets it out of there for the future time, at least. 01:13:22.700 --> 01:13:25.570 But there's a problem with antivirus software 01:13:25.570 --> 01:13:32.050 alone, in that it has to actually be current for the attacks 01:13:32.050 --> 01:13:33.100 that you're facing. 01:13:33.100 --> 01:13:35.800 That is to say, for the antivirus software to work, 01:13:35.800 --> 01:13:38.620 has to know about the virus, has to know about the worm. 01:13:38.620 --> 01:13:40.750 Now how can you make sure you're always current? 01:13:40.750 --> 01:13:44.530 Well, you can enable automatic updates with your antivirus software or even 01:13:44.530 --> 01:13:46.450 your operating system more generally. 01:13:46.450 --> 01:13:49.030 And in recent years, the world has realized 01:13:49.030 --> 01:13:52.000 that even though there are downsides of automatic updates, 01:13:52.000 --> 01:13:56.080 it's generally proving to be, it seems, a net positive for society. 01:13:56.080 --> 01:13:56.620 Why? 01:13:56.620 --> 01:13:59.260 Because it ensures that you, and I, and everyone else 01:13:59.260 --> 01:14:03.010 are generally running the latest versions of software, which generally 01:14:03.010 --> 01:14:07.480 means we fixed security holes, that is security-related bugs 01:14:07.480 --> 01:14:08.750 in previous versions. 01:14:08.750 --> 01:14:12.233 So that at least we're not vulnerable to yesterday's mistakes. 01:14:12.233 --> 01:14:14.650 We're still vulnerable to today's and tomorrow's mistakes, 01:14:14.650 --> 01:14:16.817 when people continue to write software that's buggy. 01:14:16.817 --> 01:14:18.280 But at least we're staying current. 01:14:18.280 --> 01:14:22.630 And this also ensures too that companies can focus their resources generally 01:14:22.630 --> 01:14:24.652 on the newest versions of software, and they 01:14:24.652 --> 01:14:26.860 don't have to worry about being backwards compatible, 01:14:26.860 --> 01:14:30.190 and spending time, and effort, and distraction on software that 01:14:30.190 --> 01:14:32.240 might be older, and older, and older. 01:14:32.240 --> 01:14:36.160 So enabling automatic updates is generally proving to be a good thing, 01:14:36.160 --> 01:14:37.090 I daresay. 01:14:37.090 --> 01:14:38.470 But there are downsides. 01:14:38.470 --> 01:14:41.440 Google, Microsoft, Apple, and others, they're not perfect. 01:14:41.440 --> 01:14:44.950 And it has definitely been the case that their companies have released 01:14:44.950 --> 01:14:48.700 an update that actually breaks your computer or mine, in the sense 01:14:48.700 --> 01:14:51.020 that now I can't access it or something went wrong. 01:14:51.020 --> 01:14:53.470 And so generally, automatic updates are not something 01:14:53.470 --> 01:14:57.820 you do to all of your customers all at once, but maybe a few, then a few more, 01:14:57.820 --> 01:15:01.150 and just to make sure that we're not going to break or brick 01:15:01.150 --> 01:15:02.980 a whole lot of our users' computers. 01:15:02.980 --> 01:15:05.590 But generally speaking, turning on automatic updates 01:15:05.590 --> 01:15:09.010 will at least ensure that you're not vulnerable to problems 01:15:09.010 --> 01:15:10.730 the world has already solved. 01:15:10.730 --> 01:15:14.560 And this is a good thing because there's nothing worse than realizing, oh, I've 01:15:14.560 --> 01:15:17.230 been attacked and there was something you could do about it. 01:15:17.230 --> 01:15:19.480 You could have updated your software already. 01:15:19.480 --> 01:15:22.540 But the catch is that there are also these attacks known 01:15:22.540 --> 01:15:24.100 as zero-day attacks. 01:15:24.100 --> 01:15:27.640 And the problem with antivirus software in general, and even automatic updates, 01:15:27.640 --> 01:15:31.360 is that there's still humans involved in this process, whereby 01:15:31.360 --> 01:15:34.210 they have to realize, oh, there's a new virus in the world, 01:15:34.210 --> 01:15:37.150 oh, there's a new worm in the world, oh, we made a mistake. 01:15:37.150 --> 01:15:38.200 You have to fix it. 01:15:38.200 --> 01:15:42.280 You have to update the antivirus software to detect those new threats, 01:15:42.280 --> 01:15:45.280 or those new viruses, those new worms. 01:15:45.280 --> 01:15:48.610 So a zero-day attack is an example of attack 01:15:48.610 --> 01:15:52.870 where an adversary maybe writes their own virus, their own worm, 01:15:52.870 --> 01:15:55.930 gets it out into the wild, maybe on enough computers 01:15:55.930 --> 01:15:58.870 that they can do something particularly destructive or valuable 01:15:58.870 --> 01:16:03.320 for them with it, and the world just doesn't have time to catch up. 01:16:03.320 --> 01:16:06.970 So even if you have antivirus software installed, automatic updates, 01:16:06.970 --> 01:16:11.830 it might still take a day, a week for the companies who design that software 01:16:11.830 --> 01:16:14.120 to update those products for you. 01:16:14.120 --> 01:16:16.340 So even then, you're still vulnerable. 01:16:16.340 --> 01:16:18.400 And that's why security really is going to be 01:16:18.400 --> 01:16:21.730 this multipronged approach, especially when it comes to our systems. 01:16:21.730 --> 01:16:24.070 It's not enough to just use antivirus software. 01:16:24.070 --> 01:16:26.110 It's not enough just to have a good password. 01:16:26.110 --> 01:16:30.820 It really is a layered defense so that you create this gauntlet of defenses 01:16:30.820 --> 01:16:33.250 ultimately that adversaries have to get through. 01:16:33.250 --> 01:16:35.380 And if they get through one, hopefully you're fine. 01:16:35.380 --> 01:16:37.450 If they get through two, hopefully you're still fine. 01:16:37.450 --> 01:16:40.117 If they get through three, maybe then you should start worrying. 01:16:40.117 --> 01:16:42.370 But security really isn't this absolute. 01:16:42.370 --> 01:16:45.580 Recall from where we began, we really just want to raise the bar, 01:16:45.580 --> 01:16:49.090 raise the cost, raise the risk to the adversary 01:16:49.090 --> 01:16:52.750 so that, again, they hopefully, lose interest in little old me. 01:16:52.750 --> 01:16:55.270 So that's it for today's focus on systems. 01:16:55.270 --> 01:16:58.690 Hereafter, we'll focus on software, specifically on what you can do, 01:16:58.690 --> 01:17:00.910 whether you use or write software. 01:17:00.910 --> 01:17:03.940 And thereafter, we'll take a turn to privacy as well. 01:17:03.940 --> 01:17:06.570 All that next time.