1 00:00:00,000 --> 00:00:11,210 >> [MUSIC PLAYING] 2 00:00:11,210 --> 00:00:14,640 >> SPEAKER 1: All right, welcome back to CS50. 3 00:00:14,640 --> 00:00:18,190 This is the end of week eight, and almost Halloween. 4 00:00:18,190 --> 00:00:22,460 Tomorrow night's office hours will be the scariest ones yet, and not because 5 00:00:22,460 --> 00:00:23,460 of Halloween. 6 00:00:23,460 --> 00:00:28,600 >> But on that note, do realize that problem set six, the spell checking 7 00:00:28,600 --> 00:00:32,340 problem set, is renowned to be, for many students, the most challenging, 8 00:00:32,340 --> 00:00:36,010 certainly among the C problem sets, and really, in general. 9 00:00:36,010 --> 00:00:39,250 And I mention this only because this is the week where a lot of people get 10 00:00:39,250 --> 00:00:42,840 particularly stressed with just trying to get the damn spell checker to work. 11 00:00:42,840 --> 00:00:45,640 And the one thing I would encourage you is that, as you'll see today, and 12 00:00:45,640 --> 00:00:49,670 on Monday, we begin to hit this peak this week where, now, things become a 13 00:00:49,670 --> 00:00:52,370 little more familiar, a little more accessible, as we transition from a 14 00:00:52,370 --> 00:00:56,120 command line environment in C to a web based environment in PHP. 15 00:00:56,120 --> 00:00:59,805 >> And so I'd encourage you, even if you're really at your wit's end in 16 00:00:59,805 --> 00:01:02,785 trying to get the p set to work, if that's indeed the place you're at, or 17 00:01:02,785 --> 00:01:05,770 find yourself at, do try to power through it. 18 00:01:05,770 --> 00:01:08,280 Because I do think you'll be quite pleased, and quite proud of yourself, 19 00:01:08,280 --> 00:01:12,300 if you really end that portion of the course, the C portion, on that high, 20 00:01:12,300 --> 00:01:13,310 if stressful, note. 21 00:01:13,310 --> 00:01:14,120 So that's not to scare. 22 00:01:14,120 --> 00:01:18,010 That's just meant to encourage you to stay up that extra hour in order to 23 00:01:18,010 --> 00:01:19,820 get the spell checking working. 24 00:01:19,820 --> 00:01:22,730 >> And if you do, realize that this is optional, entirely. 25 00:01:22,730 --> 00:01:25,720 But we have the so-called big board that went live this morning. 26 00:01:25,720 --> 00:01:29,950 As of this morning, I was atop the big board, which is a measurement of how 27 00:01:29,950 --> 00:01:34,450 much RAM and how much running time your program speller requires. 28 00:01:34,450 --> 00:01:35,890 But I've since been displaced. 29 00:01:35,890 --> 00:01:37,910 I'm now the unlucky number 13. 30 00:01:37,910 --> 00:01:41,460 And what you'll see here is, David Kaufman, and Lauren, and Adam, and 31 00:01:41,460 --> 00:01:44,130 Jason, and others are now atop the big board. 32 00:01:44,130 --> 00:01:47,480 >> If you look over there at the right, all of us have really good 33 00:01:47,480 --> 00:01:49,890 implementations of size at least-- 34 00:01:49,890 --> 00:01:51,640 returning the number of words in the dictionary. 35 00:01:51,640 --> 00:01:54,690 And in each of these columns, you'll see how much RAM each of our 36 00:01:54,690 --> 00:01:58,370 implementations is using, how much running time it's taking to execute 37 00:01:58,370 --> 00:02:01,450 load, versus check, versus size and unload, and then, the 38 00:02:01,450 --> 00:02:02,490 total running time. 39 00:02:02,490 --> 00:02:05,990 So just to reassure Elmer, and Patrick, and Linda, and everyone else 40 00:02:05,990 --> 00:02:09,210 who comes after you, there's absolutely no shame in being toward 41 00:02:09,210 --> 00:02:10,590 the bottom of the big board. 42 00:02:10,590 --> 00:02:13,950 If anything, that means you got working, and it's correct, but it's 43 00:02:13,950 --> 00:02:18,480 not necessarily as efficient, space or time-wise, as it might be. 44 00:02:18,480 --> 00:02:19,430 >> So, totally optional. 45 00:02:19,430 --> 00:02:22,630 But meant to be a carrot of sorts so that when you're working on your p 46 00:02:22,630 --> 00:02:25,960 set, you're so proud of yourself, you got it working, you post to the big 47 00:02:25,960 --> 00:02:28,920 board, you've got a really good number, you go to dinner, you come 48 00:02:28,920 --> 00:02:31,810 back, and your roommates has edged you out on the big board. 49 00:02:31,810 --> 00:02:34,910 Well, it's time, at that point, to go back to the drawing board so as to 50 00:02:34,910 --> 00:02:36,160 re-challenge the big board. 51 00:02:36,160 --> 00:02:39,330 If you look at the spec, the instructions for interfacing with the 52 00:02:39,330 --> 00:02:41,480 big board are now posted. 53 00:02:41,480 --> 00:02:44,870 >> So a couple of heads ups-- 54 00:02:44,870 --> 00:02:48,410 one, the pre-proposal for the final project is due this coming Monday. 55 00:02:48,410 --> 00:02:51,060 See this spec on the course's website for what that means. 56 00:02:51,060 --> 00:02:54,450 It's really just a casual but thought provoking email between you and your 57 00:02:54,450 --> 00:02:58,410 TF, really just to get things started, the conversation started, even though 58 00:02:58,410 --> 00:03:02,110 most of you have never even written a web page before, don't even know what 59 00:03:02,110 --> 00:03:04,850 you might, how you might, implement your final project. 60 00:03:04,850 --> 00:03:07,250 Go on faith that you'll know how to do quite a few more 61 00:03:07,250 --> 00:03:08,410 things in a few weeks. 62 00:03:08,410 --> 00:03:12,900 So just begin this process per the spec of exploring possible ideas. 63 00:03:12,900 --> 00:03:16,030 >> Also, what we'd invite you to do is-- we have a tradition, for many years 64 00:03:16,030 --> 00:03:18,840 now, in the course, of hosting this-- store.cs50.net. 65 00:03:18,840 --> 00:03:20,010 Everything's sold at cost. 66 00:03:20,010 --> 00:03:23,460 And it's really just an opportunity to wear CS50, if you would like to do 67 00:03:23,460 --> 00:03:24,920 that, at course's and. 68 00:03:24,920 --> 00:03:27,990 For instance, there are such things as the t-shirts that you might have seen 69 00:03:27,990 --> 00:03:29,880 going around campus, sweatshirts. 70 00:03:29,880 --> 00:03:33,960 And then, we also invite students to submit designs to be immortalized in 71 00:03:33,960 --> 00:03:35,330 the CS50 store. 72 00:03:35,330 --> 00:03:39,910 >> For instance, one of last year's favorites that will, perhaps, now 73 00:03:39,910 --> 00:03:41,860 resonate with you is this one here. 74 00:03:41,860 --> 00:03:45,390 75 00:03:45,390 --> 00:03:46,820 Very popular item. 76 00:03:46,820 --> 00:03:51,020 So if you would like to participate in this, we'll put up a form soon, at 77 00:03:51,020 --> 00:03:54,240 cs50.net/design, to which you can upload an image that you've made in 78 00:03:54,240 --> 00:03:56,990 Illustrator, or Photoshop, or some similar program. 79 00:03:56,990 --> 00:03:59,850 And if you're familiar with these kinds of specifications, we want it to 80 00:03:59,850 --> 00:04:05,010 be a PNG image, at least 200 dots per inch, and fewer than that many pixels, 81 00:04:05,010 --> 00:04:07,680 and under 10 megabytes. 82 00:04:07,680 --> 00:04:11,260 For more details, just email the course's heads at heads@cs50.net if 83 00:04:11,260 --> 00:04:13,910 you would like to partake in this. 84 00:04:13,910 --> 00:04:20,920 >> All right, so today, no more C. So we begin to pull back the layers of the 85 00:04:20,920 --> 00:04:24,900 internet, the web, and how you can actually start writing software for 86 00:04:24,900 --> 00:04:26,420 this different environment. 87 00:04:26,420 --> 00:04:31,420 So in particular, let's ask, first, the question of-- 88 00:04:31,420 --> 00:04:36,070 let me get us to our familiar drawing app over here. 89 00:04:36,070 --> 00:04:42,702 Let me pose the question of, how does the internet work. 90 00:04:42,702 --> 00:04:43,560 >> [? STUDENT: Magic. ?] 91 00:04:43,560 --> 00:04:44,010 >> SPEAKER 1: Magic. 92 00:04:44,010 --> 00:04:44,940 OK. 93 00:04:44,940 --> 00:04:45,880 Good answer. 94 00:04:45,880 --> 00:04:49,460 So we'll start there today, and see if we can't make it a little less magical 95 00:04:49,460 --> 00:04:50,880 within the hour. 96 00:04:50,880 --> 00:04:53,850 Let's try to tell it in the context of a story. 97 00:04:53,850 --> 00:04:58,480 >> So you're fans of going to facebook.com, or reddit.com, or 98 00:04:58,480 --> 00:04:59,780 whatever these days. 99 00:04:59,780 --> 00:05:02,590 And so what's really happening when you type in something like 100 00:05:02,590 --> 00:05:07,020 facebook.com, and hit Enter, in Chrome, or Firefox, or IE, or Safari, 101 00:05:07,020 --> 00:05:09,050 or whatever browser you're actually doing? 102 00:05:09,050 --> 00:05:11,500 Can we tell this story, maybe sentence by sentence? 103 00:05:11,500 --> 00:05:14,770 What's one of the first things that happens when you hit Enter, after 104 00:05:14,770 --> 00:05:15,876 typing facebook.com? 105 00:05:15,876 --> 00:05:17,780 >> [? STUDENT: Your ?] computer makes an HTTP request. 106 00:05:17,780 --> 00:05:18,260 >> SPEAKER 1: OK. 107 00:05:18,260 --> 00:05:21,900 So your computer makes-- we'll call it-- an HTTP request. 108 00:05:21,900 --> 00:05:22,940 Now what does that mean? 109 00:05:22,940 --> 00:05:27,980 Well, all of us have probably seen or typed, for years now, H-T-T-P often 110 00:05:27,980 --> 00:05:29,186 followed by colon, slash, slash. 111 00:05:29,186 --> 00:05:30,340 So what is that? 112 00:05:30,340 --> 00:05:33,980 >> Well, HTTP is HyperText Transfer Protocol. 113 00:05:33,980 --> 00:05:37,360 And that's just a fancy way of saying, it's the language that web browsers, 114 00:05:37,360 --> 00:05:42,460 like Chrome and others, and web servers, like facebook.com, speak to 115 00:05:42,460 --> 00:05:43,100 one another. 116 00:05:43,100 --> 00:05:46,730 And it's a fairly simple, English oriented language. 117 00:05:46,730 --> 00:05:48,140 It's almost like pseudo code. 118 00:05:48,140 --> 00:05:51,820 >> And it's a way of a client, as we'll call it-- a browser-- 119 00:05:51,820 --> 00:05:53,150 communicating with the server. 120 00:05:53,150 --> 00:05:56,230 And just like in a restaurant, when you, the client, sit down at a table 121 00:05:56,230 --> 00:05:59,630 and then order something off of the menu of the server, that server's 122 00:05:59,630 --> 00:06:02,720 going to bring you back something, whatever it is you requested. 123 00:06:02,720 --> 00:06:04,270 Same in the computer world. 124 00:06:04,270 --> 00:06:04,970 A browser-- 125 00:06:04,970 --> 00:06:05,610 a client-- 126 00:06:05,610 --> 00:06:07,890 is going to make a request, and then, hopefully get back 127 00:06:07,890 --> 00:06:09,120 something from the server. 128 00:06:09,120 --> 00:06:11,660 And that something is, at a high level, the web page. 129 00:06:11,660 --> 00:06:15,040 At a slightly lower level, it's a file written in another 130 00:06:15,040 --> 00:06:17,160 language called HTML-- 131 00:06:17,160 --> 00:06:18,920 HyperText Markup Language. 132 00:06:18,920 --> 00:06:20,720 But more on that in just a moment. 133 00:06:20,720 --> 00:06:22,470 >> So HyperText Transfer Protocol-- 134 00:06:22,470 --> 00:06:23,450 HTTP-- 135 00:06:23,450 --> 00:06:26,050 that's the protocol that browser and server use. 136 00:06:26,050 --> 00:06:27,830 Well, what is a protocol, exactly? 137 00:06:27,830 --> 00:06:29,280 Well, you can think of it as a language. 138 00:06:29,280 --> 00:06:32,580 But if I reach out to our audience here, a normal thing for us humans to 139 00:06:32,580 --> 00:06:35,928 do is, when we greet someone, I say, hi, my name is David. 140 00:06:35,928 --> 00:06:37,320 >> [? STUDENT: Hi, ?] my name is Dipty. 141 00:06:37,320 --> 00:06:39,000 >> SPEAKER 1: "Hi, my name is Dipty," she replies. 142 00:06:39,000 --> 00:06:43,530 And so we've had this fairly arbitrary interaction of shaking hands, as is 143 00:06:43,530 --> 00:06:45,730 often the human convention in most countries. 144 00:06:45,730 --> 00:06:47,380 And that's a protocol, right? 145 00:06:47,380 --> 00:06:50,680 I sort of initiated it by extending my hand, rather awkwardly, on the stage 146 00:06:50,680 --> 00:06:51,610 of Sanders here. 147 00:06:51,610 --> 00:06:54,670 She realized, oh, I've gotten a request for a hand apparently. 148 00:06:54,670 --> 00:06:58,170 And so she responded to that request by actually acknowledging it. 149 00:06:58,170 --> 00:07:01,860 An acknowledging, ACK, is actually a phrase very common in the world of 150 00:07:01,860 --> 00:07:04,060 networking, for a server to acknowledge the client. 151 00:07:04,060 --> 00:07:07,720 Then, we sort of completed that transaction, and awkwardness over. 152 00:07:07,720 --> 00:07:10,010 So that's really what's happening underneath the hood as well. 153 00:07:10,010 --> 00:07:13,450 >> Let me do this a little more technically under the hood. 154 00:07:13,450 --> 00:07:16,900 I'm going to go over here to a terminal window. 155 00:07:16,900 --> 00:07:19,950 This terminal window happens to be on my Mac, but you could do the same kind 156 00:07:19,950 --> 00:07:21,760 of thing in CS50 Appliance. 157 00:07:21,760 --> 00:07:24,750 And I'm actually going to use a program that we won't really used for 158 00:07:24,750 --> 00:07:26,300 much at all the semester. 159 00:07:26,300 --> 00:07:27,430 But it's called Telnet. 160 00:07:27,430 --> 00:07:31,880 >> Back in the day, Telnet was the program that you used to connect to a 161 00:07:31,880 --> 00:07:34,910 remote server, to check your mail or to do something like that. 162 00:07:34,910 --> 00:07:38,460 For now, we're going to use this old school program, Telnet, to pretend to 163 00:07:38,460 --> 00:07:39,830 be a browser. 164 00:07:39,830 --> 00:07:41,550 And I'm going to go ahead and do the following-- let me 165 00:07:41,550 --> 00:07:42,800 increase my font size. 166 00:07:42,800 --> 00:07:48,080 >> And I'm going to say, Telnet to the server called www.facebook.com, but 167 00:07:48,080 --> 00:07:50,980 specifically, Telnet to port 80. 168 00:07:50,980 --> 00:07:52,070 We'll come back to this. 169 00:07:52,070 --> 00:07:56,630 But for now, know that most services on the internet are identified 170 00:07:56,630 --> 00:07:58,170 uniquely by some number. 171 00:07:58,170 --> 00:07:59,460 In this case, it's 80. 172 00:07:59,460 --> 00:08:02,910 Now most of you have probably never typed 80 before. 173 00:08:02,910 --> 00:08:08,540 But in reality, if I go to a browser and pull up, for instance, 174 00:08:08,540 --> 00:08:16,500 http://www.facebook.com/-- 175 00:08:16,500 --> 00:08:18,460 that's auto-complete, that's not my history-- 176 00:08:18,460 --> 00:08:23,070 all right, so now, we go to colon 80 slash. 177 00:08:23,070 --> 00:08:26,270 >> So I claim that even though you've probably never typed this before, with 178 00:08:26,270 --> 00:08:30,310 the colon 80 after facebook.com, hopefully, it's still going to work. 179 00:08:30,310 --> 00:08:32,220 And indeed, it goes to facebook.com. 180 00:08:32,220 --> 00:08:34,860 So it turns out that 80 has been implicit. 181 00:08:34,860 --> 00:08:36,690 None of us humans have had to type that for years. 182 00:08:36,690 --> 00:08:41,350 Because browsers, by default, just assume that the number you want to use 183 00:08:41,350 --> 00:08:44,620 when calling up a server so to speak is, in fact, 80. 184 00:08:44,620 --> 00:08:47,340 Because long story short, servers can do way more than just 185 00:08:47,340 --> 00:08:48,320 serve up web pages. 186 00:08:48,320 --> 00:08:50,030 >> They can respond to instant messages. 187 00:08:50,030 --> 00:08:51,230 They can send emails. 188 00:08:51,230 --> 00:08:54,410 There's lots of services that can run on a single server. 189 00:08:54,410 --> 00:08:57,590 So these numbers-- in this case, 80-- uniquely identifies one of those 190 00:08:57,590 --> 00:09:01,830 services, which is HTTP, the web protocol than a server 191 00:09:01,830 --> 00:09:03,210 might actually support. 192 00:09:03,210 --> 00:09:07,250 But I can simulate this request now, textually, using this old school 193 00:09:07,250 --> 00:09:08,240 Telnet program. 194 00:09:08,240 --> 00:09:12,940 So I'm going to essentially now pretend to be a browser and speak HTTP 195 00:09:12,940 --> 00:09:16,620 by sending, with my keyboard, exactly the commands that Chrome just knew how 196 00:09:16,620 --> 00:09:18,260 to send for me magically. 197 00:09:18,260 --> 00:09:19,910 >> So I'm going to go ahead and hit Enter. 198 00:09:19,910 --> 00:09:22,000 Notice that it's trying 31.13.69.32. 199 00:09:22,000 --> 00:09:26,110 13 What is that? 200 00:09:26,110 --> 00:09:27,440 So it's an IP address. 201 00:09:27,440 --> 00:09:30,790 Now even if you're not too familiar with the intricacies of those, you 202 00:09:30,790 --> 00:09:33,420 probably have a general sense that these things exist. 203 00:09:33,420 --> 00:09:34,650 And an IP address-- 204 00:09:34,650 --> 00:09:36,620 Internet Protocol address-- 205 00:09:36,620 --> 00:09:40,970 is just a unique identifier for a computer on the internet. 206 00:09:40,970 --> 00:09:43,040 This is a bit of an oversimplification for the moment. 207 00:09:43,040 --> 00:09:47,490 >> But every computer on the internet has a unique IP address, much like every 208 00:09:47,490 --> 00:09:53,600 house in, say, the US has a unique postal address, something like 123 209 00:09:53,600 --> 00:09:55,820 Main Street, in Anytown, USA. 210 00:09:55,820 --> 00:09:56,540 So something like that. 211 00:09:56,540 --> 00:09:58,330 And that, too, is oversimplification. 212 00:09:58,330 --> 00:10:01,470 But these addresses that we have in the postal world and these addresses 213 00:10:01,470 --> 00:10:04,940 that we have in the computer world uniquely identify servers so that when 214 00:10:04,940 --> 00:10:09,030 you send a message to them over the internet, or when you put a letter in 215 00:10:09,030 --> 00:10:10,500 an old school mailbox-- 216 00:10:10,500 --> 00:10:12,100 postal mail-- 217 00:10:12,100 --> 00:10:16,940 the service knows how to get that request, or that letter, to the 218 00:10:16,940 --> 00:10:18,110 intended recipient. 219 00:10:18,110 --> 00:10:21,390 >> Now my computer, somehow, has just figured out that Facebook's unique IP 220 00:10:21,390 --> 00:10:23,820 is 31.13.69.32. 221 00:10:23,820 --> 00:10:25,170 In fact, that can probably change. 222 00:10:25,170 --> 00:10:27,780 Facebook probably has multiple IP addresses, because they absolutely 223 00:10:27,780 --> 00:10:29,150 have more than one server. 224 00:10:29,150 --> 00:10:30,810 But that's happened for us magically. 225 00:10:30,810 --> 00:10:35,070 In fact, the internal secret name of the server I've apparently connected 226 00:10:35,070 --> 00:10:40,270 to is called star.c10r.facebook.com, whatever that is. 227 00:10:40,270 --> 00:10:42,960 It's just whatever the system administrator at Facebook decided to 228 00:10:42,960 --> 00:10:46,510 call this particular server that I was somewhat randomly sent to. 229 00:10:46,510 --> 00:10:48,630 >> So now if my connection hasn't timed out, I'm going to 230 00:10:48,630 --> 00:10:50,210 pretend to be that browser. 231 00:10:50,210 --> 00:10:54,590 I'm going to say get space forward slash space. 232 00:10:54,590 --> 00:10:58,220 And I'm going to pretend to be speaking HTTP version 1.1, which is 233 00:10:58,220 --> 00:10:59,880 the one that most browsers use. 234 00:10:59,880 --> 00:11:03,980 And I'm specifically going to mention to the server, by the way, I want the 235 00:11:03,980 --> 00:11:06,280 website known to the world as facebook.com. 236 00:11:06,280 --> 00:11:09,000 Enter, Enter. 237 00:11:09,000 --> 00:11:11,390 And now, notice what's happened. 238 00:11:11,390 --> 00:11:16,400 >> The server, the waiter, has responded to my order, or my request, with 239 00:11:16,400 --> 00:11:17,720 another textual message. 240 00:11:17,720 --> 00:11:20,720 Now again, in the world of browsers like Chrome and Safari, you wouldn't 241 00:11:20,720 --> 00:11:21,990 see this, as the human. 242 00:11:21,990 --> 00:11:24,770 Microsoft and Google just hide these details from us. 243 00:11:24,770 --> 00:11:29,580 But Facebook has responded with an answer, also in the language HTTP. 244 00:11:29,580 --> 00:11:33,250 Notice there's a code here, 302, which actually has special significance by 245 00:11:33,250 --> 00:11:34,110 convention. 246 00:11:34,110 --> 00:11:36,030 Found, so that's at least promising. 247 00:11:36,030 --> 00:11:39,160 >> But apparently Facebook is telling me, mm-mm, you don't want 248 00:11:39,160 --> 00:11:40,190 what you asked for. 249 00:11:40,190 --> 00:11:42,810 You instead want today's special, which is 250 00:11:42,810 --> 00:11:45,680 facebook.com/unsupportedbrowser. 251 00:11:45,680 --> 00:11:50,350 So at a high level, what does Facebook appear to be doing here? 252 00:11:50,350 --> 00:11:51,410 It's redirecting me. 253 00:11:51,410 --> 00:11:53,420 So Facebook doesn't like the fact that I'm pretending to 254 00:11:53,420 --> 00:11:54,770 be this other browser. 255 00:11:54,770 --> 00:11:57,700 And so it's redirecting me to some website. 256 00:11:57,700 --> 00:11:59,820 >> I'm actually curious, now, what this thing looks like. 257 00:11:59,820 --> 00:12:04,420 Let me go over to that in Chrome so we can see what they want me to see. 258 00:12:04,420 --> 00:12:07,060 So now they've actually sent me back to Facebook because they've realized, 259 00:12:07,060 --> 00:12:08,360 oh, you do have a supported browser. 260 00:12:08,360 --> 00:12:10,260 We're not even going to show you that page. 261 00:12:10,260 --> 00:12:12,920 So let's go ahead and see if we can't fix this. 262 00:12:12,920 --> 00:12:14,280 >> I'm going to have to cheat a little bit. 263 00:12:14,280 --> 00:12:16,350 And more on this in the weeks to come. 264 00:12:16,350 --> 00:12:18,120 But I'm going to do one thing here. 265 00:12:18,120 --> 00:12:20,590 And I'll explain this before long. 266 00:12:20,590 --> 00:12:24,320 Give me just a moment to cheat, and wow you. 267 00:12:24,320 --> 00:12:28,190 So let me get this. 268 00:12:28,190 --> 00:12:29,110 OK. 269 00:12:29,110 --> 00:12:30,690 I'll explain what I'm doing in just a moment. 270 00:12:30,690 --> 00:12:32,810 I'm going to go ahead and cancel this connection, and try this again. 271 00:12:32,810 --> 00:12:38,440 >> Get slash HTTP 1.1 host www.facebook.com user-agent. 272 00:12:38,440 --> 00:12:43,880 273 00:12:43,880 --> 00:12:44,560 OK. 274 00:12:44,560 --> 00:12:46,820 Now I have pretended to be Chrome. 275 00:12:46,820 --> 00:12:50,920 So it turns out that when a browser sends a request to a server, it's just 276 00:12:50,920 --> 00:12:51,595 the honor system. 277 00:12:51,595 --> 00:12:54,840 If I say I'm Chrome, Facebook will assume I'm Chrome. 278 00:12:54,840 --> 00:12:58,560 And the means by which I identified myself as Chrome is by this 279 00:12:58,560 --> 00:13:00,360 atrociously long string. 280 00:13:00,360 --> 00:13:03,240 Essentially, all the browser manufacturers in the world have 281 00:13:03,240 --> 00:13:06,470 decided, well, this version of this browser on this operating system will 282 00:13:06,470 --> 00:13:09,740 have a user-agent string that looks like that crazy mess there. 283 00:13:09,740 --> 00:13:12,110 And Mozilla is in there for historical reasons. 284 00:13:12,110 --> 00:13:15,160 >> But notice how much information I'm leaking to facebook.com without even 285 00:13:15,160 --> 00:13:16,030 logging in. 286 00:13:16,030 --> 00:13:18,910 I'm telling Mark that it's a Mac that I'm using. 287 00:13:18,910 --> 00:13:23,590 I'm telling him that it's an Intel based Mac running Mac OS 10.8.5. 288 00:13:23,590 --> 00:13:27,870 As an aside, this information is going to every website that you visit with 289 00:13:27,870 --> 00:13:28,500 your browser. 290 00:13:28,500 --> 00:13:31,360 Pretty innocuous so far, but it gets a little juicier. 291 00:13:31,360 --> 00:13:33,920 >> Notice that, if we read far enough, I'm using Chrome version 292 00:13:33,920 --> 00:13:38,060 30.0.1599.101. 293 00:13:38,060 --> 00:13:42,410 But now, notice that the response is not as bad as it was before. 294 00:13:42,410 --> 00:13:44,840 Where is Facebook telling me to go now? 295 00:13:44,840 --> 00:13:49,140 It's telling me, again, the website-- 296 00:13:49,140 --> 00:13:50,720 it's telling me it's moved permanently. 297 00:13:50,720 --> 00:13:54,200 Well, where the heck did Facebook go? 298 00:13:54,200 --> 00:13:56,100 >> Yeah, so it's a subtle difference. 299 00:13:56,100 --> 00:14:01,680 But notice, here, that the website has actually relocated to HTTPS. 300 00:14:01,680 --> 00:14:05,210 So long story short, this is one way that Facebook is enforcing that I 301 00:14:05,210 --> 00:14:08,890 actually end up at the secure version of their website, the one that's using 302 00:14:08,890 --> 00:14:09,660 encryption-- 303 00:14:09,660 --> 00:14:12,730 more complex than the encryption we talked about for p set two, but 304 00:14:12,730 --> 00:14:14,520 encryption nonetheless. 305 00:14:14,520 --> 00:14:17,110 >> Now at this point it gets hard for me to spoof their web 306 00:14:17,110 --> 00:14:18,230 request using Telnet. 307 00:14:18,230 --> 00:14:20,210 Because if they're telling me to use SSL-- 308 00:14:20,210 --> 00:14:23,050 the HTTPS prefix is what that implies-- 309 00:14:23,050 --> 00:14:25,590 if they're telling me to use cryptography, there's no way I'm going 310 00:14:25,590 --> 00:14:28,610 to manually encrypt my message in front of all of you here, and try to 311 00:14:28,610 --> 00:14:29,770 figure out how to do that. 312 00:14:29,770 --> 00:14:31,150 It's just going to get much more complex. 313 00:14:31,150 --> 00:14:33,150 But that's what the browser is doing for you. 314 00:14:33,150 --> 00:14:36,230 >> Let's see if we can't do this a little more simply, then, with a website 315 00:14:36,230 --> 00:14:38,700 that's not expecting us to be as secure. 316 00:14:38,700 --> 00:14:43,310 Let's go to, say, harvard.edu on port 80. 317 00:14:43,310 --> 00:14:44,550 Enter. 318 00:14:44,550 --> 00:14:48,170 All right, so get slash HTTP 1.1. 319 00:14:48,170 --> 00:14:49,730 And what does this first slash mean? 320 00:14:49,730 --> 00:14:53,120 Just to be clear, why do I keep typing that? 321 00:14:53,120 --> 00:14:54,790 >> Well normally, when you type a URL-- 322 00:14:54,790 --> 00:14:57,610 and unfortunately, browsers usually hide this these days-- 323 00:14:57,610 --> 00:15:00,850 normally, when you go to harvard.edu, that URL officially 324 00:15:00,850 --> 00:15:02,560 does end in a slash. 325 00:15:02,560 --> 00:15:07,350 Because a single slash denotes what part of the hard drive? 326 00:15:07,350 --> 00:15:08,990 The root of the hard drive. 327 00:15:08,990 --> 00:15:11,260 We in the Appliance haven't really had to think about this, because we're 328 00:15:11,260 --> 00:15:12,930 always in John Harvard's folder. 329 00:15:12,930 --> 00:15:14,690 But his folder's in another folder. 330 00:15:14,690 --> 00:15:17,980 And that folder's in the root of the Appliance's hard drive, so to speak, 331 00:15:17,980 --> 00:15:18,980 even though it's virtual. 332 00:15:18,980 --> 00:15:21,660 So a single slash like this means the root of the hard drive. 333 00:15:21,660 --> 00:15:25,650 It's like C colon backslash, or it's the root of your volume, on Mac OS. 334 00:15:25,650 --> 00:15:28,740 >> But Chrome, and other browsers these days, have gotten user-friendly, and 335 00:15:28,740 --> 00:15:30,300 they hide that slash altogether. 336 00:15:30,300 --> 00:15:32,620 But that's all that means in my textual message-- 337 00:15:32,620 --> 00:15:36,570 give me the root of harvard.edu's homepage, that is, the 338 00:15:36,570 --> 00:15:38,120 default page itself. 339 00:15:38,120 --> 00:15:39,900 So let me go ahead and hit Enter. 340 00:15:39,900 --> 00:15:43,650 Let me remind the host that I want www.harvard.edu, just in case there's 341 00:15:43,650 --> 00:15:45,880 other websites living on the same physical server. 342 00:15:45,880 --> 00:15:46,080 >> OK. 343 00:15:46,080 --> 00:15:47,700 Harvard got a little impatient with me. 344 00:15:47,700 --> 00:15:49,390 So let's do this again, faster. 345 00:15:49,390 --> 00:15:55,560 Get slash HTTP 1.1 host www.harvard.edu user-agent-- 346 00:15:55,560 --> 00:15:58,080 I'm guessing our servers don't care as much about this-- 347 00:15:58,080 --> 00:15:59,566 Enter, Enter. 348 00:15:59,566 --> 00:15:59,962 Whew. 349 00:15:59,962 --> 00:16:01,700 Oh damn it, bad request. 350 00:16:01,700 --> 00:16:02,080 OK. 351 00:16:02,080 --> 00:16:05,310 So what's going on here-- 352 00:16:05,310 --> 00:16:07,800 hello, harvard.edu. 353 00:16:07,800 --> 00:16:10,280 Why is it doing the-- interesting. 354 00:16:10,280 --> 00:16:11,710 Oh, OK. 355 00:16:11,710 --> 00:16:14,830 >> So what Harvard's now doing-- and we're going to quickly veer off of 356 00:16:14,830 --> 00:16:17,100 this path, because it's going to get tedious quickly-- 357 00:16:17,100 --> 00:16:21,270 notice that Harvard is actually compressing its response to me, which 358 00:16:21,270 --> 00:16:22,140 isn't ideal. 359 00:16:22,140 --> 00:16:25,780 Because I, apparently, as a human, don't know how to decompress bits that 360 00:16:25,780 --> 00:16:27,280 have been sent to me compressed. 361 00:16:27,280 --> 00:16:31,500 And they're being shown is garbage there, because they're zeros and ones, 362 00:16:31,500 --> 00:16:33,190 but they're not ASCII characters. 363 00:16:33,190 --> 00:16:36,090 They're patterns of zeros and ones that have been compressed to take up 364 00:16:36,090 --> 00:16:37,050 less space. 365 00:16:37,050 --> 00:16:39,010 >> So very quickly, let me see if I can recover here. 366 00:16:39,010 --> 00:16:41,590 Let's try, maybe, another campus altogether. 367 00:16:41,590 --> 00:16:50,450 mit.edu get slash HTTP slash 1.1 host www.mit.edu user-agent colon there. 368 00:16:50,450 --> 00:16:51,600 Thank you, MIT. 369 00:16:51,600 --> 00:16:52,630 OK. 370 00:16:52,630 --> 00:16:55,750 So here we have a web page. 371 00:16:55,750 --> 00:16:58,840 >> So this is the language known as HTML-- 372 00:16:58,840 --> 00:17:00,400 HyperText Markup Language. 373 00:17:00,400 --> 00:17:03,390 I'm simply scrolling back up in time to get to the very 374 00:17:03,390 --> 00:17:04,810 tip top of this page. 375 00:17:04,810 --> 00:17:07,440 And notice how MIT has responded to my request. 376 00:17:07,440 --> 00:17:08,520 200 is good. 377 00:17:08,520 --> 00:17:10,630 200 means everything is literally OK. 378 00:17:10,630 --> 00:17:13,390 And that's a status code that we humans really never 379 00:17:13,390 --> 00:17:14,670 see, in a good way. 380 00:17:14,670 --> 00:17:16,140 Because it means all is well. 381 00:17:16,140 --> 00:17:19,369 >> Notice that MIT is informing me, hey, the server we're running is called 382 00:17:19,369 --> 00:17:23,849 Apache, which is a very popular open source free web server. 383 00:17:23,849 --> 00:17:25,589 They're running, apparently, UNIX, which is an 384 00:17:25,589 --> 00:17:27,130 operating system like Linux. 385 00:17:27,130 --> 00:17:30,660 Notice that they apparently updated their web page at 4:00 a.m., 386 00:17:30,660 --> 00:17:32,400 Greenwich Mean Time. 387 00:17:32,400 --> 00:17:34,990 >> Notice a couple of other details. 388 00:17:34,990 --> 00:17:37,910 They're returning, to me, text/html. 389 00:17:37,910 --> 00:17:39,800 So we'll see what that means in just a moment. 390 00:17:39,800 --> 00:17:45,460 They've apparently given me 14,717 bytes worth of HTML. 391 00:17:45,460 --> 00:17:48,180 And some other, more esoteric information is in there. 392 00:17:48,180 --> 00:17:49,920 >> But this is where it gets interesting. 393 00:17:49,920 --> 00:17:52,580 This is how you make a web page. 394 00:17:52,580 --> 00:17:57,860 This is how you make a web page whose title in the tab, in your browser, is 395 00:17:57,860 --> 00:18:00,590 MIT hyphen Massachusetts Institute of Technology. 396 00:18:00,590 --> 00:18:06,300 And indeed, if we go back to Chrome and visit www.mit.edu, notice that, 397 00:18:06,300 --> 00:18:09,680 indeed, in the title up here, is MIT dash Massachusetts 398 00:18:09,680 --> 00:18:11,260 Institute dot, dot, dot. 399 00:18:11,260 --> 00:18:16,490 And now notice, too, if I right click or control click on the desktop here, 400 00:18:16,490 --> 00:18:17,960 and go to View Page Source-- 401 00:18:17,960 --> 00:18:20,870 at least in Chrome, though every browser does this via some means-- 402 00:18:20,870 --> 00:18:22,140 here is that same file. 403 00:18:22,140 --> 00:18:25,140 >> It happens to be color coded, or syntax highlighted. 404 00:18:25,140 --> 00:18:28,590 But just like with your C code that was not colorized by you, it was 405 00:18:28,590 --> 00:18:31,810 colorized by gedit, similarly is Chrome just making 406 00:18:31,810 --> 00:18:33,130 this prettier to read. 407 00:18:33,130 --> 00:18:37,110 But this is the stuff that we'll soon be writing. 408 00:18:37,110 --> 00:18:38,840 So that's the endgame. 409 00:18:38,840 --> 00:18:42,020 The server has responded with that information, just like you responded 410 00:18:42,020 --> 00:18:43,660 with your hand for our handshake. 411 00:18:43,660 --> 00:18:47,280 But what else has to be going on in between those steps? 412 00:18:47,280 --> 00:18:53,430 >> Well, when I type in, in this last case, www.mit.edu and hit Enter, we 413 00:18:53,430 --> 00:18:56,390 know it's talking to port 80 automatically, port 414 00:18:56,390 --> 00:18:57,780 just being that number. 415 00:18:57,780 --> 00:19:00,710 But where did the IP address go? 416 00:19:00,710 --> 00:19:05,045 How is my computer figuring out what the IP address of mit.edu is? 417 00:19:05,045 --> 00:19:07,720 418 00:19:07,720 --> 00:19:10,840 >> Well, it turns out, in this world, there are things called DNS servers. 419 00:19:10,840 --> 00:19:14,500 And let me go ahead and draw a quick picture over here. 420 00:19:14,500 --> 00:19:17,680 And this'll just sketch out, in rough terms, what's going on. 421 00:19:17,680 --> 00:19:21,510 So we'll pretend like this is my laptop here, in Sanders. 422 00:19:21,510 --> 00:19:24,650 And it has Wi-Fi, so it's connected wirelessly to something. 423 00:19:24,650 --> 00:19:26,060 >> What's it actually connected to? 424 00:19:26,060 --> 00:19:27,990 Well, somewhere in here, there's something on the 425 00:19:27,990 --> 00:19:29,240 wall with some antennas. 426 00:19:29,240 --> 00:19:30,725 And that's called an access point-- 427 00:19:30,725 --> 00:19:31,560 AP. 428 00:19:31,560 --> 00:19:34,190 Wireless access point, wireless router-- call it whatever you want. 429 00:19:34,190 --> 00:19:36,230 But they're all over campus, with those little antennas. 430 00:19:36,230 --> 00:19:38,100 Ours are made by Cisco, typically. 431 00:19:38,100 --> 00:19:42,480 And so somehow, my computer is talking to that wireless access point, 432 00:19:42,480 --> 00:19:45,580 somewhere here in Sanders, or downstairs, or outside. 433 00:19:45,580 --> 00:19:50,030 >> Meanwhile, this thing has a lot of physical wires going to, probably, the 434 00:19:50,030 --> 00:19:52,175 Science Center, which we'll draw like this. 435 00:19:52,175 --> 00:19:54,200 It doesn't actually look like that. 436 00:19:54,200 --> 00:19:55,200 That actually looks a lot better. 437 00:19:55,200 --> 00:19:59,170 So the Science Center has a whole bunch of computers inside of it that 438 00:19:59,170 --> 00:20:02,320 are somehow physically connected to all of these access points on campus. 439 00:20:02,320 --> 00:20:06,440 And those physical computers, we'll call routers, or gateways. 440 00:20:06,440 --> 00:20:09,450 >> A router, as its name suggests, it's purpose in life is to route 441 00:20:09,450 --> 00:20:10,310 information. 442 00:20:10,310 --> 00:20:14,150 It takes some bits, from a computer, as input, and figures out to where 443 00:20:14,150 --> 00:20:15,640 those bits should be sent. 444 00:20:15,640 --> 00:20:19,910 So in the case of my request for mit.edu, it's actually pretty easy. 445 00:20:19,910 --> 00:20:24,620 My request comes in from my browser, over Wi-Fi, to the access point, then, 446 00:20:24,620 --> 00:20:27,080 via some cable, into a router in the Science Center. 447 00:20:27,080 --> 00:20:29,810 And somehow, the router in the Science Center figures out 448 00:20:29,810 --> 00:20:31,510 that MIT is that way. 449 00:20:31,510 --> 00:20:34,080 And I'm going to move forward those bits, I'm going to route those bits, 450 00:20:34,080 --> 00:20:36,670 down the road, down Mass Ave., to MIT. 451 00:20:36,670 --> 00:20:42,030 But how did my computer know what the IP address even was? 452 00:20:42,030 --> 00:20:45,660 >> Well it turns out that somewhere in here there are servers-- 453 00:20:45,660 --> 00:20:48,330 and I'm going to draw it fairly abstractly-- 454 00:20:48,330 --> 00:20:49,710 as a DNS server-- 455 00:20:49,710 --> 00:20:51,220 Domain Name System. 456 00:20:51,220 --> 00:20:51,960 These are not routers. 457 00:20:51,960 --> 00:20:56,050 These are different types of servers whose purpose in life is to translate 458 00:20:56,050 --> 00:21:04,340 host names, like www.mit.edu, to IP addresses, like 1.2.3.4 So DNS servers 459 00:21:04,340 --> 00:21:05,240 do exactly that. 460 00:21:05,240 --> 00:21:08,320 You can think of them as having a big database, or really, like a big Excel 461 00:21:08,320 --> 00:21:09,750 file with two columns. 462 00:21:09,750 --> 00:21:12,120 One is host names, one is IP addresses. 463 00:21:12,120 --> 00:21:15,020 And they just convert one to the other, in either direction. 464 00:21:15,020 --> 00:21:16,830 >> Now in reality, it's a little more complex than that. 465 00:21:16,830 --> 00:21:22,070 But that's how my computer, my random Mac or PC on this table here, knows 466 00:21:22,070 --> 00:21:27,590 what the unique identifier is for www.mit.edu, or Facebook, or 467 00:21:27,590 --> 00:21:29,680 harvard.edu, for that matter. 468 00:21:29,680 --> 00:21:33,520 But of course, there's the entirety of Mass Ave here. 469 00:21:33,520 --> 00:21:37,390 And then, we get to MIT, which this is actually more compelling. 470 00:21:37,390 --> 00:21:39,230 That'll be MIT. 471 00:21:39,230 --> 00:21:41,580 And so they, too, have some servers. 472 00:21:41,580 --> 00:21:45,770 And they somehow have a wired, or wireless, connection to Harvard. 473 00:21:45,770 --> 00:21:48,830 And of course, we can go much farther down the road than MIT, and talk to 474 00:21:48,830 --> 00:21:50,470 most any computer in the world. 475 00:21:50,470 --> 00:21:52,060 >> But let's see if we can't see that. 476 00:21:52,060 --> 00:21:54,810 Let me go back to my Terminal window for just a moment. 477 00:21:54,810 --> 00:22:00,170 And let's assume that I figured out what the IP address is for mit.edu 478 00:22:00,170 --> 00:22:02,700 like Telnet figured it out before, and my browser can clearly 479 00:22:02,700 --> 00:22:03,960 figure it out for me. 480 00:22:03,960 --> 00:22:06,970 And I'm going to run another program, in this Terminal window, called 481 00:22:06,970 --> 00:22:10,320 traceroute, tracing the route from here-- 482 00:22:10,320 --> 00:22:13,760 literally, this table-- to www.mit.edu. 483 00:22:13,760 --> 00:22:14,750 Let's see what happens. 484 00:22:14,750 --> 00:22:16,690 Let me actually shrink the font size. 485 00:22:16,690 --> 00:22:17,430 Oop. 486 00:22:17,430 --> 00:22:18,790 No, I wanted to surprise you. 487 00:22:18,790 --> 00:22:19,110 >> OK. 488 00:22:19,110 --> 00:22:20,870 So here we go. 489 00:22:20,870 --> 00:22:22,880 Let me go ahead and run this here. 490 00:22:22,880 --> 00:22:26,410 And what I was seeing a moment ago, and we're seeing again now, is this 491 00:22:26,410 --> 00:22:29,980 output-- traceroute www.mit.edu. 492 00:22:29,980 --> 00:22:33,380 Notice, in the first line, this program indeed figured out that MIT's 493 00:22:33,380 --> 00:22:35,730 IP address is this number here. 494 00:22:35,730 --> 00:22:38,060 And now, what's going on between us and them? 495 00:22:38,060 --> 00:22:44,110 >> So this line here, in row one, and this line here, in row two, and then, 496 00:22:44,110 --> 00:22:46,335 row three-- what do each of these lines probably represent? 497 00:22:46,335 --> 00:22:49,010 498 00:22:49,010 --> 00:22:50,225 Locations, points, sure. 499 00:22:50,225 --> 00:22:53,520 They're called hops, conceptually. 500 00:22:53,520 --> 00:22:56,230 But physically, what are they? 501 00:22:56,230 --> 00:22:57,130 They're routers. 502 00:22:57,130 --> 00:22:59,820 >> We only have, really, one piece of hardware here to talk about thus far. 503 00:22:59,820 --> 00:23:00,560 They're routers. 504 00:23:00,560 --> 00:23:01,800 So this thing here-- 505 00:23:01,800 --> 00:23:02,990 crazy name-- 506 00:23:02,990 --> 00:23:06,700 but this is probably machine room, MR, in the Science Center. 507 00:23:06,700 --> 00:23:08,680 It's a gateway, aka router. 508 00:23:08,680 --> 00:23:11,160 This is just some unique number that someone came up with for it. 509 00:23:11,160 --> 00:23:13,120 And it's within harvard.edu. 510 00:23:13,120 --> 00:23:16,290 And that's the IP address of that router that's, again, probably in the 511 00:23:16,290 --> 00:23:17,860 Science Center, based on its name. 512 00:23:17,860 --> 00:23:21,440 This second row represents another router that doesn't have a nickname 513 00:23:21,440 --> 00:23:23,980 apparently-- a host name-- it just has an IP address. 514 00:23:23,980 --> 00:23:28,070 >> So long story short, to get data from points A to B, there's more than just 515 00:23:28,070 --> 00:23:31,400 Harvard's router, and MIT's router, and Google's router, 516 00:23:31,400 --> 00:23:32,640 and Facebook's router. 517 00:23:32,640 --> 00:23:37,300 There's dozens, hundreds, thousands of routers between any point A and any 518 00:23:37,300 --> 00:23:38,710 point B on the internet. 519 00:23:38,710 --> 00:23:41,710 But typically, you can get data from one point to another in 520 00:23:41,710 --> 00:23:43,210 fewer than 30 hops. 521 00:23:43,210 --> 00:23:47,930 In other words, you only have to hand the data to 30 or fewer such routers. 522 00:23:47,930 --> 00:23:49,720 And it's typically many fewer than that. 523 00:23:49,720 --> 00:23:50,970 >> Well, let's see what happens here. 524 00:23:50,970 --> 00:23:54,460 In row three, we hit a router called core Science Center gateway 525 00:23:54,460 --> 00:23:56,580 something or other. 526 00:23:56,580 --> 00:23:58,970 In row 4, we have border gateway-- 527 00:23:58,970 --> 00:24:00,670 these are just cryptic acronyms-- 528 00:24:00,670 --> 00:24:02,530 also within harvard.edu. 529 00:24:02,530 --> 00:24:04,160 Here's another border gateway. 530 00:24:04,160 --> 00:24:09,070 And then, all of a sudden, whoa, we seem to be in New York City. 531 00:24:09,070 --> 00:24:12,030 >> So it turns out-- and I'm in inferring only from the host name. 532 00:24:12,030 --> 00:24:12,970 This could be misleading. 533 00:24:12,970 --> 00:24:13,830 It could be down the road. 534 00:24:13,830 --> 00:24:15,030 It's tough to say-- 535 00:24:15,030 --> 00:24:21,960 but this can be used as a revelation that the shortest distance between two 536 00:24:21,960 --> 00:24:25,730 points on the internet is not necessarily a straight line. 537 00:24:25,730 --> 00:24:29,380 If we think of shortest as the quickest path, the least congested 538 00:24:29,380 --> 00:24:32,070 path, it is quite possible-- though we can't be sure-- 539 00:24:32,070 --> 00:24:37,090 that the data is traveling a decent distance between rows five and six. 540 00:24:37,090 --> 00:24:42,000 >> Now unfortunately MIT, or someone, got a little self-defensive, and they've 541 00:24:42,000 --> 00:24:43,700 started ignoring our requests. 542 00:24:43,700 --> 00:24:47,380 Those routers have been configured to ignore requests of the form who are 543 00:24:47,380 --> 00:24:48,900 you, who are you, who are you. 544 00:24:48,900 --> 00:24:51,650 So let's see if we can't do this with someone more cooperative. 545 00:24:51,650 --> 00:24:56,260 So Stanford has a nice tradition of having a little more openness. 546 00:24:56,260 --> 00:24:57,820 So let's see what happens here. 547 00:24:57,820 --> 00:24:59,080 >> Again, pretty cryptic. 548 00:24:59,080 --> 00:25:01,040 But we start, again, in the machine room in the Science 549 00:25:01,040 --> 00:25:01,990 Center, in row one. 550 00:25:01,990 --> 00:25:02,660 So that's good. 551 00:25:02,660 --> 00:25:05,240 Most of the servers did reply, including Stanford. 552 00:25:05,240 --> 00:25:07,940 So notice we went from the machine room in the Science Center, to some 553 00:25:07,940 --> 00:25:11,770 anonymous router elsewhere, to another Science Center gateway, to a border 554 00:25:11,770 --> 00:25:13,970 gateway, and then, to something here-- 555 00:25:13,970 --> 00:25:14,620 nox.org. 556 00:25:14,620 --> 00:25:19,330 This is the Northern Crossroads, a very popular peering point where lots 557 00:25:19,330 --> 00:25:21,080 of cables, lots of ISPs-- 558 00:25:21,080 --> 00:25:23,220 internet service providers-- connect into. 559 00:25:23,220 --> 00:25:25,470 Here's another nameless IP here. 560 00:25:25,470 --> 00:25:27,530 Here's another such server. 561 00:25:27,530 --> 00:25:29,910 >> But this is interesting. 562 00:25:29,910 --> 00:25:33,750 Where is the router in row eight, probably? 563 00:25:33,750 --> 00:25:36,030 So it's probably in Washington, DC. 564 00:25:36,030 --> 00:25:40,290 And I can kind of corroborate that hypothesis this time. 565 00:25:40,290 --> 00:25:45,230 Because how long did it take us to go from the Science Center to this router 566 00:25:45,230 --> 00:25:46,370 in row seven? 567 00:25:46,370 --> 00:25:49,820 Well, these milliseconds measurements on the right hand side here are 568 00:25:49,820 --> 00:25:51,960 estimates of that time. 569 00:25:51,960 --> 00:25:54,610 >> There are three of them because the program, traceroute, tries every 570 00:25:54,610 --> 00:25:58,010 router three times, just so you can get a visual average of the numbers. 571 00:25:58,010 --> 00:26:00,230 But it apparently takes six milliseconds to get 572 00:26:00,230 --> 00:26:01,840 to row seven's router. 573 00:26:01,840 --> 00:26:05,470 But how fast can, apparently, you travel, if you are a bit, between 574 00:26:05,470 --> 00:26:09,520 Boston and Washington DC? 575 00:26:09,520 --> 00:26:14,180 14 milliseconds is as long as it takes for that instant message, for that 576 00:26:14,180 --> 00:26:18,870 email, for that web page request to travel between here and Washington DC. 577 00:26:18,870 --> 00:26:23,970 >> If I go further, to router number 10, what city am I apparently in now? 578 00:26:23,970 --> 00:26:24,810 So, Houston. 579 00:26:24,810 --> 00:26:27,350 And this is corroborated by the jump in time. 580 00:26:27,350 --> 00:26:28,730 It's really slow to get to Houston. 581 00:26:28,730 --> 00:26:33,960 It takes 47 milliseconds to get from Boston to Houston in this case. 582 00:26:33,960 --> 00:26:37,120 And if we look further, LAX-- 583 00:26:37,120 --> 00:26:41,430 looks like we're getting to Stanford sort of this way, by going through LA. 584 00:26:41,430 --> 00:26:43,170 But I'm inferring that from LAX. 585 00:26:43,170 --> 00:26:46,390 The geeks tend to use airport codes for routers names here. 586 00:26:46,390 --> 00:26:48,600 And this is kind of consistent with that assumption. 587 00:26:48,600 --> 00:26:50,260 82 milliseconds. 588 00:26:50,260 --> 00:26:54,720 >> Then, we apparently go to another LAX, another LA router and then, some 589 00:26:54,720 --> 00:26:59,530 nameless one, and then finally, a cryptic name on Stanford's network, or 590 00:26:59,530 --> 00:27:04,670 close thereto, stanford.edu, is 90 milliseconds away, or 6 591 00:27:04,670 --> 00:27:06,170 plus hours by plane. 592 00:27:06,170 --> 00:27:09,360 So this is how fast data travels on the internet. 593 00:27:09,360 --> 00:27:11,410 And it's things we absolutely take for granted these days. 594 00:27:11,410 --> 00:27:13,950 When you're having some Gchat with someone, and the messages are just 595 00:27:13,950 --> 00:27:16,940 appearing, consider just how fast that's happening. 596 00:27:16,940 --> 00:27:21,540 And visually, it's indeed happening at that kind of rate. 597 00:27:21,540 --> 00:27:25,620 >> So between points one and 18, in this case, there are 598 00:27:25,620 --> 00:27:26,890 things besides routers. 599 00:27:26,890 --> 00:27:30,140 What are some machines on the internet that can block traffic 600 00:27:30,140 --> 00:27:31,610 from getting through? 601 00:27:31,610 --> 00:27:31,950 >> STUDENT: Firewalls. 602 00:27:31,950 --> 00:27:32,910 >> SPEAKER 1: So, firewalls. 603 00:27:32,910 --> 00:27:36,260 And we have personal firewalls such that your own Mac or PC can keep 604 00:27:36,260 --> 00:27:37,540 traffic in or out. 605 00:27:37,540 --> 00:27:38,990 Harvard has firewalls. 606 00:27:38,990 --> 00:27:40,820 MIT presumably has firewalls. 607 00:27:40,820 --> 00:27:44,400 And Stanford does, as do all of the internet service providers who own 608 00:27:44,400 --> 00:27:49,260 these routers in between points A and B. But did you ever stop to consider, 609 00:27:49,260 --> 00:27:52,710 or care, how a firewall works. 610 00:27:52,710 --> 00:27:56,380 Well already, we have the basic building blocks with which to engineer 611 00:27:56,380 --> 00:27:57,700 that answer. 612 00:27:57,700 --> 00:27:59,090 >> If you were a firewall-- 613 00:27:59,090 --> 00:28:03,740 and let's suppose that you are somewhere between point A and point B. 614 00:28:03,740 --> 00:28:06,080 A cable is coming into you, and going out of you. 615 00:28:06,080 --> 00:28:11,160 So you have the technological ability to look at all of the envelopes of 616 00:28:11,160 --> 00:28:14,200 information that are flowing between you and the other person. 617 00:28:14,200 --> 00:28:17,280 In other words, those get messages I was manually typing, you can think of 618 00:28:17,280 --> 00:28:21,060 them as writing a quick note to someone, putting the IP address of the 619 00:28:21,060 --> 00:28:24,810 recipient, and the port number of the recipient, on this envelope, then, 620 00:28:24,810 --> 00:28:28,520 writing your own IP address and your own port number in the top left hand 621 00:28:28,520 --> 00:28:30,230 corner like you would a letter. 622 00:28:30,230 --> 00:28:32,520 Then, you send it out wirelessly. 623 00:28:32,520 --> 00:28:37,130 And it somehow travels, through routers, through wires, wirelessly, 624 00:28:37,130 --> 00:28:39,190 down the road to MIT. 625 00:28:39,190 --> 00:28:43,520 >> So if you're a firewall, how do you stop that from happening? 626 00:28:43,520 --> 00:28:49,710 What would you do if your next p set was implement a firewall? 627 00:28:49,710 --> 00:28:53,980 How do I stop all Harvard people from ever talking to MIT people again? 628 00:28:53,980 --> 00:28:55,870 >> [? STUDENT: You ?] reverse the letter. 629 00:28:55,870 --> 00:28:56,450 >> SPEAKER 1: You what? 630 00:28:56,450 --> 00:28:58,140 >> [? STUDENT: Reverse ?] the letter early. 631 00:28:58,140 --> 00:28:59,290 >> SPEAKER 1: Reverse the letter-- what do you mean? 632 00:28:59,290 --> 00:29:01,130 >> [? STUDENT: Send ?] it back to the sender. 633 00:29:01,130 --> 00:29:01,780 >> SPEAKER 1: Send it back. 634 00:29:01,780 --> 00:29:01,990 OK. 635 00:29:01,990 --> 00:29:05,720 So you could reject the virtual envelope, sort of by doing return to 636 00:29:05,720 --> 00:29:06,660 sender somehow. 637 00:29:06,660 --> 00:29:08,370 So sure, that's what we want to achieve. 638 00:29:08,370 --> 00:29:09,440 But let's dive a little deeper. 639 00:29:09,440 --> 00:29:10,460 How do I do that? 640 00:29:10,460 --> 00:29:13,950 >> If the input to this problem-- if I'm the firewall, and I'm effectively 641 00:29:13,950 --> 00:29:18,020 standing between points A and B, and I am a middle man that gets to look 642 00:29:18,020 --> 00:29:21,240 inside of this envelope, and then decide whether to send it back to 643 00:29:21,240 --> 00:29:25,030 Harvard or to allow it to continue, what is it I, the firewall, am going 644 00:29:25,030 --> 00:29:26,280 to want to look at? 645 00:29:26,280 --> 00:29:29,030 646 00:29:29,030 --> 00:29:29,975 >> I think I heard it here. 647 00:29:29,975 --> 00:29:30,550 >> [? STUDENT: Where it's ?] coming from. 648 00:29:30,550 --> 00:29:32,360 >> SPEAKER 1: Where it's coming from. 649 00:29:32,360 --> 00:29:36,410 So if the source IP address-- the little number up here-- 650 00:29:36,410 --> 00:29:38,430 is an IP address belonging to Harvard-- 651 00:29:38,430 --> 00:29:40,220 and I can actually know that with high probability. 652 00:29:40,220 --> 00:29:45,540 Most of Harvard's IP addresses start with 140.247 dot something dot 653 00:29:45,540 --> 00:29:48,810 something, or 128.103 dot something dot something. 654 00:29:48,810 --> 00:29:51,450 Harvard owns those chunks of IP addresses. 655 00:29:51,450 --> 00:29:55,200 >> Well, if I see that IP addresses as the sender, I can just send it back. 656 00:29:55,200 --> 00:29:57,380 In reality, the internet doesn't bother wasting time 657 00:29:57,380 --> 00:29:58,460 sending the bits back. 658 00:29:58,460 --> 00:30:02,480 It just literally drops the packet by deleting it, effectively. 659 00:30:02,480 --> 00:30:04,190 So what else could I look at though? 660 00:30:04,190 --> 00:30:10,520 Suppose that I want to let people at Harvard visit mit.edu, and pull up 661 00:30:10,520 --> 00:30:13,230 websites, and watch videos at MIT, and the like. 662 00:30:13,230 --> 00:30:17,970 But I don't want humans at Harvard emailing anyone at MIT. 663 00:30:17,970 --> 00:30:23,810 How could I allow traffic from Harvard to MIT, via the web, but disallow 664 00:30:23,810 --> 00:30:24,700 something like an email? 665 00:30:24,700 --> 00:30:25,840 >> [? STUDENT: The ?] port number. 666 00:30:25,840 --> 00:30:28,650 >> SPEAKER 1: A port number-- that's the only other ingredient we have. 667 00:30:28,650 --> 00:30:31,880 We have IP address, which we just leveraged, or we have port number, 668 00:30:31,880 --> 00:30:34,870 where 80, we said, uniquely identifies web traffic. 669 00:30:34,870 --> 00:30:37,430 Now I wouldn't expect you to know this-- some of you might already know 670 00:30:37,430 --> 00:30:38,210 from familiarity-- 671 00:30:38,210 --> 00:30:41,860 what's a number that's used for email, usually? 672 00:30:41,860 --> 00:30:43,080 It's often 25. 673 00:30:43,080 --> 00:30:48,520 25 refers to SMTP, which is a mail transfer protocol that you might have 674 00:30:48,520 --> 00:30:51,270 had to set up at some point, if you're using Eudora, or Outlook, or 675 00:30:51,270 --> 00:30:52,120 something like that. 676 00:30:52,120 --> 00:30:53,190 It's just another number-- 677 00:30:53,190 --> 00:30:54,100 25. 678 00:30:54,100 --> 00:30:58,934 >> Telnet, which we were using before, uses 23. 679 00:30:58,934 --> 00:30:59,770 FTP-- 680 00:30:59,770 --> 00:31:03,750 file transfer protocol, if you've ever heard of that one-- uses 21. 681 00:31:03,750 --> 00:31:07,430 HTTPS, the secure version of HTTP, which we'll come back to 682 00:31:07,430 --> 00:31:10,130 before long, uses 443. 683 00:31:10,130 --> 00:31:14,240 So the world has a whole bunch of numbers that correlate packets-- 684 00:31:14,240 --> 00:31:17,760 rather, correlate services to those actual numbers. 685 00:31:17,760 --> 00:31:19,400 So that's all a firewall is doing. 686 00:31:19,400 --> 00:31:23,330 It's taking a look inside this virtual envelope, and then deciding yea or nay 687 00:31:23,330 --> 00:31:26,230 to forward along, based on those ingredients. 688 00:31:26,230 --> 00:31:29,720 >> Now what could Harvard clearly do to get past this firewall then? 689 00:31:29,720 --> 00:31:33,620 If you want to be able to send a message to MIT but not be detected, 690 00:31:33,620 --> 00:31:38,050 well, you could spoof your IP address, and just somehow be fancy enough, know 691 00:31:38,050 --> 00:31:41,400 how to write C code, and write your own network program that changes the 692 00:31:41,400 --> 00:31:41,860 firm address. 693 00:31:41,860 --> 00:31:45,820 The problem is you can absolutely send data anonymously, but if you want to 694 00:31:45,820 --> 00:31:49,850 get any kind of reply, like see MIT's homepage, obviously, this addresses 695 00:31:49,850 --> 00:31:50,870 needs to be correct. 696 00:31:50,870 --> 00:31:52,780 Otherwise, you can say anything you want, you're not going to 697 00:31:52,780 --> 00:31:53,930 hear back from them. 698 00:31:53,930 --> 00:31:57,130 But these are just one of the kinds of attacks that we can send. 699 00:31:57,130 --> 00:31:59,240 >> But it turns out when we send these messages-- and let's do 700 00:31:59,240 --> 00:32:00,485 an example of this. 701 00:32:00,485 --> 00:32:04,020 It turns out, if I have a message that I want to send, it's not just sent in 702 00:32:04,020 --> 00:32:04,920 one envelope. 703 00:32:04,920 --> 00:32:08,760 For efficiency's sake, especially when the files you're requesting or the 704 00:32:08,760 --> 00:32:13,570 responses you're getting are particularly large, what TCP/IP-- 705 00:32:13,570 --> 00:32:16,330 Transmission Control Protocol / Internet Protocol-- it's just a fancy 706 00:32:16,330 --> 00:32:19,630 way of saying what the networking software and computers do-- is they 707 00:32:19,630 --> 00:32:23,770 take a message like this, and they cut it up into fragments-- 708 00:32:23,770 --> 00:32:25,540 let's say four fragments. 709 00:32:25,540 --> 00:32:29,740 >> And if I now cut this up into here, cut this up into here, what my 710 00:32:29,740 --> 00:32:34,270 computer is then going to do is it's going to take one fragment and put it 711 00:32:34,270 --> 00:32:35,700 in an envelope. 712 00:32:35,700 --> 00:32:39,130 713 00:32:39,130 --> 00:32:41,100 All right, and let me get a-- 714 00:32:41,100 --> 00:32:41,630 let's see. 715 00:32:41,630 --> 00:32:43,150 It's going to take one. 716 00:32:43,150 --> 00:32:46,490 It's going to take another envelope, and it's going to put the second part 717 00:32:46,490 --> 00:32:49,530 of this message in here. 718 00:32:49,530 --> 00:32:51,370 All right. 719 00:32:51,370 --> 00:32:55,226 It's going to take the third part, put it in here. 720 00:32:55,226 --> 00:32:57,410 Maybe next time we'll just do two parts. 721 00:32:57,410 --> 00:33:00,010 And we'll take the fourth part, and put it in here. 722 00:33:00,010 --> 00:33:02,140 >> And what, now, has to be written on these envelopes-- 723 00:33:02,140 --> 00:33:04,700 which we'll pretend to do, for time's sake, and not actually write out. 724 00:33:04,700 --> 00:33:07,760 What needs to be written on each of these four envelopes, with my message 725 00:33:07,760 --> 00:33:08,320 to someone? 726 00:33:08,320 --> 00:33:09,290 >> [? STUDENT: The ?] order. 727 00:33:09,290 --> 00:33:10,270 >> SPEAKER 1: So, the order. 728 00:33:10,270 --> 00:33:13,740 I need not only the IP address and the port numbers, as we just discussed, I 729 00:33:13,740 --> 00:33:17,606 now need a sequence number of some sort to say, this is packet one, this 730 00:33:17,606 --> 00:33:19,840 is two, this is three, this is four. 731 00:33:19,840 --> 00:33:20,980 And this is actually useful. 732 00:33:20,980 --> 00:33:23,690 Because the internet, it turns out, is actually pretty unreliable. 733 00:33:23,690 --> 00:33:26,080 Routers can get congested. 734 00:33:26,080 --> 00:33:27,615 Cables can get overwhelmed-- 735 00:33:27,615 --> 00:33:28,860 an oversimplification-- 736 00:33:28,860 --> 00:33:32,650 but, with bits such that what routers have to do is just drop packets. 737 00:33:32,650 --> 00:33:35,540 >> In other words, if the internet is just really congested, you might get 738 00:33:35,540 --> 00:33:37,000 three out of those four packets. 739 00:33:37,000 --> 00:33:40,000 But if you have a unique identifier on each of them, you'll know that you're 740 00:33:40,000 --> 00:33:42,510 missing packet number four of four. 741 00:33:42,510 --> 00:33:45,310 So you can ask the guy at the other end to resend it. 742 00:33:45,310 --> 00:33:47,900 But assuming that doesn't happen, let's see what might happen. 743 00:33:47,900 --> 00:33:50,780 >> So if I want to send a message to-- who would like to receive my message 744 00:33:50,780 --> 00:33:52,235 from the internet? 745 00:33:52,235 --> 00:33:53,630 How about someone closer up front. 746 00:33:53,630 --> 00:33:55,490 Brian, is it? 747 00:33:55,490 --> 00:33:56,430 All right. 748 00:33:56,430 --> 00:33:57,280 You stay there. 749 00:33:57,280 --> 00:33:58,820 I'm going to send it to you. 750 00:33:58,820 --> 00:34:01,100 And the thing about the internet is that they might not even 751 00:34:01,100 --> 00:34:02,020 follow the same path. 752 00:34:02,020 --> 00:34:02,990 >> So here I go. 753 00:34:02,990 --> 00:34:06,470 I am sending a message, fragment one of four. 754 00:34:06,470 --> 00:34:06,940 Be a router. 755 00:34:06,940 --> 00:34:08,469 Just let other people deal with it. 756 00:34:08,469 --> 00:34:10,310 There you go. 757 00:34:10,310 --> 00:34:12,790 We'll give this to you, and we'll give this to you. 758 00:34:12,790 --> 00:34:14,000 And we'll see how quickly-- 759 00:34:14,000 --> 00:34:16,500 how many milliseconds it takes to get this message to Brian. 760 00:34:16,500 --> 00:34:20,820 761 00:34:20,820 --> 00:34:23,940 Everyone gets to participate today. 762 00:34:23,940 --> 00:34:25,130 All right. 763 00:34:25,130 --> 00:34:27,130 Brian has one, and two. 764 00:34:27,130 --> 00:34:29,279 If someone wants to be-- 765 00:34:29,279 --> 00:34:30,230 >> [? STUDENT: All four. ?] 766 00:34:30,230 --> 00:34:30,980 >> SPEAKER 1: He has all four. 767 00:34:30,980 --> 00:34:32,480 So no one chose to drop a packet. 768 00:34:32,480 --> 00:34:32,900 That's cool. 769 00:34:32,900 --> 00:34:33,330 That's fine. 770 00:34:33,330 --> 00:34:34,380 So Brian now has all four. 771 00:34:34,380 --> 00:34:36,219 If you want to go ahead and reassemble those for us. 772 00:34:36,219 --> 00:34:39,360 773 00:34:39,360 --> 00:34:40,320 I know, we're pretending. 774 00:34:40,320 --> 00:34:45,090 So for time's sake-- 775 00:34:45,090 --> 00:34:45,929 we have four. 776 00:34:45,929 --> 00:34:48,909 So, OK, open one of them. 777 00:34:48,909 --> 00:34:49,360 OK. 778 00:34:49,360 --> 00:34:51,699 That's one fourth of my message to you. 779 00:34:51,699 --> 00:34:52,949 Now, open the second. 780 00:34:52,949 --> 00:34:58,190 781 00:34:58,190 --> 00:35:01,985 This may be funny, in the end, only to me and Brian. 782 00:35:01,985 --> 00:35:04,320 All right, you've got two. 783 00:35:04,320 --> 00:35:09,110 >> So in the meantime, we physically did this with the scissors, but all it 784 00:35:09,110 --> 00:35:12,360 takes to fragment these things in a computer is just to send some of the 785 00:35:12,360 --> 00:35:15,930 bits in one packet, in one virtual envelope, some of the bits in the 786 00:35:15,930 --> 00:35:19,160 other, some in another, and some in a fourth, and then, let the computer 787 00:35:19,160 --> 00:35:21,570 decide, based on those numbers, in what order you have 788 00:35:21,570 --> 00:35:24,166 to concatenate them. 789 00:35:24,166 --> 00:35:26,270 And Brian's, maybe, the only one that can see this. 790 00:35:26,270 --> 00:35:29,010 The message I sent to Brain-- because of course, the internet is filled with 791 00:35:29,010 --> 00:35:30,260 these, is-- 792 00:35:30,260 --> 00:35:33,080 793 00:35:33,080 --> 00:35:34,500 yes. 794 00:35:34,500 --> 00:35:35,330 >> So that's the message. 795 00:35:35,330 --> 00:35:36,700 And Brian can hang on to that now. 796 00:35:36,700 --> 00:35:38,640 So it took, obviously, a while to do this. 797 00:35:38,640 --> 00:35:41,680 But that's what really happens, like routing data through the 798 00:35:41,680 --> 00:35:43,290 audience in this way. 799 00:35:43,290 --> 00:35:47,320 But there is, again, a number of points, routers, firewalls, and other 800 00:35:47,320 --> 00:35:50,700 such things between points A and B. And rather than just tell the story 801 00:35:50,700 --> 00:35:54,740 verbally, I thought I'd pull up this video that some friends of ours, from 802 00:35:54,740 --> 00:35:59,510 Erikson, years back, actually put together that explains 803 00:35:59,510 --> 00:36:00,480 how this all works. 804 00:36:00,480 --> 00:36:02,380 And it's about 10 or so minutes long. 805 00:36:02,380 --> 00:36:04,065 So let's give you, now, Warriors of the Net. 806 00:36:04,065 --> 00:36:09,282 807 00:36:09,282 --> 00:37:09,720 >> [MUSIC PLAYING] 808 00:37:09,720 --> 00:37:14,990 >> NARRATOR: For the first time in history, people and machinery are 809 00:37:14,990 --> 00:37:18,600 working together, realizing a dream-- 810 00:37:18,600 --> 00:37:22,550 a uniting force that knows no geographical boundaries, without 811 00:37:22,550 --> 00:37:26,050 regard to race, creed, or color-- 812 00:37:26,050 --> 00:37:31,000 a new era where communication truly brings people together. 813 00:37:31,000 --> 00:37:34,420 This is the dawn of the net. 814 00:37:34,420 --> 00:37:38,240 815 00:37:38,240 --> 00:37:40,070 Want to know how it works? 816 00:37:40,070 --> 00:37:44,605 Click here to begin your journey into the net. 817 00:37:44,605 --> 00:37:47,930 818 00:37:47,930 --> 00:37:51,080 >> Now exactly what happened when you clicked on that link? 819 00:37:51,080 --> 00:37:53,320 You started a flow of information. 820 00:37:53,320 --> 00:37:56,950 This information travels down into your own personal mail room, when Mr. 821 00:37:56,950 --> 00:38:01,805 IP packages it, labels it, and sends it on its way. 822 00:38:01,805 --> 00:38:03,790 >> Each packet is limited in its size. 823 00:38:03,790 --> 00:38:08,010 The mail room must decide how to divide the information, and how to 824 00:38:08,010 --> 00:38:09,170 package it. 825 00:38:09,170 --> 00:38:13,390 Now the package needs a label containing important information such 826 00:38:13,390 --> 00:38:19,492 as sender's address, receiver's address, and the type of packet it is. 827 00:38:19,492 --> 00:38:34,940 828 00:38:34,940 --> 00:38:38,680 >> Because this particular packet is going out onto the internet, it also 829 00:38:38,680 --> 00:38:42,570 gets an address for the proxy server, which has a special function, 830 00:38:42,570 --> 00:38:44,410 as we'll see later. 831 00:38:44,410 --> 00:38:50,070 The packet is now launched onto your local area network, or LAN. 832 00:38:50,070 --> 00:38:53,990 This network is used to connect all the local computers, routers, 833 00:38:53,990 --> 00:38:57,940 printers, et cetera for information exchange within the physical walls of 834 00:38:57,940 --> 00:38:59,160 the building. 835 00:38:59,160 --> 00:39:04,130 The LAN is a pretty uncontrolled place, and unfortunately, accidents 836 00:39:04,130 --> 00:39:05,425 can happen. 837 00:39:05,425 --> 00:39:14,460 838 00:39:14,460 --> 00:39:18,050 >> The highway of the LAN is packed with all types of information. 839 00:39:18,050 --> 00:39:22,070 These are IP packets, Novell packets, AppleTalk packets-- 840 00:39:22,070 --> 00:39:24,500 they're going against traffic, as usual. 841 00:39:24,500 --> 00:39:29,250 The local router reads to address and, if necessary, lifts the packet onto 842 00:39:29,250 --> 00:39:31,710 another network. 843 00:39:31,710 --> 00:39:33,570 Ah, the router-- 844 00:39:33,570 --> 00:39:37,490 a symbol of control in a seemingly disorganized world. 845 00:39:37,490 --> 00:39:38,480 >> ROUTER: Whoops, sorry about that. 846 00:39:38,480 --> 00:39:39,965 Let's put this one here, this one here. 847 00:39:39,965 --> 00:39:40,460 This moves here. 848 00:39:40,460 --> 00:39:40,955 This one moves here. 849 00:39:40,955 --> 00:39:41,945 I don't like this one. 850 00:39:41,945 --> 00:39:42,935 Let's move this one. 851 00:39:42,935 --> 00:39:43,925 This one goes here. 852 00:39:43,925 --> 00:39:45,410 [INAUDIBLE] 853 00:39:45,410 --> 00:39:46,400 Put another jangle here. 854 00:39:46,400 --> 00:39:46,895 Let's put this one here. 855 00:39:46,895 --> 00:39:47,885 Nah, I'll go with that. 856 00:39:47,885 --> 00:39:48,700 Let's put that one here. 857 00:39:48,700 --> 00:39:49,930 >> NARRATOR: There he is-- 858 00:39:49,930 --> 00:39:55,770 systematic, uncaring, methodical, conservative, and sometimes, not quite 859 00:39:55,770 --> 00:39:56,975 up to speed. 860 00:39:56,975 --> 00:40:00,090 But at least he is exact, for the most part. 861 00:40:00,090 --> 00:40:01,243 >> ROUTER: Put that one over there. 862 00:40:01,243 --> 00:40:04,694 That one goes there, that one goes there, and this one goes there. 863 00:40:04,694 --> 00:40:05,680 Well, another one goes there. 864 00:40:05,680 --> 00:40:06,173 That goes here. 865 00:40:06,173 --> 00:40:07,423 [INAUDIBLE] 866 00:40:07,423 --> 00:40:14,570 867 00:40:14,570 --> 00:40:18,670 >> NARRATOR: As the packets leave the router, they make their way into the 868 00:40:18,670 --> 00:40:24,090 corporate intranet and head for the router switch. 869 00:40:24,090 --> 00:40:28,120 A bit more efficient than the router, the router switch plays fast and loose 870 00:40:28,120 --> 00:40:31,970 with IP packets, deftly routing them along their way-- 871 00:40:31,970 --> 00:40:34,720 a digital pinball wizard, if you will. 872 00:40:34,720 --> 00:40:35,290 >> ROUTER SWITCH: Here we go. 873 00:40:35,290 --> 00:40:36,020 Here comes another one. 874 00:40:36,020 --> 00:40:36,950 And it's another. 875 00:40:36,950 --> 00:40:37,406 Watch this, mom. 876 00:40:37,406 --> 00:40:38,320 Here it goes. 877 00:40:38,320 --> 00:40:39,235 Whoop, around the back. 878 00:40:39,235 --> 00:40:40,660 Hey, in there, in there. 879 00:40:40,660 --> 00:40:41,135 Over to the left. 880 00:40:41,135 --> 00:40:42,090 Over to the right. 881 00:40:42,090 --> 00:40:42,480 Over to the left. 882 00:40:42,480 --> 00:40:42,820 Over to the right. 883 00:40:42,820 --> 00:40:43,490 You got it. 884 00:40:43,490 --> 00:40:43,800 Here it comes. 885 00:40:43,800 --> 00:40:45,170 He shoots, he scores. 886 00:40:45,170 --> 00:40:45,860 It's going. 887 00:40:45,860 --> 00:40:48,270 Hey Wayne, watch out, here comes another one. 888 00:40:48,270 --> 00:40:49,520 Oh, here we go. 889 00:40:49,520 --> 00:40:52,920 890 00:40:52,920 --> 00:40:56,330 >> NARRATOR: As packets arrive at their destination, they're picked up by the 891 00:40:56,330 --> 00:41:01,250 network interface, ready to be sent to the next level-- 892 00:41:01,250 --> 00:41:04,340 in this case, the proxy. 893 00:41:04,340 --> 00:41:08,750 The proxy is used by many companies as sort of a middle man in order to 894 00:41:08,750 --> 00:41:11,570 lessen the load on their internet connection, and for 895 00:41:11,570 --> 00:41:15,350 security reasons as well. 896 00:41:15,350 --> 00:41:19,420 As you can see, the packets are all of various sizes, 897 00:41:19,420 --> 00:41:21,770 depending upon their content. 898 00:41:21,770 --> 00:41:37,960 899 00:41:37,960 --> 00:41:45,110 >> The proxy opens the packet and looks for the web address, or URL. 900 00:41:45,110 --> 00:41:49,500 Depending upon whether the address is acceptable, the packet is sent on to 901 00:41:49,500 --> 00:41:50,750 the internet. 902 00:41:50,750 --> 00:41:56,940 903 00:41:56,940 --> 00:42:01,970 >> There are, however, some addresses which do not meet with the approval of 904 00:42:01,970 --> 00:42:03,090 the proxy-- 905 00:42:03,090 --> 00:42:05,893 that is to say, corporate or management guidelines. 906 00:42:05,893 --> 00:42:09,100 907 00:42:09,100 --> 00:42:13,710 These are summarily dealt with. 908 00:42:13,710 --> 00:42:15,620 We'll have none of that. 909 00:42:15,620 --> 00:42:19,227 For those who make it, it's on the road again. 910 00:42:19,227 --> 00:42:29,950 911 00:42:29,950 --> 00:42:32,313 >> Next up, the firewall. 912 00:42:32,313 --> 00:42:36,500 913 00:42:36,500 --> 00:42:40,225 The corporate firewall serves two purposes. 914 00:42:40,225 --> 00:42:44,350 It prevents some rather nasty things from the internet from coming into the 915 00:42:44,350 --> 00:42:48,460 intranet, and it can also prevent sensitive corporate information from 916 00:42:48,460 --> 00:42:53,380 being sent out onto the internet. 917 00:42:53,380 --> 00:42:57,340 >> Once through the firewall, a router picks up the packet and places it onto 918 00:42:57,340 --> 00:43:01,216 a much narrower road, or bandwidth, as we say. 919 00:43:01,216 --> 00:43:06,830 Obviously, the road is not broad enough to take them all. 920 00:43:06,830 --> 00:43:10,870 >> Now you might wonder what happens to all those packets which don't make it 921 00:43:10,870 --> 00:43:11,950 along the way. 922 00:43:11,950 --> 00:43:16,540 Well, when Mr. IP doesn't receive an acknowledgement that a packet has been 923 00:43:16,540 --> 00:43:22,940 received in due time, he simply sends a replacement packet. 924 00:43:22,940 --> 00:43:29,360 We are now ready to enter the world of the internet, a spider web of 925 00:43:29,360 --> 00:43:33,670 interconnected networks which span our entire globe. 926 00:43:33,670 --> 00:43:39,360 Here, routers and switches establish links between networks. 927 00:43:39,360 --> 00:43:42,740 >> Now the net is an entirely different environment than you'll find within 928 00:43:42,740 --> 00:43:44,900 the protective walls of your LAN. 929 00:43:44,900 --> 00:43:47,340 Out here, it's the Wild West-- 930 00:43:47,340 --> 00:43:50,540 plenty of space, plenty of opportunities, plenty of things to 931 00:43:50,540 --> 00:43:53,130 explore, and places to go. 932 00:43:53,130 --> 00:43:57,620 Thanks to very little control and regulation, new ideas find fertile 933 00:43:57,620 --> 00:44:01,530 soil to push the envelope of their possibilities. 934 00:44:01,530 --> 00:44:05,240 But because of this freedom, certain dangers also lurk. 935 00:44:05,240 --> 00:44:10,860 You'll never know when you'll meet the dreaded ping of death, a special 936 00:44:10,860 --> 00:44:15,610 version of a normal request ping which some idiot thought up to mess up 937 00:44:15,610 --> 00:44:18,500 unsuspecting hosts. 938 00:44:18,500 --> 00:44:23,760 >> The path our packets take may be via satellite, telephone lines, wireless, 939 00:44:23,760 --> 00:44:25,650 or even trans-oceanic cable. 940 00:44:25,650 --> 00:44:29,860 They don't always take the fastest, or shortest, routes possible. 941 00:44:29,860 --> 00:44:33,560 But they will get there eventually. 942 00:44:33,560 --> 00:44:38,410 Maybe that's why it's sometimes called the world wide wait. 943 00:44:38,410 --> 00:44:42,710 But when everything is working smoothly, you can circumvent the globe 944 00:44:42,710 --> 00:44:47,110 five times over at the drop of a hat, literally-- 945 00:44:47,110 --> 00:44:51,520 and all for the cost of a local call, or less. 946 00:44:51,520 --> 00:44:55,260 >> Near the end of our destination, we'll find another firewall. 947 00:44:55,260 --> 00:44:58,450 948 00:44:58,450 --> 00:45:02,740 Depending upon your perspective as a data packet, the firewall could be a 949 00:45:02,740 --> 00:45:06,930 bastion of security, or a dreaded adversary. 950 00:45:06,930 --> 00:45:11,710 It all depends on which side you're on and what your intentions are. 951 00:45:11,710 --> 00:45:15,590 >> The firewall is designed to let in only those packets 952 00:45:15,590 --> 00:45:18,060 that meet its criteria. 953 00:45:18,060 --> 00:45:22,450 This firewall is operating on ports 80 and 25. 954 00:45:22,450 --> 00:45:26,880 All attempts to enter through other ports are closed for business. 955 00:45:26,880 --> 00:45:40,500 956 00:45:40,500 --> 00:45:48,470 >> Port 25 is used for mail packets, while port 80 is the entrance for 957 00:45:48,470 --> 00:45:50,755 packets from the internet to the web server. 958 00:45:50,755 --> 00:45:54,060 959 00:45:54,060 --> 00:45:58,230 Inside the firewall, packets are screened more thoroughly. 960 00:45:58,230 --> 00:46:02,190 Some packets make it easily through customs, while others 961 00:46:02,190 --> 00:46:04,760 look just a bit dubious. 962 00:46:04,760 --> 00:46:08,390 >> The firewall officer is not easily fooled, such as when this ping of 963 00:46:08,390 --> 00:46:14,430 death packet tries to disguise itself as a normal ping packet. 964 00:46:14,430 --> 00:46:14,740 >> FIREWALL: Next. 965 00:46:14,740 --> 00:46:15,214 OK. 966 00:46:15,214 --> 00:46:15,688 Go on. 967 00:46:15,688 --> 00:46:16,162 That's OK. 968 00:46:16,162 --> 00:46:16,636 No problem. 969 00:46:16,636 --> 00:46:17,584 Have a nice day. 970 00:46:17,584 --> 00:46:18,532 Be out here. 971 00:46:18,532 --> 00:46:20,315 Bye. 972 00:46:20,315 --> 00:46:23,870 >> NARRATOR: For those packets lucky enough to make it this far, the 973 00:46:23,870 --> 00:46:25,920 journey is almost over. 974 00:46:25,920 --> 00:46:28,940 975 00:46:28,940 --> 00:46:35,380 It's just a lineup on the interface to be taken up into the web server. 976 00:46:35,380 --> 00:46:40,700 >> Nowadays a web server can run on many things, from a mainframe, to a webcam, 977 00:46:40,700 --> 00:46:41,910 to the computer on your desk. 978 00:46:41,910 --> 00:46:44,630 Or why not your refrigerator? 979 00:46:44,630 --> 00:46:48,750 With the proper setup, you can find out if you have the makings for 980 00:46:48,750 --> 00:46:51,570 chicken cacciatore, or if you have to go shopping. 981 00:46:51,570 --> 00:46:54,870 Remember, this is the dawn of the net. 982 00:46:54,870 --> 00:46:56,360 Almost anything's possible. 983 00:46:56,360 --> 00:47:00,540 984 00:47:00,540 --> 00:47:05,540 >> One by one, the packets are received, opened, and unpacked. 985 00:47:05,540 --> 00:47:09,550 986 00:47:09,550 --> 00:47:11,900 The information they contain-- 987 00:47:11,900 --> 00:47:14,370 that is, your request for information-- 988 00:47:14,370 --> 00:47:17,520 is sent on to the web server application. 989 00:47:17,520 --> 00:47:24,650 990 00:47:24,650 --> 00:47:33,750 >> The packet itself is recycled, ready to be used again, and filled with your 991 00:47:33,750 --> 00:47:46,830 requested information, addressed, and send out, on its way back to you, back 992 00:47:46,830 --> 00:47:56,950 past the firewall, routers, and on through to the internet, back through 993 00:47:56,950 --> 00:48:08,430 your corporate firewall, and on to your interface, ready to supply your 994 00:48:08,430 --> 00:48:11,060 web browser with the information you requested-- 995 00:48:11,060 --> 00:48:14,320 996 00:48:14,320 --> 00:48:17,236 that is, this film. 997 00:48:17,236 --> 00:48:22,870 998 00:48:22,870 --> 00:48:27,590 >> Pleased with their efforts and trusting in a better world, our trusty 999 00:48:27,590 --> 00:48:33,840 data packets ride off blissfully into the sunset of another day, knowing 1000 00:48:33,840 --> 00:48:37,135 fully, they have served their masters well. 1001 00:48:37,135 --> 00:48:40,080 1002 00:48:40,080 --> 00:48:43,695 Now isn't that a happy ending? 1003 00:48:43,695 --> 00:48:47,910 1004 00:48:47,910 --> 00:48:49,890 >> SPEAKER 1: That, then, is how the internet works. 1005 00:48:49,890 --> 00:48:53,360 Through problem set seven will you better understand this and will you 1006 00:48:53,360 --> 00:48:55,830 learn a bit of HTML, PHP, and more. 1007 00:48:55,830 --> 00:48:58,590 More on that in the specification that will go out on Friday. 1008 00:48:58,590 --> 00:49:00,310 And we will see you on Monday. 1009 00:49:00,310 --> 00:49:02,763