1 00:00:00,000 --> 00:00:12,610 2 00:00:12,610 --> 00:00:12,900 >> DAVID J. MALAN: All right. 3 00:00:12,900 --> 00:00:16,790 So welcome to the first ever CS50 postmortem for a quiz. 4 00:00:16,790 --> 00:00:18,340 We thought we'd inaugurate this tradition this year. 5 00:00:18,340 --> 00:00:20,960 And this will be an opportunity to walk through the 6 00:00:20,960 --> 00:00:22,220 solutions to the quiz. 7 00:00:22,220 --> 00:00:26,160 And we'll speed up or slow down based on interest of those here. 8 00:00:26,160 --> 00:00:29,730 >> So you're probably here because you're interested in how you could have or 9 00:00:29,730 --> 00:00:31,170 should have answered some of these problems. 10 00:00:31,170 --> 00:00:33,300 So why don't we take a look at this section first? 11 00:00:33,300 --> 00:00:34,450 So getting strings. 12 00:00:34,450 --> 00:00:37,600 This gave you three different versions of a program that was, ultimately, 13 00:00:37,600 --> 00:00:39,650 meant to get a string from a user. 14 00:00:39,650 --> 00:00:42,530 Whether or not it did that was left to you to determine. 15 00:00:42,530 --> 00:00:45,150 >> And we asked in Question 0, suppose that version 1 is 16 00:00:45,150 --> 00:00:46,400 compiled and executed. 17 00:00:46,400 --> 00:00:48,860 Why might the program segfault? 18 00:00:48,860 --> 00:00:51,150 At first glance, any suggestions as to why? 19 00:00:51,150 --> 00:00:54,012 20 00:00:54,012 --> 00:00:54,489 Yeah. 21 00:00:54,489 --> 00:00:59,260 >> AUDIENCE: So I remember seeing this in a previous example of looking at the 22 00:00:59,260 --> 00:01:05,506 char* s and seeing the scan of the s and seeing because it's a pointer, how 23 00:01:05,506 --> 00:01:07,971 did it affect what you scanned in? 24 00:01:07,971 --> 00:01:10,940 Is it s or the address of s? 25 00:01:10,940 --> 00:01:11,180 >> DAVID J. MALAN: OK. 26 00:01:11,180 --> 00:01:11,480 Good. 27 00:01:11,480 --> 00:01:14,830 So ultimately, the source of any problem is presumably going to reduce 28 00:01:14,830 --> 00:01:16,210 to that variable s. 29 00:01:16,210 --> 00:01:17,280 And it's indeed a variable. 30 00:01:17,280 --> 00:01:19,900 The data type of that variable is char*, which means it's going to 31 00:01:19,900 --> 00:01:22,570 contain the address of a character. 32 00:01:22,570 --> 00:01:23,850 And therein lies the insight. 33 00:01:23,850 --> 00:01:28,330 It's going to contain the address of a character or, more generally, the 34 00:01:28,330 --> 00:01:32,110 address of the first character in a whole block of characters. 35 00:01:32,110 --> 00:01:36,680 >> But the catch is that scan s, purpose in life, is given an address and given 36 00:01:36,680 --> 00:01:40,960 a format code, like %s, read a string into the chunk of 37 00:01:40,960 --> 00:01:42,330 memory at that address. 38 00:01:42,330 --> 00:01:46,040 But because there's no equal sign before that semicolon on the first 39 00:01:46,040 --> 00:01:49,310 line of code, because we don't actually allocate any memory with 40 00:01:49,310 --> 00:01:53,020 malloc, because it didn't actually allocate an array of some size, all 41 00:01:53,020 --> 00:01:57,620 you're doing is reading the user's keyboard input into some complete 42 00:01:57,620 --> 00:02:00,490 garbage value, which is in s by default. 43 00:02:00,490 --> 00:02:04,480 So odds are you're going to segfault if that address doesn't just so happen 44 00:02:04,480 --> 00:02:08,009 to be a value that you can, in fact, write to. 45 00:02:08,009 --> 00:02:10,889 So bad not to allocate your memory there. 46 00:02:10,889 --> 00:02:13,150 >> So in question 1, we asked, suppose that version 2 is 47 00:02:13,150 --> 00:02:14,230 compiled and executed. 48 00:02:14,230 --> 00:02:15,900 Why might this program segfault? 49 00:02:15,900 --> 00:02:17,990 So this one is less buggy. 50 00:02:17,990 --> 00:02:21,470 And there's really only one obvious way where you can 51 00:02:21,470 --> 00:02:22,810 trigger a segfault here. 52 00:02:22,810 --> 00:02:23,730 And this is thematic. 53 00:02:23,730 --> 00:02:28,180 Any time we're using c in memory, what could you do to induce a segfault 54 00:02:28,180 --> 00:02:30,718 with version 2? 55 00:02:30,718 --> 00:02:35,560 >> AUDIENCE: If you use that input in a string that's longer than 49 56 00:02:35,560 --> 00:02:35,975 characters. 57 00:02:35,975 --> 00:02:37,260 >> DAVID J. MALAN: Exactly. 58 00:02:37,260 --> 00:02:41,420 Any time you see something fixed length when it comes to an array, your 59 00:02:41,420 --> 00:02:44,650 radar should go off that this could be problematic if you're not checking the 60 00:02:44,650 --> 00:02:45,810 boundaries of an array. 61 00:02:45,810 --> 00:02:46,650 And that's the problem here. 62 00:02:46,650 --> 00:02:47,910 We're still using scanf. 63 00:02:47,910 --> 00:02:52,200 We're still using %s, which means try to read a string from the user. 64 00:02:52,200 --> 00:02:56,300 That's going to be read into s, which, at this point, is effectively the 65 00:02:56,300 --> 00:02:58,570 address of a chunk of memory or it's equivalent. 66 00:02:58,570 --> 00:03:02,080 It's the name of an array of characters of memory. 67 00:03:02,080 --> 00:03:07,610 >> But exactly that, if you read a string that's longer than 49 characters, 49 68 00:03:07,610 --> 00:03:10,440 because you need room for the backslash 0, you're going to overflow 69 00:03:10,440 --> 00:03:11,390 that buffer. 70 00:03:11,390 --> 00:03:16,410 And you might get lucky and be able to write a 51st character, 52nd, 53rd. 71 00:03:16,410 --> 00:03:18,560 But at some point, the OS is going to say, no. 72 00:03:18,560 --> 00:03:21,270 This definitely is not memory you're allowed to touch. 73 00:03:21,270 --> 00:03:23,380 And the program is going to segfault. 74 00:03:23,380 --> 00:03:26,650 >> So there, the heuristics should be any time you've got fixed length, you have 75 00:03:26,650 --> 00:03:30,150 to make sure you're checking the length of whatever it is you're trying 76 00:03:30,150 --> 00:03:31,090 to read into it. 77 00:03:31,090 --> 00:03:35,110 >> AUDIENCE: So to solve that, you could have had a statement checking actually 78 00:03:35,110 --> 00:03:37,140 is the length greater than or less than? 79 00:03:37,140 --> 00:03:37,730 >> DAVID J. MALAN: Absolutely. 80 00:03:37,730 --> 00:03:41,706 You just have a condition that says, if the-- 81 00:03:41,706 --> 00:03:46,080 or rather you don't necessarily know in advance how many characters the 82 00:03:46,080 --> 00:03:49,060 user is going to type, because you have chicken and the egg. 83 00:03:49,060 --> 00:03:51,860 Not until you've read it in with scanf can you figure out how long it is. 84 00:03:51,860 --> 00:03:54,500 But at that point, it's too late, because you've already read it into 85 00:03:54,500 --> 00:03:55,710 some block of memory. 86 00:03:55,710 --> 00:03:59,590 So as an aside, the CS50 library avoids this issue altogether, recall 87 00:03:59,590 --> 00:04:01,060 by using fgetc. 88 00:04:01,060 --> 00:04:05,390 And it reads one character at a time, tip-toeing along, knowing that you 89 00:04:05,390 --> 00:04:08,060 can't overflow a character if you read one at a time. 90 00:04:08,060 --> 00:04:11,580 >> The catch is with getstring recall is that we have to constantly re-size 91 00:04:11,580 --> 00:04:13,590 that chunk of memory, which is just a pain. 92 00:04:13,590 --> 00:04:15,310 It's a lot of lines of code to do that. 93 00:04:15,310 --> 00:04:18,779 So another approach would be to actually use a cousin, so 94 00:04:18,779 --> 00:04:19,790 to speak, of scanf. 95 00:04:19,790 --> 00:04:22,820 There are variants of a lot of these functions that actually check the 96 00:04:22,820 --> 00:04:25,870 length of how many characters you might read maximally. 97 00:04:25,870 --> 00:04:29,430 And you could specify, do not read more than 50 characters. 98 00:04:29,430 --> 00:04:34,110 So that would be another approach but less accommodating of larger inputs. 99 00:04:34,110 --> 00:04:37,040 >> So question 2 asks, suppose that version 3 is compiled and executed. 100 00:04:37,040 --> 00:04:39,960 Why might that program segfault? 101 00:04:39,960 --> 00:04:42,650 So this one is actually the same answer, even though it 102 00:04:42,650 --> 00:04:43,590 looks a little fancier. 103 00:04:43,590 --> 00:04:46,440 We're using malloc, which feels like we're giving ourselves more options. 104 00:04:46,440 --> 00:04:48,030 And then we're freeing that memory at the end. 105 00:04:48,030 --> 00:04:49,580 It's still just 50 bytes of memory. 106 00:04:49,580 --> 00:04:53,620 So we might still try to read in 51, 52, 1,000 bytes. 107 00:04:53,620 --> 00:04:55,830 It's going to segfault for exactly the same reason. 108 00:04:55,830 --> 00:04:57,530 >> But there is another reason too. 109 00:04:57,530 --> 00:05:03,890 What else could malloc return besides the address of a chunk of memory? 110 00:05:03,890 --> 00:05:04,920 It could return null. 111 00:05:04,920 --> 00:05:07,560 And because we're not checking for that, we might be doing something 112 00:05:07,560 --> 00:05:11,350 stupid for another reason, which is that we might be telling scanf, read 113 00:05:11,350 --> 00:05:16,050 the user's input from the keyboard into 0 location, AKA null. 114 00:05:16,050 --> 00:05:18,890 And that, too, will definitely trigger a segfault. 115 00:05:18,890 --> 00:05:21,590 So for the quiz's purpose, we would have accepted either of those as a 116 00:05:21,590 --> 00:05:22,740 valid reason. 117 00:05:22,740 --> 00:05:23,420 One is identical. 118 00:05:23,420 --> 00:05:25,720 One is a little more nuanced. 119 00:05:25,720 --> 00:05:28,975 >> Lastly, with respect to the program's use of memory, how do version 2 and 120 00:05:28,975 --> 00:05:30,350 version 3 differ? 121 00:05:30,350 --> 00:05:35,070 So for what it's worth, we saw a seemingly endless supply of possible 122 00:05:35,070 --> 00:05:35,770 answers to this. 123 00:05:35,770 --> 00:05:39,300 And among people's answers, what we were hoping for, but we accepted other 124 00:05:39,300 --> 00:05:42,250 things, was some mention of the fact that version 2 is using 125 00:05:42,250 --> 00:05:44,560 the so-called stack. 126 00:05:44,560 --> 00:05:46,710 Version 3 is using the heap. 127 00:05:46,710 --> 00:05:50,060 And functionally, this doesn't really make all that much of a difference. 128 00:05:50,060 --> 00:05:54,040 At the end of the day, we're still just getting 50 bytes of memory. 129 00:05:54,040 --> 00:05:56,640 >> But that was one of the possible answers that we were looking at. 130 00:05:56,640 --> 00:05:59,730 But you'll see, as you get your quizzes back from the TFs, that we did 131 00:05:59,730 --> 00:06:04,330 accept other discussions of their disparate uses of memory as well. 132 00:06:04,330 --> 00:06:08,600 But stack and heap would have been an easy answer to go with. 133 00:06:08,600 --> 00:06:11,150 Any questions? 134 00:06:11,150 --> 00:06:12,400 I give you Rob. 135 00:06:12,400 --> 00:06:18,360 136 00:06:18,360 --> 00:06:20,210 >> ROB BOWDEN: So problem 4. 137 00:06:20,210 --> 00:06:21,985 This is the one where you had to fill in the number of bytes out of all 138 00:06:21,985 --> 00:06:23,460 these different types used. 139 00:06:23,460 --> 00:06:24,830 So first thing we see. 140 00:06:24,830 --> 00:06:27,930 Assume a 32-bit architecture, like this CS50 appliance. 141 00:06:27,930 --> 00:06:33,530 So one of the fundamental things about 32-bit architectures, that tells us 142 00:06:33,530 --> 00:06:37,490 exactly how big a pointer is going to be in the architecture. 143 00:06:37,490 --> 00:06:43,020 >> So immediately, we know that any pointer type is 32-bits or 4 bytes. 144 00:06:43,020 --> 00:06:46,010 So looking at this table, a node* is a pointer type. 145 00:06:46,010 --> 00:06:47,250 That's going to be 4 bytes. 146 00:06:47,250 --> 00:06:51,640 Struct node*, that's literally identical to node star. 147 00:06:51,640 --> 00:06:53,590 And so that's going to be 4 bytes. 148 00:06:53,590 --> 00:06:58,270 String, so it doesn't look like a pointer yet, but the typedef , a 149 00:06:58,270 --> 00:07:01,590 string is just a char*, which is a pointer type. 150 00:07:01,590 --> 00:07:03,550 So that's going to be 4 bytes. 151 00:07:03,550 --> 00:07:06,150 >> So these three are all 4 bytes. 152 00:07:06,150 --> 00:07:09,350 Now, node and student are a bit more complicated. 153 00:07:09,350 --> 00:07:15,160 So looking at node and student, we see node as an integer and a pointer. 154 00:07:15,160 --> 00:07:18,050 And student is two pointers inside of it. 155 00:07:18,050 --> 00:07:23,340 So at least for our case here, the way that we end up calculating the size of 156 00:07:23,340 --> 00:07:27,020 this struct is just add up everything that's inside the struct. 157 00:07:27,020 --> 00:07:30,690 >> So for node, we have an integer, which is 4 bytes. 158 00:07:30,690 --> 00:07:32,830 We have a pointer, which is 4 bytes. 159 00:07:32,830 --> 00:07:35,820 And so one node is going to take up 8 bytes. 160 00:07:35,820 --> 00:07:39,490 And similarly for student, we have a pointer that's 4 bytes and another 161 00:07:39,490 --> 00:07:40,770 pointer that's 4 bytes. 162 00:07:40,770 --> 00:07:43,180 So that's going to end up being 8 bytes. 163 00:07:43,180 --> 00:07:45,480 So node and student are 8 bytes. 164 00:07:45,480 --> 00:07:48,950 And these three are all 4 bytes. 165 00:07:48,950 --> 00:07:50,240 Questions on that? 166 00:07:50,240 --> 00:07:54,640 167 00:07:54,640 --> 00:07:54,990 Yes. 168 00:07:54,990 --> 00:07:58,413 >> AUDIENCE: Is it was a 64-bit architecture, would that 169 00:07:58,413 --> 00:07:59,880 double all of them? 170 00:07:59,880 --> 00:08:01,790 >> ROB BOWDEN: It wouldn't double all of them. 171 00:08:01,790 --> 00:08:05,830 So 64-bit architecture, it, again, changes that fundamental thing that a 172 00:08:05,830 --> 00:08:08,910 pointer is now 64 bits. 173 00:08:08,910 --> 00:08:09,290 Yeah. 174 00:08:09,290 --> 00:08:10,930 So a pointer is 8 bytes. 175 00:08:10,930 --> 00:08:15,420 So these that were 4 bytes are going to be 8 bytes. 176 00:08:15,420 --> 00:08:18,617 A student, which was two pointers, well, now it's going to 177 00:08:18,617 --> 00:08:19,800 be 8 bytes, 8 bytes. 178 00:08:19,800 --> 00:08:21,980 It's going to make 16 bytes. 179 00:08:21,980 --> 00:08:25,710 >> But a node is still 4 bytes. 180 00:08:25,710 --> 00:08:27,800 So this pointer is going to be 8 bytes. 181 00:08:27,800 --> 00:08:28,930 This is 4 bytes. 182 00:08:28,930 --> 00:08:30,870 So a node is only going to be 12 bytes. 183 00:08:30,870 --> 00:08:36,309 184 00:08:36,309 --> 00:08:39,280 Any other questions on that one? 185 00:08:39,280 --> 00:08:44,500 So the next one, these are the HTTP status codes. 186 00:08:44,500 --> 00:08:48,000 And you had to describe circumstances under which these might 187 00:08:48,000 --> 00:08:49,810 be returned to you. 188 00:08:49,810 --> 00:08:56,730 one problem that I heard some students have is that they tried to make the 189 00:08:56,730 --> 00:08:58,950 errors be on the client's end. 190 00:08:58,950 --> 00:09:02,320 So when we try to make the request to the server, something goes 191 00:09:02,320 --> 00:09:03,820 wrong on our end. 192 00:09:03,820 --> 00:09:07,660 But generally, these codes are being returned by the server. 193 00:09:07,660 --> 00:09:11,720 So we want to figure out what's going wrong or right on the server that 194 00:09:11,720 --> 00:09:14,280 causes these things to be returned. 195 00:09:14,280 --> 00:09:18,670 So why might a server returns status code 200? 196 00:09:18,670 --> 00:09:19,920 Any thoughts? 197 00:09:19,920 --> 00:09:23,360 198 00:09:23,360 --> 00:09:23,730 >> Yeah. 199 00:09:23,730 --> 00:09:27,850 So something about successfully the request went through. 200 00:09:27,850 --> 00:09:30,260 And they're able to return whatever you asked for. 201 00:09:30,260 --> 00:09:32,240 So everything was fine. 202 00:09:32,240 --> 00:09:35,662 What about 302 found? 203 00:09:35,662 --> 00:09:36,618 Yeah. 204 00:09:36,618 --> 00:09:39,008 >> AUDIENCE: The server was looking for what you requested. 205 00:09:39,008 --> 00:09:40,442 But it couldn't find it. 206 00:09:40,442 --> 00:09:42,850 So there's an error. 207 00:09:42,850 --> 00:09:47,720 >> ROB BOWDEN: So the server was looking for what you wanted. 208 00:09:47,720 --> 00:09:51,682 So just looking here, 302 found, it was able to find it. 209 00:09:51,682 --> 00:09:53,035 >> AUDIENCE: I'm sorry. 210 00:09:53,035 --> 00:09:54,388 Found means that they did find it. 211 00:09:54,388 --> 00:09:55,638 Sorry. 212 00:09:55,638 --> 00:09:58,120 213 00:09:58,120 --> 00:10:00,160 >> ROB BOWDEN: So 302 found. 214 00:10:00,160 --> 00:10:02,350 The server is able to find what you wanted. 215 00:10:02,350 --> 00:10:04,640 >> AUDIENCE: But it's not displaying it? 216 00:10:04,640 --> 00:10:08,180 >> ROB BOWDEN: The difference between this 302 and 200 is that it 217 00:10:08,180 --> 00:10:09,280 knows what you want. 218 00:10:09,280 --> 00:10:12,000 But it isn't exactly where you wanted to ask. 219 00:10:12,000 --> 00:10:14,580 So 302 is a typical redirect. 220 00:10:14,580 --> 00:10:16,510 So you requested a page. 221 00:10:16,510 --> 00:10:19,590 It knows, oh, I want to return you this. 222 00:10:19,590 --> 00:10:21,070 But this is at a different URL. 223 00:10:21,070 --> 00:10:23,534 So hey, you actually want this. 224 00:10:23,534 --> 00:10:26,950 >> DAVID J. MALAN: It's a piece that said that we gave you guys a redirect 225 00:10:26,950 --> 00:10:30,830 function that used the header function that, in turn, printed out location, 226 00:10:30,830 --> 00:10:34,110 colon, and then the URL to which you want to reject the user. 227 00:10:34,110 --> 00:10:37,480 Even though you didn't see 302 explicitly there, that is what PHP 228 00:10:37,480 --> 00:10:41,550 would magically insert as the header saying exactly what Rob said there-- 229 00:10:41,550 --> 00:10:41,930 found. 230 00:10:41,930 --> 00:10:43,180 But go here instead. 231 00:10:43,180 --> 00:10:45,960 232 00:10:45,960 --> 00:10:46,160 >> ROB BOWDEN: OK. 233 00:10:46,160 --> 00:10:47,630 So what about 403 forbidden? 234 00:10:47,630 --> 00:10:52,240 235 00:10:52,240 --> 00:10:57,120 >> AUDIENCE: I think it's that the server is basically saying that the client 236 00:10:57,120 --> 00:10:59,970 can't access the home page. 237 00:10:59,970 --> 00:11:03,260 >> ROB BOWDEN: So yes. 238 00:11:03,260 --> 00:11:07,670 Well, the typical answer we were expecting is something like, the files 239 00:11:07,670 --> 00:11:08,920 aren't chmodded appropriately. 240 00:11:08,920 --> 00:11:11,590 That's probably under what circumstances you saw them. 241 00:11:11,590 --> 00:11:18,920 But there is a reason that the client could be at fault here. 242 00:11:18,920 --> 00:11:20,440 There's actually another status code-- 243 00:11:20,440 --> 00:11:21,210 401. 244 00:11:21,210 --> 00:11:22,820 So these are very similar. 245 00:11:22,820 --> 00:11:24,590 >> 401 is unauthorized. 246 00:11:24,590 --> 00:11:26,130 And 403 is forbidden. 247 00:11:26,130 --> 00:11:31,890 And so unauthorized you exclusively get if you're not logged in. 248 00:11:31,890 --> 00:11:34,520 But logging in might mean that you are authorized. 249 00:11:34,520 --> 00:11:37,930 But if you're already logged in and you still don't have permission, then 250 00:11:37,930 --> 00:11:40,140 you can also get forbidden. 251 00:11:40,140 --> 00:11:45,320 So if you are logged in and do not have permission, forbidden is also 252 00:11:45,320 --> 00:11:47,164 something you can get. 253 00:11:47,164 --> 00:11:48,900 >> DAVID J. MALAN: And the mechanism by which these problems are usually 254 00:11:48,900 --> 00:11:53,100 solved on the server is via what command? 255 00:11:53,100 --> 00:11:57,700 Chmod, if it's, indeed, a permissions issue on the file or directory. 256 00:11:57,700 --> 00:11:59,220 >> ROB BOWDEN: Then 404 not found. 257 00:11:59,220 --> 00:12:03,100 258 00:12:03,100 --> 00:12:03,470 Yeah. 259 00:12:03,470 --> 00:12:10,150 So unlike 302 where it wasn't exactly where you're asking but it knows what 260 00:12:10,150 --> 00:12:12,710 you want, this, it just has no idea what you want. 261 00:12:12,710 --> 00:12:15,648 And you are not requesting something valid. 262 00:12:15,648 --> 00:12:18,580 263 00:12:18,580 --> 00:12:22,310 418 I'm a teapot and then 500 internal server. 264 00:12:22,310 --> 00:12:24,870 So why might you get that? 265 00:12:24,870 --> 00:12:26,120 >> So segfault-- 266 00:12:26,120 --> 00:12:28,760 267 00:12:28,760 --> 00:12:30,640 I actually don't know the grading standard for this. 268 00:12:30,640 --> 00:12:34,850 But if your PHP code had something wrong in it, in theory, it could 269 00:12:34,850 --> 00:12:39,650 actually segfault, in which case, this 500 internal server error, something 270 00:12:39,650 --> 00:12:41,400 is wrong with your server's configuration. 271 00:12:41,400 --> 00:12:44,320 Or there's a syntax error in your PHP code. 272 00:12:44,320 --> 00:12:46,095 Or something bad is going on. 273 00:12:46,095 --> 00:12:48,320 >> DAVID J. MALAN: We did see segfault among a few people's answers. 274 00:12:48,320 --> 00:12:49,490 And technically, it could happen. 275 00:12:49,490 --> 00:12:53,820 But that would be a PHP, the program written by other people, actually 276 00:12:53,820 --> 00:12:57,790 segfaulted, which only if those people screwed up and wrote buggy code in 277 00:12:57,790 --> 00:13:00,680 their interpreter would PHP itself segfault. 278 00:13:00,680 --> 00:13:06,460 So even though 500 is like a segfault in spirit, it's almost always the 279 00:13:06,460 --> 00:13:10,490 result of a configuration file issue with your web server or, as Rob said, 280 00:13:10,490 --> 00:13:13,200 a syntax error, like you didn't close a quote. 281 00:13:13,200 --> 00:13:16,180 Or you lost a semicolon somewhere. 282 00:13:16,180 --> 00:13:23,677 >> AUDIENCE: So for the Shuttle pset, I think when I did it once I clicked the 283 00:13:23,677 --> 00:13:26,300 browser, but nothing came up, what they called white page. 284 00:13:26,300 --> 00:13:28,056 But it was because of the code. 285 00:13:28,056 --> 00:13:29,440 I think that was JavaScript, right? 286 00:13:29,440 --> 00:13:29,770 >> ROB BOWDEN: Yeah. 287 00:13:29,770 --> 00:13:31,180 >> AUDIENCE: Would that error still come up? 288 00:13:31,180 --> 00:13:34,290 >> ROB BOWDEN: So you wouldn't have gotten this error because everything 289 00:13:34,290 --> 00:13:36,930 from the web server's perspective was completely fine. 290 00:13:36,930 --> 00:13:39,090 But you requested index.html. 291 00:13:39,090 --> 00:13:42,000 You requested shuttle.js and service.js. 292 00:13:42,000 --> 00:13:44,580 And it was able to successfully return to you all of those things-- 293 00:13:44,580 --> 00:13:44,980 200. 294 00:13:44,980 --> 00:13:45,680 OK. 295 00:13:45,680 --> 00:13:49,330 It's only when your browser tried to interpret the JavaScript code that 296 00:13:49,330 --> 00:13:51,370 it's like, wait, this is not valid JavaScript error. 297 00:13:51,370 --> 00:13:55,720 298 00:13:55,720 --> 00:13:58,210 Any other questions? 299 00:13:58,210 --> 00:14:00,750 All right. 300 00:14:00,750 --> 00:14:04,120 >> DAVID J. MALAN: So next up was number 11. 301 00:14:04,120 --> 00:14:07,610 And 11 was the scariest for a lot of people. 302 00:14:07,610 --> 00:14:14,620 303 00:14:14,620 --> 00:14:18,570 So the most important thing to note here was that this was, indeed, about 304 00:14:18,570 --> 00:14:19,840 a doubly linked list. 305 00:14:19,840 --> 00:14:23,160 But this was not the same as last year's doubly linked list problem, 306 00:14:23,160 --> 00:14:27,170 which did not give you the caveat that the list could, in fact, be unsorted. 307 00:14:27,170 --> 00:14:29,640 >> So the fact that the list was unsorted and the fact that that word was 308 00:14:29,640 --> 00:14:32,930 underlined there was meant to convey that this is actually a simplification 309 00:14:32,930 --> 00:14:35,430 of what otherwise would have been a more challenging problem 310 00:14:35,430 --> 00:14:36,600 and a longer one. 311 00:14:36,600 --> 00:14:40,760 So a common mistake here was to have put last year's solution on your one 312 00:14:40,760 --> 00:14:45,580 pager and then just blindly copy that down as the answer, which is the right 313 00:14:45,580 --> 00:14:48,520 answer to a different question similar in spirit. 314 00:14:48,520 --> 00:14:51,340 But the subtleties here were as follows. 315 00:14:51,340 --> 00:14:55,200 >> So one, we have a node declared and defined in the usual way here. 316 00:14:55,200 --> 00:14:59,230 Then we defined list of be a global pointer initialized to null. 317 00:14:59,230 --> 00:15:02,150 Then apparently, there's two functions we have prototypes for here, insert 318 00:15:02,150 --> 00:15:03,240 and remove. 319 00:15:03,240 --> 00:15:06,600 And then we have some sample code here of doing a bunch of insertions. 320 00:15:06,600 --> 00:15:09,930 And then we ask you to complete the implementation of insert below in such 321 00:15:09,930 --> 00:15:14,380 a way that it inserts n into the list in constant time, also underlined, 322 00:15:14,380 --> 00:15:15,730 even if already present. 323 00:15:15,730 --> 00:15:20,600 >> So the beauty of being able to insert in constant time is that it implies 324 00:15:20,600 --> 00:15:23,060 that you have to insert the new node where? 325 00:15:23,060 --> 00:15:23,690 Into the front. 326 00:15:23,690 --> 00:15:27,760 So it eliminates, thankfully, at least one of the cases that used to require 327 00:15:27,760 --> 00:15:30,520 even more lines of code, like it did last year and even in class when we 328 00:15:30,520 --> 00:15:34,040 talked through this kind of thing with humans and with some 329 00:15:34,040 --> 00:15:35,250 verbal pseudo code. 330 00:15:35,250 --> 00:15:39,190 So in the solution here, let's skip over to that just to have a visual on 331 00:15:39,190 --> 00:15:40,480 the screen. 332 00:15:40,480 --> 00:15:42,230 >> Notice that we're doing the following. 333 00:15:42,230 --> 00:15:45,140 And also notice the other simplification was that even if it's 334 00:15:45,140 --> 00:15:48,280 already present, so this means even if the number is already there, you can 335 00:15:48,280 --> 00:15:50,280 just blindly insert another copy of it. 336 00:15:50,280 --> 00:15:52,560 And that, too, was meant to be a simplification, so that you could 337 00:15:52,560 --> 00:15:54,940 focus on, really, some of the more intellectually interesting part and 338 00:15:54,940 --> 00:15:58,090 not just some additional error checking given the limited time. 339 00:15:58,090 --> 00:16:02,880 >> So in this sample solution, we allocate a pointer on the left-hand 340 00:16:02,880 --> 00:16:04,510 side here to a node. 341 00:16:04,510 --> 00:16:07,190 Now, realize that pointer, as Rob said, is only 32 bits. 342 00:16:07,190 --> 00:16:09,060 And it doesn't actually contain an address until you 343 00:16:09,060 --> 00:16:09,970 assign it the address. 344 00:16:09,970 --> 00:16:13,220 And we do that on the right-hand side via malloc. 345 00:16:13,220 --> 00:16:16,550 Like a good citizen, we check that malloc is not, in fact, null, so that 346 00:16:16,550 --> 00:16:18,690 we don't accidentally create a segfault here. 347 00:16:18,690 --> 00:16:22,840 And any time you use malloc in life, you should be checking for null, lest 348 00:16:22,840 --> 00:16:24,090 you have a subtle bug. 349 00:16:24,090 --> 00:16:28,460 >> Then we initialize that null by assigning n and previous and next. 350 00:16:28,460 --> 00:16:32,450 And in this case here, I initialized previous to null, because this new 351 00:16:32,450 --> 00:16:34,780 node is going to be the new beginning of my list. 352 00:16:34,780 --> 00:16:37,050 So there's going to be nothing before it. 353 00:16:37,050 --> 00:16:42,010 And I want to essentially append the existing list to the new node by 354 00:16:42,010 --> 00:16:44,700 setting next equal to list itself. 355 00:16:44,700 --> 00:16:47,120 But I'm not done just yet. 356 00:16:47,120 --> 00:16:51,780 So if the list itself already existed, and there was at least one node 357 00:16:51,780 --> 00:16:57,070 already in place, if this is the list here and I insert a new node here, I 358 00:16:57,070 --> 00:17:01,840 need to make sure that my former node points backwards to my new node, 359 00:17:01,840 --> 00:17:04,260 because this is, again, a doubly linked list. 360 00:17:04,260 --> 00:17:05,460 >> So we do a sanity check. 361 00:17:05,460 --> 00:17:10,109 If list is not null, if there's already one or more nodes there, then 362 00:17:10,109 --> 00:17:12,470 add that back reference so to speak. 363 00:17:12,470 --> 00:17:15,420 And then the very last thing we need to do is actually update the global 364 00:17:15,420 --> 00:17:20,329 variable list itself to point to that new node. 365 00:17:20,329 --> 00:17:21,790 Yeah. 366 00:17:21,790 --> 00:17:26,579 >> AUDIENCE: In the pointer arrow [INAUDIBLE] equals null, does that 367 00:17:26,579 --> 00:17:30,420 deal with the list because the list is null? 368 00:17:30,420 --> 00:17:30,596 >> DAVID J. MALAN: Nope. 369 00:17:30,596 --> 00:17:34,500 That is simply me being proactively careful, in that if this is my 370 00:17:34,500 --> 00:17:38,730 original list with maybe some more nodes over here and I'm inserting my 371 00:17:38,730 --> 00:17:42,380 new node over here, there's going to be nothing over here. 372 00:17:42,380 --> 00:17:44,720 And I want to capture that idea by setting previous to 373 00:17:44,720 --> 00:17:47,740 null on the new node. 374 00:17:47,740 --> 00:17:51,410 And presumably, if my code is correct and there's no other way to insert 375 00:17:51,410 --> 00:17:54,970 nodes other than this function, presumably, even if list already has 376 00:17:54,970 --> 00:18:00,090 one or more nodes in it, presumably the list, the first node, would have a 377 00:18:00,090 --> 00:18:02,750 previous pointer of null itself. 378 00:18:02,750 --> 00:18:03,550 >> AUDIENCE: And just a follow-up. 379 00:18:03,550 --> 00:18:08,139 The reason you put pointer next equals list is you're making the pointer 380 00:18:08,139 --> 00:18:13,579 before list in that it's pointing to the next, I guess-- 381 00:18:13,579 --> 00:18:14,980 I don't-- 382 00:18:14,980 --> 00:18:15,450 just lists? 383 00:18:15,450 --> 00:18:16,400 >> DAVID J. MALAN: Exactly. 384 00:18:16,400 --> 00:18:19,400 And so let's actually consider two cases here really, even though the 385 00:18:19,400 --> 00:18:22,070 order we'll consider them isn't quite the same as the code. 386 00:18:22,070 --> 00:18:26,250 But on a high level, if this represents list and this is a 32-bit 387 00:18:26,250 --> 00:18:29,560 pointer, the simplest scenario is that this is null by default. 388 00:18:29,560 --> 00:18:33,010 And suppose I want to insert the number 50 was the first number. 389 00:18:33,010 --> 00:18:37,640 So I'm going to go ahead and allocate a node, which is going to contain 390 00:18:37,640 --> 00:18:38,770 three fields-- 391 00:18:38,770 --> 00:18:42,070 n, previous, and next. 392 00:18:42,070 --> 00:18:44,580 >> I'm going to put the number 50 here, because this will be n. 393 00:18:44,580 --> 00:18:46,130 This will be next. 394 00:18:46,130 --> 00:18:48,530 And this will be previous. 395 00:18:48,530 --> 00:18:50,910 And so what do I do in this case? 396 00:18:50,910 --> 00:18:53,900 Well, I've just done line 1 here. 397 00:18:53,900 --> 00:18:55,400 Pointer n gets n. 398 00:18:55,400 --> 00:18:57,740 I'm then saying, previous should get null. 399 00:18:57,740 --> 00:18:59,470 So this is going to be null. 400 00:18:59,470 --> 00:19:01,365 Then I'm going to say next is going to get list. 401 00:19:01,365 --> 00:19:05,150 >> And this just works out well. 402 00:19:05,150 --> 00:19:06,500 This is null. 403 00:19:06,500 --> 00:19:10,620 And so I'm saying, the new node's next field should get whatever this is. 404 00:19:10,620 --> 00:19:12,570 So that puts another null there. 405 00:19:12,570 --> 00:19:14,510 And then the last thing I do is check here. 406 00:19:14,510 --> 00:19:17,870 If list is not equal to null, but it is equal to null, so we skip that 407 00:19:17,870 --> 00:19:18,470 altogether. 408 00:19:18,470 --> 00:19:23,520 And so all I do next is list gets pointer, which pictorially results in 409 00:19:23,520 --> 00:19:25,570 a picture like that. 410 00:19:25,570 --> 00:19:26,620 So that's one scenario. 411 00:19:26,620 --> 00:19:30,490 >> And the one that you were asking about specifically is a situation like this, 412 00:19:30,490 --> 00:19:33,190 where we already have a one-node list. 413 00:19:33,190 --> 00:19:36,240 And if I go back up in the original problem statement, the next we'll 414 00:19:36,240 --> 00:19:39,320 insert say is 34, just for the sake of discussion. 415 00:19:39,320 --> 00:19:46,210 So I'm going to just conveniently draw that over here. 416 00:19:46,210 --> 00:19:47,540 I've just malloced. 417 00:19:47,540 --> 00:19:49,310 Let's assume I'm checking for null. 418 00:19:49,310 --> 00:19:51,870 >> Now, I'm going to initialize n to be 34. 419 00:19:51,870 --> 00:19:53,040 And this will be n. 420 00:19:53,040 --> 00:19:54,670 This will be next. 421 00:19:54,670 --> 00:19:57,100 And this will be previous. 422 00:19:57,100 --> 00:19:59,370 Let's make sure I didn't get this backwards. 423 00:19:59,370 --> 00:20:01,110 Previous comes first in the definition. 424 00:20:01,110 --> 00:20:03,070 Let me fix this. 425 00:20:03,070 --> 00:20:04,410 This is previous. 426 00:20:04,410 --> 00:20:05,780 This is next. 427 00:20:05,780 --> 00:20:08,620 Even though these are identical, let's keep it consistent. 428 00:20:08,620 --> 00:20:09,450 >> Previous. 429 00:20:09,450 --> 00:20:11,030 This is next. 430 00:20:11,030 --> 00:20:16,310 So I've just malloced my note, checked for null, assigned 34 into the node. 431 00:20:16,310 --> 00:20:17,570 Previous gets null. 432 00:20:17,570 --> 00:20:19,480 So that gives me that. 433 00:20:19,480 --> 00:20:21,010 Next gets list. 434 00:20:21,010 --> 00:20:22,370 So list is this. 435 00:20:22,370 --> 00:20:26,520 So this is the same now as drawing this arrow, so that they point to one 436 00:20:26,520 --> 00:20:27,940 in the same. 437 00:20:27,940 --> 00:20:30,400 And then I'm checking if list is not equal to null. 438 00:20:30,400 --> 00:20:31,740 And it's not this time. 439 00:20:31,740 --> 00:20:35,580 Then I'm going to do list previous gets pointer. 440 00:20:35,580 --> 00:20:39,700 >> So list previous gets PTR. 441 00:20:39,700 --> 00:20:44,300 So this has the effect of putting a graphical arrow here. 442 00:20:44,300 --> 00:20:46,930 And that's getting a little wavy, the lines. 443 00:20:46,930 --> 00:20:50,780 And then, lastly, I update list to point to pointer. 444 00:20:50,780 --> 00:20:55,560 So now this points to this guy. 445 00:20:55,560 --> 00:20:57,170 And now, let's do a quick sanity check. 446 00:20:57,170 --> 00:20:59,470 >> Here's the list, which is the global variable. 447 00:20:59,470 --> 00:21:02,850 The first node is, indeed, 34, because I'm following that arrow. 448 00:21:02,850 --> 00:21:05,210 And that's correct because I want to insert at the beginning of the list 449 00:21:05,210 --> 00:21:06,070 all new nodes. 450 00:21:06,070 --> 00:21:08,860 His next field leads me to this guy. 451 00:21:08,860 --> 00:21:10,710 If I keep going, I hit next is null. 452 00:21:10,710 --> 00:21:11,760 So there's no more list. 453 00:21:11,760 --> 00:21:14,460 If I hit previous, I get back where I expect. 454 00:21:14,460 --> 00:21:16,435 >> So there are still a few pointers, obviously, to manipulate. 455 00:21:16,435 --> 00:21:19,870 But the fact that you were told to do this in constant time means you only 456 00:21:19,870 --> 00:21:22,910 have a finite number of things you're allowed to do. 457 00:21:22,910 --> 00:21:24,290 And what is that number? 458 00:21:24,290 --> 00:21:25,185 It might be one step. 459 00:21:25,185 --> 00:21:25,700 It might be two. 460 00:21:25,700 --> 00:21:26,820 It might be 1,000 steps. 461 00:21:26,820 --> 00:21:30,500 But it's finite, which means you can't have any kind of looping going on 462 00:21:30,500 --> 00:21:32,010 here, no recursion, no loops. 463 00:21:32,010 --> 00:21:37,390 It's just got to be hard-coded lines of code as we have in this sample. 464 00:21:37,390 --> 00:21:42,330 >> So the next problem 12 asked us to complete the implementation of remove 465 00:21:42,330 --> 00:21:46,740 below in such a way that it removes n from the list in linear time. 466 00:21:46,740 --> 00:21:48,740 So you have a little more wiggle room now. 467 00:21:48,740 --> 00:21:52,380 You may assume that n, if present in the list, will be present 468 00:21:52,380 --> 00:21:53,340 no more than once. 469 00:21:53,340 --> 00:21:56,770 And that too is meant to be a quiz-based simplifying assumption, so 470 00:21:56,770 --> 00:21:59,780 that if you find the number 50 somewhere in the list, you don't also 471 00:21:59,780 --> 00:22:02,890 have to worry about continuing to iterate, looking for every possible 472 00:22:02,890 --> 00:22:06,990 copy of 50, which would just devolve into some minutia in limited time. 473 00:22:06,990 --> 00:22:10,460 >> So with remove, this one was definitely more challenging and more 474 00:22:10,460 --> 00:22:11,640 code to write. 475 00:22:11,640 --> 00:22:14,990 But at first glance, frankly, it might look overwhelming and like something 476 00:22:14,990 --> 00:22:17,060 there's no way you could have come up with on a quiz. 477 00:22:17,060 --> 00:22:22,450 But if we focus on the individual steps, hopefully, it will suddenly 478 00:22:22,450 --> 00:22:26,060 strike you that each of these individual steps makes obvious sense 479 00:22:26,060 --> 00:22:27,080 in retrospect. 480 00:22:27,080 --> 00:22:28,200 So let's take a look. 481 00:22:28,200 --> 00:22:32,570 >> So first, we initialize pointer to be list itself. 482 00:22:32,570 --> 00:22:36,040 Because I want linear time, that means I'm going to have some loop. 483 00:22:36,040 --> 00:22:39,730 And a common way to iterate over the nodes in a list structure or any kind 484 00:22:39,730 --> 00:22:43,860 of structure iteratively is to take a pointer to the front of the data 485 00:22:43,860 --> 00:22:46,990 structure and then just start updating it and walk your way 486 00:22:46,990 --> 00:22:48,650 through the data structure. 487 00:22:48,650 --> 00:22:50,040 So I'm going to do exactly that. 488 00:22:50,040 --> 00:22:54,260 >> While pointer, my temporary variable, is not equal to null, let's 489 00:22:54,260 --> 00:22:55,660 go ahead and check. 490 00:22:55,660 --> 00:22:56,910 Did I get lucky? 491 00:22:56,910 --> 00:23:01,740 Is the n field in the node I'm currently looking at equal to the 492 00:23:01,740 --> 00:23:03,380 number I'm looking for? 493 00:23:03,380 --> 00:23:05,410 And if so, let's do something. 494 00:23:05,410 --> 00:23:10,020 Now, notice this if condition surrounds the entire 495 00:23:10,020 --> 00:23:11,520 following lines of code. 496 00:23:11,520 --> 00:23:14,610 This is the only thing I care about-- finding a number in question. 497 00:23:14,610 --> 00:23:18,010 So there's no else, which simplifies things conceptually a little bit. 498 00:23:18,010 --> 00:23:22,040 >> But now, I realized, and you might have only realized this after thinking 499 00:23:22,040 --> 00:23:24,720 it through a bit, there's actually two cases here. 500 00:23:24,720 --> 00:23:28,060 One is where the node is at the beginning of the list, which is a 501 00:23:28,060 --> 00:23:31,040 little annoying, because that's a special case, because you have to deal 502 00:23:31,040 --> 00:23:33,340 with this thing, which is the only anomaly. 503 00:23:33,340 --> 00:23:35,720 Everywhere else in the list, it's the same thing. 504 00:23:35,720 --> 00:23:38,050 There's a previous node and a next node, previous node, next node. 505 00:23:38,050 --> 00:23:40,940 But this guy is a little special if he's at the beginning. 506 00:23:40,940 --> 00:23:48,710 >> So if the pointer equals the list itself, so if I'm at the beginning of 507 00:23:48,710 --> 00:23:53,960 the list and I have found n, I need to do a couple of things. 508 00:23:53,960 --> 00:23:59,230 One, I need to change list to point to the next field, 50. 509 00:23:59,230 --> 00:24:01,270 So suppose that I'm trying to remove 34. 510 00:24:01,270 --> 00:24:03,560 So this guy's got to go away in just a moment. 511 00:24:03,560 --> 00:24:07,210 >> So I'm going to say, list gets pointer next. 512 00:24:07,210 --> 00:24:08,570 Well, this is pointer. 513 00:24:08,570 --> 00:24:10,360 Next is pointing over here. 514 00:24:10,360 --> 00:24:17,470 So this is changing this arrow right now to point to this guy here. 515 00:24:17,470 --> 00:24:19,580 Now, remember, we have a temporary variable. 516 00:24:19,580 --> 00:24:23,520 So we haven't orphaned any nodes, because I also have this guy in my 517 00:24:23,520 --> 00:24:25,010 implementation of remove. 518 00:24:25,010 --> 00:24:29,600 So now, if list itself is not null, I need to fix a little something. 519 00:24:29,600 --> 00:24:32,690 >> I need to now make sure that this arrow, which is previously pointing 520 00:24:32,690 --> 00:24:36,830 from 50 to 34, this has got to go away, because if I'm trying to get rid 521 00:24:36,830 --> 00:24:41,910 of 34, 50 had better not maintain any kind of back reference to it as the 522 00:24:41,910 --> 00:24:42,820 arrow suggested. 523 00:24:42,820 --> 00:24:44,820 So I just did this line. 524 00:24:44,820 --> 00:24:46,520 So then I'm done. 525 00:24:46,520 --> 00:24:48,040 That case is actually pretty easy. 526 00:24:48,040 --> 00:24:51,010 Chopping off the head of the list is relatively straightforward. 527 00:24:51,010 --> 00:24:52,980 >> Unfortunately, there's this annoying else block. 528 00:24:52,980 --> 00:24:56,170 So now, I have to consider the case where there's something in the middle. 529 00:24:56,170 --> 00:24:59,880 But it's not too terrible, except for syntax like this. 530 00:24:59,880 --> 00:25:03,080 So if I'm not at the beginning of the list, I'm somewhere in the middle. 531 00:25:03,080 --> 00:25:08,160 And this line here is saying, start at whatever node you're at. 532 00:25:08,160 --> 00:25:11,210 533 00:25:11,210 --> 00:25:18,550 Go to the previous node's next field and point that at the pointer. 534 00:25:18,550 --> 00:25:20,390 >> Let's do this pictorially. 535 00:25:20,390 --> 00:25:21,640 That was getting complicated. 536 00:25:21,640 --> 00:25:30,480 537 00:25:30,480 --> 00:25:37,990 So if I have a previous fields here-- let's do this-- next fields here. 538 00:25:37,990 --> 00:25:41,200 I'm going to simplify my pointers rather than draw a whole bunch of 539 00:25:41,200 --> 00:25:45,710 things back and forth crisscrossing each other. 540 00:25:45,710 --> 00:25:50,870 And now, let's just say this is 1, 2, 3 for the sake of discussion, even 541 00:25:50,870 --> 00:25:53,410 though that doesn't line up with the problem in question. 542 00:25:53,410 --> 00:25:55,900 >> So here's my linked list. 543 00:25:55,900 --> 00:25:59,300 I am trying to remove two in this particular version of the story. 544 00:25:59,300 --> 00:26:01,960 So I've updated pointer to be pointing to this guy. 545 00:26:01,960 --> 00:26:03,315 So this is PTR. 546 00:26:03,315 --> 00:26:04,530 He's pointing here. 547 00:26:04,530 --> 00:26:07,170 This is list, which exists globally as before. 548 00:26:07,170 --> 00:26:09,200 And he's pointing here no matter what. 549 00:26:09,200 --> 00:26:10,800 And now, I'm trying to remove two. 550 00:26:10,800 --> 00:26:13,850 >> So if pointer is pointing here, I'm going to follow, apparently, the 551 00:26:13,850 --> 00:26:17,110 previous pointer, which puts me at 1. 552 00:26:17,110 --> 00:26:22,290 I'm then going to say that the next field, which brings me over to this 553 00:26:22,290 --> 00:26:25,410 box here, is going to equal pointer next. 554 00:26:25,410 --> 00:26:28,400 So if this pointer, this is next. 555 00:26:28,400 --> 00:26:31,840 That means that this arrow needs to point to this guy. 556 00:26:31,840 --> 00:26:35,140 >> So what that line of code has just done is a little bit of this. 557 00:26:35,140 --> 00:26:37,500 And now, this is looking like a step in the right direction. 558 00:26:37,500 --> 00:26:41,390 We essentially want to snip 2 out of the middle of 1 and 3. 559 00:26:41,390 --> 00:26:44,400 So it makes sense that we want to route this pointer around it. 560 00:26:44,400 --> 00:26:50,400 So this next line is checking if pointer next is not null, there's 561 00:26:50,400 --> 00:26:54,200 indeed someone to the right of 2, that means we also have to do 562 00:26:54,200 --> 00:26:55,850 a little snip here. 563 00:26:55,850 --> 00:27:00,590 >> So I now need to follow this pointer and update the previous pointer on 564 00:27:00,590 --> 00:27:05,410 this guy to do a little bit of a workaround here the point here. 565 00:27:05,410 --> 00:27:07,100 And now, visually this is nice. 566 00:27:07,100 --> 00:27:11,930 It's a little messy in that there's no one pointing at the 2 anymore. 567 00:27:11,930 --> 00:27:13,600 2 is pointing to the left. 568 00:27:13,600 --> 00:27:14,980 And 2 is pointing to the right. 569 00:27:14,980 --> 00:27:17,480 But he can do whatever he wants, because he's about to get freed. 570 00:27:17,480 --> 00:27:19,480 And it doesn't matter what those values are anymore. 571 00:27:19,480 --> 00:27:23,040 >> What's important is that the remaining guys are routing above 572 00:27:23,040 --> 00:27:24,280 and below him now. 573 00:27:24,280 --> 00:27:25,810 And indeed, that's what we do next. 574 00:27:25,810 --> 00:27:29,360 We free pointer, which means we tell the operating system, you are welcome 575 00:27:29,360 --> 00:27:30,906 to reclaim this. 576 00:27:30,906 --> 00:27:34,900 And then lastly, we return. 577 00:27:34,900 --> 00:27:37,220 Else implicitly, if we haven't returned yet, 578 00:27:37,220 --> 00:27:38,290 we've got to keep looking. 579 00:27:38,290 --> 00:27:41,485 So pointer equals pointer next just means move this guy here. 580 00:27:41,485 --> 00:27:42,600 Move this guy here. 581 00:27:42,600 --> 00:27:45,400 Move this guy here if, in fact, we didn't find the number 582 00:27:45,400 --> 00:27:46,960 we're looking for yet. 583 00:27:46,960 --> 00:27:49,630 >> So frankly, it looks completely overwhelming, I think, at first 584 00:27:49,630 --> 00:27:52,180 glance, especially if you struggled with this during the quiz then see 585 00:27:52,180 --> 00:27:52,850 something like this. 586 00:27:52,850 --> 00:27:55,050 And you pat yourself on the back. 587 00:27:55,050 --> 00:27:57,080 Well, there's no way I could have come up with that on the quiz. 588 00:27:57,080 --> 00:28:00,470 But I would argue, you can if you break it down into these individual 589 00:28:00,470 --> 00:28:04,400 cases and just walk through it carefully, albeit, admittedly, under 590 00:28:04,400 --> 00:28:06,300 stressful circumstances. 591 00:28:06,300 --> 00:28:09,470 >> Thankfully, the picture made everything happier. 592 00:28:09,470 --> 00:28:11,050 You could draw this in any number of ways. 593 00:28:11,050 --> 00:28:12,760 You don't have to do the crisscrossing thing here. 594 00:28:12,760 --> 00:28:14,520 You could do it with straight lines like this. 595 00:28:14,520 --> 00:28:18,790 But the gist of this problem, in general, was to realize that the 596 00:28:18,790 --> 00:28:22,060 picture in the end should look a little something like this, because 597 00:28:22,060 --> 00:28:25,030 constant time implied that you keep jamming and jamming and jamming the 598 00:28:25,030 --> 00:28:29,900 new nodes at the beginning of the list. 599 00:28:29,900 --> 00:28:31,960 Any questions? 600 00:28:31,960 --> 00:28:34,565 Probably the most challenging of certainly the coding questions. 601 00:28:34,565 --> 00:28:37,690 >> AUDIENCE: So is list similar to head in previous examples. 602 00:28:37,690 --> 00:28:39,640 >> DAVID J. MALAN: Exactly, exactly. 603 00:28:39,640 --> 00:28:43,130 Just a different name for a global variable. 604 00:28:43,130 --> 00:28:44,380 World wide what? 605 00:28:44,380 --> 00:28:48,880 606 00:28:48,880 --> 00:28:49,730 >> ROB BOWDEN: OK. 607 00:28:49,730 --> 00:28:52,020 So this is the one where you had to write the paragraph. 608 00:28:52,020 --> 00:28:56,060 Some people wrote essays for this question. 609 00:28:56,060 --> 00:29:00,230 But you just need to use these six terms to describe what happens when 610 00:29:00,230 --> 00:29:02,440 you try to contact facebook.com. 611 00:29:02,440 --> 00:29:07,930 So I'll just talk through the process using all these terms. 612 00:29:07,930 --> 00:29:11,290 So in our browser, we type facebook.com and hit Enter. 613 00:29:11,290 --> 00:29:17,280 So our browser's going to construct an HTTP request that it's going to send 614 00:29:17,280 --> 00:29:22,220 through some process to Facebook for Facebook to respond to us with the 615 00:29:22,220 --> 00:29:24,450 HTML of its page. 616 00:29:24,450 --> 00:29:28,800 >> So what is the process by which the HTTP request 617 00:29:28,800 --> 00:29:30,730 actually gets to Facebook? 618 00:29:30,730 --> 00:29:32,790 So first, we need to translate Facebook.com. 619 00:29:32,790 --> 00:29:38,780 So just given the name Facebook.com, where actually does the HTTP request 620 00:29:38,780 --> 00:29:39,940 need to go? 621 00:29:39,940 --> 00:29:44,120 So we need to translate Facebook.com to an IP address, which uniquely 622 00:29:44,120 --> 00:29:47,620 identifies what machine we actually want to send this request to. 623 00:29:47,620 --> 00:29:49,310 Your laptop has an IP address. 624 00:29:49,310 --> 00:29:52,240 Anything connected to the internet has an IP address. 625 00:29:52,240 --> 00:29:59,030 >> So DNS, Domain Name System, that is what's going to handle the translation 626 00:29:59,030 --> 00:30:03,750 from facebook.com to an IP address that you actually want to contact. 627 00:30:03,750 --> 00:30:08,075 So we contact the DNS servers and say, what is facebook.com? 628 00:30:08,075 --> 00:30:16,560 It says, oh, it's IP address 190.212 something, something, something. 629 00:30:16,560 --> 00:30:16,900 All right. 630 00:30:16,900 --> 00:30:18,850 Now, I know what machine I want to contact. 631 00:30:18,850 --> 00:30:22,360 >> So then you send your HTTP request over to that machine. 632 00:30:22,360 --> 00:30:24,140 So how does it get to that machine? 633 00:30:24,140 --> 00:30:27,200 Well, the request goes from router to router bouncing. 634 00:30:27,200 --> 00:30:32,630 Remember the example in class, where we actually saw the route that the 635 00:30:32,630 --> 00:30:35,340 packets took when we tried to communicate. 636 00:30:35,340 --> 00:30:38,460 We saw it jump over the Atlantic Ocean at one point or whatever. 637 00:30:38,460 --> 00:30:42,820 >> So the last term port. 638 00:30:42,820 --> 00:30:46,520 So this is now on your computer. 639 00:30:46,520 --> 00:30:49,970 You can have multiple things currently communicating with the internet. 640 00:30:49,970 --> 00:30:53,730 So I can be running, say, Skype. 641 00:30:53,730 --> 00:30:55,670 I might have a web browser open. 642 00:30:55,670 --> 00:30:59,010 I might have something that torrenting files. 643 00:30:59,010 --> 00:31:00,880 So all of these things are communicating with the 644 00:31:00,880 --> 00:31:02,600 internet in some way. 645 00:31:02,600 --> 00:31:08,070 >> So when your computer receives some data from the internet, how does it 646 00:31:08,070 --> 00:31:10,130 know what application actually wants the data? 647 00:31:10,130 --> 00:31:12,610 How does it know whether this particular data is meant for the 648 00:31:12,610 --> 00:31:16,070 torrenting application as opposed to the web browser? 649 00:31:16,070 --> 00:31:20,980 So this is the purpose of ports in that all of these applications have 650 00:31:20,980 --> 00:31:22,720 claimed a port on your computer. 651 00:31:22,720 --> 00:31:27,580 So your web browser says, hey, I'm listening on port 1000. 652 00:31:27,580 --> 00:31:32,240 And your torrenting program is saying, I'm listening on port 3000. 653 00:31:32,240 --> 00:31:34,770 And Skype says, I'm using port 4000. 654 00:31:34,770 --> 00:31:41,950 >> So when you get some data that belongs to one of these applications, the data 655 00:31:41,950 --> 00:31:45,510 is marked with which port it actually should be sent along to. 656 00:31:45,510 --> 00:31:47,950 So this says, oh, I belong to port 1000. 657 00:31:47,950 --> 00:31:50,950 I know then I need to forward this along to my web browser. 658 00:31:50,950 --> 00:31:56,440 So the reason it's relevant here is that web servers tend to 659 00:31:56,440 --> 00:31:58,240 listen on port 80. 660 00:31:58,240 --> 00:32:02,420 So when I contact Facebook.com, I'm communicating with some machine. 661 00:32:02,420 --> 00:32:06,390 But I need to say which port of that machine I want to communicate with. 662 00:32:06,390 --> 00:32:09,160 And web servers tend to be listening on port 80. 663 00:32:09,160 --> 00:32:14,010 >> If they wanted, they could set it up so it lists as on port 7000. 664 00:32:14,010 --> 00:32:19,090 And then in a web browser, I could manually type Facebook.com:7000 to 665 00:32:19,090 --> 00:32:24,600 send the request to port 7000 of Facebook's web server. 666 00:32:24,600 --> 00:32:26,820 >> DAVID J. MALAN: And in this case, even though we didn't require that people 667 00:32:26,820 --> 00:32:30,000 mention this, in this case, what port would the request actually go to? 668 00:32:30,000 --> 00:32:36,630 669 00:32:36,630 --> 00:32:37,880 Try again. 670 00:32:37,880 --> 00:32:42,810 671 00:32:42,810 --> 00:32:44,300 Exactly. 672 00:32:44,300 --> 00:32:47,960 Not looking for that, but a subtlety that's there none the last. 673 00:32:47,960 --> 00:32:51,770 >> ROB BOWDEN: So the HTTPS, since it's listening specifically for the 674 00:32:51,770 --> 00:32:55,180 encrypted, it's on port 4430. 675 00:32:55,180 --> 00:32:57,680 >> AUDIENCE: And emails are 25, right? 676 00:32:57,680 --> 00:33:00,670 >> DAVID J. MALAN: Outbound emails, 25, yep. 677 00:33:00,670 --> 00:33:03,760 >> ROB BOWDEN: I don't even know most of the-- all of the lower ones tend to be 678 00:33:03,760 --> 00:33:06,310 reserved for things. 679 00:33:06,310 --> 00:33:09,260 I think everything under 1024 is reserved. 680 00:33:09,260 --> 00:33:13,450 >> AUDIENCE: Why did you say 3 was the wrong number? 681 00:33:13,450 --> 00:33:18,820 >> ROB BOWDEN: Because in an IP address, there's four groupings of digits. 682 00:33:18,820 --> 00:33:21,090 And they're from 0 to 255. 683 00:33:21,090 --> 00:33:28,060 So 192.168.2.1 is a common local network IP address. 684 00:33:28,060 --> 00:33:30,840 Notice all of those are less than 255. 685 00:33:30,840 --> 00:33:33,570 So when I started with 300, that couldn't possibly have 686 00:33:33,570 --> 00:33:35,210 been one of the numbers. 687 00:33:35,210 --> 00:33:38,170 >> DAVID J. MALAN: But that silly clip from-- was it CSI, where they had a 688 00:33:38,170 --> 00:33:39,970 number that was too big for the IP address. 689 00:33:39,970 --> 00:33:42,940 690 00:33:42,940 --> 00:33:46,110 >> ROB BOWDEN: Any questions on this? 691 00:33:46,110 --> 00:33:51,710 The next one, so complete change in topic, but we have this PHP array for 692 00:33:51,710 --> 00:33:53,270 the houses in the quad. 693 00:33:53,270 --> 00:33:56,360 And we have an unordered list. 694 00:33:56,360 --> 00:33:59,550 And we want to print out each list item just containing the house name. 695 00:33:59,550 --> 00:34:09,090 696 00:34:09,090 --> 00:34:11,870 So we have a foreach loop. 697 00:34:11,870 --> 00:34:17,540 So remember, the syntax is foreach array as item in the array. 698 00:34:17,540 --> 00:34:22,360 So through each iteration of the loop, house is going to take on one of the 699 00:34:22,360 --> 00:34:24,060 values inside of the array. 700 00:34:24,060 --> 00:34:26,530 >> On the first iteration, house will be Cabot House. 701 00:34:26,530 --> 00:34:30,370 On a second iteration, house will be Courier House and so on. 702 00:34:30,370 --> 00:34:34,370 So for each quad as house, we're just going to print-- 703 00:34:34,370 --> 00:34:37,250 you also could have echoed-- 704 00:34:37,250 --> 00:34:42,199 the list item and then the house's name and then close the list item. 705 00:34:42,199 --> 00:34:45,210 The curly braces are optional here. 706 00:34:45,210 --> 00:34:49,480 >> And then we also said in the question itself, remember to close the 707 00:34:49,480 --> 00:34:50,770 unordered list tag. 708 00:34:50,770 --> 00:34:53,949 So we need to exit PHP mode in order to do this. 709 00:34:53,949 --> 00:35:00,280 Or we could have echoed the close unordered list tag. 710 00:35:00,280 --> 00:35:02,380 >> DAVID J. MALAN: Also fine here would have been to use an old school for 711 00:35:02,380 --> 00:35:07,340 loop with a $i=0 0 and using counts to figure out the length of the ray. 712 00:35:07,340 --> 00:35:09,240 Totally fine too, just a little wordier. 713 00:35:09,240 --> 00:35:12,170 714 00:35:12,170 --> 00:35:14,742 >> AUDIENCE: So if you were going to [INAUDIBLE], would you do-- 715 00:35:14,742 --> 00:35:16,734 I forget what the loop [INAUDIBLE] is. 716 00:35:16,734 --> 00:35:21,380 Would you $quad bracket i? 717 00:35:21,380 --> 00:35:21,850 >> DAVID J. MALAN: Exactly. 718 00:35:21,850 --> 00:35:23,100 Yeah, exactly. 719 00:35:23,100 --> 00:35:26,650 720 00:35:26,650 --> 00:35:27,900 >> ROB BOWDEN: Anything else? 721 00:35:27,900 --> 00:35:31,350 722 00:35:31,350 --> 00:35:32,010 >> DAVID J. MALAN: All right. 723 00:35:32,010 --> 00:35:32,300 Trade-offs. 724 00:35:32,300 --> 00:35:38,290 So there were bunches of answers possible for each of these. 725 00:35:38,290 --> 00:35:40,510 We were really just looking for something compelling for an upside and 726 00:35:40,510 --> 00:35:41,100 a downside. 727 00:35:41,100 --> 00:35:44,830 And number 16 asked, validating users' input client-side, as with JavaScript, 728 00:35:44,830 --> 00:35:47,280 instead of server-side, as with PHP. 729 00:35:47,280 --> 00:35:49,450 So what's an upside of doing client-side? 730 00:35:49,450 --> 00:35:53,780 >> Well, one of the things we proposed is that you reduce latency, because you 731 00:35:53,780 --> 00:35:56,750 don't have to bother contacting the server, which might take a few 732 00:35:56,750 --> 00:36:00,390 milliseconds or even a couple of seconds by avoiding that and just 733 00:36:00,390 --> 00:36:04,670 validating users' input client-side by triggering an on-submit handler and 734 00:36:04,670 --> 00:36:06,650 just checking, did they type something in for name? 735 00:36:06,650 --> 00:36:08,080 Did they type something in for email address? 736 00:36:08,080 --> 00:36:10,950 Did they choose a dorm from the drop-down menu? 737 00:36:10,950 --> 00:36:14,360 >> You can give them instantaneous feedback using the gigahertz computer 738 00:36:14,360 --> 00:36:16,770 or whatever they have that's actually on their desk. 739 00:36:16,770 --> 00:36:19,310 So it's just a better user experience typically. 740 00:36:19,310 --> 00:36:24,460 But a downside of doing client-side validation, if you do it without also 741 00:36:24,460 --> 00:36:29,860 doing server-side validation is that most anyone coming out of CS50 knows 742 00:36:29,860 --> 00:36:33,980 that you can just send any data you want to a server any number of ways. 743 00:36:33,980 --> 00:36:37,030 Frankly, in most any browser, you can click around in the settings and just 744 00:36:37,030 --> 00:36:40,110 turn off JavaScript, which would, therefore, disable any form of 745 00:36:40,110 --> 00:36:41,080 validation. 746 00:36:41,080 --> 00:36:44,460 >> But you also might recall that even I did some arcane things in class using 747 00:36:44,460 --> 00:36:47,790 telnet and actually pretending to be a browser by sending get 748 00:36:47,790 --> 00:36:49,240 requests to a server. 749 00:36:49,240 --> 00:36:51,030 And that's certainly not using any JavaScript. 750 00:36:51,030 --> 00:36:53,290 That's just me typing commands at a keyboard. 751 00:36:53,290 --> 00:36:57,410 So really, any programmer within enough comfort with the web and HTTP 752 00:36:57,410 --> 00:37:01,690 could send whatever data he or she wants to a server without validation. 753 00:37:01,690 --> 00:37:05,470 And if your server is not also checking, did they give me a name, is 754 00:37:05,470 --> 00:37:08,930 this actually a valid email address, did they choose a dorm, you might end 755 00:37:08,930 --> 00:37:12,800 up inserting bogus or just blank data into your database, which probably 756 00:37:12,800 --> 00:37:15,450 isn't going to be a good thing if you were assuming it was there. 757 00:37:15,450 --> 00:37:16,770 >> So this is an annoying reality. 758 00:37:16,770 --> 00:37:19,890 But in general, client-side validation is great. 759 00:37:19,890 --> 00:37:21,810 But it means twice as much work. 760 00:37:21,810 --> 00:37:25,970 Although there do exist various libraries, JavaScript libraries for 761 00:37:25,970 --> 00:37:28,830 instance, that make this much, much less of a headache. 762 00:37:28,830 --> 00:37:31,940 And you can reuse some of the code server-side, client-side. 763 00:37:31,940 --> 00:37:35,980 But do realize that it is typically additional work. 764 00:37:35,980 --> 00:37:36,415 Yeah. 765 00:37:36,415 --> 00:37:37,792 >> AUDIENCE: So if we just said less secure-- 766 00:37:37,792 --> 00:37:39,205 >> DAVID J. MALAN: [LAUGHS] 767 00:37:39,205 --> 00:37:39,680 Ugh. 768 00:37:39,680 --> 00:37:43,105 Those are always the harder ones to adjudicate. 769 00:37:43,105 --> 00:37:44,480 >> ROB BOWDEN: That would have been accepted. 770 00:37:44,480 --> 00:37:44,810 >> DAVID J. MALAN: What? 771 00:37:44,810 --> 00:37:45,810 >> ROB BOWDEN: I created this problem. 772 00:37:45,810 --> 00:37:46,735 That would have been accepted. 773 00:37:46,735 --> 00:37:47,220 >> DAVID J. MALAN: Yeah. 774 00:37:47,220 --> 00:37:47,830 >> AUDIENCE: Cool. 775 00:37:47,830 --> 00:37:51,770 >> ROB BOWDEN: But we did not accept for the first one-- 776 00:37:51,770 --> 00:37:53,630 well, what we were looking for is something like you don't have to 777 00:37:53,630 --> 00:37:55,270 communicate with the server. 778 00:37:55,270 --> 00:37:58,355 We didn't accept just faster. 779 00:37:58,355 --> 00:38:00,080 >> AUDIENCE: What about do not reload page? 780 00:38:00,080 --> 00:38:00,430 >> ROB BOWDEN: Yes. 781 00:38:00,430 --> 00:38:03,000 That was an accepted answer. 782 00:38:03,000 --> 00:38:06,300 >> DAVID J. MALAN: Anything where we felt it was more likely than not likely 783 00:38:06,300 --> 00:38:09,780 that you knew what you were saying, which is a tough 784 00:38:09,780 --> 00:38:13,500 line to draw sometimes. 785 00:38:13,500 --> 00:38:16,000 Using a linked list instead of an array to maintain a 786 00:38:16,000 --> 00:38:17,590 sorted list of integers. 787 00:38:17,590 --> 00:38:21,000 So an upside we often cite with linked lists that motivated their whole 788 00:38:21,000 --> 00:38:22,370 introduction was you get dynamism. 789 00:38:22,370 --> 00:38:23,030 They can grow. 790 00:38:23,030 --> 00:38:23,950 They can shrink. 791 00:38:23,950 --> 00:38:27,370 So you don't have to jump through hoops to actually create more memory 792 00:38:27,370 --> 00:38:28,140 with an array. 793 00:38:28,140 --> 00:38:30,310 Or you don't have to just say, sorry, user. 794 00:38:30,310 --> 00:38:31,410 The array is filled. 795 00:38:31,410 --> 00:38:35,850 So dynamic growth of the list. 796 00:38:35,850 --> 00:38:37,210 A downside though of linked lists? 797 00:38:37,210 --> 00:38:40,916 798 00:38:40,916 --> 00:38:43,356 >> AUDIENCE: It's linear. 799 00:38:43,356 --> 00:38:45,800 Searching on linked list is linear instead of what you log in. 800 00:38:45,800 --> 00:38:46,360 >> DAVID J. MALAN: Exactly. 801 00:38:46,360 --> 00:38:50,160 Searching on a linked list is linear, even if it's sorted, because you can 802 00:38:50,160 --> 00:38:53,170 only follow these bread crumbs, these pointers, from the start of the list 803 00:38:53,170 --> 00:38:53,570 to the end. 804 00:38:53,570 --> 00:38:57,970 You can't leverage random access and, thus, binary search, even if it's 805 00:38:57,970 --> 00:39:00,740 sorted, that you could do with an array. 806 00:39:00,740 --> 00:39:02,390 And there's also another cost. 807 00:39:02,390 --> 00:39:02,966 Yeah. 808 00:39:02,966 --> 00:39:03,800 >> AUDIENCE: Memory inefficient? 809 00:39:03,800 --> 00:39:04,130 >> DAVID J. MALAN: Yeah. 810 00:39:04,130 --> 00:39:06,940 Well, I wouldn't necessarily say inefficient. 811 00:39:06,940 --> 00:39:10,110 But it does cost you more memory, because you need 32 bits for every 812 00:39:10,110 --> 00:39:13,400 node for the additional pointer, at least for a singly linked list. 813 00:39:13,400 --> 00:39:16,660 Now, if you're only storing integers and you're adding the pointer, that's 814 00:39:16,660 --> 00:39:17,830 actually kind of non-trivial. 815 00:39:17,830 --> 00:39:19,340 It's doubling the amount of memory. 816 00:39:19,340 --> 00:39:22,330 But in reality, if you're storing a linked list of structs that might have 817 00:39:22,330 --> 00:39:25,540 8 bytes, 16 bytes, even more than that, maybe it's less 818 00:39:25,540 --> 00:39:26,500 of a marginal cost. 819 00:39:26,500 --> 00:39:28,320 But it's a cost nonetheless. 820 00:39:28,320 --> 00:39:31,880 So either of those would've been fine as downsides. 821 00:39:31,880 --> 00:39:32,110 >> 18. 822 00:39:32,110 --> 00:39:36,100 Using PHP instead of C to write a command-line program. 823 00:39:36,100 --> 00:39:41,890 So here, it's often faster to use a language like PHP or Ruby or Python. 824 00:39:41,890 --> 00:39:43,700 You just quickly open up a text editor. 825 00:39:43,700 --> 00:39:45,900 You have many more functions available to you. 826 00:39:45,900 --> 00:39:49,325 PHP has the kitchen sink of functions, whereas in C, you 827 00:39:49,325 --> 00:39:50,420 have very, very little. 828 00:39:50,420 --> 00:39:53,820 In fact, guys the know the hard way that you don't have hash tables. 829 00:39:53,820 --> 00:39:55,000 You don't have linked lists. 830 00:39:55,000 --> 00:39:57,470 If you want those, you have to implement them yourself. 831 00:39:57,470 --> 00:40:00,950 >> So one upside of PHP or really any interpreted language is the rapidity 832 00:40:00,950 --> 00:40:02,920 with which you can write code. 833 00:40:02,920 --> 00:40:06,660 But a downside, we saw this when I quickly whipped up a misspeller 834 00:40:06,660 --> 00:40:11,780 implementation in lecture using PHP, is that using an interpreted language 835 00:40:11,780 --> 00:40:13,570 is usually slower. 836 00:40:13,570 --> 00:40:18,420 And we saw that demonstrably with an increase in time from 0.3 seconds to 3 837 00:40:18,420 --> 00:40:24,440 seconds, because of the interpretation that actually happens. 838 00:40:24,440 --> 00:40:27,060 >> Another upside was that you don't have to compile. 839 00:40:27,060 --> 00:40:30,130 So it also speeds up development incidentally, because you don't have 840 00:40:30,130 --> 00:40:31,360 two steps to running a program. 841 00:40:31,360 --> 00:40:32,140 You just have one. 842 00:40:32,140 --> 00:40:35,260 And so that's pretty compelling as well. 843 00:40:35,260 --> 00:40:38,450 Using a SQL database instead of a CSV file to store data. 844 00:40:38,450 --> 00:40:40,230 So SQL database is used for pset7. 845 00:40:40,230 --> 00:40:42,060 CSV files you didn't use much. 846 00:40:42,060 --> 00:40:45,960 But you used it indirectly in pset7 as well by talking to Yahoo Finance. 847 00:40:45,960 --> 00:40:49,330 >> But CSV is just like an Excel file but super simple, where the columns are 848 00:40:49,330 --> 00:40:54,010 just demarked by commas inside of an otherwise text file. 849 00:40:54,010 --> 00:40:56,740 And using a SQL database is a little more compelling. 850 00:40:56,740 --> 00:41:00,060 It's an upside, because you get things like select and insert and delete. 851 00:41:00,060 --> 00:41:03,790 And you get, presumably, indexes that MySQL and other databases, like 852 00:41:03,790 --> 00:41:07,510 Oracle, build for you in memory, which means your select is probably not 853 00:41:07,510 --> 00:41:09,000 going to be linear top to bottom. 854 00:41:09,000 --> 00:41:11,300 It's actually going to be something like binary search or something 855 00:41:11,300 --> 00:41:12,520 similar in spirit. 856 00:41:12,520 --> 00:41:13,930 So they're generally faster. 857 00:41:13,930 --> 00:41:16,040 >> But a downside is that it's just more work. 858 00:41:16,040 --> 00:41:16,730 It's more effort. 859 00:41:16,730 --> 00:41:18,140 You have to understand databases. 860 00:41:18,140 --> 00:41:18,940 You have to set it up. 861 00:41:18,940 --> 00:41:20,840 You need a server to run that database on. 862 00:41:20,840 --> 00:41:22,750 You need to understand how to configure it. 863 00:41:22,750 --> 00:41:24,930 So these are just these kinds of trade-offs. 864 00:41:24,930 --> 00:41:27,860 Whereas a CSV file, you can create it with gedit. 865 00:41:27,860 --> 00:41:28,770 And you're good to go. 866 00:41:28,770 --> 00:41:31,550 There's no complexity beyond that. 867 00:41:31,550 --> 00:41:34,870 >> Using a trie instead of a hash table with separate chaining to store a 868 00:41:34,870 --> 00:41:37,490 dictionary of words reminiscent of pset5. 869 00:41:37,490 --> 00:41:42,480 So a tries upside, in theory at least, is what? 870 00:41:42,480 --> 00:41:46,380 Constant time, at least if you're hashing on each of the individual 871 00:41:46,380 --> 00:41:48,990 letters in a word, like you might have for pset5. 872 00:41:48,990 --> 00:41:52,720 That might be five hashes, six hashes if there's five or six 873 00:41:52,720 --> 00:41:53,900 letters in the word. 874 00:41:53,900 --> 00:41:54,580 And that's pretty good. 875 00:41:54,580 --> 00:41:56,910 And if there's an upper bound on how long your words might be, that's 876 00:41:56,910 --> 00:41:59,320 indeed asymptotically constant time. 877 00:41:59,320 --> 00:42:05,180 >> Whereas a hash table with separate chaining, the problem there with that 878 00:42:05,180 --> 00:42:09,070 kind of data structure is that the performance of your algorithms usually 879 00:42:09,070 --> 00:42:12,700 depends on the number of things already in the data structure. 880 00:42:12,700 --> 00:42:15,660 And that's definitely the case with chains, whereby the more stuff you put 881 00:42:15,660 --> 00:42:18,800 into a hash table, the longer those chains go, which means in the worst 882 00:42:18,800 --> 00:42:21,960 case, the thing you might be looking for is all the way at the end of one 883 00:42:21,960 --> 00:42:26,000 of those chains, which effectively devolves into something linear. 884 00:42:26,000 --> 00:42:29,450 >> Now, in practice, it could absolutely be the case that a hash table with 885 00:42:29,450 --> 00:42:32,820 chains is faster than a corresponding trie implementation. 886 00:42:32,820 --> 00:42:35,570 But that's for various reasons, among which are tries use a whole lot of 887 00:42:35,570 --> 00:42:39,240 memory that can, in fact, slow things down, because you don't get nice 888 00:42:39,240 --> 00:42:42,410 benefits of something called caching, where things that are close together 889 00:42:42,410 --> 00:42:45,420 in memory can be accessed often more quickly. 890 00:42:45,420 --> 00:42:48,180 And sometimes you can come up with a really good hash function. 891 00:42:48,180 --> 00:42:51,060 Even if you have to waste a bit of memory, you might, indeed, be able to 892 00:42:51,060 --> 00:42:54,430 find things fast and not as bad as linearly. 893 00:42:54,430 --> 00:42:58,410 >> So in short, there wasn't necessarily with any of these one or even two 894 00:42:58,410 --> 00:43:00,050 specific things we were looking for. 895 00:43:00,050 --> 00:43:03,080 Really anything persuasive as an upside and downside 896 00:43:03,080 --> 00:43:04,800 generally caught our eye. 897 00:43:04,800 --> 00:43:11,840 >> ROB BOWDEN: So for the upside, we did not accept on its own "faster." You 898 00:43:11,840 --> 00:43:14,540 had to say something about it. 899 00:43:14,540 --> 00:43:17,910 Even if you said theoretically faster, we knew that you kind of understood 900 00:43:17,910 --> 00:43:19,470 that it's 0 of 1. 901 00:43:19,470 --> 00:43:22,820 And hash table, in theory, is not 0 of 1. 902 00:43:22,820 --> 00:43:26,550 Mentioning anything about runtime generally got you the points. 903 00:43:26,550 --> 00:43:32,640 But "faster," most of the solutions on the big board that were tries were 904 00:43:32,640 --> 00:43:34,990 objectively slower than solutions that were hash tables. 905 00:43:34,990 --> 00:43:37,250 So faster in and of itself isn't really true. 906 00:43:37,250 --> 00:43:41,550 907 00:43:41,550 --> 00:43:44,380 >> DAVID J. MALAN: Dom de dom dom. 908 00:43:44,380 --> 00:43:46,686 I'm probably the only one that realizes that's how that's supposed to 909 00:43:46,686 --> 00:43:47,500 be pronounced, right? 910 00:43:47,500 --> 00:43:50,400 >> ROB BOWDEN: I had actually no idea. 911 00:43:50,400 --> 00:43:51,650 >> DAVID J. MALAN: It made sense in my head. 912 00:43:51,650 --> 00:43:53,830 913 00:43:53,830 --> 00:43:57,580 >> ROB BOWDEN: I'm doing this one. 914 00:43:57,580 --> 00:43:58,020 OK. 915 00:43:58,020 --> 00:44:04,243 So this is the one where you had to draw the diagram similar to you might 916 00:44:04,243 --> 00:44:06,040 have seen on past exams. 917 00:44:06,040 --> 00:44:12,200 So let's just look at this. 918 00:44:12,200 --> 00:44:18,170 So from the HTML node, we have two children, the head and the body. 919 00:44:18,170 --> 00:44:20,570 So we branch-- head and body. 920 00:44:20,570 --> 00:44:22,280 The head has a title tag. 921 00:44:22,280 --> 00:44:23,710 So we have a title. 922 00:44:23,710 --> 00:44:28,450 >> Now, the one thing a lot of people forgot is that these text nodes are 923 00:44:28,450 --> 00:44:30,430 elements within this tree. 924 00:44:30,430 --> 00:44:36,260 So here we happen to draw them as ovals to differentiate them from these 925 00:44:36,260 --> 00:44:37,380 types of nodes. 926 00:44:37,380 --> 00:44:41,450 But notice also here we have top, middle, and bottom will end up being 927 00:44:41,450 --> 00:44:42,560 text nodes. 928 00:44:42,560 --> 00:44:46,250 So forgetting those was somewhat of a common mistake. 929 00:44:46,250 --> 00:44:48,770 >> The body has three children-- these three divs. 930 00:44:48,770 --> 00:44:53,340 So div, div, div and then the text node children of those divs. 931 00:44:53,340 --> 00:44:55,900 That's pretty much it for that questions. 932 00:44:55,900 --> 00:44:57,860 >> DAVID J. MALAN: And it's worth noting, even though we don't dwell on these 933 00:44:57,860 --> 00:45:01,040 details in the time we spend on JavaScript, that the order does, in 934 00:45:01,040 --> 00:45:02,290 fact, matter technically. 935 00:45:02,290 --> 00:45:06,330 So if head comes before body in the HTML, then it should appear to the 936 00:45:06,330 --> 00:45:08,860 left of body in the actual DOM. 937 00:45:08,860 --> 00:45:12,265 That his is, in general, just FYI, something called document order, where 938 00:45:12,265 --> 00:45:13,260 it does matter. 939 00:45:13,260 --> 00:45:17,470 And if you were implementing a parser, a program that reads HTML in building 940 00:45:17,470 --> 00:45:20,960 up the tree in memory, to be honest, that's intuitively probably what you 941 00:45:20,960 --> 00:45:24,720 do anyway-- top to bottom, left to right. 942 00:45:24,720 --> 00:45:26,116 >> ROB BOWDEN: Questions on that? 943 00:45:26,116 --> 00:45:29,080 944 00:45:29,080 --> 00:45:30,000 Should I do the next one? 945 00:45:30,000 --> 00:45:32,380 >> DAVID J. MALAN: Sure. 946 00:45:32,380 --> 00:45:33,810 >> ROB BOWDEN: OK. 947 00:45:33,810 --> 00:45:39,320 So this is the buffer overrun attack question. 948 00:45:39,320 --> 00:45:43,740 The main thing to recognize here is, well, how might an adversary trick 949 00:45:43,740 --> 00:45:46,170 this program into executing arbitrary code? 950 00:45:46,170 --> 00:45:51,860 So argv1, the first command line argument to this program, that can be 951 00:45:51,860 --> 00:45:53,920 arbitrarily long. 952 00:45:53,920 --> 00:45:59,160 But here we're using memcpy to copy argv1, which here is bar. 953 00:45:59,160 --> 00:46:00,165 We're passing it as the argument. 954 00:46:00,165 --> 00:46:02,050 And so it's taking on the name bar. 955 00:46:02,050 --> 00:46:08,040 >> So we're memcpying bar into this buffer c. 956 00:46:08,040 --> 00:46:09,400 How many bytes are we copying? 957 00:46:09,400 --> 00:46:14,040 Well however many bytes bar happens to be using, the length of that argument. 958 00:46:14,040 --> 00:46:17,930 But c is only 12 bytes wide. 959 00:46:17,930 --> 00:46:22,280 So if we type a command line argument that's longer than 12 bytes, we're 960 00:46:22,280 --> 00:46:25,470 going to overflow this particular buffer. 961 00:46:25,470 --> 00:46:31,000 Now, how might an adversary trick the program into executing arbitrary code? 962 00:46:31,000 --> 00:46:34,910 >> So remember that here main is calling foo. 963 00:46:34,910 --> 00:46:37,340 And so then main calls foo. 964 00:46:37,340 --> 00:46:40,408 Let's draw this. 965 00:46:40,408 --> 00:46:44,720 966 00:46:44,720 --> 00:46:46,990 So we have our stack. 967 00:46:46,990 --> 00:46:49,090 And main has a stack frame at the bottom. 968 00:46:49,090 --> 00:46:51,860 969 00:46:51,860 --> 00:46:53,250 At some point, main calls foo. 970 00:46:53,250 --> 00:46:55,390 Well, immediately, main calls foo. 971 00:46:55,390 --> 00:46:57,130 And so foo gets its own stack frame. 972 00:46:57,130 --> 00:46:59,650 973 00:46:59,650 --> 00:47:02,220 >> Now, at some point, foo is going to return. 974 00:47:02,220 --> 00:47:06,810 And went foo returns, we need to know at what line of code inside of main we 975 00:47:06,810 --> 00:47:10,610 were in order to know where we should resume in main. 976 00:47:10,610 --> 00:47:13,100 We can call foo from a whole bunch of different places. 977 00:47:13,100 --> 00:47:14,620 How do we know where to return? 978 00:47:14,620 --> 00:47:16,460 Well, we need to store that somewhere. 979 00:47:16,460 --> 00:47:23,010 >> So somewhere right around here, we store where we should return to once 980 00:47:23,010 --> 00:47:24,070 foo returns. 981 00:47:24,070 --> 00:47:26,350 And this is the return address. 982 00:47:26,350 --> 00:47:30,490 So how an adversary might take advantage of this is the fact that 983 00:47:30,490 --> 00:47:37,550 this buffer c is stored, let's say, right here is c. 984 00:47:37,550 --> 00:47:39,690 So we've got 12 bytes for c. 985 00:47:39,690 --> 00:47:40,540 This is c. 986 00:47:40,540 --> 00:47:43,030 And this is foo's stack ring. 987 00:47:43,030 --> 00:47:49,970 So if the malicious user enters more bytes than 12 or they enter a command 988 00:47:49,970 --> 00:47:54,570 line argument that's longer than 12 characters, then we're going to 989 00:47:54,570 --> 00:47:57,540 overflow this buffer. 990 00:47:57,540 --> 00:47:59,910 >> We can keep going. 991 00:47:59,910 --> 00:48:02,220 And at some point, we go far enough that we start 992 00:48:02,220 --> 00:48:05,120 overwriting this return address. 993 00:48:05,120 --> 00:48:08,310 So once we overwrite the return address, this means that when foo 994 00:48:08,310 --> 00:48:14,220 returns, we're returning to wherever the malicious user is telling it to by 995 00:48:14,220 --> 00:48:19,490 whatever value it entered, by whatever characters the user entered. 996 00:48:19,490 --> 00:48:24,320 And so if the malicious user is being particularly clever, he can have this 997 00:48:24,320 --> 00:48:29,255 return to somewhere in the printDef function or somewhere in the malloc 998 00:48:29,255 --> 00:48:31,830 function, just anywhere arbitrary. 999 00:48:31,830 --> 00:48:38,420 >> But even more clever is what if he has the user return to right here. 1000 00:48:38,420 --> 00:48:41,920 And then you start executing these as lines of code. 1001 00:48:41,920 --> 00:48:46,610 So at that point, the user can enter whatever he wants into this region. 1002 00:48:46,610 --> 00:48:52,210 And he has complete control over your program. 1003 00:48:52,210 --> 00:48:53,460 Questions on that? 1004 00:48:53,460 --> 00:48:56,380 1005 00:48:56,380 --> 00:49:00,970 So the next question is complete the reimplementation of foo in such a way 1006 00:49:00,970 --> 00:49:02,620 that it's no longer vulnerable. 1007 00:49:02,620 --> 00:49:03,870 >> So there's a couple of ways you could have done this. 1008 00:49:03,870 --> 00:49:10,900 1009 00:49:10,900 --> 00:49:13,330 We still have c only being of length 12. 1010 00:49:13,330 --> 00:49:16,480 You could have changed this as part of your solution. 1011 00:49:16,480 --> 00:49:18,930 We also added a check to make sure bar was not null. 1012 00:49:18,930 --> 00:49:24,460 Though you didn't need that for full credit. 1013 00:49:24,460 --> 00:49:27,690 So we're checking first the string length of bar. 1014 00:49:27,690 --> 00:49:31,650 If it's greater than 12, then don't actually do the copy. 1015 00:49:31,650 --> 00:49:33,010 So that's one way of fixing it. 1016 00:49:33,010 --> 00:49:36,750 >> Another way of fixing it is instead of having c only be of length 12, have it 1017 00:49:36,750 --> 00:49:39,310 be of length strlen(bar). 1018 00:49:39,310 --> 00:49:43,370 Another way of fixing it is to actually just return. 1019 00:49:43,370 --> 00:49:46,690 So if you had just gotten rid of all of this, if you had just deleted all 1020 00:49:46,690 --> 00:49:51,830 lines of code, you would have gotten full credit, since this function 1021 00:49:51,830 --> 00:49:54,150 doesn't actually accomplish anything. 1022 00:49:54,150 --> 00:49:57,650 It's copying the command line argument into some array in 1023 00:49:57,650 --> 00:49:59,960 its local stack frame. 1024 00:49:59,960 --> 00:50:01,310 And then the thing is returning. 1025 00:50:01,310 --> 00:50:04,020 And whatever it accomplished is gone. 1026 00:50:04,020 --> 00:50:09,740 So return was also a sufficient way of getting full credit. 1027 00:50:09,740 --> 00:50:13,425 >> DAVID J. MALAN: Not quite the spirit of the question but acceptable per the 1028 00:50:13,425 --> 00:50:15,580 spec nonetheless. 1029 00:50:15,580 --> 00:50:18,260 >> ROB BOWDEN: Questions on any of that? 1030 00:50:18,260 --> 00:50:22,270 The one thing that you at least needed to have compiling code. 1031 00:50:22,270 --> 00:50:24,810 So even though technically you aren't vulnerable if your code doesn't 1032 00:50:24,810 --> 00:50:29,130 compile, we didn't accept that. 1033 00:50:29,130 --> 00:50:31,350 No questions? 1034 00:50:31,350 --> 00:50:33,320 OK. 1035 00:50:33,320 --> 00:50:34,580 >> DAVID J. MALAN: Do you want to say this title? 1036 00:50:34,580 --> 00:50:37,230 >> ROB BOWDEN: No. 1037 00:50:37,230 --> 00:50:40,470 >> DAVID J. MALAN: So in this one, this was either good news or bad news. 1038 00:50:40,470 --> 00:50:43,870 This is literally the same problem as the first quiz. 1039 00:50:43,870 --> 00:50:46,140 And it's almost the same problem as pset1. 1040 00:50:46,140 --> 00:50:49,980 But it was deliberately simplified to be a simpler pyramid, one that can be 1041 00:50:49,980 --> 00:50:52,330 solved with a slightly simpler iteration. 1042 00:50:52,330 --> 00:50:55,680 And really, what we were getting at here was not so much the logic, 1043 00:50:55,680 --> 00:50:58,100 because probably, by this point, you're more comfortable than you were 1044 00:50:58,100 --> 00:51:01,850 in week one with for loops or why loops, but really to tease apart that 1045 00:51:01,850 --> 00:51:04,790 you're a little comfortable with the notion that PHP isn't just about what 1046 00:51:04,790 --> 00:51:05,290 programming. 1047 00:51:05,290 --> 00:51:07,820 It can actually be used as a language to write command line programs. 1048 00:51:07,820 --> 00:51:10,060 >> And indeed, that's what we were trying to draw your attention to. 1049 00:51:10,060 --> 00:51:12,060 This is a command line PHP program. 1050 00:51:12,060 --> 00:51:16,690 So C code here, while correct in C, not correct for PHP. 1051 00:51:16,690 --> 00:51:17,940 But the code really is the same. 1052 00:51:17,940 --> 00:51:21,720 If you compare the solutions for Quiz 0 against Quiz 1, you'll find that 1053 00:51:21,720 --> 00:51:25,630 it's almost identical, except for some dollar signs and for the 1054 00:51:25,630 --> 00:51:27,250 absence of a data type. 1055 00:51:27,250 --> 00:51:31,720 In particular, if we take a look here, you'll see that we iterate, in this 1056 00:51:31,720 --> 00:51:33,730 case, from 1 up through 7. 1057 00:51:33,730 --> 00:51:34,910 >> We could have done it 0 index. 1058 00:51:34,910 --> 00:51:37,320 But sometimes, I think it's just mentally easier to think about things 1059 00:51:37,320 --> 00:51:38,200 from 1 to 7. 1060 00:51:38,200 --> 00:51:40,300 If you want one block, then two blocks, then three, then 1061 00:51:40,300 --> 00:51:41,770 dot, dot, dot seven. 1062 00:51:41,770 --> 00:51:45,960 We have j being initialized to 1 and then counting on up to i. 1063 00:51:45,960 --> 00:51:48,150 And everything here is otherwise identical. 1064 00:51:48,150 --> 00:51:49,790 But worthy of note are a couple of things. 1065 00:51:49,790 --> 00:51:53,230 We give you these two lines, this first one, goofily named as a shebang 1066 00:51:53,230 --> 00:51:54,560 for sharp bang. 1067 00:51:54,560 --> 00:51:58,770 And that just specifies the path, the folder, in which a program can be 1068 00:51:58,770 --> 00:52:02,160 found that you want to use to interpret this file. 1069 00:52:02,160 --> 00:52:04,710 >> And then the line after that, of course, means enter PHP mode. 1070 00:52:04,710 --> 00:52:07,740 And the line at the very bottom means exit PHP mode. 1071 00:52:07,740 --> 00:52:09,740 And this works, in general, with interpreted languages. 1072 00:52:09,740 --> 00:52:14,370 It's kind of annoying if you write a program in a file called foo.php. 1073 00:52:14,370 --> 00:52:17,320 And then your users have to just remember, OK, to run this program, I 1074 00:52:17,320 --> 00:52:22,320 have to type "php space foo.php." Kind of annoying if nothing else. 1075 00:52:22,320 --> 00:52:25,270 And it also reveals that your program is written in PHP, which isn't all 1076 00:52:25,270 --> 00:52:27,060 that illuminating for the user. 1077 00:52:27,060 --> 00:52:30,100 >> So you can remove the .php altogether recall from lecture. 1078 00:52:30,100 --> 00:52:35,690 And you can actually do ./foo if you've chmodded it by making it 1079 00:52:35,690 --> 00:52:36,500 executable. 1080 00:52:36,500 --> 00:52:39,630 So chmod a+x foo would have done that. 1081 00:52:39,630 --> 00:52:41,460 And if you also add the shebang here. 1082 00:52:41,460 --> 00:52:45,320 But really, the problem was getting at printing out something like this. 1083 00:52:45,320 --> 00:52:51,100 No HTML, no C-code certainly, just some PHP. 1084 00:52:51,100 --> 00:52:54,100 So Milo then returned in problem 25. 1085 00:52:54,100 --> 00:52:58,050 And in 25, you were given the following skeleton code, which was a 1086 00:52:58,050 --> 00:52:59,730 pretty simple web page. 1087 00:52:59,730 --> 00:53:04,230 And the juicy part HTML-wise was down here, where we have inside of the body 1088 00:53:04,230 --> 00:53:09,160 a form that has unique ID of inputs inside of which was two inputs, one 1089 00:53:09,160 --> 00:53:11,950 with an idea of name, one with an idea of button. 1090 00:53:11,950 --> 00:53:14,240 >> The first was type text, the second of type submit. 1091 00:53:14,240 --> 00:53:16,930 And so we gave you, actually, more ingredients than you needed, just so 1092 00:53:16,930 --> 00:53:19,230 you guys had options with which to solve this problem. 1093 00:53:19,230 --> 00:53:21,130 You don't strictly need all of these IDs. 1094 00:53:21,130 --> 00:53:23,580 But it allows you to solve it in different ways. 1095 00:53:23,580 --> 00:53:27,050 And up at the top, notice that the objective was to trigger 1096 00:53:27,050 --> 00:53:27,960 a window like this-- 1097 00:53:27,960 --> 00:53:28,780 Hello, Milo!-- 1098 00:53:28,780 --> 00:53:31,270 to pop up in the browser using the super simple, if 1099 00:53:31,270 --> 00:53:33,190 not ugly, alert function. 1100 00:53:33,190 --> 00:53:37,480 And so, ultimately, this boils down conceptually to somehow listening for 1101 00:53:37,480 --> 00:53:41,290 submissions of the form client-side , not the server-side, somehow 1102 00:53:41,290 --> 00:53:45,640 responding to that submission by grabbing the value that the user typed 1103 00:53:45,640 --> 00:53:50,120 in to the name field, and then displaying it in the body of an alert. 1104 00:53:50,120 --> 00:53:53,460 >> So one way you can do this is with jQuery, which looks a little 1105 00:53:53,460 --> 00:53:56,880 syntactically perplexing at first. 1106 00:53:56,880 --> 00:54:00,760 You can do this with pure DOM code-- document.getelement by ID. 1107 00:54:00,760 --> 00:54:02,530 But let's take a look at this version. 1108 00:54:02,530 --> 00:54:05,110 I have a couple of important lines first. 1109 00:54:05,110 --> 00:54:09,460 So one, we have this line, which is identical to what you might have seen 1110 00:54:09,460 --> 00:54:13,830 in, I believe, form2.html from class in week 9. 1111 00:54:13,830 --> 00:54:16,960 And this is just saying, execute the following code when 1112 00:54:16,960 --> 00:54:18,430 the document is ready. 1113 00:54:18,430 --> 00:54:21,770 This being important only because HTML pages are read top to 1114 00:54:21,770 --> 00:54:23,280 bottom, left to right. 1115 00:54:23,280 --> 00:54:27,910 >> And therefore, if you try to do something in code up here to some DOM 1116 00:54:27,910 --> 00:54:31,560 element, some HTML tag, that's down here, you're doing it too soon, 1117 00:54:31,560 --> 00:54:34,220 because this has not even been read into memory. 1118 00:54:34,220 --> 00:54:37,740 So by saying this document.ready line, we're saying, 1119 00:54:37,740 --> 00:54:39,040 here's some code, browser. 1120 00:54:39,040 --> 00:54:42,440 But don't execute this until the whole document is ready, that is the DOM 1121 00:54:42,440 --> 00:54:44,320 tree exists in memory. 1122 00:54:44,320 --> 00:54:47,110 This one is a little more straightforward, if syntactically a 1123 00:54:47,110 --> 00:54:51,890 bit different, where I'm saying, grab the HTML element whose unique 1124 00:54:51,890 --> 00:54:53,560 identifier is inputs. 1125 00:54:53,560 --> 00:54:56,220 That's what the hash tag denotes, the unique ID. 1126 00:54:56,220 --> 00:54:58,070 And then I'm calling .submit. 1127 00:54:58,070 --> 00:55:01,660 >> So .submit here is a function, otherwise known as a method, that's 1128 00:55:01,660 --> 00:55:05,850 inside of the object on the left-hand side there that I didn't highlight. 1129 00:55:05,850 --> 00:55:08,990 So if you think of inputs as an object in memory-- and indeed it is. 1130 00:55:08,990 --> 00:55:10,440 It's a node in a tree-- 1131 00:55:10,440 --> 00:55:16,580 .submit means when this form with this ID is submitted, execute 1132 00:55:16,580 --> 00:55:17,700 the following code. 1133 00:55:17,700 --> 00:55:20,290 I don't care what the name of the function is I'm executing. 1134 00:55:20,290 --> 00:55:23,760 So here I'm using, as before, what's called the lambda function or an 1135 00:55:23,760 --> 00:55:24,720 anonymous function. 1136 00:55:24,720 --> 00:55:27,640 It's not at all intellectually interesting other than it has no name, 1137 00:55:27,640 --> 00:55:30,220 which is fine if you're only ever going to call it once. 1138 00:55:30,220 --> 00:55:34,490 And inside there I actually handle the submission of the form. 1139 00:55:34,490 --> 00:55:36,810 I first declare a variable called value. 1140 00:55:36,810 --> 00:55:40,610 And then what is the effect of this highlighted portion here now? 1141 00:55:40,610 --> 00:55:44,755 What does that do at a high level for me? 1142 00:55:44,755 --> 00:55:48,539 >> AUDIENCE: It gets the value that the user didn't in the HTML below. 1143 00:55:48,539 --> 00:55:50,920 It gets that ID and then finds the value of it. 1144 00:55:50,920 --> 00:55:51,590 >> DAVID J. MALAN: Exactly. 1145 00:55:51,590 --> 00:55:54,300 It grabs the node, whose unique identifier is name. 1146 00:55:54,300 --> 00:55:56,900 It gets the value therein, which is, presumably, what the user 1147 00:55:56,900 --> 00:55:58,190 typed him or herself. 1148 00:55:58,190 --> 00:56:01,020 And then it stores that in the variable called value. 1149 00:56:01,020 --> 00:56:03,720 As an aside, you could have also done this a little differently. 1150 00:56:03,720 --> 00:56:09,250 Totally acceptable by doing something lie var value gets 1151 00:56:09,250 --> 00:56:10,500 document.getElementById. 1152 00:56:10,500 --> 00:56:12,860 1153 00:56:12,860 --> 00:56:15,460 And this is why it's a little tedious to not use jQuery. 1154 00:56:15,460 --> 00:56:16,710 "name".value. 1155 00:56:16,710 --> 00:56:18,330 1156 00:56:18,330 --> 00:56:19,620 So totally acceptable. 1157 00:56:19,620 --> 00:56:22,770 Different ways to do this. jQuery just tends to be a little more succinct and 1158 00:56:22,770 --> 00:56:25,230 definitely more popular among programmers. 1159 00:56:25,230 --> 00:56:27,590 >> Now, I'm doing a bit of a sanity check, because in the problem 1160 00:56:27,590 --> 00:56:30,820 statement we explicitly said, if the user has not yet typed his or her 1161 00:56:30,820 --> 00:56:32,580 name, don't show an alerts. 1162 00:56:32,580 --> 00:56:35,390 But you can check for that, by just checking for the empty string for a 1163 00:56:35,390 --> 00:56:37,850 quote-unquote if there's nothing actually there. 1164 00:56:37,850 --> 00:56:40,880 But if it's not equal to quote-unquote, I want to call alerts. 1165 00:56:40,880 --> 00:56:45,610 And the interesting part here is that we're using the plus operator, which 1166 00:56:45,610 --> 00:56:48,130 does what in JavaScript? 1167 00:56:48,130 --> 00:56:48,740 Concatenate. 1168 00:56:48,740 --> 00:56:50,690 So it's like PHPs dot operator. 1169 00:56:50,690 --> 00:56:52,820 Same idea, slightly different syntax. 1170 00:56:52,820 --> 00:56:55,280 And I'm just creating the string that you saw on the screen shot-- 1171 00:56:55,280 --> 00:56:57,750 Hello, so and so. 1172 00:56:57,750 --> 00:56:59,200 >> And then the last detail is this. 1173 00:56:59,200 --> 00:57:04,970 Why do I return false inside of this anonymous function? 1174 00:57:04,970 --> 00:57:07,420 >> AUDIENCE: There's no value. 1175 00:57:07,420 --> 00:57:09,380 You put it in form. 1176 00:57:09,380 --> 00:57:12,320 1177 00:57:12,320 --> 00:57:16,730 It just says, if value is not equal to blank, then do it. 1178 00:57:16,730 --> 00:57:20,040 1179 00:57:20,040 --> 00:57:20,940 There was a blank in that submission. 1180 00:57:20,940 --> 00:57:21,170 >> DAVID J. MALAN: OK. 1181 00:57:21,170 --> 00:57:21,640 Careful though. 1182 00:57:21,640 --> 00:57:22,830 There's no one else here. 1183 00:57:22,830 --> 00:57:25,510 And that return false is outside of the if conditions. 1184 00:57:25,510 --> 00:57:29,470 So this highlighted line, return false, executes no matter what when 1185 00:57:29,470 --> 00:57:32,310 the form is submitted. 1186 00:57:32,310 --> 00:57:36,810 What does returning false inside of this event handler, as it's called, 1187 00:57:36,810 --> 00:57:38,450 the event in question being submission? 1188 00:57:38,450 --> 00:57:42,350 1189 00:57:42,350 --> 00:57:44,470 >> AUDIENCE: Because it only happens once. 1190 00:57:44,470 --> 00:57:45,320 >> DAVID J. MALAN: Only happens once. 1191 00:57:45,320 --> 00:57:46,821 Not quite. 1192 00:57:46,821 --> 00:57:47,292 Yeah? 1193 00:57:47,292 --> 00:57:50,589 >> AUDIENCE: It prevents the form from submitting to the default behavior, 1194 00:57:50,589 --> 00:57:52,480 which would make the page reload. 1195 00:57:52,480 --> 00:57:53,110 >> DAVID J. MALAN: Exactly. 1196 00:57:53,110 --> 00:57:56,490 So I'm overloading the term submit here, because I'm saying, the form is 1197 00:57:56,490 --> 00:57:57,670 being submitted. 1198 00:57:57,670 --> 00:58:02,240 But as you suggest, it's actually not been submitted in the true HTTP way. 1199 00:58:02,240 --> 00:58:06,870 When you click Submit, because of our onSubmit handler, we're intercepting 1200 00:58:06,870 --> 00:58:09,040 that form submission so to speak. 1201 00:58:09,040 --> 00:58:11,290 We're then doing our thing with JavaScript code. 1202 00:58:11,290 --> 00:58:14,070 But I'm deliberately returning false, because what I don't want to happen a 1203 00:58:14,070 --> 00:58:18,430 split second later is for the whole form itself to be submitted to the web 1204 00:58:18,430 --> 00:58:22,800 server with key value pairs by changing the URL to be something like 1205 00:58:22,800 --> 00:58:26,180 q=cats or whatever we did, for instance, in class. 1206 00:58:26,180 --> 00:58:29,640 I don't want that to happen, because there is no server listening for this 1207 00:58:29,640 --> 00:58:30,690 form submission. 1208 00:58:30,690 --> 00:58:32,320 It's purely done in JavaScript code. 1209 00:58:32,320 --> 00:58:35,760 And that's why I didn't even have an action attribute on my form, because I 1210 00:58:35,760 --> 00:58:38,870 don't intend for this to ever go to the server. 1211 00:58:38,870 --> 00:58:40,780 >> So it's being submitted. 1212 00:58:40,780 --> 00:58:44,340 But we're intercepting that form submission and preventing the default 1213 00:58:44,340 --> 00:58:47,477 behavior, which is to actually go all the way to the server. 1214 00:58:47,477 --> 00:58:48,730 >> AUDIENCE: So keeping it client-side. 1215 00:58:48,730 --> 00:58:49,780 >> DAVID J. MALAN: Keeping it client-side. 1216 00:58:49,780 --> 00:58:51,030 Exactly right. 1217 00:58:51,030 --> 00:58:53,240 1218 00:58:53,240 --> 00:58:55,757 Next up was my oh MySQL. 1219 00:58:55,757 --> 00:59:00,000 1220 00:59:00,000 --> 00:59:00,430 >> ROB BOWDEN: OK. 1221 00:59:00,430 --> 00:59:04,990 So this first question was generally rough for people. 1222 00:59:04,990 --> 00:59:07,270 Though the later ones went better. 1223 00:59:07,270 --> 00:59:12,260 So you had to choose the correct data types for both of these columns. 1224 00:59:12,260 --> 00:59:17,750 And both of these have some things about them that 1225 00:59:17,750 --> 00:59:20,620 make the choice difficult. 1226 00:59:20,620 --> 00:59:24,430 So int was not a valid type for number. 1227 00:59:24,430 --> 00:59:29,410 The reason being a 12-digit account number, an int is not big enough to 1228 00:59:29,410 --> 00:59:31,070 store total digits. 1229 00:59:31,070 --> 00:59:36,570 So a valid choice would have been a big int if you happen to know that. 1230 00:59:36,570 --> 00:59:42,090 Another choice could have been a char field of length 12. 1231 00:59:42,090 --> 00:59:44,560 So either of those would have worked. 1232 00:59:44,560 --> 00:59:46,100 Int would not. 1233 00:59:46,100 --> 00:59:50,170 >> Now, balance, think back to pset7. 1234 00:59:50,170 --> 00:59:59,540 So we specifically used decimal to store the value of shares or-- 1235 00:59:59,540 --> 01:00:00,550 >> DAVID J. MALAN: Cash. 1236 01:00:00,550 --> 01:00:01,060 >> ROB BOWDEN: Cash. 1237 01:00:01,060 --> 01:00:05,710 We used decimal to store the amount of cash that the user currently has. 1238 01:00:05,710 --> 01:00:10,950 So the reason we do that is because, remember, floats. 1239 01:00:10,950 --> 01:00:12,480 There's floating point in precision. 1240 01:00:12,480 --> 01:00:18,200 It can't precisely store the cash values like we want here. 1241 01:00:18,200 --> 01:00:23,630 So decimal is able to precisely store something to, say, two decimal places. 1242 01:00:23,630 --> 01:00:27,630 That's why balance, we want it to be decimal and not float. 1243 01:00:27,630 --> 01:00:30,230 >> DAVID J. MALAN: And also, too, though it might have been clever in other 1244 01:00:30,230 --> 01:00:32,760 contexts to think, maybe this is a chance for an int. 1245 01:00:32,760 --> 01:00:34,420 I'll just keep track of things in pennies. 1246 01:00:34,420 --> 01:00:38,670 Because we explicitly showed the default value of being 100.00, that 1247 01:00:38,670 --> 01:00:40,380 means it could just be an int. 1248 01:00:40,380 --> 01:00:45,310 And another subtlety too with number was that it wasn't meant 1249 01:00:45,310 --> 01:00:46,180 to be a trick question. 1250 01:00:46,180 --> 01:00:49,860 But recall that an int in MySQL, like in C, at least in the 1251 01:00:49,860 --> 01:00:51,440 appliance, is 32-bit. 1252 01:00:51,440 --> 01:00:53,960 And even though we don't expect you to know exactly how many digits that 1253 01:00:53,960 --> 01:00:56,910 means, do recall that the largest number you can represent potentially 1254 01:00:56,910 --> 01:01:00,710 with a 32-bit number is roughly what? 1255 01:01:00,710 --> 01:01:02,760 >> What number do we always say? 1256 01:01:02,760 --> 01:01:04,530 2 to the 32, which is what roughly? 1257 01:01:04,530 --> 01:01:07,492 1258 01:01:07,492 --> 01:01:08,780 You don't have to know precisely. 1259 01:01:08,780 --> 01:01:10,580 But roughly is helpful in life. 1260 01:01:10,580 --> 01:01:12,200 It's roughly 4 billion. 1261 01:01:12,200 --> 01:01:14,430 So we've said that a few times. 1262 01:01:14,430 --> 01:01:16,360 I know I have said that a few times. 1263 01:01:16,360 --> 01:01:17,670 And it is roughly 4 billion. 1264 01:01:17,670 --> 01:01:19,710 And that's a good rule of thumb to know. 1265 01:01:19,710 --> 01:01:21,880 If you have 8 bits, 256 is the magic number. 1266 01:01:21,880 --> 01:01:24,160 If you have 32 bits, 4 billion give or take. 1267 01:01:24,160 --> 01:01:27,140 So if you just write down 4 billion, you'll see that it's fewer digits than 1268 01:01:27,140 --> 01:01:30,970 12, which means that's clearly not enough expressiveness to capture a 1269 01:01:30,970 --> 01:01:34,220 12-digit account number. 1270 01:01:34,220 --> 01:01:34,940 >> ROB BOWDEN: OK. 1271 01:01:34,940 --> 01:01:38,520 So the other ones went better. 1272 01:01:38,520 --> 01:01:40,900 So suppose that the bank imposes a $20 monthly 1273 01:01:40,900 --> 01:01:42,400 maintenance fee on all accounts. 1274 01:01:42,400 --> 01:01:45,506 With what SQL query could the bank deduct $20 from every count, even if 1275 01:01:45,506 --> 01:01:47,520 it results in some negative balances? 1276 01:01:47,520 --> 01:01:50,380 So basically, there are four main types of queries-- 1277 01:01:50,380 --> 01:01:52,840 insert, select, update, and delete. 1278 01:01:52,840 --> 01:01:56,080 So what do we think we're going to use here? 1279 01:01:56,080 --> 01:01:57,000 Update. 1280 01:01:57,000 --> 01:01:58,260 >> So let's take a look. 1281 01:01:58,260 --> 01:02:04,290 1282 01:02:04,290 --> 01:02:05,870 So here we're updating. 1283 01:02:05,870 --> 01:02:09,900 What table are we updating accounts? 1284 01:02:09,900 --> 01:02:11,670 So updating accounts. 1285 01:02:11,670 --> 01:02:15,390 And then the syntax says, what in accounts are we updating? 1286 01:02:15,390 --> 01:02:19,520 Well, we're setting balance equal to the current value of balance minus 20. 1287 01:02:19,520 --> 01:02:22,860 So this will update all rows of accounts, subtracting 1288 01:02:22,860 --> 01:02:26,250 $20 from the balance. 1289 01:02:26,250 --> 01:02:29,260 >> DAVID J. MALAN: A common mistake here, even though we sometimes forgave it, 1290 01:02:29,260 --> 01:02:32,990 was to actually have PHP code here calling the query function or putting 1291 01:02:32,990 --> 01:02:35,460 quotes around everything that didn't need to be there. 1292 01:02:35,460 --> 01:02:39,780 >> ROB BOWDEN: Remember that MySQL is a separate language from PHP. 1293 01:02:39,780 --> 01:02:42,410 We happen to be writing MySQL in PHP. 1294 01:02:42,410 --> 01:02:46,180 And PHP is then sending it over to the MySQL server. 1295 01:02:46,180 --> 01:02:51,120 But you don't need PHP in order to communicate with a MySQL server. 1296 01:02:51,120 --> 01:02:51,730 >> DAVID J. MALAN: Exactly. 1297 01:02:51,730 --> 01:02:54,240 So no variables with dollar signs should be in this context. 1298 01:02:54,240 --> 01:02:59,550 It can just do all of the math within the database itself. 1299 01:02:59,550 --> 01:03:00,080 >> ROB BOWDEN: OK. 1300 01:03:00,080 --> 01:03:01,300 So the next one. 1301 01:03:01,300 --> 01:03:02,731 Is this the next one? 1302 01:03:02,731 --> 01:03:03,210 Yeah. 1303 01:03:03,210 --> 01:03:06,570 So with what SQL query could the bank retrieve the account numbers of its 1304 01:03:06,570 --> 01:03:09,300 richest customers, those with balances greater than 1,000? 1305 01:03:09,300 --> 01:03:13,280 So which of the four main types are we going to want here? 1306 01:03:13,280 --> 01:03:14,430 Select. 1307 01:03:14,430 --> 01:03:16,650 So we want to select. 1308 01:03:16,650 --> 01:03:17,610 What do we want to select? 1309 01:03:17,610 --> 01:03:19,380 What column do we want to select? 1310 01:03:19,380 --> 01:03:20,970 We will specifically want to select number. 1311 01:03:20,970 --> 01:03:23,910 But if you said star, we also accepted that. 1312 01:03:23,910 --> 01:03:25,820 >> So select number from what table? 1313 01:03:25,820 --> 01:03:26,640 Accounts. 1314 01:03:26,640 --> 01:03:28,370 And then the condition we want? 1315 01:03:28,370 --> 01:03:30,140 Where balance greater than 1,000. 1316 01:03:30,140 --> 01:03:31,720 We also accepted greater than or equal. 1317 01:03:31,720 --> 01:03:35,230 1318 01:03:35,230 --> 01:03:36,190 Last one. 1319 01:03:36,190 --> 01:03:42,940 With what SQL query could the bank close, i.e., delete every account that 1320 01:03:42,940 --> 01:03:44,480 has a balance of $0? 1321 01:03:44,480 --> 01:03:47,620 So which of the four are we going to want to use? 1322 01:03:47,620 --> 01:03:48,320 Delete. 1323 01:03:48,320 --> 01:03:50,180 So the syntax for that? 1324 01:03:50,180 --> 01:03:51,890 Delete from what table? 1325 01:03:51,890 --> 01:03:53,550 Accounts. 1326 01:03:53,550 --> 01:03:55,790 And then the condition on which we want to delete-- 1327 01:03:55,790 --> 01:03:57,280 where balance equals zero. 1328 01:03:57,280 --> 01:04:03,050 So delete all rows from accounts where the balance is zero. 1329 01:04:03,050 --> 01:04:04,300 Questions on any of these? 1330 01:04:04,300 --> 01:04:08,840 1331 01:04:08,840 --> 01:04:10,260 Want to queue? 1332 01:04:10,260 --> 01:04:11,200 >> DAVID J. MALAN: Queue guide. 1333 01:04:11,200 --> 01:04:17,110 So in this one, we gave you a somewhat familiar structure that we explored a 1334 01:04:17,110 --> 01:04:20,450 bit in class alongside of structs, which was a data 1335 01:04:20,450 --> 01:04:21,910 structure related in spirit. 1336 01:04:21,910 --> 01:04:24,670 The difference though with a queue is that we had to somehow remember who 1337 01:04:24,670 --> 01:04:27,900 was at the front of the queue, in large part so that we could make more 1338 01:04:27,900 --> 01:04:30,530 efficient use of the memory, at least if we were using an array. 1339 01:04:30,530 --> 01:04:35,460 >> Because recall, if we have an array, if, for instance, this is the front of 1340 01:04:35,460 --> 01:04:38,470 the queue, if I get into the queue here, and then someone gets in line 1341 01:04:38,470 --> 01:04:42,710 behind me, behind me, behind me, and one person steps out of line, you 1342 01:04:42,710 --> 01:04:45,930 could, as we saw some of our human volunteers in class, have everyone 1343 01:04:45,930 --> 01:04:47,100 shift this way. 1344 01:04:47,100 --> 01:04:50,880 But in general, having everyone do something is not the best use of time 1345 01:04:50,880 --> 01:04:54,600 in a program, because it means your algorithm is running in what 1346 01:04:54,600 --> 01:04:56,520 asymptotic running time? 1347 01:04:56,520 --> 01:04:57,420 It's linear. 1348 01:04:57,420 --> 01:04:59,600 >> And I feel like that's kind of stupid. 1349 01:04:59,600 --> 01:05:02,890 If the next person in line is the next person who's supposed to go into the 1350 01:05:02,890 --> 01:05:04,660 store, they don't all have to move together. 1351 01:05:04,660 --> 01:05:08,200 Just let that person be plucked off when the time comes, for instance. 1352 01:05:08,200 --> 01:05:09,870 So we can save a bit of time there. 1353 01:05:09,870 --> 01:05:14,840 And so to do that though, that means that the head of the queue or the 1354 01:05:14,840 --> 01:05:18,060 front of the queue is going to progressively move deeper and deeper 1355 01:05:18,060 --> 01:05:23,340 into the array and eventually might actually wrap around if we're using an 1356 01:05:23,340 --> 01:05:25,790 array to store the people in this queue. 1357 01:05:25,790 --> 01:05:28,390 So you can almost think of the array as a circular data 1358 01:05:28,390 --> 01:05:29,880 structure in that sense. 1359 01:05:29,880 --> 01:05:33,970 >> So you somehow have to keep track of the size of it or really the end of it 1360 01:05:33,970 --> 01:05:36,250 and then where the beginning of it is. 1361 01:05:36,250 --> 01:05:39,490 So we propose that you declare one such queue, calling 1362 01:05:39,490 --> 01:05:41,330 it q, just one letter. 1363 01:05:41,330 --> 01:05:44,570 Then we propose that the front be initialized to zero and that the size 1364 01:05:44,570 --> 01:05:45,470 be initialized to zero. 1365 01:05:45,470 --> 01:05:47,770 >> So right now, there's nothing inside of that queue. 1366 01:05:47,770 --> 01:05:50,910 And we ask you to complete the implementation of enqueue below in 1367 01:05:50,910 --> 01:05:55,250 such a way that the function adds n to the end of q and then returns true. 1368 01:05:55,250 --> 01:05:58,690 But if q is full or negative, the function should instead return false. 1369 01:05:58,690 --> 01:06:01,060 And we gave you a couple of assumptions. 1370 01:06:01,060 --> 01:06:04,320 But they're not really functionally relevant, just that bool exists, 1371 01:06:04,320 --> 01:06:06,690 because, technically, bool doesn't exist in C unless you include a 1372 01:06:06,690 --> 01:06:07,310 certain header file. 1373 01:06:07,310 --> 01:06:09,350 So that was just make sure there were no is this a trick 1374 01:06:09,350 --> 01:06:10,940 question kind of thing. 1375 01:06:10,940 --> 01:06:16,280 >> So enqueue, we proposed in the sample solutions to implement as follows. 1376 01:06:16,280 --> 01:06:20,420 One, we first check the ease, the low-hanging fruits. 1377 01:06:20,420 --> 01:06:23,820 If the queue is full or the number that you're trying to insert is less 1378 01:06:23,820 --> 01:06:26,380 than zero, which we said in the specification of the problem should 1379 01:06:26,380 --> 01:06:30,320 not be allowed, because we only want non-negative values, then you should 1380 01:06:30,320 --> 01:06:31,640 just return false immediately. 1381 01:06:31,640 --> 01:06:33,820 So some relatively easy error checking. 1382 01:06:33,820 --> 01:06:38,720 If though you want to add that actual number, you had to do a bit of 1383 01:06:38,720 --> 01:06:39,440 thinking here. 1384 01:06:39,440 --> 01:06:41,330 And this is where it's a little annoying mentally, because you have to 1385 01:06:41,330 --> 01:06:43,000 figure out how to handle wraparound. 1386 01:06:43,000 --> 01:06:46,870 >> But the germ of the idea here that's of interest to us is that wraparound 1387 01:06:46,870 --> 01:06:51,480 often implies modular arithmetic and the mod operator, the percent side, 1388 01:06:51,480 --> 01:06:55,140 where you can go from a larger value back to zero and then one and two and 1389 01:06:55,140 --> 01:06:58,650 three and then back around to zero, one and two and three and so forth 1390 01:06:58,650 --> 01:06:59,380 again and again. 1391 01:06:59,380 --> 01:07:02,880 So the way we propose doing this is that we do want to index into the 1392 01:07:02,880 --> 01:07:05,850 array called numbers where our integers lie. 1393 01:07:05,850 --> 01:07:10,740 But to get there, we first want to do whatever the size of the queue is but 1394 01:07:10,740 --> 01:07:14,080 then add to that whatever the front of the list is. 1395 01:07:14,080 --> 01:07:17,880 And the effect of that is to put us at the right position in the queue and 1396 01:07:17,880 --> 01:07:20,970 not assume that the first person in line is at the beginning, which he or 1397 01:07:20,970 --> 01:07:24,130 she absolutely could be if we were also shifting everyone. 1398 01:07:24,130 --> 01:07:26,710 But we're just creating work for ourselves if we took 1399 01:07:26,710 --> 01:07:27,800 that particular path. 1400 01:07:27,800 --> 01:07:29,330 >> So we can keep it relatively simple. 1401 01:07:29,330 --> 01:07:32,180 We do have to remember that we just added an int to the queue. 1402 01:07:32,180 --> 01:07:35,850 And then we just return true. 1403 01:07:35,850 --> 01:07:38,560 Meanwhile, in dequeue, we asked you to do the following. 1404 01:07:38,560 --> 01:07:42,260 Implement it in such a way that it dequeues, that is removes and returns, 1405 01:07:42,260 --> 01:07:44,190 the int at the front of queue. 1406 01:07:44,190 --> 01:07:46,410 To remove the int, it suffices to forget it. 1407 01:07:46,410 --> 01:07:47,650 You don't need to override its bit. 1408 01:07:47,650 --> 01:07:48,820 So it's still actually there. 1409 01:07:48,820 --> 01:07:51,930 Just like data on a hard drive, we're just ignoring the fact 1410 01:07:51,930 --> 01:07:52,970 that it's now there. 1411 01:07:52,970 --> 01:07:55,520 And if q is empty, we should instead return negative 1. 1412 01:07:55,520 --> 01:07:56,750 So this feels arbitrary. 1413 01:07:56,750 --> 01:08:01,640 Why return negative 1 instead of false? 1414 01:08:01,640 --> 01:08:02,620 Yeah. 1415 01:08:02,620 --> 01:08:05,070 >> AUDIENCE: Q is storing positive values. 1416 01:08:05,070 --> 01:08:10,950 Since you only store positive values in the q, negative is an error. 1417 01:08:10,950 --> 01:08:11,510 >> DAVID J. MALAN: OK, true. 1418 01:08:11,510 --> 01:08:14,850 So because we're only storing positive values or zero, then it's fine to 1419 01:08:14,850 --> 01:08:18,050 return a negative value as a sentinel value, a special symbol. 1420 01:08:18,050 --> 01:08:21,630 But you're rewriting history there, because the reason we're only 1421 01:08:21,630 --> 01:08:25,890 returning non-negative values is because we want to 1422 01:08:25,890 --> 01:08:27,670 have a sentinel value. 1423 01:08:27,670 --> 01:08:32,617 So more specifically, why not just return false in cases of errors? 1424 01:08:32,617 --> 01:08:33,099 Yeah. 1425 01:08:33,099 --> 01:08:35,510 >> AUDIENCE: You've failed to return an integer. 1426 01:08:35,510 --> 01:08:36,630 >> DAVID J. MALAN: Exactly. 1427 01:08:36,630 --> 01:08:38,569 And this is where C gets pretty constraining. 1428 01:08:38,569 --> 01:08:40,590 If you're saying you're going to return an int, you've got 1429 01:08:40,590 --> 01:08:41,279 to return an int. 1430 01:08:41,279 --> 01:08:43,689 You can't get fancy and start returning a bool or a float or a 1431 01:08:43,689 --> 01:08:45,040 string or something like that. 1432 01:08:45,040 --> 01:08:49,370 Now, meanwhile, JavaScript and PHP and some other languages can, in fact, 1433 01:08:49,370 --> 01:08:51,310 have you returning different types of values. 1434 01:08:51,310 --> 01:08:54,819 And that can actually be useful, where you could return positive ints, zeros, 1435 01:08:54,819 --> 01:08:59,439 negative ints, or false or null even to signify error. 1436 01:08:59,439 --> 01:09:01,890 But we don't have that versatility in C. 1437 01:09:01,890 --> 01:09:04,569 >> So with dequeue, what we propose to do is-- 1438 01:09:04,569 --> 01:09:07,350 1439 01:09:07,350 --> 01:09:09,830 >> ROB BOWDEN: You can return false. 1440 01:09:09,830 --> 01:09:13,189 It's just that false is hash define false to zero. 1441 01:09:13,189 --> 01:09:16,000 So if you return false, you're returning zero. 1442 01:09:16,000 --> 01:09:25,470 And zero is a valid thing in our queue, whereas negative 1 is not if 1443 01:09:25,470 --> 01:09:27,000 false happened to be negative 1. 1444 01:09:27,000 --> 01:09:29,972 But you shouldn't even need to know that. 1445 01:09:29,972 --> 01:09:32,399 >> DAVID J. MALAN: That's why I didn't say it. 1446 01:09:32,399 --> 01:09:36,450 >> ROB BOWDEN: But it wasn't true that you can't return false. 1447 01:09:36,450 --> 01:09:37,700 >> DAVID J. MALAN: Sure. 1448 01:09:37,700 --> 01:09:40,920 1449 01:09:40,920 --> 01:09:44,240 So dequeue, notice we accept void as its argument. 1450 01:09:44,240 --> 01:09:45,479 And that's because we're not passing anything in. 1451 01:09:45,479 --> 01:09:48,359 We just want to remove the element at the front of the queue. 1452 01:09:48,359 --> 01:09:49,819 So how might we go about doing this? 1453 01:09:49,819 --> 01:09:51,290 Well, first, let's do this quick sanity check. 1454 01:09:51,290 --> 01:09:53,350 If the queue size is 0, there's no work to be done. 1455 01:09:53,350 --> 01:09:54,210 Return negative 1. 1456 01:09:54,210 --> 01:09:54,800 Done. 1457 01:09:54,800 --> 01:09:56,340 So that's a few lines of my program. 1458 01:09:56,340 --> 01:09:58,180 So only four lines remain. 1459 01:09:58,180 --> 01:10:01,310 >> So here I decide to decrement the size. 1460 01:10:01,310 --> 01:10:04,620 And decrementing the size effectively means that I'm forgetting 1461 01:10:04,620 --> 01:10:06,010 something is in there. 1462 01:10:06,010 --> 01:10:09,910 But I also have to update where the front of the numbers are. 1463 01:10:09,910 --> 01:10:11,620 So to do that, I need to do two things. 1464 01:10:11,620 --> 01:10:16,390 I first need to remember what the number is at the front of the queue, 1465 01:10:16,390 --> 01:10:17,860 because I need to return that thing. 1466 01:10:17,860 --> 01:10:20,910 So I don't want to accidentally forget about it and then overwrite it. 1467 01:10:20,910 --> 01:10:22,840 I'm just going to remember in an int. 1468 01:10:22,840 --> 01:10:27,310 >> And now, I want to update q.front to be q.front+1. 1469 01:10:27,310 --> 01:10:30,070 So if this was the first person in line, now, I want to do plus 1 to 1470 01:10:30,070 --> 01:10:31,930 point at the next person in line. 1471 01:10:31,930 --> 01:10:33,420 But I have to handle that wraparound. 1472 01:10:33,420 --> 01:10:37,270 And if capacity is a global constant, that's going to allow me to make sure 1473 01:10:37,270 --> 01:10:41,140 as I point to the very last person in line, the modulo operation will bring 1474 01:10:41,140 --> 01:10:43,840 me back to zero at the front of the queue. 1475 01:10:43,840 --> 01:10:46,050 And that handles the wraparound here. 1476 01:10:46,050 --> 01:10:48,950 And then I proceed to return n. 1477 01:10:48,950 --> 01:10:51,530 >> Now, strictly speaking, I didn't have to declare n. 1478 01:10:51,530 --> 01:10:53,880 I didn't have to grab it and store it temporarily, because the value is 1479 01:10:53,880 --> 01:10:54,740 still there. 1480 01:10:54,740 --> 01:10:57,490 So I could just do the right arithmetic to return the former head 1481 01:10:57,490 --> 01:10:58,450 of the queue. 1482 01:10:58,450 --> 01:11:01,850 But I just felt that this was more clear to actually grab the int, put it 1483 01:11:01,850 --> 01:11:04,320 in n, and then return that for clarity's sake but 1484 01:11:04,320 --> 01:11:05,735 not strictly necessary. 1485 01:11:05,735 --> 01:11:09,313 1486 01:11:09,313 --> 01:11:12,130 Psst. 1487 01:11:12,130 --> 01:11:13,410 They're all pronounceable in my head. 1488 01:11:13,410 --> 01:11:15,940 1489 01:11:15,940 --> 01:11:19,110 >> ROB BOWDEN: So first question is the binary tree problem. 1490 01:11:19,110 --> 01:11:22,140 So first question is, we're given these numbers. 1491 01:11:22,140 --> 01:11:27,160 And we want to somehow insert them into these nodes such that it is a 1492 01:11:27,160 --> 01:11:30,110 valid binary search tree. 1493 01:11:30,110 --> 01:11:36,260 So the one thing to remember about binary search trees is that it's not 1494 01:11:36,260 --> 01:11:39,800 just that the thing to the left is less and the thing to 1495 01:11:39,800 --> 01:11:41,120 the right is greater. 1496 01:11:41,120 --> 01:11:44,580 It needs to be that the entire tree to the left is less, and the entire tree 1497 01:11:44,580 --> 01:11:45,740 to the right is greater. 1498 01:11:45,740 --> 01:11:55,260 >> So if I put 34 here at the top, and then I put 20 here, so that's valid so 1499 01:11:55,260 --> 01:11:56,970 far, because 34 up here. 1500 01:11:56,970 --> 01:11:57,920 20 is going to the left. 1501 01:11:57,920 --> 01:11:58,950 So that's less. 1502 01:11:58,950 --> 01:12:03,640 But I can't then put 59 here, because even though 59 is on the right of 20, 1503 01:12:03,640 --> 01:12:06,140 it's still on the left of 34. 1504 01:12:06,140 --> 01:12:10,760 So with that constraint in mind, the easiest way of probably solving this 1505 01:12:10,760 --> 01:12:14,330 problem is to just sort of these numbers-- 1506 01:12:14,330 --> 01:12:18,720 so 20, 34, 36, 52, 59, 106. 1507 01:12:18,720 --> 01:12:21,640 And then insert those from left to right. 1508 01:12:21,640 --> 01:12:23,390 >> So 20 goes here. 1509 01:12:23,390 --> 01:12:24,630 34 goes here. 1510 01:12:24,630 --> 01:12:25,830 36 goes here. 1511 01:12:25,830 --> 01:12:29,360 52, 59, 106. 1512 01:12:29,360 --> 01:12:34,730 And you also could have figured out with some plugging in and realizing, 1513 01:12:34,730 --> 01:12:38,830 oh, wait, I don't have enough numbers to fill this in over here. 1514 01:12:38,830 --> 01:12:42,170 So I need to reshift what my route note is going to be. 1515 01:12:42,170 --> 01:12:47,490 But notice that in the final three, if you read from left to right, it is in 1516 01:12:47,490 --> 01:12:48,740 increasing order. 1517 01:12:48,740 --> 01:12:52,150 1518 01:12:52,150 --> 01:12:56,540 >> So now, we want to declare what the struct is going to be for the 1519 01:12:56,540 --> 01:12:58,300 nodes in this tree. 1520 01:12:58,300 --> 01:13:02,720 So what do we need in a binary tree? 1521 01:13:02,720 --> 01:13:05,830 So we have a value of type int, so some int value. 1522 01:13:05,830 --> 01:13:07,220 I don't know what we called it in the solution-- 1523 01:13:07,220 --> 01:13:08,500 int n. 1524 01:13:08,500 --> 01:13:13,570 We need a pointer to the left child and a pointer to the right child. 1525 01:13:13,570 --> 01:13:17,540 So it's going to look like this. 1526 01:13:17,540 --> 01:13:20,510 And it'll actually look before when did the doubly-linked 1527 01:13:20,510 --> 01:13:25,090 list stuff, so notice-- 1528 01:13:25,090 --> 01:13:27,860 I'm going to have to scroll all the way back down to problem 11. 1529 01:13:27,860 --> 01:13:30,980 1530 01:13:30,980 --> 01:13:36,390 >> So notice it looks identical to this, except we just happen to call these 1531 01:13:36,390 --> 01:13:38,590 different names. 1532 01:13:38,590 --> 01:13:41,440 We still have an integer value and two pointers. 1533 01:13:41,440 --> 01:13:44,850 It's just that instead of treating the pointers as pointing to the next thing 1534 01:13:44,850 --> 01:13:47,955 and the previous thing, we're treating the pointers to point to a left child 1535 01:13:47,955 --> 01:13:49,205 and right child. 1536 01:13:49,205 --> 01:13:57,372 1537 01:13:57,372 --> 01:13:57,860 OK. 1538 01:13:57,860 --> 01:13:59,650 So that's our struct node. 1539 01:13:59,650 --> 01:14:03,920 And now, the only function we need to implement for this is traverse, which 1540 01:14:03,920 --> 01:14:08,320 we want to go over the tree, printing out the values of the tree in order. 1541 01:14:08,320 --> 01:14:15,241 >> So looking here, we would want to print out 20, 34, 36, 52, 59, and 106. 1542 01:14:15,241 --> 01:14:17,970 How do we accomplish that? 1543 01:14:17,970 --> 01:14:18,890 So it's pretty similar. 1544 01:14:18,890 --> 01:14:22,910 If you saw in the past exam the problem that you wanted to print out 1545 01:14:22,910 --> 01:14:25,940 the whole tree with commas in between everything, it was actually even 1546 01:14:25,940 --> 01:14:27,320 easier than that. 1547 01:14:27,320 --> 01:14:30,950 So here is the solution. 1548 01:14:30,950 --> 01:14:33,110 This was significantly easier if you did it recursively. 1549 01:14:33,110 --> 01:14:36,650 I don't know if anyone attempted to do it iteratively. 1550 01:14:36,650 --> 01:14:38,340 >> But first, we have our base case. 1551 01:14:38,340 --> 01:14:39,660 What if the root is null? 1552 01:14:39,660 --> 01:14:40,610 Then we're just going to return. 1553 01:14:40,610 --> 01:14:42,300 We don't want to print anything. 1554 01:14:42,300 --> 01:14:45,940 Else we're going to traverse recursively down. 1555 01:14:45,940 --> 01:14:48,140 Print the entire left subtree. 1556 01:14:48,140 --> 01:14:51,440 So print everything less than my current value. 1557 01:14:51,440 --> 01:14:53,930 And then I'm going to print myself. 1558 01:14:53,930 --> 01:14:57,310 And then I'm going to recurse down my entire right subtree, so everything 1559 01:14:57,310 --> 01:14:58,810 greater than my value. 1560 01:14:58,810 --> 01:15:03,870 And this is going to print out everything in order. 1561 01:15:03,870 --> 01:15:05,860 Questions on how this actually accomplishes that? 1562 01:15:05,860 --> 01:15:09,892 1563 01:15:09,892 --> 01:15:12,545 >> AUDIENCE: I have a question on the [INAUDIBLE]. 1564 01:15:12,545 --> 01:15:15,090 1565 01:15:15,090 --> 01:15:23,550 >> ROB BOWDEN: So one way of approaching any recursive problem is to just think 1566 01:15:23,550 --> 01:15:26,275 about it like you have to think about all the corner cases. 1567 01:15:26,275 --> 01:15:32,150 1568 01:15:32,150 --> 01:15:38,110 So consider that we want to print this entire tree. 1569 01:15:38,110 --> 01:15:42,030 So all we are going to focus on is this particular node-- 1570 01:15:42,030 --> 01:15:43,740 36. 1571 01:15:43,740 --> 01:15:47,420 The recursive calls, we pretend those just work. 1572 01:15:47,420 --> 01:15:54,000 So here, this recursive call to traverse, we without even thinking 1573 01:15:54,000 --> 01:15:58,640 about it, just traversing the left three, imagine that already prints 20 1574 01:15:58,640 --> 01:16:00,730 and 34 for us. 1575 01:16:00,730 --> 01:16:03,350 And then when we eventually recursively call traverse on the 1576 01:16:03,350 --> 01:16:07,890 right, that will correctly print 52, 59, and 106 for us. 1577 01:16:07,890 --> 01:16:13,620 >> So given that this can print 20, 34, and the other can print 52, 59, 108, 1578 01:16:13,620 --> 01:16:17,180 all we need to be able to do is print ourself in the middle of that. 1579 01:16:17,180 --> 01:16:21,250 So print out everything before us. 1580 01:16:21,250 --> 01:16:27,710 Print ourself, so the current node print 36, regular printf, and then 1581 01:16:27,710 --> 01:16:31,170 print everything after us. 1582 01:16:31,170 --> 01:16:32,730 >> DAVID J. MALAN: This is where recursion gets really beautiful. 1583 01:16:32,730 --> 01:16:36,270 It's this amazing leap of faith where you do the tiniest bit of work. 1584 01:16:36,270 --> 01:16:38,460 And then you let someone else do the rest. 1585 01:16:38,460 --> 01:16:40,180 And that someone else is, ironically, you. 1586 01:16:40,180 --> 01:16:44,260 1587 01:16:44,260 --> 01:16:48,360 So for serious brownie points, if you scroll up on the questions-- 1588 01:16:48,360 --> 01:16:50,530 >> ROB BOWDEN: On the questions? 1589 01:16:50,530 --> 01:16:53,490 >> DAVID J. MALAN: And down a little to the numbers, does anyone know where 1590 01:16:53,490 --> 01:16:55,190 these numbers come from? 1591 01:16:55,190 --> 01:16:56,610 >> ROB BOWDEN: I have literally no idea. 1592 01:16:56,610 --> 01:16:59,794 >> DAVID J. MALAN: They appear throughout the quiz. 1593 01:16:59,794 --> 01:17:01,150 >> AUDIENCE: Are they the same numbers? 1594 01:17:01,150 --> 01:17:01,910 >> DAVID J. MALAN: Those numbers. 1595 01:17:01,910 --> 01:17:03,260 A little Easter egg. 1596 01:17:03,260 --> 01:17:08,100 So for those of you watching online at home, if you can tell us via email to 1597 01:17:08,100 --> 01:17:12,680 heads@CS50.net what the significance of these recurring six numbers are 1598 01:17:12,680 --> 01:17:18,560 throughout Quiz 1, we will shower you with amazing attention at the final 1599 01:17:18,560 --> 01:17:21,610 lecture and a stress ball. 1600 01:17:21,610 --> 01:17:25,460 1601 01:17:25,460 --> 01:17:27,790 Nice, subtle. 1602 01:17:27,790 --> 01:17:29,570 >> ROB BOWDEN: Any last questions about anything on the quiz? 1603 01:17:29,570 --> 01:17:32,608