1 00:00:10,326 --> 00:00:12,076 >>All right, welcome to CS50. 2 00:00:12,136 --> 00:00:13,816 This is the end of week 5, 3 00:00:13,876 --> 00:00:16,526 so I felt that Monday's lecture fell a little flat, 4 00:00:16,526 --> 00:00:18,846 I wasn't really feeling it so I thought we'd try to spice things 5 00:00:18,846 --> 00:00:21,186 up perhaps today with some props. 6 00:00:21,326 --> 00:00:23,016 So I went shopping there in the square. 7 00:00:23,016 --> 00:00:26,816 But first, some good news and some bad news; 8 00:00:26,816 --> 00:00:28,066 which would you like first? 9 00:00:28,186 --> 00:00:30,556 >>Bad. 10 00:00:30,556 --> 00:00:32,036 >>I heard 1 person say good first, 11 00:00:32,036 --> 00:00:33,526 but then everyone else wants the bad news. 12 00:00:33,526 --> 00:00:35,836 Bad news, Quiz 0 is next Wednesday. 13 00:00:35,836 --> 00:00:37,726 But this is mitigated by the fact 14 00:00:37,726 --> 00:00:40,046 that there's no class Monday and there's no problems 15 00:00:40,046 --> 00:00:41,056 with next week as well. 16 00:00:41,106 --> 00:00:43,216 So realize you have a handout in your hands from outside 17 00:00:43,506 --> 00:00:45,156 that describes all of the particulars 18 00:00:45,206 --> 00:00:46,766 for Wednesday's logistics. 19 00:00:46,866 --> 00:00:49,276 Do not come here because of our size, 20 00:00:49,316 --> 00:00:51,216 we'll split up into 3 different lecture halls, 21 00:00:51,216 --> 00:00:53,716 where you actually have a decent writing surface for the quiz, 22 00:00:53,976 --> 00:00:56,166 so that's all there and it's also on the course's website. 23 00:00:56,166 --> 00:00:57,906 If you lose this piece of paper or tuning 24 00:00:57,906 --> 00:00:59,706 in afar do not have this piece of paper, 25 00:00:59,966 --> 00:01:01,216 it's under the "Quizzes" page. 26 00:01:01,446 --> 00:01:04,636 Also on the "Quizzes" page are 5 previous quizzes, 27 00:01:04,636 --> 00:01:06,056 along with their solutions. 28 00:01:06,376 --> 00:01:08,766 I know, this is probably more interesting 29 00:01:08,766 --> 00:01:10,196 than the quiz, but okay. 30 00:01:10,196 --> 00:01:13,996 And actually the segue way here was this is a wonderful way 31 00:01:13,996 --> 00:01:15,896 to waste a lot of time. 32 00:01:15,946 --> 00:01:19,816 Not only was this implemented years ago, actually in ASCII 33 00:01:19,816 --> 00:01:22,606 or ASCII Art, which is a complete tangent, isn't it? 34 00:01:22,866 --> 00:01:24,576 So this was implemented in ASCII Art. 35 00:01:24,576 --> 00:01:28,286 The relevance is that we used ASCII Art for last problem set, 36 00:01:28,336 --> 00:01:30,916 for the game of 15, but we definitely raised the bar a bit 37 00:01:30,916 --> 00:01:32,936 with Sodoku, where you're actually using a graphics 38 00:01:33,016 --> 00:01:35,106 library, called N Curses, 39 00:01:35,106 --> 00:01:37,376 albeit a very limited graphical library. 40 00:01:37,376 --> 00:01:41,816 So this thing used to live on a server that you could TelNet to. 41 00:01:41,816 --> 00:01:44,246 TelNet is like SSH, but it's unencrypted. 42 00:01:44,476 --> 00:01:47,946 So this guy, years ago made this whole animation of Star Wars. 43 00:01:47,946 --> 00:01:50,266 And now there's variants on the web and in Java applets. 44 00:01:50,396 --> 00:01:51,716 But it's pure ASCII Art 45 00:01:51,716 --> 00:01:53,806 for almost the entirety of that movie. 46 00:01:53,806 --> 00:01:54,566 So we'll link to that 47 00:01:54,566 --> 00:01:56,366 on the course's website, if you're curious. 48 00:01:56,366 --> 00:01:58,446 It's an hour or more long. 49 00:01:58,516 --> 00:01:59,816 Now, back to reality. 50 00:02:00,416 --> 00:02:02,686 Let's see, we were on the bad news. 51 00:02:02,686 --> 00:02:05,326 The bad news is that the Quiz 0 is Wednesday, 52 00:02:05,326 --> 00:02:06,666 but there's innumerable resources 53 00:02:06,666 --> 00:02:10,216 on the course's website under "Quizzes" from past quizzes. 54 00:02:10,216 --> 00:02:12,586 Realize that the material does vary somewhat year to year. 55 00:02:12,816 --> 00:02:14,936 So if you see some topic that's completely over your head, 56 00:02:14,936 --> 00:02:18,126 it may very well be that it is completely over your head, 57 00:02:18,126 --> 00:02:19,536 but we might not have covered it, either. 58 00:02:19,536 --> 00:02:21,856 So do consult the syllabus or me or a teaching fellow. 59 00:02:22,116 --> 00:02:24,766 And then also this week, on Sunday, Monday, 60 00:02:24,766 --> 00:02:27,066 Tuesday the sections will obstinately be review 61 00:02:27,066 --> 00:02:27,576 for the quiz. 62 00:02:27,816 --> 00:02:29,786 And then, this Sunday at 7 pm 63 00:02:29,786 --> 00:02:32,246 in Emerson 108 will be a course-wide review 64 00:02:32,246 --> 00:02:34,356 which is optional, but you're welcome and encouraged to go. 65 00:02:34,356 --> 00:02:36,466 It will be filmed and put online by Tuesday. 66 00:02:36,806 --> 00:02:38,576 And this will supplant the regularly 67 00:02:38,576 --> 00:02:39,506 scheduled walk-through. 68 00:02:39,566 --> 00:02:40,936 So there's no walk-through this Sunday, 69 00:02:40,936 --> 00:02:42,016 instead there's a review for this. 70 00:02:42,106 --> 00:02:44,796 And all of that is on your PDF for your reference. 71 00:02:45,006 --> 00:02:45,836 Okay, so let's see. 72 00:02:45,836 --> 00:02:46,926 We did the good news. 73 00:02:46,926 --> 00:02:49,246 No P Sets [assumed spelling], no lecture on Monday. 74 00:02:49,246 --> 00:02:51,596 Which means this last time we'll see you for a week. 75 00:02:51,926 --> 00:02:56,436 But without further ado, some stuff. 76 00:02:56,626 --> 00:02:58,426 On Monday, we talked about pointers. 77 00:02:58,426 --> 00:02:59,776 And we've been talking about pointers. 78 00:02:59,776 --> 00:03:03,626 And this is the last piece of interesting, or dare I say, 79 00:03:03,626 --> 00:03:05,696 complicated syntax that we're going to introduce 80 00:03:05,696 --> 00:03:08,436 in the concept of C. Because once you have this ability 81 00:03:08,436 --> 00:03:12,296 to manipulate memory at your will, you can do a lot 82 00:03:12,296 --> 00:03:13,466 of interesting things. 83 00:03:13,466 --> 00:03:16,856 And also thus far we've not have useful data structures. 84 00:03:17,076 --> 00:03:18,506 You've had these things called primitives. 85 00:03:18,576 --> 00:03:20,656 And primitive is an int, a char, a float. 86 00:03:20,816 --> 00:03:22,126 Any of those little things that you get 87 00:03:22,126 --> 00:03:23,286 for free in the language. 88 00:03:23,476 --> 00:03:26,556 And then we have this data structure called an array. 89 00:03:26,916 --> 00:03:30,656 But it's really just a nice conceptualization of a chunk 90 00:03:30,656 --> 00:03:33,706 of memory back to back to back to back that you can address via 91 00:03:33,706 --> 00:03:35,276 that square bracket operator. 92 00:03:35,276 --> 00:03:38,006 But today, we introduce a fundamentally more interesting 93 00:03:38,006 --> 00:03:41,396 data structure and it's more of a traditional data structure. 94 00:03:41,396 --> 00:03:43,786 Or, as people call it, an abstract data type. 95 00:03:43,786 --> 00:03:46,486 ADT, if you ever see that acronym somewhere. 96 00:03:46,766 --> 00:03:48,306 And that just means that in general, 97 00:03:48,306 --> 00:03:51,016 these more interesting data structures are things 98 00:03:51,016 --> 00:03:53,466 that have data associated with them. 99 00:03:53,526 --> 00:03:55,676 Properties, fields, call it whatever you want. 100 00:03:55,676 --> 00:03:59,466 Pieces of information about variables, much like the structs 101 00:03:59,546 --> 00:04:03,156 that we've discussed, but they also have operations related 102 00:04:03,156 --> 00:04:03,406 to them. 103 00:04:03,786 --> 00:04:04,846 So the thing about C is 104 00:04:04,846 --> 00:04:06,956 that even though we have these structs that we started using 105 00:04:06,956 --> 00:04:09,396 on Monday, just to encapsulate related information. 106 00:04:09,606 --> 00:04:11,716 Those of you who have tackled P Set 4, know that all 107 00:04:11,716 --> 00:04:15,436 of our global variables, which is tucked inside of a struct, 108 00:04:15,436 --> 00:04:17,996 because it felt a little cleaner to encapsulate them. 109 00:04:17,996 --> 00:04:21,006 And this word, encapsulate, is actually a theme in programming 110 00:04:21,006 --> 00:04:24,646 because it means to gather together like, or related, data. 111 00:04:24,936 --> 00:04:27,446 But there's no fundamental operations 112 00:04:27,656 --> 00:04:29,346 that are supported by that struct. 113 00:04:29,346 --> 00:04:32,116 There's no functions related to Sodoku's board. 114 00:04:32,266 --> 00:04:34,956 Anything you do with G dot whatever has 115 00:04:34,996 --> 00:04:36,216 to be done very manually. 116 00:04:36,316 --> 00:04:40,966 G dot X gets some value, G dot Y gets some value. 117 00:04:41,216 --> 00:04:42,726 You don't have any built-in mechanisms. 118 00:04:42,766 --> 00:04:44,476 But now that we start talking 119 00:04:44,476 --> 00:04:47,416 about these things called linked lists, 120 00:04:47,556 --> 00:04:49,816 which will address some weaknesses in arrays, 121 00:04:50,166 --> 00:04:52,946 can we actually say that there's some formal operations, 122 00:04:53,116 --> 00:04:55,606 some features of these data structures that up until now, 123 00:04:55,606 --> 00:04:56,456 we haven't really had. 124 00:04:56,846 --> 00:05:00,666 But why care at all, why not just draw the line after arrays? 125 00:05:00,666 --> 00:05:02,356 Well, arrays are only so good. 126 00:05:02,426 --> 00:05:06,526 What is a downside, in your mind already perhaps, about arrays? 127 00:05:06,526 --> 00:05:07,916 >>[inaudible audience comment] 128 00:05:07,916 --> 00:05:11,356 >>Since it's not been working very well in past lectures, 129 00:05:11,356 --> 00:05:13,406 I felt there are just some murmurings and then I fill 130 00:05:13,406 --> 00:05:14,616 in the blank with my own answer. 131 00:05:14,786 --> 00:05:16,676 So give me a hand, if you would, it's hard for me to hear you. 132 00:05:16,766 --> 00:05:18,656 Okay, black shirt here. 133 00:05:18,746 --> 00:05:20,866 Yes. You're wearing white. 134 00:05:20,866 --> 00:05:21,536 >>[inaudible audience comment] 135 00:05:21,536 --> 00:05:29,446 >>Good, so you need to know the size of the array 136 00:05:29,446 --> 00:05:30,936 that you want before you create it. 137 00:05:30,936 --> 00:05:33,356 And there's a couple of ways we can create arrays now, right? 138 00:05:33,356 --> 00:05:35,836 One is just to declare it statically 139 00:05:36,036 --> 00:05:38,046 within some function, or maybe globally. 140 00:05:38,046 --> 00:05:40,906 But statically in the sense that you start int, then the name 141 00:05:40,906 --> 00:05:42,956 of the array, open brackets, some number. 142 00:05:42,956 --> 00:05:44,706 Closed brackets, semi-colon; 143 00:05:44,706 --> 00:05:46,476 or maybe it's 2 dimensional or 3 dimensional. 144 00:05:46,476 --> 00:05:49,256 But you have to know in advance what that size is going to be. 145 00:05:49,566 --> 00:05:52,636 Or you can also allocate an array dynamically, 146 00:05:52,706 --> 00:05:56,206 which means on demand using what new function 147 00:05:56,206 --> 00:05:57,516 or what useful feature of C? 148 00:05:58,166 --> 00:05:58,736 >>Malok [phonetic spelling]. 149 00:05:58,736 --> 00:06:01,566 >>Malok. And Malok doesn't give you an array, per say, 150 00:06:01,566 --> 00:06:05,236 it just gives you a pointer to a chunk of memory that is as big 151 00:06:05,236 --> 00:06:07,026 as you ask the operating system for. 152 00:06:07,026 --> 00:06:10,236 But in the event the OS just doesn't have enough memory 153 00:06:10,236 --> 00:06:12,536 to give you; you ask for 5 gigabytes, 154 00:06:12,536 --> 00:06:13,706 and that's just unreasonable. 155 00:06:13,706 --> 00:06:14,916 What will Malok return? 156 00:06:15,046 --> 00:06:15,436 Quick review. 157 00:06:15,436 --> 00:06:18,076 So it's going to return that null value. 158 00:06:18,076 --> 00:06:20,536 So the best lesson you can hammer into your head 159 00:06:20,536 --> 00:06:21,916 when you're playing with problem set 4, 160 00:06:21,916 --> 00:06:25,126 5 and beyond is anytime you manipulate pointers in C, 161 00:06:25,446 --> 00:06:28,666 constantly, very defensively check for null pointers. 162 00:06:28,666 --> 00:06:30,456 Because otherwise these very bad things known 163 00:06:30,456 --> 00:06:32,986 as Segfalls [phonetic spelling] happen a little too often. 164 00:06:33,766 --> 00:06:37,896 So arrays require that you know in advance how big they are. 165 00:06:37,896 --> 00:06:40,506 And that might be fine because we already saw 166 00:06:40,506 --> 00:06:42,896 in that silly little registrar example that I did 167 00:06:42,896 --> 00:06:45,806 where I inputted students' names and houses and ID numbers. 168 00:06:46,136 --> 00:06:49,296 Well, I can just change the constant, called students, 169 00:06:49,466 --> 00:06:51,406 from 3 to some other value. 170 00:06:52,186 --> 00:06:56,086 So why is that not really a compelling argument that okay, 171 00:06:56,086 --> 00:06:58,266 yes, you have to declare an array's size in advance, 172 00:06:58,466 --> 00:07:00,266 but you can always change it. 173 00:07:00,266 --> 00:07:02,116 What is the downside about that approach? 174 00:07:02,756 --> 00:07:07,756 And recall in that example, there was sharp defined students 175 00:07:07,806 --> 00:07:09,196 and then it was hard coded at 3. 176 00:07:09,356 --> 00:07:10,806 But I could very easily change that, right? 177 00:07:10,806 --> 00:07:12,046 That was the whole point of the constant. 178 00:07:12,536 --> 00:07:15,516 So why is that not really a compelling argument 179 00:07:15,516 --> 00:07:17,846 in favor of arrays? 180 00:07:19,476 --> 00:07:19,636 Yes? 181 00:07:19,636 --> 00:07:19,786 >>[inaudible audience comment] 182 00:07:19,786 --> 00:07:22,006 >>Okay, so causes security vulnerably. 183 00:07:22,006 --> 00:07:24,566 So the fact that arrays are fixed to 1 length and view, 184 00:07:24,566 --> 00:07:26,926 you as the programmer have to check the length of them. 185 00:07:26,926 --> 00:07:29,746 That's the security vulnerability, sure. 186 00:07:30,126 --> 00:07:32,036 But specifically with regard to their size. 187 00:07:32,036 --> 00:07:36,686 Why can I just not say array size can be changed, 188 00:07:36,846 --> 00:07:39,506 just change the constant in your code? 189 00:07:39,506 --> 00:07:40,536 >>[inaudible audience comment] 190 00:07:40,536 --> 00:07:42,176 >>Right. So the user can't do that. 191 00:07:42,176 --> 00:07:45,146 You, the programmer, absolutely can go into your dot C file 192 00:07:45,146 --> 00:07:46,956 or dot H file, change that constant. 193 00:07:46,956 --> 00:07:48,486 But then, what do you need to do? 194 00:07:49,816 --> 00:07:50,866 Recompile, right? 195 00:07:50,866 --> 00:07:52,446 And that's not that compelling. 196 00:07:52,446 --> 00:07:56,226 Maybe now in ES50, week 5, it's pretty easy to recompile, 197 00:07:56,226 --> 00:07:58,636 it's pretty fast because we're only writing so much code. 198 00:07:58,846 --> 00:08:00,036 But if you're Microsoft, 199 00:08:00,036 --> 00:08:02,346 if you're a company that's actually written a program 200 00:08:02,346 --> 00:08:04,016 and ships it in a shrink-wrapped box, 201 00:08:04,016 --> 00:08:05,556 or a user has downloaded it. 202 00:08:05,616 --> 00:08:09,426 No, there's no chance thereafter to change anything 203 00:08:09,426 --> 00:08:11,646 in the source code unless you update the whole darn thing. 204 00:08:11,916 --> 00:08:13,246 So we need some more dynamisms. 205 00:08:13,246 --> 00:08:14,726 So Malok does give us that, 206 00:08:14,726 --> 00:08:17,536 because we can ask the user how many students are in the class. 207 00:08:17,536 --> 00:08:19,896 And then we can allocate using Malok a chunk 208 00:08:19,896 --> 00:08:21,766 of memory that's actually big enough. 209 00:08:22,426 --> 00:08:25,576 But what happens if I start up my program 210 00:08:25,576 --> 00:08:28,476 and I say I have 100 students this semester? 211 00:08:28,656 --> 00:08:30,716 So give me enough memory for 100 students. 212 00:08:30,716 --> 00:08:33,216 So 100 times the size of the student struct. 213 00:08:33,696 --> 00:08:36,376 Okay. But what if there are some late registrants 214 00:08:36,566 --> 00:08:37,566 at the university? 215 00:08:37,566 --> 00:08:39,056 And this happens in every course here. 216 00:08:39,056 --> 00:08:40,116 People add, they drop. 217 00:08:40,446 --> 00:08:42,566 But after the program's already running, 218 00:08:42,696 --> 00:08:44,216 after the program's already compiled. 219 00:08:44,486 --> 00:08:47,826 So I now have an array of size 100, what do I then do 220 00:08:48,076 --> 00:08:50,816 if while the program's running, I need to change its size? 221 00:08:50,816 --> 00:08:51,516 What are my options? 222 00:08:51,516 --> 00:08:52,306 >>[inaudible audience question] 223 00:08:52,306 --> 00:08:56,676 >>I mean intuitively, would you like to do? 224 00:08:56,706 --> 00:08:58,646 Right now, I have 100 students in an array, 225 00:08:58,956 --> 00:08:59,796 but now I'm out of space. 226 00:08:59,796 --> 00:09:02,036 So just intuitively, what would you do? 227 00:09:02,036 --> 00:09:02,416 >>[inaudible audience question] 228 00:09:02,416 --> 00:09:03,776 >>Make a new array, right? 229 00:09:03,776 --> 00:09:07,226 So use Malok again, whatever technique you used the first 230 00:09:07,226 --> 00:09:09,236 time, make a new array, maybe err on the side 231 00:09:09,236 --> 00:09:10,546 of safety, make it 200. 232 00:09:10,546 --> 00:09:12,986 Even though you might waste some memory, and then what do I do? 233 00:09:12,986 --> 00:09:15,596 I have an array that's 100, and an array that's 200. 234 00:09:15,596 --> 00:09:17,406 What do I do? 235 00:09:17,406 --> 00:09:17,646 >>[inaudible audience question] 236 00:09:17,646 --> 00:09:19,446 >>So copy one into the other, 237 00:09:19,446 --> 00:09:20,546 that's a simple foreloop [assumed spelling], 238 00:09:20,546 --> 00:09:22,016 wild loop [assumed spelling], whatever you want to use. 239 00:09:22,016 --> 00:09:24,316 Then what should I do? 240 00:09:24,316 --> 00:09:24,556 >>[inaudible audience question] 241 00:09:24,556 --> 00:09:26,466 >>So free the first array, right? 242 00:09:26,466 --> 00:09:29,236 And you can do this in relatively few steps. 243 00:09:29,236 --> 00:09:32,456 We just rattled off maybe 5, 6 maybe max 10 lines 244 00:09:32,456 --> 00:09:34,396 of code to call Malok again. 245 00:09:34,736 --> 00:09:37,306 To use a loop to copy the contents of the old array 246 00:09:37,306 --> 00:09:40,236 into the new; and then to call free on the original array. 247 00:09:40,426 --> 00:09:43,316 But what do I probably need to keep track of now? 248 00:09:43,546 --> 00:09:45,716 Arrays are just a chunk of memory, 249 00:09:45,716 --> 00:09:46,986 what do I need to keep track of? 250 00:09:47,546 --> 00:09:51,296 If I've now added some more students to this bigger array. 251 00:09:52,416 --> 00:09:53,786 There's murmurings, what? 252 00:09:55,476 --> 00:09:56,576 Okay, the size of it, right? 253 00:09:56,576 --> 00:09:58,916 So this is one of the downsides, upsides, 254 00:09:58,916 --> 00:10:00,056 however you want to view it to see. 255 00:10:00,176 --> 00:10:02,956 The burden is on you to remember the size 256 00:10:02,956 --> 00:10:05,936 of some integer called N. All right, but this is annoying 257 00:10:05,936 --> 00:10:08,496 and we saw this ability in get strings. 258 00:10:08,566 --> 00:10:11,646 You might not recall the CS50 source code, but you do have it 259 00:10:11,646 --> 00:10:13,016 and at this point in the semester, 260 00:10:13,016 --> 00:10:14,736 probably pretty accessible, reading it top 261 00:10:14,766 --> 00:10:16,396 to bottom again at home. 262 00:10:16,706 --> 00:10:19,136 But our get strength function is exactly this. 263 00:10:19,136 --> 00:10:19,666 We took a guess. 264 00:10:19,816 --> 00:10:23,356 We allocated, I forget what the constant is, 128 bytes. 265 00:10:23,456 --> 00:10:25,966 Something arbitrary, but relatively big. 266 00:10:26,306 --> 00:10:29,176 And then we have a loop that says if the user types 267 00:10:29,176 --> 00:10:33,026 in 129th character, uh-oh, let's call a new function 268 00:10:33,536 --> 00:10:37,926 and we actually slightly more fancy use reallocation, 269 00:10:38,176 --> 00:10:41,236 which actually does that copying for you, so you don't need 270 00:10:41,236 --> 00:10:42,266 to resort to the foreloop. 271 00:10:42,466 --> 00:10:46,026 But the same idea this is arguably annoying. 272 00:10:46,026 --> 00:10:48,166 Every time you want to add something to the array, 273 00:10:48,166 --> 00:10:49,426 you have to check its length. 274 00:10:49,576 --> 00:10:51,246 Too small reallocate an array. 275 00:10:51,516 --> 00:10:53,556 That's a lot of minutiae just 276 00:10:53,606 --> 00:10:55,836 to add something simple to your array. 277 00:10:56,106 --> 00:10:58,216 So it turns out that there are data structures 278 00:10:58,246 --> 00:10:59,596 that address this. 279 00:10:59,766 --> 00:11:01,916 The downside of an array is that it's fixed eyes. 280 00:11:02,196 --> 00:11:04,766 Well, why don't we craft a data structure that's 281 00:11:04,766 --> 00:11:05,976 of variable size? 282 00:11:05,976 --> 00:11:08,766 That grows when we want it to, and we don't have to jump 283 00:11:08,766 --> 00:11:11,256 through all these stupid hoops of copying big chunks 284 00:11:11,256 --> 00:11:13,446 of memory again and again and again just 285 00:11:13,476 --> 00:11:15,016 because we didn't have the foresight to know 286 00:11:15,016 --> 00:11:17,096 in advance how big this thing needs to be. 287 00:11:17,346 --> 00:11:19,006 So we can express this data type, 288 00:11:19,006 --> 00:11:23,146 frankly using a pretty clear picture. 289 00:11:23,406 --> 00:11:25,866 So if I have an array, it looks like this. 290 00:11:25,866 --> 00:11:29,216 And this is of size N or in this case size 4. 291 00:11:29,216 --> 00:11:31,426 So I put a bunch of new students in here. 292 00:11:31,666 --> 00:11:34,686 We've got Students 0, 1, 2, 3. 293 00:11:34,796 --> 00:11:37,166 But now I'm screwed if I want to add more students here. 294 00:11:37,396 --> 00:11:39,756 But this new data structure that we'll call a linked list 295 00:11:39,856 --> 00:11:44,596 and is much more compelling is something 296 00:11:44,596 --> 00:11:45,466 that might look like this. 297 00:11:45,706 --> 00:11:46,986 Well, when I have one student, 298 00:11:47,226 --> 00:11:50,816 let's allocate a student object a student struct in C speak. 299 00:11:51,046 --> 00:11:52,286 Well, if I have a second student, 300 00:11:52,406 --> 00:11:53,726 let's create another one. 301 00:11:53,776 --> 00:11:55,336 Another student, another one. 302 00:11:55,336 --> 00:11:56,756 Another student, let's create another one. 303 00:11:56,996 --> 00:11:59,076 And then inside each of these structs, 304 00:11:59,156 --> 00:12:00,986 just as in my example last week. 305 00:12:00,986 --> 00:12:03,756 I had an ID field and I had a name field 306 00:12:04,056 --> 00:12:06,416 and I had a house field and repeat 307 00:12:06,416 --> 00:12:07,796 that for each of these guys here. 308 00:12:07,796 --> 00:12:09,146 I can do the same thing here. 309 00:12:09,436 --> 00:12:13,486 I can do ID, I can do name, I can do house. 310 00:12:13,676 --> 00:12:18,296 But, because I've called Malok for each and every one 311 00:12:18,296 --> 00:12:19,876 of these students on demand. 312 00:12:20,336 --> 00:12:23,096 I need to somehow keep track of where they are in memory. 313 00:12:23,096 --> 00:12:24,506 Because you call Malok, the stuff you get back, 314 00:12:24,506 --> 00:12:25,716 the pointer you get back is not guaranteed, 315 00:12:25,716 --> 00:12:27,536 in fact the pointer you get back does not guarantee it to be back 316 00:12:27,536 --> 00:12:28,556 to back to back to back. 317 00:12:28,556 --> 00:12:30,826 Maybe by chance it will be, especially if we call it 318 00:12:30,826 --> 00:12:33,366 in rapid succession because no one else is using that memory, 319 00:12:33,636 --> 00:12:34,946 but it's no guarantee. 320 00:12:35,226 --> 00:12:37,976 So what you get back every time you call Malok is a pointer 321 00:12:37,976 --> 00:12:41,246 to this thing, a pointer to this thing, a pointer to this thing. 322 00:12:41,246 --> 00:12:41,776 And again. 323 00:12:42,086 --> 00:12:43,856 And now we seem to have a downside. 324 00:12:43,856 --> 00:12:47,546 An array is previously we could remember by way of 1 pointer 325 00:12:47,826 --> 00:12:48,956 or by way of their name. 326 00:12:48,956 --> 00:12:50,406 But how many pointers do I need now 327 00:12:50,406 --> 00:12:52,006 to keep track of my 4 students? 328 00:12:53,116 --> 00:12:53,896 So it seems 4, right? 329 00:12:53,896 --> 00:12:55,416 4 unique addresses, they're not necessarily continuous, 330 00:12:55,416 --> 00:12:56,356 which means I can't use pointer arithmetic 331 00:12:56,356 --> 00:12:57,446 because I got back 4 unique addresses, 332 00:12:57,446 --> 00:12:57,956 they're not contiguous, 333 00:12:57,956 --> 00:13:00,256 which means I can't use pointer arithmetic, 334 00:13:00,256 --> 00:13:03,476 I can't use square bracket notation because that assumes 335 00:13:03,706 --> 00:13:06,366 that requires a leap of faith that everything is laid 336 00:13:06,366 --> 00:13:08,206 out contiguously where it's expected. 337 00:13:08,616 --> 00:13:11,386 So I intentionally left this extra field here. 338 00:13:11,836 --> 00:13:13,916 So if we just are a little clever, 339 00:13:13,916 --> 00:13:17,406 and define our student struct as not only having 3 fields 340 00:13:17,406 --> 00:13:19,266 that we really need, but a 4th field, 341 00:13:19,566 --> 00:13:21,736 maybe we can just string these things together. 342 00:13:21,766 --> 00:13:23,736 We can daisy chain them, if you've heard the word. 343 00:13:24,036 --> 00:13:25,936 So we have a pointer that yes, 344 00:13:26,156 --> 00:13:27,596 points to the very first student. 345 00:13:27,986 --> 00:13:28,556 But you know what? 346 00:13:28,556 --> 00:13:30,796 We don't need pointers separate, 347 00:13:31,036 --> 00:13:33,656 why don't we just include them in our struct? 348 00:13:33,656 --> 00:13:37,526 And somehow link these things together in this fashion. 349 00:13:37,526 --> 00:13:39,306 And the only thing I have to be really careful 350 00:13:39,306 --> 00:13:42,046 about right now is what? 351 00:13:42,256 --> 00:13:43,056 What goes here? 352 00:13:43,746 --> 00:13:46,546 So you need some sentinel value 353 00:13:46,546 --> 00:13:48,626 that will probably be null in this case. 354 00:13:48,626 --> 00:13:50,666 Anytime you need the absence of a pointer, 355 00:13:50,666 --> 00:13:51,676 or null, it's important. 356 00:13:51,786 --> 00:13:53,516 If I hadn't done that, what might happen 357 00:13:53,516 --> 00:13:56,256 if I traversed this list as by in the foreloop 358 00:13:56,256 --> 00:13:58,236 or wild loop just following these pointers. 359 00:13:58,236 --> 00:14:00,556 And how you follow the pointers we'll see some tactically today. 360 00:14:01,636 --> 00:14:05,206 What could happen if this is not null, but some unknown value? 361 00:14:05,206 --> 00:14:07,486 Garbage values, as we keep calling them? 362 00:14:07,931 --> 00:14:09,931 >>[inaudible audience question] 363 00:14:10,376 --> 00:14:11,666 >>A little louder? 364 00:14:11,666 --> 00:14:12,506 >>[inaudible audience question] 365 00:14:12,506 --> 00:14:17,396 >>Yes, you could follow the pointer that just so happens 366 00:14:17,426 --> 00:14:20,286 to look like a number but really leads nowhere useful. 367 00:14:20,286 --> 00:14:22,226 Nowhere you own and so you get 368 00:14:22,226 --> 00:14:23,426 that typical segfall [phonetic spelling]. 369 00:14:23,426 --> 00:14:25,546 All right, so let's try to frame this specifically. 370 00:14:25,866 --> 00:14:29,026 All right, so this, I would propose is a new struct 371 00:14:29,026 --> 00:14:29,516 that we use. 372 00:14:29,516 --> 00:14:30,206 And let's simplify. 373 00:14:30,206 --> 00:14:31,656 Let's not use student objects just 374 00:14:31,656 --> 00:14:33,526 yet because there's a whole bunch of additional data 375 00:14:33,526 --> 00:14:35,466 which feels a little complicated to keep track of. 376 00:14:35,726 --> 00:14:38,486 Let's reign it in, assume we're storing numbers, just the IDs. 377 00:14:38,486 --> 00:14:41,796 So I'm going to use that same type of feature before 378 00:14:41,796 --> 00:14:43,796 because I want to call this thing a node. 379 00:14:44,066 --> 00:14:46,286 And so in general, anytime you have these graph-like 380 00:14:46,286 --> 00:14:49,106 structures, these list-like structures in Computer Science 381 00:14:49,106 --> 00:14:50,786 or Math are generally called nodes. 382 00:14:51,166 --> 00:14:53,896 So we'll call this thing a node using the same syntax 383 00:14:53,896 --> 00:14:54,606 as last week. 384 00:14:55,036 --> 00:14:57,386 But, notice 1 difference here. 385 00:14:57,386 --> 00:14:59,326 Last week, when I typed up something, 386 00:14:59,606 --> 00:15:01,996 I didn't type deaf struct. 387 00:15:01,996 --> 00:15:05,046 Name of the thing and then the curly braces the name 388 00:15:05,046 --> 00:15:07,426 of the thing again at the end. 389 00:15:07,476 --> 00:15:10,356 But in structure, notice how 1 of the fields, 390 00:15:10,566 --> 00:15:12,836 even though this is the student version, notice how 1 391 00:15:12,836 --> 00:15:17,666 of the fields needs to point to in object of the same type 392 00:15:17,666 --> 00:15:20,066 as the thing storing this pointer. 393 00:15:20,366 --> 00:15:23,596 In other words, we need a way to be self-referential 394 00:15:23,686 --> 00:15:27,516 and this just means that using the syntax of C if you want 395 00:15:27,516 --> 00:15:29,256 to store inside of a struct, 396 00:15:29,656 --> 00:15:32,816 a pointer to that same kind of struct. 397 00:15:33,226 --> 00:15:34,926 You just have to append its name here, 398 00:15:34,926 --> 00:15:36,336 even though it looks somewhat redundant. 399 00:15:36,446 --> 00:15:39,596 So just take it on faith for now, declare a struct 400 00:15:39,596 --> 00:15:43,176 that is inside familiar syntax and if we want to have a pointer 401 00:15:43,176 --> 00:15:45,296 that leads elsewhere, we simply define it 402 00:15:45,296 --> 00:15:47,366 as struct node star next. 403 00:15:47,696 --> 00:15:49,556 Although, next can be called anything we want. 404 00:15:49,716 --> 00:15:51,536 So what does this mean specifically? 405 00:15:51,756 --> 00:15:53,906 This just means that I declared a structure 406 00:15:54,266 --> 00:15:56,216 that in general is going to look like this, 407 00:15:56,216 --> 00:16:00,116 where this field is called N. This field is called next 408 00:16:00,216 --> 00:16:02,246 and the data type of this thing is what? 409 00:16:02,366 --> 00:16:05,226 [inaudible] That's just an end and this thing, 410 00:16:06,486 --> 00:16:11,946 a pointer to a struct node. 411 00:16:12,146 --> 00:16:15,096 So a pointer to a struct node, or a pointer to a node 412 00:16:15,096 --> 00:16:16,746 if we just want to just simplify. 413 00:16:16,796 --> 00:16:19,576 All right, let's just take a look at what we can do with this 414 00:16:19,576 --> 00:16:20,976 and why we should actually care. 415 00:16:21,046 --> 00:16:22,996 Let me go ahead and make something called List 1, 416 00:16:23,256 --> 00:16:26,236 this is the same printout from Monday, if you have your handout 417 00:16:26,316 --> 00:16:27,576 with you, or there's extras outside. 418 00:16:27,576 --> 00:16:29,686 I'm going to go ahead and run List 1 419 00:16:29,896 --> 00:16:32,366 and to demonstrate this use of linked lists I had 420 00:16:32,366 --> 00:16:34,396 to write a quick and dirty program. 421 00:16:34,636 --> 00:16:38,336 Because just coding up linked lists in no context really isn't 422 00:16:38,336 --> 00:16:43,376 that useful, but if instead run this version here, 423 00:16:43,986 --> 00:16:46,596 what I decided to do was turn it into a little bit 424 00:16:46,596 --> 00:16:48,136 of an interactive interface. 425 00:16:48,316 --> 00:16:50,826 So what's just executed as my main routine. 426 00:16:51,026 --> 00:16:52,726 My main routine has some printout stuff 427 00:16:52,726 --> 00:16:54,406 and it just prints out this quick and dirty menu 428 00:16:54,406 --> 00:16:56,926 when I map numbers to different operations. 429 00:16:57,256 --> 00:17:00,666 So a linked list, as this picture will be generally 430 00:17:00,666 --> 00:17:04,086 called, has a bunch of operations that are inherent 431 00:17:04,166 --> 00:17:05,346 to it, or related to it. 432 00:17:05,596 --> 00:17:07,066 The things people like to do 433 00:17:07,066 --> 00:17:10,796 with linked lists is find things inside of them. 434 00:17:10,976 --> 00:17:13,576 So in this case, I might want to find an integer in this list 435 00:17:13,576 --> 00:17:16,336 to answer questions in the form is this student in the list? 436 00:17:16,406 --> 00:17:17,916 Is this integer in the list? 437 00:17:18,116 --> 00:17:19,986 That's find, otherwise known as search 438 00:17:19,986 --> 00:17:21,096 for any number of other things. 439 00:17:21,456 --> 00:17:24,396 Traverse just means start at the beginning, go to the end 440 00:17:24,396 --> 00:17:27,066 and maybe do something along the way, generally print things 441 00:17:27,066 --> 00:17:29,286 out so that would be the traverse operation. 442 00:17:29,666 --> 00:17:33,286 Insert, as the word implies, means add a number to the list, 443 00:17:33,706 --> 00:17:35,876 but here's where things will get a little interesting 444 00:17:35,876 --> 00:17:39,216 because if you have this string-like structure like this, 445 00:17:39,646 --> 00:17:43,416 inserting can actually be tricky in some situations. 446 00:17:43,456 --> 00:17:45,496 Because if you want to keep this list sorted, 447 00:17:45,726 --> 00:17:46,706 which you probably do 448 00:17:46,706 --> 00:17:48,626 for efficiency reasons, as we've seen. 449 00:17:48,626 --> 00:17:50,556 Can actually search it a little more effectively. 450 00:17:51,006 --> 00:17:53,726 We might want to put the new students here. 451 00:17:53,946 --> 00:17:55,486 That looks like it's pretty easy. 452 00:17:55,486 --> 00:17:58,236 I'll take this arrow and some other arrow in here. 453 00:17:58,556 --> 00:18:00,056 But if it's sandwiched in the middle, 454 00:18:00,216 --> 00:18:01,996 it feels like there's going to be some more code 455 00:18:01,996 --> 00:18:04,256 and some more movement of pointers. 456 00:18:04,606 --> 00:18:06,976 If it's at the end, again, it's a different case. 457 00:18:07,416 --> 00:18:09,926 So this simple idea of inserting a number 458 00:18:10,186 --> 00:18:12,906 into a sorted length list can actually be broken 459 00:18:12,906 --> 00:18:14,296 down into 3 cases. 460 00:18:14,436 --> 00:18:15,616 Beginning, middle, end. 461 00:18:15,856 --> 00:18:17,556 And hopefully we can generalize in that way. 462 00:18:17,826 --> 00:18:19,646 And delete, the same thing. 463 00:18:19,646 --> 00:18:22,096 If I want to delete an arbitrary student or link from the list, 464 00:18:22,536 --> 00:18:25,126 I might have to treat the front guy a little differently 465 00:18:25,126 --> 00:18:27,686 from the end guy because they look different 466 00:18:27,756 --> 00:18:29,096 from everything else in the middle. 467 00:18:29,336 --> 00:18:31,416 So this is consistent with this idea we've been preaching 468 00:18:31,416 --> 00:18:32,016 of design. 469 00:18:32,016 --> 00:18:35,326 How can you break a problem down to its constituent parts? 470 00:18:35,326 --> 00:18:37,996 Well, there's 3 cases, beginning, middle, end. 471 00:18:37,996 --> 00:18:39,726 And that's precisely what we'll do. 472 00:18:39,726 --> 00:18:41,816 So right now my length list, according to this program, 473 00:18:41,816 --> 00:18:42,966 has nothing in so I'm going 474 00:18:42,966 --> 00:18:45,606 to go ahead an insert by typing, "3." 475 00:18:45,606 --> 00:18:50,056 It's going to ask me a number to insert and I'll say 50, enter. 476 00:18:50,446 --> 00:18:51,806 So list is now 50. 477 00:18:51,806 --> 00:18:53,756 So this is a very ugly, quick and dirty program 478 00:18:53,756 --> 00:18:55,816 that I just spit out what this list currently is. 479 00:18:56,106 --> 00:18:58,046 Let me insert another number. 480 00:18:58,046 --> 00:18:59,206 Let me go ahead and insert. 481 00:18:59,626 --> 00:19:01,566 Let's say the number 61, enter. 482 00:19:01,936 --> 00:19:03,346 Program seems to be working. 483 00:19:03,346 --> 00:19:05,836 I've inserted 50, then I inserted 61. 484 00:19:06,056 --> 00:19:07,476 And maybe it was just appended, 485 00:19:07,476 --> 00:19:09,156 but hopefully it's actually been appended 486 00:19:09,156 --> 00:19:11,106 in sorted order, let's try again. 487 00:19:11,156 --> 00:19:14,466 Insert 51, which will hopefully end up in the middle. 488 00:19:14,466 --> 00:19:17,066 And yes, indeed, and we can continue along this way 489 00:19:17,066 --> 00:19:19,096 and we'll insert 1 more for demonstration's sake. 490 00:19:19,416 --> 00:19:23,016 And we seem to have a running list that if it's consistent 491 00:19:23,016 --> 00:19:24,876 with what we're talking about here, is growing. 492 00:19:25,276 --> 00:19:26,746 And is growing, and is growing. 493 00:19:26,976 --> 00:19:28,936 So a very underwhelming program. 494 00:19:28,936 --> 00:19:30,166 We could have implemented this a week, 495 00:19:30,246 --> 00:19:33,636 2 weeks ago just using an array but if I start inputting 496 00:19:33,636 --> 00:19:34,896 in enough numbers, as you pointed out, 497 00:19:34,896 --> 00:19:37,766 eventually we're going to hit a hard coded limit 498 00:19:38,226 --> 00:19:42,046 at which point you have to do this somewhat annoying, 499 00:19:42,046 --> 00:19:45,656 somewhat time-consuming process of creating new array, 500 00:19:45,926 --> 00:19:48,996 copying the old into the new and then repeat. 501 00:19:49,336 --> 00:19:51,546 Now, give the examples we've discussed. 502 00:19:51,606 --> 00:19:54,216 Students and the example last week, 503 00:19:54,466 --> 00:19:55,706 it's really not such a big deal. 504 00:19:55,746 --> 00:19:59,076 Looping from 0 to 3 or 0 to 10 or even 0 505 00:19:59,076 --> 00:20:01,486 to 1,000 is really not all that problematic. 506 00:20:01,946 --> 00:20:07,316 So why do you think that this constant resizing 507 00:20:07,316 --> 00:20:14,126 of arrays might actually be a bad thing? 508 00:20:14,126 --> 00:20:16,886 Why bother introducing what seems to be more complexity, 509 00:20:16,886 --> 00:20:18,976 even though you could have solved this problem last week 510 00:20:19,036 --> 00:20:20,766 using the semantics of last week? 511 00:20:21,416 --> 00:20:23,366 At what point does that approach 512 00:20:23,736 --> 00:20:27,006 to dynamism growth become a problem, do you think? 513 00:20:27,006 --> 00:20:28,376 >>[inaudible audience question] 514 00:20:28,376 --> 00:20:28,506 >>Yes. 515 00:20:28,506 --> 00:20:28,776 >>[inaudible audience question] 516 00:20:28,776 --> 00:20:29,596 >>Exactly. 517 00:20:31,056 --> 00:20:38,176 When the time actually gets noticeable, either to your CPU 518 00:20:38,176 --> 00:20:39,596 or worse to the human. 519 00:20:39,736 --> 00:20:41,536 So a number of you have posted questions 520 00:20:41,536 --> 00:20:44,746 on the course's bulletin board about problem set 4 in Sodoku, 521 00:20:45,016 --> 00:20:48,406 like, is it okay to exhaustively check the whole board this 9 522 00:20:48,406 --> 00:20:50,536 by 9 grid that's 81 separate cells. 523 00:20:50,806 --> 00:20:54,456 It feels like a lot but again you have a billion hertz 524 00:20:54,456 --> 00:20:55,286 at your disposal. 525 00:20:55,286 --> 00:20:57,066 You have a gigahertz, 2 gigahertz. 526 00:20:57,066 --> 00:20:59,996 Checking a 9 by 9 grid that's really meaningless. 527 00:20:59,996 --> 00:21:02,936 And we saw as much when we talked about amitotic behavior 528 00:21:02,936 --> 00:21:05,586 and we did some searches and sorts for small data sets. 529 00:21:05,896 --> 00:21:09,506 The running time is fairly negligible, but these days, 530 00:21:09,506 --> 00:21:11,556 and after a class like this, you're going to start playing 531 00:21:11,556 --> 00:21:13,966 with much more interesting data sets, whether it's 532 00:21:13,966 --> 00:21:16,466 because of computer science, or you go back to your bio labs 533 00:21:16,466 --> 00:21:19,376 or your chem labs, where you actually had interesting sized 534 00:21:19,376 --> 00:21:19,936 data sets. 535 00:21:20,226 --> 00:21:23,766 Again this naïve approaches, these week 2 approaches 536 00:21:23,966 --> 00:21:27,546 of storing things contiguously in memory probably aren't going 537 00:21:27,546 --> 00:21:29,986 to cut it because you're going to start to incur the costs 538 00:21:29,986 --> 00:21:32,656 of time, of space, of CPU cycles. 539 00:21:32,756 --> 00:21:34,406 All of that begins to add up. 540 00:21:34,406 --> 00:21:37,436 So, how much do we go about doing this? 541 00:21:37,436 --> 00:21:38,456 Let's take a look. 542 00:21:38,456 --> 00:21:43,526 In this code is list 1 dot C. And we have the following setup. 543 00:21:43,526 --> 00:21:45,836 Let's just glance at C for the moment. 544 00:21:46,336 --> 00:21:48,616 So main looks like it's pretty simple here. 545 00:21:48,706 --> 00:21:51,146 Print def I have some new lines, and this is a little trick. 546 00:21:51,146 --> 00:21:52,706 If you've never picked up on this before, 547 00:21:52,706 --> 00:21:54,596 or haven't seen it in a TF do it. 548 00:21:54,896 --> 00:21:58,556 If you have a multi-line string, it's not the best thing 549 00:21:58,556 --> 00:22:02,216 to call print def, for instance 6 or 6 times here 550 00:22:02,406 --> 00:22:04,356 because anytime you recall a function, 551 00:22:04,356 --> 00:22:06,426 you actually incur a bit of cost because of all 552 00:22:06,426 --> 00:22:07,956 of the stack frame stuff that we discussed. 553 00:22:07,956 --> 00:22:11,526 So you can actually call print def 1 as a multi-line string 554 00:22:11,526 --> 00:22:14,206 and notice I have not put semicolons at the ends of any 555 00:22:14,206 --> 00:22:16,166 of these lines but I have closed the strings 556 00:22:16,416 --> 00:22:17,656 in double quotes each time. 557 00:22:17,656 --> 00:22:19,186 All right, so scrolling down, 558 00:22:19,186 --> 00:22:21,706 it looks like I'm using CS50's library, telling the user 559 00:22:21,886 --> 00:22:22,826 to give me the command. 560 00:22:23,056 --> 00:22:25,886 I'm using the perhaps now familiar script construct, 561 00:22:25,886 --> 00:22:27,076 though I could have used an if-else. 562 00:22:27,076 --> 00:22:30,256 I have 4 cases, I call a specifically named function 563 00:22:30,256 --> 00:22:33,236 that's in the case that we need to delete, find, 564 00:22:33,236 --> 00:22:34,416 insert or traverse the list 565 00:22:34,416 --> 00:22:36,846 and then again we do this wild construct 566 00:22:36,896 --> 00:22:38,426 because it's a user interface. 567 00:22:38,736 --> 00:22:42,486 Well, let's see a hint of the notation here. 568 00:22:42,486 --> 00:22:44,436 It looks like I'm doing some stuff with pointers 569 00:22:44,436 --> 00:22:46,596 at the very end, just to free the whole list. 570 00:22:46,596 --> 00:22:50,256 And that's consistent with this idea of managing our own memory. 571 00:22:50,576 --> 00:22:53,866 But how do you represent linked list in the first place? 572 00:22:54,336 --> 00:22:56,676 Well, if you have an empty list, 573 00:22:57,976 --> 00:22:59,686 how can you represent an empty list? 574 00:22:59,686 --> 00:23:00,826 What primitive do you need, 575 00:23:00,826 --> 00:23:03,026 I potentially left the answer on the board. 576 00:23:03,446 --> 00:23:06,546 How do you represent an empty list, what do you need? 577 00:23:06,546 --> 00:23:07,476 Do you need a struct? 578 00:23:07,476 --> 00:23:10,416 Do you need to call Malok to represent a list of size 0? 579 00:23:11,546 --> 00:23:12,926 No, you just need what? 580 00:23:12,926 --> 00:23:13,446 >>[inaudible audience question] 581 00:23:13,446 --> 00:23:15,286 >>So a null pointer. 582 00:23:15,336 --> 00:23:16,336 Where can you store it? 583 00:23:16,676 --> 00:23:19,206 Looks like in this program we've decided globally 584 00:23:19,206 --> 00:23:20,946 to declare a pointer called first. 585 00:23:21,186 --> 00:23:22,666 It's a type of node star, 586 00:23:22,716 --> 00:23:25,026 which means that this thing is just going to hold an address 587 00:23:25,026 --> 00:23:29,196 of the thing that it's going to point to, is going to be a node. 588 00:23:29,316 --> 00:23:31,096 So that's all that's saying, and it's consistent 589 00:23:31,096 --> 00:23:33,636 with last week's definition of pointers and we're going 590 00:23:33,636 --> 00:23:34,676 to initial it to null. 591 00:23:34,836 --> 00:23:37,956 Which is to say, in the context of a data structure like this, 592 00:23:37,956 --> 00:23:40,566 we simply want to create a new list 593 00:23:41,646 --> 00:23:45,786 or represent a new list we won't have had occasion 594 00:23:45,786 --> 00:23:48,586 to call Malok yet, so we won't have an valid addresses, 595 00:23:48,916 --> 00:23:51,456 so we simply need to declare 32 bits of memory. 596 00:23:51,456 --> 00:23:53,326 We'll call this apparently first. 597 00:23:54,676 --> 00:23:55,956 And what's going to be inside of it? 598 00:23:56,036 --> 00:23:58,236 Well, unless we assign it a value, 599 00:23:58,286 --> 00:23:59,606 it's going to have junk in it. 600 00:23:59,836 --> 00:24:02,386 Which is why in this case, we've initialized it to null. 601 00:24:02,476 --> 00:24:02,916 So that's it. 602 00:24:03,296 --> 00:24:04,976 So now think just conceptually. 603 00:24:04,976 --> 00:24:07,136 When I call the insert function. 604 00:24:07,246 --> 00:24:09,686 When I type in the number, what was it? 605 00:24:09,816 --> 00:24:12,856 3, the insert function is called and I type 606 00:24:12,856 --> 00:24:14,276 in thereafter the number 50. 607 00:24:14,386 --> 00:24:16,466 So that's what I did in the little demo a moment ago. 608 00:24:16,786 --> 00:24:19,206 What is the insert function probably doing? 609 00:24:19,636 --> 00:24:24,446 Step 1. The insert function, based on the behavior 610 00:24:24,446 --> 00:24:25,756 that we saw, is doing what first? 611 00:24:26,316 --> 00:24:32,946 And just to remind, so if I run this linked list program, 612 00:24:33,436 --> 00:24:37,116 and I type in 3, what's it doing first? 613 00:24:38,726 --> 00:24:41,706 So it's doing a print def, and then I'll only jot 614 00:24:41,706 --> 00:24:43,646 down the more interesting 1, it's doing a get in. 615 00:24:43,966 --> 00:24:46,256 Right, it's probably using the CS50 library also. 616 00:24:46,256 --> 00:24:47,956 And here's where I typed in 50. 617 00:24:48,336 --> 00:24:50,036 So what did it do with this value? 618 00:24:50,356 --> 00:24:52,336 Get it? It probably stored it in, 619 00:24:52,376 --> 00:24:54,206 and I'll write slightly more complete code, 620 00:24:54,206 --> 00:24:58,036 it probably stored it in a local variable called N, or whatever. 621 00:24:58,256 --> 00:25:00,036 So N equals get in. 622 00:25:00,476 --> 00:25:02,556 This is probably the code that was just executed. 623 00:25:02,846 --> 00:25:06,336 What it probably doing is step 2, when I then proceeded 624 00:25:06,336 --> 00:25:11,156 to hit enter, thereby putting a value in N. If the goal, again, 625 00:25:11,156 --> 00:25:12,336 is to construct dynamically, 626 00:25:12,336 --> 00:25:13,926 one of these things called a linked list. 627 00:25:14,326 --> 00:25:21,766 Just conceptually, what did it have to do? 628 00:25:21,766 --> 00:25:21,906 >>[inaudible audience question] 629 00:25:21,906 --> 00:25:22,026 >>Sorry? 630 00:25:22,026 --> 00:25:22,093 >>[inaudible audience question] 631 00:25:22,093 --> 00:25:24,086 >>Okay, so it's got to somehow set this pointer, 632 00:25:24,086 --> 00:25:27,296 which has up until now still null, to point to this value. 633 00:25:27,616 --> 00:25:29,986 So we can't- this is not a pointer to an int, 634 00:25:30,426 --> 00:25:33,446 so we cannot do, for instance, this. 635 00:25:33,916 --> 00:25:36,176 We cannot, if this is the int in memory, 636 00:25:36,426 --> 00:25:39,546 we cannot just have the pointer point to that int 637 00:25:39,546 --> 00:25:42,486 and no longer be null because what is this a pointer to? 638 00:25:42,486 --> 00:25:43,716 Not an int, but- 639 00:25:43,716 --> 00:25:44,846 >>[inaudible audience question] 640 00:25:44,846 --> 00:25:45,766 >> A node, okay. 641 00:25:45,766 --> 00:25:47,786 So how do we make that happen? 642 00:25:47,786 --> 00:25:50,626 If we need to have this pointer point to a node, 643 00:25:50,626 --> 00:25:53,626 and all we've got in memory at the moment is apparently an int, 644 00:25:53,626 --> 00:25:55,746 what do we need to do, or who do we need to call? 645 00:25:55,746 --> 00:25:55,813 >>[inaudible audience question] 646 00:25:55,813 --> 00:25:57,736 >>All right, so we call Malok. 647 00:25:58,006 --> 00:26:00,256 So we call Malok with something like this. 648 00:26:00,256 --> 00:26:03,156 I'm going to do a node star. 649 00:26:03,376 --> 00:26:05,916 So this is pseudo-code at the moment. 650 00:26:06,146 --> 00:26:06,906 So I'm just going to use PTR, 651 00:26:06,906 --> 00:26:10,096 this is a constant variable people use as pointers. 652 00:26:10,386 --> 00:26:11,216 So this gets what? 653 00:26:11,216 --> 00:26:14,806 I need to call Malok, and then what do I ask Malok? 654 00:26:14,806 --> 00:26:20,446 1 node, as I ask in that slightly sarcastic voice, 655 00:26:20,556 --> 00:26:21,106 the answer is no. 656 00:26:22,386 --> 00:26:25,176 So what do I put in here? 657 00:26:25,176 --> 00:26:25,243 >>[inaudible audience question] 658 00:26:25,243 --> 00:26:27,556 >>Yes, so size of the node. 659 00:26:27,786 --> 00:26:31,196 So we need to ask Malok for a specific number of lights. 660 00:26:31,376 --> 00:26:34,526 Now maybe I know how many bytes there are in a struct. 661 00:26:34,696 --> 00:26:38,056 In fact, if I think it through, a struct called node has an int. 662 00:26:38,656 --> 00:26:41,566 If it has an int and a pointer, 663 00:26:42,466 --> 00:26:45,346 I actually can guess how much space it's going to take up. 664 00:26:45,346 --> 00:26:48,406 How many bytes is a node struct can you infer? 665 00:26:48,406 --> 00:26:49,636 >>[inaudible audience question] 666 00:26:49,636 --> 00:26:50,726 >>So it looks like it's 8. 667 00:26:50,726 --> 00:26:53,176 Because I need 4 bytes, 32 bits for the int. 668 00:26:53,386 --> 00:26:56,366 And another 4 bytes, 32 bits, for the pointer, 669 00:26:56,366 --> 00:26:58,516 because all pointers are 32 bits in this environment. 670 00:26:58,736 --> 00:27:01,976 So that means 8 bytes so that means I actually go a bit 671 00:27:02,026 --> 00:27:06,146 riskily type in 8 there and get the number of bytes I need. 672 00:27:06,346 --> 00:27:08,166 This is not a safe assumption to make, 673 00:27:08,236 --> 00:27:11,466 because it's not necessarily the case that the compiler 674 00:27:11,516 --> 00:27:14,266 or the computer is going to give you just 8 bytes. 675 00:27:14,446 --> 00:27:17,076 There are cases in which you actually get back more bytes 676 00:27:17,076 --> 00:27:21,986 and things would so let's instead use the size 677 00:27:21,986 --> 00:27:22,666 of the operator. 678 00:27:22,986 --> 00:27:24,546 I pass in the size of what? 679 00:27:24,546 --> 00:27:28,996 Not ints but node and now, dynamically, 680 00:27:28,996 --> 00:27:31,376 I will hopefully get back either a null pointer, 681 00:27:31,376 --> 00:27:33,906 but that's pretty unlikely given how few things I've asked 682 00:27:34,046 --> 00:27:37,436 for so far, or I'm going to get back the address of a chunk 683 00:27:37,436 --> 00:27:39,646 of memory that is now mine to manipulate. 684 00:27:39,916 --> 00:27:42,706 Now this is just raw memory but the neat thing is 685 00:27:42,706 --> 00:27:46,096 that I can treat it as this is a node. 686 00:27:46,196 --> 00:27:49,676 Because the thing about C is that just handed a chunk 687 00:27:49,676 --> 00:27:52,406 of memory that has no data type associated 688 00:27:52,406 --> 00:27:55,296 with it fundamentally, you if you know how big it is, 689 00:27:55,546 --> 00:27:57,566 can just assume that you can treat it 690 00:27:57,566 --> 00:27:59,076 as the struct in question. 691 00:27:59,326 --> 00:28:02,626 So the fact that we're saying node star pointer means 692 00:28:02,666 --> 00:28:04,396 that with the memory I'm getting back, 693 00:28:04,606 --> 00:28:07,206 even though it's just a generic address, I'm going to proceed 694 00:28:07,206 --> 00:28:09,216 to treat it as though it's the address 695 00:28:09,216 --> 00:28:11,126 of a node struct in memory. 696 00:28:11,586 --> 00:28:13,856 So now what do I do in step 3, most likely? 697 00:28:13,856 --> 00:28:15,546 I've got an N and an int. 698 00:28:15,986 --> 00:28:19,206 I've got the address of a chunk of memory 699 00:28:19,206 --> 00:28:21,136 that I can use for a node now. 700 00:28:21,686 --> 00:28:24,146 What do I need to do as the next step to create the beginning 701 00:28:24,146 --> 00:28:25,306 of this linked list structure? 702 00:28:25,866 --> 00:28:30,376 What needs to happen next? 703 00:28:31,306 --> 00:28:33,686 So here I have, just pictorially, my int, 704 00:28:33,866 --> 00:28:34,766 it's got my number in it. 705 00:28:34,956 --> 00:28:36,086 This thing's a little bigger 706 00:28:36,166 --> 00:28:39,996 because it has 2 fields ready and waiting for me. 707 00:28:40,036 --> 00:28:42,956 So what do I need to do? 708 00:28:42,956 --> 00:28:43,166 >>[inaudible audience question] 709 00:28:43,166 --> 00:28:46,626 >>Yes, so put the int into the node, so how do I do this? 710 00:28:46,626 --> 00:28:48,336 Well, we've seen that we've not used it that much. 711 00:28:48,336 --> 00:28:52,736 In fact, before, if I want to go into that address and follow 712 00:28:52,736 --> 00:28:55,156 that address to the memory location, but I want to go 713 00:28:55,156 --> 00:28:58,346 to a specific field, the notation is this arrow notation 714 00:28:58,346 --> 00:29:01,506 so I'm going to go pointer, arrow and then the field I want 715 00:29:01,506 --> 00:29:03,266 to update is called what? 716 00:29:03,661 --> 00:29:05,661 >>[inaudible audience question] 717 00:29:06,056 --> 00:29:06,936 >>Little bolder. 718 00:29:08,036 --> 00:29:11,766 N, right? So go to that address, go the field. 719 00:29:11,766 --> 00:29:15,756 I have defined as N and assign it a value 720 00:29:16,036 --> 00:29:19,026 of N. Now you might be recalling issues of scope, 721 00:29:19,026 --> 00:29:21,546 you might recall when we had the same variable names before 722 00:29:21,546 --> 00:29:22,426 was problematic. 723 00:29:22,726 --> 00:29:25,546 But this is okay, because the compiler very quickly realizes 724 00:29:25,546 --> 00:29:28,276 that you want the variable called N, the field called N, 725 00:29:28,506 --> 00:29:30,216 that's associated with this pointer, 726 00:29:30,276 --> 00:29:31,236 with this chunk of memory. 727 00:29:31,526 --> 00:29:34,716 Whereas N just in this vacuum clearly refers 728 00:29:34,716 --> 00:29:36,646 to the local variable that we're using here. 729 00:29:36,736 --> 00:29:38,506 And let me reangle this, in case those of you 730 00:29:38,636 --> 00:29:39,826 on the side are having trouble. 731 00:29:40,186 --> 00:29:42,506 So there's 1 last step to clean 732 00:29:42,506 --> 00:29:44,446 up the memory I've just been given. 733 00:29:44,526 --> 00:29:45,476 I have this. 734 00:29:45,746 --> 00:29:50,356 I have the same memory question mark question mark But, 735 00:29:50,356 --> 00:29:52,916 at the top now, I just have the value of N is, 736 00:29:52,916 --> 00:29:54,306 so let's put 50 in here. 737 00:29:54,676 --> 00:29:57,606 50 is now also in here, but what's this? 738 00:29:57,606 --> 00:29:59,106 >>[inaudible audience question] 739 00:29:59,106 --> 00:30:01,206 >>It's just garbage value. 740 00:30:01,236 --> 00:30:03,176 Because it's just a chunk of memory I've been handed. 741 00:30:03,176 --> 00:30:06,236 So step 4 is probably going to be to do what here? 742 00:30:06,236 --> 00:30:07,696 >>[inaudible audience question] 743 00:30:07,696 --> 00:30:08,816 >>So create a null pointer. 744 00:30:09,036 --> 00:30:10,956 And it's probably not the best verb to use, 745 00:30:10,956 --> 00:30:12,866 since you don't actually create pointers. 746 00:30:12,866 --> 00:30:14,996 This is just a hard coded value of zero, 747 00:30:15,276 --> 00:30:16,436 but that's the right idea. 748 00:30:16,436 --> 00:30:20,386 The next field of this struct is just going to be null. 749 00:30:20,386 --> 00:30:25,756 And now what I have is something that looks like this, 50, 750 00:30:26,176 --> 00:30:28,656 and then null field here 751 00:30:28,656 --> 00:30:29,926 and actually I'll draw it graphically, 752 00:30:29,926 --> 00:30:30,586 but what most people do 753 00:30:30,586 --> 00:30:33,246 to represent null is just a backslash, or some kind 754 00:30:33,246 --> 00:30:34,536 of angled slash in the box. 755 00:30:34,856 --> 00:30:36,286 Okay, now that's step 4, 756 00:30:36,286 --> 00:30:38,636 but I haven't actually created a linked list. 757 00:30:38,636 --> 00:30:39,666 At this point in the story, 758 00:30:39,666 --> 00:30:41,676 my linked list appears still to be empty. 759 00:30:41,676 --> 00:30:44,366 So the fifth and final step is going to be what here? 760 00:30:44,966 --> 00:30:48,816 What needs to happen now? 761 00:30:51,396 --> 00:30:52,576 So we're so close. 762 00:30:52,676 --> 00:30:54,866 We have this thing just floating somewhere in memory 763 00:30:54,866 --> 00:30:56,586 and that's fine, because memory does not need 764 00:30:56,586 --> 00:30:57,686 to be contiguous for lists. 765 00:30:58,206 --> 00:31:00,766 This thing here is our global variable called first, 766 00:31:00,766 --> 00:31:02,736 which we initialized per the code we saw, 767 00:31:03,006 --> 00:31:04,326 is initialized to null. 768 00:31:04,736 --> 00:31:07,246 So in order to literally connect these things, 769 00:31:07,246 --> 00:31:10,806 what do I need to do? 770 00:31:10,806 --> 00:31:11,056 >>[inaudible audience question] 771 00:31:11,056 --> 00:31:14,516 >>Yes, so assign first the value of pointer. 772 00:31:14,606 --> 00:31:18,346 So the last step is first get, which is currently null. 773 00:31:18,346 --> 00:31:20,996 Let's overwrite the value of null with the address 774 00:31:21,176 --> 00:31:22,576 of this struct in memory. 775 00:31:22,796 --> 00:31:25,356 So pictorially, what I just did in step 5, 776 00:31:25,676 --> 00:31:29,066 is that I changed the value of first to be from null 777 00:31:29,296 --> 00:31:31,616 to being something like O, X, 1, 2, 3. 778 00:31:31,726 --> 00:31:34,026 Or wherever this thing happens to be in memory, 779 00:31:34,256 --> 00:31:35,916 but that's not really interesting any more. 780 00:31:35,916 --> 00:31:38,216 The specifics anymore, so let's just draw it 781 00:31:38,216 --> 00:31:39,806 with our familiar value notation. 782 00:31:40,106 --> 00:31:41,516 Now what I have is that. 783 00:31:42,716 --> 00:31:45,006 So to do recap in step 1, this old school stuff. 784 00:31:45,006 --> 00:31:45,626 Get the ints. 785 00:31:45,766 --> 00:31:49,246 Step 2, allocate space for the structure that's going 786 00:31:49,246 --> 00:31:50,166 to store that int. 787 00:31:50,166 --> 00:31:53,276 And another pointer, so metadata, if you will. 788 00:31:53,276 --> 00:31:55,186 So that we can string these things together. 789 00:31:55,496 --> 00:31:56,296 Then what do you do? 790 00:31:56,296 --> 00:31:58,246 Put the int into that structure. 791 00:31:59,076 --> 00:31:59,976 What do you do then? 792 00:31:59,976 --> 00:32:02,606 Put null in the next field of that structure 793 00:32:02,606 --> 00:32:04,186 so that you don't reach a garbage value 794 00:32:04,186 --> 00:32:06,476 and assume more things in this list than there are. 795 00:32:06,686 --> 00:32:09,036 And then finally, you just need to link things together. 796 00:32:09,286 --> 00:32:10,946 Your list was previously empty. 797 00:32:11,176 --> 00:32:13,396 You now need to wire things together. 798 00:32:13,606 --> 00:32:15,076 So now what do I have in memory, 799 00:32:15,166 --> 00:32:18,016 even though this thing's a bit messy on the board, 800 00:32:18,436 --> 00:32:22,856 let's spin it around for 1 final clear picture. 801 00:32:23,416 --> 00:32:26,186 Our linked list now is quite simply first 802 00:32:27,016 --> 00:32:30,346 and then we have the number 50 stored in a node here. 803 00:32:30,606 --> 00:32:32,846 This arrow points here, this is a null pointer. 804 00:32:33,066 --> 00:32:34,556 Voila. That's our linked list. 805 00:32:35,006 --> 00:32:37,206 Now, this felt like a lot of work. 806 00:32:37,206 --> 00:32:40,396 It's easy to create an array with the number 50, done. 807 00:32:40,676 --> 00:32:42,956 We did that last week, we just syntax int. 808 00:32:42,956 --> 00:32:48,316 We'll call this array, bracket, 1. 809 00:32:48,826 --> 00:32:50,646 So we've just done a lot more work. 810 00:32:50,646 --> 00:32:54,546 But, if we now factor out these concepts, these operations, 811 00:32:54,546 --> 00:32:57,706 like insert and later delete, spine and traverse. 812 00:32:58,126 --> 00:32:59,436 Then you just have to call it a function. 813 00:32:59,436 --> 00:33:02,486 Hand it the linked list, or have it have access 814 00:33:02,556 --> 00:33:05,766 to this first pointer, and then you can repeat these same 815 00:33:05,766 --> 00:33:07,196 operations again and again and you 816 00:33:07,196 --> 00:33:09,206 as the programmer no longer have to worry 817 00:33:09,436 --> 00:33:11,506 about how big this structure needs to be 818 00:33:11,506 --> 00:33:14,926 in advance unless the user really wants to push the limits 819 00:33:14,926 --> 00:33:18,136 of reality and ask for billions of things, which again tends 820 00:33:18,136 --> 00:33:20,996 to run afoul of some real world limitations. 821 00:33:21,986 --> 00:33:23,866 So, any questions? 822 00:33:24,686 --> 00:33:29,056 So that was either really clear or really confusing, 823 00:33:29,056 --> 00:33:31,286 and I'm not sure which, so I'll ask you again. 824 00:33:31,286 --> 00:33:31,806 Any questions? 825 00:33:32,226 --> 00:33:32,846 Well that's good. 826 00:33:32,996 --> 00:33:33,826 Thank you. 827 00:33:34,096 --> 00:33:35,076 So I'm not sure you can see this, 828 00:33:35,076 --> 00:33:37,196 but thanks to Harvard's financial crisis the temperature 829 00:33:37,196 --> 00:33:39,586 is intentionally being raised in the rooms so I apologize 830 00:33:39,896 --> 00:33:41,446 if you can see the glistening here. 831 00:33:41,816 --> 00:33:45,196 But, let's now put you guys in the spotlight. 832 00:33:45,246 --> 00:33:47,616 So we're not going to use the clay. 833 00:33:47,746 --> 00:33:49,406 I'm not sure I can take myself seriously playing 834 00:33:49,406 --> 00:33:50,496 with Play-Dough in science [phonetic spelling] theater. 835 00:33:50,736 --> 00:33:54,046 But if we could get 4 humans up here, we won't be playing 836 00:33:54,046 --> 00:33:56,426 with this, you'll be a compliment to the clarity. 837 00:33:56,426 --> 00:33:57,596 So come on up, you're number 1. 838 00:33:58,686 --> 00:33:59,636 Someone else, yes in front? 839 00:33:59,916 --> 00:34:01,616 2. And you do have to be comfortable, 840 00:34:01,616 --> 00:34:03,616 yada yada on film here. 841 00:34:03,676 --> 00:34:08,686 3. And 4, you stood up so you nominated yourself. 842 00:34:08,966 --> 00:34:13,156 Okay. So we've done things like this before 843 00:34:13,156 --> 00:34:16,146 and I'm always amazed by how many humans are willing to come 844 00:34:16,146 --> 00:34:17,396 up here and do these silly things. 845 00:34:17,456 --> 00:34:19,416 But what I hope is that what we just did 846 00:34:19,416 --> 00:34:21,266 on the board is very easily forgotten. 847 00:34:21,266 --> 00:34:23,426 It's fairly esoteric kinds of low level details. 848 00:34:23,426 --> 00:34:24,366 But let's see 849 00:34:24,366 --> 00:34:27,486 if the visualization doesn't make things a bit clearer. 850 00:34:27,486 --> 00:34:30,676 So let me go ahead and I'm going to use 3 of you guys, 851 00:34:30,936 --> 00:34:32,346 just as a chunk of memory. 852 00:34:32,396 --> 00:34:35,116 So if you could duck off to the side, for just a moment. 853 00:34:35,216 --> 00:34:37,516 We're going to allocate you in a moment. 854 00:34:38,096 --> 00:34:39,836 And you're name is again? 855 00:34:40,376 --> 00:34:40,826 >>Alex. 856 00:34:40,826 --> 00:34:42,706 >>Alex, you are a pointer. 857 00:34:43,496 --> 00:34:45,276 So you're a null pointer, 858 00:34:45,276 --> 00:34:48,806 it's just say you keep your hands at your side there. 859 00:34:48,936 --> 00:34:50,486 Just don't point at anyone just yet. 860 00:34:50,706 --> 00:34:52,256 So you guys are a chunk of memory. 861 00:34:52,396 --> 00:34:53,276 So you are RAM. 862 00:34:53,636 --> 00:34:56,176 And now, we'll start with small numbers. 863 00:34:56,406 --> 00:35:00,746 I decide here that I want to insert 864 00:35:00,866 --> 00:35:04,186 into this currently empty list called Alex. 865 00:35:04,186 --> 00:35:05,796 A number 2. 866 00:35:05,796 --> 00:35:08,536 So I call get int, I am handed the number 2. 867 00:35:08,816 --> 00:35:10,346 What's my next step? 868 00:35:11,416 --> 00:35:13,876 Malok. So I'm going to allocate memory. 869 00:35:13,936 --> 00:35:15,616 I'm going to ask the operating system 870 00:35:15,616 --> 00:35:19,206 for 8 bytes of RAM please. 871 00:35:19,636 --> 00:35:20,496 Nominate who you will. 872 00:35:21,736 --> 00:35:24,396 Okay. So I get back a pointer to- your name is? 873 00:35:24,666 --> 00:35:24,936 >>Olga. 874 00:35:25,066 --> 00:35:27,326 >>Olga. So I get a pointer to Olga here. 875 00:35:27,326 --> 00:35:29,176 She is 8 bytes of memory. 876 00:35:29,426 --> 00:35:30,866 I'm now going to do what, with respect 877 00:35:30,866 --> 00:35:32,166 to our chunk of memory here? 878 00:35:32,166 --> 00:35:32,756 >>[inaudible audience question] 879 00:35:32,756 --> 00:35:38,676 >>It's not fun if we're the only ones playing this 880 00:35:38,676 --> 00:35:39,416 awkward exercise. 881 00:35:39,416 --> 00:35:42,136 I'm going to store in the field called on, which they're going 882 00:35:42,136 --> 00:35:44,216 to represent with their hands, just as Alex is doing. 883 00:35:44,386 --> 00:35:45,726 And value 2, and why don't we stalk here just 884 00:35:45,726 --> 00:35:47,506 on the side a little bit so we're out of the way. 885 00:35:47,716 --> 00:35:49,806 So now she has a pointer. 886 00:35:49,806 --> 00:35:51,356 And your left hand is your value 887 00:35:51,356 --> 00:35:52,766 and your right hand is the pointer. 888 00:35:53,436 --> 00:35:55,456 Or actually, let's do the opposite. 889 00:35:55,456 --> 00:35:56,736 That way, that'll make sense. 890 00:35:56,966 --> 00:35:59,076 So now we were at wherever we were- 891 00:35:59,076 --> 00:36:01,416 this is, we just did step 3. 892 00:36:01,486 --> 00:36:02,526 So actually, we're almost there. 893 00:36:02,766 --> 00:36:05,186 So that's step 3, but we really don't have a linked list, 894 00:36:05,186 --> 00:36:07,906 we've got a null pointer and we've got a chunk 895 00:36:07,906 --> 00:36:09,896 of memory only half of which has been used 896 00:36:09,896 --> 00:36:11,636 in an interesting way by storing 2 there. 897 00:36:11,636 --> 00:36:14,916 What do we now need to do? 898 00:36:15,086 --> 00:36:16,756 So we now need to store a null pointer here. 899 00:36:16,756 --> 00:36:18,306 So for dramatization's sake, 900 00:36:18,306 --> 00:36:21,116 can you make a garbage value with your left arm? 901 00:36:21,866 --> 00:36:26,456 So that's before, and now we initialize your pointer to null. 902 00:36:26,976 --> 00:36:27,396 Excellent. 903 00:36:27,546 --> 00:36:30,586 First time we've ever done this demonstration in CCS50, 904 00:36:30,586 --> 00:36:31,706 so we're figuring out as we go. 905 00:36:32,086 --> 00:36:34,366 So now what's the fourth step? 906 00:36:34,851 --> 00:36:36,851 >>[inaudible audience question] 907 00:36:37,336 --> 00:36:40,946 >>So actually point the first variable, that's right, 908 00:36:41,196 --> 00:36:43,046 to this new chunk of memory. 909 00:36:43,046 --> 00:36:44,306 So Alex, your time to shine. 910 00:36:44,706 --> 00:36:45,266 Excellent. 911 00:36:45,756 --> 00:36:48,346 Good. So we've now just recreated what we had 912 00:36:48,346 --> 00:36:50,196 on the board, except with a slightly different number. 913 00:36:50,196 --> 00:36:52,196 So now things get a little more interesting, though. 914 00:36:52,346 --> 00:36:54,976 Because now I get a little greedy and I decide that I want 915 00:36:54,976 --> 00:36:57,106 to put in, say, the number 3. 916 00:36:57,446 --> 00:36:58,336 So I call get int. 917 00:36:58,526 --> 00:37:00,086 I am handed by the user the number 3 918 00:37:00,286 --> 00:37:02,246 and I just side step 2 to call Malok again. 919 00:37:02,446 --> 00:37:03,466 So Malok away. 920 00:37:04,306 --> 00:37:05,946 Hello, what's your name? 921 00:37:06,026 --> 00:37:06,486 >>Julie. 922 00:37:06,486 --> 00:37:08,916 >>Julie is returned to me, the address of Julie. 923 00:37:09,096 --> 00:37:12,316 She too has in her right hand space for an int. 924 00:37:12,536 --> 00:37:14,566 And in her left hand, what's in both of- oh, 925 00:37:14,566 --> 00:37:15,176 this will embarrass you. 926 00:37:15,176 --> 00:37:16,916 What are your 2 values right now, 927 00:37:16,986 --> 00:37:18,526 we need you to represent 2 garbage values. 928 00:37:19,456 --> 00:37:20,186 Thank you. 929 00:37:20,376 --> 00:37:23,886 So now we store in Julie's N field 930 00:37:23,886 --> 00:37:26,026 in her right hand is the value 3. 931 00:37:26,426 --> 00:37:29,926 What do we do for step- let's keep track 932 00:37:29,926 --> 00:37:31,186 for those following notes. 933 00:37:31,186 --> 00:37:33,986 We just did step 3, step 4 is to do what? 934 00:37:33,986 --> 00:37:34,756 >>[inaudible audience question] 935 00:37:34,756 --> 00:37:38,126 >> Good. Assign null to her left hand. 936 00:37:38,766 --> 00:37:42,196 Done. And step 5, finally, is to have what? 937 00:37:42,286 --> 00:37:45,386 Alex point at Julie? 938 00:37:45,386 --> 00:37:46,606 >>[inaudible audience question] 939 00:37:46,606 --> 00:37:47,576 >>No, right? 940 00:37:47,576 --> 00:37:49,596 So, what do we need to do if the goal, 941 00:37:49,596 --> 00:37:52,026 to be clear, is to sort the list? 942 00:37:52,026 --> 00:37:53,916 So here we need actually a bit more work. 943 00:37:53,916 --> 00:37:55,956 Here we need a temporary pointer. 944 00:37:55,956 --> 00:37:59,226 So I'll play that role, and I need to initialize myself 945 00:37:59,226 --> 00:38:00,426 to the start of the list. 946 00:38:00,426 --> 00:38:01,266 What does this mean? 947 00:38:01,266 --> 00:38:02,706 It really means that I'm going to point 948 00:38:02,706 --> 00:38:04,616 to the same thing Alex is pointing at. 949 00:38:04,846 --> 00:38:08,356 So now I'm going to check, let's check Olga's N field. 950 00:38:08,356 --> 00:38:10,176 Is 3 less than 2? 951 00:38:10,176 --> 00:38:11,686 No, so I'm going to keep going. 952 00:38:11,876 --> 00:38:14,066 Now, unfortunately, I'm pointing at nothing. 953 00:38:14,066 --> 00:38:15,796 Which means I've reached the end of the list, 954 00:38:15,796 --> 00:38:19,786 because Olga has nowhere else for me to go and so this implies 955 00:38:19,996 --> 00:38:21,336 that this number 3 goes here. 956 00:38:21,616 --> 00:38:23,406 So now, very carefully, and we haven't walked 957 00:38:23,406 --> 00:38:25,356 through these steps, what needs to happen now? 958 00:38:26,436 --> 00:38:29,316 To add Julie to the correct location on the list? 959 00:38:30,046 --> 00:38:30,736 1 step, in fact. 960 00:38:33,396 --> 00:38:34,056 Your time to shine. 961 00:38:34,056 --> 00:38:35,066 >>[inaudible audience question] 962 00:38:35,066 --> 00:38:35,716 >>Very good. 963 00:38:36,006 --> 00:38:39,066 But for intentionally plucking off some of the easier ones. 964 00:38:39,066 --> 00:38:43,016 Now, finally, we're going to go with the number 1. 965 00:38:43,016 --> 00:38:43,546 All right. 966 00:38:43,786 --> 00:38:44,946 So things get a little trickier. 967 00:38:45,376 --> 00:38:47,136 I call int, I store number 1 968 00:38:47,196 --> 00:38:50,486 in my variable called N. I then call Malok. 969 00:38:51,036 --> 00:38:52,246 Hello, what's your name? 970 00:38:52,586 --> 00:38:53,016 >>Robert. 971 00:38:53,066 --> 00:38:54,736 >>Robert, nice to meet you. 972 00:38:55,146 --> 00:38:57,026 So you look like what at this point in the story? 973 00:38:58,256 --> 00:38:59,206 That's great. 974 00:38:59,206 --> 00:39:01,546 So now we assign you this number 2 N, 975 00:39:01,546 --> 00:39:02,736 which goes in this hand here. 976 00:39:02,896 --> 00:39:05,186 Now, again, Robert, if I can drag you in this direction. 977 00:39:05,186 --> 00:39:06,766 You're anywhere in memory. 978 00:39:06,766 --> 00:39:08,836 So you could be here, okay. 979 00:39:09,336 --> 00:39:14,466 Yes. Can't see in the camera. 980 00:39:14,466 --> 00:39:16,926 So now Robert has his N field initialized. 981 00:39:16,926 --> 00:39:18,496 Left hand is still doing whatever. 982 00:39:18,496 --> 00:39:22,596 So step 4 here is nullify your pointer. 983 00:39:22,656 --> 00:39:24,106 Good. So now he's in good shape. 984 00:39:24,266 --> 00:39:27,306 So now it's my turn, the temporary pointer and I have 985 00:39:27,336 --> 00:39:28,586 to have some kind of loop again. 986 00:39:28,586 --> 00:39:30,636 Maybe it's a foreloop, maybe it's a wide loop so I point 987 00:39:30,636 --> 00:39:31,616 where Alex is pointing. 988 00:39:31,856 --> 00:39:34,426 I then check, does Alex belong here- sorry, 989 00:39:34,426 --> 00:39:37,966 does Robert belong here? 990 00:39:38,256 --> 00:39:39,636 Yes, to the left of Olga. 991 00:39:39,636 --> 00:39:41,526 So now there's a bit of trickery. 992 00:39:41,616 --> 00:39:43,526 So now let's see what we need to do? 993 00:39:43,826 --> 00:39:46,496 What would you, the audience propose that we insert Robert 994 00:39:46,496 --> 00:39:47,536 into the right location here? 995 00:39:47,536 --> 00:39:48,906 What's the first step? 996 00:39:50,116 --> 00:39:53,696 By a show of hands, so we can actually pinpoint the voice? 997 00:39:55,056 --> 00:39:56,206 Yes, in back. 998 00:39:56,286 --> 00:39:58,286 >>[inaudible audience question] 999 00:39:58,366 --> 00:39:59,956 >>Alex has to point to Robert. 1000 00:39:59,956 --> 00:40:01,616 Alex, please point to Robert. 1001 00:40:02,496 --> 00:40:06,246 Okay, now what? 1002 00:40:06,246 --> 00:40:06,446 >>[inaudible audience question] 1003 00:40:06,446 --> 00:40:10,736 >>Yes. So now dangling pointer, as this is called. 1004 00:40:10,736 --> 00:40:13,066 Because we've now lost Olga. 1005 00:40:13,066 --> 00:40:16,326 We've lost Julie, which means you've just cost us a lot 1006 00:40:16,326 --> 00:40:16,796 of RAM. 1007 00:40:16,826 --> 00:40:19,156 Now your computer is going to slow to a crawl. 1008 00:40:19,476 --> 00:40:23,066 But thank you for playing, that's good. 1009 00:40:23,176 --> 00:40:25,366 Let's roll back. 1010 00:40:25,666 --> 00:40:27,166 R, Control C back up. 1011 00:40:27,256 --> 00:40:28,376 Rerun the program here. 1012 00:40:28,376 --> 00:40:31,296 And again, Alex is pointing at Olga, we do do this check, 1013 00:40:31,296 --> 00:40:33,106 we realize Robert doesn't belong to the left, 1014 00:40:33,406 --> 00:40:36,606 let's do a new first step. 1015 00:40:36,606 --> 00:40:36,726 >>[inaudible audience question] 1016 00:40:36,726 --> 00:40:38,906 >>Okay, so Robert points to Olga. 1017 00:40:38,906 --> 00:40:42,716 Robert points to Olga, looks a little weird 1018 00:40:42,716 --> 00:40:43,566 but maybe this is okay. 1019 00:40:43,566 --> 00:40:43,956 Then what? 1020 00:40:43,956 --> 00:40:44,996 >>[inaudible audience question] 1021 00:40:44,996 --> 00:40:47,326 >>Now Alex points to Robert. 1022 00:40:47,536 --> 00:40:51,046 Now, this is actually realistic in that the memory can be all 1023 00:40:51,046 --> 00:40:52,756 over the place for visualization's sake, 1024 00:40:52,756 --> 00:40:54,266 let's cheat and have Robert come 1025 00:40:54,266 --> 00:40:57,476 over here even though he really is anywhere in memory. 1026 00:40:58,696 --> 00:41:03,136 But now notice what's- so that was actually a bad part 1027 00:41:03,176 --> 00:41:05,136 of the visualization, because the compelling thing 1028 00:41:05,136 --> 00:41:07,526 about linked lists is that you don't actually have 1029 00:41:07,526 --> 00:41:08,806 to shuffle people all around. 1030 00:41:08,806 --> 00:41:11,696 And this was one of the problems initially with arrays, 1031 00:41:11,696 --> 00:41:13,176 which is if you wanted to insert something 1032 00:41:13,176 --> 00:41:15,196 in the middle, we had a problem. 1033 00:41:15,196 --> 00:41:17,496 We saw this with our searching algorithms 1034 00:41:17,496 --> 00:41:18,416 and our sorting algorithms. 1035 00:41:18,416 --> 00:41:20,196 And if we wanted to put somebody into the middle, 1036 00:41:20,286 --> 00:41:22,346 we had to shift all of those people over. 1037 00:41:22,346 --> 00:41:23,816 But as soon as we started shifting, 1038 00:41:24,086 --> 00:41:25,876 we started increasing our running time, 1039 00:41:25,876 --> 00:41:27,496 and we started devolving, if you recall, 1040 00:41:27,626 --> 00:41:29,826 to things that are quadratic and no longer linear. 1041 00:41:30,096 --> 00:41:33,066 But yes, even though the humans here didn't need to make room 1042 00:41:33,186 --> 00:41:36,076 for the addition of Robert, technically, 1043 00:41:36,076 --> 00:41:37,606 they're just updating pointers. 1044 00:41:37,606 --> 00:41:40,416 So the first pictorial with him way over there speaks 1045 00:41:40,466 --> 00:41:44,186 to the fact that it's much more efficient to insert here 1046 00:41:44,446 --> 00:41:47,396 because we don't need to worry about contiguousness 1047 00:41:47,526 --> 00:41:48,636 of all of these elements. 1048 00:41:49,076 --> 00:41:55,036 But big round of applause if we can for these 4. 1049 00:41:55,036 --> 00:41:55,606 [ Applause ] 1050 00:41:55,606 --> 00:41:57,926 >>We have a little parting gift here for you today, 1051 00:41:58,586 --> 00:42:00,106 I'm not sure how you'll use this. 1052 00:42:00,376 --> 00:42:01,716 But thank you, let's take a 5 minute break. 1053 00:42:03,516 --> 00:42:04,866 And you people, I need to see you in the corner. 1054 00:42:05,786 --> 00:42:07,996 >>I already signed this. 1055 00:42:09,766 --> 00:42:09,866 >>Okay. 1056 00:42:10,946 --> 00:42:12,906 >>All right, we are back. 1057 00:42:13,446 --> 00:42:14,686 Brings the fun to a crashing halt, 1058 00:42:14,736 --> 00:42:16,306 because now we have some code on the board. 1059 00:42:16,366 --> 00:42:20,926 So you have, in your slides, that will be available online, 1060 00:42:20,926 --> 00:42:24,336 there's actually a number of depictions 1061 00:42:24,336 --> 00:42:26,046 of what we just did with humans. 1062 00:42:26,046 --> 00:42:28,866 So realize and refer back to this couple of weeks time, 1063 00:42:29,176 --> 00:42:31,616 because what this is meant to seize your minds for, 1064 00:42:31,616 --> 00:42:35,596 for problems that sticks, which is a little bit aways away 1065 00:42:35,596 --> 00:42:37,336 but it's problems in t6 you'll find 1066 00:42:37,586 --> 00:42:39,486 that your challenge is implementing the fastest 1067 00:42:39,486 --> 00:42:40,736 spellchecker possible. 1068 00:42:40,786 --> 00:42:43,366 And we pretty much take off the training wheels at that point 1069 00:42:43,366 --> 00:42:44,586 and encourage you, and invite you 1070 00:42:44,586 --> 00:42:47,646 to implement your own choice of data structure. 1071 00:42:47,696 --> 00:42:49,766 We haven't seen many yet, we've seen linked lists, 1072 00:42:49,956 --> 00:42:51,526 but you'll find that these linked lists 1073 00:42:51,526 --> 00:42:53,236 from today will become a building block 1074 00:42:53,236 --> 00:42:55,426 for much more sophisticated data structures 1075 00:42:55,656 --> 00:42:56,966 that we do in 2 week's time. 1076 00:42:56,966 --> 00:42:59,826 So these pictures realize walk you through the process 1077 00:43:00,046 --> 00:43:01,856 of what just happened, perhaps a little quickly 1078 00:43:02,136 --> 00:43:04,006 with humans on stage. 1079 00:43:04,046 --> 00:43:07,036 But what I'm going to do instead, instead of walking 1080 00:43:07,036 --> 00:43:08,626 through this again and again, 1081 00:43:08,626 --> 00:43:10,366 which is what we pretty much just did, 1082 00:43:10,586 --> 00:43:13,406 let's focus on the actual implementation of this. 1083 00:43:13,406 --> 00:43:16,626 Because some of the syntax is new, but hopefully the concepts, 1084 00:43:16,626 --> 00:43:19,566 once visualized with some people is now a bit more clear. 1085 00:43:19,566 --> 00:43:21,486 So this is the guts of the insert function. 1086 00:43:21,486 --> 00:43:24,806 So again, you'll have a printout of this, this is list 1.C. 1087 00:43:24,806 --> 00:43:33,066 In lists.H, note actually has the definition list 1.H has the 1088 00:43:33,066 --> 00:43:34,136 definition of the struct. 1089 00:43:34,136 --> 00:43:36,976 Just to be clear, I factored it out to this header file, 1090 00:43:36,976 --> 00:43:38,876 which again is pretty typical and any time you want 1091 00:43:38,876 --> 00:43:41,926 to declare some data structures or you want to create a file 1092 00:43:41,926 --> 00:43:45,886 that other people might also use via sharp include. 1093 00:43:46,216 --> 00:43:47,916 So here's the function insert. 1094 00:43:47,916 --> 00:43:50,236 It takes no argument because we're going to ask the user 1095 00:43:50,236 --> 00:43:51,806 for a number to insert this coin. 1096 00:43:52,096 --> 00:43:53,506 But notice what I'm first doing. 1097 00:43:53,746 --> 00:43:56,336 I've decided at the very start of the function called insert. 1098 00:43:56,566 --> 00:43:58,896 Try to substantiate a node for the number. 1099 00:43:59,216 --> 00:44:01,276 Now I'm doing this because I know that once I get a number 1100 00:44:01,276 --> 00:44:03,756 from the user, I've got to tuck it away inside of things. 1101 00:44:03,906 --> 00:44:05,646 So the order of operations here is a little different 1102 00:44:05,646 --> 00:44:07,866 than we did verbally, but we're going to cover the same steps. 1103 00:44:08,326 --> 00:44:11,966 So the syntax for this is, give me a pointer called New Pointer, 1104 00:44:12,316 --> 00:44:14,706 could come up with most any variable name there, 1105 00:44:14,706 --> 00:44:19,296 and it's of type node star. 1106 00:44:19,296 --> 00:44:20,466 Let me scroll back to that. 1107 00:44:20,776 --> 00:44:23,146 So it's of type node star, which means give me a pointer 1108 00:44:23,206 --> 00:44:25,266 to a struct that a type node. 1109 00:44:25,456 --> 00:44:28,016 Okay, I'm calling Malok, passing in the size of a node. 1110 00:44:28,226 --> 00:44:31,286 And just to be clear, why not just hard code the number 8? 1111 00:44:31,376 --> 00:44:33,526 Why incur the expense of calling this? 1112 00:44:33,706 --> 00:44:33,816 Yes? 1113 00:44:33,816 --> 00:44:34,846 >>[inaudible audience question] 1114 00:44:34,846 --> 00:44:41,626 >>In case you want to enlarge the struct, and again, 1115 00:44:41,626 --> 00:44:44,456 even though this is more of a corner case 1116 00:44:44,566 --> 00:44:47,776 but we can see a taste of it in problem set 5, 1117 00:44:47,776 --> 00:44:51,146 forensics problem said you're not, even though we draw N 1118 00:44:51,256 --> 00:44:53,686 on top of next sort of back to back in our picture, 1119 00:44:53,896 --> 00:44:54,806 that doesn't mean they're going 1120 00:44:54,806 --> 00:44:56,826 to be perfectly back to back in memory. 1121 00:44:57,046 --> 00:44:59,066 Now, in this case, in reality, they probably will. 1122 00:44:59,066 --> 00:45:00,816 But if we start creating data structures 1123 00:45:00,816 --> 00:45:02,886 that don't have these 4 by values, 1124 00:45:02,886 --> 00:45:04,086 an int and then a pointer, 1125 00:45:04,276 --> 00:45:06,676 or maybe a char another char, a float. 1126 00:45:06,676 --> 00:45:09,186 And then you start to comingle lots of different data types 1127 00:45:09,506 --> 00:45:16,216 for efficiency's sake what the computer will do is word align 1128 00:45:16,216 --> 00:45:18,946 these values, which just means that it's going to try, 1129 00:45:18,946 --> 00:45:21,796 for efficiency's sake, to line them up every 4 bytes 1130 00:45:21,796 --> 00:45:23,216 which means you might have gaps. 1131 00:45:23,216 --> 00:45:25,346 Another motivation for actually keeping track 1132 00:45:25,626 --> 00:45:27,156 of this thing here by a size. 1133 00:45:27,156 --> 00:45:28,696 So here's a little sanity check. 1134 00:45:28,786 --> 00:45:30,706 If I get back null, just return. 1135 00:45:30,926 --> 00:45:32,816 Now I don't bother in this particular demo 1136 00:45:32,956 --> 00:45:34,166 to tell the user anything. 1137 00:45:34,166 --> 00:45:35,296 I just ignore the user. 1138 00:45:35,296 --> 00:45:36,766 So it's unfortunate, 1139 00:45:37,076 --> 00:45:39,786 but at least we handled the return of null. 1140 00:45:40,026 --> 00:45:43,176 Because if you take a null pointer and try to follow it, 1141 00:45:43,516 --> 00:45:45,026 you will in fact segfault. 1142 00:45:45,026 --> 00:45:46,776 And that's a feature of the language. 1143 00:45:47,066 --> 00:45:49,666 Null is a constant that represents the number 0. 1144 00:45:49,916 --> 00:45:52,926 The number 0 is a valid address and memory. 1145 00:45:53,076 --> 00:45:58,546 Right? 0 byte of RAM, but the authors of C essentially decided 1146 00:45:58,546 --> 00:46:02,176 that this byte will never be owned by the user. 1147 00:46:02,316 --> 00:46:05,226 So it will always trigger, a segfault, 1148 00:46:05,226 --> 00:46:06,686 so you know there's a problem. 1149 00:46:06,686 --> 00:46:09,946 So that's actually a good thing, initialized to some known value. 1150 00:46:10,296 --> 00:46:12,116 So here's what our first step was. 1151 00:46:12,116 --> 00:46:13,586 Printout, number to insert. 1152 00:46:13,726 --> 00:46:15,936 Let me go into that, and actually saved a step here. 1153 00:46:15,936 --> 00:46:18,426 Didn't bother wasting the space for a local variable. 1154 00:46:18,626 --> 00:46:19,836 I just flopped to the end 1155 00:46:19,836 --> 00:46:22,596 of the user give me directly into the struct. 1156 00:46:22,596 --> 00:46:24,866 Reasonable optimization, but it achieves the same task. 1157 00:46:25,276 --> 00:46:28,176 But then I do very carefully initialize the next field 1158 00:46:28,266 --> 00:46:30,206 of my node pointer to null. 1159 00:46:30,486 --> 00:46:31,706 So that's precisely what we did 1160 00:46:31,706 --> 00:46:33,376 on the board and pictorially here. 1161 00:46:33,656 --> 00:46:36,316 Now I'm just going to check the 3 cases. 1162 00:46:36,576 --> 00:46:40,216 So at this point in the story, we had Olga holding a number 1163 00:46:40,416 --> 00:46:42,516 and then we had Alex pointing at nothing. 1164 00:46:42,756 --> 00:46:44,596 So we now needed to consider the scenarios. 1165 00:46:44,596 --> 00:46:46,636 Is Alex pointing too something? 1166 00:46:46,636 --> 00:46:48,736 Because if so, we're going to have to traverse the list. 1167 00:46:48,736 --> 00:46:51,246 If he's pointing to nothing, that's actually really nice 1168 00:46:51,246 --> 00:46:52,646 because it's really easy to code. 1169 00:46:52,906 --> 00:46:54,806 If Alex is currently null, 1170 00:46:54,806 --> 00:46:57,406 if the first pointer is null, well this is so easy. 1171 00:46:57,546 --> 00:46:59,046 First gets new pointer. 1172 00:46:59,246 --> 00:47:02,726 Just make Alex point at Olga as he did, then we're done. 1173 00:47:02,906 --> 00:47:04,376 But there's a couple of other cases. 1174 00:47:04,376 --> 00:47:07,056 Apparently, I need to check if the number belongs at the head 1175 00:47:07,056 --> 00:47:10,376 of the list, the start of the list, or if it belongs 1176 00:47:10,376 --> 00:47:13,176 in the middle or turns out you can model the tail, 1177 00:47:13,176 --> 00:47:16,106 the end of the list, in pretty much the same way as the middle. 1178 00:47:16,166 --> 00:47:18,906 And the only way you would realize this, frankly, 1179 00:47:18,906 --> 00:47:19,636 is by thinking it through. 1180 00:47:19,636 --> 00:47:20,986 This is not necessarily obvious. 1181 00:47:20,986 --> 00:47:23,026 But let's go with the slightly easier case. 1182 00:47:23,026 --> 00:47:25,396 I like 2 lines of code instead of several there. 1183 00:47:25,706 --> 00:47:29,176 So case 2 is, if the number we just inserted belongs 1184 00:47:29,176 --> 00:47:30,186 to the list's head. 1185 00:47:30,546 --> 00:47:33,136 So now fast-forward to the situation involving Robert. 1186 00:47:33,296 --> 00:47:37,076 Robert was number 1, he belonged to the right of Olga. 1187 00:47:37,416 --> 00:47:39,426 So Alex would have to start pointing to him, 1188 00:47:39,686 --> 00:47:42,036 so the case there, we had them update 2 things. 1189 00:47:42,036 --> 00:47:43,516 And we didn't quite get it right at first, 1190 00:47:43,516 --> 00:47:44,706 but we did subsequently. 1191 00:47:45,046 --> 00:47:50,686 We first told Robert to point at first. 1192 00:47:50,686 --> 00:47:51,766 Wait, I'm a little confused. 1193 00:47:51,766 --> 00:47:53,776 Isn't first Alex? 1194 00:47:54,396 --> 00:47:56,056 Why does this actually work? 1195 00:47:57,116 --> 00:47:57,936 What is first? 1196 00:47:58,256 --> 00:47:59,526 First is a pointer, it's an address. 1197 00:47:59,706 --> 00:48:01,226 What's it the address of, to be clear? 1198 00:48:02,576 --> 00:48:04,606 So it's Olga. 1199 00:48:04,606 --> 00:48:06,646 So even though we talked about Alex 1200 00:48:06,646 --> 00:48:09,556 and representing this first pointer, the value of side 1201 00:48:09,556 --> 00:48:13,006 of him actually represents the address of Olga 1202 00:48:13,006 --> 00:48:14,096 at that point in the story. 1203 00:48:14,146 --> 00:48:15,836 He's just a container storing an address. 1204 00:48:16,156 --> 00:48:20,336 So when I say that the new pointer's next field equals 1205 00:48:20,376 --> 00:48:25,516 first follow Alex's hand, which leads to Olga and store 1206 00:48:25,516 --> 00:48:27,256 that address inside that field. 1207 00:48:27,536 --> 00:48:30,426 So it's the same thing, but just realize 1208 00:48:30,476 --> 00:48:32,376 that Alex himself was not a node, 1209 00:48:32,376 --> 00:48:35,536 Alex did not have 2 fields, he just had an address 1210 00:48:35,536 --> 00:48:38,036 which is why this direct assignment makes sense. 1211 00:48:38,396 --> 00:48:40,866 Now we told Alex, go ahead an update yourself 1212 00:48:41,166 --> 00:48:45,076 to be the address of this new node, that is point to Robert. 1213 00:48:45,486 --> 00:48:49,336 So at this point, I'm not sure whether your mind prefers the 1214 00:48:49,336 --> 00:48:51,356 model of just thinking about things as arrows 1215 00:48:51,356 --> 00:48:53,696 and human's pointing at each other, or if you prefer thinking 1216 00:48:53,696 --> 00:48:54,726 in terms of addresses, 1217 00:48:55,046 --> 00:48:58,376 this approach too is literally a translation of what we did 1218 00:48:59,116 --> 00:49:00,526 in that demonstration. 1219 00:49:00,526 --> 00:49:03,266 So if you're more comfortable thinking of things as addresses, 1220 00:49:03,266 --> 00:49:05,796 if you're more comfortable thinking of them as arrows 1221 00:49:05,796 --> 00:49:07,306 or hands, that too is fine. 1222 00:49:07,366 --> 00:49:09,996 Now let's glance at the slightly more complicated case 1223 00:49:10,296 --> 00:49:12,756 and this is where your brain gets a little squeezed 1224 00:49:12,756 --> 00:49:14,496 because you have to very carefully consider how 1225 00:49:14,496 --> 00:49:14,976 to do this. 1226 00:49:15,306 --> 00:49:16,686 But thankfully, you do it once, 1227 00:49:16,966 --> 00:49:18,426 and then your insert function works. 1228 00:49:18,856 --> 00:49:22,426 So let's consider, ideally without getting lost, 1229 00:49:22,656 --> 00:49:25,086 how we go about inserting in the middle of the list 1230 00:49:25,376 --> 00:49:28,266 or to the end, the tail of the list as we did with Julie 1231 00:49:28,266 --> 00:49:29,356 in that second scenario. 1232 00:49:29,696 --> 00:49:32,266 So what I'm going to first do is this apparently my role. 1233 00:49:32,336 --> 00:49:33,616 When I came over, 1234 00:49:33,926 --> 00:49:36,446 I was pred-pointer, predecessor pointer. 1235 00:49:36,676 --> 00:49:38,536 And I chose these variable names just to be consistent 1236 00:49:38,536 --> 00:49:39,826 with the textbook's figures. 1237 00:49:40,136 --> 00:49:42,336 I pointed at the same thing Alex was pointed at. 1238 00:49:42,576 --> 00:49:45,166 So that highlighted line of code is exactly what I did, 1239 00:49:45,166 --> 00:49:47,166 with that temporary variable called, 1240 00:49:47,166 --> 00:49:48,306 apparently, pred-pointer. 1241 00:49:48,606 --> 00:49:50,976 Now I just need an infinite loop, and it's okay 1242 00:49:50,976 --> 00:49:53,386 to produce an infinite loops using fore or wild, 1243 00:49:53,386 --> 00:49:56,276 so as long as you are sure, logically, that you break 1244 00:49:56,276 --> 00:49:57,996 out of it, if that's your clear goal. 1245 00:49:58,326 --> 00:49:59,406 So I'm first going to check this, 1246 00:49:59,406 --> 00:50:00,546 and we didn't bother doing this 1247 00:50:00,546 --> 00:50:02,466 because I didn't print duplicates on paper, 1248 00:50:02,736 --> 00:50:03,766 but I am going to check. 1249 00:50:03,766 --> 00:50:08,256 If the pred-pointer is end field equals the new nodes end field, 1250 00:50:08,496 --> 00:50:09,376 we'll forget about this. 1251 00:50:09,376 --> 00:50:10,886 I don't want duplicates in this list. 1252 00:50:11,016 --> 00:50:13,876 This is an undocumented feature of the demo that we just did. 1253 00:50:14,326 --> 00:50:16,656 This feels a little foolish, though. 1254 00:50:16,876 --> 00:50:21,126 So some of you might notice, I just wasted some time 1255 00:50:21,276 --> 00:50:23,416 and allocating a node at the top of the file, 1256 00:50:23,416 --> 00:50:24,866 the top of the function using Malok. 1257 00:50:24,956 --> 00:50:26,276 And yet this feels dumb. 1258 00:50:26,276 --> 00:50:28,566 Only then do I check for duplicates 1259 00:50:28,716 --> 00:50:31,236 and deallocate by calling free. 1260 00:50:31,236 --> 00:50:33,696 What would be an alternative to this because this feels 1261 00:50:33,696 --> 00:50:35,446 like I'm wasting time by allocating the node 1262 00:50:35,446 --> 00:50:37,606 and then realizing, damn I don't need this node, 1263 00:50:37,896 --> 00:50:39,686 here it is back by calling free. 1264 00:50:39,716 --> 00:50:45,936 What's an alternative to that approach? 1265 00:50:45,936 --> 00:50:46,056 >>[inaudible audience question] 1266 00:50:46,056 --> 00:50:46,286 >>Yes. 1267 00:50:46,291 --> 00:50:48,291 >>[inaudible audience question] 1268 00:50:48,296 --> 00:50:51,316 >>So just intuitively, I'll just do a pre-check. 1269 00:50:51,316 --> 00:50:53,516 Let me run through the whole list, which is pretty easy to do 1270 00:50:53,516 --> 00:50:56,316 with the foreloop and I just did it by pointing again and again 1271 00:50:56,316 --> 00:50:57,346 to the humans on stage. 1272 00:50:57,346 --> 00:50:59,856 Let me just do a sanity check and if I find a duplicate then, 1273 00:51:00,046 --> 00:51:01,556 maybe then I can actually quit. 1274 00:51:01,906 --> 00:51:04,616 But now I've essentially doubled my running time 1275 00:51:04,616 --> 00:51:07,656 because if I do this pre-check, I have to linearly go 1276 00:51:07,656 --> 00:51:11,326 across the whole list, then I might say there's no duplicates, 1277 00:51:11,526 --> 00:51:13,026 now I have to find where to go. 1278 00:51:13,216 --> 00:51:15,686 So that would work maybe we could do slightly better. 1279 00:51:15,686 --> 00:51:19,986 And maybe better just means let me just allocate the moment I 1280 00:51:19,986 --> 00:51:23,176 need it, this node, and don't do it at the front of the list. 1281 00:51:23,356 --> 00:51:26,316 In other words, let me go ahead and copy these lines 1282 00:51:26,816 --> 00:51:29,626 and down here, when I actually needed the new node, 1283 00:51:29,626 --> 00:51:31,746 I could insert this code here. 1284 00:51:31,746 --> 00:51:34,896 But the reason I didn't is that I have these multiple cases. 1285 00:51:35,216 --> 00:51:37,966 If I had done this approach, allocating this node on demand, 1286 00:51:38,316 --> 00:51:43,336 now what do you notice me doing, which is generally a bad thing? 1287 00:51:43,336 --> 00:51:43,516 >>[inaudible audience question] 1288 00:51:43,516 --> 00:51:44,376 >>Copy paste, right? 1289 00:51:44,446 --> 00:51:47,436 Duplication of code, there's clearly an opportunity to rip 1290 00:51:47,466 --> 00:51:49,706 that code out, so it's just a trade-off 1291 00:51:49,706 --> 00:51:52,176 and it's a design decision that one could argue. 1292 00:51:52,176 --> 00:51:54,756 Maybe either way, but generally when you resort to copy paste, 1293 00:51:54,906 --> 00:51:56,436 there's probably a better way. 1294 00:51:56,436 --> 00:51:59,156 So at this point, we're going the right direction, it seems. 1295 00:51:59,156 --> 00:52:02,086 I have my node and maybe duplicates are a rare thing. 1296 00:52:02,086 --> 00:52:05,056 So maybe we could argue that it happens so rarely, who cares? 1297 00:52:05,196 --> 00:52:07,396 Every once in a while we spend a few extra cycles. 1298 00:52:07,486 --> 00:52:09,176 That's a reasonable decision to make. 1299 00:52:09,566 --> 00:52:10,386 So let's see. 1300 00:52:10,386 --> 00:52:11,886 Check for insertion at tail. 1301 00:52:12,216 --> 00:52:14,586 So there's 2 cases in this bigger case. 1302 00:52:14,676 --> 00:52:16,106 Is it in the middle, or is it the tail? 1303 00:52:16,436 --> 00:52:18,536 After a lot of thought, and maybe some trial and error, 1304 00:52:18,536 --> 00:52:20,286 I realized that it's a little easier for me 1305 00:52:20,286 --> 00:52:21,836 to check insertion at the tail. 1306 00:52:22,216 --> 00:52:27,176 So if the temporary pointer, me, if I'm pointing at nothing, 1307 00:52:27,586 --> 00:52:30,396 as I was in one situation when we were inserting Julie, 1308 00:52:30,636 --> 00:52:32,506 then it's actually a really easy case. 1309 00:52:32,636 --> 00:52:38,156 Go ahead a new structure, go ahead an point to the end 1310 00:52:38,156 --> 00:52:41,956 of the list, whom I'm currently pointing at, to the new node. 1311 00:52:43,076 --> 00:52:44,906 And again, I won't go too long on too much of the code 1312 00:52:44,906 --> 00:52:46,366 because I know it's easy to get caught 1313 00:52:46,366 --> 00:52:49,856 in some stupid little detail and that's fine reasoning through it 1314 00:52:49,856 --> 00:52:51,966 with the picture in hand can certainly help afterwards. 1315 00:52:51,996 --> 00:52:55,066 This is just the translation of what I did to insert Julie 1316 00:52:55,066 --> 00:52:57,226 at the end of the list when only Olga was there. 1317 00:52:57,226 --> 00:52:58,416 The list was size 1. 1318 00:52:58,826 --> 00:53:02,976 So now let's assume, and we didn't do this per say, 1319 00:53:02,976 --> 00:53:04,886 because we inserted Robert at the beginning, 1320 00:53:04,886 --> 00:53:07,416 not at the middle, but there is this last case. 1321 00:53:08,036 --> 00:53:09,486 So this is interesting. 1322 00:53:09,486 --> 00:53:12,416 You can actually chain together this arrow notation. 1323 00:53:12,736 --> 00:53:14,946 If I am the temporary pointer, pred-pointer, 1324 00:53:15,206 --> 00:53:18,326 and I follow the next field and then I follow 1325 00:53:18,326 --> 00:53:23,416 down into the end field, so gone 2 steps. 1326 00:53:23,416 --> 00:53:28,266 If that equals the N in the new node, 1327 00:53:28,616 --> 00:53:31,226 I know that this node belongs somewhere in the middle 1328 00:53:31,226 --> 00:53:34,596 and so I need to update 2 pointers here. 1329 00:53:34,596 --> 00:53:37,716 And again, I don't want to dwell too much on the syntax here, 1330 00:53:37,716 --> 00:53:39,996 because I think it's best done by drawing a picture 1331 00:53:39,996 --> 00:53:42,506 on the sides, thinking this through. 1332 00:53:42,506 --> 00:53:44,836 But the takeaway, that I think is interesting, 1333 00:53:45,056 --> 00:53:47,186 that there just needs to be 2 pointer updates. 1334 00:53:47,556 --> 00:53:49,426 So we've really only seen 2 general cases. 1335 00:53:49,596 --> 00:53:51,716 1 pointer update, which was terribly easy, 1336 00:53:51,716 --> 00:53:53,246 and that's why we did those examples first, 1337 00:53:53,496 --> 00:53:55,726 and that's why I myself coded them up first 1338 00:53:56,046 --> 00:53:58,426 and the only more interesting examples is 1339 00:53:58,426 --> 00:54:00,156 when the thing belongs to the list's head, 1340 00:54:00,486 --> 00:54:04,186 we realized we needed to update 2 pointers in the proper order, 1341 00:54:04,446 --> 00:54:06,276 and similarly if it belongs in the middle, 1342 00:54:06,276 --> 00:54:08,346 just conceptually we have someone here, 1343 00:54:08,506 --> 00:54:11,416 someone here- certainly we need to update 2 pointers 1344 00:54:11,766 --> 00:54:15,556 so they point to me and then I point to them. 1345 00:54:15,786 --> 00:54:19,636 So the reason for this double traversal is 1346 00:54:19,636 --> 00:54:22,356 because once you go left, you can't go right 1347 00:54:22,596 --> 00:54:23,616 in a singly linked list. 1348 00:54:23,776 --> 00:54:25,626 A singly linked list, by definition, 1349 00:54:25,916 --> 00:54:28,976 has only single lists going in 1 direction. 1350 00:54:29,356 --> 00:54:32,296 So as this picture here depicts, this is how it relates 1351 00:54:32,296 --> 00:54:34,906 to deletion, which is the same idea when searching 1352 00:54:34,906 --> 00:54:36,706 through a list and then wanting to delete a node, 1353 00:54:37,126 --> 00:54:42,466 what do we do- let's actually do insert in the middle. 1354 00:54:42,466 --> 00:54:44,006 And this is the one with several steps. 1355 00:54:45,056 --> 00:54:47,766 If we want to traverse this list, notice, 1356 00:54:47,796 --> 00:54:50,116 just as the arrows indicate, we can only go left to right. 1357 00:54:50,586 --> 00:54:54,086 So it turns out that sometimes, for programmer's convenience 1358 00:54:54,176 --> 00:55:00,006 or for algorithm's sake, you can throw more metadata 1359 00:55:00,006 --> 00:55:02,776 into these structures, there's nothing stopping us 1360 00:55:02,776 --> 00:55:04,456 from having not only a next pointer, 1361 00:55:04,746 --> 00:55:06,246 but also a previous pointer. 1362 00:55:06,416 --> 00:55:09,366 Because in fact, it would have simplified some of my code 1363 00:55:09,366 --> 00:55:11,806 if I had the ability to traverse the list this way, 1364 00:55:12,096 --> 00:55:13,336 and then looked backwards 1365 00:55:13,676 --> 00:55:16,056 because then I could have eliminated, if you look closely 1366 00:55:16,056 --> 00:55:19,616 at the code in problem set 6, I could have eliminated one 1367 00:55:19,616 --> 00:55:20,876 of my temporary variables. 1368 00:55:21,396 --> 00:55:23,706 Because, again, in that code I had myself 1369 00:55:23,706 --> 00:55:25,796 and then there's also a temporary variable who's 1370 00:55:25,796 --> 00:55:27,836 essentially following what's going 1371 00:55:27,836 --> 00:55:30,356 on in the story a left hand and a right hand, if you will. 1372 00:55:30,356 --> 00:55:33,076 But if I have the ability from any node to go left or right, 1373 00:55:33,076 --> 00:55:35,806 can I then get rid of one of those temporary variables? 1374 00:55:36,386 --> 00:55:38,256 But this, again, hints at this trade-off. 1375 00:55:38,496 --> 00:55:40,276 I can double my number of pointers, 1376 00:55:40,586 --> 00:55:43,846 save myself some thought, saver myself some logical coding, 1377 00:55:44,346 --> 00:55:46,866 but the downside of course is that I'm spending twice 1378 00:55:46,866 --> 00:55:49,006 as much memory on these pointers. 1379 00:55:49,316 --> 00:55:50,406 But here's another question, 1380 00:55:50,406 --> 00:55:52,606 and this picture spoils the fun answer. 1381 00:55:52,956 --> 00:55:55,006 How do you determine the length of a linked list, 1382 00:55:55,586 --> 00:55:58,496 according to how we've implemented it thus far? 1383 00:55:58,496 --> 00:55:59,586 Which is a picture more like this. 1384 00:55:59,976 --> 00:56:04,596 How do you determine the length of this thing? 1385 00:56:04,596 --> 00:56:04,826 >>[inaudible audience question] 1386 00:56:04,826 --> 00:56:05,706 >>Just traverse it, right? 1387 00:56:05,846 --> 00:56:08,446 Start at the beginning, start where Alex was, 1388 00:56:08,646 --> 00:56:11,536 and go duh-duh-duh-duh, following the arrows and as soon 1389 00:56:11,536 --> 00:56:13,366 as you hit what do you stop counting? 1390 00:56:14,366 --> 00:56:16,186 The null. As soon as you hit the null pointer, 1391 00:56:16,186 --> 00:56:18,346 which was originally at Olga then at Julie, 1392 00:56:18,346 --> 00:56:20,666 their left hand was null, then you know that you're at the end. 1393 00:56:20,666 --> 00:56:21,916 And so long as you are keeping track 1394 00:56:21,916 --> 00:56:23,886 with an N plus plus plus plus, 1395 00:56:24,056 --> 00:56:26,296 you know that the list if size three. 1396 00:56:26,546 --> 00:56:28,136 But that's a little inefficient. 1397 00:56:28,416 --> 00:56:29,856 Here, too, is this trade-off. 1398 00:56:29,926 --> 00:56:33,786 We could probably spend 32 more bits up front 1399 00:56:34,166 --> 00:56:37,146 and actually avoid having to recount the length of this list. 1400 00:56:37,146 --> 00:56:39,946 If I'm writing some programs that constantly requires 1401 00:56:39,946 --> 00:56:41,076 that I know the length of the list, 1402 00:56:41,396 --> 00:56:45,226 why not just compute it once, and keep that answer around. 1403 00:56:45,226 --> 00:56:47,286 So what this picture also hints at is 1404 00:56:47,286 --> 00:56:50,076 that Alex did not need to be so simple. 1405 00:56:50,076 --> 00:56:53,006 He did not just have to be a pointer to a node. 1406 00:56:53,266 --> 00:56:56,556 He himself could have been an entirely separate structure. 1407 00:56:56,826 --> 00:56:58,646 We've seen how to define a struct. 1408 00:56:58,646 --> 00:57:00,286 Type def struct and then some name, 1409 00:57:00,286 --> 00:57:01,556 whatever you want to put inside. 1410 00:57:01,846 --> 00:57:04,286 We could have made Alex a special struct 1411 00:57:04,286 --> 00:57:07,036 that we used 1 copy of, 1 of his fields is 1412 00:57:07,036 --> 00:57:08,376 in fact a pointer to a node. 1413 00:57:08,776 --> 00:57:11,706 But his other field maybe is, as this picture implies, 1414 00:57:12,346 --> 00:57:14,146 an int, the size of the list. 1415 00:57:14,606 --> 00:57:17,406 So what I could go do, is go back through all of my code 1416 00:57:17,406 --> 00:57:19,086 and go back through all my demonstration 1417 00:57:19,286 --> 00:57:21,396 and every time I insert, say, Julie. 1418 00:57:21,396 --> 00:57:24,166 Every time I insert Robert, not only do I update the 1 1419 00:57:24,166 --> 00:57:26,106 or 2 pointers that a requisite for that. 1420 00:57:26,266 --> 00:57:28,416 I also do what to keep track of the current length? 1421 00:57:29,876 --> 00:57:31,476 Then do the plus plus there. 1422 00:57:31,476 --> 00:57:34,686 So again, this hints at the opportunities that you, 1423 00:57:34,686 --> 00:57:36,876 as the designer as a computer scientist has 1424 00:57:36,906 --> 00:57:38,666 to solve problems more efficiently. 1425 00:57:38,666 --> 00:57:41,036 If you don't want to waste time again and again. 1426 00:57:41,206 --> 00:57:44,156 Incur linear costs, linear costs just to compute the size 1427 00:57:44,156 --> 00:57:46,306 of something, the length of something, you can solve 1428 00:57:46,306 --> 00:57:49,526 that a priori by keeping that information around. 1429 00:57:49,876 --> 00:57:52,876 Linked lists are only 1 interesting data structure. 1430 00:57:52,876 --> 00:57:55,476 And again, we will use these before long for a problem set. 1431 00:57:55,996 --> 00:58:00,026 Another canonical example is this 1 here, a stack. 1432 00:58:00,026 --> 00:58:03,176 And the cheesy example most any instructor gives 1433 00:58:03,176 --> 00:58:05,186 when discussing stacks is that it takes some photographs 1434 00:58:05,186 --> 00:58:08,526 of some phrase like this and cafeteria and much 1435 00:58:08,616 --> 00:58:12,626 like you experience as a human, when you put a tray on top. 1436 00:58:12,816 --> 00:58:16,326 When the folks running the dining hall put trays on top 1437 00:58:16,326 --> 00:58:17,926 on top on top of one another, 1438 00:58:18,166 --> 00:58:21,176 the last 1 in is the first 1 out. 1439 00:58:21,566 --> 00:58:24,356 In other, words, the top tray is the first one you, 1440 00:58:24,356 --> 00:58:26,366 the customer actually take off. 1441 00:58:26,366 --> 00:58:28,816 So a stack is another data structure 1442 00:58:29,236 --> 00:58:32,746 that is what's called a LIFO data structure. 1443 00:58:32,946 --> 00:58:35,086 Last in, first out. 1444 00:58:35,386 --> 00:58:38,126 Now this actually has some applications that you'll cover 1445 00:58:38,126 --> 00:58:42,636 in courses like CS121 or even, if you've ever written HTML, 1446 00:58:42,636 --> 00:58:45,036 made web pages, you know there's a structure to them. 1447 00:58:45,036 --> 00:58:46,786 There's a hierarchy where literally, 1448 00:58:46,786 --> 00:58:49,076 if you indent things properly, even though it's not required, 1449 00:58:49,276 --> 00:58:51,426 it gets in deeper and deeper and deeper and deeper. 1450 00:58:51,426 --> 00:58:54,816 You can actually parse, you can actually analyze a webpage 1451 00:58:54,816 --> 00:58:57,496 for correctness and for other purposes, using a stack 1452 00:58:58,086 --> 00:59:01,256 and the idea, just as an aside is that each tag you read, 1453 00:59:01,536 --> 00:59:03,556 you push onto the stack, push onto the stack, 1454 00:59:03,556 --> 00:59:06,726 push onto the stack, so you have the most recent tag latest 1455 00:59:06,776 --> 00:59:09,696 and then you can check your indentation and all 1456 00:59:09,696 --> 00:59:12,116 of your symmetry is right by going backwards. 1457 00:59:12,146 --> 00:59:14,876 But there's more useful purposes than just that. 1458 00:59:14,876 --> 00:59:16,306 But this is a LIFO data structure, 1459 00:59:16,506 --> 00:59:18,696 but all a stack is some chunk of memory 1460 00:59:18,936 --> 00:59:21,236 that you can put something here, then something here. 1461 00:59:21,236 --> 00:59:23,096 Here, here, here, here. 1462 00:59:23,326 --> 00:59:26,066 But this is just a general data type, an abstract data type. 1463 00:59:26,066 --> 00:59:28,696 How could you implement this idea of a stack? 1464 00:59:29,096 --> 00:59:31,246 Using what C primitives or basics? 1465 00:59:31,246 --> 00:59:31,876 >>[inaudible audience question] 1466 00:59:31,876 --> 00:59:32,026 >>Sorry? 1467 00:59:32,026 --> 00:59:33,176 >>[inaudible audience question] 1468 00:59:33,176 --> 00:59:38,156 >>So we have a couple of data structures that are disposable. 1469 00:59:38,206 --> 00:59:40,006 What is the simplest way to dispose of this idea 1470 00:59:40,006 --> 00:59:42,286 of a data structure that looks like you put something on top 1471 00:59:42,286 --> 00:59:45,396 of something on top of something. 1472 00:59:45,526 --> 00:59:46,706 We've done this, right? 1473 00:59:46,806 --> 00:59:48,316 We just always drew the picture like this. 1474 00:59:48,316 --> 00:59:49,346 Something, then something 1475 00:59:49,346 --> 00:59:50,616 and then something and then something. 1476 00:59:50,676 --> 00:59:51,746 Right, it's just an array, 1477 00:59:51,746 --> 00:59:53,936 just flipped conceptually upside down. 1478 00:59:54,206 --> 00:59:55,376 But there are some constraints. 1479 00:59:55,676 --> 00:59:59,546 And here's the distinction array is really not an abstract data 1480 00:59:59,546 --> 01:00:01,506 type, it's not a proper data structure 1481 01:00:01,506 --> 01:00:03,636 in that you can go anywhere you want. 1482 01:00:03,696 --> 01:00:07,786 So in that sense, there's no- it's much simpler 1483 01:00:07,866 --> 01:00:09,046 than the things we're now discussing. 1484 01:00:09,046 --> 01:00:11,706 Whereas a linked list, you can't go to a random element. 1485 01:00:11,906 --> 01:00:14,226 You have to start at the beginning, search through it 1486 01:00:14,486 --> 01:00:16,366 and that now has a more interesting cost 1487 01:00:16,366 --> 01:00:19,366 but with stacks, what is the only element in a stack, 1488 01:00:19,606 --> 01:00:23,046 at least in reality, that you can access in a given time? 1489 01:00:23,816 --> 01:00:25,186 The top tray, right? 1490 01:00:25,186 --> 01:00:26,906 If you assumed 1 stack of trays here, 1491 01:00:26,906 --> 01:00:28,266 the only one realistically you're going 1492 01:00:28,266 --> 01:00:29,526 to get is the top tray. 1493 01:00:29,526 --> 01:00:30,536 Going anywhere in the middle 1494 01:00:30,536 --> 01:00:32,806 or bottom is just not physically feasible, generally, 1495 01:00:33,076 --> 01:00:35,906 so the operations that a stack typically supports is something 1496 01:00:35,906 --> 01:00:38,306 called push and something called pop. 1497 01:00:38,416 --> 01:00:40,866 And these functions would just be implemented, 1498 01:00:40,936 --> 01:00:44,316 whether they're using an array or even a linked list to add 1499 01:00:44,426 --> 01:00:46,296 or remove elements from this stack. 1500 01:00:46,656 --> 01:00:47,536 But there's another one. 1501 01:00:47,536 --> 01:00:49,916 So this is a silly picture of some crazy people lined 1502 01:00:49,916 --> 01:00:51,906 up for the iPhone 3G in New York. 1503 01:00:52,186 --> 01:00:53,156 So this is a queue. 1504 01:00:53,156 --> 01:00:55,136 This is a line, this happens all day long 1505 01:00:55,366 --> 01:00:58,496 but there is dare I say even more common data structure. 1506 01:00:58,746 --> 01:01:00,966 In computer science or in programming in general, 1507 01:01:01,126 --> 01:01:05,236 because this data structure is a FIFO structure. 1508 01:01:05,376 --> 01:01:06,846 First in, first out. 1509 01:01:07,196 --> 01:01:09,156 That is, if you get there first at the front of the line, 1510 01:01:09,156 --> 01:01:11,676 you're the first one to buy the iPhone. 1511 01:01:11,676 --> 01:01:14,566 And this has fundamentally different applications, right? 1512 01:01:14,696 --> 01:01:19,876 Queues, or queues in this real world sense are a much better 1513 01:01:19,876 --> 01:01:23,146 solution to the problem of fairness than a stack. 1514 01:01:23,146 --> 01:01:25,746 Imagine if you were 1 of those crazy people who wanted 1515 01:01:25,746 --> 01:01:28,386 to get an iPhone that first day and Apple, 1516 01:01:28,586 --> 01:01:29,956 because they're a computer-minded people, 1517 01:01:29,956 --> 01:01:32,136 decided to implement this thing as a stack. 1518 01:01:33,666 --> 01:01:36,176 Right? You'd be ticked off, right? 1519 01:01:37,386 --> 01:01:38,546 Right now, that's something 1520 01:01:38,546 --> 01:01:40,626 that a computer scientist might actually do. 1521 01:01:40,626 --> 01:01:44,796 But anyone with a queue similarly has insert operations 1522 01:01:44,796 --> 01:01:46,026 and remove operations. 1523 01:01:46,236 --> 01:01:48,896 But fundamentally those operations manipulate different 1524 01:01:48,896 --> 01:01:49,616 pieces of data. 1525 01:01:49,616 --> 01:01:52,226 So anytime fairness is important, queues are used. 1526 01:01:52,396 --> 01:01:54,056 So in fact, you might generally note 1527 01:01:54,056 --> 01:01:55,996 that the internet has routers. 1528 01:01:56,036 --> 01:01:58,446 Devices that take data from point A to point B 1529 01:01:58,446 --> 01:01:59,816 and they send them every which way. 1530 01:02:00,136 --> 01:02:02,416 Well, routers have RAM they have memory, 1531 01:02:02,416 --> 01:02:03,726 they have hard disks often. 1532 01:02:03,966 --> 01:02:07,266 So when they get overwhelmed by traffic spikes, 1533 01:02:07,426 --> 01:02:09,616 a lot of people downloading something in a given time, 1534 01:02:09,816 --> 01:02:12,396 the routers can't handle everyone's traffic all at once, 1535 01:02:12,626 --> 01:02:14,226 so traffic starts to back up. 1536 01:02:14,396 --> 01:02:17,426 So thankfully routers and many other devices have queues, 1537 01:02:17,426 --> 01:02:19,826 chunks of memory, whereby a packet comes in 1538 01:02:20,006 --> 01:02:22,216 and then it queues up, a whole bunch of packets come in 1539 01:02:22,216 --> 01:02:25,416 and when the router finally has a moment to deal with your data, 1540 01:02:25,416 --> 01:02:27,526 or your download, does it finally move you 1541 01:02:27,526 --> 01:02:28,146 through the line. 1542 01:02:28,426 --> 01:02:30,456 But here, too, there are interesting opportunities. 1543 01:02:30,526 --> 01:02:32,496 You might note that some people 1544 01:02:32,496 --> 01:02:34,246 like to prioritize on the internet. 1545 01:02:34,246 --> 01:02:37,256 Or would like to prioritize data like voice traffic. 1546 01:02:37,256 --> 01:02:40,396 Or they'd like to deprioritze downloads of illegal movies. 1547 01:02:40,556 --> 01:02:41,656 And the ways you can do this is 1548 01:02:41,656 --> 01:02:44,476 by using different data structures at the router level 1549 01:02:45,016 --> 01:02:46,516 to prioritize different data. 1550 01:02:46,516 --> 01:02:50,176 Queues is a pretty fair thing, but if you want some other type 1551 01:02:50,176 --> 01:02:52,126 of traffic to go through that router first, 1552 01:02:52,366 --> 01:02:59,256 a queues is not necessarily the right data structure to deploy. 1553 01:02:59,666 --> 01:03:02,616 So those are just teases of some of the data structures to come. 1554 01:03:03,076 --> 01:03:06,566 What I thought we'd do is end on this note. 1555 01:03:06,566 --> 01:03:07,626 Very common in this point 1556 01:03:07,626 --> 01:03:10,326 in the semester are bugs in your own code. 1557 01:03:10,566 --> 01:03:13,676 We mentioned several weeks ago, that 1 of the first bugs, 1558 01:03:13,676 --> 01:03:16,306 or one 1 of the most defining bugs 1559 01:03:16,726 --> 01:03:19,276 in our history is this one here. 1560 01:03:19,276 --> 01:03:21,636 When the great Grace Hopper or one of her colleagues pulled 1561 01:03:21,636 --> 01:03:24,266 out what was literally a bug from the Mark 2 computer 1562 01:03:24,546 --> 01:03:26,486 and then was entered into the notebook here 1563 01:03:26,486 --> 01:03:28,336 and they had a good laugh, ha-ha, 1564 01:03:28,336 --> 01:03:29,676 computer actually has bug. 1565 01:03:29,926 --> 01:03:31,636 Well, there's been some stupid bugs. 1566 01:03:31,886 --> 01:03:35,256 And we are at the stack, by no means guilt-free of this. 1567 01:03:35,626 --> 01:03:37,866 We make mistakes, you make mistakes 1568 01:03:37,866 --> 01:03:42,186 and so what I did was troll around for some inspiring, 1569 01:03:42,186 --> 01:03:44,256 reassuring, if not humorous errors. 1570 01:03:44,526 --> 01:03:45,746 This one you might have seen. 1571 01:03:46,096 --> 01:03:48,306 How many of you have ever seen this on your own piece of tape. 1572 01:03:48,856 --> 01:03:50,836 So this is called the "Blue Screen of Death." 1573 01:03:51,266 --> 01:03:54,616 Not a terribly helpful message, but frankly at this point 1574 01:03:54,616 --> 01:03:58,186 in the course a fatal exception 0 E. I have no idea what 1575 01:03:58,186 --> 01:03:59,636 that means, but hexadecimal. 1576 01:03:59,636 --> 01:04:01,516 I can now understand a little bit more of this thing. 1577 01:04:01,776 --> 01:04:05,136 Has occurred at 0 0 28 cog pull in some other stuff. 1578 01:04:05,136 --> 01:04:08,556 VXD, that happens to represent virtual device driver. 1579 01:04:08,676 --> 01:04:10,316 You wouldn't know that other than by Googling. 1580 01:04:10,546 --> 01:04:13,036 But this was Microsoft's early on form 1581 01:04:13,156 --> 01:04:15,266 of a really bad error message that you know what? 1582 01:04:15,266 --> 01:04:18,626 Is actually useful, just not to most of us in this room. 1583 01:04:18,706 --> 01:04:21,706 In fact really none of us, to Microsoft's engineers, 1584 01:04:21,956 --> 01:04:23,316 on the rare occasion you might be able 1585 01:04:23,316 --> 01:04:26,616 to tell them what number you're experiencing, this is some kind 1586 01:04:26,616 --> 01:04:28,956 of error code indicating a problem in this driver. 1587 01:04:29,296 --> 01:04:30,156 Well, this one, too. 1588 01:04:30,156 --> 01:04:33,576 Similarly arcane, but perhaps is a little more accessible. 1589 01:04:33,576 --> 01:04:35,296 I have no idea what the problem was 1590 01:04:35,296 --> 01:04:37,086 in this program, explorer dot exe. 1591 01:04:37,306 --> 01:04:39,476 But the instruction at OX77FF. 1592 01:04:39,826 --> 01:04:40,436 So that's an int. 1593 01:04:40,556 --> 01:04:42,816 That's an address in memory, referenced memory 1594 01:04:42,816 --> 01:04:43,706 at some other address. 1595 01:04:43,996 --> 01:04:45,556 The memory, and this is the funny part, I think. 1596 01:04:45,676 --> 01:04:49,036 The memory cannot be read, just want to emphasize that. 1597 01:04:49,276 --> 01:04:51,286 But this was just a bug in the program and frankly, 1598 01:04:51,286 --> 01:04:55,226 it sounds like it was some stupid pointer mistake. 1599 01:04:55,386 --> 01:04:57,046 So if you ever do see this message, 1600 01:04:57,306 --> 01:04:58,846 that's really what they're hinting at. 1601 01:04:58,846 --> 01:05:00,506 This is just a bug in many ways. 1602 01:05:00,896 --> 01:05:03,366 So this is a legitimate error 1603 01:05:03,366 --> 01:05:05,096 that someone took a screenshot of. 1604 01:05:05,096 --> 01:05:09,256 Now things devolve into less compelling ones. 1605 01:05:09,396 --> 01:05:10,356 This is cute. 1606 01:05:10,356 --> 01:05:10,776 [ Laughter ] 1607 01:05:10,776 --> 01:05:20,436 >>This one, you have to think about it, that is true. 1608 01:05:20,706 --> 01:05:23,486 So that is many years old and quite popular. 1609 01:05:23,746 --> 01:05:25,376 This is just a problem. 1610 01:05:25,376 --> 01:05:27,336 This is within a browser using JavaScript, 1611 01:05:27,336 --> 01:05:29,686 bad things can happen in even just the confines here. 1612 01:05:29,686 --> 01:05:30,576 So this is just weird. 1613 01:05:30,936 --> 01:05:33,656 This is what happens when you start using Google images 1614 01:05:33,696 --> 01:05:34,016 too much. 1615 01:05:34,266 --> 01:05:36,396 So this is a tattoo of a blue screen 1616 01:05:36,396 --> 01:05:38,406 of death on this fellow's arm. 1617 01:05:38,726 --> 01:05:41,126 He's still red from it, I don't know why. 1618 01:05:41,126 --> 01:05:42,536 I'm a little uncomfortable with it, 1619 01:05:42,536 --> 01:05:44,036 but we'll leave the slides online. 1620 01:05:44,366 --> 01:05:46,346 So this, too, is bad, And I actually thought 1621 01:05:46,346 --> 01:05:47,116 of this the other day. 1622 01:05:47,116 --> 01:05:49,596 Fidelity Investment bank on the corner of some street, 1623 01:05:49,596 --> 01:05:52,336 maybe in New York, so that's a scrolling blue screen 1624 01:05:52,336 --> 01:05:53,436 of death, essentially. 1625 01:05:54,216 --> 01:05:58,096 An error has occurred and Windows has been shut down, 1626 01:05:58,096 --> 01:05:59,206 or something to that effect. 1627 01:05:59,356 --> 01:06:01,636 And I wish I'd had the foresight to pull 1628 01:06:01,636 --> 01:06:02,426 out my camera at the time. 1629 01:06:02,426 --> 01:06:03,896 I was just going through the subway the other day 1630 01:06:03,896 --> 01:06:06,916 in Harvard Square and the swipe machines that they have 1631 01:06:06,966 --> 01:06:12,626 to the entrance to the T, I was very sad to realize 1632 01:06:12,626 --> 01:06:14,306 that whoever designed them, they used Windows 1633 01:06:14,356 --> 01:06:16,656 to run these things, because the little thing that normally says, 1634 01:06:16,656 --> 01:06:20,766 "Enter Here," said this. 1635 01:06:21,676 --> 01:06:24,506 So that was amusing, if you can catch a photo of that, do. 1636 01:06:24,606 --> 01:06:27,236 So to be fair, this one was actually fun. 1637 01:06:27,236 --> 01:06:29,796 So this was an advertisement for Windows Vista. 1638 01:06:30,226 --> 01:06:33,556 The machine, funny enough, was behind glass. 1639 01:06:34,236 --> 01:06:36,366 Macs are not infallible. 1640 01:06:36,366 --> 01:06:37,916 This is the Mac version of this. 1641 01:06:37,956 --> 01:06:39,886 They simplified it, it's a little less scary 1642 01:06:39,886 --> 01:06:41,406 than the blue cryptic message. 1643 01:06:41,406 --> 01:06:43,486 Apple, in fact, doesn't tell you there's a problem, just, 1644 01:06:43,726 --> 01:06:45,236 "You need to restart your computer," 1645 01:06:45,606 --> 01:06:47,316 cutting corners there. 1646 01:06:47,596 --> 01:06:48,976 So there's 2, if you start Googling. 1647 01:06:48,976 --> 01:06:51,536 There's the sad Mac users get. 1648 01:06:51,756 --> 01:06:54,156 I will not follow the link on the page I found this. 1649 01:06:54,156 --> 01:06:55,876 And some people have done some sketchy things 1650 01:06:55,876 --> 01:06:58,296 to their bodies using error messages from computers. 1651 01:06:58,556 --> 01:06:59,616 I'll link it on the website. 1652 01:06:59,866 --> 01:07:02,126 This is from Scent OS, a version of Linux. 1653 01:07:02,126 --> 01:07:03,566 So this, too, happens sometimes. 1654 01:07:03,796 --> 01:07:08,026 Please insert negative disk 099 to continue. 1655 01:07:08,026 --> 01:07:11,446 But perhaps the best 1 of all time, no- second to best. 1656 01:07:11,736 --> 01:07:12,956 This 1 I snapped myself. 1657 01:07:12,956 --> 01:07:15,656 I was ordering pizza late at night, being a geek, 1658 01:07:15,656 --> 01:07:17,316 I thought a printout error was cool. 1659 01:07:17,736 --> 01:07:20,556 So I was offered this upsell to buy some chicken in addition 1660 01:07:20,556 --> 01:07:22,446 to my pizza, do you see the mistake? 1661 01:07:23,456 --> 01:07:24,776 This is a print-def fail. 1662 01:07:24,776 --> 01:07:27,746 They didn't use the right width for the floating points value. 1663 01:07:27,746 --> 01:07:31,636 So this is $6.30, but they haven't padded it properly 1664 01:07:31,636 --> 01:07:32,506 to half that 0. 1665 01:07:32,796 --> 01:07:34,836 And this 1 is perhaps the most famous 1. 1666 01:07:35,116 --> 01:07:37,266 So some engineer somewhere, years ago, 1667 01:07:37,266 --> 01:07:38,916 maybe at HP to the light decided 1668 01:07:38,916 --> 01:07:41,496 that paper cassette would load a letter, 1669 01:07:41,676 --> 01:07:44,156 or a PC load letter would be sufficient information 1670 01:07:44,156 --> 01:07:46,676 for millions of people over 10 years to understand 1671 01:07:46,676 --> 01:07:48,906 when it's time to load the paper into the machine. 1672 01:07:49,126 --> 01:07:51,796 We are perhaps not the only ones who find this aggravating. 1673 01:07:51,796 --> 01:07:54,226 I thought we'd end on this infamous note. 1674 01:07:55,186 --> 01:07:58,666 Slightly R-rated, but we'll end with this. 1675 01:07:58,866 --> 01:08:03,876 We'll end with this final moment, a wonderful movie, 1676 01:08:03,876 --> 01:08:05,556 if you've never seen it called, Office Space. 1677 01:08:05,556 --> 01:08:07,066 >>What would I do if I had million dollars? 1678 01:08:08,176 --> 01:08:10,976 I would invest half of it in low-risk mutual funds, 1679 01:08:11,416 --> 01:08:13,326 and then take the other half to my friend [inaudible] 1680 01:08:13,326 --> 01:08:14,786 who works in securities. 1681 01:08:14,786 --> 01:08:16,286 >>Samir, you're missing the point. 1682 01:08:16,836 --> 01:08:18,466 Point of the exercise is to figure 1683 01:08:18,466 --> 01:08:22,266 out what you'd want to do with it. 1684 01:08:22,266 --> 01:08:36,016 PC load letter, what the fuck does that mean? 1685 01:08:36,016 --> 01:08:37,958 [ Crash ] 1686 01:08:37,958 --> 01:08:39,900 [ Music ]