1 00:00:10,926 --> 00:00:11,286 >> David: Welcome back. 2 00:00:11,356 --> 00:00:17,346 This is Week 5 of CS50 and this is Sudoku in Problem Set 4. 3 00:00:17,346 --> 00:00:20,056 If you haven't dived in already this week you will be again 4 00:00:20,056 --> 00:00:22,346 handed some so called distribution code. 5 00:00:22,586 --> 00:00:24,536 Lines of code that we have written that provides you 6 00:00:24,536 --> 00:00:27,066 with a framework with which to begin this problem set 7 00:00:27,556 --> 00:00:31,486 and there, from there you then have to proceed to fill 8 00:00:31,486 --> 00:00:34,226 in some blanks namely to implement this game of Sudoku. 9 00:00:34,226 --> 00:00:35,426 Now for those unfamiliar, 10 00:00:35,776 --> 00:00:38,196 this is a game that's still pretty popular 11 00:00:38,196 --> 00:00:40,886 in various newspapers that you get on the subway and the like 12 00:00:40,886 --> 00:00:43,316 and you can fill in some boxes with numbers 13 00:00:43,316 --> 00:00:44,816 and it looks a little something like this. 14 00:00:44,926 --> 00:00:48,636 Now this screen shot here is of the staff's implementation 15 00:00:48,636 --> 00:00:50,066 of the hacker edition actually 16 00:00:50,066 --> 00:00:51,286 and what we've used is a little bit 17 00:00:51,286 --> 00:00:54,496 of so called ASCII using the characters on the keyboard 18 00:00:54,496 --> 00:00:56,716 in order to render a Graphical User Interface 19 00:00:56,946 --> 00:00:59,286 but what's interesting about problem set four is 20 00:00:59,286 --> 00:01:03,166 that with problem set three the game of 15, it too was a game 21 00:01:03,166 --> 00:01:05,216 and it too had a GUI of sorts, 22 00:01:05,476 --> 00:01:07,946 but recall that for problem set three to print that GUI 23 00:01:07,946 --> 00:01:10,886 out all we kept doing was calling print F from the top 24 00:01:10,886 --> 00:01:13,776 of the screen all the way down and when we had a special line 25 00:01:13,776 --> 00:01:15,226 of code that would clear the screen 26 00:01:15,456 --> 00:01:18,336 and then we would redraw the whole board from top to bottom. 27 00:01:18,336 --> 00:01:21,266 Now, this doesn't really lend itself to very good animation 28 00:01:21,266 --> 00:01:23,506 and it's certainly not terribly efficient if just 29 00:01:23,506 --> 00:01:25,376 to update the board you essentially have 30 00:01:25,376 --> 00:01:27,126 to redraw the whole screen. 31 00:01:27,126 --> 00:01:30,446 In fact, if you play with hence forth problem set three your 32 00:01:30,446 --> 00:01:33,066 implementation on a really slow Internet connection, 33 00:01:33,066 --> 00:01:36,076 maybe back home over vacation you might actually see the 34 00:01:36,076 --> 00:01:39,536 screen redrawing if your connection is sufficiently slow 35 00:01:39,536 --> 00:01:41,556 that you literally see each line getting drawn. 36 00:01:41,816 --> 00:01:44,616 So much more efficient at least conceptually would be 37 00:01:44,616 --> 00:01:46,126 if you want to update part 38 00:01:46,126 --> 00:01:48,846 of the screen don't redraw the whole darn board top 39 00:01:48,846 --> 00:01:49,856 to bottom, left to right. 40 00:01:50,136 --> 00:01:53,046 Just update the individual character that you care about. 41 00:01:53,046 --> 00:01:56,666 And so we introduce in problem set four a graphical library. 42 00:01:56,666 --> 00:02:00,766 It's called ncurses, for new curses, and this is a C library 43 00:02:00,766 --> 00:02:03,216 that is code that other people wrote that you can integrate 44 00:02:03,216 --> 00:02:05,066 into your own programs that allows you 45 00:02:05,066 --> 00:02:08,376 to have a much more compelling graphical interface. 46 00:02:08,376 --> 00:02:10,356 Now at the end of the day the graphics are still just 47 00:02:10,586 --> 00:02:13,386 characters from the keyboard, ASCII codes themselves. 48 00:02:13,556 --> 00:02:15,936 Toward the end of the semester we'll introduce actual graphics 49 00:02:15,936 --> 00:02:17,386 and in the context of web programming 50 00:02:17,696 --> 00:02:21,096 but as this board here suggests you can actually 51 00:02:21,096 --> 00:02:24,936 in this game ultimately move a cursor up, down, left, 52 00:02:24,936 --> 00:02:27,926 right as you can see with the actual running code here. 53 00:02:27,926 --> 00:02:31,206 So the game of Sudoku for those unfamiliar gives you a board 54 00:02:31,206 --> 00:02:32,236 like this, a grid. 55 00:02:32,236 --> 00:02:35,126 It's got nine numbers across, nine numbers down, 56 00:02:35,126 --> 00:02:38,546 and then it's divided into nine squares and you're given a bunch 57 00:02:38,546 --> 00:02:41,806 of numbers from the start such as those depicted here in yellow 58 00:02:42,036 --> 00:02:43,876 and the challenge of this game is to figure 59 00:02:43,876 --> 00:02:47,616 out what numbers you should put where all of these blanks are, 60 00:02:47,616 --> 00:02:50,186 where all of these dots are and the constraints are as follows. 61 00:02:50,186 --> 00:02:55,036 You can only put each number from 1 to 9 in a row once. 62 00:02:55,436 --> 00:02:57,866 So, you can have the numbers 1, 2, 3, 4, 5, 6, 7, 8, 63 00:02:57,866 --> 00:02:59,906 9 in any order in that top row, 64 00:03:00,126 --> 00:03:02,286 but you can't repeat any of those digits. 65 00:03:02,346 --> 00:03:04,296 Similarly you have to obey that constraint 66 00:03:04,506 --> 00:03:07,746 with every vertical row in the chart and then finally you have 67 00:03:07,776 --> 00:03:10,176 to make sure that you never reuse the same number 68 00:03:10,456 --> 00:03:12,766 within each individual square. 69 00:03:12,766 --> 00:03:14,526 So those are the three primary constraints 70 00:03:14,526 --> 00:03:16,476 and to be honest even if you've never played this game before 71 00:03:16,706 --> 00:03:20,116 it's actually kind of neat in that especially if you play one 72 00:03:20,116 --> 00:03:23,066 of the n00b levels as we call them; one of the easier levels 73 00:03:23,066 --> 00:03:25,846 where you can pretty much always make forward progress 74 00:03:25,846 --> 00:03:29,086 if you look for logically a hole that you can fill in. 75 00:03:29,346 --> 00:03:31,926 The strategy that you can implement is something 76 00:03:31,926 --> 00:03:32,466 like this. 77 00:03:32,466 --> 00:03:35,166 Let's see if we can't find an easy one to fill out ourselves. 78 00:03:35,166 --> 00:03:38,496 So, for instance down here in the bottom left-hand corner 79 00:03:38,496 --> 00:03:42,426 if I move my cursor, I definitely cannot put 9, 7 or 5. 80 00:03:42,426 --> 00:03:48,736 I could put 6, 7, I could not also put 6, 7 or 3, but I could, 81 00:03:48,736 --> 00:03:52,586 for instance put for now the number 1, but no, I can't. 82 00:03:52,776 --> 00:03:54,446 Now, this is not a feature you have to implement 83 00:03:54,446 --> 00:03:55,456 in the standard edition, 84 00:03:55,456 --> 00:03:58,246 but notice in the hacker edition I changed the colors 85 00:03:58,246 --> 00:03:59,616 of all those dots to red. 86 00:03:59,916 --> 00:04:01,906 That's because I've violated one of the constraints. 87 00:04:01,906 --> 00:04:03,276 Why can I not put 1 in the corner? 88 00:04:04,906 --> 00:04:07,276 So 1 is already in that same square 89 00:04:07,276 --> 00:04:09,946 and so this is the basic idea and what's neat in the end is 90 00:04:09,946 --> 00:04:12,516 that you'll have the ability to actually play this game 91 00:04:12,566 --> 00:04:15,036 and if you choose to elect the hacker edition of this pset, 92 00:04:15,286 --> 00:04:18,006 well, much like God mode in the game of 15 you can just kind 93 00:04:18,006 --> 00:04:21,156 of hang on to the number 8 for hint and the game will be solved 94 00:04:21,246 --> 00:04:23,916 for you, but that's for this hacker edition only 95 00:04:24,116 --> 00:04:27,386 and the standard edition you implement the game itself. 96 00:04:27,386 --> 00:04:29,326 Some more on that in the specification. 97 00:04:29,326 --> 00:04:32,066 For those then who caught on to that definition 98 00:04:32,066 --> 00:04:35,276 of Sudoku pretty quickly, here's a little bit for geek humor. 99 00:04:37,266 --> 00:04:40,156 Right, pretty lame game at this point, 100 00:04:40,156 --> 00:04:42,806 but there is one correct solution to this problem. 101 00:04:42,806 --> 00:04:44,626 All right if you're not sure what that means, 102 00:04:44,706 --> 00:04:46,326 go ahead and ask in sections. 103 00:04:46,326 --> 00:04:47,516 It's a pretty simple game that one. 104 00:04:47,856 --> 00:04:49,666 So quiz zero is coming up. 105 00:04:49,736 --> 00:04:51,966 So this is the first of two milestones in the course. 106 00:04:51,966 --> 00:04:54,816 This is really meant to be not an opportunity just regurgitate 107 00:04:54,816 --> 00:04:56,566 things we've said, this is not an opportunity 108 00:04:56,806 --> 00:04:59,056 to see how well you can code on a piece of paper, 109 00:04:59,346 --> 00:05:01,556 but rather how well you've mastered certain concepts 110 00:05:01,556 --> 00:05:04,106 in the course, how you can spot say problems in code 111 00:05:04,106 --> 00:05:07,856 or how you can draw or how you can pen a few lines for code 112 00:05:07,856 --> 00:05:12,006 in the order of tens of lines of code in the context of a quiz. 113 00:05:12,006 --> 00:05:15,086 You're not going to have, of course, a computer with you, 114 00:05:15,086 --> 00:05:16,716 you're not going to have a compiler with you 115 00:05:16,966 --> 00:05:18,796 and so the quiz will be designed in a way 116 00:05:18,796 --> 00:05:21,586 that understands the constraints of this problem we'll hand 117 00:05:21,586 --> 00:05:24,516 out a PDF, print out of a PDF this Wednesday 118 00:05:24,516 --> 00:05:27,546 that will tell you exactly what to do next Wednesday 119 00:05:27,606 --> 00:05:29,706 so the quiz is in lieu of lecture next Wednesday 120 00:05:29,956 --> 00:05:32,026 so that you guys actually have something to draw on. 121 00:05:32,066 --> 00:05:33,606 We will not be in Sanders. 122 00:05:33,786 --> 00:05:36,056 We've reserved a few rooms elsewhere on campus 123 00:05:36,056 --> 00:05:38,386 with actual writing surfaces, and you'll be dispatched 124 00:05:38,386 --> 00:05:40,656 to different rooms according to this printout based 125 00:05:40,656 --> 00:05:42,086 on the first letter of your last name. 126 00:05:42,086 --> 00:05:43,416 So more on that on Wednesday, 127 00:05:43,416 --> 00:05:45,616 but know that there are innumerable resources coming 128 00:05:45,616 --> 00:05:46,676 up available to you. 129 00:05:46,676 --> 00:05:49,876 So, one, this Sunday during the regularly scheduled walk 130 00:05:49,876 --> 00:05:53,086 through time will be a course-wide review session. 131 00:05:53,146 --> 00:05:56,516 The details of that will be on this handout on Wednesday, 132 00:05:56,516 --> 00:05:58,286 but that will be 7PM this Sunday. 133 00:05:58,516 --> 00:06:00,416 All sections this coming Sunday, Monday, 134 00:06:00,416 --> 00:06:02,476 Tuesday will also involve quiz review. 135 00:06:02,676 --> 00:06:03,976 Because of the quiz there is no pset due next week. 136 00:06:04,126 --> 00:06:04,193 >> Student: Oh. 137 00:06:04,193 --> 00:06:04,260 [Laughter] 138 00:06:04,260 --> 00:06:06,206 >> David: Very sorry. 139 00:06:06,526 --> 00:06:13,356 There is no pset due next week so if there's a silver lining 140 00:06:13,356 --> 00:06:16,216 in having any sort of examination it is 141 00:06:16,216 --> 00:06:19,996 that there's no pset that corresponds to it. 142 00:06:21,176 --> 00:06:23,296 [Laughter] So, let's regroup here. 143 00:06:23,296 --> 00:06:27,336 So, more on that on Wednesday, but also actually in terms 144 00:06:27,336 --> 00:06:30,476 of resources there are three years' worth of past quizzes 145 00:06:30,626 --> 00:06:32,566 and sample solutions on the course's website. 146 00:06:32,566 --> 00:06:35,606 Just click quizzes so those are probably the best guide toward 147 00:06:35,766 --> 00:06:37,646 getting a sense of what format to expect 148 00:06:37,646 --> 00:06:40,176 and if you haven't noticed them already let me draw some 149 00:06:40,176 --> 00:06:43,036 explicit attention to what's on the lecture's page 150 00:06:43,036 --> 00:06:44,936 and has been here for some time. 151 00:06:45,016 --> 00:06:48,016 Everything we use and do in lecture ends up on this page. 152 00:06:48,206 --> 00:06:49,526 Now, if you expand everything at once, 153 00:06:49,526 --> 00:06:51,646 it looks a little overwhelming, but the juiciest stuff, 154 00:06:51,646 --> 00:06:54,836 for instance, are things like the scribe notes that one 155 00:06:54,836 --> 00:06:56,896 of our teaching fellows, Andrew Sellergren, produces. 156 00:06:57,146 --> 00:06:58,106 These are meant to be sort 157 00:06:58,106 --> 00:07:00,966 of an authoritative teaching fellow's guide to what happened 158 00:07:00,966 --> 00:07:04,256 in lectures and they're actually quite fun to read. 159 00:07:04,256 --> 00:07:06,926 Andrew like me is a fan of snarky little foot notes. 160 00:07:06,956 --> 00:07:08,716 In fact, he commented to me recently 161 00:07:10,036 --> 00:07:12,186 when writing the scribe notes why does everyone clap 162 00:07:12,426 --> 00:07:14,076 at the end of a computer science lecture? 163 00:07:14,386 --> 00:07:16,666 It's just a computer science lecture and I actually agree. 164 00:07:16,666 --> 00:07:19,326 You're all very kind sometimes to have this awkward applause 165 00:07:19,326 --> 00:07:21,436 at the end of lecture, but it's just a computer science lecture 166 00:07:21,436 --> 00:07:23,616 so feel free to get right up out of your seats 167 00:07:23,616 --> 00:07:24,406 as soon as we're done. 168 00:07:24,406 --> 00:07:26,046 No need for that. 169 00:07:26,086 --> 00:07:28,206 That will make Andrew happy, hello, Andrew. 170 00:07:28,736 --> 00:07:31,116 Andrew scribes these lectures based on the video 171 00:07:31,116 --> 00:07:33,896 so what we say ends up written down. 172 00:07:34,236 --> 00:07:34,556 All right. 173 00:07:34,886 --> 00:07:39,886 So, for those of you who are thinking of switching either 174 00:07:39,886 --> 00:07:42,696 to graded status or to pass/fail status, today is, in fact, 175 00:07:42,696 --> 00:07:43,966 that fifth Monday of the term. 176 00:07:43,966 --> 00:07:47,926 We will have pink sheets on the sides of the stage here. 177 00:07:48,076 --> 00:07:50,236 Feel free to just chat with me during break or afterward 178 00:07:50,236 --> 00:07:51,966 if you're a bit on the fence as to what to do. 179 00:07:51,966 --> 00:07:56,126 For those of you who are feeling like maybe you want neither 180 00:07:56,126 --> 00:07:58,436 of those options I thought I'd take a quick tour with you 181 00:07:58,806 --> 00:08:01,546 through bugs from years past 182 00:08:01,986 --> 00:08:04,136 that you can find actually in popular culture. 183 00:08:04,696 --> 00:08:06,936 These are all screen shots that random people some 184 00:08:06,936 --> 00:08:08,146 of them staff, some of them people 185 00:08:08,146 --> 00:08:10,056 on the Internet have taken and if you're feeling 186 00:08:10,056 --> 00:08:11,806 like your code is perpetually buggy 187 00:08:11,806 --> 00:08:13,976 and there's always something wrong, well, realize it is, 188 00:08:13,976 --> 00:08:15,326 in fact, not just you. 189 00:08:15,326 --> 00:08:17,216 So, this of course, was the first such bug 190 00:08:17,216 --> 00:08:18,586 with which we'll set the stage here 191 00:08:18,766 --> 00:08:21,576 that was actually found inside of one of the first computers. 192 00:08:21,836 --> 00:08:24,706 This is perhaps a screen that some of you are familiar with. 193 00:08:24,706 --> 00:08:28,366 It's Windows own blue screen of death. 194 00:08:28,366 --> 00:08:31,236 So, if you'll look here, there's not all much information that's 195 00:08:31,236 --> 00:08:34,796 useful to a normal person other than control alt delete 196 00:08:34,946 --> 00:08:37,706 at bottom left here, but there are some details 197 00:08:37,706 --> 00:08:39,326 that might start to feel 198 00:08:39,326 --> 00:08:40,996 at least a little comprehensible even 199 00:08:40,996 --> 00:08:43,296 if you're not quite sure what to make of them. 200 00:08:43,536 --> 00:08:44,976 On the top left, for instance, 201 00:08:44,976 --> 00:08:49,636 there's this cryptic looking sequence of digits, 0010E36. 202 00:08:49,926 --> 00:08:53,396 Well, this is just a hexadecimal number and this is just a way 203 00:08:53,396 --> 00:08:55,946 of encoding a number slightly more efficiently than decimal 204 00:08:56,246 --> 00:08:58,226 and what this likely refer to? 205 00:09:00,346 --> 00:09:02,406 So it might very well be an error code, 206 00:09:02,406 --> 00:09:04,906 it might very well be an address, too, as we've started 207 00:09:04,906 --> 00:09:08,526 to play with gdb recall that we often see these similarly 208 00:09:08,526 --> 00:09:10,056 looking codes. 209 00:09:10,056 --> 00:09:12,246 a little bit of numbers, a little bit of characters, 210 00:09:12,246 --> 00:09:13,656 these hexadecimal notation 211 00:09:13,876 --> 00:09:15,686 and it often refers to memory addresses. 212 00:09:15,686 --> 00:09:18,346 So, it seems that here a fatal exception, 213 00:09:18,376 --> 00:09:21,436 which just means error, zero E so this just happens 214 00:09:21,436 --> 00:09:23,196 to represent the number 14, 215 00:09:23,196 --> 00:09:24,856 but in this system called hexadecimal, 216 00:09:24,856 --> 00:09:26,376 but more on that to come. 217 00:09:26,376 --> 00:09:28,126 At this location, whatever that means, 218 00:09:28,166 --> 00:09:30,086 but it's mostly numeric or hexadecimal there. 219 00:09:30,336 --> 00:09:33,256 VXD happens to stand for Virtual Device Driver, 220 00:09:33,256 --> 00:09:36,256 which is just Window's way of saying some piece of hardware 221 00:09:36,256 --> 00:09:38,896 that came with like a CD or some driver that you installed 222 00:09:38,896 --> 00:09:41,336 into this computer itself probably had the bug 223 00:09:41,596 --> 00:09:44,766 and that triggered the entirety of Windows to crash in this way. 224 00:09:45,226 --> 00:09:48,756 So, a little more light heartedly, do you see things 225 00:09:48,756 --> 00:09:51,736 that are just a little playful? 226 00:09:51,736 --> 00:09:54,576 And here you see that this is clearly an address of some sort. 227 00:09:54,576 --> 00:09:56,166 We've seen things like this in gdb, 228 00:09:56,386 --> 00:09:58,166 but I've always thought it's kind of amusing at the end 229 00:09:58,166 --> 00:09:59,406 that they kind of simplify it 230 00:09:59,406 --> 00:10:01,596 with the memory could not be read with a little bit 231 00:10:01,596 --> 00:10:02,466 of finger quotes there 232 00:10:02,466 --> 00:10:05,926 with these same pop-ups though do you see errors within errors? 233 00:10:06,606 --> 00:10:07,956 [Laughter] So this is literally a screen shot. 234 00:10:08,046 --> 00:10:10,986 A slightly more amusing is literally this one. 235 00:10:11,846 --> 00:10:16,886 [Laughter] And no joke this is a bug from the mid-1990's 236 00:10:16,886 --> 00:10:18,106 that appeared in Windows. 237 00:10:18,926 --> 00:10:24,086 [Laughter] So that is what happens 238 00:10:24,086 --> 00:10:25,786 when you don't consider corner cases. 239 00:10:25,786 --> 00:10:27,336 Things can certainly get out of control. 240 00:10:27,336 --> 00:10:29,766 a few years ago pop-ups were all the rage on the Web. 241 00:10:30,046 --> 00:10:31,356 You get scenarios like this. 242 00:10:31,446 --> 00:10:32,376 a bit out of control. 243 00:10:32,626 --> 00:10:35,736 It turns out that operating systems like Windows, Mac OS, 244 00:10:35,736 --> 00:10:38,576 Linux and others are actually used in the real world including 245 00:10:38,576 --> 00:10:41,506 by brokers like Fidelity where if you run Windows 246 00:10:41,506 --> 00:10:43,696 on your sign you get blue signs of death. 247 00:10:44,456 --> 00:10:46,586 a little ticker outside some city corner there. 248 00:10:46,586 --> 00:10:49,696 This was from let's say, oh, this was one 249 00:10:49,696 --> 00:10:52,136 of my favorites back in the Vista era. 250 00:10:52,596 --> 00:10:54,696 This was from like a Best Buy or something like that 251 00:10:54,696 --> 00:10:55,896 where they were selling the product. 252 00:10:56,296 --> 00:10:59,406 [Laughter] So that's kind of amusing. 253 00:10:59,456 --> 00:11:02,076 Behind glass no less so it can't easily be fixed. 254 00:11:02,526 --> 00:11:04,766 Now with have Mac OS in fairness. 255 00:11:04,766 --> 00:11:06,596 So Apple kind of simplifies things for people. 256 00:11:06,596 --> 00:11:07,076 They don't tell you 257 00:11:07,076 --> 00:11:09,366 that something has gone wrong they just advise you you need 258 00:11:09,366 --> 00:11:12,476 to restart your computer now, but this is not, in fact, 259 00:11:12,476 --> 00:11:13,666 a good thing when this happens. 260 00:11:13,736 --> 00:11:15,626 Cent OS, which is a variant of Linux, 261 00:11:15,626 --> 00:11:17,136 one that we're actually using in the cluster, 262 00:11:17,596 --> 00:11:20,176 this is probably not what you need to do at this point 263 00:11:20,176 --> 00:11:21,446 in an installation process, 264 00:11:21,546 --> 00:11:23,986 inserting the disk number negative 99. 265 00:11:24,666 --> 00:11:26,236 This one is one of my favorites if only 266 00:11:26,236 --> 00:11:29,736 because I was ordering pizza late one night on Dominos.com, 267 00:11:29,736 --> 00:11:31,956 and I was like ah-ha, this is as perfect lecture example. 268 00:11:32,196 --> 00:11:33,116 What's the bug here? 269 00:11:33,616 --> 00:11:36,246 >> Student: [inaudible]. 270 00:11:36,746 --> 00:11:38,016 >> David: So 6.3, right? 271 00:11:38,016 --> 00:11:40,086 So this is printf error of some sort. 272 00:11:40,086 --> 00:11:41,976 Now they probably didn't implement their website 273 00:11:41,976 --> 00:11:43,786 in C. They used some other language, 274 00:11:43,816 --> 00:11:46,236 but this function printf and several variants of it 275 00:11:46,236 --> 00:11:48,846 that we'll see over time are very common in other languages 276 00:11:48,846 --> 00:11:50,866 as well, and it looks like the programmer 277 00:11:50,866 --> 00:11:52,216 that wrote this little advert 278 00:11:52,216 --> 00:11:54,156 for these chicken kickers did not realize 279 00:11:54,156 --> 00:11:57,556 that US currency is typically formatted to point to F 280 00:11:57,666 --> 00:12:01,166 as opposed to just percent F as a format string. 281 00:12:01,466 --> 00:12:03,566 Let's see, one other here that was given to us 282 00:12:03,736 --> 00:12:07,596 by a former student so this is bankofamerica.com, you log in 283 00:12:07,596 --> 00:12:10,356 but this particular day this user logged in 284 00:12:10,616 --> 00:12:11,986 and there is, in fact, a bug. 285 00:12:11,986 --> 00:12:17,036 Allow me to zoom in at top left. 286 00:12:17,936 --> 00:12:21,196 [Laughter] That really should not be, any hypotheses 287 00:12:21,196 --> 00:12:22,856 as to why this actually happened? 288 00:12:23,556 --> 00:12:26,376 It's probably something like a time zone difference 289 00:12:26,376 --> 00:12:28,636 or the time zone setting was wrong, something like this, 290 00:12:28,686 --> 00:12:31,196 but not really inspiring of confidence in a place 291 00:12:31,196 --> 00:12:32,286 where you're storing your money. 292 00:12:32,286 --> 00:12:33,726 So, real screen shot there. 293 00:12:33,936 --> 00:12:37,956 This is not actually a bug, but it's perhaps been written 294 00:12:37,956 --> 00:12:40,836 into infamy by the movie Office Space. 295 00:12:40,836 --> 00:12:43,786 It's something that you can still see in printers on campus. 296 00:12:43,786 --> 00:12:46,476 This is really just an example of an idiotic design decision 297 00:12:46,476 --> 00:12:48,966 when 99% of the time all you need to be told is 298 00:12:48,966 --> 00:12:52,636 to refill the thing, but instead HP decided to introduce this 299 00:12:52,636 --> 00:12:54,006 into the American lexicon. 300 00:12:54,466 --> 00:12:55,926 PC load letter anyone? 301 00:12:57,586 --> 00:13:00,986 Paper, cassette, load letter-sized paper, right? 302 00:13:00,986 --> 00:13:02,476 Instead of just add paper. 303 00:13:02,676 --> 00:13:05,756 So, if you've never seen this movie that I keep referring to, 304 00:13:05,756 --> 00:13:09,206 Office Space, do rent it and you will see three guys go to town 305 00:13:09,206 --> 00:13:11,796 on a printer like this in a very famous scene. 306 00:13:11,856 --> 00:13:15,456 So, without further ado, those are bugs that other people make. 307 00:13:15,456 --> 00:13:18,166 It's quite normal to be making them initially yourself 308 00:13:18,476 --> 00:13:21,176 and now is our transition to exactly the kind 309 00:13:21,176 --> 00:13:23,666 of code we've been writing so many bugs in. 310 00:13:23,666 --> 00:13:27,176 So, we had this picture, this mental model, 311 00:13:27,176 --> 00:13:28,476 for the past couple of days 312 00:13:28,476 --> 00:13:30,866 and this really just represents a computer's memory 313 00:13:30,866 --> 00:13:31,836 and we've started kind 314 00:13:31,836 --> 00:13:36,726 of arbitrarily allocating different areas of our RAM 315 00:13:36,996 --> 00:13:38,086 to different purposes. 316 00:13:38,086 --> 00:13:40,086 So at the very bottom there we have these things called 317 00:13:40,086 --> 00:13:42,116 environment variables, which we don't play with much, 318 00:13:42,116 --> 00:13:44,316 but they have to do with system-wide settings generally. 319 00:13:44,626 --> 00:13:46,826 The stack though has been of particular interest. 320 00:13:46,826 --> 00:13:49,466 It's on the stack that any local variables go. 321 00:13:49,466 --> 00:13:51,466 Any time you call a function it gets a chunk 322 00:13:51,466 --> 00:13:53,536 of memory called a frame on the stack 323 00:13:53,816 --> 00:13:55,466 and you keep layering them again and again 324 00:13:55,466 --> 00:13:58,026 and again every time one function calls another, 325 00:13:58,296 --> 00:14:01,006 but then as soon as that function recall stops executing, 326 00:14:01,186 --> 00:14:03,716 that memory goes away or at least it's no longer valid 327 00:14:03,716 --> 00:14:04,916 so it's not safe to trust, 328 00:14:04,916 --> 00:14:08,546 but we have on occasion seen variables containing 329 00:14:08,546 --> 00:14:09,506 garbage values. 330 00:14:09,506 --> 00:14:12,886 If you just declare int X semicolon, recall that on one 331 00:14:12,886 --> 00:14:15,096 or two occasions we've actually used printf and printed 332 00:14:15,096 --> 00:14:18,016 out its value and once I think it was zero just by luck, 333 00:14:18,246 --> 00:14:21,156 but then some other time it was some completely random value. 334 00:14:21,156 --> 00:14:22,246 Well, where did that come from? 335 00:14:22,506 --> 00:14:25,506 Well, if you're using this stack and reusing this stack 336 00:14:25,766 --> 00:14:27,636 and you're just changing what bits are 337 00:14:27,636 --> 00:14:29,786 where by assigning variables values 338 00:14:29,786 --> 00:14:31,966 and then you're just forgetting where those values were, 339 00:14:32,256 --> 00:14:34,546 certainly if you reuse that memory later 340 00:14:34,546 --> 00:14:38,046 on in your program might you have some bogus value inside a 341 00:14:38,046 --> 00:14:40,006 new variable because it just happens 342 00:14:40,006 --> 00:14:43,036 to have been put inside RAM where something once was. 343 00:14:43,036 --> 00:14:45,446 And since this is why it's terribly important 344 00:14:45,566 --> 00:14:47,066 to always initialize variables 345 00:14:47,066 --> 00:14:51,096 to some knowing state least you accidently use some value that's 346 00:14:51,096 --> 00:14:53,956 not yours or actually follow a pointer, an address, 347 00:14:54,376 --> 00:14:55,826 that's not, in fact, valid. 348 00:14:55,826 --> 00:14:57,666 This is actually a security concern, too. 349 00:14:57,666 --> 00:14:59,646 You can imagine a program whose purpose in life is 350 00:14:59,776 --> 00:15:02,116 to ask someone for their user name and their password 351 00:15:02,326 --> 00:15:04,516 and you check that their user name and password are correct 352 00:15:04,516 --> 00:15:07,286 by calling a function and so while you're in that function, 353 00:15:07,286 --> 00:15:09,686 you need to store what the user typed in for user name 354 00:15:09,756 --> 00:15:12,936 and for password in the frame on the stack. 355 00:15:13,266 --> 00:15:15,786 Well, as soon as you've checked that their user name 356 00:15:15,786 --> 00:15:18,226 and password is correct, this function might return, 357 00:15:18,526 --> 00:15:20,146 the program might go on executing, 358 00:15:20,346 --> 00:15:24,296 but where is the potential security danger here? 359 00:15:24,526 --> 00:15:24,636 Yeah? 360 00:15:25,016 --> 00:15:26,026 >> Student: [inaudible] 361 00:15:26,026 --> 00:15:26,396 >> David: Yeah. 362 00:15:26,396 --> 00:15:29,146 So, your user name and more importantly your password might 363 00:15:29,146 --> 00:15:31,936 actually be lingering somewhere there in memory. 364 00:15:31,936 --> 00:15:35,256 Now, so long as your program is trustworthy, 365 00:15:35,306 --> 00:15:38,006 you yourself have written it, that's maybe not such a big deal 366 00:15:38,006 --> 00:15:40,706 because only you, the programmer of that program, 367 00:15:40,706 --> 00:15:42,616 can later access that same memory, 368 00:15:42,616 --> 00:15:44,076 but we'll talk briefly this week 369 00:15:44,076 --> 00:15:47,426 about something called a buffer overflow exploit that is one 370 00:15:47,426 --> 00:15:50,386 of the most common ways, some random person on the Internet 371 00:15:50,596 --> 00:15:53,776 or some hacker can compromise a program by inserting data 372 00:15:53,776 --> 00:15:55,656 into your program and tricking it 373 00:15:55,656 --> 00:15:58,066 into running some other code that they wrote. 374 00:15:58,066 --> 00:16:00,956 So, if this is the case whereby you can leave secret stuff 375 00:16:00,956 --> 00:16:03,626 like passwords somewhere in your memory but there is a chance 376 00:16:03,656 --> 00:16:05,456 that someone else can access that memory, 377 00:16:05,786 --> 00:16:08,416 therein lies the potential security threat. 378 00:16:08,576 --> 00:16:11,156 So, more on that in more technical detail to come. 379 00:16:11,396 --> 00:16:12,356 This thing called the heap. 380 00:16:12,436 --> 00:16:14,326 What was the difference about the heap that we touched 381 00:16:14,326 --> 00:16:15,646 on at the end of last week? 382 00:16:16,176 --> 00:16:16,966 What's it used for? 383 00:16:16,966 --> 00:16:17,096 Yeah? 384 00:16:17,146 --> 00:16:25,896 >> Student: The memory that you allocate... 385 00:16:25,896 --> 00:16:25,963 [inaudible]. 386 00:16:25,963 --> 00:16:26,376 >> David: Okay, good. 387 00:16:26,376 --> 00:16:28,846 Let me tweak the jargon, but the memory you allocate on the 388 00:16:28,846 --> 00:16:32,146 so called heap does not get reused 389 00:16:32,146 --> 00:16:34,406 until you yourself free it up. 390 00:16:34,406 --> 00:16:38,106 So, if you ask the operating system for memory 391 00:16:38,346 --> 00:16:40,076 and you grab it from this thing called the heap, 392 00:16:40,336 --> 00:16:42,806 that memory is yours until you call a function 393 00:16:42,806 --> 00:16:45,926 that we'll see today called free that explicitly hands 394 00:16:45,926 --> 00:16:47,586 that memory back to the operating system. 395 00:16:47,586 --> 00:16:50,376 The up side of that is that any function you write can ask 396 00:16:50,376 --> 00:16:53,796 for as much memory as it wants and it's not going to disappear, 397 00:16:53,796 --> 00:16:55,886 it's not going to get reused because it's not coming 398 00:16:55,886 --> 00:16:57,546 from that place called the stack. 399 00:16:57,816 --> 00:16:59,986 Now, we introduced this toward the tail end of last week 400 00:17:00,256 --> 00:17:01,896 because we realized this is really stupid. 401 00:17:01,896 --> 00:17:04,206 If I have to write programs that have to know 402 00:17:04,276 --> 00:17:06,986 in advance how much memory the user is going to need, 403 00:17:07,186 --> 00:17:08,946 case in point was the quizzes example. 404 00:17:09,156 --> 00:17:11,376 What if the number of quizzes in some semester changes? 405 00:17:11,376 --> 00:17:14,026 I don't want to go back and change my source code, 406 00:17:14,026 --> 00:17:15,966 recompile my program just to change 407 00:17:15,966 --> 00:17:18,506 that hard-coded constant from 2 to 3. 408 00:17:18,736 --> 00:17:20,446 I'd much rather ask the user 409 00:17:20,446 --> 00:17:22,646 when they run the program how many quizzes were there 410 00:17:22,646 --> 00:17:23,306 this semester? 411 00:17:23,306 --> 00:17:25,906 And then dynamically get data from the user. 412 00:17:26,056 --> 00:17:28,546 Well, we introduced a function last week with which we can ask 413 00:17:28,956 --> 00:17:31,526 for chunks of memory and this thing was called what? 414 00:17:32,616 --> 00:17:34,696 So this was malloc, memory alloc, 415 00:17:34,696 --> 00:17:36,396 and this was just a function that took a number 416 00:17:36,396 --> 00:17:38,986 as an argument how many bytes of memory do you want 417 00:17:38,986 --> 00:17:42,096 and you get it and then we'll see today that you better return 418 00:17:42,096 --> 00:17:44,896 that memory otherwise you induce what's generally called a 419 00:17:44,896 --> 00:17:45,756 memory leak. 420 00:17:45,916 --> 00:17:48,696 So first let's actually play with some of these ideas 421 00:17:48,696 --> 00:17:49,576 in a real environment. 422 00:17:49,576 --> 00:17:52,586 So I'm going to go ahead into a file 423 00:17:52,586 --> 00:17:55,296 from last week called bar.c. So if you have your printouts 424 00:17:55,296 --> 00:17:58,316 from last week this is in bar.c. If not, not to worry. 425 00:17:58,316 --> 00:18:00,166 We're going to step through it deliberately on the screen. 426 00:18:00,166 --> 00:18:01,836 This is a pretty stupid program. 427 00:18:01,836 --> 00:18:05,136 It's been contrived only for the purposes of stepping through it 428 00:18:05,136 --> 00:18:07,846 with gdb, but so that we can see some interesting detail. 429 00:18:07,846 --> 00:18:10,556 So, at the very top of this function we have a function 430 00:18:10,556 --> 00:18:11,136 called main. 431 00:18:11,376 --> 00:18:12,846 It takes no command line arguments 432 00:18:12,846 --> 00:18:14,266 so I've explicitly said void. 433 00:18:14,526 --> 00:18:16,386 It seems to allocate an integer called a 434 00:18:16,386 --> 00:18:19,546 and then a char * called S 435 00:18:19,546 --> 00:18:22,706 and as an aside even though I've gotten into the habit 436 00:18:22,706 --> 00:18:24,776 of always putting the star right next to S, 437 00:18:25,096 --> 00:18:28,836 you may see in textbooks or the TF sections spaces sometimes. 438 00:18:28,836 --> 00:18:31,946 That's okay, but this is perhaps a little more clear. 439 00:18:32,176 --> 00:18:33,456 S is the pointer now. 440 00:18:33,786 --> 00:18:34,076 All right. 441 00:18:34,076 --> 00:18:37,146 So we have a char * and this is synonymous with what? 442 00:18:38,306 --> 00:18:39,136 So string, right? 443 00:18:39,136 --> 00:18:40,426 We started to take off 444 00:18:40,466 --> 00:18:42,376 that training wheel verbally last week. 445 00:18:42,376 --> 00:18:46,136 What we used to call string is now char * and this makes sense 446 00:18:46,186 --> 00:18:48,276 because what's really being stored in memory 447 00:18:48,276 --> 00:18:51,716 for a string is the address of a sequence of characters. 448 00:18:51,746 --> 00:18:54,246 The address of an array of characters. 449 00:18:54,246 --> 00:18:55,546 In this case, hello world. 450 00:18:55,826 --> 00:18:57,036 Now, this is just a line of printf, 451 00:18:57,036 --> 00:18:59,186 but notice I'm doing something a little funky here. 452 00:18:59,456 --> 00:19:01,946 I'm first treating S as though it's an array 453 00:19:01,946 --> 00:19:04,386 and that really is true conceptually. 454 00:19:04,386 --> 00:19:08,956 S bracket S gives me the seventh location, 0 index, 455 00:19:08,986 --> 00:19:10,386 of S. So what should that be? 456 00:19:10,606 --> 00:19:15,256 Well, this is 0 the H, 0, 1, 2, 3, 4, 5, 6, 7. 457 00:19:15,326 --> 00:19:20,206 So I count from 0 and I get W at location 7. 458 00:19:20,746 --> 00:19:25,676 So, S bracket 7 would give me a char namely W. What does 459 00:19:25,676 --> 00:19:27,926 ampersand S bracket 7 give me? 460 00:19:29,186 --> 00:19:31,606 It gives me the address of. 461 00:19:31,606 --> 00:19:32,996 So ampersand is the address 462 00:19:32,996 --> 00:19:34,796 of operator even though we haven't seen it 463 00:19:34,796 --> 00:19:36,456 in this context being prefixed 464 00:19:36,456 --> 00:19:38,226 to an array it's just the same question, 465 00:19:38,226 --> 00:19:41,346 at what address can I find the character W? 466 00:19:41,646 --> 00:19:44,436 And so it turns out that this is really going 467 00:19:44,436 --> 00:19:48,126 to be whatever the address of H is plus 7. 468 00:19:48,366 --> 00:19:50,566 So one of the tricks we'll start to see is that what's neat 469 00:19:50,566 --> 00:19:52,836 about pointers is that you can do arithmetic on them. 470 00:19:52,836 --> 00:19:55,796 You can start at one pointer, add some number like 7 471 00:19:55,796 --> 00:19:58,266 and you can make your way to another part 472 00:19:58,266 --> 00:20:00,226 of the string its location 473 00:20:00,226 --> 00:20:03,036 because the computer does not care where strings begin. 474 00:20:03,226 --> 00:20:05,416 The only thing that's important here when I print 475 00:20:05,416 --> 00:20:08,296 out this here is that the string that starts 476 00:20:08,296 --> 00:20:11,466 at location 7 also still ends somewhere 477 00:20:11,466 --> 00:20:12,666 and what was the special character 478 00:20:12,666 --> 00:20:13,966 that demarks the end of a string? 479 00:20:14,016 --> 00:20:15,056 >> Student: [inaudible] 480 00:20:15,056 --> 00:20:19,046 >> David: Yeah, so '0' was a character, a special character. 481 00:20:19,046 --> 00:20:20,406 It's really just the number 0, 482 00:20:20,406 --> 00:20:22,676 but it's a character representing the number 0 483 00:20:22,966 --> 00:20:24,366 that demarks the end of the string 484 00:20:24,366 --> 00:20:26,976 and conceptually it's right after the D here. 485 00:20:26,976 --> 00:20:28,696 We don't see it, but it is, in fact, 486 00:20:28,696 --> 00:20:30,376 going to be used there underneath the hood. 487 00:20:30,376 --> 00:20:32,206 So, in the end, what should this print? 488 00:20:32,396 --> 00:20:35,506 This line of code if I'm printing out the string starting 489 00:20:35,506 --> 00:20:38,426 at the address of the 7th character? 490 00:20:38,426 --> 00:20:40,276 What should print? 491 00:20:40,426 --> 00:20:41,046 Just world. 492 00:20:41,286 --> 00:20:42,226 It's the same idea. 493 00:20:42,456 --> 00:20:45,476 I'm passing in an address of a string 494 00:20:45,476 --> 00:20:49,066 to printf ampersand S says here comes a string so I am passing 495 00:20:49,066 --> 00:20:51,106 in a string but only part of the string 496 00:20:51,106 --> 00:20:52,636 because now I'm leveraging this fact 497 00:20:52,836 --> 00:20:56,376 that pointers can let me get to any location in memory 498 00:20:56,596 --> 00:20:59,026 so if I want to go to the 7th location in this string 499 00:20:59,126 --> 00:21:00,306 and then get the address of it, 500 00:21:00,466 --> 00:21:03,376 I can absolutely now pretend that's a string because notice 501 00:21:03,416 --> 00:21:07,066 that hello world and also just world will have what 502 00:21:07,066 --> 00:21:09,626 at the end of both of them? 503 00:21:09,626 --> 00:21:11,216 They both have that '0'. 504 00:21:11,216 --> 00:21:12,846 So I'm not changing where the string ends; 505 00:21:12,846 --> 00:21:15,236 I'm essentially just starting in the middle 506 00:21:15,236 --> 00:21:16,906 of the string at location 7. 507 00:21:16,906 --> 00:21:18,926 So we'll see this in just a moment as to what this means. 508 00:21:19,176 --> 00:21:22,256 Now in the next line of code here I just arbitrarily say a 509 00:21:22,256 --> 00:21:25,096 gets 5 so that I actually assign it a value. 510 00:21:25,316 --> 00:21:26,436 Then I call foo. 511 00:21:26,606 --> 00:21:31,286 If I step into that, I now have a parameter called N. It's going 512 00:21:31,286 --> 00:21:34,876 to initially have the value of 5 because I passed in A=5. 513 00:21:35,266 --> 00:21:38,866 I declare a local variable here called B. B arbitrarily gets 514 00:21:38,866 --> 00:21:41,946 assigned to N. I do a little bit of mathematics and now just 515 00:21:41,946 --> 00:21:45,396 because I pass B into this last function called bar. 516 00:21:45,396 --> 00:21:48,326 So this last function called bar simply says, hi, 517 00:21:48,426 --> 00:21:51,276 I'm bar as though to confirm that it was, in fact, called, 518 00:21:51,556 --> 00:21:54,116 but notice I'm not even using that int N and why? 519 00:21:54,356 --> 00:21:55,556 This is completely arbitrary. 520 00:21:55,556 --> 00:21:57,326 I just wanted some interesting constructs 521 00:21:57,326 --> 00:21:58,896 with which we could now play with the code 522 00:21:59,096 --> 00:22:01,376 and actually confirm that what we've been preaching verbally 523 00:22:01,496 --> 00:22:03,806 is, in fact, true in reality. 524 00:22:04,266 --> 00:22:05,556 So let's go ahead and make bar. 525 00:22:05,556 --> 00:22:08,526 I'm going to run gdb on this program called bar, 526 00:22:08,526 --> 00:22:11,526 and I'm going to go ahead and type break main 527 00:22:11,716 --> 00:22:13,296 to get a break point at my function 528 00:22:13,536 --> 00:22:18,666 and then notice this OX80483FD represents what perhaps? 529 00:22:19,346 --> 00:22:24,336 So that's the address where main is in memory. 530 00:22:24,336 --> 00:22:25,046 What does that mean? 531 00:22:25,046 --> 00:22:27,086 Well, recall from last week that that chunk of memory 532 00:22:27,086 --> 00:22:30,726 at the very top of RAM is the so-called .text segment. 533 00:22:30,756 --> 00:22:33,256 Those are the 0's and 1's that represent all 534 00:22:33,256 --> 00:22:35,246 of the functions I wrote and then compiled. 535 00:22:35,576 --> 00:22:42,336 So when gdb says, oh, well, main is at OX080483FD that jus means 536 00:22:42,336 --> 00:22:43,816 that is the numeric address 537 00:22:44,046 --> 00:22:46,876 of the function called main in that text segment. 538 00:22:46,876 --> 00:22:48,346 Generally not useful, but it turns 539 00:22:48,346 --> 00:22:51,286 out in C you can actually pass functions around just 540 00:22:51,286 --> 00:22:54,196 as you can variables and you can pass functions around by way 541 00:22:54,196 --> 00:22:55,756 of their addresses, but that's not something we'll 542 00:22:55,756 --> 00:22:56,446 generally do. 543 00:22:56,746 --> 00:22:58,156 Now just a quick crash course 544 00:22:58,506 --> 00:23:02,426 in something that'll be a little more fun in Visual once we get 545 00:23:02,426 --> 00:23:05,256 to Web programming since everyone likes to change colors 546 00:23:05,256 --> 00:23:08,216 of things it turns out that the same notation hexadecimal 547 00:23:08,216 --> 00:23:10,506 notation is used for color codes and Photoshop 548 00:23:10,506 --> 00:23:12,166 and Web development and so forth, 549 00:23:12,166 --> 00:23:13,966 but for now we just need a simple definition. 550 00:23:14,316 --> 00:23:20,196 So, in the world of binary, we represented the number 0 with 0, 551 00:23:20,406 --> 00:23:22,796 the number 1 with 1, the number 2 with? 552 00:23:23,026 --> 00:23:26,856 10 and then the number 3 with 11 553 00:23:26,856 --> 00:23:29,356 and then 100 an notice the pattern. 554 00:23:29,416 --> 00:23:31,036 So I didn't quite line them up optimally, 555 00:23:31,296 --> 00:23:33,866 but I can right now draw this, can I fix this? 556 00:23:34,076 --> 00:23:34,556 Not really. 557 00:23:34,996 --> 00:23:37,266 Well, let's just leave it like that. 558 00:23:37,586 --> 00:23:40,966 So, 0 goes to 1, 1 plus 1 is technically 2, 559 00:23:40,966 --> 00:23:43,776 but I don't have the digit 2 in the binary system so I have 560 00:23:43,776 --> 00:23:46,676 to carry that 1 and so I get 10. 561 00:23:46,676 --> 00:23:49,466 I then add 1, this is easy, get 11 or 1, 1. 562 00:23:49,756 --> 00:23:50,766 Then I add 1 again. 563 00:23:50,766 --> 00:23:53,886 Well, 1 plus 1 is 2, but 2 doesn't exist so I have 564 00:23:53,886 --> 00:23:55,796 to carry the 1 and so forth. 565 00:23:55,796 --> 00:23:59,006 That's how you end up getting the same number here 1 0 0. 566 00:23:59,236 --> 00:24:01,386 Now, in decimal, these things are pretty straightforward. 567 00:24:01,606 --> 00:24:04,936 So this is just 0 1 2 3 4. 568 00:24:05,186 --> 00:24:07,806 Well, what about in hexadecimal? 569 00:24:07,976 --> 00:24:09,696 Actually it starts off the same. 570 00:24:09,966 --> 00:24:13,526 Zero is 0, 1 is 1, 2 is 2, dot, dot, dot, 571 00:24:13,836 --> 00:24:15,756 but now let me do some dot, dot, dot down here. 572 00:24:15,946 --> 00:24:19,636 When I get to the number 9 and then 10 in decimal, 573 00:24:19,906 --> 00:24:22,026 now I've used up all of those digits, but it turns 574 00:24:22,026 --> 00:24:24,456 out in hexadecimal I can go up to 9. 575 00:24:24,806 --> 00:24:30,676 There is no 10 but there is, in fact, A, B, C, D, E, 576 00:24:30,996 --> 00:24:33,536 F. So in this notation called hexadecimal, 577 00:24:33,536 --> 00:24:37,526 you can effectively count from 0 on up to 1, 2, 3, 4, 5, 6, 7, 8, 578 00:24:37,526 --> 00:24:41,296 9, 10, 11, 12, 13, 14, 15. 579 00:24:41,296 --> 00:24:44,396 You can count to 15 by using 0 though 9 and then 580 00:24:44,396 --> 00:24:45,976 when you run out a through F. 581 00:24:46,326 --> 00:24:47,046 Now who cares? 582 00:24:47,046 --> 00:24:48,116 Why is this interesting? 583 00:24:48,376 --> 00:24:50,116 Well, the bigger picture taken away is 584 00:24:50,116 --> 00:24:52,866 that you know have more digits at your disposal 585 00:24:53,016 --> 00:24:55,916 so for representing really large numbers in the millions 586 00:24:55,916 --> 00:24:57,136 and the billions and higher, 587 00:24:57,386 --> 00:25:00,096 you can actually write them using fewer characters 588 00:25:00,326 --> 00:25:03,666 because you have more expressive capabilities 589 00:25:03,666 --> 00:25:04,986 with each single digit. 590 00:25:05,166 --> 00:25:07,556 That is it only takes one character F 591 00:25:07,656 --> 00:25:12,666 to represent 15 whereas in decimal it would take 1, 5, 592 00:25:12,816 --> 00:25:14,416 2 and how about in binary? 593 00:25:14,416 --> 00:25:15,636 How do you represent 15? 594 00:25:16,216 --> 00:25:19,646 It's 1111. 595 00:25:20,036 --> 00:25:22,096 So one's place, two's place, four's, eight. 596 00:25:22,096 --> 00:25:24,706 So 8 plus 4 plus 2 plus 1 gives me 15. 597 00:25:24,936 --> 00:25:25,726 So this is the idea. 598 00:25:25,726 --> 00:25:27,216 With all these different base systems, 599 00:25:27,216 --> 00:25:29,756 really it's just a convention, a convenience for humans. 600 00:25:30,046 --> 00:25:32,206 Hexadecimal tends to be used when you start talking 601 00:25:32,206 --> 00:25:35,356 about larger numbers because you can fit more information 602 00:25:35,356 --> 00:25:36,816 into fewer characters. 603 00:25:36,816 --> 00:25:39,696 Now as an aside so that this feels a little more real and so 604 00:25:39,696 --> 00:25:40,756 that those of you already familiar 605 00:25:40,756 --> 00:25:42,416 with Photoshop can see the connection, 606 00:25:42,706 --> 00:25:45,346 turns out in the graphics world if you want 607 00:25:45,346 --> 00:25:49,836 to represent the number, the color red, you just express it 608 00:25:50,146 --> 00:25:53,176 with what's called and RGB code, red, green, blue. 609 00:25:53,466 --> 00:25:55,646 And so if you want to represent the number red, 610 00:25:55,806 --> 00:26:00,846 you would generally say FF0000. 611 00:26:00,986 --> 00:26:01,646 Now why this? 612 00:26:01,856 --> 00:26:04,506 Here's how much blue you want, here's how much green you want, 613 00:26:04,606 --> 00:26:05,836 here's how much red you want. 614 00:26:06,266 --> 00:26:13,336 So, because each F is 1111, FF is 11111111 so that's cranking 615 00:26:13,336 --> 00:26:18,266 up all the possible red that's available whereas 0 is no green, 616 00:26:18,436 --> 00:26:21,626 0 is no blue so this is how the world represents red. 617 00:26:21,626 --> 00:26:23,156 How does the world represent green? 618 00:26:23,436 --> 00:26:28,056 Well, 00FF00 and you can infer from that. 619 00:26:28,106 --> 00:26:30,386 a good quiz question maybe what blue is actually 620 00:26:30,386 --> 00:26:31,086 represented with. 621 00:26:31,146 --> 00:26:35,196 So for now the whole point is just this is a different way 622 00:26:35,466 --> 00:26:38,146 of expressing numbers and the human convention is generally 623 00:26:38,146 --> 00:26:41,206 to put this character 0X at the start 624 00:26:41,206 --> 00:26:43,676 of any hexadecimal number just to make clear 625 00:26:43,676 --> 00:26:46,046 to the reader here comes a hexadecimal number. 626 00:26:46,046 --> 00:26:49,166 The 0 and the X are not information unto themselves, 627 00:26:49,436 --> 00:26:51,606 they're not part of the address, it's just a notation, 628 00:26:51,606 --> 00:26:54,116 a hint that says here comes a hexadecimal number. 629 00:26:54,336 --> 00:26:57,316 If you ever see any alphabetical letters though in a number, 630 00:26:57,746 --> 00:26:59,276 odds are it's hexadecimal. 631 00:26:59,276 --> 00:27:03,036 If you see G in a number, odds are it's not hexadecimal 632 00:27:03,036 --> 00:27:06,166 because we can't count that high as G. Okay, so back on track. 633 00:27:06,356 --> 00:27:06,966 Here's main. 634 00:27:07,146 --> 00:27:10,576 I've set a break point in this program called bar.c. Let me go 635 00:27:10,576 --> 00:27:12,786 ahead now and run my program by typing run. 636 00:27:13,156 --> 00:27:14,056 We've done this before. 637 00:27:14,286 --> 00:27:15,676 It hits the break point in main. 638 00:27:15,926 --> 00:27:20,306 That first line of code declares a string S, but wait a minute. 639 00:27:20,306 --> 00:27:23,676 Let me type list just so I can see that code again. 640 00:27:24,046 --> 00:27:26,736 Well, it turns out I have an int A. It didn't bother showing it 641 00:27:26,736 --> 00:27:28,766 to me because that was just allocating space. 642 00:27:28,766 --> 00:27:30,806 There was no statement, there was no assignment 643 00:27:31,026 --> 00:27:32,356 so let's do one sanity check. 644 00:27:32,356 --> 00:27:35,306 Let me go ahead and print out the value of a in gdb. 645 00:27:36,386 --> 00:27:37,136 What is this value? 646 00:27:37,896 --> 00:27:39,086 All right. 647 00:27:39,226 --> 00:27:42,306 Garbage value that was not put there by us necessarily. 648 00:27:42,306 --> 00:27:44,336 There was some start-up process that happened 649 00:27:44,336 --> 00:27:46,716 when I first ran my program whereby that memory 650 00:27:46,716 --> 00:27:49,586 on the stack was used for some purpose unbeknownst to me 651 00:27:49,886 --> 00:27:52,926 and I cannot now trust that a has been initialized 652 00:27:53,166 --> 00:27:55,436 to any fixed value because who knows what's there? 653 00:27:55,436 --> 00:27:58,086 But that's okay because in a few lines from now I'm going 654 00:27:58,086 --> 00:28:01,236 to explicitly initialize a to some value so it's a problem 655 00:28:01,236 --> 00:28:02,906 that it's initialized to some junk value. 656 00:28:03,176 --> 00:28:05,436 I just had better not use that for any purpose yet. 657 00:28:05,436 --> 00:28:07,366 So I'm going to go ahead now and print the value 658 00:28:07,366 --> 00:28:10,486 of S. Notice I haven't started executing the lines 659 00:28:10,486 --> 00:28:11,566 of code in this function. 660 00:28:11,796 --> 00:28:14,566 All I did was list the lines and then I printed a now I'm going 661 00:28:14,566 --> 00:28:16,516 to print S and what do I see? 662 00:28:17,336 --> 00:28:19,966 Well, what's this? 663 00:28:20,176 --> 00:28:22,496 It's definitely hexadecimal so it turns 664 00:28:22,496 --> 00:28:25,406 out that S2 is just a variable. 665 00:28:25,406 --> 00:28:27,436 It happens to be a variable of a different type. 666 00:28:27,696 --> 00:28:31,586 It's a pointer to a char, which means it's the address 667 00:28:32,006 --> 00:28:34,366 of some character in memory but at this point 668 00:28:34,366 --> 00:28:36,506 in the story this line of code, line 20, 669 00:28:36,726 --> 00:28:38,206 has not been executed yet. 670 00:28:38,386 --> 00:28:39,986 So because S is a variable just 671 00:28:39,986 --> 00:28:42,396 like a is S2 can have some garbage value. 672 00:28:42,676 --> 00:28:45,666 So the garbage value that's apparently an S is this thing 673 00:28:45,666 --> 00:28:48,096 here, OX312FF4. 674 00:28:48,616 --> 00:28:50,286 That's just whatever number happens 675 00:28:50,326 --> 00:28:53,136 to be there gdb is showing it in hexadecimal notation 676 00:28:53,306 --> 00:28:54,976 because it knows it should be an address 677 00:28:55,266 --> 00:28:57,016 but that doesn't mean it's not a garbage value. 678 00:28:57,226 --> 00:28:59,546 In fact, what is at that garbage value? 679 00:28:59,736 --> 00:29:02,356 Well, gdb is actually telling me it's checking in advance 680 00:29:02,356 --> 00:29:04,456 for me apparently the string that starts 681 00:29:04,456 --> 00:29:07,306 with a vertical bar then a curly brace then this special 682 00:29:07,656 --> 00:29:08,346 code there. 683 00:29:08,456 --> 00:29:09,646 So it's just some junk value. 684 00:29:09,646 --> 00:29:12,196 It's just a coincidence and it's a coincidence, too, 685 00:29:12,386 --> 00:29:15,866 that this string that's at that address is only a few characters 686 00:29:15,866 --> 00:29:19,266 long because gdb stops showing what's there as soon 687 00:29:19,266 --> 00:29:23,486 as it hits a '0' so just by luck there is, in fact, 688 00:29:23,486 --> 00:29:25,946 a 0 byte in the vicinity of that address 689 00:29:26,076 --> 00:29:28,086 and that's why the string does not look longer. 690 00:29:28,246 --> 00:29:29,846 Now, why did this happen here? 691 00:29:30,306 --> 00:29:30,906 Who knows. 692 00:29:30,946 --> 00:29:33,836 It's just an artifact of having run this program 693 00:29:34,216 --> 00:29:35,016 on this machine. 694 00:29:37,116 --> 00:29:37,206 Yeah? 695 00:29:37,206 --> 00:29:38,076 >> Student: [inaudible]. 696 00:29:38,076 --> 00:29:39,996 >> David: s plus 30? 697 00:29:39,996 --> 00:29:41,256 Okay so we can do this. 698 00:29:41,256 --> 00:29:42,846 So jumping slightly ahead let me go ahead 699 00:29:42,846 --> 00:29:48,336 and print let's do s plus 30 and now I've advanced, 700 00:29:48,336 --> 00:29:50,896 if you do the math, I've advanced 30 bytes in memory 701 00:29:50,926 --> 00:29:53,406 and what's apparently 30 bytes away 702 00:29:53,406 --> 00:29:57,036 from S's current bogus address is this random character there. 703 00:29:57,036 --> 00:29:59,966 So, you can poke around with gdb anywhere you want in memory. 704 00:30:00,206 --> 00:30:04,576 In fact, if you continue on next year with a course called CS61, 705 00:30:04,746 --> 00:30:06,426 which is low-level systems programming, 706 00:30:06,686 --> 00:30:10,316 one of the first problem sets actually involves they hand you 707 00:30:10,316 --> 00:30:13,546 a program that the staff wrote that they compiled whose purpose 708 00:30:13,546 --> 00:30:15,976 in life is to ask for passwords and then check 709 00:30:15,976 --> 00:30:17,296 if you've typed in the right value. 710 00:30:17,546 --> 00:30:19,796 The problem is if you input the wrong password 711 00:30:19,796 --> 00:30:21,386 into this special problem set program, 712 00:30:21,386 --> 00:30:24,606 it emails the staff saying David got this question wrong. 713 00:30:24,956 --> 00:30:27,526 So the goal of that pset is to actually figure 714 00:30:27,526 --> 00:30:32,346 out what the user's passwords are without triggering this bomb 715 00:30:32,346 --> 00:30:35,116 to go off, without triggering the email to get sent out. 716 00:30:35,266 --> 00:30:37,776 So you obviously can't just run the program at the command line 717 00:30:37,776 --> 00:30:39,286 and start inputting your passwords 718 00:30:39,316 --> 00:30:42,166 because clearly the odds of your guessing the password right the 719 00:30:42,226 --> 00:30:44,876 first time is not going to happen and so you instead have 720 00:30:44,906 --> 00:30:47,766 to run, making I'm making next year's problem set too easy 721 00:30:47,766 --> 00:30:50,656 for you, you could certainly use a tool like gdb, right? 722 00:30:50,656 --> 00:30:53,786 Because with gdb can you pause execution at main 723 00:30:54,026 --> 00:30:56,936 and then start stepping through it poking around, poking around 724 00:30:57,056 --> 00:30:59,006 and if they've hard coded the right passwords 725 00:30:59,006 --> 00:31:02,766 into the program, surely by printing random parts and memory 726 00:31:02,766 --> 00:31:05,466 out can you poke around and see what's actually there. 727 00:31:05,466 --> 00:31:06,596 So, if you take CS61 remember 728 00:31:06,596 --> 00:31:08,776 that don't tell them I said this in lecture. 729 00:31:08,776 --> 00:31:09,216 [Laughter] 730 00:31:09,216 --> 00:31:10,506 All right. 731 00:31:11,366 --> 00:31:12,336 So now let's continue. 732 00:31:12,556 --> 00:31:14,426 Here's the program we're working on. 733 00:31:14,536 --> 00:31:15,686 Here's the program we're working on. 734 00:31:15,686 --> 00:31:18,326 Let me go ahead and remind you 735 00:31:18,426 --> 00:31:23,716 in main dot C the main function notice I've just initialized S 736 00:31:23,886 --> 00:31:26,366 or I'm about to initialize S to hello world and then we're going 737 00:31:26,366 --> 00:31:27,816 to start printing things out for real. 738 00:31:28,126 --> 00:31:30,526 So let me go ahead and type next in gdb 739 00:31:30,866 --> 00:31:34,706 so now I've executed line 20 so now if I print out what's 740 00:31:34,706 --> 00:31:38,076 at location S notice that S itself is still an address 741 00:31:38,596 --> 00:31:41,776 but it's now the address of this string in memory. 742 00:31:41,776 --> 00:31:42,966 Now where did that end up? 743 00:31:43,586 --> 00:31:46,626 That ended up specifically to be clear at this address. 744 00:31:46,786 --> 00:31:48,016 Now let's go ahead and call printf. 745 00:31:48,016 --> 00:31:49,366 So I'm going to go ahead and type next 746 00:31:50,116 --> 00:31:52,826 and sure enough it printed out the word world 747 00:31:52,826 --> 00:31:55,666 on the left hand side there and let's see why this is. 748 00:31:55,666 --> 00:31:58,986 Well, let me go ahead and printout S bracket 7. 749 00:31:59,266 --> 00:32:01,606 That should print out the 7th character 750 00:32:01,756 --> 00:32:04,126 in the string called S, and in fact, it does. 751 00:32:04,426 --> 00:32:07,186 It prints out W whose ASCII code if you look it 752 00:32:07,186 --> 00:32:11,746 up in the chart is 119 just as lower case a was 97. 753 00:32:11,866 --> 00:32:13,766 So, W is 119. 754 00:32:14,006 --> 00:32:16,476 Well, let me go ahead now and print the address 755 00:32:16,476 --> 00:32:20,346 of S bracket S, and I shouldn't get W per se. 756 00:32:20,346 --> 00:32:23,196 I should instead get the address at that location, 757 00:32:23,416 --> 00:32:25,996 which is this again and yet it shows me world 758 00:32:26,256 --> 00:32:28,956 because gdb realizes, oh, this is the start of a string, 759 00:32:29,066 --> 00:32:31,256 I'm going to just keep showing you character after character 760 00:32:31,256 --> 00:32:33,366 after character until I hit what special value? 761 00:32:33,366 --> 00:32:37,936 Back slash 0 and so I happen to see the word world again 762 00:32:38,136 --> 00:32:41,176 and that's exactly why printf itself printed 763 00:32:41,176 --> 00:32:42,636 out just the word world. 764 00:32:42,896 --> 00:32:44,036 So, now let's go ahead and advance. 765 00:32:44,216 --> 00:32:50,146 If I hit next now, I'm at line 22 so a is currently Print A, 766 00:32:50,356 --> 00:32:52,806 it's currently some bogus value, but now if I do next 767 00:32:53,426 --> 00:32:58,016 and now I Print A, notice that a is now 5, which is as expected. 768 00:32:58,306 --> 00:33:01,356 Now, if I type next again, foo will get called 769 00:33:01,596 --> 00:33:02,946 but gdb won't step into it. 770 00:33:02,946 --> 00:33:05,616 It will just step right over it, foo will do its thing 771 00:33:05,616 --> 00:33:07,596 and that's kind of useless because then my program will end 772 00:33:07,836 --> 00:33:11,356 so I'm instead going to type not next but step that steps me 773 00:33:11,356 --> 00:33:14,836 into recall the function, gdb is reminding me, all right, 774 00:33:14,836 --> 00:33:18,506 I'm in bar.c line 31 so you can do a little sanity check 775 00:33:18,506 --> 00:33:20,036 in another window or on a printout. 776 00:33:20,216 --> 00:33:23,336 I've called a function called foo with an argument of 5. 777 00:33:23,486 --> 00:33:24,366 What do I do here? 778 00:33:24,736 --> 00:33:27,636 Well, I'm going to go ahead and hit next, I'm going to go ahead 779 00:33:27,636 --> 00:33:30,826 and hit next and now let's see what N is. 780 00:33:31,036 --> 00:33:33,366 Well, N is still 5 because that was the argument 781 00:33:33,536 --> 00:33:34,716 that goes passed into foo. 782 00:33:34,906 --> 00:33:36,416 Well, what should B be at this point? 783 00:33:37,686 --> 00:33:39,216 So hopefully B is 10. 784 00:33:39,416 --> 00:33:41,346 That makes sense because I initialize B to N 785 00:33:41,346 --> 00:33:44,386 and then I multiplied B times 2. 786 00:33:44,606 --> 00:33:48,336 This special shorthand notation just means B equals B times 2 787 00:33:48,736 --> 00:33:51,376 so now it's 10 and now what I'm going to do next let me go ahead 788 00:33:51,376 --> 00:33:56,456 and step into bar and now if I print out M the argument 789 00:33:56,456 --> 00:33:58,106 to bar it's indeed 10. 790 00:33:58,416 --> 00:34:01,806 So, at this point now I'm just doing some useless stuff 791 00:34:01,866 --> 00:34:03,646 because it's just going to print high on bar so I'm going 792 00:34:03,646 --> 00:34:05,626 to type continue because I'm getting bored with the process 793 00:34:05,846 --> 00:34:07,526 so the whole thing finishes executing 794 00:34:07,526 --> 00:34:09,476 without breaking again, program exists normally, 795 00:34:09,476 --> 00:34:12,376 which means it returned zero and voila, but the point 796 00:34:12,376 --> 00:34:15,416 for now is not what this program actually did but that using gdb 797 00:34:15,416 --> 00:34:19,846 where we able to confirm that this whole layout of memory is, 798 00:34:19,846 --> 00:34:23,246 in fact, the case and if we look specifically at the addresses 799 00:34:23,326 --> 00:34:25,356 without just saying hexadecimal, hexadecimal, 800 00:34:25,616 --> 00:34:28,096 if you actually look at the numbers, you'll see that all 801 00:34:28,096 --> 00:34:30,096 of the locations of these values are 802 00:34:30,096 --> 00:34:32,016 at different points in this process. 803 00:34:32,016 --> 00:34:33,436 Let's do one sanity check. 804 00:34:33,436 --> 00:34:36,186 Let me go ahead and rerun gdb on bar. 805 00:34:36,186 --> 00:34:39,436 I'm going to set another break point in main and now I'm going 806 00:34:39,436 --> 00:34:41,566 to type run and now I'm going to do this. 807 00:34:41,746 --> 00:34:45,966 So notice that print the value 808 00:34:46,116 --> 00:34:49,076 of a it is some bogus value again, 809 00:34:49,076 --> 00:34:53,446 but let's print the address of a and it looks like the address 810 00:34:53,446 --> 00:34:58,126 of a is here; OXBFFFFF688. 811 00:34:58,396 --> 00:35:00,106 So we'll leave that on the screen. 812 00:35:00,106 --> 00:35:01,666 Let me now type a couple of commands. 813 00:35:01,666 --> 00:35:03,046 Next. Next. 814 00:35:03,966 --> 00:35:05,436 Next. All right I'm going to step. 815 00:35:05,436 --> 00:35:07,416 I want to go into foo for a moment. 816 00:35:07,416 --> 00:35:09,046 Now I'm going to step into foo. 817 00:35:09,246 --> 00:35:11,376 Now, foo recall has an argument. 818 00:35:11,376 --> 00:35:15,516 It's called N. So on this picture foo should be higher 819 00:35:15,786 --> 00:35:18,756 on the stack than main's own variables, right? 820 00:35:18,756 --> 00:35:20,286 Because main has called foo 821 00:35:20,286 --> 00:35:22,056 and eventually foo is going to call bar. 822 00:35:22,296 --> 00:35:24,326 So the stack is growing upward in this way, 823 00:35:24,326 --> 00:35:25,746 but we're about to see a curiosity. 824 00:35:26,006 --> 00:35:31,706 Let me go ahead and print out N, N is indeed 5, let me go ahead 825 00:35:31,706 --> 00:35:34,156 and print out B. B is some bogus value, 826 00:35:34,156 --> 00:35:37,926 but if I execute B equals N by typing next now I print B 827 00:35:37,926 --> 00:35:38,696 and it should be what? 828 00:35:39,956 --> 00:35:44,926 Should be 5 also because I said B equal to N and now last step 829 00:35:44,926 --> 00:35:47,086 if you're following along let me go ahead and print 830 00:35:47,086 --> 00:35:50,376 out the address of B, which is a local variable 831 00:35:50,376 --> 00:35:52,056 in foo and now notice. 832 00:35:52,786 --> 00:35:54,396 So it's a little cryptic at first. 833 00:35:54,476 --> 00:35:58,216 Here is where a was and a is a local variable in main. 834 00:35:58,616 --> 00:36:03,406 Now down here B is a local variable in foo 835 00:36:03,656 --> 00:36:05,326 and notice their addresses. 836 00:36:05,416 --> 00:36:06,636 They're almost the same. 837 00:36:06,636 --> 00:36:11,746 They both start with BFFFF and then a 6, but then we have, 838 00:36:11,746 --> 00:36:15,596 what, 5C and then we have 88. 839 00:36:15,976 --> 00:36:18,896 So, in which direction do things seem to be growing? 840 00:36:19,506 --> 00:36:23,986 So it looks like even though we think of the address, 841 00:36:23,986 --> 00:36:26,786 even though we think of the stack as growing up, 842 00:36:27,156 --> 00:36:29,256 recall that main is going to be on the bottom here 843 00:36:29,616 --> 00:36:33,236 and then foo is going to be on top of it pictorially here and 844 00:36:33,236 --> 00:36:34,606 yet notice foo's address. 845 00:36:34,606 --> 00:36:36,686 Is it bigger or smaller if you compare the two? 846 00:36:37,586 --> 00:36:40,506 So, it's actually smaller so this is actually a problem 847 00:36:40,506 --> 00:36:41,266 and it lends itself 848 00:36:41,266 --> 00:36:44,646 to a security exploit we'll soon see even though conceptually 849 00:36:44,646 --> 00:36:47,426 frankly we call it the stack, it makes nice sense 850 00:36:47,426 --> 00:36:49,866 to say the frames get stacked up and up and up just 851 00:36:49,866 --> 00:36:53,106 like in a cafeteria, but in reality you'll actually see 852 00:36:53,106 --> 00:36:54,916 in some textbooks the pictures reversed 853 00:36:54,916 --> 00:36:56,936 where the stack grows down, which is fine. 854 00:36:56,936 --> 00:36:58,366 You just have to flip your mental model, 855 00:36:58,586 --> 00:37:00,556 but then the whole cafeteria example doesn't really make 856 00:37:00,556 --> 00:37:01,886 as much sense and it's kind of weird 857 00:37:01,886 --> 00:37:03,606 that they called it a stack, but so be it. 858 00:37:03,936 --> 00:37:07,486 But in this case with gdb we're actually confirming visually 859 00:37:07,726 --> 00:37:09,916 that here's a local variable in main and it's 860 00:37:09,916 --> 00:37:11,616 at a decently high, a big address. 861 00:37:12,236 --> 00:37:14,746 Notice that this is a local variable in foo, 862 00:37:14,976 --> 00:37:16,386 but its address is smaller. 863 00:37:16,746 --> 00:37:20,476 So, that means big addresses in RAM are actually at the bottom; 864 00:37:20,646 --> 00:37:23,036 small addresses in RAM at higher up. 865 00:37:23,176 --> 00:37:25,026 So even though I've kind of been cheating 866 00:37:25,026 --> 00:37:28,036 and saying suppose this is byte 0 and this is 1 and this is 2, 867 00:37:28,296 --> 00:37:30,396 it's actually the reverse on most systems 868 00:37:30,656 --> 00:37:32,886 and this is a problem because we'll see this week 869 00:37:33,096 --> 00:37:36,106 that you can exploit that feature of the order of memory 870 00:37:36,366 --> 00:37:39,466 and insert as big a chunk of data as you want 871 00:37:39,516 --> 00:37:42,026 into someone's function and what's going to happen is 872 00:37:42,026 --> 00:37:44,916 because these addresses are going from small to big, 873 00:37:45,126 --> 00:37:47,966 when you insert big chunk of memory, you're going to end 874 00:37:47,966 --> 00:37:50,666 up overwriting previous frames on the stack 875 00:37:50,906 --> 00:37:53,116 and by doing this can you take over the machine 876 00:37:53,116 --> 00:37:54,326 or take over that program. 877 00:37:54,676 --> 00:37:56,376 So that's very abstract for now. 878 00:37:56,376 --> 00:37:58,226 Why don't we go ahead and take a five-minute break 879 00:37:58,226 --> 00:37:59,936 and we'll resume with more detail. 880 00:38:03,526 --> 00:38:07,226 All right so that was a lot all at once and I recognize 881 00:38:07,226 --> 00:38:09,126 that it's kind of hard to follow through gdb 882 00:38:09,126 --> 00:38:10,626 when the screen starts becoming a mess. 883 00:38:10,626 --> 00:38:14,036 So, let's take a look at some perhaps simpler pictures just 884 00:38:14,036 --> 00:38:15,816 to set the stage here, again, mentally. 885 00:38:16,066 --> 00:38:18,386 The story we're telling is that memory is still laid 886 00:38:18,386 --> 00:38:20,576 out relatively speaking the same way it's been 887 00:38:20,576 --> 00:38:24,026 for the past few weeks, but it turns out that byte 0 888 00:38:24,026 --> 00:38:26,876 in your computer's memory is actually up here 889 00:38:27,096 --> 00:38:31,056 and byte 2 billion if you have 2 gigabytes of RAM is lower 890 00:38:31,056 --> 00:38:34,306 in this picture even though the stack itself grows upward 891 00:38:34,306 --> 00:38:37,786 pictorially the addresses must be going from big address 892 00:38:37,786 --> 00:38:39,686 to smaller, to smaller to smaller. 893 00:38:39,906 --> 00:38:41,346 Now why is this of concern? 894 00:38:41,346 --> 00:38:43,206 Well, consider this chunk of code here. 895 00:38:43,456 --> 00:38:46,686 This is a pretty small C program, but it demonstrates one 896 00:38:46,686 --> 00:38:48,026 of the most common attacks 897 00:38:48,436 --> 00:38:51,156 that still compromises servers and programs today. 898 00:38:51,446 --> 00:38:52,846 So here's main down here. 899 00:38:53,096 --> 00:38:55,276 It does, in fact, take command line arguments. 900 00:38:55,276 --> 00:38:58,286 As an aside, you might see in sections 901 00:38:58,286 --> 00:39:00,386 or in textbooks there's this other notation 902 00:39:00,386 --> 00:39:04,306 for ARG V. We generally say char * and then ARG V 903 00:39:04,306 --> 00:39:06,456 with square brackets to make very clear 904 00:39:06,456 --> 00:39:10,436 that ARG V is an array of char *s aka strings 905 00:39:10,436 --> 00:39:11,416 and array of strings. 906 00:39:11,796 --> 00:39:15,196 Well, because an array can also be treated as we've started 907 00:39:15,196 --> 00:39:16,436 to see last week in this 908 00:39:16,786 --> 00:39:21,226 as itself just an address it is also possible to write ARG V 909 00:39:21,226 --> 00:39:24,696 as char * star so consider this just for now a teaser. 910 00:39:24,926 --> 00:39:27,356 You can have things called pointers, two pointers, 911 00:39:27,356 --> 00:39:31,366 that can be useful but for now will generally use our original 912 00:39:31,366 --> 00:39:32,696 notation with square brackets. 913 00:39:32,696 --> 00:39:33,866 But for now it's the same thing. 914 00:39:34,266 --> 00:39:37,536 So this program here calls foo, which is this function up here 915 00:39:37,746 --> 00:39:40,986 and it just blindly passes in ARG V bracket one. 916 00:39:41,196 --> 00:39:45,146 Now just instinctively what is potentially wrong 917 00:39:45,146 --> 00:39:46,596 with this implementation? 918 00:39:47,206 --> 00:39:52,586 So the real answer is going to be no bounds checking, 919 00:39:52,586 --> 00:39:54,186 but there's another problem in main itself. 920 00:39:54,186 --> 00:39:55,696 What am I not doing in main? 921 00:39:57,096 --> 00:39:58,716 So, I'm not checking ARG C. Right, 922 00:39:58,716 --> 00:40:02,826 even though thus far we've been writing main for you recently 923 00:40:02,826 --> 00:40:06,416 in game of 15 and Sudoku, think back to caesar in vigenere 924 00:40:06,416 --> 00:40:08,886 when you had to accept the command line argument assuming 925 00:40:08,886 --> 00:40:10,296 you implemented the requirements 926 00:40:10,296 --> 00:40:12,126 of the specification you probably 927 00:40:12,126 --> 00:40:16,216 at least checked is ARG C equal equal to the number 2 928 00:40:16,446 --> 00:40:19,816 so that you knew there was going to be a number after caesar 929 00:40:19,936 --> 00:40:23,466 or after vigenere's name at the command line. 930 00:40:23,746 --> 00:40:26,896 Here we're just assuming blindly that ARG C is going 931 00:40:27,036 --> 00:40:32,026 to have a count of 2 or more, and I'm passing in bracket 1 932 00:40:32,026 --> 00:40:35,676 of ARG V. So the first word typed after the program's name, 933 00:40:35,676 --> 00:40:37,406 but what if there is no word there? 934 00:40:37,406 --> 00:40:39,066 Well, it looks like I'm going to be jumping 935 00:40:39,066 --> 00:40:40,876 into memory that's not necessarily my own. 936 00:40:40,876 --> 00:40:42,776 So that's one problem, but now let's look 937 00:40:42,776 --> 00:40:43,916 at what foo actually does. 938 00:40:44,166 --> 00:40:46,976 It looks like foo takes a single argument, a string, 939 00:40:46,976 --> 00:40:52,436 aka char *. It then has a local variable called C, C is the name 940 00:40:52,436 --> 00:40:55,586 of an array with 12 characters available in it 941 00:40:55,966 --> 00:40:57,716 and now I'm using this function. 942 00:40:57,716 --> 00:41:00,396 So there's all sorts of fun memory-related functions 943 00:41:00,396 --> 00:41:03,006 in C. We saw one last week, malloc that will keep using. 944 00:41:03,006 --> 00:41:05,466 There's free as we'll see and there's memcpy, 945 00:41:05,466 --> 00:41:06,946 which does as the name implies. 946 00:41:06,986 --> 00:41:07,746 It's not a typo. 947 00:41:08,326 --> 00:41:09,956 Someone decided that saving 948 00:41:09,956 --> 00:41:12,386 that individual character just made the function easier to type 949 00:41:12,456 --> 00:41:15,126 so memcpy, C-P-Y, what does it do? 950 00:41:15,306 --> 00:41:18,006 Well, its purpose in life if we've read the man page is 951 00:41:18,036 --> 00:41:22,646 to take, use the first argument here as a destination, 952 00:41:22,956 --> 00:41:26,466 use the second argument as the source that is what you want to, 953 00:41:26,466 --> 00:41:29,156 what bytes do you want to copy into the destination 954 00:41:29,426 --> 00:41:30,706 and then it takes a third argument, 955 00:41:30,706 --> 00:41:32,306 which is how many bytes do you want to copy? 956 00:41:32,306 --> 00:41:36,436 So, casually speaking memcpy is supposed to copy this many bytes 957 00:41:36,966 --> 00:41:40,216 from bar into C. Now there's a comment here, 958 00:41:40,216 --> 00:41:43,286 which says no bounds checking, but what does that mean? 959 00:41:43,286 --> 00:41:49,346 What really is the danger here? 960 00:41:50,286 --> 00:41:50,376 Yeah? 961 00:41:50,376 --> 00:41:50,816 >> Student: [inaudible] 962 00:41:50,816 --> 00:41:51,436 >> David: Perfect. 963 00:41:51,436 --> 00:41:54,246 So if I type a really long word at the command line 964 00:41:54,376 --> 00:41:56,806 after this program's name, it's not just foo. 965 00:41:56,806 --> 00:41:57,866 It's not just 13. 966 00:41:57,976 --> 00:42:00,356 It is "a really long sentence". 967 00:42:00,356 --> 00:42:02,696 Which if you count that up, that's going to be more 968 00:42:02,696 --> 00:42:05,436 than a really long, I think that's more 969 00:42:05,436 --> 00:42:06,826 than 12 characters you're going 970 00:42:06,826 --> 00:42:09,346 to overflow it seems this buffer. 971 00:42:09,346 --> 00:42:12,126 In other words, I am not only just assuming there's something 972 00:42:12,156 --> 00:42:16,046 in ARG V bracket 1, I'm also then blindly copying the 973 00:42:16,046 --> 00:42:18,036 entirety of that string 974 00:42:18,316 --> 00:42:21,876 into C even though I've only allocated how many bytes 975 00:42:21,876 --> 00:42:25,256 for this variable C. So 12. 976 00:42:25,256 --> 00:42:27,466 So what does this mean in reality? 977 00:42:27,536 --> 00:42:29,636 Well, this means that it will work. 978 00:42:29,636 --> 00:42:34,146 It will copy that many bytes from bar into C, 979 00:42:34,446 --> 00:42:36,916 but there's a danger here as you've seen in your own code 980 00:42:36,916 --> 00:42:39,036 that if you start touching memory that's not yours, 981 00:42:39,216 --> 00:42:40,896 odds are the program is going to do what? 982 00:42:41,096 --> 00:42:43,886 It's going to seg fault, it's going to crash and that's 983 00:42:43,976 --> 00:42:45,686 because you're touching memory that's not your own, 984 00:42:45,686 --> 00:42:48,696 but if you don't type a crazy long string 985 00:42:48,696 --> 00:42:50,636 but one that's a little longer than 12, 986 00:42:50,866 --> 00:42:52,866 but not something ridiculous like a thousand 987 00:42:52,866 --> 00:42:53,756 or a million just, you know, 988 00:42:53,756 --> 00:42:56,986 something a little long that's still in that sweet spot 989 00:42:56,986 --> 00:42:59,506 where the computer doesn't realize that it should crash, 990 00:42:59,596 --> 00:43:01,906 it doesn't realize that that really isn't your memory, 991 00:43:02,246 --> 00:43:05,096 you can actually overwrite some really important information 992 00:43:05,096 --> 00:43:07,596 that's also on the stack that we've not mentioned thus far. 993 00:43:07,896 --> 00:43:09,056 Thus far we've always said 994 00:43:09,056 --> 00:43:12,146 that on the stack goes a function's local variables 995 00:43:12,146 --> 00:43:14,576 and its parameters and that repeats 996 00:43:14,576 --> 00:43:16,746 for each function that's called, but it turns 997 00:43:16,746 --> 00:43:19,426 out we've been taking for granted the fact 998 00:43:19,666 --> 00:43:22,436 that if main calls foo and foo calls bar, 999 00:43:22,516 --> 00:43:26,146 that the computer just knows how to get back from bar to foo 1000 00:43:26,306 --> 00:43:28,696 and how to get back from foo to main. 1001 00:43:28,976 --> 00:43:31,206 In other words, we've just waved our hands at this detail 1002 00:43:31,286 --> 00:43:33,226 of unwinding the stack, removing frames 1003 00:43:33,436 --> 00:43:34,916 and we've just assumed the computer knows 1004 00:43:34,916 --> 00:43:36,736 where it came from, but that's not the case. 1005 00:43:36,986 --> 00:43:39,536 C in a programming language is very low level. 1006 00:43:39,536 --> 00:43:41,076 You need to tell the computer 1007 00:43:41,346 --> 00:43:44,196 when this function is done executing where to go back 1008 00:43:44,426 --> 00:43:47,126 to so it turns out that one of the other pieces 1009 00:43:47,126 --> 00:43:48,776 of information that's been stored 1010 00:43:48,776 --> 00:43:50,866 on the stack all this time not by you, 1011 00:43:51,166 --> 00:43:54,246 but the compiler has been doing this for you, also what goes 1012 00:43:54,246 --> 00:43:56,756 on the stack is something called a return address. 1013 00:43:57,266 --> 00:44:00,446 So here's a little picture of just the stack. 1014 00:44:00,646 --> 00:44:02,106 Consider this a zoomed in version 1015 00:44:02,356 --> 00:44:04,336 where it says parent routine stack here. 1016 00:44:04,336 --> 00:44:05,606 Suppose this is main. 1017 00:44:05,916 --> 00:44:10,816 Now, main lives somewhere in memory in that text segment, 1018 00:44:11,256 --> 00:44:13,166 but when you call foo, for instance, 1019 00:44:13,166 --> 00:44:14,476 which is this guy up here. 1020 00:44:14,476 --> 00:44:16,836 So this is foo up in the blue at the top here, 1021 00:44:17,056 --> 00:44:22,246 foo takes this argument and has this local array called C, 1022 00:44:22,556 --> 00:44:25,916 when foo gets called, once it's done executing the computer 1023 00:44:25,916 --> 00:44:28,086 needs to know what address to go back to. 1024 00:44:28,086 --> 00:44:30,136 That is it needs to know go back to main. 1025 00:44:30,416 --> 00:44:31,176 So one of the things 1026 00:44:31,176 --> 00:44:33,876 that happens inside memory is what goes here 1027 00:44:33,876 --> 00:44:38,076 in red is the return address, the address of the line in code 1028 00:44:38,396 --> 00:44:42,176 in main that should get executed after foo is done running. 1029 00:44:42,996 --> 00:44:45,276 So, this red bar just represents, again, 1030 00:44:45,276 --> 00:44:47,386 the address of the specific line of code 1031 00:44:47,736 --> 00:44:50,606 that the computer should go back to once foo is done 1032 00:44:50,816 --> 00:44:51,976 after you hit that curly brace. 1033 00:44:52,436 --> 00:44:53,956 Now notice the danger here now. 1034 00:44:54,226 --> 00:44:57,826 If addresses are small up here and big up here, 1035 00:44:58,316 --> 00:45:01,146 notice if you pass in more bytes than is expected, 1036 00:45:01,176 --> 00:45:03,746 where do those extra dangerous bytes end up? 1037 00:45:04,636 --> 00:45:06,006 They end up going this way, right, 1038 00:45:06,006 --> 00:45:08,076 and so as this redness suggests, 1039 00:45:08,076 --> 00:45:10,706 going over the red part is not a good thing 1040 00:45:10,876 --> 00:45:13,106 because it means you could clobber, you could change 1041 00:45:13,106 --> 00:45:15,386 that return address to be something else. 1042 00:45:15,676 --> 00:45:18,526 Now, if you're the bad guy and you are deliberately typing 1043 00:45:18,526 --> 00:45:21,336 in a word at the command line that's not just a word, 1044 00:45:21,336 --> 00:45:24,256 it actually represents code that you want to be executed, 1045 00:45:24,546 --> 00:45:25,946 if through trial and error 1046 00:45:25,946 --> 00:45:29,656 or really smart savvy you know exactly what you should type 1047 00:45:29,656 --> 00:45:33,466 at the command line so that the different number ends up here 1048 00:45:33,466 --> 00:45:36,156 in the return address, what you can do is trick the computer 1049 00:45:36,386 --> 00:45:37,846 by passing in enough bytes 1050 00:45:37,846 --> 00:45:39,636 to overload not just the blue segment 1051 00:45:39,636 --> 00:45:40,876 and this green segment here 1052 00:45:41,076 --> 00:45:44,026 but also this red segment you can trick the computer 1053 00:45:44,026 --> 00:45:47,086 into returning to the address not of main 1054 00:45:47,086 --> 00:45:50,096 but of the code you yourself just typed in. 1055 00:45:50,416 --> 00:45:51,776 So what you can be passing 1056 00:45:51,816 --> 00:45:55,536 in to this blue portion is actual executable code 1057 00:45:55,536 --> 00:45:57,826 if you know how to represent it with ASCII characters 1058 00:45:58,016 --> 00:45:59,206 and you can trick the program 1059 00:45:59,206 --> 00:46:01,966 into running your own code what might the program then do? 1060 00:46:02,186 --> 00:46:03,426 Well, if this is an installation 1061 00:46:03,426 --> 00:46:05,956 of Photoshop you could be tricking it into skipping 1062 00:46:05,956 --> 00:46:09,196 that next line of code that says ask user for their serial number 1063 00:46:09,196 --> 00:46:11,386 or their registration code and just skip right over it 1064 00:46:11,496 --> 00:46:13,326 and execute your line of code instead. 1065 00:46:13,566 --> 00:46:16,196 If it's a server, you might be tricking the server 1066 00:46:16,196 --> 00:46:18,536 into skipping over that very important line of code 1067 00:46:18,536 --> 00:46:21,366 that says check user's password and if you just skip it, 1068 00:46:21,366 --> 00:46:23,686 the user might have now unfettered access to the system. 1069 00:46:23,966 --> 00:46:27,276 So if have the ability to control the flow of a program 1070 00:46:27,276 --> 00:46:31,516 by changing where it's going in memory, you can essentially take 1071 00:46:31,516 --> 00:46:35,046 over that program or server and do anything you want thereafter. 1072 00:46:35,046 --> 00:46:38,926 So here's a picture so here is to be clear the array called C 1073 00:46:38,926 --> 00:46:39,836 and just like I always do 1074 00:46:39,836 --> 00:46:42,916 on the blackboard we're representing each byte in 1075 00:46:42,916 --> 00:46:46,496 or each char in that array is just a square here's the 0th 1076 00:46:46,496 --> 00:46:49,966 one, here's the 11th one, 0 index, so suppose we pass 1077 00:46:50,226 --> 00:46:52,386 in an actual word like hello? 1078 00:46:52,496 --> 00:46:56,166 It gets passed in as H-E-L-L-O back slash zero 1079 00:46:56,396 --> 00:46:58,066 and then this rest of the 12 bytes 1080 00:46:58,126 --> 00:47:00,336 that were allocated they're wasted, they're not being used, 1081 00:47:00,566 --> 00:47:01,856 but nothing bad has happened, 1082 00:47:02,136 --> 00:47:04,776 but now suppose I insert some attack code. 1083 00:47:04,776 --> 00:47:07,716 So, just some letters or numbers that happen 1084 00:47:07,716 --> 00:47:11,016 to represent executable code what can happen is you can 1085 00:47:11,016 --> 00:47:12,976 over run that segment. 1086 00:47:13,106 --> 00:47:16,646 So notice before hello is confined to this blue box. 1087 00:47:16,646 --> 00:47:19,226 If I now insert some attack code, whatever it looks like, 1088 00:47:19,456 --> 00:47:23,086 it might actually overflow everything here and notice 1089 00:47:23,086 --> 00:47:26,376 because I've been really smart I've included some numbers, 1090 00:47:26,376 --> 00:47:29,676 four hexadecimal numbers, at the very end of my attack code, 1091 00:47:29,936 --> 00:47:31,746 which hopefully represent what? 1092 00:47:32,146 --> 00:47:35,976 Hopefully represent the address of the code that I inserted 1093 00:47:36,176 --> 00:47:37,386 where hello used to be. 1094 00:47:37,796 --> 00:47:40,126 So, it's fairly sophisticated and yet a lot 1095 00:47:40,126 --> 00:47:42,086 of these attacks are done by trial and error. 1096 00:47:42,086 --> 00:47:43,726 It's pretty hard to know a priori 1097 00:47:43,976 --> 00:47:47,716 if you didn't write the program what the return address in main 1098 00:47:48,026 --> 00:47:50,026 or where rather the return address is. 1099 00:47:50,116 --> 00:47:52,056 Where exactly this red bar is, 1100 00:47:52,296 --> 00:47:55,396 but what attackers generally do both on the Internet and even 1101 00:47:55,396 --> 00:47:57,456 with installations of Photoshop, Microsoft Word, 1102 00:47:57,456 --> 00:47:58,556 whatever it is they're trying to crack, 1103 00:47:59,026 --> 00:48:01,356 you just try running your attack code again and again 1104 00:48:01,356 --> 00:48:02,386 and again and you know what? 1105 00:48:02,596 --> 00:48:05,156 If you crash that server or crash that program 1106 00:48:05,156 --> 00:48:07,556 or blue screen of death the computer, you know, 1107 00:48:07,556 --> 00:48:09,496 that's not actually a bad thing if you're the attacker. 1108 00:48:09,496 --> 00:48:13,046 It means you've found a bug, you've found an opportunity 1109 00:48:13,046 --> 00:48:16,636 to exploit because if you can run programs and get them 1110 00:48:16,636 --> 00:48:18,936 to crash, there's an opportunity there. 1111 00:48:19,036 --> 00:48:21,456 Odds are you've passed it too much attack code 1112 00:48:21,456 --> 00:48:24,456 or too much input so maybe if you scale back your attack 1113 00:48:24,456 --> 00:48:26,806 and send fewer bytes and different addresses, 1114 00:48:27,026 --> 00:48:28,776 maybe you can hit that sweet spot 1115 00:48:28,776 --> 00:48:31,956 and actually trick the computer into executing your code. 1116 00:48:31,956 --> 00:48:33,926 Now a lot of programs today that we all run 1117 00:48:33,926 --> 00:48:36,106 on our computers are written in languages like C 1118 00:48:36,106 --> 00:48:41,056 and C++ objective C and a whole bunch of other languages 1119 00:48:41,056 --> 00:48:45,646 that do not do, offer a number of safety features 1120 00:48:45,646 --> 00:48:47,226 that other languages do, for instance, 1121 00:48:47,226 --> 00:48:49,816 what we're describing here you can't really do in Java 1122 00:48:49,816 --> 00:48:53,386 and frankly even with modern compilers you can generally 1123 00:48:53,626 --> 00:48:57,106 check a box in your compiler if you're using like Visual Studio 1124 00:48:57,426 --> 00:48:59,496 that will essentially ward off this attack, 1125 00:49:00,076 --> 00:49:02,986 but it's frankly just lack of knowledge or lack of savvy 1126 00:49:03,076 --> 00:49:06,376 that continues to, that developers continue 1127 00:49:06,376 --> 00:49:09,256 to write code that's vulnerable to this and if you read 1128 00:49:09,256 --> 00:49:10,196 about some attack 1129 00:49:10,196 --> 00:49:12,816 on the Internet involving so-called buffer overflow 1130 00:49:12,816 --> 00:49:14,366 or buffer over run exploit, 1131 00:49:14,766 --> 00:49:16,806 what they're describing is something like this. 1132 00:49:16,906 --> 00:49:19,866 Someone had an array and someone did not check the bounds 1133 00:49:19,866 --> 00:49:23,206 of that array and just blindly copy memory data into it 1134 00:49:23,206 --> 00:49:24,676 or just blindly tried to print it 1135 00:49:24,676 --> 00:49:27,756 or just generally did not check the bounds of that array. 1136 00:49:28,326 --> 00:49:28,426 Yeah? 1137 00:49:29,016 --> 00:49:29,136 >> Student: [inaudible]. 1138 00:49:29,136 --> 00:49:30,366 >> David: Good question. 1139 00:49:31,056 --> 00:49:40,406 If you yourself have written the code 1140 00:49:40,896 --> 00:49:46,206 and you know how big the memory is and will be, it might be safe 1141 00:49:46,206 --> 00:49:48,736 to say, okay, you know what you're doing then go ahead 1142 00:49:48,736 --> 00:49:51,496 and not bother checking memory but the reality is 1143 00:49:51,526 --> 00:49:53,946 if you actually pursue this in a professional capacity 1144 00:49:53,946 --> 00:49:55,246 and research capacity, you're not going 1145 00:49:55,246 --> 00:49:57,016 to be the only programmer on some projects. 1146 00:49:57,016 --> 00:49:59,006 So, whereas you might know what you're doing the next person 1147 00:49:59,246 --> 00:50:01,026 might not realize you made some assumption 1148 00:50:01,306 --> 00:50:03,186 so frankly the best rule of thumb especially 1149 00:50:03,186 --> 00:50:05,476 since we have millions of CPU cycles 1150 00:50:05,476 --> 00:50:08,276 at our disposal these days is spin some on the additional 1151 00:50:08,276 --> 00:50:11,816 if condition just to protect you from yourself or some colleague. 1152 00:50:13,236 --> 00:50:16,156 So, with that said any other questions? 1153 00:50:17,596 --> 00:50:27,466 Don't try and run off and do this tonight. 1154 00:50:28,106 --> 00:50:28,173 Yeah? 1155 00:50:28,379 --> 00:50:30,380 >> Student: [inaudible]. 1156 00:50:30,586 --> 00:50:32,966 >> David: So parameter is the input to a function, 1157 00:50:32,966 --> 00:50:34,916 the thing you put inside the parenthesis when you call it. 1158 00:50:35,416 --> 00:50:37,146 That's an argument aka parameter. 1159 00:50:37,146 --> 00:50:38,966 >> Student: [inaudible]. 1160 00:50:38,966 --> 00:50:41,236 >> David: Yep, exactly. 1161 00:50:41,736 --> 00:50:45,806 >> Student: [inaudible]. 1162 00:50:46,306 --> 00:50:48,136 >> David: So, good question, what goes on the stack, 1163 00:50:48,136 --> 00:50:49,376 what goes on the heap to recap. 1164 00:50:49,606 --> 00:50:52,396 What goes on the stack as we've seen are local variables, 1165 00:50:53,036 --> 00:50:54,566 parameters, aka arguments, 1166 00:50:54,696 --> 00:50:57,206 and also it appears today something called return 1167 00:50:57,206 --> 00:50:59,756 addresses and there's some other stuff, too, that I'll gloss 1168 00:50:59,756 --> 00:51:00,826 over for today's purposes, 1169 00:51:00,826 --> 00:51:04,306 but this frame pointer there's other sophisticated numbers 1170 00:51:04,306 --> 00:51:06,156 that end up on the stack for useful purposes 1171 00:51:06,156 --> 00:51:07,526 but they're not as germane right now. 1172 00:51:07,796 --> 00:51:09,846 So those three things; arguments, aka parameters, 1173 00:51:10,306 --> 00:51:12,596 local variables and return addresses. 1174 00:51:12,596 --> 00:51:15,686 What goes on the heap any time you call malloc, 1175 00:51:15,686 --> 00:51:17,546 which we're going to start doing today onward 1176 00:51:17,546 --> 00:51:20,096 with increased frequency, ends up on the heap. 1177 00:51:20,656 --> 00:51:22,876 So, anytime you call malloc, it ends up on the heap. 1178 00:51:22,876 --> 00:51:24,236 So that's actually a good segue. 1179 00:51:24,236 --> 00:51:26,356 So, let me go ahead and pull up some 1180 00:51:26,356 --> 00:51:30,416 of the code we started that's among last week's printouts 1181 00:51:30,646 --> 00:51:34,286 and that's namely this one, structs one dot C. 1182 00:51:34,366 --> 00:51:36,626 And let's introduce a new feature that's actually quite 1183 00:51:36,696 --> 00:51:39,266 useful so that we can get away from stupid little examples 1184 00:51:39,266 --> 00:51:40,976 where all we do is store chars and ints. 1185 00:51:40,976 --> 00:51:43,286 Let's actually start storing some more real-world data. 1186 00:51:43,656 --> 00:51:48,526 So this is structs1.c. It is from, it's the last program 1187 00:51:48,526 --> 00:51:51,916 from last week's printout and then we'll, oh, actually 1188 00:51:51,916 --> 00:51:54,286 and I did just in case you didn't bring that it's also 1189 00:51:54,286 --> 00:51:55,816 in today's printout at the very end. 1190 00:51:55,816 --> 00:51:57,796 It's a duplicate of last weeks. 1191 00:51:58,076 --> 00:52:02,966 So, in structs1.c, I appear to be doing something a big crazy. 1192 00:52:02,966 --> 00:52:05,436 Even though we've only talked about ints and chars and floats 1193 00:52:05,436 --> 00:52:08,366 and doubles and strings and pointers and all of this turns 1194 00:52:08,366 --> 00:52:11,246 out there's a data type called student, but not quite. 1195 00:52:11,646 --> 00:52:13,476 The goal of this program ultimately is 1196 00:52:13,516 --> 00:52:15,686 to store some student's specific information. 1197 00:52:15,686 --> 00:52:18,196 I want to actually implement a slightly more real-world program 1198 00:52:18,486 --> 00:52:21,066 involving students, maybe like the registrar's database, 1199 00:52:21,296 --> 00:52:23,976 and I'm assuming every student has an ID number like an int, 1200 00:52:24,126 --> 00:52:27,996 a name, a string, and a house, a string also, but I don't want 1201 00:52:28,056 --> 00:52:30,776 to have like three different variables for every student 1202 00:52:30,776 --> 00:52:32,826 because if I have two students that's six variables, 1203 00:52:32,896 --> 00:52:34,276 three students that's nine variables, 1204 00:52:34,436 --> 00:52:37,166 a 100 students that's 300 variables feels 1205 00:52:37,166 --> 00:52:39,816 like this is an opportunity not only for an array 1206 00:52:39,816 --> 00:52:42,546 but it would also be nice to bundle up all 1207 00:52:42,546 --> 00:52:46,516 of each student's variables into just one sort of metavariable. 1208 00:52:46,516 --> 00:52:49,476 One container variable inside of which is their ID 1209 00:52:49,726 --> 00:52:51,836 and their name and their house. 1210 00:52:51,946 --> 00:52:53,276 So it turns out we can do that. 1211 00:52:53,466 --> 00:52:56,026 At the very top of this file notice I'm just using this 1212 00:52:56,026 --> 00:52:57,446 convention, which is pretty common. 1213 00:52:57,486 --> 00:52:58,966 If you have some data structures 1214 00:52:59,016 --> 00:53:02,086 that you yourself have created it's very typical to put them 1215 00:53:02,086 --> 00:53:05,566 into a header file, a .h file and now notice 1216 00:53:05,566 --> 00:53:08,856 as you might have seen in recent distribution code any time 1217 00:53:08,856 --> 00:53:10,106 you're using a standard, 1218 00:53:10,106 --> 00:53:13,666 official library whether it's CS50's or the standard libraries 1219 00:53:13,666 --> 00:53:16,156 that come with C itself, you use angled brackets. 1220 00:53:16,506 --> 00:53:18,716 If you want to write your own dot H file, 1221 00:53:18,716 --> 00:53:20,736 which you're given, in fact, for Sudoku. 1222 00:53:20,736 --> 00:53:22,306 There is a sudoku.h file. 1223 00:53:22,606 --> 00:53:25,716 You use double quotes just to make clear this is my file 1224 00:53:25,716 --> 00:53:27,096 in this directory it's not 1225 00:53:27,096 --> 00:53:30,026 in the special directories owned by the system. 1226 00:53:30,346 --> 00:53:33,276 So inside of structs.h must be something interesting. 1227 00:53:33,276 --> 00:53:36,206 Let's go into structs.h, which you should have as well. 1228 00:53:36,266 --> 00:53:37,756 Not much. It's mostly comments, 1229 00:53:38,096 --> 00:53:40,496 but notice one new feature of C here. 1230 00:53:40,936 --> 00:53:42,656 It turns out if you want 1231 00:53:42,656 --> 00:53:45,146 to bundle together multiple variables just 1232 00:53:45,146 --> 00:53:48,176 because conceptually it makes more sense to lump them together 1233 00:53:48,176 --> 00:53:50,676 and call them by one name, for instance, student. 1234 00:53:51,026 --> 00:53:55,506 You can say this, type def for type definition, struct. 1235 00:53:55,506 --> 00:53:58,246 So typedef strict then a pair of curly braces 1236 00:53:58,456 --> 00:54:00,876 and inside the curly braces you simply list one 1237 00:54:00,876 --> 00:54:04,396 by one semicolons at the end of each line what variables, 1238 00:54:04,446 --> 00:54:06,946 what types of variables you want to associate 1239 00:54:06,946 --> 00:54:08,276 with this particular structure. 1240 00:54:08,586 --> 00:54:12,096 Then outside the curly brace you give these type a name. 1241 00:54:12,286 --> 00:54:13,336 I'm going to go with the obvious, 1242 00:54:13,336 --> 00:54:17,306 I'm going to call this student so now hence forth C 1243 00:54:17,306 --> 00:54:20,656 or gcc not only knows about ints and chars and floats and strings 1244 00:54:20,656 --> 00:54:23,476 and all of that it will also know about a new data type 1245 00:54:23,476 --> 00:54:26,476 that you created called student and inside 1246 00:54:26,476 --> 00:54:30,126 of a student struct is going to be three variables 1247 00:54:30,126 --> 00:54:31,976 that you can access by name as well. 1248 00:54:32,456 --> 00:54:33,826 So now let's actually use this. 1249 00:54:33,996 --> 00:54:36,346 There's a new piece of syntax here 1250 00:54:36,346 --> 00:54:38,096 but it's actually pretty straightforward. 1251 00:54:38,096 --> 00:54:41,416 Let's go ahead and open up structs1.c again. 1252 00:54:41,416 --> 00:54:42,306 Let me scroll down. 1253 00:54:42,306 --> 00:54:43,666 First take note that just 1254 00:54:43,666 --> 00:54:46,046 so that I have a fixed sized program I didn't want 1255 00:54:46,046 --> 00:54:48,106 to mix examples so I went back 1256 00:54:48,176 --> 00:54:49,586 to the approach of using a constant. 1257 00:54:49,586 --> 00:54:52,156 I'm not going to use malloc dynamically just yet. 1258 00:54:52,156 --> 00:54:54,646 I'm going to define number of students supported 1259 00:54:54,646 --> 00:54:56,836 by this program to be hard coded as three just 1260 00:54:56,836 --> 00:54:58,846 so we have something slightly interesting to play with. 1261 00:54:59,216 --> 00:55:00,406 Now notice what main does. 1262 00:55:00,636 --> 00:55:04,776 main allocates an array called class and inside 1263 00:55:04,776 --> 00:55:07,196 of that array are not ints, not chars not floats, 1264 00:55:07,196 --> 00:55:08,936 but student structure. 1265 00:55:08,936 --> 00:55:11,556 So the syntax is the same it's declaring an array of ints 1266 00:55:11,666 --> 00:55:14,466 and array of chars or anything else I just now say student. 1267 00:55:14,926 --> 00:55:16,106 Now here's a for loop. 1268 00:55:16,106 --> 00:55:19,126 It's going to iterate from 0 up to 3 because students 1269 00:55:19,126 --> 00:55:22,816 in all caps is a constant called initialize the 3 1270 00:55:22,816 --> 00:55:26,606 and then there's some slightly new syntax mixed with old. 1271 00:55:26,686 --> 00:55:29,236 First I use printf and I say students ID, 1272 00:55:29,546 --> 00:55:33,376 I'm calling get int, but now I don't have an int variable 1273 00:55:33,376 --> 00:55:33,956 per se. 1274 00:55:34,196 --> 00:55:37,556 I have a student variable inside of which is an int 1275 00:55:37,866 --> 00:55:40,896 so the new piece of syntax here is dot notation. 1276 00:55:41,146 --> 00:55:44,996 If you want to go inside of a struct and class[i] is going 1277 00:55:45,206 --> 00:55:47,926 to be the i'th student in this array, 1278 00:55:48,456 --> 00:55:52,086 if I want to get access not just to the student structure itself 1279 00:55:52,386 --> 00:55:55,386 but inside that structure is an ID field. 1280 00:55:55,386 --> 00:55:58,896 I go .id and I assign it the return value of GetInt. 1281 00:55:59,416 --> 00:56:03,376 Now next recall that name and house were declared as char *s. 1282 00:56:03,376 --> 00:56:06,006 So those are strings, which means I can call get string 1283 00:56:06,006 --> 00:56:07,266 and assign them a value, 1284 00:56:07,606 --> 00:56:09,496 technically assignment the address of a string, 1285 00:56:09,496 --> 00:56:12,026 but I can assign them a return value of get string 1286 00:56:12,326 --> 00:56:13,486 so I use the same notation. 1287 00:56:13,486 --> 00:56:14,546 Get the i'th student 1288 00:56:14,546 --> 00:56:17,846 in the array dot name equals get string. 1289 00:56:17,846 --> 00:56:19,416 And now notice this would be wrong 1290 00:56:20,086 --> 00:56:24,126 because what does class bracket I represent? 1291 00:56:24,686 --> 00:56:26,696 It represents a student struct. 1292 00:56:26,696 --> 00:56:30,816 So you can't assign a string on the right hand side to a student 1293 00:56:30,916 --> 00:56:34,006 on the left hand side, but if you dive into that structure 1294 00:56:34,006 --> 00:56:37,786 and say give me the name field inside of that student struct, 1295 00:56:38,136 --> 00:56:39,546 then absolutely can you assign 1296 00:56:39,546 --> 00:56:41,826 that name the return value of get string. 1297 00:56:42,116 --> 00:56:42,236 Yeah? 1298 00:56:42,456 --> 00:56:44,456 >> Student: [inaudible] 1299 00:56:44,676 --> 00:56:48,006 >> David: How is a struct stored and memory? 1300 00:56:48,046 --> 00:56:55,176 So essentially when you declare something simple like an int. 1301 00:56:55,506 --> 00:56:59,286 We always draw the as something like this. 1302 00:56:59,386 --> 00:57:02,426 So this is the int called a and it's represented as a box. 1303 00:57:02,456 --> 00:57:05,036 If I now declare a student with something 1304 00:57:05,036 --> 00:57:10,606 like student s semicolon, this too, gives me a chunk of memory 1305 00:57:10,646 --> 00:57:13,056 but let's see it has enough memory for an int 1306 00:57:13,386 --> 00:57:15,536 and then a char * and then another char *. 1307 00:57:15,536 --> 00:57:20,486 So what actually ends up in memory here is an int further ID 1308 00:57:20,826 --> 00:57:24,006 but then next to that in memory is also going to be a char *, 1309 00:57:24,266 --> 00:57:27,006 which by coincidence is also 32 bits. 1310 00:57:27,186 --> 00:57:29,996 The same as an int we said and then also allocated 1311 00:57:29,996 --> 00:57:32,606 for a student is this field we called house, 1312 00:57:32,946 --> 00:57:36,226 but this is also a char * so a pointer. 1313 00:57:36,476 --> 00:57:42,906 So, in fact, I get three, 32 bit chunks of memory back to back 1314 00:57:42,906 --> 00:57:45,236 to back or 12 bytes total contiguous. 1315 00:57:45,336 --> 00:57:46,756 So it looks like this in memory 1316 00:57:46,756 --> 00:57:50,666 or if you ignore the divisions it's just a bigger chunk 1317 00:57:51,186 --> 00:57:52,396 of memory that you're handed 1318 00:57:52,666 --> 00:57:53,786 because you've clustered it together. 1319 00:57:53,786 --> 00:57:54,836 Now that's a slight white lie. 1320 00:57:54,836 --> 00:57:57,676 It's possible that the compiler might give you more memory 1321 00:57:57,676 --> 00:57:59,596 than you actually asked for just so things line 1322 00:57:59,596 --> 00:58:00,986 up in a nice mathematical way, 1323 00:58:01,336 --> 00:58:09,166 but conceptually that's what happens. 1324 00:58:09,516 --> 00:58:09,816 Yeah? 1325 00:58:09,816 --> 00:58:10,716 >> Student: [inaudible] 1326 00:58:10,716 --> 00:58:10,816 >> David: Okay. 1327 00:58:11,316 --> 00:58:16,236 >> Student: [inaudible] 1328 00:58:16,736 --> 00:58:16,906 >> David: Correct. 1329 00:58:17,806 --> 00:58:25,816 So, what's really happening then 1330 00:58:26,056 --> 00:58:32,276 if I declare student class bracket 3 what happens is 1331 00:58:32,276 --> 00:58:33,626 exactly the same story. 1332 00:58:33,626 --> 00:58:36,016 I'm going to have to zoom out to make it fit better, 1333 00:58:36,326 --> 00:58:38,456 but here is the first student's chunk of memory. 1334 00:58:38,676 --> 00:58:41,116 Because this is an array all of the students will be back 1335 00:58:41,276 --> 00:58:44,076 to back to back so the next student ends up here 1336 00:58:44,326 --> 00:58:48,366 and then the last student ends up not to scale down there. 1337 00:58:48,746 --> 00:58:52,366 So this bracket notation though is actually not saying go 1338 00:58:52,366 --> 00:58:54,876 to byte one, go to byte two, go to byte three. 1339 00:58:55,146 --> 00:59:00,436 When I say S or rather when I say class bracket I, 1340 00:59:00,776 --> 00:59:02,156 it's not going to the i'th byte. 1341 00:59:02,456 --> 00:59:05,346 It's going to the i'th data member in this array 1342 00:59:05,616 --> 00:59:09,076 so the computer checks well how big is a student structure? 1343 00:59:09,266 --> 00:59:12,906 It's 12 bytes so when you say class bracket one that's 1344 00:59:12,906 --> 00:59:15,916 actually going 12 bytes away from the first student. 1345 00:59:15,966 --> 00:59:18,106 It's not going for instance right there. 1346 00:59:19,176 --> 00:59:22,286 So all of this happens for you thanks to something we'll look 1347 00:59:22,286 --> 00:59:24,456 at actually before long called pointer arithmetic 1348 00:59:24,606 --> 00:59:26,456 and it's done sort of magically for you, 1349 00:59:26,736 --> 00:59:29,436 but for now just consider the utility of having this ability. 1350 00:59:29,466 --> 00:59:31,246 We now have a struct, we can bundle 1351 00:59:31,246 --> 00:59:34,886 up together inside one variable or an array of variables and ID 1352 00:59:34,886 --> 00:59:37,496 and name and a house together now what's this chunk 1353 00:59:37,496 --> 00:59:39,316 of code here that I've stripped the comments 1354 00:59:39,316 --> 00:59:40,076 from actually doing? 1355 00:59:41,406 --> 00:59:46,876 This for loop in the middle here? 1356 00:59:47,066 --> 00:59:49,656 Let's first see if we can infer from context. 1357 00:59:49,656 --> 00:59:53,656 So it's doing a comparison. 1358 00:59:53,656 --> 00:59:55,716 strcmp is "string comparison." 1359 00:59:55,716 --> 00:59:57,366 It checks two strings for equality. 1360 00:59:58,506 --> 01:00:01,516 So what is this loop doing? 1361 01:00:01,516 --> 01:00:03,016 >> Student: [inaudible] 1362 01:00:03,016 --> 01:00:06,436 >> David: Yeah, it's just going to say, it's going to iterate 1363 01:00:06,436 --> 01:00:08,356 over all the students, three students in this case, 1364 01:00:08,566 --> 01:00:10,816 and it's just going to proclaim so and so [inaudible] 1365 01:00:11,066 --> 01:00:12,326 if that's indeed the case. 1366 01:00:12,326 --> 01:00:16,376 And so you'll see this in actually you will see this 1367 01:00:16,376 --> 01:00:18,416 in Sudoku source code where we use strcmp. 1368 01:00:18,676 --> 01:00:21,456 String compare takes two strings, left and right, 1369 01:00:21,736 --> 01:00:23,416 and it returns 0 if they're equal 1370 01:00:23,416 --> 01:00:25,946 but what's nice is it also returns negative 1 1371 01:00:25,946 --> 01:00:28,346 if 1 alphabetically should come before the other 1372 01:00:28,346 --> 01:00:30,246 or it returns positive 1 1373 01:00:30,286 --> 01:00:32,326 if the other string should go before the other 1374 01:00:32,556 --> 01:00:33,616 so you can actually use this 1375 01:00:33,616 --> 01:00:36,016 as we'll eventually see the sort strings which can be useful 1376 01:00:36,056 --> 01:00:38,316 for dictionaries and spell checkers and things like that, 1377 01:00:38,586 --> 01:00:40,736 but for now we're just using it for a silly proclamation. 1378 01:00:40,736 --> 01:00:43,776 So and so is in Mather but then down here notice what I'm doing. 1379 01:00:44,176 --> 01:00:46,876 Just as I called malloc last week, 1380 01:00:47,166 --> 01:00:49,726 it turns out you guys have been writing buggy code 1381 01:00:49,976 --> 01:00:52,306 with our blessing for five weeks now. 1382 01:00:52,336 --> 01:00:56,636 Every time you call get string that string has to be stored not 1383 01:00:56,636 --> 01:00:59,056 in the stack because if the words you typed 1384 01:00:59,056 --> 01:00:59,776 in were getting stored 1385 01:00:59,776 --> 01:01:02,486 on the stack you could not trust what get string is returning. 1386 01:01:02,906 --> 01:01:05,866 So, just conceptually based on the definition we've been using 1387 01:01:05,946 --> 01:01:09,316 in recent days of memory when you call get string 1388 01:01:09,406 --> 01:01:10,626 to get a string from the user, 1389 01:01:10,836 --> 01:01:13,656 where in this picture must the words you type 1390 01:01:13,656 --> 01:01:14,656 in be getting stored? 1391 01:01:15,626 --> 01:01:17,026 So in the heap, right? 1392 01:01:17,026 --> 01:01:18,786 But if they're stored in the heap 1393 01:01:18,906 --> 01:01:20,706 and you're just using those strings 1394 01:01:20,706 --> 01:01:22,086 and you're never actually saying 1395 01:01:22,086 --> 01:01:24,446 to the operating system I am done with this string, 1396 01:01:24,736 --> 01:01:27,766 you actually for five weeks now have been leaking memory sort 1397 01:01:27,766 --> 01:01:29,886 of unbeknownst to you or maybe you didn't notice as much 1398 01:01:30,096 --> 01:01:33,336 because what get string actually calls itself underneath the hood 1399 01:01:33,586 --> 01:01:36,626 is this function called malloc and so now hence forth 1400 01:01:36,866 --> 01:01:38,896 as we start to take off some of these training wheels 1401 01:01:39,176 --> 01:01:42,046 because I've called get string which itself calls malloc 1402 01:01:42,386 --> 01:01:46,896 if I want to free the memory that I asked for implicitly 1403 01:01:46,896 --> 01:01:48,006 by calling get string, 1404 01:01:48,296 --> 01:01:50,776 I just have to call this new function it's not hard to use, 1405 01:01:51,236 --> 01:01:54,966 called free, and I pass it, the pointer that's pointing 1406 01:01:54,966 --> 01:01:56,826 at the chunk of memory that was handed to me. 1407 01:01:57,046 --> 01:01:59,626 So notice here I'm using get strings return value 1408 01:01:59,626 --> 01:02:02,676 and remembering it as the name and we said last week that, yes, 1409 01:02:02,676 --> 01:02:03,926 this is a string conceptually, 1410 01:02:03,926 --> 01:02:05,446 but underneath the hood it's the address 1411 01:02:05,576 --> 01:02:06,906 of those characters in memory. 1412 01:02:07,286 --> 01:02:08,306 Same deal for house. 1413 01:02:08,566 --> 01:02:12,366 Well, later it's really proper now to be freeing the memory 1414 01:02:12,366 --> 01:02:14,406 that we've asked the operating system for. 1415 01:02:14,406 --> 01:02:15,946 So, let's go ahead and run this program. 1416 01:02:16,316 --> 01:02:20,486 So let me go ahead and make struct 1, let me go ahead 1417 01:02:20,486 --> 01:02:25,046 and run structs 1 and let's do student ID 1, David, Mather. 1418 01:02:25,046 --> 01:02:27,846 Student ID 2, Cansu, Eliot. 1419 01:02:28,006 --> 01:02:31,586 Student ID 3, Yuhki, lives in Mather. 1420 01:02:31,586 --> 01:02:34,556 Enter. And so we see David is in Mather, Yuhki is in Mather, 1421 01:02:34,556 --> 01:02:36,896 but the point really is that underneath the hood I was able 1422 01:02:36,896 --> 01:02:39,726 to store all three of those variables, data types, 1423 01:02:40,006 --> 01:02:43,526 for each student in its own struct and then bundled them 1424 01:02:43,526 --> 01:02:45,986 up together in print just the ones I cared about. 1425 01:02:45,986 --> 01:02:48,546 Let me go ahead into the second version of this. 1426 01:02:48,586 --> 01:02:52,056 So this is structs2.c. It's also toward the end 1427 01:02:52,056 --> 01:02:53,056 of this week's printout. 1428 01:02:54,126 --> 01:02:56,166 This is kind of a useless program I just ran. 1429 01:02:56,166 --> 01:02:59,376 All I did was ask you for all this input, nine lines of input; 1430 01:02:59,376 --> 01:03:01,986 ID numbers, names, and houses and then I did something stupid 1431 01:03:01,986 --> 01:03:05,916 like check for "Mather" and then I quit which meant all 1432 01:03:05,916 --> 01:03:08,036 that hard work from the user was really for naught. 1433 01:03:08,236 --> 01:03:09,986 So suppose we want to do something reasonable 1434 01:03:09,986 --> 01:03:13,136 like actually start storing information from users to disks. 1435 01:03:13,436 --> 01:03:16,126 Well, thus far you guys have only stored things in variables, 1436 01:03:16,126 --> 01:03:17,476 maybe in arrays, but at the end 1437 01:03:17,476 --> 01:03:20,396 of the day the program runs any work you did while running 1438 01:03:20,396 --> 01:03:23,596 that program, any keystrokes you typed, get lost forever. 1439 01:03:23,866 --> 01:03:26,486 So suppose we now want to start creating own our files 1440 01:03:26,716 --> 01:03:30,006 and actually store the data on disks it's actually pretty easy 1441 01:03:30,006 --> 01:03:31,616 and we can use a function that's very, 1442 01:03:31,616 --> 01:03:33,526 very similar to printf to do this. 1443 01:03:33,896 --> 01:03:36,676 So, this is up top the exact same program. 1444 01:03:36,976 --> 01:03:40,386 I declare an array of students called class. 1445 01:03:40,786 --> 01:03:43,616 I then call this loop just to get some data from the human, 1446 01:03:43,836 --> 01:03:47,266 but now rather than just do the silly little Mather comparison 1447 01:03:47,266 --> 01:03:50,476 in the middle here notice I've added a chunk of code down here. 1448 01:03:50,476 --> 01:03:52,476 Now at first glance it might look a little cryptic, 1449 01:03:52,476 --> 01:03:54,756 but let's tease it apart line by line. 1450 01:03:55,566 --> 01:03:57,406 So here's a new function, open. 1451 01:03:57,656 --> 01:03:58,686 It's called "file open." 1452 01:03:58,866 --> 01:04:00,176 So this opens a file. 1453 01:04:00,446 --> 01:04:02,716 Now that file doesn't have to exist in advance 1454 01:04:03,106 --> 01:04:05,636 because you can first specify the name of a file 1455 01:04:05,926 --> 01:04:09,276 but then the second argument to F open is what's called the mode 1456 01:04:09,276 --> 01:04:12,216 and if you put in "r" that means read this file. 1457 01:04:12,246 --> 01:04:14,626 Load it from disk, but that's not the point of this program. 1458 01:04:14,626 --> 01:04:16,596 The point of this program is to write to the disk, 1459 01:04:16,646 --> 01:04:19,616 write to the hard drive, so you pass "w". 1460 01:04:19,886 --> 01:04:21,216 And what this will do is 1461 01:04:21,216 --> 01:04:23,806 if a file called database does not exist 1462 01:04:23,806 --> 01:04:25,356 in your current working directory 1463 01:04:25,686 --> 01:04:27,966 because I've said W it will create it for me. 1464 01:04:27,966 --> 01:04:30,436 It will be empty initially, but it will now exist 1465 01:04:30,526 --> 01:04:31,636 under the name database. 1466 01:04:32,236 --> 01:04:33,636 Here's a little sanity check. 1467 01:04:33,996 --> 01:04:36,956 F opens returns a pointer to a new data type 1468 01:04:37,006 --> 01:04:39,716 so thus far we've only talked about built-in types like ints 1469 01:04:39,716 --> 01:04:42,356 and chars and floats, we made up our own called student, 1470 01:04:42,476 --> 01:04:44,896 but there is another one that's built in that's quite useful. 1471 01:04:44,966 --> 01:04:47,636 For whatever reason it's in all caps, but it's called FILE 1472 01:04:48,066 --> 01:04:51,156 so this is returning one of these things called a pointer 1473 01:04:51,496 --> 01:04:54,096 but it's really conceptually a pointer to a file. 1474 01:04:54,196 --> 01:04:56,566 Somewhere on disk, I don't know where it is, I don't really care 1475 01:04:56,566 --> 01:04:59,966 where it is, I just know that via this file pointer, fp, 1476 01:04:59,966 --> 01:05:03,646 by convention can I find that file on disk and add stuff 1477 01:05:03,706 --> 01:05:05,406 to it, but first the sanity check. 1478 01:05:05,406 --> 01:05:08,506 Almost always when you use pointers should you make sure 1479 01:05:08,506 --> 01:05:11,116 they're not NULL because if you follow a NULL pointer what tends 1480 01:05:12,116 --> 01:05:12,956 to happen? 1481 01:05:13,126 --> 01:05:15,296 Seg faults and you've not probably done this too often 1482 01:05:15,296 --> 01:05:18,256 if at all yourselves, but very soon will this be a risk. 1483 01:05:18,686 --> 01:05:19,726 So here's a loop. 1484 01:05:19,976 --> 01:05:22,076 I'm just iterating over all of my students 1485 01:05:22,186 --> 01:05:24,946 and now I'm calling this other function that's probably new 1486 01:05:25,146 --> 01:05:27,956 called F printf but F just stands for file. 1487 01:05:27,956 --> 01:05:29,576 So this is not printf to the screen, 1488 01:05:29,576 --> 01:05:31,786 this is file printf, print to a file. 1489 01:05:32,126 --> 01:05:33,596 What file do I want to print to? 1490 01:05:33,826 --> 01:05:35,566 Why, I don't redundantly say the name 1491 01:05:35,566 --> 01:05:36,816 of the file again and again. 1492 01:05:36,816 --> 01:05:40,336 I instead just paste in the name of the pointer that I got back. 1493 01:05:40,686 --> 01:05:42,706 So what this means is I'm going to print 1494 01:05:42,796 --> 01:05:45,756 to this file percent D back slash N. 1495 01:05:45,756 --> 01:05:48,376 So this means print an integer to the file. 1496 01:05:48,376 --> 01:05:50,876 Right integer to the file, what integer? 1497 01:05:51,196 --> 01:05:53,956 Print the i'th students ID then go ahead 1498 01:05:53,956 --> 01:05:57,296 on a new line print the name of that student, the i'th student 1499 01:05:57,726 --> 01:06:00,626 in the classes name and then do the same thing for the house. 1500 01:06:01,006 --> 01:06:02,516 So in the end, and then F close, 1501 01:06:02,516 --> 01:06:04,826 F P once I'm all done just means close the file. 1502 01:06:04,826 --> 01:06:07,626 It's like going to file save in most any program on the computer 1503 01:06:07,886 --> 01:06:10,196 and then at the bottom here I again called free just 1504 01:06:10,196 --> 01:06:11,116 to free up that RAM. 1505 01:06:11,116 --> 01:06:13,126 Now as an aside, even though 1506 01:06:13,126 --> 01:06:15,406 up until now we've been intentionally 1507 01:06:15,406 --> 01:06:18,256 or accidentally writing memory leaks whereby we're asking 1508 01:06:18,256 --> 01:06:20,746 for memory and never giving it back, the reality is 1509 01:06:20,746 --> 01:06:23,706 when you quit a program generally the memory you asked 1510 01:06:23,706 --> 01:06:27,576 for is returned so this isn't necessarily a horrible, 1511 01:06:27,576 --> 01:06:28,986 horrible sin we've been committing, 1512 01:06:29,306 --> 01:06:31,876 but it's certainly not good practice to ask for memory 1513 01:06:31,876 --> 01:06:34,426 and never give it back since there will be situations 1514 01:06:34,426 --> 01:06:36,766 where it doesn't get freed automatically. 1515 01:06:36,766 --> 01:06:38,926 Case in point most of you I think a few weeks ago kind 1516 01:06:38,926 --> 01:06:40,896 of raised your hand or nodded in familiarity 1517 01:06:40,896 --> 01:06:43,706 when I asked have you ever sat in front of your computer 1518 01:06:43,706 --> 01:06:46,306 after using it for a while and the thing is just getting slower 1519 01:06:46,306 --> 01:06:47,486 and slower and slower? 1520 01:06:47,716 --> 01:06:49,666 You try to tab between different programs 1521 01:06:49,666 --> 01:06:51,376 or click a different icon and it takes forever 1522 01:06:51,376 --> 01:06:52,876 for the windows to actually change. 1523 01:06:53,146 --> 01:06:55,956 That's generally indicative of your running out of memory. 1524 01:06:56,156 --> 01:06:57,436 Now that might not be your fault. 1525 01:06:57,436 --> 01:06:59,846 It might be because some of the programs you ran 1526 01:06:59,916 --> 01:07:02,476 since you first booted your computer asked for RAM 1527 01:07:02,796 --> 01:07:04,626 and never actually gave it back 1528 01:07:04,626 --> 01:07:07,246 and so the computer thinks it doesn't have much memory to play 1529 01:07:07,246 --> 01:07:08,666 with so things slow down. 1530 01:07:08,946 --> 01:07:10,456 So, let's see what actually happens here. 1531 01:07:10,456 --> 01:07:14,566 So make structs 2 let me go ahead and run structs 2 1532 01:07:14,826 --> 01:07:16,106 and let's do the same kind of input. 1533 01:07:16,106 --> 01:07:22,126 Student 1, David Mather, student 2, Cansu Eliot, student 3, 1534 01:07:22,556 --> 01:07:24,786 Yuhki, and then Mather, Enter. 1535 01:07:25,106 --> 01:07:28,066 Okay nothing seems to have happened, but if I type LS now 1536 01:07:28,286 --> 01:07:30,486 for list files there's a whole bunch of stuff 1537 01:07:30,486 --> 01:07:33,396 that we'll be able to play with her but the interesting one is, 1538 01:07:33,526 --> 01:07:35,416 ah, this one, database. 1539 01:07:35,606 --> 01:07:37,546 Let me go ahead and open up database. 1540 01:07:37,766 --> 01:07:39,496 This was actually a file that I wrote. 1541 01:07:39,836 --> 01:07:41,726 So now what does it mean to create a file? 1542 01:07:41,726 --> 01:07:43,186 Well, most of you guys write essays 1543 01:07:43,186 --> 01:07:46,956 in things called .doc files or .dock these days 1544 01:07:46,956 --> 01:07:48,466 and Excel files are dot xls. 1545 01:07:48,836 --> 01:07:51,496 And there's all of these file extensions, file formats 1546 01:07:51,546 --> 01:07:52,476 that we're familiar with. 1547 01:07:52,476 --> 01:07:53,436 What does that really mean? 1548 01:07:53,766 --> 01:07:56,126 Well, this is now the CS50 file format. 1549 01:07:56,126 --> 01:07:58,576 We just arbitrarily decided that to store students 1550 01:07:58,576 --> 01:08:00,876 in a database we're just going to write each student 1551 01:08:00,876 --> 01:08:02,946 on a triple of lines, one after the other, 1552 01:08:03,106 --> 01:08:04,376 and then for the next student we're going 1553 01:08:04,376 --> 01:08:07,236 to write the next lines of data for that student 1554 01:08:07,476 --> 01:08:08,376 and then again and again. 1555 01:08:08,616 --> 01:08:11,146 So, if we were to read these same students in for memory, 1556 01:08:11,356 --> 01:08:13,716 we would have to have a loop that reads three lines 1557 01:08:13,716 --> 01:08:15,726 at a time, grabs an int from each 1558 01:08:16,006 --> 01:08:18,916 and then stores them somewhere in memory. 1559 01:08:18,996 --> 01:08:23,036 So this is a file format in only so far as we came 1560 01:08:23,036 --> 01:08:24,446 up with this arbitrary format 1561 01:08:24,796 --> 01:08:26,466 and we've decided how this data is going 1562 01:08:26,466 --> 01:08:27,356 to be laid out in memory. 1563 01:08:27,466 --> 01:08:31,026 Same as Microsoft done with the dot doc format they just put 1564 01:08:31,026 --> 01:08:33,696 things in places they want them to be 1565 01:08:33,696 --> 01:08:35,816 and they call it their own proprietary format. 1566 01:08:36,136 --> 01:08:39,486 So now let's look at what we've actually been using that's put 1567 01:08:39,526 --> 01:08:43,006 these memory, that's created these leaks in the first place. 1568 01:08:43,056 --> 01:08:44,596 So let me go ahead and scroll down here. 1569 01:08:44,856 --> 01:08:48,376 This is CS50 dot C. So this is the first printout from today. 1570 01:08:48,696 --> 01:08:51,196 We won't spend too much time on the library itself 1571 01:08:51,196 --> 01:08:53,576 because it's a lot more fun to just take for granted 1572 01:08:53,576 --> 01:08:56,336 that these things work but let's at least zoom in on one of them. 1573 01:08:56,336 --> 01:08:57,806 So, if you scroll down visually 1574 01:08:58,166 --> 01:09:01,146 to the function called get string, let's just get a taste 1575 01:09:01,146 --> 01:09:03,726 of what get string has been doing for you all this time 1576 01:09:03,946 --> 01:09:07,496 and frankly have perhaps a bit of appreciation for the headache 1577 01:09:07,496 --> 01:09:11,296 that is getting user input on, say, Week 0 of a class 1578 01:09:11,386 --> 01:09:13,756 if you're not handed helper functions like this. 1579 01:09:13,756 --> 01:09:16,326 Even in languages like Java it's a little easier 1580 01:09:16,326 --> 01:09:17,776 but it's not nearly as easy 1581 01:09:17,776 --> 01:09:19,976 as just calling a function like get string. 1582 01:09:20,496 --> 01:09:21,936 So what do I appear to be doing 1583 01:09:21,936 --> 01:09:23,796 in this implementation of get string? 1584 01:09:23,796 --> 01:09:26,376 So this is, again, the code in the CS50 library 1585 01:09:26,516 --> 01:09:28,566 that you've been taking for granted for a couple of weeks. 1586 01:09:29,046 --> 01:09:32,516 So the first line of code here string buffer gets NULL. 1587 01:09:32,516 --> 01:09:35,346 So it turns out get string does not know in advance, of course, 1588 01:09:35,536 --> 01:09:38,326 how long a string, how long a word or sentence you're going 1589 01:09:38,326 --> 01:09:40,076 to type in so we assume 0. 1590 01:09:40,386 --> 01:09:42,436 We assume that you're not going to type anything just 1591 01:09:42,436 --> 01:09:44,296 so we start from an arbitrary starting point. 1592 01:09:44,296 --> 01:09:47,306 Maybe not the most efficient, but at least we can't go wrong 1593 01:09:47,306 --> 01:09:50,176 if we subsequently grow the amount of space we're using. 1594 01:09:50,516 --> 01:09:53,976 So, I declare an int called capacity saying there are no 1595 01:09:53,976 --> 01:09:57,436 bytes available yet and then I declare another int called N, 1596 01:09:57,436 --> 01:10:00,126 which specifies how many characters are in the buffer. 1597 01:10:00,126 --> 01:10:03,026 So in other words conceptually, get string has a buffer. 1598 01:10:03,106 --> 01:10:05,206 It's initially this big and we keep track 1599 01:10:05,206 --> 01:10:07,056 of not only how big it can be 1600 01:10:07,366 --> 01:10:09,036 but also how big it currently is. 1601 01:10:09,126 --> 01:10:11,646 So, at the moment it can fit 0 characters 1602 01:10:11,646 --> 01:10:12,976 and it has 0 characters. 1603 01:10:13,296 --> 01:10:14,666 Now what's the deal with unsigned? 1604 01:10:14,906 --> 01:10:17,906 Turns out with a lot of C data types you can prefix them 1605 01:10:17,906 --> 01:10:20,116 with keywords like signed or unsigned. 1606 01:10:20,496 --> 01:10:24,136 Why might it make sense to call capacity and int unsigned ints? 1607 01:10:26,306 --> 01:10:27,166 What's that? 1608 01:10:27,166 --> 01:10:27,916 >> Student: [inaudible] 1609 01:10:27,916 --> 01:10:28,966 >> David: I haven't initialized it, 1610 01:10:28,966 --> 01:10:32,666 but more importantly it's not going to be negative, right? 1611 01:10:32,666 --> 01:10:34,376 I can't have a negative amount of space. 1612 01:10:34,376 --> 01:10:36,936 Contrary to some of the silly error messages we saw today, 1613 01:10:36,936 --> 01:10:39,176 you can't have a negative amount of space and 1614 01:10:39,176 --> 01:10:42,056 yet if I'm allowing my numbers to just be an int recall 1615 01:10:42,056 --> 01:10:44,416 that an int can be negative and positive 1616 01:10:44,686 --> 01:10:45,756 but that's pretty wasteful 1617 01:10:45,756 --> 01:10:47,616 because if you use just a signed int 1618 01:10:47,726 --> 01:10:51,216 by default integers are signed if you don't specify signed 1619 01:10:51,216 --> 01:10:54,686 or unsigned, by default they can be negative or positive 1620 01:10:54,866 --> 01:10:58,986 which means in 32 bit int you are throwing away 2 billion 1621 01:10:58,986 --> 01:11:01,486 possible values if you know you're never going 1622 01:11:01,486 --> 01:11:05,186 to use the negative ones you're just wasting all of those bits. 1623 01:11:05,186 --> 01:11:07,006 So rather if I say unsigned, 1624 01:11:07,006 --> 01:11:09,616 this means I can now have a string that's between 0 1625 01:11:09,616 --> 01:11:14,156 and not 2 billion but 0 and 4 billion characters long. 1626 01:11:14,156 --> 01:11:15,846 Now it may be excessive in fairness, 1627 01:11:16,106 --> 01:11:19,216 but it's such an easy fix here to not throw away half 1628 01:11:19,216 --> 01:11:20,926 of my number space so I went ahead 1629 01:11:20,926 --> 01:11:22,536 and prefixed it with unsigned. 1630 01:11:22,846 --> 01:11:25,206 So, now let's go ahead and read through what happens next. 1631 01:11:25,506 --> 01:11:27,636 I have an int called C, which is actually going 1632 01:11:27,636 --> 01:11:29,436 to represent a character and it turns 1633 01:11:29,436 --> 01:11:32,386 out we're using a C function called F get C 1634 01:11:32,686 --> 01:11:35,326 to get 1 character after another from the user. 1635 01:11:35,386 --> 01:11:37,716 So I'm not going to dwell too long on these lines because, 1636 01:11:37,716 --> 01:11:39,276 again, we handed this to you for a reason 1637 01:11:39,276 --> 01:11:40,806 because this is more sophistication early 1638 01:11:40,806 --> 01:11:42,566 on than I think is all that enlightening, 1639 01:11:42,896 --> 01:11:44,616 but notice conceptually what we're doing. 1640 01:11:44,956 --> 01:11:48,616 F gets C's purpose in life is to get from a special thing 1641 01:11:48,616 --> 01:11:50,686 from standard in, S-T-D-I-N. 1642 01:11:50,936 --> 01:11:52,806 This essentially represents the keyboard. 1643 01:11:53,126 --> 01:11:56,736 So this function F get C gets one character 1644 01:11:56,736 --> 01:11:58,506 at a time from the keyboard. 1645 01:11:58,856 --> 01:12:02,906 This check here is making sure that the current character C 1646 01:12:03,156 --> 01:12:05,606 that we just got from the keyboard is not a special 1647 01:12:05,606 --> 01:12:06,636 character called EOF. 1648 01:12:06,636 --> 01:12:08,866 So the way a computer signifies, huh-uh, 1649 01:12:09,156 --> 01:12:10,706 there's nothing else that the users typed. 1650 01:12:10,706 --> 01:12:12,486 They hit enter or they hit control D 1651 01:12:12,486 --> 01:12:16,596 or they stopped typing, EOF is a special symbol that signifies 1652 01:12:16,596 --> 01:12:18,356 to the computer there's nothing more here 1653 01:12:18,356 --> 01:12:21,266 and that's why this loop will terminate as soon as we hit EOF. 1654 01:12:21,806 --> 01:12:24,186 Now, let's ignore all of these lines for just a moment 1655 01:12:24,186 --> 01:12:26,676 and fast forward just to the bottom of this loop. 1656 01:12:26,846 --> 01:12:29,796 At the very bottom of this loop is this line of code here. 1657 01:12:30,076 --> 01:12:32,656 This buffer, which initially is this size, 1658 01:12:32,986 --> 01:12:36,166 apparently I just keep storing character after character 1659 01:12:36,166 --> 01:12:39,206 after character inside of this buffer, but I can't do that 1660 01:12:39,206 --> 01:12:42,126 yet because initially the size of this buffer is zero. 1661 01:12:42,436 --> 01:12:45,936 So the code I just glossed over is a whole bunch of lines 1662 01:12:45,936 --> 01:12:47,646 of code whose purpose in life is 1663 01:12:47,676 --> 01:12:51,416 to check all right is the current size, N, 1664 01:12:51,776 --> 01:12:54,216 plus 1 over our capacity? 1665 01:12:54,256 --> 01:12:56,676 In other words, if I allocated let's say 8 bytes 1666 01:12:56,676 --> 01:13:00,806 and I'm already at 8 so 8 plus 1 is 9 and 9 is greater 1667 01:13:00,806 --> 01:13:02,256 than 8, I'm out of space. 1668 01:13:02,346 --> 01:13:04,056 That means the users typed a ninth character, 1669 01:13:04,056 --> 01:13:06,406 my buffer is this big, and I need some more space 1670 01:13:06,406 --> 01:13:08,466 over here so what do I do? 1671 01:13:08,696 --> 01:13:09,906 You have to jump through some hoops. 1672 01:13:10,276 --> 01:13:12,526 I first check and make sure that I'm not going to run 1673 01:13:12,526 --> 01:13:15,546 out of memory completely because of the maximum size of an int 1674 01:13:15,586 --> 01:13:17,756 but I'm going to completely wave my hands at this one 1675 01:13:17,956 --> 01:13:19,156 and let's just focus on the one 1676 01:13:19,156 --> 01:13:23,056 that uses a more familiar function or a cousin thereof. 1677 01:13:23,416 --> 01:13:24,516 Notice what I do here. 1678 01:13:24,786 --> 01:13:28,446 There's another function besides malloc called realloc whose name 1679 01:13:28,446 --> 01:13:30,296 implies reallocate memory. 1680 01:13:30,656 --> 01:13:34,326 So notice what I'm doing here is I'm reallocating the buffer 1681 01:13:34,716 --> 01:13:38,146 to a new size because if we scroll up and this is, again, 1682 01:13:38,146 --> 01:13:39,376 why we hand this function to you, 1683 01:13:39,856 --> 01:13:41,426 the key detail is this one here. 1684 01:13:41,866 --> 01:13:44,176 Every time I run out of space based on that line 1685 01:13:44,176 --> 01:13:46,976 of code alone what do I apparently do to my buffer? 1686 01:13:48,376 --> 01:13:48,976 I double it. 1687 01:13:49,116 --> 01:13:50,846 So, I initially it starts at 0 1688 01:13:50,846 --> 01:13:55,156 and then I think we initialize it to, we add one at some point. 1689 01:13:55,156 --> 01:13:56,056 I'd have to check the code. 1690 01:13:56,236 --> 01:13:59,316 Then it goes to 2 then to 8, then to 4 then to 8 then 1691 01:13:59,316 --> 01:14:02,246 to 16 then to 32 and so forth so any time we run 1692 01:14:02,246 --> 01:14:03,906 out of space we ask for more memory 1693 01:14:04,196 --> 01:14:07,566 but because we're getting things character by character at a time 1694 01:14:07,566 --> 01:14:11,326 with fgetc, there's absolutely no risk in the CS50 library 1695 01:14:11,516 --> 01:14:12,906 of someone exploiting our code 1696 01:14:12,906 --> 01:14:15,136 because you can't possibly pass us a bigger string 1697 01:14:15,136 --> 01:14:17,946 than we're expecting because we're only looking at it one 1698 01:14:17,946 --> 01:14:22,366 at a time but notice the headache that is this code 1699 01:14:22,366 --> 01:14:25,386 and why we bundle it all up in a function called get string 1700 01:14:25,786 --> 01:14:27,636 but notice, too, if you look through this code 1701 01:14:27,636 --> 01:14:30,506 at your leisure you'll see that almost all of the functions 1702 01:14:30,506 --> 01:14:34,356 in the CS50 library use get string to first get the input 1703 01:14:34,356 --> 01:14:36,676 from the user and then we convert those strings 1704 01:14:36,756 --> 01:14:37,446 to integers. 1705 01:14:37,646 --> 01:14:39,646 Now who cares about all of these details? 1706 01:14:39,646 --> 01:14:42,246 Well, there are now ways 1707 01:14:42,246 --> 01:14:44,636 in which we now have the expressive capabilities 1708 01:14:44,636 --> 01:14:46,076 to store things on disk. 1709 01:14:46,236 --> 01:14:48,666 Unfortunately, once you have this ability to store things 1710 01:14:48,666 --> 01:14:51,346 on disks you have to worry about cleaning things up 1711 01:14:51,346 --> 01:14:54,096 and most operating systems today including Windows, 1712 01:14:54,096 --> 01:14:56,716 Mac OS and Linux leave a ridiculous number 1713 01:14:56,716 --> 01:14:59,146 of bread crumbs, leave quite the paper trail 1714 01:14:59,146 --> 01:15:01,996 on your computer even when you so called delete a program 1715 01:15:01,996 --> 01:15:04,696 and even if you're just visiting some website 1716 01:15:04,696 --> 01:15:07,446 and you're not saving any files but those images 1717 01:15:07,446 --> 01:15:09,966 or that text is resident in RAM turns 1718 01:15:09,966 --> 01:15:12,176 out that even those photos you're looking at could end 1719 01:15:12,176 --> 01:15:13,546 up on your computer's hard drive 1720 01:15:13,546 --> 01:15:16,236 in something called virtual memory but more on that 1721 01:15:16,306 --> 01:15:18,256 and the solutions there to on Wednesday. 1722 01:15:18,256 --> 01:15:19,416 [ Applause ] 1723 01:15:19,416 --> 01:15:22,416 Thank you. 1724 01:15:26,636 --> 01:15:31,066 Andrew will be please.