1 00:00:00,000 --> 00:00:10,970 >> [MUSIC PLAYING] 2 00:00:10,970 --> 00:00:12,536 >> DAVID J. MALAN: All right. 3 00:00:12,536 --> 00:00:13,392 >> [LAUGHTER] 4 00:00:13,392 --> 00:00:14,240 >> Welcome back. 5 00:00:14,240 --> 00:00:14,990 This is CS50. 6 00:00:14,990 --> 00:00:16,890 And this the end of week five. 7 00:00:16,890 --> 00:00:20,020 And up until now, we've pretty much been taking for granted that there 8 00:00:20,020 --> 00:00:23,480 exists this compiler, Clang, that you've been invoking by way of this 9 00:00:23,480 --> 00:00:27,100 other tool called Make that somehow magically converts your source code 10 00:00:27,100 --> 00:00:31,350 into object code, the zeros and ones that your computers CPU, central 11 00:00:31,350 --> 00:00:33,410 processing unit, actually understands. 12 00:00:33,410 --> 00:00:36,770 But it turns out there's a number that's going on underneath the hood in 13 00:00:36,770 --> 00:00:38,690 between input and output. 14 00:00:38,690 --> 00:00:41,800 >> And I'd like to propose that we flesh that out in a little more detail into 15 00:00:41,800 --> 00:00:45,130 these four steps, have something called pre-processing, something 16 00:00:45,130 --> 00:00:48,300 called compiling, which we have seen, something called assembling, and 17 00:00:48,300 --> 00:00:49,420 something called linking. 18 00:00:49,420 --> 00:00:53,270 So up until now, in some of our programs, we've had sharp includes. 19 00:00:53,270 --> 00:00:56,650 More recently we've had some sharp defines for constants. 20 00:00:56,650 --> 00:01:00,660 So it turns out that those things that are prefixed with the hash symbol or 21 00:01:00,660 --> 00:01:04,150 the pound symbol are pre-processor directives. 22 00:01:04,150 --> 00:01:07,960 That's just a fancy way of saying it's a line of code that's actually 23 00:01:07,960 --> 00:01:12,280 converted into something else before the computer even try to convert your 24 00:01:12,280 --> 00:01:13,800 program into zeros and ones. 25 00:01:13,800 --> 00:01:19,000 >> For instance, sharp includes standard I/O .h, pretty much just means go 26 00:01:19,000 --> 00:01:24,010 ahead, grab the contents of the files stdio.h and paste them right there. 27 00:01:24,010 --> 00:01:25,880 So no zeros and ones at that point yet. 28 00:01:25,880 --> 00:01:27,470 It's really just a substitution. 29 00:01:27,470 --> 00:01:30,790 And that's done during the so-called pre-processing stage, when you 30 00:01:30,790 --> 00:01:34,230 actually run Clang or specifically Make in most cases. 31 00:01:34,230 --> 00:01:36,950 So all this has been happening first automatically thus far. 32 00:01:36,950 --> 00:01:38,800 >> Then comes the compilation step. 33 00:01:38,800 --> 00:01:40,920 But we've been oversimplified compilation. 34 00:01:40,920 --> 00:01:45,060 Compiling a program really means to take it from something like C, the 35 00:01:45,060 --> 00:01:48,430 source code we've been writing, down to something called assembly. 36 00:01:48,430 --> 00:01:52,900 Assembly language is a lower level language that, thankfully, we won't 37 00:01:52,900 --> 00:01:55,480 have much occasion to write this semester. 38 00:01:55,480 --> 00:01:59,100 But it's at the lowest level in the sense that you literally start writing 39 00:01:59,100 --> 00:02:04,270 add and subtract and multiply and load from memory and save to memory, the 40 00:02:04,270 --> 00:02:08,259 very basic instructions that a computer, underneath the hood, 41 00:02:08,259 --> 00:02:09,639 actually understands. 42 00:02:09,639 --> 00:02:14,930 >> Lastly, assembling takes that language to the zeros and ones that we've been 43 00:02:14,930 --> 00:02:16,190 describing thus far. 44 00:02:16,190 --> 00:02:19,270 And truly lastly, there's the so-called linking phase, which we'll 45 00:02:19,270 --> 00:02:22,360 see in just a moment, which combines your zeros and ones with zeros and 46 00:02:22,360 --> 00:02:24,870 ones other people before you have created. 47 00:02:24,870 --> 00:02:26,660 >> So consider this super simple program. 48 00:02:26,660 --> 00:02:27,560 It was from Week 1. 49 00:02:27,560 --> 00:02:29,610 It just said, Hello World, on the screen. 50 00:02:29,610 --> 00:02:30,920 We ran this through Clang. 51 00:02:30,920 --> 00:02:33,200 Or we ran it through Make which ran Clang. 52 00:02:33,200 --> 00:02:36,170 And outputted at the time where some zeros and ones. 53 00:02:36,170 --> 00:02:38,100 But it turns out there's an intermediate step. 54 00:02:38,100 --> 00:02:40,460 If I go over here-- oops, didn't want to see him yet. 55 00:02:40,460 --> 00:02:44,800 If I go over here to my appliance and I open up hello.c, here 56 00:02:44,800 --> 00:02:46,160 is that same program. 57 00:02:46,160 --> 00:02:48,600 And what I'm going to do in my terminal window here is I'm going to 58 00:02:48,600 --> 00:02:51,430 run Clang rather than Make, which automates all four of 59 00:02:51,430 --> 00:02:52,870 those steps for us. 60 00:02:52,870 --> 00:02:58,620 And I'm going to do clang-S and then hello.c and then enter. 61 00:02:58,620 --> 00:03:00,590 >> And I get a blinking prompt again, which is good. 62 00:03:00,590 --> 00:03:05,280 And now in a slightly bigger window, I'm going to open up gedit in here. 63 00:03:05,280 --> 00:03:09,610 And I'm going to open up a file that, turns out, is called hello.s this 64 00:03:09,610 --> 00:03:11,870 contains that assembly language I referred to earlier. 65 00:03:11,870 --> 00:03:15,060 And this is what's called assembly language, fairly low level 66 00:03:15,060 --> 00:03:18,470 instructions that your Intel CPU or whatever it is that's inside 67 00:03:18,470 --> 00:03:19,350 understands. 68 00:03:19,350 --> 00:03:24,480 And mov is for move. call is for calling, a very low level function. 69 00:03:24,480 --> 00:03:26,380 sub is for subtract. 70 00:03:26,380 --> 00:03:30,370 >> So when you have a particular CPU inside of your computer, what makes it 71 00:03:30,370 --> 00:03:34,300 distinct, versus other CPUs on the market, is which instructions it 72 00:03:34,300 --> 00:03:39,460 understands and often how efficient it is, how fast it is at executing some 73 00:03:39,460 --> 00:03:40,380 of those instructions. 74 00:03:40,380 --> 00:03:45,150 Now for more on this, you can take next Fall CS61 at the college. 75 00:03:45,150 --> 00:03:48,170 But here we have, for instance, a few identifiers that might look familiar. 76 00:03:48,170 --> 00:03:50,150 hello.c is the name of the program. 77 00:03:50,150 --> 00:03:51,070 >> .text-- 78 00:03:51,070 --> 00:03:54,190 there's not much of interest there just now, recall that the text 79 00:03:54,190 --> 00:03:59,190 segment, as of Monday, is where in memory your program actually ends up. 80 00:03:59,190 --> 00:04:01,330 So that's at least vaguely familiar there. 81 00:04:01,330 --> 00:04:03,730 Here, of course, is a mention of our main function. 82 00:04:03,730 --> 00:04:07,220 Scrolling down, these refer to things called registers, very small chunks of 83 00:04:07,220 --> 00:04:09,190 memory inside of your actual CPU. 84 00:04:09,190 --> 00:04:12,930 And if I scroll down even further, I see some sort 85 00:04:12,930 --> 00:04:14,240 indirect mention of ASCII. 86 00:04:14,240 --> 00:04:17,120 And there, indeed, is that string, hello, comma, world. 87 00:04:17,120 --> 00:04:20,079 >> So long story short, this has been happening for you, automatically, 88 00:04:20,079 --> 00:04:22,140 underneath the hood all of this time. 89 00:04:22,140 --> 00:04:26,450 And what's been happening really is once you've run Clang, or by way of 90 00:04:26,450 --> 00:04:29,150 Make, you're getting first, from the source code, the 91 00:04:29,150 --> 00:04:30,700 so-called assembly language. 92 00:04:30,700 --> 00:04:35,210 Then Clang is converting this assembly language down to zeros and ones. 93 00:04:35,210 --> 00:04:38,340 And this is the slide that we started our discussion in Week 0 on-- 94 00:04:38,340 --> 00:04:39,840 and then Week 1 on. 95 00:04:39,840 --> 00:04:44,030 And then finally, those zeros and ones are combined with the zeros and ones 96 00:04:44,030 --> 00:04:47,190 from those libraries we've been taking for granted like Standard I/O or the 97 00:04:47,190 --> 00:04:50,010 String Library or even the CS50 library. 98 00:04:50,010 --> 00:04:54,200 >> So to paint this picture more visually, we have hello.c. 99 00:04:54,200 --> 00:04:57,220 And it, of course, uses the printf function to say, hello world. 100 00:04:57,220 --> 00:05:01,810 The compilation step takes it down to that file we just saw hello.s, even 101 00:05:01,810 --> 00:05:04,290 though that's typically deleted automatically for you. 102 00:05:04,290 --> 00:05:06,050 But that's the assembly code in the middle step. 103 00:05:06,050 --> 00:05:09,750 And then when we assemble the assembly language, so to speak, that's when you 104 00:05:09,750 --> 00:05:10,830 get those zeros and ones. 105 00:05:10,830 --> 00:05:13,920 So we've zoomed in effectively today on what we've been taking for granted, 106 00:05:13,920 --> 00:05:16,430 means going source code to object code. 107 00:05:16,430 --> 00:05:18,850 >> But lastly, now that same picture-- let's shove it over to 108 00:05:18,850 --> 00:05:20,020 the left hand side. 109 00:05:20,020 --> 00:05:22,880 And note that in the top there I mentioned stdio.h. 110 00:05:22,880 --> 00:05:25,030 That's a file that we've included in almost all of the 111 00:05:25,030 --> 00:05:26,250 programs we've written. 112 00:05:26,250 --> 00:05:28,830 And that's the file whose contents get copy pasted, 113 00:05:28,830 --> 00:05:30,350 effectively atop your code. 114 00:05:30,350 --> 00:05:34,170 But it turns out that, on a computer system somewhere, there's presumably a 115 00:05:34,170 --> 00:05:39,150 stdio.c file that someone wrote years ago that implements all of the 116 00:05:39,150 --> 00:05:41,870 functions that were declared in stdio.h. 117 00:05:41,870 --> 00:05:45,465 >> Now in reality it's probably not on your Mac or your PC or even in the 118 00:05:45,465 --> 00:05:47,660 CS50 appliance is a raw C code. 119 00:05:47,660 --> 00:05:52,710 Someone already compiled it and included .o file for object code or .a 120 00:05:52,710 --> 00:05:56,020 file, which refers to a shared library that's been pre-installed and 121 00:05:56,020 --> 00:05:57,240 pre-compiled for you. 122 00:05:57,240 --> 00:06:01,950 But suppose that there indeed exists on our computer stdio.c in parallel 123 00:06:01,950 --> 00:06:02,650 with Clang. 124 00:06:02,650 --> 00:06:04,960 Your code's being compiled and assembled. 125 00:06:04,960 --> 00:06:09,200 stdio.c's code is being compiled and assembled, so that this very last 126 00:06:09,200 --> 00:06:13,730 step, down here, we have to somehow link, so to speak, your zeros and ones 127 00:06:13,730 --> 00:06:18,430 with his or her zeros and ones into one simple program that ultimately is 128 00:06:18,430 --> 00:06:20,540 called just Hello. 129 00:06:20,540 --> 00:06:23,340 >> So that's all of the magic that's been happening thus far. 130 00:06:23,340 --> 00:06:26,430 And will continue to take these processes for granted, but realize 131 00:06:26,430 --> 00:06:28,750 there's a lot of juicy details going on underneath there. 132 00:06:28,750 --> 00:06:31,920 And this is what makes your computer with Intel inside 133 00:06:31,920 --> 00:06:33,940 particularly distinct. 134 00:06:33,940 --> 00:06:37,020 >> So on that note, if you would like to join us for lunch this Friday, do go 135 00:06:37,020 --> 00:06:41,570 to the usual place cs50.net/rsvp, 1:15 PM this Friday. 136 00:06:41,570 --> 00:06:43,400 And now a few announcements. 137 00:06:43,400 --> 00:06:44,670 So we have some good news. 138 00:06:44,670 --> 00:06:45,970 And we have some bad news. 139 00:06:45,970 --> 00:06:47,260 Start with some good news here. 140 00:06:47,260 --> 00:06:52,038 141 00:06:52,038 --> 00:06:54,510 >> [GROANING] 142 00:06:54,510 --> 00:06:54,710 >> All right. 143 00:06:54,710 --> 00:06:56,670 Well, it's technically a holiday, so it's not so much a gift from us. 144 00:06:56,670 --> 00:06:58,030 But then the bad news of course. 145 00:06:58,030 --> 00:07:00,550 146 00:07:00,550 --> 00:07:01,880 >> [GROANING] 147 00:07:01,880 --> 00:07:03,530 >> I spent a lot of time on these animations. 148 00:07:03,530 --> 00:07:04,690 >> [LAUGHTER] 149 00:07:04,690 --> 00:07:07,000 >> There will be a review session this coming Monday. 150 00:07:07,000 --> 00:07:08,340 It's going to be at 5:30 PM. 151 00:07:08,340 --> 00:07:11,210 We will remind you of all these details via email on the course's 152 00:07:11,210 --> 00:07:13,470 website in just a couple of days time. 153 00:07:13,470 --> 00:07:16,610 It will be filmed and made available shortly thereafter. 154 00:07:16,610 --> 00:07:19,200 So if you can't make that Monday night slot, don't worry. 155 00:07:19,200 --> 00:07:22,270 Sections this coming week will also focus on review for the quiz. 156 00:07:22,270 --> 00:07:25,670 If your section is on Monday, which is indeed university holiday, we will 157 00:07:25,670 --> 00:07:26,920 still meet in section. 158 00:07:26,920 --> 00:07:28,890 If you simply can't make that section because you're going 159 00:07:28,890 --> 00:07:29,860 away, that's fine. 160 00:07:29,860 --> 00:07:33,710 Attend a Sunday or Tuesday section or tune-in to Jason's section, which is 161 00:07:33,710 --> 00:07:35,110 available online. 162 00:07:35,110 --> 00:07:37,490 >> So, more bad news. 163 00:07:37,490 --> 00:07:41,960 So according to the syllabus, we have lecture next Friday. 164 00:07:41,960 --> 00:07:43,690 But the good news-- 165 00:07:43,690 --> 00:07:44,860 clearly, I spent too much time on this. 166 00:07:44,860 --> 00:07:45,280 >> [LAUGHTER] 167 00:07:45,280 --> 00:07:47,140 >> We'll cancel next Friday's lectures. 168 00:07:47,140 --> 00:07:50,590 So that will be a gift for us, so you can really have a nice respite in 169 00:07:50,590 --> 00:07:52,990 between this week and two weeks hence. 170 00:07:52,990 --> 00:07:57,460 So no lectures next week, just a tiny little quiz, for which you should be 171 00:07:57,460 --> 00:07:59,030 getting increasingly excited. 172 00:07:59,030 --> 00:08:03,870 >> So let's now turn our attention to something that is indeed more visual 173 00:08:03,870 --> 00:08:06,990 and more exciting and to set the stage for what's going to be on the horizon 174 00:08:06,990 --> 00:08:08,420 in just a couple of weeks time. 175 00:08:08,420 --> 00:08:12,160 After the first quiz, we'll turn the focus of our problem sets to another 176 00:08:12,160 --> 00:08:16,710 domain specific problem, that of forensics or security more generally. 177 00:08:16,710 --> 00:08:19,550 >> In fact, the tradition with this problem set is for me one of the 178 00:08:19,550 --> 00:08:24,850 teaching fellow or CAs to walk across campus taking some photographs of 179 00:08:24,850 --> 00:08:29,450 identifiable but non obvious people, places, or things, then every year I 180 00:08:29,450 --> 00:08:34,520 somehow manage to accidentally delete or corrupt the digital media card 181 00:08:34,520 --> 00:08:35,720 that's inside of our camera. 182 00:08:35,720 --> 00:08:36,860 But no big deal. 183 00:08:36,860 --> 00:08:39,200 I can go ahead and plug that into my computer. 184 00:08:39,200 --> 00:08:43,010 I can make a forensic image of it, so to speak, by copying the zeros and 185 00:08:43,010 --> 00:08:46,830 ones off of that memory card, whether its a SD card or compact flash card or 186 00:08:46,830 --> 00:08:48,100 whatever you're familiar with. 187 00:08:48,100 --> 00:08:49,300 And then we can hand that out. 188 00:08:49,300 --> 00:08:53,190 >> And so the challenge ahead, among other things for you, will be to write 189 00:08:53,190 --> 00:08:58,630 C code that recovers a whole bunch of JPEGs for me and revealed will be 190 00:08:58,630 --> 00:09:00,190 those people, places, or things. 191 00:09:00,190 --> 00:09:03,340 And we'll also talk, in this problem set and in the days to come, about 192 00:09:03,340 --> 00:09:04,440 graphics more generally. 193 00:09:04,440 --> 00:09:06,140 We've used them, a course, for break out. 194 00:09:06,140 --> 00:09:09,080 But you've sort of taken for granted there exists these high level notions 195 00:09:09,080 --> 00:09:10,680 of rectangles and ovals. 196 00:09:10,680 --> 00:09:12,450 But underneath the hood there are pixels. 197 00:09:12,450 --> 00:09:14,370 And you have had to start thinking about those. 198 00:09:14,370 --> 00:09:18,800 Or you will for p-set 4 have to think about the gap between your bricks, how 199 00:09:18,800 --> 00:09:21,990 quickly you're ball is moving across the screen for break out. 200 00:09:21,990 --> 00:09:24,830 So there is this notion of the dots on your screen that's 201 00:09:24,830 --> 00:09:26,290 come into play already. 202 00:09:26,290 --> 00:09:29,430 >> Now what you see, though, is what you get on a computer screen. 203 00:09:29,430 --> 00:09:33,680 If you've ever watched some good or bad TV, odds are they pretty much 204 00:09:33,680 --> 00:09:36,280 treat the audience like technophobes who don't really 205 00:09:36,280 --> 00:09:37,630 know much about computing. 206 00:09:37,630 --> 00:09:40,840 And so it's very easy for the police detective to say, can you 207 00:09:40,840 --> 00:09:41,710 clean that up for me? 208 00:09:41,710 --> 00:09:42,710 Or enhance, right? 209 00:09:42,710 --> 00:09:45,550 Enhance is like the buzz word in most any crime related show. 210 00:09:45,550 --> 00:09:49,240 And the reality is if you take a very blurry picture of a suspect doing 211 00:09:49,240 --> 00:09:51,620 something bad, you cannot just enhance it. 212 00:09:51,620 --> 00:09:53,080 You cannot zoom in infinitely. 213 00:09:53,080 --> 00:09:56,350 You cannot see in the glint of someone's eye who committed that 214 00:09:56,350 --> 00:09:59,860 particular crime, despite the prevalence of this on TV. 215 00:09:59,860 --> 00:10:04,110 >> And so with that let's motivate that upcoming problem set with a glimpse at 216 00:10:04,110 --> 00:10:05,765 some shows with which you might be familiar. 217 00:10:05,765 --> 00:10:06,500 >> [VIDEO PLAYBACK] 218 00:10:06,500 --> 00:10:07,835 >> -OK. 219 00:10:07,835 --> 00:10:09,956 Now, let's get a good look at you. 220 00:10:09,956 --> 00:10:17,060 221 00:10:17,060 --> 00:10:17,766 >> -Hold it. 222 00:10:17,766 --> 00:10:18,658 Run that back. 223 00:10:18,658 --> 00:10:19,550 >> -Wait a minute. 224 00:10:19,550 --> 00:10:21,580 Go right. 225 00:10:21,580 --> 00:10:21,800 >> -There. 226 00:10:21,800 --> 00:10:22,690 Freeze that. 227 00:10:22,690 --> 00:10:23,692 >> -Full screen. 228 00:10:23,692 --> 00:10:23,846 >> -OK. 229 00:10:23,846 --> 00:10:24,154 Freeze that. 230 00:10:24,154 --> 00:10:25,140 >> -Tighten up on that, will ya? 231 00:10:25,140 --> 00:10:27,090 >> -Vector in on that guy by the back wheel. 232 00:10:27,090 --> 00:10:29,730 >> -Zoom in right here on this spot. 233 00:10:29,730 --> 00:10:33,700 >> -With the right equipment, the imaged can be enlarged and sharpened. 234 00:10:33,700 --> 00:10:34,490 >> -What's that? 235 00:10:34,490 --> 00:10:35,870 >> -It's an enhancement program. 236 00:10:35,870 --> 00:10:36,793 >> -Can you clear that up any? 237 00:10:36,793 --> 00:10:38,560 >> -I don't know. 238 00:10:38,560 --> 00:10:39,090 Let's enhance it. 239 00:10:39,090 --> 00:10:41,690 >> -Enhance section A-6. 240 00:10:41,690 --> 00:10:43,510 >> -I enhanced the detail and-- 241 00:10:43,510 --> 00:10:44,456 >> -I think there's enough to enhance. 242 00:10:44,456 --> 00:10:45,402 Release it to my screen. 243 00:10:45,402 --> 00:10:47,300 >> -Enhance the reflection in her eye. 244 00:10:47,300 --> 00:10:49,330 >> -Let's run this through video enhancement. 245 00:10:49,330 --> 00:10:50,340 >> -Edgar, can you enhance this? 246 00:10:50,340 --> 00:10:52,320 >> -Hang on. 247 00:10:52,320 --> 00:10:54,290 >> -I've been working on this reflection. 248 00:10:54,290 --> 00:10:55,560 >> -Someone's reflection. 249 00:10:55,560 --> 00:10:56,440 >> -Reflection. 250 00:10:56,440 --> 00:10:57,940 >> -There's a reflection of the man's face. 251 00:10:57,940 --> 00:10:58,860 >> -The reflection. 252 00:10:58,860 --> 00:10:59,710 >> -There's a reflection. 253 00:10:59,710 --> 00:11:00,900 >> -Zoom in on the mirror. 254 00:11:00,900 --> 00:11:03,500 >> -You can see a reflection. 255 00:11:03,500 --> 00:11:04,700 >> -Can you enhance the image from here? 256 00:11:04,700 --> 00:11:05,700 >> -Can you enhance him right here? 257 00:11:05,700 --> 00:11:06,500 >> -Can you enhance it? 258 00:11:06,500 --> 00:11:07,380 >> -Can you enhance it? 259 00:11:07,380 --> 00:11:08,190 >> -Can we enhance this? 260 00:11:08,190 --> 00:11:08,940 >> -Can you enhance it? 261 00:11:08,940 --> 00:11:10,280 >> -Hold on a second, I'll enhance. 262 00:11:10,280 --> 00:11:11,570 >> -Zoom in on the door. 263 00:11:11,570 --> 00:11:12,180 >> -x10. 264 00:11:12,180 --> 00:11:13,052 >> -Zoom. 265 00:11:13,052 --> 00:11:13,197 >> [LAUGHTER] 266 00:11:13,197 --> 00:11:14,360 >> -Move in. 267 00:11:14,360 --> 00:11:15,100 >> -Wait, stop. 268 00:11:15,100 --> 00:11:15,740 >> -Stop. 269 00:11:15,740 --> 00:11:16,290 >> -Pause it. 270 00:11:16,290 --> 00:11:19,390 >> -Rotate a 75 degrees around the vertical please. 271 00:11:19,390 --> 00:11:19,886 >> [LAUGHTER] 272 00:11:19,886 --> 00:11:24,350 >> -Stop, and back to the part about the door again. 273 00:11:24,350 --> 00:11:26,330 >> -Got an image enhancer that can bitmap? 274 00:11:26,330 --> 00:11:28,990 >> -Maybe we can use the Pradeep Sen method to see into the windows. 275 00:11:28,990 --> 00:11:30,680 >> -This software is state of the art. 276 00:11:30,680 --> 00:11:31,676 >> -The icon value is off. 277 00:11:31,676 --> 00:11:34,166 >> -With the right combination of algorithms. 278 00:11:34,166 --> 00:11:38,399 >> -He's taken illumination algorithms to the next level and I can use them to 279 00:11:38,399 --> 00:11:38,648 enhance this photograph. 280 00:11:38,648 --> 00:11:42,050 >> -Lock on and enlarge the z-axis. 281 00:11:42,050 --> 00:11:42,760 >> -Enhance. 282 00:11:42,760 --> 00:11:43,060 >> -Enhance. 283 00:11:43,060 --> 00:11:43,760 >> -Enhance. 284 00:11:43,760 --> 00:11:45,010 >> -Freeze and enhance. 285 00:11:45,010 --> 00:11:47,470 286 00:11:47,470 --> 00:11:47,910 >> [END VIDEO PLAYBACK] 287 00:11:47,910 --> 00:11:51,470 >> DAVID J. MALAN: So Problem Set 5 is what lies ahead there. 288 00:11:51,470 --> 00:11:55,260 So we'll soon get a better understanding of when and why you can 289 00:11:55,260 --> 00:11:57,300 and our cannot enhance in that way. 290 00:11:57,300 --> 00:12:00,090 But first, let's return our attention to some of the building blocks we'll 291 00:12:00,090 --> 00:12:02,250 need to be able to tell that story. 292 00:12:02,250 --> 00:12:05,580 >> So recall that we drew this picture on Monday and a little bit last week. 293 00:12:05,580 --> 00:12:09,970 And this describes the layout of things in your computer's memory when 294 00:12:09,970 --> 00:12:11,000 running some program. 295 00:12:11,000 --> 00:12:14,310 The tech segment up top, recall, refers to the actual zeros and ones 296 00:12:14,310 --> 00:12:16,000 that compose your program. 297 00:12:16,000 --> 00:12:19,340 There's, below that, some initialized or uninitialized data, which typically 298 00:12:19,340 --> 00:12:22,910 refers to things like constants or strings or global variables that have 299 00:12:22,910 --> 00:12:24,200 been declared in advance. 300 00:12:24,200 --> 00:12:26,500 There's the heap, but we'll come back to that in a bit. 301 00:12:26,500 --> 00:12:27,410 >> And then there's the stack. 302 00:12:27,410 --> 00:12:30,660 Much like a stack of trays in the cafeteria, this is where memory gets 303 00:12:30,660 --> 00:12:33,610 layered and layered whenever you do what in a program? 304 00:12:33,610 --> 00:12:36,380 305 00:12:36,380 --> 00:12:37,730 What is the stack use for? 306 00:12:37,730 --> 00:12:39,320 >> Yeah? 307 00:12:39,320 --> 00:12:40,000 >> Call of function. 308 00:12:40,000 --> 00:12:42,890 Any time you call a function, it's given to sliver of memory for its 309 00:12:42,890 --> 00:12:45,020 local variables or its parameters. 310 00:12:45,020 --> 00:12:48,810 And pictorially, we see that with each successive function called, when A 311 00:12:48,810 --> 00:12:52,520 calls B calls C calls D, they get layered onto the stack. 312 00:12:52,520 --> 00:12:55,630 And within each of those slices of memory is essentially a unique scope 313 00:12:55,630 --> 00:12:58,590 for that function, which, of course, is problematic if you want to hand 314 00:12:58,590 --> 00:13:01,850 from one function to another A piece of data that you want it 315 00:13:01,850 --> 00:13:03,500 to mutate or change. 316 00:13:03,500 --> 00:13:08,060 >> So what was our solution to enabling A function represented by one stack 317 00:13:08,060 --> 00:13:11,390 frame to change the memory inside of another stack frame? 318 00:13:11,390 --> 00:13:14,590 How do those two talk to one another? 319 00:13:14,590 --> 00:13:18,510 So by way of pointers or addresses, which, again, just describe where in 320 00:13:18,510 --> 00:13:22,280 memory, by way of a specific bite number, the particular 321 00:13:22,280 --> 00:13:23,830 value can be found. 322 00:13:23,830 --> 00:13:26,860 So recall last time too we continued the story and looked at a 323 00:13:26,860 --> 00:13:28,280 fairly buggy program. 324 00:13:28,280 --> 00:13:32,900 And this program is buggy for a few reasons, but the most worrisome one is 325 00:13:32,900 --> 00:13:34,620 because it fails to check what? 326 00:13:34,620 --> 00:13:39,111 327 00:13:39,111 --> 00:13:40,450 >> Yeah, it fails to check the input. 328 00:13:40,450 --> 00:13:41,870 Sorry? 329 00:13:41,870 --> 00:13:43,880 >> If it's more than 12 characters. 330 00:13:43,880 --> 00:13:47,260 So very smartly, when calling memcopy, which, as the name suggests, just 331 00:13:47,260 --> 00:13:50,630 copies memory from its second argument into its first argument. 332 00:13:50,630 --> 00:13:54,730 The third argument, very smartly, is checked to make sure that you don't 333 00:13:54,730 --> 00:13:59,400 copy more than, in this case, the length of bar, number of characters, 334 00:13:59,400 --> 00:14:03,810 into the destination, which is this array C. But the problem is that what 335 00:14:03,810 --> 00:14:07,230 if C itself is not big enough to handle that? 336 00:14:07,230 --> 00:14:09,900 You're going to copy the number of bytes that you've been given. 337 00:14:09,900 --> 00:14:13,040 But what do you actually have more bytes than you have room for? 338 00:14:13,040 --> 00:14:16,770 >> Well, this program very foolishly just blindly proceeds to take whatever it's 339 00:14:16,770 --> 00:14:20,650 given, hello backslash 0 is great if string is short 340 00:14:20,650 --> 00:14:22,040 enough, like five chars. 341 00:14:22,040 --> 00:14:26,470 But if it's actually 12 characters or 1,200 characters, we saw last time 342 00:14:26,470 --> 00:14:29,380 that you're just going to completely overwrite memory that 343 00:14:29,380 --> 00:14:30,470 doesn't belong to you. 344 00:14:30,470 --> 00:14:34,390 And worst case, if you overwrite that red portion there that we called the 345 00:14:34,390 --> 00:14:35,380 return address-- 346 00:14:35,380 --> 00:14:38,370 this is just where the computer automatically, for you, behind the 347 00:14:38,370 --> 00:14:43,130 scenes, tucks away a 32-bit value that reminds it to what address it should 348 00:14:43,130 --> 00:14:47,080 return when foo, this other function, is done executing. 349 00:14:47,080 --> 00:14:49,320 It's a bread crumb of sorts to which it returns. 350 00:14:49,320 --> 00:14:52,490 If you overwrite that, potentially, if you're the bad guy, can could 351 00:14:52,490 --> 00:14:54,750 potentially take over someone's computer. 352 00:14:54,750 --> 00:14:58,020 And you'll most certainly crash it in most cases. 353 00:14:58,020 --> 00:15:01,690 >> Now this problem was only exacerbated as we started talking about memory 354 00:15:01,690 --> 00:15:03,010 management more generally. 355 00:15:03,010 --> 00:15:07,150 And malloc, for memory allocation, is a function that we can use to allocate 356 00:15:07,150 --> 00:15:11,260 memory when we don't know in advance that we might need some. 357 00:15:11,260 --> 00:15:13,960 So, for instance, if I go back to the appliance here. 358 00:15:13,960 --> 00:15:21,010 And I open up from last time hello2.c, recall this program here, which looked 359 00:15:21,010 --> 00:15:23,500 a little something like this, just three lines-- 360 00:15:23,500 --> 00:15:27,940 state your name, then string name, on the left, equals getstring. 361 00:15:27,940 --> 00:15:29,690 And then we print it out, the user's name. 362 00:15:29,690 --> 00:15:31,170 >> So this was a super simple program. 363 00:15:31,170 --> 00:15:34,870 To be clear, let me go ahead and make hello-2. 364 00:15:34,870 --> 00:15:36,680 I'm going to do dot slash hello-2. 365 00:15:36,680 --> 00:15:37,750 State your name-- 366 00:15:37,750 --> 00:15:38,140 David. 367 00:15:38,140 --> 00:15:38,840 Enter. 368 00:15:38,840 --> 00:15:39,540 Hello David. 369 00:15:39,540 --> 00:15:41,060 It seems to work OK. 370 00:15:41,060 --> 00:15:43,140 But what's really going on underneath hood here? 371 00:15:43,140 --> 00:15:44,670 First let's peel back some layers. 372 00:15:44,670 --> 00:15:48,380 String is just a synonym we've realized for what? 373 00:15:48,380 --> 00:15:49,110 Char star. 374 00:15:49,110 --> 00:15:52,740 So let's make it a little more arcane but more technically correct that this 375 00:15:52,740 --> 00:15:55,570 is a char star, which means that name, yes, is a variable. 376 00:15:55,570 --> 00:15:59,920 But what name stores is the address of a char, which feels a little strange 377 00:15:59,920 --> 00:16:01,050 because I'm getting back a string. 378 00:16:01,050 --> 00:16:03,580 I'm getting back multiple chars not a char. 379 00:16:03,580 --> 00:16:07,400 >> But of course, you only need the first char's address to remember where the 380 00:16:07,400 --> 00:16:08,870 whole string is because why? 381 00:16:08,870 --> 00:16:12,700 How do you figure out where the end of the string is knowing the beginning? 382 00:16:12,700 --> 00:16:13,630 The backslash zero. 383 00:16:13,630 --> 00:16:17,260 So with those two clues you figure out before the beginning and the end of 384 00:16:17,260 --> 00:16:20,280 any string are, so long as they're properly formed with that null 385 00:16:20,280 --> 00:16:22,110 terminator, that backslash zero. 386 00:16:22,110 --> 00:16:24,520 >> But this is calling getstring. 387 00:16:24,520 --> 00:16:28,020 And it turns out that getstring all this time has been kind of 388 00:16:28,020 --> 00:16:28,820 cheating for us. 389 00:16:28,820 --> 00:16:32,460 It's been doing this labor, to be sure, getting a string from the user. 390 00:16:32,460 --> 00:16:34,580 But where's that memory been coming from? 391 00:16:34,580 --> 00:16:38,440 If we go back to the picture here and apply the definition from just a 392 00:16:38,440 --> 00:16:42,610 moment ago, that the stack is where memory goes when functions are called, 393 00:16:42,610 --> 00:16:45,370 by that logic, when you call getstring, and then I type in 394 00:16:45,370 --> 00:16:50,900 D-A-V-I-D Enter, where is D-A-V-I-D backslash zero stored, based on the 395 00:16:50,900 --> 00:16:53,480 story we've told us far? 396 00:16:53,480 --> 00:16:55,190 >> It would seem to be in the stack, right? 397 00:16:55,190 --> 00:16:58,120 When you call get string you get a little slice of memory on the stack. 398 00:16:58,120 --> 00:17:01,630 So it stands to reason that D-A-V-I-D backslash zero is stored 399 00:17:01,630 --> 00:17:02,770 there in the stack. 400 00:17:02,770 --> 00:17:07,680 But wait a minute, getstring returns that string, so to speak, which means 401 00:17:07,680 --> 00:17:11,700 it's tray from the cafeteria is taken off the stack. 402 00:17:11,700 --> 00:17:14,560 And we said last time that as soon as a function returns, and you take that 403 00:17:14,560 --> 00:17:20,109 tray, so to speak, off the stack, what can you assume about the remnants of 404 00:17:20,109 --> 00:17:21,819 that memory? 405 00:17:21,819 --> 00:17:25,160 I sort of redrew them as question marks because they effectively become 406 00:17:25,160 --> 00:17:26,250 unknown values. 407 00:17:26,250 --> 00:17:29,500 They can be reused when some next function is called. 408 00:17:29,500 --> 00:17:31,870 >> In other words, if we happen to be storing-- 409 00:17:31,870 --> 00:17:34,350 I'll draw a quick picture here of the stack. 410 00:17:34,350 --> 00:17:38,690 If we happen to be drawing the bottom of my memory segment, and we'll say 411 00:17:38,690 --> 00:17:42,230 that this is the place of memory occupied by main and maybe arg c and 412 00:17:42,230 --> 00:17:46,790 arg v and anything else in the program, when getstring is called, 413 00:17:46,790 --> 00:17:51,120 presumably getstring gets a chunk of memory here. 414 00:17:51,120 --> 00:17:53,940 And then D-A-V-I-D somehow ends up in this function. 415 00:17:53,940 --> 00:17:55,320 And I'm going to oversimplify. 416 00:17:55,320 --> 00:18:00,050 But let's assume that its D-A-V-I-D backslash zero. 417 00:18:00,050 --> 00:18:03,500 So this many bytes are used in the frame for getstring. 418 00:18:03,500 --> 00:18:08,270 >> But as soon as getstring returns, we said last time that this memory over 419 00:18:08,270 --> 00:18:11,340 here all becomes --woops!-- 420 00:18:11,340 --> 00:18:14,270 all becomes effectively erased. 421 00:18:14,270 --> 00:18:17,220 And we can think of this now as question marks because who knows 422 00:18:17,220 --> 00:18:18,720 what's going to become of that memory. 423 00:18:18,720 --> 00:18:22,130 Indeed, I very often call functions other than getstring. 424 00:18:22,130 --> 00:18:24,750 And as soon as I call some other function than getstring, maybe not in 425 00:18:24,750 --> 00:18:28,860 this particular program we just looked at but some other, surely some other 426 00:18:28,860 --> 00:18:34,180 function might end up being given this next spot in the stack. 427 00:18:34,180 --> 00:18:39,410 >> So it can't be that getstring stores D-A-V-I-D on the stack because I would 428 00:18:39,410 --> 00:18:41,040 immediately lose access to it. 429 00:18:41,040 --> 00:18:43,720 But we know they getstring only returns what? 430 00:18:43,720 --> 00:18:47,220 It's not returning to me six characters. 431 00:18:47,220 --> 00:18:51,090 What is it truly returning did we conclude last time? 432 00:18:51,090 --> 00:18:52,480 The address of the first. 433 00:18:52,480 --> 00:18:56,650 So somehow, when you called getstring, it's allocating a chunk of memory for 434 00:18:56,650 --> 00:18:59,620 the string that the users type and then returning address of it. 435 00:18:59,620 --> 00:19:02,930 And it turns out that when you want to function to allocate memory in this 436 00:19:02,930 --> 00:19:08,390 way and return to the person who called that function, the address of 437 00:19:08,390 --> 00:19:11,870 that chunk of memory, you absolutely can't put it in the stack at the 438 00:19:11,870 --> 00:19:14,750 bottom, because functionally it's just going to not become yours very 439 00:19:14,750 --> 00:19:17,800 quickly, so you can probably guess where we're probably going to toss it 440 00:19:17,800 --> 00:19:20,130 instead, the so-called heap. 441 00:19:20,130 --> 00:19:25,290 >> So between the bottom of your memory's layout and the top of your memory's 442 00:19:25,290 --> 00:19:26,820 layout are a whole bunch of segments. 443 00:19:26,820 --> 00:19:29,270 One is the stack, and right above it is the heap. 444 00:19:29,270 --> 00:19:33,680 And heap is just a different chunk of memory that's not used for functions 445 00:19:33,680 --> 00:19:34,770 when they're called. 446 00:19:34,770 --> 00:19:38,100 It's used for longer term memory, when you want one function to grab some 447 00:19:38,100 --> 00:19:42,700 memory and be able to hang on to it without losing control over it. 448 00:19:42,700 --> 00:19:45,550 >> Now you could perhaps immediately see that this is not 449 00:19:45,550 --> 00:19:48,060 necessarily a perfect design. 450 00:19:48,060 --> 00:19:51,350 As your program allocated memory on the stack, or as you call more and 451 00:19:51,350 --> 00:19:55,540 more functions, or as you allocate memory on the heap with malloc off as 452 00:19:55,540 --> 00:20:00,690 getstring is doing, what clearly seems to be inevitable problem? 453 00:20:00,690 --> 00:20:00,860 >> Right. 454 00:20:00,860 --> 00:20:03,150 Like the fact that these arrows are pointing at each other 455 00:20:03,150 --> 00:20:04,380 does not bode well. 456 00:20:04,380 --> 00:20:08,630 And indeed, we could very quickly crash a program in any number of ways. 457 00:20:08,630 --> 00:20:12,050 In fact, I think we might have done this accidentally once. 458 00:20:12,050 --> 00:20:14,020 Or if not, let's do it deliberately now. 459 00:20:14,020 --> 00:20:21,330 Let me go ahead and write super quickly a program called dontdothis.c. 460 00:20:21,330 --> 00:20:26,730 And now I'll go in here and do sharp include stdio.h. 461 00:20:26,730 --> 00:20:32,620 Let's declare function foo takes no arguments, which is 462 00:20:32,620 --> 00:20:34,040 denoted as well by void. 463 00:20:34,040 --> 00:20:37,830 >> And the only thing foo is going to do is call foo, which probably isn't the 464 00:20:37,830 --> 00:20:39,100 smartest idea, but so be it. 465 00:20:39,100 --> 00:20:40,490 Ent main void. 466 00:20:40,490 --> 00:20:45,270 Now the only thing main is going to do is call foo as well. 467 00:20:45,270 --> 00:20:51,050 And just for kicks, I'm going to go ahead here and say printf "Hello from 468 00:20:51,050 --> 00:20:52,340 foo." 469 00:20:52,340 --> 00:20:52,890 >> OK. 470 00:20:52,890 --> 00:21:00,160 So if I didn't make any mistakes, Make dontdothis dot slash. 471 00:21:00,160 --> 00:21:01,960 And let's do it in a bigger window-- 472 00:21:01,960 --> 00:21:03,210 dot slash, dontdothis. 473 00:21:03,210 --> 00:21:07,590 474 00:21:07,590 --> 00:21:08,840 Come on. 475 00:21:08,840 --> 00:21:10,940 476 00:21:10,940 --> 00:21:11,890 Uh oh. 477 00:21:11,890 --> 00:21:13,100 Apparently, you can do this. 478 00:21:13,100 --> 00:21:15,190 Damn it. 479 00:21:15,190 --> 00:21:16,190 OK. 480 00:21:16,190 --> 00:21:16,580 Wait. 481 00:21:16,580 --> 00:21:17,370 Stand by. 482 00:21:17,370 --> 00:21:18,270 Did we-- 483 00:21:18,270 --> 00:21:20,110 We did use it with Make. 484 00:21:20,110 --> 00:21:22,050 >> [SIGHS] 485 00:21:22,050 --> 00:21:25,110 >> I know but I think we just deleted that. 486 00:21:25,110 --> 00:21:28,410 Uh, yeah. 487 00:21:28,410 --> 00:21:30,660 Damn it. 488 00:21:30,660 --> 00:21:32,640 Solve this Rob. 489 00:21:32,640 --> 00:21:34,678 What? 490 00:21:34,678 --> 00:21:35,928 It's very simple. 491 00:21:35,928 --> 00:21:43,820 492 00:21:43,820 --> 00:21:47,360 Yeah, we turned optimization off. 493 00:21:47,360 --> 00:21:48,970 OK, stand bye. 494 00:21:48,970 --> 00:21:49,950 Now I feel better. 495 00:21:49,950 --> 00:21:51,390 OK. 496 00:21:51,390 --> 00:21:51,780 All right. 497 00:21:51,780 --> 00:21:53,430 >> So let's recompile this-- 498 00:21:53,430 --> 00:21:55,880 Make you dontdothis. 499 00:21:55,880 --> 00:22:00,090 You might have to rename this to dothis.c in just a moment. 500 00:22:00,090 --> 00:22:00,710 There we go. 501 00:22:00,710 --> 00:22:01,240 Thank you. 502 00:22:01,240 --> 00:22:02,050 OK. 503 00:22:02,050 --> 00:22:05,480 So the fact that I was printing something out was actually just 504 00:22:05,480 --> 00:22:08,150 slowing down the process by which we would have reached that point. 505 00:22:08,150 --> 00:22:08,510 OK. 506 00:22:08,510 --> 00:22:08,870 Phew! 507 00:22:08,870 --> 00:22:11,180 >> So what is actually going on? 508 00:22:11,180 --> 00:22:14,440 The reason there, just as an aside, is doing anything in terms of input and 509 00:22:14,440 --> 00:22:17,270 output tends to be slower because you have to write characters to the 510 00:22:17,270 --> 00:22:18,600 screen, It has to scroll. 511 00:22:18,600 --> 00:22:21,720 So long story short, had I actually happened so impatient, we would have 512 00:22:21,720 --> 00:22:23,260 seen this end result as well. 513 00:22:23,260 --> 00:22:26,220 Now that I got ride of the print-ups, we see it right away. 514 00:22:26,220 --> 00:22:28,410 So why is this happening. 515 00:22:28,410 --> 00:22:31,300 Well, the simple explanation, of course, is that foo probably shouldn't 516 00:22:31,300 --> 00:22:32,500 be calling itself. 517 00:22:32,500 --> 00:22:34,470 >> Now in general terms, this is recursion. 518 00:22:34,470 --> 00:22:36,970 And we thought a couple weeks ago recursive is good. 519 00:22:36,970 --> 00:22:40,330 Recursion is this magical way of expressing yourself super succinctly. 520 00:22:40,330 --> 00:22:41,400 And it just works. 521 00:22:41,400 --> 00:22:45,060 But there is a key feature of all of the recursive programs we've talked 522 00:22:45,060 --> 00:22:48,260 about and looked at thus far, which was that they had what? 523 00:22:48,260 --> 00:22:52,610 A base case, which was some hard coded case that said in some situations 524 00:22:52,610 --> 00:22:56,210 don't call foo, which is clearly not the case here. 525 00:22:56,210 --> 00:22:58,920 >> So what is really happening in terms of this picture? 526 00:22:58,920 --> 00:23:01,790 Well, when main calls foo, it gets a slice of memory. 527 00:23:01,790 --> 00:23:04,150 When foo calls foo, it gets a slice of memory. 528 00:23:04,150 --> 00:23:06,430 When foo calls foo, it gets a slice. 529 00:23:06,430 --> 00:23:07,080 It gets a slice. 530 00:23:07,080 --> 00:23:08,120 It gets a slice. 531 00:23:08,120 --> 00:23:09,460 Because foo is never returning. 532 00:23:09,460 --> 00:23:12,160 We're never erasing one of those frames from the stack. 533 00:23:12,160 --> 00:23:15,930 So we're blowing through the heap, not to mention who knows what else, and 534 00:23:15,930 --> 00:23:19,600 we're overstepping the bounds of our so-called segment of memory. 535 00:23:19,600 --> 00:23:21,790 Error go segmentation false. 536 00:23:21,790 --> 00:23:24,110 >> So the solution there is clearly don't do this. 537 00:23:24,110 --> 00:23:28,830 But the bigger implication is that, yes, there absolutely is some limit, 538 00:23:28,830 --> 00:23:32,470 even if it's not well defined, as to how many functions you can call in a 539 00:23:32,470 --> 00:23:34,970 program, how many times a function can call itself. 540 00:23:34,970 --> 00:23:38,430 So even though we did preach recursion as this potentially magical thing a 541 00:23:38,430 --> 00:23:41,870 couple of weeks ago for the sigma function, and when we get the data 542 00:23:41,870 --> 00:23:45,270 structures and CS50, you'll see other applications for it, it's not 543 00:23:45,270 --> 00:23:46,500 necessarily the best thing. 544 00:23:46,500 --> 00:23:50,070 Because if a function calls itself, calls itself, even if there's a base 545 00:23:50,070 --> 00:23:54,860 case, if you don't hit that base case for 1,000 calls or 10,000 calls, by 546 00:23:54,860 --> 00:23:58,800 that time you might have run out of room on your so-called stack and hit 547 00:23:58,800 --> 00:24:00,400 some other segments of memory. 548 00:24:00,400 --> 00:24:03,950 So it too is a design trade-off between elegance and between 549 00:24:03,950 --> 00:24:06,920 robustness of your particular implementation. 550 00:24:06,920 --> 00:24:10,780 >> So there's another downside or another gotcha to what we've 551 00:24:10,780 --> 00:24:11,720 been doing thus far. 552 00:24:11,720 --> 00:24:12,980 When I called getstring-- 553 00:24:12,980 --> 00:24:15,120 let me go back into hello-2. 554 00:24:15,120 --> 00:24:18,170 Notice that I'm calling getstring, which is returning an address. 555 00:24:18,170 --> 00:24:20,730 And we claim today that address is from the heap. 556 00:24:20,730 --> 00:24:24,480 And now I am printing out the string at that address. 557 00:24:24,480 --> 00:24:27,000 But we've never called the opposite of getstring. 558 00:24:27,000 --> 00:24:30,850 We've never had to calll a function like ungetstring, where you hand back 559 00:24:30,850 --> 00:24:31,610 that memory. 560 00:24:31,610 --> 00:24:33,250 But frankly we probably should have been. 561 00:24:33,250 --> 00:24:37,390 Because if we keep asking the computer for memory, by way of someone like 562 00:24:37,390 --> 00:24:40,830 getstring but never give it back, surely that too is bound to lead to 563 00:24:40,830 --> 00:24:42,970 problems whereby we run out of memory. 564 00:24:42,970 --> 00:24:46,140 >> And in fact, we can look for these problems with the new tool whose usage 565 00:24:46,140 --> 00:24:47,640 is a little cryptic to type. 566 00:24:47,640 --> 00:24:50,960 But let me go ahead and splash it up on the screen in just a moment. 567 00:24:50,960 --> 00:24:56,940 I'm going to go ahead and run Valgrind with parameter whose first command 568 00:24:56,940 --> 00:25:00,260 line argument is the name of that program hello-2. 569 00:25:00,260 --> 00:25:02,650 And unfortunately it's output is atrociously 570 00:25:02,650 --> 00:25:04,290 complex for no good reason. 571 00:25:04,290 --> 00:25:06,280 So we see all that mess. 572 00:25:06,280 --> 00:25:07,530 David is state my name. 573 00:25:07,530 --> 00:25:09,760 So that's the program actually running. 574 00:25:09,760 --> 00:25:11,180 And now we get this output. 575 00:25:11,180 --> 00:25:13,400 >> So Valgrind is similar in spirit to GDB. 576 00:25:13,400 --> 00:25:14,950 It's not a debugger per se. 577 00:25:14,950 --> 00:25:16,270 But it's a memory checker. 578 00:25:16,270 --> 00:25:20,140 It's a program that will run your program and tell you if you asked a 579 00:25:20,140 --> 00:25:23,860 computer for memory and never handed it back, thereby meaning that you have 580 00:25:23,860 --> 00:25:24,570 a memory leak. 581 00:25:24,570 --> 00:25:26,240 And memory leaks tend to be bad. 582 00:25:26,240 --> 00:25:29,120 And you is users of computers have probably felt this, whether you have a 583 00:25:29,120 --> 00:25:30,300 Mac or a PC. 584 00:25:30,300 --> 00:25:33,730 Have you ever used your computer for while and not rebooted in several 585 00:25:33,730 --> 00:25:36,820 days, or you've just got a lot of programs running, and the damn thing 586 00:25:36,820 --> 00:25:42,360 slows to a grinding halt, or at least it's super annoying to use, because 587 00:25:42,360 --> 00:25:44,350 everything just got super slow. 588 00:25:44,350 --> 00:25:46,260 >> Now that can be any number of reasons. 589 00:25:46,260 --> 00:25:49,600 It could be an infinite loop, a bug in someone's code, or, more simply, it 590 00:25:49,600 --> 00:25:53,250 could mean that you're using more memory, or trying to, than your 591 00:25:53,250 --> 00:25:54,920 computer actually has. 592 00:25:54,920 --> 00:25:57,770 And maybe there's a bug in some program that keep asking for memory. 593 00:25:57,770 --> 00:26:02,480 Browsers for years were notorious for this, asking for more and more memory 594 00:26:02,480 --> 00:26:03,870 but never handing it back. 595 00:26:03,870 --> 00:26:07,220 Surely, if you only have a finite amount of memory, you can't ask 596 00:26:07,220 --> 00:26:09,990 infinitely many times for some of that memory. 597 00:26:09,990 --> 00:26:13,070 >> And so what you see here, even though again Valgrind's output is 598 00:26:13,070 --> 00:26:17,490 unnecessarily complex to glance at first, this is the interesting part. 599 00:26:17,490 --> 00:26:18,890 Heap -- 600 00:26:18,890 --> 00:26:20,060 in use at exit. 601 00:26:20,060 --> 00:26:22,810 So here's how much memory was in use in the heap at the 602 00:26:22,810 --> 00:26:24,300 time my program exited-- 603 00:26:24,300 --> 00:26:27,280 apparently six bytes in one block. 604 00:26:27,280 --> 00:26:28,710 So I'm going to wave my hands at what a block is. 605 00:26:28,710 --> 00:26:31,270 Think of it is just a chunk, a more technical word for chunk. 606 00:26:31,270 --> 00:26:33,140 But six bytes-- 607 00:26:33,140 --> 00:26:36,870 what are the six bytes that were still in use? 608 00:26:36,870 --> 00:26:37,390 >> Exactly. 609 00:26:37,390 --> 00:26:41,520 D-A-V-I-D backslash zero, five letter name plus the null terminator. 610 00:26:41,520 --> 00:26:46,350 So this program Valgrind noticed that I asked for six bytes, apparently, by 611 00:26:46,350 --> 00:26:48,950 way of getstring, but never gave them back. 612 00:26:48,950 --> 00:26:52,030 And in fact, this might not be so obvious if my program isn't three 613 00:26:52,030 --> 00:26:53,590 lines, but it's 300 lines. 614 00:26:53,590 --> 00:26:56,920 So we can actually give another command line argument to Valgrind to 615 00:26:56,920 --> 00:26:58,290 make it more verbose. 616 00:26:58,290 --> 00:26:59,760 It's a little annoying to remember. 617 00:26:59,760 --> 00:27:01,580 But if I do-- 618 00:27:01,580 --> 00:27:01,930 let's see. 619 00:27:01,930 --> 00:27:03,540 Leak-- 620 00:27:03,540 --> 00:27:05,030 Was it leak-- 621 00:27:05,030 --> 00:27:07,580 even I don't remember what it is off hand. 622 00:27:07,580 --> 00:27:08,550 >> --leak-check equals full. 623 00:27:08,550 --> 00:27:10,180 Yep, thank you. 624 00:27:10,180 --> 00:27:12,520 --leak-check equals full. 625 00:27:12,520 --> 00:27:13,800 Enter. 626 00:27:13,800 --> 00:27:14,940 Same program is running. 627 00:27:14,940 --> 00:27:16,180 Type in David again. 628 00:27:16,180 --> 00:27:17,660 Now I see a little more detail. 629 00:27:17,660 --> 00:27:20,890 But below the heap summary, which is identical to four-- ah, 630 00:27:20,890 --> 00:27:22,120 this is kind of nice. 631 00:27:22,120 --> 00:27:25,460 Now Valgrind is actually looking a little harder in my code. 632 00:27:25,460 --> 00:27:29,580 And it's saying that, apparently, malloc at line-- 633 00:27:29,580 --> 00:27:30,580 we zoom out. 634 00:27:30,580 --> 00:27:31,980 At line-- 635 00:27:31,980 --> 00:27:32,930 we don't see what line it is. 636 00:27:32,930 --> 00:27:35,110 But malloc is the first culprit. 637 00:27:35,110 --> 00:27:38,630 There's a blog in malloc. 638 00:27:38,630 --> 00:27:39,810 >> All right? 639 00:27:39,810 --> 00:27:40,450 OK, no. 640 00:27:40,450 --> 00:27:40,940 Right? 641 00:27:40,940 --> 00:27:42,520 I called getstring. 642 00:27:42,520 --> 00:27:44,460 getstring apparently calls malloc. 643 00:27:44,460 --> 00:27:47,800 So what line of code is apparently at fault for having 644 00:27:47,800 --> 00:27:49,050 allocated this memory? 645 00:27:49,050 --> 00:27:51,560 646 00:27:51,560 --> 00:27:55,540 Let's assume that whoever wrote malloc has been around long enough that it's 647 00:27:55,540 --> 00:27:56,390 not their fault. 648 00:27:56,390 --> 00:27:57,520 So it's probably mine. 649 00:27:57,520 --> 00:28:02,000 getstring in cs50.c --so that's a file somewhere on the computer-- 650 00:28:02,000 --> 00:28:05,210 in line 286 seems to be the culprit. 651 00:28:05,210 --> 00:28:08,140 Now let's assume that cs50 has been around for decent amount of time, so 652 00:28:08,140 --> 00:28:09,720 we too are infallible. 653 00:28:09,720 --> 00:28:14,080 And so it's probably not in getstring that the bug lies, but rather in 654 00:28:14,080 --> 00:28:17,810 hello-2.c line 18. 655 00:28:17,810 --> 00:28:20,670 >> So let's take a look at what that line 18 was. 656 00:28:20,670 --> 00:28:21,130 Oh. 657 00:28:21,130 --> 00:28:27,130 Somehow this line isn't necessarily buggy, per se, but it is the reason 658 00:28:27,130 --> 00:28:28,630 behind that memory leak. 659 00:28:28,630 --> 00:28:32,140 So super simply, what would intuitively be the solution here? 660 00:28:32,140 --> 00:28:34,710 If we're asking for memory, were never giving it back, and that seems to be a 661 00:28:34,710 --> 00:28:37,940 problem because over time my computer might run out of memory, might slow 662 00:28:37,940 --> 00:28:42,110 down, bad things might happen, well, what's the simple intuitive solution? 663 00:28:42,110 --> 00:28:43,140 Just give it back. 664 00:28:43,140 --> 00:28:44,770 >> How do you free up that memory? 665 00:28:44,770 --> 00:28:49,970 Well, thankfully it's quite simple to just say free name. 666 00:28:49,970 --> 00:28:51,260 And we've never done this before. 667 00:28:51,260 --> 00:28:55,890 But you can essentially think of free as the opposite of malloc. 668 00:28:55,890 --> 00:28:58,030 free is the opposite of allocating memory. 669 00:28:58,030 --> 00:28:59,540 So now let me recompile this. 670 00:28:59,540 --> 00:29:02,050 Make hello-2. 671 00:29:02,050 --> 00:29:04,620 Let me run it again. hello-2 David. 672 00:29:04,620 --> 00:29:07,290 So it seems to work in exactly the same way. 673 00:29:07,290 --> 00:29:11,180 But if I go back to Valgrind and re-run that same command on my newly 674 00:29:11,180 --> 00:29:14,720 compiled program, typing in my name as before-- 675 00:29:14,720 --> 00:29:15,370 nice. 676 00:29:15,370 --> 00:29:16,760 Heap summary-- 677 00:29:16,760 --> 00:29:17,740 in use at exit-- 678 00:29:17,740 --> 00:29:19,370 zero bytes in zero blocks. 679 00:29:19,370 --> 00:29:21,840 And this is super nice, all heap blocks were freed. 680 00:29:21,840 --> 00:29:23,480 No leaks are possible. 681 00:29:23,480 --> 00:29:27,200 >> So coming up, not with Problem Set 4, but with Problem Set 5, the forensics 682 00:29:27,200 --> 00:29:30,740 and onward, this too will become a measure of the correctness of your 683 00:29:30,740 --> 00:29:33,630 program, whether or not you have or don't have memory leaks. 684 00:29:33,630 --> 00:29:36,900 But thankfully, not only can you reason through them intuitively, which 685 00:29:36,900 --> 00:29:40,430 is, arguably, easy for small programs but harder for larger programs, 686 00:29:40,430 --> 00:29:43,860 Valgrind, for those larger programs, can help you identify 687 00:29:43,860 --> 00:29:45,360 the particular problem. 688 00:29:45,360 --> 00:29:47,500 >> But there's one other problem that might arise. 689 00:29:47,500 --> 00:29:51,245 Let me open up this file here, which is, again, a somewhat simple example. 690 00:29:51,245 --> 00:29:53,760 But let's focus on what this program does. 691 00:29:53,760 --> 00:29:55,190 This is called memory.c. 692 00:29:55,190 --> 00:29:58,380 We'll post this later today in the zip of today's source code. 693 00:29:58,380 --> 00:30:01,610 And notice that I have a function called f that takes no arguments and 694 00:30:01,610 --> 00:30:02,800 returns nothing. 695 00:30:02,800 --> 00:30:07,240 In line 20, I'm apparently declaring a pointer to an int and calling it x. 696 00:30:07,240 --> 00:30:09,570 I'm assigning is the return value of malloc. 697 00:30:09,570 --> 00:30:14,590 And just to be clear, how many bytes am I probably getting back from malloc 698 00:30:14,590 --> 00:30:17,080 in this situation? 699 00:30:17,080 --> 00:30:18,040 >> Probably 40. 700 00:30:18,040 --> 00:30:18,840 Where do you get that from? 701 00:30:18,840 --> 00:30:22,410 Well, if you recall that an int is often 4 bytes, at least it is in the 702 00:30:22,410 --> 00:30:25,110 appliance, 10 times 4 is obviously 40. 703 00:30:25,110 --> 00:30:28,920 So malloc is returning an address of a chunk of memory and storing that 704 00:30:28,920 --> 00:30:30,800 address ultimately in x. 705 00:30:30,800 --> 00:30:32,570 So to be clear, what then is happening? 706 00:30:32,570 --> 00:30:34,990 Well, let me switch back to our picture here. 707 00:30:34,990 --> 00:30:38,150 Let me not just draw the bottom of my computer's memory, let me go ahead and 708 00:30:38,150 --> 00:30:42,990 draw the whole rectangle that represents all of my RAM. 709 00:30:42,990 --> 00:30:44,790 >> We'll say that the stack is on the bottom. 710 00:30:44,790 --> 00:30:47,010 And there's a text segment in the uninitialized data. 711 00:30:47,010 --> 00:30:49,880 But I'm just going to abstract those other things away as dot, dot dot. 712 00:30:49,880 --> 00:30:53,470 I'm just going to refer to this as the heap at the top. 713 00:30:53,470 --> 00:30:57,070 And then at the bottom of this picture, to represent main, I'm going 714 00:30:57,070 --> 00:30:59,880 to give it a slices memory on the stack. 715 00:30:59,880 --> 00:31:03,150 For f, I'm going to give it a slice of memory on the stack. 716 00:31:03,150 --> 00:31:05,140 Now, I got to consult my source code again. 717 00:31:05,140 --> 00:31:07,170 What are the local variables for main? 718 00:31:07,170 --> 00:31:10,710 Apparently nothing, so that slice is effectively empty or not even as big 719 00:31:10,710 --> 00:31:11,600 as I've drawn it. 720 00:31:11,600 --> 00:31:15,730 But in f, I have a local variable, which is called x. 721 00:31:15,730 --> 00:31:20,410 So I'm going to go ahead and give f a chunk of memory, calling it x. 722 00:31:20,410 --> 00:31:24,680 >> And now malloc of 10 times 4, So malloc 40, where's that 723 00:31:24,680 --> 00:31:25,430 memory coming from? 724 00:31:25,430 --> 00:31:27,530 We've not drawn a picture like this before. 725 00:31:27,530 --> 00:31:31,140 But let's suppose that it's effectively coming from here, so one, 726 00:31:31,140 --> 00:31:33,170 two, three, four, five. 727 00:31:33,170 --> 00:31:34,680 And now I need 40 of these. 728 00:31:34,680 --> 00:31:37,540 So I'll just do dot, dot, dot to suggest that there's even more memory 729 00:31:37,540 --> 00:31:39,350 coming back from the heap. 730 00:31:39,350 --> 00:31:40,710 Now what's the address? 731 00:31:40,710 --> 00:31:42,620 Let's choose our arbitrary address as always-- 732 00:31:42,620 --> 00:31:46,310 Ox123, even though it's probably going to be something completely different. 733 00:31:46,310 --> 00:31:50,420 That's the address of the first byte in memory that I'm asking malloc for. 734 00:31:50,420 --> 00:31:53,630 >> So in short, once line 20 executes, what is literally 735 00:31:53,630 --> 00:31:57,170 stored inside of x here? 736 00:31:57,170 --> 00:31:58,730 Ox123. 737 00:31:58,730 --> 00:32:00,370 Ox123. 738 00:32:00,370 --> 00:32:01,550 And the Ox is uninteresting. 739 00:32:01,550 --> 00:32:03,200 It just means here's a hexadecimal number. 740 00:32:03,200 --> 00:32:06,490 But what's key is that what I've store in x, which is a local variable. 741 00:32:06,490 --> 00:32:10,260 But its data type, again, is an address of an int. 742 00:32:10,260 --> 00:32:12,710 Well, I'm going to store Ox123. 743 00:32:12,710 --> 00:32:16,610 But again, if that's a little too complicated unnecessarily, if I scroll 744 00:32:16,610 --> 00:32:21,490 back, we can abstract this away quite reasonably and just say that x is a 745 00:32:21,490 --> 00:32:23,910 pointer to that chunk of memory. 746 00:32:23,910 --> 00:32:24,070 >> OK. 747 00:32:24,070 --> 00:32:26,230 Now the question at hand is the following-- 748 00:32:26,230 --> 00:32:29,910 line 21, it turns out, is buggy. 749 00:32:29,910 --> 00:32:31,160 Why? 750 00:32:31,160 --> 00:32:34,890 751 00:32:34,890 --> 00:32:36,930 >> Sorry? 752 00:32:36,930 --> 00:32:38,640 It doesn't have-- 753 00:32:38,640 --> 00:32:40,390 say that once more. 754 00:32:40,390 --> 00:32:41,240 Well, it doesn't free. 755 00:32:41,240 --> 00:32:42,350 So that's the second but. 756 00:32:42,350 --> 00:32:45,000 So there's one other but specifically at line 21. 757 00:32:45,000 --> 00:32:49,480 758 00:32:49,480 --> 00:32:50,040 >> Exactly. 759 00:32:50,040 --> 00:32:54,980 This simple line of code is just a buffer overflow, a buffer overrun. 760 00:32:54,980 --> 00:32:57,050 A buffer just means a chunk of memory. 761 00:32:57,050 --> 00:33:01,520 But that chunk of memory is of size 10, 10 integers, which means if we 762 00:33:01,520 --> 00:33:05,350 index into it using the syntactic sugar of array notation, the square 763 00:33:05,350 --> 00:33:09,220 brackets, you have access to x bracket 0 x bracket 1 x, 764 00:33:09,220 --> 00:33:10,390 bracket dot, dot, dot. 765 00:33:10,390 --> 00:33:13,270 x bracket 9 is the biggest one. 766 00:33:13,270 --> 00:33:17,680 So if I do x bracket 10, where I'm actually going in memory? 767 00:33:17,680 --> 00:33:19,120 >> Well, if I have 10 int-- 768 00:33:19,120 --> 00:33:21,070 let's actually draw all of these out here. 769 00:33:21,070 --> 00:33:22,700 So that was the first five. 770 00:33:22,700 --> 00:33:24,660 Here's the other five ints. 771 00:33:24,660 --> 00:33:29,580 So x bracket 0 is here. x bracket 1 is here. x bracket 9 is here. x bracket 772 00:33:29,580 --> 00:33:37,960 10 is here, which means I am telling, in line 21, the computer to put the 773 00:33:37,960 --> 00:33:39,400 number where? 774 00:33:39,400 --> 00:33:42,010 The number 0 where? 775 00:33:42,010 --> 00:33:43,380 Well, it's 0, yes. 776 00:33:43,380 --> 00:33:45,460 But just the fact that its 0 is kind of a coincidence. 777 00:33:45,460 --> 00:33:47,140 It could be the number 50, for all we care. 778 00:33:47,140 --> 00:33:50,480 But we're trying to put it at x bracket 10, which is where this 779 00:33:50,480 --> 00:33:53,700 question mark is drawn, which is not a good thing. 780 00:33:53,700 --> 00:33:57,070 This program might very well crash as a result. 781 00:33:57,070 --> 00:33:59,400 >> Now, let's go ahead and see if this is, indeed, what happens. 782 00:33:59,400 --> 00:34:02,600 Make memory, since the file is called memory.c. 783 00:34:02,600 --> 00:34:05,950 Let's go ahead and run the program memory. 784 00:34:05,950 --> 00:34:08,239 So we got lucky, actually, it seems. 785 00:34:08,239 --> 00:34:09,340 We got lucky. 786 00:34:09,340 --> 00:34:11,060 But let's see if we now run Valgrind. 787 00:34:11,060 --> 00:34:14,170 At first glance, my program might seem to be perfectly correct. 788 00:34:14,170 --> 00:34:18,010 But let me run Valgrind with the --leak-check equals full on memory. 789 00:34:18,010 --> 00:34:20,110 >> And now when I run this-- 790 00:34:20,110 --> 00:34:21,030 interesting. 791 00:34:21,030 --> 00:34:26,800 Invalid write of size 4 at line 21 of memory.c. 792 00:34:26,800 --> 00:34:29,284 Line 21 of memory.c is which one? 793 00:34:29,284 --> 00:34:30,340 Oh, interesting. 794 00:34:30,340 --> 00:34:31,080 But wait. 795 00:34:31,080 --> 00:34:32,389 Size 4, what is that referring to? 796 00:34:32,389 --> 00:34:34,969 I only did one write, but it's of size 4. 797 00:34:34,969 --> 00:34:36,889 Why is it 4? 798 00:34:36,889 --> 00:34:39,280 It's because it's an int, which is, again, four bytes. 799 00:34:39,280 --> 00:34:42,510 So Valgrind found a bug that I, glancing at my code, didn't. 800 00:34:42,510 --> 00:34:45,040 And maybe your TF would or wouldn't. 801 00:34:45,040 --> 00:34:48,469 What But Valgrind for sure found that we've made a mistake there, even 802 00:34:48,469 --> 00:34:52,719 though we got lucky, and the computer decided, eh, I'm not going to crash 803 00:34:52,719 --> 00:34:57,470 just because you touched one byte, one int's worth of memory that you didn't 804 00:34:57,470 --> 00:34:58,550 actually own. 805 00:34:58,550 --> 00:35:00,380 >> Well, what else is buggy here. 806 00:35:00,380 --> 00:35:01,180 Address-- 807 00:35:01,180 --> 00:35:03,190 this is a crazy looking address in hexadecimal. 808 00:35:03,190 --> 00:35:06,890 That just means somewhere in the heap is zero bytes after a block of size 40 809 00:35:06,890 --> 00:35:07,620 is allocated. 810 00:35:07,620 --> 00:35:10,610 Let me zoom out here and see if this is a little more helpful. 811 00:35:10,610 --> 00:35:11,410 Interesting. 812 00:35:11,410 --> 00:35:15,600 40 bytes are definitely lost in loss record 1 of 1. 813 00:35:15,600 --> 00:35:17,840 Again, more words than is useful here. 814 00:35:17,840 --> 00:35:21,350 But based on the highlighted lines, where should I probably focus my 815 00:35:21,350 --> 00:35:24,070 attention for another bug? 816 00:35:24,070 --> 00:35:26,570 Looks like a line 20 of memory.c. 817 00:35:26,570 --> 00:35:30,990 >> So if we go back to line 20, that's the one that you identified earlier. 818 00:35:30,990 --> 00:35:33,030 And it's not necessarily buggy. 819 00:35:33,030 --> 00:35:35,160 But we have this reversed its effects. 820 00:35:35,160 --> 00:35:38,790 So how do I correct at least one of those mistakes? 821 00:35:38,790 --> 00:35:42,240 What could I do after line 21? 822 00:35:42,240 --> 00:35:47,110 I could do free of x, so is to give back that memory. 823 00:35:47,110 --> 00:35:49,230 And how do I fix this bug? 824 00:35:49,230 --> 00:35:52,120 I should definitely go no farther than 0. 825 00:35:52,120 --> 00:35:53,670 So let me try and re-run this. 826 00:35:53,670 --> 00:35:56,080 Sorry, definitely go no farther than 9. 827 00:35:56,080 --> 00:35:57,510 Make memory. 828 00:35:57,510 --> 00:36:00,650 Let me rerun Valgrind in a bigger window. 829 00:36:00,650 --> 00:36:01,580 And now look. 830 00:36:01,580 --> 00:36:02,250 Nice. 831 00:36:02,250 --> 00:36:03,270 All heap blocks were freed. 832 00:36:03,270 --> 00:36:04,270 No leaks are possible. 833 00:36:04,270 --> 00:36:07,520 And up above here, there's no mention any more of the invalid right. 834 00:36:07,520 --> 00:36:09,820 >> Just to get greedy, and let's see if another demonstration 835 00:36:09,820 --> 00:36:11,050 does not go as intended-- 836 00:36:11,050 --> 00:36:12,560 I did get lucky a moment ago. 837 00:36:12,560 --> 00:36:15,530 And the fact that this is 0 is perhaps unnecessarily misleading. 838 00:36:15,530 --> 00:36:20,650 Let's just do 50, a somewhat arbitrary number, make memory dot slash memory-- 839 00:36:20,650 --> 00:36:21,410 still get lucky. 840 00:36:21,410 --> 00:36:22,510 Nothing's crashing. 841 00:36:22,510 --> 00:36:26,150 Suppose I just do something really foolish, and I do 100. 842 00:36:26,150 --> 00:36:30,360 Let me remake memory, dot slash memory-- 843 00:36:30,360 --> 00:36:31,075 got lucky again. 844 00:36:31,075 --> 00:36:32,800 How about 1,000? 845 00:36:32,800 --> 00:36:35,370 ints beyond, roughly, where I should be? 846 00:36:35,370 --> 00:36:37,410 Make memory-- 847 00:36:37,410 --> 00:36:38,570 damn it. 848 00:36:38,570 --> 00:36:39,920 >> [LAUGHTER] 849 00:36:39,920 --> 00:36:41,270 >> OK. 850 00:36:41,270 --> 00:36:43,920 Let's not mess around anymore. 851 00:36:43,920 --> 00:36:45,120 Rerun memory. 852 00:36:45,120 --> 00:36:45,840 There we go. 853 00:36:45,840 --> 00:36:46,410 All right. 854 00:36:46,410 --> 00:36:52,500 So apparently you index 100,000 ints beyond where you should have been in 855 00:36:52,500 --> 00:36:54,410 memory, bad things happen. 856 00:36:54,410 --> 00:36:56,430 So this is obviously not a hard, fast rule. 857 00:36:56,430 --> 00:36:58,190 I was kind of using trial and error to get there. 858 00:36:58,190 --> 00:37:02,230 But this is because, long story short, your computer's memory is also divided 859 00:37:02,230 --> 00:37:03,580 into these things called segments. 860 00:37:03,580 --> 00:37:07,260 And sometimes, the computer actually has given you a little more memory 861 00:37:07,260 --> 00:37:08,400 than you ask for. 862 00:37:08,400 --> 00:37:12,170 But for efficiency, it's just easier to get more memory but only tell you 863 00:37:12,170 --> 00:37:13,780 that you're getting a portion of it. 864 00:37:13,780 --> 00:37:16,370 >> And if you get lucky sometimes, therefore, you might be able to touch 865 00:37:16,370 --> 00:37:17,795 memory that doesn't belong to you. 866 00:37:17,795 --> 00:37:21,860 You have no guarantee that what value you put there will stay there, because 867 00:37:21,860 --> 00:37:25,080 the computer still thinks it's not yours, but it's not necessarily going 868 00:37:25,080 --> 00:37:29,910 to hit another segment of memory in the computer and induce a mistake like 869 00:37:29,910 --> 00:37:31,710 this one here. 870 00:37:31,710 --> 00:37:32,060 All right. 871 00:37:32,060 --> 00:37:37,240 Any questions then on memory? 872 00:37:37,240 --> 00:37:37,590 >> All right. 873 00:37:37,590 --> 00:37:40,610 Let's take a look here, then, at something we've been taking for 874 00:37:40,610 --> 00:37:48,361 granted for quite some time, which is in this file called cs50.h. 875 00:37:48,361 --> 00:37:49,420 So this is a file. 876 00:37:49,420 --> 00:37:51,130 These are just a whole bunch of comments up top. 877 00:37:51,130 --> 00:37:53,900 And you might have looked at this if you poked around on the appliance. 878 00:37:53,900 --> 00:37:57,000 But it turns out that all the time, when we used to use string as a 879 00:37:57,000 --> 00:38:01,130 synonym, the means by which we declared that synonym was with this 880 00:38:01,130 --> 00:38:03,990 keyword typedef, for type definition. 881 00:38:03,990 --> 00:38:07,500 And we're essentially saying, make string a synonym for char star. 882 00:38:07,500 --> 00:38:11,190 That the means by which the stack created these training wheels known as 883 00:38:11,190 --> 00:38:12,040 the string. 884 00:38:12,040 --> 00:38:14,830 >> Now here's just a prototype for getchar. 885 00:38:14,830 --> 00:38:17,350 We might have seen it before, but that's indeed what it does. getchar 886 00:38:17,350 --> 00:38:19,070 takes no arguments, returns a char. 887 00:38:19,070 --> 00:38:21,340 getdouble takes no arguments, returns a double. 888 00:38:21,340 --> 00:38:24,440 getfloat takes no arguments, returns a float, and so forth. 889 00:38:24,440 --> 00:38:27,270 getint is in here. getlonglong is in here. 890 00:38:27,270 --> 00:38:28,820 And getstring is in here. 891 00:38:28,820 --> 00:38:29,420 And that's it. 892 00:38:29,420 --> 00:38:33,080 This purple line is another preprocessor directive because of the 893 00:38:33,080 --> 00:38:35,550 hashtag at the beginning of it. 894 00:38:35,550 --> 00:38:35,870 >> All right. 895 00:38:35,870 --> 00:38:38,380 So now let me go into cs50.c. 896 00:38:38,380 --> 00:38:40,400 And we won't talk too long on this. 897 00:38:40,400 --> 00:38:43,280 But to give you a glimpse of what's been going on all this 898 00:38:43,280 --> 00:38:46,434 time, let me go to-- 899 00:38:46,434 --> 00:38:48,250 let's do getchar. 900 00:38:48,250 --> 00:38:51,050 So getchar is mostly comments. 901 00:38:51,050 --> 00:38:52,060 But it looks like this. 902 00:38:52,060 --> 00:38:54,800 So this is the actual function getchar that we've been 903 00:38:54,800 --> 00:38:56,055 taking for granted exists. 904 00:38:56,055 --> 00:38:59,370 And even though we haven't use this one that often, if ever, it's at least 905 00:38:59,370 --> 00:39:00,470 relatively simple. 906 00:39:00,470 --> 00:39:02,580 So it's worth taking a quick look at here. 907 00:39:02,580 --> 00:39:06,540 >> So getchar has an infinite loop, deliberately so apparently. 908 00:39:06,540 --> 00:39:10,050 It then calls-- and this is kind of a nice reuse of code we ourselves wrote. 909 00:39:10,050 --> 00:39:11,220 It calls getstring. 910 00:39:11,220 --> 00:39:12,460 Because what does it mean to get a char? 911 00:39:12,460 --> 00:39:14,730 Well, you might as well try to get a whole line of text from the user and 912 00:39:14,730 --> 00:39:16,940 then just look at one of those characters. 913 00:39:16,940 --> 00:39:19,170 In line 60, here's a little bit of a sanity check. 914 00:39:19,170 --> 00:39:21,610 If getstring returned null, let's not proceed. 915 00:39:21,610 --> 00:39:22,820 Something went wrong. 916 00:39:22,820 --> 00:39:28,120 >> Now this is somewhat annoying but conventional in C. char max probably 917 00:39:28,120 --> 00:39:29,960 represents what just based on its name? 918 00:39:29,960 --> 00:39:31,670 It's a constant. 919 00:39:31,670 --> 00:39:36,040 It's like the numeric value of the biggest char you can represent with 920 00:39:36,040 --> 00:39:40,370 one bite, which is probably the number 255, which is the biggest number you 921 00:39:40,370 --> 00:39:42,720 represent eight bits, starting from zero. 922 00:39:42,720 --> 00:39:47,460 So I've use this, in this function, when writing this code, only because 923 00:39:47,460 --> 00:39:51,753 if something goes wrong in getchar but its purpose in life is to return a 924 00:39:51,753 --> 00:39:54,830 char, you need to somehow be able to signal to the user that 925 00:39:54,830 --> 00:39:55,840 something went wrong. 926 00:39:55,840 --> 00:39:56,970 We can't return null. 927 00:39:56,970 --> 00:39:58,480 It turns out that null is a pointer. 928 00:39:58,480 --> 00:40:01,030 And again, getchar has to return a char. 929 00:40:01,030 --> 00:40:04,760 >> So the convention, if something goes wrong, is you, the programmer, or in 930 00:40:04,760 --> 00:40:08,160 this case, me with the library, I had a just decide arbitrarily, if 931 00:40:08,160 --> 00:40:12,230 something goes wrong, I'm going to return the number 255, which is truly 932 00:40:12,230 --> 00:40:17,240 means we cannot, the user cannot type the character represented by the 933 00:40:17,240 --> 00:40:21,410 number 255 because we had a steal it as a so-called sentinel value to 934 00:40:21,410 --> 00:40:23,410 represent a problem. 935 00:40:23,410 --> 00:40:27,010 Now it turns out that the character 255 is not something you can type on 936 00:40:27,010 --> 00:40:28,380 your keyboard, so it's no big deal. 937 00:40:28,380 --> 00:40:30,910 The user doesn't notice that I've stolen this character. 938 00:40:30,910 --> 00:40:34,620 But if you ever see in man pages on a computer system some reference to an 939 00:40:34,620 --> 00:40:38,560 all caps constant like this that says, in cases of error this constant might 940 00:40:38,560 --> 00:40:42,720 be returned, that's all some human did years ago was arbitrarily decided to 941 00:40:42,720 --> 00:40:45,680 return this special value and call it a constant in case 942 00:40:45,680 --> 00:40:46,840 something goes wrong. 943 00:40:46,840 --> 00:40:48,580 >> Now the magic happens down here. 944 00:40:48,580 --> 00:40:52,600 First, I'm declaring in line 67 two characters, C1 and C2. 945 00:40:52,600 --> 00:40:57,080 And then in line 68, there's actually a line of code that's reminiscent of 946 00:40:57,080 --> 00:41:01,140 our friend printf, given that it does have percent Cs in quotes. 947 00:41:01,140 --> 00:41:06,490 But notice what's happening here. sscanf means string scan-- 948 00:41:06,490 --> 00:41:11,690 means scan a formatted string, ergo sscanf. 949 00:41:11,690 --> 00:41:12,590 What does that mean? 950 00:41:12,590 --> 00:41:16,310 It means you pass to sscanf a string. 951 00:41:16,310 --> 00:41:18,420 And line is whatever the user types in. 952 00:41:18,420 --> 00:41:23,520 You pass to sscanf a format string like this that tells scanf what are 953 00:41:23,520 --> 00:41:25,870 you hoping the user has typed in. 954 00:41:25,870 --> 00:41:29,730 You then pass-in the addresses of two chunks of memory, in this case, 955 00:41:29,730 --> 00:41:31,150 because I have two placeholders. 956 00:41:31,150 --> 00:41:34,610 So I'm going to give it the address of C1 and the address of C2. 957 00:41:34,610 --> 00:41:37,700 >> And recall that you give a function the address of some variable, what's 958 00:41:37,700 --> 00:41:38,950 the implication? 959 00:41:38,950 --> 00:41:41,400 960 00:41:41,400 --> 00:41:45,050 What can that function do as a result of giving it the address of a 961 00:41:45,050 --> 00:41:48,170 variable, as opposed to the variable itself? 962 00:41:48,170 --> 00:41:49,450 It can change it, right? 963 00:41:49,450 --> 00:41:53,250 If you had someone a map to a physical address, they can go there and do 964 00:41:53,250 --> 00:41:54,750 whatever they want at that address. 965 00:41:54,750 --> 00:41:55,800 Same idea here. 966 00:41:55,800 --> 00:41:59,950 If we pass to sscanf, the address of two chunks of memory, even these tiny 967 00:41:59,950 --> 00:42:03,585 little chunks of memory, C1 and C2, but we tell it the address of them, 968 00:42:03,585 --> 00:42:05,170 sscanf can change it. 969 00:42:05,170 --> 00:42:08,530 >> So sscanf's purpose in life, if we read the man page, is to read what the 970 00:42:08,530 --> 00:42:13,420 user typed in, hope for the user having typed in a character and maybe 971 00:42:13,420 --> 00:42:16,470 another character, and whatever the user typed, the first character goes 972 00:42:16,470 --> 00:42:19,310 here, the second character goes here. 973 00:42:19,310 --> 00:42:22,470 Now, as an aside, this, and you would only know this from the documentation, 974 00:42:22,470 --> 00:42:25,570 the fact that I put a blank space there just means that I don't care if 975 00:42:25,570 --> 00:42:28,440 the user hits the Space bar a few times before he or she takes a 976 00:42:28,440 --> 00:42:30,400 character, I'm going to ignore any white space. 977 00:42:30,400 --> 00:42:32,510 So that, I know from the documentation. 978 00:42:32,510 --> 00:42:36,570 >> The fact that there's a second %c followed by white space is actually 979 00:42:36,570 --> 00:42:37,410 deliberate. 980 00:42:37,410 --> 00:42:41,190 I want to be able to detect if the user screwed up or didn't cooperate. 981 00:42:41,190 --> 00:42:45,630 So I'm hoping that the user only typed in one character, therefore I'm hoping 982 00:42:45,630 --> 00:42:50,640 that sscanf is only going to return the value 1 because, again, if I read 983 00:42:50,640 --> 00:42:55,400 the documentation, sscanf's purpose in life is to return to the number of 984 00:42:55,400 --> 00:42:59,170 variables that were filled with user input. 985 00:42:59,170 --> 00:43:02,270 >> I passed in two variables addresses, C1 and C2. 986 00:43:02,270 --> 00:43:06,420 I'm hoping, though, that only one of them gets killed because if sscanf 987 00:43:06,420 --> 00:43:11,130 returns 2, what's presumably the implication logically? 988 00:43:11,130 --> 00:43:14,600 That the user didn't just give me one character like I told him or her. 989 00:43:14,600 --> 00:43:17,860 They probably typed at least two characters. 990 00:43:17,860 --> 00:43:22,430 So if I instead did not have the second %c, I just had one, which 991 00:43:22,430 --> 00:43:25,370 frankly would be more intuitive approach, I think a first glance, 992 00:43:25,370 --> 00:43:30,220 you're not going to be able to detect if the user has been giving you more 993 00:43:30,220 --> 00:43:31,780 input than you actually wanted. 994 00:43:31,780 --> 00:43:34,100 So this is an implicit form of error checking. 995 00:43:34,100 --> 00:43:35,640 >> But notice what I do here. 996 00:43:35,640 --> 00:43:39,970 Once I'm sure that the user gave me one character, I free the line, doing 997 00:43:39,970 --> 00:43:44,450 the opposite of getstring, which in turn uses malloc, and then I return 998 00:43:44,450 --> 00:43:51,030 C1, the character that I hoped the user provided and only provided. 999 00:43:51,030 --> 00:43:54,680 So a quick glimpsed only, but any questions on getchar? 1000 00:43:54,680 --> 00:43:57,450 1001 00:43:57,450 --> 00:43:59,590 We'll come back to some of the others. 1002 00:43:59,590 --> 00:44:03,770 >> Well, let me go ahead and do this-- suppose now, just to motivate our 1003 00:44:03,770 --> 00:44:08,910 discussion in a week plus time, this is a file called structs.h. 1004 00:44:08,910 --> 00:44:11,440 And again, this is just a taste of something that lies ahead. 1005 00:44:11,440 --> 00:44:13,090 But notice that a lot of this is comments. 1006 00:44:13,090 --> 00:44:17,440 So let me highlight only the interesting part for now. 1007 00:44:17,440 --> 00:44:18,020 typedef-- 1008 00:44:18,020 --> 00:44:19,700 there's that same keyword again. 1009 00:44:19,700 --> 00:44:23,100 typedef we use to declare string as a special data type. 1010 00:44:23,100 --> 00:44:27,490 You can use typedef to create brand new data types that didn't exist when 1011 00:44:27,490 --> 00:44:28,570 C was invented. 1012 00:44:28,570 --> 00:44:32,520 For instance, int comes with C. char comes with C. double comes with C. But 1013 00:44:32,520 --> 00:44:34,000 there's no notion of a student. 1014 00:44:34,000 --> 00:44:37,230 And yet it would be pretty useful to be able to write a program that stores 1015 00:44:37,230 --> 00:44:40,440 in a variable, a student's ID number, their name, and their house. 1016 00:44:40,440 --> 00:44:42,890 In other words, three pieces of data, like an int and a 1017 00:44:42,890 --> 00:44:44,420 string and another string. 1018 00:44:44,420 --> 00:44:48,220 >> With typedef, what's pretty powerful about this and the keyword sturct for 1019 00:44:48,220 --> 00:44:53,660 structure, you, the programmer in 2013, can actually define your own the 1020 00:44:53,660 --> 00:44:57,530 data types that didn't exist years ago but that suit your purposes. 1021 00:44:57,530 --> 00:45:01,910 And so here, in lines 13 through 19, we're declaring a new data type, like 1022 00:45:01,910 --> 00:45:04,320 an int, but calling it student. 1023 00:45:04,320 --> 00:45:09,310 And inside of this variable is going to be three things-- an int, a string, 1024 00:45:09,310 --> 00:45:09,930 and a string. 1025 00:45:09,930 --> 00:45:13,040 So you can think of what's really happened here, even though this is a 1026 00:45:13,040 --> 00:45:17,160 bit of a simplification for today, a student is essentially going 1027 00:45:17,160 --> 00:45:19,450 to look like this. 1028 00:45:19,450 --> 00:45:22,580 Its going to be a chunk of memory with an ID, a name 1029 00:45:22,580 --> 00:45:25,580 field, and a house field. 1030 00:45:25,580 --> 00:45:30,670 And we'll be able to use those chunks of memory and access them as follows. 1031 00:45:30,670 --> 00:45:38,870 >> If I go into struct0.c, here is a relatively long, but following a 1032 00:45:38,870 --> 00:45:42,630 pattern, of code that uses this new trick. 1033 00:45:42,630 --> 00:45:45,790 So first, let me draw your attention to the interesting parts up top. 1034 00:45:45,790 --> 00:45:49,670 Sharp defines students 3, declares a constant called students and assigns 1035 00:45:49,670 --> 00:45:53,450 it arbitrarily the number 3, just so I have three students using 1036 00:45:53,450 --> 00:45:54,830 this program for now. 1037 00:45:54,830 --> 00:45:55,960 Here comes Main. 1038 00:45:55,960 --> 00:45:58,860 And notice, how do I declare an array of students? 1039 00:45:58,860 --> 00:46:00,480 Well, I just use the same syntax. 1040 00:46:00,480 --> 00:46:02,110 The word student is obviously new. 1041 00:46:02,110 --> 00:46:04,790 But student, class, bracket students. 1042 00:46:04,790 --> 00:46:06,720 >> So unfortunately there's a lot of reuse of terms here. 1043 00:46:06,720 --> 00:46:07,660 This is just a number. 1044 00:46:07,660 --> 00:46:09,040 So this is like saying three. 1045 00:46:09,040 --> 00:46:11,430 Class is just what I want to call the variable. 1046 00:46:11,430 --> 00:46:12,840 I could call it students. 1047 00:46:12,840 --> 00:46:15,880 But class, this is not a class in an object oriented Java kind of way. 1048 00:46:15,880 --> 00:46:17,220 It's just a class of students. 1049 00:46:17,220 --> 00:46:20,590 And the data type of every element in that array is student. 1050 00:46:20,590 --> 00:46:23,040 So this is a little different and from saying something 1051 00:46:23,040 --> 00:46:25,250 like this, it's just-- 1052 00:46:25,250 --> 00:46:29,500 I'm saying give me three students and call that array class. 1053 00:46:29,500 --> 00:46:29,800 >> All right. 1054 00:46:29,800 --> 00:46:30,680 Now here's a four loop. 1055 00:46:30,680 --> 00:46:33,480 This guy's familiar-- iterate from zero on up to three. 1056 00:46:33,480 --> 00:46:35,160 And here's the new piece of syntax. 1057 00:46:35,160 --> 00:46:37,710 The program's going to prompt me, the human, to give it a student 1058 00:46:37,710 --> 00:46:39,200 ID, which is an int. 1059 00:46:39,200 --> 00:46:44,650 And here's the syntax with which you can store something in the ID field at 1060 00:46:44,650 --> 00:46:48,630 location class bracket I. So this syntax is not new. 1061 00:46:48,630 --> 00:46:51,450 This just means give me the eighth student in the class. 1062 00:46:51,450 --> 00:46:52,940 But this symbol is new. 1063 00:46:52,940 --> 00:46:56,320 Up until now, we've cannot used dot, at least in code like this. 1064 00:46:56,320 --> 00:47:01,490 This means go to the struct known as a student and put something there. 1065 00:47:01,490 --> 00:47:05,670 Similarly, in this next line, 31, go ahead and put whatever the user types 1066 00:47:05,670 --> 00:47:10,530 for a name here and what they do for a house, the same thing, go ahead and 1067 00:47:10,530 --> 00:47:13,230 put it in .house. 1068 00:47:13,230 --> 00:47:15,955 >> So what does this program ultimately do? 1069 00:47:15,955 --> 00:47:17,220 You can see a little teaser there. 1070 00:47:17,220 --> 00:47:24,780 Let me go ahead and do make structs 0 dot slash struct 0, student's ID 1, 1071 00:47:24,780 --> 00:47:28,250 say David Mather, student ID 2. 1072 00:47:28,250 --> 00:47:32,070 Rob Kirkland, student ID 3. 1073 00:47:32,070 --> 00:47:35,010 Lauren Leverit-- 1074 00:47:35,010 --> 00:47:38,380 and the only thing this program did, which is just completely arbitrary, is 1075 00:47:38,380 --> 00:47:40,980 I wanted to do something with this data, now that I've taught us how to 1076 00:47:40,980 --> 00:47:43,450 use structs, is I just had this extra loop here. 1077 00:47:43,450 --> 00:47:45,260 I iterate over the array of students. 1078 00:47:45,260 --> 00:47:49,170 I used our, perhaps now familiar friend, string compare, stircomp to 1079 00:47:49,170 --> 00:47:53,780 check is 8th student's house equal to Mather? 1080 00:47:53,780 --> 00:47:56,760 And if so, just print something arbitrarily like, yes, it is. 1081 00:47:56,760 --> 00:47:59,430 But again, just giving me opportunities to use and reuse and 1082 00:47:59,430 --> 00:48:02,270 reuse this new dot notation. 1083 00:48:02,270 --> 00:48:03,250 >> So who cares, right? 1084 00:48:03,250 --> 00:48:06,270 Coming up with a student program is somewhat arbitrary, but it turns out 1085 00:48:06,270 --> 00:48:09,800 that we can do useful things with this, for instance as follows. 1086 00:48:09,800 --> 00:48:14,600 This is a much more complicated struct in C. It's got a dozen or more fields, 1087 00:48:14,600 --> 00:48:15,880 somewhat cryptically named. 1088 00:48:15,880 --> 00:48:20,110 But if you've ever heard of a graphics file format called bitmap, BMP, it 1089 00:48:20,110 --> 00:48:22,830 turns out that the bitmap file format pretty much looks like that this. 1090 00:48:22,830 --> 00:48:24,200 It's a stupid little Smiley face. 1091 00:48:24,200 --> 00:48:27,840 It's a small image that I've zoomed in on pretty big so that I could see each 1092 00:48:27,840 --> 00:48:30,410 of the individual dots or pixels. 1093 00:48:30,410 --> 00:48:33,800 Now, it turns out we can represent a black dot with, say, the number 0. 1094 00:48:33,800 --> 00:48:35,520 And a white dot with the number 1. 1095 00:48:35,520 --> 00:48:39,140 >> So in other words, if you want to draw a Smiley face and save that image in a 1096 00:48:39,140 --> 00:48:42,680 computer, it suffices to store zeros and ones that look like this, where, 1097 00:48:42,680 --> 00:48:45,250 again, ones are white and zeros are black. 1098 00:48:45,250 --> 00:48:48,290 And together, if you effectively have a gird of ones and zeros, you have a 1099 00:48:48,290 --> 00:48:51,030 grid of pixels, and if you lay them out, you have a cute 1100 00:48:51,030 --> 00:48:52,560 little Smiley face. 1101 00:48:52,560 --> 00:48:58,150 Now, bitmap file format, BMP, is effectively that underneath the hood, 1102 00:48:58,150 --> 00:49:00,970 but with more pixels sot that you can actually represent colors. 1103 00:49:00,970 --> 00:49:05,170 >> But when you have more sophisticated file formats like BMP and JPEG and GIF 1104 00:49:05,170 --> 00:49:09,360 with which you might be familiar, those files on disk typically not only 1105 00:49:09,360 --> 00:49:13,760 have zeros and ones for the pixels, but they have some metadata as well-- 1106 00:49:13,760 --> 00:49:16,960 meta in the sense that is not really data but it's useful to have. 1107 00:49:16,960 --> 00:49:21,370 So these fields here are implying, and we'll see this in more detail in P-set 1108 00:49:21,370 --> 00:49:25,810 5, that before the zeros and ones that represent the pixels in an image, 1109 00:49:25,810 --> 00:49:29,110 there's a bunch of metadata like the size of the image and the 1110 00:49:29,110 --> 00:49:30,250 width of the image. 1111 00:49:30,250 --> 00:49:32,910 And notice I'm plucking off some arbitrary things here-- 1112 00:49:32,910 --> 00:49:34,260 width and height. 1113 00:49:34,260 --> 00:49:36,160 Bit count and some other things. 1114 00:49:36,160 --> 00:49:37,840 So there's some metadata in a file. 1115 00:49:37,840 --> 00:49:41,470 >> But by understanding how files are laid out in this way, you can actually 1116 00:49:41,470 --> 00:49:45,890 then manipulate images, recover images from disk, resize images. 1117 00:49:45,890 --> 00:49:47,560 But you can't necessarily enhance them. 1118 00:49:47,560 --> 00:49:48,480 I needed a photograph. 1119 00:49:48,480 --> 00:49:52,840 So I went back to RJ here, who you saw on the screen quite some time ago. 1120 00:49:52,840 --> 00:49:57,160 And if I open up Keynote here, this is what happens if you try to zoom in and 1121 00:49:57,160 --> 00:49:59,380 enhance RJ. 1122 00:49:59,380 --> 00:50:01,480 He's not getting any better really. 1123 00:50:01,480 --> 00:50:06,240 Now Keynote is kind of blurring it a little bit, just to gloss over the 1124 00:50:06,240 --> 00:50:11,040 fact that RJ does not get particularly enhanced when you zoom in. 1125 00:50:11,040 --> 00:50:13,310 And if do it this way, see the squares? 1126 00:50:13,310 --> 00:50:15,490 Yeah, you can definitely see the squares on a projector. 1127 00:50:15,490 --> 00:50:17,690 >> That's what you get when you enhance. 1128 00:50:17,690 --> 00:50:22,570 But in understanding how our RJ or the Smiley face is implemented will let us 1129 00:50:22,570 --> 00:50:24,950 actually write code that manipulates these things. 1130 00:50:24,950 --> 00:50:29,970 And I thought I'd end on this note, with 55 seconds of an enhance that's, 1131 00:50:29,970 --> 00:50:31,230 I dare, say rather misleading. 1132 00:50:31,230 --> 00:50:32,990 >> [VIDEO PLAYBACK] 1133 00:50:32,990 --> 00:50:34,790 >> -He's lying. 1134 00:50:34,790 --> 00:50:38,310 About what, I don't know. 1135 00:50:38,310 --> 00:50:41,200 >> -So what do we know? 1136 00:50:41,200 --> 00:50:45,280 >> -That at 9:15 Ray Santoya was at the ATM. 1137 00:50:45,280 --> 00:50:47,830 >> -So the question is what was he doing at 9:16? 1138 00:50:47,830 --> 00:50:50,750 >> -Shooting the nine millimeter at something. 1139 00:50:50,750 --> 00:50:52,615 Maybe he saw the sniper. 1140 00:50:52,615 --> 00:50:54,760 >> -Or was working with him. 1141 00:50:54,760 --> 00:50:56,120 >> -Wait. 1142 00:50:56,120 --> 00:50:57,450 Go back one. 1143 00:50:57,450 --> 00:50:58,700 >> -What do you see? 1144 00:50:58,700 --> 00:51:05,530 1145 00:51:05,530 --> 00:51:09,490 >> -Bring his face up, full screen. 1146 00:51:09,490 --> 00:51:09,790 >> -His glasses. 1147 00:51:09,790 --> 00:51:11,040 >> -There's a reflection. 1148 00:51:11,040 --> 00:51:21,790 1149 00:51:21,790 --> 00:51:23,520 >> -That's the Neuvitas baseball team. 1150 00:51:23,520 --> 00:51:24,530 That's their logo. 1151 00:51:24,530 --> 00:51:27,040 >> -And he's talking to whoever's wearing that jacket. 1152 00:51:27,040 --> 00:51:27,530 >> [END VIDEO PLAYBACK] 1153 00:51:27,530 --> 00:51:29,180 >> DAVID J. MALAN: This will be Problem Set 5. 1154 00:51:29,180 --> 00:51:30,720 We will see you next week. 1155 00:51:30,720 --> 00:51:32,330 >> MALE SPEAKER: At the next CS50. 1156 00:51:32,330 --> 00:51:39,240 >> [CRICKETS CHIRPING] 1157 00:51:39,240 --> 00:51:41,270 >> [MUSIC PLAYING]