1 00:00:08,996 --> 00:00:12,406 >> All right, welcome -- welcome to CS 50. 2 00:00:12,706 --> 00:00:13,896 This is the end of Week 4, 3 00:00:13,896 --> 00:00:15,866 because you all got here a little bit early we have a 4 00:00:15,866 --> 00:00:17,616 little bit of a treat for you. 5 00:00:18,306 --> 00:00:19,546 This is perhaps going 6 00:00:19,546 --> 00:00:24,726 to be among the most awkward six minutes, though, of your life. 7 00:00:25,516 --> 00:00:27,546 [ Applause ] 8 00:00:28,046 --> 00:00:31,986 >> This -- the following, is not a spoof. 9 00:00:33,516 --> 00:00:36,496 [ Background noise ] 10 00:00:36,996 --> 00:00:39,296 >> Hey, welcome to the party. 11 00:00:39,566 --> 00:00:42,266 The four of us, along with host [Inaudible] and you, 12 00:00:42,396 --> 00:00:44,776 are launching Windows 7 ultimate software. 13 00:00:44,856 --> 00:00:48,156 So you know what, let's take a minute or so to tell you 14 00:00:48,156 --> 00:00:50,716 about how great it is to host a launch party. 15 00:00:50,716 --> 00:00:54,086 You can use house party tools to build your guest list, 16 00:00:54,086 --> 00:00:55,646 up load your pictures which [Inaudible] 17 00:00:55,646 --> 00:00:58,296 and you can even get a party pack. 18 00:00:59,306 --> 00:01:01,606 Though you're in your own home, you'll be able to participate 19 00:01:01,606 --> 00:01:03,946 with others in this exciting event around the world. 20 00:01:04,506 --> 00:01:06,146 >> In a lot of ways, you're just throwing the party 21 00:01:06,146 --> 00:01:08,106 with Windows 7 as an honored guest. 22 00:01:08,106 --> 00:01:08,676 Sounds easy. 23 00:01:08,916 --> 00:01:09,686 And it is. 24 00:01:09,686 --> 00:01:12,656 But we thought you would probably like to know what to do 25 00:01:12,656 --> 00:01:13,196 to get ready, and [Inaudible] 26 00:01:13,196 --> 00:01:18,286 >> Yeah. The four of us got to have our parties a little ahead 27 00:01:18,456 --> 00:01:20,806 of schedule, you know, try everything out. 28 00:01:21,296 --> 00:01:23,756 So we thought we'd be able to tell you some of the tips 29 00:01:23,916 --> 00:01:25,806 that help make our parties really fun. 30 00:01:26,216 --> 00:01:29,406 >> Of course the first thing you want to do is install Windows 7. 31 00:01:29,716 --> 00:01:30,136 >> Duh! 32 00:01:30,896 --> 00:01:32,236 >> Make sure you do that a couple days 33 00:01:32,236 --> 00:01:33,166 in advance of the party. 34 00:01:33,626 --> 00:01:34,976 >> Call customer service if you have any questions. 35 00:01:35,446 --> 00:01:35,736 >> Exactly. 36 00:01:35,916 --> 00:01:37,906 >> Play with Windows 7 before the party. 37 00:01:38,476 --> 00:01:39,906 >> Second, look at the activities you 38 00:01:39,906 --> 00:01:42,436 and your guests can try at your party, and choose the ones 39 00:01:42,596 --> 00:01:44,196 that seem to you to be the most fun. 40 00:01:44,196 --> 00:01:46,756 >> There's a video of each activity from one of our parties 41 00:01:46,756 --> 00:01:49,296 and we have tried them all, right? 42 00:01:49,896 --> 00:01:54,126 So you can see how the activity runs. 43 00:01:54,346 --> 00:01:57,596 And you know what, there's a handy page of notes to print 44 00:01:57,596 --> 00:01:59,066 out that tells you what you need ahead of time, I mean, 45 00:01:59,096 --> 00:02:00,086 it helps guide you along the party, right? 46 00:02:00,116 --> 00:02:01,316 Print these out, because these are called host notes, 47 00:02:01,346 --> 00:02:01,976 and they're at the web site. 48 00:02:02,536 --> 00:02:04,946 >> Hey, again, you don't have to do the activities listed 49 00:02:04,946 --> 00:02:06,076 under your party favor. 50 00:02:06,476 --> 00:02:08,876 You just look at them all and decide which one seems 51 00:02:08,916 --> 00:02:11,306 to be the most fun for your guests. 52 00:02:11,446 --> 00:02:14,956 >> And some of the host notes, they list bonus activities. 53 00:02:15,016 --> 00:02:15,986 [ Multiple voices speaking ] 54 00:02:15,986 --> 00:02:19,166 >> You want to try them, but you want to make sure 55 00:02:19,166 --> 00:02:20,696 that you have the right devices on hand. 56 00:02:21,116 --> 00:02:24,296 >> Right. Now how the party flows is totally up to you. 57 00:02:24,396 --> 00:02:25,626 Now here's a sample agenda, though, 58 00:02:25,626 --> 00:02:27,426 that we each more or less followed. 59 00:02:27,886 --> 00:02:29,436 >> Right. For the first half hour or so 60 00:02:29,436 --> 00:02:37,096 at my party the guests just came in, had a drink and a snack, 61 00:02:37,096 --> 00:02:38,236 and mingled like they do at any good party. 62 00:02:38,266 --> 00:02:39,226 But many of the activities suggest 63 00:02:39,256 --> 00:02:40,156 that you have photos from the party. 64 00:02:40,186 --> 00:02:41,596 So we each made a point of shooting 20 or so photos 65 00:02:41,626 --> 00:02:42,976 on a digital camera at the start of the party too. 66 00:02:43,156 --> 00:02:46,106 >> Oh my gosh, when everyone was there and settled, 67 00:02:46,106 --> 00:02:49,926 I ran an overview of some of my favorite Windows 7 features. 68 00:02:50,426 --> 00:02:54,016 I showed my guests things from 7.2, 69 00:02:54,066 --> 00:02:55,916 the Windows orientation video, and it took like ten minutes. 70 00:02:55,946 --> 00:02:57,146 You know it was great, it was totally informal, 71 00:02:57,176 --> 00:02:58,106 like everyone just kind of crowded 72 00:02:58,136 --> 00:02:58,976 around the computer in the kitchen. 73 00:03:00,066 --> 00:03:01,076 >> We also started with some 74 00:03:01,076 --> 00:03:02,806 of the basic Windows 7 features, right? 75 00:03:02,806 --> 00:03:05,626 >> Yeah. I mean, it's a good way to get things going, right? 76 00:03:05,926 --> 00:03:06,956 Whatever your party is. 77 00:03:06,956 --> 00:03:09,596 We've got four separate videos of each of us doing bits 78 00:03:09,596 --> 00:03:11,976 and pieces of this kind of thing at our own parties. 79 00:03:12,296 --> 00:03:15,406 Now, after my review, I went straight to an activity. 80 00:03:15,406 --> 00:03:17,016 >> Oh, you went straight to the activity? 81 00:03:17,146 --> 00:03:20,066 I let everybody fool around with snap for a minute -- 82 00:03:20,066 --> 00:03:20,133 [ Multiple voices speaking ] 83 00:03:20,133 --> 00:03:23,926 >> And then we started an activity maybe 30 minutes later. 84 00:03:24,046 --> 00:03:26,036 >> Well, either way works, right? 85 00:03:26,406 --> 00:03:28,796 You figure out what your guests want, and just play it by ear. 86 00:03:28,796 --> 00:03:32,646 In any event, we each did an activity or two. 87 00:03:33,346 --> 00:03:35,906 >> I did three activities, and -- 88 00:03:35,906 --> 00:03:36,046 [ Multiple voices speaking ] 89 00:03:36,046 --> 00:03:37,066 >> That's great. 90 00:03:37,376 --> 00:03:43,136 The activities, each have you talk for a minute or so, 91 00:03:43,316 --> 00:03:46,666 and then you set something up so your guests can try. 92 00:03:46,796 --> 00:03:50,886 So for the rest of the party you just leave your computer on 93 00:03:50,916 --> 00:03:51,906 and running, and then you let folks mess 94 00:03:51,936 --> 00:03:52,386 around with it, right? 95 00:03:52,416 --> 00:03:53,796 And the guests may have a question for you or something, 96 00:03:53,826 --> 00:03:54,786 or you may want to show them some things, 97 00:03:54,816 --> 00:03:55,626 but it was all really informal. 98 00:03:55,656 --> 00:03:56,976 Oh, and then when you're close to the end of the party, 99 00:03:57,466 --> 00:03:58,536 >> wanting everyone to leave -- 100 00:03:58,536 --> 00:03:58,603 [ Multiple voices speaking ] 101 00:03:58,603 --> 00:04:02,916 >> Towards the end I showed the guests windows.com/help, 102 00:04:03,466 --> 00:04:07,986 and it's a great site for people to get more information. 103 00:04:08,016 --> 00:04:08,976 And I found it to be a nice wrap up. 104 00:04:09,346 --> 00:04:11,726 >> I agree, help and how to is a great way 105 00:04:11,726 --> 00:04:12,346 to bring it all together. 106 00:04:12,736 --> 00:04:12,936 >> Yeah. 107 00:04:12,936 --> 00:04:15,336 >> And it's a great resource for you hosts too. 108 00:04:15,336 --> 00:04:17,516 You can find out pretty much anything you want to know 109 00:04:17,546 --> 00:04:18,976 about the features we're using for the activities. 110 00:04:19,756 --> 00:04:23,306 >> That's one way to flow the party, but again, 111 00:04:23,306 --> 00:04:26,256 it's all up to us, you, it's what's right 112 00:04:26,436 --> 00:04:27,316 for you and your guests. 113 00:04:28,036 --> 00:04:29,696 >> Yeah, that's exactly right. 114 00:04:29,696 --> 00:04:31,456 >> You know, the four of us learned quite a few things 115 00:04:31,456 --> 00:04:32,976 to help make our parties a lot of fun. 116 00:04:33,756 --> 00:04:34,386 >> Absolutely. 117 00:04:34,746 --> 00:04:35,196 Here's mine. 118 00:04:35,466 --> 00:04:37,346 Make the thing you're demonstrating personal 119 00:04:37,416 --> 00:04:38,266 to someone at the party. 120 00:04:38,546 --> 00:04:40,586 Like the way I made Chip's files get transferred 121 00:04:40,636 --> 00:04:41,856 by Windows easy transfer. 122 00:04:42,376 --> 00:04:44,916 >> Or the way I showed by guests web slices by talking 123 00:04:44,916 --> 00:04:46,596 about Frank's online auction shop -- 124 00:04:47,016 --> 00:04:48,306 [ Multiple voices speaking ] 125 00:04:48,306 --> 00:04:50,966 >> Or having folks edit photos of themselves to e-mail home. 126 00:04:51,146 --> 00:04:54,926 >> Bottom line, guest love it when the activity is about them. 127 00:04:55,586 --> 00:04:57,576 >> Hey, another thing, I found that it really helped 128 00:04:57,676 --> 00:04:59,536 to name the person to be the first 129 00:04:59,596 --> 00:05:01,156 with the hands-on activity. 130 00:05:01,366 --> 00:05:03,266 And have them pick the next person, and so on and so on. 131 00:05:03,266 --> 00:05:05,926 >> Yeah, I think you'll see that in a lot of our videos. 132 00:05:06,176 --> 00:05:07,976 It really helped get the guests onto the computer. 133 00:05:08,156 --> 00:05:08,896 >> Okay. On a more serious note. 134 00:05:08,896 --> 00:05:15,106 Decide what activities you're going to do at least a day 135 00:05:15,106 --> 00:05:17,546 or two in advance, and watch those videos 136 00:05:17,546 --> 00:05:21,066 and read the hand outs, and some activities have modest set up, 137 00:05:21,066 --> 00:05:21,486 you know? 138 00:05:22,526 --> 00:05:24,816 They require certain things for you to have at the house. 139 00:05:24,816 --> 00:05:26,886 >> Sure. Like you have to have two computers 140 00:05:26,886 --> 00:05:28,096 to do the web chatting activity. 141 00:05:28,436 --> 00:05:28,976 >> That's right. 142 00:05:29,126 --> 00:05:32,636 >> In any case, none of the set up is too hard, right? 143 00:05:32,946 --> 00:05:35,846 You need to make sure you're ready to go 144 00:05:35,846 --> 00:05:38,616 when your guests arrive and there are bonus activities 145 00:05:38,696 --> 00:05:42,006 in some cases, and you won't go deeper, perhaps, into it, 146 00:05:42,006 --> 00:05:42,976 and you have to have the equipment to do that. 147 00:05:43,206 --> 00:05:43,376 >> Right. 148 00:05:44,156 --> 00:05:46,786 >> Hey, it helped me to remember I'm not a salesman 149 00:05:46,786 --> 00:05:47,356 at this party. 150 00:05:48,236 --> 00:05:49,746 I'm not supposed to be a total expert either. 151 00:05:49,826 --> 00:05:51,666 This is a brand new product. 152 00:05:51,666 --> 00:05:52,526 And part of the point 153 00:05:52,526 --> 00:05:54,496 of a launch party is seeing what you already know 154 00:05:54,776 --> 00:05:55,866 and what you can figure out. 155 00:05:56,136 --> 00:05:57,746 It's just so simple anyone can do it. 156 00:05:57,916 --> 00:05:59,576 >> That is one of the great things 157 00:05:59,576 --> 00:06:01,236 about ending with help and how to. 158 00:06:01,236 --> 00:06:02,966 It's nice to be able to say throughout the party 159 00:06:03,136 --> 00:06:05,226 that you'll wrap it up with a great resource 160 00:06:05,256 --> 00:06:05,976 for any unanswered questions -- 161 00:06:06,166 --> 00:06:06,666 >> Exactly. 162 00:06:06,776 --> 00:06:10,816 I think the biggest thing is to be totally creative 163 00:06:10,816 --> 00:06:13,186 with the party and the activities. 164 00:06:13,236 --> 00:06:13,976 I mean, this is your party. 165 00:06:14,046 --> 00:06:14,536 >> Exactly. 166 00:06:15,396 --> 00:06:17,776 >> Can you believe that Microsoft put the launch 167 00:06:17,776 --> 00:06:20,466 of Windows 7 in our hands? 168 00:06:20,536 --> 00:06:21,046 They nuts or what? 169 00:06:21,756 --> 00:06:23,856 >> Maybe by letting you be involved. 170 00:06:23,856 --> 00:06:24,036 >> That was -- 171 00:06:24,036 --> 00:06:24,103 [ Multiple voices speaking ] 172 00:06:24,103 --> 00:06:29,496 >> I mean, I don't know, really, it does make total sense. 173 00:06:29,496 --> 00:06:31,386 Windows 7 is all about the computer user, 174 00:06:31,706 --> 00:06:33,216 making everyday tasks more simple, 175 00:06:33,606 --> 00:06:35,266 working the way we all really wanted to, 176 00:06:35,266 --> 00:06:36,756 and making new things possible. 177 00:06:37,076 --> 00:06:37,966 This really is our launch. 178 00:06:38,796 --> 00:06:39,786 >> Yeah. You're right. 179 00:06:40,306 --> 00:06:41,506 So it ought to be a party. 180 00:06:41,866 --> 00:06:43,016 Have fun out there. 181 00:06:43,256 --> 00:06:43,376 >> Cheers. 182 00:06:43,376 --> 00:06:46,636 >> Have a good one, guys. 183 00:06:46,636 --> 00:06:48,116 [ Multiple voices speaking ] 184 00:06:48,116 --> 00:06:50,536 >> So I had six minutes to come 185 00:06:50,536 --> 00:06:52,526 up with wise-ass comments, and I've got nothing. 186 00:06:52,866 --> 00:06:56,486 Like -- that is a genuine commercial for the release 187 00:06:56,486 --> 00:06:58,316 of a new operating system. 188 00:06:58,676 --> 00:07:01,646 And I realize this is CS 50 and we do some geeky things 189 00:07:01,646 --> 00:07:03,926 in this class, we encourage you to start your P sets 190 00:07:03,926 --> 00:07:05,386 on Fridays, I did it myself. 191 00:07:05,606 --> 00:07:08,426 My God, if you all start having Windows 7 parties, like, 192 00:07:08,426 --> 00:07:09,506 that is over the line. 193 00:07:09,726 --> 00:07:14,856 So -- let me counterbalance this by saying two things. 194 00:07:14,926 --> 00:07:19,506 One is if you do have a PC and would like to install Windows 7, 195 00:07:19,506 --> 00:07:22,116 CS 50 does have access to stuff like this for free. 196 00:07:22,376 --> 00:07:24,356 So you can actually go to the software page 197 00:07:24,356 --> 00:07:26,556 of CS 50's web site and follow the directions there 198 00:07:26,556 --> 00:07:29,016 for downloading windows in many different flavors. 199 00:07:29,016 --> 00:07:31,536 If you guys are Mac users, we also have -- 200 00:07:31,536 --> 00:07:33,446 or even LINUX, we have access 201 00:07:33,446 --> 00:07:35,876 to what is called virtual machine technology, 202 00:07:36,066 --> 00:07:38,696 VMware fusion, VMware work station, 203 00:07:38,696 --> 00:07:40,846 we'll talk perhaps briefly later in the semester about this. 204 00:07:41,106 --> 00:07:42,426 But for now these are just programs 205 00:07:42,456 --> 00:07:45,906 that you can download via our web site on a Mac, run them, 206 00:07:46,136 --> 00:07:49,366 and inside of them can you then run Windows if you like. 207 00:07:49,516 --> 00:07:52,586 So you may be familiar with this idea of dual booting a computer, 208 00:07:52,586 --> 00:07:54,316 boot camp is one incarnation of this. 209 00:07:54,656 --> 00:07:57,036 But that requires that you restart your whole computer 210 00:07:57,036 --> 00:07:58,866 to choose between Mac OS and Windows. 211 00:07:58,866 --> 00:08:01,086 Well, that's not really necessary these days, 212 00:08:01,086 --> 00:08:02,486 thanks to virtual machines. 213 00:08:02,486 --> 00:08:06,966 You can literary run Windows inside of a window on your Mac. 214 00:08:07,196 --> 00:08:09,996 But unfortunately, Apple does not really let the reverse be 215 00:08:09,996 --> 00:08:13,356 possible on PCs, albeit with a few exceptions. 216 00:08:13,626 --> 00:08:16,286 So let me now counter balance this ad with one other from one 217 00:08:16,286 --> 00:08:24,996 of our own CS 50 CAs groups, this is a good commercial. 218 00:08:26,516 --> 00:09:06,676 [ Music and singing ] 219 00:09:07,176 --> 00:09:12,956 >> So the cool people will be going to that party. 220 00:09:13,076 --> 00:09:13,766 Another dance? 221 00:09:13,896 --> 00:09:14,956 So there's lots of competition. 222 00:09:15,086 --> 00:09:16,486 But I did forget the best part. 223 00:09:16,486 --> 00:09:17,876 Let me toggle over. 224 00:09:18,076 --> 00:09:21,556 So this morning I downloaded 22 pages for notes for how 225 00:09:21,556 --> 00:09:23,196 to host my own Windows launch party. 226 00:09:23,536 --> 00:09:24,936 So you can do this too. 227 00:09:24,936 --> 00:09:26,726 We linked to it on the Courts lecture page. 228 00:09:26,726 --> 00:09:30,376 They have -- come on, focus, focus -- 229 00:09:30,546 --> 00:09:35,016 so you can download these things here, what do I do at the party, 230 00:09:35,076 --> 00:09:37,666 introduce; we'll be making the computer uniquely our own 231 00:09:37,666 --> 00:09:39,326 by changing the desktop background. 232 00:09:39,326 --> 00:09:43,906 And I only downloaded two of the like five or six PDFs. 233 00:09:44,076 --> 00:09:45,476 So anyhow, bless their hearts. 234 00:09:45,476 --> 00:09:47,686 Microsoft has been very good and generous to the course. 235 00:09:47,756 --> 00:09:51,446 But someone over there is I think out of control. 236 00:09:53,526 --> 00:09:57,026 So -- without further ado, we left off on Monday talking 237 00:09:57,026 --> 00:10:00,266 about pointers, which are just addresses. 238 00:10:00,266 --> 00:10:02,306 And we also talked a bit about GDB. 239 00:10:02,306 --> 00:10:05,416 So in reverse order here, GDB again is a debugger. 240 00:10:05,526 --> 00:10:07,966 This is a tool that allows you to step through your code 241 00:10:07,966 --> 00:10:11,136 and as the name implies, debug your code, find problems 242 00:10:11,136 --> 00:10:14,196 with it a little more easily sometimes than you can 243 00:10:14,396 --> 00:10:16,366 by just reasoning through your own code 244 00:10:16,366 --> 00:10:18,226 or putting print defs all over the place. 245 00:10:18,476 --> 00:10:20,036 And it's actually been wonderful to see 246 00:10:20,036 --> 00:10:23,126 on helpCS50.net the past couple of days, and the bulletin board, 247 00:10:23,306 --> 00:10:25,566 a bunch of you, even without much encouragement 248 00:10:25,566 --> 00:10:27,426 from us have actually been trying 249 00:10:27,426 --> 00:10:29,776 out GDB before reaching out with a question. 250 00:10:29,776 --> 00:10:32,426 So let me make that plea formally now. 251 00:10:32,726 --> 00:10:35,606 The next time you run up against a bug, before you think you need 252 00:10:35,606 --> 00:10:37,656 to resort to office hours, before you think you need 253 00:10:37,656 --> 00:10:42,416 to turn to the bulletin board or e-mail, do just fire up GDB, 254 00:10:42,416 --> 00:10:45,696 run it on your program as we saw on Monday, GDB, 255 00:10:45,696 --> 00:10:47,576 space program name, and then a few 256 00:10:47,576 --> 00:10:50,606 of the commands I introduced are probably sufficient for now 257 00:10:50,606 --> 00:10:54,006 to start poking around, set a break point at main, type next, 258 00:10:54,256 --> 00:10:56,756 maybe step, and just walk through your program. 259 00:10:56,756 --> 00:10:58,976 And there is on the resources page 260 00:10:58,976 --> 00:11:01,626 of the course web site a PDF, it's more detail 261 00:11:01,626 --> 00:11:04,436 than you'll want, probably, in this first week of using GDB, 262 00:11:04,676 --> 00:11:06,286 but it's essentially a cheat sheet for all 263 00:11:06,286 --> 00:11:07,056 of the things you can do. 264 00:11:07,056 --> 00:11:09,946 And you'll probably see me in the TFs do even more than that. 265 00:11:09,946 --> 00:11:11,226 So do start there with GDB, 266 00:11:11,226 --> 00:11:13,096 and you'll be surprised how powerful it is. 267 00:11:13,436 --> 00:11:16,736 So I spent yesterday using myPHP and mySQL skills to whip 268 00:11:16,736 --> 00:11:20,696 up a little grades interface for the TFs to input all grades 269 00:11:20,696 --> 00:11:22,586 into the course's web site, and then for you guys 270 00:11:23,106 --> 00:11:25,736 on an encrypted page to view them. 271 00:11:25,836 --> 00:11:28,026 So the TFs are in the process of up loading scores. 272 00:11:28,026 --> 00:11:31,026 You probably received via e-mail already, but do take a look 273 00:11:31,026 --> 00:11:32,916 at the new grades link on the course's web site. 274 00:11:33,116 --> 00:11:35,286 And this is meant to be a sanity check for you guys too, 275 00:11:35,286 --> 00:11:38,556 to make sure that our records reflect what you think you 276 00:11:38,556 --> 00:11:40,696 in fact received on some P set or quiz. 277 00:11:40,886 --> 00:11:41,796 No new hand outs today. 278 00:11:42,466 --> 00:11:46,666 Okay, so we left off talking about pointers and memory 279 00:11:46,666 --> 00:11:49,726 and we said that this is going to allow us to do a lot more 280 00:11:49,726 --> 00:11:52,906 because it's really giving us pretty low level control 281 00:11:52,906 --> 00:11:57,266 of the computer in terms of memory and ultimately hardware, 282 00:11:57,396 --> 00:11:59,916 but we also promise that we can do some damage. 283 00:11:59,916 --> 00:12:03,026 And we'll see over time how bad things can happen, and frankly, 284 00:12:03,256 --> 00:12:05,826 a number of you have already experienced this most recent 285 00:12:05,826 --> 00:12:10,246 week bad things happening, like weird things appearing 286 00:12:10,286 --> 00:12:12,396 in your game of 15, bored all of a sudden, 287 00:12:12,396 --> 00:12:14,546 and very often we've seen this is the result 288 00:12:14,546 --> 00:12:17,726 of your overstepping the bounds of some array. 289 00:12:17,896 --> 00:12:19,016 For instance, your board array. 290 00:12:19,406 --> 00:12:23,166 So just to quickly recap over here, if you declare something 291 00:12:23,166 --> 00:12:26,726 like inter, X, a little something like that, 292 00:12:27,386 --> 00:12:28,856 and hopefully you all can see this, 293 00:12:28,886 --> 00:12:30,166 but I'll recite anything I write. 294 00:12:30,166 --> 00:12:32,716 So if I declare int X, pictorially, 295 00:12:32,716 --> 00:12:34,396 we've been drawing this as a little square. 296 00:12:34,646 --> 00:12:36,786 This square is how big in size? 297 00:12:37,516 --> 00:12:39,776 32 bits. AKA, 4 bytes. 298 00:12:39,776 --> 00:12:42,646 And this would be called X. So it has some little label 299 00:12:42,746 --> 00:12:44,206 and it's in fact 4 bytes. 300 00:12:44,206 --> 00:12:47,926 If instead I do something like this, char C, 301 00:12:48,326 --> 00:12:50,376 what does that look like in memory? 302 00:12:50,466 --> 00:12:52,296 It too is a chunk of memory, but how big is it? 303 00:12:53,666 --> 00:12:57,336 So it's just 8 bits or 1 byte in memory. 304 00:12:57,386 --> 00:13:00,696 So this would be C. So the quick recap is chars are 8 bits 305 00:13:00,696 --> 00:13:06,936 or 1 byte, shorts are 16 bits or 2 bytes, ints are 4 bytes, 306 00:13:06,986 --> 00:13:09,576 longs are 4 bytes on these computers these days. 307 00:13:09,836 --> 00:13:14,966 And long long is 8 bytes, or 64 bits. 308 00:13:14,966 --> 00:13:16,336 And we have floats and doubles, 309 00:13:16,336 --> 00:13:20,066 which are similarly 32, and 64-bits in size. 310 00:13:20,286 --> 00:13:21,866 But then we started doing other things. 311 00:13:21,866 --> 00:13:24,066 We started saying things like inter-- 312 00:13:24,716 --> 00:13:28,266 inters, let's say ARR bracket 4. 313 00:13:28,956 --> 00:13:32,626 So this declares an array, and so an array 314 00:13:32,626 --> 00:13:35,286 in memory looks instead a little bigger. 315 00:13:35,286 --> 00:13:38,026 So this is going to look a little something like this. 316 00:13:38,126 --> 00:13:41,166 It's going to have a label of ARR, and then what's going 317 00:13:41,166 --> 00:13:42,666 to be inside of this chunk of memory 318 00:13:42,666 --> 00:13:44,016 when I write this line of code here? 319 00:13:44,556 --> 00:13:44,706 Ones? 320 00:13:44,741 --> 00:13:46,741 [ Inaudible audience comment ] 321 00:13:46,776 --> 00:13:49,666 >> So garbage values, right? 322 00:13:50,476 --> 00:13:52,026 We don't know, right? 323 00:13:52,026 --> 00:13:54,946 We don't know, and we've seen examples already, 324 00:13:54,946 --> 00:13:58,186 if we just declare a variable or declare an array of inters 325 00:13:58,186 --> 00:14:00,456 or any data time, who knows what's going to be in there. 326 00:14:00,456 --> 00:14:02,806 So one lesson on Monday was to make sure 327 00:14:02,806 --> 00:14:04,566 to always initialize your data, 328 00:14:04,806 --> 00:14:06,906 otherwise unexpected things can happen. 329 00:14:07,136 --> 00:14:09,786 Now ARR here is the name of this array. 330 00:14:10,076 --> 00:14:12,966 But now you can start thinking of this ARR, 331 00:14:12,966 --> 00:14:15,036 the name of this variable, if it's an array, 332 00:14:15,036 --> 00:14:18,026 as not just a label or a name, but it's kind 333 00:14:18,026 --> 00:14:19,486 of the address in memory. 334 00:14:19,486 --> 00:14:21,626 Because one of the other things we said on Monday is 335 00:14:21,626 --> 00:14:24,876 that when you allocate a whole bunch of things in memory by way 336 00:14:24,876 --> 00:14:27,596 of an array they're contiguous, and this is important, 337 00:14:27,596 --> 00:14:30,746 contiguous as in the first int is here, the second int is here, 338 00:14:30,746 --> 00:14:33,416 and the third and the fourth are all back-to-back in memory. 339 00:14:33,696 --> 00:14:35,756 And that means you can then index 340 00:14:35,756 --> 00:14:38,026 into them using this square bracket notation 341 00:14:38,316 --> 00:14:39,526 because the computer knows 342 00:14:39,526 --> 00:14:41,866 that the zero element is right at the beginning. 343 00:14:42,146 --> 00:14:45,196 The one element is then 4 bytes over, 344 00:14:45,446 --> 00:14:48,916 the 2 element is 8 bytes over, and so forth. 345 00:14:48,916 --> 00:14:51,526 So arrays give us what's called random access, 346 00:14:51,526 --> 00:14:53,236 because you can jump anywhere in the array you want, 347 00:14:53,286 --> 00:14:55,506 so long as you know the left-hand side, 348 00:14:55,836 --> 00:14:58,276 and the right-hand side or the length, but we'll see 349 00:14:58,276 --> 00:15:00,676 in just a couple of weeks that there are other data structures 350 00:15:00,676 --> 00:15:02,376 that are actually more sophisticated, 351 00:15:02,496 --> 00:15:04,796 but for which we're going to have to give up that feature. 352 00:15:05,066 --> 00:15:07,586 Now what does it take to remember 353 00:15:07,586 --> 00:15:09,096 where an array as in memory. 354 00:15:09,456 --> 00:15:11,756 Well, yes you could certainly remember the address 355 00:15:11,756 --> 00:15:15,216 of this 32-bits, this 32-bits, this, and this. 356 00:15:15,606 --> 00:15:18,096 But again, it suffices, so far as we've seen, 357 00:15:18,326 --> 00:15:20,106 to just remember the first address, 358 00:15:20,506 --> 00:15:23,386 so long as you also remember what? 359 00:15:24,916 --> 00:15:26,326 So the length of the array. 360 00:15:26,536 --> 00:15:29,486 So unlike Java, those of you who have programmed before, 361 00:15:29,486 --> 00:15:34,216 unlike Java, C arrays do not let you ask the question how big is 362 00:15:34,216 --> 00:15:34,666 this array. 363 00:15:34,666 --> 00:15:36,376 At least usually. 364 00:15:36,376 --> 00:15:38,416 There are a couple of exceptions. 365 00:15:38,656 --> 00:15:40,546 But you can't ask that, so you have to keep it around 366 00:15:40,546 --> 00:15:42,016 and your own variable like N. 367 00:15:42,346 --> 00:15:45,286 So we also talked increasingly about this thing. 368 00:15:45,286 --> 00:15:47,866 So we said -- let's just rotate this arranged, 369 00:15:47,866 --> 00:15:50,286 we started talking more about strings, 370 00:15:50,586 --> 00:15:52,956 and a string is just a synonym for what data type? 371 00:15:54,756 --> 00:15:58,906 Yeah, so char star or really a string we've said is an array 372 00:15:58,906 --> 00:16:01,446 of characters, but char star implies 373 00:16:01,446 --> 00:16:03,686 that it's actually a pointer or an address. 374 00:16:04,036 --> 00:16:06,616 So these things too are now kind of the same. 375 00:16:06,876 --> 00:16:11,426 If I do something like -- if I declare a string, 376 00:16:13,136 --> 00:16:14,606 let's actually leave room this time, 377 00:16:14,606 --> 00:16:18,366 if I actually declare a string and store in it a short word 378 00:16:18,366 --> 00:16:24,436 like foo, I might do string, S gets quote unquote, foo. 379 00:16:24,616 --> 00:16:28,096 All right, and in memory, what does this look like? 380 00:16:28,096 --> 00:16:30,246 Well, how many bytes do I need to draw on the board? 381 00:16:31,466 --> 00:16:31,946 Okay, good. 382 00:16:31,946 --> 00:16:32,776 So that was good. 383 00:16:33,136 --> 00:16:34,466 Not easy to trip up just yet. 384 00:16:34,886 --> 00:16:37,746 So we need 4 bytes, because we need F-O-O 385 00:16:37,746 --> 00:16:39,576 and then the back slash 0. 386 00:16:39,576 --> 00:16:41,536 And in order to write a back slash 0, 387 00:16:41,806 --> 00:16:44,526 the character that's going here, this is the character F, 388 00:16:44,526 --> 00:16:46,476 and remember with char you use single quotes. 389 00:16:46,796 --> 00:16:50,306 This is the character or char O, character or char O, 390 00:16:50,306 --> 00:16:52,126 and then this is also single quote, 391 00:16:52,536 --> 00:16:54,736 back slash 0, single quote. 392 00:16:54,846 --> 00:16:56,406 So that's all that means there. 393 00:16:56,666 --> 00:16:59,426 You may see some text books just write a literal zero, 394 00:16:59,426 --> 00:17:02,186 but I would say many or most people just use this notation 395 00:17:02,186 --> 00:17:04,336 that this is in fact a char, just happens 396 00:17:04,366 --> 00:17:05,906 to be the actual number 0. 397 00:17:06,266 --> 00:17:08,586 But I can -- this is clearly the same thing as this. 398 00:17:09,126 --> 00:17:14,586 So char star S equals bar actually does the exact same 399 00:17:14,586 --> 00:17:16,066 thing in memory, because string 400 00:17:16,066 --> 00:17:18,816 and char star are just synonymous, as we'll see 401 00:17:18,816 --> 00:17:20,426 in the CS 50 library today. 402 00:17:20,746 --> 00:17:23,756 So now we have B A R back slash 0. 403 00:17:24,036 --> 00:17:25,096 So what is S? 404 00:17:25,096 --> 00:17:30,016 It's the label, yes, of these actual strings, rather, 405 00:17:30,016 --> 00:17:32,696 it's the label for these actual strings. 406 00:17:33,036 --> 00:17:35,776 But it's also as we're starting to see now an address. 407 00:17:36,006 --> 00:17:39,076 An address is something that's numeric and it's something 408 00:17:39,076 --> 00:17:41,836 that we're going to be able to perform tricks 409 00:17:41,836 --> 00:17:44,776 on to actually really start to manipulating our memory. 410 00:17:44,776 --> 00:17:47,326 So with that said, let's take a look at this example here. 411 00:17:47,706 --> 00:17:50,926 So this is again from the other day's hand outs. 412 00:17:51,436 --> 00:17:55,516 Pointers 1.C. And what I did today was I ran a little script 413 00:17:55,516 --> 00:17:58,336 on my own code that removed all of the comments just 414 00:17:58,336 --> 00:18:00,546 for the sake of discussion, but your print outs have some 415 00:18:00,546 --> 00:18:02,826 of the answers to some of these provocative questions. 416 00:18:03,146 --> 00:18:05,216 This program here is pointer 1.C. 417 00:18:05,216 --> 00:18:06,846 And the first interesting line 418 00:18:06,846 --> 00:18:09,986 of code has me calling get string, 419 00:18:10,546 --> 00:18:12,436 so this is going to return what? 420 00:18:12,776 --> 00:18:14,886 So get string prompts the user for a string, 421 00:18:14,886 --> 00:18:16,936 when I finally type some characters and hit enter, 422 00:18:17,276 --> 00:18:19,466 what is it that's being returned, technically. 423 00:18:21,936 --> 00:18:25,436 So conceptually, I'm getting back a string, 424 00:18:25,626 --> 00:18:26,666 a little more concretely, 425 00:18:26,666 --> 00:18:29,336 I'm getting back an array of characters. 426 00:18:29,716 --> 00:18:33,186 But again, return values in a C function can only be one thing, 427 00:18:33,186 --> 00:18:35,346 and you can't just hand me a whole bunch of characters. 428 00:18:35,346 --> 00:18:36,866 So what am I literally being handed 429 00:18:37,106 --> 00:18:38,926 as a return value for get string? 430 00:18:40,076 --> 00:18:40,596 A pointer. 431 00:18:40,746 --> 00:18:42,536 So an address of that string. 432 00:18:42,746 --> 00:18:44,136 So apparently, and you'll see this 433 00:18:44,136 --> 00:18:45,706 when we peel back this layer today, 434 00:18:45,706 --> 00:18:48,466 get string is actually allocating a chunk of memory, 435 00:18:48,466 --> 00:18:51,176 it's saying to LINUX I need a bunch of bytes of memory, 436 00:18:51,176 --> 00:18:52,626 I need 4 bytes, I need 8 bytes. 437 00:18:52,626 --> 00:18:56,656 Whatever. Tell me what address I can start writing these 438 00:18:56,686 --> 00:18:57,976 characters into RAM. 439 00:18:58,386 --> 00:19:00,726 Because then once I'm done writing the characters 440 00:19:00,726 --> 00:19:03,986 that the user has typed in, I'm going to return to the caller, 441 00:19:03,986 --> 00:19:07,206 the function who called me, the address of the first byte 442 00:19:07,206 --> 00:19:09,436 that you the operating system handed me. 443 00:19:09,646 --> 00:19:12,146 All right, so why am I checking for null here, 444 00:19:12,146 --> 00:19:14,566 under what circumstances do U something 445 00:19:14,566 --> 00:19:16,966 like get string might return this special sentinel 446 00:19:16,966 --> 00:19:17,826 value null. 447 00:19:19,316 --> 00:19:19,466 sorry? 448 00:19:19,466 --> 00:19:20,236 [ Inaudible audience comment ] 449 00:19:20,236 --> 00:19:25,076 >> I didn't enter anything, so maybe I hit, for instance, 450 00:19:25,076 --> 00:19:28,196 control D, is sort of an esoteric trick. 451 00:19:28,456 --> 00:19:31,076 If you essentially want to tell the computer you're not 452 00:19:31,076 --> 00:19:34,506 providing anything, you can send what's called the end of file 453 00:19:34,506 --> 00:19:37,836 or E O F signal or character, and that's usually done 454 00:19:37,836 --> 00:19:40,376 by hitting control D. So we need to be able to handle that. 455 00:19:40,866 --> 00:19:42,276 But under what other sort 456 00:19:42,276 --> 00:19:46,776 of more familiar circumstances might get string not be able 457 00:19:46,776 --> 00:19:52,276 to return to me, the address of some string in memory. 458 00:19:52,276 --> 00:19:53,876 Again, start thinking corner cases. 459 00:19:53,876 --> 00:19:54,746 What could go wrong. 460 00:19:54,746 --> 00:19:57,286 What could a really object noxious user try doing just 461 00:19:57,286 --> 00:19:58,236 to mess with me. 462 00:19:58,446 --> 00:20:02,156 Got to be a little more committal. 463 00:20:02,766 --> 00:20:05,956 Like, what could go wrong. 464 00:20:07,296 --> 00:20:08,646 Okay, so just pressing enter. 465 00:20:08,646 --> 00:20:10,686 So hopefully I handle really short strings. 466 00:20:10,876 --> 00:20:12,826 And in fact this code does. 467 00:20:13,066 --> 00:20:14,776 What's the -- what's the opposite of that, right? 468 00:20:14,776 --> 00:20:16,866 Again, corner cases generally mean, you know, 469 00:20:16,866 --> 00:20:18,856 if you're expecting a number, give it a word. 470 00:20:18,856 --> 00:20:20,216 If you're expecting a positive number, 471 00:20:20,216 --> 00:20:21,786 give it a negative, see what happens. 472 00:20:21,786 --> 00:20:24,046 So in this case, it's expecting a string, 473 00:20:24,046 --> 00:20:24,936 don't give it a string. 474 00:20:24,936 --> 00:20:27,536 Or what's the opposite of that. 475 00:20:27,716 --> 00:20:30,386 Give it the biggest fricking string that you can just by, 476 00:20:30,386 --> 00:20:32,756 you know, holding down your keyboard for a while 477 00:20:32,756 --> 00:20:34,596 and then hitting enter, just to see 478 00:20:34,596 --> 00:20:37,166 if you can overflow the memory in the computer, 479 00:20:37,166 --> 00:20:38,076 because this too would be bad. 480 00:20:38,566 --> 00:20:41,836 Because if the operating system doesn't have a billion bytes 481 00:20:41,906 --> 00:20:44,186 to give you, I really went to town on the keyboard 482 00:20:44,186 --> 00:20:46,056 and tried typing in a billion characters. 483 00:20:46,206 --> 00:20:50,336 Well, if there's not that much RAM in the computer, get string, 484 00:20:50,546 --> 00:20:53,456 you know, maybe it could return part of the string, 485 00:20:53,456 --> 00:20:55,546 just the first several thousand characters. 486 00:20:55,886 --> 00:20:58,076 Or something bad's going to happen because it's going 487 00:20:58,076 --> 00:20:59,886 to start overwriting important memory. 488 00:20:59,886 --> 00:21:02,796 That's where we would need to check the documentation. 489 00:21:02,796 --> 00:21:07,186 And as you'll see in CS 50.8 our own header file, 490 00:21:07,186 --> 00:21:08,656 which you do have a print out of as well, 491 00:21:09,016 --> 00:21:12,876 this is where in the absence of a man page can you actually turn 492 00:21:12,876 --> 00:21:14,796 for documentation for functions. 493 00:21:15,066 --> 00:21:17,556 So if you got some source code that you're using from us 494 00:21:17,826 --> 00:21:20,586 or from something you downloaded for future projects, I mean, 495 00:21:20,586 --> 00:21:22,976 honestly, looking at the source code is a very good place 496 00:21:22,976 --> 00:21:26,196 to start, assuming that that person has exercised some good 497 00:21:26,196 --> 00:21:28,916 design and style and actually documented it. 498 00:21:28,916 --> 00:21:30,386 So I'm curious, let me scroll down. 499 00:21:30,786 --> 00:21:33,646 Okay, get char, I'm not quite interested in that yet. 500 00:21:33,646 --> 00:21:35,116 Get float, get int. 501 00:21:35,366 --> 00:21:37,056 And now notice, this is a header file. 502 00:21:37,056 --> 00:21:42,766 So what do you see and what do you not see in this file. 503 00:21:42,976 --> 00:21:46,356 Sorry? So these are all called what, here? 504 00:21:46,786 --> 00:21:48,236 These things that are not comments? 505 00:21:49,716 --> 00:21:53,076 So these are the function declarations or the prototypes, 506 00:21:53,076 --> 00:21:56,196 they're not -- the functions are not yet implemented or defined. 507 00:21:56,476 --> 00:21:59,056 So declaring a function means telling me what it's going 508 00:21:59,056 --> 00:21:59,576 to look like. 509 00:21:59,646 --> 00:22:02,456 Defining or implementing a function is actually writing the 510 00:22:02,456 --> 00:22:04,966 code between the curly braces that implements that function. 511 00:22:05,276 --> 00:22:08,186 So a header file generally, as we'll start to see today, 512 00:22:08,386 --> 00:22:11,466 you guys have been using sharp, including header files 513 00:22:11,466 --> 00:22:15,756 for some time, is generally not much code, but rather a bunch 514 00:22:15,756 --> 00:22:18,406 of comments describing what this library 515 00:22:18,406 --> 00:22:20,006 or what this file can do. 516 00:22:20,276 --> 00:22:23,966 And also declarations for things you, the programmer, might need. 517 00:22:24,206 --> 00:22:28,376 So for instance, when you have been using standard lib dot H 518 00:22:28,416 --> 00:22:32,126 or math dot H or any of the other dot H files 519 00:22:32,126 --> 00:22:34,306 that you may have found useful over the past couple of weeks, 520 00:22:34,636 --> 00:22:37,606 what you're doing is telling GCC to go look 521 00:22:37,606 --> 00:22:40,046 on the local file system, the local LINUX hard drive, 522 00:22:40,386 --> 00:22:42,116 find the file called math dot H, 523 00:22:42,166 --> 00:22:45,526 because in that file is a list just like this of all 524 00:22:45,526 --> 00:22:47,866 of the special math functions someone else put a lot of work 525 00:22:47,866 --> 00:22:49,306 into writing for my benefit, 526 00:22:49,616 --> 00:22:52,716 and this way now does GCC know what -- 527 00:22:52,716 --> 00:22:55,106 what functions I can call. 528 00:22:55,316 --> 00:22:56,956 But generally, one other stem. 529 00:22:57,606 --> 00:22:58,896 Simply including the C -- 530 00:22:58,896 --> 00:23:03,816 the dot H file with sharp include is only telling GCC 531 00:23:03,946 --> 00:23:05,466 that this function exists. 532 00:23:05,556 --> 00:23:09,476 How do you then tell GCC where to get the actual bytes 533 00:23:09,506 --> 00:23:11,086 that comprise this file. 534 00:23:11,796 --> 00:23:13,576 How do you link them in, so to speak. 535 00:23:14,066 --> 00:23:17,306 And that's the keyword. 536 00:23:17,306 --> 00:23:20,376 Link. At what point too you link in the math library. 537 00:23:21,896 --> 00:23:25,816 Yeah. So compile time could be dash L M flag. 538 00:23:25,816 --> 00:23:27,536 So it's essentially a two-part process. 539 00:23:27,536 --> 00:23:28,906 This is sort of full disclosure. 540 00:23:28,906 --> 00:23:32,386 Hey, GCC, here comes some functions that I want to use. 541 00:23:32,386 --> 00:23:33,456 Here comes some constants, 542 00:23:33,456 --> 00:23:35,646 here comes some synonyms I want to use. 543 00:23:35,906 --> 00:23:39,936 But to actually tell GCC to compile in the zeros and ones 544 00:23:39,936 --> 00:23:41,326 that implement this stuff, 545 00:23:41,326 --> 00:23:43,216 that live in a different file all together, 546 00:23:43,496 --> 00:23:47,496 probably CS 50.O file or a math.0 file, 547 00:23:47,786 --> 00:23:51,596 you need that linker flag at the command line, the dash MLM, 548 00:23:51,596 --> 00:23:53,176 dash CS 50, and so forth. 549 00:23:53,176 --> 00:23:55,296 So we were looking for the get string, here it is. 550 00:23:55,516 --> 00:23:57,646 So apparently get string returns a string, 551 00:23:58,226 --> 00:23:59,476 and here's its documentation. 552 00:23:59,476 --> 00:24:02,066 Reads a line of text from standard input and returns it 553 00:24:02,066 --> 00:24:04,146 as a string, sans new line character. 554 00:24:04,426 --> 00:24:07,166 So in other words, even though I might type foo and hit enter, 555 00:24:07,426 --> 00:24:09,236 apparently this function's going to get rid 556 00:24:09,236 --> 00:24:12,506 of the new line character for me, and just return F-O-O 557 00:24:12,506 --> 00:24:14,986 and they be the terminating character, back slash 0. 558 00:24:15,316 --> 00:24:19,106 Ergo, if the user inputs only enter, it returns, 559 00:24:19,186 --> 00:24:22,616 quote unquote, in answer to your question, not null. 560 00:24:22,686 --> 00:24:24,176 So there's apparently a distinction here. 561 00:24:24,176 --> 00:24:25,766 Null is the special thing 562 00:24:25,766 --> 00:24:28,176 which signifies I've got nothing for you. 563 00:24:28,456 --> 00:24:31,036 But quote unquote in computer science is generally known 564 00:24:31,036 --> 00:24:33,656 as the empty string, which is a string, 565 00:24:34,006 --> 00:24:37,096 but there's just no actual readable characters there. 566 00:24:37,496 --> 00:24:40,406 So there's a difference null is nothing. 567 00:24:40,596 --> 00:24:43,156 Whereas the empty string represented in code 568 00:24:43,156 --> 00:24:45,866 like this is actually represented how in memory. 569 00:24:46,306 --> 00:24:46,966 With how many bytes? 570 00:24:48,346 --> 00:24:50,696 So there is in fact 1 byte being used 571 00:24:50,696 --> 00:24:52,336 to represent the so-called empty string, 572 00:24:52,336 --> 00:24:55,786 and that byte simply contains back slash 0. 573 00:24:55,786 --> 00:24:56,636 So that's the difference. 574 00:24:56,636 --> 00:25:00,246 Whereas null, this special constant null, this is like -- 575 00:25:00,506 --> 00:25:02,846 there is nothing actually there. 576 00:25:02,846 --> 00:25:05,266 I'm returning just the special sentinel value. 577 00:25:05,606 --> 00:25:07,406 Okay, so what about string? 578 00:25:07,596 --> 00:25:09,076 Well, all this time we've been saying 579 00:25:09,076 --> 00:25:12,816 that string is just the synonym more char star, and this is why. 580 00:25:12,916 --> 00:25:15,766 So here at top left at the top of CS 50 dot H, 581 00:25:16,026 --> 00:25:17,626 there's this feature called type def, 582 00:25:17,986 --> 00:25:19,966 where you can define your own data times. 583 00:25:19,966 --> 00:25:23,206 And we'll also do this for more sophisticated purposes. 584 00:25:23,526 --> 00:25:25,706 But what type def here is saying from left to right, 585 00:25:26,036 --> 00:25:31,286 is declare a type called string that is identical to char star. 586 00:25:31,686 --> 00:25:32,706 So it's just a synonym. 587 00:25:32,706 --> 00:25:35,186 And we only do it at the start of the semester just to kind 588 00:25:35,186 --> 00:25:36,786 of simplify things a little bit. 589 00:25:37,156 --> 00:25:39,346 But realize too, you may see things like this. 590 00:25:39,806 --> 00:25:42,666 So you don't need to have the star, FYI, 591 00:25:42,716 --> 00:25:43,996 right next to the variable name. 592 00:25:43,996 --> 00:25:45,466 You'll often see code like this. 593 00:25:45,756 --> 00:25:47,746 It generally tends to be clearer, though. 594 00:25:47,816 --> 00:25:50,206 If any time you define a pointer, moving forward, 595 00:25:50,426 --> 00:25:52,246 it is right next to the variable, 596 00:25:52,246 --> 00:25:54,076 or in this case the data type name. 597 00:25:54,286 --> 00:25:56,126 So just FYI on that. 598 00:25:56,126 --> 00:25:58,266 Okay, so now let's take a look again at this code 599 00:25:58,736 --> 00:26:00,616 that open this line of discussion. 600 00:26:00,616 --> 00:26:02,596 So here we go with the code. 601 00:26:02,936 --> 00:26:07,356 I call get string, I store it in S, I then do a sanity check. 602 00:26:07,356 --> 00:26:09,476 If it equals null, just return 1. 603 00:26:09,556 --> 00:26:12,476 Bad stuff's going to happen if I try using this thing otherwise. 604 00:26:12,796 --> 00:26:14,636 And then this for loop has what? 605 00:26:14,636 --> 00:26:17,926 So int I gets 0, N gets as I recalling, S, okay, 606 00:26:17,926 --> 00:26:20,036 so that's sort of boring for loop stuff. 607 00:26:20,036 --> 00:26:22,286 But there is something interesting here. 608 00:26:23,666 --> 00:26:26,896 What is going on here, exactly. 609 00:26:30,956 --> 00:26:33,676 so what is S, first in technical terms. 610 00:26:33,676 --> 00:26:34,006 What is S? 611 00:26:35,686 --> 00:26:37,146 So it's a pointer, which is an address, 612 00:26:37,496 --> 00:26:38,836 which is just a number, right? 613 00:26:38,836 --> 00:26:41,036 So every time I draw a RAM on the black board, 614 00:26:41,036 --> 00:26:44,166 I draw it as a rectangle, and I say byte 0 is here, 615 00:26:44,166 --> 00:26:46,416 then there's byte 1, then byte 2, then dot dot dot, 616 00:26:46,656 --> 00:26:49,696 byte 2 billion, if you've got 2 gigabytes of RAM. 617 00:26:49,956 --> 00:26:51,656 So memory can just be addressed numerically. 618 00:26:51,656 --> 00:26:54,396 I don't know if it's bottom up or top down, really depends 619 00:26:54,396 --> 00:26:57,206 on how, you know, the machine is viewing its chips of RAM. 620 00:26:57,556 --> 00:27:00,846 But in this case, we just care that S is a pointer 621 00:27:00,966 --> 00:27:03,876 or an address or number all the same here, 622 00:27:04,106 --> 00:27:06,056 and so what am I doing on each iteration? 623 00:27:06,346 --> 00:27:08,876 Well, S is the starting point 624 00:27:09,236 --> 00:27:11,136 of whatever string the user has typed in. 625 00:27:11,316 --> 00:27:16,476 So if the user has typed in F-O-O back slash S, 626 00:27:17,056 --> 00:27:18,936 first of all what is the value of N going 627 00:27:18,936 --> 00:27:20,406 to be within this loop. 628 00:27:22,656 --> 00:27:26,166 Hmm. What's the value of N going to be, 629 00:27:26,166 --> 00:27:27,906 once called here with string length? 630 00:27:28,016 --> 00:27:30,816 So it is going to be 3. 631 00:27:30,896 --> 00:27:33,336 So when we ask about the length of a string we mean 632 00:27:33,336 --> 00:27:36,356 in human terms, not special computer encoding terms. 633 00:27:36,356 --> 00:27:39,356 So the string length here is in fact 3, 634 00:27:39,356 --> 00:27:41,056 it is not 4, just to be clear. 635 00:27:41,336 --> 00:27:42,986 So what is the length of this string? 636 00:27:43,626 --> 00:27:44,846 So 0. Right. 637 00:27:44,846 --> 00:27:47,226 Because there's no actual alphabetic characters 638 00:27:47,226 --> 00:27:47,816 or otherwise there. 639 00:27:47,816 --> 00:27:49,086 Okay, so what is S? 640 00:27:49,276 --> 00:27:50,866 Well S is a pointer. 641 00:27:51,086 --> 00:27:54,606 So pointers we know are 32-bit chunks of memory. 642 00:27:54,976 --> 00:27:57,356 Right? So an address is 32-bits. 643 00:27:57,476 --> 00:27:58,776 So it looks like an int, 644 00:27:59,266 --> 00:28:01,826 but it's a special data type called a pointer. 645 00:28:02,106 --> 00:28:03,606 And now what is inside of S? 646 00:28:04,246 --> 00:28:07,236 Well, S is technically going to have the address 647 00:28:07,336 --> 00:28:09,536 of the first character in the string. 648 00:28:09,976 --> 00:28:10,846 See we're going to push the limits 649 00:28:10,846 --> 00:28:12,916 of my hand writing ability on a tablet here. 650 00:28:13,176 --> 00:28:23,756 But if this is O X 10, this is O X 11, so now we're dealing 651 00:28:24,106 --> 00:28:26,306 with chars today and not ints. 652 00:28:26,306 --> 00:28:27,396 So a char is 1 byte. 653 00:28:27,396 --> 00:28:29,396 So the size of the data type in now important. 654 00:28:29,616 --> 00:28:30,976 This is O X 12. 655 00:28:30,976 --> 00:28:33,906 And then O X 13. 656 00:28:34,076 --> 00:28:37,406 What is actually stored in this 32-bit chunk of memory called S? 657 00:28:38,836 --> 00:28:40,366 O X 10. Right? 658 00:28:40,366 --> 00:28:43,616 O X 10. The address of the first character, or pictorially, 659 00:28:43,616 --> 00:28:45,386 and it's a lot more user friendly just 660 00:28:45,386 --> 00:28:46,746 to start drawing things with arrows, 661 00:28:46,996 --> 00:28:48,966 what we essentially have is a pointer, 662 00:28:49,306 --> 00:28:50,796 pointing to the first character there. 663 00:28:50,886 --> 00:28:53,446 Okay, so with that said, S is then the address 664 00:28:53,906 --> 00:28:56,056 of this first piece of memory. 665 00:28:56,296 --> 00:29:01,596 So this loop iterates from 0 on up to 3. 666 00:29:01,596 --> 00:29:04,496 So it's going to execute for I equals 0, 1, and 2, 667 00:29:04,556 --> 00:29:06,206 on up to 3 but less than 3. 668 00:29:06,536 --> 00:29:07,376 So what am I printing? 669 00:29:07,376 --> 00:29:09,086 I'm printing a character in a new line 670 00:29:09,086 --> 00:29:10,986 on each iteration, and what am I doing? 671 00:29:10,986 --> 00:29:12,076 I'm printing some math. 672 00:29:12,326 --> 00:29:13,506 So S plus I. 673 00:29:13,846 --> 00:29:17,196 So S plus 0 is what on the first iteration, what number. 674 00:29:18,476 --> 00:29:19,456 O X 10, right? 675 00:29:19,456 --> 00:29:20,636 The hexadecimal value 10. 676 00:29:20,636 --> 00:29:22,436 Which is just a number represented in hex. 677 00:29:22,436 --> 00:29:26,376 Okay. So star and then an address tells me to do what? 678 00:29:27,296 --> 00:29:27,996 Go there. Right? 679 00:29:27,996 --> 00:29:30,446 That's all we said on Monday, when you put a star in front 680 00:29:30,446 --> 00:29:33,636 of a variable, if that variable is a pointer or in this case 681 00:29:33,636 --> 00:29:36,226 if you put a star in front of an arithmetic expression 682 00:29:36,226 --> 00:29:39,446 that itself is the result of doing math on a pointer, 683 00:29:39,646 --> 00:29:42,246 the star just means dereference this, go there. 684 00:29:42,466 --> 00:29:46,296 So that means go to this character and what gets printed? 685 00:29:46,636 --> 00:29:50,606 Perhaps needless to say, F. Now we iterate onto the loop again. 686 00:29:50,606 --> 00:29:52,826 So I gets plus plus, so now I is 1. 687 00:29:53,036 --> 00:29:57,306 So X plus 1 is now O X 11, so what gets printed? 688 00:29:58,486 --> 00:29:59,896 Right, so now we do a little bit of math. 689 00:30:00,226 --> 00:30:01,846 So O X 11, go there. 690 00:30:01,846 --> 00:30:02,946 So O gets printed. 691 00:30:03,056 --> 00:30:05,666 Plus plus, O X 12, go there, print that, 692 00:30:05,666 --> 00:30:10,256 and then I is now going to equal N so the loop terminates. 693 00:30:10,836 --> 00:30:12,816 Right? So that's all that's going on here. 694 00:30:12,956 --> 00:30:15,736 This is what's generally known as pointer arithmetic. 695 00:30:16,216 --> 00:30:17,956 And as the name implies, it just has to do 696 00:30:17,956 --> 00:30:19,536 with doing arithmetic on pointers. 697 00:30:19,856 --> 00:30:23,676 But GCC or the compiler figures out whether you want 698 00:30:23,886 --> 00:30:27,536 to add 1 byte or 4 bytes or as we'll see, 699 00:30:27,536 --> 00:30:28,866 it depends on the context. 700 00:30:29,056 --> 00:30:30,476 But for now it's pretty straight forward. 701 00:30:30,706 --> 00:30:34,726 Initially, why did I initialize a second variable here called N 702 00:30:34,726 --> 00:30:39,546 instead of just putting string length here, by doing this term, 703 00:30:39,936 --> 00:30:42,256 it's never going to work on this. 704 00:30:42,256 --> 00:30:45,726 So why did I do the approach I did and not just monk it 705 00:30:45,726 --> 00:30:46,596 to the condition part? 706 00:30:48,966 --> 00:30:50,616 Light. Otherwise I'd be calculating the length 707 00:30:50,616 --> 00:30:51,996 of the string S every time. 708 00:30:51,996 --> 00:30:54,566 And odds are the length of the string is not going to change, 709 00:30:54,766 --> 00:30:57,086 if I'm not doing anything destructively to the string, 710 00:30:57,086 --> 00:30:58,096 I'm just letting it be. 711 00:30:58,296 --> 00:30:59,796 So the length is never going to change, 712 00:30:59,796 --> 00:31:03,156 putting it into the condition would actually be fairly stupid 713 00:31:03,156 --> 00:31:04,676 because then this loop is going 714 00:31:04,676 --> 00:31:08,446 to have an increased running time just because I'm not -- 715 00:31:08,446 --> 00:31:10,826 I'm foolishly checking the length 716 00:31:10,826 --> 00:31:12,406 of it again and again and again. 717 00:31:12,726 --> 00:31:14,126 Okay, finally, there's this. 718 00:31:14,126 --> 00:31:15,416 And this is new today. 719 00:31:16,386 --> 00:31:24,446 What does this probably mean, free S. Why is that necessary? 720 00:31:24,516 --> 00:31:25,756 So yeah, so there is a keyword [Inaudible] 721 00:31:25,816 --> 00:31:27,236 that will get you today. 722 00:31:27,416 --> 00:31:32,906 All this time, get string and get -- yeah, so all this time, 723 00:31:32,906 --> 00:31:35,996 get string is actually pretty poorly implemented. 724 00:31:35,996 --> 00:31:37,426 Sort of objectively speaking. 725 00:31:37,426 --> 00:31:38,756 So CS 50's library is all 726 00:31:38,756 --> 00:31:40,436 about making it easier to get user input. 727 00:31:40,706 --> 00:31:43,056 To do this, we need to allocate memory on demand, 728 00:31:43,206 --> 00:31:45,706 but we don't know how much memory you're going to need 729 00:31:45,816 --> 00:31:48,076 at first, because we don't now how many words 730 00:31:48,076 --> 00:31:50,146 or how many characters our user is going to type in. 731 00:31:50,416 --> 00:31:53,546 So we allocate memory, as we'll see today, dynamically. 732 00:31:53,546 --> 00:31:57,006 We allocate enough -- as much as we need to fit your string, 733 00:31:57,296 --> 00:31:59,536 unless we completely exhaust the RAM's capacity, 734 00:31:59,776 --> 00:32:01,436 and then we return you a pointer to it. 735 00:32:01,716 --> 00:32:05,426 But the problem in a language like C and C++ is 736 00:32:05,426 --> 00:32:08,656 that if we hand you memory and you never hand it back, 737 00:32:09,286 --> 00:32:11,756 we will assume you're continuing to use it. 738 00:32:11,756 --> 00:32:13,536 So fast forward to reality, 739 00:32:14,086 --> 00:32:16,636 if you've ever been using your computer, Mac, PC, whatever, 740 00:32:16,686 --> 00:32:19,916 for a long time, many hours or even many days, 741 00:32:19,916 --> 00:32:22,436 without even shutting it down, I mean, 742 00:32:22,436 --> 00:32:25,366 what you probably experience is the machine starts to slow 743 00:32:25,366 --> 00:32:28,156 down over time or you know, hitting alternative tab 744 00:32:28,156 --> 00:32:30,186 or trying to change Windows might start to feel 745 00:32:30,186 --> 00:32:31,896 like things are grinding to a halt. 746 00:32:32,196 --> 00:32:34,096 There could be any number of explanations 747 00:32:34,096 --> 00:32:35,566 for that problematic behavior. 748 00:32:35,756 --> 00:32:38,026 But one of them is that your computer is running 749 00:32:38,026 --> 00:32:38,836 out of memory. 750 00:32:39,276 --> 00:32:42,076 So not physically, but your computer is running this 751 00:32:42,076 --> 00:32:43,896 program, and that program, and this program, 752 00:32:43,896 --> 00:32:48,056 and humans are fallible, and probably wrote some buggy code 753 00:32:48,286 --> 00:32:50,436 that asked the operating system for memory, 754 00:32:50,736 --> 00:32:51,786 but never gave it back. 755 00:32:52,256 --> 00:32:53,426 In fact, in the worst case, 756 00:32:53,426 --> 00:32:55,006 you can imagine a simple application, 757 00:32:55,006 --> 00:32:57,706 like an instant messenger, every time you get an IM, 758 00:32:57,706 --> 00:33:00,466 whether you're using AOL or MSN or Gtalk or whatever, 759 00:33:00,676 --> 00:33:02,676 a string appears on your screen. 760 00:33:02,936 --> 00:33:05,646 Well, that string has to be stored in memory somewhere. 761 00:33:05,886 --> 00:33:08,406 So you know, even if we just kind of think 762 00:33:08,486 --> 00:33:11,216 through intuitively how an instant messaging client works, 763 00:33:11,476 --> 00:33:14,296 odds are that memory is being allocated dynamically. 764 00:33:14,296 --> 00:33:17,336 Every time you get an IM, maybe, the OS is being asked, oh, 765 00:33:17,636 --> 00:33:18,936 someone just sent me L-O-L. 766 00:33:18,936 --> 00:33:21,666 I need another 4 bytes for this, or a longer sentence, 767 00:33:21,666 --> 00:33:23,996 I need an even bigger chunk of memory for this. 768 00:33:24,366 --> 00:33:27,206 But if that client never says to the operating system, oh, 769 00:33:27,206 --> 00:33:31,246 the user closed the window, here's all of that RAM back, 770 00:33:31,246 --> 00:33:33,886 AOL instant messenger or what not is just going to keep asking 771 00:33:33,886 --> 00:33:35,386 for more and more and more bytes. 772 00:33:35,386 --> 00:33:37,916 And then if you look at your activity monitor on a Mac 773 00:33:38,226 --> 00:33:40,726 or process manager on a PC, you might see 774 00:33:40,726 --> 00:33:43,416 that one stupid little program is using many, many, 775 00:33:43,416 --> 00:33:45,956 many megabytes, if not gigabytes of ram, 776 00:33:46,326 --> 00:33:48,126 because the programmer screwed up, 777 00:33:48,236 --> 00:33:50,096 and this is how easy it is to screw up. 778 00:33:50,096 --> 00:33:53,656 And in fact, any program you all have written thus far allocating 779 00:33:53,656 --> 00:33:57,026 memory has been buggy at least objectively speaking, 780 00:33:57,026 --> 00:33:58,506 because probably none 781 00:33:58,506 --> 00:34:01,106 of you have ever freed the memory you asked 782 00:34:01,376 --> 00:34:02,996 for by way of get string. 783 00:34:03,196 --> 00:34:04,336 But today that all ends. 784 00:34:04,706 --> 00:34:08,486 So it turns out that get string uses a function called meloc, 785 00:34:08,626 --> 00:34:10,716 for memory allocation, and we'll see that today. 786 00:34:11,086 --> 00:34:12,806 Free is essentially the opposite. 787 00:34:13,456 --> 00:34:16,636 You hand to free a pointer that has been known, 788 00:34:16,826 --> 00:34:19,266 that you know points to a chunk of memory 789 00:34:19,526 --> 00:34:21,446 that has been allocated for you. 790 00:34:21,756 --> 00:34:24,166 So let me go ahead and open now a second variant of this, 791 00:34:24,166 --> 00:34:26,376 just so show something a little more sophisticated indicated, 792 00:34:26,516 --> 00:34:29,576 and actually let me clean up one of the conditions 793 00:34:29,616 --> 00:34:33,646 for a moment just to show something slightly different. 794 00:34:33,646 --> 00:34:36,336 So this 5 here is kind of a magic number right now, 795 00:34:36,336 --> 00:34:38,256 but I just wanted to simplify the code for the sake 796 00:34:38,256 --> 00:34:40,146 of discussion, and I'm actually I'm going to go ahead 797 00:34:40,146 --> 00:34:42,546 and delete these two lines of code just for discussion's sake. 798 00:34:42,836 --> 00:34:44,486 So three lines of interesting code now. 799 00:34:44,486 --> 00:34:47,946 The first allocates an array statistically, as we'll say. 800 00:34:47,946 --> 00:34:50,426 If it's static in the sense that I give the values 801 00:34:50,426 --> 00:34:52,906 in advantageous and I'm not letting the user provide them, 802 00:34:53,106 --> 00:34:53,666 for instance. 803 00:34:53,896 --> 00:34:55,796 This is again how you statistically initialize 804 00:34:55,796 --> 00:34:56,246 an array. 805 00:34:56,246 --> 00:34:59,366 You can use curly braces like this, and just put your numbers 806 00:34:59,366 --> 00:35:02,346 or your strings or whatever inside separated by commas. 807 00:35:02,626 --> 00:35:04,126 But the loop is essentially the same. 808 00:35:04,126 --> 00:35:07,496 Here's a star, and here is some pointer arithmetic, 809 00:35:07,816 --> 00:35:09,426 but notice the difference here. 810 00:35:09,426 --> 00:35:10,616 And this is kind of neat. 811 00:35:10,836 --> 00:35:14,816 So this time the array is not of type char, it's not a string, 812 00:35:15,346 --> 00:35:17,096 but rather it's a type int. 813 00:35:17,766 --> 00:35:21,576 So an int, we said a moment ago is 32 bits or 4 bytes, 814 00:35:22,256 --> 00:35:26,796 and yet when I iterate over this program's array, 815 00:35:27,286 --> 00:35:29,216 printing out each of its numbers, 816 00:35:29,466 --> 00:35:31,106 so actually let's do that sanity check. 817 00:35:31,106 --> 00:35:32,476 So make pointers 2. 818 00:35:32,476 --> 00:35:34,406 Let me run pointers 2. 819 00:35:34,526 --> 00:35:35,316 That's all it does. 820 00:35:35,316 --> 00:35:36,306 It brings 1 to 5. 821 00:35:36,486 --> 00:35:38,116 But notice how I'm doing to do that. 822 00:35:38,116 --> 00:35:41,736 I'm iterating from 0 on up to 5. 823 00:35:42,076 --> 00:35:44,366 But each time I'm going the exact same arithmetic. 824 00:35:44,756 --> 00:35:47,446 So I'm taking numbers plus 0. 825 00:35:47,806 --> 00:35:49,556 Numbers plus 1. 826 00:35:49,846 --> 00:35:51,586 Numbers plus 2. 827 00:35:51,966 --> 00:35:54,076 But that feels kind of buggy, right? 828 00:35:54,126 --> 00:35:57,916 If an int takes up 4 bytes of memory that looks kind 829 00:35:57,916 --> 00:35:59,896 of like this, and I print the first int, 830 00:35:59,996 --> 00:36:03,006 well that makes sense, the 32 bits representing the number, 831 00:36:03,316 --> 00:36:05,696 say the number 1 gets printed. 832 00:36:05,856 --> 00:36:08,786 And so the width of this thing now is 32 bits 833 00:36:09,386 --> 00:36:10,116 from left to right. 834 00:36:10,326 --> 00:36:13,136 So this here is my pointer, called numbers, 835 00:36:13,136 --> 00:36:15,306 and it's pointing to the start of this element. 836 00:36:15,546 --> 00:36:17,186 And I print out those 32 bits. 837 00:36:17,416 --> 00:36:20,416 But if I then add 1 to it, that sort of means 838 00:36:20,416 --> 00:36:23,286 that this arrow is not pointing there, but it's kind 839 00:36:23,286 --> 00:36:24,616 of pointing here, right? 840 00:36:24,776 --> 00:36:26,816 Because that would be 1, this would be byte 2 841 00:36:26,816 --> 00:36:29,656 and this would be byte 3, and then we'd have the next byte, 842 00:36:29,736 --> 00:36:31,426 starting at another 4. 843 00:36:31,426 --> 00:36:35,006 So is this buggy or not? 844 00:36:38,966 --> 00:36:41,186 Actually, this is kind of a leading question, 845 00:36:41,186 --> 00:36:42,396 because you wouldn't know the answer. 846 00:36:42,566 --> 00:36:45,006 So no, it's not buggy, because one of the features 847 00:36:45,006 --> 00:36:46,986 of this thing called pointer arithmetic, 848 00:36:46,986 --> 00:36:48,646 and this is just really to hammer this home 849 00:36:48,646 --> 00:36:50,816 so you don't yourself do the wrong math. 850 00:36:51,246 --> 00:36:53,826 GCC is smart enough to realize, oh, 851 00:36:53,826 --> 00:36:57,586 this pointer here is a pointer to an int I know 852 00:36:57,586 --> 00:37:01,106 from the way I was designed that an int is 32 bits or 4 bytes. 853 00:37:01,346 --> 00:37:04,166 Therefore, any time someone tries to perform arithmetic 854 00:37:04,166 --> 00:37:07,606 on me with plus 0, plus 1, I'm really going 855 00:37:07,606 --> 00:37:11,466 to do plus 1 times the size of me. 856 00:37:11,726 --> 00:37:14,196 Plus 2 times the size of me. 857 00:37:14,426 --> 00:37:19,176 So what this means is numbers plus I times 4 is really the 858 00:37:19,176 --> 00:37:20,506 mathematics that are going on. 859 00:37:20,716 --> 00:37:22,976 And that's what let's me go from left to right, 860 00:37:23,336 --> 00:37:26,346 across the array correctly. 861 00:37:26,346 --> 00:37:28,966 And I'll leave that updated version of the code there. 862 00:37:29,336 --> 00:37:33,836 Okay, so any questions before we start peeling back the layers 863 00:37:33,836 --> 00:37:34,756 of the library here. 864 00:37:36,266 --> 00:37:40,456 No? Okay. So here is a use of the CS 50 library. 865 00:37:40,866 --> 00:37:42,456 It says print def, save some -- 866 00:37:42,456 --> 00:37:44,016 this is an example that uses the library -- 867 00:37:44,016 --> 00:37:47,416 print def, say something, char star S 1, get string. 868 00:37:47,646 --> 00:37:50,046 So really, starting today and starting with P set 4 869 00:37:50,046 --> 00:37:51,756 on ward, no more string. 870 00:37:51,756 --> 00:37:52,686 It's char star. 871 00:37:52,686 --> 00:37:53,976 That particular training wheel will come off. 872 00:37:54,716 --> 00:37:57,766 allocate enough memory for copy. 873 00:37:57,766 --> 00:37:58,606 Okay, interesting. 874 00:37:58,886 --> 00:38:02,256 So the context here is that I wanted to write a little program 875 00:38:02,456 --> 00:38:05,236 that let's me copy one string to another 876 00:38:05,506 --> 00:38:08,766 and then actually demonstrate that the copy is correct. 877 00:38:08,996 --> 00:38:13,776 So this is excerpted from this code here, this is copy 1.C. 878 00:38:13,776 --> 00:38:14,876 And it's not terribly long, 879 00:38:14,876 --> 00:38:16,396 but it uses the same building blocks. 880 00:38:16,396 --> 00:38:18,716 So up here I say, say something. 881 00:38:19,046 --> 00:38:19,366 All right? 882 00:38:19,366 --> 00:38:20,656 That's not that interesting. 883 00:38:20,926 --> 00:38:23,436 Here I say get string, and then I do a sanity check. 884 00:38:23,436 --> 00:38:26,266 Okay, so at this point, and as your print out suggests, 885 00:38:26,336 --> 00:38:29,096 I can comment those 4 lines of code with one comment like, 886 00:38:29,176 --> 00:38:30,786 get input from user or whatever. 887 00:38:30,816 --> 00:38:32,366 All right, so now this. 888 00:38:33,076 --> 00:38:36,226 This is a little worrisome here, what -- 889 00:38:36,286 --> 00:38:38,916 in English is this line of code doing? 890 00:38:39,766 --> 00:38:42,776 What is it copying? 891 00:38:42,776 --> 00:38:42,876 [ Inaudible audience comment ] 892 00:38:42,876 --> 00:38:44,746 >> So it's just copying the memory address 893 00:38:44,746 --> 00:38:47,476 and it's taking the value in S 1, which is an address, 894 00:38:47,476 --> 00:38:50,956 a pointer, and it's storing it in S 2, so at this point 895 00:38:51,106 --> 00:38:54,986 in the story, S 2 is a copy of S 1. 896 00:38:55,346 --> 00:38:58,026 But conceptually, S 2 is not a copy 897 00:38:58,026 --> 00:39:00,506 of the string pointed at by S 1. 898 00:39:00,506 --> 00:39:01,976 In fact, let's take a quick look. 899 00:39:01,976 --> 00:39:06,946 So I'm going to make copy 1, I'm going to then run copy 1, 900 00:39:07,156 --> 00:39:10,346 and I'm going to say something like hello there. 901 00:39:11,686 --> 00:39:14,196 Hmm, it didn't seem to capitalize the whole thing. 902 00:39:14,196 --> 00:39:16,676 Let's try this again with another word in all lower case. 903 00:39:17,006 --> 00:39:18,806 Foo. Lots of lower case letters. 904 00:39:19,496 --> 00:39:24,556 Okay, oh, okay, no, that's correct. 905 00:39:24,696 --> 00:39:25,466 I had to think about it. 906 00:39:25,996 --> 00:39:28,606 But what's wrong here, this is buggy. 907 00:39:30,326 --> 00:39:34,386 So the goal is to capitalize the original, or to make a -- 908 00:39:34,386 --> 00:39:36,026 actually I should tell you what the point is. 909 00:39:36,246 --> 00:39:39,216 The goal of this program is to take the original string, 910 00:39:39,416 --> 00:39:41,396 make a copy of it, and capitalize the copy. 911 00:39:41,596 --> 00:39:45,186 But clearly what's happened here is both have been capitalized. 912 00:39:45,186 --> 00:39:46,636 So let's take a quick look at the code 913 00:39:46,636 --> 00:39:49,486 and then see what it actually takes to fix this, 914 00:39:49,706 --> 00:39:51,366 not for the sake of fixing it, but for the sake 915 00:39:51,366 --> 00:39:53,406 of understanding what's going on underneath the hood. 916 00:39:53,406 --> 00:39:56,526 So at this point in the story we have a string called foo 917 00:39:56,526 --> 00:39:58,876 or whatever, in memory, and I would say keep it short 918 00:39:58,976 --> 00:40:02,126 so that my hand writing doesn't completely fail us. 919 00:40:02,166 --> 00:40:05,046 So I have a string called F-O-O back slash 0, 920 00:40:05,046 --> 00:40:10,276 I have a pointer called S 1, and that is effectively pointing 921 00:40:10,276 --> 00:40:14,016 to this byte in memory the moment get string returns. 922 00:40:14,366 --> 00:40:16,416 Okay, it is not null, so I don't return yet. 923 00:40:16,606 --> 00:40:17,786 I proceed to the next line. 924 00:40:18,056 --> 00:40:22,176 This line of code here where it says S 2 similarly declares a 925 00:40:22,176 --> 00:40:26,236 pointer of size 32 bits, even though it's pointing to a char, 926 00:40:26,236 --> 00:40:27,586 the point yes, sir is still 32 bits, 927 00:40:27,586 --> 00:40:30,436 so it's still the same size square, in reality -- 928 00:40:30,436 --> 00:40:32,086 in theory, not in reality. 929 00:40:32,146 --> 00:40:33,016 Per my handwriting here. 930 00:40:33,016 --> 00:40:33,986 So that's S 2. 931 00:40:34,266 --> 00:40:36,186 What is S 2 pointing at? 932 00:40:36,736 --> 00:40:38,646 So it's pointing at the same thing. 933 00:40:38,866 --> 00:40:41,156 So S 2 is a copy of S 1. 934 00:40:41,446 --> 00:40:43,536 But if we're trying to consider char stars 935 00:40:43,536 --> 00:40:46,176 to be conceptually bigger entities than just a number 936 00:40:46,176 --> 00:40:49,166 but an entire string, clearly, I've not copied the string, 937 00:40:49,166 --> 00:40:51,506 because the same 4 bytes are being used 938 00:40:51,506 --> 00:40:53,246 for F-O-O back slash 0. 939 00:40:53,246 --> 00:40:56,266 So now I claim I'm going to capitalize the copy here. 940 00:40:56,516 --> 00:40:57,796 So I do a little sanity check. 941 00:40:57,796 --> 00:41:01,366 If the length of S 2 is greater than 0, like let me make sure 942 00:41:01,646 --> 00:41:05,216 that I have room for this string, what do I do? 943 00:41:05,506 --> 00:41:10,386 I take the 0 location of S 2 and I change it to the result 944 00:41:10,386 --> 00:41:14,346 of calling 2 upper on the 0 character in S 2. 945 00:41:14,606 --> 00:41:15,996 And I took that function 946 00:41:16,056 --> 00:41:17,866 from the string library that's up here. 947 00:41:17,866 --> 00:41:19,996 So you can see it documented, you can check out the man page, 948 00:41:20,216 --> 00:41:22,546 it's just a little function that does capitalization for me, 949 00:41:22,546 --> 00:41:25,756 by remembering, oh, 65 on up is upper case, 950 00:41:25,756 --> 00:41:27,916 97 on up is lower case, that's the whole deal. 951 00:41:27,916 --> 00:41:31,486 All right, but then I claim here comes the original, S 1. 952 00:41:31,706 --> 00:41:34,176 Here comes the copy, but the problem was 953 00:41:34,176 --> 00:41:37,396 that both the original and the copy were the same. 954 00:41:37,396 --> 00:41:38,616 They were both capitalized. 955 00:41:38,676 --> 00:41:40,246 But pictorially, that should make sense. 956 00:41:40,756 --> 00:41:43,906 So how do I fix this problem fundamentally if I want 957 00:41:43,906 --> 00:41:46,506 to maintain the original, and then make a copy, 958 00:41:46,506 --> 00:41:49,156 the latter of which only is capitalize d. What needs 959 00:41:49,396 --> 00:41:54,626 to happen on a high level? 960 00:41:54,736 --> 00:41:56,206 Yeah, so we need new memory, right? 961 00:41:56,246 --> 00:41:59,606 So we need to take 4 bytes that have F-O-O back slash 0, 962 00:41:59,606 --> 00:42:02,426 we need another 4 bytes, and then we need to fill that array 963 00:42:02,696 --> 00:42:04,396 with that particular copy. 964 00:42:04,456 --> 00:42:06,006 So let's go ahead and open copy 2. 965 00:42:06,006 --> 00:42:07,516 You each have a print out of this as well. 966 00:42:07,886 --> 00:42:09,226 Let's scroll on down here. 967 00:42:09,346 --> 00:42:10,946 And now this is the new magic. 968 00:42:10,946 --> 00:42:13,456 And this is something that's going to become very useful 969 00:42:13,656 --> 00:42:16,446 because thus far, pretty much any program you all have 970 00:42:16,446 --> 00:42:18,256 written, if it takes any form 971 00:42:18,256 --> 00:42:20,476 of input the only way you've been able to get input 972 00:42:20,476 --> 00:42:22,526 from the user is by way of get string. 973 00:42:22,606 --> 00:42:25,556 But you'll see and you certainly want to write programs 974 00:42:25,586 --> 00:42:27,176 that take far more interesting input 975 00:42:27,176 --> 00:42:29,016 than just a string here or a string there. 976 00:42:29,016 --> 00:42:31,696 And you're not going to be able to use a get string function. 977 00:42:31,696 --> 00:42:34,266 You might want to get a new record in a database, 978 00:42:34,266 --> 00:42:37,676 you might want to read in an entire web page, maybe not in C, 979 00:42:37,676 --> 00:42:38,576 but in another language. 980 00:42:38,576 --> 00:42:41,226 You're going to need dynamicism in which you can allocate 981 00:42:41,226 --> 00:42:43,346 as much memory as you want, but on demand. 982 00:42:43,346 --> 00:42:46,546 And the means by which you do this is this little guy here, 983 00:42:46,876 --> 00:42:47,496 meloc. 984 00:42:47,496 --> 00:42:49,106 So what am I doing in this version? 985 00:42:49,316 --> 00:42:50,926 The first few lines of code are identical. 986 00:42:51,176 --> 00:42:54,836 I declare a string of S 1, and I get a string from the user, 987 00:42:54,836 --> 00:42:57,976 so that picture again looks a little something like this, 988 00:42:58,136 --> 00:43:00,556 with a pointer to the first chunk of memory. 989 00:43:00,816 --> 00:43:01,886 Now though, I do this. 990 00:43:02,336 --> 00:43:05,496 I call a function called meloc for memory allocation, 991 00:43:05,496 --> 00:43:08,246 and that takes a single argument here. 992 00:43:08,566 --> 00:43:12,446 The number of bytes you want allocated for your use. 993 00:43:13,006 --> 00:43:14,536 So I had to do a little bit of math. 994 00:43:14,536 --> 00:43:16,636 I could have just hard coded 4 in here, 995 00:43:16,636 --> 00:43:18,786 but that's probably not the point of the exercise. 996 00:43:18,786 --> 00:43:20,186 So let me figure it out dynamically. 997 00:43:20,186 --> 00:43:21,336 How many bytes do I need? 998 00:43:21,646 --> 00:43:22,596 I need the length 999 00:43:22,596 --> 00:43:27,976 of the original string plus 1 times the size of the piece 1000 00:43:27,976 --> 00:43:29,996 of data that I want to store in that location. 1001 00:43:30,456 --> 00:43:33,046 So Stirling of S 1 gives that answer, 1002 00:43:33,046 --> 00:43:35,176 what in this specific example from before? 1003 00:43:35,806 --> 00:43:41,666 So 3, so 3 plus 1 times size of char, and the size of char, 1004 00:43:41,666 --> 00:43:42,486 what does this return? 1005 00:43:44,106 --> 00:43:46,266 So you're right, this returns one byte here, 1006 00:43:46,796 --> 00:43:48,826 so not 8 bits, 1 byte. 1007 00:43:48,906 --> 00:43:49,926 Size of return is bytes. 1008 00:43:50,196 --> 00:43:52,876 But 4 times 1, 4 bytes comes back. 1009 00:43:52,876 --> 00:43:55,516 And just a little sanity check, why this plus 1 here? 1010 00:43:56,946 --> 00:43:59,416 Because you need the zero character 1011 00:43:59,416 --> 00:44:00,376 for the end of the string. 1012 00:44:00,556 --> 00:44:02,566 Okay, so let's see what happens next. 1013 00:44:02,566 --> 00:44:04,936 Okay, so int, N gets string length of 1. 1014 00:44:05,176 --> 00:44:07,516 Okay, so this is kind of a borrowing for loop that iterates 1015 00:44:07,516 --> 00:44:11,126 from 0 to N. Oh, this is just using array notation 1016 00:44:11,356 --> 00:44:15,676 to make a copy of S 2, S 1's I character, 1017 00:44:15,676 --> 00:44:17,756 and put in S 2's I character. 1018 00:44:17,906 --> 00:44:20,526 And then finally at the very end I need to make sure 1019 00:44:20,526 --> 00:44:22,986 to install a back slash 0 here, 1020 00:44:23,126 --> 00:44:26,136 those following along quite carefully might realize is this 1021 00:44:26,136 --> 00:44:27,666 last line really necessary? 1022 00:44:28,276 --> 00:44:31,936 If I really wanted to be nit-picky, I could delete this, 1023 00:44:31,936 --> 00:44:33,056 but what would I need to change 1024 00:44:33,056 --> 00:44:35,926 to make sure I have a back slash 0 at the end 1025 00:44:35,926 --> 00:44:37,426 of the copied string, S 2. 1026 00:44:37,426 --> 00:44:38,796 [ Inaudible audience comment ] 1027 00:44:38,796 --> 00:44:40,026 >> Good. So I need to -- 1028 00:44:40,126 --> 00:44:42,396 you know, I don't need to hard code in a back slash 0. 1029 00:44:42,396 --> 00:44:44,196 Let me just steal the one that's all right 1030 00:44:44,196 --> 00:44:45,976 in S 1 and make a copy. 1031 00:44:46,006 --> 00:44:46,706 Which way is better? 1032 00:44:46,706 --> 00:44:48,276 Eh, it's not really clear. 1033 00:44:48,276 --> 00:44:52,806 The only reason that I might want to do this for sure is just 1034 00:44:52,806 --> 00:44:56,216 to ward off the possibility that S 1 is somehow broken 1035 00:44:56,216 --> 00:44:59,176 or corrupted, at least this way I know S 2 is going 1036 00:44:59,176 --> 00:45:00,586 to stop at one point. 1037 00:45:00,586 --> 00:45:02,206 But for the most part, these two are equivalent. 1038 00:45:02,406 --> 00:45:03,246 And now what do I do? 1039 00:45:03,246 --> 00:45:04,336 Capitalizing copy. 1040 00:45:04,616 --> 00:45:06,636 So why am I executing this line here? 1041 00:45:07,286 --> 00:45:10,506 Why am I checking the string length of S 2 in this context 1042 00:45:10,506 --> 00:45:14,306 at the bottom of the program. 1043 00:45:14,396 --> 00:45:17,566 Like, what is bad about not checking the string length 1044 00:45:17,566 --> 00:45:19,376 of S. S 2. 1045 00:45:20,686 --> 00:45:21,976 What might I do blindly? 1046 00:45:22,516 --> 00:45:27,126 [ Inaudible audience comment ] 1047 00:45:27,626 --> 00:45:30,176 >> So I might try changing a null character 1048 00:45:30,176 --> 00:45:32,106 to an upper case character, or really, 1049 00:45:32,106 --> 00:45:35,526 if this string S 2 has zero length, I'm saying go 1050 00:45:35,526 --> 00:45:37,036 to the first byte in S 2. 1051 00:45:37,166 --> 00:45:38,846 If there's nothing there, you all -- 1052 00:45:38,916 --> 00:45:40,776 some of you have made this mistake, I mean, 1053 00:45:40,856 --> 00:45:42,316 maybe we can do a little confessional, 1054 00:45:42,316 --> 00:45:43,936 even with problems in theory. 1055 00:45:43,936 --> 00:45:44,836 How many of you are willing 1056 00:45:44,836 --> 00:45:48,156 to admit you created a core dump in your directory. 1057 00:45:49,046 --> 00:45:50,616 Okay, so now those of you who weren't willing 1058 00:45:50,616 --> 00:45:52,476 to put your hands up now, now your hands can go up, right? 1059 00:45:52,536 --> 00:45:53,366 So you're in good company. 1060 00:45:53,706 --> 00:45:54,606 So it's a lot of you. 1061 00:45:54,606 --> 00:45:56,546 And odds are those whose hands didn't went up, 1062 00:45:56,806 --> 00:45:57,946 I think it's more likely 1063 00:45:57,946 --> 00:45:59,356 that you haven' started the P set yet, 1064 00:45:59,356 --> 00:46:02,446 or that you're just not fessing up, 1065 00:46:02,806 --> 00:46:04,056 since it happens to the best of us. 1066 00:46:04,056 --> 00:46:05,026 And let's see -- let's see 1067 00:46:05,026 --> 00:46:06,816 if I can't induce this behavior myself. 1068 00:46:07,076 --> 00:46:08,796 So this is my original program. 1069 00:46:09,186 --> 00:46:13,496 So what do I want to do to maybe mess with my own program here? 1070 00:46:13,726 --> 00:46:15,966 Well, let me do something like this. 1071 00:46:16,516 --> 00:46:22,246 Let me go ahead and -- let's say -- say not up to N, 1072 00:46:22,246 --> 00:46:24,606 but you know, I kind of screw up, and I do something -- oops. 1073 00:46:25,176 --> 00:46:28,506 So instead of making a copy from 0 up to N, 1074 00:46:28,506 --> 00:46:32,366 let me be really obnoxious and try copying 100 bytes only a few 1075 00:46:32,366 --> 00:46:34,836 of which -- few of which probably belong to me. 1076 00:46:34,836 --> 00:46:40,046 So I'm going to make copy 2, I'm going to run copy 2, 1077 00:46:40,046 --> 00:46:43,686 I'm going to type in F-O-O, which is the string length of 3. 1078 00:46:43,686 --> 00:46:45,106 Okay, that seemed to work. 1079 00:46:45,106 --> 00:46:47,286 So realize, too, these problems you're running 1080 00:46:47,286 --> 00:46:50,356 into with memory are not always easily changed down. 1081 00:46:50,356 --> 00:46:53,406 And this is again why GDB can be so powerful, because sometimes, 1082 00:46:53,646 --> 00:46:55,966 because of the way memory is managed in the computer, 1083 00:46:56,296 --> 00:46:59,166 sometimes you can touch memory that doesn't belong to you. 1084 00:46:59,676 --> 00:47:01,016 But nothing bad happens. 1085 00:47:01,066 --> 00:47:02,276 But sometimes it does. 1086 00:47:02,276 --> 00:47:04,436 So with memory, can you actually get the sort 1087 00:47:04,596 --> 00:47:06,066 of non-deterministic bugs, 1088 00:47:06,396 --> 00:47:08,576 because things happen differently sometimes 1089 00:47:08,576 --> 00:47:11,696 when you run a program if user input is influencing 1090 00:47:11,696 --> 00:47:12,456 the behavior. 1091 00:47:12,666 --> 00:47:13,566 So let me try this again. 1092 00:47:13,566 --> 00:47:17,126 Let me recompile, this time using 1,000 for copy 2. 1093 00:47:17,366 --> 00:47:20,456 So what I'm doing now is I'm blindly copying 1,000 bytes 1094 00:47:20,456 --> 00:47:23,826 from S 1 to S 2, eve though I did not allocate 1095 00:47:23,826 --> 00:47:25,356 that many bytes for the copy. 1096 00:47:25,896 --> 00:47:26,956 Whoa, okay. 1097 00:47:27,466 --> 00:47:28,986 So bad stuff just happened. 1098 00:47:29,176 --> 00:47:30,736 Right? So that's the take away. 1099 00:47:30,956 --> 00:47:33,706 By doing L S, there's my core file. 1100 00:47:33,956 --> 00:47:38,286 If I do an L S dash L, what you'll see is the long listing. 1101 00:47:38,376 --> 00:47:41,426 So this is a lot of dates and times, but if I look 1102 00:47:41,476 --> 00:47:43,456 for core here, notice 1103 00:47:43,456 --> 00:47:51,966 that apparently this dump outside 442,368 bytes of memory, 1104 00:47:52,156 --> 00:47:53,256 and if you really want to be more 1105 00:47:53,256 --> 00:47:56,506 like a human you can do L S dash L H for human readable. 1106 00:47:56,846 --> 00:48:00,236 So my core dump is apparently 432 kilobytes, 1107 00:48:00,516 --> 00:48:02,646 which is apparently roughly how much memory 1108 00:48:02,856 --> 00:48:05,446 that program was using at the moment I screwed up. 1109 00:48:05,446 --> 00:48:08,956 So the computer dumped the contents of that memory 1110 00:48:09,236 --> 00:48:11,396 to the local hard drive. 1111 00:48:11,396 --> 00:48:13,426 So let me actually do something with this core dump, 1112 00:48:13,426 --> 00:48:15,366 another trick that we can introduce 1113 00:48:15,366 --> 00:48:16,626 with GDB today is this. 1114 00:48:16,776 --> 00:48:20,546 I can run GDB on copy 2 and then I can go ahead and type run, 1115 00:48:21,306 --> 00:48:23,816 and then I can time foo, and now I get this mess. 1116 00:48:24,286 --> 00:48:26,416 But you can also do sort of some forensics with GDB. 1117 00:48:26,416 --> 00:48:28,466 If you already screwed up 1118 00:48:28,466 --> 00:48:31,066 and you therefore have a core dump file in your directory, 1119 00:48:31,376 --> 00:48:33,406 you can tell GDB to use that. 1120 00:48:33,466 --> 00:48:36,116 You say GDB, name of the program, and then name 1121 00:48:36,116 --> 00:48:39,296 of the core dump, which is typically core, hit enter, 1122 00:48:39,606 --> 00:48:44,516 and what you'll see now is -- let's see if it's actually -- 1123 00:48:44,666 --> 00:48:46,896 hmm, it's not terribly useful here. 1124 00:48:47,066 --> 00:48:48,826 But it did tell me that this program, 1125 00:48:48,826 --> 00:48:51,016 when the core file was generated, 1126 00:48:51,226 --> 00:48:53,226 aborted with signal 6. 1127 00:48:53,226 --> 00:48:56,836 And actually, you know, I would also e-mail help at CS 50.net 1128 00:48:56,836 --> 00:48:57,546 if I got this err your. 1129 00:48:57,546 --> 00:48:59,056 Because this is not actually that helpful. 1130 00:48:59,056 --> 00:49:00,266 So what's actually going on? 1131 00:49:00,266 --> 00:49:02,436 Well, let's take a quick look by running the program. 1132 00:49:02,436 --> 00:49:08,116 GDB of copy 2, enter, let me go ahead and break in main, 1133 00:49:08,166 --> 00:49:11,686 let me go ahead and run, and now it's going to say something. 1134 00:49:11,936 --> 00:49:13,396 Now I'm being prompted to say something, 1135 00:49:13,396 --> 00:49:14,606 I'm going to type foo. 1136 00:49:14,636 --> 00:49:16,316 Now it's doing a little sanity check. 1137 00:49:16,316 --> 00:49:18,156 Now its allocating the additional memory, 1138 00:49:18,156 --> 00:49:19,356 and let me do something here first. 1139 00:49:19,556 --> 00:49:22,566 I'm going print S 2, and look, there's just garbage there. 1140 00:49:22,746 --> 00:49:26,096 If I then type next, let me now print S 2. 1141 00:49:26,656 --> 00:49:28,646 And now there's different garbage. 1142 00:49:28,866 --> 00:49:32,766 Why is there different garbage all of a sudden? 1143 00:49:33,806 --> 00:49:36,226 What's that? 1144 00:49:36,436 --> 00:49:38,326 Well, so I haven't put anything in S 1 yet. 1145 00:49:38,326 --> 00:49:40,346 So S 2 originally by default had? 1146 00:49:40,346 --> 00:49:41,496 Garbage value here. 1147 00:49:42,266 --> 00:49:44,516 So now it seems to have some new garbage value, 1148 00:49:44,706 --> 00:49:45,896 but that's actually to be expected. 1149 00:49:45,896 --> 00:49:48,006 Because what does meloc return, quite simply? 1150 00:49:48,596 --> 00:49:51,506 It returns the address of what? 1151 00:49:52,416 --> 00:49:54,816 Just the address of a chunk of memory. 1152 00:49:55,066 --> 00:49:56,546 However many bytes you asked for. 1153 00:49:56,546 --> 00:49:59,356 But it makes no claims as to what's actually at that address, 1154 00:49:59,356 --> 00:50:01,226 and so we have different garbage. 1155 00:50:01,226 --> 00:50:02,656 Because we've been handed a different address. 1156 00:50:02,656 --> 00:50:04,796 All right, let's get to the point where I actually screw up. 1157 00:50:04,796 --> 00:50:06,826 I check the string length, if I print N, 1158 00:50:07,116 --> 00:50:09,726 and is indeed 3, now here's my loop. 1159 00:50:09,796 --> 00:50:13,306 So you know what, let me go ahead and type next here, okay, 1160 00:50:13,306 --> 00:50:14,166 that seems to be fine. 1161 00:50:14,166 --> 00:50:16,856 Next here, let me do a quick sanity check and print S 2. 1162 00:50:17,016 --> 00:50:20,196 And really, I'm just futzing around, oh, but some progress. 1163 00:50:20,196 --> 00:50:22,836 Now I at least have an F at the start of S 2. 1164 00:50:22,836 --> 00:50:26,346 Will he me go ahead and type next again, next again, 1165 00:50:26,346 --> 00:50:27,906 now let me print S 2 again. 1166 00:50:28,296 --> 00:50:31,666 Okay, F-O-O -- oh, that's an accented O, 1167 00:50:31,666 --> 00:50:34,606 so it looks coincidentally like garbage and not an accented O. 1168 00:50:34,606 --> 00:50:35,786 So let's do one more time. 1169 00:50:35,786 --> 00:50:38,466 Next, next, now let me print S 2. 1170 00:50:38,466 --> 00:50:40,156 Good. There's still some garbage. 1171 00:50:40,156 --> 00:50:42,096 So let me do one more iteration here, 1172 00:50:42,096 --> 00:50:45,826 and now print S 2 -- ah ha. 1173 00:50:45,826 --> 00:50:48,156 Why does the string all of a sudden look perfect? 1174 00:50:48,526 --> 00:50:50,996 Because there is a null zero. 1175 00:50:50,996 --> 00:50:53,496 And one of the things GDB does for me is realizes oh, 1176 00:50:53,496 --> 00:50:56,336 if I sigh character, character, character, back slash 0, 1177 00:50:56,336 --> 00:51:00,096 let me just show the user the thing before the back slash 0. 1178 00:51:00,276 --> 00:51:02,056 Unwilling if I go ahead and type continue, 1179 00:51:02,166 --> 00:51:05,006 I'll see that something bad happens here, there's mention 1180 00:51:05,006 --> 00:51:07,946 of heap over here, which is an interesting keyword, 1181 00:51:07,946 --> 00:51:08,806 let me go over here. 1182 00:51:09,026 --> 00:51:10,306 Oh, so this is interesting. 1183 00:51:10,306 --> 00:51:14,386 GlibC detected, invalid pointer when I called freeze, 1184 00:51:14,386 --> 00:51:16,226 some really bad stuff seems to have happened, 1185 00:51:16,556 --> 00:51:19,646 and apparently the result of messing with my own memory. 1186 00:51:19,646 --> 00:51:24,886 So why is this then a good thing in well, we now have the ability 1187 00:51:25,206 --> 00:51:28,436 to do what we want with memory 1188 00:51:28,436 --> 00:51:29,856 and pretty much anywhere we want, 1189 00:51:29,916 --> 00:51:31,306 albeit with this down side. 1190 00:51:31,306 --> 00:51:34,726 So the quick teaser here, at the bottom of this picture, finally, 1191 00:51:34,726 --> 00:51:36,776 we're now putting back the top of this picture. 1192 00:51:36,776 --> 00:51:39,086 At the bottom of this picture is the so-called stack. 1193 00:51:39,086 --> 00:51:43,666 Quick review, what goes on the stack, what kind of stuff. 1194 00:51:43,876 --> 00:51:46,816 So functions, frames, which contain the local variables 1195 00:51:46,816 --> 00:51:49,556 and also functions, parameters. 1196 00:51:49,556 --> 00:51:52,276 And when we played around with recursion a couple lectures ago, 1197 00:51:52,506 --> 00:51:53,946 and I just kind of foolishly 1198 00:51:53,946 --> 00:51:57,536 or naively implemented a recursive program, 1199 00:51:57,536 --> 00:51:58,946 a function that called itself, 1200 00:51:58,946 --> 00:52:01,246 an I just let it call itself thousands 1201 00:52:01,246 --> 00:52:02,516 of times, millions of times. 1202 00:52:02,786 --> 00:52:05,646 It eventually seg faulted, which again hints at a memory error, 1203 00:52:05,876 --> 00:52:06,976 because what happened to the stack? 1204 00:52:07,496 --> 00:52:09,196 Well, it kept putting frame after frame 1205 00:52:09,286 --> 00:52:11,026 or frame on the stack. 1206 00:52:11,276 --> 00:52:14,506 But clearly as your memory and reality limited, finitely, 1207 00:52:14,676 --> 00:52:16,286 and in fact most operating systems 1208 00:52:16,286 --> 00:52:19,686 as we've said somewhat arbitrarily say you cannot use 1209 00:52:19,686 --> 00:52:22,436 more than 2 gigabytes per any given program, 1210 00:52:22,586 --> 00:52:24,716 and that generally has to do with the size of the int 1211 00:52:24,716 --> 00:52:25,766 or whatnot that they're using. 1212 00:52:26,076 --> 00:52:29,006 So in that recursive program, bad things eventually happened 1213 00:52:29,206 --> 00:52:31,586 when my stack overran this thing called the heap. 1214 00:52:31,846 --> 00:52:33,166 Well, what is this heap? 1215 00:52:33,496 --> 00:52:35,886 The heap pictorially is generally just drawn at the top, 1216 00:52:35,886 --> 00:52:38,216 and this just means that the heap is 1217 00:52:38,216 --> 00:52:41,386 where dynamically allocated memory is taken from. 1218 00:52:41,656 --> 00:52:43,526 Any time you call meloc hence forth, 1219 00:52:43,756 --> 00:52:46,196 it comes from that portion of memory called the heap, 1220 00:52:46,446 --> 00:52:48,366 and as the arrow suggests, the more 1221 00:52:48,366 --> 00:52:51,376 and more memory you allocate with meloc, the closer 1222 00:52:51,376 --> 00:52:53,036 and closer and closer you get to the stack. 1223 00:52:53,546 --> 00:52:56,916 So you get sort of these competing beasts 1224 00:52:56,916 --> 00:52:59,466 where memory gets allocated by you via meloc. 1225 00:52:59,606 --> 00:53:00,846 But if you start to push the limit 1226 00:53:00,846 --> 00:53:03,856 with meloc you too might end up crashing your program. 1227 00:53:04,176 --> 00:53:05,746 Now what about those things on top? 1228 00:53:05,896 --> 00:53:08,656 Well, there's initialized data and uninitialized data. 1229 00:53:08,866 --> 00:53:11,856 Well, those of you who tackled Problem Set 3 already, 1230 00:53:11,856 --> 00:53:14,516 a little bit, know that there's at least two global variables. 1231 00:53:14,716 --> 00:53:16,846 What are those? 1232 00:53:17,046 --> 00:53:21,006 So board and B for dimension, and in fact, if I go ahead 1233 00:53:21,006 --> 00:53:24,826 and open this so I'm going into CS 50, pub source, P set, 1234 00:53:24,966 --> 00:53:29,446 P set 3, 15, 15.C. If you've not seen this yet, not to worry, 1235 00:53:29,536 --> 00:53:33,126 but think -- thank you. 1236 00:53:34,956 --> 00:53:37,206 The laughter helps, yeah, embarrass. 1237 00:53:37,546 --> 00:53:39,016 Good lesson. 1238 00:53:39,356 --> 00:53:42,206 So this is P set 3's 15.C file. 1239 00:53:42,486 --> 00:53:44,296 So this is one of the framework files 1240 00:53:44,296 --> 00:53:45,476 that we give you for this Problem Set. 1241 00:53:45,656 --> 00:53:47,246 And there's two global variables. 1242 00:53:47,246 --> 00:53:50,026 One is called board, and it's a two dimensional array. 1243 00:53:50,026 --> 00:53:53,536 So as we said a while ago, you can have a two dimensional array 1244 00:53:53,536 --> 00:53:56,606 and you just declare it by specifying the width 1245 00:53:56,606 --> 00:53:58,646 or the height, or the height and the width. 1246 00:53:58,646 --> 00:54:00,346 Doesn't matter which way you view your world, 1247 00:54:00,346 --> 00:54:02,146 so long as you do it consistently throughout 1248 00:54:02,146 --> 00:54:02,696 your program. 1249 00:54:02,956 --> 00:54:05,806 And actually, just this morning we were corresponding with one 1250 00:54:05,806 --> 00:54:07,916 of your classmates who very recently, not to poke fun 1251 00:54:07,916 --> 00:54:11,316 at all, had some weird bug where every time he ran one 1252 00:54:11,316 --> 00:54:15,236 of his functions, one or more of the elements of his board, 1253 00:54:15,486 --> 00:54:19,696 one of the numbers from 1 to 15 was just changing inexplicably. 1254 00:54:19,696 --> 00:54:21,776 It was becoming negative 1 or something weird. 1255 00:54:22,016 --> 00:54:24,726 And so what the problem actually ended up being was he had 1256 00:54:24,726 --> 00:54:27,536 in some line of code of his program, hope no one tries 1257 00:54:27,536 --> 00:54:28,406 to copy the distribution code, 1258 00:54:28,406 --> 00:54:30,226 because we're now writing now writing on top of it here, 1259 00:54:30,446 --> 00:54:31,366 he had a line like this. 1260 00:54:31,406 --> 00:54:36,566 For int I get 0, I is less than dim Max, I++, 1261 00:54:36,566 --> 00:54:39,016 and then he probably had a nested loop or something 1262 00:54:39,016 --> 00:54:42,696 like this, some hypothesizing, partly, J plus plus, 1263 00:54:42,696 --> 00:54:45,266 but there is a silly mistake like this, 1264 00:54:45,266 --> 00:54:48,366 and what was happening was when he was allocate -- 1265 00:54:48,366 --> 00:54:53,486 let's see, B-O-A-R-D, J, gets let's say -- 1266 00:54:54,716 --> 00:54:58,376 let's say I, just arbitrarily. 1267 00:54:58,666 --> 00:54:59,956 So there's a bug in this loop. 1268 00:55:00,086 --> 00:55:02,556 What is the bug that I introduced intentionally? 1269 00:55:03,166 --> 00:55:09,116 So I'm iterating not from us, I is 0 less than dim Max, 1270 00:55:09,826 --> 00:55:12,646 but rather less than or equal to. 1271 00:55:12,646 --> 00:55:14,536 So I'm actually going too big here. 1272 00:55:14,756 --> 00:55:17,526 But the interesting thing was every time he did this, oh, 1273 00:55:17,526 --> 00:55:20,966 that's what it was, his variable D that was changing. 1274 00:55:21,416 --> 00:55:23,356 So D was somehow changing, 1275 00:55:23,356 --> 00:55:26,156 even though he was never touching the variable D. And 1276 00:55:26,156 --> 00:55:30,016 yet if you look back now at the proximity of this area 1277 00:55:30,016 --> 00:55:33,316 of memory called initialized data and uninitialized data, 1278 00:55:33,666 --> 00:55:36,386 those global variables end up, up there. 1279 00:55:36,386 --> 00:55:39,446 And if you give them a default value they're put in the slice 1280 00:55:39,446 --> 00:55:41,056 of memory called initialized data. 1281 00:55:41,306 --> 00:55:43,406 If you declare as we've done two global variables 1282 00:55:43,406 --> 00:55:45,246 without giving them default values they end 1283 00:55:45,246 --> 00:55:46,696 up in uninitialized data. 1284 00:55:46,976 --> 00:55:49,856 But because I had written these things back-to-back in the tile, 1285 00:55:50,076 --> 00:55:52,716 GCC had laid them out in memory back-to-back, 1286 00:55:52,976 --> 00:55:55,546 which means if with your loop you iterate ever 1287 00:55:55,546 --> 00:55:57,906 so slightly too far, guess whose value does 1288 00:55:57,906 --> 00:55:58,946 in fact get globbered. 1289 00:56:00,386 --> 00:56:03,306 D, because it was literally right next to board in memory. 1290 00:56:03,546 --> 00:56:05,446 And this is the simple explanation. 1291 00:56:05,676 --> 00:56:07,406 And then finally, the text segment up there, 1292 00:56:07,696 --> 00:56:09,936 it's a weird name but it's historically accurate, 1293 00:56:09,936 --> 00:56:11,566 what is the text segment of your program. 1294 00:56:12,736 --> 00:56:16,786 So those are actually the zeros and ones 1295 00:56:16,786 --> 00:56:20,236 that compose your program on disc, when you compile a program 1296 00:56:20,236 --> 00:56:23,726 into A.out or any executable and then run it at the prompt, 1297 00:56:23,966 --> 00:56:26,666 well they be the computer has to be able to read those bits back 1298 00:56:26,726 --> 00:56:29,316 through its CPU and it puts them at the very top 1299 00:56:29,646 --> 00:56:31,026 of this chunk of memory. 1300 00:56:31,286 --> 00:56:35,816 So there's an interesting danger now that arises with the fact 1301 00:56:35,816 --> 00:56:38,946 that most stuff in memory is allocated one after the other. 1302 00:56:38,946 --> 00:56:40,476 The stack, the fact that it does this, 1303 00:56:40,476 --> 00:56:43,736 is very nice conceptually neat, but it's very dangerous 1304 00:56:43,736 --> 00:56:46,616 because adversaries or malicious coders can actually 1305 00:56:46,616 --> 00:56:47,376 exploit this. 1306 00:56:47,376 --> 00:56:50,536 And so one of the most common ways of hacking into a system, 1307 00:56:50,536 --> 00:56:53,196 even to this day, one of the most common ways of breaking 1308 00:56:53,196 --> 00:56:56,336 into a web site or some other piece of software is 1309 00:56:56,336 --> 00:57:00,526 to feed a program more bytes than its expecting. 1310 00:57:00,526 --> 00:57:03,696 So as this student did accidentally by stepping 1311 00:57:03,766 --> 00:57:06,866 over the bounds of his array into a variable D, 1312 00:57:07,036 --> 00:57:09,806 what people generally do, and sometimes it's by trill 1313 00:57:09,806 --> 00:57:14,006 and error, is they try to input not just some normal text-like 1314 00:57:14,006 --> 00:57:16,056 foo, but they try to paste 1315 00:57:16,056 --> 00:57:18,356 into a program essentially executable code. 1316 00:57:18,486 --> 00:57:19,896 Zeros and ones that they wrote, 1317 00:57:20,136 --> 00:57:23,396 that if you can somehow slip them into a program's memory 1318 00:57:23,396 --> 00:57:25,086 in the right place, those zeros 1319 00:57:25,086 --> 00:57:27,436 and ones might actually get executed. 1320 00:57:27,436 --> 00:57:30,826 So if you've ever had a friend who downloads a crack 1321 00:57:30,966 --> 00:57:33,526 for some piece of software to circumvent the serial number 1322 00:57:33,526 --> 00:57:36,186 or whatever, that's often the result of someone figuring 1323 00:57:36,186 --> 00:57:39,416 out exactly where in code, now everyone's attentive, 1324 00:57:39,416 --> 00:57:44,276 where in code that if condition is or where that variable is, 1325 00:57:44,276 --> 00:57:48,236 and somehow trying to globber its value or tell the computer, 1326 00:57:48,466 --> 00:57:49,516 you know, unbeknownst to it, 1327 00:57:49,716 --> 00:57:51,816 to move to a different function all together 1328 00:57:51,816 --> 00:57:53,086 and not that particular one. 1329 00:57:53,116 --> 00:57:54,826 Even to this day, there's no such software 1330 00:57:54,826 --> 00:57:58,416 out there that's still written in C++, and similarly dangerous, 1331 00:57:58,556 --> 00:58:01,666 but powerful languages that may be 50% 1332 00:58:01,666 --> 00:58:03,966 of the time still are exploits, 1333 00:58:03,966 --> 00:58:06,116 the result of buffer overflow attacks. 1334 00:58:06,226 --> 00:58:08,266 And in fact, when the iPhone had first come out a couple 1335 00:58:08,266 --> 00:58:10,466 of years ago, those of you -- some of you might be familiar 1336 00:58:10,466 --> 00:58:12,696 with this idea of jail breaking it, and being able 1337 00:58:12,696 --> 00:58:16,106 to put your own software on it, pull it off of AT&T's network 1338 00:58:16,106 --> 00:58:17,296 and put it on to T-Mobile. 1339 00:58:17,406 --> 00:58:20,046 And two years ago in 2007 we actually distributed 1340 00:58:20,096 --> 00:58:22,686 in CS 50 the crack code for the iPhone 1341 00:58:22,686 --> 00:58:25,536 because somehow had written it in C, posted it on the Internet, 1342 00:58:25,536 --> 00:58:27,866 and by running this C code on your iPhone, 1343 00:58:28,126 --> 00:58:29,116 could you take advantage 1344 00:58:29,116 --> 00:58:31,926 of a stupid mistake an Apple developer accidentally made -- 1345 00:58:32,106 --> 00:58:34,556 that did not check the bounds of an array. 1346 00:58:34,556 --> 00:58:37,746 And so this very clever or malicious person 1347 00:58:37,746 --> 00:58:40,126 who wrote this code was able to take advantage of that, 1348 00:58:40,126 --> 00:58:43,506 and to insert into memory data that was not expected. 1349 00:58:43,576 --> 00:58:44,816 Let's take a five minute break. 1350 00:58:48,936 --> 00:58:50,916 >> All right, welcome back. 1351 00:58:50,996 --> 00:58:53,396 So this was the article, actually, two years ago, 1352 00:58:53,436 --> 00:58:55,616 that I think I read, like the morning of lecture, 1353 00:58:55,796 --> 00:58:57,756 and it was -- this was posted 1354 00:58:57,756 --> 00:58:59,896 on a popular web site called [Inaudible] gadget, iPhone, 1355 00:58:59,936 --> 00:59:01,936 iPod Touch jail break code posted, 1356 00:59:02,206 --> 00:59:04,736 and I went into our C S 50 archives. 1357 00:59:04,736 --> 00:59:06,216 And here is in fact that code. 1358 00:59:06,406 --> 00:59:07,376 I'll link to this on our web site. 1359 00:59:07,376 --> 00:59:08,986 I mean, this was publicly available, 1360 00:59:08,986 --> 00:59:11,486 we're not really doing anything nasty here. 1361 00:59:11,746 --> 00:59:15,256 But what is perhaps the academic value of glancing at this is 1362 00:59:15,256 --> 00:59:18,206 that although we'll just glance at the code, all of the code 1363 00:59:18,206 --> 00:59:19,616 that this fellow wrote just boils 1364 00:59:19,616 --> 00:59:21,876 down to some basic C constructs 1365 00:59:21,876 --> 00:59:24,046 that we've been teasing apart the past couple of weeks. 1366 00:59:24,316 --> 00:59:27,046 So this is the code, exploit for the iPod, iPhone, 1367 00:59:27,046 --> 00:59:30,276 by talk carta, Drey, and Niacin. 1368 00:59:30,336 --> 00:59:33,086 Credit for the discovery goes to Tavis [Phonetics]. 1369 00:59:33,266 --> 00:59:36,606 All right, so we have some sharpened clues up here. 1370 00:59:37,146 --> 00:59:40,466 This is actually C++ thing, so for those familiar 1371 00:59:40,466 --> 00:59:44,456 or unfamiliar, C++ is kind of like a new and improved version 1372 00:59:44,456 --> 00:59:47,126 of C, a super, if you will, with additional features. 1373 00:59:47,596 --> 00:59:50,366 They're definitely distinct, but they're very much related. 1374 00:59:50,636 --> 00:59:52,546 Here we have a function, I mean, 1375 00:59:52,546 --> 00:59:54,376 this stuff is all fairly familiar, 1376 00:59:54,376 --> 00:59:58,516 we got some parameters defined there, a for loop, oh, 1377 00:59:58,516 --> 01:00:00,006 and this is interesting, we'll actually glance 1378 01:00:00,006 --> 01:00:01,006 at this briefly today. 1379 01:00:01,366 --> 01:00:03,566 C allows you to declare something called the struct. 1380 01:00:03,566 --> 01:00:06,326 So there will come a point very soon where it's insufficient 1381 01:00:06,326 --> 01:00:07,976 to represent pieces of our program 1382 01:00:07,976 --> 01:00:09,936 with just chars and ints and strings. 1383 01:00:10,216 --> 01:00:13,046 You kind of want to represent a whole student object 1384 01:00:13,046 --> 01:00:16,496 or a whole person object, you want to represent an entity 1385 01:00:16,496 --> 01:00:18,436 that might have multiple components to it. 1386 01:00:18,436 --> 01:00:21,966 And you can do that with this piece of syntax called a struct, 1387 01:00:22,136 --> 01:00:24,266 and define your own data structures. 1388 01:00:24,526 --> 01:00:27,886 If I scroll down here we see some more struct stuff. 1389 01:00:27,886 --> 01:00:29,226 Where's main, where's main. 1390 01:00:29,226 --> 01:00:29,786 There it is. 1391 01:00:29,786 --> 01:00:33,246 So there's the familiar main. 1392 01:00:33,916 --> 01:00:35,356 So this fellow would not get very good points 1393 01:00:35,356 --> 01:00:36,116 for style, right? 1394 01:00:36,116 --> 01:00:37,626 I have yet to see a single comment in here, 1395 01:00:37,926 --> 01:00:39,226 but maybe that's one of his points. 1396 01:00:39,496 --> 01:00:43,626 So we get some hexadecimal, apparently that comes in useful 1397 01:00:43,626 --> 01:00:45,316 and handy when cracking iPhones. 1398 01:00:45,546 --> 01:00:48,096 And then there's some fairly esoteric looking stuff, 1399 01:00:48,096 --> 01:00:51,116 but as I was talking with one of your classmates during break, 1400 01:00:51,116 --> 01:00:52,896 that often is the work flow. 1401 01:00:52,896 --> 01:00:53,866 And I don't know the specifics 1402 01:00:53,866 --> 01:00:56,746 of this particular attack vector, is for people just 1403 01:00:56,746 --> 01:00:59,806 to bang on programs, or bang on web sites, 1404 01:00:59,806 --> 01:01:02,296 and by this I mean give them a billion characters 1405 01:01:02,296 --> 01:01:04,446 and see what happens, give them zero characters 1406 01:01:04,446 --> 01:01:05,386 and see what happens. 1407 01:01:05,386 --> 01:01:08,296 Generally, bang on someone's code, which we encourage you 1408 01:01:08,296 --> 01:01:10,486 to do to your own, because the teaching fellows will then do 1409 01:01:10,486 --> 01:01:13,956 that themselves because very often if you push hard enough, 1410 01:01:13,956 --> 01:01:16,256 you start to find wake points in programs. 1411 01:01:16,496 --> 01:01:18,786 And in fact, what adversaries generally thrive 1412 01:01:18,786 --> 01:01:23,576 on is finding bugs in programs that cause them to crash. 1413 01:01:24,086 --> 01:01:25,546 Because even though you, the human, 1414 01:01:25,546 --> 01:01:28,556 the user might actually get really annoyed when your program 1415 01:01:28,556 --> 01:01:30,526 or some web site crashes as the result 1416 01:01:30,526 --> 01:01:32,056 of some unintended behavior, 1417 01:01:32,356 --> 01:01:35,966 that really makes an adversary drool, because generally, 1418 01:01:35,966 --> 01:01:37,606 as we've even started to see here, 1419 01:01:37,856 --> 01:01:41,346 when there are mistakes made you can induce unintended behavior 1420 01:01:41,626 --> 01:01:43,896 and sometimes by injecting data 1421 01:01:43,896 --> 01:01:45,886 that shouldn't actually be there. 1422 01:01:45,886 --> 01:01:48,046 And so what happens often is trial and error, 1423 01:01:48,046 --> 01:01:50,606 you bang on a web site, bang on a program, viola. 1424 01:01:50,606 --> 01:01:51,186 It crashes. 1425 01:01:51,186 --> 01:01:52,206 It core dumps. 1426 01:01:52,356 --> 01:01:53,666 Then the fun starts. 1427 01:01:53,666 --> 01:01:55,326 At least in the eyes of these folks. 1428 01:01:55,586 --> 01:01:57,256 Because then can you finally start to think 1429 01:01:57,256 --> 01:02:00,956 about why did it crash and how can I maybe take advantage 1430 01:02:00,986 --> 01:02:01,676 of that. 1431 01:02:01,886 --> 01:02:03,726 Now in the case of iPhones and things like this, 1432 01:02:03,726 --> 01:02:05,576 this really is a cat and mouse game, right? 1433 01:02:05,576 --> 01:02:07,736 Because Apple quickly fixed whatever exploit, 1434 01:02:08,006 --> 01:02:11,296 whatever bug this fellow took over, but it's a cat 1435 01:02:11,296 --> 01:02:12,506 and mouse game, and that's -- 1436 01:02:12,586 --> 01:02:14,356 this has been done several times since. 1437 01:02:14,446 --> 01:02:16,456 So it's much -- the adversaries, frankly, 1438 01:02:16,456 --> 01:02:19,496 probably have the advantage in all of this because we the -- 1439 01:02:19,496 --> 01:02:24,006 well, we the good guys have to fix every one of our mistakes 1440 01:02:24,156 --> 01:02:26,636 to keep our code and our programs and our data safe. 1441 01:02:26,916 --> 01:02:30,476 The bad guys just have to find one flaw or one mistake. 1442 01:02:30,536 --> 01:02:33,406 So the tables are by nature fairly imbalanced. 1443 01:02:33,666 --> 01:02:34,706 I thought I would show you this too, 1444 01:02:34,706 --> 01:02:35,856 this was shared by a classmate. 1445 01:02:36,386 --> 01:02:38,056 Now you have a bit more comfort, perhaps, 1446 01:02:38,056 --> 01:02:40,146 with crazy-looking syntax, this is linked, too, 1447 01:02:40,146 --> 01:02:41,636 on the courses lecture page. 1448 01:02:41,896 --> 01:02:44,426 And it's kind of a witty, jokey thing someone spent a lot 1449 01:02:44,526 --> 01:02:48,956 of time on, coming up with different snippets of code 1450 01:02:48,956 --> 01:02:51,266 that different personalities or people might write. 1451 01:02:51,586 --> 01:02:53,636 So someone in high school or junior high back 1452 01:02:53,636 --> 01:02:55,236 in the day might have written a program 1453 01:02:55,236 --> 01:02:56,456 in a language called basic, 1454 01:02:56,456 --> 01:02:58,766 that as I said has explicit line numbers, 1455 01:02:58,766 --> 01:03:00,026 looks a little simple like that. 1456 01:03:00,256 --> 01:03:03,336 Then maybe in college it gets a little more sophisticated. 1457 01:03:03,336 --> 01:03:06,326 After that, maybe in CS 51 you start using a language called 1458 01:03:06,326 --> 01:03:10,086 lisp or scheme, looks a little like that. 1459 01:03:10,086 --> 01:03:11,676 Then you start to become a professional, 1460 01:03:11,676 --> 01:03:13,256 and the code starts to get longer. 1461 01:03:13,366 --> 01:03:15,616 A lot of the examples just get a little crazy. 1462 01:03:15,616 --> 01:03:19,156 But it speaks to this tendency to overengineer a problem, 1463 01:03:19,746 --> 01:03:21,966 this written on a Windows platform, 1464 01:03:21,966 --> 01:03:23,056 just ridiculous amount of code. 1465 01:03:23,056 --> 01:03:25,456 All of these programs implement hello world. 1466 01:03:27,176 --> 01:03:28,476 Then things start to come back 1467 01:03:28,476 --> 01:03:31,096 down to meet reality, apprentice hacker. 1468 01:03:31,316 --> 01:03:33,386 So the geeks in the room might actually like to play with some 1469 01:03:33,386 --> 01:03:34,746 of these, because you can run them on nice, 1470 01:03:34,896 --> 01:03:37,446 most of these programs, if you know how to compile them 1471 01:03:37,446 --> 01:03:38,336 or how to interpret them. 1472 01:03:38,606 --> 01:03:40,526 But I think it gets funny sort of toward the bottom, 1473 01:03:40,526 --> 01:03:43,346 where it becomes a little office like or office space like. 1474 01:03:43,786 --> 01:03:46,556 So now we have the new manager printing something in basic, 1475 01:03:46,556 --> 01:03:48,736 middle manager, if you're familiar 1476 01:03:48,736 --> 01:03:50,786 with a little command like this. 1477 01:03:53,086 --> 01:03:55,256 Senior manager, and then the chief executive, 1478 01:03:55,256 --> 01:03:56,336 I thought was the funniest. 1479 01:03:59,321 --> 01:04:01,321 [ Laughter ] 1480 01:04:01,626 --> 01:04:04,346 >> So, fun with geeky humor. 1481 01:04:04,626 --> 01:04:10,986 Okay, so we promised to take the hood off of the CS 50 library, 1482 01:04:11,166 --> 01:04:12,466 and this is what you've been taking 1483 01:04:12,506 --> 01:04:14,176 for granted all this time. 1484 01:04:14,406 --> 01:04:17,166 But now we're in Week Four of the course, even though some 1485 01:04:17,166 --> 01:04:17,806 of the techniques you're 1486 01:04:17,806 --> 01:04:20,436 about to see are a little more sophisticated than we'd see 1487 01:04:20,436 --> 01:04:24,656 in current problem sets, you can at least now maybe take on faith 1488 01:04:24,656 --> 01:04:27,596 that certain lines of code do what the comments say they do 1489 01:04:27,866 --> 01:04:30,746 without getting perhaps a little overwhelmed by what -- 1490 01:04:30,746 --> 01:04:33,206 just a couple weeks ago was entirely new stuff. 1491 01:04:33,206 --> 01:04:34,696 So what I'm going to do is go back 1492 01:04:34,696 --> 01:04:36,696 into today's source directory. 1493 01:04:36,696 --> 01:04:38,026 You do have a print out of this. 1494 01:04:38,026 --> 01:04:41,546 And this file again that we've been including via the 1495 01:04:41,546 --> 01:04:44,626 preprocessor directive called sharp include looks a little 1496 01:04:44,626 --> 01:04:45,146 like this. 1497 01:04:45,146 --> 01:04:48,036 It's mostly comments up top, 1498 01:04:48,346 --> 01:04:51,066 but then notice we are using a couple of values 1499 01:04:51,066 --> 01:04:53,756 that are defined in these libraries here, flow dot H, 1500 01:04:53,756 --> 01:04:57,056 limits dot H, the boolean data type if you used it. 1501 01:04:57,056 --> 01:04:58,166 We actually ripped that off 1502 01:04:58,166 --> 01:05:00,806 from a standard library called standard bool dot H, 1503 01:05:01,376 --> 01:05:03,356 seed did not originally can with a boolean data type. 1504 01:05:03,416 --> 01:05:05,886 People just used ints and used zero and one. 1505 01:05:06,176 --> 01:05:09,316 But eventually people came up with a type def for that. 1506 01:05:09,316 --> 01:05:11,126 We define string in here, 1507 01:05:11,336 --> 01:05:14,316 and then down here are just the proto types or declarations 1508 01:05:14,316 --> 01:05:16,936 for all these various functions, and the documentation for them. 1509 01:05:17,176 --> 01:05:21,016 And as an aside, you won't have to do these sets of instructions 1510 01:05:21,016 --> 01:05:23,356 at the top of this file, but just so you know, 1511 01:05:23,436 --> 01:05:25,146 and try to remember this eight weeks hence, 1512 01:05:25,436 --> 01:05:28,896 when you leave CS 50, if you ever try to repeal your code 1513 01:05:28,896 --> 01:05:31,696 on another LINUX box or PC or Mac, 1514 01:05:32,026 --> 01:05:34,646 and you may have some trouble with the compilation 1515 01:05:34,646 --> 01:05:38,156 if you don't actually have copies of CS 50.H 1516 01:05:38,276 --> 01:05:42,516 and CS 50.C. We install them sort of automatically 1517 01:05:42,556 --> 01:05:46,166 on nice.fas in a way that just makes things work. 1518 01:05:46,166 --> 01:05:48,706 So we included -- just FYI eight months hence, 1519 01:05:49,016 --> 01:05:50,916 directions up here, they're arcane, 1520 01:05:50,916 --> 01:05:53,316 but they'll be a little more familiar in a few weeks time, 1521 01:05:53,636 --> 01:05:56,166 how you can continue using the CS 50 library 1522 01:05:56,236 --> 01:05:59,096 or in spirit any library like it, down the road. 1523 01:05:59,096 --> 01:06:01,826 Essentially, you have to create the equivalent of a dot O file 1524 01:06:02,056 --> 01:06:04,466 and then move it to a special place on your actual Mac, 1525 01:06:04,466 --> 01:06:06,446 Windows or LINUX computer. 1526 01:06:06,706 --> 01:06:11,466 So let's now look at the source code for CS 50.C. So we copied 1527 01:06:11,466 --> 01:06:13,666 and pasted a lot of the same documentation, 1528 01:06:13,666 --> 01:06:15,466 let's scroll down to get int. 1529 01:06:15,886 --> 01:06:17,766 Since that's perhaps one we've been using a lot. 1530 01:06:17,976 --> 01:06:19,256 It's not many lines of code. 1531 01:06:19,256 --> 01:06:20,816 So I stripped out the comments here. 1532 01:06:20,966 --> 01:06:23,336 But let's see if we can't reason through what's going on. 1533 01:06:23,336 --> 01:06:28,066 So while 1 induces what we would generally call insert the blank, 1534 01:06:28,566 --> 01:06:32,236 fill in the blank, infinite loop, right? 1535 01:06:32,236 --> 01:06:35,166 So infinite loops, often bad, not necessarily. 1536 01:06:35,236 --> 01:06:38,026 Sometimes it is actually a very reasonable construct 1537 01:06:38,076 --> 01:06:41,306 to code deliberately, so that you can do something 1538 01:06:41,306 --> 01:06:45,096 in perpetuity until something is true, or false, 1539 01:06:45,286 --> 01:06:47,756 at which points you can break out of that loop, 1540 01:06:47,826 --> 01:06:50,316 as we actually do, by returning from within. 1541 01:06:50,576 --> 01:06:53,036 So there's different ways of doing this, but one approach is 1542 01:06:53,036 --> 01:06:56,046 to embrace the infinite loop and just make sure logically 1543 01:06:56,046 --> 01:06:57,656 that you will at some point break out of it, 1544 01:06:57,756 --> 01:06:58,646 if that's the intention. 1545 01:06:59,046 --> 01:07:02,896 So it turns out that get int actually uses our own version 1546 01:07:02,896 --> 01:07:03,626 of get string. 1547 01:07:03,886 --> 01:07:07,506 So the first time Glen and I sat down to write this code we, 1548 01:07:07,506 --> 01:07:09,406 you know, we too started copying and pasting. 1549 01:07:09,456 --> 01:07:12,316 Then we realized wow, get int is really similar to get float. 1550 01:07:12,316 --> 01:07:14,566 And get float is really similar to get double. 1551 01:07:14,836 --> 01:07:17,096 Maybe we should factor out these lines of code 1552 01:07:17,096 --> 01:07:18,496 that we keep copying and pasting 1553 01:07:18,496 --> 01:07:19,796 into every one of these functions. 1554 01:07:20,066 --> 01:07:22,556 And so we did, and thus was born get string. 1555 01:07:22,816 --> 01:07:25,476 So here we're leveraging a function we wrote elsewhere, 1556 01:07:25,476 --> 01:07:29,716 and this factoring out of common code is perfectly consistent 1557 01:07:29,716 --> 01:07:31,126 with the message we've been trying to send, 1558 01:07:31,126 --> 01:07:33,456 that once your code starts getting a little longer, 1559 01:07:33,456 --> 01:07:36,596 a little unwieldy, then can you start plucking out portions 1560 01:07:36,596 --> 01:07:39,316 of it just like we've done in the game of 15. 1561 01:07:39,316 --> 01:07:41,736 The fact that you're handed a file with a whole bunch 1562 01:07:41,736 --> 01:07:44,226 of functions, each one of which does something conceptually 1563 01:07:44,226 --> 01:07:48,496 distinct speaks to this idea of good design and modularity. 1564 01:07:48,496 --> 01:07:50,266 So how are we leveraging this? 1565 01:07:50,266 --> 01:07:53,186 We get a string up here, we call it a line, 1566 01:07:53,346 --> 01:07:56,286 the variable called line, then we do a little sanity check, 1567 01:07:56,286 --> 01:07:58,966 if line is null, return int Max. 1568 01:07:59,506 --> 01:08:02,936 So here's where you trip over a little limitation or frustration 1569 01:08:02,936 --> 01:08:06,716 of C. If something goes wrong in a function that's supposed 1570 01:08:06,716 --> 01:08:09,556 to return an int, how too you signify an error? 1571 01:08:10,136 --> 01:08:13,556 Well, if you return zero, that suggests 1572 01:08:13,556 --> 01:08:16,636 that you can never actually get the number zero from the user. 1573 01:08:16,636 --> 01:08:17,696 So let's pick something else. 1574 01:08:17,746 --> 01:08:20,236 If you -- maybe we return 123. 1575 01:08:20,236 --> 01:08:22,126 123, any time there's an error. 1576 01:08:22,446 --> 01:08:26,636 But the problem then is the user can never type in the number 123 1577 01:08:26,806 --> 01:08:29,346 because you can't distinguish that from an error. 1578 01:08:29,526 --> 01:08:32,426 So in short, you have to pick some value, presumably, 1579 01:08:32,426 --> 01:08:34,276 you're not going to pick one smack-dab in the middle 1580 01:08:34,276 --> 01:08:37,216 of the range, something like negative 1 or 0, 1581 01:08:37,496 --> 01:08:40,836 or as we've done, we picked the largest int possible. 1582 01:08:40,836 --> 01:08:42,976 We decided you know what, there's an upper bound 1583 01:08:42,976 --> 01:08:47,136 on the number of ints we can store with just 32 bits anyway, 1584 01:08:47,416 --> 01:08:50,716 let's steal one of those, let's steal the very biggest of them, 1585 01:08:51,006 --> 01:08:55,126 2 to the 31 minus 1, and just say to the user, sorry, 1586 01:08:55,176 --> 01:08:58,746 you cannot type in 2 billion, give or take. 1587 01:08:58,856 --> 01:09:00,846 You can use anything smaller than that. 1588 01:09:00,846 --> 01:09:04,496 So this constant, as suggested by the capitalization of it, 1589 01:09:04,716 --> 01:09:10,966 is actually defined in one of these header files in CS 50.H. 1590 01:09:11,206 --> 01:09:14,266 So in float.H and limits.H, they're just some constants 1591 01:09:14,266 --> 01:09:16,296 that someone else already defined for us 1592 01:09:16,296 --> 01:09:18,536 that represent the largest possible ints 1593 01:09:18,626 --> 01:09:20,426 that you can represent with a C program. 1594 01:09:20,746 --> 01:09:22,856 So we said you know what, let's just return that value. 1595 01:09:22,856 --> 01:09:25,896 So you, all this time , any time you all have been calling get 1596 01:09:25,896 --> 01:09:28,526 int and just using its value, you know, 1597 01:09:28,526 --> 01:09:31,906 you've never actually been doing the right thing, which is check 1598 01:09:31,906 --> 01:09:34,186 and make sure it's a legitimate value. 1599 01:09:34,326 --> 01:09:36,746 Technically any time you've been calling get int, 1600 01:09:37,316 --> 01:09:40,226 just like we've been calling get string and checking for null, 1601 01:09:40,426 --> 01:09:42,826 you really should have been checking for int max. 1602 01:09:43,246 --> 01:09:44,816 But frankly, in the first weeks of the course, 1603 01:09:45,136 --> 01:09:47,116 really gets a little distracting I think 1604 01:09:47,116 --> 01:09:49,146 if you're constantly checking and getting. 1605 01:09:49,406 --> 01:09:51,096 So we throw -- we ignore those details, 1606 01:09:51,096 --> 01:09:54,116 but now in the documentation you can see force -- 1607 01:09:54,546 --> 01:09:56,286 if line can't be read, 1608 01:09:56,286 --> 01:09:59,276 if any error actually happens, it returns int Max. 1609 01:09:59,576 --> 01:10:01,526 And so this is again one of these artsy [Inaudible] moments. 1610 01:10:01,526 --> 01:10:03,996 You just have to know what the possible return values 1611 01:10:03,996 --> 01:10:06,566 or so you can handle any errors accordingly. 1612 01:10:06,786 --> 01:10:08,296 All right, so let's assume there's no errors, 1613 01:10:08,296 --> 01:10:09,816 because frankly this happens very, 1614 01:10:09,816 --> 01:10:11,186 very rarely in most context. 1615 01:10:11,506 --> 01:10:12,956 So what do we do next. 1616 01:10:13,016 --> 01:10:14,106 Well, this is a new function. 1617 01:10:14,196 --> 01:10:20,466 S scan F. String scan F. So this function takes a string as input 1618 01:10:20,666 --> 01:10:23,176 and scans it for special characters. 1619 01:10:23,176 --> 01:10:26,246 And we're using this function to figure out did the user type 1620 01:10:26,246 --> 01:10:28,396 in int, did they type a word, a char, 1621 01:10:28,686 --> 01:10:31,246 we need to use some format codes and just 1622 01:10:31,246 --> 01:10:32,916 like print def uses format codes, 1623 01:10:33,206 --> 01:10:35,976 so does S scan F use format codes like this. 1624 01:10:36,276 --> 01:10:38,436 So this is a little trick, and we won't spend too much time 1625 01:10:38,436 --> 01:10:40,486 on the details of this, but just so you have a taste 1626 01:10:40,486 --> 01:10:42,626 of how this -- what this library's been doing 1627 01:10:42,626 --> 01:10:44,286 and how it's been saving you trouble, 1628 01:10:44,546 --> 01:10:47,636 notice or giving you trouble, perhaps, S scan F is going 1629 01:10:47,636 --> 01:10:50,386 to scan the line the user times in, and it's going 1630 01:10:50,386 --> 01:10:51,546 to look for this pattern. 1631 01:10:51,886 --> 01:10:55,096 It's going to look for space and actually because of the -- 1632 01:10:55,096 --> 01:10:58,216 because the documentation of S scan F says this, 1633 01:10:58,216 --> 01:11:00,306 one space or multiple spaces. 1634 01:11:00,346 --> 01:11:02,776 The fact that I intentioned included a space right there 1635 01:11:03,046 --> 01:11:04,676 means allow any white space. 1636 01:11:04,676 --> 01:11:07,256 The user can hit space bars as many times as he or she wants. 1637 01:11:07,606 --> 01:11:11,746 Then look for an integer, then look for a space, 1638 01:11:11,746 --> 01:11:14,616 one or more spaces, then look for a character. 1639 01:11:14,746 --> 01:11:17,616 And this is actually kind of a clever trick that we did here. 1640 01:11:17,896 --> 01:11:23,676 So these two variables here is the clever trick 1641 01:11:23,676 --> 01:11:27,526 that let's S scan F essentially return two values. 1642 01:11:27,916 --> 01:11:30,356 So remember, C function can't return two values, 1643 01:11:30,356 --> 01:11:31,336 it can only return one. 1644 01:11:31,406 --> 01:11:33,336 You can only have one thing on the left-hand side 1645 01:11:33,336 --> 01:11:35,216 of the equal sign, the assignment operator, 1646 01:11:35,466 --> 01:11:37,736 but we've had as Monday this goal, 1647 01:11:37,736 --> 01:11:40,596 this desire to actually change multiple values at once. 1648 01:11:40,926 --> 01:11:46,616 I and J, A and B, X and Y. So what am I doing with N and C 1649 01:11:46,616 --> 01:11:50,746 that are just primitives, ints and chars, what am I doing 1650 01:11:50,746 --> 01:11:52,596 with this ampersand, what am I really passing 1651 01:11:52,816 --> 01:11:55,596 into S scan F. Yeah, pointer. 1652 01:11:55,596 --> 01:11:59,306 So the address of N, the address of C, which means now 1653 01:11:59,306 --> 01:12:02,356 that I've handed him this road map to those variables, 1654 01:12:02,356 --> 01:12:05,656 S scan F can put anything he wants in those variables, 1655 01:12:05,876 --> 01:12:09,046 which means he effectively can return two values, 1656 01:12:09,216 --> 01:12:10,706 but that's an abuse of the terminology. 1657 01:12:10,896 --> 01:12:13,876 He can alter or modify or mutate two values, 1658 01:12:14,136 --> 01:12:15,036 and that's the goal here. 1659 01:12:15,036 --> 01:12:17,076 And now finally, I know from the documentation, 1660 01:12:17,396 --> 01:12:21,366 that S scan F returns the total number 1661 01:12:21,366 --> 01:12:24,136 of variables that were read into. 1662 01:12:24,706 --> 01:12:28,146 So I've handed it two, and in N, but S scan F doesn't have 1663 01:12:28,146 --> 01:12:29,706 to fill those variables, because frankly, 1664 01:12:29,706 --> 01:12:32,486 if the user just hits enter or a lot of spaces and no ints 1665 01:12:32,486 --> 01:12:35,516 and no chars, I mean, he has nothing to put in N or C. 1666 01:12:35,786 --> 01:12:38,416 So S scan F returns the number of variables 1667 01:12:38,416 --> 01:12:40,336 that were actually filled with values. 1668 01:12:41,166 --> 01:12:42,616 So why am I doing this? 1669 01:12:42,706 --> 01:12:46,186 Like, why the percent C. Clearly, I want just one 1670 01:12:46,186 --> 01:12:47,156 of those to be filled. 1671 01:12:47,326 --> 01:12:50,906 Right? Because I'm checking here for equalling, equalling one. 1672 01:12:51,256 --> 01:12:53,296 I only want one variable filled, the first one, 1673 01:12:53,296 --> 01:13:02,706 N. What does it mean if the return value is 2, do you think. 1674 01:13:03,296 --> 01:13:04,666 So there's a character, right? 1675 01:13:04,666 --> 01:13:08,256 Just logically, if the return value of S scan F is 2, not 1, 1676 01:13:08,486 --> 01:13:11,316 that means both N and C were filled with values. 1677 01:13:11,496 --> 01:13:12,366 But what does that mean? 1678 01:13:12,366 --> 01:13:13,846 Well' according to the format codes, 1679 01:13:14,116 --> 01:13:18,186 the number would have been put first because the percent D 1680 01:13:18,186 --> 01:13:20,926 and because of the N. So if the user times something else, 1681 01:13:21,186 --> 01:13:23,486 this something else, the first character of it, 1682 01:13:23,696 --> 01:13:26,526 would at least end up in C, and so what this means is 1683 01:13:26,526 --> 01:13:29,136 that if the user is trying to mess with me by typing something 1684 01:13:29,136 --> 01:13:32,776 like 4 and then again my random string of the year, monkey, 1685 01:13:33,066 --> 01:13:36,586 it's going to detect 4 and then M, but that's an error, right? 1686 01:13:36,586 --> 01:13:38,596 Because we only want the number 4. 1687 01:13:38,716 --> 01:13:41,486 So if the return value of S scan F is everything but 1, 1688 01:13:41,486 --> 01:13:44,656 that means the user typed in yes in int, but then some garbage. 1689 01:13:45,076 --> 01:13:46,416 Or the user times nothing, 1690 01:13:46,416 --> 01:13:48,476 in which case the return value is 0, 1691 01:13:48,806 --> 01:13:50,366 in which case it's also an error. 1692 01:13:50,366 --> 01:13:52,456 So what do we do in the case of an error? 1693 01:13:52,786 --> 01:13:57,166 We free the line, so we free the memory that we just got by way 1694 01:13:57,166 --> 01:14:00,196 of get string, and then we tell the user to retry. 1695 01:14:00,196 --> 01:14:03,606 So any time you've seen in your own programs retry colon, 1696 01:14:03,876 --> 01:14:05,606 it's simply because we've hard coded it 1697 01:14:05,606 --> 01:14:08,266 into this function here, but if S scan F does 1698 01:14:08,266 --> 01:14:09,806 in fact return 1, what do we do? 1699 01:14:09,806 --> 01:14:11,606 Well, we don't need the line of text, 1700 01:14:11,606 --> 01:14:14,486 I don't need everything the user times, I don't need any spaces 1701 01:14:14,486 --> 01:14:18,246 or what not, I just need N. So I go ahead and return N. 1702 01:14:18,816 --> 01:14:20,686 And now a quick distinction, to be clear. 1703 01:14:20,936 --> 01:14:23,866 Get string allocates memory dynamically with regard 1704 01:14:23,866 --> 01:14:26,006 to our little picture from before, 1705 01:14:26,006 --> 01:14:30,776 where does get string take memory from? 1706 01:14:30,966 --> 01:14:32,876 So the heap, right? 1707 01:14:32,916 --> 01:14:36,936 So get string or in turn, meloc takes all of its memory 1708 01:14:36,936 --> 01:14:38,176 from this area called the heap. 1709 01:14:38,456 --> 01:14:42,836 By contrast, where do N and C live in memory. 1710 01:14:43,376 --> 01:14:44,966 So they live in the stack. 1711 01:14:45,126 --> 01:14:46,836 And we knew that from a week or so ago. 1712 01:14:46,836 --> 01:14:49,056 Any local variables live on the stack. 1713 01:14:49,356 --> 01:14:51,326 But the problem with the stack is that as soon 1714 01:14:51,326 --> 01:14:53,686 as a function returns, as soon as get int returns, 1715 01:14:54,086 --> 01:14:56,406 the stuff from the stack goes away, gets obliterated. 1716 01:14:56,406 --> 01:14:57,906 It's no longer safely there. 1717 01:14:57,906 --> 01:15:01,056 And this is why the heap is no useful and so neat, 1718 01:15:01,206 --> 01:15:04,746 because on the heap, memory stays where it is 1719 01:15:04,746 --> 01:15:07,396 and it doesn't get automatically overwritten 1720 01:15:07,436 --> 01:15:08,466 as it does in the stack. 1721 01:15:08,466 --> 01:15:09,766 Now granted, if you try 1722 01:15:09,766 --> 01:15:12,716 to use too much memory it will accidentally get overwritten, 1723 01:15:12,916 --> 01:15:14,136 but not automatically. 1724 01:15:14,196 --> 01:15:17,396 It's the stack whose memory is reused constantly, 1725 01:15:17,396 --> 01:15:18,646 as soon as a function returns. 1726 01:15:19,036 --> 01:15:22,816 Which is to say do I need to ever free a local variable 1727 01:15:22,816 --> 01:15:26,656 like N or C. So no, it would be an error 1728 01:15:26,656 --> 01:15:28,036 to free anything on the stack. 1729 01:15:28,196 --> 01:15:29,856 You only call the free function 1730 01:15:29,856 --> 01:15:32,106 on something that's all indicated on the heap 1731 01:15:32,246 --> 01:15:34,066 and that's why I only free the line 1732 01:15:34,406 --> 01:15:36,316 and I only free the line here. 1733 01:15:36,606 --> 01:15:38,436 Well, let's now tease apart get string, 1734 01:15:38,576 --> 01:15:40,146 since that's apparently what's making all 1735 01:15:40,146 --> 01:15:41,656 of this possible in the first place. 1736 01:15:42,026 --> 01:15:45,126 So get string is a little longer, we won't dwell on all 1737 01:15:45,126 --> 01:15:47,936 of the details, but just to hint at the flexibility here. 1738 01:15:48,266 --> 01:15:51,656 So get string first declares a string called buffer, 1739 01:15:51,656 --> 01:15:53,636 but it initializes it to null. 1740 01:15:53,636 --> 01:15:54,816 And that's just good practice. 1741 01:15:54,816 --> 01:15:56,606 You will find that even if you don't need 1742 01:15:56,606 --> 01:15:59,166 to use a variable yet, you will save yourself a lot 1743 01:15:59,166 --> 01:16:01,646 of time over, you know, the course of your lifetime 1744 01:16:01,906 --> 01:16:04,096 by initializing it intentionally, 1745 01:16:04,096 --> 01:16:07,206 variables to value, so you know in fact they are something. 1746 01:16:07,596 --> 01:16:10,936 Then I have some capacity local variable and N local variable 1747 01:16:10,936 --> 01:16:13,656 and a character, now this looks like bad style, but it's just 1748 01:16:13,656 --> 01:16:16,046 because I've written out all of the comments from today's code. 1749 01:16:16,366 --> 01:16:18,306 And it looks like there's this function. 1750 01:16:18,726 --> 01:16:23,006 There is a function in C called F get C file get character 1751 01:16:23,446 --> 01:16:25,416 standard in refers to your keyboard. 1752 01:16:25,416 --> 01:16:29,026 So what this is, a loop that says keep reading character 1753 01:16:29,026 --> 01:16:30,816 after character from the keyboard, 1754 01:16:31,056 --> 01:16:34,456 store it on each iteration in a local variable called C, 1755 01:16:34,456 --> 01:16:36,436 and then, and this is our clever one-liner, 1756 01:16:36,636 --> 01:16:39,096 make sure that C does not equal the new line 1757 01:16:39,306 --> 01:16:43,056 and does not equal the special E O F, end of file character 1758 01:16:43,276 --> 01:16:44,676 that happens if the user hits, say, 1759 01:16:44,676 --> 01:16:46,626 control D on most computers. 1760 01:16:46,936 --> 01:16:49,176 Now this line of code here, which is best explained 1761 01:16:49,176 --> 01:16:52,176 by the comments, does a really neat thing. 1762 01:16:52,176 --> 01:16:54,416 If the total number of characters in the string 1763 01:16:54,416 --> 01:16:58,916 at the moment plus 1 puts us over the capacity of the string, 1764 01:16:59,176 --> 01:17:01,496 in other words, we have this variable called buffer, 1765 01:17:01,646 --> 01:17:03,126 it's just a bunch of bytes, currently, 1766 01:17:03,316 --> 01:17:04,196 there are no bytes there, 1767 01:17:04,196 --> 01:17:05,806 because I didn't initialize it to anything. 1768 01:17:06,186 --> 01:17:09,506 If the next byte would overflow the so-called buffer 1769 01:17:09,506 --> 01:17:12,206 of memory just intuitively, what had we best do, 1770 01:17:12,206 --> 01:17:14,026 if we want to be able to grow dynamically. 1771 01:17:14,786 --> 01:17:17,856 The buffer is this big, the user just typed something in, 1772 01:17:17,856 --> 01:17:19,646 I need to go gasp for more memory. 1773 01:17:19,646 --> 01:17:22,016 Now let's fast forward a few moments in the story, 1774 01:17:22,016 --> 01:17:23,726 suppose that I now had 10 characters, 1775 01:17:23,726 --> 01:17:24,836 but the user types in 11. 1776 01:17:25,266 --> 01:17:27,606 Problem is I can't just overwrite the 11th space, 1777 01:17:27,606 --> 01:17:29,596 because I'm going to overwrite the back slash 0. 1778 01:17:29,806 --> 01:17:32,636 So what get string ultimately does is it starts with a buffer 1779 01:17:32,636 --> 01:17:35,216 like this, the more and more characters it finds 1780 01:17:35,216 --> 01:17:39,036 at the prompt by way of this F get C function, it desides oh, 1781 01:17:39,036 --> 01:17:41,036 you know what, let me double the buffer size. 1782 01:17:41,036 --> 01:17:43,456 Let me double it again and then double it again. 1783 01:17:43,456 --> 01:17:44,526 And then double it again. 1784 01:17:44,766 --> 01:17:46,986 So it's literally growing dynamically. 1785 01:17:47,226 --> 01:17:50,316 It's using a function called realloc, which equal the effect 1786 01:17:50,316 --> 01:17:53,026 of reallocating memory if its possible. 1787 01:17:53,266 --> 01:17:54,256 You hand it a pointer, 1788 01:17:54,356 --> 01:17:57,266 say I don't want this memory any more, I want more. 1789 01:17:57,526 --> 01:17:58,696 Give it back to me. 1790 01:17:58,966 --> 01:18:01,396 So that's what this ultimately does, and then finally, 1791 01:18:01,396 --> 01:18:04,666 at the end of this function we're sort of a good neighbor. 1792 01:18:05,036 --> 01:18:07,776 So we then decide, you know what, if I've just been doubling 1793 01:18:07,776 --> 01:18:09,156 and doubling and doubling this buffer, 1794 01:18:09,156 --> 01:18:12,096 but now I actually didn't need all of that memory, 1795 01:18:12,096 --> 01:18:13,816 I just needed an extra one or two bytes, 1796 01:18:13,816 --> 01:18:15,446 not all of the memory over here. 1797 01:18:15,886 --> 01:18:18,536 These three lines of code here, as you'll see in your print outs 1798 01:18:18,536 --> 01:18:19,486 if you reason through it, 1799 01:18:19,796 --> 01:18:24,066 simply allocates just enough memory using alloc, 1800 01:18:24,066 --> 01:18:26,476 using N plus 1 for the new line character. 1801 01:18:26,686 --> 01:18:29,986 We use a function called string N copy, stir N copy, 1802 01:18:29,986 --> 01:18:31,236 which copies one to the other, 1803 01:18:31,236 --> 01:18:34,136 and then we free this excessively large buffer 1804 01:18:34,136 --> 01:18:36,586 so that what we hand you, the student, the user, 1805 01:18:36,586 --> 01:18:39,556 is only as many bytes as you actually need. 1806 01:18:39,766 --> 01:18:41,596 And so this little exercise today 1807 01:18:41,596 --> 01:18:43,226 of finally peeling back the layer 1808 01:18:43,226 --> 01:18:45,746 and looking underneath the hood is precisely in the spirit 1809 01:18:45,746 --> 01:18:47,006 of taking these training wheels off. 1810 01:18:47,256 --> 01:18:48,786 And next week when we dive into 1811 01:18:48,786 --> 01:18:50,796 yet more powerful techniques still. 1812 01:18:50,796 --> 01:18:51,916 We'll see you on Monday.