1 00:00:00,000 --> 00:00:11,320 2 00:00:11,320 --> 00:00:13,260 >> DAVID MALAN: Hello, and welcome back to CS50. 3 00:00:13,260 --> 00:00:14,860 So this is the end of week four. 4 00:00:14,860 --> 00:00:16,680 Just one announcement first. 5 00:00:16,680 --> 00:00:19,600 So the so-called fifth Monday is coming up this coming Monday. 6 00:00:19,600 --> 00:00:22,800 This is the opportunity to change from SAT/UNSAT to a letter grade, or from 7 00:00:22,800 --> 00:00:24,130 letter grade SAT/UNSAT. 8 00:00:24,130 --> 00:00:27,130 Annoyingly, that process does require a signature, because you have to fill 9 00:00:27,130 --> 00:00:28,770 out one of those pink add/drop forms. 10 00:00:28,770 --> 00:00:31,680 >> Because technically, the SAT/UNSAT version and the letter grade version 11 00:00:31,680 --> 00:00:33,320 have distinct catalog numbers. 12 00:00:33,320 --> 00:00:34,240 But no big deal. 13 00:00:34,240 --> 00:00:36,620 Just come up to me or to Rob or to Lauren at any point. 14 00:00:36,620 --> 00:00:39,550 Or email us if you don't have the kind of paperwork you need today, and we 15 00:00:39,550 --> 00:00:43,410 will be sure to help you take care of that before Monday. 16 00:00:43,410 --> 00:00:45,780 >> All right, so today-- 17 00:00:45,780 --> 00:00:47,630 actually, there's a bit of an echo. 18 00:00:47,630 --> 00:00:51,070 Can we tone me down a bit? 19 00:00:51,070 --> 00:00:51,730 OK. 20 00:00:51,730 --> 00:00:54,850 So today, we introduce a topic known as pointers. 21 00:00:54,850 --> 00:00:57,770 And I'll admit that this is one of the more complex topics that we tend to 22 00:00:57,770 --> 00:01:00,960 cover in this class, or really any introductory course that uses C. 23 00:01:00,960 --> 00:01:05,510 >> But take my word for it, particularly if your mind feels a bit more bent 24 00:01:05,510 --> 00:01:07,100 today and in the weeks to come. 25 00:01:07,100 --> 00:01:10,340 It's not representative of you getting any worse at this it just means that 26 00:01:10,340 --> 00:01:13,360 it's a particularly sophisticated topic that I promise, a few weeks 27 00:01:13,360 --> 00:01:17,610 hence, will seem all too strikingly straightforward in retrospect. 28 00:01:17,610 --> 00:01:18,720 >> I still remember to this day. 29 00:01:18,720 --> 00:01:22,190 I was sitting in Elliott Dining Hall, sitting next to my TF Nishat Mehta, 30 00:01:22,190 --> 00:01:24,070 who was a resident of Elliott house. 31 00:01:24,070 --> 00:01:26,340 And for some reason, this topic just clicks. 32 00:01:26,340 --> 00:01:29,430 Which is to say that I too struggled with it for some amount of time, but I 33 00:01:29,430 --> 00:01:33,610 will do my best to help avoid any such struggle with a topic that ultimately 34 00:01:33,610 --> 00:01:34,580 is quite powerful. 35 00:01:34,580 --> 00:01:37,350 >> In fact, one of the topics we'll discuss in the weeks to come is that 36 00:01:37,350 --> 00:01:41,130 of security, and how you can actually exploit machines in ways 37 00:01:41,130 --> 00:01:42,320 that were not intended. 38 00:01:42,320 --> 00:01:45,850 And those exploitations are typically the result of bugs, mistakes that we 39 00:01:45,850 --> 00:01:49,740 people make by not understanding some of the underlying implementation 40 00:01:49,740 --> 00:01:52,250 details via which programs are made. 41 00:01:52,250 --> 00:01:55,410 >> Now to make this seem all the more user friendly, I thought I'd play a 10 42 00:01:55,410 --> 00:01:59,680 second preview of a little claymation figure named Binky who was brought to 43 00:01:59,680 --> 00:02:03,020 life by a friend of ours at Stanford, professor Nick Parlante. 44 00:02:03,020 --> 00:02:06,753 So allow me to give you this teaser of Binky here. 45 00:02:06,753 --> 00:02:09,520 >> [VIDEO PLAYBACK] 46 00:02:09,520 --> 00:02:10,380 >> -Hey, Binky. 47 00:02:10,380 --> 00:02:11,050 Wake up. 48 00:02:11,050 --> 00:02:13,610 It's time for pointer fun. 49 00:02:13,610 --> 00:02:14,741 >> -What's that? 50 00:02:14,741 --> 00:02:16,440 Learn about pointers? 51 00:02:16,440 --> 00:02:17,928 Oh, goodie. 52 00:02:17,928 --> 00:02:18,920 >> [END VIDEO PLAYBACK] 53 00:02:18,920 --> 00:02:20,670 >> DAVID MALAN: That is Stanford computer science. 54 00:02:20,670 --> 00:02:23,194 So more on that to come. 55 00:02:23,194 --> 00:02:24,930 >> [APPLAUSE] 56 00:02:24,930 --> 00:02:26,660 >> DAVID MALAN: Sorry, Nick. 57 00:02:26,660 --> 00:02:30,680 >> So recall that last time we ended on this really exciting cliffhanger 58 00:02:30,680 --> 00:02:32,960 whereby this function just didn't work. 59 00:02:32,960 --> 00:02:34,960 At least intuitively, it felt like it should work. 60 00:02:34,960 --> 00:02:37,600 Simply swapping the values of two integers. 61 00:02:37,600 --> 00:02:40,915 But recall that when we printed out the original values in main, one and 62 00:02:40,915 --> 00:02:44,210 two, they were still one and two and not two and one. 63 00:02:44,210 --> 00:02:46,070 >> So let me actually switch over to the appliance. 64 00:02:46,070 --> 00:02:50,180 And I wrote up a bit of skeletal code in advance here, where I claim that x 65 00:02:50,180 --> 00:02:52,500 will be 1, y will be 2. 66 00:02:52,500 --> 00:02:54,810 I then print out both of their values with print f. 67 00:02:54,810 --> 00:02:57,540 >> I then claim down here that we're going to swap them. 68 00:02:57,540 --> 00:03:00,800 I left a blank spot here for us to fill in today in just a moment. 69 00:03:00,800 --> 00:03:03,380 Then, I'm going to claim that the two variables have been swapped. 70 00:03:03,380 --> 00:03:04,770 Then I'm going to print them out again. 71 00:03:04,770 --> 00:03:07,090 And so hopefully, I should see 1, 2. 72 00:03:07,090 --> 00:03:07,380 2, 1. 73 00:03:07,380 --> 00:03:09,830 That's the super simple goal right now. 74 00:03:09,830 --> 00:03:12,430 >> So how do we go about swapping two variables? 75 00:03:12,430 --> 00:03:17,220 Well if I propose here that these cups might represent memory in a computer. 76 00:03:17,220 --> 00:03:19,070 This is a few bites, this is another few bites. 77 00:03:19,070 --> 00:03:23,260 Could we have a volunteer come on up and mix us some drinks, if familiar? 78 00:03:23,260 --> 00:03:23,920 Come on up. 79 00:03:23,920 --> 00:03:24,815 What's your name? 80 00:03:24,815 --> 00:03:25,260 >> JESS: Jess. 81 00:03:25,260 --> 00:03:25,690 >> DAVID MALAN: Jess? 82 00:03:25,690 --> 00:03:26,540 Come on up, Jess. 83 00:03:26,540 --> 00:03:29,180 If you don't mind, we have to put the Google Glass on you so we can 84 00:03:29,180 --> 00:03:30,430 immortalize this. 85 00:03:30,430 --> 00:03:32,800 86 00:03:32,800 --> 00:03:34,670 OK, glass. 87 00:03:34,670 --> 00:03:37,250 Record a video. 88 00:03:37,250 --> 00:03:43,103 And OK, we are good to go with Jess here. 89 00:03:43,103 --> 00:03:43,810 All right. 90 00:03:43,810 --> 00:03:45,120 Nice to meet you. 91 00:03:45,120 --> 00:03:47,720 >> So what I'd like you do here-- if you could, quite quickly-- 92 00:03:47,720 --> 00:03:51,040 just pours us half a glass of orange juice and half a glass of milk, 93 00:03:51,040 --> 00:03:55,710 representing effectively the numbers 1 in one cup and 2 in the other cup. 94 00:03:55,710 --> 00:04:01,380 95 00:04:01,380 --> 00:04:02,630 >> This is going to be good footage. 96 00:04:02,630 --> 00:04:04,910 97 00:04:04,910 --> 00:04:05,860 >> JESS: Sorry. 98 00:04:05,860 --> 00:04:06,330 >> DAVID MALAN: No, no. 99 00:04:06,330 --> 00:04:08,703 It's OK. 100 00:04:08,703 --> 00:04:10,120 Nice. 101 00:04:10,120 --> 00:04:12,950 All right, so we have four bytes worth of orange juice. 102 00:04:12,950 --> 00:04:14,460 We'll called it the value 1. 103 00:04:14,460 --> 00:04:16,579 Now another four bytes worth of milk. 104 00:04:16,579 --> 00:04:18,519 Will call it value 2. 105 00:04:18,519 --> 00:04:20,440 So x and y, respectively. 106 00:04:20,440 --> 00:04:23,450 >> All right, so now if the task at hand-- for you, Jess, in front of all 107 00:04:23,450 --> 00:04:24,270 of your classmates-- 108 00:04:24,270 --> 00:04:28,510 is to swap the values of x and y such that we want the orange juice in the 109 00:04:28,510 --> 00:04:32,070 other cup and the milk in this cup, how might you-- before you actually do 110 00:04:32,070 --> 00:04:34,020 it-- go about doing this? 111 00:04:34,020 --> 00:04:35,220 >> OK, wise decision. 112 00:04:35,220 --> 00:04:36,340 So you need a bit more memory. 113 00:04:36,340 --> 00:04:38,190 So let's allocate a temporary cup, if you will. 114 00:04:38,190 --> 00:04:40,540 And now proceed to swap x and y. 115 00:04:40,540 --> 00:04:52,950 116 00:04:52,950 --> 00:04:53,530 >> Excellent. 117 00:04:53,530 --> 00:04:54,420 So very well done. 118 00:04:54,420 --> 00:04:55,670 Thank you so much, Jess. 119 00:04:55,670 --> 00:04:59,520 120 00:04:59,520 --> 00:05:00,020 Here you are. 121 00:05:00,020 --> 00:05:01,950 A little souvenir. 122 00:05:01,950 --> 00:05:04,350 >> OK, so obviously, super simple idea. 123 00:05:04,350 --> 00:05:07,500 Completely intuitive that we need a bit more storage space-- in this form, 124 00:05:07,500 --> 00:05:09,750 a cup-- if we actually want to swap these two variables. 125 00:05:09,750 --> 00:05:11,110 So let's do exactly that. 126 00:05:11,110 --> 00:05:14,330 Up here in between where I claim I'm going to be doing some swapping, I'll 127 00:05:14,330 --> 00:05:15,720 go ahead and declare temp. 128 00:05:15,720 --> 00:05:17,980 And I'll set it equal to, say, x. 129 00:05:17,980 --> 00:05:21,110 >> Then I'm going to change the value of x just like Jess did here with the 130 00:05:21,110 --> 00:05:23,200 milk and orange juice to be equal to y. 131 00:05:23,200 --> 00:05:27,460 And I'm going to change y to be equal to not x, because now we would be 132 00:05:27,460 --> 00:05:29,530 stuck in a circle, but rather temp. 133 00:05:29,530 --> 00:05:33,170 Where I temporarily-- or where Jess temporarily put the orange juice 134 00:05:33,170 --> 00:05:35,460 before clobbering that cup with the milk. 135 00:05:35,460 --> 00:05:37,250 >> So let me go ahead now and make this. 136 00:05:37,250 --> 00:05:39,210 It's called noswap.c. 137 00:05:39,210 --> 00:05:41,190 And now let me run no swap. 138 00:05:41,190 --> 00:05:43,910 And indeed I see, if I expand the window a little bit, that 139 00:05:43,910 --> 00:05:45,160 x is 1, y is 2. 140 00:05:45,160 --> 00:05:47,230 And then x is 2, y is 1. 141 00:05:47,230 --> 00:05:51,910 >> But recall that on Monday we did things a little differently whereby I 142 00:05:51,910 --> 00:05:56,760 instead implemented a helper function, if you will, that was actually void. 143 00:05:56,760 --> 00:05:58,010 I called it swap. 144 00:05:58,010 --> 00:06:01,600 I gave it two parameters, and I called them a and I called them b. 145 00:06:01,600 --> 00:06:04,380 >> Frankly, I could call them x and y. 146 00:06:04,380 --> 00:06:06,040 There's nothing stopping me from doing that. 147 00:06:06,040 --> 00:06:08,140 But I would argue it's then a little ambiguous. 148 00:06:08,140 --> 00:06:11,910 Because recall for Monday that we claimed that these parameters were 149 00:06:11,910 --> 00:06:13,650 copies of the values passed in. 150 00:06:13,650 --> 00:06:15,640 So it just messes with your mind, I think, if you use 151 00:06:15,640 --> 00:06:17,370 exactly the same variables. 152 00:06:17,370 --> 00:06:20,150 >> So I'll instead call them a and b, just for clarity. 153 00:06:20,150 --> 00:06:21,840 But we could call them most anything we want. 154 00:06:21,840 --> 00:06:26,280 And I'm going to copy and paste effectively this code from up there 155 00:06:26,280 --> 00:06:27,170 down into here. 156 00:06:27,170 --> 00:06:29,110 Because I just saw that it works. 157 00:06:29,110 --> 00:06:30,790 So that's in pretty good shape. 158 00:06:30,790 --> 00:06:37,390 And I'll change my x to a, my x to a, my y to b and my y to b. 159 00:06:37,390 --> 00:06:39,130 >> So in other words, exact same logic. 160 00:06:39,130 --> 00:06:40,850 The exact same thing that Jess did. 161 00:06:40,850 --> 00:06:44,350 And then the one thing I have to do up here, of course, is now invoke this 162 00:06:44,350 --> 00:06:45,990 function, or call this function. 163 00:06:45,990 --> 00:06:50,430 So I will call this function with two inputs, x and y, and hit Save. 164 00:06:50,430 --> 00:06:52,300 >> All right, so fundamentally the same thing. 165 00:06:52,300 --> 00:06:55,570 In fact, I've probably made the program unnecessarily complex by 166 00:06:55,570 --> 00:07:00,820 writing a function that's just taking some six lines of code whereas I 167 00:07:00,820 --> 00:07:02,970 previously had implemented this in just three. 168 00:07:02,970 --> 00:07:06,230 >> So let me go ahead now and remake this, make no swap. 169 00:07:06,230 --> 00:07:07,920 All right, I screwed up here. 170 00:07:07,920 --> 00:07:11,290 This should be an error that you might see increasingly commonly as your 171 00:07:11,290 --> 00:07:12,380 programs get more complex. 172 00:07:12,380 --> 00:07:13,470 But there's an easy fix. 173 00:07:13,470 --> 00:07:15,650 Let me scroll back up here. 174 00:07:15,650 --> 00:07:18,190 >> And what's the first error I'm seeing? 175 00:07:18,190 --> 00:07:19,520 Implicit declaration. 176 00:07:19,520 --> 00:07:21,466 What does that typically indicate? 177 00:07:21,466 --> 00:07:22,830 Oh, I forgot the prototype. 178 00:07:22,830 --> 00:07:26,900 I forgot to teach the compiler that swap is going to exist even though he 179 00:07:26,900 --> 00:07:28,920 doesn't exist at the very beginning of the program. 180 00:07:28,920 --> 00:07:35,780 So I'm just going to say void, swap, int, a int b, semicolon. 181 00:07:35,780 --> 00:07:37,280 >> So I'm not going to reimplement it. 182 00:07:37,280 --> 00:07:39,140 But now it matches what's down here. 183 00:07:39,140 --> 00:07:42,530 And notice, the absence of a semicolon here, which is not necessary when 184 00:07:42,530 --> 00:07:43,200 implementing. 185 00:07:43,200 --> 00:07:46,010 >> So let me remake this, make no swap. 186 00:07:46,010 --> 00:07:46,910 Much better shape. 187 00:07:46,910 --> 00:07:48,130 Run no swap. 188 00:07:48,130 --> 00:07:48,740 And damn it. 189 00:07:48,740 --> 00:07:51,650 Now we're back where we were on Monday, where the thing didn't swap. 190 00:07:51,650 --> 00:07:55,410 >> And what's the intuitive explanation for why this is the case? 191 00:07:55,410 --> 00:07:56,380 Yeah? 192 00:07:56,380 --> 00:07:57,630 >> STUDENT: [INAUDIBLE]. 193 00:07:57,630 --> 00:08:04,140 194 00:08:04,140 --> 00:08:05,230 >> DAVID MALAN: Exactly. 195 00:08:05,230 --> 00:08:07,330 So a and b are copies of x and y. 196 00:08:07,330 --> 00:08:10,680 And in fact, any time you've been calling a function thus far that 197 00:08:10,680 --> 00:08:12,540 passes variables like ints-- 198 00:08:12,540 --> 00:08:14,470 just as swap is expecting here-- 199 00:08:14,470 --> 00:08:16,270 you guys have been passing in copies. 200 00:08:16,270 --> 00:08:19,150 >> Now that means it takes a little bit of time, a split second, for the 201 00:08:19,150 --> 00:08:23,270 computer to copy the bits from one variable into the bits of another. 202 00:08:23,270 --> 00:08:24,610 But that's not such a big deal. 203 00:08:24,610 --> 00:08:25,920 But they're nonetheless a copy. 204 00:08:25,920 --> 00:08:30,020 >> And so now, in the context of swap, I am in fact successfully 205 00:08:30,020 --> 00:08:31,180 changing a and b. 206 00:08:31,180 --> 00:08:33,000 In fact, let's do a quick sanity check. 207 00:08:33,000 --> 00:08:36,830 Print f a is %i, new line. 208 00:08:36,830 --> 00:08:38,770 And let's plug in a. 209 00:08:38,770 --> 00:08:41,830 Now let's do the same thing with b. 210 00:08:41,830 --> 00:08:43,640 And let's do the same thing here. 211 00:08:43,640 --> 00:08:47,260 >> And now, let me copy those same lines again at the bottom of the function 212 00:08:47,260 --> 00:08:51,250 after my three lines of interesting could have executed, and 213 00:08:51,250 --> 00:08:53,270 print a and b yet again. 214 00:08:53,270 --> 00:08:56,030 So now let's make this, make no swap. 215 00:08:56,030 --> 00:08:58,430 Let me make the terminal window a bit taller, so that we can see 216 00:08:58,430 --> 00:08:59,520 more of it at once. 217 00:08:59,520 --> 00:09:00,860 >> And run no swap. 218 00:09:00,860 --> 00:09:04,000 x is 1, y is 2. a is 1, b is 2. 219 00:09:04,000 --> 00:09:06,070 And then, a is 2, b is 1. 220 00:09:06,070 --> 00:09:09,390 So it is working, just like Jess did here inside of swap. 221 00:09:09,390 --> 00:09:13,090 But of course, it's having no effect on the variables in main. 222 00:09:13,090 --> 00:09:15,360 >> So we saw a trick whereby we could fix this, right? 223 00:09:15,360 --> 00:09:19,560 When you're faced with this scoping issue, you could just punt and make x 224 00:09:19,560 --> 00:09:22,400 and y what kind of variables instead? 225 00:09:22,400 --> 00:09:23,390 >> You could make them global. 226 00:09:23,390 --> 00:09:27,560 Put them at the very top of the file as we did, even in the game of 15. 227 00:09:27,560 --> 00:09:28,890 We use a global variable. 228 00:09:28,890 --> 00:09:32,420 But in the context of the game a 15, it's reasonable to have a global 229 00:09:32,420 --> 00:09:37,170 variable representing the board, because the entirety of 15.c is all 230 00:09:37,170 --> 00:09:38,650 about implementing that game. 231 00:09:38,650 --> 00:09:41,470 That's what the file exists to do. 232 00:09:41,470 --> 00:09:44,170 >> But in this case here, I'm calling a function swap. 233 00:09:44,170 --> 00:09:45,380 I want to swap two variables. 234 00:09:45,380 --> 00:09:48,950 And it should start to feel just sloppy if the solution to all of our 235 00:09:48,950 --> 00:09:51,300 problems when we run into scope issues is make it global. 236 00:09:51,300 --> 00:09:54,730 Because very quickly our program is going to become quite a mess. 237 00:09:54,730 --> 00:09:57,760 And we did that very sparingly as a result in 15.c. 238 00:09:57,760 --> 00:10:00,470 >> But it turns out there's a better way altogether. 239 00:10:00,470 --> 00:10:05,600 Let me actually go back and delete the print f's, just to simplify this code. 240 00:10:05,600 --> 00:10:09,160 And let me propose that this, indeed, is bad. 241 00:10:09,160 --> 00:10:15,990 But if I instead add in some asterisks and stars, I can instead turn this 242 00:10:15,990 --> 00:10:18,670 function into one that's actually operational. 243 00:10:18,670 --> 00:10:25,020 >> So let me go back here and admit saying asterisks is always difficult, 244 00:10:25,020 --> 00:10:26,170 so I'll say stars. 245 00:10:26,170 --> 00:10:27,660 I'll just fess up to that one. 246 00:10:27,660 --> 00:10:28,190 All right. 247 00:10:28,190 --> 00:10:30,190 And now, what am I going to do instead? 248 00:10:30,190 --> 00:10:34,130 >> So first of all, I'm going to specify that instead of passing an int into 249 00:10:34,130 --> 00:10:37,980 the swap function, I'm instead of going to say int star. 250 00:10:37,980 --> 00:10:39,170 Now, what does the star indicate? 251 00:10:39,170 --> 00:10:41,970 This is that notion of a pointer that Binky, the claymation character, was 252 00:10:41,970 --> 00:10:43,465 referring to a moment ago. 253 00:10:43,465 --> 00:10:47,610 >> So if we say int star, the meaning of this now is that a is not going to be 254 00:10:47,610 --> 00:10:49,110 passed in by its value. 255 00:10:49,110 --> 00:10:50,350 It's not going to be copied in. 256 00:10:50,350 --> 00:10:54,700 Rather, the address of a is going to be passed in. 257 00:10:54,700 --> 00:10:57,840 >> So recall that inside of your computer is a whole bunch of memory, otherwise 258 00:10:57,840 --> 00:10:58,760 known as RAM. 259 00:10:58,760 --> 00:11:00,520 And that RAM is just a whole bunch of bytes. 260 00:11:00,520 --> 00:11:03,320 So if your Mac or your PC has two gigabytes, you have 2 261 00:11:03,320 --> 00:11:05,760 billion bytes of memory. 262 00:11:05,760 --> 00:11:08,440 >> Now let's just suppose that just to keep things nice and orderly, we 263 00:11:08,440 --> 00:11:09,450 assign an address-- 264 00:11:09,450 --> 00:11:10,170 a number-- 265 00:11:10,170 --> 00:11:12,270 to every byte of RAM in your computer. 266 00:11:12,270 --> 00:11:15,410 The very first byte of those 2 billion is by number zero. 267 00:11:15,410 --> 00:11:18,572 The next one is byte number one, number two, all the way on up, dot dot 268 00:11:18,572 --> 00:11:20,530 dot, to roughly 2 billion. 269 00:11:20,530 --> 00:11:23,640 >> So you can number of the bytes of memory in your computer. 270 00:11:23,640 --> 00:11:26,460 So let's assume that that's what we mean by an address. 271 00:11:26,460 --> 00:11:31,360 So when I see int star a, what's going to be passed into swap now is the 272 00:11:31,360 --> 00:11:32,830 address of a. 273 00:11:32,830 --> 00:11:37,150 Not its value, but whatever its postal address is, so to speak-- 274 00:11:37,150 --> 00:11:38,810 its location in RAM. 275 00:11:38,810 --> 00:11:41,250 >> And similarly for b, I'm going to say the same thing. 276 00:11:41,250 --> 00:11:42,720 Int, star, b. 277 00:11:42,720 --> 00:11:46,350 As an aside, technically the star could go in other locations. 278 00:11:46,350 --> 00:11:50,140 But we'll standardize on the star being right next to the data type. 279 00:11:50,140 --> 00:11:54,080 >> So swap signature now means, give me the address of an int, and call 280 00:11:54,080 --> 00:11:55,400 that address a. 281 00:11:55,400 --> 00:11:58,690 And give me another address of an int and call that address b. 282 00:11:58,690 --> 00:12:01,120 >> But now my code here has to change. 283 00:12:01,120 --> 00:12:03,470 Because if I declare int temp-- 284 00:12:03,470 --> 00:12:05,580 which is still of type int-- 285 00:12:05,580 --> 00:12:08,700 but I store in it a, what kind of value? 286 00:12:08,700 --> 00:12:12,870 To be clear, am I putting an a with the code as written right now? 287 00:12:12,870 --> 00:12:14,360 >> I'm putting the location in a. 288 00:12:14,360 --> 00:12:16,500 But I don't care about the location now, right? 289 00:12:16,500 --> 00:12:21,940 Temp exists just Jess' third cup existed, for what purpose? 290 00:12:21,940 --> 00:12:23,090 To store a value. 291 00:12:23,090 --> 00:12:24,830 Milk or orange juice. 292 00:12:24,830 --> 00:12:28,520 Not to actually store the address of either of those things, which feels a 293 00:12:28,520 --> 00:12:31,200 little nonsensical in this real world context anyway. 294 00:12:31,200 --> 00:12:34,990 >> So really, what I want to put in temp is not the address of a, but the 295 00:12:34,990 --> 00:12:36,180 contents of a. 296 00:12:36,180 --> 00:12:41,930 So if a is a number like 123, this is the 123rd byte of memory that a just 297 00:12:41,930 --> 00:12:45,090 happens to be occupying, that the value in a happens to be occupying. 298 00:12:45,090 --> 00:12:49,040 >> If I want to go to that address, I need to say star a. 299 00:12:49,040 --> 00:12:52,610 Similarly, if I were to change what's at the address a, I change 300 00:12:52,610 --> 00:12:53,570 this to start a. 301 00:12:53,570 --> 00:12:58,185 If I want to store in what's at the location a with what's at the location 302 00:12:58,185 --> 00:13:02,180 at b, star b star. 303 00:13:02,180 --> 00:13:05,340 >> So in short, even if this isn't quite sinking in yet-- and I wouldn't expect 304 00:13:05,340 --> 00:13:06,560 that it would so fast-- 305 00:13:06,560 --> 00:13:11,100 realize that all I'm doing is prefixing these stars to my variables, 306 00:13:11,100 --> 00:13:13,350 saying don't grab the values. 307 00:13:13,350 --> 00:13:14,520 Don't change the values. 308 00:13:14,520 --> 00:13:17,600 But rather, go to those addresses and get the value. 309 00:13:17,600 --> 00:13:21,430 Go to that address and change the value there. 310 00:13:21,430 --> 00:13:25,500 >> So now let me scroll back up to the top, just to fix this line here, to 311 00:13:25,500 --> 00:13:27,690 change the prototype to match. 312 00:13:27,690 --> 00:13:30,280 But I now need to do one other thing. 313 00:13:30,280 --> 00:13:35,500 Intuitively, if I've changed the types of arguments that swap is expecting, 314 00:13:35,500 --> 00:13:37,245 what else do I need to change in my code? 315 00:13:37,245 --> 00:13:39,750 316 00:13:39,750 --> 00:13:40,840 >> When I call swap. 317 00:13:40,840 --> 00:13:43,340 Because right now, what am I passing to swap still? 318 00:13:43,340 --> 00:13:47,450 The value x and the value of y, or the milk and the orange juice. 319 00:13:47,450 --> 00:13:48,510 But I don't want to do that. 320 00:13:48,510 --> 00:13:51,060 I instead want to pass in what? 321 00:13:51,060 --> 00:13:53,050 The location of x and the location of y. 322 00:13:53,050 --> 00:13:55,300 What are their postal addresses, so to speak. 323 00:13:55,300 --> 00:13:57,600 >> So to do that, there's an ampersand. 324 00:13:57,600 --> 00:13:59,260 Ampersand kind of sounds like address. 325 00:13:59,260 --> 00:14:03,240 so n, ampersand, the address of x, and the address of y. 326 00:14:03,240 --> 00:14:06,790 So it's deliberate that we use ampersands when calling the function, 327 00:14:06,790 --> 00:14:10,230 and stars when declaring and when implementing the function. 328 00:14:10,230 --> 00:14:14,220 >> And just think of ampersand as the address of operator, and star as the 329 00:14:14,220 --> 00:14:15,490 go there operator-- 330 00:14:15,490 --> 00:14:18,640 or, more properly, the dereference operator. 331 00:14:18,640 --> 00:14:23,480 So that's a whole lot of words just to say that now, hopefully, swap is going 332 00:14:23,480 --> 00:14:24,440 to be correct. 333 00:14:24,440 --> 00:14:26,550 >> Let me go ahead and make-- 334 00:14:26,550 --> 00:14:30,940 let's actually rename the file, lest this program still be called no swap. 335 00:14:30,940 --> 00:14:33,240 I claim that we'll call it swap.c now. 336 00:14:33,240 --> 00:14:35,670 So make, swap. 337 00:14:35,670 --> 00:14:37,520 Dot, slash, swap. 338 00:14:37,520 --> 00:14:40,210 >> And now indeed, x is 1, y is 2. 339 00:14:40,210 --> 00:14:44,040 And then, x is 2, y is one. 340 00:14:44,040 --> 00:14:46,500 Well let's see if we can't do this a little bit differently as to what's 341 00:14:46,500 --> 00:14:47,180 going on here. 342 00:14:47,180 --> 00:14:51,250 First, let me zoom in on our drawing screen here. 343 00:14:51,250 --> 00:14:54,160 And let me propose for a moment-- and whenever I draw here will be mirrored 344 00:14:54,160 --> 00:14:58,660 up there now-- let me propose that here's a whole bunch of memory, or 345 00:14:58,660 --> 00:15:00,540 RAM, inside of my computer. 346 00:15:00,540 --> 00:15:04,140 >> And this will be bite number, let's say, 1. 347 00:15:04,140 --> 00:15:05,720 This will be bytes number 2. 348 00:15:05,720 --> 00:15:08,220 And I'll do a whole bunch more, and then a bunch of dot dot dots to 349 00:15:08,220 --> 00:15:10,880 indicate that there's 2 billion of these things. 350 00:15:10,880 --> 00:15:13,520 4, 5, and so forth. 351 00:15:13,520 --> 00:15:17,055 >> So there are the first five bytes of my computer's memory. 352 00:15:17,055 --> 00:15:17,560 All right? 353 00:15:17,560 --> 00:15:19,060 Very few out of 2 billion. 354 00:15:19,060 --> 00:15:21,120 But now I'm going to propose the following. 355 00:15:21,120 --> 00:15:27,490 I'm going to propose that x is going to store the number 1, and y is going 356 00:15:27,490 --> 00:15:29,690 to store the number 2. 357 00:15:29,690 --> 00:15:35,000 And let me go ahead now and represents these values as follows. 358 00:15:35,000 --> 00:15:41,510 >> Let's do this as follows. 359 00:15:41,510 --> 00:15:42,870 Give me just one second. 360 00:15:42,870 --> 00:15:44,150 One second. 361 00:15:44,150 --> 00:15:45,680 OK. 362 00:15:45,680 --> 00:15:47,560 I want to make this a little-- 363 00:15:47,560 --> 00:15:50,440 let's do this again. 364 00:15:50,440 --> 00:15:53,250 Otherwise I'm going to and using the same numbers, unintentionally, 365 00:15:53,250 --> 00:15:54,230 multiple times. 366 00:15:54,230 --> 00:15:57,320 >> So just so we have different numbers to talk about, let's call this byte 367 00:15:57,320 --> 00:16:03,391 number 123, 124, 125, 126, and dot dot dot. 368 00:16:03,391 --> 00:16:08,400 And let me claim now that I'm going to put the value 1 here, and the value 2 369 00:16:08,400 --> 00:16:11,990 here, otherwise known as x and y. 370 00:16:11,990 --> 00:16:15,300 So it just so happens that this is x, this is y. 371 00:16:15,300 --> 00:16:18,180 >> And just by some random chance, the computer, the operating system, 372 00:16:18,180 --> 00:16:21,890 happened to put x at location number 123. 373 00:16:21,890 --> 00:16:25,590 And y ended up at location 124-- 374 00:16:25,590 --> 00:16:26,330 damn it. 375 00:16:26,330 --> 00:16:28,700 I should have fixed this. 376 00:16:28,700 --> 00:16:34,040 Oh man, do I really want to do this? 377 00:16:34,040 --> 00:16:37,340 Yes, I want to fix this and b proper about this today. 378 00:16:37,340 --> 00:16:39,950 Sorry, new at this. 379 00:16:39,950 --> 00:16:45,020 >> 127, 131, and I didn't want to be this complex, but why did I change the 380 00:16:45,020 --> 00:16:46,340 numbers there? 381 00:16:46,340 --> 00:16:48,360 Because I want the ints to actually be four bytes. 382 00:16:48,360 --> 00:16:49,810 So let's be super anal about this. 383 00:16:49,810 --> 00:16:53,800 So that if 1 happens to be addressed 123, the 2 is going to be at address 384 00:16:53,800 --> 00:16:55,730 127 because it's just 4 byes away. 385 00:16:55,730 --> 00:16:56,210 That's all. 386 00:16:56,210 --> 00:16:58,640 And we'll forget about all of the other addresses in the world. 387 00:16:58,640 --> 00:17:03,320 >> So x is at location 123, y is at location 127. 388 00:17:03,320 --> 00:17:05,770 And now, what do I actually want to do? 389 00:17:05,770 --> 00:17:10,099 When I call swap now, what's actually going on? 390 00:17:10,099 --> 00:17:14,920 Well, when I call swap, I'm passing in the address of x and the address of y. 391 00:17:14,920 --> 00:17:18,540 So for instance, if these two pieces of paper now represent the two 392 00:17:18,540 --> 00:17:23,510 arguments a and b to swap, what am I going to write on the first of these, 393 00:17:23,510 --> 00:17:27,720 which I'm going to call refer to as a? 394 00:17:27,720 --> 00:17:30,610 >> Exactly, 123. 395 00:17:30,610 --> 00:17:31,905 So this I claim is a. 396 00:17:31,905 --> 00:17:32,955 This is the parameter a. 397 00:17:32,955 --> 00:17:35,856 I'm putting the address of x in there. 398 00:17:35,856 --> 00:17:38,152 >> What's that? 399 00:17:38,152 --> 00:17:40,890 >> What's that? 400 00:17:40,890 --> 00:17:41,190 >> No, no. 401 00:17:41,190 --> 00:17:41,720 That's OK. 402 00:17:41,720 --> 00:17:42,570 Still good, still good. 403 00:17:42,570 --> 00:17:43,530 So this is a. 404 00:17:43,530 --> 00:17:46,240 And now on the second piece of paper, this is going to be b, and what am I 405 00:17:46,240 --> 00:17:49,010 going to be writing on this piece of paper? 406 00:17:49,010 --> 00:17:50,080 127. 407 00:17:50,080 --> 00:17:53,720 >> So the only thing that's changed since our previous telling of this story is, 408 00:17:53,720 --> 00:17:58,590 rather than literally 1 and 2, I'm going to pass in 123 and 127. 409 00:17:58,590 --> 00:18:02,130 And I'm now going to put these inside of this box, all right? 410 00:18:02,130 --> 00:18:04,640 So that black box now represents the swap function. 411 00:18:04,640 --> 00:18:07,230 >> Meanwhile, let's now have someone implement the swap function. 412 00:18:07,230 --> 00:18:09,090 Would someone up here like to volunteer? 413 00:18:09,090 --> 00:18:09,560 Come on up. 414 00:18:09,560 --> 00:18:11,080 What's your name? 415 00:18:11,080 --> 00:18:11,460 Charlie. 416 00:18:11,460 --> 00:18:12,080 All right, Charlie. 417 00:18:12,080 --> 00:18:14,810 Come on up. 418 00:18:14,810 --> 00:18:17,310 >> So Charlie is going to play the role of our black box. 419 00:18:17,310 --> 00:18:21,460 And Charlie, what I'd like you to do now is implement swap in such a way 420 00:18:21,460 --> 00:18:25,320 that, given those two addresses, you were actually going 421 00:18:25,320 --> 00:18:26,330 to change the values. 422 00:18:26,330 --> 00:18:28,290 And I'll whisper in your ear how to run the TV here. 423 00:18:28,290 --> 00:18:29,930 >> So go ahead, and you're the black box. 424 00:18:29,930 --> 00:18:30,920 Reach in there. 425 00:18:30,920 --> 00:18:34,054 What values do you see for a, and what values do you see for b? 426 00:18:34,054 --> 00:18:36,740 >> CHARLIE: a is 123 and b is 127. 427 00:18:36,740 --> 00:18:37,530 >> DAVID MALAN: OK, exactly. 428 00:18:37,530 --> 00:18:38,940 Now pause there for just a moment. 429 00:18:38,940 --> 00:18:41,680 The first thing you're going to do now, according to the code-- which 430 00:18:41,680 --> 00:18:43,220 I'll now pull up on the screen-- 431 00:18:43,220 --> 00:18:46,750 is going to be to allocate a little bit of memory called temp. 432 00:18:46,750 --> 00:18:48,850 So I'm going to go ahead and give you that memory. 433 00:18:48,850 --> 00:18:52,210 >> So this is going to be a third variable that you have accessible to 434 00:18:52,210 --> 00:18:54,080 you called temp. 435 00:18:54,080 --> 00:18:57,120 And what are you going to write on the temp piece of paper? 436 00:18:57,120 --> 00:19:02,524 437 00:19:02,524 --> 00:19:03,470 >> CHARLIE: Pointers, right? 438 00:19:03,470 --> 00:19:04,790 >> DAVID MALAN: OK, well not necessarily pointers. 439 00:19:04,790 --> 00:19:07,230 So the line of code that I've highlighted on the right hand side, 440 00:19:07,230 --> 00:19:07,900 let's start there. 441 00:19:07,900 --> 00:19:08,890 It says star a. 442 00:19:08,890 --> 00:19:11,670 So a is currently storing the number 123. 443 00:19:11,670 --> 00:19:16,660 And just intuitively, what did star 123 mean? 444 00:19:16,660 --> 00:19:21,630 >> But specifically, if a is 123, star a means what? 445 00:19:21,630 --> 00:19:22,560 The value of a. 446 00:19:22,560 --> 00:19:24,580 Or more casually, go there. 447 00:19:24,580 --> 00:19:28,620 So let me propose that, holding the a in your hand, go ahead and treat that 448 00:19:28,620 --> 00:19:29,430 as though it's a map. 449 00:19:29,430 --> 00:19:32,940 And walk yourself over to the computer's memory, and find us what is 450 00:19:32,940 --> 00:19:36,520 at location 123. 451 00:19:36,520 --> 00:19:37,720 Exactly. 452 00:19:37,720 --> 00:19:41,100 >> So we see at location 123 is what, obviously? 453 00:19:41,100 --> 00:19:44,240 OK, so what value now are you going to put into temp? 454 00:19:44,240 --> 00:19:44,750 Exactly. 455 00:19:44,750 --> 00:19:45,600 So go ahead and do that. 456 00:19:45,600 --> 00:19:51,280 And write the number 1 on the piece of paper that's currently titled temp. 457 00:19:51,280 --> 00:19:53,540 >> And now the next step that you're going to implement 458 00:19:53,540 --> 00:19:54,310 is going to be what. 459 00:19:54,310 --> 00:19:57,820 Well, on the right hand side of the next line of code is star b. b, of 460 00:19:57,820 --> 00:19:59,260 course, stores an address. 461 00:19:59,260 --> 00:20:02,270 That addresses 127. 462 00:20:02,270 --> 00:20:06,620 Star b means what, casually speaking? 463 00:20:06,620 --> 00:20:08,700 >> Go to that location. 464 00:20:08,700 --> 00:20:14,988 So go ahead and find us what's at location 127. 465 00:20:14,988 --> 00:20:15,480 OK. 466 00:20:15,480 --> 00:20:19,170 Of course, at location 127, is still the value 2. 467 00:20:19,170 --> 00:20:24,060 So what are you going now store at whatever's at the location in a? 468 00:20:24,060 --> 00:20:26,860 So star a means go to the location a. 469 00:20:26,860 --> 00:20:29,770 What is the location a? 470 00:20:29,770 --> 00:20:30,430 >> Exactly. 471 00:20:30,430 --> 00:20:34,190 So now, if you want to change what's at that location-- 472 00:20:34,190 --> 00:20:36,470 I'll go ahead and run the eraser are here. 473 00:20:36,470 --> 00:20:37,760 And now put it back on the brush. 474 00:20:37,760 --> 00:20:42,190 What number are you going to write in that blank box now? 475 00:20:42,190 --> 00:20:42,850 >> Exactly. 476 00:20:42,850 --> 00:20:46,470 So this line of code, to be clear-- let me pause what Charlie's doing and 477 00:20:46,470 --> 00:20:51,730 point out here, what he's just done is write into that box at location 123 478 00:20:51,730 --> 00:20:55,150 the value that was previously at b. 479 00:20:55,150 --> 00:20:59,140 And so we've now implemented indeed this second line of code. 480 00:20:59,140 --> 00:21:01,920 >> Now unfortunately, there's still one line remaining. 481 00:21:01,920 --> 00:21:04,900 Now what is in temp, literally? 482 00:21:04,900 --> 00:21:06,200 It's obviously the number one. 483 00:21:06,200 --> 00:21:07,020 That's not an address. 484 00:21:07,020 --> 00:21:09,380 It's just a number, sort of a variable from week one. 485 00:21:09,380 --> 00:21:13,520 >> And now when you say star b, that means go to the address b, which is of 486 00:21:13,520 --> 00:21:15,090 course here. 487 00:21:15,090 --> 00:21:16,020 So once you get there-- 488 00:21:16,020 --> 00:21:18,320 I'll go ahead and erase what's actually there-- and what are you 489 00:21:18,320 --> 00:21:20,820 going to write now at location 127? 490 00:21:20,820 --> 00:21:22,010 >> CHARLIE: Temp, which is one. 491 00:21:22,010 --> 00:21:23,430 >> DAVID MALAN: Temp, which is one. 492 00:21:23,430 --> 00:21:25,670 And what happens to temp in the end? 493 00:21:25,670 --> 00:21:26,600 Well, we don't really know. 494 00:21:26,600 --> 00:21:27,420 We don't really care. 495 00:21:27,420 --> 00:21:31,090 Any time we've implemented a function thus far, any local variables you have 496 00:21:31,090 --> 00:21:31,890 are indeed local. 497 00:21:31,890 --> 00:21:33,060 And they just disappear. 498 00:21:33,060 --> 00:21:35,040 They're reclaimed by the operating system eventually. 499 00:21:35,040 --> 00:21:39,800 >> So the fact that temp still has the value 1 is sort of fundamentally 500 00:21:39,800 --> 00:21:41,150 uninteresting to us. 501 00:21:41,150 --> 00:21:43,100 All right, so a round of applause if we could for Charlie. 502 00:21:43,100 --> 00:21:46,400 Very well done. 503 00:21:46,400 --> 00:21:51,520 >> All right, so what more does this mean we can do? 504 00:21:51,520 --> 00:21:54,400 So it turns out that we've been telling a few white lies 505 00:21:54,400 --> 00:21:55,540 for quite some time. 506 00:21:55,540 --> 00:21:59,990 Indeed, it turns out that a string, all of this time, is not really a 507 00:21:59,990 --> 00:22:02,190 sequence of characters per se. 508 00:22:02,190 --> 00:22:03,980 It kind of is that intuitively. 509 00:22:03,980 --> 00:22:08,270 >> But technically speaking, string is a data type that we declared inside of 510 00:22:08,270 --> 00:22:12,170 the CS50 library to simplify the world for the first few weeks of class. 511 00:22:12,170 --> 00:22:20,130 What a string really is is the address of a character somewhere in RAM. 512 00:22:20,130 --> 00:22:25,530 A string is really a number, like 123 or 127, that happens to demarcate 513 00:22:25,530 --> 00:22:28,420 where a string begins in your computer's memory. 514 00:22:28,420 --> 00:22:31,870 >> But it doesn't represent the string, per se, itself. 515 00:22:31,870 --> 00:22:33,460 And we can see this as follows. 516 00:22:33,460 --> 00:22:35,980 Let me go ahead and open up some code that's among 517 00:22:35,980 --> 00:22:38,340 today's source code examples. 518 00:22:38,340 --> 00:22:42,225 And I'm going to go ahead and open up, let's say, compare-0.c. 519 00:22:42,225 --> 00:22:44,830 520 00:22:44,830 --> 00:22:48,790 This is a buggy program that is going to be implemented as follows. 521 00:22:48,790 --> 00:22:49,040 >> First. 522 00:22:49,040 --> 00:22:50,420 I'm going to say something. 523 00:22:50,420 --> 00:22:52,660 Then I'm going to go ahead and get a string from the user 524 00:22:52,660 --> 00:22:53,750 in that next line. 525 00:22:53,750 --> 00:22:55,370 Then I'm going to say it again. 526 00:22:55,370 --> 00:22:57,540 Then I'm going to get another string from the user. 527 00:22:57,540 --> 00:23:00,390 >> And notice, I'm showing one of the strings in a variable called s, and 528 00:23:00,390 --> 00:23:03,040 another of these strings in a variable called t. 529 00:23:03,040 --> 00:23:07,480 And now I'm going to claim, very reasonably, that if s equals equals t, 530 00:23:07,480 --> 00:23:08,940 the strings are the same. 531 00:23:08,940 --> 00:23:09,970 You type the same thing. 532 00:23:09,970 --> 00:23:11,830 Else, the strings are not the same thing. 533 00:23:11,830 --> 00:23:15,440 >> After all, if we input two ints, two chars, two floats, two doubles, any of 534 00:23:15,440 --> 00:23:18,400 the data types we've talked about thus far to compare them-- 535 00:23:18,400 --> 00:23:22,070 recall we made very clear a while ago that you don't do this, because a 536 00:23:22,070 --> 00:23:25,840 single equal sign is of course the assignment operator. 537 00:23:25,840 --> 00:23:26,820 So that would be a bug. 538 00:23:26,820 --> 00:23:29,260 >> We use the equal equal sign, which indeed compares 539 00:23:29,260 --> 00:23:31,050 things for true equality. 540 00:23:31,050 --> 00:23:32,275 But I claim this is buggy. 541 00:23:32,275 --> 00:23:37,400 If I go ahead and make compare zero, and then do dot slash compare zero. 542 00:23:37,400 --> 00:23:39,700 And I type in, let's say, hello. 543 00:23:39,700 --> 00:23:41,590 And then let's say hello again. 544 00:23:41,590 --> 00:23:46,040 Literally the same thing, the computer claims I typed different things. 545 00:23:46,040 --> 00:23:47,640 >> Now maybe I just mistyped something. 546 00:23:47,640 --> 00:23:49,910 I'll type my name this time. 547 00:23:49,910 --> 00:23:52,580 I mean, hello. 548 00:23:52,580 --> 00:23:54,770 Hello. 549 00:23:54,770 --> 00:23:57,360 It's different every single time. 550 00:23:57,360 --> 00:23:58,430 >> Well, why is that? 551 00:23:58,430 --> 00:24:00,140 What's really going on underneath the hood? 552 00:24:00,140 --> 00:24:03,270 Well, what's really going on underneath the hood is the string then 553 00:24:03,270 --> 00:24:07,410 I typed in that first time for instance is the word hello, of course. 554 00:24:07,410 --> 00:24:11,660 But if we represent this underneath the hood, recall that a 555 00:24:11,660 --> 00:24:13,470 string is in an array. 556 00:24:13,470 --> 00:24:15,040 And we've said as much in the past. 557 00:24:15,040 --> 00:24:20,200 >> So if I draw that array like this, I'm going to represent something quite 558 00:24:20,200 --> 00:24:23,030 similar to what we did a moment ago. 559 00:24:23,030 --> 00:24:25,390 And there's actually something special here, too. 560 00:24:25,390 --> 00:24:28,090 What did we determine was at the end of every string? 561 00:24:28,090 --> 00:24:30,760 Yeah, this backslash zero, which is just the way of representing, 562 00:24:30,760 --> 00:24:33,610 literally, 00000000. 563 00:24:33,610 --> 00:24:35,680 Eight 0 bits in a row. 564 00:24:35,680 --> 00:24:37,610 >> I don't know, frankly, what's after this. 565 00:24:37,610 --> 00:24:40,090 That's just a bunch more RAM inside of my computer. 566 00:24:40,090 --> 00:24:40,970 But this is an array. 567 00:24:40,970 --> 00:24:42,260 We talked about arrays before. 568 00:24:42,260 --> 00:24:45,010 And we typically talk about arrays as being location zero, 569 00:24:45,010 --> 00:24:46,580 then one, then two. 570 00:24:46,580 --> 00:24:47,950 But that's just for convenience. 571 00:24:47,950 --> 00:24:49,380 And that's entirely relative. 572 00:24:49,380 --> 00:24:53,010 >> When you're actually getting memory from the computer, it's of course any 573 00:24:53,010 --> 00:24:55,450 2 billion some odd bytes, potentially. 574 00:24:55,450 --> 00:24:59,100 So really underneath the hood, all this time, yes. 575 00:24:59,100 --> 00:25:01,670 This might very well be bracket zero. 576 00:25:01,670 --> 00:25:04,780 But if you dig even deeper underneath the hood, that's really 577 00:25:04,780 --> 00:25:07,000 address number 123. 578 00:25:07,000 --> 00:25:09,150 This is address 124. 579 00:25:09,150 --> 00:25:11,040 This is address 125. 580 00:25:11,040 --> 00:25:12,540 >> And I didn't screw up this time. 581 00:25:12,540 --> 00:25:15,840 These are now one bytes apart for what reason? 582 00:25:15,840 --> 00:25:17,930 How big is a char? 583 00:25:17,930 --> 00:25:19,170 A char is just one byte. 584 00:25:19,170 --> 00:25:20,570 An int is typically four bytes. 585 00:25:20,570 --> 00:25:24,850 So that's why I made it 123, 127, 131 and so forth. 586 00:25:24,850 --> 00:25:27,560 Now I can keep the math simpler and just do plus 1. 587 00:25:27,560 --> 00:25:30,510 And this is now what's really going on underneath the hood. 588 00:25:30,510 --> 00:25:37,760 >> So when you declare something like this, string s, this is actually-- 589 00:25:37,760 --> 00:25:39,170 it turns out-- 590 00:25:39,170 --> 00:25:41,190 char star. 591 00:25:41,190 --> 00:25:44,640 Star, of course, means address, aka pointer. 592 00:25:44,640 --> 00:25:46,200 So it's the address of something. 593 00:25:46,200 --> 00:25:47,510 What is it the address of? 594 00:25:47,510 --> 00:25:47,760 >> Well-- 595 00:25:47,760 --> 00:25:51,680 I'm the only one who can see the very important point I'm making, or think 596 00:25:51,680 --> 00:25:52,560 I'm making. 597 00:25:52,560 --> 00:25:55,270 So string-- 598 00:25:55,270 --> 00:25:57,180 the sad thing is I have a monitor right there where I 599 00:25:57,180 --> 00:25:58,100 could have seen that. 600 00:25:58,100 --> 00:26:00,990 >> All right, so string s is what I declared previously. 601 00:26:00,990 --> 00:26:04,600 But it turns out, thanks to a little magic in the CS50 library, all this 602 00:26:04,600 --> 00:26:08,780 time string has literally been char star. 603 00:26:08,780 --> 00:26:11,310 The star again means pointer or address. 604 00:26:11,310 --> 00:26:14,180 The fact that it's flanking the word char means it's the 605 00:26:14,180 --> 00:26:15,970 address of a character. 606 00:26:15,970 --> 00:26:23,100 >> So if get string is called, and I type in H-E-L-L-O, propose now what has get 607 00:26:23,100 --> 00:26:27,330 string literally been returning all of this time, even though we've rather 608 00:26:27,330 --> 00:26:29,980 oversimplified the world? 609 00:26:29,980 --> 00:26:33,310 What does get string actually return as its return value? 610 00:26:33,310 --> 00:26:35,830 611 00:26:35,830 --> 00:26:38,720 >> 123 in this case, for instance. 612 00:26:38,720 --> 00:26:42,630 We've previously said that get string simply returns a string, a sequence of 613 00:26:42,630 --> 00:26:43,300 characters. 614 00:26:43,300 --> 00:26:44,790 But that's a bit of a white lie. 615 00:26:44,790 --> 00:26:48,010 The way get string really works underneath the hood is it gets a 616 00:26:48,010 --> 00:26:48,930 string from the user. 617 00:26:48,930 --> 00:26:51,530 It plops the characters that he or she types in memory. 618 00:26:51,530 --> 00:26:54,680 It puts a backslash zero at the end of those sequence of characters. 619 00:26:54,680 --> 00:26:57,310 >> But then what does get string literally return? 620 00:26:57,310 --> 00:27:02,710 It literally returns the address of the very first bytes in the RAM that 621 00:27:02,710 --> 00:27:04,130 it used for that strength. 622 00:27:04,130 --> 00:27:07,500 And it turns out that just by returning a single address of the 623 00:27:07,500 --> 00:27:12,120 first character in the string, that is sufficient for finding the entirety of 624 00:27:12,120 --> 00:27:12,630 the string. 625 00:27:12,630 --> 00:27:16,930 >> In other words, get string does not have to return 123 and 124 and 125. 626 00:27:16,930 --> 00:27:19,950 It doesn't have to give me a long list of all of the bytes that 627 00:27:19,950 --> 00:27:20,740 my string is using. 628 00:27:20,740 --> 00:27:22,670 Because one, they're all back to back. 629 00:27:22,670 --> 00:27:28,160 And two, based on the first address, I can figure out where the string ends. 630 00:27:28,160 --> 00:27:29,910 How? 631 00:27:29,910 --> 00:27:33,490 >> The special null character, the backslash zero at the end. 632 00:27:33,490 --> 00:27:35,430 So in other words, if you pass around-- 633 00:27:35,430 --> 00:27:36,530 inside of variables-- 634 00:27:36,530 --> 00:27:41,300 the address of a char, and you assume that at the end of any string, any 635 00:27:41,300 --> 00:27:45,040 sequence of characters as we humans think of strings, if you assume that 636 00:27:45,040 --> 00:27:48,600 at the end of any such string there's a backslash zero, you're golden. 637 00:27:48,600 --> 00:27:52,430 Because you can always find the end of a string. 638 00:27:52,430 --> 00:27:54,870 >> Now what's really then going on in this program? 639 00:27:54,870 --> 00:27:59,990 Why is this program, compare-0.c, buggy? 640 00:27:59,990 --> 00:28:01,690 What is actually being compared? 641 00:28:01,690 --> 00:28:02,420 Yeah? 642 00:28:02,420 --> 00:28:05,000 >> STUDENT: [INAUDIBLE]. 643 00:28:05,000 --> 00:28:05,730 >> DAVID MALAN: Exactly. 644 00:28:05,730 --> 00:28:08,350 It's comparing the locations of the strings. 645 00:28:08,350 --> 00:28:12,420 So if the user has typed in hello once, as I did, memory might end up 646 00:28:12,420 --> 00:28:13,430 looking like this. 647 00:28:13,430 --> 00:28:18,210 If the user then types in hello again, but by calling get string again, c is 648 00:28:18,210 --> 00:28:21,800 not particularly clever unless you teach it to be clever by writing code. 649 00:28:21,800 --> 00:28:22,430 >> C-- 650 00:28:22,430 --> 00:28:23,860 and computers more generally-- 651 00:28:23,860 --> 00:28:27,370 if you type in the word hello again, you know what you're going to get. 652 00:28:27,370 --> 00:28:31,480 You're just going to get a second array of memory that, yes, happens be 653 00:28:31,480 --> 00:28:35,510 storing H-E-L-L-O and so forth. 654 00:28:35,510 --> 00:28:38,240 >> It's going to look the same to us humans, but this address 655 00:28:38,240 --> 00:28:39,460 might not be 123. 656 00:28:39,460 --> 00:28:42,470 It might just so happen that the operating system has some available 657 00:28:42,470 --> 00:28:45,430 space for instance at location-- 658 00:28:45,430 --> 00:28:49,820 let's say something arbitrary, like this is location 200. 659 00:28:49,820 --> 00:28:51,620 And this is location 201. 660 00:28:51,620 --> 00:28:53,060 And this is location 202. 661 00:28:53,060 --> 00:28:55,730 We have no idea where that's going to be in memory. 662 00:28:55,730 --> 00:28:59,110 >> But what this means is that what is going to be stored ultimately in s? 663 00:28:59,110 --> 00:29:00,750 The number 123. 664 00:29:00,750 --> 00:29:04,860 What's going to be stored in t, in this arbitrary example? 665 00:29:04,860 --> 00:29:06,300 The number 200. 666 00:29:06,300 --> 00:29:11,410 And all that means then is obviously, 123 does not equal 200. 667 00:29:11,410 --> 00:29:14,940 And so this if condition never evaluates to true. 668 00:29:14,940 --> 00:29:18,430 Because get string is using different chunks of memory each time. 669 00:29:18,430 --> 00:29:20,360 >> Now we can see this again in another example. 670 00:29:20,360 --> 00:29:23,764 Let me go ahead and open up copy-0.c. 671 00:29:23,764 --> 00:29:28,770 I claim that this example is going to try-- but fail-- to copy two strings 672 00:29:28,770 --> 00:29:29,910 as follows. 673 00:29:29,910 --> 00:29:31,730 >> I'm going to say something to the user. 674 00:29:31,730 --> 00:29:34,490 I'm then going to get a string and call it s. 675 00:29:34,490 --> 00:29:36,400 And now, I'm doing this check here. 676 00:29:36,400 --> 00:29:37,990 We mentioned this a while back. 677 00:29:37,990 --> 00:29:42,490 But when might get string return null, another special character, or special 678 00:29:42,490 --> 00:29:45,050 symbol let's say. 679 00:29:45,050 --> 00:29:45,900 If it's out of memory. 680 00:29:45,900 --> 00:29:48,970 >> For instance, if the user is really being difficult and types an atrocious 681 00:29:48,970 --> 00:29:51,220 number of characters at the keyboard and hits Enter. 682 00:29:51,220 --> 00:29:54,580 If that number of characters just can't fit in RAM for whatever crazy 683 00:29:54,580 --> 00:29:57,820 reason, well get string might very well return null. 684 00:29:57,820 --> 00:30:01,080 >> Or if your program itself is doing a lot of other things and there's just 685 00:30:01,080 --> 00:30:03,790 not enough memory for get string to succeed, It might end 686 00:30:03,790 --> 00:30:05,240 up returning null. 687 00:30:05,240 --> 00:30:07,160 But let's be more precise as to what this is. 688 00:30:07,160 --> 00:30:10,280 What is s's data type really? 689 00:30:10,280 --> 00:30:11,610 Char star. 690 00:30:11,610 --> 00:30:14,560 >> So it turns out now we can peel back the layer of null. 691 00:30:14,560 --> 00:30:17,500 Turns out, null is-- yes, obviously a special symbol. 692 00:30:17,500 --> 00:30:19,190 But what is it really? 693 00:30:19,190 --> 00:30:25,220 Really, null is just a symbol that we humans use to represent zero as well. 694 00:30:25,220 --> 00:30:29,010 >> So the authors of C, and computers more generally, decided years ago 695 00:30:29,010 --> 00:30:30,010 that, you know what. 696 00:30:30,010 --> 00:30:34,850 Why don't we ensure that no user data is ever, ever, ever 697 00:30:34,850 --> 00:30:36,730 stored at bye zero? 698 00:30:36,730 --> 00:30:39,610 In fact, even in my arbitrary example before, I didn't start numbering the 699 00:30:39,610 --> 00:30:40,390 bytes at zero. 700 00:30:40,390 --> 00:30:41,540 I started at one. 701 00:30:41,540 --> 00:30:44,950 Because I knew that people in the world have decided to reserve the zero 702 00:30:44,950 --> 00:30:47,970 byte in anyone's RAM as something special. 703 00:30:47,970 --> 00:30:52,020 >> The reason being, anytime you want to signal that something has gone wrong 704 00:30:52,020 --> 00:30:55,960 with regard to addresses, you returned null-- otherwise known as zero-- 705 00:30:55,960 --> 00:30:59,410 and because you know that there's no legit data at address zero, clearly 706 00:30:59,410 --> 00:31:00,400 that means an error. 707 00:31:00,400 --> 00:31:04,080 And that's why we, by convention, check for null and return something 708 00:31:04,080 --> 00:31:06,260 like one in those cases. 709 00:31:06,260 --> 00:31:09,300 >> So if we scroll down now, this is just then some error checking, just in case 710 00:31:09,300 --> 00:31:10,610 something went wrong with [? bail ?] 711 00:31:10,610 --> 00:31:13,470 altogether and quit the program by returning early. 712 00:31:13,470 --> 00:31:19,030 This line now could be rewritten as this, which means what? 713 00:31:19,030 --> 00:31:23,155 On the left hand side, give me another pointer to a character, and call it t. 714 00:31:23,155 --> 00:31:26,935 What am I storing inside of t, based on this one line of code? 715 00:31:26,935 --> 00:31:30,950 716 00:31:30,950 --> 00:31:32,170 >> I'm storing a location. 717 00:31:32,170 --> 00:31:34,742 Specifically the location that was in s. 718 00:31:34,742 --> 00:31:39,000 So if the user has typed in hello, and that first hello happens to end up 719 00:31:39,000 --> 00:31:42,567 here, then the number 123 is going to come back from get 720 00:31:42,567 --> 00:31:43,810 string and be stored-- 721 00:31:43,810 --> 00:31:44,780 as we said earlier-- 722 00:31:44,780 --> 00:31:45,440 in s. 723 00:31:45,440 --> 00:31:50,560 >> When I now declare another pointer to a char and call it t, what number is 724 00:31:50,560 --> 00:31:53,940 literally going to end up in t according to the story? 725 00:31:53,940 --> 00:31:55,420 So 123. 726 00:31:55,420 --> 00:32:00,310 >> So technically now both s and t are pointing to the exact 727 00:32:00,310 --> 00:32:02,410 same chunks of memory. 728 00:32:02,410 --> 00:32:06,140 So notice what I'm going to do now to prove that this program is buggy. 729 00:32:06,140 --> 00:32:08,820 >> First I'm going to claim, with a print f, capitalizing 730 00:32:08,820 --> 00:32:10,080 the copy of the string. 731 00:32:10,080 --> 00:32:11,660 Then I'm going to do a little error checking. 732 00:32:11,660 --> 00:32:12,160 I'm going to make sure. 733 00:32:12,160 --> 00:32:16,710 Let's make sure that the string t is at least greater than zero in length, 734 00:32:16,710 --> 00:32:19,190 so there's some character there to actually capitalize. 735 00:32:19,190 --> 00:32:22,840 >> And then you might recall this from previous examples. 736 00:32:22,840 --> 00:32:25,630 2 upper-- which is in the ctype.h file. 737 00:32:25,630 --> 00:32:30,800 T bracket zero gives me the zero character of the string t. 738 00:32:30,800 --> 00:32:34,360 And 2 upper of that same value, of course, converts it to uppercase. 739 00:32:34,360 --> 00:32:38,230 >> So intuitively, this highlighted line of code is capitalizing the first 740 00:32:38,230 --> 00:32:40,250 letter in t. 741 00:32:40,250 --> 00:32:44,485 But it's not capitalizing, intuitively, the first letter in s. 742 00:32:44,485 --> 00:32:48,130 But if you're thinking ahead, what am I about to see when I run this program 743 00:32:48,130 --> 00:32:54,220 and print out both the original, s, and the so-called copy, t? 744 00:32:54,220 --> 00:32:55,350 >> They're actually going to be the same. 745 00:32:55,350 --> 00:32:56,600 And why are they going to be the same? 746 00:32:56,600 --> 00:32:58,970 747 00:32:58,970 --> 00:33:01,020 They're both pointing to exactly the same thing. 748 00:33:01,020 --> 00:33:01,610 So let's do this. 749 00:33:01,610 --> 00:33:03,160 >> Make copy zero. 750 00:33:03,160 --> 00:33:04,070 It compiles OK. 751 00:33:04,070 --> 00:33:06,500 Let me run copy zero. 752 00:33:06,500 --> 00:33:10,110 Let me type something like hello in all lowercase then hit Enter. 753 00:33:10,110 --> 00:33:16,520 And it claims that both the original s and the copy are indeed identical. 754 00:33:16,520 --> 00:33:17,920 >> So what really happened here? 755 00:33:17,920 --> 00:33:20,100 Let me redraw this picture just to tell the story in a 756 00:33:20,100 --> 00:33:21,340 slightly different way. 757 00:33:21,340 --> 00:33:26,060 What's really going on underneath the hood when I declare something like 758 00:33:26,060 --> 00:33:30,410 char start s, or string s, I am getting a pointer-- 759 00:33:30,410 --> 00:33:33,090 which happens to be four bytes in the CS50 appliance 760 00:33:33,090 --> 00:33:34,410 and in a lot of computers. 761 00:33:34,410 --> 00:33:36,008 And I'm going to call this s. 762 00:33:36,008 --> 00:33:39,810 And this currently has some unknown value. 763 00:33:39,810 --> 00:33:43,900 >> When you declare a variable, unless you yourself put a value there, who 764 00:33:43,900 --> 00:33:44,570 knows what's there. 765 00:33:44,570 --> 00:33:48,110 It could be some random sequence of bits from the previous execution. 766 00:33:48,110 --> 00:33:52,490 So when I, in my line of code do get string, and then store the return 767 00:33:52,490 --> 00:33:54,800 value in s get string somehow-- 768 00:33:54,800 --> 00:33:58,520 and we'll eventually peel back how get string works, somehow allocates an 769 00:33:58,520 --> 00:34:00,480 array that probably looks a bit like this. 770 00:34:00,480 --> 00:34:05,390 H-E-L-L-O, backslash zero. 771 00:34:05,390 --> 00:34:09,510 >> Let's suppose that this is address 123 just first consistency. 772 00:34:09,510 --> 00:34:13,000 So get string returns, in the highlighted line there, it returns the 773 00:34:13,000 --> 00:34:15,000 number we said, 123. 774 00:34:15,000 --> 00:34:17,420 So what really goes inside of s here? 775 00:34:17,420 --> 00:34:26,590 >> Well, what really goes inside of s is 123. 776 00:34:26,590 --> 00:34:29,250 But frankly, I'm getting a little confused by all of these addresses, 777 00:34:29,250 --> 00:34:30,320 all of these arbitrary numbers. 778 00:34:30,320 --> 00:34:32,290 123, 124, 127. 779 00:34:32,290 --> 00:34:34,570 So let's actually simplify the world a little bit. 780 00:34:34,570 --> 00:34:38,800 >> When we talk about pointers, frankly, to us humans, who the heck cares where 781 00:34:38,800 --> 00:34:39,870 things are in memory? 782 00:34:39,870 --> 00:34:41,080 That's completely arbitrary. 783 00:34:41,080 --> 00:34:43,370 It's going to depend on how much RAM the user has. 784 00:34:43,370 --> 00:34:46,590 It's going to depend on when in the day you run the program, perhaps, and 785 00:34:46,590 --> 00:34:48,250 what input the user gives you. 786 00:34:48,250 --> 00:34:50,060 We're dwelling on unimportant details. 787 00:34:50,060 --> 00:34:54,230 >> So let's abstract away and say that, when you run a line of code like this, 788 00:34:54,230 --> 00:34:57,320 char star s gets the return value of get string. 789 00:34:57,320 --> 00:35:02,720 Why don't we instead just draw what we keep calling a pointer as though it's 790 00:35:02,720 --> 00:35:04,140 pointing at something? 791 00:35:04,140 --> 00:35:07,000 So I claim now that s up there is a pointer-- 792 00:35:07,000 --> 00:35:08,480 underneath the hood it's an address. 793 00:35:08,480 --> 00:35:11,330 But it's just pointing to the first byte in the 794 00:35:11,330 --> 00:35:12,780 string that's been returned. 795 00:35:12,780 --> 00:35:16,710 >> If I now return to the code here, what's going on at this line? 796 00:35:16,710 --> 00:35:20,020 Well, in this highlighted line now, I'm declaring apparently another 797 00:35:20,020 --> 00:35:21,070 variable called t. 798 00:35:21,070 --> 00:35:25,700 But it's also a pointer, so I'm going to draw it as, in theory, the exact 799 00:35:25,700 --> 00:35:26,710 same size box. 800 00:35:26,710 --> 00:35:28,160 And I'm going to call it t. 801 00:35:28,160 --> 00:35:33,500 >> And now if we go back to the code again, when I store s inside of t, 802 00:35:33,500 --> 00:35:36,920 what am I technically putting inside of t? 803 00:35:36,920 --> 00:35:39,350 Well technically, this was the number 123. 804 00:35:39,350 --> 00:35:42,270 So really I should be writing the number 123 there. 805 00:35:42,270 --> 00:35:43,900 But let's take it higher level. 806 00:35:43,900 --> 00:35:48,090 t, if it is just a pointer, intuitively, is just that. 807 00:35:48,090 --> 00:35:49,800 That's all that's being stored in there. 808 00:35:49,800 --> 00:35:54,970 >> So now in the last interesting lines of code, when I actually go about 809 00:35:54,970 --> 00:36:00,680 capitalizing the zero character in t, what is going on? 810 00:36:00,680 --> 00:36:06,310 Well, t bracket zero is now pointing to what character, presumably? 811 00:36:06,310 --> 00:36:07,460 >> It's pointing to h. 812 00:36:07,460 --> 00:36:08,870 Because t bracket zero-- 813 00:36:08,870 --> 00:36:12,490 recall, this is old syntax. t bracket zero just means if t is a string, t 814 00:36:12,490 --> 00:36:15,590 bracket zero means getting the zero character in that strength. 815 00:36:15,590 --> 00:36:18,650 So what that really means is go to this array-- 816 00:36:18,650 --> 00:36:21,520 and yes, this might be 123, this might be 124. 817 00:36:21,520 --> 00:36:22,790 But it's all relative, remember. 818 00:36:22,790 --> 00:36:25,640 Whenever talking about an array, we have the advantage of talking about 819 00:36:25,640 --> 00:36:27,000 relative indices. 820 00:36:27,000 --> 00:36:31,120 >> And so now we can just assume that t bracket zero is h. 821 00:36:31,120 --> 00:36:35,090 So if I call 2 upper on it, what that's really doing is capitalizing 822 00:36:35,090 --> 00:36:38,290 the lowercase h to uppercase H. But of course, what is s? 823 00:36:38,290 --> 00:36:41,010 It's pointing to the same darn string. 824 00:36:41,010 --> 00:36:44,200 >> So this is all that's been happening in this code so far. 825 00:36:44,200 --> 00:36:45,960 So what's then the implication? 826 00:36:45,960 --> 00:36:48,300 How do we fix these two problems? 827 00:36:48,300 --> 00:36:50,870 How do we compare to actual strings? 828 00:36:50,870 --> 00:36:53,720 >> Well intuitively, how would you go about comparing two 829 00:36:53,720 --> 00:36:55,090 strings for true equality? 830 00:36:55,090 --> 00:36:58,920 831 00:36:58,920 --> 00:37:00,750 >> What does it mean if two strings are equal? 832 00:37:00,750 --> 00:37:04,330 Clearly not that their addresses are equal in memory, because that's a low 833 00:37:04,330 --> 00:37:06,590 level implementation detail. 834 00:37:06,590 --> 00:37:08,360 All the characters are the same. 835 00:37:08,360 --> 00:37:12,810 So let me propose, and let me introduce in version one of compare.c 836 00:37:12,810 --> 00:37:14,970 here, so compare-1.c. 837 00:37:14,970 --> 00:37:19,590 >> Let me propose that we still get a pointer called s, and store in it the 838 00:37:19,590 --> 00:37:20,610 return value of get string. 839 00:37:20,610 --> 00:37:21,750 Let's do the same thing with t. 840 00:37:21,750 --> 00:37:23,230 So none of the code is different. 841 00:37:23,230 --> 00:37:25,420 I'm going to add a little more error checking now. 842 00:37:25,420 --> 00:37:29,390 So now that we're sort of peeling back this layers in CS50 of what a string 843 00:37:29,390 --> 00:37:33,520 actually is, we need to be more anal about making sure we don't abuse 844 00:37:33,520 --> 00:37:35,330 invalid values like null. 845 00:37:35,330 --> 00:37:36,440 >> So I'm just going to check. 846 00:37:36,440 --> 00:37:41,490 If s does not equal null and t does not equal null, that means we're OK. 847 00:37:41,490 --> 00:37:44,460 Get string did not screw up getting either of those strings. 848 00:37:44,460 --> 00:37:51,270 And you can perhaps guess now, what does STR CMP presumably do? 849 00:37:51,270 --> 00:37:52,000 String compare. 850 00:37:52,000 --> 00:37:55,470 >> So if you've programme in java before, this is like the equals method in the 851 00:37:55,470 --> 00:37:56,490 string class. 852 00:37:56,490 --> 00:37:57,890 But for those of you who haven't programmed before, 853 00:37:57,890 --> 00:37:59,320 this is just a c function. 854 00:37:59,320 --> 00:38:02,180 It happens to come in a file called string.h. 855 00:38:02,180 --> 00:38:03,830 That's where it's declared. 856 00:38:03,830 --> 00:38:05,110 >> And string compare-- 857 00:38:05,110 --> 00:38:07,530 I actually forget its usage, but never mind that. 858 00:38:07,530 --> 00:38:10,470 Recall that we can do man, stir compare. 859 00:38:10,470 --> 00:38:12,590 And this is going to bring up the Linux programmers manual. 860 00:38:12,590 --> 00:38:14,060 And it's, frankly, a little cryptic. 861 00:38:14,060 --> 00:38:15,270 But I can see here that, yep. 862 00:38:15,270 --> 00:38:17,570 I have to include string.h. 863 00:38:17,570 --> 00:38:20,590 >> And it says here under description, "the string compare function compares 864 00:38:20,590 --> 00:38:24,560 the two strings S1 and S2." And S1 and S2 are apparently the two 865 00:38:24,560 --> 00:38:26,120 arguments passed in. 866 00:38:26,120 --> 00:38:28,650 I don't really remember what const is, but now notice-- 867 00:38:28,650 --> 00:38:31,480 and you may have seen this already when you've use the man pages if you 868 00:38:31,480 --> 00:38:32,390 have it all-- 869 00:38:32,390 --> 00:38:36,220 that char star is just synonymous with string. 870 00:38:36,220 --> 00:38:40,440 >> So it compares the two strings, S1 and S2, and it returns an integer less 871 00:38:40,440 --> 00:38:44,930 than or equal to or greater than zero if S1 is found, respectively, to be 872 00:38:44,930 --> 00:38:47,450 less than, or match, or be greater than S2. 873 00:38:47,450 --> 00:38:51,220 That's just a very complex way of saying that string compare returns 874 00:38:51,220 --> 00:38:55,760 zero if two strings are intuitively identical, character for 875 00:38:55,760 --> 00:38:57,120 character for character. 876 00:38:57,120 --> 00:38:59,970 >> It returns a negative number if s, alphabetically, is supposed 877 00:38:59,970 --> 00:39:01,010 to come before t. 878 00:39:01,010 --> 00:39:05,300 Or returns a positive number if s is supposed to come after t 879 00:39:05,300 --> 00:39:06,170 alphabetically. 880 00:39:06,170 --> 00:39:08,360 So with this simple function, could you, for instance, sort a 881 00:39:08,360 --> 00:39:09,770 whole bunch of words? 882 00:39:09,770 --> 00:39:13,984 >> So in this new version, I'm going to go ahead and make compare1. 883 00:39:13,984 --> 00:39:15,750 Dot slash compare one. 884 00:39:15,750 --> 00:39:18,030 I'll type in hello in all lower case. 885 00:39:18,030 --> 00:39:20,300 I'm going to type in hello in all lowercase again. 886 00:39:20,300 --> 00:39:23,340 And thankfully now it realizes I typed the same thing. 887 00:39:23,340 --> 00:39:27,520 >> Meanwhile, if I type in hello in lower case and HELLO in upper case and 888 00:39:27,520 --> 00:39:29,710 compare them, I typed different things. 889 00:39:29,710 --> 00:39:32,530 Because not only are the addresses different, but we're comparing 890 00:39:32,530 --> 00:39:35,350 different characters again and again. 891 00:39:35,350 --> 00:39:37,320 >> Well let's go and fix one other problem now. 892 00:39:37,320 --> 00:39:41,590 Let me open up version one of copy, which now addresses 893 00:39:41,590 --> 00:39:42,900 this issue as follows. 894 00:39:42,900 --> 00:39:45,650 And this one's going to look a little more complex. 895 00:39:45,650 --> 00:39:49,320 But if you think about what problem we need to solve, hopefully this will be 896 00:39:49,320 --> 00:39:51,870 clear in just a moment now. 897 00:39:51,870 --> 00:39:57,280 >> So this first line, char start t, in layman's terms could someone propose 898 00:39:57,280 --> 00:39:59,450 what this line here means? 899 00:39:59,450 --> 00:40:01,050 Char star t, what is that doing? 900 00:40:01,050 --> 00:40:06,660 901 00:40:06,660 --> 00:40:07,210 >> Good. 902 00:40:07,210 --> 00:40:09,500 Create a pointer to some spot in memory. 903 00:40:09,500 --> 00:40:10,930 And let me refine it a little bit. 904 00:40:10,930 --> 00:40:17,180 Declare a variable that will store the address of some char in memory, just 905 00:40:17,180 --> 00:40:18,480 to be a little more proper. 906 00:40:18,480 --> 00:40:21,210 >> OK, so now on the right hand side, I've never seen one of these functions 907 00:40:21,210 --> 00:40:22,660 before, malloc. 908 00:40:22,660 --> 00:40:26,980 But what might that mean? 909 00:40:26,980 --> 00:40:28,050 Allocation of memory. 910 00:40:28,050 --> 00:40:29,410 Memory allocation. 911 00:40:29,410 --> 00:40:33,050 >> So it turns out, up until now, we haven't really had a powerful way of 912 00:40:33,050 --> 00:40:36,210 asking the operating system, give me some memory. 913 00:40:36,210 --> 00:40:39,980 Rather, we now have a function called malloc that does exactly that. 914 00:40:39,980 --> 00:40:42,960 Even though this is a bit of a distraction right now, notice that in 915 00:40:42,960 --> 00:40:46,200 between the two parentheses is just going to be a number. 916 00:40:46,200 --> 00:40:48,510 Where I've typed in question marks can be a number. 917 00:40:48,510 --> 00:40:51,020 >> And that number means, give me 10 bytes. 918 00:40:51,020 --> 00:40:52,320 Give me 20 bytes. 919 00:40:52,320 --> 00:40:53,820 Give me 100 bytes. 920 00:40:53,820 --> 00:40:56,500 And malloc will do its best to ask the operating system-- 921 00:40:56,500 --> 00:40:57,630 Linux, in this case-- 922 00:40:57,630 --> 00:40:59,630 hey, are their 100 bytes of RAM available? 923 00:40:59,630 --> 00:41:04,320 If so, return those bytes to me by returning the address of which of 924 00:41:04,320 --> 00:41:06,610 those bytes, perhaps? 925 00:41:06,610 --> 00:41:07,610 The very first one. 926 00:41:07,610 --> 00:41:10,460 >> So here too-- and this is predominant in C, any time you're 927 00:41:10,460 --> 00:41:11,680 dealing with addresses? 928 00:41:11,680 --> 00:41:15,830 You're almost always dealing with the first such address, no matter how big 929 00:41:15,830 --> 00:41:19,490 a chunk of memory you are being handed back, so to speak. 930 00:41:19,490 --> 00:41:20,880 >> So let's dive in here. 931 00:41:20,880 --> 00:41:23,940 I am trying to allocate how many bytes, exactly? 932 00:41:23,940 --> 00:41:24,080 Well. 933 00:41:24,080 --> 00:41:26,090 String length of s-- let's do a concrete example. 934 00:41:26,090 --> 00:41:30,700 If s is hello, H-E-L-L-O, what's the string length of s, obviously? 935 00:41:30,700 --> 00:41:32,010 So it's five. 936 00:41:32,010 --> 00:41:34,590 But I'm doing a plus 1 on that, why? 937 00:41:34,590 --> 00:41:37,700 Why do I want six bytes instead of five? 938 00:41:37,700 --> 00:41:38,790 The null character. 939 00:41:38,790 --> 00:41:41,210 >> I don't want to leave off this special null character. 940 00:41:41,210 --> 00:41:45,160 Because if I make a copy of Hello and just do H-E-L-L-O, but I don't put 941 00:41:45,160 --> 00:41:50,160 that special character, the computer might not have, by chance, a backslash 942 00:41:50,160 --> 00:41:51,730 zero there for me. 943 00:41:51,730 --> 00:41:55,570 And so if I'm trying to figure out the length of the copy, I might think that 944 00:41:55,570 --> 00:41:59,360 it's 20 characters long, or a million characters long if I just never happen 945 00:41:59,360 --> 00:42:01,050 to hit a backslash zero. 946 00:42:01,050 --> 00:42:05,780 >> So we need six bytes to store H-E-L-L-O, backslash zero. 947 00:42:05,780 --> 00:42:07,870 And then this is just to be super anal. 948 00:42:07,870 --> 00:42:10,700 Suppose that I forget what the size of a char is. 949 00:42:10,700 --> 00:42:12,020 We keep saying it's one byte. 950 00:42:12,020 --> 00:42:12,860 And it usually is. 951 00:42:12,860 --> 00:42:15,425 In theory, it could be something different, on a different Mac or a 952 00:42:15,425 --> 00:42:16,250 different PC. 953 00:42:16,250 --> 00:42:19,650 >> So it turns out there's this operator called sizeof that if you pass it the 954 00:42:19,650 --> 00:42:22,680 name of a data type-- like char, or int, or float-- 955 00:42:22,680 --> 00:42:26,930 it will tell you, dynamically, how many bytes a char takes up on this 956 00:42:26,930 --> 00:42:28,090 particular computer. 957 00:42:28,090 --> 00:42:31,360 >> So this is effectively just like saying times 1 or 958 00:42:31,360 --> 00:42:32,440 times nothing at all. 959 00:42:32,440 --> 00:42:36,340 But I'm doing it just to be super anal, that just in case a char differs 960 00:42:36,340 --> 00:42:40,610 on your computer versus mine, this way the math is always going to check out. 961 00:42:40,610 --> 00:42:43,720 >> Lastly, down here I check for null, which is always good practice-- again, 962 00:42:43,720 --> 00:42:44,920 any time we're dealing with pointers. 963 00:42:44,920 --> 00:42:47,520 If malloc wasn't able to give me six byes-- which is 964 00:42:47,520 --> 00:42:49,210 unlikely, but just in case-- 965 00:42:49,210 --> 00:42:50,730 return one immediately. 966 00:42:50,730 --> 00:42:53,290 And now, go ahead and copy the string as follows. 967 00:42:53,290 --> 00:42:57,240 And this is familiar syntax, albeit in a different role. 968 00:42:57,240 --> 00:43:01,210 >> I'm going to go ahead and get the string length of s and store it in n. 969 00:43:01,210 --> 00:43:06,620 I'm then going to iterate from i equals zero up to and including n, 970 00:43:06,620 --> 00:43:08,410 greater than or equal to. 971 00:43:08,410 --> 00:43:13,540 So that on each iteration, I put the ith character of s in the ith 972 00:43:13,540 --> 00:43:15,380 character of t. 973 00:43:15,380 --> 00:43:18,190 >> So what's really going on underneath the hood here? 974 00:43:18,190 --> 00:43:22,140 Well if this, for instance, is s-- 975 00:43:22,140 --> 00:43:26,400 and I have typed in the word H-E-L-L-O and there's a backslash zero. 976 00:43:26,400 --> 00:43:29,020 And again, this is s pointing here. 977 00:43:29,020 --> 00:43:30,830 And here now is t. 978 00:43:30,830 --> 00:43:34,860 >> And this is pointing now to a copy of memory, right? 979 00:43:34,860 --> 00:43:37,340 Malloc has given me a whole chunk of memory. 980 00:43:37,340 --> 00:43:41,440 I don't know initially what's in any of these locations. 981 00:43:41,440 --> 00:43:44,340 So I'm going to think of these as a whole bunch of question marks. 982 00:43:44,340 --> 00:43:50,190 >> But as soon as I start looping from zero on up through the length of s, t 983 00:43:50,190 --> 00:43:52,790 bracket zero and t bracket 1-- 984 00:43:52,790 --> 00:43:55,080 and I'll put this now on the overhead-- 985 00:43:55,080 --> 00:44:04,190 t bracket zero and s bracket zero mean that I'm going to be copying 986 00:44:04,190 --> 00:44:09,875 iteratively h in here, E-L-L-O. Plus, because I did the plus 987 00:44:09,875 --> 00:44:12,370 1, backslash zero. 988 00:44:12,370 --> 00:44:19,060 >> So now in the case of compare-1.c, in the end, if I print out the 989 00:44:19,060 --> 00:44:24,760 capitalization of t, we should see that s is unchanged. 990 00:44:24,760 --> 00:44:26,090 Let me go ahead now and do this. 991 00:44:26,090 --> 00:44:28,630 So make copy1. 992 00:44:28,630 --> 00:44:30,860 Dot slash copy1. 993 00:44:30,860 --> 00:44:33,670 I'm going to type in hello, Enter. 994 00:44:33,670 --> 00:44:37,430 And now notice, only the copy has been capitalized. 995 00:44:37,430 --> 00:44:40,890 Because I truly have two chunks of memory. 996 00:44:40,890 --> 00:44:44,390 >> Unfortunately, you can do some pretty bad and pretty dangerous things here. 997 00:44:44,390 --> 00:44:49,290 Let me pull up an example here now, that gives us an example of a few 998 00:44:49,290 --> 00:44:51,540 different lines. 999 00:44:51,540 --> 00:44:56,040 So just intuitively here, the first line of code, int star x, is declaring 1000 00:44:56,040 --> 00:44:57,340 a variable called x. 1001 00:44:57,340 --> 00:44:58,810 And what's the data type of that variable? 1002 00:44:58,810 --> 00:45:01,820 1003 00:45:01,820 --> 00:45:04,290 What's the data type of that variable? 1004 00:45:04,290 --> 00:45:06,980 That was not the cliffhanger. 1005 00:45:06,980 --> 00:45:08,350 >> The data type is int star. 1006 00:45:08,350 --> 00:45:12,600 So what does that mean? x will store the address of an int. 1007 00:45:12,600 --> 00:45:13,520 Simple as that. 1008 00:45:13,520 --> 00:45:16,220 Y is going to store the address of an int. 1009 00:45:16,220 --> 00:45:18,390 What is the third line of code doing there? 1010 00:45:18,390 --> 00:45:21,850 It's allocating how many bytes, most likely? 1011 00:45:21,850 --> 00:45:22,350 Four. 1012 00:45:22,350 --> 00:45:25,460 Because of the size of an int is generally four, malloc of four gives 1013 00:45:25,460 --> 00:45:29,950 me back the address of a chunk of memory, the first of whose bytes is 1014 00:45:29,950 --> 00:45:32,110 stored now in x. 1015 00:45:32,110 --> 00:45:34,410 >> Now we're moving a little quickly. 1016 00:45:34,410 --> 00:45:35,760 Star x means what? 1017 00:45:35,760 --> 00:45:38,480 1018 00:45:38,480 --> 00:45:42,590 It means go to that address and put what number there? 1019 00:45:42,590 --> 00:45:43,870 Put the number 42 there. 1020 00:45:43,870 --> 00:45:47,590 Star y means go to what's at y and put the number 13 there. 1021 00:45:47,590 --> 00:45:48,600 >> But wait a minute. 1022 00:45:48,600 --> 00:45:51,640 What is in y at the moment? 1023 00:45:51,640 --> 00:45:54,950 What address is y storing? 1024 00:45:54,950 --> 00:45:55,770 We don't know, right? 1025 00:45:55,770 --> 00:45:59,230 We have never once use the assignment operator involving y. 1026 00:45:59,230 --> 00:46:03,370 So y as declared on the second line of code is just some garbage value, a big 1027 00:46:03,370 --> 00:46:04,760 question mark so to speak. 1028 00:46:04,760 --> 00:46:07,230 It could be pointing randomly to anything in memory, which 1029 00:46:07,230 --> 00:46:08,340 is generally bad. 1030 00:46:08,340 --> 00:46:13,540 >> So as soon as we hit that line there, star y equals 13, something bad, 1031 00:46:13,540 --> 00:46:17,220 something very bad is about to happen to Binky. 1032 00:46:17,220 --> 00:46:25,810 So let's see what's going to end up happening to Binky here in this minute 1033 00:46:25,810 --> 00:46:26,200 or so look. 1034 00:46:26,200 --> 00:46:26,490 >> [VIDEO PLAYBACK] 1035 00:46:26,490 --> 00:46:26,745 >> -Hey, Binky. 1036 00:46:26,745 --> 00:46:27,000 Wake up. 1037 00:46:27,000 --> 00:46:29,296 It's time for pointer fun. 1038 00:46:29,296 --> 00:46:30,680 >> -What's that? 1039 00:46:30,680 --> 00:46:31,980 Learn about pointers? 1040 00:46:31,980 --> 00:46:34,010 Oh, goodie. 1041 00:46:34,010 --> 00:46:37,220 >> -Well, to get started, I guess we're going to need a couple pointers. 1042 00:46:37,220 --> 00:46:37,930 >> -OK. 1043 00:46:37,930 --> 00:46:41,650 This code allocates two pointers which can point to integers. 1044 00:46:41,650 --> 00:46:43,760 >> -OK, well, I see the two pointers. 1045 00:46:43,760 --> 00:46:45,850 But they don't seem to be pointing to anything. 1046 00:46:45,850 --> 00:46:46,490 >> -That's right. 1047 00:46:46,490 --> 00:46:48,630 Initially, pointers don't point to anything. 1048 00:46:48,630 --> 00:46:51,700 The things they point to are called pointees, and setting them up is a 1049 00:46:51,700 --> 00:46:52,850 separate step. 1050 00:46:52,850 --> 00:46:53,740 >> -Oh, right, right. 1051 00:46:53,740 --> 00:46:54,500 I knew that. 1052 00:46:54,500 --> 00:46:56,270 The pointees are separate. 1053 00:46:56,270 --> 00:46:58,553 So how do you allocate a pointee? 1054 00:46:58,553 --> 00:46:59,480 >> -OK. 1055 00:46:59,480 --> 00:47:03,707 Well, this code allocates a new integers pointee, and this part sets x 1056 00:47:03,707 --> 00:47:05,520 to point to it. 1057 00:47:05,520 --> 00:47:06,760 >> -Hey, that looks better. 1058 00:47:06,760 --> 00:47:08,520 So make it do something. 1059 00:47:08,520 --> 00:47:09,530 >> -OK. 1060 00:47:09,530 --> 00:47:14,110 I'll dereference the pointer x to store the number 42 into its pointee. 1061 00:47:14,110 --> 00:47:17,660 For this trick, I'll need my magic wand of dereferencing. 1062 00:47:17,660 --> 00:47:20,695 >> -Your magic wand of dereferencing? 1063 00:47:20,695 --> 00:47:22,632 Uh, that's great. 1064 00:47:22,632 --> 00:47:24,620 >> -This is what the code looks like. 1065 00:47:24,620 --> 00:47:27,526 I'll just set up the number, and-- 1066 00:47:27,526 --> 00:47:28,250 >> -Hey, look. 1067 00:47:28,250 --> 00:47:29,680 There it goes. 1068 00:47:29,680 --> 00:47:34,520 So doing a dereference on x follows the arrow to access its pointee. 1069 00:47:34,520 --> 00:47:36,690 In this case, to store 42 in there. 1070 00:47:36,690 --> 00:47:40,890 Hey, try using it to store the number 13 through the other pointer, y. 1071 00:47:40,890 --> 00:47:42,125 >> -OK. 1072 00:47:42,125 --> 00:47:46,810 I'll just go over here to y and get the number 13 set up. 1073 00:47:46,810 --> 00:47:50,890 And then take the wand of dereferencing and just-- 1074 00:47:50,890 --> 00:47:52,430 whoa! 1075 00:47:52,430 --> 00:47:53,030 >> -Oh, hey. 1076 00:47:53,030 --> 00:47:54,610 That didn't work. 1077 00:47:54,610 --> 00:47:58,200 Say, Binky, I don't think the dereferencing y is a good idea, 1078 00:47:58,200 --> 00:48:01,370 because setting up the pointee is a separate step. 1079 00:48:01,370 --> 00:48:03,460 And I don't think we ever did it. 1080 00:48:03,460 --> 00:48:03,810 >> -Hmm. 1081 00:48:03,810 --> 00:48:05,160 Good point. 1082 00:48:05,160 --> 00:48:07,410 >> -Yeah, we allocated the pointer y. 1083 00:48:07,410 --> 00:48:10,045 But we never set it to point to a pointee. 1084 00:48:10,045 --> 00:48:10,490 >> -Hmm. 1085 00:48:10,490 --> 00:48:12,170 Very observant. 1086 00:48:12,170 --> 00:48:13,790 >> -Hey, you're looking good there, Binky. 1087 00:48:13,790 --> 00:48:16,920 Can you fix it so that y points to the same pointee as x? 1088 00:48:16,920 --> 00:48:17,810 >> -Sure. 1089 00:48:17,810 --> 00:48:20,300 I'll use my magic wand of pointer assignment. 1090 00:48:20,300 --> 00:48:22,240 >> -Is that going to be a problem like before? 1091 00:48:22,240 --> 00:48:22,665 >> -No. 1092 00:48:22,665 --> 00:48:24,300 This doesn't touch the pointees. 1093 00:48:24,300 --> 00:48:27,880 It just changes one pointer to point to the same thing as another. 1094 00:48:27,880 --> 00:48:28,970 >> -Oh, I see. 1095 00:48:28,970 --> 00:48:31,730 Now y points to the same place as x. 1096 00:48:31,730 --> 00:48:32,450 So wait. 1097 00:48:32,450 --> 00:48:33,490 Now y is fixed. 1098 00:48:33,490 --> 00:48:34,630 It has a pointee. 1099 00:48:34,630 --> 00:48:36,520 So you can try the wand of dereferencing again 1100 00:48:36,520 --> 00:48:39,200 to send the 13 over. 1101 00:48:39,200 --> 00:48:39,840 >> -OK. 1102 00:48:39,840 --> 00:48:41,570 Here goes. 1103 00:48:41,570 --> 00:48:42,870 >> -Hey, look at that. 1104 00:48:42,870 --> 00:48:44,320 Now dereferencing works on y. 1105 00:48:44,320 --> 00:48:47,020 And because the pointers are sharing that one pointee, they 1106 00:48:47,020 --> 00:48:48,585 both see the 13. 1107 00:48:48,585 --> 00:48:49,040 >> -Yeah. 1108 00:48:49,040 --> 00:48:49,670 Sharing. 1109 00:48:49,670 --> 00:48:50,380 Whatever. 1110 00:48:50,380 --> 00:48:52,290 So are we going switch places now? 1111 00:48:52,290 --> 00:48:52,970 >> -Oh, look. 1112 00:48:52,970 --> 00:48:54,150 We're out of time. 1113 00:48:54,150 --> 00:48:55,200 >> -But-- 1114 00:48:55,200 --> 00:48:57,060 >> -Just remember the three pointer rules. 1115 00:48:57,060 --> 00:49:00,100 Number one, the basic structure is that you have a pointer. 1116 00:49:00,100 --> 00:49:02,170 And it points over to a pointee. 1117 00:49:02,170 --> 00:49:04,160 But the pointer and pointee are separate. 1118 00:49:04,160 --> 00:49:06,460 And the common error is to set up a pointer, but to 1119 00:49:06,460 --> 00:49:08,540 forget to given a pointee. 1120 00:49:08,540 --> 00:49:12,460 >> Number two, pointer dereferencing starts at the pointer and follows its 1121 00:49:12,460 --> 00:49:14,570 arrow over to access its pointee. 1122 00:49:14,570 --> 00:49:18,640 As we all know, this only works if there is a pointee, which gets back to 1123 00:49:18,640 --> 00:49:19,790 rule number one. 1124 00:49:19,790 --> 00:49:23,670 >> Number three, pointer assignment takes one pointer and changes it to point to 1125 00:49:23,670 --> 00:49:25,850 the same pointee as another pointer. 1126 00:49:25,850 --> 00:49:27,840 So after the assignment, the two pointers will 1127 00:49:27,840 --> 00:49:29,430 point to the same pointee. 1128 00:49:29,430 --> 00:49:31,600 Sometimes that's called sharing. 1129 00:49:31,600 --> 00:49:33,430 And that's all there is to it, really. 1130 00:49:33,430 --> 00:49:33,840 Bye bye now. 1131 00:49:33,840 --> 00:49:34,300 >> [END VIDEO PLAYBACK] 1132 00:49:34,300 --> 00:49:36,940 >> DAVID MALAN: So more on pointers, more on Binky next week. 1133 00:49:36,940 --> 00:49:38,190 We'll see you on Monday. 1134 00:49:38,190 --> 00:49:42,187