[ Silence ] 

>> Alright! Welcome back to CS50. This is the end of week 4. So, if you haven't already, do pull up or follow along while I do here a simple Google search for this keyword recursion which we discussed last week, and do you get it? 

[ Inaudible Remarks ] 

>> Ah, ha ha ha. Okay, so many new jokes available to us now. So, someone has a sense of humor on the Google humor team, right? 

[ Inaudible Remarks ] 

>> Alright, we're all a bunch of dorks now, alright. So, a couple of recaps and announcements. So last time, recall that we focused on algorithms and actually solving some problems a bit more efficiently and we started to use this so-called asymptotic notation, just big O in omega and there's also feta [phonetic] but we'll focus mostly on big O in--in Omega and just to toss some jargon out here, know that we focused a lot on big O of N linear and so linear then perhaps brings back memories of the straight lines that we do while graphing. We looked at big O of N squared, bubble sort, selection sort, these were in big O of N squared and we found that that was generally slow. If definitely felt or looked slow when we compared it to something faster, namely N log N. Now, frankly most people don't tend to say supralinear but it is an apt word but N log N was kind of a sweet spot between N squared which was really slow and N which was generally unrealistic, at least thus far for doing something like sorting. But know too that anytime you have an algorithm that takes the same amount of time, irrespective of the size of the problem, irrespective of the number of students in the room, irrespective of the number of inputs provided, well that's what we would call constant time. And just as we cross out things like divided by 2 and times 2, anytime you wanna express the notion that an algorithm runs in constant time, 5 seconds, 10 seconds, no matter how big the input is, you would typically say big O of 1. Well, what's an example of a constant algorithm? Well, I mean you can think of something simple like--at the start of lecture, rather than taking something like attendance, the algorithm can be greet students, so I execute that algorithm, "hello", like no matter what that's gonna take constant time. Even if I say three things like "hello class", okay, so that's two steps but again we get rid of this constant multiplier so it's still constant time. We saw binary search with Shawn up at the board on video where log N was the ultimate conclusion there. But know that there are other things. Sometimes, you even can't do N squared, the problem is just too big or too complex, so you might have to have something polynomial, N raised to the something power, where that something, call it C, is bigger than 2, and really bad algorithms are things that are in, say, factorial--N a factorial which we'll typically avoid but there are--there is the potential to write algorithms that really are that slow. And then exponential is yet another one not even depicted here, and that too tends to be quite slow. But today we return now to the practice of actually coding things and understanding what's going on inside of a computer and peel back a layer called pointers that is both empowering as we'll see but also quite dangerous. So if you don't recall, WiFi should now be available here in Sanders. What we've also done, thanks to HUIT, is Quincy House whose wireless we broke last Wednesday at office hours since we had 209 students there in attendance. They have added six more wireless access points for us, so tonight's office hours in Quincy should be much improved, and the other houses have been bolstered as well. Also, too, if you've been trying to catch me for pass/fail forms or add/drop questions and I have been quiet on email, it's just because I've gotten kind of overwhelmed. Please don't hesitate to just re-ping me, and I'll do my best to catch up today and tomorrow, but certainly you engage me, or Matt, or Rob, or your TF in a chat if you have any questions or concerns at this point in the term. So, a little teaser before we move on to some milk and OJ here. So this was a--a little video that should tease us as to today's focus, this topic of pointers. In a nutshell, a pointer is going to be an address, the address of something in memory but we'll see what damage we can actually do if we use these things incorrectly, so a clip from a certain Stanford University Computer Science professor here. 

[ Music ] 

>> Hey Binky, wake up, its time for Pointer Fun. 

>> What's that? Learn about pointers? Oh goody! 

[ Laughter ] 

>> That's all you get right now, we will conclude with the--the follow up there. It's just fun to imagine Nick, this esteemed computer science professor at Stanford kind of playing at home with his clay models and recording himself do this, so more on that later today. So, recall that we ran into some problems early on when we tried to do something so simple like swapping variables and writing our own swap function also failed ultimately. And so consider how we solve the problem like this. I'm gonna get rid of things like instant charge here because we can just generalize with pseudo code, but here might be two variables, A and B, and they've already been initialized to be orange juice and milk, respectively. Well, suppose that the goal here is to swap the contents of A and B, right? And sort of intuitive term, the goal is to put this over here, this over here, but obviously if you take the straightforward approach, that kind of looks at first glance, correct, AB goes into A, A goes into B, of course, you get a bit of a mess if this is A, this is B and you try doing--that is disgusting. This--right, there's no way I'm gonna be able to move just the white stuff back over here on to the left. And so, right, this is buggy, right, broken algorithm, if that weren't clear already. So, how do we solve this? Perhaps, in real world, obvious problem. Right, so you have some kind of temporary variable. So if we introduce for instance just an extra 32 bits or whatever the data size is, then we can actually fix this whereby--let's try this again--whereby we have a fresh cup of A, fresh cup of B, and then we're gonna need--whoops let's not do that. 

[ Laughter ] 

>> Alright, fresh cup of B, so now we need of course a temporary variable. If there's any suspense, really, we should know where this is going, right? So--so we have A with orange juice, we put A into temp, okay, that was productive. Now, we have B going into A, so that's over here, little cross contamination but not nearly as bad. And now, temp goes into B, right, so this is kind of a fun game, right? We could do this all day. 

[ Laughter ] 

>> But thank you. Thank you. 

[ Applause ] 

>> So, not bad. We've solved it, but then we broke this, right? You could implement something like this, albeit in C in a function like main, and it would actually work. So you could swap two variables, A and B, if they're integers or whatnot, and it would work in main but then we kinda got arrogant and introduced our own function, swap, whereby we pass in A and B, and we claim this is good practice, right? You factor out sort of conceptual chunks of code into functions that are aptly named, we slapped the label, hierarchical dot decomposition on it, but then this broke, this simple, simple, simple thing of swapping. And what was the simple answer as to why the inputs to this function are not in fact swapped once this function executes? Yeah, so it's this issue of scope. When you pass in two variables, say X and Y, into a function like swap, what's actually passed in is two copies of those things. So, what's really been happening in the case of an actual swap function is if this is, say X and Y, and this exist in main. Well, when you actually call swap, what happens is--I'm gonna have to do a little bit of a magic here. So if this is X, this is Y, this is going to be A, this is going to be B, what actually goes into this cup? Like it's a copy of whatever X is, so it wouldn't really make sense to pour this in here just yet so I'm just gonna cheat and say this is a copy, which pretty much is, half a glass of orange juice. Y goes into B so we have a perfect copy of this. And then what happens in this swap function? Well, the swap function again is actually correct. We have this local temp variable so we go ahead and move A into temp, then we go ahead and move B into A, and then we go ahead and move temp into B, this is all working really well and then we return, and what happens. Well, the slice of memory that was allocated to the swap function is effectively gone. I mean, it's actually still there but the computer just forgets about it so you kind of fast forward. If we ever call another function, we might end up with orange juice or milk by default in those new variables, just because of those remnants, so there actually is something still here in memory but now we have X and Y, we didn't really do anything. So how do we actually solve this? Like just kind of intuitively, if the swap function isn't so much a function but rather it's just a friend and you want to hand your friend, X and Y, and you want that friend to swap X and Y for you, what should you really be handing your friend? Yeah, so ideally, you just hand him or her the actual cups, the originals, X and Y, and let him or her do their thing with an extra temp variable--variable and actually do the swap. So how do you actually hand to a function the original X and Y? Well, what you can do is this. If you have a little scrap paper, and you say to your friend, alright, well, let's suppose now that X--actually this is an unrealistic, suppose that X and Y are kind of bolted to this table in the same sense that a variable X and Y are in one place in RAM, right, the variables aren't gonna move. They are where the computer allocated them in RAM. So, what I can do is I can't really hand my friend these variables because that would be to move them from memory, but I can say, you know what, dear friend, the OJ is at location 1, 2, 3, 4 in RAM and the location of Y is at 4, 5, 6, 7. 

>> I'm just making these numbers up. So what you hand your friend is not X and Y but rather this little chi-chi, the address of X and Y. So that when your friend receives this, he or she can say alright, the X is apparently at location 1, 2, 3, 4, so I need to swap these cups for David so I'm gonna allocate my temporary variable, TMP, I'm gonna go ahead and bring it with me to location 1, 2, 3, 4 in memory. Okay, I'm gonna go ahead now and pour that location's contents into the temporary variable. Now, what am I gonna do? Alright, I need to now go to location 4, 5, 6, 7, so I go there, I find it and I realize, oh okay, I'm gonna move its contents to that same temporary variable. And then what do I have to do? Right, well now I have to go back to the original cup and actually say, alright, now I have to move the--I got completely lost--what goes next? Orange juice goes next. I have to take those contents, move it over, and if I didn't completely screw up this algorithm, they are indeed swapped but they're swapped at the original locations because I actually followed this map, if you will, to my friend's cups of OJ and milk, change those values, albeit with the help of my extra temporary cup and then the original contents are changed. So on the one hand I'm saving memory and that I'm not making copies of milk and orange juice this time, but I am spending some memory--what am I spending in this scenario? Whatever the costs of these pieces of paper are, so it turns out inside of a computer these little slips of paper will cost you generally 32 bits, a pointer, otherwise known as an address in memory will cost you 4 bytes or 32 bits, at least on many computers including the appliance. Now, if you're familiar with your own personal MAC or PC being 64 bit Intel Inside, well that actually means these slips of paper are twice as long, you can fit bigger addresses in them and so pointers can also be 64 bits. And on hardware that's 10 years old, 20 years old, 30 years old, you might have pointers that are just 16 bits or 8 bits, but that's all it is. A pointer is just the address of something in memory and so this seems to now solve the problem. We can just inform someone where these things actually are. So what's this gonna look like? Well, this is before. This is the broken version of swap. But the after version, if I now spoil this here, is going to be that. So we have to introduce a little bit of more syntax and there's actually one tweak we're gonna have to make to main. But if we want to rewrite our swap function so that it does still swap, but it swaps correctly by passing in the address of X and Y or whatever the inputs are called and actually changes them at their original locations, we essentially have to introduce some of these stars, these asterisks before the variable names. So we'll tease apart exactly why that is but let's now see--let's now just toss out a bit of jargon so that when we start seeing these things on the screen, its not so scary. So I have to teach you how to count again, so we've done this a couple of times now, right? You came in presumably knowing decimal, well, 0 through 9, we spent a little bit of time in week 01 binary where you only have two digits, 0 and 1, but we realized in the first week that if I wanna express the number 0 in binary, it's actually pretty easy. I can just say 0. But then remember that we said, you don't typically just use one bit, you typically work in units of at least what? So at least 8, a byte is 8 bits so even though it's a little tedious to write on a chalkboard, really, if you want to represent the number 0 in binary, you would probably store 8 bits, all of which are 0, and it got pretty easy after this. If you wanna represent the number 1, you have seven zeroes and then one 1, but then we actually had to think for the first time, right, in this scenario where we can't just go 00000002 'cause we don't have that digit, we only have a--a binary number of digits, 0 and 1, so we didn't do this, so what was the number 2 in binary? Yeah, so it's actually 1, 0, so we had to move this here, so it's like carrying the 1, this then becomes a 0. And if you then think back to what this is, this is the 1's columns, the 2's, 4's, 8, 16, 32, 64, 128 and so forth, if you have more bits. So even though it's a little longer, it takes more chalk to write these numbers, we can represent all possible integers with which we already are familiar. So this is binary, we already know decimal, let's introduce a bit of hexadecimal. Hexadecimal is gonna be useful here only in so far as some of the numbers we're gonna start seeing on the screen are little less scary although in reality, we're rarely going to care about what the address of some variable is. It could be 12345, could be 4567, it could be a huge number of digits, we just care that these addresses exist, we're not gonna typically care what they actually are. But what we're gonna start seeing in a moment is that a hexadecimal number--so that everyone can see, let me move to just the window up here. What we're gonna see now is a whole lot of numbers that start with 0X, which is just a human convention that says here comes a hexadecimal number. Hexa, in this case, means 16, so in a hexadecimal system, you don't have 2, you don't have 10, you have 16 digits available to you. So how do you go about counting in hexadecimal? Typically, you would start at 0, then you would go up to 1, and thankfully hexadecimal is actually a little easier, now you can go to 2, then you can go to 3, then you can go to 4, 5, 6, 7, 8, 9, and? 

[ Inaudible Remark ] 

>> Ooh, very good, so it's A, it's not 10. Right, so each of these digits, each of these values here can be any of 16 possible values. Of course, once you start counting at 0, you get all the way up to 7, 8, 9, and you wanna say 10, but expressing 10 takes two digits but we need--we have 16 available so the world decided that after you get to 9, you actually go to A, to B, to C, to D, to--oops, to E--and capital or lower case doesn't really matter, its just a convention. And then after you get to F, which is actually the 16th digit if we count all these up, I've just typed out 16 hexadecimal digits, what do you get to after F? One zero, so then you carry the 1, you go from the biggest value which in the normal world is 9, and you carry the 1 and you get to10. In the hexadecimal world, you get to F, then you carry the 1 and you get to 1-0 and then it's 1-1, 1-2, 1-3, 1-4, 1--dot dot dot, 1A, 1B and you can keep counting up like this. So we're gonna start seeing a lot of things in hexadecimal. What's nice about hexadecimal is that each of these digits, and there's 16 possible ones, takes up an exact number of bits. Every hexadecimal digit is 4 bits, so if you write 4 bits followed by 4 bits, you get a byte, so it's very convenient to express numbers as hexadecimal because with just two digits, you can represent a whole byte, whereas in binary, my god, you have to write out eight separate digits. So this is just a more succinct base system. Now we're gonna see it in a very arcane sense this week and next but fast forward toward term's end and we start doing things like web programming and a language called HTML and a language called CSS, Cascading Style Sheets. We're actually gonna re-see--see these things again because they're also used on the web for colors. So just to fast forward, if you ever want to express the color red on a web page, you can either just say "red" which is actually pretty compelling, or you can actually say FF 00, 00, and that should be consistent today, FF, but it doesn't matter. Now, why is this? Little teaser, if you've ever heard of the expression RG--whoops--RGB, red, green, blue, the way we're gonna implement colors in a few weeks and also in the multimedia section of one of our problem sets, FF is kind of the biggest possible hexadecimal digit you can write, it's the--the--before you have to roll over to a 1 and a 0, well this means a lot of red, no green, no blue, so what's the result if you combine these various paint colors? Only you get just red, so that's red. Alright, so that was a bit of a tangent only to emphasize that we're gonna start seeing these things now but it's really not a big deal, it's just a more succinct way of representing numbers than binary is and even then decimal itself is. So let's go ahead and solve something. I'm gonna go ahead and open up a new window in G edit and we're gonna go ahead and try to compare something very simple, so let me go ahead and start from scratch here, though the code is available online and I'm gonna go ahead and say--let's zoom in--let's go ahead and save this as compare1.c and I'm gonna go ahead and say "int main void" alright, and the goal of this program ultimately is to get two strings from the user and then just compare them and say yes or no, the user typed the same word twice or here she did not. So let's go ahead and say we wanna get a string from the user, so printf--let's just do something like "say something" alright, I put a space for aesthetics and now how do I get a string? Little recap--so string S--let's call it S1 'cause I'm gonna need two, so S1 gets--GetString. Okay, not quite, I need to now kind of remind myself, wait a minute, I've just used two functions that I did not invent. So what do I need to include at the top? Okay, so CS50, and also--yup, standard I/O for printf. And now, it should compile okay at this point in the story. So let's assume that the user doesn't try to be difficult. Here she just types in a word, so now let's go ahead and do the same thing, and we could use--whoops! We could use a loop here but it's only a finite number of things, so it's actually pretty reasonable just to copy paste these two lines of code and get something from the user but this time say "say something again". 

>> So I want the user to type in some word, then another word, and now let's go ahead and compare these things. So we know how to do branches in both C and scratch, so let's say if S1 equals equals S2, well let me go ahead and say to the user printf, you typed the same thing, exclamation point, semicolon, else let's say printf, you typed different things exclamation point semicolon. Any bugs jumped out at anyone? Or any syntax bugs? Typos? Okay, feels pretty reasonable, right? So we get a string S1, get a string S2, we then compare it and I didn't make this mistake, right? This could be the first possible mistake, very commonly happens, but I am using equals equals and that's as comparative things, don't assign one to the other, so I print--you print the type the same thing or different thing, so let's compile this. Let me go down to my terminal window here. Let me go ahead and do make compare 1. Looks like you compiled okay. Let me make this a little bigger and let's go ahead and run, compare 1, enter, say something, let's say hello, okay, hello. Now I did it, right? So, let's try this again and make sure maybe--maybe we got it all backwards, alright, so hello and goodbye. Okay, so I'm half right, right? So that's at least correct and let's just do a sanity check. Hello? Hello. Alright, so it feels like there's something going on here. So what might this be? Right, so there's clearly a lead in today, it somehow relates to this stuff but like what really is probably going on underneath the hood that's breaking this program? Yeah? 

[ Inaudible Remark ] 

>> Yeah, so I mean that's kind of what it reduces. To even if you're not quite sure at all on the Y, feels like the only opportunity for the bug here really reduces to just one line. Every other line in this program is pretty straightforward and pretty uninteresting so the decision point must be where the flaw is and in fact it is the case that you cannot compare two strings in this way. Why can we not compare two strings in this way? 'Cause equals equals has worked for like 4 weeks. We've used it for chars and ints, the--it can break with floats and doubles but that's a different issue. Remember when you compare floats and doubles because of imprecision, you might be comparing something that's ever so slightly off but a string is a string. But what is a string really? What did we say "string" is synonymous? 

[ Inaudible Remarks ] 

>> Yeah, it's an array of characters, so really remember if we get rid of the CS50 Library, we lose some of its synonyms that it's defined and recall that we kind of peeled back this layer recently and then quickly hid it again. What string really is just a synonym for is char star. So star, we'd just seen a moment ago is apparently the solution to all of our problems, at least with regard to swapping things. But char we've seen is just a character, so char star means pointer to character, or if we boil it down to its basic definition, char star means the address of a character, so who cares? What is this really doing for us? Well, let me spin this around to a--a blanks--okay, so it's coming with us. Let me spin this around to--alright. Okay. So, what really is char star? Char star is pointer to char but that's just address of char, so what is a string? When you allocate S1 like this, and then assign it a value, what's really going on? Well, on the left hand side, what you're getting in RAM is--let's call it a square, a chunk of memory, that is how many bytes? How many bits? In the appliance, at least, this is gonna be 32 bits. The mere act of having written a star means you're gonna get a pointer, whatever that is, and we can represent this pictorially with the square, and that square represents 32 bits. Now, we've--the wrong picture is like this before. If I ever said int X on the screen with a semicolon, what that means is give me 32 bits and allow me to put a number in there. Well, that's all a pointers, it's just a number. It just so happens that number has special meaning, it represents the address of something in memory, it doesn't represent an integer that I put in memory. So this is what S1 is, so logically, what is S2? S2 is just another such chunk of RAM, another 32 bits. So when I call GetString, we already know at least conceptually that in a string is an array of characters and we know this, we've seen this, right in P set 2 with Caesar and Visonaire and the hacker edition with crack, you've actually, you know, traversed arrays of characters from bracket 0 all the way up to N minus 1. So, what really is a string? What is GetString giving you? Well, what GetString does, and we'll see this week and next, is that when you type in the word hello as the human, GetString is calling a special function, happens to be called malloc, for memory allocation, and what it's doing is it's allocating a chunk of memory for you and it's then putting in whatever letters you type, H, E, L, L, need a little more, O, need a little more actually. What else is GetString probably doing based on past promises? Yeah, it's putting that so-called null character which we draw as backslash 0. So, what has GetString doing all of these weeks? It's been allocating with a new function called malloc, a chunk of RAM, but what's key about this chunk of memory is that it's back to back to back to back, it's not a little over here, little over here, little over here, it's all contiguous and it's of size N plus 1 so that you can fit the string plus this null terminating character. So now, if a pointer I claim is just 32 bits, it's this small, and it's clearly too small to fit H-E-L-L-O backslash 0, what's really going in here, do you think? 

[ Inaudible Remark ] 

>> Yeah, just the address of this. Right, if this is memory, this is RAM, and RAM is like 2 gigabytes worth on your MAC or PC, well we've claimed that you can say this is byte 0, this is byte 1, this is byte 2 billion, if you withdraw this--this rectangle as we keep doing. So these cells in memory, these chunks of bits all have some address. This might be address 1, 2, 3, 4. This might be 1, 2, 3, 5; 1, 2, 3, 6 and so forth, that the--what the numbers are doesn't matter but that they have numbers does matter. So, which of these numbers is probably going inside of this 32 bits? Which one do you need to know? It's really just the first. What's gonna go in here is the number 1, 2, 3, 4 if 1, 2, 3, 4 is the physical address in memory of the first byte. Now, this is sufficient to remember S1 because if I know that the start of the string is at--address 1, 2, 3, 4, how do I find the end of the string? 

[ Inaudible Remark ] 

>> Exactly! You just very simply, honestly, four loop, Y loop, whatever your preference is, you start at that address and you check here. Is this the null character? Is this the null character? Is this the null character? Is this the null character? Is this the null character? Is this--aha, it is! That's the end of the string. So the function printf does something pretty much like that. It looks at every character back to back to back, just like you did with Caesar and Visionaire, and as soon as printf sees, "wait a minute, this is not a printable character, this is backslash 0," all 0 bits, it stops printing, it returns and voila, you see some text--string of text on the screen. So now, back to the flawed program. This S1 equals equals S2, what is it really doing then if we now understand what's going on underneath the hood? What's it comparing? S1 and S2, so let's finish this picture. I typed in goodbye before so what do I need for goodbye? Well, I need another chunk of memory that's gonna have G-O-O--bye backslash 0, and now let me just break it up pictorially with lines. So now I'm gonna get back--this is I said before maybe address 4, 5, 6, 7, so what's going in S2, 4, 5, 6, 7, again numbers don't matter but that a number is there matters. So what am I comparing? You know, I'm kind of comparing two--I am comparing two distinct addresses. So even if I typed in "hello here, hello here," they're gonna end up at different locations in memory. The computer is not smart enough to realize, "oh you typed in hello twice, I'm gonna optimize this and just reuse that same memory." No, GetString is gonna use a different chunk of memory, it's gonna--thus be at a different place in RAM, so what are you comparing? Two addresses, and so, yup, they are wrong. Those are not correct. So let's see this actually, so S1 equals equals S2, let's actually print this out. So let me just make some temporary space here, printf, and I'm gonna say, "S1 is percent S comma S1 semicolon," and actually I may put a new line in there, and now I'm gonna also say, "S2 is percent S paste in S2," and let me temporary--I'll leave that in there. So S1 is something, S2 is something, let's go ahead and compile this again, so let me go down to my terminal window. We go ahead and do make compare 1, compiled okay, let me run compare. I'm gonna type in hello, I'm gonna type in hello again just as I started with, enter, okay. So at least, they look the same, but I'm saying I typed different things. So let's look underneath the hood. S1 is not percent S, that's actually printed as what it really is, a pointer, percent P. So this is gonna show me the address hopefully, so let's scroll down here, and now let me go ahead and recompile, let me rerun it, hello, hello, and wow. So the numbers are not as simplistic as 1, 2, 3, 4, they're much bigger, they're 32 bits but now you see that they're so close, right? But they're actually a few bytes off. They're 10 bytes off from each other. 

>> Notice the 3 for S1 and then the 4 for the next? So there's a bit of an offset there, and that's just because GetString has put one of the strings over here, the other one over here, and they're pretty close but they're not the same. So how might we go about solving this? What do we have to do in order to compare two strings? Sorry? 

[ Inaudible Remark ] 

>> A to I? Okay, so if we used A to I, ASCII to integer, we could com--we could convert a string to a number. But the problem here is that H-E-L-L-O is not a number, so this is a little distinct from Caesar where you might have typed in 13 and even though that is a string you wanted as a number, hello is a string, it's not a number. So not quite, what could we do? 

[ Inaudible Remark ] 

>> Well, yes. So there is in fact a string compare function. Very, very good--googling? Okay, very good. 

[ Laughter ] 

>> So there's a string compare function. Alright, so let's actually borrow that, so I actually don't remember how to use it, so let me go to cs50.net/resources, let me go to the little [inaudible] up here, the C reference. This has something to do with strings, so I'm gonna click down here on the bottom left corner, standard C-string in character functions, string compare--okay, these are all kind of abbreviated as is the convention. String CMP, that sounds like string compare. Alright, so it looks like in string.h, there's a function called strcmp, S-T-R-C-M-P, that stands for string comparison. It apparently takes in two arguments, string 1 and string 2. They are indeed strings, because again char star is synonymous with string. There is a new keyword here but you can probably guess what this means, const--so it means constant. So C actually has this mechanism and you might have seen this in man pages before where you can say just to be extra paranoid that you don't screw up the user's inputs, you can say, I'm gonna receive str1 and str2 but I'm specifying they're constant. I am not gonna take any liberties with these like a function like swap might and move them around, I'm gonna promise to whoever's calling me that I will take in your parameters as constant, even though you're handing them to me by pointer as suggested by the star. So this is just a convention that's good practice, so no need to be too distracted by it. So here's the summary, the function strcmp compares string 1 and string 2, and then returns. So this is interesting, it doesn't return a Boolean, it actually returns 1 of 3 values. So string comparison will return 0 if what's the case? 

[ Inaudible Remark ] 

>> If str1 is equals to str2. And in this context of the documentation, it means literally equal like the words are the same in the--in a human sense. But it will also, just to help you out, return less than 0, it probably returns negative 1 but the programmer didn't even wanna make that commitment. He or she just said, I will commit to returning a none--a negative number to you. So if it returns less than 0, that means str1 is less than str2 and if the function returns greater than 0, that means str1 is greater than string 2. What does it mean for string though do you think to be less than or greater than? 

[ Inaudible Remark ] 

>> What's that? 

[ Inaudible Remark ] 

>> Oh not--so not about [inaudible], so not correct but it--it could be the sum of the values 'cause these are just ASCII characters, but that's not it, its even more simplistic than that. 

[ Inaudible Remark ] 

>> What's that? 

[ Inaudible Remark ] 

>> Could be the length but even that's not--not on point here. 

[ Inaudible Remark ] 

>> Alphabetical. That's what it's doing. Right, so this is actually very compelling. If you want a function that compares two strings alphabetically, you actually have it in strcmp already, it doesn't just tell you equal or not equal, it tells you which one comes alphabetically first, less than, or which one comes alphabetically later, greater than, so you can compare two strings. So now, already we have a function that could apparently allow us to implement something like binary search, which again we've done on the board here with Shawn. That doesn't just search for numbers. Now we could actually search for words, words like Mike Smith in a phonebook implemented in a computer program. So let's try this. Apparently to use this, I have to include string.h and then I have to know about these return values, so let's go back here. Alright, so rather than do equal equals, I'm gonna go ahead and get rid of my temporary pointer printing and now I'm gonna say, not this, I wanna compare, so I wanna call strcmp of S1, S2, if it what? 

[ Inaudible Remark ] 

>> Equals equals 0, right? And again, that's just from the documentation. If it equals 0, that means they're the same. So you know what, I don't care about alphabetical less than or greater than here, I'm gonna keep it simple and just say yes or no, same or different. So if str compare S1 S2 equals equals 0, hopefully this will fix my bug. So let me scroll down here, let me go--error? 

[ Inaudible Remark ] 

>> Good! So--good--good point. This is gonna break when I try to compile it, what error am I about to see? Yeah, so something undeclared, right? So let's do a quick--quick recompile make--implicit declaration of function strcmp, what does that mean? Well, that just means that I forgot to declare it. I don't need to declare it. I need to include whatever file it's actually declared in, so let's go ahead and type string.h there, scroll back down, let me recompile with make, that seems to be good. Let's compare. Let's say hello. Make sure I didn't break the only part that was working. Okay, different things, that's good. Hello, hello--finally. It's actually working. So there is this magical function, str compare, but you don't need these magical functions, right? Printf is useful, it's kind of non-obvious to me how I would actually make stuff appear on a computer monitor. But comparing two strings, if you understand what's going on underneath the hood, should actually be pretty tenable. So let's actually do this. I'm gonna say, you are not allowed to use strcmp--compare. So for instance, string.h, we accidentally deleted it along with string.c. So just conceptually, how can we take in two inputs like string 1 and string 2, and compare them, and say yes or no, same or different. Yeah? 

[ Inaudible Remark ] 

>> Perfect! So we could just compare each character and what do we do as we compare? 

[ Inaudible Remark ] 

>> What's that? 

[ Inaudible Remark ] 

>> So iterate through and I start comparing the left end against the left end, and then here, and then here, and then here, and at what point can I make a decision or just quit altogether? 

[ Inaudible Remark ] 

>> So if you see null at the end or if--if it's a different in any location. So let's try this. Let me go ahead and say, okay, 4 int, I gets 0. I is let's say less than the string length of S1, I++, let me go in here now and say, okay so if--how do I express if the leftmost character of either is different? Do something, if--? Think back to Caesar, if S1-- 

[ Inaudible Remark ] 

>> I'll say it was good so if--I could just say is F--S1 bracket I does not equal S2 bracket I, so just compare their Ith locations, what can I do? Well I can say, you typed different things, semicolon, and then I can quit, right? So let's say return 1, 'cause something bad happened, that's an error. But not quite, right? I just made a bug. Right, so we need the brackets. Otherwise only one of those lines will actually get executed, so let me go back in here, let me do this. And now, again, I feel like I got some issue still but I'm gonna claim that if I at least get way down here, is it pretty safe to assume that--1, 2, 3, 4, you--so this is kind of my catchall. If I get all the way through the loop and I don't actually return prematurely and yell at the user for having typed different things, I can claim you type the same thing, right? 

[ Inaudible Remark ] 

>> Alright, why not? What's flawed here? 

[ Inaudible Remark ] 

>> I still need what at the top? 

[ Inaudible Remark ] 

>> So I actually do need string.h, but why this time? 

[ Inaudible Remark ] 

>> Damn it! Okay. So let's--let--I'll give you that back. Right, so, I don't really feel like implementing string length just yet though we could. Right, how do you implement string length frankly? 

[ Inaudible Remark ] 

>> Yeah, just start at the beginning and then count, count, count, count, count, so we'll come back to that or maybe assign it on the quiz or something, but it's pretty easy to implement string length if all you have to do is take in a string, start iterating over each of its characters, and the moment you hit backslash 0, stop counting, you're done, you found the end of the string. Alright, so let's--we'll give you str length but not string compare, right? So I'll take away Google from you. How about that? So now, we can't string compare it with the function but we can do it this way but I actually think this is flawed for a couple of reasons still. Yeah? 

[ Inaudible Remark ] 

>> Yeah. 

[ Inaudible Remark ] 

>> Exactly. Let's see what happens, maybe nothing, maybe something. Let's see what happens if I save it as is, I go back here and I recompile and I type in S1 that's pretty long, but an S2 that's pretty short, right? Because how many times am I gonna iterate according to my four loop at the moment? The length of S1, so it feels like there's a bad opportunity here, so let's run this. So say something like I am a very long string hello, okay, S2--okay, it worked. Not broken--correct. Correctness 5. Why is it actually flawed? Can we still break this? I just need to stall for a moment here. Okay. 

[ Inaudible Remark ] 

>> What's that? 

[ Inaudible Remark ] 

>> Oh okay, so cat and cat--oh yeah, you know, I'm kind of an idiot, right? So what are we doing? I'm comparing a really long string against a really short string, but guess where they differ? 

>> At the first location, right? So, not a problem, I'm actually catching that at the beginning. So let's--let's control C, let's start this over, and let's say caterpillar, enter, did I spell that right? 

[ Laughter ] 

>> Really? Hang on, I'm giving us--there we go. I--I had to take to go back. Okay, so--damn it, I forgot what I copy pasted. Caterpillar, alright. Come on, that is not a word I've had to type very often in life, alright. So here we go, caterpillar, okay, cat. Switched them around, cat, caterpillar. [Laughter] That was only a surprise to make. So, what's going on here? Well, now, we have a substring issue, right? So cat is obviously a substring, a prefix of caterpillar, and so they do in fact match but if I'm only iterating from the length of 0 to the length of--rather, from 0 to the length of S1, I'm only gonna compare like the first three characters, C-A-T, and then I'm gonna call it quits. But in the opposite direction, it didn't actually happen this time, but if I [inaudible] around enough it might. What could happen if the differences of these strings is long enough? So I'm gonna start comparing C-A-T, but what am I gonna hit once I get to the fourth location, if the one word is caterpillar and it's first, and then cat is the second word. So it null, but null is not equal to C-A-T-E, it's not equal to E, and so it actually still catches that scenario and we don't overstep the bounds of the array because we at least stopped there. But flip them around and we do confuse cat for caterpillar, or vice versa. So what's a quick fix here? What's one way we could fix this? So we could get the longer length first, so we could say something like if string length of S1 is less than string length of S2, then we could go ahead and say the maximum length is equal to str length of S2, but I'm kinda pushing it now with design issues, let me put max here. Is this correct? And then I can say, let's change this to max. So this kind of helps but I'm still kind of doing something stupid. What's the quickest way to just check if two strings are not the same? 

[ Inaudible Remark ] 

>> Right, if it's not the same length, right? I could actually have a check like if string length of S1 does not equal the string length of S2, you don't need to iterate over anything, you can just immediately yell at the user and say, obviously not the same, right, because they are different lengths, and then so long as I have my curly braces here, I can return one here, and then if I get to this point here, now does it matter which string length I use? So now, I can use either because they're equivalent so there's a couple optimizations that are possible here, but now I can do response to the user in a couple different ways. I can say they are obviously the different because they're not even the same length or same length but different things. Now, hereto, not perfect design, so correct I think at this point but where is an opportunity for better design in that loop? Yeah? 

[ Inaudible Remark ] 

>> Okay, so what if we did less than or equal to string length of S1, what would that have us comparing, in addition to the string itself? 

[ Inaudible Remark ] 

>> So it would have us comparing the null character, and now just to push back, is that necessary? So this is not incorrect but it doesn't really gain as much, and if anything, we're spending an extra bit of time now because if we already know these lengths of the strings are the same, and we already know that in C, every string ends with backslash 0, it's sort of a waste of our time to bother comparing the backslash 0, we can stop one short of that, so not incorrect but we're just wasting an extra iteration. Yeah? 

[ Inaudible Remark ] 

>> Yeah. 

[ Inaudible Remark ] 

>> Exactly. So this is subtle and frankly it's--it's kind of elegant, right? We've kind of preached the idea of making these succinct lines of code that are just very straightforward, but the problem here is that the condition in a four loop is evaluated not once, but on every iteration, right? 'Cause you have to check, is it time to break? Is it time to break? Is it time to break? So, every time you check that condition, you're comparing I against the current length of S1, but what is the current length of S1 on every iteration, what's the same as the very first one. So a common technique would be, don't call functions in your conditions, rather move this over here, introduce something like N and assign it to the string length and then iterate from I to N. Even here, there's still an opportunity for slight improvement 'cause what am I doing that's still slightly foolish? 

[ Inaudible Remark ] 

>> I called it earlier as well. So, you know, not as an expensive of a mistake. I mean frankly the code is kind of elegant here and is the strings are pretty short, it doesn't matter if I'm spending a few extra cycles, but still I should probably compute string length of S1 once up top. Put the result perhaps in a variable and then actually proceed with the comparisons. So, all of these though can go away, of course, if we just use the string compare function. But what's it doing and what's a lot of these functions doing in the documentation? It's just implementing these fundamentals that years ago people got tired of every time they wanna compare strings, writing like 10 lines of code, you collapse it to one line, str compare, and thus, do you hopefully start to see really the motivation for writing things as functions and not just writing everything in main. So I've got some more craziness in a bit, let's take a 5-minute break, and return with the damage we can do with pointers. 

[ Pause ] 

>> Alright. So we're back, first, any questions on pointers and just what they are? Alright. So let's now start pulling off some of these training wheels. So in the CS50 Library, we've been taking these functions for granted and they've been useful in that they handle all of the getting of input from the user from the keyboard. They even do some error checking whereby GetInt if you type in as we've done monkey, it will say retry, retry, retry, until the user actually gives you an integer. But let's focus now on GetString because clearly all this time, GetString has been doing some interesting stuff with us with regard to pointers. This is an excerpt from cs50.h, the actual file you've been including. This file lives in a special directory in the appliance as it would on a typical computer system. For the curious, it happens to live in a directory called slash user slash include, this is where a lot of dot H files are. In fact, if you recall in a Linux environment, if you type LS that will list the contents of a directory and if I do slash users slash include, notice there's a whole bunch of dot H files in my appliance. Among them over there on the left is cs50.h. Not all of them are necessarily in those same directory but we might see a couple of familiar ones if we go to--I don't think we've used any of these just yet. Sorry, oh and crypt, so we've used crypt there for the Hacker Edition and then also-- 

[ Inaudible Remark ] 

>> --the CS50, obviously, dot H the header file. So here it is up at the top, the CS50 header file only contains that synonym for string, char star. Remember we saw this a few days ago. If I scroll all the way to the top, we introduced this key word, typedef, which we won't use terribly often. We will use it for 1 or 2 P sets, but typedef just defines a type that char star is henceforth gonna be known as string, that's where that came from. But if I scroll now down to the documentation for GetString, we pretty much been taking for granted how it works but let's actually read CS50's official documentation. It says, GetString function reads a line of text from standard input. Standard input just generally means the keyboard, the user, and returns it as a string, otherwise known as char star, sends trailing new line character. So even though all this time we humans have been hitting enter to give the computer a--a string, we presumptuously, CS50, we just throw away that backslash N, because we don't want all of your strings to have a carriage--to have a new line in it. So notice now if the user inputs only backslash N, it returns this thing, so double quote, double unquote means the empty string. It is a string, how many bytes are taken up to represent quote, unquote? What do you think? Zero or one? Any zero? One? Okay, so it is actually one. Why? Well, every string, even if it's of some crazy 0 length, still has to have the backslash 0. Why? We have to be able to know how long this string is so that you're not just handed some garbage value. But just in case it turns out that there is and we've seen this word before, NULL, in all caps, that represents, as of today, a pointer--a special pointer, the pointer to memory address zero. So a very useful thing about computers is that you as programmers are never ever, ever allowed to store anything at a location number 0 in a computer's RAM. This is simply because so that there's at least one place in memory that is guaranteed not to contain a legitimate value, so that we can reserve that address, 0, as a special constant. So henceforth, if we ever wanna check if a pointer is actually legitimate or not, that is, is there a value there or is it maybe some garbage value, we're gonna start checking for NULL in all caps. So this function GetString apparently returns NULL if there's an error or if there's no user input whatsoever. 

>> Now this is hard to induce but it turns out for the curious if you hit control D whenever asked by a program for input, sometimes just hitting control D will crash a program because control D typically means I give you no input but stop asking me. And that is different from hitting enter, which does give you some input, it gives you backslash N. Now we have to catch this and we soon will. So, leading and trailing white space is not ignored. So this is to say if the user hit space, space, space H-E-L-L-O, you're gonna get back to the space bar characters and so forth. So ultimately we'll see eventually the implementation of GetString but at least it seems that sometimes this function might return null. So we're gonna have to start checking for this. So let's do the following, I'm gonna go ahead and create a new file, we'll call it copy1.c, and let's now try to oops, let's try to actually copy a string rather than just compare. So let me zoom in. I'm gonna go ahead and do int, main, void and I'm gonna go ahead and let's say first ask the user as before, say something. So we have something to play with. And then I'm gonna do a no more string char star, S1 gets GetString, alright? And now I need to start getting into a new habit. We've not mentioned as much, not expected as much, but now that we know that GetString could return a null pointer where null is bad, we better start checking for this. So if S1 equals equals the special constant null, we should probably do something like printf, an error occurred and then backslash N and then return 1, or whatever nonzero value. Alright, so what--when might a function like GetString ever return an error? When might it return null rather to signify an error, what could go wrong? I can think of at least two things. One I already answered. Control D, so whatever that thing is, control D apparently can do bad things and so null might get return. Now this is not a terribly worrisome case because most humans are not gonna accidentally hit control D. But again, there's another more reasonable approach. How else could GetString fail potentially? If the string is too long, right? If I spend half an hour typing letters at the keyboard such that I type in like 2 billion and 1 characters or 4 billion and 1 characters, more characters than I have bytes of RAM, it's possible that GetString, which happens to use that function called malloc, memory allocation, which we'll see before long, what if they just can't give me two gigabytes worth of memory just to fit some user's crazy long essay? So how does GetString signify that error condition? It's got to return a special value. What's that special value? It's gonna go ahead and return null. Now in the past we might have returned false to signify that something went wrong. But if GetString is defined as returning a string but a string we now know is just a pointer, we can't return false because false is what? It's a Boolean. we have to return a pointer so we need to see that it gives us a special pointer that's reserved for error conditions, and so it commits to returning null so now we can check for this. So this is as simple as this is, this failure to check with an if condition, something as simple as this in a program is a huge source of bugs in people's programs, as well as relatedly a huge source of hacking opportunities if you are not checking the values of pointers and are potentially doing things with pointers that are just garbage values. And we've already talked about the notion of garbage values, they apply to pointers as well. Alright, so now let's go ahead and make a copy of this thing. I wanna make a copy of S2--of S1 and call it S2. So let me say char star S2 gets S1. Alright, on the left hand side I'm allocating a pointer called S2, I'm assigning it the value of S1. And now I'm gonna go ahead and do a printf. S1 is percent S backslash N and paste in S1. And let me do the same for S2. S2 is percent S, S2, okay. And let's try this. Let me scroll down, let me go ahead and do make copy 1. Oh my god, so many errors, what's wrong here? So, a couple of things, right? I skipped this part, so include cs50.h, alright, now I need that for sure for GetString, let's see if that solved all my problems. This does not mean I have six of my problems rather I could just have one that has a cascading effect. Make copy 1, no, printf, damn it! I need that one. Alright, so it's probably just a good idea to get into the habit of almost always using these things. Let's see if I've solved all my problems now. Make copy 1, whew look out. So copy 1, enter, say something, hello. Oh nice, my copy program works. Okay, but not quite. Let's try something to prove that I'm actually not so good at this. So let me try to capitalize--capit--I don't need to Google this. Capitalize just one of the strings. And I'm gonna do this as follows. I'm gonna capitalize just S2, and how do I capitalize it? Well, I'm gonna capitalize its first letter like this by 2 upper, can I do this? S2 bracket 0, you might have used this function before in Caesar or Visionare or if not 2 upper should. Just uppercase this one--a letter. But let's see if I need to include anything. Let's try to run this. Implicit declaration of function 2 upper, so no problem. So two upper is a string function. So let's go back to the same documentation. Two upper, C type, so here's another file. We actually glimpsed that a moment ago in my directory, but I need to do C type. So let's go up here. Include ctype.h. Now let's zoom in on the bottom, try recompiling one last time, it looks good. Now let's run copy 1, hello? Hmm, now kind of interesting but hopefully less interesting having just spent all the time talking about what a pointer is and what a string is. Why is this program flawed? I capitalized S2 but apparently the computer decided to capitalize S1 also, but why is that? 

>> They point to the same place. 

>> They point to the same place, exactly. So what's really happened? Well, I didn't type goodbye, so let me delete this part of that story. I did type in H-E-L-L-O. I did create S1 and S1 is apparently pointing there. But when I do S1 equal sign S2, or rather S2 equal sign S1, what's really happening? I'm not copying all of this. I'm literally copying S1 putting it into S2. So what ends up getting written here? 1, 2, 3, 4, 1, 2, 3, 4. So when I go to S2 bracket 0, bracket 0 is at what memory address? 1, 2, 3, 4. Where is that on the board? Well, it's over here. So what did I change the H to? A capital H, but wait a minute that's the exact same chunk of memory that represents the S1 string, but just so happens that now I do have copies but have copies of the pointer, not copies of the whole string. Now as an aside, I've been making up 1, 2, 3, 4; 4, 5, 6, 7. But the reality is these numbers are almost always uninteresting to us. So generally what the world does as we'll see in clay before long is the world just says if it's a pointer, let's just go with something much more intuitive. If that's a pointer, just point on the board to whatever it is you're pointing to in memory. So we would typically draw pointers as arrows pointing at something but realize that what's really inside of here is a number like 1, 2, 3, 4. But it's kind of uninteresting to keep making up fake numbers. So we'll typically just draw arrows. So just conceptually, what's the solution to this copy problem? How do we actually duplicate the string? What do we need to do in order to make a genuine copy of S1 and call it S2? Sorry? Yeah. 

>> Individually copy each character. 

>> Okay, we need to individually copy each character, but into what? 

>> Into like another-- 

>> Yeah, so really what we need to fix this problem and finish this story is we need the same amount of memory that has as many locations. I don't know what's here by default, 'cause again just as with variables when you allocate memory, you're gonna get a whole bunch of garbage, it's up to you to populate it with actual values. So that begs the question, how do you create an array of a specific size dynamically? How do you dynamically allocate memory? Well, it's actually pretty easy. Let me go up here and I'm still gonna check null. But before I do this copy, what I'm gonna do is this. I'm gonna first say S2 is not S1 but rather char star S2 equals the return value of this fancy new function malloc. And just to oversimplify for a moment, I need 1, 2, 3, 4, 5, technically 6 bytes. So I'm gonna hard code this ever so briefly. This function, malloc, called as follows, will hand me back 6 bytes of memory which pictorially would look like this. They're at some other memory address, but how do I know at what memory address malloc allocated me memory? Via it's return value. So what malloc does if it can is it allocates memory for you. It then returns to you the address of which of the bytes of memory it just allocated, which of the six bytes. The first, and it's gonna be up to me to make sure I put a special sentinel value, special backslash 0 at the very end so that I don't forget how long this chunk of memory is. Now I probably shouldn't hard code something like 6. How can I decide dynamically what the length of S1 actually is? Yeah, so a sterling of S1 like this. Okay, so plus 1. So here it's okay to hard code that plus 1 'cause you always need plus 1 byte for that special backslash 0. So to be clear, when you call string length as you've done for a week or so perhaps in your own code, it returns the actual human length of the string, it does not include the backslash 0. 

>> So when you're allocating memory for this, you actually do need to do so. Now as an aside, the official version of this code that's posted on the course's website actually has a slight modification, whereby I have multiplied by the size of a char. So this is not strictly necessary 'cause on almost every system a string is composed of chars and a char is 1 byte. But just in case my program ends up getting used for 20 years and is eventually used on a computer that actually uses 2 bytes for characters. And this has already started to happen. There's a system--there's an alternative to ASCII called Unicode that actually uses at least 2 bytes. This is particularly helpful for Asian languages. We're having only 8 bits. It's insufficient to represent all possible symbols. So just in case this program is run on the computer where characters are more than 1 byte, there's a special operator called sizeof that if I call it and pass in the name of the data type I'm trying to allocate memory for, it will say 1. Or maybe on fancier systems, it will return 2. So technically, even though this is getting a little scarier versus the just malloc 6 that I hard coded, this is now a bit more rigorous. So now I have memory, now I have this chunk of memory. How do I actually populate it with values? Well, how do I go about copying one string into another? What about for int I get 0, I is less than--oh wait, let's not make that same mistake again. N gets the string length of S1. I is less than N, I++, and what do I wanna do? S2 bracket I gets S1 bracket I, and is this quite right? Slight--ever so subtle bug. Okay, besides saving it, thank you. What else? I wanna click here, I wanna fix this here, I actually--I need to fix--I can fix this in a couple ways. I probably wanna do something like this, 'cause I also want to copy that backslash 0, so it's okay in this case to go one extra because I know it's a backslash 0, so now I've actually copied the string. But something could still go wrong. What happens if what the user is trying to copy is again a huge essay that they've typed in? I need to check the value of some--of some variable here. If S2 equals equals NULL, what I should really do here is something like printf, an error occurred EG out of memory or something like that. And then I wanna go ahead and return something other than 0. So these are the habits you now need to get into. Anytime you're dealing with memory, it's gonna be super easy to have your code, segmentation fault, core dump, all because of misuse of these things called pointers. So now if I capitalize, should this work? If I leave the rest of my code alone? Let's go down here, let's go ahead and rerun--copy, big syntax error, implicit [inaudible] of string length, so I'm back to making those old mistakes. Alright, include string.h 'cause I remember where that came from, and it's telling me I have a syntax error somewhere? 

[ Inaudible Remark ] 

>> Semicolon, right there. Perfect. Alright, so now let me try recompiling, let me rerun copy, hello, enter, nice. So now, one is lower case, the original; one is upper case, the new one. So what's really going on here? So, here's an example that actually has a bit more syntax and this time it uses int. So, here's a function main, and let's say we allocate a variable X but it's this time not an int, it's gonna be a pointer to a value X--sorry, rather a pointer to an integer, so let me erase this and draw a picture consistent with what's going on here, alright. So, what's going on? When I declare int star X, that's essentially like saying, give me a pointer, call it X and I don't know what's there, so it's some garbage value, question mark. Then I do the same thing, give me a pointer , call it Y, it's gonna be the same size but a different chunk of memory, question mark, 'cause I don't know what it is. Now, X gets malloc size of int, so malloc again allocates memory, how much memory is gonna come back? How many bytes based on this call? How big is an int? So an int is 32 bits, otherwise known as 4 bytes, that by coincidence is identical to the size of a pointer in the appliance, so how do I draw this? Well, I keep saying that 4 bytes or 32 bits is a square, so this is what malloc just gave me. And what got stored here in X, according to the third line of code? So malloc allocated this one chunk of memory which is 32 bits--or 4 bytes. Why? Because the size of an int is 32 bits. What does malloc return in general? The memory address, the first byte of whatever chunk of memory it just allocated, so maybe it's 1, 2, 3, 4, so we would draw 1, 2, 3, 4 here. But again, we're kind of past that point. We can now just generalize and say, what is an X? It's a pointer to that chunk of memory. Alright, and how about Y? What's happening now with Y? Well, let's see. Well, it turns out that this star notation can actually be used in two different ways. One, to say that something is a pointer, and another to say go there. So before when I had this piece of paper and I said 1, 2, 3, 4, I handed it to my friend, when my friend went there, you expressed the idea of going to a pointer by saying star pointer name. You don't specify the keyword int again, you just say star X, means go to the address of X that's on that piece of paper, so here's X, when I say go there, you can literally think of it as following the arrow to see where you end up and what do I put here according to the line of code? 42. So that int goes there. But here's a bad thing, now I'm saying star Y gets 13. So here's Y, where do I go? Question mark, you can now think of as being like, you know, who knows where this actually is, it's pointing to some garbage value. So where am I gonna go? I'm gonna follow this pointer and then bam! I'm gonna put 13 here but that is not my memory. What happens when you touch memory that's not yours? Segmentation fault, right, core dump, if you touch memory that's not yours. Now, this is bad, so if I then do though Y equals X, Y equals X, how does the picture change? Y equals X? So I change its garbage value, I have no idea where that was even going. So if I say Y equals X, well, this is literally like saying if this is 1, 2, 3, 4, put 1, 2, 3, 4 in Y. But pictorially, it's just saying point to the same value. So now if I say star Y gets 13, let's bring it home, what changes on the board? Star Y gets 13? 42 becomes 13, so what is the value of star Y? 13. What is the value of star X? Also 13. But this program might not even gotten this far because of this. Lets see what Binky has to say on the same subject. 

[ Pause ] 

[ Music ] 

>> Hey Binky, wake up, its time for Pointer Fun. 

>> What's that? Learn about pointers? Oh goody! 

>> Well, to get started, I guess we're gonna need a couple pointers. 

>> Okay, this code allocates two pointers which can point to integers. 

>> Okay, well, I see the two pointers but they don't seem to be pointing to anything. 

>> That's right. Initially, pointers don't point to anything. The things they point to are called pointees and so [inaudible] is a separate step. 

>> Oh right, right, I knew that. The pointees are separate or--so, how do you allocate a pointee? 

>> Okay, well, this code allocates a new integer pointee and this part sets extra point to it. 

>> Hey that looks better, so make it do something. 

>> Okay, I'll dereference the pointer X to store the number 42 into its pointee. For this trick, I'll need my magic wand of dereferencing. 

>> You're magic wand of dereferencing? That--that's great. 

>> This is what the code looks like. I'll just set up the number and-- 

>> Hey look, there it goes. So doing a dereference on X, follows the own, access its pointee, in this case to store 42 in there. Hey, try using it to store the number 13 through the other pointer, Y. 

>> Okay, I'll just go over here to Y and get the number 13 set up and then take the wand of dereferencing and just--whoa! 

>> Oh hey, that didn't work. Say, Binky, I don't think that dereferencing Y is a good idea 'cause you know setting up the pointee is a separate step and I don't think we ever did it. 

>> Good point. 

>> Yeah, we--we allocated the pointer Y, but we never set it to point to a pointee. 

>> Very observant. 

>> Hey, you're looking good there, Binky. Can you fix it so that Y points to the same pointee as X? 

>> Sure, I'll use my magic wand of pointer assignment. 

>> Is that gonna be a problem like before? 

>> No, this doesn't touch the pointees. It does changes one pointer to point to the same thing as another. 

>> Oh I see, now Y points to the same place as X. So--so wait! Now, Y is fixed, it has a pointee, so you can try the wand of dereferencing again to send the 13 over. 

>> Okay, here goes. 

>> Hey look at that. Now, dereferencing works on Y, and because the pointers are sharing that one pointee, they both see the 13. 

>> Yeah, sharing--whatever. So are we gonna switch places now? 

>> Oh look, we're out of time. 

>> But-- 

>> Just remember the three pointer rules. Number 1, the basic structure is that you have a pointer and it points over to a pointee, but the pointer and pointee are separate, and the common error is to set up a pointer but to forget to give it a pointee. Number 2, pointer dereferencing starts at the pointer and follows its arrow over to access its pointee. As we all know, this only works if there is a pointee, which kind of gets back to rule number 1. Number 3, pointer assignment takes one pointer and changes it to point to the same pointee as another pointer, so after the assignment, the two pointers will point to the same pointee, sometimes that's called sharing, and that's all there is to it really. Bye-bye now. 

>> This is Binky. This is Nick Parlante of Stanford University. This is CS50. We will see you next week. 

[ Applause ]