>> All right, welcome to CS 50. This is week 4. So it's always kind of fun when you pick up the paper in the morning and there's actually something apropos to class that day. And this is an article in the front page of today's Wall Street Journal. And those of you who have a computer know that your computer has a keyboard, but it's called a QWERTY keyboard for some reason. Why is it called the QWERTY keyboard? Okay, can't actually make out any of those comments, but presumably because it spells QWERTY, Q-W-E-R-T-Y. Here's a random keyboard from the Internet, Q-W-E-R-T-Y, never heard this expression you have a QWERTY keyboard just because that's the most phonetically pronounceable string of characters someone discovered many years ago. But there's this alternative called Dvorak, which some people, perhaps those among those more comfortable, sometimes use. Personally, I never got the hang of this. But this article here in the Wall Street Journal is about this. Quick little excerpt which amused me. The Dvorak keyboard layout, though around for decades, is as little known among the general typing population as it is passionately embraced by its devotees. It is to the keyboard what Esperanto is to language and what Betamax is to video tape. Fans say it let's them type at blazing fast speeds, with less strain on their hands and wrists than typing on a conventional keyboard. And this was, I thought, the punch line, nobody else cares. And if you actually then follow up, let's see, turn to Page A-4, there's more discussion on this. And the context was there's a bunch of people sort of bemoaning the fact that smart phones these days, iPhones and what not, apparently don't support Dvorak, they've kind of optimized for the rest of the world. The Wall Street Journal really went to town on people here. Smart phones seem dumb to Dvorak fans, but let's see -- this was kind of cute. So the final quote in this article is I think it's really sad, says one Dvorak devotee, that Dvorak isn't on smart phones, but it's something I'll have to live with. I'm a Dvorak typist in a QWERTY world. So those of you -- some more geek humor, those of you who have travelled internationally or are from abroad known that there are certainly differences in certain countries. Most recently I was in France and sat down in the hotel's kiosk and got so frustrated, because I couldn't find, like, the forward slash symbol, because it was in just a different place. But there's yet other keyboards out there. I just did actually a quick Google image search and found this one, which actually some of my more advanced friends have. It's this crazy looking thing, which frankly I can not even send an e-mail effectively on. But it looks a little like this. And frankly, the motivation -- laughing at me, huh? There -- oh! So frankly, I mean, I don't mean to poke fun at these things, because even I have had issues with RSI, repetitive strain issues, that actually can be alleviated to some extent by devices like this. But again, I just never wrapped my own mind around this. But the neatest keyboard I think I ever saw was owned by a really geeky MIT friend of mine in graduate school, whereby in his lab, I sat down, went to just send an e-mail or something, looked down at this keyboard, looked like a very standard, black Dell keyboard, keys not like this, but a normal keyboard, but none of them were labelled. So there was no Q W E R T, no ABC, there was no labels on this keyboard. And it was frankly the most effective way I've ever seen of keeping people off your computer. Because -- but it was also this really interesting, like, intellectual challenge, because it reminded me at the end of Star Wars where Luke kind of has to put away his machine and just go on faith alone. I mean, that was the only way I was able to type. I couldn't look down and type my e-mail. I had to just literally, like, close my eyes and idea it rote, from memory, and I actually got through it. But it was the neatest thing I think I have ever seen. Better yet, I've also seen on line, keyboards that are optimized for moving letters around just to mess with people. So even though it's actually a normal keyboard, people don't realize what the labels actually mean. So enough about that. In other news. If you haven't returned your scratch boards yet, hope you're having a lot of fun with them, but please do return those to any of the teaching fellows in the corner here. We'll pretty much cut off that limit this week on Wednesday. And let's see -- in other news about grades. If we can pull up the lights for just a second. I mentioned briefly at the very start of the term, but now that you've started getting official feedback from your TFs, it's perhaps worth a quick recap. I think two of the wheels are locked today. So this is essentially a summary, numerically, of what each Page 2 of the problem sets convey qualitatively. So the teaching fellows have all been asked very expressly to focus this term as in past terms on as much qualitative feedback as possible and less on the quantitative feed back, for a number of reasons, the least of which is far more educationally valuable to get pros comments back from a teacher pushing you toward how you might write better code, solve problems more efficiently than just getting numeric scores in and of themselves, not pedologically that interesting. But I know there is a concern, last semester, this semester, with getting things like threes, because the mathematician in the room immediately does 3 divided by 5, 60% on my first Problem Set, and you get really frustrated or angry or disappointed, or whatever the emotion happens to be. But realize, as these labels are meant to imply, the goal of the course in terms of grades at term's start is to start off as many students as is possible, as is appropriate, in this middle sweet spot, to be honest. 3 is good. 3 is not a C, 3 is not a 60%. You can do all the arithmetic you want, but that's not how we view these numbers. In fact, the TFs have all been asked very specifically by me to, you know, keep their students as best as is possible, you know, in this zone at first, so there's actually room to grow. And frankly, even among those more comfortable in this room, odds are there's still plenty of room for improvements within your code, even though relative to say a complete newbs code, it might be better. But again, we ultimately consider students on an individual basis at term's semester, then again, what's most important in this course is not so much where you end up relative to that person sitting next to you, but where you are relative to your other self. So do expect 3s in the beginning of the semester. If you're getting betters already, even if you got a couple of bests here, outstanding. If you're in the 1, 2 range, take advantage of the innumerable advantages the course offers, office hours and more, and do realize what these axes mean, if not clear from the PDFs themselves. So correctness really boils down to does your program do what the spec told you it should do. Is it buggy, is it bug-free, does it behave as expected. Design is nor of a subjective matter. And this is why we have humans grading your work and not just computers. We do use automated tools to assess for the most part correctness, because we can compare your output against some conical output, but computers do not do a very good job at these other details, and that's why we have a team of some 30 teaching fellows. So design is about how well did you solve this problem, how clever was your design or how clean was your design. If you have a perfectly functional program that's 100% correct, that doesn't mean it was designed well. You might have again, loops and if conditions that are nested, you know, this wide on the screen. There's probably an opportunity to fix that. Your program might be this long, when really it could have been this long, when really it could have been this long, or your program runs in quadratic time when really you could have done the problem -- the same problem linearly. So that's what design speaks to, and this is where the TFs on the PDFs and on e-mails are asked to provide as much qualitative feedback as possible. And style, frankly, is relatively easy. If you're losing points on style by semester's end, frankly, you're just throwing points away. And this is important. Naming variables appropriately, commenting your code, indenting properly, goes a huge way long-term, whether or not you're still taking courses after 50, just keeping your code understandable by you, even I. If I sit down a year hence and look at some of my own code that I've written, I have no clue what some of it does if I didn't actually take the time and priority to design is well. So think of style as something you should be doing as you write each and every line of code. Too often, it seems, do students relegate style to the very last minute, I'll comment it all later, I'll fix things later -- that never seems to happen, right? Because 6:59 p.m. rolls around pretty fast, and that's the first thing to go. And it's the easiest thing, frankly, to fix. So don't do yourself a disservice with that. And if you have any concerns, by all means reach out to me or your own teaching fellow or [Inaudible] or Jan [Inaudible] throughout the semester. All right, with that said, today's where I think things begin to get particularly interesting because we can start to do more sophisticated things, both mechanically using various tools and also conceptually, because now we've gotten underneath our belts a lot of the basic building blocks, we can now assume for the rest of the semester. Now what do I mean by that? Well, the first thing we're going to introduce today is called a debugger. So many of you have suffered bugs in your programs, and this is certainly to be expected. This is nothing that you ever move away from entirely, certainly code I've written, whether it's for the bulletin board or for certain Problem Sets, have been erroneous at times. And you learn to get better at this, but thankfully, the act of debugging, finding bugs that you've made and fixing them, does not have to devolve to just thinking through your code or using print def all over the place. There are some powerful tools help you help yourself. And the one we're going to introduce today is called GDB, the Gnu Debugger. This is sort of -- sort of the second nice counter part to GCC, the compiler that you've been using. And it operates as follows. I'm going to go ahead and open a program called bar dot, C, it's in today's print out, so you have a copy of this. This program really does nothing interesting. In fact, it does a lot of unnecessary work. So bar dot C, but I use this as an opportunity to walk through this program step by step, which is not something we can do just by running at the command line. I'm going to instead compile this program in a slightly special way, and then run it within the confines of this thing called a debugger, a program that let's you pause execution anywhere you want, a program that let's you look inside of variables, look inside of arrays, see what's there, so you can do a little sanity check for yourself, and they be you can let the program proceed. In short, with a debugger, you can walk through your own code step by step, or even jump around, and this is hugely invaluable, especially now that your problem sets are getting a little larger, because you can now diagnosis problems a lot more easily. So this is again, just a stupid program. Starts off in main. Calls a function calls foo, which calls a function called bar. Notice at the top here's I've gone ahead and provided a prototype, as it's called, of each, so that GCC knows to expect these functions. But now I'm going to go ahead and compile this in a slightly different way. Rather than run GCC of bar dot C, and actually I'm going to do dash O bar, I'm going to add an additional flag that you may have seen flash across the seen sometimes, given how we've automated some tasks on nice.fas. But I'm going to go ahead and add to this a dash G, G -- GDB. So it's a little oddly named, but we'll be able to automate this before long. And what this tells GCC to do is yes, compile this source code down to zeros and ones, called object code, but also include in A dot out or rather in the binary called bar, include some extra information that a normal human, normal user, really doesn't care about, in fact you're just wasting space by adding the special stuff to the binary. But include debugging symbols, include some hints from my own source code so that I, the programmer, can walk through this code and actually see what's going on. So what I'm now going to do, now that I have a program called bar, as usual here, looks the same, I'm going to run GDB of bar and then hit enter. So I get a bunch of copyright stuff, not all that interesting, and then I get a GDB prompt. Now the simplest thing I can do, and we'll refer you on line to a really nice cheat sheet for GDB, but there's a couple things that might be worth jotting down for yourself today, just so you don't forget, but run is perhaps the easiest command you can type at the GDB parenthetical prompt. Starting program. Incidentally, this is a longer path than you might be familiar with, but if you recall from earlier problem sets, nice.fas has some big hard drives with lots of people's data on it. This is the full path, the number of steps is takes to find your directory from the root of the hard drive. And this is just being ever more explicit as to where I am. So not that interesting, except that I am in fact running bar at the very end. And apparently this program just prints out something like world high on bar. So again, uninteresting, and it existed normally. But now let's see if we can step inside this thing. And incidentally, what do you think it means for a program to exit normally? How does GDB exited normally? Yeah, return zero. So finally, that detail we've been telling you is a good thing in many context to do, it's actually becoming useful, because now when we actually care about the right -- the correctness or the incorrectness of my code, getting back exit codes like that is helpful. So you know what I'm going to do, that happened way too fast, right? I'm not yet actually controlling this program. I'm going go ahead and incidentally , when I clear my screen it's usually control L, another little LINUX trick. So control L cleared my screen. This time I'm going to say this. Break in the function called main. So I'm going to hit enter, and now I get a slightly esoteric comment back, break 0.1 at O X, 84, 8505, file bar dot C line 21. All right, so some of that's confusing. But this part's pretty clear. I just said what's called a break point in the file called bar dot C at line 21. And as the words, own name implies, a break point is a moment in time or a line of your code that the GDB debugger is going to stop executing for a moment so that you can start to poke around. And only once you say okay, I'm done poking around, will GDB continue executing your program. So now when I type run and hit enter, I get this. Break point 1 main at bar C colon 21. So again, a little arcane, but just useful information. I now know where I am in my program. So this indicates my line number, this indicates what's on that line number, and if I do a little sanity check, you can type list to see the lines of code near by where you just broke. Ah, okay, so line 16 was the return value for main, line 17 is main's prototype here, line 18 is this curly brace, I seem to have used tabs accidentally in my version of this. Your printed version should be fine. So ignore this white space -- actually fix this so there's no unnecessary confusion. [ Background noise ] >> Okay, oops, GDB, oops, GCC, GDB, break, main, run. Okay, back where we are. can you compile your code that fast? Okay, so now I type list. Now my code is prettier. Okay, so now there's no stupid distractions. Okay, so we're on line 21, now why did the program break on apparently line 21 instead of say, 19 or 16, which really seem to be the first lines. What is it actually breaking on. You can think back to some scratch terminology. What construct is it actually breaking on. When I say break main. Well, what's on 21? So it's like the first statement, right? It's an assignment statement, I've got a left-hand side thing, char star S equals hello world, whereas the stuff up top, these are just variables, declarations, or arrays declarations. So you really don't see anything happen when you declare a variable. But you certainly will see some stuff happening, usually when a statement is executed. So typing break main simply broke execution at line 21. So what does this mean? Well, nothing has yet happened in the world. In fact, what I can now do is type in print, and as the word implies, it's going to print something. What do I want to print? You know what , let me go ahead and print the value of A. I'm just curious, what is inside of A. Okay, that's weird. What is with this complete nonsense inside of A. Sorry? [ Inaudible audience comment ] >> Louder than murmurs. What is this weird value, 134514073? [ Inaudible audience comment ] >> Exactly. Good. So we did offer some caution a week or two ago when we said it's fine to declare variables, but you should never start using them until you yourself have assigned them a value. It might be nice if ideally variables all initialized to some value, like zero, which actually is the case sometimes, and in some languages. But clearly here there's just some garbage value inside the variable called A. And that's because again, the program has a whole bunch of RAM allocated to it, it can do anything it wants with that RAM, or you can, but it doesn't necessarily clean up that RAM, it doesn't initialize it all to zeros, and so yes, I have a 32-bit chunk of memory called A. It's designed to store an int, but I haven't actually initialized it to something. So what's inside of A right now are the bits that previously belonged to some other variable or some other entity in this program. Okay, so that's fine. That's not such a big deal. But I don't want to use A for any reason. And frankly -- just as an aside, the dollar sign 1 here, this is just a GDB thing so that you can actually reprint values from earlier, that's sort of like a temporary variable that for now just turn a blind eye to. But it can be useful. All right, what about B? Let me go ahead and print -- let me go ahead and print B. Okay, interesting. So B is an array of type int of size 5, and apparently, I have five bogus values in there. What do those numbers mean? Well, frankly, who knows. But I haven't initialized that array to any specific integers yet, so I just have some junk in there. So what's neat already about GDB is we're literally looking underneath the hood at what's going on inside the computer's memory. Unfortunately, nothing interesting just yet. What about S? Let me go ahead and print S. Okay, that's really weird. So what is that? So there's some kind of an address. Any time you see 0 X starting today, that's going to refer to some memory address, some location RAM. What all of letters mean we'll determine in a moment. And then in the quoted string, that is definitely not hello world, right? Sort of like the rot 13 hello world, but even then it's kind of garbage. But why is that, why is there junk inside of S at this moment in time at line 21. All right, so it hasn't actually executed yet. When I broke in at line 21, I broke in before line 21 is executed. So if I actually want to step over it, step over to the next line, the command in GDB is next. So now I've advanced to line 22, this means line 21 has hopefully executed. So if I type the very same thing again, print S, there is in fact my hello world string stored inside of the variable called S. And char star, that's synonymous with what little shorthand in CSV's library? Just a string. Okay, and we'll tease apart what that asterisk is exactly doing for us later today and Wednesday. All right, so what comes next. I do a print def. Well, what is this all about. Well print -- let's see, S bracket 7. This is something we'll spend some time on. But if I go ahead and hit next, let's see what gets printed next. Okay, no pun intended there. Line 22 is about to get executed when I time next. Here we go. Okay, so world printed out. Now why is that? Well, let's see. What I've just done is print, ampersand, S, bracket, 7, close brackets. So some new syntax. But just as a teaser so far, what is 7? Well, let's see. Here's S, from here, left to right, so this is 0, 1, 2, 3, 4, 5, 6, 7. Is looks like when you start counting from 0, 7 is the index of the W, and we'll see later today and Wednesday what its doing for us by adding this ampersand. Apparently, it's somehow tricking print def or compelling print def to print what string, exactly? So world, clearly, because right, even though there's a lot of int mingling here of GDB commands and output to the screen, it printed world. So somehow or other, this little trick of using ampersand S bracket S is going to the seventh location and then printing the sub string that starts there. So we'll see why that actually works. Okay, I'm getting a little confused where I am, so I'm going to type list again. Okay, there I am. Here's the nearby lines. I'm on line 23. So if I type print A, A is still gets valued. Let's go ahead and type next. And now print A. Okay, now A is initialized to 5. Okay, so just baby steps thus far, but you know I'm kind of curious as to what foo is going to do now. So line 24 is the next line to be executed. Here it is in my list of code. If I just type next, unfortunately, it steps over foo, it executes it, but then it goes to the next line after foo. And the next line after foo, of course, is line 25, where I'm returning zero. So if I hit next one more time, next one more time for the closed brace, viola. So apparently I'm actually in someone else's code. Oops. Now in someone else's code. Typing continue is just going to finish off the program. Just go to the end. I've lost interest in this problem. And it exits normally, because it in fact returns zero. Let's dive into foo again. So let me quickly pull up the code, since I know we're going back and forth a lot here. So this was main, we stepped all the way through main. But when I type next at this line here, it went over it, it did execute it, but I'm really curious to see what's going on inside of foo. So let me do this. I'm going to rerun GDB on the program called bar. I'm going to hit enter. And you know what, I'm going to again say break main. You know what, I'm tired of seeing line 21 execute. Instead, I'm going to do this. I'm going to hit delete to delete all my break points, another little trick, and I'm going to say break in bar dot C line 22. I'm really -- don't want to see line 21, I want to start at line 22. So you can exercise some precision. You can specify not a function's name, but a file's name, colon, line number to start right there. Okay, so now let me type run. So I'm breaking at line 22. Line 21 and prior have all executed. It's just I have not broken execution until this point. I'm going to go ahead and click next. Next, and now I'm going to type step. So next takes you over a line while still executing it, step steps you into the line, which means step into this function. So now notice I'm inside a function called foo, N is referring to what exactly? So that's its parameter. Let me pull up bar dot C. Ah, here we go. So here is the prototype or here is the declaration for foo, returns an int, okay, here it takes a parameter, N. So what I'm seeing in GDB is the parameterization of this function. So again, let me restart this. I'm going to go ahead and type run again, which will restart the whole thing. Enter. I'm at line 22. Next, next, and step. And now okay, so the very first line of foo per your print out says to declare a variable B, and then assign B to N, and then multiply B by 2 and then call bar and then return B. So you know what, if I type next that's going to execute line 33. And then line 34, well, let's do a quick sanity check. If I type print N what should I see? What was passed in as the value of N. Okay, it was 5, right? Because A was 5. We passed in A to foo. N takes on the name, is the name we give to the parameter. So it too says 5. And now if I print B at this point, another little sanity check. So B gets N and then B star equals 2 should give me what here? Yeah. So just 10. You know, I'm getting tired of these long commands. P, B, will also print. So there's a lot of little short cuts where if you just write the first letter, the first two letters, that's enough to tell GDB what to do. So finally, if I type step I'm now going to step into bar. Bar doesn't do all that much, so in fact as soon as I type next to go to line 42 it prints it. I'm now done with that. And now I'm losing interest, I found my problem, I'm just going to say continue and it finishes along its way. So this in and of itself was just meant as an exercise in stepping through code. We'll soon start using this as a technique to fix problems in our code or understand what's going on, and you -- this is going to be one of the hammers we have to hit all the time -- when you start attending office hours or tackling bugs in your own code, do not underestimate the value of GDB. Eventually, it will be Tuesday, Wednesday, maybe Thursday night, you just need to get past some bug and you're not going to want to sort of fuss with a tool like GDB, because it will cost you initially an extra 15 minutes, maybe an extra half an hour. It will save you hours by semester's end, no joke, leveraging tools like this. So don't dismiss it as something that was kind of interesting, but over my head. It's a quite, quite powerful thing. All right, so with that said, let's now introduce or talk about some of the things we've been taking for value -- for granted. So this was an example from a while ago, buggy 3 dot C. We've included it again in today's printouts, but it was wrong, right? So a couple weeks ago we tried to swap two values using a function and it just didn't work. So what was the fundamental problem with this code. Sorry? Scope. So there was this problem of scope whereby one main called swap passing in I think the variables were X and Y. So X and Y were passed in, but copies of X and Y were passed in. And so swap has A and B which are identical to X and Y, but they're copies, they're identical in the sense that they're copies to those four lines of code inside the curly braces actually work. Yeah, so they did work, right? That is a correct logical way of swapping two values, but unfortunately the moment we got to the end of this function, what did we do with the values A and B? So absolutely nothing. So this again was buggy 3 dot C. Here again was the code. And just to be clear if you think back to what memory looks like inside of a computer -- oops, what memory looks like inside of a computer, every time you call function, it gets a frame on the so-called stack, it gets a chunk of RAM devoted just to it. So inside of this chunk here, inside of main, there apparently are two variables called X and Y, so what's going on inside of memory here if I can borrow this board for a moment and I'll rotate it just a second, so inside of main are two 32-bit chunks of memory. All right, so let's see. If we draw this is our RAM, this sort of big rectangle down here is main's frame. So I'll label this as main. And carved out for main is say 32 bits here, called X. Another 32 bits here, called Y. And we've put inside of this RAM two values. The number 1 and the number 2. Okay? So this is main. But then main calls swap in this implementation. Swap itself takes two arguments, two parameters, called A and B and per our picture here, what you get first on the stack is a frame containing a function's parameters and then on top of that is another frame containing any local variables that it might have. So let's see what this translates to here. So now I've gone ahead and called foo's parameters, so we'll call this foos prams, albeit illegibly, inside of here, well I passed in or rather I defined A and B, which are another 32-bit values. And what's going inside of them? Copies one and two. Now what is this. Well, it's just kind of nice to be able to draw a whole rectangle. But this is unused space for the moment. So you can think of this as either being there or not being there, it's just being unused right now. All right, so what happens next. In the swap function per this, int temp -- where does int temp go. Yes, so it goes on its own frame, right? So above a function's parameters goes a memory being used for the function itself. So this is foo's own frame, and so foo now has a 32-bit value called temp, and initially it's got some garbage value like 4, 5, 6, 7, right? Just because. We saw garbage values with GDB. Who knows what's in here at the moment in time when this line of code here is executed. All that's done is that 32 bits are set aside for temp. But as soon as I do temp equals A, what happens is that garbage value gets overwritten with something useful, so now we have that state of the world, the next line of code that gets executed here is this one. So we have A as being assigned the value of B. So what happens here. Which of these five boxes changes? So just A. Right, so this one gets clobbered and becomes ace 2. So that is now the state of our world. And then finally B gets temp, so that means I'm going to copy temp down to B, so now I have this state of the world. So in fact, this code is really, really correct. It has perfectly swapped these two values. It did cost us something, cost us 32 bits in the form of temp. But the moment we hit this curly brace on the overhead, what happens? Yeah, so this goes away. And now technically I could erase this. But we -- we'll discuss this in the context of forensics. Technically, this 1 does not go away. We just forget that it's called temp. So where do these garbage values come from? Well hence forth, whenever we call a function, whoever gets allocated this 32 bits is going to have a garbage value of 1, because it's just been left there. But conceptually, this frame is thrown away entirely. And so now the state of my world is this. But then oh, the function is now returning, so this now goes away. And unfortunately, all of that very correct, quote unquote, code was completely for naught, because we did nothing with those values. And the mode -- the reason for it, ultimately, was that variables when passed from one function to another are often, or at least by default, passed in by value as copies. So let's do a quick sanity check. So that was called buggy C. So I'm going to make buggy 3. And now notice it's an aside, all this time when you type make, you do get this really long line, but we sort of anticipated this. So notice any time you run make, automatically does dash GDB get inserted for you, just in case you actually want to poke around your own code. Now as an aside, it's not a good thing to ship or sell software that has debugging symbols enabled. Usually something like this can be used by adversaries, by hackers, by bad guys who actually, you know, crack your software, disable serial numbers, because it's more information than a user needs, but for you the developer, it's actually very useful. So let's go ahead and now run GDB on buggy 3. Okay, a lot of copyright stuff. Let me go ahead and type run. Oh, right, doesn't work. So I want to poke around. So let's do -- what should I type, how do I poke around? [Inaudible] sure, short program, won't be hard to go through it. So now I type run. All right, so I'm in main, okay. I'm tired of typing next, I'm just going hit N hence forth. So N, N, N, N, N, the same program, here's where it's interesting. So quick sanity check. So I'm going to display X, I'm going to display Y. So whereas print displays value once, display should keep showing it for us. So for now if I do step into swap, I'm going to be re -- oops, not in that case. Ignore what I said about display for a moment. Now inside of swap, what was passed in is a copy of X and a copy of Y, currently called A and B. So if I print A and I print B, sanity Czechs out, if I print temp at this point, what should I see? Zero. Why zero? Just yeah, just kind of because. It's a garbage value, there is sometimes some initialization happens, sometimes you do get zero values, but you should never assume it. But if I do type next and now print out temp, ah ha, there's the value 1, which I explicitly put there inside of my swap function. So let me go to the very bottom of this function. A gets B, B gets temp, there's my curly brace. So let me check. Print A should hopefully say 2, yup. And print B should say 1. It did in fact work, but then as soon as I continue to the end of my program, X and Y are obviously unchanged because of the pictorial that we had there. All right, so GDB didn't help me solve a problem here, but it did help me see what's going on underneath the hood. Now when I ran -- let's tees one thing apart -- when I ran GDB on bar a moment ago and set a break point in main I got this cryptic thing. So I said this was an address. What does that mean exactly? Like, what is -- an address of what, where? What context? Anyone? So murmuring doesn't seem to help today, so we'll do this approach. I'll just stand here awkwardly. Anyone? Yes? Yeah, so it's a location in memory. What kind of memory? Not hard disc, but RAM. So again if we model our RAM, model our world of memory as this rectangle, right, thus far we've just been drawing nice little boxes that look kind of like squares and we assume that a square is 33 bits. But you need to be able to address memory somehow. So in fact what really a computer does is yes, it takes your RAM, maybe it thinks of it as kind of this conceptual rectangle, but advertisement got a number every location in this rectangle. So this chunk of memory here, even though I've labelled it X and I've labelled it Y, those actually have numeric addresses. So technically, this box here is numbered 0, X, 1, 2, 3. And then this box here, watch and let me pick an easier number. So let's suppose this is 0 X 1. That just happens to be the address. The byte -- so if you've got 2 gigabytes of RAM, you've got 2 billion bytes of RAM. This happens to be byte 12. Albeit something called hex decimal. So what's the address of Y going to be? O X 1, just take a guess. So not 3, because how many bytes away is this guy. It's an inter. So an ints 32 bits, 32 bits is 4 bytes. So that means if this is O X 12 this is going to be O X 16. All right? So all that minuta from Week Zero when we talked a little bit about binary and bit, it's actually useful because now it applies to the addressing of memory. Now as an aside, and I think we said this once before, there are many operating systems, some of which we're still using, that can only use, say, 2 gigabytes of RAM. Why might that be? Why 2 gigabytes of RAM? I think XP is limited to this, maybe 3 gigabytes of RAM. Yeah? [ Inaudible audience comment ] >> So there's a limit to how long those statements are. Yeah, so but put another way, when we say a computer is a 32-bit computer or a 32-bit CPU, as we've said, that just means that inside of the CPU are -- is memory, called registers. 32-bits are used to represent numbers. Well the biggest number we can represent with 32 bits is 4 billion. But if we use positive and negative values the biggest number we can represent is 2 billion. And so why can some computers, some operating systems have a maximum amount of ram at 2 gigabytes? Well, that's 2 billion bytes. Well, if the programmers of that operating system or the designers of that CPU only use 32-bit values to implement numbers, how can you address the 2 billion and 1 bytes up here if you literally cannot count that high using just 32-bits. Now thankfully the world has moved to 64-bit computers, 64-bit CPUs, which means you can have a ridiculous amount of memory. So this was just kind of for historical reasons short-sightedness that we have some of these constraints. But that's what it boils down to. You can't use 3 gigabytes or 4 gigabytes or 5 giga bits of RAM in some computers, even though it's really useful for things like Photoshop, these days, because your ints are not big enough. They once were, but no longer. All right, so what -- how do we then start counting in this new scheme? All right, well we know binary, zeros and ones, and we know decimal. Let's see how we can get from zeros and ones and zeros through nines to O X and these other letters here. So in binary, what's the number for zero using 4 bits? Good, all right, how about 1. Okay, and then oops -- that, an then 1 -- oops, 1, 1, and this is -- I'll put some little reminders here. If this has been a while. Okay, so now 0100 is 4. 0101 is 5, 0110 is 6, 0111 is 7, let's see, haven't embarrassed myself yet, 1001 is 9, okay, fun part coming up. 1010 is 10, but if I'm using the decimal system, the only digits I have for numbers representations are zero through nine. So yes, I can compose 10 by using a 1 and a 0, but if your notion of a digit is a single number, so zero through nine, we have no way of expressing the number 10 using individual binary -- decimal digits. But fortunately, there are other base systems. We talked about binary, we talked about decimal, there's also something called hexadecimal. Whereas in binary you have two values, 0 and 1, decimal, deck, you have 10 values, 0 through 9. Hexadecimal, you have 16, 0 through 9 and then A through F. Right? So the world needed to count a little higher than decimal allowed using individual digits. So now 1011 would normally be 11, but I can't do that, because again, 11 is not a digit, it's the composition of two digits. The single digit I'm looking for in hexadecimal would be B, and then finally 100 is C, 1101 is D, 1110 is E, an then thankfully, no mistakes, is F. And the contact frankly that some you have might have actually seen hexadecimal before is kind of a silly context, making web pages, where color codes are often represents with hexadecimal these days. So any time we need to count a little higher than decimal allows, and certainly higher than binary, I mean, my God, this would be a nightmare if GDB told us where things were in binary. It's not very human-friendly. We can now at least count up, here in Week Four, to the number 16 using this new base system. But this is useful, because many, many, many programs actually use hexadecimal to remit values here. Now it's not going to be that useful to you generally to know where in your computer's RAM, your function, or your variables ended up. But sometimes it's useful to care about maybe the last couple of digits, because then you kind of know where they are relative to each other in what's otherwise a really big address space. And when we get you our forensics problems set in a couple weeks time and you actually have to recover JPGs from disc, what you're going to be seeing when you look at the files on disc are actually not numbers like 0, 1 through 9, but rather you're going to see hex decimal, partly just by convention. Because what is one upside of using hexadecimal instead of let's say decimal? What's useful about it, even on first glance, like, what is the value of defining A through F in this way. It's shorter. That's it. Right? You can express more values using just individual digits. So there's an efficiency aspect. And consider, too, the whole reason we use decimals and as humans instead of binary is my God, just to represent the number 15, look how many digits -- well, maybe that's not compelling, you have to use four digits, we can use two to represent the number 15. But if you think a little further out, this actually does have ripple effects with much larger values, though that might not have been compelling. Okay, so that was a lot all at once. Why don't we go ahead and take our five minute break here. [ Background noise ] >> All right, we're back. So it occurs to me that I might have induced some fear by saying you shouldn't be getting many ones and twos. So on the first C Problem Set, you should have gotten some ones and twos, because design and style as your TFs may have told you, were only out of two possible points. So you probably did get a one or a two. And the motivation -- normally, everything will be out of five points. So that little chart I drew on the back of the board is applicable. But in the beginning of the semester with P set 1 when we've barely had time to talk about what design is, what style is. We certainly don't want to put you at a disadvantage by expecting more than we possibly could. So ones and twos, very good on Problem Set 1 for those two axes. So the setup here was that swap has not worked for weeks, like we've been able to implement functions, we've been able to take in inputs, produce outputs, but swap itself cannot swap two values successfully. So one approach at least instinctively might be well, maybe the problem this whole time swap has been a void function, we are not actually returning a value. So by -- could we in fact maybe start returning a value. So if I call swap and pass in two values, maybe I could return the swapped values. But there's a problem here. How many values can you return from a function as far as we've seen in C. So only one. So we could maybe -- we could pass in -- we could do something like this, I might be able to do -- so inter, X, gets 1, int, Y gets 2. You know, maybe I could do something like X gets swap and then I can pass in X comma, Y. Okay, so that could in fact swam X. But now I -- odds are this would not work -- well, you know, I could kind of pack this, right? I could do swap Y X. Right, if you kind of think how I might implement it, you know, you could kind of hack this together. This feels kind of stupid, right? Or if you're still kind of acclimating to what we mean by good design, probably not good design here. So it would be ideal if we could just return two new values. Give me back two values and then I'll assign those to X and Y. And you can do this in some languages, but not in C. So the approach that is almost always taken that actually opens the doors to some really powerful if dangerous code is this one here. Thus far we've been passing in arguments by value, by copy. By contrast, there's something called passing in arguments by pointer. Or passing in by reference. So pointer, reference, synonymous. Value, copy, synonymous. So passing in by pointer does something fundamentally different. Instead of passing to swap the values of X and Y you can use this asterisk notation and pass enough values of X and Y, but their physical locations in RAM. Now as soon as you pass in the locations, AKA, the addresses of two variables in RAM, you're essentially handing a function like swap the power to do anything he wants to those variables. Because if you tell swap where X and Y live in memory, he can go to town on them and swap them, change them, delete theme, overwrite them, anything that function wants. Now hopefully you're writing the function and you're not going to mess with your own variables. But what this is going to do for us is the following. If in main I have the following scenarios. So here we'll say as my RAM again, and in main, we have main's arguments like arg C and arg V, which aren't interesting today. And then we have main this has this. We have a value of X and we have a value for Y, and these numbers are 1 and 2. Hopefully, you can see this. We just have two little boxes, the values 1 and 2, and these have addresses. So let's just use the same address as before. So the address of this thing for reference is 0 X 12, it's completely arbitrary. I just made that up. And the address of this thing completely arbitrary, except that it's near by, is O X 16 , which is four bytes away. So I'm being consistent relative, but the starting point was random. Okay, so that's what I've done here. If I now calm swap before I allocated 32 bits, 32 bits, and passed in the values of 1 and 2. But you know what, let me do this. Let me call swap this time, I'm just going to label it over here, so this is swap's parameters, and I'm still going to call these parameters A and B. But you know what? What am I going to pass in as A and B? Not 1 and 2, but what numbers? Yeah, so O X 12, and I apologize for my hand writing, we'll clean this up in the scribe notes, and O X 16. So now incidentally, O X, this is just human convention for the following is a hexadecimal number, that's it. It doesn't mean there's a zero there, it doesn't mean there's a digit X, it just meaning the following digits are hexadecimal. All right, so now I'm passing in this address and this address. So this means swap can now use that information somehow. So how do you do that? Well here is the new and improved and finally correct version of wrap. What we're doing now with swap is the same thing as before, we are declaring a temporary variable, so that was swap's parameters. Here now is swap's other frame, where he -- where he can actually put local variables. So here's a local variable called temp. Now what I don't want to do is do temp equals A semicolon. I don't want to just copy the value of A into temp because I don't really care fundamentally about the address of A, I'm just going to use that as a little short cut to actually getting at A itself. So the notation for following a pointer, the notation for going to an address and getting the value that's at that address is this asterisk before the A. So this is the so-called dereference operator. It's just a star, just an asterisk. But what this tells the computer is treat A as a pointer in address, so a pointer is an address, a location in memory, those two are synonyms, hence force. Go to that address, get what's there, and what is there, what's that star A? 1, go to the address A, get what's there, this is the address A, O X 12. Get what's there, which is 1, and put 1 where? Inside of temp. So that's it. So new syntax, but it's relatively simple. We just have now introduced the idea of describing data not just by its names, its labels, the variable names like X, Y, A and B, but we now have another sort of advanced way of getting at data by way of their actual addresses in memory. And this is where C is both powerful and dangerous. A lot of modern languages do not let you get this close to the hardware. They don't let you get this close to manipulating computer's memory. Because you can do a lot of damage relatively easy. And we'll do precisely that in a lecture to come. But for right now, it's really very powerful, because we can really do anything we want underneath the hood. Which in this case is advantageous. So what do I do next? I've now copied the value 1 into temp. So we're good, we're on our way. Now star A gets star B. All right, well let's tease that apart, let's start with the right-hand side first. Staff B. So that means treat B as a pointer, O X 16. Go there. Well, where is O X 16? Well, my dotted line says it's right down here. So go there and then star B in the end, is equivalent to what value? 2, right? Just get the value 2. What do I do with the value 2? Well, assign it to the left-hand side. What's the left-hand side. Okay, same idea, treat A as a pointer, O X 12, go to that address, so follow the pointer, and then put what there? 2. That's it. All right, so go there. So it's kind of like there's this intermediate step. We now have to go to an extra location, we now have to do a little more work mentally and also programatically, but the result is still going to be the same, but in a much better way. So now what do I do? So star B gets temp. Well, temp is easy. This is the name of a variable, bam, I'm there. This is like week one stuff. Now what do I do with temp? Go to the address pointed at by B. All right, so B is this, this is a pointer O X 16. Star B says go there, where? Here. That is O X 16. Do what with it? Put there what value? 1, AKA, temp. And so now even though I've used the same amount of memory, I've still got two parameters. I've still got a local variable. Notice what I've done has not changed the local variables, which really was a mistake last time, really didn't get us anywhere. Would have actually used those two parameters, is as little shot cuts to go to the location in RAM to go into main's own frame and change the original values. So when I run this version here, swam dot C, hopefully, finally, all these weeks later, will I get a correct answer. So it's the same exact function. The only thing I've changed is the implementation of swap. But I've taken care to do two things here. Actually, I did one change in main. Notice what I've done. I did all the stuff we just described, the star notation. And for now this is the dereference operator. It means go to this address, okay? What did I also change in swap, though? Something I didn't yet mention. So what did I also do in swap. Not ampersand. What else has changed, vis-a-vis the buggy version? Well, this part, so not ampersand. This star is here, right? So the declaration of swap has changed ever so slightly. Previously, the definition of swap, remember in the broken version from just now and also a couple of weeks ago was this. Int A, int B. And that's great. I took in two ints. But they were copies. So if you instead want to pass by a reference, pass by pointer, you instead say you know what, here comes not an int, but a pointer to an inter. Here comes the address of an inter. And the way that you express the address of something is to put a star in front of it. So slight -- same symbol, which is a little confusing, slightly different meaning inside of this parenthetical context, versus inside of the function itself. But this just means that swap takes two arguments, one called A, one called B, and both of them happen to be the addresses of integers in memory. And that's precisely consistent with this picture. So I have changed one other thing, as a couple of people noticed. What has changed in my main function, apparently. So there are these ampersands. Exactly. So the ampersand operator is kind of like the opposite of the dereference operator, whereas the star inside of swam says go to this address, the ampersand operator says get the address of the following variable. So previously, in the buggy version of my main function, I just did this. I passed in X, I passed in Y, but that was problematic, because I'm saying here swap, here's two values, 1 and 2. Do what you want with them, but do only what you want with the copies I am handing you. So in this case when I say ampersand X and ampersand Y, this is saying hey swap, here comes the address of an int, let's call it X -- I'll call it X, you can call it anything you want -- here comes the address of another inter. I'll call it Y, but you can call it anything you want, say B, do what you want with these now. So you're now handing two swap again the power to actually manipulate the data at those locations. And this is hugely powerful. In fact, we've seen this before. Where have we seen the star operator before, and not in the context of multiplication. What data type? Right, we've seen it in R V. So we've had char, star, R V brackets, we've seen the asterisk there, and it's clearly not multiplication there. We've seen char, star, synonymous with string, and that too has nothing to do with multiplication. So we've seen this before. But because now things are getting a little more sophisticated, a little more interesting, do we final started addressing, pun intended, what that symbol actually means. So tutorially, we offer this. So there's a nice tutorial link on the resources page of the courses' web site from how stuff works. And it's actually got a nice exposition of what's going on inside. And their pictures are much cleaner than mine. So these two lines of code, inter, I, comma, J, and int, star, P, is a clearer picture like the following. So what's going on here, and I'm going to go ahead and eliminate this. It will appear on the scribe notes, if needed. And let me go ahead and see if we can't relate this to the previous stuff. So int I comma J. Well, we've done stuff like this before. Int I comma J. And I've even drawn silly little pictures like this. What does this do? Well, this allocates somewhere in program RAM 32 bits and it calls it I. And then another 32 bits in memory, calls it J. They're not necessarily next to each other. Odds are they are one after the other, but there's no guarantee of contiguousness, unless you've declared an array. But I've just declared two ints here, I and J. So I get two chunks of RAM. Now if I do something like this, I gets -- let's say 3. What happens in my picture? Well, 3 goes there. And what's here? Well, this is some garbage value. Who knows what's there. We better not do anything with it yet. All right, let me do something else kind of random. So J gets 4. All right, what happens? Well, this unknown value now becomes a 4. Now suppose I want to change the value of I. The easiest, most obvious way of changing I from 3 to say 5 would be this. I gets 5. How does the picture change? Well the 3 becomes a 5. Right? Very much Week One stuff. But let's see if we can't now leverage this idea, albeit in a little sand box, to use the address of this thing. Suppose I want to change I from 3 to 5, but I want to kind of impress people and do it in a manner that leverages the addresses of this stuff. Well, with this slide here, what this picture and tutorial are explaining is just as you declare an int by saying int, and then the name of the thing, you declare a pointer to a data type by saying int star P. So this now declares, and for those for whom this might be cut off, I've just win int star P semicolon. All this is doing is saying to GCC, you know what, give me 32 bits, but I'm not going to store in those 32 bits an inter, but rather an address of an int, and it just so happens that with what data type our address is actually stored, have we been saying, with ints. So in this case, it's actually kind of a coincidence, but a pointer, as this thing is called, the address in a C program is implemented ultimately by way of an inter. So this is going to be another 32-bit chunk. And that's why I'm drawing them all as roughly the same size squares. Here is its name P. And if I now say this, let's say P gets I, what would that do for me? P gets can I. So nothing. Bad stuff, right? Because I'm saying store in this address, store in this variable I, well, what does that mean? That means put three here, but the compiler is probably going to yell at us, because these two things, P and I, are different data times. Right? This is like trying to cram one data type into another that doesn't really belong there. P is supposed to be the address of something, but I is not the address of something, I is a value. So if what my real goal was here is not to store in P the value of I, but the address of I, what did we just say is the trick syntactically for that? Yeah. So P gets ampersand of I, and what that means is that what goes here is what, well, gosh, now I have to make the picture a little messier. All right, I don't know where this stuff is, so let's say it's in 0 X 123, somewhere in memory. And this one, who knows, this is in 0 X 789. Because again, they're not necessarily adjacent to each other. So who knows where they are. But now in this context, P gets ampersand I. What am I doing? Well, ampersand I is the address of I, which I just said, and I'm sort of the omniscient God here who knows where stuff is in memory, I'm going to put 0 X 123 here. Now what you'll see in most text books, and frankly, even we, after this week will start doing, because talking about arbitrarily-picked hexadecimal numbers really is not intellectually enlightening. What we'll typically do is abstract this detail away and say you know what, frankly, who cares where this thing is in memory. It really doesn't benefit us to know the specifics of that. We're going to exploit where it is, but we don't have to care where it is precisely. So what you'll usually see me and most any teacher or text do is draw a pointer as a pointer, as an arrow. So what's stored inside of P is yes, the address of I, but let's depict that, let's convey that idea pictorially just by drawing an arrow. So what can I now do with this information? Well, let's see. If I do I gets 5, I immediately change the 3 to a 5. But what if I instead do star P gets 5. Well, what does that mean? Well, P is here. Star P means go to that address. So this sort of like chutes and ladders style, when it says star P, go to that actual address and then do what? Then assign it the value of 5. So in short, few new syntactic tricks, we've got ampersand to say give me the address of something. Star means go here, or in the context of a function's declaration, star says expect not an int but a pointer to an inter. So the meaning of the star is somewhat context sensitive. But at the end of the day any time you see the star or this ampersand operator now, odds are we're actually talking about these things called pointers, and again a pointer is just a fancy way of saying the address of a piece of memory. Okay, any questions about pointers? No? Okay. So this -- and again, I can't emphasize enough, if this feels a little new, these pictures are actually pretty good. But they just depict what we talked about a moment ago. So it turns out there is a relationship with arrays that we've kind of been waving our hands at. Clearly, R V has something to do with arrays, but also pointers because the brackets and the star, and char star is this -- is the technical definition for a string, but we already know that strings are really just chars, but not chars per se, arrays of chars. So there's actually been this relationship between pointers and array, or at least a similarity between them. So just to slap a picture on this too. If in code I say inter-I, what I get at top right there is a 32-bit chunk of memory. The how stuff works tutorial says question mark, question mark, because who knows what's there. It's some garbage value. But that's how we might depict it with a diagram. Int A bracket 5 close bracket in layman's terms, or oh, slightly more technical layman's terms, that delays an array of size of 5 for 5 ints. Okay, what does that mean? Well, in memory, that means you get five chunks of memory, five 32-bit chunks, back to back to back to back, as in the middle of this picture, at top right. It's called A, that's the name of the variable. And A bracket 0 is the first location, A bracket 1 is the second location, on up to I think bracket 4. But turns out you can actually start to treat arrays in most context as though they're actually pointers. So even though in this context, A is really just the name of this array, it turns out you can even in C say something like line 3 there, int star P, which says give me a pointer to an int and call it P. What am I assigning it? A. So what do you think is actually getting stored inside of P per this third line of code. It's going to be the address. What is it the address of? Yeah. It's the address of the first element in the array. Right? So there's an advantage of arrays being defined as contiguous blocks of memory. You might think at first glance, wow, if I had just been handed five ints, I number know the addresses of all five integers, otherwise how am I going to hang on to them. But really, you don't. If you know that the memory is back to back to back to back, it's not all that enlightening to know what the address of this is, this is, this is. All you need to know is where the starting point is, and then you, the programmer, hopefully kept around a variable like N or just have some recollection of the length. Because if you know the starting address in memory and the length of the thing and how big each one is, you can reverse engineer the actual address of anything in between. So what's hatching here with int star P gets A, this is saying store in P address of the first integer in this array. Which means if I then proceed to do this, let me write one line of code on the fly with respect to this giant bunch of code, if I say print F quote unquote percent D, and then bracket A, let's say -- would have helped if they gave us values. Okay, not the best example. What's this going to print? Okay, so that's actually a good -- clever answer. So the value in the first location of the array. Frankly, I realized this was a bag example because this is going to print junk, who knows what it's going to print, because question marks are there. But higher level, this is going to print the first int the array. But it turns out if I do this too, if I've done star P gets A, notice what happens if I just put P, this is going to literally print the address, right? So int -- rather pointers are in fact numbers, they are integers. Percent D means put an integer here. So this would actually print the address of the first location in A, otherwise known as P. This would print the value inside P. But if you actually want to go to P and print that location, if I add this it turns out that this is essentially synonymous with A bracket 0 at this point in the story. So again, I have this ability to figure out the address of something in memory. But then with a star operator, I have the ability to go to that address. So I now have a way of expressing the same values in different ways. And this in and of itself not all that useful, because we've not given you a trick that you need to solve any new problems in a silly context like this. But we'll see that this is actually increasingly powerful. So let's take a look, actually. In compare 1.C, which you do have a print out of today, we have now an explicit use of star. So now at least in lectures, we're going to begin to take off the training wheels, so to speak, and yes, we're going to talk about strings, but let's really call them what they are. They are char stars. Or they are arrays of characters. All right, so what does this mean? So here's a program called compare 1 dot C per the comments up top. This is going to try and fail to compare to strings. But the interesting question is going to be why. Well, it's not a long program, so let's see what's going on. So here we have print def. So say something this program is going to do. Then I'm using get string, and I'm storing the string that's returned in a variable called S 1. But again, training wheels off, I'm not calling it a string, it's a char star. Same thing, but now we can finally ask the question what does that mean. All right, then the line 3 here, say something else, then I store the second thing the user types in S 2. That's where of the string's getting stored. But now let's push a little harder on my own language here. When I say get string returns a string, how do you return a string? Like functions can only return one thing. Now you can't just hand the user any number of characters or whole sentence, what is get string really returning. So an array of characters, but what does that mean, what literally is being handed back. Because an array of characters still sounds plural. What's getting handed back? A pointer. The address of the string in memory. So all this time when get string has been getting a string from the user, what that means is get string, whose implementation we'll look at, is actually allocating a bunch of RAM all contiguously, any character I type at the keyboard as the human, as part of this function, gets stored here then here then here then here, character by character, and then get string, which again we wrote in order to return that string to you, quote unquote, to return conceptually that whole string, it suffices to return to you what? Just the address of the first character in that string. And why is that sufficient? I don't know in advance how long that string is, but what did we say if briefly a few lectures ago, demarks the end of a string in memory. So special zero character, so if I type in the word foo at the keyboard an the get string function as we'll see stores a character F, then O, then O, it also stores a special terminating character, which we represent with back slash zero. This is literally all zeros. And this demarks the end of the string so I don't have to know or guess what the length of the string is. If I'm given the address of the beginning of the string, I can figure out how long the string is just with a for loop, check here, check here, check here, ah ha, there's back slash zero. So in fact, many of you have probably used the sterling function for some piece or another, string length, to get the length of a string. What is it doing? It is literally implemented with a 4 for or a while loop. It starts at the beginning of the string and just checks, are you back slash zero, are you back slash zero, are you back slash zero. When the answer is finally yes, it returns the number of times it asks that question. So that's all Stirling has been doing. So what this means is get string is going to return a pointer to a character that is a pointer to a -- an array of changes. So why does the following fail? Notice this. If I get one string S 1, and then get string S 2 and then compare the two, this program's actually broken. So let me do this. Make compare 1, okay? Now I'm going to run compare 1 -- okay, I'm going to say foo and bar and obviously, I type different things. And the program seems to be correct at least 50% of the time. Well now let me do this. So foo, foo, hmm. Maybe it was just buggy that time. Bar, bar. All right, so maybe you've even done that too, right? Maybe I cross my fingers and rerun my program the bug will go away. But it hasn't in this case. So what has happened? Well, let's see. We don't even know how get string is really implemented, well technically you could look at the code you have printed. If there's a bunch of it -- we can infer what's really going on. So get string again is a function we wrote. Its purpose in life is to figure out what the user has typed at the keyboard and then return it to you as an array of characters. So what's happening, the first time I call get string, and I type in foo, get string is allocating, again the thing I just drew. Not three but four characters, because we need a fourth character, an extra character for the back slash zero, what is it returning? It's returning to me the address of this thing. One, two, three, we'll say. I don't know where it is, who cares. It's just 0 X 123. But then the next time I call get string I'm getting a different chunk of memory. So up here I did this. S 1 gets -- and this is char star S 1 gets get string, so what is inside of S 1 at this point? Well, S 1 is a local variable, it's a type char star , which just means it's a pointer to a char. What does that mean? After this line of code is executed, what gets put inside of this 32-bit chunk of memory. O X 123. That's all it takes to store a string, or really, that's all it takes to remember where a string is, because I can figure out where the string ends just by again iterating over it or calling the already implemented sterling function. But now I do this. Now in my own program char star S 2 gets -- get string, suppose a do type in the same word again. On foo, well what happens? Well, get string written by us allocates another chunk of 4 bytes, F-O-O gets stored there, back slash zero. I don't know where this is. Like 0 X 456. Who knows where it ends up. Now what gets returned and stored in S 2. S 2 is another pointer to a char, so this is called S 2. And what goes in this box? Yeah. So O X, what, 456. So now, even if you're a little fuzzy today on what we mean by pointers, even this line should make some sense. If I then try to compare two strings by comparing their variables' names, does X 1 equal equal S 2, this is going to be false why? Even though foo is clearly equal to foo. Yeah, exactly. S 1, S 2, I mean, clearly not the same. You compare these for equality with equals equals, you're going to get back false. So that begs the question how do you compare two strings for equality. Well, right, the Y [Inaudible] would say call stir comp, which is a function that's already been implemented that does that. But how would that be implemented? Try to beat you to it that time. Yeah, so really what we need to do, if we want to compare two strings, is something more like a comparison with this function here, stir comp, which -- how is it implemented? What -- how would you, if you had to implement this function called stir comp, because it's string comparison, but it's the concise way of saying it, how do you compare two strings for equality. Given the address of one and the address of another. Probably just some kind of for loop. Are they equal, are they equal, are they equal. If you ever answer no, what do you return? False. But if you then hit a null character, what do you do? If you hit a null character at the same time, rather? Return true. All right, you can do this with a for loop, a while loop, you can just check for that character within the loop so this new and improved version is actually correct, and I also did a sanity check. So get string as we'll see in its documentation can return this special sentinel value null, which means something bad happened, the user didn't actually type something or the user types a special sequence of commands that actually in many context could crash your program. So get string does return this so-called sentinel value called null, and all capped. So what I actually need to do here, as we'll see as a matter of best practice, is always check the return value of a function like this. And if it does not equal null, that is, if there is actually a string there, okay, go ahead and compare the two with -- oops -- go ahead and compare the two with this function called stir comp which actually returns a negative value, a zero value, or a 1 value. And you can determine with that if they type the same things. What does get returned? You can infer this now, using your deductive skills. What gets returned by stir comp if two strings are equal? Zero. And why might you want to return a negative value versus a positive one. If it's less than or greater than, i.e., if it's alphabetically before it or alphabetically after it, so it seems here we have a little building block for sorting actual strings. More on that Wednesday. See you. ==== Transcribed by Automatic Sync Technologies ====