SPEAKER 1: Hi everyone. We are going to get started. I think people are still going to be filtering in. But in the interest of time, so we can get you guys out of here on time, we're going to start. So welcome to the CS50 Quiz 0 review. For those of you who haven't realized yet, you have a question on Wednesday. Woo-hoo. If you haven't started studying yet or haven't realized that this exists yet, past quizzes and all information about your quiz are on cs50.net/quizzes. There's some pretty good stuff on there, past quizzes from the last 10 years as well as information about this quiz and topics that will be covered. So let's get started. So you guys might remember, the first day of class David had those lamps on. So essentially, everything that goes on under the hood of a computer is done in binary. Binary means what it sounds like, 0's and 1's. It has two values that can be represented. So just like in the first day of section when David turned on a light bulb to represent on, or 1, our computer understands binary as 0's and 1's, on or off. Basics of Binary. Every place is represented in base two. So you add 2 to the 0 to the 1 to the 2 all the way up. To calculate what your binary is to decimal, you just follow this equation type thing. If you have a 1 in any of those places, you multiply it by whatever base it's in, add it up, and you get the decimal. So this is how you count to 5 in binary. Just like what we were doing on the last slide, this is how you would represent 1 through 5. Similarly, just like you can add and subtract in decimal or base 10, or really any base, on can add and subtract in binary. Exactly what you would expect when you add the two up, if it equals greater than 1, you carry a 1, make it a 0, and do the addition that way, just like you would expect with regular decimal or any other base. Cool. So like I said before, everything that goes on under the hood of our computer is done in 0's and 1's, or binary. So how do we express, for example, letters, or numbers, or characters? And the answer to that is ASCII. ASCII is a mapping between characters that we would normally see in the English language like A's, B's, C's, underscore, dashes, and anything like that. And it maps that to an ASCII value. An ASCII value is just a number that can be understood by your computer. And just like you can do addition and subtraction with numbers, you can do them with ASCII values. So in this example, what will this print out? Yeah, so just A space B space C space D. Where did my mouse go? Notice you can define an int at 65. And when you print that out using percent C, it'll interpret that as a character and will print out A. Similarly, you can declare it as a char. And when you print it out using percent C, it'll interpret that as percent D. And just like you can add a number, you can add characters are ASCII values, in this case. So a little pointer for everybody. 5, as a string, does not actually equal 5. So how might we convert the string 5 to the integer 5? Any ideas? Yeah. So if we have 5 as a string, we can subtract 0. And that'll give us 5. And similarly, if we have 5 as an integer, add that to the string 0. And that gives us the string 5. Cool. Now, recall back to lecture one where we talked about algorithms. So how do we actually want a computer to do interesting things? You know, just adding and subtracting numbers and printing things out is not that exciting. Usually, we want our computer to perform some kind of algorithm. Something a little more complex than just simple arithmetic. An algorithm is just a step by step set of instructions for how to perform a certain task-- just like a recipe. You might remember the first day of class where David had us count a room of people and how many people were in the room. You might be used to counting one by one. 1, 2, 3, 4. In that case, a linear time algorithm. But David introduced an algorithm for you to count the people in the room where everyone stands up, you say your number to another person, add that number up, and one person sits down. And you repeat that. That's one type of algorithm. We can analyze how efficient an algorithm is based on it's run time. But we'll talk a little bit more about that later. So all algorithms can also be written in pseudocode. Pseudocode is just an English like syntax used to represent a programming language. For example, if we wanted to ask a user to guess my favorite number, we might have pseudocode as such. Get a users guess. If the guess is correct, tell them they're correct, else tell them they're not correct. And pseudocode is a way of easily representing an idea or an algorithm. So now we might want to actually write this in the language that the computer might understanding. So we could write our pseudocode and interpret that into source code. So far, source code must adhere to a certain syntax of a programming language. And so far, in CS50, we've been using mostly c. So this might be source code for c. Later on in the course, you night come into contact with other programming languages like PHP. Or if you even take other classes, you might do Java, Python, or even OCML. But in our c program language, this is how we might write the source code for the pseudocode algorithm that I just described earlier. So how does your computer actually understand that? Like I said before, it only really understands zeros and ones. So how does it get from the source code to something that can be understood? Well, we have something called a compiler. If you recall back in most of your psets, you had some kind of program written in a dot c file. And then you would type make. So what is make doing? You can type make to compile your program because someone-- whoever wrote your p set; probably David-- created a make file. And that tells make to know to run your compiler, called clang, that will then compile your source code to object code, which is zeros and ones that your computer understands. But a little later on, we will go more in depth about compilers. So recall pset 0, where-- yes, you have a question? AUDIENCE: [INAUDIBLE]? SPEAKER 1: Yes. I think they actually should be online. Yeah. AUDIENCE: Is it like [INAUDIBLE]? SPEAKER 1: It is not. The are on cs50.net/quizzes. AUDIENCE: Slash quizzes, slash 2013, slash 0, and just click through quizzes 2013 and quiz 0, review section slides. SPEAKER 1: Yeah, so if you guys want to pull it up and look at it on your own computer, that's fine too. Say that again. AUDIENCE: [INAUDIBLE]. SPEAKER 1: Yeah, [INAUDIBLE] is the dummy variable. Oh, yes? AUDIENCE: [INAUDIBLE]? SPEAKER 1: No, strikes are not on the exam. Sorry, her question was, was strikes on the exam. And it is not. So pset 0, you guys should have all implemented something using scratch. And we learned some basic programming building blocks using scratch. So let's take a look at some of these building blocks that make up a program. First is Boolean expression. Boolean expressions are ones and 0's or anything that has two possible values. In this case, true or false, on or off, and yes or no. An example of a simple, very simple, program that uses a Boolean expression up here. So in order for Boolean expressions to be useful, we have Boolean operators. These are operators that can be used to compare certain values. So we have and or not equal to, less than or equal to, greater than or equal to, and less than or greater than. But these operators aren't very useful unless we can combine them into conditions. So you guys might remember from scratch and from your p sets that we had conditions. They are, essentially, like forks in the logic of your program that executes depending on whether a condition is met. So one of the conditions that we had used many times in this course is the if, else, if, and else conditions. Here's an example of how you might use that. Does anyone know the difference between just using if statements all the way down verses if, else, if, and else combined? Yes? AUDIENCE: [INAUDIBLE]. SPEAKER 1: Exactly. So if I had if all the way down this way, even if this condition returns true, it will still continue testing the next two. Whereas, with an else-if, an else statement, if the one returns true, the others are not tested. Any questions about that? Cool. So you use an if-else of an else statement if you know that it can only be one of these cases. So we know if x is less than 0, it's definitely not going to be greater than 0. Next, another building block that we learned are loops. We have three types of loops. For loops, while loops, and do while loops. And generally, when you sit down to write something, you have to decide which of the three you want to use. So how do we decide which one? We generally use a for loop if we know how many times we want to iterate through something or how many times we want to perform a task. We use while loops if we need some condition to be true to keep running. And we use do while very similar to while, but we want our code to run at least one time. So do while, whatever is in the do will always run at least one time. Whereas, with the while, it may not run at all if the condition is not satisfied. Any questions with that? So structure of a for loop. You guys have all seen this. You initialize it. You have some kind of condition. So, for example, we might initialize as for i equals 0. i is less than 10. And i++. Very simple one that we've done. For a while loop, similarly, you have to have some kind of initialization, some kind of condition, and some kind of update. So we can implement our for loop also as a while loop using this. And similarly with a do while loop, we might have some initialization, execute something, update it, and then check the condition. So now functions. We put everything together. We might want to write some kind of function. Common function that you might have seen already is main. Main is a function. It has a return type, int. It has a function name, main. And it has arguments, argc and argv. So main is just a function. Other functions you might have used, printf-- printf is a function-- GetInt, toupper. But these happen to have been implemented for us by some kind of library. If you guys remember including this CS50.h library or the standard I/O library. Yes, question? AUDIENCE: Is main just inherent in c? Does it just kind of [INAUDIBLE]? SPEAKER 1: The question is if main is inherent in c. And yes, all functions have a main function. It's kind of necessary for the computer to know where to start running the code. AUDIENCE: So you wouldn't [INAUDIBLE]? SPEAKER 1: No. Any other questions? Cool. So just like you can use a function that's written for you, you can also write your own function. This is a function that someone might have written to calculate the volume of a q, for example. There's a return type here, in this case int, our function name q and our list of parameters. And note that you have to write the data type of the parameter you want to use or else the function doesn't know what kind of parameter should I be accepting. So, in this case, we want an integer as our input. So why might we want to use functions? First of all, great for organization. They help break up your code into more organized chunks and make it easier to read. Simplification. This is good for design. When you're reading a piece of code and the main function is really, really long, it might be harder to reason about what's going on. So if you break it down into functions, it might be easier to read. And reuse-ability. If you have a chunk of code that's being called or run multiple times, instead of rewriting that code 10 times in your main function, you might want to reuse it. And then every time you need to use that piece of code, call the function. So now if we remember back to scratch, we also talked about a few concepts, one of which is threading. Thread is the concept of multiple sequences of code executing at the same time. So think back to day one where David had you guys count off the number of people in the room. Essentially, what was going on is all of you guys were running separate threads. And those threads were coming together to get some kind of answer. Similarly, in Scratch, when you have multiple sprites, you might have a cat and a dog. And they would be simultaneously running their own scripts. That is an example of threading. And the other concept that was introduced in scratch was events. And events are when multiple parts of your code communicate with each other. In Scratch, this was when you used the broadcast control and the When I Receive blocks. And also, in Problem Set 4, we saw a little bit of events as well. You guys might have used the Gevent library. And there was a function waitForClick in which you were waiting for the user to click. And your click, in this case, would be the event and wait for click is your event handler. And also, throughout running your psets and working on your psets, you might have come into contact with some of these commands. This is what you typed into your terminal window or whatever window that shows up on your g edit to, essentially, navigate your computer. So for example, LS lists the contents of a directory. Make directory creates a new folder. CD, change directory. RM, remove, deletes a file or some directory. And then remove directory removes a directory. AUDIENCE: [INAUDIBLE]? SPEAKER 1: Yeah, sure. Sorry, the question was if you would suggest putting this on the cheat sheet. It could help. If you have room, you can put it on. It's also just generally good enough to remember because when you use it you might want to just have it memorized. That'll make your life a lot easier. Did I answer your question? So now, we talked a little bit briefly about libraries. But the two main ones that we've been using so far in the course are standard I/O and cs50. What kind of things are included in the standard I/O library? Yeah, so far we've used printf. In cs50, we've used GetInt and GetString. And the data type string also happens to be declared in this cs50 library. We'll talk a little more in depth about how libraries work and how they interact with the rest of your code. But those are the two main ones that we have come in contact with so far in the course. Types. These are good to remember how much each type is represented by or how many bytes each of type requires-- int, 4 bytes; char, 1 byte. Float is 4 bytes. What is a double? AUDIENCE: [INAUDIBLE]. SPEAKER 1: Yeah, so a float but double the size. What about a long? AUDIENCE: [INAUDIBLE]. SPEAKER 1: OK. What is a long? AUDIENCE: [INAUDIBLE]. SPEAKER 1: Yeah, double an int. Yes. AUDIENCE: [INAUDIBLE]. SPEAKER 1: Long [INAUDIBLE]. And then a long long is double that. AUDIENCE: No, no. A long is just an int. It depends on the architecture before the [INAUDIBLE] and int have the same size. [INAUDIBLE]. SPEAKER 1: So a long and an int are the same. And then a long long is double the int. Cool. And then, what is the last type? AUDIENCE: Pointer. SPEAKER 1: Yeah, so we learned a little bit about pointers. And regardless of what a pointer is pointing to-- it could be a char star or an int star-- it's always 4 bytes for a pointer. Questions about that? Yes? AUDIENCE: [INAUDIBLE]? SPEAKER 1: So a long and an int are the same in this cs50 appliance. AUDIENCE: The appliance are completely interchangeable. SPEAKER 1: Yeah. So then a long long is double an int. AUDIENCE: This is the 32 bit? SPEAKER 1: 32 bit, yeah. AUDIENCE: So [INAUDIBLE]? SPEAKER 1: Yes, if it doesn't explicitly say, you should assume a 32 bit. AUDIENCE: It would say something like assuming an architecture like the appliance. For 64 bit, the only things that change are longs and pointers. They both [INAUDIBLE]. SPEAKER 1: Yes? AUDIENCE: Question. So on one of the practice quizzes, it asks about an unsigned int. So how would that be determined from an int [INAUDIBLE]? SPEAKER 1: An unsigned in is also 4 bytes. But what is different about a signed int and an unsigned int? AUDIENCE: [INAUDIBLE]. SPEAKER 1: Right. One can represent negative values. But how does it do that? AUDIENCE: [INAUDIBLE]. SPEAKER 1: Yeah, it saves 1 bit to represent the sign. The signed has one bit that represents the sign. And unsigned just is all positives. AUDIENCE: OK. So you say that a double is twice the size of a float? SPEAKER 1: Double is twice the size of a float, yes. AUDIENCE: How does a pointer to a long long [INAUDIBLE]? SPEAKER 1: So the question is how does the pointer to a long long-- how is that only four bytes when a long long its 8 bytes. So remember what is a pointer, essentially, at the very base value. AUDIENCE: [INAUDIBLE]. SPEAKER 1: Yeah, so a pointer is just a memory location. So it doesn't matter how much space that pointer is pointing to. It only needs 4 bytes to keep track of that memory location. Any other questions? Cool. So the last thing I have is standard output. You should use them frequently enough that you can remember. But this is when we use printf, for example. And we have these placeholders that were called format codes. So percent c char, percent i for int, and we can also use percent d. It's the same thing. But, generally, in CS50 we try to use percent i. Percent f for float. Percent ld for long long and percent s for string. Similarly, we've been using a few of these escape sequences. For example, backslash n for new line. This is just for when you're formatting your code for print f. Yes? AUDIENCE: What is percent d for? SPEAKER 1: So the question is what is percent d for? Percent d is for ints. Percent d and percent i are the same. AUDIENCE: What's the difference between backslash n and backslash r? SPEAKER 1: So the question is what's the difference between backlash n and backlash r? I think backslash r is-- AUDIENCE: So backslash r just implies returns to the beginning of the line without actually going to a new line. So if you print a backslash r and you go back to the beginning of the line then you print more stuff, you overwrite the stuff that's already on [INAUDIBLE]. Whereas, n actually goes to a new line and goes to [INAUDIBLE]. SPEAKER 1: Well, any other questions? All right. I'm going to hand it off to Dan who will continue. [APPLAUSE] DAN: All righty. So I'll be talking about another wide range of ideas from the class that are roughly representative of week two and the start of week three starting off with casting, which is just a way of treating a value of a certain type as a value of a different type. So we can do this with chars to ints, floats to ints, and long longs to double. All of these things can be used as ways of treating some numeric value minus char as some other numeric value. So there are some issues with this, of course, which comes when you cast things like float to ints. So this is a little weird. We have a float that is 1.31. We multiply it by 10,000. And then we print it as an int. What does this output? 10,000 times 1.31. So 13,000, is that the guess? AUDIENCE: I think it's 10,000. DAN: So I'm multiplying it by 10,000 before I'm casting it. AUDIENCE: Oh. Wouldn't there be one 9 and some 0 numbers? DAN: You might have some weird digits. So right, it's 1.3 times 10,000. So that's 13,000. And this extra weird-- AUDIENCE: 13,100. DAN: 13,100. Thank you, Rob. And this extra weirdness-- this 9,9-- is simply because this casting ended up rounding down where it shouldn't have. Yeah. AUDIENCE: The casting happens after anything else? DAN: So because I have this in print, it does this multiplication before it does this casting. AUDIENCE: [INAUDIBLE]. DAN: I think it would cast first, yeah, which would be 10,000. Anything else? Cool. So this is 13,099. Why does this happen? Imprecision. Floats aren't perfect. They can only represent numbers to a certain number of significant figures. So if we print out 8 sig figs on this float, we get a kind of ugly looking number. And that's because 1.31 can't accurately be represented by simple powers of two in the machine. So it ends up taking the closest guess, which ends up being a little low. Make sense? OK. Now, switched are a different way of doing conditional statements where all we care about is a single variable. So in this particular example, we're getting an integer from the user. And then we're looking at what that integer is. Presumably, it's number between one and four. That's what we're asking for. So you do a switch of the variable name. Then you set up cases of possible values it could be. So case one, say it's low. And then you break to get out of the switch condition so you don't keep going. In the next case-- so case two and case three-- if it's case two it just drops down to the first line of code it sees as with case three until it sees a break. So the reason you get case one to only print low is because I have this break here. If I, say, ignored this break-- if I threw this breakaway-- it would print low, and then it would print middle, and then it would break. So breaks are an important part of switch conditions and they should be there. Any cases that are not stated explicitly are handled by the default case in the switch and should be cast. AUDIENCE: So 1, 2, 3, and 4 would be n? DAN: Values that n can be. Yes. Yeah? AUDIENCE: So when you have that [INAUDIBLE]? DAN: You would print low, and then it would print middle, and then it would break. AUDIENCE: Why would it print middle if [INAUDIBLE]? DAN: So everything under a case before a break falls under. So case one print is underneath case one as is this following print. Yeah? AUDIENCE: [INAUDIBLE]? DAN: So this number is just a particular value that this variable can take, right? Does that make sense? Yeah. AUDIENCE: [INAUDIBLE]? DAN: Yes, case two would print middle and then break. AUDIENCE: [INAUDIBLE]? DAN: I think any? What other data types can you switch over? AUDIENCE: You can switch over any data types. But it only means anything over chars and ints and stuff like that, because if you're switching over a pointer that doesn't really make sense, switching over loads, if it even let's you do that, because of floating point in precision, you wouldn't really want to do that anyway. So pretty much, just ints and chars and stuff like that. DAN: Yeah, it's when you have explicit values that you know, I think, can be that a switch is actually useful. Good? OK. Scope is the range that a declared variable extends. So in this little chunk of code I have, it would be full of errors. And the reason is I declared this int i within the scope of this for loop. And then I'm trying to reference that i outside of that for loop scope. So basically, you can think about scope as anything that you declare with inside a set of curly braces only exists within those curly braces. And if you try and use that variable outside of those curly braces, you'll get an error from the compiler. Yeah? AUDIENCE: So this one doesn't work? DAN: This doesn't work, yes. Strings. String a char*. They're exactly the same. They are just pointers to characters. And any strings that you have should end with backslash zero, which is just a c convention. It is called the NULL terminator. And NULL-- capital N, capital U, capital L, capital L-- is not the same as the NULL terminator. This is a pointer. This is a character. They are very distinct. Remember it. It will be on the quiz, probably. I haven't seen the quiz. Yeah? AUDIENCE: So NULL is, say, the pointer? DAN: Yes. AUDIENCE: What does [INAUDIBLE]? DAN: If, say, malloc is called when you don't have enough memory to get whatever size you're asking for, malloc will return NULL. It's, basically, whenever a function is supposed to return a pointer, you need to check against NULL because NULL is a pretty good-- it's, sort of, the garbage value. It's a zero as far as pointers go. Whenever you call a function, that returns a pointer. You're going to want to check to be sure that that pointer isn't NULL because NULL is very common. It's sort of a garbage return. So if something didn't go right, just return NULL instead. AUDIENCE: [INAUDIBLE]? DAN: Yes, and that's this. AUDIENCE: [INAUDIBLE]? DAN: Spell it as this. It's the NULL terminator. It's lowercase N-U-L-L if you're spelling it. AUDIENCE: And I just went back and tested it. And if you try to put a floating point value into a switch, it'll yell at you saying, statement requires expression of integer type. DAN: There you go. But yeah, what was the question again? AUDIENCE: [INAUDIBLE]? DAN: So capital N, capital U, capital L, capital L is an actual c thing. It is the NULL pointer and will only be treated as such. You won't ever try and spell the NULL character and see any other way than this. Yeah? AUDIENCE: So returning to char max or something in the notes, would it embody the same function as [INAUDIBLE]? AUDIENCE: So are you referring to returning char max from getchar, or whatever it is? AUDIENCE: Yeah. AUDIENCE: Yeah, so the general term for all those things are sentinel values. So like returning int max from GetInt and char max from getchar, it's supposed to be like, all right, if these things are returning to us, something went wrong. For pointers, we just happen to have this sentinel value that everyone agrees upon. And this is the thing you return when things go wrong. So char max is what we're using to represent something like NULL or getchar. AUDIENCE: So if you're testing getchar, could you just put NULL? Would that make a difference? DAN: You couldn't just check NULL. You'd have to check char max because the return value from the function is a character not a pointer. Yeah? AUDIENCE: This question asks for the string length. Does that include the NULL character? DAN: No. And that's actually how string length knows to stop because it goes through your array of characters until it sees a NULL character. And then it's like, all right, I'm done. AUDIENCE: [INAUDIBLE] five? DAN: Hello would be five. Yep. So arrays are continuous blocks of memory. They have instant access by saying the name of the array and then, in curly braces, whatever index you want to go to, they're indexed from zero through the length of the array minus 1. And they're declared by the type of the thing that you're storing in the array, the name of the array, and then whatever the size is of that array. So this is a char array of length six that has these values. Yeah? AUDIENCE: [INAUDIBLE]? DAN: Yeah. AUDIENCE: [INAUDIBLE]? DAN: If you have what is going into the array already made. So you could specify this instead as, say, char, whatever the name of your array is, empty brackets equals curly brace H comma E comma L comma L comma O comma NULL character and curly brace. That would also work as a declaration. AUDIENCE: [INAUDIBLE]? DAN: Then you need to have the size already made. AUDIENCE: [INAUDIBLE]? DAN: Yes. All righty. Command line arguments are a way of getting input from the user as arguments to main. Main takes two arguments. The number of arguments that is being passed along the command line and a string vector or a string array of all of the arguments. So if I, say, called a function such as a dot out 1 space, 2 space, three, argc would be 4. And the argv 0 would be a dot out. Argv1 would be 1. argv2 would be 2. argv3 would be 3, in that particular case. Yeah? AUDIENCE: [INAUDIBLE]? DAN: The last element in the array because the array is length argc plus one of argb, the last element is the NULL pointer. It is argc plus 1. So in the case that I just said, it would be argv 0 is a dot out. argv 1 is 1. argv2 is 2. argv 3 is 3. argv 4, which is one larger than argc would be NULL. And that's the NULL pointer. Yes. And that's because string is a char star is a pointer. So it has to be the same type. Yeah? AUDIENCE: Two questions. So one, what's the difference between this and GetString other than one type in the user engine? And two, is it stored within your recent memory? So like, GetString would be [INAUDIBLE]? DAN: Where is it stored? I don't know where it's stored. AUDIENCE: So, actually, you know how any function you call it's arguments are stored in the stack? So argc and argv are arguments to main and they are on the stack, or really just above what you think as the start of the stack. What was the other part of the question? AUDIENCE: So what's the [INAUDIBLE]? DAN: Yeah, it's just a different way of getting input from the user. This one's slightly more efficient and it's handier for scripts because you can just pass arguments to your main function rather than having to wait for users if you don't have any users. AUDIENCE: And yeah, get strings would be [INAUDIBLE]. It would store the stuff you need. DAN: Yeah? AUDIENCE: [INAUDIBLE]? DAN: Yes, argv 0 always includes the dot slash of the function call. Yeah? AUDIENCE: [INAUDIBLE]? DAN: Yes, each of the arguments are ended in NULL character because they are strings. AUDIENCE: [INAUDIBLE]? DAN: Yes, argv argc is a NULL pointer. AUDIENCE: [INAUDIBLE]? DAN: Oh yeah. Yeah, sorry. AUDIENCE: So [INAUDIBLE]? DAN: So the question is if you had the command line dot slash a dot out 1, 2, would the number of command line arguments be two or would it be three? AUDIENCE: I think it doesn't really matter. I tend to say, oh, you didn't pass any command line arguments when, obviously, you called the function. So I tend to vocally exclude the function from the command line arguments even though it's included in argv. DAN: But if it was on the test-- yeah-- and also if you say something like argc equals 3, you're in safe standing. Yeah? AUDIENCE: [INAUDIBLE]? DAN: I think if instead of calling this in argc and string argv brackets but kept the same types and just called them something different like a and b, would it still work? And it would still work, you would just-- instead of using argc-- you'd use a and b. Yeah? AUDIENCE: [INAUDIBLE]? DAN: So the question is GetString is going to store memory in the heap because GetString is char*. It stores memory in the heap because it calls now malloc within the actual implementation of GetString. OK, moving on. Security. So to be truly secure, you rely on no one and you allow no one access to any of your information, which is why everyone builds their own machines, their own operating systems, all their programs from scratch, and obviously don't connect to any other machines via the internet. So computers are insecure. They really are. We have to trust other people. And the idea of security is that you're trying to limit the amount of trust that you need. And one of the means you do that is through cryptography. Cryptography is, essentially, we have secrets. Sometimes we have to pass our secrets along through, say, the internet or other things. And we don't want people to know these secrets. So we encrypt our secrets into a way that we hope no one can figure out. So we used-- through the course of this class-- things like Caesar cipher and [INAUDIBLE], which are both very, very insecure ways of encrypting things. They're easy to figure out what they are and what your secrets are. The real world uses much more complicated encryption schemes. And we won't get into much more than that. Debugging. GDB is the best. I'm going to stress this again. Use GDB all the time every time you have a problem. Commands that are useful in GDB are break, which you pass either a line number, a function name, essentially where in your code you want to stop, and be able to take control. Print takes a variable and prints out whatever that variable is at that point in your execution. Next moves your execution along one step. And step steps inside a function in your execution. Other things are run, which is how you actually run your code. Continue takes all the steps needed to get to the next break point. And there are many, many others. Look them up. They're great. Yeah? AUDIENCE: [INAUDIBLE]? DAN: Yes, which is a debugger. So a debugger is a program that lets you debug your program. It's not a program that finds bugs for you, though that would be great. And last for me is search. So the types of search that we talked about in this class are linear search, which is just that you look through each element of the search space, one element at a time, until you find what you're looking for or until you reach the end of your search space at which point you say that you couldn't find the element that you were looking for. And this takes at best constant time, which is 0 of 1 and at worst linear time, which is 0 of n. Binary search, which needs sordid elements. You go to the middle of your elements, see if the element you're looking for is larger or smaller than the element that you're at the middle. It it's larger, you say that the bottom of your search space is your current location, the middle, and you restart the process. If it's smaller, you look say that the-- yeah, what's up? AUDIENCE: [INAUDIBLE]? DAN: Yes. Any sort of sort that's been taught in the class is fair game for the test. [LAUGHTER] DAN: And the fact that you haven't had to do it for a problem set, it's fair game for the test. AUDIENCE: Can we go over it how to-- DAN: It will be gone over. SPEAKER 2: The actual code for [INAUDIBLE] is on study.cs50.net. So if you look at the practice problem in the merge sort page of study.cs50.net, there is the code for implementing merge sort. So you don't have to implement it yourself tonight. But make sure you understand it rather than just memorizing it. AUDIENCE: [INAUDIBLE]? SPEAKER 2: The merge sort page on study.cs50.net, there is a practice problem that, if you click through the problem, at the very end there is a solution, which is the merge sort implementation. But make sure you understand it rather than just memorizing it or copying it down. AUDIENCE: And a perfectly valid problem for the exam would be something like here's a list. What does this list look like after one step of selections sort or insertion sort or whatever. One full iteration of the list. So even if you don't end up needing to code for it, you need to understand it enough to know how it's going to be modifying this array. DAN: That's it for me. [APPLAUSE] LUCAS: Hey everyone. My name is Lucas. I'm going to talk about recursion, all the sorts that we have learned, and a little bit of all pointers. OK? So first of all, recursion. What does it mean to say that a function is recursive? AUDIENCE: Calls itself. LUCAS: OK, calls itself, yeah. So like this picture, for example. It's like the picture inside of a picture and so on. So for example, you can have-- as Dan that was talking about binary search. One way in which binary search is recursive is the fact that you're trying to find a number. So you go to the middle. And then you check if the numbers there in the left and in the right. And then if you find out the number is going to be on the left, it's the same thing as doing the search again but just on the left of the list. So that's how it sounds like it's recursive. So that's why you guys have recursive solution for merge sort. OK, so here's an example. So let's say that I want to choose all the numbers from 1 to n. I can realize that the sum of the n number is n plus n minus 1 up to 1. But then, if I look at n minus 1 plus n minus 2 plus 1, that's the same thing as summing numbers up to n minus 1. So I can say the sum of an equal sum equals n plus the sum of n minus 1. Does that make sense? And I also would have something else called the base case, which is that the sum of the numbers up to zero would be zero. So as soon as I get to the number zero, I stop counting. Does that make sense? So here's an example of how I can implement that. So I have this function in some. That takes an integer n. So here I first check if n is less or equals to zero. So if it's less or equal to zero, I return zero, which is our base case. Otherwise, I can just return n plus the sum of the numbers from one to n minus one. Make sense? OK. So here's what it looks like. You have sum of 2 equals 2 plus the sum of 1. And some of 1 is 1 plus the sum of 0, which is 0. Make sense? So if we look at the stack of your program, this is what it looks like. First, we have the main function. And then the main function called sum 2. And then sum 2 is going to say, oh, sum 2 equals 2 plus the sum of one. So I add sum of 1 to the stack. And the sum of 1 is going to call sum of 0, which is also going to be added to the stack. And then each of these ones that are on top of another have to return before the other ones can keep going. So for example, here, sum of 0, first, is going to return 0. And then choose sum of 1. Then sum of 1 is going to return 1 to sum of 2. And finally, sum of 2 is going to return 3 to main. Does that make sense? It's really important to understand how the stack is working and try to see if it makes sense. OK, so sorting. So why is sorting important, first of all? Why should we care? Anyone? Give me an example? Yeah? AUDIENCE: [INAUDIBLE]. LUCAS: Yeah, OK. So you can search more efficiently. That's a good way. So, for example, we have a lot of things, actually, in our lives that are sorted. For example, dictionaries. It's very important to have all the words in some kind of order that we can access easily. So that's what he was saying. You can search more efficiently. Think of how hard it would be to have a dictionary in which the words are in random order. You'll have to look at, pretty much, every single word until you find the word that you're looking for. If you're using Facebook also, when you're looking at your friends, you're going to see that Facebook put your closer friend's on top of the ones that you don't talk to that much. If you go all the way to the bottom of your friend list, you're going to see people that you probably don't even remember that you're friends with. And that's because Facebook sorts your friends based on how close you are to them. So organizing data. Also Pokemon. So you see that all Pokemons have numbers. And that's like an easy way of accessing data. AUDIENCE: Accessing Pokemon. LUCAS: Yeah. AUDIENCE: [INAUDIBLE]. LUCAS: Yep. OK, so selection sort. Selection sort is going to select the smallest unsorted value of a list each time in each iteration. It's kind of like the sort that you do in your head when you're trying to sort a list on hand. Basically, all you do is you look for the smallest number. You put it in the sorted list. And then you look for the next smallest number. And then you keep doing that and so on. So selection sort is basically you select every time the smallest unsorted value. Put at the end of the sorted part of the list. And keep doing that. So let's quickly see what this looks like. So here's the sorted and unsorted list. So for the sorted of list, it's initially empty. And then I'm going to select the smallest number here, which is 2. So I get the number 2 and I put in the front of the list. And then I look for the next smallest element, which is 3. So I put it at the end of the sorted list. And then I keep doing that. I find 4 and put it at the end. Find 5 and put it at the end. And look at how all of those times that I'm saying put it at the end is, basically, swapping two values. OK? And then the last one, you just have one more element. So it's already sorted. OK, so insertion sort. Insertion sort you're going to have also that thing of having a sorted and an unsorted list. The only thing is that every time that you're adding an element to the sorted list, you just pick the element that is in front of the unsorted list. And then you're going to find what position it should be in the sorted part of the list. Let's see what this is so this makes more sense. So initially, for example, I'm trying to insert the number three in the sorted part of the list. So the list doesn't have anything. So I can just put the number 3. Now, I want to add the number 5 to the sorted part of the list. So I look at the number 5. I notice that it's greater than 3. So I know that it has to be after 3. So I put 3 and 5. Then I want to insert the number 2. I notice that the number 2 is actually last then both 3 and 5. So I actually have to put it all the way in the beginning of the list. So I have to, kind of, shift all the elements in the sorted list so I can make room for the number 2. Then I see the number 6. I see that it should be after 5. So I put it there. And finally, I look at the number 4. And I notice it should be between 3 and 5. And then I put it there and shift all the other elements. Make sense? Bubble Sort. So bubble sort is basically what you're going to do-- we call it bubble sort because you go through the list-- it's actually better if I just show you like this-- and you're going to compare adjacent numbers. And you're going to swap their positions if they're not in the right order. So basically, what is going to happen is here, for example, you have 8 and 6. You know that the sorted order will actually be 6 and 5, right? So you're going to swap the orders. Then I see 8 and 4 here. And I do the same thing. I swap again. And finally, 2 and 8. I also swap them. It's called Bubble Sort because after each of these iterations, actually, the largest number in the list gets all the way to the end of the list. Does that make sense? Because it keeps swapping it and moving it to the right. OK, so this is the second iteration. It would be the same thing. I'll do one swap and then the last one. I that there are no swaps and the list is sorted. So in Bubble Sort, we basically keep going through the list and swapping things until I notice that I didn't do any swaps doing that iteration, which means that list is already sorted. Make sense? Let's talk a little bit about running time. So do you guys remember Big O, Omega, and Theta? Yeah? OK, what is Big O, first of all? AUDIENCE: [INAUDIBLE]. LUCAS: Yeah, it's called a worst case runtime, which just means that it's how much you expect the program to take to run. Like, in terms of-- in this case-- n. The number of elements in the list in the worst case. Like, in the worst possible case. So for Bubble Sort, for example, we have Big O of n square. Why do we have that? Why is Bubble Sort Big O n square? AUDIENCE: [INAUDIBLE]. LUCAS: Yeah, so the worst case will be that I'll have to do n iterations. So each of the iterations is going to bring the largest element to the end of the list. So the worst case is that I have to do that thing n times. And for each of those times, I have to do n swaps because I have to compare each two elements. So that's why it's n squared because it's n times n. Then, selection sort is also n square because, for each iteration, I have to look at every single element in the list. And then find the smallest, which means that I have to look through n elements. And I have to do that n times because I have to select all the n elements. An insertion sort is also n square because the worst case scenario will be, one, I have to insert n numbers, right? So I already know that I'm going to have n iterations. But for each of those numbers, if I had to look at all of the numbers in the sorted list and put it all the way in the front, that will be n square because it will be n times n again. Make sense? What about omega? AUDIENCE: [INAUDIBLE]. LUCAS: It's the best case scenario. So it's like, in a lot of times for sorting, the best case scenario is when the list is already sorted. So you don't really have to do anything. Bubble Sort has the best case scenario of n. Do you guys know why? AUDIENCE: [INAUDIBLE]. LUCAS: Yeah, if you keep track of whether data ration had any swaps or not, if you have something like set to true if there was an iteration, if the list is already sorted, basically, what's going to happen is I'm going to try to swap each two adjacent elements. I'm going to see that there are no swaps. And I just return right away. So it means that I just had to go through the list one time. So it's n because I look at n elements. Why selection sort n square? Yeah, even if the list is sorted, for every iteration of selection sort, I have to select the minimum element. So that means that I have out to look at all the elements in the unsorted list and find the minimum for each iteration. Does that make sense? And insertion sword is n because in the case that I'm trying to insert the numbers and all of the numbers, when I try to insert them, I see that they are in the right position. I don't have to go check all the other numbers in the unsorted list. So that's why it will be n. Make sense? And what is theta? AUDIENCE: [INAUDIBLE]. LUCAS: What, sorry? Say it again. AUDIENCE: [INAUDIBLE]. LUCAS: Exactly. So you can see that only selection stored in Merge sort have thetas. And that's because you only have theta if both Big O and Omega are the same. OK. And finally, merge sort is in log n. And then, as Dan was saying, Merge sort is kind of like the same way that you do binary search. So you get the list. And you're going to cut in half. And then you cut them in smaller halves. And then you merge them. You guys remember that, right? OK, as he was saying. OK, pointers. So what is a pointer? AUDIENCE: [INAUDIBLE]. LUCAS: An address. OK. I know that David shows a bunch of videos of binky and things pointing each other. But I like to think of pointers as merely an address. So it's a variable that is going to store an address. So it's just this special variable that is four bytes long. Remember, that pointer to anything is always four bytes long for our 32-bit machine so the case with the appliance. And it just has the location of a variable inside of it. OK, so there's this memory, basically. So each block of memory actually has a label, which is the address of the slotty memory. So that means that I can have a pointer pointing to any of these addresses. So the reason why we'll use pointers is if I have to remember the location that a specific variable is a memory. And you guys remember that one of those cases was if I have a function if I have actually want you to swap for reals, I actually have to send a pointer. Not the variable. Do you guys remember that? The difference between-- what is the name? Calling by value and calling by reference, right? OK, yeah. So call by value. When you just send a variable to function you're just sending a value. So you're actually sending a copy of the variable. And your program couldn't care less about if the same variable actually makes a copy. And calling by reference means that I'm actually sending a copy of the pointer to that variable. So it means that I'm sending the location of that variable. So sense I have the location of the variable, when I call the function with pointers, I'm able to actually change the data that was in main. Make sense? Although, the pointer is a copy, the pointer still has the real address of the variable that I want to change. Make sense? So creating pointers. Remember, the pointer always have the type that it's pointing to and then a star. And then you put the name. So remember that whenever you have whatever star, it's like a pointer to that whatever variable type that you had. So here in star, for example, it's a pointer and an integer. And then char star is a pointer char star and so forth. Yeah? AUDIENCE: What if we have a pointer to n to star x. I know that creates a pointer to x. Does it also declare x an integer? LUCAS: OK, so when you say n star x, you're not creating a pointer to a variable x. You're creating a pointer named x. AUDIENCE: [INAUDIBLE]. LUCAS: So when I say n star x, I'm saying, hey, in memory, I'm going to get one of these three boxes. And I'm going to say that that is going to be x, which is going to be a pointer. And something interesting about pointers is that we say that they have 4 bytes for a 32-bit machine. And the reason for that is because 4 bytes are 32-bits. And machines that are 64 bits actually have pointers addresses that are 64 bits long. So it just means that the size of the addresses in the machine is different. So Referencing and Dereferencing. There are two operators that you guys should remember. The first is ampersand. The second is star. Don't get confused by that star and this star because remember that, in this case, you have n star. It's like a whole thing together. There's no n space star. So it means that it's the type. Remember, that when you have the variable star, you're talking about the type. When you have just star and then the name of the variable, it means that you're dereferencing the pointer, which means that you're looking at the pointer, finding the address it's pointing to, going to that address, and looking at whenever you have there. So I tell my students that when you have star, you should think that it's the abbreviation of content of. So if you have a pointer and you do star pointer, it's the content of the pointer. So you go to whatever it's pointing to and look at the constant content. And the ampersand is the same thing as address of. So if I have a variable a-- like, let's say that I did int a equals 3-- if I want to find the address of that variable a memory, I can just do ampersand a. So it's address of a. Make sense? So here's an example. This is missing int b and int c. So int a equals 3 means that I'm going to go to memory. And I'm going to find a slot and put the number 3 here. And then int b equals 4. I'm going to do the same thing. Go to memory and put a number 4 in one of the boxes. And int equals 5. Find another box and put a number 5. So what is this line doing out? n star pa equals ampersand a. So first of all, n star pa. What is it doing? AUDIENCE: [INAUDIBLE]. LUCAS: Yeah, so n star pa, first, declares a pointer called pa. And then it's assigning the value of that pointer to be the address of a. So ampersand a. Then, if I do star pb, what is a star pb? Oh, sorry. This is also missing. n star pb. I mean star pc. I'm so sorry. It's the same thing. But now I'm good ar creating a pointer to b and then a pointer to c. Yeah? AUDIENCE: [INAUDIBLE]? LUCAS: Yes. So if you go to memory and you go to the box that is designator for pa, you're actually going to see an address of a. OK? Yeah? AUDIENCE: [INAUDIBLE]? LUCAS: Yeah, pointer is an address. Never forget that. It's like the most important part about pointers. There's storing and address to some variable. Anything else? Any other questions? OK. So Pointers and Arrays. Remember that when I do int array 3, basically, what I'm doing is I'm, kind of, declaring in a pointer. So array is kind of like a pointer to a specific place in memory in which I allocated three slots for integers. Does that make sense? So when I do int array 3, what I'm doing, basically, is creating three slots in memory. So I just find three slots in memory. So if I do, then, a star array, it basically means the content of array, which means I erase the pointer, I go to that place that it's pointing to, and I put the number one. And then, if I do star array plus 1, that's the same thing as doing array brackets one, which just means I go to the place that it's pointing at. And then the plus 1 makes me shift one position. So I go to this position, actually, and put the number two. And then, finally, when I do array plus 2, I go to where array's pointing at. And then I move to memory blocks. And then I put the number three here. Yeah? AUDIENCE: So star array is simply saying the very first point. And you can add 1, just because we're only really referencing that first address. LUCAS: Yeah. Why do we, for example, say array 0, array 1, and array 2? I'm saying, why do you do 0, 1, 2, 3 instead of 1, 2, 3? One of the reasons is, one, computer programmers prefer to start counting from 0. Two is because when you do array 0, it's the same thing as doing array plus 0, which means I go to that position, and I don't skip any memory blocks. So I don't move any memory blocks. Yeah? AUDIENCE: [INAUDIBLE]? LUCAS: So she's asking what is the difference between doing this or doing malloc. One of the differences is that int array 3 is creating an array on the stack. And when I do malloc, it creates on the heap. Does that make sense? So how does malloc actually work? So why do we even need to use malloc? Your compiler kind of figures out all the variables that you declared. And he creates space for all of them in the stack. So all of your variables are going to be somewhere in the stack. So here is the environment variables. So basically, space for those variables in memory is allocated at compile time. So it means that your computer has to know all of those variables beforehand. It doesn't need to know what value you're going to put in them. But it needs to know how much memory you need. But now let's say that, for example, you're creating an array or taking a string that you're taking from the user. You don't know how long the string is going to be, for example. So you don't know exactly how many memory blocks you allocate, right? So it doesn't really make sense for you to say put 100 characters. And then what if the user writes 150? You're going to be screwed. So basically, you cannot be sure of how much memory you need to allocate when you compile the program. You just know that on run time. So that's why you have the heap. So the heap is going to have memory that you're allocating during the duration of the program running. So basically, when you do malloc, what you're doing is allocating memory at runtime, which means that you're deciding right at that moment that you should have that memory. So that's when you're allocating it. Does that make sense? So remember, the stack has variables that are created on compile time. And then the heap has variables that are created as you go with malloc, for example. AUDIENCE: [INAUDIBLE]? LUCAS: So GetString is going to call malloc. Let me talk about malloc, and I'll explain GetString. So malloc is the same thing as memory allocation. So it's going to allocate memory on the heap. And it's going to return a pointer to where that memory was allocated at. So when you do-- here for example-- n star pointer. And then pointer equals malloc size of inch times 10. I'm creating a pointer. And then I'm assigning that pointer to the value of the pointer that malloc is giving me. So I'm asking malloc can you allocate space for 10 integers. That's what it's saying. And malloc gives me back a pointer to that place. Make sense? OK. I And GetString is, basically, doing a call to malloc so you can allocate memory during runtime. Always remember to check for null because malloc is going to return null if it cannot allocate memory. Let's say that you ask for a ridiculous amount of memory. Your computer is not going to be able to allocate that much. So malloc is just going to return null. So always remember to check if the pointer that you got from malloc is null or not because, if it is, you might be dereferencing a pointer and causing side faults. And finally, don't forget your free memory. Malloc is creating memory in the heap. And you have to free the memory before the program ends. OK, that's all for me. Sorry, Rob. Thanks. [APPLAUSE] LUCAS: Any last questions before Rob comes? No? Yeah? AUDIENCE: I didn't see this one online. Have you uploaded it yet? LUCAS: I think Dave is uploading it soon. DAVE: It'll be posted. LUCAS: It'll be online. AUDIENCE: It's up. LUCAS: It's up? OK. Yeah? AUDIENCE: [INAUDIBLE]? LUCAS: Yes, you should free all the memory that is put in the heap. AUDIENCE: [INAUDIBLE]? LUCAS: Yes. Any time that you have a culture malloc, you should have a culture free after you stop using that variable. So malloc and free are always together. Their best friends. Yeah. Rob? ROB: I'll go quickly. And also the video will be put up. I have the mic on. OK, so week five stuff. First thing we have is the stack. So remember that there's only one stack frame per active function call. We'll see that in a second. And also remember what actually goes in each stack frame are going to be the local variables of our functions, the arguments that are passed into our functions, along with a couple other things you don't really need to worry about. So here's an example program where, notice, main is printfing the return value of foo 4. foo is just going to return the value of bar 4 comma 6. And bar is going to set some local variable n equal to 4 times 6. And then return n. So let's look at the stack throughout the actual iteration of this program. So there's the bottom of our stack. Remember that the stack grows up. So at the bottom of our stack, we have a stack frame for main. When the program starts, main is always going to be at the bottom of our stack. And what is inside of our stack frame for main? So even though there are no local variables to main, like I said before, we have argc and rgv taking up space inside of main stack frame. So main is now going to call the function foo. And that means foo is going to get its own stack frame. So now we're inside of the function foo. And what needs to go in foo's stack frame? Well, foo has an argument n. And n is equal to 4 since that's what main is passing as foo's argument. So now foo is going to call bar. What is bar going to have inside of its' stack frame? It has x equal to 4 y equal to six. That's not all that we're going to have in the stack frame because bar also has a local variable n. And n we're going to set equal to 24. So now bar is going to return n. So bar is returning 24 to the stack frame foo. And because bar is now returning, that means we're popping the stack frame for bar off of the stack. So all the memory that bar had been using is now off the stack. Now, foo is also going to return 24 to main. So now that foo is returning, the memory that foo was using in its' stack frame is also gone. And now, main is going to call printf. So printf is just another function. When we call printf, it's going to be another stack frame for the printf function call. What are we passing printf? That's what's going to go on its stack frame. At the very least, we're passing that percent i backslash n and the argument 24. It might have more in it's stack frame if printf happens to be using some local variables. We don't know. But all that goes in printf's stack frame. It's going to execute the printf. Then printf's done. It will return. Finally, main is done. Main will return. And then our program is done. Yeah? AUDIENCE: Are you seeing [INAUDIBLE] arguments [INAUDIBLE] parameters? ROB: So there is a subtle difference between arguments and parameters. And really, in common speak, people tend to just mix them up all the time. But parameters are the formal name of the things. So argc and argv are the parameters to main. Arguments are what you actually pass in as those parameters. So there when I call foo of 4, 4 is the argument I'm passing in. And the parameter n, inside of foo, takes on the value 4 since 4 was the argument. AUDIENCE: [INAUDIBLE]? ROB: n is a local variable to bar. n is still local to foo, but it's a parameter to foo. It's not a local variable. Yeah? AUDIENCE: [INAUDIBLE]? ROB: foo is just calling bar and returning whatever bar returns. AUDIENCE: [INAUDIBLE]? ROB: Yeah, just to see multiple stack frames. Yeah? AUDIENCE: Why was foo called before printf? ROB: Why was foo called before printf? So I could have, instead, done something like int x equals foo of 4 and then printed x. But instead, I combined the function call into the printf argument. But notice that we can't actually execute the call to printf until we figure out what foo of 4 is. So we're going to evaluate this. And only once that's done are going to come back and evaluate this. Yeah? AUDIENCE: Since both bar [INAUDIBLE] value, why do we not have [INAUDIBLE]? ROB: They totally should be int. That was not caught over multiple passes. So it should be int bar and int foo since both of those are returning integers. Void is only if they're not going to return actual values. Yeah? AUDIENCE: If you had a line above the return, [INAUDIBLE]? ROB: A line above the return? AUDIENCE: Yeah. Like if you did printf and [INAUDIBLE], would it print twice? ROB: So inside of foo? If we had a printf right here? AUDIENCE: Yeah. ROB: So if we had a printf right here, it would print once. Since we are calling foo once right here, then we'll hit the printf. Then we'll call bar. And then foo will return. And that's it. We only ever encounter the printf once. Yeah? AUDIENCE: [INAUDIBLE] printf calling foo because we're first calling printf and then we're passing the arguments. ROB: So in theory, isn't printf calling foo? So no. Just the order that c is going to execute these things is, before we can call a function, all of the arguments to the function have to be completely evaluated. So is this completely evaluated? Yes, it's just a string. It's just a value. Then we have to completely evaluate this. Once this is done, now all of its arguments are evaluated. And now we can make the call to printf. Yeah? AUDIENCE: One question. If you have a void function, must you have return semicolon? ROB: You do not a return semicolon if you have a void function. OK. So now some heap stuff. So heap is how we're going to deal with dynamic memory management. And this directly contrasts with the stack which we would call automatic memory management. So on the stack, you never really have to deal with how the local variables are being pushed and popped off all these stack frames and all that stuff. You don't have to worry about it. It's automatic. So the heap is manual. And the [INAUDIBLE] comes from these functions malloc and free. So here's another program. All we're doing is mallocing an integer. We're storing it in star x. Of course, we have to check to see if x is null. Then we're going to just set what x is pointing to to 50. Print what x is pointing to, print x, and then free x. So how is this actually going to look if we look at our stack and heap? So we'll start again. The bottom of our stack as before. Remember that thee heap directly opposes the stack? So we're going to have the top of our heap up there. So the bottom of our stack, we have our stack frame for main. It has the space for argc, argv, and we now have a local variable x, which is an int star. So we're going to iterate through this program. First thing we have is a call to malloc. So we're making a call to malloc. Malloc is a function. It's going to get a stack frame. What are we passing to malloc? That's going to go inside of the stack frame. We're passing size of n, which is 4. So that is passed to malloc. What does malloc do? It grabs us some space on the heap. So we're going to go to the heap. And we're going to grab 4 bytes from the heap. So let's just give that an arbitrary address. 0x123 Just pretend that is an address that is on the heap. So what is actually inside of that region of memory at address Ox123? Garbage. So we have not stored anything in it. So as far as we know, it could be anything. You shouldn't assume it's zero. It's most likely not zero. So now malloc returns. And what do we do when malloc returns? We set what it returns. We set x equal to what it is returning. So what is it returning? It's returning 0x123 since that is the address of the block of memory that it just allocated in the heap. So return 0x123 x is now going to be set equal to 0x123 which, pictorially, we frequently draw as x having an actual arrow pointing to that block. But x is just storing that address. So now we have to check if x is null. It's not null. We pretend that that malloc succeeded. So now star x equals 50. So star remembers it means go to that address. So 0x123 We're going to go to that address. So that brings us up there. What are we doing at that address? We're storing 50. So after this line, that is what things are going to look like. So now it's no longer garbage up there. Now we know that 50 is in that particular address because we set it to that. OK? So now we're going to print f. So first we're going to print star x. So what is star x? Again, star x means go to the thing that x is pointing to. So x is storing 0x123 Go to that. We get 50. So print f that. And that means it's going to print 50. And then that returns. And then we have the second printf. We're now percent p. If you haven't seen it, that's just how you print a pointer. So we have percent i, percent f, and all of those already. So percent p, print a pointer. So x is a pointer. So if we're going to print x itself, we're printing what is actually inside x, which is 0x123 So the first print f is going to print 50. The second print f is going to print 0x123 Yeah? AUDIENCE: Do you use percent x to print a pointer? ROB: So do you use percent x to print a pointer? So you can but percent x is just, generally, for like if you have some integer and you want to print it as a hexadecimal. That's just how you do that. Whereas, percent d would print as decimal. That's were we get percent d. i is just integer. percent p is specifically for pointers. So x is a pointer. We want to use percent p. But percent x could work. Yeah? AUDIENCE: [INAUDIBLE]? ROB: Yeah. At least for this call-- so I didn't include it in here. But these two arguments are necessarily inside of this stack frame along with any local variables printf happens to be using. And then the next call to printf now inside of printf stack frame is percent p backslash n and whatever the value of x is, which is 0x123. Yeah? AUDIENCE: [INAUDIBLE]? ROB: It'll print something that looks like this. AUDIENCE: [INAUDIBLE]. ROB: So it prints it in address form. It looks like an address. Yeah? AUDIENCE: [INAUDIBLE]? ROB: Why is what? AUDIENCE: [INAUDIBLE]? ROB: Why is this pointer 4 bytes? So there are a whole bunch of 0's in front of this. So it's really 0x0000000123. On a 64-bit system, there would be a whole bunch of more zeros. Yeah? AUDIENCE: [INAUDIBLE]. ROB: So the first printf is going to print-- AUDIENCE: [INAUDIBLE]. ROB: Yes, it's going to print what x is pointing to. Star says what is this thing pointing to. Grab it. So what is it pointing to? 50. Grab it. That's what we're going to print. Whereas, the next one, we're just printing x itself. What is inside of f? 0x123. OK. And then, finally, we have the free. What are we passing to free? We're passing x. That time I actually displayed it in the stack frame. So we're passing the value 0x123 to free. So now free knows, all right, I have to go up to the heap and free that memory. It's no longer using what is at address 0x123. So free is going to release that from the heap. Now our heap is empty again. We have no memory leaks. Now free will return. Notice that x is still 0x123. But that is now not valid memory. We should no longer dereference x. Yeah? AUDIENCE: Is return 0 redundant? ROB: Is returen 0 redundant? Yes. We just put that there because we have a return one for air. So it's like, yeah, lets include the return 0. Yeah? AUDIENCE: [INAUDIBLE]? ROB: So after free x, what happens if we try to dereference the pointer? It's possible that nothing goes wrong. It's possible that we'll still get 50. It's possible, also, that that memory is now being used for something else. So it's undefined behavior. And undefined means anything can happen. Yeah? AUDIENCE: [INAUDIBLE]? ROB: No, so if you assign x to something else. So if right here we said x equals malloc something else-- malloc size event-- then that original block of memory is not freed. And we have officially lost it. That is a memory leak. We've lost all references to that block of memory. So there's no way we can ever free it. OK, so then return 0 means done. All right, so stack overflow. What's the idea here? So remember, heap is going down. Stack is going up. So this was the example from lecture, I think, where main is just going to call this function foo, which is going to call itself recursively over and over again. So stack frames are going to work exactly the same. So we're going to start with main as the bottom stack frame. Then main is going to call foo, which is going to get a stack frame. Then foo is going to call foo again, which is going to get another stack frame. And then again, and again, and again, and again until, eventually, we run into the heap. So this is how we get a stack overflow. And at this point, you seg fault. Or you'd really seg fault before this point but yeah. AUDIENCE: Is core dump the same as seg fault? ROB: So you'll see segmentation fault core dumped. You get a core dump when you seg fault. And it's like a dump of all of the contents of your current memory so that you can try and identify why you seg faulted. Yeah? AUDIENCE: [INAUDIBLE]? ROB: So a segmentation fault means there's a stack overflow. So not necessarily. A segmentation fault means that you're touching memory in a way you shouldn't be. So one way of that happening is, when you stack overflow, we start touching memory in a way that we shouldn't be. Yeah? AUDIENCE: [INAUDIBLE]? ROB: So inside of an infinite loop. Like, this is like a recursive infinite loop and so we get another stack frame each time. But just inside of a regular infinite while one-- well, let's not even print f-- do something. Whatever. We're not going to be getting another stack frame. We're just going to keep looping over this single instruction. The stack isn't growing. It's the fact that each recursive call is giving us a stack frame. That's why we get a stack overflow. Yeah? AUDIENCE: So if you said to get the while loop and then [INAUDIBLE]? ROB: So if inside of the while loop there was a printf, you still would not seg fault. I just didn't want to confuse things. It would loop. You'd get a single stack frame for the printf. Then printf would return. Then you'd loop again. You'd get a single stack frame for the printf. It would return. Single stack frame. So you're not getting this infinite piling up stack frames. AUDIENCE: [INAUDIBLE]? ROB: Yes. So this stack overflow happens because none of these calls to foo are returning. So if we return, then we would start losing stack frames. And then we would not stack overflow. And that's why you need a base case for your personal functions. Yeah? AUDIENCE: Is the potential size and the stack for the heap the same for all programs? ROB: Roughly. Is the potential size of the stack and the heap the same for all programs? Roughly. There is some randomization to where the stack starts and where the heap starts. If you happen to have a whole lot of global variables and things, you might take away from some space for your heap. On a 64-bit system, you virtually have infinite memory. There's just so much. Between 32 bits and 64 bits, that is a significant difference. You're going to get a whole lot more stack and heap space on a 64-bit system because there's just more addresses that they can use. But on an individual system, it will be roughly the same amount of stack and heap space. All right. So last thing is compilation. So you should know this process. There are four big steps. So the first one should be easy to remember. Pre-processing. It has the prefix pre in it. So it comes before everything else. The thing to remember is the hash. So hash defines and hash includes in all of those. Those are all pre-processor directives. These are the things that the pre-processor takes care of. So what does a pre-processor do? It's a really dumb thing. All it's capable of are all of these copy, and cut, and paste operations. So hash includes standard i0 dot h. What is that doing? It's grabbing the standard i0 dot h file and pasting it into the top wherever it says hash includes standard i0 dot h. And any hash define that we've seen, what is that doing? Its copying the value that the hash defined is defined as and pasting that wherever you are using the value. So the preprocessor just does really simple text based operations. It does nothing smart. So everything else is more complicated. So now that preprocessor is done, we actually compile. So what does compiling mean? We're now going from c code to assembly code. Yeah? AUDIENCE: [INAUDIBLE]? ROB: Yeah, we caught that. So compiling. We're going from c to assembly. So this is an actual language change. Compiling itself means going from a higher level language to a lower level language. And c is a high level language compared to assembly. What is assembly? Its instructions that are, pretty much, made for your CPU. But your computer still does not understand assembly. It only understands ones and zeros. So the next step is assembling, which brings us from these instructions that your CPU understands and actually translates them, to the ones and zeros. So C to assembly to binary. But I don't have an executable yet. So think of the cs50 library. We have provided you with a binary for this cs50 library, which has GetString and GetInt and all that. But the cs50 library-- in and of itself-- is not executable. It does not have a main function. It's just a bunch of binary that you can use. So linking is how we bring together all of these different binary files into an actual executable. One that you can type dot slash a dot out. So this is like the file that you wrote,-- whatever your program is-- Ceaser dot c. But now it's been compiled down to binary. So Ceaser dot o. And this is our cs50 libraries binary. And they're being combined into a single executable. Yeah? AUDIENCE: [INAUDIBLE]? ROB: So first include, remember, the hash include is actually a pre-processor step. But that's separate. If you're not using any functions that are outside of your single file then, no, you don't need to link anything since you have everything. That said, printf is being linked in. If you ever use printf, that's something that needs to be linked in because you didn't write that. And, in fact, printf is automatically linked in. You know how at the command line or when you type make, you see it have dash l cs50, which has link in the cs50 library? Printf, and stuff like that, is going to be linked in automatically. Any other questions on anything? AUDIENCE: [INAUDIBLE]? ROB: Linking? We have a whole bunch of different binary files. This is the canonical example that we use is cs50 library. We have compiled and given to you the binary for this cs50 library. You want to use GetString in your program. So you go and use GetString. But without my binary code for GetString, when you compile your code down, you can't actually run your program because GetString String is not yet completely defined. It's only when you link in my binary that contains GetString that now, all right, I can actually execute GetString. My file is complete. And I can run this. Yeah? AUDIENCE: Does linking convert the binary to executable? So even if you don't have other libraries, wouldn't it still be necessary to translate the [INAUDIBLE]? ROB: So an executable is still in binary. It's just combining a whole bunch of binaries. AUDIENCE: Thank you so much. ROB: No problem. Any other questions? Otherwise, we're all set. All right. Thanks. [APPLAUSE] AUDIENCE: Thank you. ROB: Yeah.