[MUSIC PLAYING] [VIDEO PLAYBACK] - --we know? - That at 9:15, Ray Santoya was at the ATM. - So the question is, what was he doing at 9:16? - Shooting the nine-millimeter at something. Maybe he saw the sniper. - Or he was working with him. - Right. Go back one. - What do you see? - Bring his face up. Full screen. - His glasses. - There's a reflection. - That's Neuvitas baseball team. That's their logo. - And he's talking to whoever is wearing that jacket. - We may have a witness. - To both shootings. [END PLAYBACK] DAVID MALAN: This is he is CS50, and this is lecture 3, and that is not how computer science works. And indeed, by the end of today, we'll make clear exactly what's right, what's not right about that, and hopefully give you some pause any time you watch TV or movies hereafter and notice these little things that all too many writers seem to take for granted. So recall that last time, we took a look lower level at what compiling actually is. And recall that it was a few things, these four steps of pre-processing and compiling and assembling and linking, so that when you start with their source cod, that might look like this code that we have written in the past, you first have to preprocess it, and the first step in pre-processing was converting all of those processor instructions-- anything starting with a hash at the beginning-- to their equivalents. So opening the files and effectively copying and pasting the contents there so that programs and the compiler know what get_string is and know what printf is. The next step that came after that was actually compiling, whereby compiling technically means taking that source code, once it's been preprocessed, and printing and generating this very cryptic-looking stuff called assembly code. And those assembly codes or assembly instructions are really what the CPU-- the brain of your computer-- actually understands, although technically the computer understands them only in the form of 0's and 1's. And so when you "assemble-- step three-- that assembly code, you actually get out those 0's and 1's. But even that simplest of programs where we just prompt the user for a string and then print out their name still involved a couple more files. There was not only cs50.h and stdio.h at the top, somewhere in the computer system there's probably files called cs50.c, and in the case of stdio, printf.c, in which actually the code is for those two functions, those two have to get compiled down to 0's and 1's, and then we need to link everything together, merging those 0's and 1's so that the computer has access to your code and to printf's code and to the cs50 library's code And so forth. But all of that we can just generally wrap up in the descriptor of compiling. And so that's one of the looks we took last week. And we also have introduced, last week and previously, a few tools. And odds are, you're having as many frustrations perhaps already with the p-sets as you are accomplishments and sense of satisfaction. And that's normal, and rest assured that the scales will eventually tip more toward happiness and away from sadness, but we'll give you indeed more tools today than these for actually finding problems or shortcomings in your code. help50, recall, helps you with what process? When you instinctively consider using help50? When you see error messages on the screen. Something you don't understand that's the result of some mistake you probably made but you don't quite understand what the computer is telling you, run help50, and then that same command and we, the staff, with our code will try to understand the message for you and provide you with feedback. style50 does exactly that. It helps you see with red and green color coding exactly what spaces should be there, shouldn't be there-- it just helps you pretty your code so that you can read it better and other humans can as well. And then printf, which is kind of like the coarsest tool in your tool box, this is just helping you see not only messages you want to see, but just the values of variables. You can print ints and strings, whatever you want, and then you can delete those lines of printf once you're confident your program's working. But that gets a little tedious, and honestly, as our programs get bigger, we're going to want more powerful tools than like manually printing things out, recompiling, rerunning, it very quickly it gets tedious. And the goal of programming is not to be tedious, but to be empowering, and that's where we'll step to today via this. So CS50 IDE is sort of fancier version of what you've been using called CS50 Sandbox, and in turn, CS50 Lab. Now recall that both of those tools, the Sandbox and the Lab, have a terminal window where you can type commands, they have a code editor where you can actually write your code, and then they have a file browser with icons and such where you can actually see your files and folders. So it turns out that CS50 IDE is another tool that at first glance is very, very similar, even though it's laid out a little differently, but it has as many features as the Sandbox and the Lab, but some more. More features that actually help you solve problems in your code and even collaborate come final project time with others if you would like. So this we'll see is this is the CS50 IDE. It comes with the so-called night mode so you can make everything a little darker on your screen, especially if p-setting at night, and let's actually take a look then at what you can do with this kind of tool. When you log into this tool for the very first time in the next problem set, you'll see an interface that's almost the same as before. The colors are a little different, the font sizes are a little different, but at the bottom by default, you have your so-called terminal window, though instead of the dollar sign now, you'll see a little more detailed workspace, but more on that in a bit. Up here you just have the code editor window, nothing's really going on there. And then we have the added feature of Ceiling Cat in the top right-hand corner. And we'll also see some other features along the way. So let's actually write a program in CS50 IDE, which, to be clear, is just another web-based programming environment that also gives you access to your own cloud-based server. It, too, is running Ubuntu Linux, which is a popular operating system that is not macOS and it's not Windows. But unlike the sandbox environment where you don't even log in and you lose your files eventually, as you may know from when your cookies are lost or something goes wrong, the IDE saves everything. And you'll log in with your account, and whatever you put there last week is going to be there this week and next week and beyond. So let me go ahead up to File, New File, or I could just click this little plus icon in the top right-hand corner, and let me go ahead and preemptively hit Control-S or Command-S or go to File, Save-- you should find the interface very similar to any Mac or PC program-- and let me go ahead and save this file as follows. I'm going to call this hello.c. And it's important to mention the file extension, otherwise the IDE, like the Sandbox and the Lab, won't know what type of program you're writing. And then let me go ahead and just write my simplest of programs. So let me go ahead and include stdio.h, int main void. Let me go ahead and open my curly braces, printf-- hello, world, backslash n, and a semi-colon. So you'll notice that almost everything is the same. The colors are a little different, perhaps, and you might see some different assistive features as you're typing your code, but the end result is the same. And the color coding you just get for free because it's helping draw your attention to different parts of the code. Let me go ahead now and-- oh notice this. There's one difference. The IDE is a more powerful tool, but as such, it's a more manual tool and it's not just going to auto-save your code for you. Nice as that's been with the Sandbox, such that you'd never actually had the hit Command-S or Control-S-- and if you were, you didn't need to be, the IDE is only going to save things when you want it to so that nothing will happen magically anymore. So what I'm going to have to do is go back up here, File, Save, or Command-S or Control-S, you'll see a little green dot briefly, and now and back at my prompt. I'm going to go ahead now and type my familiar command, make hello, Enter, and you'll see pretty much the same cryptic-looking client command as before because the IDE is configured quite like the Sandbox. And if I want to go ahead and run this now, how do I run this program? Quick check? ./hello, it's exactly the same as before. ./hello, and there we have it, hello, world. So long story short, the user interface thus far is a little different, but functionally it's the same. We're just going to now start to see some more features. So what are those features? And let's introduce new some capabilities that were actually possible in the Sandbox, we just didn't really introduce them at the time. If I click this folder icon at top left, you'll see all of my files and folders. And today for lecture I have a lot of pre-made examples that are already on the course's website, some of which we'll look at, some of which we'll refer to the website, but these are just familiar files and folders. And you can see that everything in my account is apparently in something called Workspace, which is just a folder, name, or a directory. Here's my sc3 directory, which again, comes from the website for today's lecture, lecture 3. And then here's the file I just compiled in the program and the file that I wrote, hello.c. You'll notice too that there's this funky symbol here, tilde, that you might not have occasion to write often in English, but in Spanish in other languages you might use this character. This is actually a shorthand notation for what's called your home directory. In this environment, CS50 IDE, you have your own home directory, which means your folder of files and other folders that you get to create, you own, and that persists every time you log in-- you're not going to lose the contents therein. So this just means that in your home directory, a.k.a. tilde, there is a folder called workspace in which I'm currently working. And that's just one folder in which all of my work is going to be done, because there's so many other files and folders in this cloud environment, just like there are in your Mac and PC, we just generally don't care what they are. But notice what we can do at this terminal window besides compile and run code. There are other commands. For instance, this blue text here, similarly to the file browser up top, indicates now not just that this is my prompt per the dollar sign, but that in my home directory's workspace directory. So that means I can be elsewhere even though I haven't specified where I want to go yet. And in fact, I can do this. ls stands for list, it's just shorthand notation for that. And now I see a textual version of my file tree, so to speak. So you'll see here, sc3 is a folder, and you can tell as much because there's a slash at the end of it. hello.c is of course the file I wrote a moment ago. And then hello in green is my program that I compiled, and the star or asterisk there is just-- it's not the name of the file, it's just indicating to me visually that that is executable. That's a program I can run just so I know what's compiled and what maybe is source code. So when you're running ./hello, the reason all this time this has been working is because in dot, your current folder, there is a file called hello, and when you hit Enter, you are running that program there. So if after today you go back onto CS50 Sandbox or CS50 Lab and type ls, you'll see exactly the same thing as you might by the little folder icon in those programs as well. But suppose I want to go into a directory. In macOS or Windows or even the IDE, I could, of course, go my File icon, and then per the little triangle here, which might seem intuitive, you just click it and you can see what's going on inside, not surprising. But how do you do that textually? At a command prompt, well it's not all that hard. You just need to change your directory. So if I do cd space sc3, Enter, nothing seems to happen quite yet except that my prompt changed. Here's the indication that-- this is my prompt, but to the left of it you see in blue that I'm now in my home directory's workspace folder, in my sc3 folder there. So it's just a text-based version of the GUIs, the Graphical User Interfaces that all of us have certainly come to take for granted in the world of macOS and Windows thus far. Well, suppose that I'm a little done with my hello program and I want to delete it. Well in the IDE, like in the Sandbox, you can actually go up here and you can click on it, and then you can typically right-click or control-click, and you'll get a whole menu of other options, one of which is Delete-- and feel free to tinker like that in your own environment. But what about the command line? If I zoom in down here and I want to remove hello, you're not going to type remove because that just feels a little verbose and humans decades ago decided that's too tedious to type, let's just call this command rm-- for remove-- hello, you're going to see a somewhat cryptic prompt. rm-- remove regular file 'hello?' This is more arcane than it needs to be, but it's just asking, are you sure you want to delete 'hello?' Then it's just waiting for you. And here you can type y or yes or sometimes other commands too, now I've confirmed that my intentions were yes. If I type ls again, I-- whoops, in the wrong folder. If I type ls again after doing hello-- no-- after doing hello and do ls, now I'll see just those two things-- sc3 and hello.c. What if I want to make a folder? Well notice this. If I type at the bottom here, make directory-- mkdir-- test just to make a test folder, I'm about to hit Enter, but watch the top left-hand corner where I currently have those other files and folders, and when I hit Enter, now I have a test folder. So these things are identical. One is graphical, one is command line, and there's even other commands if I decide I don't want that. rmdir is remove directory, and it just goes away because it's empty and thus safe. Any questions then on any of those commands or just the overall layout of what it is we're looking at? All right, so don't get hung up on any of those commands, and the problem set and beyond will always remind you of those kinds of features. The point for now is just that we're in a somewhat new environment, but it's fundamentally still the same, it has the same capabilities. So what are other tools we looked at? So you might have heard rumors about a tool called check50, and indeed, this is a tool that the staff use to evaluate problem set 1 and problems set 2 to evaluate the correctness of them so that we ourselves don't have to type ./mario or ./caesar again and again and again to test students' code. But starting this week, you, too, have access to the same program. check50 is a command from the staff that checks the correctness of your code just like style50 checks the style of your code. And in fact, if I go back over to my IDE, let's try to use this for the first time by making the same version of hello that you did perhaps for your first problem set. So if I go ahead and include not just stdio, but cs50.h, and I go ahead and get a string from the user with get_string, prompting them for their name, and then go ahead and print not just hello, world, but hello, percent s comma name, this I believe was the same program you yourselves probably wrote, or some variant thereof. So if I go ahead now and test this myself-- make hello, Enter, seems OK, ./hello. I'm going to go ahead and type in my name, and voila, hello, David. Now suppose you're feeling pretty good, you're pretty confident that your code is correct, and most importantly, you have tested your code yourselves. It's not sufficient to rely on our tool alone to test your code because it, too, might not be exhaustive. So once you've tried a few inputs, not just David, but perhaps Veronica's name as well, seems to work. Brian's name as well, seems to work. No name at all, doesn't seem to work, maybe? But we'll have to look back to the problem set to see if that's actually a problem. Let me go ahead now and run check50. check50 expects a special slug, so to speak. Just a unique identifier for the problem that you want to check. And you would only know this from reading a problem set or a documentation online. I just happened to recall that the command that the staff had been using to grade and evaluate hello is just cs50/2018/fall/hello. And the slash is to just kind of visually distinguish those words, this isn't a folder or files or anything like that in your own account. So I'm going to run check50 cs50/2018/fall/hello in the same directory that hello.c is in. Enter. It's going to go ahead and connect to GitHub, which is the backend, recall, that we use for storing your code. It's authenticating me now, which means what's your username and password? I'm going to go ahead and use one of my test accounts. And now it's prompting me for my password, and I'm going to go ahead and type that in. You'll notice you're seeing stars like you see bullets in a website just so that someone looking over your shoulder can't see what you're typing. Now I'm going to go ahead and watch the progress. It's preparing, let me go ahead and zoom in. Dot-dot-dot. It's looking at my code, it's getting ready for submission, it's now uploading it to GitHub.com, and once it's on the servers, then it's going to tell CS50 server, here is so-and-so's submission, go ahead and run a few automated tests on it, checking therefore its correctness, and hopefully we're about to see some green, happy smiley faces, and voila, yes, it looks like this check50 command for this problem-- or slug, so to speak-- checked that hello.c exists, because if I forgot to write the file or if I misnamed it, nothing's going to work. We checked that it compiles successfully, so that, too, is a happy green face. Then it apparently checked-- what if we type in Veronica? Do we see hello, Veronica? Apparently yes. What if we typed in another word, Brian? Yes, apparently we say hello, Brian. And so with high probability, we're going to conclude, based on those four tests, that your code is, in fact, correct, at least with respect to those inputs. And there's often some more detail via URL at the bottom where you can actually see more graphically just more feedback on your code. Of course, the first time, second time, third time maybe you run this command, you might not see some green happy faces, you might see some red unhappy faces or some yellow flat faces, which just means we couldn't even run the checks because something else is wrong. But over time, this will help you feel more comfortable and more confident that your code's correct before you actually use submit50 and submit. Going into it you'll feel a little better or a little frustrated to know in advance-- wait a minute, I'm about to submit this but nope, it's not yet correct. So realize it's a two-edged sword. Any questions about check50 or any of these commands thus far? Anything at all? No? All right. So let's take a look at the final and most powerful tool now available to you in the IDE environment. Built in to CS50 IDE, which stands for Integrated Development Environment, which isn't a CS50 thing-- this is a common term in industry for tools that make it easier to write code, it turns out that there's some other feature besides the cat over here. Namely, one, you can share your workspace with teaching fellows and course assistants so they can perhaps help you in real time a la Google Docs, even chatting with you in real time. But it also provides you with what's called a debugger. A debugger, as the name suggests, removes bugs-- or rather, helps you remove bugs from your code by allowing you to not just resort to printf-- printing out ints and strings and whatever is good that's going on your program, it kind of automates that very tedious process for you. And it lets you walk through your code one line at a time at your own comfortable pace and see along the way all of the values of your variables in that program. To activate this debugger, I'm going to go ahead and do the following. I'm going to compile my code as always with make hello. It has to compile, otherwise I might want to use help50 and figure out why it's not compiling, but it does seem to have compiled. And now I'm going to go ahead and run debug50, space, and then the name of the program I wanted to debug. And the name of the program I wanted to debug at the moment is the current directory's file called hello. Let's assume that there's perhaps something wrong with it. The first time I run this command, though, debug50 is not going to be happy with me because it's going to say, it looks like you haven't set any breakpoints. Set at least one breakpoint by clicking to the left of a line number and then rerun debug50. Well what is a breakpoint? Well as the name kind of suggests, it allows you to break or pause the running of your code at any of your lines. And all this time for the past few weeks, your code been automatically line-numbered. And this is useful because the most interesting line in this program, once it really gets going, isn't this stuff at the top, it's not int main void, right? That's all copy-paste from past programs. It's really the sixth line here where I actually have some logic of my own. And so in CS50 IDE, what you can now do is click to the left of one of these line numbers, a little red light like a stop sign is going to appear saying, break or pause my program on this line so that I can poke around my actual code. Sandbox and Lab cannot do this. So now I'm going to go ahead and rerun debug50 in exactly the same way, hit Enter, but now I have one breakpoint. And you'll see on the right-hand side a fancier menu just popped up by the cat that provides me with a bunch of features. And at first glance, frankly, it's a little overwhelming because there's a lot going on here, but you'll notice first, and most importantly, there's some mention of my name variable. I don't quite understand 0x0 or whatnot, but I do understand string. And so what the debug50 program has realized is oh, on this line and below, you have a variable called name. It doesn't seem to have a value yet. 0x0, it turns out, is just going to mean empty or null or 0. But that's good, because now, when I actually execute this line, hopefully it's going to take on the name David or Veronica or Brian. So let's see what happens. Notice that it's highlighted in yellow, line 6, which means it has not yet executed this line of code. My code has paused at this point because I set that breakpoint. And then notice kind of like a music player up here, there's a few icons. The Play button is just going to say, ah, play my program, run it all the way through the end, kind of like scratch with the green flag. But more powerful is this. You can step over this line, therefore executing it just once. If it's a function, you can step into this line and actually look inside of a function that you're using, like get_string, or you can step out of another function, but more on that another time. So what I'm going to do is this. And the button I'm going to click most commonly when trying to understand how my program is working is this-- Step Over. So it's the second icon from the left, right next to the triangle. So once I click this, watch what's going to happen, even though it's a little small, on the right-hand side for my name variable. Notice that I'm being prompted to type in my name because the program is still running in my terminal window, but when I hit Enter now, providing my own name, automatically you see on the right-hand side that this name variable has a value now of, quote-unquote, "David" of type string. There's this 0x1083010-- more on that later, just a little cryptic, but I didn't have to use printf now, I can actually see what's going on. Now you can see that line 7 is highlighted, because I set a breakpoint above it, so now I'm on the second line because I just stepped into it. Let me go ahead and click Next again, and you'll see that in my terminal window, hello, David just got executed. And now if I just keep going, it's going to go ahead and run to the end and close the debugger. So not all that useful for this program because frankly, I'm pretty sure this is correct, but the power of debug50 and a debugger more generally is that it lets you, whether you're less comfy or more comfy, walk through your own code at your pace just like a TF or a CA might say, OK, what is this line doing? What is this line doing? You don't have to resort to printf, you can just very methodically walk through your code and find that damn bug that's been bothering you for minutes or even hours. So henceforth, any time you have a bug in your code that is compiling but it's just logically incorrect-- the pyramid in Mario isn't quite right, your encryption of Caesar isn't quite right, or something else, your first instinct now should be, let me compile it, run debug50 on it, and just step through the code, setting a breakpoint wherever I want, so you focus on just a few lines, not the whole thing-- like I just did-- and see if you can figure out logically when a value is not what you expected, then oh-- go ahead and just click Resume, fix the bug, and retry. Such a powerful tool. Any questions? Yeah? What is it? AUDIENCE: What does it look like when there is a bug? DAVID MALAN: What does it look like when there is a bug? So the debugger won't find your bugs and it won't show you your bugs, per se. It's going to let you see what line is executing, it's going to let you see what's outputting, it's going to let you take input, but all it's going to do on that right-hand side is just show you the values of things along the way. It's up to you to infer from that information what it is that's going wrong, just like if you're using printf in past weeks to see what's going on in your program. Other questions? And let me save this too. It is so easy to get into the habit, especially when so many things have been new over the past few weeks of just saying, ah, this is just yet another thing to learn. This is hands down the kind of tool that if you spend a few extra minutes this week and next week just using it, get a little more comfortable with it, it will save you potentially hours in the long run, because all the time you've been spending manually trying to fix your bugs or posting questions online trying to understand things, this is a tool that if you invest those minutes upfront will just help you understand everything going on inside of your program, and will absolutely over the next few weeks save you more and more time. All right, any questions? yeah? AUDIENCE: So you have a for loop that ran [INAUDIBLE] times, [INAUDIBLE] separate break statements so you don't have to [INAUDIBLE].. DAVID MALAN: Ah, good question. If you have something like a for loop or a while loop, something that's happening a lot, can you set a breakpoint in such a way that it only breaks so that you don't have to walk through it 100 times just to see that value? Short answer, yes. And let me defer to section and online resources for just a few of these features, but one, you can actually watch values, and you can have what's called a watch expression. You can say show me this value if only when x is greater than 50 or something like that. Or you yourself can just add some lines of code. You could add a, if x equals-equals 50, then print out something, and you can set a breakpoint on that new, if temporary line, so there's a couple of ways to do that. Good question to anticipate. Yeah? Behind. AUDIENCE: If you run debug50, aren't you adding another arugment with the [INAUDIBLE] in your main method at line 4? DAVID MALAN: Really good question. If you're running debug50, aren't you adding another argument-- argv-- per our discussion last week of command line arguments? Short answer, no, because debug50 corrects for that, so you don't have to worry about that. It will not shift things over numerically. Really good thought. Other questions? All right, so with that said, let's now take some training wheels off. So the only reason I bought these training wheels years ago is to make this very dramatic point of now taking the training wheels off today. OK, so what does this mean? Well worth the trip to Target. So what does this mean? For the past few weeks, we have been using a whole bunch of functions from CS50's library. All of these were meant to just make it pretty easy, relatively speaking, in the first few weeks to get input from the user. Because it turns out, as we'll see today, it's actually a kind of a pain in the neck to get input from users in C, and frankly, even in other languages reliability. Because you'll recall that get_string and get_int and all of these functions take on the burden of like re-prompting the user if they don't actually give you an an int or don't give you a float or don't give you a char that you're expecting, they'll re-prompt, they're using a while loop or a do-while loop or the like, so there's just a lot of error detection built into these functions. But, most importantly-- and most misleadingly, has been the last one on this list. Recall that we introduced a couple weeks ago now the notion of a string. And a string is in English what? An array of characters, good. It's a sequence of characters, and we learned last week that a sequence can be implemented in an array, which is just a chunk of memory back-to-back-to-back-to-back. So string, though, is not quite like any of those other data types. It turns out that it's not quite like int or char or even bool or float, and we can start to see that now as follows. I'm going to go ahead and go into the IDE today-- and henceforth we're going to just start using the IDE, but you're welcome to keep using the Sandbox for quick and dirty programs, but for anything you want to keep around, your instinct should now be to open your IDE. I'm going to go ahead and create a new file, and I'm going to call it compare0.c from my first example of comparing things. And I'm going to go ahead and whip up a relatively short program that you would hope would work right out of the box. So I'm going to go ahead and include the familiar cs50.h. I'm going to go include stdio.h. I'm going to go ahead and do int main void. I'm going to go ahead and in here-- let me a variable called i using get_int from the user, and just prompt them for i. Let me go ahead then and prompt the user for another get_int. We'll call it j and get that from them. And then let's just compare these things. So if i equals-equals j, then go ahead and print out with printf same and a new line. Then go ahead and print out the opposite, which is different. So the only place I think I could have screwed up, perhaps, is if I did this, which is kind of reasonable if you come in knowing what an equal sign is. But again, in code, we typically need two equal signs because that compares two values. So I didn't make that mistake, I'm feeling pretty good about this. Let me save it with Command-S or Control-S or via File, Save; go to my prompt and run make compare0. Good, everything compiled. And let me go ahead and run compare0, Enter, and I'll type in 50, and I'll type in 50, and they do seem to be the same. Let me go ahead and do that again, let's type in 42 and 13, and they are different. And I should probably test a few more, maybe some negative values, maybe some 0's, positive values and the like, but I'm feeling pretty good about the correctness of this code. All right. So let's change this program a bit. Let me go ahead and create another file, which I can do with the little green plus or via File, New File. I'm going to go ahead save this one as compare1.c. And for the moment I'm going to go ahead and just paste in that code from before, but I'm going to make some changes now. I'm going to go ahead and rename and retype my data types as strings. So give me a string called s, and will prompt the user for that using get_string, then I'm going to go ahead and change this 1 to string t, and I'm going to go ahead and get get_string. I, of course, need to now compare s and t, not i and j. And s is a common variable name for a string. t just comes after s, so that's pretty reasonable too, but I should of course update that as well. And so I think everything's now the same logically. I just changed my data types and my variable names. So I've saved this. Let me go ahead and run make compare1. Good, everything's correct. Let me go ahead and do ./compare1. Let me go ahead and type in Brian and Veronica. And of course, those are different. Now let me go ahead and type in David, let me type in David again, and those of course are different? Huh. Maybe it's because I just hit the Spacebar or something. So let's try Erin. Her name's a little shorter. Hmm. OK, let's try-- oh, what's her name? TJ. OK, even shorter, perfect. TJ, can't go wrong. Different. I mean, what is going on? Let's just say i, i. Different? So where's the logical bug in this program? What is it that's going on? Yeah, what do you think? AUDIENCE: Is it comparing integer values? DAVID MALAN: Is it comparing integer values? Well maybe. I mean, thus far when we've used equal-equals we've probably used it mostly for comparing integers, so maybe I'm just misusing it, sure. Other thoughts? AUDIENCE: [INAUDIBLE] DAVID MALAN: Oh, that's a big word that we'll get to in just a little bit. But correct, correct-- but for very similar reasons. So something's going on logically involving comparison, because I'm using equal-equal, but maybe I'm using it for the wrong data types? I mean, it's clearly broken for strings. So why might that actually be? Well it turns out that strings don't actually exist. So a string that we know is just a sequence of characters or an array of characters is not an actual data type. int is, float is, double is, long is, bool is, and even more are actual data types. String is kind of a little white lie we've been telling for a few weeks that's implemented only in the CS50 library. Now the word string is super common in programming. Like every programmer out there will know what you mean when you say string. That is not a CS50 word, but our use of it in C is CS50-specific. Because in that file called cs50.h, in addition to declaring functions like get_string and get_int and get_float and a bunch of other things, we also have a special line that says, create a data type called string. But what does it actually do or what does it actually mean? Well let's go ahead and consider what might be going on underneath the hood here. So if I go ahead and draw the program that we just ran, that program compare1 gets a string s from the user, then gets a string t from the user, and then compares them. So we know from last week what a string is, it's just an array. So when I run that first line of code and get a string from the user-- for instance, Brian, I'm going to go ahead and see a B-R-I-A-N, which we know from last week to actually be an array of memory that might look pictorially like this-- and this, too, is a bit of a white lie, there's something else. AUDIENCE: The null. DAVID MALAN: Yeah, the null character, so to speak, and ul, which we typically just write with a backslash 0, which is just all 0 bits. And it turns out, you might recall from the debugger earlier, you saw this-- that's the even more cryptic way of expressing the null character, backslash 0. Just different programs display it in different ways. So when I get_string and type in Brian, this is what's allocated in memory. And when I type Veronica, I can see a V-E-R-O-N-I-C-A. I'm going to get that right preemptively. Backslash 0. That, too, is a chunk of memory, which I'll draw like this. 1, 2, and split these up into interval characters or bytes. And recall from last time that these bytes just come from my memory, and that memory just has a bunch of bytes in it, maybe millions or even billions these days. And so honestly, if you just have that many things, any human or computer can certainly number them. Like this is byte 1, 2, 3, 4. So let's just assume for the sake of discussion that out of context of my computer's hardware, Brian just ended up at location 100, and location 101, and 102, 103, 104, 105. So this is the 100th byte in my computer, this is 105th byte in my computer, and Brian is using that many characters in total. Veronica, she ended up somewhere else. Maybe she ended up farther away just because at location 900, 901, 902, 903, 904, 905, 906-- a lot more memory, 907, and 908-- but you can see even more visually now that the length of Brian's name-- strlen of Brian is what? AUDIENCE: [INAUDIBLE] DAVID MALAN: I hear five and I hear six. The length of Brian's name-- Brian, how long is your name? AUDIENCE: Five. DAVID MALAN: OK, it is definitively five characters, that is the length of Brian's name, but you have to appreciate that in the computer, Brian's five-character name does indeed take up six bytes. So both answers are kind of correct, but the length of the string henceforth is always the number of actual characters. The amount of space it takes up is that plus 1 for the null character. So you can actually see why Brian's name takes up six bytes in this picture rather than just the actual length, which is five. So when you call get_string now, and when you call get_string and get another string-- Brian and Veronica respectively, what is actually being handed back? A couple weeks ago, Erin came up and she kind of like handed me back a string, a student's name from the audience. On that piece of paper we thought was the student's name. But it's not. It turns out that when a function returns a value, it can pretty much only return a 1 byte or maybe 2 or 4 bytes. It can't return an arbitrary number of bytes, like six for Brian or 1, 2, 3, 4, 5, 6, 7, 8, 9-- it cannot return 9 bytes for Veronica. And if you even type a whole paragraph or page of text, it can't return all of that text, it can only return a single value. So to your instinct earlier, what might actually be getting returned by get_string when the human has typed in a name like Brian or Veronica? AUDIENCE: [INAUDIBLE] DAVID MALAN: The memory location. Indeed, an integer, or as you called it, a pointer, which we'll introduce more formally in just a moment. So when get_string string returns "Brian," quote-unquote, it's actually not returning B-R-I-A-N backslash 0, it is just returning 100. And when get_string returns Veronica, it's not returning her name, it's returning 900. And so if you realize that now, when you do does s equal-equal t, what question more mundanely are you actually asking? Yeah. Memory location and memory location-- does 100 equal 900? And obviously not. And so that is why Brian's name, Veronica's name, my name, TJ's name-- every word I typed in was of course different, because each input was ending up at a different location in memory. And even if I typed the same word like David twice, one David was going here, one David was going somewhere else, they were ending up at different memory locations. Maybe 100, maybe 900, maybe something else, but they were ending up in different locations in memory. So equal-equals does compare values, but dammit if it isn't comparing the wrong values. Yeah? AUDIENCE: Well what if you use some char*s? DAVID MALAN: Ah, so we'll come back to that. Let me come back to that in just a moment. char* is actually intricately related. More on that in a moment. Yeah? AUDIENCE: If you add two integers in memory-- DAVID MALAN: Uh huh? AUDIENCE: Wouldn't they be in different places in memory? So you would return-- so you need a different value. DAVID MALAN: OK, really good question. So wait a minute, this same logic that I'm returning the address of something surely applies to integers as well or floating point values as well? Because if I type in the number 50 like I did earlier, that, too, is somewhere in memory-- like a box in memory, and that, too, has an address somewhere in memory, but it turns out, for reasons that you just alluded to, actually, ints are returned as their values. Chars are returned as their values. Bools are returned as their values. Floats are returned as their values. Strings are different. Strings are returned by their address. And those addresses, it turns out, are ultimately going to be called char*'s, which we'll see in just a moment. So how do we go about then fixing this fundamentally? Like even if you have no idea how to code this yet, just intuitively, if I do actually want to delete-- if I do actually want to compare-- sorry. OK. If I do want to go ahead and compare Brian and Veronica for equality, what do I want to do intuitively? I can't just compare their addresses. What do I need to do? Isolate the characters and then do what with them? AUDIENCE: [INAUDIBLE] DAVID MALAN: Good. Yeah, good instincts. Use a for loop, use a while loop-- any kind of looping structure. And intuitively, compare the first characters, and if they're different, well then we know we don't have to go any further. B is not a V, so surely these names are different. But what about in my case? If it was David and David, you would compare the first two. D and D are the same. Compare the second two, A and A are the same. V and V, I and I, D and D, and then what am I going to hit last? Null character. And should I keep going beyond the null character? No. So this is the beauty of that super simple design for a string. Insofar as strings are identified by their starting address, just the byte at which they start, you still need to know how long they are, because otherwise how do where one word begins and ends and another word begins? And so the simple decision we made last week-- as did humans decades ago-- to terminate all strings with backslash 0 or all 0's is a super handy trick, so that if I tell you that Brian starts at 100, you can infer that he ends where? At byte number 105 or 104, if you will, however you want to think about it, because all you need to do in linear time, if you will, left or right, is check-- backslash 0, backslash 0-- ah! Backslash 0, now I know how long Brian's name is. So let's consider for a moment this program called string length. How does strlen actually work? When you pass to strlen, a variable containing a string, like Brian, what is sterling probably doing? AUDIENCE: [INAUDIBLE] DAVID MALAN: Exactly. It's looking at that null character's address and subtracting the start address and the end address, figuring out what the difference is, and actually returning that minus 1 the total count. And more mechanically, we'll see in a moment, it's probably doing exactly the same thing I did, which is, is this backslash 0? Is this backslash 0? Is this, is this, is this? I asked that question five times before I saw backslash 0. strlen is just a function some human wrote years ago that probably just has a simple for loop and an if condition, and then that's it. Because that person understood before we even did how strings are actually implemented. Any questions then? All right, so let's actually implement this. Let me go ahead and into my editor here, and make one other example here that I'm going to call compare2. I'm going to go ahead and do include cs50.h and include stdio.h, and then I'm going to do int main void, and I'm going to quickly now grab my code from before where I got strings and I compared them, but I have to obviously fix that comparison. So here's my code from before. I'm going to do this the right way. I'm going to call a function called compare_strings passing in s and t. Because as you proposed, we need to do some logic. We don't have to pass it to a function, but we could. We could just do a for loop here, but I'm going to go ahead and implement compare_strings as follows. If I want to write a function that returns a yes/no answer, what data type should it return? A bool. So we've not necessarily done this yet, but you can return a bool just like you can int or a char or something else. I'm going to call this function compare_strings. It's going to take in one string called a and another string called b, but I could call those anything I want. And now what's the easiest thing to check? If I pass two strings, a and b, or Brian and Veronica, what's the easiest question you can ask and just immediately say, nope, these are different? String length, right? Like if the B-R-I-A-N is not of the same length as Veronica's name, we don't need to do any logic whatsoever beyond that, we can just quit and say false. So let me just do that. If the strlen of a does not equal the strlen of b, you know what? Let's just go ahead and return false and get out of here. OK, but now, if we get past that gateway, so to speak, that check, that question, that Boolean expression, now I have to compare things character by character by character. So I can do this in a bunch of ways, but I like the suggestion of a for loop. So for int i at 0, n for efficiency-- actually, let's do i is less than the string length-- should I do the string length of a or b? And it doesn't matter, right? So let's go with a. And frankly, had I been smart early on, I could have stored the value in a variable and then reused it, but we'll just keep going ahead for now. Then i plus-plus, but I remember from last time-- this is correct, but this is not good design. Why? Yeah, I keep calling strlen again and again, because remember, in a for loop, this condition is checked again and again and again-- you're just wasting your own time. So let me go ahead and actually do this. n or any variable equals the strlen of a, then just compare i against n, because now i is getting incremented, but n is never changing. So now let me go ahead and implement this for loop. So if-- how about the i-th character of a does not equal the i-th character of b, I can immediately conclude-- nope, these strings can't be the same, because some letter, like a B, is not the same as another, like a V, or whatever letter we're actually comparing. And then I think that's it. If I get through these gauntlets of questions-- are yours lengths different? Are your characters different? And I still haven't said false, what should I return by default? Yeah. Like if you make it through all of those questions and all is well, then D-A-V-I-D must indeed equal D-A-V-I-D or whatever the user actually typed in. Now I'm not quite done yet. When I've implemented a function or a helper function like this, because it's helping me do my work, what else do I have to add to the file? Oh? AUDIENCE: I've got a logical question. DAVID MALAN: Sure. AUDIENCE: In a computer, couldn't you just type in David with a capital D and then david with a lowercase d, you're going to run [INAUDIBLE],, they're not going to sync because your first character's not the same character. DAVID MALAN: Correct. So this is a feature, not a bug at the moment. My program at the moment is case-sensitive. If I type in DAVID and all caps, that is a different string I claim for now than david in all lowercase. If you want to tolerate uppercase and lowercase, you're going have to add more logic. But for now that's a design decision that I intend. All right. What else do I need to add to the program? Yeah, the prototype at top. You can literally copy and paste-- this is the only time copy and paste is probably a legitimate thing to do-- at the top, and then semi-colon-- don't re-implement it. But I do need one other header file. I'm using a function that's not in cs50.h or in stdio.h. String length? Where was string length? Yeah, string.h. So I just need this, include string.h, save. Now this I think is correct. We'll see if I eat the word in a moment. But realize that if you're writing this code yourself, like this is not a natural thing to be writing a program in office hours or at home in your dorm and just getting it right the first time. This is after like 20 years of doing this, so realize we happen to be-- and I also have a cheat sheet right here-- we happen to be doing this correctly often, but realize that's not going to be the common case. So with that reassurance in mind, let's see if I have to now take all that back. make compare2. OK-- phew. 20 years worked out. So now I'm going to go ahead and ./compare2. Let's type in Brian, let's type in Veronica. Those are indeed still different hopefully. Now let's try myself, David and David. Phew! Those are the same. And to your point, David in capitalized and David in all lowercase, different, but that's what I expect now. Any questions on compare2? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: OK. AUDIENCE: [INAUDIBLE] string in the program and in general. DAVID MALAN: OK. AUDIENCE: Would that still work [INAUDIBLE] DAVID MALAN: If you were to hard code the strings? Short answer, yes, that would still work. If you for whatever reason did not do this and using get_string, but you did David, and here, for instance, David, that would work too. And whatever your error is, if you can recreate it, just let us know. AUDIENCE: It seems to be like a string that would be increased for a set that was [INAUDIBLE] only? And it was having issues in the little [INAUDIBLE].. DAVID MALAN: I'd have to see it to be sure, but happy to chat after. All right, so let's see if we can't now clean this up just a little bit as follows. Let me go ahead here and reveal what it is that's actually going on. So indeed, there is no such thing as a string. And indeed, as you pointed out a moment ago, it actually goes by a different name. String is just a synonym for what's called a char*. Now what does that even mean? So char is the same as it's always been. It's a single character. Star in a program written in C could of course mean multiplication, we have seen that. This is another use of the star. Whenever you see it after a data type like char, this means that the data type in question is not just a char, it's the address of a char. So the star just means the address of whatever the data type is to the left, and this is, as you pointed out earlier, what we're going to start calling a pointer. A pointer is, for all intents and purposes, an address. It's just a buzzword to describe an address. This data type here, char*, means I want a variable that doesn't store a char, it stores the address of a char. The number 100, the number 900. But that address is just going to be called a pointer. A pointer variable is a variable that stores the address of something. A char or even other data types as well. So with that in mind, let me actually quickly create compare3.c, paste this in, and save it as compare3.c, and let me take off, if you will, those training wheels. It turns out that when you get a string with get_string, it doesn't return a string, per se, because again, that word doesn't exist in C, it actually returns a char*. And when I call it again here and return another string, it, too, returns a char*. Now technically the star can have spaces around it. Some people write it like this, but the sort of right way to do it or the default way should just be to put the star next to the variable name for clarity. So I have to make a few other changes. This should change too, because there is no more string as of today. I'm going to change this to a char*; and then I also need to change it here, char*; and then here, char*; and that is actually it. And honestly, the only reason we didn't introduce this like two weeks ago is because it just looks cryptic. Like no one wants to program the first time they're ever touching a keyboard and writing code and see char* and need to worry about what that means, it's just a string conceptually. But the only change I technically need to make to take those training wheels off is just change all mentions of string as data types to char*. And that just means that you know what-- a? Yes it's a string, but more technically it's the address of a string. Or more precisely, it is the address of the first byte of the string, like 100 for Brian or 900 for Veronica, and I'm not even going to tell you where the string ends because you, the programmer, can figure that out by calling strlen or just by using a loop and figuring out where that backslash 0 actually is. So that is enough information to pass it around. So if go ahead now and compile this, make compare3, and then I go ahead and do ./compare3, let's go ahead and type in Brian and Veronica, those are indeed still different. Now let me go ahead and type in David and David, those are in fact the same. So the training wheels are off, there is no such thing as string, henceforth it's a char*. Let's go ahead and take a quick break here for five minutes, and we'll come back and dive in more. All right. So we are back, and let's go ahead and simplify this now, as our tendency has been. It's kind of a bunch of code, but I think we can make this a little tighter. But rather than type this one out manually, let me go ahead and just open one of our pre-made examples from today, which is all in the course's website, called compare4. And you'll see in compare4, that's it. I only have a main function this time. I've gotten rid of my compare_strings function because you know what? I seem to be using something instead. What function did I apparently deploy? Yeah, S-T-R-C-M-P, or someone with pronounce it, just str compare or strcmp. So this, like strlen, also succinctly named, is just a function that's actually declared in one of our familiar libraries up top, string.h, and it turns out if you look in the man page, so to speak, by typing man strcmp, or if you go to CS50 reference and actually look at the less comfortable description of the function there, this is just a function whose sole purpose in life is to compare strings for you. But it's a little different in behavior because it's a little fancier than the one I just wrote. Let me zoom in on this, and you'll see that line 14 here, I'm not quite treating it in the same way. My logic is ever so slightly different. What am I actually checking for in my Boolean expression this time? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, which is a little weird. I'm checking explicitly-- if strcmp's return value equal-equal to 0. Before I just said, if compare_strings s comma t, because I was expecting back a bool-- true or false. strcmp, kind of weird, acts the opposite way. It turns out that strcmp doesn't return true and false. If you read its documentation, it returns 0 if the strings are equal, but super conveniently, it returns a positive value if s is supposed to come before t, and it returns a negative value if s is supposed to come after t alphabetically. So it turns out that you can use strcmp not just to compare for equality, but inequality-- less than or equal-- less than or greater than, so to speak, alphabetically, or in ASCII order, so to speak. It will actually compare character by character the ASCII values, and that will make sure that B comes after A, and C comes after B, and so forth. So you can actually use strcmp to like sort a dictionary, or to sort the contacts in your iPhone or your Android phone. So long story short, this is a function we can use, we don't have to reinvent this wheel, and thus, we have no more code even after this. We just have to use it correctly, and there, the documentation is your friend. So if I run this program it's going to work exactly the same way, but let me go ahead and point out some flaws. It turns out all this time, I've been a little lazy with my error checking-- checking for errors. There's a whole bunch of things that can go wrong in week 1 of CS50 that we just kind of turn a blind eye to, because it would just bloat our code, make it longer and sort of less interesting and fun to write and less comprehensible. But today, now that we know what's actually going on, we can begin to ask some additional questions and make our code stronger, more robust so that nothing does, in fact, go wrong. Turns out, if you read the documentation for get_string in the man page or in CS50 reference, turns out get_string does return a string-- uh, not really. It returns the address of a string. Uh, not really. It returns the address of the first byte of a string, technically. But if something goes wrong, it returns a special character called null. Not to be confused with NUL, it returns a special address called null-- left hand wasn't talking to right hand decades ago. So null, N-U-L-L, just means the address 0, which nothing should ever live at. It's just a bogus, invalid address. Insofar as get_string returns the address of a string in memory, like 100 for Brian or 900 for Veronica, if get_string ever runs into a problem and just something goes wrong with the computer, if it ever returns 0, specifically 0, a.k.a. null-- N-U-L-L, then you can detect that something has gone wrong. So to do that, and it's going to get a little tedious, but it's nonetheless the right thing to do, I need to be a little more defensive. If s equals-equals null, otherwise known as 0, otherwise known as 0x0, but I'll write it conventionally like this, I'm going to go ahead and return 1 as my exit code. If t equals-equals null, I'm going to go ahead and return 1 as my exit code, or I could return 2 or 3-- I just need to return some value to signal to the computer that something went wrong, but by default we'll just return 1 whenever something goes wrong, but if all went well, I'm going to go ahead and return 0. So recall again from last week, and we didn't spend a huge amount of time on this-- main itself can return values. By default, ever since week 1, if you don't return anything, main is automatically and secretly returning 0 for you because 0 is good. The reason for 0 is because there's only one 0 in the world, obviously, but there is an infinite number to the left and there's an infinite number of the right, negative and positive. That's great, because as you've already experienced in the past few weeks, it feels like there's an infinite number of things that can go wrong when you're writing even the shortest of programs. So that means we have a lot of numbers we can assign to error codes, so to speak. Now I don't really care what the error codes are, so I'm just going to adopt the human convention at the moment-- if anything goes wrong, returns anything other than 0. And so I'm going to return 1 up here, but if nothing goes wrong, return 0. The point here is that by adding these three lines here and these three lines here, I'm going to avoid what's called a segmentation fault or segfault. Did any of you encounter this cryptic error? OK. So a decent number of you, and if you probably had no idea what that means, but starting today you will a bit more, and in the weeks to come, you'll understand even more. Segmentation fault means you touched memory you should not have. Or something went wrong and you did not detect it. It's kind of a catch-all phrase for memory-related problems. This helps ward off those kinds of errors. It's not the only way, but it's one such way. So starting today with problems set programs and anything you write in the course, you always want to be thinking about, even if you go back and add it later, could this go wrong? Could this go wrong? Could this go wrong? And just add some additional ifs and else-ifs and handle those situations so that your program doesn't just crash on you or segfault or surprise someone who's actually using it. All right, let's take a look at one final example, because frankly this is a little tedious. I'm going to go ahead and open up-- and this file can be found in compare5.c. Let me go ahead and save this so that we have it-- compare5.c. I'm going to make one final comparison example. I'm going to save this as compare6.c. Turns out that humans like their succinctness. And null, because it is technically the 0 address, you can actually be a little clever. If not s and if not t is a sufficient way to express those same things. Because what does the bang do? The exclamation point in code if you recall? It inverts something. So like if this is saying, if s is not 0, a.k.a., if s not null, or rather-- if-- now I'm getting confused. Yes. If I had just said, if s, then it's a valid address and I should go on with my business. But if it's not s or if s is null, I want to go ahead and return 1 because there's an error, and down here too. So any time you're checking whether something equals null, you can make it more succinct by just saying if not s; if it's null, return 1. If it's null, return 1. It's just syntactic shorthand. Phew! I had to think about that one. Any questions? AUDIENCE: Why does [INAUDIBLE] will store some [INAUDIBLE] DAVID MALAN: Correct. You are storing an address, but if that address is 0. Saying if it's not 0, 0 is like false, so not false means true, and so it has the effect of inverting the logic. That's all. Anytime you use a bang or exclamation point, it changes a 0 to non-0-- AUDIENCE: [INAUDIBLE], but even-- I don't understand why [INAUDIBLE] implies that it's [INAUDIBLE].. DAVID MALAN: So you can think about it this way. If s-- previously we had this. If s equals-equals null is like saying if s literally equals 0. And you can kind of think of that informally as if s doesn't have a valid pointer-- 0 is not a valid point or it's not a valid address by definition. 100 is valid, 900 is valid, 0 is not valid just by a human convention. So this is like saying, if s does not have a value, that's valid. So the way to succinctly say that, if not s, and it's just shorthand for that is another way to think about it. All right, so let's take a look at a very different program, but that reveals the same kind of issue as follows. I'm going to go ahead and open up an example called copy0, whose purpose in life hopefully is to copy a string. So notice that in my program here, which I wrote in advance, I'm getting a string from the user on line 11, and I'm storing it in a string called s. I could change this to char* now, but we know what it is. And I'm going to go ahead and copy the string's address from s into t. And then I'm going to say, if the length of t is greater than 0, then go ahead and just capitalize the first character. So it's a little cryptic, but you might have done something kind of like this with Caesar and with recent string manipulation. This is just making sure, do I have at least one character? And if so, first character is t bracket 0, as you recall. toupper is a function in ctype.h from last week that just capitalizes this letter. So this one line of code, 19, just capitalizes the first letter in t, that's it. And then at the very end we just print out what s is and print out what t is. That's all. So this program just copies s into t, capitalizes t, and that's it. So let me go ahead and make copy0. This is in our code from today. So I'm going to do cd sc3, because I already wrote it in that directory. make copy0. Went well. ./copy0. Let's go ahead and type in tj again in lowercase. Enter. Huh. TJ, TJ-- both are capitalized. All right, maybe it's just a weird thing with initials. So let's just do Veronica, all lowercase. Huh, that's definitely capital. Let's do even more obvious difference, Brian where the B's really going to look different. Yet I'm only capitalizing t. Well let's consider what's actually going on here. In this case, when I'm getting a string from the user, s and t, and I type in, for instance, brian in all lowercase, backslash 0, this, of course, is just an array underneath the hood. This is taking up six bytes here. And when I store in s, s is a string. So you know what? We didn't do this before. Let me actually create a variable, a chunk of memory for s and call it s. And suppose Brian is just where he was before-- 100, 101, 102, 103, 104, and 105. So if I do s equals get_string and get_string returns Brian, what do I write in the box called s? Yeah, just 100, right? This is all that's been going on all this time even though we didn't talk about it at this level. And actually, it turns out-- pointer actually can be used pictorially. If you actually prefer to think about a pointer as being an address or like kind of a map that leads you somewhere, another way a human would typically draw a pointer-- because honestly, who really cares that Brian is at address 100? Like that is way too low level, that's week 0 stuff. He's just pointing there. So s is a pointer to that chunk of memory. It happens to be 100, whatever, the arrow is how you would literally point at the chunk of memory if you were drawing this on some notes. So that, too, is correct. So the problem arises here with that line of code. When I actually try to copy s and store in t, think about what's going on. The right-hand side is just s's value, which happens to be 100. The left-hand side is just saying, hey computer, give me another variable, first string, and call it t. So that's like saying, hey, computer, give me another chunk of memory, call it t, and then store s in it. But what does it mean to store s? Well what is s's value at this point in time? It's the pointer to Brian, or it's technically-- I'll write both just for thoroughness-- it's literally the number 100. So if you do t equals s, that is like saying put 100 there too, and pictorially that's like saying this. So at this point in the story, when I copy s into t, the computer took me literally. It did copy s into t, but what is s? It's just the address. It is not B-R-I-A-N backslash 0, it's just the address. So when I then say, t bracket 0 gets toupper-- so let's look at this line of code. The one line of code here that's highlighted, when I say go to the 0th character of t and store the uppercase version of that same character, you just follow the arrows. If you ever played chutes and ladders as a kid, you just kind of follow the arrow, see where you end up. t bracket 0 is this location here, because again, if this is a chunk of memory, per last week it's an array, so you can also think of this as being bracket 0, this is bracket 1, this is bracket 2, and so forth. So it's just an array. So t bracket 0 is lowercase b, and toupper of lowercase b, of course, changes this little b to a B. But now both s and t are still pointing at the same chunk of memory, so of course s and t are both going to be Bryan capitalized, or TJ too in my first example. Any questions then on what we just did and why that happens? All right, so intuitively what's the fix? Doesn't matter if you've no idea how to code it, like what do we have to do to fundamentally copy a string, not an address? AUDIENCE: [INAUDIBLE] DAVID MALAN: Create a new what? AUDIENCE: Basically create the [INAUDIBLE].. DAVID MALAN: Yeah. Create the same string in a new chunk of memory. What I really need to do is allocate or give myself a bunch of more memory that's just as big as Brian, including his backslash 0. And then logically I just need to copy every character into that. So if I go back to my original when it was a lowercase b, I need to make a copy logically by using a for loop or a while loop or whatever you prefer-- B-R-I-A-N backslash 0, so that when I copy the string and then store it in t, It's not actually copying literally s. And let's suppose that he ends up at location 300 just arbitrarily-- just making up easy numbers. t now stores 300, points here. So when I execute this line in this version of the story, t bracket 0 gets toupper, what am I actually doing? I'm following a different arrow this time because I gave myself a different chunk of memory, capitalizing this Brian, thereby hopefully fixing the bug, albeit verbally only. So how do we do this in code? We need to do exactly that. We need to give ourself some more memory, so let's introduce one other feature of C. In copy1.c, we see the solution to this problem. Notice at the top I'm doing things a little lower level-- oop, surprise. Notice in this version of the code, copy1.c, see I've started off almost the same, but just to be super clear, I'm just using char*. I don't want any magic, so there's no string, there's no training wheels here. But this logically is the exact same as before-- plus the error-checking. This line is new. And it looks a little funky, but let's see what's going on. And this line of code here, what am I doing? The left-hand side, that's shorter, let's start with the easier one. Char* t, just in layman's terms, what does that expression do? char*? Hey computer, do what? What's that? AUDIENCE: [INAUDIBLE] DAVID MALAN: Not quite yet. Different formulation. Hey computer, give me-- not quite. Be more precise? AUDIENCE: An array? DAVID MALAN: Not quite an array, just this part. So let me hide all this. If the star wasn't there-- I can't really do this very well. So this-- yeah? AUDIENCE: [INAUDIBLE] character? DAVID MALAN: Good, I'll take that. So hey computer, give me a pointer to a character. Or even more low level, hey computer, give me a chunk of memory in which I can store the address of a character. I mean, it is that mundane. Draw a box on the screen, call it s-- or rather, call it t, but just give me space for a pointer, as you said. So that's all that's doing. It's drawing a box on the screen and calling it t, and it's currently empty. Now let's look at the scarier part on the right-hand side. malloc, new function today. Stands for memory allocates. It's very cryptic-sounding, but it just means give me a chunk of memory. It says exactly what you said in functional terms. Then it just needs you to answer one question-- OK, how much memory do you want? How many bytes do you want? And now maybe the math, even though cryptic at first glance, makes sense. Get the string length of s, add 1, and then multiply it by the size of a character. And we've not seen this before. sizeof literally does that. It tells you how many bytes is a char. Happens to be 1, and in fact, that's defined. So if we simplify this in C, the char is always 1 byte, so this is equivalent to just multiplying by 1. And obviously mathematically that's a waste of time, so we can whittle this down to be even simpler. I was just being thorough. So now, hey computer, allocate me this many bytes of memory. Why is it plus 1? AUDIENCE: You need the null character. DAVID MALAN: I need that null character. Brian is 1, 2, 3, 4, 5 as he said, but I need the sixth for his null character, and I just know that's going to be there. So at this point in the story, what has happened? All that malloc does is it gives me this box of memory containing room for as many bytes are in Brian's name. But it doesn't fill them just yet. Now I need to logically fill those bytes with Brian's actual name. So if we scroll down to my for loop here, we can actually copy the string into that space. And it's a little long, the expression, but nothing new here. Initialize i to 0, n to the length of s, i is less than or equal to n-- we'll come back to that, i++. So it's just a pretty standard for loop. Then copy the i-th character of s into the i-th character of t. The only thing that's making me a little nervous honestly is this thing here. Like I feel like every time we do less than or equal to, we create a bug like last week. But this is correct, why? Why do I want to go up to and through the length of this? AUDIENCE: Is it the null character that adds-- DAVID MALAN: Exactly. Because of the null character. I actually don't want to stop at the strlen of s, so I could change this. If you're just more comfortable using less than, because you just got your mind wrapped around why we do that in the first place, that's fine, we just need to do this instead. So this is mathematically-- if you go to strlen plus 1, the same thing as not doing that math but just going one step further. Just whatever you want to think about it is fine. However you want to think about it is fine. OK, and then lastly, just a quick check, is the length of t at least one or more characters? Because otherwise there's nothing to capitalize, and if so, go ahead and do it. So if I now run this example, make-- oop, let me save it. make copy1, that compiled. ./copy1, now let's type in tj, tj in lowercase comes back, but now t is capitalized. And let's go ahead and do Brian's name in all lowercase, only one of them is now capitalized. So does that make sense what's now happened? All right. So where can we go with this? Well it turns out-- let me open up one final example here, because honestly, that's incredibly tedious, and no one's ever going to want to copy strings if you have to go through all of that work. Turns out that store copy exists. So when in doubt, check the man page. When in doubt, check CS50 reference. Does the function exist somewhere related to some keywords you have in mind? Like string copy, see if something comes back. And indeed, we've had strlen, we've had strcmp, we now have strcpy, and if you read the documentation, this is deliberately reversed like this. The destination is this variable, the source or the origin string is this one, and it copies from one end to the other, and then I don't need that for loop. It just saves me a few lines of code. All right. So let's take off one other detail here. Oh, and you'll notice, actually, let me make one fix, one fix here. It turns out that what I'm doing here is a little lazy. It turns out that malloc does have an opposite. So anytime you allocate memory, technically you should also be freeing that memory. And so C allows you to ask the computer for as much memory as you want, but if you never give it back, have you ever experienced on your own Mac or PC, like after your computer's been running a while or using some new or bloated program like a browser, it gets slower and slower and slower? And in the worse case it just freezes or hangs or something? It is quite possible that that program simply-- was made by humans, of course-- just has a memory leak. So some human wrote one or more lines of code that uses malloc or some equivalent in another language that just kept allocating memory for the user's input. You're visiting one web page, two web pages, that requires memory whatever the program is. And if that human never calls the opposite of allocate-- deallocate, otherwise known as free, you're never giving the memory back to the operating system. So it gets slower and slower because it's running lower and lower and lower on memory, and it might have to move some things around to make room for things, that's what's called a memory leak. And so indeed, in this program, I should actually improve this a little bit. If I go back into this version here and line 18, recall, I allocated this memory just to make my copy, the very last thing I should actually do in this program is this line here-- free. You don't have to tell the computer how many bytes you want to free, it will remember for you so long as you're just pass in the pointer-- the variable that's storing the address of the chunk of memory that you allocated. All right. So let's now see why we've been using get_string, since it's not just to kind of simplify the code, it's also to defend against some very easy problems. Here is a program called scanf0-- scanned formatted text, another arcane-sounding function, but it's pretty straightforward. This program simply gets in from the user using scanf. Up until now for the past three weeks, you've used get_int. So this is an alternative to get_int that you could have started using a few weeks ago. Give me an int called x, print out x colon whatever-- that's just the prompt to the user. scanf %i, &x;, whatever that is, and then print out x's value using %i. So what's going on here? Now today we can actually start to wrap our minds around what get_int actually does. This is effectively get_int. If you actually look at the source code for get_int, it's a little fancier. But in essence, what get_int does is it declares a variable called x, and it doesn't put anything there, because that's supposed to come from you, the human. It then prompts you for whatever string you pass to get_int, so those are the first two lines. And this is the only weird-looking one. Scanf is like the opposite of printf. You still use a formatted string-- %s, %i, %f or whatever, but you're not going to output this, you're going to input this from the human's keyboard. And %x is the opposite of-- is the special symbol in C that says, go ahead and get me the address of x. So don't pass in x, give me the address of x. Now why is that? We'll see, but this is the way where you can tell the computer, I've made a variable for you called x, here is where it is. It's a treasure map that leads you to x, go put a value here for me. And so the end result is that we do, in fact, end up getting an int. If I do make scanf0, and then ./scanf0, I'll type in 42, all right? It's not an interesting program, it just spits back out what I got, but that's literally all that get_int, of course, is doing if you then print out the value. So if I stipulate this is correct, this is how you get an int from the user, but honestly, the reason we don't do this in week 1 of the course is like, my God, we just took the fun out of even getting a simple number from the user by using these lines of code and whoever knows what this symbol is-- we don't want you to think about that, we want you to just get an int. But today those training wheels are off, but we're going to run into a problem super fast. Let's try the same thing with a string. If I were to do this, you would think that the result is the same. Or let's just do it as char*. But there's going to be one tweak. If I go ahead and give myself space for the address of a character, I don't need to use the ampersand now, because scanf does need to be told where the chunk of memory is, but it's already an address, so I don't need the ampersand here. Recall earlier, I declared int x, which was just an int. %x gets the address of that int. Here, I'm saying from the get-go, get me the address of a char. I don't need the ampersand cause I already have the address of a char by definition of that star symbol. So what's going on here? Let me see now. If I run scanf1, what happens? So make scanf1 and-- oh, let's see. Here's a warning I'm getting. Variable s is uninitialized when used here. All right, that's fine. It wants me to initialize it because this is a very common mistake. Those of you who alluded to segmentation faults earlier might have encountered something similar in spirit to this. So that squelched that error. Let me go ahead and run scanf1. All right, here we go, TJ. Hmm. That is not your name, but OK. It didn't crash at least, it's just a little weird. David. Null, OK, that's a little weird. Let's go ahead and do this again. Let's type in a really long name. Enter. Dammit, that didn't work. So let's try an even longer name. I'm hitting paste a lot. OK-- dammit. Too many times. Too many times. Command not found, that's definitely not a command. Wow, OK. Well that's interesting. Oh, there it is. Null, same thing. OK, so what's actually going on? Well null, which is all lowercase here, which is this kind of an aesthetic thing, well it's not working. It's not working. Well what am I actually doing? In that first line of code, when I say give me s to be a char*, otherwise known as a string, all that's doing is allocating this. And it's technically the size of a pointer. A pointer, we never mentioned this before, but now we can. Turns out it is 64 bits or 8 bytes. 8 bits is 1 bytes, so a pointer is by definition on many computers these days-- most of your Macs, most of your PCs, the IDE, the Sandbox, the Lab-- is 64-bit. So that just means there's 64 bits here, but we initialized it to null, so that just means there's 64 0's here, dot-dot-dot. But when I get a string using scanf, what I'm telling the computer to do with this line of code here, notice, is hey computer, go to that address and put a string there. So what's actually happening? It turns out that there's just not enough room to type in TJ. There's not enough room-- that's a bit of a white lie, because we could fit you in 64 bits, but there's not enough room to type in the long sentence or paragraph of text I did, right? What did we not do? We didn't allocate any space over here. All we allocated space for was the address. And so every time I use scanf saying, get me a string and put it here, there's nowhere to put it. And so the value just very defensively says, no, like no, cannot store this anywhere for you. So I actually need to be a little smarter about this. I actually need to get myself some space so that I can actually store something in the right place. Let's do that. Let me go ahead and create a new program. I'm going to go ahead and call this scanf2. We need a little secret code to remind me of that. Oh, wrong file name. So I'm gone ahead and create a file called scanf2. scanf2.c. And I'm going to quickly recreate this stdio.h, int main void, and then down here I'm going to go ahead and-- you know what? Instead of a string s, which I know today to be a char* s, what is this string really? Well you said it earlier. What is this string? It's an array of characters. Let me take you literally. Just give me an array of let's say five characters. The D-A-V-I-D, or one more, that's fine, just enough for my backslash 0. Let me just create a string-- really low level, but this time give myself the chunk of memory. I don't want just the address of a character, I want the actual characters themselves. Let me go ahead and just prompt the human for their string with s, just like before. Then let me call scanf and get a string from the user using %s and then pass in s. And here's a little trick. It turns out that because a string is really just an array, but a string is also just a pointer, you can actually treat an array as though it is a pointer-- an address. And so even though this is a char* array, this is OK. This is the equivalent in this context to being just the address of a string. Because strings are arrays, arrays can be treated as pointers as of now. And then let me go ahead and just print out whatever the human typed in. S is actually this. Pass in s;, save. Yeah? AUDIENCE: So [INAUDIBLE] char*? DAVID MALAN: At this point it would be redundant to do char*, because I literally want for this story six characters. I want space, rather, for six characters. So this is kind of week 2 stuff now, there's no pointers involved. But again, just showing the equivalence of these ideas for now. So if I now go into this, and this is in my other directory at the moment, make scanf2, Enter, ./scanf2, s is going to type in-- I'll type in my name, I know I can fit that, we're back in business. Like now it's working because I didn't just create the address for a string, I created the space for the string. But let me get a little dangerous-- David Malan? OK, that kind of worked out OK. David Malan or some really long other name? OK, that worked out too. Let me go ahead and run it again. Let me try that really long string again, see what happens. I know this didn't work very well last time. All right, done. Ooh, OK. So now I'm in the club of those of you who have had segmentation faults. So let's understand what's going on here. Segmentation fault a moment ago I claimed was touching a segment, a chunk of memory that's not your own. So just happened? Well with this simple program, I told the computer, hey computer, give me room for six characters, give me six bytes. With the scanf line, I'm telling the computer, put the following user input at that location, in that array of characters. D-A-V-I-D backslash 0 fit. David Malan didn't really, but it didn't seem to be a huge deal. David Malan or some really long other name, also didn't crash the computer. But that's because unbeknownst to us, usually when you ask for six bytes, the computer is kind of sort of-- it's giving you a few extras. It's not safe to use them, but it gives you enough that you're not going to necessarily see a problem like a segmentation fault. But it only allocates a few extra bytes typically, so if you really keep pasting in long, long, long, long lines of text, eventually you're going exceed not only those six bytes, but well past the special-- the secret bytes that you got back that you shouldn't be using anyway, and that point the computer just gives up and says, you are touching memory you shouldn't, a.k.a. segmentation fault. AUDIENCE: [INAUDIBLE] if the computer gives you a few extra bytes, then why isn't it printing any of the other stuff? After you said [INAUDIBLE] it just printed David. DAVID MALAN: Really good question. So even though I'm getting these sort of extra bytes, why am I not seeing them after D-A-V-I-D? I'm probably getting lucky. Long story short, when you first run a program, much of the memory that your program has access to is by default initialized to 0's. 0 is the same thing as backslash 0, and so I'm getting lucky. When I had D-A-V-I-D and then excess space in that array, a lot of them are initialized as 0's already, and the string is getting secretly terminated for me. Or the better answer is, it's undefined behavior. Like you should not touch memory that is not your own. What happens after that is your risk alone. But that's a conjecture as to why that's happening. All right, so what is the fundamental feature than get_int is providing for us? All of this time get_int has actually been dealing with all of this headache for us. I mean honestly, even I'm getting bored like thinking about, talking about how you just get a damn string from the user, because you need to figure out, well how many bytes do you need? And what if the human types in one more bite than you were expecting? Then you need to do a switcheroo and get more memory. get_string is doing all of this headache for us. And that's not to say you need to use it forever, there are indeed training wheels, but that's just because when you're using C or a lot of programming languages, the computer will only do what you tell it to do. And it turns out that even asking the user for input, if you don't know how many characters he or she is going to type in from the get-go, you have to deal with it. And so underneath the hood-- and you're welcome to take a look at the source code for CS50's library, which I'll post on the home page later today, it turns out that with the way we're doing get_string is taking baby steps. We literally like get one character at a time from the user, kind of building the road as we go. And if we don't have enough space, we ask the computer, give me some more bytes so I can get more bytes, and we just get one character at a time so that we can handle the user maliciously or accidentally typing in way more input than we actually expect. So let's contextualize all of this then. Recall that we've been drawing these pictures the past couple of weeks. Let's just make this super clear as to what's been going on. This is a memory module in a computer. It's just a green board, it's way blown out of scale here, it's easily like yea big inside of your Mac or PC laptop or desktop, though can vary in size. One of these black chips is the actual memory or the bytes to which we've been referring. And if we zoom in on that, recall that I proposed last week that you can just think about this as like a grid, an array. And it doesn't have to be rectangular, this is just an artist's rendition, but each of those squares represents, we claimed, a byte. And each of those bytes can be addressed in some way with a number. And that number is just its location, otherwise known as an address. We can actually see this, it turns out, as follows. Let me go ahead and open up this example here. Or actually, you know, let's just write this one from scratch. Let me write a program called addresses.c. And that's going to use our old friends, the CS50 library and stdio.h and int main void. And let me go ahead and just do this. I'm going to go ahead and get a string-- you know what? No more string. char* from the user, get_string, ask the user for s. And we get another string, a.k.a. char*, get_string, call it t from the user. And then, I want to print out not the strings, which I used to do like this, printing out s. I want to print out the pointer that s really is, that is the address. Turns out %p for pointer will print out not the string at that memory location, it will print the actual memory location for you of s. And I can do the same thing here, %p, backslash 0, paste in t. And just so I know which is which, let me just prefix it with some text-- s colon and t colon. Let me go ahead now down here and do make addresses. Oh, I messed up, missed a semi-colon. Let me do this again. make addresses. And get rid of this. That compiled OK, ./addresses, and here we go. Let's type in-- let's do Brian and Veronica like before. Enter. And this is a little funky, but it turns out the IDE in your Macs and your PCs have a lot of memory. So this is the address. It's not quite as small as 100, it's not quite as small as 900. It's actually kind of big. It's 2331010 with this weird 0x. Well it turns out, this is just a human convention. In week 0 we talked about decimal and all of us grew up with decimal, 10 digits from 0 to 9. Talked a little bit about binary 0's and 1's. Turns out there's an infinite number of base systems-- decimal/dec, binary/bi are just two of those infinite number of possibilities. Turns out there's another one that's super common called hexadecimal. Hexa meaning 16 in this case. So base-16 actually has 16 letters in its alphabet. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f. So it turns out that base systems that need to count higher than 10 characters just start using letters of the alphabet by convention. Humans just decided this. So we're getting just numbers in this case, but if these addresses were even bigger, we might actually see some alphabetical letters between a and f there. And frankly I don't know what address this is, but Google's usually pretty good at this stuff, so let me actually open up another browser window. And let me just paste this in. Come on, Google. Come on. So Google is your friend when it comes to this stuff, or any number of calculators. 0x2331010 in decimal please. And Google has translated that. So Brian, I-- kind of under a bit earlier. He is not at address location 0, he's actually in the 36 millionth byte inside of my computer right now, location 36,900,880. So a little higher address than 100. And then Veronica, if you really want to get into the weeds here, we can say "in decimal," let Google translate that for us. She's at location 36,900,944. Why? Who cares? The computer is managing all of this for us, but when get_string used malloc, these are literally the numbers that were being returned saying, you may use this chunk of memory. And why did humans use hexadecimal? Like it's just slightly more compact to say 0x2331050, then 36900944-- like you just save a few digits, so it's just conventional. That's all, there's no magic there. But, recall earlier. Do you recall that when I had the debugger open earlier, you saw next to my name variable a value that was cryptically 0x0? Then there was another value that I don't recall-- 0x-something? That was just the numeric address of my name in hexadecimal. And 0x0 is just the technical address being used by null. Yeah? AUDIENCE: You said the address printed out was [INAUDIBLE] x of the variable s and-- DAVID MALAN: Sorry, could you say that again? AUDIENCE: You said the address printed out on the screen was an x, but x is [INAUDIBLE] DAVID MALAN: Ah, I should've clarified. 0x, humans years ago decided anytime you see anything with 0x, that means whatever comes next is hexadecimal. Just the convention. It's also common too if it starts with a 0, it's an octal, which is base-8. If you see a lowercase b at the end, it means binary. So humans have just come up with symbology as to kind of communicate this to readers, that's all. Not part of the value. So turns out that we can actually do this math ourselves. And we won't really get into the weeds of this because it's not a particularly useful life skill, to be able to convert to various base systems, but let's just do one example so that we've seen it. Just to make clear that there's no magic here, it's just a different way of thinking about numbers versus grade school. So if back in the day we had three decimal numbers-- 255, 216, and then another 255, if we rewound to week 0, we could go through the math of converting that to binary. And even if it might take you a little while, this is the binary equivalent. And frankly, the first and last are kind of easy. 255 is kind of a special value because with 8 bits, all of which are 1, that's what gives you 255. So the only hard one is actually this. But who cares about the math today. We know from weeks ago that we can do this if we really tried. But notice that bytes are eight bits, and of course, eight is a pair of four, if you will. Well what's really nice about hexadecimal is that it starts at 0 and ends at f. And that's 0, 1, 2, 3, 4, 5, 6, 7, 8, 9-- wait-- yes, that's 10. OK. And then a, b, c, d, e, f. I just held up 16 fingers in total, hence, hexadecimal. What's nice about base-16 is that how many bits do I need to count from 0 up to-- one, two, three, four-- 15? Just 4, right? So if I have all 0 bits, that's 0. And if I have 4 1-bits, that's-- let's see. This is an 8 plus 4 plus 2 plus 1 gives me 15. So long story short, hexadecimal's super convenient because 0 through f maps wonderfully cleanly to 4 bits. So it's just a nice way of thinking about the world not in units of 8 but in 4 instead. So all I did here was I took my values and I just added a little bit of whitespace to make clear that 8 bits is like a pair of 4 bits. It turns out now that 1 1 1 1 is f for the reasons I enumerated earlier. All 1's is f, otherwise known as 15. All 1's is again f, otherwise known as 15. If we did the math, 1 1 0 1 is d, 1 0 0 0 is 8, and then all 1's is f and f. So long story short, there is a way to convert from decimal to binary, to hexadecimal, to any number of other base systems. It all just boils down to what digits you care about. And the way you write this, to your question earlier, is by human convention. Not just FFDAFF, but 0xFF0xD80xFF just because. Then it's clear to the user what it is. So a little levity now. I'm sorry to do this to you, but now you will all hopefully understand this famous comic. OK, welcome to that club of people who understand things like this. So let's now stumble upon just one last problem, and we'll take it home by putting into the context a very sexy field of forensics where all of these building blocks will come into play. But first let's start with a problem. Suppose I want to implement a function here called swap whose purpose in life is just to swap two values, a and b. I just want to do a switcheroo. Let's first do this with a sort of mid-lecture snack for at least one person. Would anyone be up for-- OK, that was fast. Volunteering, come on up. What's your name? Kelly, all right. Thank you for volunteering so suddenly. Kelly, David, nice to meet you. OK, so very simple task at hand. I have here two empty cups, and we have some orange juice. OK, put this in here. And we've got some milk over here. That should stand out, very different colors. OK, I would just like you, Kelly, if you could, swap those two values. Orange goes into milk, milk goes into orange please. That is cheating, OK? No, I mean literally the cups. I put them in the wrong cup, I prefer my milk in the other cup and my orange juice in the other cup, I'm sorry. AUDIENCE: Pour it back in. DAVID MALAN: No, that is not available to you, OK? [LAUGHTER] OK, so you're struggling. Why are you struggling? KELLY: Because I'm going to mix them. And then it won't be the same. DAVID MALAN: Right. So I mean obviously, this is kind of a losing proposition. You can't really do this. What would make this easier for you besides putting them back in the bottles? KELLY: Having another container. DAVID MALAN: Yeah. So you need like a temporary storage space for this. You know, let me-- Tara, can we get some more cups over here? Ah, this will make it easier. OK, so if I get you some temporary space-- here you go-- could you solve the problem now please? Ah, very nice. A little contamination, but that's OK. But I need that temporary cup back for Tara. Yeah, OK. Thank you. All right, a round of applause if we could for Kelly here. [APPLAUSE] Well here we go. I'm guessing you don't want warm milk, but orange juice? OK. Thank you so much. All right, so what's the point here? This is pretty easy. Like once you have some temporary storage space-- a variable, if you will, like it's no problem to swap two values. So let me go ahead and do that as follows. I'm going to go ahead and just implement this swap function and see exactly as Kelly ultimately just implemented it. If the goal is to swap a and b, I can't just do a complete switcheroo, it seems. I need to put one of those values, like the milk, in another container, and then swap and then swap. So it takes three steps, not just one. All right, so I could call this extra variable or cup that Tara gave us anything we want-- tmp. So I'm just going to put a in tmp. Then I'm going to put b in a, because a is now empty. Then I'm going to put tmp in b, and then I don't really care what happens to tmp-- indeed, it's just still sitting there, but the job is now done. So let's go ahead and see this program in action, because obviously this should be pretty straightforward. So let me go ahead and open up this program in the context of a main function so we can actually run it. In this code here, I'm going to demonstrate it as follows. Here's my main function. I'm going to call variable x, give it 1, call variable y, give it 2, go ahead and just print out just for a quick sanity check-- x is this, y is that. Then I'm going to call this super simple swap function, x, y. Then I'm going to print the exact same thing-- x is this, y is that, just so I can see in those variables-- I could also use debug50, but this is meant to be a complete solution, I want to see it on the screen. Here is swap. I copy-pasted that from before. This feels like a no-brainer, super straightforward, let's go into my directory and compile this program, which, slight spoiler, noswap is the name. ./noswap. Oof. Let's zoom in. Nope, that is not what I intended, right? I really intended milk to become OJ, OJ to become milk, or x become y, y become x, this doesn't seem to work. And again, the only magic is this one call to swap. All right, maybe it just works some of the time. So nope, nope-- OK. Now it's time for the debugger. I don't understand what's going on in my program, printf is not really illuminating here. So let me go ahead and run debug50 ./noswap. The little debugging panels get opened on the side, but wait, I need a breakpoint. I'm going to start a breakpoint at the very top, the first line I care about. I don't really care about all the stuff at the super top. Now I'm going to go ahead and rerun debug50 ./noswap, all right? Now I see over here, the first line 9 is highlighted. Notice on the right-hand side, and this perhaps answers by example your question earlier. x and y conveniently, but just because we're initialized to 0-- not by me, I shouldn't necessarily trust this in all contexts, but that's why they had values. They're otherwise known as garbage values, but I got lucky with 0's here. Let me go ahead and step over that line, and if you watch, albeit small, on the right-hand side, x should suddenly take on a value of 1. And if I step over one more line, y should take on a value of 2. OK, so I'm pretty confident the program is thus far correct. I'm going to go ahead and step over printf. And notice the blue terminal window, I see one output. Now things get interesting. If I continue stepping over lines, it's just going to finish running and that's not enough. So notice this time I'm going to hover over this third icon, Step Into. Now I can kind of go down the rabbit hole, so to speak, and go into the swap function, and notice, the debugger jumps into that other function. So here now, the context changed. My local variables are now a, b, and tmp, and this is really weird. A is 1, b is 2, as expected, because I passed an x, y. And in the context of this function I'm just calling them a, b because. But why is tmp 32,767? It's just because it can't be trusted, it's a garbage value. If you just give yourself a temporary value, who knows what's in there? We got lucky and Tara did not have anything in this cup, but it could have had a garbage value, maybe it had some Pepsi, and then we would have had to replace that value somehow. So to be clear, when you declare variables in a program, quite often they have garbage values, just bogus values-- the 0's and 1's that are there underneath the hood in that chip, but that you didn't set yourself. But that's OK, because I'm explicitly in this next line setting tmp equal to a. So it doesn't matter what its original weird value was, so if I click Next, tmp is now 1, a.k.a. a. Now notice a is going to become b if you watch the right-hand side. Now I seem to have a is 2, b is 2, which is a little worrisome but not as bad, because I have that separate variable tmp, so I still have the one around. So now b is about to become 1, and I've done the switcheroo. OK, at this point in the story, line 22, my code seems correct. b has become a, a has become b, and the values are swapped-- and the debugger is confirming that for me visually. So now, let's do a step and-- dammit. Lost. What is going on? Intuitively? Even if you've never seen or done this before, like clearly there's a bug. What is that bug? What must be happening? Yeah? AUDIENCE: [INAUDIBLE] a new value [INAUDIBLE] doesn't have the same address for the first one? DAVID MALAN: Yeah. What seems to be happening here is yes, you're passing in x and y and calling it a and b, but a and b would seem to be copies of x and y. And I am very successfully, very correctly swapping a and b, but because they're copies, it has no effect on the original x and y. So our metaphor here of juice isn't quite apt because I didn't pass Kelly copies of the OJ and milk, I handed her the actual OJ and milk and she was able to change the values. But in the context of C and code, when you pass arguments to a function, you're passing copies of those arguments to the function. So intuitively, what is the solution? We clearly cannot pass from one function to another copies of the values if we expect the function swap, or a.k.a. Kelly, to make some useful change for us. What do we have to pass to the function or to Kelly instead? The addresses of those values, right? I told her where the milk and OJ were. I didn't give her copies of them, I told her, here's the milk, here's the OJ, swap those. In this version of the code, I've just said, here's a copy of x, here's a copy of y, you can call them a and b-- um-mmm. We need to now use the ampersand or something like that to pass in a map, if you will. The treasure map to those values so that swap can change the original values. And the way we do this is a little weird-looking, but all we're going to have to do is make a little addition here that looks as follows. It's got to look like this instead. So this is the broken version. Or broken in that it doesn't have the effect we intend even though it works. This is what we need to do instead, and it's the last piece of new symbology for today. We've seen star in a couple of different places before, now we're using it in one final context. When you specify a star here and here in the arguments to a function, that is just the way you tell the computer, I'm expecting not an int, but the address of an int. I'm expecting not an int here, but the address of an int. So two pointers, two addresses of integers. Down here, tmp is still just an int. I don't need to over think tmp, that's just an empty cup. Give me an integer called tmp from week 1. But, what do I want to store in tmp? Both a and b in this version are addresses. Do I want to remember the address a and the address b? No, I want to remember the volume of OJ, the volume of milk, I want to remember 1 and 2, I don't care where in memory they are. So star in this context, when there's no mention of a data type, there's just a star and a variable name. That variable is a pointer and it's not multiplication, there's no math going on. That star is the dereference operator that says, go to this address and get the value there. So if this address a is at location, I don't know, 100 like Brian was, and this address b is at location 900 like Veronica was, *a means go to the 100th byte in memory and get me that value, which is 1. This means, down here, go to the address b, get me that value at address 900, which is 2. And go ahead and store 1 in tmp. Go ahead and go to that address and put whatever's at b's address-- so get that address and put it over-- get that address, get the value, and put it over at that address by dereferencing. And then lastly, go to b in memory, like over there, put the tmp value there. So whereas ampersand in our previous example means, tell me what the address is of a variable, star is the opposite. When you have an address, it says, go to that address. Follow the treasure map, X marks the spot at that location in memory, and get at its value. So what is the net effect here? If I actually now open up not this example, but swap.c-- spoiler, this one is going to actually work. If I open up swap.c, we're going to see now the following instead. The code is almost the same, except that I pasted it in this new green version of the function. And notice here, this had a change. Why am I typing in %x now and %y instead of just x and y? AUDIENCE: [INAUDIBLE] address [INAUDIBLE] functions [INAUDIBLE].. DAVID MALAN: Exactly. The swap function now, the new improved version is expected two addresses-- stars. Each star, a.k.a. pointers, not just values. So this means I know x and y are actually integers from week 1. Now I need the address of x and the address of y so that swap can follow those treasure maps, so to speak, and go to those addresses. So now, when I run this program, this is more like the metaphor with Kelly where I told her where the milk and OJ were. Now swap and go to those locations as follows. make swap. Let me go ahead and then do ./swap, Enter-- ah! Now it seems to be working. And we can see as much even with the debugger. Even though it doesn't seem to be buggy, I can still use debug50 to see and understand my program, if not obvious-- oh, I still need a breakpoint. Let's set a breakpoint as before. Let's rerun debug50. The right-hand panel will open automatically for me. And let's go ahead and see, if I start stepping over this, now I see that x is 1, y is 2, printf prints as much on the screen. Now I'm going to go ahead and step into swap, and now notice, it's a little weird-looking, because now a is an address and b is an address, but tmp is still an int with a garbage value, but I can fix that. Now tmp is 1, but notice, a and b's values are not changing, but what is clearly changing per the code? So notice, this is weird and cryptic. a is this 0x value. That's a big hexadecimal address, like that is where in memory a is. But you know what? If I click the little triangle, I can kind of follow that pointer and go to it. The debugger is smart like that. So *a, go to a is 2; and *b at the moment is 2, but if I keep going, now I've done a switcheroo, and you can see that these values have changed. And again, we don't care what these addresses are, I don't care what the actual addresses are. I do care that it gives me this functionality, because now when I return up here in print, now the values have indeed changed as I expected this whole time. All right. That was complex, but hopefully clear as to why it now works even though we've made this code look more cryptic. If not, any questions are welcome. Yeah? AUDIENCE: Is that from the spot where [INAUDIBLE] DAVID MALAN: Uh huh. AUDIENCE: [INAUDIBLE] the star [INAUDIBLE] pointers? DAVID MALAN: Good question. Do we really need to have these ampersands here because we already have the stars here? Short answer, yes, for symmetry. This is telling the function what to expect on the way in; this is what's telling the computer actually what to send in. So what are the actual inputs to that function? It has to be symmetric. Yeah? AUDIENCE: [INAUDIBLE] value is swapping addresses. DAVID MALAN: We are swapping what is at the addresses. AUDIENCE: So what if you change the address of [INAUDIBLE] DAVID MALAN: OK. AUDIENCE: And would we swap the addresses saying 2 is at 200 and 1 is at [INAUDIBLE] that could change. DAVID MALAN: Short answer, you cannot for the following reason. So technically, when you do %x and %y, these are converted to the address of x, the address of y. Technically swap is getting copies of something, C has not changed. But C is now getting copies of the address of x, copies of the address of y, calling them a and b. So sure, you could swap the addresses, but for the same reasons as before, it's going to have no fundamental effect. The difference here is because I'm passing in a map, so to speak, to x and y, their addresses. And again, an address is like-- we are at 45 Quincy Street I think right now-- Cambridge, Massachusetts 02138, USA. That uniquely identifies the building. These 0x hexadecimal numbers uniquely identify locations in memory. So this is like saying now, get me the address of x, get me the address of y, and I'm technically passing in copies of those addresses, but it doesn't matter, because now with the star notation, I'm saying go to those addresses and swap who is physically in this building and some other. All right. So let's just put this now into the context of what else your computer actually has just that you've seen some nomenclature around this computer's memory. So this is the chip with a grid laid out on top of it just to communicate that there's bytes here, and we could number them. But let's think about this now more abstractly, and let me just reveal that it turns out that the computer treats different bytes, different squares in different ways just by convention. It turns out that in your computer's memory-- and this is all just an artist's representation-- at the top of that chip of memory, so to speak, is the so-called text of your program. This is a fancy and non-obvious way of saying the 0's and 1's that your code have has been compiled into. The text of a program is the code you wrote in binary, that's where it's loaded from memory. So in macOS and Windows, you double-click an icon, that program is loaded into memory I said last week. It's literally loaded into the top of your computer's memory conceptually. What else? Well the heap is the fancy name given to the chunk of memory in which memory is coming from when you call malloc. So when I called malloc earlier to get a bunch of space for some characters, it was just coming from this big open area called the heap. And that's what get_string is using and other functions as well. Well it turns out that the reason for the problem we just ran into is because the bottom part of memory is what's called the stack. The stack is the area of memory that functions use when they are called. And this is actually relevant to that very simple noswap example as follows. If we now assume that anytime you call a function, the memory it uses comes from the bottom of that big block of memory, where you can draw that, for instance, here on the screen, because it turns out that anytime you call a function, that function gets a slice of its own memory. So for instance, main is always the first program a function calls, and so it gets the first slice of memory at the bottom of the screen here. And so if main had two variables x and y, that's like saying, OK, give me a chunk of memory called x and put the value 1 in it; give me another chunk of memory, call it y, put a value in it here. But remember, from the first noswap example, the swap function was called. This is a stack in the literal sense. You go into a dining hall, a cafeteria, one tray for food, goes on another, goes on another, goes on another so that the humans can take it and put food and plates on it. Well similarly in this model, when you call a function, it gets its own slice of memory, but literally above, conceptually, the existing frame on the stack. So this is the swap function's own chunk of memory, and it, too, gets some space. It gets some space for a variable called a. It gets some space for a variable called b. And guess what goes inside those of that first example? A copy of x and a copy of y. And you know what? It had a temp variable, so that's got to have some space here. So I'll call this tmp. And recall that I set tmp equal to a, so that got 1. And then what happened? Well then I did what-- what did I? Let me get this right. We had a gets b. So what happened there? So in this example here, a gets the value b, so that changed. And then what happens here, b got the value of 10, so that changed. So swap was working in the sense that it was swapping values, but the problem is, when a function returns, this chunk of memory that it was previously using gets reclaimed so that someone else can now use it, another function. So we did all that hard work and no swap, and we did it correctly, we just did it in the wrong place. So by contrast, this next example that we did, which was swap.c, just treated the memory a little bit differently. Main this time still had two variables called x, and this was a 1, and then another one called y, and this was a 2. And then one swap was called this time, it again had a variable called a and a variable called b, but what was stored in a and b? Well now they're addresses. And I don't know what it is, but let me just arbitrarily say that this is location 100, this is location-- let's say 104. But it could be anything, we just don't care at this point, it would have 0x technically if the computer were showing us. What's going in a here is 100, what's going in b here is 104. And those are the addresses of x and y, and the code we had using all of those new stars was saying, go to address 100 and store whatever is at address 100 in tmp. Then go to the address that's in b, or 104, and store that at the location int *a, whatever is there. Then it was saying, go get that 10th value, by the way, and go ahead and put that here, so that now we did different work in a different place. So now when swap is done running, it doesn't matter if its memory disappears because it has now mutated or changed the other memory. That it was passed in just like Kelly changed or mutated the cups I actually pointed her at rather than copies thereof. Now as an aside, there's other chunks of memory that are actually used. If you have global variables in a program, turns out that in between the text and the heap memory are your global variables, if they're initialized with values or they're not initialized with values, as would happen with the equal sign, but we don't care too much about that for today's purposes. And if you've ever heard of environment variables, which we will when we get to web programming, they, too, are stored elsewhere in memory. But the most interesting chunks of memory are stack and heap, as in this case here. But unfortunately it's so easy for things to go awry-- I mean, some of you experienced segmentation faults already, and let's consider why that might happen. So here's a contrived example of code that is by design buggy, but let's just talk it through in English what these lines are doing. This line here, int *x, is saying, hey, computer, give me a variable that will store the address of an integer. So give me a pointer to an int is the more casual way of saying it. Hey computer, give me another variable that's going to store the address of an int and call it y. So x and y, that's it. This line is new-ish. Hey computer, allocate enough space that will fit an int. So sizeof int is the new syntax we saw earlier for just figuring out how many bytes is an int. Odds are this is going to come back as 4 or 32 bits in most computers. So this just says, hey browser, give me 4 bytes of memory and store that in this location. Or rather, store that in this variable, store that this variable. So maybe it's going to say, OK, here's four bytes at location 100, or here's four bytes at location 900. Or wherever, we don't care, we're just remembering that address in x. *x says, go to that address-- 100 or 900, whatever it is, put the number 42 there. This next line says, go to the address in y and put the unlucky number-- hint, hint-- 13 there. Well what is the address in y? I haven't allocated it yet. What's the address in x? It's wherever malloc told me to use space. That's safe, that was like 100, 900, whatever the value was, but did I allocate space for y? So what kind of value does it contain, so to speak? A garbage value. Maybe it's 0, maybe it's like 32,000-- we don't know, because if you don't specify the value, it is not safe to trust it or do anything with it. This is going to give me probably one of those segmentation faults. And indeed, if I run a program like this, I'm quite likely going to see exactly that kind of problem. It's perhaps better, though, to see this in a way that will paint a more memorable picture, and for that, thought we'd take-- in our 10 minutes remaining, use a few of these minutes to take a look at something our friends at Stanford put together with a bit of claymation. It's about three minutes long, well worth it to paint a picture of exactly what goes wrong when you don't use memory correctly. If you could dim the lights. [VIDEO PLAYBACK] [MUSIC PLAYING] - Hey, Binky. Wake up! It's time for pointer fun! - What's that? Learn about pointers? Oh goody! - Well to get started, I guess we're going to need a couple of pointers. - OK. This code allocates two pointers which can point to integers. - OK. Well I see the two pointers, but they don't seem to be pointing to anything. - That's right. Initially pointers don't point to anything. The things they point to are called pointees, and setting them up to a separate step. - Oh right, right. I knew that. The pointees are separate. So how do you allocate a pointee? - OK. Well this code allocates a new integer pointee, and this part sets x to point to it. - Hey, that looks better. So make it do something. - OK. How do you reference the pointer x to store the number 42 into its pointee? For this trick, I'll need my magic wand of dereferencing. - Your magic wand of dereferencing? That-- that's great. - This is what the code looks like. I'll just set up the number and-- [POP] - Hey look! There it goes. So doing a dereference on x follows the arrow to access its pointee. In this case, to store 42 in there. Hey, try using it to store the number 13 through the other pointer, y. - OK. I'll just go over here to y and get the number 13 set up, and then take the wand of dereferencing and just-- [BUZZING] whoa! - Oh hey, that didn't work. Say, Binky, I don't think dereferencing y is a good idea, cause setting up the pointee is a separate step and I don't think we ever did it. - Mmm, good point. - Yeah. We allocated the pointer y, but we never set it to point to a pointee. - Mmm, very observant. - Hey, you're looking good there, Binky. Can you fix it so that y points to the same pointee as x? - Sure. I'll use my magic wand of pointer assignment. - Is that going to be a problem like before? - No, this doesn't touch the pointees. It just changes one pointer to point to the same thing as another. - Oh, I see. Now y points to the same place as x. So wait, now y is fixed. It has a pointee. So you can try the wand of dereferencing again to send the 13 over. - OK. Here goes. - Hey, look at that. Now dereferencing works on y. And because the pointers are sharing that one pointee, they both see the 13. - Yeah, sharing, whatever. So we going to switch places now? - Oh look, we're out of time. - But-- [END PLAYBACK] DAVID MALAN: All right. So hopefully that puts a little more visual behind some of these ideas, but let's now contextualize this in a domain that's perhaps more familiar in a couple of ways. So one, some of you might already know, especially if you've had prior programming experience, of a very popular website called Stack Overflow where lots of programmers post questions and hopefully answers to common technical problems. If you ever wondered why it's called Stack Overflow, it turns out it reduces to this picture here. This was not a mistake that I drew one arrow from the heap pointing down, and one arrow from the stack growing up. As you malloc, malloc, malloc more and more space, starts up here, so to speak, and you just get more and more space that's going this direction. But the more functions you call-- function after function after function after a function, each of them gets its own slice or frame of memory, that, too, is growing up. So this feels like a pretty bad design, but honestly, it's not really avoidable because if you have a finite amount of memory, you can't avoid each other forever. And so there's this fundamental risk of overflowing the stack, or even overflowing the heap in the reverse direction. So Stack Overflow is an allusion to, for instance, calling too many-- many, many, many, many, many, many, many, many functions, so many so that it overlaps other chunks or segments of memory, thereby inducing a segmentation fault, and buffer heap overflow is in the reverse direction, and these are more generally known as buffer overflows, and we'll see more of these in the weeks to come. But now that we have the ability to discuss pointers, let's introduce one final feature and then a familiar face. So it turns out that you can actually come up with your own custom variables kind of like we did with string, but even more sophisticated than that. For instance, if I wanted to implement a program that involves multiple students, I might do something like this. Ask the user what is the enrollment in a class, then go ahead and give myself an array of strings, a.k.a. char*s today of that size, and then I could also have another array of dorms. And I could have two arrays containing one for the students' names, one for the students' dorms, and I can keep track of other things. Another array for emails, another array for phone numbers-- but this gets messy quickly, because you can imagine, if I need names and dorms and emails and phones, that starts to become a lot of copy-paste. And I just have this design where I have lots and lots of arrays where each bracket location-- like bracket 0, bracket 1 presumably refers to the same student across all of these arrays, like mmm! Messy, messy, messy design. So with a wave of my hand, let me actually fix that immediate problem out of the gate by introducing a new feature. I can invent my own data types. Let me just go ahead and declare an array called students with this many students, but of data type student. C comes with float, bool, char, int, not string, and definitely not student. So you can make your own custom data types, and you can put them in your own header files, which we've not done either. But I can look, and you'll see more of this in the next problem set. So not to worry if this feels quite brief, it's just meant to be a teaser here. And struct.h is how you declare or define your own type. The keyword is literally typedef struct for structure, or data structure to be more complete. The name of the data structure comes at the end after some curly braces. And then inside the curly braces you just specify, well what do you want a student to have? I want them to have a name, a dorm, maybe a phone number, maybe an email address, anything I want. I can just add here. So that now in my actual code, I can have an array of actual students, and I can just access them with this new notation like this. You know that you can index into an array with bracket notation. What you didn't know until now, perhaps, is that if at that location is a structure, a.k.a. struct, you can get at the name, the dorm, or the phone, or the email, or anything else there just by using a dot-notation, which is our last piece of new syntax for today. Everything else is the same. I can write a program that says so and so is in such and such a dorm by just saying get the i-th student's name and the i-th student's dorm. And I can be even fancier, and if I don't want to just print those values, I can even, now, that I see no understand pointers-- or I've seen pointers and we'll soon understand them by way of problem sets and practice, I can actually do this. This is just a little sneak preview of a line of code that uses a new function called fopen. fopen this file open, and it takes in the name of the file to open. You might know of CSV files, they're like simple spreadsheets, comma separated values. And quote-unquote "w" means write. So this says open the file called students.csv in write mode, so I can write to this file. Because in this example, as you'll see in the days to come, I want to write out to a file. But it turns out to use files, I need to know what a pointer is, and it's a little weird that it's all caps, but there is a data type in C called "file," and it's a pointer. So long story short, what you're going to see in the next problem set as we explore the world of forensics is the ability using pointers and a few new functions to open files and get back the address of that file in memory so that you can go to that address, change the contents of a file, and save it back out. All of us take for granted these days that you can go to File, Open and File, Save, but what's actually happening, pointers are involved, stuff's getting loaded into memory, and the computer is dereferencing or going to those addresses and changing what's at those locations in memory. Now why might you want to do this? Well here, of course, is Zamila-- you might recall from some of the problem sets and the walkthroughs. Turns out we could try to enhance this picture of her by zooming in, and here's about as much fidelity as it is in her eyes. Like I do not see the glint of any criminal's logo on his or her jacket in the glint of Zamila's eyes. If you zoom in on an image, and an image, recall, from week 0 is just a grid of pixels or dots, that's all you get. And you can maybe smooth it out a little bit or clean up the colors, but you can't just "enhance," quote-unquote, and see more of the glint in Zamila's eye, because an image at the end of the day is just a bitmap, a map-- top-down, left-right-- of pixels. For instance, here's a smiley face. If you kind of take a look back and you can kind of see a black smiley face against a white backdrop. And if we just decide as humans, let's represent white dots with 1's and black dots with 0's, this might be what's in the file, this is what the human sees. So if we have the ability to open that from a file, store it in memory, and then using pointers go to those locations in memory, we can even change the smiley face to an unhappy face, for instance, or color it or do any number of things to it. Now at quick glance, there's a lot going on in files, because what a file is is a set of conventions that humans decided on where humans years ago just decided in a bitmap file, BMP file-- so an older but still popular file format for images, humans just decided that, like, we're going to put a bunch of special values at the first bytes of the file, then some more special values than the actual RGB pixels in the rest of the file. So this is meant to look cryptic at first glance, and the next homework assignment will walk you through this, but all it is is a convention of what the 0's and 1's mean in these different locations. And indeed, the challenge ahead is going to be to do a number of things. One is to first and foremost figure out-- who done it? A sort of murder mystery in which there's a clue hidden in an image, but an image that's a little noisy and you're going to have to figure out what secret messages in the image by loading that image in, tweaking it, putting a sort of red filter on top of it and seeing the secret message, but all digitally; two, actually resizing images and taking this many pixels in this big of a smiley face or something else and making it bigger, or if more comfortable, making it even smaller and figuring out how to make that workout; and then lastly, we've been taking some photographs of all CS50 staff in Cambridge and New Haven. Unfortunately we accidentally corrupted or lost the memory card, but we made a forensic image of it, a copy of all of the 0's and 1's with all of the staff photos, and we're going to need you to write code that actually recovers all of the JPEGs or photographs from that digital card by opening a file, reading in those 0's and 1's, understanding what they are and where they are, and just writing them back out to disk using functions we'll introduce you to in the problem set itself. But of course, all of this takes for granted that we can do this, and you can only do so much. And indeed, this week is as much about solving those problems as it is realizing the limitations of computers, and so we thought we'd end with the final few seconds of this very real example from Futurama. [VIDEO PLAYBACK] - Magnify that death sphere. Why is it still blurry? - That's all the resolution we have. Making it bigger doesn't make it clearer. - It does on CSI Miami. - Ugh. [END PLAYBACK] DAVID MALAN: And that's it for CS50, we'll see you next time. [APPLAUSE]