DAVID J. MALAN: This is CS50 and this is the start of week four. And, boy, is Volkswagen in trouble all because of software. Let us take a look. [VIDEO PLAYBACK] -Cars, the smartest characters in the Fast and Furious movies. This week German automaker Volkswagen found itself in the middle of a scandal of potentially criminal proportions. -Volkswagen is bracing for billions in fines, possible criminal charges for its executives, as the company apologizes for rigging 11 million cars to help it beat emissions tests. -Certain diesel models were designed with sophisticated software that used information including the position of the steering and vehicle speed to determine the car was undergoing emissions testing. Under that circumstance, the engine would reduce toxic emissions. But the car was rigged to bypass that when it was being driven. Emissions increased 10 to 40 times above acceptable EPA levels. [END PLAYBACK] DAVID J. MALAN: So let's take a look at this and see exactly how this might be implemented and how this might affect so many cars like this. So in my hand here are the press release that was issued by the EPA-- the Environmental Protection Agency which is the US regulatory agency that handles environmental concerns, and then the actual legal notice that was send to Volkswagen just a few days ago. So the EPA writes, and discloses now publicly, a sophisticated software algorithm on certain Volkswagen vehicles detects when the car is undergoing official emissions testing and turns full emissions controls on only during the test. The effectiveness of these vehicles pollution emissions control devices is greatly reduced during all normal driving situations. This results in cars that meet the standards in the laboratory or testing station, but during normal operation emit nitrogen oxides-- or NOx-- at up to 40 times the standard. The software produced by Volkswagen is a quote unquote, defeat device, as defined by the Clean Air Act in the US. They go on to say that the EPA and another agency uncovered the defeat device software after independent analysis by researchers at West Virginia University. NOx pollution contributes to nitrogen dioxide, ground level ozone, and fine particulate matter. Exposure to these pollutants has been linked with a wide range of serious health effects, including increased asthma attacks and other respiratory illnesses that can be serious enough to send people to the hospital. Exposure to ozone and particulate matter has also been associated with premature death due to respiratory related or cardiovascular related effects. Children, the elderly, people with preexisting respiratory disease are particularly at risk for health effects of these pollutants. Suffice is to say, it's quite serious. And let's go on to read just one more excerpt and then we'll take a look at the underlying implications of this in the context of a car. Specifically, Volkswagen manufactured and installed software in the so-called electronic control module-- or ECM-- of these vehicles that sensed when the vehicle was being tested for compliance with EPA emission standards. Based on various inputs including the position of the steering wheel, vehicle speed, the duration of the engine's operation, and barometric pressure, these inputs precisely tracked the parameters of the federal test procedure used for emission testing for EPA certification purposes. During EPA's emission testing, the vehicles ECM software ran software which produced compliant emissions results. At all other times, the vehicle ECM software ran a separate road calibration which reduced the effectiveness of the overall emission control system, specifically the selective catalytic reduction of the Lean NOx trap-- which we'll see about in a moment. As a result, emissions of NOx increased by a factor of 10 to 40 times above the EPA compliant levels depending on the type of drive cycle. So what this really means, and the source code to the software running on the Volkswagen's has not yet been publicly disclosed, is that, effectively, this equivalent is somewhere there inside of Volkswagen's code. If you are being tested, and if the car detects certain environmental factors like the steering wheel position or the movement or lack thereof of the car or any number of other factors that are currently hypothesized to be part of this formula, they simply turn on full emissions control. In other words, they start emitting less of the pollutants. Else, in every other situation when it's not detected as being in the laboratory, they just don't. And so you can simplify this into more concrete pseudocode with something like this. If the wheels are turning but the steering wheel isn't, suggestive that the car is on some kind of rotating cylinder but in some kind of warehouse being tested, then behave as the EPA would like you to. Otherwise do not. So let's take a look at a short video that takes a look at what the implications are of this actually mechanically. [VIDEO PLAYBACK] -Last Friday the EPA announced that some Volkswagen Audi cars made between 2009 and this year were using a so-called defeat device to get around emissions laws designed to keep the air clean. But what does that mean exactly? Well, modern cars have dozens of computers inside them. And some of those computers help coordinate the functions of the engine for optimum performance while making sure that there isn't too much garbage coming out of the exhaust pipe. They've actually been working this way for several decades now. Basically, every part of a modern car's engine has a sensor or controller on it, and these computers are reading in data thousands of times per second making adjustments like the ratio of fuel to air that's going into the cylinders. These cheating Volkswagen and Audi models are diesels, and diesels have one more really important computer controlled parameters, which is the amount of unburned fuel going into the exhaust. Now that sounds bad. Doesn't sound like you would want unburned fuel going into the exhaust. But in the case of a diesel, you have something called a NOx trap which is a device that absorbs and traps for nitrogen oxides that are pollutants that would otherwise go into the atmosphere. And the effect of that NOx trap is enhanced with unburned fuel. So a defeat device is a special program inside these computers that can make it look like the car meets emission standards even when it doesn't. Volkswagen had a problem on its hands. Its diesel engines were known for getting great fuel economy, but the NOx trap only works well when more fuel is being used. So the car would detect, using this defeat device, when it was getting an emissions test, it would use more fuel, make the NOx trap work well, emissions would be fine. But then you get on the road, the device turns off, you're burning less fuel but you're putting as much as 40 times more pollutants into the atmosphere. But how the heck did the car know that it was being tested for emissions compliance? The EPA says it was a sophisticated system that checked things like steering wheel position, speed, how long the engine was on, and even the atmospheric pressure. In other words, there was no way this was accidental because the software was designed very carefully to detect an official emissions test. That's some pretty serious deception and that's why Volkswagen is in such serious trouble. In fact, their CEO, Martin Winterkorn, just stepped down. So what happens next? Well, if you're one of the half million diesel Jettas, Beatles, Golfs, Passats, or Audi A3s effected, the good news is is that your car is still safe to drive. You don't have to put it away until Volkswagen issues a recall. But at some point they're probably going to have to update the software inside your car. When that happens you might get fewer miles per tank. Lawyers are already gearing up for class action lawsuits so owners might get compensated at some point in the future. But that's not going to happen any time soon. [END PLAYBACK] DAVID J. MALAN: So this actually raises an interesting bigger picture question as to trust. Right? All of us have iPhones or Androids or something in our pockets most likely these days, or laptops on our laps that are running software made by Apple and Microsoft and bunches of other companies. But how do we know that what these software products are doing is actually what these companies say they are doing? For instance, who's to say that every time you make a phone call on your iPhone or Android phone or the like, that that phone number also isn't being uploaded to some company's server because of some program you've written, whether it's the operating system itself like iOS or Android, or because you've downloaded some third party app that somehow is listening to everything you're typing in or everything you're actually saying. How do you know that, when you guys are running Clang or Make to compile your own software in CS50, how do you that CS50's own staff, by way of the CS50 library, hasn't been logging every string you've ever gotten or every inch you've ever gotten? Well, you could certainly look at the source code for something like the CS50 library, you could look at the source code for Linux operating system running on CS50 IDE. But an amazing presentation was given back in 1984 in receipt of the Turing Award by a very famous computer scientist known as-- named Ken Thompson who received the Turing Award which is sort of computer science's Nobel Prize, if you will, for his work on an operating system called Unix, which is very similar in spirit to what we use which is Linux. And the question he asked in his acceptance speech, essentially laying down the framework for years and years of discussion about trust and security, was this. To what extent should one trust a statement that a program-- a piece of software-- is free of Trojan horses? Perhaps it is more important to trust the people who wrote the software. And in fact, we've linked to the talk that he gave when accepting this award in the '80s on CS50's website under the Lectures page for today. Because what you'll see is that he actually gives a fairly simple example of how even a compiler like Clang or whatever compilers others have used in past, what if embedded in the compiler we ourselves are using is a little if condition that essentially says, if you notice that this code is using the GetString function or the GetInt function, go ahead and insert a back door or a Trojan horse such that that program now has some zeros and ones that do something malicious. Logging all of your keystrokes, uploading that data to some server, or really anything. And what Ken Thompson goes on to do in his talk is to demonstrate that even if you have access to the source code of a compiler that maliciously might be doing this, it doesn't matter because there's this chicken and the egg reality of the past many years whereby compilers are used to compile themselves. In other words, way back when someone had to have written the first compiler. And thereafter, any time they've updated a compiler by changing its source code, adding features and recompiling it for people like us to use, well, they're using the old version of the compiler to compile the new version of the compiler. And if you take a look at the talk that he gave, you'll see that because of that circularity, you can actually have bugs or Trojan horses embedded in software we're using. And even if you look at the source code for those programs, it might not even be evident because the trickery is actually in some older version of a compiler that ever since has been injecting the threat into our software. Which is only to say, we really can't and should not trust software running on our laptops or phones or any number of places. And in fact, later in this semester when we start talking about web programming and actually start building web applications ourselves, we'll talk about these threats and others. Now, you might have wondered and noticed that there was a tiny little Darth Vader in the clips that The Verge was showing there about Volkswagen. If you've never seen, I thought we should lighten the mood because this is all very depressing and frightening. I'm going to look back at Super Bowl 2011 when a commercial by Volkswagen-- and this almost makes them likable again-- aired for the first time on TV. It's the 60 second clip that I think you'll enjoy. [VIDEO PLAYBACK] [MUSIC - THEME FROM "STAR WARS"] [DOG BARKS] [CAR STARTS] [END PLAYBACK] DAVID J. MALAN: Yeah. I was just checking. That car is on the list of violations. All right. So we look at some pseudocode a moment ago. And here's a bigger snippet of pseudocode code that we've seen a few times thus far. And let's use this is an opportunity now to introduce a new programming technique that we did see algorithmically last week when we looked at merge sort. But let's formalize it and see how we might use it in actual code, and then we're going to use this technique down the road most likely to solve certain other problems. So this was one of the first programs we ever wrote, albeit in pseudocode code. And what this program allowed us to do course was to find Mike Smith in a phone book. And notice in particular lines eight and 11 which had this Go To statement. And in fact, certain languages, C among them, actually do have a statement that is literally go to that allows you to jump to a specific line. It's generally frowned upon because it can be very easily abused and you can start jumping your program all over the place as opposed to using the kind of logic and the control flow that we've used thus far with just loops and conditions and the like. But we can simplify this algorithm in pseudocode code as follows. Instead of this iterative or looping approach where we keep going back and back and back to line three, why don't we just kind of punt and more generally say in line seven and 10, just replace those two pairs of lines with, else if Smith is earlier in the book we'll search for Mike in the left half of the book. Else if Smith is later in the book, search for Mike in the right half the book. And notice already the circularity. Right? I'm searching for Mike in the phone book and then I eventually hit maybe line seven or maybe line 10 and my instruction to myself is search for Mike in half of the phone book. Well, how do I search for Mike? I'm in the middle of searching for Mike, why are you sort of sending me in a circle? But that's OK because what is happening to the size of the problem, as written in line 7 and 10? We're not just saying search for Mike, search for Mike. We're specifically saying what? Search for him in the left half of the right half which is effectively half the size of the problem. So it's OK that we're kind of engaging in this circularity, this circular argument, because at least we're making the problem smaller and smaller. And eventually we're going to reach that so-called base case where we have just one page left-- as our volunteer last week did-- we had one page left and then we don't have to keep searching for Mike Smith because he's either on that page or he is not. So how can we implement this idea, this sort of circularity in actual code? Well, we can leverage a technique that's generally known as recursion. And we've seen this in the pseudocode for merge sort last week. Recall that this was the pseudocode for merge sort. It's arguably even simpler than bubble or selection or insertion sort just in terms of the simplicity with which you can express it. But that's because we're sort of circularly saying, search for something by searching for it again. But we're searching either on the left half or the right half and then eventually we're merging in this case. But here, too, with those two sort lines, did we again have this idea of recursion. And concretely what this means, in the context of an algorithm, is that a algorithm is recursive if it uses or calls itself. Or in terms of C, a function is recursive-- a function called foo is recursive if foo, somewhere in its source code, calls the function foo itself. And that's bad if all foo ever does is call itself again and again. It's OK if foo eventually stops, as does merge sort, by saying, wait a minute, if this problem is super small, for instance, or I found him whom I'm looking for, just return. Don't recursively, don't cyclically call myself again. And so let's take a look at how this might actually work. So I'm going to go ahead and open up two source code examples here. One of which is called sigma 0. And this is not at all recursive, but let's take a look at what this program does. I've stripped out all comments from it but all of the source code on CS50's website has comments if you want to read through it again later. And let's do a couple of sanity checks here. So at the top of this code, we have include CS50.h. What does this do? Why is it here? In reasonable layman's terms. What does it do? Yeah. AUDIENCE: So that GetInt function works. DAVID J. MALAN: So that the GetInt function works. Because inside of this file, CS50.h, which we'll see before long in terms of its source code, has a bunch of functions declared-- GetInt, GetString, and a bunch of others-- and unless we actually have that Include line, the compiler Clang is not going to know that it exists. And same goes for line two where int is defined printf, which is a function we keep using quite a bit. Now, line four seems a little funky because it's just a one liner. It's got a semicolon, no curly braces, no code inside of it. But what did we call this thing in weeks past? Yeah. So a prototype. And why do we have a prototype which seems to be a little redundant typically because we usually see the function again later in the file, right? So why do we have-- you're just scratching your head but I'll take it. Yeah. AUDIENCE: [INAUDIBLE] function after the main. DAVID J. MALAN: Exactly. So that the compiler knows you will eventually define or implement that function after main, presumably. So Clang and most compilers are kind of dumb and they'll only know what you tell them. And if you want to use a function called sigma, you better teach the compiler that it exists in advance. Now, main itself, even though it's a bunch of lines, is pretty familiar hopefully by now. It's got a do while loop whose purpose in life here apparently is to get a positive integer from the user. And just keep pestering him or her until they cooperate. Then in line 16 I have an interesting call. IntAnswer. Which on the left hand side gives me an Int which can store-- called Answer-- which is going to store, apparently, the return value of sigma. So sigma is just an arbitrary but meaningful name that I've given to a function whose purpose in life is to take one argument-- we'll call it N in this case-- and just to take the sum of that number plus every positive number that's smaller than it. So if I pass in the number 2 to sigma, I want to add 2 plus 1 plus 0-- not 0-- so that gives me 3. If I pass in 3 to sigma, I want to have 3 plus 2 plus 1, which gives me 6. And so forth. So it just adds up all the numbers less than or equal to it. Now, down here I'm just going to print out the answer. So as a quick sanity check, let's make sigma 0-- dot slash sigma 0-- and let me type in 2. And I indeed get 3. Let me type in 3. I indeed get 6. And if anyone can do the math quickly, if I do 50 what am I going get? AUDIENCE: [INAUDIBLE]. DAVID J. MALAN: Well, no. But 1,275 which is pretty close. So this is the result of doing 50 plus 49 plus 48 plus 47 plus 46 all the way down to 1. So that's all sigma does. But let's see how we've implemented it now. So down here is the function itself. And this doesn't seem to have anything to do with recursion yet. In fact, we're using an old school technique. I'm initializing a variable called sum to zero, then I have a foreloop here, and I'm declaring an Int called I, setting it equal to 1-- though I could set it equal to zero, but since I'm doing addition, who cares if it's zero or one. It's going to have no effect. So I'm iterating so long as I is less than or equal to m, which is the argument that was passed in. And then I just keep incrementing I. And insight of the loop all I'm doing is doing sum plus equals I. And that's deliberate. I don't want to do, in this case, like sum plus plus. I want to actually add the current value of I which keeps getting bigger and bigger and bigger to the running tally. And then I return sum. And so answer gets the value sum. And then I print it out. So there's an opportunity here, though, to kind of simplify this code conceptually and the kind of blow one's mind in terms of the simplicity even though it takes a while to sort of appreciate why this is powerful in these small examples. Here is sigma one-- so the second version of this code. Everything up top is identical so that same story applies as before. But now let's look at the implementation of sigma which I've whittled down to just these lines-- four lines of code, really, plus some curly braces and white space. But what am I doing? If m is less than or equal to zero, I need to kind of handle that super simple case. And if you hand me zero or anything negative which is just weird, I'm just going to arbitrarily but consistently return zero. I don't want this thing to get into some weird infinite loop because of a negative value. So I'm just saying, if you give me zero or less, I'm returning zero. But that's good because that's that single page of the phone book that's left. I'm biting off a very specific problem and not calling something recursively. But in line 31, what do I seem to be doing? The parentheses are just keeping things, hopefully, a little clearer. But all I'm doing is I'm returning m-- whatever you hand me-- plus the value of m-- sorry, plus the value of sigma of m minus 1. So what does this mean? If you give me the number 3 as input, the answer I want to get ultimately is 6 because 3 plus 2 plus 1 gives me 6. But how do I think about how this code is running? The first time I call sigma and I pass in the value 3, that's like saying on a piece of paper, here's the value 3 and I've been passed this as sigma. 3 is obviously not less than 0 so the IF condition doesn't apply. The ELSE does. So what do I do? I want to return m, which is 3, plus sigma of m minus 1. So let me keep track of this. I'm going to put this piece of paper down. And what value, to be clear, am I going to pass into sigma at this point in the story? What number? 2, right? 3 minus 1 is 2. So I just need a little scrap of paper here. So now sigma is getting called again. And I've deliberately put this down because it's kind of like pausing that version of the story because now I'm focused on signal of m minus 1. So m was 3, m minus 1 is 2. So here is 2 that I've been passed. 2 is obviously not less than 0 so that case doesn't apply. Else I return m, which is this thing, plus sigma of what value? So if sigma of 1-- because m is right now 2 so 2 minus 1 is 1. So now I have just the value 1. I'm passing just the number 1 to the function sigma-- or myself here-- so 1 is obviously not less than zero, still doesn't apply. Else return 1 plus sigma of what? 0. So let me just remember that. I'll get back to that later. Now I'm going to go ahead and jot down the number 0 because that's my argument or parameter. I'm passed the number 0 and finally this process of just repeating myself ad nauseum does cease because what do I immediately do once I see this 0? I return zero. So now you have to rewind the story. If I now go backwards in time, what was the most recent thing I did if you were literally rewinding a video? I'm going to pick up the most recent 1 and that gives me 1 plus 0 is 1. If I keep rewinding the story, that's going to give me 2 plus this running value, which is 1. So that's 3. And then I'm going to keep rewinding. When I first put down the number 3-- so 3 plus 3 gives me 6. And now, if you've rewound the video up until this point, this was the very first question I asked. When passed 3, what is sigma of 3? It's indeed 6, the sum of all these pieces of paper. So if that takes a little while to wrap your mind around, that's fine. But consider it was a little-- it was very deliberate that I stacked these numbers on top of each other. It's kind of like having a memory-- a record in time, like a scrubber in a video, that I can indeed rewind in. And we're going to come back to that metaphor in just a little bit. But first, it turns out that there's a lot of geeks and funny people, I guess, at Google. Would someone who's very good at Googling mind coming up for just a moment and help me search for something? Very, very low key. Someone who's never come up before, perhaps. OK. Yeah? Come on. Come on down. What's your name? SAM: Sam. DAVID J. MALAN: Sam, come on down. This is Same. Nice to meet you. Hey. Come on over. So all I need you to do, if you could, Sam, here's Google. Can you search for the term recursion? Don't spoil. And now let's-- yeah. OK Click that. Better click that. Ahh, get it. No? OK. So let's do a couple others. Not so much related academically here, but have you ever searched Google for anagram? SAM: No. DAVID J. MALAN: OK. Search for anagram instead of recursion. How about askew. You ever searched for askew? Now, this one's a little hard to see but hopefully everything's-- OK. It's just you and me enjoying this. OK. So finally, this one's-- it's a little askew. Now do a barrel roll. Wonderful. All right. Big thank you to Sam. Here you go. Thanks. So what's going on in all of these silly examples? So really, underneath the hood of Google's millions of lines of code apparently is a few silly IF conditions that are essentially checking if the user has typed in this phrase, do something that probably took a nontrivial amount of time to implement just to be amusing in this way. But that's all it boils down to underneath the hood. But, of course, recursion is more of the geekier example among those special tricks. And surely there's others out there as well that we perhaps haven't even discovered just yet. So take a look, or consider now the following program, and certainly grab any of these on your way out. I'm going to go ahead and open up a program that's going to try to swap two values. But before we go there, let's do this. Could we get one more volunteer, I think? Would you like to volunteer? No? Come on up. Come on up. All right. So your name is what? LAUREN: Lauren. DAVID J. MALAN: Lauren. Come on up, Lauren. So Lauren is being challenged here as follows. Nice to meet you. So Lauren here has in front of her two empty cups. And we have some orange juice and some milk and we're going to go ahead and do the following. We're just going to fill this. A few ounces of milk over here and let's fill a little orange juice over here. And in front of all of these audience members, swap the two values of these cups. Put the orange juice in the milk cup and the milk in the orange juice cup. How would you do this if you were at home and had access to other supplies? LAUREN: Put it in another cup. DAVID J. MALAN: OK. So let's have a temporary variable, if we will. And go ahead now and implement this same swapping procedure. So, good. We've put OJ into the temporary variable, milk into the OJ variable, and now the temporary variable into the milk variable. OK. So very well done so far. So it turns out-- hold that thought for just a moment. Here, to just geek it up a bit, this would be the corresponding C code that we just implemented. We had two inputs, a and b, both of which we'll just say for simplicity are int's. And notice here, if I want to swap the values of two variables, a and b, we indeed need a middleman, a temporary variable, a temporary cup, into which the pour one of the values so that we have a placeholder for it. But then the code is exactly as Lauren here implemented. Now, just to get a little crazier, turns out that you can do this without a temporary variable. To do this properly, though, we're going to have to cheat with some chemistry. We have some extra cups here. So the closest thing that looks like milk and water perhaps-- or milk and OJ-- is we have some water, so we'll fill this one up with a few ounces of clear water. That's probably too much. Yeah. That's definitely too much. Hold on one sec. And now we have oil, which, as I recall from middle school chemistry class, hopefully it doesn't mix with water. But it kind of sort of looks like milk and OJ. So now, without using a temporary variable, can you swap those two values? So oils goes into the water cup, water goes into the oil cup. LAUREN: No other cups? DAVID J. MALAN: No other cups. And I've not actually tested this before this year so I don't know if this will actually work chemically. That wasn't supposed to happen. Is it working? All right. So separating? Good. Now we got to get the water into the other cup. Smarter chemistry concentrators could probably do this better than me. LAUREN: The water's on the bottom. DAVID J. MALAN: The water-- that was what's key the last time we did this. You have to do it in the right order. Yeah. That's-- OK. So now we have two cups of oil. OK. That's OK. But chemically if this worked than I-- LAUREN: This is water. DAVID J. MALAN: That's mostly water. All right. But that's still the same cup as before. So pour it-- try it over there. OK. This is a good use of class time today. OK. So now we-- nice. Sort of. All right. So very good. Thank you to Lauren. Very well done. So just to blow your minds, and this is perhaps something to play with if you like in CS50 ID, you can, in fact, swap two variables without using a temporary integer. And this is the corresponding C code. And if you recall from last Wednesday, we introduced, if briefly, some new operators in C. And does anyone recall what the little carrot symbol is, that little triangular symbol from the keyboard represents? What bitwise operator? AUDIENCE: EXOR. DAVID J. MALAN: EXOR. Exclusive Or. So if you want, just for fun at home, to give a and b two arbitrary values like any eight-- and I would choose an eight bit value. If you do this with 32 bits, you'll very quickly get bored. But just give a an eight bit value that's whatever, one or two, and give b a similar value. And then using the definition of XOR from last Wednesday, apply that bit by bit, each of those eight bits in each of a and b, and then do it exactly per this code. And it's not incorrect what you see here on the screen. It indeed boils down to three XOR operations and somehow magically a and b will exchange positions without losing any information. So the oil and water trick is the closest real world incarnation I could think of to mimic that. But it's surely easier to use a temporary variable, as in this case here. And this too is an opportunity say, too, this kind of micro optimization, as a computer scientist would say, while kind of fun to brag about how you did this without like swapping with an extra variable, it's not all that compelling. Because to save 32 bits, as in the case of an actual int, isn't all that compelling on a system where you might be using tens of megabytes or even more such memory these days. And in fact, when we get to a later problem set and you implement spell checker and you'll be challenged to do so with this as little RAM and as little time as possible on the computer-- you still have a week to implement it-- you'll have-- you'll be challenged to minimize those resources. And that's really the only occasion this semester where you'll be encouraged to shave off even the finest performance costs otherwise. So what-- how can we see this in actual code? Let me go ahead now and open up an example that deliberately is called No Swap because it does not in fact swap the variables as you actually might expect. So let's take a look. Here's a program that has no CS50 library going on, just standard I/O. Now we have a prototype for swap up top which just means it's got to be defined later. And here's main. I arbitrarily assigned x and y, respectively, the values one and two just because they're small and easy to think about. And then I just have a bunch of printfs where I have a sanity check. x is 1 and y is 2 is presumably what those printfs will say. So no magic thus far. Then I'm going to claim with print def, swapping dot dot dot. I'm going to call the swap function, passing in x and y. And let's assume for now that swap is implemented exactly as it was a moment ago with a temporary variable. And so I claim boldly, swapped. x is now this and y is now that. But the file, of course, is called No Swap. So let's actually see what happens. If I compile no swap and then do ./noswap , x is 1, y is 2. Swapping swapped. x is 1, y is 2. So it actually seems to be flawed even though swap-- let's scroll down now-- is implemented exactly per the code I proposed a moment ago. So we're not going to get fancy with the XOR stuff for now. This, too, should work just like with the milk and OJ, but it doesn't seem to be working. So let's do this again. Maybe I just wasn't running it right. So let's run No Swap again. Maybe I-- no. So it's just not working. So let's do a little sanity check. Let me go ahead here in Swap and just add, wait a minute, a is %i/n and let's plug-in the value of a. Because I really want to see what's going on. And indeed, this is a debugging technique that you might be using in office hours or at home already, akin to the first half of Dan Armendariz's video in PSET3 wherein we introduced print def as a recommended technique, at least for simple cases. Let me go ahead and run make no swap again, ./noswap. Interesting. So notice what seems to be true. x is 1, y is 2, but a is 2 when b is 1. So those two somehow got swapped but x and y are not getting swapped. So to be clear, what's happening is, up here I have x and y and those are variables local in the scope of main, I'm passing in x and y to swap. Now, swap, as a separate function, is free to call its arguments or its parameters anything it wants. Foo or bar or x or y or a or b. Just to make clear that they're not identical to x and y per se, I've said a and b. But we could call them anything we want. And so it looks like swap is being passed x-- AKA a-- and it's being passed y-- AKA b. Somehow these three lines are swapping those values exactly as Lauren did with the milk and OJ. But when we print out the values, a and b are indeed swap but x and y have no change to them. Recall that x and y are up here. So we can see this via another technique as well. And this too is a technique embedded in problem set three. Let's go ahead and do this in CS50 ID if you haven't already. On the right hand side we have this Debugger tab. And if you open this up, there's some arcane information that's thrown at you initially. But let's tease this apart real fast. So one, you see local variables. Turns out that build into CS50 IDE, and a lot of programming environments more generally, is a debugger. A tool that allows you to visually see what's going on inside of your program without having to resort to adding printfs and compiling and running and adding printf's and compiling and running, which already, in office hours or home, is probably getting pretty tedious. So here, in just a moment, we're going to to see in real time the values of our local variables. We're also going to be able to set what are called breakpoints which are opportunities in my program to pause execution at a specific line of code that I'm curious about. Right? These programs run in a split second. It's kind of nice for us slower humans to be able to pause, take a moment, see what's going on around a certain line of code without the program plowing through it and finishing entirely. So a breakpoints going to allow us to break and pause at a certain point. Call stack is a fancy way of saying what functions are currently being called at the moment. Main is always called first. But if Main calls a function called Swap, we're actually going to see this tower of functions that have been called in reverse chronological order. So let's see that. I'm going to zoom out. I'm going to go back to my code. And just because I want to be pedantic here, I'm going to go ahead and click just to the left of line five. And that creates a red dot. And notice on the right hand side that the debugger knows, hey, I just said a breakpoint at noswap.c line five, specifically at this line of code. So the debugger knows that I have requested that the next time I run my program it pause execution there rather than just running the whole thing super fast. So now I'm going to click the Debug button at the very top of the IDE and that's going to do the following. It's going to open an initially somewhat scary looking second terminal window-- remote debugging from host such and such-- and we'll come back to what all that means before long. But what's important for now is that that red dot was hit, the debugger has deliberately paused execution-- not on that line per se but on the first line of actual code in that function. And that's why line seven is now highlighted in yellow. And now let's take a look at the right hand side. It looks like, by default, nicely enough, x has what value? 0. And y has what value? Zero. And that's to be expected in the sense that x and y-- that yellow line-- has not executed yet. So x should not have the value 1. It might have any other value, a so-called garbage value. And we got lucky in that it's zero at this point, essentially. So now there's only a few buttons we need to care about when debugging in this way. Notice here, we have a Play button. And if we play or hit resume, that's just going to run through the rest of the program or until it hits another breakpoint. But I've not set any other breakpoints so it's just going to run through the end. That kind of defeats the purpose of poking around. So instead, I care about these icons to the right. And if I hover over them, as you should too, you'll see little tips-- tool tips. This one is step over. Now that doesn't mean skip the following line of code. That just means execute it and move to the next, move to the next, move to the next. In other words, via that button, can I walk through my code one step at a time. Line by line, literally. Now, to the right of that, there's another one that we'll see in just a moment. This is the so-called Step Into icon that's going to allow me dive into another function. But let's see this in just a moment. So I'm going to click step over. And now notice, as I click this button at top right, keep your eyes roughly under Local Variables and see what happens to x. x is now 1 because the yellow line has now executed and we've moved on to line 8. And in just a moment y should hopefully become 2. Now, nothing that interesting happens for a bit. All this is is printf. And notice, in my secondary terminal window, I see the output of print def. And now I have to make a decision as the programmer. I can step over this line of code, executing it but not getting curious about what's inside. Or I can actually step into it and go inside of Swap itself. So let's do the latter. Let me go ahead and click not Step Over but Step Into. Notice, all of a sudden the window changes to highlight the first line of code in Swap. That's line 21. And now, what's kind of funky is that, if you look over here, as expected, a comma b is 1 and 2, respectively. Why is temp 32,767? Recalling that temp, much like the empty cup a moment ago, is declared here on line 21. Why 32,000- I mean, why is it just some weird value? Yeah? AUDIENCE: It's not initialized. DAVID J. MALAN: It's not been initialized. So our computer always has physical memory. It always has physical RAM. And there's always zero's and one's in there, right? Because we're using our computer all day long, you're using the CS50 IDE or the servers all day long. So that RAM either has some zeros or some one's or some zeros and ones. No matter whether or not you're using them. You can't just have blank spaces where you want bits. They're either zeros and ones. So it turns out that temp, because we've not initialized it yet, we have those 32 bits but they've not been initialized to any known values. So whatever they were most recently used for-- those 32 bits-- we're just seeing the artifacts of some previous use of those particular 32 bits. As soon as I click Step Over though, phew, temp is going to get the value 1. And if I do it again, a is going to be given the value 2 and then b is going to be given the value 1. And so what's nice now at this point in the story is that the debugger is showing me, super slowly at my own pace, what the state of Swap is. But notice at the top here, notice that the call stack actually has two layers to it. Now the one that's highlighted as Swap, if I click on Main instead, notice how the local variables change because the developer can just hop around and go into any different scope. So even though we're doing all of this work and correctly swapping a and b, if I go back and forth between Swap where a is 2 and b is 1 and Main, has Main been affected at all? No. So what's the takeaway here? Well, it turns out that any time you call a function like Swap, and you pass it arguments, what you're passing to the Swap function in this case is a copy of those arguments. So if x and y are each respectively 32 bits, what Swap is getting is two new local variables, or arguments, called a and b-- but those are arbitrary names-- but the pattern of zeros and ones inside of a and b are lined up to be identical to x and y but they are not the same thing as x and y. It's as though Main has on its piece of paper the number 1 and 2 for x and y, and then when it hands that piece of paper to Swap, Swap very quickly gets its own pen, writes down 1 and 2 on its own sheet of paper, hands back the original xy to Main and then does its own thing with a and b. And this is now super important because this has nontrivial implications for actually writing correct code because it would seem we cannot swap two variables. I have written a correct Swap function. We've implemented it with Lauren as a correct swap function in reality, but apparently none of that matters if you can't actually swap two values permanently. So we need another way to actually get at this, and we need to be able to actually solve this problem. And it turns out-- and we'll come back to this particular picture before long-- this is one way that you might draw your computer's memory. It's just a rectangle. You could draw it any number of ways but it's convenient to draw it as a rectangle for the following reason. We're going to start today and beyond talking about the so-called stack. And the stack is just a chunk of RAM-- a chunk of memory-- that functions have access to when they're called. And so it turns out that at the very bottom of this stack is where all of Main's local variables and org C and org V and all that stuff are going to go by default. And if Main calls some other function like Swap, well, Swap is going to get another layer of memory up above it. And so just to give you a quick cursory picture of this, if I go over here-- and let me mirror this on the overhead as well-- what really I have, if we care only about the bottom of this picture for now, is that when I run a program and Main gets called, Main is given a chunk of RAM in my computer that is at the bottom of this so-called stack. And I'm going to draw it deliberately as a square. So it's like 32 bits or four bytes. And if this main function has a variable called x with a value of 1 and it has a variable called y with the value of 2, that's like taking this sliver of memory that Main has been given by the operating system and dividing it up so that the first local variable goes here, the second one goes here, and that's it. When Main calls Swap, Swap gets its own slice of memory that we'll draw like this from the operating system, and it's going to have its own local variables based on our implementation earlier with local variables a and b that initially get the values 1 and 2. But then, as soon as the Swap code executes, and Lauren actually swaps the OJ and milk, what's happening? Well, this 2 is becoming a 1, this 1 is becoming a 2, and, by the way, there is a temp variable that's being used that whole time that eventually goes away. But it doesn't matter how much work you do in this line of-- in this memory space, x and y are completely untouched. So we need some way of giving Swap and functions like it secret access, if you will, to functions like-- to memory like x and y. So let's take a look at an example that helps us see exactly what's been going on this whole time. I'm going to go ahead and open up Compare Zero. And I'm going to close our debugger, I'm going to close this scary looking message the just says, wait a minute, you're in the middle debugging. I'm going to hide this tab here just to go back to simplicity. So don't worry if GDB is killed. That just means that the program has been quit, deliberately in this case, by me. And now Compare Zero does this. I'm using the CS50 library in standard I/O. I've got a main function that first says, say something, and gets a string. Then says it again and gets another string. And notice that these two strings are called s and t, respectively. And now this program, Compare Zero, its purpose in life, it's supposed to tell me, did I type the same thing? And so I'm going back to week one. I'm using my equal equal operator which is the quality operator. Not the assignment operator, the equality operator. I'm just comparing s and t. So let's actually go ahead and do this. And I'm going to go ahead and make Compare Zero. I'm going to do ./comparezero. And I'm going to go ahead and say something like, let's do mom in lowercase and how about mom in uppercase. And of course I type different things. All right. That's to be expected. Let's run it again. Both times do lowercase, lowercase. That looks super identical to me. Enter. OK. Maybe it's just weird because it's not liking my grammar. So let's do a capital MOM, capital MOM, identical. Different things. So why is that? Well, what's actually going on underneath the hood here? So let's go back over here for just a moment and consider what GetString is actually doing. When you call GetString, that's a function we ourselves wrote and it somehow gets a sequence of characters from the user. And let's assume that the first time I call GetString, that gives me a chunk of memory that looks like this. And if I typed in all lowercase m-o-m-- and what goes after it? Just a quick sanity check. Backslash zero. We know that. And recall that we played around with Zamila's name and a bunch of other names when Rob was here looking at what's going on inside of memory. So that story's exactly the same. This is what GetString is returning to me. Now, my code a moment ago stored the return value of GetString in a variable called s. And then the second time I called it, it stored it in a variable called t. So if I go over here, I need to draw this local variable-- and I'm generally going to draw a string as just-- we'll call it s-- as a little square here. And now, somehow-- how does mom go inside of this variable s? Well, we need to go back to first principles here. What is GetString actually returning? So it turns out that M-O-M backslash zero, and any number of other strings in memory like Zamila and Rob or Andy or any others, are of course in our computer's RAM or memory. And your RAM has like-- you have a gig of RAM, two gigs of RAM, or a billion or two billion bytes, or maybe even more these days. So let's assume, for today's purposes, that it doesn't matter how we number them, but we can number each of those billion or two billion or four billion bytes. And let's just arbitrarily say that this is the first bite, second bite, third, fourth. I'm deliberately not using zero for today but we'll come back to that. So in other words, if this is the very first time I'm using the program, I'm just getting lucky and the first bite is at location one then two then three than four. And if I kept drawing, box number two billion would be way over here. So what do you think, then, GetString actually returns? It's not returning M-O-M backslash zero per se because that clearly won't fit in the box that I've drawn. So what else might GetString actually be returning all these weeks? The answer is on the board here somewhere. You can't fit M-O-M backslash zero, so what might make sense instead? If you had to be super clever, putting on the so-called engineering hat, what could you return? What's the least amount of information you could return that would still let you find M-O-M in memory? Yeah? AUDIENCE: One. DAVID J. MALAN: One. And why one? AUDIENCE: Because it would tell you where to go [INAUDIBLE]. DAVID J. MALAN: Exactly. I am just going to return the address of the string that I have gotten. The address in this case is location one. So what really is being stored in s-- and every string variable thus far-- has just been the address of that string. Meanwhile, if I call GetString a second time and I type in literally the same thing-- M-O-M with lowercase-- M-O-M and another backslash zero, and now maybe my program's been running for some time so maybe this is 10, this is location 11, this is 12, this is 13. The computers using some other memory for whatever reason. What now goes in my second variable in my program t? 10. Exactly. And so when we look at the source code of this program where I'm simply trying to compare the two values, is s equal equal to t, what's the obvious human answer? Just no because 1 does not equal 10. And so herein lies an opportunity for us really to just go back to, again, first principles and think about, well, what's going on underneath the hood? We've been talking about bits and bytes and memory, but it's actually useful to understand because when you call GetString, even though we think of it is returning M-O-M or string mom or Andy or Zamila or the like, technically it's just returning the address of that chunk of memory. But that's OK. Because how do I know where the string ends? If I'm only given the beginning? Well, the backslash zero, right? Just in linear time I can print out with print def M-O-M. And as soon as I see backslash zero, I don't care where I started, I already know implicitly where I need to end. And so today marks the beginning-- and let me do this dramatically because we went through a lot of trouble to get these here training wheels-- so today the training wheels start to come off and we reveal at least-- [APPLAUSE] That was well worth the trip to Target this morning, yes? So now-- there is, it turns out, no such thing as string. String does not exist. It's a synonym that we've had inside of the CS50 library. Henceforth, we're going to start calling s and t not strings but char stars. And the char star we'll tease apart before long. But this is to say, that even if we continue using GetString for now, technically I should be saying char star and char star. And it turns out what that star is going to denote is something called a pointer or an address. And in fact, a teaser for what lies ahead is this 20 second clip from our friend Nick Parlante at Stanford who, quite some time ago, spend a ridiculous amount of time, as best I can tell in his kitchen or his basement, making claymation introducing to the world a character named Binky with whom we will be introduced next time to pointers. So here is a preview of what's to come. [VIDEO PLAYBACK] -Hey, Binky. Wake up. It's time for pointer fun. -What's that? Learn about pointers? Oh, goody. [END PLAYBACK] DAVID J. MALAN: And on that note, we will see you on Wednesday. All right. Who's dancing? Come on. Who's dancing? You want me to get it started? I'll get it started. Woooo! LAUREN: Sweet fancy Moses.