DAVID J. MALAN: All right. So welcome to the first ever CS50 postmortem for a quiz. We thought we'd inaugurate this tradition this year. And this will be an opportunity to walk through the solutions to the quiz. And we'll speed up or slow down based on interest of those here. So you're probably here because you're interested in how you could have or should have answered some of these problems. So why don't we take a look at this section first? So getting strings. This gave you three different versions of a program that was, ultimately, meant to get a string from a user. Whether or not it did that was left to you to determine. And we asked in Question 0, suppose that version 1 is compiled and executed. Why might the program segfault? At first glance, any suggestions as to why? Yeah. AUDIENCE: So I remember seeing this in a previous example of looking at the char* s and seeing the scan of the s and seeing because it's a pointer, how did it affect what you scanned in? Is it s or the address of s? DAVID J. MALAN: OK. Good. So ultimately, the source of any problem is presumably going to reduce to that variable s. And it's indeed a variable. The data type of that variable is char*, which means it's going to contain the address of a character. And therein lies the insight. It's going to contain the address of a character or, more generally, the address of the first character in a whole block of characters. But the catch is that scan s, purpose in life, is given an address and given a format code, like %s, read a string into the chunk of memory at that address. But because there's no equal sign before that semicolon on the first line of code, because we don't actually allocate any memory with malloc, because it didn't actually allocate an array of some size, all you're doing is reading the user's keyboard input into some complete garbage value, which is in s by default. So odds are you're going to segfault if that address doesn't just so happen to be a value that you can, in fact, write to. So bad not to allocate your memory there. So in question 1, we asked, suppose that version 2 is compiled and executed. Why might this program segfault? So this one is less buggy. And there's really only one obvious way where you can trigger a segfault here. And this is thematic. Any time we're using c in memory, what could you do to induce a segfault with version 2? AUDIENCE: If you use that input in a string that's longer than 49 characters. DAVID J. MALAN: Exactly. Any time you see something fixed length when it comes to an array, your radar should go off that this could be problematic if you're not checking the boundaries of an array. And that's the problem here. We're still using scanf. We're still using %s, which means try to read a string from the user. That's going to be read into s, which, at this point, is effectively the address of a chunk of memory or it's equivalent. It's the name of an array of characters of memory. But exactly that, if you read a string that's longer than 49 characters, 49 because you need room for the backslash 0, you're going to overflow that buffer. And you might get lucky and be able to write a 51st character, 52nd, 53rd. But at some point, the OS is going to say, no. This definitely is not memory you're allowed to touch. And the program is going to segfault. So there, the heuristics should be any time you've got fixed length, you have to make sure you're checking the length of whatever it is you're trying to read into it. AUDIENCE: So to solve that, you could have had a statement checking actually is the length greater than or less than? DAVID J. MALAN: Absolutely. You just have a condition that says, if the-- or rather you don't necessarily know in advance how many characters the user is going to type, because you have chicken and the egg. Not until you've read it in with scanf can you figure out how long it is. But at that point, it's too late, because you've already read it into some block of memory. So as an aside, the CS50 library avoids this issue altogether, recall by using fgetc. And it reads one character at a time, tip-toeing along, knowing that you can't overflow a character if you read one at a time. The catch is with getstring recall is that we have to constantly re-size that chunk of memory, which is just a pain. It's a lot of lines of code to do that. So another approach would be to actually use a cousin, so to speak, of scanf. There are variants of a lot of these functions that actually check the length of how many characters you might read maximally. And you could specify, do not read more than 50 characters. So that would be another approach but less accommodating of larger inputs. So question 2 asks, suppose that version 3 is compiled and executed. Why might that program segfault? So this one is actually the same answer, even though it looks a little fancier. We're using malloc, which feels like we're giving ourselves more options. And then we're freeing that memory at the end. It's still just 50 bytes of memory. So we might still try to read in 51, 52, 1,000 bytes. It's going to segfault for exactly the same reason. But there is another reason too. What else could malloc return besides the address of a chunk of memory? It could return null. And because we're not checking for that, we might be doing something stupid for another reason, which is that we might be telling scanf, read the user's input from the keyboard into 0 location, AKA null. And that, too, will definitely trigger a segfault. So for the quiz's purpose, we would have accepted either of those as a valid reason. One is identical. One is a little more nuanced. Lastly, with respect to the program's use of memory, how do version 2 and version 3 differ? So for what it's worth, we saw a seemingly endless supply of possible answers to this. And among people's answers, what we were hoping for, but we accepted other things, was some mention of the fact that version 2 is using the so-called stack. Version 3 is using the heap. And functionally, this doesn't really make all that much of a difference. At the end of the day, we're still just getting 50 bytes of memory. But that was one of the possible answers that we were looking at. But you'll see, as you get your quizzes back from the TFs, that we did accept other discussions of their disparate uses of memory as well. But stack and heap would have been an easy answer to go with. Any questions? I give you Rob. ROB BOWDEN: So problem 4. This is the one where you had to fill in the number of bytes out of all these different types used. So first thing we see. Assume a 32-bit architecture, like this CS50 appliance. So one of the fundamental things about 32-bit architectures, that tells us exactly how big a pointer is going to be in the architecture. So immediately, we know that any pointer type is 32-bits or 4 bytes. So looking at this table, a node* is a pointer type. That's going to be 4 bytes. Struct node*, that's literally identical to node star. And so that's going to be 4 bytes. String, so it doesn't look like a pointer yet, but the typedef , a string is just a char*, which is a pointer type. So that's going to be 4 bytes. So these three are all 4 bytes. Now, node and student are a bit more complicated. So looking at node and student, we see node as an integer and a pointer. And student is two pointers inside of it. So at least for our case here, the way that we end up calculating the size of this struct is just add up everything that's inside the struct. So for node, we have an integer, which is 4 bytes. We have a pointer, which is 4 bytes. And so one node is going to take up 8 bytes. And similarly for student, we have a pointer that's 4 bytes and another pointer that's 4 bytes. So that's going to end up being 8 bytes. So node and student are 8 bytes. And these three are all 4 bytes. Questions on that? Yes. AUDIENCE: Is it was a 64-bit architecture, would that double all of them? ROB BOWDEN: It wouldn't double all of them. So 64-bit architecture, it, again, changes that fundamental thing that a pointer is now 64 bits. Yeah. So a pointer is 8 bytes. So these that were 4 bytes are going to be 8 bytes. A student, which was two pointers, well, now it's going to be 8 bytes, 8 bytes. It's going to make 16 bytes. But a node is still 4 bytes. So this pointer is going to be 8 bytes. This is 4 bytes. So a node is only going to be 12 bytes. Any other questions on that one? So the next one, these are the HTTP status codes. And you had to describe circumstances under which these might be returned to you. one problem that I heard some students have is that they tried to make the errors be on the client's end. So when we try to make the request to the server, something goes wrong on our end. But generally, these codes are being returned by the server. So we want to figure out what's going wrong or right on the server that causes these things to be returned. So why might a server returns status code 200? Any thoughts? Yeah. So something about successfully the request went through. And they're able to return whatever you asked for. So everything was fine. What about 302 found? Yeah. AUDIENCE: The server was looking for what you requested. But it couldn't find it. So there's an error. ROB BOWDEN: So the server was looking for what you wanted. So just looking here, 302 found, it was able to find it. AUDIENCE: I'm sorry. Found means that they did find it. Sorry. ROB BOWDEN: So 302 found. The server is able to find what you wanted. AUDIENCE: But it's not displaying it? ROB BOWDEN: The difference between this 302 and 200 is that it knows what you want. But it isn't exactly where you wanted to ask. So 302 is a typical redirect. So you requested a page. It knows, oh, I want to return you this. But this is at a different URL. So hey, you actually want this. DAVID J. MALAN: It's a piece that said that we gave you guys a redirect function that used the header function that, in turn, printed out location, colon, and then the URL to which you want to reject the user. Even though you didn't see 302 explicitly there, that is what PHP would magically insert as the header saying exactly what Rob said there-- found. But go here instead. ROB BOWDEN: OK. So what about 403 forbidden? AUDIENCE: I think it's that the server is basically saying that the client can't access the home page. ROB BOWDEN: So yes. Well, the typical answer we were expecting is something like, the files aren't chmodded appropriately. That's probably under what circumstances you saw them. But there is a reason that the client could be at fault here. There's actually another status code-- 401. So these are very similar. 401 is unauthorized. And 403 is forbidden. And so unauthorized you exclusively get if you're not logged in. But logging in might mean that you are authorized. But if you're already logged in and you still don't have permission, then you can also get forbidden. So if you are logged in and do not have permission, forbidden is also something you can get. DAVID J. MALAN: And the mechanism by which these problems are usually solved on the server is via what command? Chmod, if it's, indeed, a permissions issue on the file or directory. ROB BOWDEN: Then 404 not found. Yeah. So unlike 302 where it wasn't exactly where you're asking but it knows what you want, this, it just has no idea what you want. And you are not requesting something valid. 418 I'm a teapot and then 500 internal server. So why might you get that? So segfault-- I actually don't know the grading standard for this. But if your PHP code had something wrong in it, in theory, it could actually segfault, in which case, this 500 internal server error, something is wrong with your server's configuration. Or there's a syntax error in your PHP code. Or something bad is going on. DAVID J. MALAN: We did see segfault among a few people's answers. And technically, it could happen. But that would be a PHP, the program written by other people, actually segfaulted, which only if those people screwed up and wrote buggy code in their interpreter would PHP itself segfault. So even though 500 is like a segfault in spirit, it's almost always the result of a configuration file issue with your web server or, as Rob said, a syntax error, like you didn't close a quote. Or you lost a semicolon somewhere. AUDIENCE: So for the Shuttle pset, I think when I did it once I clicked the browser, but nothing came up, what they called white page. But it was because of the code. I think that was JavaScript, right? ROB BOWDEN: Yeah. AUDIENCE: Would that error still come up? ROB BOWDEN: So you wouldn't have gotten this error because everything from the web server's perspective was completely fine. But you requested index.html. You requested shuttle.js and service.js. And it was able to successfully return to you all of those things-- 200. OK. It's only when your browser tried to interpret the JavaScript code that it's like, wait, this is not valid JavaScript error. Any other questions? All right. DAVID J. MALAN: So next up was number 11. And 11 was the scariest for a lot of people. So the most important thing to note here was that this was, indeed, about a doubly linked list. But this was not the same as last year's doubly linked list problem, which did not give you the caveat that the list could, in fact, be unsorted. So the fact that the list was unsorted and the fact that that word was underlined there was meant to convey that this is actually a simplification of what otherwise would have been a more challenging problem and a longer one. So a common mistake here was to have put last year's solution on your one pager and then just blindly copy that down as the answer, which is the right answer to a different question similar in spirit. But the subtleties here were as follows. So one, we have a node declared and defined in the usual way here. Then we defined list of be a global pointer initialized to null. Then apparently, there's two functions we have prototypes for here, insert and remove. And then we have some sample code here of doing a bunch of insertions. And then we ask you to complete the implementation of insert below in such a way that it inserts n into the list in constant time, also underlined, even if already present. So the beauty of being able to insert in constant time is that it implies that you have to insert the new node where? Into the front. So it eliminates, thankfully, at least one of the cases that used to require even more lines of code, like it did last year and even in class when we talked through this kind of thing with humans and with some verbal pseudo code. So in the solution here, let's skip over to that just to have a visual on the screen. Notice that we're doing the following. And also notice the other simplification was that even if it's already present, so this means even if the number is already there, you can just blindly insert another copy of it. And that, too, was meant to be a simplification, so that you could focus on, really, some of the more intellectually interesting part and not just some additional error checking given the limited time. So in this sample solution, we allocate a pointer on the left-hand side here to a node. Now, realize that pointer, as Rob said, is only 32 bits. And it doesn't actually contain an address until you assign it the address. And we do that on the right-hand side via malloc. Like a good citizen, we check that malloc is not, in fact, null, so that we don't accidentally create a segfault here. And any time you use malloc in life, you should be checking for null, lest you have a subtle bug. Then we initialize that null by assigning n and previous and next. And in this case here, I initialized previous to null, because this new node is going to be the new beginning of my list. So there's going to be nothing before it. And I want to essentially append the existing list to the new node by setting next equal to list itself. But I'm not done just yet. So if the list itself already existed, and there was at least one node already in place, if this is the list here and I insert a new node here, I need to make sure that my former node points backwards to my new node, because this is, again, a doubly linked list. So we do a sanity check. If list is not null, if there's already one or more nodes there, then add that back reference so to speak. And then the very last thing we need to do is actually update the global variable list itself to point to that new node. Yeah. AUDIENCE: In the pointer arrow [INAUDIBLE] equals null, does that deal with the list because the list is null? DAVID J. MALAN: Nope. That is simply me being proactively careful, in that if this is my original list with maybe some more nodes over here and I'm inserting my new node over here, there's going to be nothing over here. And I want to capture that idea by setting previous to null on the new node. And presumably, if my code is correct and there's no other way to insert nodes other than this function, presumably, even if list already has one or more nodes in it, presumably the list, the first node, would have a previous pointer of null itself. AUDIENCE: And just a follow-up. The reason you put pointer next equals list is you're making the pointer before list in that it's pointing to the next, I guess-- I don't-- just lists? DAVID J. MALAN: Exactly. And so let's actually consider two cases here really, even though the order we'll consider them isn't quite the same as the code. But on a high level, if this represents list and this is a 32-bit pointer, the simplest scenario is that this is null by default. And suppose I want to insert the number 50 was the first number. So I'm going to go ahead and allocate a node, which is going to contain three fields-- n, previous, and next. I'm going to put the number 50 here, because this will be n. This will be next. And this will be previous. And so what do I do in this case? Well, I've just done line 1 here. Pointer n gets n. I'm then saying, previous should get null. So this is going to be null. Then I'm going to say next is going to get list. And this just works out well. This is null. And so I'm saying, the new node's next field should get whatever this is. So that puts another null there. And then the last thing I do is check here. If list is not equal to null, but it is equal to null, so we skip that altogether. And so all I do next is list gets pointer, which pictorially results in a picture like that. So that's one scenario. And the one that you were asking about specifically is a situation like this, where we already have a one-node list. And if I go back up in the original problem statement, the next we'll insert say is 34, just for the sake of discussion. So I'm going to just conveniently draw that over here. I've just malloced. Let's assume I'm checking for null. Now, I'm going to initialize n to be 34. And this will be n. This will be next. And this will be previous. Let's make sure I didn't get this backwards. Previous comes first in the definition. Let me fix this. This is previous. This is next. Even though these are identical, let's keep it consistent. Previous. This is next. So I've just malloced my note, checked for null, assigned 34 into the node. Previous gets null. So that gives me that. Next gets list. So list is this. So this is the same now as drawing this arrow, so that they point to one in the same. And then I'm checking if list is not equal to null. And it's not this time. Then I'm going to do list previous gets pointer. So list previous gets PTR. So this has the effect of putting a graphical arrow here. And that's getting a little wavy, the lines. And then, lastly, I update list to point to pointer. So now this points to this guy. And now, let's do a quick sanity check. Here's the list, which is the global variable. The first node is, indeed, 34, because I'm following that arrow. And that's correct because I want to insert at the beginning of the list all new nodes. His next field leads me to this guy. If I keep going, I hit next is null. So there's no more list. If I hit previous, I get back where I expect. So there are still a few pointers, obviously, to manipulate. But the fact that you were told to do this in constant time means you only have a finite number of things you're allowed to do. And what is that number? It might be one step. It might be two. It might be 1,000 steps. But it's finite, which means you can't have any kind of looping going on here, no recursion, no loops. It's just got to be hard-coded lines of code as we have in this sample. So the next problem 12 asked us to complete the implementation of remove below in such a way that it removes n from the list in linear time. So you have a little more wiggle room now. You may assume that n, if present in the list, will be present no more than once. And that too is meant to be a quiz-based simplifying assumption, so that if you find the number 50 somewhere in the list, you don't also have to worry about continuing to iterate, looking for every possible copy of 50, which would just devolve into some minutia in limited time. So with remove, this one was definitely more challenging and more code to write. But at first glance, frankly, it might look overwhelming and like something there's no way you could have come up with on a quiz. But if we focus on the individual steps, hopefully, it will suddenly strike you that each of these individual steps makes obvious sense in retrospect. So let's take a look. So first, we initialize pointer to be list itself. Because I want linear time, that means I'm going to have some loop. And a common way to iterate over the nodes in a list structure or any kind of structure iteratively is to take a pointer to the front of the data structure and then just start updating it and walk your way through the data structure. So I'm going to do exactly that. While pointer, my temporary variable, is not equal to null, let's go ahead and check. Did I get lucky? Is the n field in the node I'm currently looking at equal to the number I'm looking for? And if so, let's do something. Now, notice this if condition surrounds the entire following lines of code. This is the only thing I care about-- finding a number in question. So there's no else, which simplifies things conceptually a little bit. But now, I realized, and you might have only realized this after thinking it through a bit, there's actually two cases here. One is where the node is at the beginning of the list, which is a little annoying, because that's a special case, because you have to deal with this thing, which is the only anomaly. Everywhere else in the list, it's the same thing. There's a previous node and a next node, previous node, next node. But this guy is a little special if he's at the beginning. So if the pointer equals the list itself, so if I'm at the beginning of the list and I have found n, I need to do a couple of things. One, I need to change list to point to the next field, 50. So suppose that I'm trying to remove 34. So this guy's got to go away in just a moment. So I'm going to say, list gets pointer next. Well, this is pointer. Next is pointing over here. So this is changing this arrow right now to point to this guy here. Now, remember, we have a temporary variable. So we haven't orphaned any nodes, because I also have this guy in my implementation of remove. So now, if list itself is not null, I need to fix a little something. I need to now make sure that this arrow, which is previously pointing from 50 to 34, this has got to go away, because if I'm trying to get rid of 34, 50 had better not maintain any kind of back reference to it as the arrow suggested. So I just did this line. So then I'm done. That case is actually pretty easy. Chopping off the head of the list is relatively straightforward. Unfortunately, there's this annoying else block. So now, I have to consider the case where there's something in the middle. But it's not too terrible, except for syntax like this. So if I'm not at the beginning of the list, I'm somewhere in the middle. And this line here is saying, start at whatever node you're at. Go to the previous node's next field and point that at the pointer. Let's do this pictorially. That was getting complicated. So if I have a previous fields here-- let's do this-- next fields here. I'm going to simplify my pointers rather than draw a whole bunch of things back and forth crisscrossing each other. And now, let's just say this is 1, 2, 3 for the sake of discussion, even though that doesn't line up with the problem in question. So here's my linked list. I am trying to remove two in this particular version of the story. So I've updated pointer to be pointing to this guy. So this is PTR. He's pointing here. This is list, which exists globally as before. And he's pointing here no matter what. And now, I'm trying to remove two. So if pointer is pointing here, I'm going to follow, apparently, the previous pointer, which puts me at 1. I'm then going to say that the next field, which brings me over to this box here, is going to equal pointer next. So if this pointer, this is next. That means that this arrow needs to point to this guy. So what that line of code has just done is a little bit of this. And now, this is looking like a step in the right direction. We essentially want to snip 2 out of the middle of 1 and 3. So it makes sense that we want to route this pointer around it. So this next line is checking if pointer next is not null, there's indeed someone to the right of 2, that means we also have to do a little snip here. So I now need to follow this pointer and update the previous pointer on this guy to do a little bit of a workaround here the point here. And now, visually this is nice. It's a little messy in that there's no one pointing at the 2 anymore. 2 is pointing to the left. And 2 is pointing to the right. But he can do whatever he wants, because he's about to get freed. And it doesn't matter what those values are anymore. What's important is that the remaining guys are routing above and below him now. And indeed, that's what we do next. We free pointer, which means we tell the operating system, you are welcome to reclaim this. And then lastly, we return. Else implicitly, if we haven't returned yet, we've got to keep looking. So pointer equals pointer next just means move this guy here. Move this guy here. Move this guy here if, in fact, we didn't find the number we're looking for yet. So frankly, it looks completely overwhelming, I think, at first glance, especially if you struggled with this during the quiz then see something like this. And you pat yourself on the back. Well, there's no way I could have come up with that on the quiz. But I would argue, you can if you break it down into these individual cases and just walk through it carefully, albeit, admittedly, under stressful circumstances. Thankfully, the picture made everything happier. You could draw this in any number of ways. You don't have to do the crisscrossing thing here. You could do it with straight lines like this. But the gist of this problem, in general, was to realize that the picture in the end should look a little something like this, because constant time implied that you keep jamming and jamming and jamming the new nodes at the beginning of the list. Any questions? Probably the most challenging of certainly the coding questions. AUDIENCE: So is list similar to head in previous examples. DAVID J. MALAN: Exactly, exactly. Just a different name for a global variable. World wide what? ROB BOWDEN: OK. So this is the one where you had to write the paragraph. Some people wrote essays for this question. But you just need to use these six terms to describe what happens when you try to contact facebook.com. So I'll just talk through the process using all these terms. So in our browser, we type facebook.com and hit Enter. So our browser's going to construct an HTTP request that it's going to send through some process to Facebook for Facebook to respond to us with the HTML of its page. So what is the process by which the HTTP request actually gets to Facebook? So first, we need to translate Facebook.com. So just given the name Facebook.com, where actually does the HTTP request need to go? So we need to translate Facebook.com to an IP address, which uniquely identifies what machine we actually want to send this request to. Your laptop has an IP address. Anything connected to the internet has an IP address. So DNS, Domain Name System, that is what's going to handle the translation from facebook.com to an IP address that you actually want to contact. So we contact the DNS servers and say, what is facebook.com? It says, oh, it's IP address 190.212 something, something, something. All right. Now, I know what machine I want to contact. So then you send your HTTP request over to that machine. So how does it get to that machine? Well, the request goes from router to router bouncing. Remember the example in class, where we actually saw the route that the packets took when we tried to communicate. We saw it jump over the Atlantic Ocean at one point or whatever. So the last term port. So this is now on your computer. You can have multiple things currently communicating with the internet. So I can be running, say, Skype. I might have a web browser open. I might have something that torrenting files. So all of these things are communicating with the internet in some way. So when your computer receives some data from the internet, how does it know what application actually wants the data? How does it know whether this particular data is meant for the torrenting application as opposed to the web browser? So this is the purpose of ports in that all of these applications have claimed a port on your computer. So your web browser says, hey, I'm listening on port 1000. And your torrenting program is saying, I'm listening on port 3000. And Skype says, I'm using port 4000. So when you get some data that belongs to one of these applications, the data is marked with which port it actually should be sent along to. So this says, oh, I belong to port 1000. I know then I need to forward this along to my web browser. So the reason it's relevant here is that web servers tend to listen on port 80. So when I contact Facebook.com, I'm communicating with some machine. But I need to say which port of that machine I want to communicate with. And web servers tend to be listening on port 80. If they wanted, they could set it up so it lists as on port 7000. And then in a web browser, I could manually type Facebook.com:7000 to send the request to port 7000 of Facebook's web server. DAVID J. MALAN: And in this case, even though we didn't require that people mention this, in this case, what port would the request actually go to? Try again. Exactly. Not looking for that, but a subtlety that's there none the last. ROB BOWDEN: So the HTTPS, since it's listening specifically for the encrypted, it's on port 4430. AUDIENCE: And emails are 25, right? DAVID J. MALAN: Outbound emails, 25, yep. ROB BOWDEN: I don't even know most of the-- all of the lower ones tend to be reserved for things. I think everything under 1024 is reserved. AUDIENCE: Why did you say 3 was the wrong number? ROB BOWDEN: Because in an IP address, there's four groupings of digits. And they're from 0 to 255. So 192.168.2.1 is a common local network IP address. Notice all of those are less than 255. So when I started with 300, that couldn't possibly have been one of the numbers. DAVID J. MALAN: But that silly clip from-- was it CSI, where they had a number that was too big for the IP address. ROB BOWDEN: Any questions on this? The next one, so complete change in topic, but we have this PHP array for the houses in the quad. And we have an unordered list. And we want to print out each list item just containing the house name. So we have a foreach loop. So remember, the syntax is foreach array as item in the array. So through each iteration of the loop, house is going to take on one of the values inside of the array. On the first iteration, house will be Cabot House. On a second iteration, house will be Courier House and so on. So for each quad as house, we're just going to print-- you also could have echoed-- the list item and then the house's name and then close the list item. The curly braces are optional here. And then we also said in the question itself, remember to close the unordered list tag. So we need to exit PHP mode in order to do this. Or we could have echoed the close unordered list tag. DAVID J. MALAN: Also fine here would have been to use an old school for loop with a $i=0 0 and using counts to figure out the length of the ray. Totally fine too, just a little wordier. AUDIENCE: So if you were going to [INAUDIBLE], would you do-- I forget what the loop [INAUDIBLE] is. Would you $quad bracket i? DAVID J. MALAN: Exactly. Yeah, exactly. ROB BOWDEN: Anything else? DAVID J. MALAN: All right. Trade-offs. So there were bunches of answers possible for each of these. We were really just looking for something compelling for an upside and a downside. And number 16 asked, validating users' input client-side, as with JavaScript, instead of server-side, as with PHP. So what's an upside of doing client-side? Well, one of the things we proposed is that you reduce latency, because you don't have to bother contacting the server, which might take a few milliseconds or even a couple of seconds by avoiding that and just validating users' input client-side by triggering an on-submit handler and just checking, did they type something in for name? Did they type something in for email address? Did they choose a dorm from the drop-down menu? You can give them instantaneous feedback using the gigahertz computer or whatever they have that's actually on their desk. So it's just a better user experience typically. But a downside of doing client-side validation, if you do it without also doing server-side validation is that most anyone coming out of CS50 knows that you can just send any data you want to a server any number of ways. Frankly, in most any browser, you can click around in the settings and just turn off JavaScript, which would, therefore, disable any form of validation. But you also might recall that even I did some arcane things in class using telnet and actually pretending to be a browser by sending get requests to a server. And that's certainly not using any JavaScript. That's just me typing commands at a keyboard. So really, any programmer within enough comfort with the web and HTTP could send whatever data he or she wants to a server without validation. And if your server is not also checking, did they give me a name, is this actually a valid email address, did they choose a dorm, you might end up inserting bogus or just blank data into your database, which probably isn't going to be a good thing if you were assuming it was there. So this is an annoying reality. But in general, client-side validation is great. But it means twice as much work. Although there do exist various libraries, JavaScript libraries for instance, that make this much, much less of a headache. And you can reuse some of the code server-side, client-side. But do realize that it is typically additional work. Yeah. AUDIENCE: So if we just said less secure-- DAVID J. MALAN: [LAUGHS] Ugh. Those are always the harder ones to adjudicate. ROB BOWDEN: That would have been accepted. DAVID J. MALAN: What? ROB BOWDEN: I created this problem. That would have been accepted. DAVID J. MALAN: Yeah. AUDIENCE: Cool. ROB BOWDEN: But we did not accept for the first one-- well, what we were looking for is something like you don't have to communicate with the server. We didn't accept just faster. AUDIENCE: What about do not reload page? ROB BOWDEN: Yes. That was an accepted answer. DAVID J. MALAN: Anything where we felt it was more likely than not likely that you knew what you were saying, which is a tough line to draw sometimes. Using a linked list instead of an array to maintain a sorted list of integers. So an upside we often cite with linked lists that motivated their whole introduction was you get dynamism. They can grow. They can shrink. So you don't have to jump through hoops to actually create more memory with an array. Or you don't have to just say, sorry, user. The array is filled. So dynamic growth of the list. A downside though of linked lists? AUDIENCE: It's linear. Searching on linked list is linear instead of what you log in. DAVID J. MALAN: Exactly. Searching on a linked list is linear, even if it's sorted, because you can only follow these bread crumbs, these pointers, from the start of the list to the end. You can't leverage random access and, thus, binary search, even if it's sorted, that you could do with an array. And there's also another cost. Yeah. AUDIENCE: Memory inefficient? DAVID J. MALAN: Yeah. Well, I wouldn't necessarily say inefficient. But it does cost you more memory, because you need 32 bits for every node for the additional pointer, at least for a singly linked list. Now, if you're only storing integers and you're adding the pointer, that's actually kind of non-trivial. It's doubling the amount of memory. But in reality, if you're storing a linked list of structs that might have 8 bytes, 16 bytes, even more than that, maybe it's less of a marginal cost. But it's a cost nonetheless. So either of those would've been fine as downsides. 18. Using PHP instead of C to write a command-line program. So here, it's often faster to use a language like PHP or Ruby or Python. You just quickly open up a text editor. You have many more functions available to you. PHP has the kitchen sink of functions, whereas in C, you have very, very little. In fact, guys the know the hard way that you don't have hash tables. You don't have linked lists. If you want those, you have to implement them yourself. So one upside of PHP or really any interpreted language is the rapidity with which you can write code. But a downside, we saw this when I quickly whipped up a misspeller implementation in lecture using PHP, is that using an interpreted language is usually slower. And we saw that demonstrably with an increase in time from 0.3 seconds to 3 seconds, because of the interpretation that actually happens. Another upside was that you don't have to compile. So it also speeds up development incidentally, because you don't have two steps to running a program. You just have one. And so that's pretty compelling as well. Using a SQL database instead of a CSV file to store data. So SQL database is used for pset7. CSV files you didn't use much. But you used it indirectly in pset7 as well by talking to Yahoo Finance. But CSV is just like an Excel file but super simple, where the columns are just demarked by commas inside of an otherwise text file. And using a SQL database is a little more compelling. It's an upside, because you get things like select and insert and delete. And you get, presumably, indexes that MySQL and other databases, like Oracle, build for you in memory, which means your select is probably not going to be linear top to bottom. It's actually going to be something like binary search or something similar in spirit. So they're generally faster. But a downside is that it's just more work. It's more effort. You have to understand databases. You have to set it up. You need a server to run that database on. You need to understand how to configure it. So these are just these kinds of trade-offs. Whereas a CSV file, you can create it with gedit. And you're good to go. There's no complexity beyond that. Using a trie instead of a hash table with separate chaining to store a dictionary of words reminiscent of pset5. So a tries upside, in theory at least, is what? Constant time, at least if you're hashing on each of the individual letters in a word, like you might have for pset5. That might be five hashes, six hashes if there's five or six letters in the word. And that's pretty good. And if there's an upper bound on how long your words might be, that's indeed asymptotically constant time. Whereas a hash table with separate chaining, the problem there with that kind of data structure is that the performance of your algorithms usually depends on the number of things already in the data structure. And that's definitely the case with chains, whereby the more stuff you put into a hash table, the longer those chains go, which means in the worst case, the thing you might be looking for is all the way at the end of one of those chains, which effectively devolves into something linear. Now, in practice, it could absolutely be the case that a hash table with chains is faster than a corresponding trie implementation. But that's for various reasons, among which are tries use a whole lot of memory that can, in fact, slow things down, because you don't get nice benefits of something called caching, where things that are close together in memory can be accessed often more quickly. And sometimes you can come up with a really good hash function. Even if you have to waste a bit of memory, you might, indeed, be able to find things fast and not as bad as linearly. So in short, there wasn't necessarily with any of these one or even two specific things we were looking for. Really anything persuasive as an upside and downside generally caught our eye. ROB BOWDEN: So for the upside, we did not accept on its own "faster." You had to say something about it. Even if you said theoretically faster, we knew that you kind of understood that it's 0 of 1. And hash table, in theory, is not 0 of 1. Mentioning anything about runtime generally got you the points. But "faster," most of the solutions on the big board that were tries were objectively slower than solutions that were hash tables. So faster in and of itself isn't really true. DAVID J. MALAN: Dom de dom dom. I'm probably the only one that realizes that's how that's supposed to be pronounced, right? ROB BOWDEN: I had actually no idea. DAVID J. MALAN: It made sense in my head. ROB BOWDEN: I'm doing this one. OK. So this is the one where you had to draw the diagram similar to you might have seen on past exams. So let's just look at this. So from the HTML node, we have two children, the head and the body. So we branch-- head and body. The head has a title tag. So we have a title. Now, the one thing a lot of people forgot is that these text nodes are elements within this tree. So here we happen to draw them as ovals to differentiate them from these types of nodes. But notice also here we have top, middle, and bottom will end up being text nodes. So forgetting those was somewhat of a common mistake. The body has three children-- these three divs. So div, div, div and then the text node children of those divs. That's pretty much it for that questions. DAVID J. MALAN: And it's worth noting, even though we don't dwell on these details in the time we spend on JavaScript, that the order does, in fact, matter technically. So if head comes before body in the HTML, then it should appear to the left of body in the actual DOM. That his is, in general, just FYI, something called document order, where it does matter. And if you were implementing a parser, a program that reads HTML in building up the tree in memory, to be honest, that's intuitively probably what you do anyway-- top to bottom, left to right. ROB BOWDEN: Questions on that? Should I do the next one? DAVID J. MALAN: Sure. ROB BOWDEN: OK. So this is the buffer overrun attack question. The main thing to recognize here is, well, how might an adversary trick this program into executing arbitrary code? So argv1, the first command line argument to this program, that can be arbitrarily long. But here we're using memcpy to copy argv1, which here is bar. We're passing it as the argument. And so it's taking on the name bar. So we're memcpying bar into this buffer c. How many bytes are we copying? Well however many bytes bar happens to be using, the length of that argument. But c is only 12 bytes wide. So if we type a command line argument that's longer than 12 bytes, we're going to overflow this particular buffer. Now, how might an adversary trick the program into executing arbitrary code? So remember that here main is calling foo. And so then main calls foo. Let's draw this. So we have our stack. And main has a stack frame at the bottom. At some point, main calls foo. Well, immediately, main calls foo. And so foo gets its own stack frame. Now, at some point, foo is going to return. And went foo returns, we need to know at what line of code inside of main we were in order to know where we should resume in main. We can call foo from a whole bunch of different places. How do we know where to return? Well, we need to store that somewhere. So somewhere right around here, we store where we should return to once foo returns. And this is the return address. So how an adversary might take advantage of this is the fact that this buffer c is stored, let's say, right here is c. So we've got 12 bytes for c. This is c. And this is foo's stack ring. So if the malicious user enters more bytes than 12 or they enter a command line argument that's longer than 12 characters, then we're going to overflow this buffer. We can keep going. And at some point, we go far enough that we start overwriting this return address. So once we overwrite the return address, this means that when foo returns, we're returning to wherever the malicious user is telling it to by whatever value it entered, by whatever characters the user entered. And so if the malicious user is being particularly clever, he can have this return to somewhere in the printDef function or somewhere in the malloc function, just anywhere arbitrary. But even more clever is what if he has the user return to right here. And then you start executing these as lines of code. So at that point, the user can enter whatever he wants into this region. And he has complete control over your program. Questions on that? So the next question is complete the reimplementation of foo in such a way that it's no longer vulnerable. So there's a couple of ways you could have done this. We still have c only being of length 12. You could have changed this as part of your solution. We also added a check to make sure bar was not null. Though you didn't need that for full credit. So we're checking first the string length of bar. If it's greater than 12, then don't actually do the copy. So that's one way of fixing it. Another way of fixing it is instead of having c only be of length 12, have it be of length strlen(bar). Another way of fixing it is to actually just return. So if you had just gotten rid of all of this, if you had just deleted all lines of code, you would have gotten full credit, since this function doesn't actually accomplish anything. It's copying the command line argument into some array in its local stack frame. And then the thing is returning. And whatever it accomplished is gone. So return was also a sufficient way of getting full credit. DAVID J. MALAN: Not quite the spirit of the question but acceptable per the spec nonetheless. ROB BOWDEN: Questions on any of that? The one thing that you at least needed to have compiling code. So even though technically you aren't vulnerable if your code doesn't compile, we didn't accept that. No questions? OK. DAVID J. MALAN: Do you want to say this title? ROB BOWDEN: No. DAVID J. MALAN: So in this one, this was either good news or bad news. This is literally the same problem as the first quiz. And it's almost the same problem as pset1. But it was deliberately simplified to be a simpler pyramid, one that can be solved with a slightly simpler iteration. And really, what we were getting at here was not so much the logic, because probably, by this point, you're more comfortable than you were in week one with for loops or why loops, but really to tease apart that you're a little comfortable with the notion that PHP isn't just about what programming. It can actually be used as a language to write command line programs. And indeed, that's what we were trying to draw your attention to. This is a command line PHP program. So C code here, while correct in C, not correct for PHP. But the code really is the same. If you compare the solutions for Quiz 0 against Quiz 1, you'll find that it's almost identical, except for some dollar signs and for the absence of a data type. In particular, if we take a look here, you'll see that we iterate, in this case, from 1 up through 7. We could have done it 0 index. But sometimes, I think it's just mentally easier to think about things from 1 to 7. If you want one block, then two blocks, then three, then dot, dot, dot seven. We have j being initialized to 1 and then counting on up to i. And everything here is otherwise identical. But worthy of note are a couple of things. We give you these two lines, this first one, goofily named as a shebang for sharp bang. And that just specifies the path, the folder, in which a program can be found that you want to use to interpret this file. And then the line after that, of course, means enter PHP mode. And the line at the very bottom means exit PHP mode. And this works, in general, with interpreted languages. It's kind of annoying if you write a program in a file called foo.php. And then your users have to just remember, OK, to run this program, I have to type "php space foo.php." Kind of annoying if nothing else. And it also reveals that your program is written in PHP, which isn't all that illuminating for the user. So you can remove the .php altogether recall from lecture. And you can actually do ./foo if you've chmodded it by making it executable. So chmod a+x foo would have done that. And if you also add the shebang here. But really, the problem was getting at printing out something like this. No HTML, no C-code certainly, just some PHP. So Milo then returned in problem 25. And in 25, you were given the following skeleton code, which was a pretty simple web page. And the juicy part HTML-wise was down here, where we have inside of the body a form that has unique ID of inputs inside of which was two inputs, one with an idea of name, one with an idea of button. The first was type text, the second of type submit. And so we gave you, actually, more ingredients than you needed, just so you guys had options with which to solve this problem. You don't strictly need all of these IDs. But it allows you to solve it in different ways. And up at the top, notice that the objective was to trigger a window like this-- Hello, Milo!-- to pop up in the browser using the super simple, if not ugly, alert function. And so, ultimately, this boils down conceptually to somehow listening for submissions of the form client-side , not the server-side, somehow responding to that submission by grabbing the value that the user typed in to the name field, and then displaying it in the body of an alert. So one way you can do this is with jQuery, which looks a little syntactically perplexing at first. You can do this with pure DOM code-- document.getelement by ID. But let's take a look at this version. I have a couple of important lines first. So one, we have this line, which is identical to what you might have seen in, I believe, form2.html from class in week 9. And this is just saying, execute the following code when the document is ready. This being important only because HTML pages are read top to bottom, left to right. And therefore, if you try to do something in code up here to some DOM element, some HTML tag, that's down here, you're doing it too soon, because this has not even been read into memory. So by saying this document.ready line, we're saying, here's some code, browser. But don't execute this until the whole document is ready, that is the DOM tree exists in memory. This one is a little more straightforward, if syntactically a bit different, where I'm saying, grab the HTML element whose unique identifier is inputs. That's what the hash tag denotes, the unique ID. And then I'm calling .submit. So .submit here is a function, otherwise known as a method, that's inside of the object on the left-hand side there that I didn't highlight. So if you think of inputs as an object in memory-- and indeed it is. It's a node in a tree-- .submit means when this form with this ID is submitted, execute the following code. I don't care what the name of the function is I'm executing. So here I'm using, as before, what's called the lambda function or an anonymous function. It's not at all intellectually interesting other than it has no name, which is fine if you're only ever going to call it once. And inside there I actually handle the submission of the form. I first declare a variable called value. And then what is the effect of this highlighted portion here now? What does that do at a high level for me? AUDIENCE: It gets the value that the user didn't in the HTML below. It gets that ID and then finds the value of it. DAVID J. MALAN: Exactly. It grabs the node, whose unique identifier is name. It gets the value therein, which is, presumably, what the user typed him or herself. And then it stores that in the variable called value. As an aside, you could have also done this a little differently. Totally acceptable by doing something lie var value gets document.getElementById. And this is why it's a little tedious to not use jQuery. "name".value. So totally acceptable. Different ways to do this. jQuery just tends to be a little more succinct and definitely more popular among programmers. Now, I'm doing a bit of a sanity check, because in the problem statement we explicitly said, if the user has not yet typed his or her name, don't show an alerts. But you can check for that, by just checking for the empty string for a quote-unquote if there's nothing actually there. But if it's not equal to quote-unquote, I want to call alerts. And the interesting part here is that we're using the plus operator, which does what in JavaScript? Concatenate. So it's like PHPs dot operator. Same idea, slightly different syntax. And I'm just creating the string that you saw on the screen shot-- Hello, so and so. And then the last detail is this. Why do I return false inside of this anonymous function? AUDIENCE: There's no value. You put it in form. It just says, if value is not equal to blank, then do it. There was a blank in that submission. DAVID J. MALAN: OK. Careful though. There's no one else here. And that return false is outside of the if conditions. So this highlighted line, return false, executes no matter what when the form is submitted. What does returning false inside of this event handler, as it's called, the event in question being submission? AUDIENCE: Because it only happens once. DAVID J. MALAN: Only happens once. Not quite. Yeah? AUDIENCE: It prevents the form from submitting to the default behavior, which would make the page reload. DAVID J. MALAN: Exactly. So I'm overloading the term submit here, because I'm saying, the form is being submitted. But as you suggest, it's actually not been submitted in the true HTTP way. When you click Submit, because of our onSubmit handler, we're intercepting that form submission so to speak. We're then doing our thing with JavaScript code. But I'm deliberately returning false, because what I don't want to happen a split second later is for the whole form itself to be submitted to the web server with key value pairs by changing the URL to be something like q=cats or whatever we did, for instance, in class. I don't want that to happen, because there is no server listening for this form submission. It's purely done in JavaScript code. And that's why I didn't even have an action attribute on my form, because I don't intend for this to ever go to the server. So it's being submitted. But we're intercepting that form submission and preventing the default behavior, which is to actually go all the way to the server. AUDIENCE: So keeping it client-side. DAVID J. MALAN: Keeping it client-side. Exactly right. Next up was my oh MySQL. ROB BOWDEN: OK. So this first question was generally rough for people. Though the later ones went better. So you had to choose the correct data types for both of these columns. And both of these have some things about them that make the choice difficult. So int was not a valid type for number. The reason being a 12-digit account number, an int is not big enough to store total digits. So a valid choice would have been a big int if you happen to know that. Another choice could have been a char field of length 12. So either of those would have worked. Int would not. Now, balance, think back to pset7. So we specifically used decimal to store the value of shares or-- DAVID J. MALAN: Cash. ROB BOWDEN: Cash. We used decimal to store the amount of cash that the user currently has. So the reason we do that is because, remember, floats. There's floating point in precision. It can't precisely store the cash values like we want here. So decimal is able to precisely store something to, say, two decimal places. That's why balance, we want it to be decimal and not float. DAVID J. MALAN: And also, too, though it might have been clever in other contexts to think, maybe this is a chance for an int. I'll just keep track of things in pennies. Because we explicitly showed the default value of being 100.00, that means it could just be an int. And another subtlety too with number was that it wasn't meant to be a trick question. But recall that an int in MySQL, like in C, at least in the appliance, is 32-bit. And even though we don't expect you to know exactly how many digits that means, do recall that the largest number you can represent potentially with a 32-bit number is roughly what? What number do we always say? 2 to the 32, which is what roughly? You don't have to know precisely. But roughly is helpful in life. It's roughly 4 billion. So we've said that a few times. I know I have said that a few times. And it is roughly 4 billion. And that's a good rule of thumb to know. If you have 8 bits, 256 is the magic number. If you have 32 bits, 4 billion give or take. So if you just write down 4 billion, you'll see that it's fewer digits than 12, which means that's clearly not enough expressiveness to capture a 12-digit account number. ROB BOWDEN: OK. So the other ones went better. So suppose that the bank imposes a $20 monthly maintenance fee on all accounts. With what SQL query could the bank deduct $20 from every count, even if it results in some negative balances? So basically, there are four main types of queries-- insert, select, update, and delete. So what do we think we're going to use here? Update. So let's take a look. So here we're updating. What table are we updating accounts? So updating accounts. And then the syntax says, what in accounts are we updating? Well, we're setting balance equal to the current value of balance minus 20. So this will update all rows of accounts, subtracting $20 from the balance. DAVID J. MALAN: A common mistake here, even though we sometimes forgave it, was to actually have PHP code here calling the query function or putting quotes around everything that didn't need to be there. ROB BOWDEN: Remember that MySQL is a separate language from PHP. We happen to be writing MySQL in PHP. And PHP is then sending it over to the MySQL server. But you don't need PHP in order to communicate with a MySQL server. DAVID J. MALAN: Exactly. So no variables with dollar signs should be in this context. It can just do all of the math within the database itself. ROB BOWDEN: OK. So the next one. Is this the next one? Yeah. So with what SQL query could the bank retrieve the account numbers of its richest customers, those with balances greater than 1,000? So which of the four main types are we going to want here? Select. So we want to select. What do we want to select? What column do we want to select? We will specifically want to select number. But if you said star, we also accepted that. So select number from what table? Accounts. And then the condition we want? Where balance greater than 1,000. We also accepted greater than or equal. Last one. With what SQL query could the bank close, i.e., delete every account that has a balance of $0? So which of the four are we going to want to use? Delete. So the syntax for that? Delete from what table? Accounts. And then the condition on which we want to delete-- where balance equals zero. So delete all rows from accounts where the balance is zero. Questions on any of these? Want to queue? DAVID J. MALAN: Queue guide. So in this one, we gave you a somewhat familiar structure that we explored a bit in class alongside of structs, which was a data structure related in spirit. The difference though with a queue is that we had to somehow remember who was at the front of the queue, in large part so that we could make more efficient use of the memory, at least if we were using an array. Because recall, if we have an array, if, for instance, this is the front of the queue, if I get into the queue here, and then someone gets in line behind me, behind me, behind me, and one person steps out of line, you could, as we saw some of our human volunteers in class, have everyone shift this way. But in general, having everyone do something is not the best use of time in a program, because it means your algorithm is running in what asymptotic running time? It's linear. And I feel like that's kind of stupid. If the next person in line is the next person who's supposed to go into the store, they don't all have to move together. Just let that person be plucked off when the time comes, for instance. So we can save a bit of time there. And so to do that though, that means that the head of the queue or the front of the queue is going to progressively move deeper and deeper into the array and eventually might actually wrap around if we're using an array to store the people in this queue. So you can almost think of the array as a circular data structure in that sense. So you somehow have to keep track of the size of it or really the end of it and then where the beginning of it is. So we propose that you declare one such queue, calling it q, just one letter. Then we propose that the front be initialized to zero and that the size be initialized to zero. So right now, there's nothing inside of that queue. And we ask you to complete the implementation of enqueue below in such a way that the function adds n to the end of q and then returns true. But if q is full or negative, the function should instead return false. And we gave you a couple of assumptions. But they're not really functionally relevant, just that bool exists, because, technically, bool doesn't exist in C unless you include a certain header file. So that was just make sure there were no is this a trick question kind of thing. So enqueue, we proposed in the sample solutions to implement as follows. One, we first check the ease, the low-hanging fruits. If the queue is full or the number that you're trying to insert is less than zero, which we said in the specification of the problem should not be allowed, because we only want non-negative values, then you should just return false immediately. So some relatively easy error checking. If though you want to add that actual number, you had to do a bit of thinking here. And this is where it's a little annoying mentally, because you have to figure out how to handle wraparound. But the germ of the idea here that's of interest to us is that wraparound often implies modular arithmetic and the mod operator, the percent side, where you can go from a larger value back to zero and then one and two and three and then back around to zero, one and two and three and so forth again and again. So the way we propose doing this is that we do want to index into the array called numbers where our integers lie. But to get there, we first want to do whatever the size of the queue is but then add to that whatever the front of the list is. And the effect of that is to put us at the right position in the queue and not assume that the first person in line is at the beginning, which he or she absolutely could be if we were also shifting everyone. But we're just creating work for ourselves if we took that particular path. So we can keep it relatively simple. We do have to remember that we just added an int to the queue. And then we just return true. Meanwhile, in dequeue, we asked you to do the following. Implement it in such a way that it dequeues, that is removes and returns, the int at the front of queue. To remove the int, it suffices to forget it. You don't need to override its bit. So it's still actually there. Just like data on a hard drive, we're just ignoring the fact that it's now there. And if q is empty, we should instead return negative 1. So this feels arbitrary. Why return negative 1 instead of false? Yeah. AUDIENCE: Q is storing positive values. Since you only store positive values in the q, negative is an error. DAVID J. MALAN: OK, true. So because we're only storing positive values or zero, then it's fine to return a negative value as a sentinel value, a special symbol. But you're rewriting history there, because the reason we're only returning non-negative values is because we want to have a sentinel value. So more specifically, why not just return false in cases of errors? Yeah. AUDIENCE: You've failed to return an integer. DAVID J. MALAN: Exactly. And this is where C gets pretty constraining. If you're saying you're going to return an int, you've got to return an int. You can't get fancy and start returning a bool or a float or a string or something like that. Now, meanwhile, JavaScript and PHP and some other languages can, in fact, have you returning different types of values. And that can actually be useful, where you could return positive ints, zeros, negative ints, or false or null even to signify error. But we don't have that versatility in C. So with dequeue, what we propose to do is-- ROB BOWDEN: You can return false. It's just that false is hash define false to zero. So if you return false, you're returning zero. And zero is a valid thing in our queue, whereas negative 1 is not if false happened to be negative 1. But you shouldn't even need to know that. DAVID J. MALAN: That's why I didn't say it. ROB BOWDEN: But it wasn't true that you can't return false. DAVID J. MALAN: Sure. So dequeue, notice we accept void as its argument. And that's because we're not passing anything in. We just want to remove the element at the front of the queue. So how might we go about doing this? Well, first, let's do this quick sanity check. If the queue size is 0, there's no work to be done. Return negative 1. Done. So that's a few lines of my program. So only four lines remain. So here I decide to decrement the size. And decrementing the size effectively means that I'm forgetting something is in there. But I also have to update where the front of the numbers are. So to do that, I need to do two things. I first need to remember what the number is at the front of the queue, because I need to return that thing. So I don't want to accidentally forget about it and then overwrite it. I'm just going to remember in an int. And now, I want to update q.front to be q.front+1. So if this was the first person in line, now, I want to do plus 1 to point at the next person in line. But I have to handle that wraparound. And if capacity is a global constant, that's going to allow me to make sure as I point to the very last person in line, the modulo operation will bring me back to zero at the front of the queue. And that handles the wraparound here. And then I proceed to return n. Now, strictly speaking, I didn't have to declare n. I didn't have to grab it and store it temporarily, because the value is still there. So I could just do the right arithmetic to return the former head of the queue. But I just felt that this was more clear to actually grab the int, put it in n, and then return that for clarity's sake but not strictly necessary. Psst. They're all pronounceable in my head. ROB BOWDEN: So first question is the binary tree problem. So first question is, we're given these numbers. And we want to somehow insert them into these nodes such that it is a valid binary search tree. So the one thing to remember about binary search trees is that it's not just that the thing to the left is less and the thing to the right is greater. It needs to be that the entire tree to the left is less, and the entire tree to the right is greater. So if I put 34 here at the top, and then I put 20 here, so that's valid so far, because 34 up here. 20 is going to the left. So that's less. But I can't then put 59 here, because even though 59 is on the right of 20, it's still on the left of 34. So with that constraint in mind, the easiest way of probably solving this problem is to just sort of these numbers-- so 20, 34, 36, 52, 59, 106. And then insert those from left to right. So 20 goes here. 34 goes here. 36 goes here. 52, 59, 106. And you also could have figured out with some plugging in and realizing, oh, wait, I don't have enough numbers to fill this in over here. So I need to reshift what my route note is going to be. But notice that in the final three, if you read from left to right, it is in increasing order. So now, we want to declare what the struct is going to be for the nodes in this tree. So what do we need in a binary tree? So we have a value of type int, so some int value. I don't know what we called it in the solution-- int n. We need a pointer to the left child and a pointer to the right child. So it's going to look like this. And it'll actually look before when did the doubly-linked list stuff, so notice-- I'm going to have to scroll all the way back down to problem 11. So notice it looks identical to this, except we just happen to call these different names. We still have an integer value and two pointers. It's just that instead of treating the pointers as pointing to the next thing and the previous thing, we're treating the pointers to point to a left child and right child. OK. So that's our struct node. And now, the only function we need to implement for this is traverse, which we want to go over the tree, printing out the values of the tree in order. So looking here, we would want to print out 20, 34, 36, 52, 59, and 106. How do we accomplish that? So it's pretty similar. If you saw in the past exam the problem that you wanted to print out the whole tree with commas in between everything, it was actually even easier than that. So here is the solution. This was significantly easier if you did it recursively. I don't know if anyone attempted to do it iteratively. But first, we have our base case. What if the root is null? Then we're just going to return. We don't want to print anything. Else we're going to traverse recursively down. Print the entire left subtree. So print everything less than my current value. And then I'm going to print myself. And then I'm going to recurse down my entire right subtree, so everything greater than my value. And this is going to print out everything in order. Questions on how this actually accomplishes that? AUDIENCE: I have a question on the [INAUDIBLE]. ROB BOWDEN: So one way of approaching any recursive problem is to just think about it like you have to think about all the corner cases. So consider that we want to print this entire tree. So all we are going to focus on is this particular node-- 36. The recursive calls, we pretend those just work. So here, this recursive call to traverse, we without even thinking about it, just traversing the left three, imagine that already prints 20 and 34 for us. And then when we eventually recursively call traverse on the right, that will correctly print 52, 59, and 106 for us. So given that this can print 20, 34, and the other can print 52, 59, 108, all we need to be able to do is print ourself in the middle of that. So print out everything before us. Print ourself, so the current node print 36, regular printf, and then print everything after us. DAVID J. MALAN: This is where recursion gets really beautiful. It's this amazing leap of faith where you do the tiniest bit of work. And then you let someone else do the rest. And that someone else is, ironically, you. So for serious brownie points, if you scroll up on the questions-- ROB BOWDEN: On the questions? DAVID J. MALAN: And down a little to the numbers, does anyone know where these numbers come from? ROB BOWDEN: I have literally no idea. DAVID J. MALAN: They appear throughout the quiz. AUDIENCE: Are they the same numbers? DAVID J. MALAN: Those numbers. A little Easter egg. So for those of you watching online at home, if you can tell us via email to heads@CS50.net what the significance of these recurring six numbers are throughout Quiz 1, we will shower you with amazing attention at the final lecture and a stress ball. Nice, subtle. ROB BOWDEN: Any last questions about anything on the quiz?