[Section 6: Less Comfortable] [Nate Hardison] [Harvard University] [This is CS50.] [CS50.TV] All right. Welcome to section 6. This week, we're going to be talking about data structures in section, primarily because this week's problem set on spellr does a whole bunch of different data structure exploration. There are a bunch of different ways you can go with the problem set, and the more data structures you know about, the more cool things you can do. So let's get started. First we're going to talk about stacks, the stack and queue data structures that we're going to talk about. Stacks and queues are really helpful when we start talking about graphs, which we're not going to do so much of right now. But they're really good to understand one of the big fundamental data structures of CS. The description in the problem set specification, if you pull it up, talks about stacks as akin to the pile of dining trays that you have in the cafeteria at the dining halls where when the dining staff comes and puts the dining trays out after they've cleaned them, they stack them one on top of the other. And then when kids come in to get food, they pull the trays off, first the top one, then the one below it, then the one below that. So, in effect, the first tray that the dining staff put down is the last one that gets taken off. The last one that the dining staff put on is the first one that gets taken off for dinner. In the problem set's spec, which you can download if you haven't already, we talk about modeling a stack data stucture using this kind of struct. So what we've got here, this is similar to what was presented in lecture, except in lecture we presented this with ints as opposed to char*s. This is going to be a stack that stores what? Daniel? What are we storing in this stack? [Daniel] Strings? >>We are storing strings in this stack, exactly. All you need to have in order to create a stack is an array of a particular capacity, which in this case, capacity is going to be in all caps because it's a constant. And then in addition to the array, all we need to track is the current size of the array. One thing to note here that's kind of cool is that we're creating the stacked data structure on top of another data structure, the array. There are different ways to implement stacks. We won't do it quite yet, but hopefully after doing the linked-list problems, you'll see how you can easily implement a stack on top of a linked list as well. But for now, we'll stick to the arrays. So again, all we need is an array and we just need to track the size of the array. [Sam] Sorry, why is it that you said the stack's on top of the strings? To me it seems like the strings are within the stack. [Hardison] Yeah. We're creating, we're taking our array data structure-- that's a great question. So the question is why, for the people who are watching this online, why are we saying that the stack is on top of the strings, because here it looks like the strings are inside the stack? Which is totally the case. What I was referring to was that we've got an array data structure. We've got an array of char*s, this array of strings, and we are going to add to that in order to create the stacked data structure. So a stack is slightly more complex than an array. We can use an array to build a stack. So that's where we say that the stack is built on top of an array. Likewise, like I said earlier, we can build a stack on top of a linked list. Instead of using an array to hold our elements, we could use a linked list to hold our elements and build the stack around that. Let's walk through a couple of examples, looking at some code, to see what's actually happening here. On the left, I've thrown down what that stack struct would look like in memory if capacity were #defined to be four. We've got our four-element char* array. We've got strings[0], strings[1], strings[2], strings[3], and then that last space for our size integer. Does this make sense? Okay. This is what happens if what I do on the right, which will be my code, is to just declare a struct, a stacked struct called s. This is what we get. It lays down this footprint in memory. The first question here is what are the contents of this stack struct? Right now they're nothing, but they're not totally nothing. They're this kind of garbage. We have no idea what's in them. When we declare stack s, we're just throwing that down on top of memory. It's kind of like declaring int i and not initializing it. You don't know what's in there. You can read what's in there, but it might not be super helpful. One thing you want to always remember to do is initialize whatever needs to be initialized. In this case, we're going to initialize the size to be zero, because that's going to turn out to be very important for us. We could go ahead and initialize all of the pointers, all the char*s, to be some understandable value, probably null. But it's not totally necessary that we do that. Now, the two main operations on stacks are? Anybody remember from lecture what you do with stacks? Yes? [Stella] Pushing and popping? >>Exactly. Pushing and popping are the two main operations on stacks. And what does push do? >>It puts something onto the top of the stack, and then popping takes it off. [Hardison] Exactly. So pushing pushes something on top of the stack. It's like the dining staff putting a dining tray down on the counter. And popping is taking a dining tray off of the stack. Let's walk through a couple of examples of what happens when we push things into the stack. If we were to push the string 'hello' onto our stack, this is what our diagram would look like now. See what happens? We pushed into the first element of our strings array and we upped our size count to be 1. So if we look at the difference between the two slides, here was 0, here's before the push. Here is after the push. Before the push, after the push. And now we have one element in our stack. It's the string "hello", and that's it. Everything else in the array, in our strings array, is still garbage. We haven't initialized it. Let's say we push another string onto our stack. We're going to push "world" on this time. So you can see "world" here goes on top of "hello", and the size count goes up to 2. Now we can push "CS50", and that'll go on top again. If we go back, you can see how we're pushing things on top of the stack. And now we get to pop. When we popped something off of the stack, what happened? Anybody see the difference? It's pretty subtle. [Student] The size. >>Yeah, the size changed. What else would you have expected to change? [Student] The strings, too. >>Right. The strings too. It turns out that when you're doing it this way, because we're not copying the elements into our stack, we actually don't have to do anything; we can just use the size to keep track of the number of things in our array so that when we pop again, again we just decrement our size down to 1. There's no need to actually go in and overwrite anything. Kind of funky. It turns out that we typically just leave things alone because it's less work for us to do. If we don't have to go back and overwrite something, then why do it? So when we pop twice off of the stack, all that does is decrement the size a couple of times. And again, this is only because we're not copying things into our stack. Yes? Go ahead. [Student, unintelligible] >>And then what happens when you push something again? When you push something again, where does it go? Where does it go, Basil? >>Into strings[1]? >>Right. Why doesn't it go into strings[3]? [Basil] Because it forgot that there was anything in strings[1] and [2]? [Hardison] Exactly. Our stack, essentially, "forgot" that it was holding on to anything in strings[1] or strings[2], so when we push "woot", it just puts that into the element at strings[1]. Are there any questions on how this works, at a basic level? [Sam] So this is not dynamic in any way, in terms of amount or in terms of the size of the stack? [Hardison] Exactly. This is--the point was that this was not a dynamically growning stack. This is a stack that can hold, at most, four char*s, at most four things. If we were to try and push a fifth thing, what do you think should happen? [Students, unintelligible] [Hardison] Exactly. There are a number of things that could happen. It could possibly seg fault, depending on what we were-- how exactly we were implementing the back-end. It could overwrite. It could have that buffer overflow that we talked about in class. What would be the most obvious thing that might be overwritten if we tried to push an extra thing on our stack? So you mentioned a buffer overflow. What might be the thing that would get written over or stomped on if we overflowed accidentally by trying to push an extra thing? [Daniel, unintelligible] >>Possible. But initially, what might happen? What if we tried to push a fourth thing? It might overwrite the size, at least with this memory diagram that we've got. In the problem set specification, which is what we're going to be implementing today, what we do want to do is just return false. Our push method is going to return a boolean value, and that boolean value will be true if the push succeeds and false if we can't push anything more because the stack is full. Let's walk through a little bit of that code right now. Here's our push function. Our push function for a stack is going to take in the string to put on the stack. It's going to return true if the string was successfully pushed on the stack and false otherwise. Any suggestions on what might be a good first thing to do here? [Sam] If size equals capacity then return false? [Hardison] Bingo. Nice job. If the size is the capacity, we're going to return false. We can't put anything more in our stack. Otherwise, we want to put something on the top of the stack. What is "the top of the stack," initially? [Daniel] Size 0? >>Size 0. What is the top of the stack after there's one thing in the stack? Missy, do you know? [Missy] One. >>Size is one, exactly. You keep adding to the size, and every time you're putting in the new element at the index size in the array. We can do it with that kind of one-liner, if that makes sense. So we've got our strings array, we're going to access it at the size index, and we're just going to store our char* in there. Notice how there's no string copying going on in here, no dynamic allocation of memory? And then Missy brought up what we now have to do, because we've stored the string in the appropriate place in the array, and she said that we had to increment the size by one so that we're ready for the next push. So we can do that with s.size++. At this point, we've pushed into our array. What's the last thing we have to do? [Student] Return true. >>Return true. So it's pretty simple, a pretty simple code. Not too much. Once you've wrapped your head around how the stack works, this is pretty simple to implement. Now, the next part of this is popping a string off of the stack. I'm going to give you guys some time to work on this a little bit. It's almost essentially the reverse of what we've done here in push. What I've done is actually--oops. I've booted up an appliance over here, and in the appliance, I've pulled up the problem set 5 specification. If we zoom in here, we can see I'm at cdn.cs50.net/2012/fall/psets/pset5.pdf. Have you guys downloaded this code that's located here, section6.zip? All right. If you haven't done that, do that right now, really quickly. I'll do it in my terminal window. I actually did it up here. Yeah. Yes, Sam? >>I have a question about why did you say s.string's brackets of size = str? What is str? Is that defined somewhere before, or--oh, in the char* str? [Hardison] Yes, exactly. That was the argument. >>Oh, okay. Sorry. [Hardison] We're specifying the string to push in. The other question that might come up that we didn't really talk about here was we took for granted that we had this variable called s that was in scope and accessible to us. We took for granted that s was this stack struct. So looking back at this push code, you can see that we're doing stuff with this string that got passed in but then all of a sudden, we're accessing s.size, like, where did s come from? In the code that we're going to look at in the section archive and then the stuff that you'll be doing in your problem sets, we've made our stack struct a global variable so that we can have access to it in all our different functions without having to manually pass it around and pass it by reference, do all that kind of stuff to it. We're just cheating a little bit, if you will, to make things nicer. And that's something we're doing here because it's for fun, it's easier. Often, you'll see people do this if they have one big data structure that's being operated on within their program. Let's go back over to the appliance. Did everybody successfully get the section6.zip? Everybody unzip it using unzip section6.zip? If you go into the section 6 directory-- aah, all over the place-- and you list what's in here, you see that you've got three different .c files. You've got a queue, an sll, which is singly-linked list, and a stack. If you open up stack.c, you can see that we've got this struct defined for us, the exact struct that we just talked about in the slides. We've got our global variable for the stack, we've got our push function, and then we've got our pop function. I'll put the code for push back up on the slide here, but what I'd like you guys to do is, to the best of your ability, go and implement the pop function. Once you've implemented it, you can compile this with make stack, and then run the resultant stack executable, and that will run all of this test code down here that's in main. And main takes care of actually making the push and pop calls and making sure that everything goes through all right. It also initializes the stack size right here so you don't have to worry about initializing that. You can assume that it's been properly initialized by the time that you access it in the pop function. Does that make sense? So here we go. There's the push code. I'll give you guys 5 or 10 minutes. And if you have any questions in the interim while you're coding, please ask them out loud. So if you get to a sticking point, just ask. Let me know, let everybody else know. Work with your neighbor too. [Daniel] We're just implementing pop right now? >>Just pop. Though you can copy the implementation of push if you'd like so that the testing will work. Because it's hard to test things getting into-- or, it's hard to test popping things out of a stack if there isn't anything in the stack to begin with. What is pop supposed to be returning? The element from the top of the stack. It's supposed to get the element off of the top of the stack and then decrement the size of the stack, and now you've lost the element on the top. And then you return the element on the top. [Student, unintelligible] [Hardison] So what happens if you do that? [Student, unintelligible] What ends up happening is you're probably accessing either an element that hasn't been initialized yet, so your calculation of where the last element is is off. So here, if you notice, in push, we're accessing strings at the s.size element because it's a new index. It's the new top of the stack. Whereas in pop, s.size is going to be the next space, the space that's on top of all the elements in your stack. So the top-most element isn't at s.size, but rather, it's beneath it. The other thing to do when you--in pop, is you do have to decrement the size. If you remember back to our little diagram right here, really, the only thing that we saw happening when we called pop was that this size dropped, first to 2, then to 1. Then when we pushed a new element on, it would go on at the proper spot. [Basil] If the s.size is 2, then wouldn't it go to element 2, and then you'd want to pop that element off? So if we went to-- >>So let's look at this again. If this is our stack at this point and we call pop, at which index is the top-most element? [Basil] At 2, but it's going to pop 3. >>Right. So that's where our size is 3, but we want to pop the element at index 2. It's that typical kind of off by one that you have with the zero-indexing of arrays. So you do want to pop the third element, but the third element isn't at index 3. And the reason we don't have to do that minus 1 when we're pushing is because right now, you notice that the top-most element, if we were to push something else onto the stack at this point, we would want to push it at index 3. And it just so happens that the size and the indices line up when you're pushing. Who's got a working stack implementation? You've got a working stack one. Do you have pop working yet? [Daniel] Yes. I think so. >>Program's running and not seg faulting, it's printing out? Does it print out "success" when you run it? Yeah. Make stack, run it, if it prints out "success" and doesn't go boom, then all's good. All right. Let's go over to the appliance really quickly, and we'll walk through this. If we look at what's going on here with pop, Daniel, what was the first thing that you did? [Daniel] If s.size is greater than 0. [Hardison] Okay. And why did you do that? [Daniel] To make sure that there was something inside the stack. [Hardison] Right. You want to test to make sure that s.size is greater than 0; otherwise, what do you want to have happen? [Daniel] Return null? >>Return null, exactly. So if s.size is greater than 0. Then what are we going to do? What do we do if the stack is not empty? [Stella] You decrement the size? >>You decrement the size, okay. So how did you do that? >>s.size-- . [Hardison] Great. And then what did you do? [Stella] And then I said return s.string[s.size]. [Hardison] Great. Otherwise you return null. Yes, Sam? [Sam] Why does it not need to be s.size + 1? [Hardison] Plus 1? >>Yeah. >>Got it. [Sam] I thought because you're taking 1 out, then you're going to be returning not the one that they asked for. [Hardison] And this was just what we were talking about with this whole issue of 0 indices. So if we zoom back over here. If we look at this guy right here, you can see that when we pop, we're popping the element at index 2. So we decrease our size first, then our size matches our index. If we don't decrement the size first, then we have to do size -1 and then decrement. Great. All good? Any questions on this? There are a number of different ways to write this as well. In fact, we can do something even--we can do a one-liner. We can do a one-line return. So we can actually decrement before we return by doing that. So putting the -- before the s.size. That makes the line really dense. Where the difference between the --s.size and s.size-- is that this postfix--they call it postfix because the -- comes after the s.size-- means that s.size is evaluated for the purposes of finding the index as it is currently when this line is executed, and then this -- happens after the line gets executed. After the element at index s.size is accessed. And that's not what we want, because we want the decrement to happen first. Othewise, we're going to be accessing the array, effectively, out of bounds. We're going to be accessing the element above the one that we actually want to access. Yeah, Sam? >>Is it faster or use less RAM to make in one line or not? [Hardison] Honestly, it really depends. [Sam, unintelligible] >>Yeah, it depends. You can do compiler tricks to get the compiler to recognize that, usually, I imagine. So we've mentioned a little bit about this compiler optimization stuff that you can do in compiling, and that's the kind of thing that a compiler might be able to figure out, like oh, hey, maybe I can do this all in one operation, as opposed to loading the size variable in from RAM, decrementing it, storing it back out, and then loading it back in again to process the rest of this operation. But typically, no, this is not the sort of thing that's going to make your program significantly faster. Any more questions on stacks? So pushing and popping. If you guys want to try out the hacker edition, what we've done in the hacker edition is actually gone and made this stack grow dynamically. The challenge there is primarily up here in the push function, to figure out how to make that array grow as you keep pushing more and more elements on to the stack. It's actually not too much additional code. Just a call to--you have to remember to get the calls to malloc in there properly, and then figure out when you're going to call realloc. That's a fun challenge if you're interested. But for the time being, let's move on, and let's talk about queues. Scroll through here. The queue is a close sibling of the stack. So in the stack, things that were put in last were the first things to then be retrieved. We've got this last in, first out, or LIFO, ordering. Whereas in the queue, as you'd expect from when you're standing in line, the first person to get in line, the first thing to get into the queue, is the first thing that gets retrieved from the queue. Queues are also often used when we're dealing with graphs, like we talked about briefly with stacks, and queues are also handy for a bunch of other things. One thing that comes up often is trying to maintain, for example, a sorted list of elements. And you can do this with an array. You can maintain a sorted list of things in an array, but where that gets tricky is then you always have to find the appropriate place to insert the next thing. So if you have an array of numbers, 1 through 10, and then you want to expand that to all the numbers 1 through 100, and you're getting these numbers in random order and trying to keep everything sorted as you go through, you end up having to do a lot of shifting. With certain types of queues and certain types of underlying data structures, you can actually keep it fairly simple. You don't have to add something and then reshuffle the whole thing each time. Nor do you have to do a lot of shifting of the internal elements around. When we look at a queue, you see that--also in queue.c in the section code-- the struct that we've given you is really similar to the struct that we gave you for a stack. There's one exception to this, and that one exception is that we have this additional integer called the head, and the head here is for keeping track of the head of the queue, or the first element in the queue. With a stack, we were able to keep track of the element that we were about to retrieve, or the top of the stack, using just the size, whereas with a queue, we're having to deal with opposite ends. We're trying to tack things on at the end, but then return things from the front. So effectively, with the head, we have the index of the beginning of the queue, and the size gives us the index of the end of the queue so that we can retrieve things from the head and add things on to the tail. Whereas with the stack, we were only ever dealing with the top of the stack. We never had to access the bottom of the stack. We only added things to the top and took things off of the top so we didn't need that extra field inside our struct. Does that generally make sense? All right. Yes, Charlotte? [Charlotte, unintelligible] [Hardison] That's a great question, and that was one that came up in lecture. Maybe walking through a few examples will illustrate why we don't want to use strings[0] as the head of the queue. So imagine that we have our queue, we're going to call it queue. At the beginning, when we've just instantiated it, when we've just declared it, we haven't initialized anything. It's all garbage. So of course we want to make sure that we initialize both the size and the head fields to be 0, something reasonable. We could also go ahead and null out the elements in our queue. And to make this diagram fit, notice that now our queue can only hold three elements; whereas our stack could hold four, our queue can only hold three. And that's just to make the diagram fit. The first thing that happens here is we enqueue the string "hi". And just like we did with the stack, nothing terribly different here, we throw the string on at strings[0] and increment our size by 1. We enqueue "bye", it gets put on. So this looks like a stack for the most part. We started off here, new element, new element, size keeps going up. What happens at this point when we want to dequeue something? When we want to dequeue, which is the element that we want to dequeue? [Basil] Strings[0]. >>Zero. Exactly right, Basil. We want to get rid of the first string, this one, "hi". So what was the other thing that changed? Notice when we popped something off of the stack, we just changed the size, but here, we've got a couple of things that change. Not only does the size change, but the head changes. This is going back to Charlotte's point earlier: why do we have this head as well? Does it make sense now, Charlotte? >>Kind of. [Hardison] Kind of? So what happened when we dequeued? What did the head do that now is interesting? [Charlotte] Oh, because it changed--okay. I see. Because the head--where the head is pointing to changes in terms of the location. It's no longer always the zero index one. >>Yeah, exactly. What happened was if dequeueing the high element was done and we didn't have this head field because we were always calling this string at 0 index the head of our queue, then we'd have to shift the rest of the queue down. We'd have to shift "bye" from from strings[1] to the strings[0]. And strings[2] down to strings[1]. And we'd have to do this for the entire list of elements, the entire array of elements. And when we're doing this with an array, that gets really costly. So here, it's not a big deal. We just have three elements in our array. But if we had a queue of a thousand elements or a million elements, and then all of a sudden, we start making a bunch of dequeue calls all in a loop, things are really going to slow down as it shifts everything down constantly. You know, shift by 1, shift by 1, shift by 1, shift by 1. Instead, we use this head, we call it a "pointer" even though it's not really a pointer in the strict sense; it's not a pointer type. It's not an int* or a char* or anything like that. But it's pointing or indicating the head of our queue. Yeah? [Student] How does dequeue know to just pop off whatever is at the head? [Hardison] How does dequeue know how to pop off whatever's at the head? >>Right, yeah. >>What it's looking at is just whatever the head field is set to. So in this first case, if we look right here, our head is 0, index 0. >>Right. [Hardison] So it just says okay, well, the element at index 0, the string "hi", is the element at the head of our queue. So we're going to dequeue that guy. And that will be the element that gets returned to the caller. Yes, Saad? >>So the head basically sets the--where you're going to index it? That's the start of it? >>Yeah. >>Okay. [Hardison] That's becoming the new start for our array. So when you dequeue something, all you have to do is access the element at index q.head, and that will be the element that you want to dequeue. You also have to decrement the size. We'll see in a bit where things get a little tricky with this. We dequeue, and now, if we enqueue again, where do we enqueue? Where does the next element go in our queue? Say we want to enqueue the string "CS". Into which index will it go? [Students] Strings[2]. >>Two. Why 2 and not 0? [Basil] Because now the head is 1, so that's like the start of the list? [Hardison] Right. And what denotes the end of the list? What were we using to denote the end of our queue? The head is the head of our queue, the beginning of our queue. What is the end of our queue? [Students] Size. >>Size, exactly. So our new elements go in at size, and the elements that we take off come off at head. When we enqueue the next element, we're putting it in at size. [Student] Before you put that in though, size was 1, right? [Hardison] Right. So not quite at size. Size +, not +1, but + head. Because we shifted everything by the head amount. So here, now we've got a queue of size 1 that begins at index 1. The tail is index 2. Yes? [Student] What happens when you dequeue strings[0], and the strings' slots in memory just get emptied, basically, or just forgotten? [Hardison] Yeah. In this sense, we're just forgetting them. If we were storing copies of them for-- many data structures will often store their own copies of the elements so that the person managing the data structure doesn't have to worry about where all those pointers are going. The data structure holds on to everything, holds on to all the copies, to make sure that everything persists appropriately. However, in this case, these data structures just, for simplicity, aren't making copies of anything that we're storing in them. [Student] So is this a continuous array of--? >>Yes. If we look back at what the definition was of this structure, it is. It's just a standard array like you've seen, an array of char*s. Does that--? >>Yeah, I was just wondering if you'll eventually run out of memory, to a certain extent, if you have all these empty spots in your array? [Hardison] Yeah, that's a good point. If we look at what's happened now at this point, we've filled up our queue, it looks like. But we haven't really filled up our queue because we have a queue that's size 2, but it begins at index 1, because that's where our head pointer is. Like you were saying, that element at strings[0], at index 0, isn't really there. It's not in our queue anymore. We just didn't bother to go in and overwrite it when we dequeued it. So even though it looks like we've run out of memory, we really haven't. That spot is available for us to use. The appropriate behavior, if we were to try and first dequeue something like "bye", that would pop bye off. Now our queue begins at index 2 and is of size 1. And now if we try and enqueue something again, say 50, 50 should go in this spot at index 0 because it's still available for us. Yes, Saad? [Saad] Does that happen automatically? [Hardison] It doesn't happen quite automatically. You have to do the math to make it work, but essentially what we've done is we've just wrapped around. [Saad] And is it okay if this has a hole in the middle of it? [Hardison] It is if we can make the math work out properly. And it turns out that that's actually not that hard to do with the mod operator. So just like we did with Caesar and the crypto stuff, using mod, we can get things to wrap around and keep going around and around and around with our queue, keeping that head pointer moving around. Notice that size is always respecting the number of elements actually within the queue. And it's just the head pointer that keeps cycling through. If we look at what happened here, if we go back to the beginning, and you just watch what happens to the head when we enqueue something, nothing happened to the head. When we enqueued something else, nothing happened to the head. Soon as we dequeued something, the head goes up by one. We enqueued something, nothing happens to the head. When we dequeue something, all of a sudden the head gets incremented. When we enqueue something, nothing happens to the head. What would happen at this point if we were to dequeue something again? Any thoughts? What would happen to the head? What should happen to the head if we were to dequeue something else? The head right now is at index 2, which means that the head of the queue is strings[2]. [Student] Which returns 0? >>It should return to 0. It should wrap back around, exactly. So far, every time we called dequeue, we've been adding one to the head, add one to the head, add one to the head, add one to the head. As soon as that head pointer gets to the last index in our array, then we have to wrap it back around to the beginning, go back to 0. [Charlotte] What determines the capacity of the queue in a stack? [Hardison] In this case, we've just been using a #defined constant. >>Okay. [Hardison] In the actual .c file, you can go in and muck with it a little bit and make it as big or as little as you want. [Charlotte] So when you're making it a queue, how do you make the computer know how big you want the stack to be? [Hardison] That's a great question. There are a couple of ways. One is to just define it up front and say this is going to be a queue that has 4 elements or 50 elements or 10,000. The other way is to do what the hacker edition folks are doing and create functions to have your queue grow dynamically as more things get added in. [Charlotte] So to go with the first option, what syntax do you use to tell the program what is the size of the queue? [Hardison] Ah. So let's get out of this. I'm still in stack.c here, so I'm just going to scroll up to the top here. Can you see this right here? This is the #define capacity 10. And this is almost the exact same syntax that we have for queue. Except in queue, we've got that extra struct field in here. [Charlotte] Oh, I thought the capacity meant the capacity for the string. [Hardison] Ah. >>That it's the maximum length of the word. >>Got it. Yeah. The capacity here--that's a great point. And this is something that's tricky because what we've declared here is an array of char*s. An array of pointers. This is an array of chars. This is probably what you've seen when you've been declaring your buffers for file I/O, when you've been creating strings manually on the stack. However, what we've got here is an array of char*s. So it's an array of pointers. Actually, if we zoom back out and we look at what's going on here in the presentation, you see that the actual elements, the character data isn't stored within the array itself. What's stored within our array here are pointers to the character data. Okay. So we've seen how the size of the queue is just like with the stack, the size always respects the number of elements currently in the queue. After making 2 enqueues, the size is 2. After making a dequeue the size is now 1. After making another enqueue the size is back up to 2. So the size definitely respects the number of elements in the queue, and then the head just keeps cycling. It goes from 0-1-2, 0-1-2, 0-1-2. And every time we call dequeue, the head pointer gets incremented to the next index. And if the head is about to go over, it loops back around to 0. So with that, we can write the dequeue function. And we're going to leave the enqueue function for you guys to implement instead. When we dequeue an element out of our queue, what was the first thing that Daniel did when we started writing the pop function for stacks? Let me hear from somebody who hasn't spoken yet. Let's see, Saad, do you remember what Daniel did as the first thing when he wrote pop? [Saad] There was, it was-- >>He tested for something. [Saad] If size is greater than 0. >>Exactly. And what was that testing for? [Saad] That was testing to see if there's anything inside the array. [Hardison] Yeah. Exactly. So you can't pop anything out of the stack if it's empty. Likewise, you can't dequeue anything from a queue if it's empty. What is the first thing we should do in our dequeue function here, do you think? [Saad] If size is greater than 0? >>Yeah. In this case, I've actually just tested to see if it is 0. If it is 0, we can return null. But exact same logic. And let's continue with this. If the size is not 0, where is the element that we want to dequeue? [Saad] At the head? >>Exactly. We can just pull out the first element in our queue by accessing the element at the head. Nothing crazy. After that, what should we do? What has to happen? What was the other thing that we talked about in dequeue? Two things have to happen, because our queue has changed. [Daniel] Reduce the size. >>We have to reduce the size, and increase the head? Exactly. To increase the head, we can't just blindly increase the head, remember. We can't just do queue.head++. We have to also include this mod by the capacity. And why do we mod by the capacity, Stella? [Stella] Because it has to wrap around. >>Exactly. We mod by the capacity because it has to wrap back around to 0. So now, at this point, we can do what Daniel said. We can decrement the size. And then we can just return the element that was at the top of the queue. It looks kind of gnarly at first. You might have a question. Sorry? [Sam] Why's first at the top of the queue? Where does that go? [Hardison] It comes from the fourth line from the bottom. After we test to make sure that our queue is not empty, we pull out char* first, we pull out the element that's sitting at the head index of our array, of our strings array, >>and call that first? [Hardison] And we call it first. Yeah. Just to follow up on that, why do you think we had to do that? [Sam] Each first is just returning q.strings[q.head]? >>Yeah. >>Because we're doing this changing of the q.head with the mod function, and there's no way to do that within return line also. [Hardison] Exactly. You're spot on. Sam's totally spot on. The reason we had to pull out the first element in our queue and store it into a variable is because this line where we had just q.head, there's the mod operator in there isn't something that we can do and have it take effect on the head without--in one line. So we actually have to pull out the first element, then adjust the head, adjust the size, and then return the element that we pulled out. And this is something that we'll see come up later with linked lists, as we play around with them. Often when you're freeing or disposing of linked lists you need to remember the next element, the next pointer of a linked list before disposing of the current one. Because otherwise you throw away the information of what's left in the list. Now, if you go to your appliance, you open up queue.c--x out of this. So if I open up queue.c, let me zoom in here, you'll see that you have a similar-looking file. Similar-looking file to what we had earlier with stack.c. We've got our struct for queue defined just as we saw on the slides. We have our enqueue function which is for you to do. And we have the dequeue function here. The dequeue function in the file is unimplemented, but I'll put it back up on the PowerPoint so that you can type it in, if you'd like. So for the next 5 minutes or so, you guys work on enqueue which is almost just the opposite of dequeue. You don't have to adjust head when you're enqueueing, but what do you have to adjust? Size. So when you enqueue, the head stays untouched, the size gets changed. But it does take a little bit of--you will have to play around with that mod to figure out exactly what index the new element should be inserted at. So I'll give you guys a little bit, put dequeue back up on the slide, and as you guys have questions, shout them out so that we can all talk about them as a group. Also, with the size you don't--when you adjust the size, you can always just-- do you have to mod the size ever? [Daniel] No. >>You don't have to mod the size, right. Because the size will always, if you're--assuming you're managing things appropriately, the size will always be between 0 and 3. Where do you have to mod when you're doing enqueue? [Student] Only for the head. >>Only for the head, exactly. And why do you have to mod at all in enqueue? When is a situation in which you'd have to mod? [Student] If you have stuff at spaces, like at spaces 1 and 2, and then you needed to add something at 0. [Hardison] Yeah, exactly. So if your head pointer is at the very end, or if your size plus your head is bigger, or rather, is going to wrap around the queue. So in this situation that we've got up here on the slide right now, if I want to enqueue something right now, we want to enqueue something at index 0. So if you look at where the 50 goes, and I call enqueue 50, it goes down there at the bottom. It goes in index 0. It replaces the 'hi' that was already dequeued. [Daniel] Don't you take care of that in dequeue already? Why does it do anything with the head in enqueue? [Hardison] Oh, so you're not modifying the head, sorry. But you have to use the mod operator when you're accessing the element that you want to enqueue when you're accessing the next element in your queue. [Basil] I didn't do that, and I got "success" on there. [Daniel] Oh, I understand what you're saying. [Hardison] So you didn't--you just did at q.size? [Basil] Yeah. I just changed sides, I didn't do anything with the head. [Hardison] You don't actually have to reset the head to be anything, but when you index into the strings array, you actually have to go ahead and calculate where the next element is, because withe the stack, the next element in your stack was always at the index corresponding to the size. If we look back up at our stack push function, we could always plunk in our new element right at index size. Whereas with the queue, we can't do that because if we're at this situation, if we enqueued 50 our new string would go right at strings[1] which we don't want to do. We want to have the new string go at index 0. Does anybody--yes? [Student] I have a question but it's not really related. What does it mean when someone just calls something like pred pointer? What is that name short for? I know it's just a name. [Hardison] Pred pointer? Let's see. In what context? [Student] It was for the insert. I can ask you later if you want because it's not really related, but I just-- [Hardison] From David's insert code from lecture? We can pull that up and talk about that. We'll talk about that next, once we get to linked lists. So let's really quickly look at what the enqueue function looks like. What was the first thing that people tried to do in your enqueue line? Into this queue? Similar to what you did for stack pushing. What did you do, Stella? [Stella, unintelligible] [Hardison] Exactly. If (q.size == CAPACITY)-- I need to put my braces in the right place--return false. Zoom in a little bit. Okay. Now what's the next thing that we had to do? Just like with the stack, and inserted at the right place. And so what was the right place to insert that? With the stack it was index size, with this it's not quite that. [Daniel] I have q.head--or-- >>q.strings? >>yeah. q.strings[q.head + q.size mod CAPACITY]? [Hardison] We probably want to put parentheses around this so that we're getting the appropriate precedence and so that's cleart to everybody. And set that equal? >>To str? >>To str. Great. And now what's the last thing that we have to do? Just like we did in the stack. >>Increment the size? >>Increment the size. Boom. And then, since the starter code just returned false by default, we want to change this to true if all goes through and all goes well. All right. That's a lot of information for section. We're not quite over. We want to talk really quickly about singly-linked lists. I'll put this up so we can go back to it later. But let's go back to our presentation for just a few more slides. So enqueue is TODO, now we've done it. Now let's take a look at singly-linked lists. We talked about these a little bit more in lecture. How many of you guys saw the demo where we had people awkwardly pointing to each other and holding numbers? >>I was in that. >>What did you guys think? Did that, hopefully demystify these a little bit? With a list, it turns out that we deal with this type that we're going to call a node. Whereas with the queue and the stack we had structs that we'd call queue in stack, we had these new queue in stack types, here a list is really just made up of a bunch of nodes. In the same way that strings are just a bunch of chars all lined up next to each other. A linked list is just a node and another node and another node and another node. And rather than smashing all the nodes together and storing them contiguously all right next to each other in memory, having this next pointer allows us to store the nodes wherever, at random. And then kind of wire them all together to point from one to the next. And what was the big advantage that this had over an array? Over storing everything contiguously just stuck next to each other? You remember? Yeah? >>Dynamic memory allocation? >>Dynamic memory allocation in what sense? [Student] In that you can keep making it bigger and you don't have to move your entire array? [Hardison] Exactly. So with an array, when you want to put a new element into the middle of it, you have to shift everything to make space. And like we talked about with the queue, that's why we keep that head pointer, so that we're not constantly shifting things. Because that gets expensive if you've got a big array and you're constantly doing these random insertions. Whereas with a list, all you have to do is throw it on a new node, adjust the pointers, and you're done. What sucks about this? Aside from the fact that it's not as easy to work with as an array? Yeah? [Daniel] Well, I guess it's much harder to access a specific element in the linked list? [Hardison] You can't just jump to an arbitrary element in the middle of your linked list. How do you have to do it instead? >>You have to step through the entire thing. [Hardison] Yeah. You have to go through one at a time, one at a time. It's a huge--it's a pain. What's the other--there's another downfall to this. [Basil] You can't go forward and backwards? You have to go one direction? [Hardison] Yeah. So how do we solve that, sometimes? [Basil] Doubly-linked lists? >>Exactly. There are doubly-linked lists. There are also--sorry? [Sam] Is that the same as using the pred thing that-- I just remembered, isn't that what the pred thing is for? Isn't that in between doubly and singly? [Hardison] Let's look at what exactly he was doing. So here we go. Here's the list code. Here we have predptr, in here. Is this what you were talking about? So this was--he's freeing a list and he's trying to store a pointer to it. This is not the doubly, singly linked-lists. We can talk more about this later since this is talking about freeing the list and I want to show some other stuff first. but it's just--it's remembering the value of ptr [Student] Oh, it's preceeding pointer? >>Yeah. So that we can then increment ptr itself before we then free what predptr is. Because we can't free ptr and then call ptr=ptr next, right? That would be bad. So let's see, back to this guy. The other bad thing about lists is that whereas with an array we just have all the elements themselves stacked next to each other, here we also have introduced this pointer. So there's an additional chunk of memory that we're having to use for every element that we're storing in our list. We get flexibility, but it comes at a cost. It comes with this time cost, and it comes with this memory cost too. Time in the sense that we now have to go through every element in the array to find the one at index 10, or that would have been index 10 in an array. Just really quickly, when we diagram out these lists, typically we hold on to the head of the list or the first pointer of the list and note that this is a true pointer. It's just 4 bytes. It's not an actual node itself. So you see that it has no int value in it, no next pointer in it. It's literally just a pointer. It's going to point to something that is an actual node struct. [Sam] A pointer called node? >>This is--no. This is a pointer to something of type node. It is a pointer to a node struct. >>Oh, okay. Diagram on the left, code on the right. We can set it to null, which is a good way to start. When you diagram it, you either write it as null or you put a line through it like that. One of the easiest ways to work with lists, and we ask you do both prepend and append to see the differences between the two, but prepending is definitely easier. When you prepend, this is where you--when you prepend(7), you go and create the node struct and you set first to point to it, because now, since we prepended it, it's going to be at the beginning of the list. If we prepend(3), that creates another node, but now 3 comes before 7. So we're essentially pushing things onto our list. Now, you can see that prepend, sometimes people call it push, because you're pushing a new element onto your list. It's also easy to delete at the front of a list. So people will often call that pop. And in that way, you can emulate a stack using a linked list. Whoops. Sorry, now we're getting into append. So here we prepended(7), now we prepend(3). If we prepended something else onto this list, if we prepended(4), then we'd have 4 and then 3 and then 7. So then we could pop and remove 4, remove 3, remove 7. Often the more intuitive way to think about this is with append. So I've diagrammed out what it would look like with append here. Here, appended(7) doesn't look any different because there's only one element in the list. And appending(3) puts it at the end. Maybe you can see right now the trick with append is that since we only know where the beginning of the list is, to append to a list you have to walk all the way through the list to get to the end, stop, then build your node and plunk everything down. Wire all the stuff up. So with prepend, as we just ripped through this really quickly, when you prepend to a list, it's fairly simple. You make your new node, involve some dynamic memory allocation. So here we're making a node struct using malloc. So malloc we're using because that'll set aside memory for us for later because we don't want this--we want this memory to persist for a long time. And we get a pointer to the space in memory that we just allocated. We use size of node, we don't sum the fields. We don't manually generate the number of bytes, instead we use sizeof so that we know we're getting the appropriate number of bytes. We make sure to test that our malloc call succeeded. This is something you want to do in general. On modern machines, running out of memory isn't something that's easy unless you're allocating a ton of stuff and making a huge list, but if you're building stuff for, say, like an iPhone or an Android, you do have limited memory resources, especially if you're doing something intense. So it's good to get into practice. Notice that I've used a couple different functions here that you've seen that are kind of new. So fprintf is just like printf except its first argument is the stream to which you want to print. In this case, we want to print to the standard error string which is different from the standard outstream. By default it shows up in the same place. It also prints out to the terminal, but you can-- using those commands you learned about, the redirection techniques you learned about in Tommy's video for problem set 4, you can direct it to different areas; then exit, right here, exits your program. It's essentially like returning from main, except we use exit because here return won't do anything. We're not in main, so returning doesn't exit the program like we want. So we use the exit function and give it an error code. Then here we set the new node's value field, its i field to be equal to i, and then we wire it up. We set the new node's next pointer to point to first, and then first will now point to the new node. These first lines of code, we're actually building the new node. Not the last two lines of this function but the first ones. You can actually pull out into a function, into a helper function. That's often what I do is, I pull it out into a function, I call it something like build node, and that keeps the prepend function pretty small, it's just 3 lines then. I make a call to my build node function, and then I wire everything up. The final thing I want to show you, and I'll let you do append and all that on your own, is how to iterate over a list. There are a bunch of different ways to iterate over a list. In this case, we're going to find the length of a list. So we start with length = 0. This is very similar to writing strlen for a string. This is what I want to show you, this for loop right here. It looks kinda funky; it's not the usual int i = 0, i < whatever, i ++. Instead it's initializing our variable n to be the beginning of the list. And then while our iterator variable is not null, we keep going. This is because, by convention, the end of our list will be null. And then to increment, rather than doing ++, the linked list equivalent of ++ is n = n->next. I'll let you fill in the gaps here because we're out of time. But keep this in mind as you work on your spellr psets. Linked lists, if you're implementing a hash table, will definitely come in very handy. And having this idiom for looping over things will make life a lot easier, hopefully. Any questions, quickly? [Sam] Will you send out the completed sll and sc? [Hardison] Yeah. I'll send out completed slides and completed sll stack and queue.cs. [CS50.TV]