1 00:00:00,000 --> 00:00:01,000 [Section 6] [More Comfortable] 2 00:00:01,000 --> 00:00:04,000 [Rob Bowden] [Harvard University] 3 00:00:04,000 --> 00:00:09,000 [This is CS50.] [CS50.TV] 4 00:00:09,000 --> 00:00:11,000 >> We can head to our section of questions. 5 00:00:11,000 --> 00:00:17,000 I sent the URL for the space before. 6 00:00:17,000 --> 00:00:22,000 The beginning of the section of the questions say— 7 00:00:22,000 --> 00:00:26,000 apparently I'm not entirely unsick—is a very easy question 8 00:00:26,000 --> 00:00:28,000 of just what is valgrind? 9 00:00:28,000 --> 00:00:30,000 What does valgrind do? 10 00:00:30,000 --> 00:00:34,000 Anyone want to say what valgrind does? 11 00:00:34,000 --> 00:00:36,000 [Student] Checks memory leaks. 12 00:00:36,000 --> 00:00:41,000 Yeah, valgrind is a general memory checker. 13 00:00:41,000 --> 00:00:44,000 It, in the end, tells you if you have any memory leaks, 14 00:00:44,000 --> 00:00:49,000 which is mostly what we're using it for because if you want to 15 00:00:49,000 --> 00:00:54,000 do well in the problem set or if you want to 16 00:00:54,000 --> 00:00:59,000 get on the big board, you need to have no memory leaks whatsoever, 17 00:00:59,000 --> 00:01:01,000 and in case you have a memory leak that you can't find, 18 00:01:01,000 --> 00:01:04,000 also keep in mind that whenever you open a file 19 00:01:04,000 --> 00:01:07,000 and if you don't close it, that's a memory leak. 20 00:01:07,000 --> 00:01:10,000 >> A lot of people are looking for some node that they're not freeing 21 00:01:10,000 --> 00:01:15,000 when really, they didn't close the dictionary in the very first step. 22 00:01:15,000 --> 00:01:19,000 It also tells you if you have any invalid reads or writes, 23 00:01:19,000 --> 00:01:22,000 which means if you try and set a value 24 00:01:22,000 --> 00:01:26,000 that's beyond the end of the heap and it doesn't happen to seg fault 25 00:01:26,000 --> 00:01:30,000 but valgrind catches it, as you shouldn't actually be writing there, 26 00:01:30,000 --> 00:01:33,000 and so you definitely shouldn't have any of those either. 27 00:01:33,000 --> 00:01:38,000 How do you use valgrind? 28 00:01:38,000 --> 00:01:42,000 How do you use valgrind? 29 00:01:42,000 --> 00:01:45,000 >> It's a general question of 30 00:01:45,000 --> 00:01:49,000 kind of run it and look at the output. 31 00:01:49,000 --> 00:01:51,000 The output is overwhelming a lot of times. 32 00:01:51,000 --> 00:01:54,000 There's also fun errors where if you have some terribly wrong thing 33 00:01:54,000 --> 00:01:59,000 happening in a loop, then it will eventually say, "Way too many errors. 34 00:01:59,000 --> 00:02:03,000 I'm going to stop counting now." 35 00:02:03,000 --> 00:02:08,000 It's basically textual output that you have to parse. 36 00:02:08,000 --> 00:02:13,000 In the end, it will tell you any memory leaks that you have, 37 00:02:13,000 --> 00:02:16,000 how many blocks, which can be useful because 38 00:02:16,000 --> 00:02:20,000 if it's one block unfreed, then it's usually easier to find 39 00:02:20,000 --> 00:02:23,000 than 1,000 blocks unfreed. 40 00:02:23,000 --> 00:02:26,000 1,000 blocks unfreed probably means you're not freeing 41 00:02:26,000 --> 00:02:30,000 your linked lists appropriately or something. 42 00:02:30,000 --> 00:02:32,000 That's valgrind. 43 00:02:32,000 --> 00:02:35,000 >> Now we have our section of questions, 44 00:02:35,000 --> 00:02:38,000 which you don't need to download. 45 00:02:38,000 --> 00:02:41,000 You can click on my name and pull them up in the space. 46 00:02:41,000 --> 00:02:44,000 Now click on me. 47 00:02:44,000 --> 00:02:46,000 Revision 1 will be stack, which we're doing first. 48 00:02:46,000 --> 00:02:55,000 Revision 2 will be queue, and Revision 3 will be the singly linked list. 49 00:02:55,000 --> 00:02:58,000 Starting off with our stack. 50 00:02:58,000 --> 00:03:02,000 As it says here, a stack is one of the most basic, 51 00:03:02,000 --> 00:03:07,000 fundamental data structures of computer science. 52 00:03:07,000 --> 00:03:11,000 The very prototypical example is 53 00:03:11,000 --> 00:03:13,000 the stack of trays in the dining hall. 54 00:03:13,000 --> 00:03:16,000 It's basically whenever you are being introduced to a stack, 55 00:03:16,000 --> 00:03:20,000 someone is going to say, "Oh, like a stack of trays." 56 00:03:20,000 --> 00:03:22,000 You stack the trays up. 57 00:03:22,000 --> 00:03:24,000 Then when you go to pull a tray, 58 00:03:24,000 --> 00:03:31,000 the first tray that's getting pulled is the last one that was put on the stack. 59 00:03:31,000 --> 00:03:34,000 The stack also—like it says here— 60 00:03:34,000 --> 00:03:37,000 we have the segment of memory called the stack. 61 00:03:37,000 --> 00:03:40,000 And why is it called the stack? 62 00:03:40,000 --> 00:03:42,000 >> Because like a stack data structure, 63 00:03:42,000 --> 00:03:46,000 it pushes and pops stack frames on the stack, 64 00:03:46,000 --> 00:03:53,000 where stack frames are like a specific call of a function. 65 00:03:53,000 --> 00:03:57,000 And like a stack, you will always have to return 66 00:03:57,000 --> 00:04:03,000 from a function call before you can get down into lower stack frames again. 67 00:04:03,000 --> 00:04:08,000 You can't have main call foo call bar and bar return to main directly. 68 00:04:08,000 --> 00:04:14,000 It's always got to follow the correct stack pushing and popping. 69 00:04:14,000 --> 00:04:18,000 The two operations, like I said, are push and pop. 70 00:04:18,000 --> 00:04:20,000 Those are universal terms. 71 00:04:20,000 --> 00:04:26,000 You should know push and pop in terms of stacks no matter what. 72 00:04:26,000 --> 00:04:28,000 We'll see queues are kind of different. 73 00:04:28,000 --> 00:04:32,000 It doesn't really have a universal term, but push and pop are universal for stacks. 74 00:04:32,000 --> 00:04:34,000 Push is just put on the stack. 75 00:04:34,000 --> 00:04:37,000 Pop is take off the stack. 76 00:04:37,000 --> 00:04:43,000 And we see here we have our typedef struct stack, 77 00:04:43,000 --> 00:04:46,000 so we have char** strings. 78 00:04:46,000 --> 00:04:51,000 Don't get scared by any **. 79 00:04:51,000 --> 00:04:54,000 This is going to end up being an array of strings 80 00:04:54,000 --> 00:04:58,000 or an array of pointers to characters, where 81 00:04:58,000 --> 00:05:00,000 pointers to characters tend to be strings. 82 00:05:00,000 --> 00:05:05,000 It doesn't have to be strings, but here, they're going to be strings. 83 00:05:05,000 --> 00:05:08,000 >> We have an array of strings. 84 00:05:08,000 --> 00:05:14,000 We have a size, which represents how many elements are currently on the stack, 85 00:05:14,000 --> 00:05:19,000 and then we have the capacity, which is how many elements can be on the stack. 86 00:05:19,000 --> 00:05:22,000 The capacity should start off as something greater than 1, 87 00:05:22,000 --> 00:05:27,000 but the size is going to start off as 0. 88 00:05:27,000 --> 00:05:36,000 Now, there are basically three different ways you can think of a stack. 89 00:05:36,000 --> 00:05:39,000 Well, there are probably more, but the two main ways are 90 00:05:39,000 --> 00:05:43,000 you can implement it using an array, or you can implement it using a linked list. 91 00:05:43,000 --> 00:05:48,000 Linked lists are kind of trivial to make stacks from. 92 00:05:48,000 --> 00:05:51,000 It is very easy to make a stack using linked lists, 93 00:05:51,000 --> 00:05:55,000 so here, we're going to make a stack using arrays, 94 00:05:55,000 --> 00:05:59,000 and then using arrays, there's also two ways you can think about it. 95 00:05:59,000 --> 00:06:01,000 Before, when I said we have a capacity for the stack, 96 00:06:01,000 --> 00:06:04,000 so we can fit an element on the stack. 97 00:06:04,000 --> 00:06:09,000 >> The one way it could happen is as soon as you hit 10 elements, then you're done. 98 00:06:09,000 --> 00:06:13,000 You might know that there is an upper bound of 10 things in the world 99 00:06:13,000 --> 00:06:16,000 that you'll never have more than 10 things on your stack, 100 00:06:16,000 --> 00:06:20,000 in which case you can have an upper bound on the size of your stack. 101 00:06:20,000 --> 00:06:23,000 Or you could have your stack be unbounded, 102 00:06:23,000 --> 00:06:27,000 but if you're doing an array, that means that every single time you hit 10 elements, 103 00:06:27,000 --> 00:06:29,000 then you're going to have to grow to 20 elements, and when you hit 20 elements, 104 00:06:29,000 --> 00:06:33,000 you're going to have to grow your array to 30 elements or 40 elements. 105 00:06:33,000 --> 00:06:37,000 You're going to need to increase the capacity, which is what we're going to do here. 106 00:06:37,000 --> 00:06:40,000 Every single time we reach the maximum size of our stack, 107 00:06:40,000 --> 00:06:46,000 when we push something else on, we're going to need to increase the capacity. 108 00:06:46,000 --> 00:06:50,000 Here, we have push declared as bool push (char* str). 109 00:06:50,000 --> 00:06:54,000 Char* str is the string that we are pushing onto the stack, 110 00:06:54,000 --> 00:06:58,000 and bool just says whether we succeeded or failed. 111 00:06:58,000 --> 00:07:00,000 >> How can we fail? 112 00:07:00,000 --> 00:07:04,000 What is the only circumstance that you can think of 113 00:07:04,000 --> 00:07:07,000 where we would need to return false? 114 00:07:07,000 --> 00:07:09,000 Yeah. 115 00:07:09,000 --> 00:07:12,000 [Student] If it's full and we're using a bounded implementation. 116 00:07:12,000 --> 00:07:17,000 Yeah, so how do we define—he answered 117 00:07:17,000 --> 00:07:23,000 if it's full and we're using a bounded implementation. 118 00:07:23,000 --> 00:07:26,000 Then we will definitely return false. 119 00:07:26,000 --> 00:07:31,000 As soon as we hit 10 things in the array, we can't fit 11, so we return false. 120 00:07:31,000 --> 00:07:32,000 What if it is unbounded? Yeah. 121 00:07:32,000 --> 00:07:38,000 If you can't expand the array for some reason. 122 00:07:38,000 --> 00:07:43,000 Yeah, so memory is a limited resource, 123 00:07:43,000 --> 00:07:51,000 and eventually, if we keep pushing things onto the stack over and over again, 124 00:07:51,000 --> 00:07:54,000 we're going to try and allocate a bigger array to fit 125 00:07:54,000 --> 00:07:59,000 the larger capacity, and malloc or whatever we're using is going to return false. 126 00:07:59,000 --> 00:08:02,000 Well, malloc will return null. 127 00:08:02,000 --> 00:08:05,000 >> Remember, every single time you ever call malloc, you should be checking to see if it 128 00:08:05,000 --> 00:08:12,000 returns null or else that is a correctness deduction. 129 00:08:12,000 --> 00:08:17,000 Since we want to have an unbounded stack, 130 00:08:17,000 --> 00:08:21,000 the only case we're going to be returning false is if we try to 131 00:08:21,000 --> 00:08:26,000 increase the capacity and malloc or whatever returns false. 132 00:08:26,000 --> 00:08:30,000 Then pop takes no arguments, 133 00:08:30,000 --> 00:08:37,000 and it returns the string that is on the top of the stack. 134 00:08:37,000 --> 00:08:41,000 Whatever was most recently pushed on the stack is what pop is returning, 135 00:08:41,000 --> 00:08:44,000 and it also removes it from the stack. 136 00:08:44,000 --> 00:08:50,000 And notice that it returns null if there is nothing on the stack. 137 00:08:50,000 --> 00:08:53,000 It is always possible that the stack is empty. 138 00:08:53,000 --> 00:08:55,000 In Java, if you're used to that, or other languages, 139 00:08:55,000 --> 00:09:01,000 trying to pop from an empty stack might cause an exception or something. 140 00:09:01,000 --> 00:09:09,000 >> But in C, null is kind of a lot of the cases how we handle these problems. 141 00:09:09,000 --> 00:09:13,000 Returning null is how we're going to signify that the stack was empty. 142 00:09:13,000 --> 00:09:16,000 We've provided code that will test your stack's functionality, 143 00:09:16,000 --> 00:09:19,000 implement push and pop. 144 00:09:19,000 --> 00:09:23,000 This will not be a lot of code. 145 00:09:23,000 --> 00:09:40,000 I will—actually, before we do that, hint, hint— 146 00:09:40,000 --> 00:09:44,000 if you have not seen it, malloc is not the only function 147 00:09:44,000 --> 00:09:47,000 that allocates memory on the heap for you. 148 00:09:47,000 --> 00:09:51,000 There are a family of alloc functions. 149 00:09:51,000 --> 00:09:53,000 The first is malloc, which you're used to. 150 00:09:53,000 --> 00:09:56,000 Then there's calloc, which does the same thing as malloc, 151 00:09:56,000 --> 00:09:59,000 but it will zero everything out for you. 152 00:09:59,000 --> 00:10:04,000 If you've ever wanted to set everything to null after mallocing something 153 00:10:04,000 --> 00:10:06,000 you should have just used calloc in the first place instead of writing 154 00:10:06,000 --> 00:10:09,000 a for loop to zero out the entire block of memory. 155 00:10:09,000 --> 00:10:15,000 >> Realloc is like malloc and has a lot of special cases, 156 00:10:15,000 --> 00:10:19,000 but basically what realloc does is 157 00:10:19,000 --> 00:10:24,000 it takes a pointer that had already been allocated. 158 00:10:24,000 --> 00:10:27,000 Realloc is the function you want to be paying attention to here. 159 00:10:27,000 --> 00:10:31,000 It takes a pointer that had already been returned from malloc. 160 00:10:31,000 --> 00:10:35,000 Let's say you request from malloc a pointer of 10 bytes. 161 00:10:35,000 --> 00:10:38,000 Then later you realize you wanted 20 bytes, 162 00:10:38,000 --> 00:10:42,000 so you call realloc on that pointer with 20 bytes, 163 00:10:42,000 --> 00:10:47,000 and realloc will automatically copy over everything for you. 164 00:10:47,000 --> 00:10:51,000 If you just called malloc again, like I have a block of 10 bytes. 165 00:10:51,000 --> 00:10:53,000 Now I need a block of 20 bytes, 166 00:10:53,000 --> 00:10:58,000 so if I malloc 20 bytes, then I have to manually copy over the 10 bytes from the first thing 167 00:10:58,000 --> 00:11:01,000 into the second thing and then free the first thing. 168 00:11:01,000 --> 00:11:04,000 Realloc will handle that for you. 169 00:11:04,000 --> 00:11:11,000 >> Notice the signature is going to be void *, 170 00:11:11,000 --> 00:11:15,000 which is just returning a pointer to the block of memory, 171 00:11:15,000 --> 00:11:17,000 then void* ptr. 172 00:11:17,000 --> 00:11:22,000 You can think of void* as a generic pointer. 173 00:11:22,000 --> 00:11:27,000 Generally, you never deal with void*, 174 00:11:27,000 --> 00:11:30,000 but malloc is returning a void*, and then it's just used like 175 00:11:30,000 --> 00:11:34,000 this is actually going to be a char*. 176 00:11:34,000 --> 00:11:37,000 The previous void* that had been returned by malloc 177 00:11:37,000 --> 00:11:41,000 is now going to be passed to realloc, and then size 178 00:11:41,000 --> 00:11:49,000 is the new number of bytes you want to allocate, so your new capacity. 179 00:11:49,000 --> 00:11:57,000 I'll give you a couple minutes, and do it in our space. 180 00:11:57,000 --> 00:12:02,000 Start with Revision 1. 181 00:12:16,000 --> 00:12:21,000 I'll stop you after hopefully about enough time to implement push, 182 00:12:21,000 --> 00:12:24,000 and then I'll give you another break to do pop. 183 00:12:24,000 --> 00:12:27,000 But it really isn't that much code at all. 184 00:12:27,000 --> 00:12:35,000 The most code is probably the expanding stuff, expanding the capacity. 185 00:12:35,000 --> 00:12:39,000 Okay, no pressure to be completely done, 186 00:12:39,000 --> 00:12:47,000 but as long as you feel like you're on the right path, that's good. 187 00:12:47,000 --> 00:12:53,000 >> Does anyone have any code they feel comfortable with me pulling up? 188 00:12:53,000 --> 00:12:59,000 Yeah, I will, but does anyone have any code I can pull up? 189 00:12:59,000 --> 00:13:05,000 Okay, can you start, save it, whatever it is? 190 00:13:05,000 --> 00:13:09,000 I always forget that step. 191 00:13:09,000 --> 00:13:15,000 Okay, looking at push, 192 00:13:15,000 --> 00:13:18,000 do you want to explain your code? 193 00:13:18,000 --> 00:13:24,000 [Student] First of all, I increased the size. 194 00:13:24,000 --> 00:13:28,000 I guess maybe I should have that—anyway, I increased the size, 195 00:13:28,000 --> 00:13:31,000 and I see if it's less than the capacity. 196 00:13:31,000 --> 00:13:36,000 And if it's less than the capacity, I add to the array that we already have. 197 00:13:36,000 --> 00:13:42,000 And if it's not, I multiply the capacity by 2, 198 00:13:42,000 --> 00:13:50,000 and I reallocate the strings array to something with a bigger capacity size now. 199 00:13:50,000 --> 00:13:55,000 And then if that fails, I tell the user and return false, 200 00:13:55,000 --> 00:14:04,000 and if it's fine, then I put the string in the new spot. 201 00:14:04,000 --> 00:14:07,000 >> [Rob B.] Also notice that we used a nice bitwise operator here 202 00:14:07,000 --> 00:14:09,000 to multiply by 2. 203 00:14:09,000 --> 00:14:11,000 Remember, left shift is always going to be multiplied by 2. 204 00:14:11,000 --> 00:14:15,000 Right shift is divided by 2 as long as you remember that it means 205 00:14:15,000 --> 00:14:18,000 divide by 2 as in an integer divided by 2. 206 00:14:18,000 --> 00:14:20,000 It might truncate a 1 here or there. 207 00:14:20,000 --> 00:14:26,000 But shift left by 1 is always going to be multiplied by 2, 208 00:14:26,000 --> 00:14:32,000 unless you overflow the bounds of the integer, and then it won't be. 209 00:14:32,000 --> 00:14:34,000 A side comment. 210 00:14:34,000 --> 00:14:39,000 I like to do—this isn't going to change the coding any way whatsoever, 211 00:14:39,000 --> 00:14:48,000 but I like to do something like this. 212 00:14:48,000 --> 00:14:51,000 It actually is going to make it slightly longer. 213 00:15:04,000 --> 00:15:08,000 Maybe this isn't the perfect case to show this, 214 00:15:08,000 --> 00:15:14,000 but I like to segment it into these blocks of— 215 00:15:14,000 --> 00:15:17,000 okay, if this if happens, then I'm going to do something, 216 00:15:17,000 --> 00:15:19,000 and then the function is done. 217 00:15:19,000 --> 00:15:22,000 I don't need to then scroll my eyes all the way down the function 218 00:15:22,000 --> 00:15:25,000 to see what happens after the else. 219 00:15:25,000 --> 00:15:27,000 It's if this if happens, then I just return. 220 00:15:27,000 --> 00:15:30,000 It also has the nice added benefit of everything beyond this 221 00:15:30,000 --> 00:15:33,000 is now shifted left once. 222 00:15:33,000 --> 00:15:40,000 I no longer need to—if you ever near ridiculously long lines, 223 00:15:40,000 --> 00:15:45,000 then those 4 bytes can help, and also the more left something is, 224 00:15:45,000 --> 00:15:48,000 the less overwhelmed you feel if like—okay, I have to remember 225 00:15:48,000 --> 00:15:53,000 I'm currently in a while loop inside of an else inside of a for loop. 226 00:15:53,000 --> 00:15:58,000 Anywhere you can do this return immediately, I kind of like. 227 00:15:58,000 --> 00:16:05,000 It's totally optional and not expected in any way. 228 00:16:05,000 --> 00:16:12,000 >> [Student] Should there be a size-- in the fail condition? 229 00:16:12,000 --> 00:16:19,000 The fail condition here is we failed to realloc, so yes. 230 00:16:19,000 --> 00:16:22,000 Notice how in the fail condition, presumably, 231 00:16:22,000 --> 00:16:26,000 unless we free stuff later, we're always going to fail 232 00:16:26,000 --> 00:16:29,000 no matter how many times we try to push something. 233 00:16:29,000 --> 00:16:32,000 If we keep pushing, we keep incrementing size, 234 00:16:32,000 --> 00:16:36,000 even though we are not putting anything onto the stack. 235 00:16:36,000 --> 00:16:39,000 Usually we don't increment the size until 236 00:16:39,000 --> 00:16:43,000 after we have successfully put it on the stack. 237 00:16:43,000 --> 00:16:50,000 We would do it, say, either here and here. 238 00:16:50,000 --> 00:16:56,000 And then instead of saying s.size ≤ capacity, it's less than capacity, 239 00:16:56,000 --> 00:17:01,000 only because we moved where everything was. 240 00:17:01,000 --> 00:17:07,000 >> And remember, the only place that we could possibly return false 241 00:17:07,000 --> 00:17:14,000 is here, where realloc returned null, 242 00:17:14,000 --> 00:17:19,000 and if you happen to remember standard error, 243 00:17:19,000 --> 00:17:22,000 maybe you might consider this a case where you want to print a standard error, 244 00:17:22,000 --> 00:17:26,000 so fprintf stderr instead of just printing directly to standard out. 245 00:17:26,000 --> 00:17:31,000 Again, that's not an expectation, but if it's an error, 246 00:17:31,000 --> 00:17:41,000 type printf, then you might want to make it print to standard error instead of standard out. 247 00:17:41,000 --> 00:17:44,000 >> Anyone have anything else to note? Yes. 248 00:17:44,000 --> 00:17:47,000 [Student] Can you go over the [inaudible]? 249 00:17:47,000 --> 00:17:55,000 [Rob B.] Yes, the actual binariness of it or just what it is? 250 00:17:55,000 --> 00:17:57,000 [Student] So you multiply it by 2? 251 00:17:57,000 --> 00:17:59,000 [Rob B.] Yeah, basically. 252 00:17:59,000 --> 00:18:11,000 In binary land, we always have our set of digits. 253 00:18:11,000 --> 00:18:22,000 Shifting this left by 1 basically inserts it here at the right side. 254 00:18:22,000 --> 00:18:25,000 Back to this, just remembering that everything in binary 255 00:18:25,000 --> 00:18:28,000 is a power of 2, so this represents 2 to the 0, 256 00:18:28,000 --> 00:18:30,000 this 2 to the 1, this 2 to the 2. 257 00:18:30,000 --> 00:18:33,000 By inserting a 0 to the right side now, we just shift everything over. 258 00:18:33,000 --> 00:18:38,000 What used to be 2 to the 0 is now 2 to the 1, is 2 to the 2. 259 00:18:38,000 --> 00:18:41,000 The right side that we inserted 260 00:18:41,000 --> 00:18:44,000 is necessarily going to be 0, 261 00:18:44,000 --> 00:18:46,000 which makes sense. 262 00:18:46,000 --> 00:18:49,000 If you ever multiply a number by 2, it's not going to end up odd, 263 00:18:49,000 --> 00:18:54,000 so the 2 to the 0 place should be 0, 264 00:18:54,000 --> 00:18:59,000 and this is what I half warned about before is if you do happen to shift 265 00:18:59,000 --> 00:19:01,000 beyond the number of bits in an integer, 266 00:19:01,000 --> 00:19:04,000 then this 1 is going to end up going off. 267 00:19:04,000 --> 00:19:10,000 That's the only worry if you happen to be dealing with really large capacities. 268 00:19:10,000 --> 00:19:15,000 But at that point, then you're dealing with an array of billions of things, 269 00:19:15,000 --> 00:19:25,000 which might not fit into memory anyway. 270 00:19:25,000 --> 00:19:31,000 >> Now we can get to pop, which is even easier. 271 00:19:31,000 --> 00:19:36,000 You could do it like if you happen to pop a whole bunch, 272 00:19:36,000 --> 00:19:38,000 and now you're at half capacity again. 273 00:19:38,000 --> 00:19:42,000 You could realloc to shrink the amount of memory you have, 274 00:19:42,000 --> 00:19:47,000 but you don't have to worry about that, so the only realloc case is going to be 275 00:19:47,000 --> 00:19:50,000 growing memory, never shrinking memory, 276 00:19:50,000 --> 00:19:59,000 which is going to make pop super easy. 277 00:19:59,000 --> 00:20:02,000 Now queues, which are going to be like stacks, 278 00:20:02,000 --> 00:20:06,000 but the order that you take things out is reversed. 279 00:20:06,000 --> 00:20:10,000 The prototypical example of a queue is a line, 280 00:20:10,000 --> 00:20:12,000 so I guess if you were English, I would have said 281 00:20:12,000 --> 00:20:17,000 a prototypical example of a queue is a queue. 282 00:20:17,000 --> 00:20:22,000 So like a line, if you're the first person in line, 283 00:20:22,000 --> 00:20:24,000 you expect to be the first person out of the line. 284 00:20:24,000 --> 00:20:31,000 If you're the last person in line, you are going to be the last person serviced. 285 00:20:31,000 --> 00:20:35,000 We call that FIFO pattern, whereas stack was LIFO pattern. 286 00:20:35,000 --> 00:20:40,000 Those words are pretty universal. 287 00:20:40,000 --> 00:20:46,000 >> Like stacks and unlike arrays, queues typically don't allow access to elements in the middle. 288 00:20:46,000 --> 00:20:50,000 Here, a stack, we have push and pop. 289 00:20:50,000 --> 00:20:54,000 Here, we happen to have called them enqueue and dequeue. 290 00:20:54,000 --> 00:20:58,000 I have also heard them called shift and unshift. 291 00:20:58,000 --> 00:21:02,000 I've heard people say push and pop to also apply to queues. 292 00:21:02,000 --> 00:21:05,000 I have heard insert, remove, 293 00:21:05,000 --> 00:21:11,000 so push and pop, if you are talking about stacks, you are pushing and popping. 294 00:21:11,000 --> 00:21:16,000 If you're talking about queues, you could pick the words you want to use 295 00:21:16,000 --> 00:21:23,000 for insertion and removal, and there is no consensus on what it should be called. 296 00:21:23,000 --> 00:21:27,000 But here, we have enqueue and dequeue. 297 00:21:27,000 --> 00:21:37,000 Now, the struct looks almost identical to the stack struct. 298 00:21:37,000 --> 00:21:40,000 But we have to keep track of head. 299 00:21:40,000 --> 00:21:44,000 I guess it says down here, but why do we need the head? 300 00:21:53,000 --> 00:21:57,000 The prototypes are basically identical to push and pop. 301 00:21:57,000 --> 00:21:59,000 You can think of it as push and pop. 302 00:21:59,000 --> 00:22:08,000 The only difference is pop is returning—instead of the last, it's returning the first. 303 00:22:08,000 --> 00:22:12,000 2, 1, 3, 4, or something. 304 00:22:12,000 --> 00:22:14,000 And here's the start. 305 00:22:14,000 --> 00:22:17,000 Our queue is completely full, so there's four elements in it. 306 00:22:17,000 --> 00:22:21,000 The end of our queue is currently 2, 307 00:22:21,000 --> 00:22:24,000 and now we go to insert something else. 308 00:22:24,000 --> 00:22:29,000 >> When we want to insert that something else, what we did for the stack version 309 00:22:29,000 --> 00:22:36,000 is we extended our block of memory. 310 00:22:36,000 --> 00:22:40,000 What is the problem with this? 311 00:22:40,000 --> 00:22:45,000 [Student] You move the 2. 312 00:22:45,000 --> 00:22:51,000 What I said before about the end of the queue, 313 00:22:51,000 --> 00:22:57,000 this doesn't make sense that we start at 1, 314 00:22:57,000 --> 00:23:01,000 then we want to dequeue 1, then dequeue 3, then dequeue 4, 315 00:23:01,000 --> 00:23:05,000 then dequeue 2, then dequeue this one. 316 00:23:05,000 --> 00:23:08,000 We can't use realloc now, 317 00:23:08,000 --> 00:23:11,000 or at the very least, you have to use realloc in a different way. 318 00:23:11,000 --> 00:23:15,000 But you probably shouldn't just use realloc. 319 00:23:15,000 --> 00:23:18,000 You are going to have to manually copy your memory. 320 00:23:18,000 --> 00:23:21,000 >> There are two functions to copy memory. 321 00:23:21,000 --> 00:23:25,000 There's memcopy and memmove. 322 00:23:25,000 --> 00:23:29,000 I'm currently reading the man pages to see which one you're going to want to use. 323 00:23:29,000 --> 00:23:35,000 Okay, memcopy, the difference is 324 00:23:35,000 --> 00:23:38,000 that memcopy and memmove, one handles the case correctly 325 00:23:38,000 --> 00:23:41,000 where you're copying into a region that happens to overlap the region 326 00:23:41,000 --> 00:23:46,000 you're copying from. 327 00:23:46,000 --> 00:23:50,000 Memcopy does not handle it. Memmove does. 328 00:23:50,000 --> 00:23:59,000 You can think of the problem as— 329 00:23:59,000 --> 00:24:09,000 let's say I want to copy this guy, 330 00:24:09,000 --> 00:24:13,000 these four to this guy over. 331 00:24:13,000 --> 00:24:16,000 In the end, what the array should look like 332 00:24:16,000 --> 00:24:26,000 after the copy is 2, 1, 2, 1, 3, 4, and then some stuff at the end. 333 00:24:26,000 --> 00:24:29,000 But this is dependent on the order in which we actually copy, 334 00:24:29,000 --> 00:24:32,000 since if we don't consider the fact that the region we're copying into 335 00:24:32,000 --> 00:24:35,000 overlaps the one we're copying from, 336 00:24:35,000 --> 00:24:46,000 then we might do like start here, copy the 2 into the place we want to go, 337 00:24:46,000 --> 00:24:52,000 then move our pointers forward. 338 00:24:52,000 --> 00:24:56,000 >> Now we're going to be here and here, and now we want to copy 339 00:24:56,000 --> 00:25:04,000 this guy over this guy and move our pointers forward. 340 00:25:04,000 --> 00:25:07,000 What we're going to end up getting is 2, 1, 2, 1, 2, 1 341 00:25:07,000 --> 00:25:10,000 instead of the appropriate 2, 1, 2, 1, 3, 4 because 342 00:25:10,000 --> 00:25:15,000 2, 1 overrode the original 3, 4. 343 00:25:15,000 --> 00:25:19,000 Memmove handles that correctly. 344 00:25:19,000 --> 00:25:23,000 In this case, basically just always use memmove 345 00:25:23,000 --> 00:25:26,000 because it handles it correctly. 346 00:25:26,000 --> 00:25:29,000 It generally does not perform any worse. 347 00:25:29,000 --> 00:25:32,000 The idea is instead of starting from the beginning and copying this way 348 00:25:32,000 --> 00:25:35,000 like we just did here, it starts from the end and copies in, 349 00:25:35,000 --> 00:25:38,000 and in that case, you can never have a problem. 350 00:25:38,000 --> 00:25:40,000 There is no performance lost. 351 00:25:40,000 --> 00:25:47,000 Always use memmove. Never worry about memcopy. 352 00:25:47,000 --> 00:25:51,000 And that's where you're going to have to separately memmove 353 00:25:51,000 --> 00:26:01,000 the wrapped-around portion of your queue. 354 00:26:01,000 --> 00:26:04,000 No worries if not completely done. 355 00:26:04,000 --> 00:26:10,000 This is more difficult than stack, push, and pop. 356 00:26:10,000 --> 00:26:15,000 >> Anyone have any code we could work with? 357 00:26:15,000 --> 00:26:21,000 Even if completely incomplete? 358 00:26:21,000 --> 00:26:23,000 [Student] Yeah, it's completely incomplete, though. 359 00:26:23,000 --> 00:26:27,000 Completely incomplete is fine as long as we—can you save the revision? 360 00:26:27,000 --> 00:26:32,000 I forget that every single time. 361 00:26:32,000 --> 00:26:39,000 Okay, ignoring what happens when we need to resize things. 362 00:26:39,000 --> 00:26:42,000 Completely ignore resize. 363 00:26:42,000 --> 00:26:49,000 Explain this code. 364 00:26:49,000 --> 00:26:54,000 I'm checking first of all if the size is less than the copy first of all 365 00:26:54,000 --> 00:27:01,000 and then after that, I insert—I take head + size, 366 00:27:01,000 --> 00:27:05,000 and I make sure it wraps around the capacity of the array, 367 00:27:05,000 --> 00:27:08,000 and I insert the new string at that position. 368 00:27:08,000 --> 00:27:12,000 Then I increase the size and return true. 369 00:27:12,000 --> 00:27:22,000 >> [Rob B.] This is definitely one of those cases where you're going to want to be using mod. 370 00:27:22,000 --> 00:27:25,000 Any kind of case where you have wrapping around, if you think wrapping around, 371 00:27:25,000 --> 00:27:29,000 the immediate thought should be mod. 372 00:27:29,000 --> 00:27:36,000 As a quick optimization/make your code one line shorter, 373 00:27:36,000 --> 00:27:42,000 you notice that the line immediately following this one 374 00:27:42,000 --> 00:27:53,000 is just size++, so you merge that into this line, size++. 375 00:27:53,000 --> 00:27:58,000 Now down here, we have the case 376 00:27:58,000 --> 00:28:01,000 where we do not have enough memory, 377 00:28:01,000 --> 00:28:05,000 so we are increasing our capacity by 2. 378 00:28:05,000 --> 00:28:09,000 I guess you could have the same problem here, but we can ignore it now, 379 00:28:09,000 --> 00:28:13,000 where if you failed to increase your capacity, 380 00:28:13,000 --> 00:28:18,000 then you're going to want to decrease your capacity by 2 again. 381 00:28:18,000 --> 00:28:24,000 Another short note is just like you can do +=, 382 00:28:24,000 --> 00:28:30,000 you can also do <<=. 383 00:28:30,000 --> 00:28:43,000 Almost anything can go before equals, +=, |=, &=, <<=. 384 00:28:43,000 --> 00:28:52,000 Char* new is our new block of memory. 385 00:28:52,000 --> 00:28:55,000 Oh, over here. 386 00:28:55,000 --> 00:29:02,000 >> What do people think about the type of our new block of memory? 387 00:29:02,000 --> 00:29:06,000 [Student] It should be char**. 388 00:29:06,000 --> 00:29:12,000 Thinking back to our struct up here, 389 00:29:12,000 --> 00:29:14,000 strings is what we are reallocating. 390 00:29:14,000 --> 00:29:21,000 We are making an entire new dynamic storage for the elements in the queue. 391 00:29:21,000 --> 00:29:25,000 What we're going to be assigning to your strings is what we're mallocing right now, 392 00:29:25,000 --> 00:29:30,000 and so new is going to be a char**. 393 00:29:30,000 --> 00:29:34,000 It's going to be an array of strings. 394 00:29:34,000 --> 00:29:38,000 Then what is the case under which we're going to return false? 395 00:29:38,000 --> 00:29:41,000 [Student] Should we be doing the char*? 396 00:29:41,000 --> 00:29:44,000 [Rob B.] Yes, good call. 397 00:29:44,000 --> 00:29:46,000 [Student] What was that? 398 00:29:46,000 --> 00:29:49,000 [Rob B.] We wanted to do size of char* because we are no longer— 399 00:29:49,000 --> 00:29:53,000 this would actually be a very big problem because sizeof(char) would be 1. 400 00:29:53,000 --> 00:29:55,000 Sizeof char* is going to be 4, 401 00:29:55,000 --> 00:29:58,000 so a lot of times when you're dealing with ints, 402 00:29:58,000 --> 00:30:01,000 you tend to get away with it because size of int and size of int* 403 00:30:01,000 --> 00:30:04,000 on a 32-bit system are going to be the same thing. 404 00:30:04,000 --> 00:30:09,000 But here, sizeof(char) and sizeof(char*) are now going to be the same thing. 405 00:30:09,000 --> 00:30:15,000 >> What is the circumstance where we return false? 406 00:30:15,000 --> 00:30:17,000 [Student] New is null. 407 00:30:17,000 --> 00:30:23,000 Yeah, if new is null, we return false, 408 00:30:23,000 --> 00:30:34,000 and I'm going to throw down here— 409 00:30:34,000 --> 00:30:37,000 [Student] [inaudible] 410 00:30:37,000 --> 00:30:39,000 [Rob B.] Yeah, this is fine. 411 00:30:39,000 --> 00:30:46,000 You could either do 2 times capacity or capacity shift 1 and then only set it down here or whatever. 412 00:30:46,000 --> 00:30:52,000 We'll do it as we had it. 413 00:30:52,000 --> 00:30:56,000 Capacity >>= 1. 414 00:30:56,000 --> 00:31:08,000 And you're never going to have to worry about losing the 1's place 415 00:31:08,000 --> 00:31:12,000 because you left shifted by 1, so the 1's place is necessarily a 0, 416 00:31:12,000 --> 00:31:16,000 so right shifting by 1, you're still going to be fine. 417 00:31:16,000 --> 00:31:19,000 [Student] Do you need to do that before return? 418 00:31:19,000 --> 00:31:29,000 [Rob B.] Yes, this makes absolutely no sense. 419 00:31:29,000 --> 00:31:36,000 >> Now assume we're going to end up returning true to the end. 420 00:31:36,000 --> 00:31:39,000 The way we're going to do these memmoves, 421 00:31:39,000 --> 00:31:45,000 we need to be careful with how we do them. 422 00:31:45,000 --> 00:31:50,000 Does anyone have any suggestions for how we do them? 423 00:32:17,000 --> 00:32:21,000 Here's our start. 424 00:32:21,000 --> 00:32:28,000 Inevitably, we want to start at the beginning again 425 00:32:28,000 --> 00:32:35,000 and copy things in from there, 1, 3, 4, 2. 426 00:32:35,000 --> 00:32:41,000 How do you do that? 427 00:32:41,000 --> 00:32:52,000 First, I have to look at the man page for memmove again. 428 00:32:52,000 --> 00:32:57,000 Memmove, order of arguments is always important. 429 00:32:57,000 --> 00:33:01,000 We want our destination first, source second, size third. 430 00:33:01,000 --> 00:33:06,000 There are a lot of functions which reverse source and destination. 431 00:33:06,000 --> 00:33:11,000 Destination, source tends to be consistent somewhat. 432 00:33:17,000 --> 00:33:21,000 Move, what is it returning? 433 00:33:21,000 --> 00:33:27,000 It returns a pointer to destination, for whatever reason you might want that. 434 00:33:27,000 --> 00:33:32,000 I can picture read it, but we want to move into our destination. 435 00:33:32,000 --> 00:33:35,000 >> What is our destination going to be? 436 00:33:35,000 --> 00:33:37,000 [Student] New. 437 00:33:37,000 --> 00:33:39,000 [Rob B.] Yes, and where are we copying from? 438 00:33:39,000 --> 00:33:43,000 The first thing we are copying is this 1, 3, 4. 439 00:33:43,000 --> 00:33:50,000 What is the—this 1, 3, 4. 440 00:33:50,000 --> 00:33:55,000 What is the address of this 1? 441 00:33:55,000 --> 00:33:58,000 What is the address of that 1? 442 00:33:58,000 --> 00:34:01,000 [Student] [inaudible] 443 00:34:01,000 --> 00:34:03,000 [Rob B.] Head + the address of the first element. 444 00:34:03,000 --> 00:34:05,000 How do we get the first element in the array? 445 00:34:05,000 --> 00:34:10,000 [Student] Queue. 446 00:34:10,000 --> 00:34:15,000 [Rob B.] Yes, q.strings. 447 00:34:15,000 --> 00:34:20,000 Remember, here, our head is 1. 448 00:34:20,000 --> 00:34:24,000 Darn it. I just think it's magically— 449 00:34:24,000 --> 00:34:29,000 Here, our head is 1. I'm going to change my color too. 450 00:34:29,000 --> 00:34:36,000 And here is strings. 451 00:34:36,000 --> 00:34:41,000 This, we can either write it as we did over here 452 00:34:41,000 --> 00:34:43,000 with heads + q.strings. 453 00:34:43,000 --> 00:34:51,000 A lot of people also write it &q.strings[head]. 454 00:34:51,000 --> 00:34:55,000 This isn't really any less efficient. 455 00:34:55,000 --> 00:34:58,000 You might think of it as you are dereferencing it and then getting the address of, 456 00:34:58,000 --> 00:35:04,000 but the compiler is going to translate it to what we had before anyway, q.strings + head. 457 00:35:04,000 --> 00:35:06,000 Either way you want to think of it. 458 00:35:06,000 --> 00:35:11,000 >> And how many bytes do we want to copy? 459 00:35:11,000 --> 00:35:15,000 [Student] Capacity - head. 460 00:35:15,000 --> 00:35:18,000 Capacity - head. 461 00:35:18,000 --> 00:35:21,000 And then you could always write out an example 462 00:35:21,000 --> 00:35:23,000 to figure out if that's right. 463 00:35:23,000 --> 00:35:26,000 [Student] It needs to be divided by 2 then. 464 00:35:26,000 --> 00:35:30,000 Yeah, so I guess we could use size. 465 00:35:30,000 --> 00:35:35,000 We still have size being— 466 00:35:35,000 --> 00:35:39,000 using size, we have size equal to 4. 467 00:35:39,000 --> 00:35:42,000 Our size is 4. Our head is 1. 468 00:35:42,000 --> 00:35:46,000 We want to copy these 3 elements. 469 00:35:46,000 --> 00:35:54,000 That's the sanity check that size - head is correctly 3. 470 00:35:54,000 --> 00:35:58,000 And coming back here, like we said before, 471 00:35:58,000 --> 00:36:00,000 if we used capacity, then we'd have to divide by 2 472 00:36:00,000 --> 00:36:04,000 because we've already grown our capacity, so instead, we're going to use size. 473 00:36:11,000 --> 00:36:13,000 That copies that portion. 474 00:36:13,000 --> 00:36:18,000 Now, we need to copy the other portion, the portion that is left of the start. 475 00:36:18,000 --> 00:36:28,000 >> That's going to memmove into what position? 476 00:36:28,000 --> 00:36:32,000 [Student] Plus size - head. 477 00:36:32,000 --> 00:36:38,000 Yes, so we have already copied in size - head bytes, 478 00:36:38,000 --> 00:36:43,000 and so where we want to copy the remaining bytes is new 479 00:36:43,000 --> 00:36:48,000 and then size minus—well, the number of bytes we've already copied in. 480 00:36:48,000 --> 00:36:52,000 And then where are we copying from? 481 00:36:52,000 --> 00:36:54,000 [Student] Q.strings[0]. 482 00:36:54,000 --> 00:36:56,000 [Rob B.] Yes, q.strings. 483 00:36:56,000 --> 00:37:02,000 We could either do &q.strings[0]. 484 00:37:02,000 --> 00:37:05,000 This is significantly less common than this. 485 00:37:05,000 --> 00:37:14,000 If it's just going to be 0, then you'll tend to see q.strings. 486 00:37:14,000 --> 00:37:16,000 That's where we're copying from. 487 00:37:16,000 --> 00:37:18,000 How many bytes do we have left to copy?>>[Student] 10. 488 00:37:18,000 --> 00:37:20,000 Right. 489 00:37:20,000 --> 00:37:25,000 [Student] Do we have to multiply 5 - 10 times the size of the bytes or something? 490 00:37:25,000 --> 00:37:30,000 Yeah, so this is where—what exactly are we copying? 491 00:37:30,000 --> 00:37:32,000 [Student] [inaudible] 492 00:37:32,000 --> 00:37:34,000 What is the type of the thing we're copying? 493 00:37:34,000 --> 00:37:36,000 [Student] [inaudible] 494 00:37:36,000 --> 00:37:41,000 Yeah, so the char*s that we're copying, we don't know where those are coming from. 495 00:37:41,000 --> 00:37:47,000 Well, where they're pointing to, like the strings, we end up pushing it onto the queue 496 00:37:47,000 --> 00:37:49,000 or enqueuing onto the queue. 497 00:37:49,000 --> 00:37:51,000 Where those are coming from, we have no idea. 498 00:37:51,000 --> 00:37:56,000 We just need to keep track of the char*s themselves. 499 00:37:56,000 --> 00:38:00,000 We don't want to copy size - head bytes. 500 00:38:00,000 --> 00:38:03,000 We want to copy size - head char*s, 501 00:38:03,000 --> 00:38:11,000 so we're going to multiply this by sizeof(char*). 502 00:38:11,000 --> 00:38:17,000 Same down here, head * sizeof(char*). 503 00:38:17,000 --> 00:38:24,000 >> [Student] What about [inaudible]? 504 00:38:24,000 --> 00:38:26,000 This right here? 505 00:38:26,000 --> 00:38:28,000 [Student] No, below that, the size - head. 506 00:38:28,000 --> 00:38:30,000 [Rob B.] This right here? 507 00:38:30,000 --> 00:38:32,000 Pointer arithmetic. 508 00:38:32,000 --> 00:38:35,000 How pointer arithmetic is going to work is 509 00:38:35,000 --> 00:38:40,000 it automatically multiplies by the size of the type that we're dealing with. 510 00:38:40,000 --> 00:38:46,000 Just like over here, new + (size - head) 511 00:38:46,000 --> 00:38:56,000 is exactly equivalent to &new[size - head] 512 00:38:56,000 --> 00:39:00,000 until we expect that to work correctly, 513 00:39:00,000 --> 00:39:04,000 since if we're dealing with an int array, then we don't index by int— 514 00:39:04,000 --> 00:39:07,000 or if it's of size of 5 and you want the 4th element, then we index into the 515 00:39:07,000 --> 00:39:10,000 int array[4]. 516 00:39:10,000 --> 00:39:14,000 You don't—[4] * size of int. 517 00:39:14,000 --> 00:39:21,000 That handles it automatically, and this case 518 00:39:21,000 --> 00:39:29,000 is literally equivalent, so the bracket syntax 519 00:39:29,000 --> 00:39:34,000 is just going to be converted to this as soon as you compile. 520 00:39:34,000 --> 00:39:38,000 That's something you need to be careful of that 521 00:39:38,000 --> 00:39:42,000 when you are adding size - head 522 00:39:42,000 --> 00:39:45,000 you are adding not one byte. 523 00:39:45,000 --> 00:39:53,000 You're adding one char*, which can be one bytes or whatever. 524 00:39:53,000 --> 00:39:56,000 >> Other questions? 525 00:39:56,000 --> 00:40:04,000 Okay, dequeue is going to be easier. 526 00:40:04,000 --> 00:40:11,000 I'll give you a minute to implement. 527 00:40:11,000 --> 00:40:18,000 Oh, and I guess this is the same situation where 528 00:40:18,000 --> 00:40:21,000 what the enqueue case, if we're enqueuing null, 529 00:40:21,000 --> 00:40:24,000 maybe we want to handle it, maybe we don't. 530 00:40:24,000 --> 00:40:27,000 We won't do it again here, but same as our stack case. 531 00:40:27,000 --> 00:40:34,000 If we enqueue null, we might want to disregard it. 532 00:40:34,000 --> 00:40:40,000 Anyone have some code I can pull up? 533 00:40:40,000 --> 00:40:45,000 [Student] I just have dequeue. 534 00:40:45,000 --> 00:40:56,000 Version 2 is that—okay. 535 00:40:56,000 --> 00:40:59,000 You want to explain? 536 00:40:59,000 --> 00:41:01,000 [Student] First, you make sure there's something in the queue 537 00:41:01,000 --> 00:41:07,000 and that the size is going down by 1. 538 00:41:07,000 --> 00:41:11,000 You need to do that, and then you return the head 539 00:41:11,000 --> 00:41:13,000 and then move the head up 1. 540 00:41:13,000 --> 00:41:19,000 Okay, so there is a corner case we have to consider. Yeah. 541 00:41:19,000 --> 00:41:24,000 [Student] If your head is at the last element, 542 00:41:24,000 --> 00:41:26,000 then you don't want head to point outside of the array. 543 00:41:26,000 --> 00:41:29,000 >> Yeah, so as soon as head hits the end of our array, 544 00:41:29,000 --> 00:41:35,000 when we dequeue, our head should be modded back to 0. 545 00:41:35,000 --> 00:41:40,000 Unfortunately, we can't do that in one step. 546 00:41:40,000 --> 00:41:44,000 I guess the way I'd probably fix it is 547 00:41:44,000 --> 00:41:52,000 this is going to be a char*, what we're returning, 548 00:41:52,000 --> 00:41:55,000 whatever your variable name wants to be. 549 00:41:55,000 --> 00:42:02,000 Then we want to mod head by our capacity 550 00:42:02,000 --> 00:42:10,000 and then return ret. 551 00:42:10,000 --> 00:42:14,000 A lot of people here they might do— 552 00:42:14,000 --> 00:42:19,000 this is the case of—you'll see people do if head 553 00:42:19,000 --> 00:42:29,000 is greater than capacity, do head - capacity. 554 00:42:29,000 --> 00:42:36,000 And that's just working around what mod is. 555 00:42:36,000 --> 00:42:41,000 Head mod = capacity is much cleaner 556 00:42:41,000 --> 00:42:51,000 of a wrapping around than if head greater than capacity head - capacity. 557 00:42:51,000 --> 00:42:56,000 >> Questions? 558 00:42:56,000 --> 00:43:02,000 Okay, the last thing we have left is our linked list. 559 00:43:02,000 --> 00:43:07,000 You might be used to some of the linked list behavior if you did 560 00:43:07,000 --> 00:43:11,000 linked lists in your hash tables, if you did a hash table. 561 00:43:11,000 --> 00:43:15,000 I strongly recommend doing a hash table. 562 00:43:15,000 --> 00:43:17,000 You might have already done a trie, 563 00:43:17,000 --> 00:43:23,000 but tries are more difficult. 564 00:43:23,000 --> 00:43:27,000 In theory, they're asymptotically better. 565 00:43:27,000 --> 00:43:30,000 But just look at the big board, 566 00:43:30,000 --> 00:43:35,000 and tries never do better, and they take up more memory. 567 00:43:35,000 --> 00:43:43,000 Everything about tries ends up being worse for more work. 568 00:43:43,000 --> 00:43:49,000 It's what David Malan's solution always is 569 00:43:49,000 --> 00:43:56,000 is he always posts his trie solution, and let's see where he currently is. 570 00:43:56,000 --> 00:44:00,000 What was he under, David J? 571 00:44:00,000 --> 00:44:06,000 He's #18, so that's not terribly bad, 572 00:44:06,000 --> 00:44:09,000 and that's going to be one of the best tries you can think of 573 00:44:09,000 --> 00:44:17,000 or one of the best tries of a trie. 574 00:44:17,000 --> 00:44:23,000 Is it not even his original solution? 575 00:44:23,000 --> 00:44:29,000 I feel like trie solutions tend to be more in this range of RAM usage. 576 00:44:29,000 --> 00:44:33,000 >> Go down to the very top, and RAM usage is in the single digits. 577 00:44:33,000 --> 00:44:36,000 Go down towards the bottom, and then you start seeing tries 578 00:44:36,000 --> 00:44:41,000 where you get absolutely massive RAM usage, 579 00:44:41,000 --> 00:44:45,000 and tries are more difficult. 580 00:44:45,000 --> 00:44:53,000 Not entirely worth it but an educational experience if you did one. 581 00:44:53,000 --> 00:44:56,000 The last thing is our linked list, 582 00:44:56,000 --> 00:45:04,000 and these three things, stacks, queues, and linked lists, 583 00:45:04,000 --> 00:45:09,000 any future thing you ever do in computer science 584 00:45:09,000 --> 00:45:12,000 will assume you have familiarity with these things. 585 00:45:12,000 --> 00:45:19,000 They are just so fundamental to everything. 586 00:45:19,000 --> 00:45:25,000 >> Linked lists, and here we have a singly linked list is going to be our implementation. 587 00:45:25,000 --> 00:45:34,000 What does singly linked mean as opposed to doubly linked? Yes. 588 00:45:34,000 --> 00:45:37,000 [Student] It only points to the next pointer rather than to the pointers, 589 00:45:37,000 --> 00:45:39,000 like the one preceding it and the one after it. 590 00:45:39,000 --> 00:45:44,000 Yeah, so in picture format, what did I just do? 591 00:45:44,000 --> 00:45:48,000 I have two things. I have picture and picture. 592 00:45:48,000 --> 00:45:51,000 In picture format, our singly linked lists, 593 00:45:51,000 --> 00:45:57,000 inevitably, we have some kind of pointer to the head of our list, 594 00:45:57,000 --> 00:46:02,000 and then within our list, we just have pointers, 595 00:46:02,000 --> 00:46:05,000 and maybe this points to null. 596 00:46:05,000 --> 00:46:08,000 It's going to be your typical drawing of a singly linked list. 597 00:46:08,000 --> 00:46:14,000 A doubly linked list, you can go backwards. 598 00:46:14,000 --> 00:46:19,000 If I give you any node in the list, then you can necessarily get to 599 00:46:19,000 --> 00:46:23,000 any other node in the list if it is a doubly linked list. 600 00:46:23,000 --> 00:46:27,000 But if I get you the third node in the list and it's a singly linked list, 601 00:46:27,000 --> 00:46:30,000 no way you're ever going to get to the first and second nodes. 602 00:46:30,000 --> 00:46:34,000 And there's benefits and detriments, and one obvious one 603 00:46:34,000 --> 00:46:42,000 is you take up more size, and you have to keep track of where these things are pointing now. 604 00:46:42,000 --> 00:46:49,000 But we only care about singly linked. 605 00:46:49,000 --> 00:46:53,000 >> A few things we're going to have to implement. 606 00:46:53,000 --> 00:47:00,000 Your typedef struct node, int i: struct node* next; node. 607 00:47:00,000 --> 00:47:09,000 That typedef should be burned into your minds. 608 00:47:09,000 --> 00:47:14,000 Quiz 1 should be like give a typedef of a linked list node, 609 00:47:14,000 --> 00:47:18,000 and you should be able to immediately scribble that down 610 00:47:18,000 --> 00:47:22,000 without even thinking about it. 611 00:47:22,000 --> 00:47:27,000 I guess a couple questions, why do we need struct here? 612 00:47:27,000 --> 00:47:32,000 Why can't we say node*? 613 00:47:32,000 --> 00:47:35,000 [Student] [inaudible] 614 00:47:35,000 --> 00:47:38,000 Yeah. 615 00:47:38,000 --> 00:47:44,000 The only thing that defines a node as a thing 616 00:47:44,000 --> 00:47:47,000 is the typedef itself. 617 00:47:47,000 --> 00:47:55,000 But as of this point, when we're kind of parsing through this struct node definition, 618 00:47:55,000 --> 00:48:01,000 we have not finished our typedef yet, so since the typedef has not finished, 619 00:48:01,000 --> 00:48:05,000 node does not exist. 620 00:48:05,000 --> 00:48:12,000 But struct node does, and this node in here, 621 00:48:12,000 --> 00:48:14,000 this could also be called anything else. 622 00:48:14,000 --> 00:48:16,000 This could be called n. 623 00:48:16,000 --> 00:48:19,000 It could be called linked list node. 624 00:48:19,000 --> 00:48:21,000 It could be called anything. 625 00:48:21,000 --> 00:48:26,000 But this struct node needs to be called the same thing as this struct node. 626 00:48:26,000 --> 00:48:29,000 What you call this has to also be here, 627 00:48:29,000 --> 00:48:32,000 and so that also answers the second point of the question 628 00:48:32,000 --> 00:48:37,000 which is why—a lot of times when you see structs and typedefs of structs, 629 00:48:37,000 --> 00:48:42,000 you'll see anonymous structs where you'll just see typedef struct, 630 00:48:42,000 --> 00:48:47,000 implementation of struct, dictionary, or whatever. 631 00:48:47,000 --> 00:48:51,000 >> Why here do we need to say node? 632 00:48:51,000 --> 00:48:54,000 Why can't it be an anonymous struct? 633 00:48:54,000 --> 00:48:56,000 It's almost the same answer. 634 00:48:56,000 --> 00:48:58,000 [Student] You need to refer to it within the struct. 635 00:48:58,000 --> 00:49:04,000 Yeah, within the struct, you need to refer to the struct itself. 636 00:49:04,000 --> 00:49:10,000 If you don't give the struct a name, if it's an anonymous struct, you can't refer to it. 637 00:49:10,000 --> 00:49:17,000 And last but not least—these should all be somewhat straightforward, 638 00:49:17,000 --> 00:49:20,000 and they should help you realize if you're writing this down 639 00:49:20,000 --> 00:49:24,000 that you're doing something wrong if these sorts of things don't make sense. 640 00:49:24,000 --> 00:49:28,000 Last but not least, why does this have to be struct node*? 641 00:49:28,000 --> 00:49:34,000 Why can't it just be struct node next? 642 00:49:34,000 --> 00:49:37,000 [Student] Pointer to the next struct. 643 00:49:37,000 --> 00:49:39,000 That's inevitably what we want. 644 00:49:39,000 --> 00:49:42,000 Why could it never be struct node next? 645 00:49:42,000 --> 00:49:50,000 Why does it have to be struct node* next? Yeah. 646 00:49:50,000 --> 00:49:53,000 [Student] It's like an infinite loop. 647 00:49:53,000 --> 00:49:55,000 Yeah. 648 00:49:55,000 --> 00:49:57,000 [Student] It would all be in one. 649 00:49:57,000 --> 00:50:02,000 Yeah, just think of how we would do size of or something. 650 00:50:02,000 --> 00:50:08,000 Size of a struct is basically + or - some pattern here or there. 651 00:50:08,000 --> 00:50:15,000 It's basically going to be the sum of the sizes of the things in the struct. 652 00:50:15,000 --> 00:50:18,000 This right here, without changing anything, the size is going to be easy. 653 00:50:18,000 --> 00:50:24,000 Size of struct node is going to be size of i + size of next. 654 00:50:24,000 --> 00:50:27,000 Size of i is going to be 4. Size of next is going to be 4. 655 00:50:27,000 --> 00:50:30,000 Size of struct node is going to be 8. 656 00:50:30,000 --> 00:50:34,000 If we don't have the *, thinking of sizeof, 657 00:50:34,000 --> 00:50:37,000 then sizeof(i) is going to be 4. 658 00:50:37,000 --> 00:50:43,000 Size of struct node next is going to be size of i + size of struct node next 659 00:50:43,000 --> 00:50:46,000 + size of i + size of struct node next. 660 00:50:46,000 --> 00:50:55,000 It would be an infinite recursion of nodes. 661 00:50:55,000 --> 00:51:00,000 This is why this is how things have to be. 662 00:51:00,000 --> 00:51:03,000 >> Again, definitely memorize that, 663 00:51:03,000 --> 00:51:06,000 or at least understand it enough that you can be able to 664 00:51:06,000 --> 00:51:12,000 reason through what it should look like. 665 00:51:12,000 --> 00:51:14,000 The things we're going to want to implement. 666 00:51:14,000 --> 00:51:18,000 If length of the list— 667 00:51:18,000 --> 00:51:21,000 you could cheat and keep around a 668 00:51:21,000 --> 00:51:24,000 global length or something, but we're not going to do that. 669 00:51:24,000 --> 00:51:28,000 We're going to count the length of the list. 670 00:51:28,000 --> 00:51:34,000 We have contains, so that's basically like a search, 671 00:51:34,000 --> 00:51:41,000 so we have a linked list of integers to see if this integer is in the linked list. 672 00:51:41,000 --> 00:51:44,000 Prepend is going to insert at the beginning of the list. 673 00:51:44,000 --> 00:51:46,000 Append is going to insert at the end. 674 00:51:46,000 --> 00:51:53,000 Insert_sorted is going to insert into the sorted position in the list. 675 00:51:53,000 --> 00:52:01,000 Insert_sorted kind of assumes that you never used prepend or append in bad ways. 676 00:52:01,000 --> 00:52:09,000 >> Insert_sorted when you're implementing insert_sorted— 677 00:52:09,000 --> 00:52:13,000 let's say we have our linked list. 678 00:52:13,000 --> 00:52:18,000 This is what it currently looks like, 2, 4, 5. 679 00:52:18,000 --> 00:52:24,000 I want to insert 3, so as long as the list itself is already sorted, 680 00:52:24,000 --> 00:52:27,000 it's easy to find where 3 belongs. 681 00:52:27,000 --> 00:52:29,000 I start at 2. 682 00:52:29,000 --> 00:52:32,000 Okay, 3 is greater than 2, so I want to keep going. 683 00:52:32,000 --> 00:52:35,000 Oh, 4 is too big, so I know 3 is going to go in between 2 and 4, 684 00:52:35,000 --> 00:52:39,000 and I have to fix pointers and all that stuff. 685 00:52:39,000 --> 00:52:43,000 But if we did not strictly use insert_sorted, 686 00:52:43,000 --> 00:52:50,000 like let's just say I prepend 6, 687 00:52:50,000 --> 00:52:55,000 then my linked list is going to become this. 688 00:52:55,000 --> 00:53:01,000 It now makes no sense, so for insert_sorted, you can just assume 689 00:53:01,000 --> 00:53:04,000 that the list is sorted, even though operations exist 690 00:53:04,000 --> 00:53:09,000 which can cause it to not be sorted, and that's it. 691 00:53:09,000 --> 00:53:20,000 Find a helpful insert—so those are the main things you're going to have to implement. 692 00:53:20,000 --> 00:53:24,000 >> For now, take a minute to do length and contains, 693 00:53:24,000 --> 00:53:30,000 and those should be relatively quick. 694 00:53:41,000 --> 00:53:48,000 Nearing closing time, so anyone have anything for length or contains? 695 00:53:48,000 --> 00:53:50,000 They're going to be almost identical. 696 00:53:50,000 --> 00:53:57,000 [Student] Length. 697 00:53:57,000 --> 00:54:01,000 Let's see, revision. 698 00:54:01,000 --> 00:54:04,000 Okay. 699 00:54:12,000 --> 00:54:15,000 You want to explain? 700 00:54:15,000 --> 00:54:21,000 [Student] I just create a pointer node and initialize it to first, which is our global variable, 701 00:54:21,000 --> 00:54:27,000 and then I check to see if it's null so I don't get a seg fault and return 0 if that's the case. 702 00:54:27,000 --> 00:54:34,000 Otherwise, I loop through, keeping track of within integer 703 00:54:34,000 --> 00:54:38,000 how many times I've accessed the next element of the list 704 00:54:38,000 --> 00:54:43,000 and in the same increment operation also access that actual element, 705 00:54:43,000 --> 00:54:47,000 and then I continuously make the check to see if it's null, 706 00:54:47,000 --> 00:54:56,000 and if it's null, then it aborts and just returns the number of elements I've accessed. 707 00:54:56,000 --> 00:55:01,000 >> [Rob B.] Does anyone have any comments on anything? 708 00:55:01,000 --> 00:55:06,000 This looks fine correctness wise. 709 00:55:06,000 --> 00:55:10,000 [Student] I don't think you need the node == null. 710 00:55:10,000 --> 00:55:13,000 Yeah, so if node == null return 0. 711 00:55:13,000 --> 00:55:18,000 But if node == null then this—oh, there is a correctness issue. 712 00:55:18,000 --> 00:55:23,000 It was just you're returning i, but it's not in scope right now. 713 00:55:23,000 --> 00:55:30,000 You just need int i, so i = 0. 714 00:55:30,000 --> 00:55:34,000 But if node is null, then i is still going to be 0, 715 00:55:34,000 --> 00:55:39,000 and we're going to return 0, so this case is identical. 716 00:55:39,000 --> 00:55:48,000 Another common thing is to keep the declaration 717 00:55:48,000 --> 00:55:51,000 of node inside of the for loop. 718 00:55:51,000 --> 00:55:54,000 You could say—oh, no. 719 00:55:54,000 --> 00:55:56,000 Let's keep it as this. 720 00:55:56,000 --> 00:55:59,000 I would probably put int i = 0 here, 721 00:55:59,000 --> 00:56:05,000 then node* node = first in here. 722 00:56:05,000 --> 00:56:11,000 And this is probably how—getting rid of this now. 723 00:56:11,000 --> 00:56:14,000 This is probably how I would have written it. 724 00:56:14,000 --> 00:56:21,000 You could also—looking at it like this. 725 00:56:21,000 --> 00:56:25,000 This for loop structure right here 726 00:56:25,000 --> 00:56:30,000 should be almost as natural to you as for int i = 0 727 00:56:30,000 --> 00:56:33,000 i is less than length of array i++. 728 00:56:33,000 --> 00:56:38,000 If that's how you iterate over an array, this is how you iterate over a linked list. 729 00:56:38,000 --> 00:56:45,000 >> This should be second nature at some point. 730 00:56:45,000 --> 00:56:50,000 With that in mind, this is going to be almost the same thing. 731 00:56:50,000 --> 00:56:57,000 You're going to want to iterate over a linked list. 732 00:56:57,000 --> 00:57:02,000 If the node—I have no idea what the value is called. 733 00:57:02,000 --> 00:57:04,000 Node i. 734 00:57:04,000 --> 00:57:15,000 If the value at that node = i return true, and that's it. 735 00:57:15,000 --> 00:57:18,000 Notice that the only way we ever return false 736 00:57:18,000 --> 00:57:23,000 is if we iterate over the entire linked list and never return true, 737 00:57:23,000 --> 00:57:29,000 so that's what this does. 738 00:57:29,000 --> 00:57:36,000 As a side note—we probably won't get to append or prepend. 739 00:57:36,000 --> 00:57:39,000 >> Quick last note. 740 00:57:39,000 --> 00:57:52,000 If you see the static keyword, so let's say static int count = 0, 741 00:57:52,000 --> 00:57:56,000 then we do count++, you can basically think of it as a global variable, 742 00:57:56,000 --> 00:58:00,000 even though I just said this isn't how we're going to implement length. 743 00:58:00,000 --> 00:58:06,000 I'm doing this here, and then count++. 744 00:58:06,000 --> 00:58:11,000 Any way we can enter a node into our linked list we are incrementing our count. 745 00:58:11,000 --> 00:58:15,000 The point of this is what the static keyword means. 746 00:58:15,000 --> 00:58:20,000 If I just had int count = 0 that would be a regular old global variable. 747 00:58:20,000 --> 00:58:25,000 What static int count means is that it is a global variable for this file. 748 00:58:25,000 --> 00:58:28,000 It is impossible for some other file, 749 00:58:28,000 --> 00:58:34,000 like think of pset 5, if you have started. 750 00:58:34,000 --> 00:58:39,000 You have both speller.c, and you have dictionary.c, 751 00:58:39,000 --> 00:58:42,000 and if you just declare a thing global, then anything in speller.c 752 00:58:42,000 --> 00:58:45,000 can be accessed in dictionary.c and vice versa. 753 00:58:45,000 --> 00:58:48,000 Global variables are accessible by any .c file, 754 00:58:48,000 --> 00:58:54,000 but static variables are only accessible from within the file itself, 755 00:58:54,000 --> 00:59:01,000 so inside of spell checker or inside of dictionary.c, 756 00:59:01,000 --> 00:59:06,000 this is kind of how I would declare my variable for the size of my array 757 00:59:06,000 --> 00:59:10,000 or the size of my number of words in the dictionary. 758 00:59:10,000 --> 00:59:15,000 Since I don't want to declare a global variable that anyone has access to, 759 00:59:15,000 --> 00:59:18,000 I really only care about it for my own purposes. 760 00:59:18,000 --> 00:59:21,000 >> The good thing about this is also the whole name collision stuff. 761 00:59:21,000 --> 00:59:27,000 If some other file tries to use a global variable called count, things go very, very wrong, 762 00:59:27,000 --> 00:59:33,000 so this nicely keeps things safe, and only you can access it, 763 00:59:33,000 --> 00:59:38,000 and no one else can, and if someone else declares a global variable called count, 764 00:59:38,000 --> 00:59:43,000 then it won't interfere with your static variable called count. 765 00:59:43,000 --> 00:59:47,000 That's what static is. It is a file global variable. 766 00:59:47,000 --> 00:59:52,000 >> Questions on anything? 767 00:59:52,000 --> 00:59:59,000 All set. Bye. 768 00:59:59,000 --> 01:00:03,000 [CS50.TV]