1 00:00:00,000 --> 00:00:02,862 2 00:00:02,862 --> 00:00:03,910 CARTER ZENKE: OK. 3 00:00:03,910 --> 00:00:04,940 Hello, everyone. 4 00:00:04,940 --> 00:00:07,982 It is so good to see you here for our Week 5 Section. 5 00:00:07,982 --> 00:00:09,940 If you don't remember, my name is Carter Zenke. 6 00:00:09,940 --> 00:00:11,740 I'm the course's preceptor here on campus, 7 00:00:11,740 --> 00:00:13,902 and it's so delightful to see you all over Zoom. 8 00:00:13,902 --> 00:00:15,860 I thought what we'd do just to kick things off, 9 00:00:15,860 --> 00:00:18,880 assuming you can hear me this time, is to on the count of three, 10 00:00:18,880 --> 00:00:22,150 we'll all unmute at the same time and say a big hello. 11 00:00:22,150 --> 00:00:25,440 So on the count of three, we'll do one, two, three, hello. 12 00:00:25,440 --> 00:00:26,830 AUDIENCE: Hello. 13 00:00:26,830 --> 00:00:28,195 AUDIENCE: Hello. 14 00:00:28,195 --> 00:00:29,455 AUDIENCE: Hello. 15 00:00:29,455 --> 00:00:31,640 AUDIENCE: Hello. 16 00:00:31,640 --> 00:00:32,800 CARTER ZENKE: All right. 17 00:00:32,800 --> 00:00:33,220 AUDIENCE: Hello. 18 00:00:33,220 --> 00:00:34,585 CARTER ZENKE: It is so wonderful-- 19 00:00:34,585 --> 00:00:35,252 AUDIENCE: Hello. 20 00:00:35,252 --> 00:00:36,481 AUDIENCE: Hello. 21 00:00:36,481 --> 00:00:37,690 AUDIENCE: Hello. 22 00:00:37,690 --> 00:00:38,710 CARTER ZENKE: Hello. 23 00:00:38,710 --> 00:00:40,533 It is so wonderful to see you all here. 24 00:00:40,533 --> 00:00:41,200 AUDIENCE: Hello. 25 00:00:41,200 --> 00:00:42,968 AUDIENCE: Hello. 26 00:00:42,968 --> 00:00:46,280 CARTER ZENKE: And we'll go ahead and mute. 27 00:00:46,280 --> 00:00:48,760 AUDIENCE: Hello. 28 00:00:48,760 --> 00:00:52,000 CARTER ZENKE: OK. 29 00:00:52,000 --> 00:00:52,500 All right. 30 00:00:52,500 --> 00:00:55,208 So if you'd like to participate, feel free to do so via the chat. 31 00:00:55,208 --> 00:00:58,405 You're welcome to type in the chat to ask questions via the chat. 32 00:00:58,405 --> 00:01:00,780 Again, the goal for these sections is really to make sure 33 00:01:00,780 --> 00:01:03,090 that you're able to ask the questions you want to ask 34 00:01:03,090 --> 00:01:06,250 and for us to dive into this learning together. 35 00:01:06,250 --> 00:01:08,460 So today, focused on data structures. 36 00:01:08,460 --> 00:01:12,540 And this week, you build your own data structure with a speller problem set. 37 00:01:12,540 --> 00:01:16,170 And I want to focus on this big idea behind data structures. 38 00:01:16,170 --> 00:01:17,087 Why would we use them? 39 00:01:17,087 --> 00:01:18,962 What are the trade-offs between them, and how 40 00:01:18,962 --> 00:01:20,800 to actually use them in the real world. 41 00:01:20,800 --> 00:01:23,680 So to that end, we'll have three questions during our section today. 42 00:01:23,680 --> 00:01:27,360 The first will be this idea of the trade-off between data structures. 43 00:01:27,360 --> 00:01:30,180 When should we use one data structure versus another? 44 00:01:30,180 --> 00:01:32,790 The second question will be about these linked lists. 45 00:01:32,790 --> 00:01:36,660 We saw these in lecture, but how can we build our own linked lists? 46 00:01:36,660 --> 00:01:37,920 What can we do with them? 47 00:01:37,920 --> 00:01:42,480 And how can we better figure out how to actually use them in our code? 48 00:01:42,480 --> 00:01:44,670 And then, finally, our third question will 49 00:01:44,670 --> 00:01:48,230 be this idea of how do we build our own hash table where hash tables rely 50 00:01:48,230 --> 00:01:50,730 on a lot of fundamentals that linked lists can give us here, 51 00:01:50,730 --> 00:01:52,688 but can run maybe a little bit faster, as we'll 52 00:01:52,688 --> 00:01:55,900 see in the problem set for this week. 53 00:01:55,900 --> 00:01:59,220 So to really dive into things here and to make things a little more concrete, 54 00:01:59,220 --> 00:02:01,500 we saw all kinds of data structures. 55 00:02:01,500 --> 00:02:04,780 But often, when you're actually programming in the real world, 56 00:02:04,780 --> 00:02:08,220 you might choose one or another, depending on the scenario, what 57 00:02:08,220 --> 00:02:09,730 you're trying to build something. 58 00:02:09,730 --> 00:02:12,480 And in this case, what I'd like to do is show you all one scenario 59 00:02:12,480 --> 00:02:13,690 we can think about here. 60 00:02:13,690 --> 00:02:17,220 So imagine that you are working for some software company 61 00:02:17,220 --> 00:02:21,190 and they are making this digital assistant that runs on a mobile device. 62 00:02:21,190 --> 00:02:23,610 Maybe it's something like, hey, Siri, or hey, Google, 63 00:02:23,610 --> 00:02:25,680 or something that responds to people's voice 64 00:02:25,680 --> 00:02:27,630 over time on their personal device. 65 00:02:27,630 --> 00:02:31,230 You can do something for that person as they activate this assistant. 66 00:02:31,230 --> 00:02:35,430 Now, you've seen, or you've heard, that people who are using this assistant 67 00:02:35,430 --> 00:02:38,640 aren't often able to wake it up at the right moment. 68 00:02:38,640 --> 00:02:42,053 Maybe it sometimes doesn't work if I were to say maybe, hey, Siri, 69 00:02:42,053 --> 00:02:42,720 or, hey, Google. 70 00:02:42,720 --> 00:02:44,830 It doesn't quite work as I want it to. 71 00:02:44,830 --> 00:02:47,953 And so this is sort of a problem that we're running into here. 72 00:02:47,953 --> 00:02:50,370 But what we ultimately want to do is figure out, OK, well, 73 00:02:50,370 --> 00:02:51,690 how could we fix this? 74 00:02:51,690 --> 00:02:54,750 And maybe the first solution is to try to improve this 75 00:02:54,750 --> 00:02:58,200 by adding more words that could wake up this voice assistant. 76 00:02:58,200 --> 00:03:02,280 We want to store more data, more words, that could activate this voice 77 00:03:02,280 --> 00:03:03,330 assistant for ourselves. 78 00:03:03,330 --> 00:03:06,372 And the question that you might run into as a software engineer is, well, 79 00:03:06,372 --> 00:03:08,820 how could we best store all of these phrases 80 00:03:08,820 --> 00:03:11,282 we want our voice assistant to wake up to? 81 00:03:11,282 --> 00:03:12,990 What kind of data structure would you use 82 00:03:12,990 --> 00:03:17,190 to actually store these phrases that might activate our voice assistant? 83 00:03:17,190 --> 00:03:21,090 And we saw a few tools in our toolkit, a few structures in our toolkit. 84 00:03:21,090 --> 00:03:25,590 And one of these was the linked list, one of these was the try, one of these 85 00:03:25,590 --> 00:03:27,370 was the tree, and so on. 86 00:03:27,370 --> 00:03:29,260 So there's lots of ways we can do this. 87 00:03:29,260 --> 00:03:32,550 But let's think first about the different priorities we might 88 00:03:32,550 --> 00:03:34,980 have for this data structure, right? 89 00:03:34,980 --> 00:03:36,822 If we want to actually make this. 90 00:03:36,822 --> 00:03:38,530 Well, what do we want to prioritize here? 91 00:03:38,530 --> 00:03:39,698 So let's think about this. 92 00:03:39,698 --> 00:03:41,490 If we to add lots of words, we could really 93 00:03:41,490 --> 00:03:43,650 think of trying a few different approaches here. 94 00:03:43,650 --> 00:03:46,023 We could think of, well, a few different aspects, like, 95 00:03:46,023 --> 00:03:48,690 if our data structure is going to do a certain number of things, 96 00:03:48,690 --> 00:03:51,300 we might care about different aspects of how data structures are formed. 97 00:03:51,300 --> 00:03:53,400 Like, do we care about deleting things quickly? 98 00:03:53,400 --> 00:03:55,170 Do we care about inserting things quickly? 99 00:03:55,170 --> 00:03:57,128 Do we care about searching for phrases quickly? 100 00:03:57,128 --> 00:03:59,520 Like, for example, if I were to have this voice assistant 101 00:03:59,520 --> 00:04:03,990 and I'm trying to do something with it, do I care about deleting words quickly, 102 00:04:03,990 --> 00:04:07,230 or do I care about inserting new words quickly, 103 00:04:07,230 --> 00:04:10,830 or do I care about maybe if I were to speak something, if I can search 104 00:04:10,830 --> 00:04:12,270 and find that word really quickly? 105 00:04:12,270 --> 00:04:14,220 So often, when we're choosing data structure, 106 00:04:14,220 --> 00:04:15,928 we care about these kind of three things. 107 00:04:15,928 --> 00:04:18,820 How fast does it perform in these three areas? 108 00:04:18,820 --> 00:04:21,328 And so here, we run into our first question, right? 109 00:04:21,328 --> 00:04:23,370 If we want to figure out which way to prioritize, 110 00:04:23,370 --> 00:04:26,320 well, we could maybe say that we care about search the most. 111 00:04:26,320 --> 00:04:28,920 We want to be able to search and find a phrase really quickly. 112 00:04:28,920 --> 00:04:32,190 Maybe if the user speaks some word, we want to be able to figure out, 113 00:04:32,190 --> 00:04:35,640 is that word in the wake word list. 114 00:04:35,640 --> 00:04:37,590 Or maybe we really care about the user being 115 00:04:37,590 --> 00:04:41,010 able to quickly add a new wake word, like, I could say not just, hey, 116 00:04:41,010 --> 00:04:42,450 but hello, or hi. 117 00:04:42,450 --> 00:04:45,450 And I want to be able to program this data structure to sort of respond 118 00:04:45,450 --> 00:04:47,200 to that quickly. 119 00:04:47,200 --> 00:04:51,180 Now, what priorities might you have if you were faced with a scenario? 120 00:04:51,180 --> 00:04:53,130 If you want to chime in in the chat, which 121 00:04:53,130 --> 00:04:56,490 would you prioritize here between a fast insertion, a fast search, 122 00:04:56,490 --> 00:04:59,380 a fast deletion, if you're trying to do this? 123 00:04:59,380 --> 00:05:02,390 Feel free to chime in in the chat. 124 00:05:02,390 --> 00:05:02,890 OK. 125 00:05:02,890 --> 00:05:09,880 I'm seeing one person for search, OK, lots of searching, some insertion, too. 126 00:05:09,880 --> 00:05:10,380 Yeah. 127 00:05:10,380 --> 00:05:13,380 And so these are things we could probably reasonably disagree on, right? 128 00:05:13,380 --> 00:05:16,078 If I want to prioritize searching for some word, 129 00:05:16,078 --> 00:05:19,120 being able to speak it and have my assistant recognize it really quickly, 130 00:05:19,120 --> 00:05:22,037 well, that seems reasonable where we can make search our top priority. 131 00:05:22,037 --> 00:05:24,400 Or maybe it's also reasonable to say, well, 132 00:05:24,400 --> 00:05:26,723 I want people to make their own custom wake words. 133 00:05:26,723 --> 00:05:29,140 And so inserting something is really important to me, too. 134 00:05:29,140 --> 00:05:30,820 So it depends on what you're trying to build here. 135 00:05:30,820 --> 00:05:33,600 And so let's take a look at some of these tools in our tool kit. 136 00:05:33,600 --> 00:05:37,020 So the first one we said before was this idea of the linked list. 137 00:05:37,020 --> 00:05:39,437 And here's the linked list in full here. 138 00:05:39,437 --> 00:05:41,520 And the idea behind the linked list was that we're 139 00:05:41,520 --> 00:05:45,352 able to have these nodes where each node has a certain phrase 140 00:05:45,352 --> 00:05:48,060 inside of it-- at least in this case, trying to store phrases now 141 00:05:48,060 --> 00:05:49,020 in our linked list. 142 00:05:49,020 --> 00:05:51,760 And we have these nodes kind of follow each other in memory. 143 00:05:51,760 --> 00:05:53,590 They're no longer back-to-back, like, in an array, 144 00:05:53,590 --> 00:05:54,780 but they sort of build on top of each other 145 00:05:54,780 --> 00:05:57,360 and sort of follow each other throughout memory, being woven 146 00:05:57,360 --> 00:05:59,940 in to our bits and bytes, thereof. 147 00:05:59,940 --> 00:06:05,850 So if we were to use a linked list here, which of these three priorities 148 00:06:05,850 --> 00:06:08,130 would be kind of well-served by this linked 149 00:06:08,130 --> 00:06:12,983 list, like, what would a link list be good at in this case, do you think? 150 00:06:12,983 --> 00:06:14,275 Feel free to chime in the chat. 151 00:06:14,275 --> 00:06:17,910 152 00:06:17,910 --> 00:06:18,410 OK. 153 00:06:18,410 --> 00:06:19,850 So I'm seeing some insertion. 154 00:06:19,850 --> 00:06:21,330 And, yeah, it's correct. 155 00:06:21,330 --> 00:06:23,720 So linked would be really fast to insert something into. 156 00:06:23,720 --> 00:06:27,500 In fact, we saw in lecture that all it takes to insert into a linked list 157 00:06:27,500 --> 00:06:30,588 is to simply make a new node and put it at the very front. 158 00:06:30,588 --> 00:06:32,630 And that's, basically, a constant time operation. 159 00:06:32,630 --> 00:06:34,100 It takes a fixed number of steps. 160 00:06:34,100 --> 00:06:36,290 So inserting into linked list is very fast 161 00:06:36,290 --> 00:06:39,447 and could be a good priority for our insertion here. 162 00:06:39,447 --> 00:06:41,030 But, again, what's the trade-off here? 163 00:06:41,030 --> 00:06:42,738 If we were to go with a linked list, what 164 00:06:42,738 --> 00:06:46,130 would be slower because we used the linked list now? 165 00:06:46,130 --> 00:06:49,130 Maybe insertion is fast, but what's not fast? 166 00:06:49,130 --> 00:06:51,507 Well, deletion is not very fast, right? 167 00:06:51,507 --> 00:06:53,840 But at least in this scenario, let's say we don't really 168 00:06:53,840 --> 00:06:57,290 want to care about deleting wake words, what's the other ones we can use, 169 00:06:57,290 --> 00:07:00,435 a little bit slow for us? 170 00:07:00,435 --> 00:07:00,935 Yeah. 171 00:07:00,935 --> 00:07:02,102 So I'm seeing search, right? 172 00:07:02,102 --> 00:07:04,100 Search would be pretty slow because, again, we 173 00:07:04,100 --> 00:07:06,410 have to sort of start at the very first node. 174 00:07:06,410 --> 00:07:09,442 Like, hey, and ask, OK, is this what the user said? 175 00:07:09,442 --> 00:07:11,150 No, it's not, let's go onto the next one. 176 00:07:11,150 --> 00:07:12,380 Oh, is this what the user said? 177 00:07:12,380 --> 00:07:14,088 Oh, it's not, let's go onto the next one. 178 00:07:14,088 --> 00:07:16,778 So we can only do linear search, basically, with a linked list. 179 00:07:16,778 --> 00:07:19,070 And this is a bit of simplification for how we actually 180 00:07:19,070 --> 00:07:20,330 do kind of voice recognition. 181 00:07:20,330 --> 00:07:22,205 But you get the idea, nonetheless, that we're 182 00:07:22,205 --> 00:07:25,040 trying to find and go through phrases that the user could have said. 183 00:07:25,040 --> 00:07:27,840 In a linked list, we have to go through every single node that we have, 184 00:07:27,840 --> 00:07:29,632 which, as I see in the chat is indeed going 185 00:07:29,632 --> 00:07:32,220 to be in big O of n going through n nodes 186 00:07:32,220 --> 00:07:35,440 to find a certain element inside of it. 187 00:07:35,440 --> 00:07:36,960 OK so we have linked lists here. 188 00:07:36,960 --> 00:07:39,040 But let's actually think of a new data structure. 189 00:07:39,040 --> 00:07:41,802 What if we use something like the hash table? 190 00:07:41,802 --> 00:07:43,510 And the hash table looks a bit like this. 191 00:07:43,510 --> 00:07:46,020 I'm going to go to the full screen mode here where 192 00:07:46,020 --> 00:07:47,760 it's a bit like a linked list. 193 00:07:47,760 --> 00:07:49,620 But what we've done is sort of categorize 194 00:07:49,620 --> 00:07:52,242 our linked list based on the very first character. 195 00:07:52,242 --> 00:07:54,450 And not all hash tables use the very first character, 196 00:07:54,450 --> 00:07:58,260 but this one, for example, does, and it is helpful in some ways. 197 00:07:58,260 --> 00:08:02,540 Like what would now be faster because we're using the hash table? 198 00:08:02,540 --> 00:08:05,368 What would be faster now? 199 00:08:05,368 --> 00:08:06,660 Feel free to write in the chat. 200 00:08:06,660 --> 00:08:08,940 201 00:08:08,940 --> 00:08:09,440 OK. 202 00:08:09,440 --> 00:08:11,810 So I'm seeing some search, like, search is faster. 203 00:08:11,810 --> 00:08:15,770 And why is search faster, if you want to write in the chat quickly? 204 00:08:15,770 --> 00:08:18,425 Why would search be faster with this hash table now? 205 00:08:18,425 --> 00:08:24,120 206 00:08:24,120 --> 00:08:26,790 So I'm seeing some of these ideas of indexing, right? 207 00:08:26,790 --> 00:08:27,660 And so-- right. 208 00:08:27,660 --> 00:08:31,770 If we look at the side of our hash table in this array on this left-hand side 209 00:08:31,770 --> 00:08:36,659 here, we've kind of split our linked list into several indexes, like, H, I, 210 00:08:36,659 --> 00:08:38,010 J, K and L. 211 00:08:38,010 --> 00:08:41,400 And in the computer's memory, these are represented with, actually, like, 212 00:08:41,400 --> 00:08:44,100 H, I, J, K and L. They're assigned some number. 213 00:08:44,100 --> 00:08:46,770 And we have a hash function that finds the number for each 214 00:08:46,770 --> 00:08:48,400 of these values on the left-hand side. 215 00:08:48,400 --> 00:08:51,430 But the idea here is that we're able to use indexing to figure out, 216 00:08:51,430 --> 00:08:54,180 OK, if I want to find the words beginning with H, 217 00:08:54,180 --> 00:08:57,878 I know right where to go, and I can only search those words. 218 00:08:57,878 --> 00:09:01,170 Or if I want to find the words beginning with L, I can search in that location, 219 00:09:01,170 --> 00:09:03,630 just find the words beginning with L. So indexing here 220 00:09:03,630 --> 00:09:05,422 is helping us breaking down our linked list 221 00:09:05,422 --> 00:09:08,170 into lots of smaller linked lists, too. 222 00:09:08,170 --> 00:09:12,070 So it seems if we were to insert into this hash table, 223 00:09:12,070 --> 00:09:13,380 it's also pretty quick, right? 224 00:09:13,380 --> 00:09:16,770 It's both faster searching and faster insertion. 225 00:09:16,770 --> 00:09:20,590 So why would we just not always use a hash table over a linked list? 226 00:09:20,590 --> 00:09:22,590 Like what's the problem with the hash table now? 227 00:09:22,590 --> 00:09:23,730 What's the trade-off here? 228 00:09:23,730 --> 00:09:26,060 229 00:09:26,060 --> 00:09:26,560 Yeah. 230 00:09:26,560 --> 00:09:28,320 So I'm seeing this idea of memory. 231 00:09:28,320 --> 00:09:31,570 Like this hash table is going to take up much more memory than our linked list 232 00:09:31,570 --> 00:09:35,100 assuming, maybe, we have some buckets, some indexes that aren't really filled 233 00:09:35,100 --> 00:09:36,770 with these phrases. 234 00:09:36,770 --> 00:09:38,520 But it's also going to take us more memory 235 00:09:38,520 --> 00:09:40,582 to have all the different indexes possible here 236 00:09:40,582 --> 00:09:43,540 that we might want to actually have in order to make our lookup really, 237 00:09:43,540 --> 00:09:44,370 really fast. 238 00:09:44,370 --> 00:09:47,640 So it's a trade-off here is memory, as well. 239 00:09:47,640 --> 00:09:49,850 And now, it's got one other structure to consider, 240 00:09:49,850 --> 00:09:52,010 which would be this idea of the trie. 241 00:09:52,010 --> 00:09:54,240 And here's some brief visual of a trie. 242 00:09:54,240 --> 00:09:57,980 Would you like to point out some words you see inside of this trie, 243 00:09:57,980 --> 00:09:59,253 reading from left to right? 244 00:09:59,253 --> 00:10:00,170 What words do you see? 245 00:10:00,170 --> 00:10:02,960 You see hello, you see hey. 246 00:10:02,960 --> 00:10:04,400 Any other ones? 247 00:10:04,400 --> 00:10:07,910 We see help, certainly. 248 00:10:07,910 --> 00:10:09,980 Yeah, there's a few other ones in here, too. 249 00:10:09,980 --> 00:10:15,260 So this trie worked by trying of storing a node for every letter in our words. 250 00:10:15,260 --> 00:10:17,390 And this is very fast for lookup, right? 251 00:10:17,390 --> 00:10:19,910 If you want to figure out if hello is in this trie, 252 00:10:19,910 --> 00:10:24,980 well, all we have to do is see, OK, I see H, I see E, I see L, I see L, 253 00:10:24,980 --> 00:10:25,640 and I see O. 254 00:10:25,640 --> 00:10:27,327 Well, hello is in this structure, right? 255 00:10:27,327 --> 00:10:28,410 We know that it's in Here. 256 00:10:28,410 --> 00:10:30,320 So look up is very fast. 257 00:10:30,320 --> 00:10:32,840 And similarly, insertion is also really quickly. 258 00:10:32,840 --> 00:10:36,860 If I wanted to add Hi to this trie, well, I could simply go to the H node 259 00:10:36,860 --> 00:10:39,410 and say, OK, well, that covers the first letter. 260 00:10:39,410 --> 00:10:42,380 Now, I'll add I, maybe somewhere up above, and that's it. 261 00:10:42,380 --> 00:10:43,280 I've inserted Hi. 262 00:10:43,280 --> 00:10:46,658 So insertion and search are also really fast in our trie. 263 00:10:46,658 --> 00:10:48,200 But again, what's the trade-off here? 264 00:10:48,200 --> 00:10:49,700 Why wouldn't we always use the trie? 265 00:10:49,700 --> 00:10:53,850 266 00:10:53,850 --> 00:10:54,350 Yeah. 267 00:10:54,350 --> 00:10:55,910 So I'm seeing this idea of memory again. 268 00:10:55,910 --> 00:10:57,980 Like, this is going to take up a lot more memory 269 00:10:57,980 --> 00:11:00,770 to store all the combinations of all the letters 270 00:11:00,770 --> 00:11:03,860 that could make up all the words we might want to have as our wake 271 00:11:03,860 --> 00:11:05,810 word for our assistant, right? 272 00:11:05,810 --> 00:11:08,062 And even beyond that, beyond the computer's memory, 273 00:11:08,062 --> 00:11:09,770 if you think about the time it would take 274 00:11:09,770 --> 00:11:12,620 to make a trie, a more complex data structure, 275 00:11:12,620 --> 00:11:15,170 well, that's real human time that you have to work on this, 276 00:11:15,170 --> 00:11:18,120 and might actually delay your shipment of some new feature, right? 277 00:11:18,120 --> 00:11:18,620 And 278 00:11:18,620 --> 00:11:21,770 So it's going to matter, not just how much time it takes for the computer, 279 00:11:21,770 --> 00:11:24,290 but also how much time it takes you to make the structure, 280 00:11:24,290 --> 00:11:28,670 if you wanted to actually make some feature for your own program, 281 00:11:28,670 --> 00:11:31,020 or your own company in the end. 282 00:11:31,020 --> 00:11:33,440 So questions on these structures so far? 283 00:11:33,440 --> 00:11:35,720 Happy to take some in the chat, if you have any. 284 00:11:35,720 --> 00:11:42,530 285 00:11:42,530 --> 00:11:43,835 What questions do you have? 286 00:11:43,835 --> 00:11:48,370 287 00:11:48,370 --> 00:11:50,300 A question about, does AI use data structures? 288 00:11:50,300 --> 00:11:50,800 Yes. 289 00:11:50,800 --> 00:11:53,980 So a lot of how AI can work so quickly is 290 00:11:53,980 --> 00:11:57,410 because we choose to organize data in these really efficient ways. 291 00:11:57,410 --> 00:12:00,310 And so by getting creative with the ways we actually make our data 292 00:12:00,310 --> 00:12:03,190 structures beyond even just linked lists and tries and trees, 293 00:12:03,190 --> 00:12:06,247 we can actually make AI a lot more quickly-- 294 00:12:06,247 --> 00:12:08,080 run a lot more quickly because it has access 295 00:12:08,080 --> 00:12:10,450 to this data structure laid out in such a way 296 00:12:10,450 --> 00:12:12,380 that it can run quickly in the end. 297 00:12:12,380 --> 00:12:16,175 So, yes, it makes use of a lot of different data structures, too. 298 00:12:16,175 --> 00:12:18,550 And this question about memory being so cheap these days. 299 00:12:18,550 --> 00:12:19,010 It is true. 300 00:12:19,010 --> 00:12:20,302 So memory is very cheap, right? 301 00:12:20,302 --> 00:12:23,200 If we wanted to make the trie here, maybe we 302 00:12:23,200 --> 00:12:27,280 don't care how much memory it takes up because memory is so cheap. 303 00:12:27,280 --> 00:12:30,130 But it also takes us time to allocate that memory, 304 00:12:30,130 --> 00:12:31,840 to put things in that memory. 305 00:12:31,840 --> 00:12:34,030 And so, perhaps, it's not always the case 306 00:12:34,030 --> 00:12:37,240 that because we have so much memory, we should just be kind of loose with it. 307 00:12:37,240 --> 00:12:38,440 We really need to take care that we're not 308 00:12:38,440 --> 00:12:41,050 using too much memory when we could, theoretically, maybe 309 00:12:41,050 --> 00:12:43,760 use less, if we wanted to. 310 00:12:43,760 --> 00:12:44,800 Nice. 311 00:12:44,800 --> 00:12:45,680 OK. 312 00:12:45,680 --> 00:12:48,410 So this is some of the ideas of trade-offs here, choosing 313 00:12:48,410 --> 00:12:51,480 one or the other, often going to cost us one thing or another. 314 00:12:51,480 --> 00:12:53,550 But, really, behind all of these structures 315 00:12:53,550 --> 00:12:58,100 is the idea of a node, where, if we look back at our structures here, 316 00:12:58,100 --> 00:13:01,670 we had maybe a trie, we had a link, we had a hash table, 317 00:13:01,670 --> 00:13:02,810 and we had a linked list. 318 00:13:02,810 --> 00:13:07,520 And it was composing all of these was this idea of some kind of node, 319 00:13:07,520 --> 00:13:09,110 some place we store data. 320 00:13:09,110 --> 00:13:12,690 And along with that, a pointer to some other pieces of data. 321 00:13:12,690 --> 00:13:16,100 So we saw last week, in Week 4, that pointers are 322 00:13:16,100 --> 00:13:18,380 helpful for pointing certain locations. 323 00:13:18,380 --> 00:13:20,900 Well, now, we're seeing the application of those pointers. 324 00:13:20,900 --> 00:13:23,025 Why would we even want pointers in the first place? 325 00:13:23,025 --> 00:13:25,860 Because we can build these much more complex data structures. 326 00:13:25,860 --> 00:13:29,797 So let's think about building these very first building block for these data 327 00:13:29,797 --> 00:13:31,130 sets, which is called this node. 328 00:13:31,130 --> 00:13:33,130 And in our code that we'll see in just a minute, 329 00:13:33,130 --> 00:13:36,318 we have this new idea of creating this node in code. 330 00:13:36,318 --> 00:13:38,360 And we've seen this kind of syntax before, right? 331 00:13:38,360 --> 00:13:41,480 Here is the code we would use to make a structure called a node. 332 00:13:41,480 --> 00:13:44,212 And we're seeing that same syntax for structs, 333 00:13:44,212 --> 00:13:45,920 like, we created a struct for candidates, 334 00:13:45,920 --> 00:13:47,630 we created a struct for people earlier. 335 00:13:47,630 --> 00:13:50,660 Now, we're using that same syntax to create a node. 336 00:13:50,660 --> 00:13:52,080 And what do you notice here? 337 00:13:52,080 --> 00:13:53,870 So first, we're going to look at the top. 338 00:13:53,870 --> 00:13:56,160 And C reads this top to bottom. 339 00:13:56,160 --> 00:13:58,260 So as it reads this code, it'll do the following. 340 00:13:58,260 --> 00:14:00,890 It'll say, OK, here I'm seeing we're going to define 341 00:14:00,890 --> 00:14:03,590 a new structure called a node. 342 00:14:03,590 --> 00:14:05,330 That seems OK so far. 343 00:14:05,330 --> 00:14:08,000 And then inside of the structure called node, 344 00:14:08,000 --> 00:14:10,520 we're going to ensure we have the following pieces of data. 345 00:14:10,520 --> 00:14:11,990 We're going to make sure we have a phrase. 346 00:14:11,990 --> 00:14:14,780 If we look at this, we're going to have space for a phrase that 347 00:14:14,780 --> 00:14:16,260 is going to be a string. 348 00:14:16,260 --> 00:14:21,320 And then we also have some space for a pointer to a node called next. 349 00:14:21,320 --> 00:14:23,660 And, again, we can store, like, Hi or bye in here. 350 00:14:23,660 --> 00:14:27,800 But we also have some new space for next, a pointer to a node. 351 00:14:27,800 --> 00:14:31,970 And that could have a real address in it, like, 0x123, or 0x456. 352 00:14:31,970 --> 00:14:35,750 But it has to point to some kind of node, right? 353 00:14:35,750 --> 00:14:39,290 So now inside this node, we have a phrase and a next portion 354 00:14:39,290 --> 00:14:42,560 currently empty, but we could put this data inside of it. 355 00:14:42,560 --> 00:14:45,593 And then once we finish reading this, we're going to say, all of this, 356 00:14:45,593 --> 00:14:48,260 everything we've done so far, the phrase, the next portions that 357 00:14:48,260 --> 00:14:51,350 are-- all of that to be wrapped up into this thing called a node. 358 00:14:51,350 --> 00:14:54,230 So we've created kind of here a template for what a node is. 359 00:14:54,230 --> 00:14:57,650 A node has a phrase in it and some next pointer in it. 360 00:14:57,650 --> 00:14:59,955 And we're going to call all of that a node. 361 00:14:59,955 --> 00:15:01,080 Now, you could change this. 362 00:15:01,080 --> 00:15:02,913 You can make sure it has an integer, you can 363 00:15:02,913 --> 00:15:04,940 make sure it has some character inside of it. 364 00:15:04,940 --> 00:15:08,270 You could even store more attributes, if you'd like, inside a single node. 365 00:15:08,270 --> 00:15:11,570 But for now, I'll focus on phrases, trying to store these activation 366 00:15:11,570 --> 00:15:13,790 words for our voice assistant for now. 367 00:15:13,790 --> 00:15:15,860 Now, questions on this structure so far? 368 00:15:15,860 --> 00:15:19,080 369 00:15:19,080 --> 00:15:23,310 Yes, I'm seeing a question, is the next location inside the computer's memory? 370 00:15:23,310 --> 00:15:25,410 Is it actually in the byte below? 371 00:15:25,410 --> 00:15:27,960 Like here in our picture here, we see that the phrase 372 00:15:27,960 --> 00:15:31,770 portion is kind of first, and then comes the next portion. 373 00:15:31,770 --> 00:15:34,277 Are these two really next to each other in memory? 374 00:15:34,277 --> 00:15:36,360 And the answer for that is, generally, like, yeah. 375 00:15:36,360 --> 00:15:38,170 So these are going to be very close to-- they're in memory. 376 00:15:38,170 --> 00:15:40,337 They're going to be kind of right next to each other 377 00:15:40,337 --> 00:15:43,290 as we create this node throughout. 378 00:15:43,290 --> 00:15:46,240 And a question about, well, could we have a before field, 379 00:15:46,240 --> 00:15:47,580 like, a pointer to a before? 380 00:15:47,580 --> 00:15:48,580 And we absolutely could. 381 00:15:48,580 --> 00:15:50,740 So if we wanted to modify this, we could simply 382 00:15:50,740 --> 00:15:53,620 add a new attribute called before, and we could theoretically 383 00:15:53,620 --> 00:15:55,570 point to linked lists that are-- 384 00:15:55,570 --> 00:15:58,172 to nodes that are behind us, as well, in our current node. 385 00:15:58,172 --> 00:16:00,880 And that allows us to make more even more complex data structures 386 00:16:00,880 --> 00:16:01,750 overall, too. 387 00:16:01,750 --> 00:16:05,790 388 00:16:05,790 --> 00:16:08,860 Now, a question here, we see a struct node if we go back here. 389 00:16:08,860 --> 00:16:10,980 So struct node, star, next. 390 00:16:10,980 --> 00:16:13,972 Is that a pointer to the struct, or just a simple pointer? 391 00:16:13,972 --> 00:16:15,930 And the answer here is a pointer to the struct. 392 00:16:15,930 --> 00:16:18,660 So here, we're seeing that this is a node pointer. 393 00:16:18,660 --> 00:16:20,800 And a node, technically, is a structure. 394 00:16:20,800 --> 00:16:23,010 So we're going to have a pointer to that node that 395 00:16:23,010 --> 00:16:26,190 is itself a structure combined with a phrase and another pointer 396 00:16:26,190 --> 00:16:27,340 to another node. 397 00:16:27,340 --> 00:16:30,860 So good questions there. 398 00:16:30,860 --> 00:16:32,780 All right. 399 00:16:32,780 --> 00:16:34,790 Any other questions on this structure here? 400 00:16:34,790 --> 00:16:41,410 401 00:16:41,410 --> 00:16:42,820 OK. 402 00:16:42,820 --> 00:16:46,750 So that all seems pretty good so far, and kind of a simple template 403 00:16:46,750 --> 00:16:47,470 for a node. 404 00:16:47,470 --> 00:16:50,050 But keep in mind that just this code doesn't actually 405 00:16:50,050 --> 00:16:51,150 create the node for us. 406 00:16:51,150 --> 00:16:53,650 It gives us a template for the node, but we haven't actually 407 00:16:53,650 --> 00:16:55,037 created it in code yet. 408 00:16:55,037 --> 00:16:57,370 So let's actually go ahead and see how we might do that. 409 00:16:57,370 --> 00:17:00,610 If you were to go ahead and create our own linked list, let's say. 410 00:17:00,610 --> 00:17:04,000 Like, let's say, out of all these three structures, we chose a linked list now. 411 00:17:04,000 --> 00:17:06,250 We could start by making a single node and have 412 00:17:06,250 --> 00:17:08,900 that be at the head of our linked list. 413 00:17:08,900 --> 00:17:11,440 So to create a linked list, we'll often do this. 414 00:17:11,440 --> 00:17:14,470 We'll often create a new pointer to a node 415 00:17:14,470 --> 00:17:18,220 and first set it equal to null because it's pointing to nothing. 416 00:17:18,220 --> 00:17:20,560 There are no nodes in our linked list. 417 00:17:20,560 --> 00:17:23,500 But once we've done that, well, we probably want to add at least one 418 00:17:23,500 --> 00:17:24,290 node, right? 419 00:17:24,290 --> 00:17:25,540 So let's go ahead and do that. 420 00:17:25,540 --> 00:17:27,790 Let's make a new pointer, this one called n, 421 00:17:27,790 --> 00:17:30,500 and make sure we have some space in which to store it. 422 00:17:30,500 --> 00:17:32,200 So let's read this from right to left. 423 00:17:32,200 --> 00:17:33,670 Let's call malloc first. 424 00:17:33,670 --> 00:17:37,160 And malloc will give us some space in memory to actually put this node. 425 00:17:37,160 --> 00:17:39,250 So you can see at the visual right below we 426 00:17:39,250 --> 00:17:41,860 have some new space our program could use 427 00:17:41,860 --> 00:17:46,090 to actually store this node inside of and add a phrase, or some next pointer, 428 00:17:46,090 --> 00:17:47,140 right? 429 00:17:47,140 --> 00:17:49,480 But now if we go all the way to the left, 430 00:17:49,480 --> 00:17:53,920 we're going to see this pointer called n will then point to that space. 431 00:17:53,920 --> 00:17:55,870 And notice how n is a type node star. 432 00:17:55,870 --> 00:18:00,260 It's pointing to some space for a node, right? 433 00:18:00,260 --> 00:18:00,860 OK. 434 00:18:00,860 --> 00:18:02,783 So that makes our first node. 435 00:18:02,783 --> 00:18:05,450 But what we probably want to do is actually add some data to it. 436 00:18:05,450 --> 00:18:07,050 Currently, it's empty, right? 437 00:18:07,050 --> 00:18:07,920 So what can we do? 438 00:18:07,920 --> 00:18:10,640 Well, we can actually use this new syntax, this arrow syntax, 439 00:18:10,640 --> 00:18:12,780 to actually add in some data. 440 00:18:12,780 --> 00:18:18,050 So down below, we could say, n arrow phrase gets the value high. 441 00:18:18,050 --> 00:18:19,050 And what does that mean? 442 00:18:19,050 --> 00:18:21,200 So if you look at our visual here, we could see-- 443 00:18:21,200 --> 00:18:23,960 OK, we see n, and we see an arrow. 444 00:18:23,960 --> 00:18:27,470 And now, we're going to go to the phrase portion of this new piece of memory 445 00:18:27,470 --> 00:18:31,290 we've allocated and make sure we put the phrase Hi in there. 446 00:18:31,290 --> 00:18:34,400 So n arrow phrase is saying, start with that pointer 447 00:18:34,400 --> 00:18:36,950 and follow it to the structure that you have, 448 00:18:36,950 --> 00:18:39,650 go to that phrase attribute, that phrase portion, 449 00:18:39,650 --> 00:18:43,160 and make sure that says Hi in the middle, right? 450 00:18:43,160 --> 00:18:45,320 And that's all fine and good, but now, we also 451 00:18:45,320 --> 00:18:47,012 have to include the next portion. 452 00:18:47,012 --> 00:18:49,470 Like, currently it's empty, but we want to have some value. 453 00:18:49,470 --> 00:18:52,430 And if this is our first node, what's the logical value 454 00:18:52,430 --> 00:18:56,360 for this next attribute for our node? 455 00:18:56,360 --> 00:18:58,010 This is our very first node. 456 00:18:58,010 --> 00:18:59,120 What could we set it to? 457 00:18:59,120 --> 00:19:01,680 458 00:19:01,680 --> 00:19:02,180 All right. 459 00:19:02,180 --> 00:19:03,472 So I'm seeing null in the chat. 460 00:19:03,472 --> 00:19:08,240 And the correct version is the N-U-L-L, not N-U-L. N-U-L is nul, 461 00:19:08,240 --> 00:19:10,490 and the concept of character is like a nul character. 462 00:19:10,490 --> 00:19:13,460 But N-U-L-L is null in the context of an address. 463 00:19:13,460 --> 00:19:16,530 So this is an empty address, zero address here. 464 00:19:16,530 --> 00:19:20,630 So if we say n arrow next gets the value null, we'd see something like this. 465 00:19:20,630 --> 00:19:25,020 Here, we have our new node n, it's phrase is Hi, its next pointer is null. 466 00:19:25,020 --> 00:19:28,160 It's the very first node in our list. 467 00:19:28,160 --> 00:19:34,130 So any questions on this so far, actually, making up our first node? 468 00:19:34,130 --> 00:19:37,280 To your question on the arrow syntax, so again, the arrow syntax 469 00:19:37,280 --> 00:19:40,680 is kind of best visualized by actually reading from left to right. 470 00:19:40,680 --> 00:19:42,740 So if we look at the very first thing, we see n. 471 00:19:42,740 --> 00:19:44,430 OK. n is a pointer. 472 00:19:44,430 --> 00:19:48,380 Now, the arrow says, let's follow that pointer to our structure, right, 473 00:19:48,380 --> 00:19:49,980 because n is pointing to a structure. 474 00:19:49,980 --> 00:19:52,550 And let's go to the next field, that next attribute, 475 00:19:52,550 --> 00:19:55,710 and set it equal to some value we put on the right-hand side. 476 00:19:55,710 --> 00:20:00,050 So here we said, start with n, follow n's pointer to our node structure, 477 00:20:00,050 --> 00:20:02,900 and go to that next attribute and set that equal to null, 478 00:20:02,900 --> 00:20:05,950 as we see down below here. 479 00:20:05,950 --> 00:20:06,940 OK. 480 00:20:06,940 --> 00:20:09,400 So there's one more step, though, which is 481 00:20:09,400 --> 00:20:11,757 our list is still looking pretty empty. 482 00:20:11,757 --> 00:20:14,090 We have our node, but our list is still empty over here. 483 00:20:14,090 --> 00:20:17,500 So what could we do to actually add this node to our linked list? 484 00:20:17,500 --> 00:20:21,610 485 00:20:21,610 --> 00:20:28,445 What kind of code do we need here, or where do we want list to point now? 486 00:20:28,445 --> 00:20:32,908 487 00:20:32,908 --> 00:20:35,450 Feel free to type in the chat if you think you might have it. 488 00:20:35,450 --> 00:20:39,280 489 00:20:39,280 --> 00:20:41,620 So I'm seeing two competing opinions here. 490 00:20:41,620 --> 00:20:44,910 So one opinion is that we should say list equals n, 491 00:20:44,910 --> 00:20:47,010 which would be list gets the value n. 492 00:20:47,010 --> 00:20:51,040 And I'm seeing n equals list, or n gets the value list. 493 00:20:51,040 --> 00:20:53,220 Which one do we think might work here? 494 00:20:53,220 --> 00:20:54,120 Let's try one. 495 00:20:54,120 --> 00:20:57,120 So let's say, first, that list would get the value n. 496 00:20:57,120 --> 00:20:59,680 So list equals n, and let's try that. 497 00:20:59,680 --> 00:21:01,840 So here we have that up here and let's execute it. 498 00:21:01,840 --> 00:21:06,720 And now we'll see, OK, we'll list an n point to the same thing. 499 00:21:06,720 --> 00:21:08,940 Because n is a pointer and list is a pointer. 500 00:21:08,940 --> 00:21:12,160 If we set list equal to n, though, both point to the very same thing, 501 00:21:12,160 --> 00:21:13,703 which is our very first node. 502 00:21:13,703 --> 00:21:15,870 What would happen if we did it the other way around? 503 00:21:15,870 --> 00:21:21,510 If we said n equals list, instead of list equals n, where would n point to? 504 00:21:21,510 --> 00:21:25,920 505 00:21:25,920 --> 00:21:27,877 Any ideas in the chat? 506 00:21:27,877 --> 00:21:28,960 So I'm seeing null, right? 507 00:21:28,960 --> 00:21:32,050 If list is currently null, it's pointing to really nothing in general, 508 00:21:32,050 --> 00:21:34,480 well, if we say n equals list, we're going 509 00:21:34,480 --> 00:21:38,680 to make sure that end points to null and we'll have lost this node over here 510 00:21:38,680 --> 00:21:40,060 that we just had, right? 511 00:21:40,060 --> 00:21:41,948 So we probably don't want to do that. 512 00:21:41,948 --> 00:21:43,990 So this will be our last step in adding our node. 513 00:21:43,990 --> 00:21:47,230 And now we have our very first node inside 514 00:21:47,230 --> 00:21:50,070 of our linked list, which is great. 515 00:21:50,070 --> 00:21:52,575 Questions on this here, too, before we keep going? 516 00:21:52,575 --> 00:21:59,170 517 00:21:59,170 --> 00:22:01,660 OK. 518 00:22:01,660 --> 00:22:04,820 Not seeing too many here, so why don't we keep on going. 519 00:22:04,820 --> 00:22:08,410 And if we want to kick off the very first node, this is how we would do it. 520 00:22:08,410 --> 00:22:11,770 But let's say we wanted to actually keep adding more and more nodes. 521 00:22:11,770 --> 00:22:14,500 We could start thinking about inserting nodes to our linked list. 522 00:22:14,500 --> 00:22:16,910 And there are a few ways we could do even that. 523 00:22:16,910 --> 00:22:19,517 So if we think back to our picture as it is right now, 524 00:22:19,517 --> 00:22:21,850 let's say we wanted to create this new space for a node. 525 00:22:21,850 --> 00:22:24,850 We could say, OK, I'm going to have this new node called n, 526 00:22:24,850 --> 00:22:27,370 and I'm going to use malloc to give me some space for it. 527 00:22:27,370 --> 00:22:28,610 So same thing we did before. 528 00:22:28,610 --> 00:22:32,200 We're going to say, get some space, make sure n points to it. 529 00:22:32,200 --> 00:22:37,360 And now, how could we best actually add this node to our list? 530 00:22:37,360 --> 00:22:40,345 Would we add it at the end maybe, or the beginning, which do you think? 531 00:22:40,345 --> 00:22:43,843 532 00:22:43,843 --> 00:22:46,760 Could we add n at the beginning or the end, and which would be faster? 533 00:22:46,760 --> 00:22:50,337 534 00:22:50,337 --> 00:22:51,920 So I'm seeing some competing opinions. 535 00:22:51,920 --> 00:22:54,900 I'm seeing end and I'm seeing beginning. 536 00:22:54,900 --> 00:22:57,270 Now, if you think about a longer linked list, 537 00:22:57,270 --> 00:23:01,922 let's say we had not one node but really like 50 nodes. 538 00:23:01,922 --> 00:23:03,380 Would we want to add it at the end? 539 00:23:03,380 --> 00:23:05,713 And if so, how many steps would it take to add this node 540 00:23:05,713 --> 00:23:07,770 at the end of that linked list? 541 00:23:07,770 --> 00:23:10,470 If we have 50 nodes, how many steps would it 542 00:23:10,470 --> 00:23:12,180 take to add this node at the end? 543 00:23:12,180 --> 00:23:13,185 It would take 50, right? 544 00:23:13,185 --> 00:23:15,810 And that reason is that you have to actually find the last node 545 00:23:15,810 --> 00:23:16,780 and then add it there. 546 00:23:16,780 --> 00:23:20,190 So to find that last node, we have to go through all 50 of our nodes. 547 00:23:20,190 --> 00:23:22,440 So if we want the very quick insertion we talked about 548 00:23:22,440 --> 00:23:24,398 before, we have to add a node at the beginning, 549 00:23:24,398 --> 00:23:26,310 right, make sure that this new node points 550 00:23:26,310 --> 00:23:29,920 to the current first node in our list. 551 00:23:29,920 --> 00:23:31,237 So let's actually do that here. 552 00:23:31,237 --> 00:23:33,820 We have some space for a node, we have a pointer to that node. 553 00:23:33,820 --> 00:23:35,590 But let's see how we might add it in. 554 00:23:35,590 --> 00:23:38,550 So we probably first give this node a phrase, like, Hey. 555 00:23:38,550 --> 00:23:41,610 We'll say, OK, n arrow phrase gets the value Hey. 556 00:23:41,610 --> 00:23:47,940 And now, we want to say, hum, where should we point n's next pointer to? 557 00:23:47,940 --> 00:23:53,590 If we look at this picture here, where should we point n's next pointer? 558 00:23:53,590 --> 00:23:56,270 In the chat, if you will. 559 00:23:56,270 --> 00:23:57,110 Yeah, to list. 560 00:23:57,110 --> 00:23:59,840 So we want to make sure that this new node called 561 00:23:59,840 --> 00:24:03,690 n points to the very first node in our list, that being called, well, just 562 00:24:03,690 --> 00:24:04,190 list. 563 00:24:04,190 --> 00:24:07,950 So here we'll say, OK, n arrow next gets the value list. 564 00:24:07,950 --> 00:24:11,150 So now we're saying that n is a node, it has the phrase, 565 00:24:11,150 --> 00:24:14,630 Hey, and its next arrow is pointing to this list 566 00:24:14,630 --> 00:24:17,760 node, the very head of our list here. 567 00:24:17,760 --> 00:24:18,260 All right. 568 00:24:18,260 --> 00:24:21,770 But now, you'll notice that our linked list is not quite in the state 569 00:24:21,770 --> 00:24:24,660 we want it to be in, like, list is still pointing to our second node. 570 00:24:24,660 --> 00:24:26,368 So what's the next step here, if you want 571 00:24:26,368 --> 00:24:30,550 to make sure our list now includes this very first node? 572 00:24:30,550 --> 00:24:32,000 What should we do? 573 00:24:32,000 --> 00:24:34,400 We should point list to n, as I'm seeing in the chat. 574 00:24:34,400 --> 00:24:37,232 So let's make sure we have list pointing to n now. 575 00:24:37,232 --> 00:24:37,940 So we'll do this. 576 00:24:37,940 --> 00:24:40,450 We'll say, list equals n, and we'll run this and we'll see, 577 00:24:40,450 --> 00:24:45,460 OK, list gets the value n, meaning list now points to this new first node. 578 00:24:45,460 --> 00:24:49,150 And now we have our own linked list of more than one node, right? 579 00:24:49,150 --> 00:24:50,440 We went from one to two. 580 00:24:50,440 --> 00:24:53,148 And now, I could even do the same thing, three, or four, or five, 581 00:24:53,148 --> 00:24:55,900 even 500 times, if we wanted to. 582 00:24:55,900 --> 00:24:58,440 So questions on this, inserting this new node here? 583 00:24:58,440 --> 00:25:06,850 584 00:25:06,850 --> 00:25:08,990 OK. 585 00:25:08,990 --> 00:25:11,090 So not seeing too many here. 586 00:25:11,090 --> 00:25:13,200 Let's actually try to translate this bit to code. 587 00:25:13,200 --> 00:25:15,075 So this is all fine and good with our visual, 588 00:25:15,075 --> 00:25:16,770 but let's actually try it out in code. 589 00:25:16,770 --> 00:25:18,950 I'll go over to my Visual Studio code over here. 590 00:25:18,950 --> 00:25:22,790 And here, I have my terminal, a file already called list.c. 591 00:25:22,790 --> 00:25:25,097 So I'll code up list.c. 592 00:25:25,097 --> 00:25:27,180 And if we take a look at this file, what we'll see 593 00:25:27,180 --> 00:25:29,960 is a very basic kind of linked list file. 594 00:25:29,960 --> 00:25:32,000 Here at the top, I have my headers. 595 00:25:32,000 --> 00:25:34,785 And down below, I have my structure called a node. 596 00:25:34,785 --> 00:25:36,660 And this is the same as we saw before, right? 597 00:25:36,660 --> 00:25:39,950 We have a new structure called a node that's going to hold a phrase, 598 00:25:39,950 --> 00:25:41,510 and it's going to have a pointer to the next node. 599 00:25:41,510 --> 00:25:44,802 And we're going to call all of this node throughout our code using this typedef 600 00:25:44,802 --> 00:25:46,280 syntax up here. 601 00:25:46,280 --> 00:25:49,610 Now, down below, we have some functions to actually implement. 602 00:25:49,610 --> 00:25:51,920 One is called unload, that we'll see in a bit, 603 00:25:51,920 --> 00:25:54,170 to actually make sure we're not using too much memory, 604 00:25:54,170 --> 00:25:55,820 we're actually freeing it as we go. 605 00:25:55,820 --> 00:25:59,690 And the other down here in main is actually add items or add phrases 606 00:25:59,690 --> 00:26:01,400 to our linked list. 607 00:26:01,400 --> 00:26:03,710 And as we do, we have this function down here called 608 00:26:03,710 --> 00:26:06,230 visualize that I wrote myself and should actually 609 00:26:06,230 --> 00:26:08,480 help you see what's happening inside of our link list. 610 00:26:08,480 --> 00:26:10,897 Don't worry about all this kind of scary syntax down here. 611 00:26:10,897 --> 00:26:13,350 We'll see what happens as we add nodes to our linked list. 612 00:26:13,350 --> 00:26:16,450 So currently, if we wanted to add some phrases to our linked list, 613 00:26:16,450 --> 00:26:19,200 well, we could just kind of run this program and see what happens. 614 00:26:19,200 --> 00:26:20,210 Well, first, I could-- 615 00:26:20,210 --> 00:26:21,890 let me bring up my terminal here. 616 00:26:21,890 --> 00:26:27,715 And I could say, I want to make list like this. 617 00:26:27,715 --> 00:26:29,590 And let's go ahead and make this full screen. 618 00:26:29,590 --> 00:26:31,780 I'll say ./list to run it. 619 00:26:31,780 --> 00:26:33,270 And now, what phrase should we add? 620 00:26:33,270 --> 00:26:35,270 Maybe we add Hi, exclamation point. 621 00:26:35,270 --> 00:26:36,390 So I'll hit Enter. 622 00:26:36,390 --> 00:26:39,480 And it doesn't show anything, right? 623 00:26:39,480 --> 00:26:40,740 Nothing's here yet. 624 00:26:40,740 --> 00:26:41,910 What if I did Hello! 625 00:26:41,910 --> 00:26:43,110 I could say Hello! 626 00:26:43,110 --> 00:26:44,800 And nothing's there. 627 00:26:44,800 --> 00:26:45,300 OK. 628 00:26:45,300 --> 00:26:47,550 But why is nothing there yet? 629 00:26:47,550 --> 00:26:49,905 In the chat, why is nothing here? 630 00:26:49,905 --> 00:26:53,540 631 00:26:53,540 --> 00:26:54,750 Yeah, we didn't add anything. 632 00:26:54,750 --> 00:26:58,040 So if you go back to our program here, we still 633 00:26:58,040 --> 00:27:01,820 have a to do, to add phrases to a new node in this list. 634 00:27:01,820 --> 00:27:05,210 So here, if we look back, very familiar syntax, 635 00:27:05,210 --> 00:27:07,190 we're creating our very first linked list. 636 00:27:07,190 --> 00:27:09,200 And initially, this list is null. 637 00:27:09,200 --> 00:27:10,170 There's nothing in it. 638 00:27:10,170 --> 00:27:12,140 So that's why, when we ran this program before, 639 00:27:12,140 --> 00:27:14,990 we didn't see anything inside of our linked list. 640 00:27:14,990 --> 00:27:19,310 But now, what we're going to do is iterate list size times 641 00:27:19,310 --> 00:27:23,450 to ask the user for some phrase and add it to our linked list 642 00:27:23,450 --> 00:27:25,760 and visualize that addition. 643 00:27:25,760 --> 00:27:27,560 So how big is list size? 644 00:27:27,560 --> 00:27:28,940 Well, list size is 2. 645 00:27:28,940 --> 00:27:30,530 We'll iterate two times. 646 00:27:30,530 --> 00:27:34,110 And each time, we'll go ahead and add some new phrase, right? 647 00:27:34,110 --> 00:27:37,340 So let's think first-- how are we going to make this new node? 648 00:27:37,340 --> 00:27:40,020 We first probably want to actually create some space for it. 649 00:27:40,020 --> 00:27:43,300 So how could we do that here? 650 00:27:43,300 --> 00:27:45,460 We're trying to add a new node now. 651 00:27:45,460 --> 00:27:47,230 So we're going to need malloc, OK. 652 00:27:47,230 --> 00:27:49,120 So we're going to need malloc in some form. 653 00:27:49,120 --> 00:27:52,510 And how much space should we ask for, do we know? 654 00:27:52,510 --> 00:27:56,000 655 00:27:56,000 --> 00:27:58,750 So we want to create some space for a node. 656 00:27:58,750 --> 00:28:00,685 And maybe a node is-- 657 00:28:00,685 --> 00:28:03,130 I don't know-- I could guess and say it's five bytes. 658 00:28:03,130 --> 00:28:04,780 But is that right? 659 00:28:04,780 --> 00:28:05,495 Or maybe it's 10. 660 00:28:05,495 --> 00:28:06,370 I mean, I don't know. 661 00:28:06,370 --> 00:28:07,700 It could be anything. 662 00:28:07,700 --> 00:28:10,360 We don't really quite know, and so that's 663 00:28:10,360 --> 00:28:13,390 why we want to use this function called sizeof. 664 00:28:13,390 --> 00:28:16,690 And we can pass sizeof a certain type, like, node, right? 665 00:28:16,690 --> 00:28:19,530 I don't know how big a node is, but sizeof does. 666 00:28:19,530 --> 00:28:25,162 It'll tell malloc how much memory to create to have this node in place. 667 00:28:25,162 --> 00:28:27,370 But now that I've asked for memory, I should probably 668 00:28:27,370 --> 00:28:29,650 keep track where that memory is. 669 00:28:29,650 --> 00:28:34,360 So I need some variable to store that location 670 00:28:34,360 --> 00:28:37,260 memory, which might be called what? 671 00:28:37,260 --> 00:28:41,020 What could we store here? 672 00:28:41,020 --> 00:28:44,470 Maybe we could have a node pointer. 673 00:28:44,470 --> 00:28:45,490 Let's call it n. 674 00:28:45,490 --> 00:28:49,840 So we could say, OK, this is going to point to some variable-- 675 00:28:49,840 --> 00:28:52,630 point to some new piece of memory called n. 676 00:28:52,630 --> 00:28:55,360 And we're going to make sure it points to a node, right? 677 00:28:55,360 --> 00:28:59,060 This is going to be some new chunk of memory where n points to. 678 00:28:59,060 --> 00:28:59,560 All right. 679 00:28:59,560 --> 00:29:02,780 So we do that kind of first bit, and now we have some space for our node. 680 00:29:02,780 --> 00:29:04,480 But what do we need now? 681 00:29:04,480 --> 00:29:07,478 We need to make sure that we add in a phrase. 682 00:29:07,478 --> 00:29:08,395 So how can we do that? 683 00:29:08,395 --> 00:29:12,323 684 00:29:12,323 --> 00:29:14,740 We would probably use our arrow syntax like we saw before. 685 00:29:14,740 --> 00:29:17,700 So we could say, n arrow phrase. 686 00:29:17,700 --> 00:29:20,370 And let's store the phrase that we have right now inside. 687 00:29:20,370 --> 00:29:23,500 And similarly for the next portion, we could do this. 688 00:29:23,500 --> 00:29:26,590 We could say n arrow next gets the value. 689 00:29:26,590 --> 00:29:33,330 But what should it get, if this is our maybe first node? 690 00:29:33,330 --> 00:29:36,760 I'm seeing a few different ideas here. 691 00:29:36,760 --> 00:29:38,520 So I'm seeing one for null. 692 00:29:38,520 --> 00:29:39,190 Let's try that. 693 00:29:39,190 --> 00:29:41,830 So we'll say, OK, n arrow next is null. 694 00:29:41,830 --> 00:29:42,810 It's our first node. 695 00:29:42,810 --> 00:29:45,330 And now what we want to do is make sure that list 696 00:29:45,330 --> 00:29:48,910 points to that very first node we just made up above. 697 00:29:48,910 --> 00:29:49,890 So let's try it. 698 00:29:49,890 --> 00:29:51,275 Let's go ahead and compile this. 699 00:29:51,275 --> 00:29:52,650 We'll open up our terminal again. 700 00:29:52,650 --> 00:29:55,160 We'll say, make list and-- 701 00:29:55,160 --> 00:29:59,590 oops-- we forgot our semicolon, so we'll do that. 702 00:29:59,590 --> 00:30:03,080 And we'll go back here and we'll say, make list. 703 00:30:03,080 --> 00:30:05,030 And then we'll do ./list. 704 00:30:05,030 --> 00:30:07,960 And I'll say, OK, my first phrase is Hi! 705 00:30:07,960 --> 00:30:11,620 And that seems all right, like, I have this new node at this location. 706 00:30:11,620 --> 00:30:13,180 It's phrase is Hi! 707 00:30:13,180 --> 00:30:14,900 And its next pointer is null. 708 00:30:14,900 --> 00:30:16,120 That seems OK. 709 00:30:16,120 --> 00:30:17,060 What if I did this? 710 00:30:17,060 --> 00:30:18,460 What if I said Hello! 711 00:30:18,460 --> 00:30:20,650 now. 712 00:30:20,650 --> 00:30:26,470 And I'm seeming to actually be missing my very first node. 713 00:30:26,470 --> 00:30:27,730 It kind of went away. 714 00:30:27,730 --> 00:30:30,340 What happened here? 715 00:30:30,340 --> 00:30:31,930 Any ideas in the chat? 716 00:30:31,930 --> 00:30:37,490 717 00:30:37,490 --> 00:30:37,990 Yeah. 718 00:30:37,990 --> 00:30:40,390 So I'm seeing we didn't quite add this node. 719 00:30:40,390 --> 00:30:43,840 Because if we take a look at our code, we're 720 00:30:43,840 --> 00:30:47,380 always setting the next pointer for our new node to be null, 721 00:30:47,380 --> 00:30:48,440 and that means nothing. 722 00:30:48,440 --> 00:30:50,410 So whenever we add a new node, well, we're just 723 00:30:50,410 --> 00:30:52,210 going to make sure that the next pointer is null, 724 00:30:52,210 --> 00:30:53,110 like, it doesn't point to anything? 725 00:30:53,110 --> 00:30:54,068 That isn't quite right. 726 00:30:54,068 --> 00:30:58,570 So let's make sure we actually update this to point to the head of our list. 727 00:30:58,570 --> 00:31:02,840 Our first node should point to whatever is the first element in our list. 728 00:31:02,840 --> 00:31:06,850 So let's say this, n arrow next gets the value list. 729 00:31:06,850 --> 00:31:09,130 And how will this work at the beginning? 730 00:31:09,130 --> 00:31:12,490 So notice how list is null at the beginning. 731 00:31:12,490 --> 00:31:14,770 Then we'll go through and add our first node. 732 00:31:14,770 --> 00:31:17,290 We'll say, OK, let's get a phrase, and let's 733 00:31:17,290 --> 00:31:20,320 make sure we have some space for this new node called n. 734 00:31:20,320 --> 00:31:23,860 Let's make sure this n has the phrase that we've been typing in. 735 00:31:23,860 --> 00:31:28,060 And now, n arrow next is equal to list which, in fact, is null, right? 736 00:31:28,060 --> 00:31:30,880 So our very first node will still have null 737 00:31:30,880 --> 00:31:34,480 as its sort of beginning next pointer. 738 00:31:34,480 --> 00:31:36,640 And then we'll go ahead and make sure list 739 00:31:36,640 --> 00:31:40,420 is updated to have the very first node at the head. 740 00:31:40,420 --> 00:31:43,060 But then if we do it again, if we go through again, 741 00:31:43,060 --> 00:31:46,120 we'll say that, OK, this list pointer's next is actually pointing 742 00:31:46,120 --> 00:31:48,163 to the previous node we just created. 743 00:31:48,163 --> 00:31:50,830 So now I can actually have the string together as we go through. 744 00:31:50,830 --> 00:31:52,038 And let's visualize this now. 745 00:31:52,038 --> 00:31:53,600 So let's go back to our terminal. 746 00:31:53,600 --> 00:31:58,110 Let's say, make list, and let's do ./list. 747 00:31:58,110 --> 00:31:59,580 And we could say Hi! 748 00:31:59,580 --> 00:32:01,290 And then we could say Hello! 749 00:32:01,290 --> 00:32:05,730 And now we see our list being built. So notice how our very first node here 750 00:32:05,730 --> 00:32:06,990 is at this location. 751 00:32:06,990 --> 00:32:08,520 It has the phrase Hello! 752 00:32:08,520 --> 00:32:11,860 And the next pointer is this, it's pointing to some new location here. 753 00:32:11,860 --> 00:32:15,990 But notice how this is now in our next node, 56f0. 754 00:32:15,990 --> 00:32:17,580 And the phrase here is Hi! 755 00:32:17,580 --> 00:32:21,170 And the next pointer is null here. 756 00:32:21,170 --> 00:32:22,590 OK. 757 00:32:22,590 --> 00:32:23,840 So that seemed to work for us. 758 00:32:23,840 --> 00:32:25,490 Let's go back to our code. 759 00:32:25,490 --> 00:32:27,785 What questions are there on this? 760 00:32:27,785 --> 00:32:35,113 761 00:32:35,113 --> 00:32:35,780 A good question. 762 00:32:35,780 --> 00:32:37,660 So is this some kind of recursion? 763 00:32:37,660 --> 00:32:40,900 This isn't quite recursion because, again, recursion 764 00:32:40,900 --> 00:32:43,640 involves having a function call itself. 765 00:32:43,640 --> 00:32:45,470 And here, we're really inside of a loop. 766 00:32:45,470 --> 00:32:47,230 So it isn't quite recursion. 767 00:32:47,230 --> 00:32:51,160 But you're onto an idea, which is that every node that we add 768 00:32:51,160 --> 00:32:52,865 is going through that same process. 769 00:32:52,865 --> 00:32:55,240 So this really could potentially be a recursive function, 770 00:32:55,240 --> 00:32:57,050 if you wanted it to be. 771 00:32:57,050 --> 00:33:00,010 But here, you're just noticing that the same thing we're doing for one 772 00:33:00,010 --> 00:33:02,710 node, we're doing for another, as we add it, as we go through. 773 00:33:02,710 --> 00:33:04,977 774 00:33:04,977 --> 00:33:06,310 And that's a good question, too. 775 00:33:06,310 --> 00:33:09,010 Is the last node we add becoming the first? 776 00:33:09,010 --> 00:33:10,100 That is also true. 777 00:33:10,100 --> 00:33:11,980 So here if we were to say, Hi! 778 00:33:11,980 --> 00:33:12,910 And then Hello! 779 00:33:12,910 --> 00:33:15,490 Well, Hello! becomes our new first node, right? 780 00:33:15,490 --> 00:33:18,070 That's because we're trying to insert new nodes at the front 781 00:33:18,070 --> 00:33:21,610 where the first node we add becomes the first-- 782 00:33:21,610 --> 00:33:25,510 sorry-- the last node we add becomes the first node in our list, ultimately. 783 00:33:25,510 --> 00:33:28,370 784 00:33:28,370 --> 00:33:30,340 OK. 785 00:33:30,340 --> 00:33:33,010 So there is one final thing to do here, which 786 00:33:33,010 --> 00:33:35,320 is to make sure that if we're asking for memory, 787 00:33:35,320 --> 00:33:38,630 we do want to make sure that this memory is actually given to us. 788 00:33:38,630 --> 00:33:41,680 And before we go ahead and use this pointer called n, 789 00:33:41,680 --> 00:33:44,540 we do want to make sure that we're not getting back some null value. 790 00:33:44,540 --> 00:33:47,500 So if n is null, what we want to do is simply 791 00:33:47,500 --> 00:33:51,820 print something like couldn't allocate memory for node, 792 00:33:51,820 --> 00:33:54,670 or whatever you want to type here, some descriptive error. 793 00:33:54,670 --> 00:33:59,560 And then we could return 1 to symbolize that something went wrong here. 794 00:33:59,560 --> 00:34:00,060 OK. 795 00:34:00,060 --> 00:34:02,890 796 00:34:02,890 --> 00:34:05,020 A question then, is this a stack? 797 00:34:05,020 --> 00:34:09,400 Well, it has the beginnings of a stack where if you remember, 798 00:34:09,400 --> 00:34:12,790 a stack is this data structure where the first thing we add 799 00:34:12,790 --> 00:34:14,920 becomes the last thing we get out. 800 00:34:14,920 --> 00:34:18,982 Or otherwise, the last thing we add is the first thing we get out, 801 00:34:18,982 --> 00:34:19,940 which is the case here. 802 00:34:19,940 --> 00:34:23,170 So we're saying that the last node, the last phrase we add, 803 00:34:23,170 --> 00:34:26,920 is going to be the first node we see in our structure. 804 00:34:26,920 --> 00:34:29,710 But it also depends on how we access this linked list. 805 00:34:29,710 --> 00:34:32,170 If we always access it from the very first node, 806 00:34:32,170 --> 00:34:34,929 well, that could be kind of similar to a stack. 807 00:34:34,929 --> 00:34:38,230 But if we always want to access it by looking at the very last element first, 808 00:34:38,230 --> 00:34:40,150 well, that could be then a queue, right? 809 00:34:40,150 --> 00:34:42,820 So a queue in a stack kind of depends on how you actually 810 00:34:42,820 --> 00:34:46,075 use them, and not so much what you build kind of underneath the hood. 811 00:34:46,075 --> 00:34:49,875 812 00:34:49,875 --> 00:34:50,375 All right. 813 00:34:50,375 --> 00:34:53,760 814 00:34:53,760 --> 00:34:55,920 So let's keep going here. 815 00:34:55,920 --> 00:34:58,880 And I would say that this code is mostly correct. 816 00:34:58,880 --> 00:35:02,420 But if I go down to my terminal and I now do this, 817 00:35:02,420 --> 00:35:04,760 I do make list again just to compile it. 818 00:35:04,760 --> 00:35:07,045 And then I say, let's try to memory check this 819 00:35:07,045 --> 00:35:08,670 and see if there are any memory errors. 820 00:35:08,670 --> 00:35:11,330 Well, I could do valgrind./list. 821 00:35:11,330 --> 00:35:16,310 And let me type in, let's say, once it loads, I can type in Hi! 822 00:35:16,310 --> 00:35:17,630 Right? 823 00:35:17,630 --> 00:35:19,190 And now that seems OK. 824 00:35:19,190 --> 00:35:20,930 I can type in Hello! 825 00:35:20,930 --> 00:35:25,250 And my program ends, but it seems I've definitely lost some bytes. 826 00:35:25,250 --> 00:35:27,290 I've definitely lost 16 bytes. 827 00:35:27,290 --> 00:35:30,110 And I've indirectly lost 16 bytes. 828 00:35:30,110 --> 00:35:31,680 So what happened here? 829 00:35:31,680 --> 00:35:33,680 What did I fail to do in my code? 830 00:35:33,680 --> 00:35:36,128 831 00:35:36,128 --> 00:35:37,420 Yeah, I didn't free the memory. 832 00:35:37,420 --> 00:35:42,570 So if I look back at my program, if I do this and I say, OK, well, unload. 833 00:35:42,570 --> 00:35:45,240 I trust that unload will work for me, right? 834 00:35:45,240 --> 00:35:47,550 But if I actually look down here, it's still a to do. 835 00:35:47,550 --> 00:35:50,290 I didn't actually free any nodes here. 836 00:35:50,290 --> 00:35:53,400 And if I look above, well, I used malloc several times. 837 00:35:53,400 --> 00:35:54,420 I used it twice. 838 00:35:54,420 --> 00:35:57,430 And I didn't call free any of those times. 839 00:35:57,430 --> 00:36:00,280 So every time we call malloc, we want to make sure we call free, 840 00:36:00,280 --> 00:36:02,613 and that's what unload is going to do for us down below. 841 00:36:02,613 --> 00:36:06,090 We're making our own function that takes in some pointer to a list 842 00:36:06,090 --> 00:36:09,120 and should go through and free every node in that list. 843 00:36:09,120 --> 00:36:10,995 We can give back that memory to the computer. 844 00:36:10,995 --> 00:36:13,370 So let's visualize, if we're actually writing it in code. 845 00:36:13,370 --> 00:36:15,570 If we go back to our slides here, we could 846 00:36:15,570 --> 00:36:19,200 think about actually trying to free this linked list. 847 00:36:19,200 --> 00:36:21,610 And maybe your initial thought is, let's do this. 848 00:36:21,610 --> 00:36:23,027 Let's actually just free the list. 849 00:36:23,027 --> 00:36:23,970 Let's say this. 850 00:36:23,970 --> 00:36:25,740 But why would this be bad? 851 00:36:25,740 --> 00:36:30,300 If we say free the list, what goes wrong? 852 00:36:30,300 --> 00:36:32,370 Here's the original picture. 853 00:36:32,370 --> 00:36:34,110 Here's freeing the list. 854 00:36:34,110 --> 00:36:35,640 What happened here? 855 00:36:35,640 --> 00:36:40,010 856 00:36:40,010 --> 00:36:42,140 Yes, I'm saying we lost the other nodes. 857 00:36:42,140 --> 00:36:47,410 So notice how in this picture, if you go back, well, the very first node 858 00:36:47,410 --> 00:36:49,570 is still pointing to that next node. 859 00:36:49,570 --> 00:36:52,510 And the only reason we know where that next node is located 860 00:36:52,510 --> 00:36:55,690 is because it's in the next portion of our very first node. 861 00:36:55,690 --> 00:37:00,940 But if we free that, if we get rid of it, well, where is that second node? 862 00:37:00,940 --> 00:37:01,890 We don't know. 863 00:37:01,890 --> 00:37:03,807 So if that's going to be a problem for us, 864 00:37:03,807 --> 00:37:06,390 what we don't want to actually just free the head of the list, 865 00:37:06,390 --> 00:37:10,230 we want to make sure we're going through and freeing nodes step by step 866 00:37:10,230 --> 00:37:13,840 and keeping track of where the rest of our nodes are. 867 00:37:13,840 --> 00:37:16,735 So instead, one way we could approach this is as follows. 868 00:37:16,735 --> 00:37:19,110 We don't want to do this, just free the head of the list. 869 00:37:19,110 --> 00:37:22,110 What we could do, instead, is say, let's make a new pointer, 870 00:37:22,110 --> 00:37:23,527 let's add that to the equation. 871 00:37:23,527 --> 00:37:24,110 Let's do this. 872 00:37:24,110 --> 00:37:27,450 So let's say that-- let's make a new pointer, just called maybe ptr, 873 00:37:27,450 --> 00:37:28,570 for example. 874 00:37:28,570 --> 00:37:33,550 And that will point to whatever the head of our list is, the next node. 875 00:37:33,550 --> 00:37:37,290 So now, a pointer is pointing to that second node in our list. 876 00:37:37,290 --> 00:37:38,992 And now, we could safely free list. 877 00:37:38,992 --> 00:37:39,700 We could do this. 878 00:37:39,700 --> 00:37:42,480 We could say, OK, free up list, and that's gone. 879 00:37:42,480 --> 00:37:49,390 But what is still pointing to the rest of our list now, which variable? 880 00:37:49,390 --> 00:37:52,140 Maybe in the chat? 881 00:37:52,140 --> 00:37:52,640 Yeah. 882 00:37:52,640 --> 00:37:55,080 So ptr is pointing to that second node in our list. 883 00:37:55,080 --> 00:37:57,170 So we still have this in our memory. 884 00:37:57,170 --> 00:38:00,250 We know where all the rest of our nodes are. 885 00:38:00,250 --> 00:38:00,750 OK. 886 00:38:00,750 --> 00:38:01,580 So that's all good. 887 00:38:01,580 --> 00:38:04,855 But now, we want to actually get back to the head of our list 888 00:38:04,855 --> 00:38:07,230 and make sure that we're actually kind of moving forward. 889 00:38:07,230 --> 00:38:10,910 So what we'll do is say that list gets the value for ptr. 890 00:38:10,910 --> 00:38:14,030 So we're going to update what our list looks like and say that we're 891 00:38:14,030 --> 00:38:16,410 going to move this over here. 892 00:38:16,410 --> 00:38:19,550 So now, the head of our list points to whatever pointer points to. 893 00:38:19,550 --> 00:38:21,470 And we could, perhaps, do this in a loop. 894 00:38:21,470 --> 00:38:25,940 We could say, OK, in this case, well, pointer will get list arrow next again. 895 00:38:25,940 --> 00:38:29,950 And what would pointer be now, though? 896 00:38:29,950 --> 00:38:33,250 If pointer is list arrow next, what is pointer going to be? 897 00:38:33,250 --> 00:38:36,490 898 00:38:36,490 --> 00:38:38,950 It will be the second node of the node after the first one. 899 00:38:38,950 --> 00:38:40,970 But what's the actual value right now? 900 00:38:40,970 --> 00:38:41,470 Yeah. 901 00:38:41,470 --> 00:38:45,460 It'll be null, or [INAUDIBLE] because notice how right now list 902 00:38:45,460 --> 00:38:48,220 is pointing to this very, now, the first node 903 00:38:48,220 --> 00:38:50,900 in our list, whose next value is null. 904 00:38:50,900 --> 00:38:54,670 So if pointer equals list arrow next, well, pointer will be null. 905 00:38:54,670 --> 00:38:56,870 OK, that seems OK for now. 906 00:38:56,870 --> 00:39:01,840 But if we wanted to kind of keep going, let's say we free list, all right. 907 00:39:01,840 --> 00:39:04,150 And then we say, well, list gets pointer. 908 00:39:04,150 --> 00:39:05,125 What is list now? 909 00:39:05,125 --> 00:39:07,830 910 00:39:07,830 --> 00:39:11,705 There's nothing left, so list would also be null, right? 911 00:39:11,705 --> 00:39:13,830 So what we've done, we've kind of gone full circle. 912 00:39:13,830 --> 00:39:17,820 We've made-- we've gone from the very first node to our very last node, 913 00:39:17,820 --> 00:39:20,880 and we've ended when list equals null, when 914 00:39:20,880 --> 00:39:25,020 there's nothing left to free, back at, for example, the same place we started. 915 00:39:25,020 --> 00:39:28,590 If we go back to our code, we could see that the very place we began 916 00:39:28,590 --> 00:39:32,040 was with list equaling no, no nodes in our list. 917 00:39:32,040 --> 00:39:38,760 And now, if we unload them, we also want to end in that very same place, right? 918 00:39:38,760 --> 00:39:39,703 OK. 919 00:39:39,703 --> 00:39:42,870 So let's try implementing this in code now so we can get rid of those memory 920 00:39:42,870 --> 00:39:43,370 errors. 921 00:39:43,370 --> 00:39:47,070 If we go down below here, we're going to free all of our allocated nodes. 922 00:39:47,070 --> 00:39:49,570 And we know we want some kind of loop to do this. 923 00:39:49,570 --> 00:39:51,598 But let's hold off on that for now. 924 00:39:51,598 --> 00:39:54,390 We know, first, we want to make this new pointer to our next nodes. 925 00:39:54,390 --> 00:39:55,015 We can do this. 926 00:39:55,015 --> 00:40:00,180 Node star pointer equals whatever list is pointing to and getting 927 00:40:00,180 --> 00:40:02,400 that next value there. 928 00:40:02,400 --> 00:40:06,230 Then we could safely free list because pointers pointing the rest of our list. 929 00:40:06,230 --> 00:40:09,800 And, finally, we could say, list updates to 0.2, 930 00:40:09,800 --> 00:40:12,610 what pointer is now pointing to. 931 00:40:12,610 --> 00:40:15,080 But how long do we want to do this? 932 00:40:15,080 --> 00:40:18,170 How do we know when to stop? 933 00:40:18,170 --> 00:40:21,120 Maybe in the chat? 934 00:40:21,120 --> 00:40:25,221 This seems good for one node, but want to keep doing this. 935 00:40:25,221 --> 00:40:28,190 Yeah, we want to keep doing it until our list is null. 936 00:40:28,190 --> 00:40:30,680 And we don't really know how many times we want to iterate, 937 00:40:30,680 --> 00:40:32,630 so a while loop seems like a good choice. 938 00:40:32,630 --> 00:40:35,880 A while loop we can do something until something else is true. 939 00:40:35,880 --> 00:40:40,190 So we could say, while list doesn't equal null. 940 00:40:40,190 --> 00:40:43,340 While we still have nodes, essentially, let's 941 00:40:43,340 --> 00:40:47,960 go ahead and make sure that we have a pointer to the next node. 942 00:40:47,960 --> 00:40:50,240 Let's free the current node, and then let's 943 00:40:50,240 --> 00:40:53,280 update the current node to point to whatever it was previously, 944 00:40:53,280 --> 00:40:54,590 the next one. 945 00:40:54,590 --> 00:40:57,503 So we could do something like this. 946 00:40:57,503 --> 00:41:00,170 And now, if we run our code, if we go back to our terminal here, 947 00:41:00,170 --> 00:41:02,550 we'll make list. 948 00:41:02,550 --> 00:41:06,040 We'll run valgrind on ./list. 949 00:41:06,040 --> 00:41:12,990 And what do we get if we type in, let's say, Hi! here, and then Hello? 950 00:41:12,990 --> 00:41:15,930 We see, successfully, all heap blocks were freed. 951 00:41:15,930 --> 00:41:19,460 No leaks are possible down below here. 952 00:41:19,460 --> 00:41:23,880 So what questions are there on this, if you look at our code 953 00:41:23,880 --> 00:41:25,290 for unloading here? 954 00:41:25,290 --> 00:41:30,420 955 00:41:30,420 --> 00:41:32,640 Yeah, a good question, is nil the same as no? 956 00:41:32,640 --> 00:41:36,210 So if we look at our terminal and we actually run this again, 957 00:41:36,210 --> 00:41:40,800 call it a make list, I'll run ./list, and I'll say Hi! 958 00:41:40,800 --> 00:41:44,280 And notice how this very first node has the next pointer of nil. 959 00:41:44,280 --> 00:41:46,690 Nil is simply the terminal representation of null. 960 00:41:46,690 --> 00:41:48,795 So nil means null in the terminal. 961 00:41:48,795 --> 00:41:51,640 962 00:41:51,640 --> 00:41:53,260 All right. 963 00:41:53,260 --> 00:41:57,140 And now, I'm seeing another question, is this a recursive function? 964 00:41:57,140 --> 00:42:00,370 So this still is probably not a recursive function, 965 00:42:00,370 --> 00:42:04,720 because a recursive function would kind of call itself like this function might 966 00:42:04,720 --> 00:42:07,520 then call unload somewhere in here. 967 00:42:07,520 --> 00:42:10,135 So it's not quite recursive, but you are correct in noticing 968 00:42:10,135 --> 00:42:12,010 that this is a good application for recursion 969 00:42:12,010 --> 00:42:16,220 because we're doing the very same thing for every node as we go. 970 00:42:16,220 --> 00:42:18,670 So you could certainly write this recursively. 971 00:42:18,670 --> 00:42:21,430 Here, we're going to iteratively using a loop. 972 00:42:21,430 --> 00:42:25,500 973 00:42:25,500 --> 00:42:26,520 OK. 974 00:42:26,520 --> 00:42:27,630 Other questions on this? 975 00:42:27,630 --> 00:42:35,430 976 00:42:35,430 --> 00:42:35,930 OK. 977 00:42:35,930 --> 00:42:38,550 So let's keep going on. 978 00:42:38,550 --> 00:42:39,950 And we've seen linked lists now. 979 00:42:39,950 --> 00:42:41,810 We've seen these really two core operations 980 00:42:41,810 --> 00:42:44,960 for linked lists, like, how do we add some data to them, 981 00:42:44,960 --> 00:42:47,360 and how do we unload them? 982 00:42:47,360 --> 00:42:50,318 What you might ask, though, is how can we speed this up? 983 00:42:50,318 --> 00:42:51,860 And that's where hash tables come in. 984 00:42:51,860 --> 00:42:55,037 So hash tables are simply an array of linked lists. 985 00:42:55,037 --> 00:42:57,370 We take linked lists, we break them into smaller pieces. 986 00:42:57,370 --> 00:42:59,120 And so let's get a visual for that and see 987 00:42:59,120 --> 00:43:00,990 if we can make this run even faster. 988 00:43:00,990 --> 00:43:06,260 So if we go back to our slides here and we see this idea of a linked list, 989 00:43:06,260 --> 00:43:07,370 we have maybe, Hey! 990 00:43:07,370 --> 00:43:08,060 Hello! 991 00:43:08,060 --> 00:43:08,840 Lo there! 992 00:43:08,840 --> 00:43:12,560 Different kind of activation words for our voice assistant. 993 00:43:12,560 --> 00:43:15,890 Well, what we could do is try to make sure we can access 994 00:43:15,890 --> 00:43:18,470 certain parts of this linked list faster to make sure 995 00:43:18,470 --> 00:43:20,870 we can actually search this linked list faster. 996 00:43:20,870 --> 00:43:22,580 And we could for that use a hash table. 997 00:43:22,580 --> 00:43:25,890 So recall from earlier that a hash table looks a bit like this. 998 00:43:25,890 --> 00:43:28,310 We have some array on the left-hand side. 999 00:43:28,310 --> 00:43:31,370 And we have different, what we call, buckets in here, 1000 00:43:31,370 --> 00:43:35,180 where a bucket might, in this case, be a letter of the alphabet. 1001 00:43:35,180 --> 00:43:38,750 And that letter of the alphabet is maybe the first character in a word. 1002 00:43:38,750 --> 00:43:43,850 You could have other buckets, other ways of assigning these phrases to buckets. 1003 00:43:43,850 --> 00:43:46,400 But for now, let's focus on this alphabetical hash 1004 00:43:46,400 --> 00:43:48,740 function here and alphabetical hash table. 1005 00:43:48,740 --> 00:43:52,640 So with this, we have different letters for every hash bucket. 1006 00:43:52,640 --> 00:44:00,560 And how does C know where in the array H, I, J, K and L are? 1007 00:44:00,560 --> 00:44:03,090 If we wanted to add some nodes to a certain bucket, well, 1008 00:44:03,090 --> 00:44:05,380 how do we figure out where that bucket is? 1009 00:44:05,380 --> 00:44:07,540 Anyone in the chat want to chime in on that? 1010 00:44:07,540 --> 00:44:11,670 1011 00:44:11,670 --> 00:44:12,170 Yes. 1012 00:44:12,170 --> 00:44:13,400 So it seems to be that they're ordered. 1013 00:44:13,400 --> 00:44:15,483 They're certainly in the right alphabetical order. 1014 00:44:15,483 --> 00:44:17,470 That's helpful. 1015 00:44:17,470 --> 00:44:21,790 Other things, too, for how we know where H is located, or L is located? 1016 00:44:21,790 --> 00:44:24,950 1017 00:44:24,950 --> 00:44:28,350 Yes, we could use maybe ASCII numbers and so on. 1018 00:44:28,350 --> 00:44:30,560 And we're on the right track here. 1019 00:44:30,560 --> 00:44:34,160 But because if you look at the left-hand side, this is just an array, 1020 00:44:34,160 --> 00:44:38,420 and we've kind of ourselves as humans assigned those array buckets, 1021 00:44:38,420 --> 00:44:41,030 those array segments, some letter. 1022 00:44:41,030 --> 00:44:44,000 But we have to assign them some number, really, some index. 1023 00:44:44,000 --> 00:44:48,930 And so you might recall from very early on, all arrays have these indices, 1024 00:44:48,930 --> 00:44:49,430 right? 1025 00:44:49,430 --> 00:44:52,340 So maybe zero to however big your array is. 1026 00:44:52,340 --> 00:44:56,540 And now we have, in this case, maybe H corresponds to the 7 index 1027 00:44:56,540 --> 00:45:00,410 in our table, or I corresponds to the 8 index, and so on. 1028 00:45:00,410 --> 00:45:04,550 And so in those indexes, we can actually add in our linked list. 1029 00:45:04,550 --> 00:45:09,110 And so here, if we notice that H corresponds to 7, 1030 00:45:09,110 --> 00:45:12,500 well, H is the eighth letter of the alphabet. 1031 00:45:12,500 --> 00:45:15,020 But if we were counting from zero, it would be the seventh. 1032 00:45:15,020 --> 00:45:19,100 So in this case, H corresponds to zero as that seventh letter of the alphabet 1033 00:45:19,100 --> 00:45:22,010 counting from zero here. 1034 00:45:22,010 --> 00:45:26,340 That's how we know how we actually add in these linked list nodes. 1035 00:45:26,340 --> 00:45:31,060 Now, if I were to give you some new phrase, not Hey! or Hello! or Lo there! 1036 00:45:31,060 --> 00:45:35,290 How would you figure out where to put it in this case? 1037 00:45:35,290 --> 00:45:37,920 Maybe in the chat? 1038 00:45:37,920 --> 00:45:42,500 If I gave you some new phrase, what could you 1039 00:45:42,500 --> 00:45:44,300 do to figure out where to put it? 1040 00:45:44,300 --> 00:45:49,560 1041 00:45:49,560 --> 00:45:50,060 Yeah. 1042 00:45:50,060 --> 00:45:51,730 So I'm seeing maybe at the start. 1043 00:45:51,730 --> 00:45:54,140 I don't really know until we see the phrase, right? 1044 00:45:54,140 --> 00:45:57,450 And that's where this idea of a hash function comes in. 1045 00:45:57,450 --> 00:45:59,750 So in order to figure out where Hey! and Hello! 1046 00:45:59,750 --> 00:46:04,670 and Lo there! go, we need to have some kind of function that takes our phrase 1047 00:46:04,670 --> 00:46:07,320 and gives us back some number that tells us where to put it. 1048 00:46:07,320 --> 00:46:09,200 So here we have, maybe, the phrase Hey! 1049 00:46:09,200 --> 00:46:11,150 And our hash function should ideally give us 1050 00:46:11,150 --> 00:46:14,910 some number that tells us where to put that given phrase. 1051 00:46:14,910 --> 00:46:17,960 So if I gave you not Hey! or Hi! or Hello there, 1052 00:46:17,960 --> 00:46:20,360 maybe I gave you the name Carter. 1053 00:46:20,360 --> 00:46:23,630 Well, that should give us some new number that then corresponds 1054 00:46:23,630 --> 00:46:27,140 to an index in our array, or we could start this new linked 1055 00:46:27,140 --> 00:46:30,420 list that might have all the C words, for example. 1056 00:46:30,420 --> 00:46:33,140 So let's actually take a look at this kind of in code now. 1057 00:46:33,140 --> 00:46:38,990 If we go back to our Visual Studio code over here, I'll open up this new file, 1058 00:46:38,990 --> 00:46:40,250 this one called table. 1059 00:46:40,250 --> 00:46:42,590 So I'll code table.c. 1060 00:46:42,590 --> 00:46:44,310 And it's already a bit done for me. 1061 00:46:44,310 --> 00:46:46,680 But notice how we have a few different things. 1062 00:46:46,680 --> 00:46:50,540 We have the same node structure from before for making our linked lists. 1063 00:46:50,540 --> 00:46:54,498 And now, we have this new array of pointers to node. 1064 00:46:54,498 --> 00:46:55,790 And we're calling this a table. 1065 00:46:55,790 --> 00:46:59,570 And notice how we have 26 buckets, 26 places 1066 00:46:59,570 --> 00:47:01,370 in which to have a linked list now. 1067 00:47:01,370 --> 00:47:04,070 And we can figure out where these new phrases 1068 00:47:04,070 --> 00:47:09,330 should go by running this function called hash that we'll see down below. 1069 00:47:09,330 --> 00:47:10,650 So let's scroll down. 1070 00:47:10,650 --> 00:47:13,100 We see that, here, we have this function called hash. 1071 00:47:13,100 --> 00:47:15,350 And hash is right now just returning zero. 1072 00:47:15,350 --> 00:47:18,290 Because if I give it a new phrase, I haven't really finished this yet. 1073 00:47:18,290 --> 00:47:20,660 I just want to return zero for every phrase I get. 1074 00:47:20,660 --> 00:47:24,590 And ideally, if I run this program, I should type in a phrase 1075 00:47:24,590 --> 00:47:29,330 and see the index that I should place this phrase inside of my hash table, 1076 00:47:29,330 --> 00:47:33,920 and then print out, OK, this phrase hash is to whatever number I have. 1077 00:47:33,920 --> 00:47:35,010 So let's see an example. 1078 00:47:35,010 --> 00:47:38,820 So if I go to my terminal now, I could make this full screen 1079 00:47:38,820 --> 00:47:42,120 and I could simply type, make a table. 1080 00:47:42,120 --> 00:47:46,200 And now, I'll run ./table, and I'll type in Hi! 1081 00:47:46,200 --> 00:47:48,330 And I get zero, which is maybe OK. 1082 00:47:48,330 --> 00:47:50,400 But let's try now-- 1083 00:47:50,400 --> 00:47:54,830 let's try Lo there! as we saw before, and we get zero. 1084 00:47:54,830 --> 00:47:57,200 So why would this be bad? 1085 00:47:57,200 --> 00:48:01,700 Besides it being unfinished, why would we not want Hi! and Lo there! 1086 00:48:01,700 --> 00:48:03,380 to kind of hash to that same value? 1087 00:48:03,380 --> 00:48:07,830 1088 00:48:07,830 --> 00:48:09,090 Any ideas in the chat? 1089 00:48:09,090 --> 00:48:16,180 1090 00:48:16,180 --> 00:48:16,680 Yeah. 1091 00:48:16,680 --> 00:48:20,683 So I'm seeing that this is kind of similar to a linked list, 1092 00:48:20,683 --> 00:48:21,600 if we were to do this. 1093 00:48:21,600 --> 00:48:23,220 Like, if we have-- 1094 00:48:23,220 --> 00:48:24,840 back to our visual here. 1095 00:48:24,840 --> 00:48:30,527 If we have some array of linked lists, but every word hashes to zero, 1096 00:48:30,527 --> 00:48:32,985 well, we have all this other space to put our linked lists, 1097 00:48:32,985 --> 00:48:35,935 but we're just putting everything in the zero bucket, which it makes-- 1098 00:48:35,935 --> 00:48:39,060 turns us back into this picture, which is just a single linked list, right? 1099 00:48:39,060 --> 00:48:42,098 The goal of this hash function is to have many shorter linked lists, 1100 00:48:42,098 --> 00:48:43,890 and that's the goal of trying to put things 1101 00:48:43,890 --> 00:48:46,210 in different buckets, different indexes here. 1102 00:48:46,210 --> 00:48:49,650 So ideally, we don't want to have everything hash to the same value, 1103 00:48:49,650 --> 00:48:52,620 we want to have different hash values for different phrases 1104 00:48:52,620 --> 00:48:54,630 that we might input. 1105 00:48:54,630 --> 00:48:57,390 So one way to do this, as we're seeing right now, 1106 00:48:57,390 --> 00:49:03,120 is trying to make sure we have different linked lists that depend on the very 1107 00:49:03,120 --> 00:49:05,165 first letter of the phrase we add. 1108 00:49:05,165 --> 00:49:08,290 And let's go back to our hash function and figure out how we could do that. 1109 00:49:08,290 --> 00:49:10,570 So let's exit our program here. 1110 00:49:10,570 --> 00:49:12,310 And let's go ahead and go back here. 1111 00:49:12,310 --> 00:49:18,780 And let's try to implement hash so that we get a value 0-25 for a given phrase. 1112 00:49:18,780 --> 00:49:19,780 So here, let's try this. 1113 00:49:19,780 --> 00:49:24,030 Let's actually try returning not just zero, 1114 00:49:24,030 --> 00:49:29,010 but let's say, depending on the very first letter of our phrase, 1115 00:49:29,010 --> 00:49:31,790 we return that and see how that goes. 1116 00:49:31,790 --> 00:49:34,720 So here we have returning phrase, bracket zero. 1117 00:49:34,720 --> 00:49:36,310 And we'll go back to our terminal now. 1118 00:49:36,310 --> 00:49:44,680 We'll say make table, and then ./table, and let's try Hello! 1119 00:49:44,680 --> 00:49:46,540 And we see 72. 1120 00:49:46,540 --> 00:49:48,430 Is that-- I mean, it's better. 1121 00:49:48,430 --> 00:49:49,970 Right now, I have actual number. 1122 00:49:49,970 --> 00:49:54,510 And if I type in Carter, well, that's a different number, which is good. 1123 00:49:54,510 --> 00:49:57,460 And Lo there! perhaps, is also a different number. 1124 00:49:57,460 --> 00:49:59,610 So we're off to a good start. 1125 00:49:59,610 --> 00:50:01,710 But what's wrong with this? 1126 00:50:01,710 --> 00:50:06,240 If I get 72, and 67, and 76, I go back to my code 1127 00:50:06,240 --> 00:50:09,570 and I look at my table, what's wrong? 1128 00:50:09,570 --> 00:50:13,800 1129 00:50:13,800 --> 00:50:14,300 Yeah. 1130 00:50:14,300 --> 00:50:16,500 So I'm seeing I only have 26 buckets. 1131 00:50:16,500 --> 00:50:21,080 So if I wanted to, let's say, get a phrase and put it in my bucket 1132 00:50:21,080 --> 00:50:25,160 somewhere, well, I can't really do that because maybe Hello! hashes to, 1133 00:50:25,160 --> 00:50:26,540 what was it if we look at it? 1134 00:50:26,540 --> 00:50:27,650 72. 1135 00:50:27,650 --> 00:50:30,840 Well, that's not a valid index in my table. 1136 00:50:30,840 --> 00:50:33,830 So I need to actually make this shorter a little bit. 1137 00:50:33,830 --> 00:50:37,520 And one way to do that is maybe just subtract A from it. 1138 00:50:37,520 --> 00:50:38,970 So I could do something like this. 1139 00:50:38,970 --> 00:50:41,540 And what this will do is, maybe if you recall 1140 00:50:41,540 --> 00:50:47,290 how A maps to 65, well, if I get A as the very first character, well, 1141 00:50:47,290 --> 00:50:49,850 what is 65? 1142 00:50:49,850 --> 00:50:52,100 A minus A is zero. 1143 00:50:52,100 --> 00:50:54,770 If I get B, B is 66 in ASCII. 1144 00:50:54,770 --> 00:50:57,560 So B minus A is 1. 1145 00:50:57,560 --> 00:51:02,000 And if I get all the way down to Z, well, Z minus a is going to be 25. 1146 00:51:02,000 --> 00:51:04,740 And if I wrap back around to A, well, that's just zero again. 1147 00:51:04,740 --> 00:51:07,580 So what I've done now is I've made sure that every index I get 1148 00:51:07,580 --> 00:51:11,940 is going to be inside the kind of length of my table here. 1149 00:51:11,940 --> 00:51:15,500 I won't get any indexes that are out of bounds for this list. 1150 00:51:15,500 --> 00:51:21,330 And now, if I kind of go back here, we run this code, I'll do make table. 1151 00:51:21,330 --> 00:51:26,580 I'll run ./table, and I'll do Hello! now, and I get 7, which is better. 1152 00:51:26,580 --> 00:51:28,470 Now, I'll do maybe Lo there! 1153 00:51:28,470 --> 00:51:29,490 and I get 11. 1154 00:51:29,490 --> 00:51:33,670 And now, I'll type in just Z to see how that works, and I get 25. 1155 00:51:33,670 --> 00:51:35,550 So that seems to be pretty good for myself. 1156 00:51:35,550 --> 00:51:39,520 But what still am I missing here? 1157 00:51:39,520 --> 00:51:42,340 What could go wrong? 1158 00:51:42,340 --> 00:51:43,060 Any ideas? 1159 00:51:43,060 --> 00:51:49,220 1160 00:51:49,220 --> 00:51:49,720 Yes. 1161 00:51:49,720 --> 00:51:52,240 So lowercased letters-- right now, I'm working 1162 00:51:52,240 --> 00:51:53,830 with phrases that are all capital. 1163 00:51:53,830 --> 00:51:56,538 But let's say someone wants to add a phrase that is in lowercase. 1164 00:51:56,538 --> 00:52:00,230 So I could actually open a terminal again, make table. 1165 00:52:00,230 --> 00:52:04,740 I'll do ./table, and I'll do maybe hello! in lowercase. 1166 00:52:04,740 --> 00:52:07,920 And I get 39, which isn't quite what I want. 1167 00:52:07,920 --> 00:52:09,800 So what could I do instead? 1168 00:52:09,800 --> 00:52:13,538 Maybe any ideas in the chat? 1169 00:52:13,538 --> 00:52:14,455 What can I do instead? 1170 00:52:14,455 --> 00:52:19,360 1171 00:52:19,360 --> 00:52:21,920 I could, perhaps, maybe standardize on capital letters. 1172 00:52:21,920 --> 00:52:25,930 So I could say, I want to make whatever this first letter is uppercase, 1173 00:52:25,930 --> 00:52:29,890 and then subtract that capital A. So if I go back to my terminal, I could say, 1174 00:52:29,890 --> 00:52:32,520 make table. 1175 00:52:32,520 --> 00:52:34,050 I could say ./table. 1176 00:52:34,050 --> 00:52:38,310 And now I want for it to be the case that lowercase hello and capital 1177 00:52:38,310 --> 00:52:40,060 Hello! hash that same value. 1178 00:52:40,060 --> 00:52:45,430 So I'll type hello here, and capital H Hello! here, and I get 7 in both cases. 1179 00:52:45,430 --> 00:52:48,550 So that seems to work for me now. 1180 00:52:48,550 --> 00:52:51,270 All right. 1181 00:52:51,270 --> 00:52:56,240 Any other-- if we go back to our terminal here, 1182 00:52:56,240 --> 00:52:59,540 that will tell us where to put individual phrases. 1183 00:52:59,540 --> 00:53:04,190 So questions on this hash function so far? 1184 00:53:04,190 --> 00:53:12,760 1185 00:53:12,760 --> 00:53:13,260 All right. 1186 00:53:13,260 --> 00:53:15,990 1187 00:53:15,990 --> 00:53:18,520 So not seeing too many questions here. 1188 00:53:18,520 --> 00:53:19,820 Why don't we do this? 1189 00:53:19,820 --> 00:53:24,490 So let's go ahead and think about what makes for a good hash function? 1190 00:53:24,490 --> 00:53:29,490 So if we think about this as our very first hash function, 1191 00:53:29,490 --> 00:53:33,418 this idea of putting nodes into individual buckets, 1192 00:53:33,418 --> 00:53:35,710 well, it's not going to be quite what we want it to be. 1193 00:53:35,710 --> 00:53:37,627 And if we actually go through down here and we 1194 00:53:37,627 --> 00:53:39,420 think about how could we try to make sure 1195 00:53:39,420 --> 00:53:42,180 we actually better hash function than this, what we could do 1196 00:53:42,180 --> 00:53:45,640 is actually make sure we have a good distribution of values across it. 1197 00:53:45,640 --> 00:53:48,510 So if you think about our buckets here, we 1198 00:53:48,510 --> 00:53:51,930 want to try to have-- to make sure that every node in our linked list 1199 00:53:51,930 --> 00:53:54,510 is kind of going to go in every bucket here. 1200 00:53:54,510 --> 00:53:58,050 Like, we actually have a lot of I words, a lot of J words, a lot of K words. 1201 00:53:58,050 --> 00:54:01,110 And here, the way we hash this isn't going to quite do that for us. 1202 00:54:01,110 --> 00:54:04,350 We're not going to actually have any words that start with I, or J, or K, 1203 00:54:04,350 --> 00:54:09,130 so we probably think of a new hash function to use in this case. 1204 00:54:09,130 --> 00:54:13,183 And if we go back over here, we can think of some good points 1205 00:54:13,183 --> 00:54:14,100 for our hash function. 1206 00:54:14,100 --> 00:54:18,000 We can think of, OK, we always try to get the same value for a given input. 1207 00:54:18,000 --> 00:54:20,880 We always want to produce an even distribution across the buckets, 1208 00:54:20,880 --> 00:54:23,970 and we always want to make sure that we use all the buckets that we're 1209 00:54:23,970 --> 00:54:26,300 given in this case. 1210 00:54:26,300 --> 00:54:28,488 So these are some ideas for a good hash function. 1211 00:54:28,488 --> 00:54:30,280 Here, what we'll actually do is let you all 1212 00:54:30,280 --> 00:54:34,927 go off and write your own hash functions to go off and build your own overall. 1213 00:54:34,927 --> 00:54:36,760 So if we think of some other ideas here, you 1214 00:54:36,760 --> 00:54:40,150 could use the length of the phrases, you could make sure you maybe 1215 00:54:40,150 --> 00:54:42,170 add up their values, and so on. 1216 00:54:42,170 --> 00:54:44,830 But what we could do is go ahead and have you all 1217 00:54:44,830 --> 00:54:47,920 do that for the rest of this week now. 1218 00:54:47,920 --> 00:54:51,940 And I think that will probably conclude us for this section. 1219 00:54:51,940 --> 00:54:53,690 It is so good to see you all here today. 1220 00:54:53,690 --> 00:54:54,898 Thank you all for joining us. 1221 00:54:54,898 --> 00:54:57,660 We will see you next week. 1222 00:54:57,660 --> 00:54:59,000