WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 00:00:02.862 --> 00:00:03.910 CARTER ZENKE: OK. 00:00:03.910 --> 00:00:04.940 Hello, everyone. 00:00:04.940 --> 00:00:07.982 It is so good to see you here for our Week 5 Section. 00:00:07.982 --> 00:00:09.940 If you don't remember, my name is Carter Zenke. 00:00:09.940 --> 00:00:11.740 I'm the course's preceptor here on campus, 00:00:11.740 --> 00:00:13.902 and it's so delightful to see you all over Zoom. 00:00:13.902 --> 00:00:15.860 I thought what we'd do just to kick things off, 00:00:15.860 --> 00:00:18.880 assuming you can hear me this time, is to on the count of three, 00:00:18.880 --> 00:00:22.150 we'll all unmute at the same time and say a big hello. 00:00:22.150 --> 00:00:25.440 So on the count of three, we'll do one, two, three, hello. 00:00:25.440 --> 00:00:26.830 AUDIENCE: Hello. 00:00:26.830 --> 00:00:28.195 AUDIENCE: Hello. 00:00:28.195 --> 00:00:29.455 AUDIENCE: Hello. 00:00:29.455 --> 00:00:31.640 AUDIENCE: Hello. 00:00:31.640 --> 00:00:32.800 CARTER ZENKE: All right. 00:00:32.800 --> 00:00:33.220 AUDIENCE: Hello. 00:00:33.220 --> 00:00:34.585 CARTER ZENKE: It is so wonderful-- 00:00:34.585 --> 00:00:35.252 AUDIENCE: Hello. 00:00:35.252 --> 00:00:36.481 AUDIENCE: Hello. 00:00:36.481 --> 00:00:37.690 AUDIENCE: Hello. 00:00:37.690 --> 00:00:38.710 CARTER ZENKE: Hello. 00:00:38.710 --> 00:00:40.533 It is so wonderful to see you all here. 00:00:40.533 --> 00:00:41.200 AUDIENCE: Hello. 00:00:41.200 --> 00:00:42.968 AUDIENCE: Hello. 00:00:42.968 --> 00:00:46.280 CARTER ZENKE: And we'll go ahead and mute. 00:00:46.280 --> 00:00:48.760 AUDIENCE: Hello. 00:00:48.760 --> 00:00:52.000 CARTER ZENKE: OK. 00:00:52.000 --> 00:00:52.500 All right. 00:00:52.500 --> 00:00:55.208 So if you'd like to participate, feel free to do so via the chat. 00:00:55.208 --> 00:00:58.405 You're welcome to type in the chat to ask questions via the chat. 00:00:58.405 --> 00:01:00.780 Again, the goal for these sections is really to make sure 00:01:00.780 --> 00:01:03.090 that you're able to ask the questions you want to ask 00:01:03.090 --> 00:01:06.250 and for us to dive into this learning together. 00:01:06.250 --> 00:01:08.460 So today, focused on data structures. 00:01:08.460 --> 00:01:12.540 And this week, you build your own data structure with a speller problem set. 00:01:12.540 --> 00:01:16.170 And I want to focus on this big idea behind data structures. 00:01:16.170 --> 00:01:17.087 Why would we use them? 00:01:17.087 --> 00:01:18.962 What are the trade-offs between them, and how 00:01:18.962 --> 00:01:20.800 to actually use them in the real world. 00:01:20.800 --> 00:01:23.680 So to that end, we'll have three questions during our section today. 00:01:23.680 --> 00:01:27.360 The first will be this idea of the trade-off between data structures. 00:01:27.360 --> 00:01:30.180 When should we use one data structure versus another? 00:01:30.180 --> 00:01:32.790 The second question will be about these linked lists. 00:01:32.790 --> 00:01:36.660 We saw these in lecture, but how can we build our own linked lists? 00:01:36.660 --> 00:01:37.920 What can we do with them? 00:01:37.920 --> 00:01:42.480 And how can we better figure out how to actually use them in our code? 00:01:42.480 --> 00:01:44.670 And then, finally, our third question will 00:01:44.670 --> 00:01:48.230 be this idea of how do we build our own hash table where hash tables rely 00:01:48.230 --> 00:01:50.730 on a lot of fundamentals that linked lists can give us here, 00:01:50.730 --> 00:01:52.688 but can run maybe a little bit faster, as we'll 00:01:52.688 --> 00:01:55.900 see in the problem set for this week. 00:01:55.900 --> 00:01:59.220 So to really dive into things here and to make things a little more concrete, 00:01:59.220 --> 00:02:01.500 we saw all kinds of data structures. 00:02:01.500 --> 00:02:04.780 But often, when you're actually programming in the real world, 00:02:04.780 --> 00:02:08.220 you might choose one or another, depending on the scenario, what 00:02:08.220 --> 00:02:09.730 you're trying to build something. 00:02:09.730 --> 00:02:12.480 And in this case, what I'd like to do is show you all one scenario 00:02:12.480 --> 00:02:13.690 we can think about here. 00:02:13.690 --> 00:02:17.220 So imagine that you are working for some software company 00:02:17.220 --> 00:02:21.190 and they are making this digital assistant that runs on a mobile device. 00:02:21.190 --> 00:02:23.610 Maybe it's something like, hey, Siri, or hey, Google, 00:02:23.610 --> 00:02:25.680 or something that responds to people's voice 00:02:25.680 --> 00:02:27.630 over time on their personal device. 00:02:27.630 --> 00:02:31.230 You can do something for that person as they activate this assistant. 00:02:31.230 --> 00:02:35.430 Now, you've seen, or you've heard, that people who are using this assistant 00:02:35.430 --> 00:02:38.640 aren't often able to wake it up at the right moment. 00:02:38.640 --> 00:02:42.053 Maybe it sometimes doesn't work if I were to say maybe, hey, Siri, 00:02:42.053 --> 00:02:42.720 or, hey, Google. 00:02:42.720 --> 00:02:44.830 It doesn't quite work as I want it to. 00:02:44.830 --> 00:02:47.953 And so this is sort of a problem that we're running into here. 00:02:47.953 --> 00:02:50.370 But what we ultimately want to do is figure out, OK, well, 00:02:50.370 --> 00:02:51.690 how could we fix this? 00:02:51.690 --> 00:02:54.750 And maybe the first solution is to try to improve this 00:02:54.750 --> 00:02:58.200 by adding more words that could wake up this voice assistant. 00:02:58.200 --> 00:03:02.280 We want to store more data, more words, that could activate this voice 00:03:02.280 --> 00:03:03.330 assistant for ourselves. 00:03:03.330 --> 00:03:06.372 And the question that you might run into as a software engineer is, well, 00:03:06.372 --> 00:03:08.820 how could we best store all of these phrases 00:03:08.820 --> 00:03:11.282 we want our voice assistant to wake up to? 00:03:11.282 --> 00:03:12.990 What kind of data structure would you use 00:03:12.990 --> 00:03:17.190 to actually store these phrases that might activate our voice assistant? 00:03:17.190 --> 00:03:21.090 And we saw a few tools in our toolkit, a few structures in our toolkit. 00:03:21.090 --> 00:03:25.590 And one of these was the linked list, one of these was the try, one of these 00:03:25.590 --> 00:03:27.370 was the tree, and so on. 00:03:27.370 --> 00:03:29.260 So there's lots of ways we can do this. 00:03:29.260 --> 00:03:32.550 But let's think first about the different priorities we might 00:03:32.550 --> 00:03:34.980 have for this data structure, right? 00:03:34.980 --> 00:03:36.822 If we want to actually make this. 00:03:36.822 --> 00:03:38.530 Well, what do we want to prioritize here? 00:03:38.530 --> 00:03:39.698 So let's think about this. 00:03:39.698 --> 00:03:41.490 If we to add lots of words, we could really 00:03:41.490 --> 00:03:43.650 think of trying a few different approaches here. 00:03:43.650 --> 00:03:46.023 We could think of, well, a few different aspects, like, 00:03:46.023 --> 00:03:48.690 if our data structure is going to do a certain number of things, 00:03:48.690 --> 00:03:51.300 we might care about different aspects of how data structures are formed. 00:03:51.300 --> 00:03:53.400 Like, do we care about deleting things quickly? 00:03:53.400 --> 00:03:55.170 Do we care about inserting things quickly? 00:03:55.170 --> 00:03:57.128 Do we care about searching for phrases quickly? 00:03:57.128 --> 00:03:59.520 Like, for example, if I were to have this voice assistant 00:03:59.520 --> 00:04:03.990 and I'm trying to do something with it, do I care about deleting words quickly, 00:04:03.990 --> 00:04:07.230 or do I care about inserting new words quickly, 00:04:07.230 --> 00:04:10.830 or do I care about maybe if I were to speak something, if I can search 00:04:10.830 --> 00:04:12.270 and find that word really quickly? 00:04:12.270 --> 00:04:14.220 So often, when we're choosing data structure, 00:04:14.220 --> 00:04:15.928 we care about these kind of three things. 00:04:15.928 --> 00:04:18.820 How fast does it perform in these three areas? 00:04:18.820 --> 00:04:21.328 And so here, we run into our first question, right? 00:04:21.328 --> 00:04:23.370 If we want to figure out which way to prioritize, 00:04:23.370 --> 00:04:26.320 well, we could maybe say that we care about search the most. 00:04:26.320 --> 00:04:28.920 We want to be able to search and find a phrase really quickly. 00:04:28.920 --> 00:04:32.190 Maybe if the user speaks some word, we want to be able to figure out, 00:04:32.190 --> 00:04:35.640 is that word in the wake word list. 00:04:35.640 --> 00:04:37.590 Or maybe we really care about the user being 00:04:37.590 --> 00:04:41.010 able to quickly add a new wake word, like, I could say not just, hey, 00:04:41.010 --> 00:04:42.450 but hello, or hi. 00:04:42.450 --> 00:04:45.450 And I want to be able to program this data structure to sort of respond 00:04:45.450 --> 00:04:47.200 to that quickly. 00:04:47.200 --> 00:04:51.180 Now, what priorities might you have if you were faced with a scenario? 00:04:51.180 --> 00:04:53.130 If you want to chime in in the chat, which 00:04:53.130 --> 00:04:56.490 would you prioritize here between a fast insertion, a fast search, 00:04:56.490 --> 00:04:59.380 a fast deletion, if you're trying to do this? 00:04:59.380 --> 00:05:02.390 Feel free to chime in in the chat. 00:05:02.390 --> 00:05:02.890 OK. 00:05:02.890 --> 00:05:09.880 I'm seeing one person for search, OK, lots of searching, some insertion, too. 00:05:09.880 --> 00:05:10.380 Yeah. 00:05:10.380 --> 00:05:13.380 And so these are things we could probably reasonably disagree on, right? 00:05:13.380 --> 00:05:16.078 If I want to prioritize searching for some word, 00:05:16.078 --> 00:05:19.120 being able to speak it and have my assistant recognize it really quickly, 00:05:19.120 --> 00:05:22.037 well, that seems reasonable where we can make search our top priority. 00:05:22.037 --> 00:05:24.400 Or maybe it's also reasonable to say, well, 00:05:24.400 --> 00:05:26.723 I want people to make their own custom wake words. 00:05:26.723 --> 00:05:29.140 And so inserting something is really important to me, too. 00:05:29.140 --> 00:05:30.820 So it depends on what you're trying to build here. 00:05:30.820 --> 00:05:33.600 And so let's take a look at some of these tools in our tool kit. 00:05:33.600 --> 00:05:37.020 So the first one we said before was this idea of the linked list. 00:05:37.020 --> 00:05:39.437 And here's the linked list in full here. 00:05:39.437 --> 00:05:41.520 And the idea behind the linked list was that we're 00:05:41.520 --> 00:05:45.352 able to have these nodes where each node has a certain phrase 00:05:45.352 --> 00:05:48.060 inside of it-- at least in this case, trying to store phrases now 00:05:48.060 --> 00:05:49.020 in our linked list. 00:05:49.020 --> 00:05:51.760 And we have these nodes kind of follow each other in memory. 00:05:51.760 --> 00:05:53.590 They're no longer back-to-back, like, in an array, 00:05:53.590 --> 00:05:54.780 but they sort of build on top of each other 00:05:54.780 --> 00:05:57.360 and sort of follow each other throughout memory, being woven 00:05:57.360 --> 00:05:59.940 in to our bits and bytes, thereof. 00:05:59.940 --> 00:06:05.850 So if we were to use a linked list here, which of these three priorities 00:06:05.850 --> 00:06:08.130 would be kind of well-served by this linked 00:06:08.130 --> 00:06:12.983 list, like, what would a link list be good at in this case, do you think? 00:06:12.983 --> 00:06:14.275 Feel free to chime in the chat. 00:06:17.910 --> 00:06:18.410 OK. 00:06:18.410 --> 00:06:19.850 So I'm seeing some insertion. 00:06:19.850 --> 00:06:21.330 And, yeah, it's correct. 00:06:21.330 --> 00:06:23.720 So linked would be really fast to insert something into. 00:06:23.720 --> 00:06:27.500 In fact, we saw in lecture that all it takes to insert into a linked list 00:06:27.500 --> 00:06:30.588 is to simply make a new node and put it at the very front. 00:06:30.588 --> 00:06:32.630 And that's, basically, a constant time operation. 00:06:32.630 --> 00:06:34.100 It takes a fixed number of steps. 00:06:34.100 --> 00:06:36.290 So inserting into linked list is very fast 00:06:36.290 --> 00:06:39.447 and could be a good priority for our insertion here. 00:06:39.447 --> 00:06:41.030 But, again, what's the trade-off here? 00:06:41.030 --> 00:06:42.738 If we were to go with a linked list, what 00:06:42.738 --> 00:06:46.130 would be slower because we used the linked list now? 00:06:46.130 --> 00:06:49.130 Maybe insertion is fast, but what's not fast? 00:06:49.130 --> 00:06:51.507 Well, deletion is not very fast, right? 00:06:51.507 --> 00:06:53.840 But at least in this scenario, let's say we don't really 00:06:53.840 --> 00:06:57.290 want to care about deleting wake words, what's the other ones we can use, 00:06:57.290 --> 00:07:00.435 a little bit slow for us? 00:07:00.435 --> 00:07:00.935 Yeah. 00:07:00.935 --> 00:07:02.102 So I'm seeing search, right? 00:07:02.102 --> 00:07:04.100 Search would be pretty slow because, again, we 00:07:04.100 --> 00:07:06.410 have to sort of start at the very first node. 00:07:06.410 --> 00:07:09.442 Like, hey, and ask, OK, is this what the user said? 00:07:09.442 --> 00:07:11.150 No, it's not, let's go onto the next one. 00:07:11.150 --> 00:07:12.380 Oh, is this what the user said? 00:07:12.380 --> 00:07:14.088 Oh, it's not, let's go onto the next one. 00:07:14.088 --> 00:07:16.778 So we can only do linear search, basically, with a linked list. 00:07:16.778 --> 00:07:19.070 And this is a bit of simplification for how we actually 00:07:19.070 --> 00:07:20.330 do kind of voice recognition. 00:07:20.330 --> 00:07:22.205 But you get the idea, nonetheless, that we're 00:07:22.205 --> 00:07:25.040 trying to find and go through phrases that the user could have said. 00:07:25.040 --> 00:07:27.840 In a linked list, we have to go through every single node that we have, 00:07:27.840 --> 00:07:29.632 which, as I see in the chat is indeed going 00:07:29.632 --> 00:07:32.220 to be in big O of n going through n nodes 00:07:32.220 --> 00:07:35.440 to find a certain element inside of it. 00:07:35.440 --> 00:07:36.960 OK so we have linked lists here. 00:07:36.960 --> 00:07:39.040 But let's actually think of a new data structure. 00:07:39.040 --> 00:07:41.802 What if we use something like the hash table? 00:07:41.802 --> 00:07:43.510 And the hash table looks a bit like this. 00:07:43.510 --> 00:07:46.020 I'm going to go to the full screen mode here where 00:07:46.020 --> 00:07:47.760 it's a bit like a linked list. 00:07:47.760 --> 00:07:49.620 But what we've done is sort of categorize 00:07:49.620 --> 00:07:52.242 our linked list based on the very first character. 00:07:52.242 --> 00:07:54.450 And not all hash tables use the very first character, 00:07:54.450 --> 00:07:58.260 but this one, for example, does, and it is helpful in some ways. 00:07:58.260 --> 00:08:02.540 Like what would now be faster because we're using the hash table? 00:08:02.540 --> 00:08:05.368 What would be faster now? 00:08:05.368 --> 00:08:06.660 Feel free to write in the chat. 00:08:08.940 --> 00:08:09.440 OK. 00:08:09.440 --> 00:08:11.810 So I'm seeing some search, like, search is faster. 00:08:11.810 --> 00:08:15.770 And why is search faster, if you want to write in the chat quickly? 00:08:15.770 --> 00:08:18.425 Why would search be faster with this hash table now? 00:08:24.120 --> 00:08:26.790 So I'm seeing some of these ideas of indexing, right? 00:08:26.790 --> 00:08:27.660 And so-- right. 00:08:27.660 --> 00:08:31.770 If we look at the side of our hash table in this array on this left-hand side 00:08:31.770 --> 00:08:36.659 here, we've kind of split our linked list into several indexes, like, H, I, 00:08:36.659 --> 00:08:38.010 J, K and L. 00:08:38.010 --> 00:08:41.400 And in the computer's memory, these are represented with, actually, like, 00:08:41.400 --> 00:08:44.100 H, I, J, K and L. They're assigned some number. 00:08:44.100 --> 00:08:46.770 And we have a hash function that finds the number for each 00:08:46.770 --> 00:08:48.400 of these values on the left-hand side. 00:08:48.400 --> 00:08:51.430 But the idea here is that we're able to use indexing to figure out, 00:08:51.430 --> 00:08:54.180 OK, if I want to find the words beginning with H, 00:08:54.180 --> 00:08:57.878 I know right where to go, and I can only search those words. 00:08:57.878 --> 00:09:01.170 Or if I want to find the words beginning with L, I can search in that location, 00:09:01.170 --> 00:09:03.630 just find the words beginning with L. So indexing here 00:09:03.630 --> 00:09:05.422 is helping us breaking down our linked list 00:09:05.422 --> 00:09:08.170 into lots of smaller linked lists, too. 00:09:08.170 --> 00:09:12.070 So it seems if we were to insert into this hash table, 00:09:12.070 --> 00:09:13.380 it's also pretty quick, right? 00:09:13.380 --> 00:09:16.770 It's both faster searching and faster insertion. 00:09:16.770 --> 00:09:20.590 So why would we just not always use a hash table over a linked list? 00:09:20.590 --> 00:09:22.590 Like what's the problem with the hash table now? 00:09:22.590 --> 00:09:23.730 What's the trade-off here? 00:09:26.060 --> 00:09:26.560 Yeah. 00:09:26.560 --> 00:09:28.320 So I'm seeing this idea of memory. 00:09:28.320 --> 00:09:31.570 Like this hash table is going to take up much more memory than our linked list 00:09:31.570 --> 00:09:35.100 assuming, maybe, we have some buckets, some indexes that aren't really filled 00:09:35.100 --> 00:09:36.770 with these phrases. 00:09:36.770 --> 00:09:38.520 But it's also going to take us more memory 00:09:38.520 --> 00:09:40.582 to have all the different indexes possible here 00:09:40.582 --> 00:09:43.540 that we might want to actually have in order to make our lookup really, 00:09:43.540 --> 00:09:44.370 really fast. 00:09:44.370 --> 00:09:47.640 So it's a trade-off here is memory, as well. 00:09:47.640 --> 00:09:49.850 And now, it's got one other structure to consider, 00:09:49.850 --> 00:09:52.010 which would be this idea of the trie. 00:09:52.010 --> 00:09:54.240 And here's some brief visual of a trie. 00:09:54.240 --> 00:09:57.980 Would you like to point out some words you see inside of this trie, 00:09:57.980 --> 00:09:59.253 reading from left to right? 00:09:59.253 --> 00:10:00.170 What words do you see? 00:10:00.170 --> 00:10:02.960 You see hello, you see hey. 00:10:02.960 --> 00:10:04.400 Any other ones? 00:10:04.400 --> 00:10:07.910 We see help, certainly. 00:10:07.910 --> 00:10:09.980 Yeah, there's a few other ones in here, too. 00:10:09.980 --> 00:10:15.260 So this trie worked by trying of storing a node for every letter in our words. 00:10:15.260 --> 00:10:17.390 And this is very fast for lookup, right? 00:10:17.390 --> 00:10:19.910 If you want to figure out if hello is in this trie, 00:10:19.910 --> 00:10:24.980 well, all we have to do is see, OK, I see H, I see E, I see L, I see L, 00:10:24.980 --> 00:10:25.640 and I see O. 00:10:25.640 --> 00:10:27.327 Well, hello is in this structure, right? 00:10:27.327 --> 00:10:28.410 We know that it's in Here. 00:10:28.410 --> 00:10:30.320 So look up is very fast. 00:10:30.320 --> 00:10:32.840 And similarly, insertion is also really quickly. 00:10:32.840 --> 00:10:36.860 If I wanted to add Hi to this trie, well, I could simply go to the H node 00:10:36.860 --> 00:10:39.410 and say, OK, well, that covers the first letter. 00:10:39.410 --> 00:10:42.380 Now, I'll add I, maybe somewhere up above, and that's it. 00:10:42.380 --> 00:10:43.280 I've inserted Hi. 00:10:43.280 --> 00:10:46.658 So insertion and search are also really fast in our trie. 00:10:46.658 --> 00:10:48.200 But again, what's the trade-off here? 00:10:48.200 --> 00:10:49.700 Why wouldn't we always use the trie? 00:10:53.850 --> 00:10:54.350 Yeah. 00:10:54.350 --> 00:10:55.910 So I'm seeing this idea of memory again. 00:10:55.910 --> 00:10:57.980 Like, this is going to take up a lot more memory 00:10:57.980 --> 00:11:00.770 to store all the combinations of all the letters 00:11:00.770 --> 00:11:03.860 that could make up all the words we might want to have as our wake 00:11:03.860 --> 00:11:05.810 word for our assistant, right? 00:11:05.810 --> 00:11:08.062 And even beyond that, beyond the computer's memory, 00:11:08.062 --> 00:11:09.770 if you think about the time it would take 00:11:09.770 --> 00:11:12.620 to make a trie, a more complex data structure, 00:11:12.620 --> 00:11:15.170 well, that's real human time that you have to work on this, 00:11:15.170 --> 00:11:18.120 and might actually delay your shipment of some new feature, right? 00:11:18.120 --> 00:11:18.620 And 00:11:18.620 --> 00:11:21.770 So it's going to matter, not just how much time it takes for the computer, 00:11:21.770 --> 00:11:24.290 but also how much time it takes you to make the structure, 00:11:24.290 --> 00:11:28.670 if you wanted to actually make some feature for your own program, 00:11:28.670 --> 00:11:31.020 or your own company in the end. 00:11:31.020 --> 00:11:33.440 So questions on these structures so far? 00:11:33.440 --> 00:11:35.720 Happy to take some in the chat, if you have any. 00:11:42.530 --> 00:11:43.835 What questions do you have? 00:11:48.370 --> 00:11:50.300 A question about, does AI use data structures? 00:11:50.300 --> 00:11:50.800 Yes. 00:11:50.800 --> 00:11:53.980 So a lot of how AI can work so quickly is 00:11:53.980 --> 00:11:57.410 because we choose to organize data in these really efficient ways. 00:11:57.410 --> 00:12:00.310 And so by getting creative with the ways we actually make our data 00:12:00.310 --> 00:12:03.190 structures beyond even just linked lists and tries and trees, 00:12:03.190 --> 00:12:06.247 we can actually make AI a lot more quickly-- 00:12:06.247 --> 00:12:08.080 run a lot more quickly because it has access 00:12:08.080 --> 00:12:10.450 to this data structure laid out in such a way 00:12:10.450 --> 00:12:12.380 that it can run quickly in the end. 00:12:12.380 --> 00:12:16.175 So, yes, it makes use of a lot of different data structures, too. 00:12:16.175 --> 00:12:18.550 And this question about memory being so cheap these days. 00:12:18.550 --> 00:12:19.010 It is true. 00:12:19.010 --> 00:12:20.302 So memory is very cheap, right? 00:12:20.302 --> 00:12:23.200 If we wanted to make the trie here, maybe we 00:12:23.200 --> 00:12:27.280 don't care how much memory it takes up because memory is so cheap. 00:12:27.280 --> 00:12:30.130 But it also takes us time to allocate that memory, 00:12:30.130 --> 00:12:31.840 to put things in that memory. 00:12:31.840 --> 00:12:34.030 And so, perhaps, it's not always the case 00:12:34.030 --> 00:12:37.240 that because we have so much memory, we should just be kind of loose with it. 00:12:37.240 --> 00:12:38.440 We really need to take care that we're not 00:12:38.440 --> 00:12:41.050 using too much memory when we could, theoretically, maybe 00:12:41.050 --> 00:12:43.760 use less, if we wanted to. 00:12:43.760 --> 00:12:44.800 Nice. 00:12:44.800 --> 00:12:45.680 OK. 00:12:45.680 --> 00:12:48.410 So this is some of the ideas of trade-offs here, choosing 00:12:48.410 --> 00:12:51.480 one or the other, often going to cost us one thing or another. 00:12:51.480 --> 00:12:53.550 But, really, behind all of these structures 00:12:53.550 --> 00:12:58.100 is the idea of a node, where, if we look back at our structures here, 00:12:58.100 --> 00:13:01.670 we had maybe a trie, we had a link, we had a hash table, 00:13:01.670 --> 00:13:02.810 and we had a linked list. 00:13:02.810 --> 00:13:07.520 And it was composing all of these was this idea of some kind of node, 00:13:07.520 --> 00:13:09.110 some place we store data. 00:13:09.110 --> 00:13:12.690 And along with that, a pointer to some other pieces of data. 00:13:12.690 --> 00:13:16.100 So we saw last week, in Week 4, that pointers are 00:13:16.100 --> 00:13:18.380 helpful for pointing certain locations. 00:13:18.380 --> 00:13:20.900 Well, now, we're seeing the application of those pointers. 00:13:20.900 --> 00:13:23.025 Why would we even want pointers in the first place? 00:13:23.025 --> 00:13:25.860 Because we can build these much more complex data structures. 00:13:25.860 --> 00:13:29.797 So let's think about building these very first building block for these data 00:13:29.797 --> 00:13:31.130 sets, which is called this node. 00:13:31.130 --> 00:13:33.130 And in our code that we'll see in just a minute, 00:13:33.130 --> 00:13:36.318 we have this new idea of creating this node in code. 00:13:36.318 --> 00:13:38.360 And we've seen this kind of syntax before, right? 00:13:38.360 --> 00:13:41.480 Here is the code we would use to make a structure called a node. 00:13:41.480 --> 00:13:44.212 And we're seeing that same syntax for structs, 00:13:44.212 --> 00:13:45.920 like, we created a struct for candidates, 00:13:45.920 --> 00:13:47.630 we created a struct for people earlier. 00:13:47.630 --> 00:13:50.660 Now, we're using that same syntax to create a node. 00:13:50.660 --> 00:13:52.080 And what do you notice here? 00:13:52.080 --> 00:13:53.870 So first, we're going to look at the top. 00:13:53.870 --> 00:13:56.160 And C reads this top to bottom. 00:13:56.160 --> 00:13:58.260 So as it reads this code, it'll do the following. 00:13:58.260 --> 00:14:00.890 It'll say, OK, here I'm seeing we're going to define 00:14:00.890 --> 00:14:03.590 a new structure called a node. 00:14:03.590 --> 00:14:05.330 That seems OK so far. 00:14:05.330 --> 00:14:08.000 And then inside of the structure called node, 00:14:08.000 --> 00:14:10.520 we're going to ensure we have the following pieces of data. 00:14:10.520 --> 00:14:11.990 We're going to make sure we have a phrase. 00:14:11.990 --> 00:14:14.780 If we look at this, we're going to have space for a phrase that 00:14:14.780 --> 00:14:16.260 is going to be a string. 00:14:16.260 --> 00:14:21.320 And then we also have some space for a pointer to a node called next. 00:14:21.320 --> 00:14:23.660 And, again, we can store, like, Hi or bye in here. 00:14:23.660 --> 00:14:27.800 But we also have some new space for next, a pointer to a node. 00:14:27.800 --> 00:14:31.970 And that could have a real address in it, like, 0x123, or 0x456. 00:14:31.970 --> 00:14:35.750 But it has to point to some kind of node, right? 00:14:35.750 --> 00:14:39.290 So now inside this node, we have a phrase and a next portion 00:14:39.290 --> 00:14:42.560 currently empty, but we could put this data inside of it. 00:14:42.560 --> 00:14:45.593 And then once we finish reading this, we're going to say, all of this, 00:14:45.593 --> 00:14:48.260 everything we've done so far, the phrase, the next portions that 00:14:48.260 --> 00:14:51.350 are-- all of that to be wrapped up into this thing called a node. 00:14:51.350 --> 00:14:54.230 So we've created kind of here a template for what a node is. 00:14:54.230 --> 00:14:57.650 A node has a phrase in it and some next pointer in it. 00:14:57.650 --> 00:14:59.955 And we're going to call all of that a node. 00:14:59.955 --> 00:15:01.080 Now, you could change this. 00:15:01.080 --> 00:15:02.913 You can make sure it has an integer, you can 00:15:02.913 --> 00:15:04.940 make sure it has some character inside of it. 00:15:04.940 --> 00:15:08.270 You could even store more attributes, if you'd like, inside a single node. 00:15:08.270 --> 00:15:11.570 But for now, I'll focus on phrases, trying to store these activation 00:15:11.570 --> 00:15:13.790 words for our voice assistant for now. 00:15:13.790 --> 00:15:15.860 Now, questions on this structure so far? 00:15:19.080 --> 00:15:23.310 Yes, I'm seeing a question, is the next location inside the computer's memory? 00:15:23.310 --> 00:15:25.410 Is it actually in the byte below? 00:15:25.410 --> 00:15:27.960 Like here in our picture here, we see that the phrase 00:15:27.960 --> 00:15:31.770 portion is kind of first, and then comes the next portion. 00:15:31.770 --> 00:15:34.277 Are these two really next to each other in memory? 00:15:34.277 --> 00:15:36.360 And the answer for that is, generally, like, yeah. 00:15:36.360 --> 00:15:38.170 So these are going to be very close to-- they're in memory. 00:15:38.170 --> 00:15:40.337 They're going to be kind of right next to each other 00:15:40.337 --> 00:15:43.290 as we create this node throughout. 00:15:43.290 --> 00:15:46.240 And a question about, well, could we have a before field, 00:15:46.240 --> 00:15:47.580 like, a pointer to a before? 00:15:47.580 --> 00:15:48.580 And we absolutely could. 00:15:48.580 --> 00:15:50.740 So if we wanted to modify this, we could simply 00:15:50.740 --> 00:15:53.620 add a new attribute called before, and we could theoretically 00:15:53.620 --> 00:15:55.570 point to linked lists that are-- 00:15:55.570 --> 00:15:58.172 to nodes that are behind us, as well, in our current node. 00:15:58.172 --> 00:16:00.880 And that allows us to make more even more complex data structures 00:16:00.880 --> 00:16:01.750 overall, too. 00:16:05.790 --> 00:16:08.860 Now, a question here, we see a struct node if we go back here. 00:16:08.860 --> 00:16:10.980 So struct node, star, next. 00:16:10.980 --> 00:16:13.972 Is that a pointer to the struct, or just a simple pointer? 00:16:13.972 --> 00:16:15.930 And the answer here is a pointer to the struct. 00:16:15.930 --> 00:16:18.660 So here, we're seeing that this is a node pointer. 00:16:18.660 --> 00:16:20.800 And a node, technically, is a structure. 00:16:20.800 --> 00:16:23.010 So we're going to have a pointer to that node that 00:16:23.010 --> 00:16:26.190 is itself a structure combined with a phrase and another pointer 00:16:26.190 --> 00:16:27.340 to another node. 00:16:27.340 --> 00:16:30.860 So good questions there. 00:16:30.860 --> 00:16:32.780 All right. 00:16:32.780 --> 00:16:34.790 Any other questions on this structure here? 00:16:41.410 --> 00:16:42.820 OK. 00:16:42.820 --> 00:16:46.750 So that all seems pretty good so far, and kind of a simple template 00:16:46.750 --> 00:16:47.470 for a node. 00:16:47.470 --> 00:16:50.050 But keep in mind that just this code doesn't actually 00:16:50.050 --> 00:16:51.150 create the node for us. 00:16:51.150 --> 00:16:53.650 It gives us a template for the node, but we haven't actually 00:16:53.650 --> 00:16:55.037 created it in code yet. 00:16:55.037 --> 00:16:57.370 So let's actually go ahead and see how we might do that. 00:16:57.370 --> 00:17:00.610 If you were to go ahead and create our own linked list, let's say. 00:17:00.610 --> 00:17:04.000 Like, let's say, out of all these three structures, we chose a linked list now. 00:17:04.000 --> 00:17:06.250 We could start by making a single node and have 00:17:06.250 --> 00:17:08.900 that be at the head of our linked list. 00:17:08.900 --> 00:17:11.440 So to create a linked list, we'll often do this. 00:17:11.440 --> 00:17:14.470 We'll often create a new pointer to a node 00:17:14.470 --> 00:17:18.220 and first set it equal to null because it's pointing to nothing. 00:17:18.220 --> 00:17:20.560 There are no nodes in our linked list. 00:17:20.560 --> 00:17:23.500 But once we've done that, well, we probably want to add at least one 00:17:23.500 --> 00:17:24.290 node, right? 00:17:24.290 --> 00:17:25.540 So let's go ahead and do that. 00:17:25.540 --> 00:17:27.790 Let's make a new pointer, this one called n, 00:17:27.790 --> 00:17:30.500 and make sure we have some space in which to store it. 00:17:30.500 --> 00:17:32.200 So let's read this from right to left. 00:17:32.200 --> 00:17:33.670 Let's call malloc first. 00:17:33.670 --> 00:17:37.160 And malloc will give us some space in memory to actually put this node. 00:17:37.160 --> 00:17:39.250 So you can see at the visual right below we 00:17:39.250 --> 00:17:41.860 have some new space our program could use 00:17:41.860 --> 00:17:46.090 to actually store this node inside of and add a phrase, or some next pointer, 00:17:46.090 --> 00:17:47.140 right? 00:17:47.140 --> 00:17:49.480 But now if we go all the way to the left, 00:17:49.480 --> 00:17:53.920 we're going to see this pointer called n will then point to that space. 00:17:53.920 --> 00:17:55.870 And notice how n is a type node star. 00:17:55.870 --> 00:18:00.260 It's pointing to some space for a node, right? 00:18:00.260 --> 00:18:00.860 OK. 00:18:00.860 --> 00:18:02.783 So that makes our first node. 00:18:02.783 --> 00:18:05.450 But what we probably want to do is actually add some data to it. 00:18:05.450 --> 00:18:07.050 Currently, it's empty, right? 00:18:07.050 --> 00:18:07.920 So what can we do? 00:18:07.920 --> 00:18:10.640 Well, we can actually use this new syntax, this arrow syntax, 00:18:10.640 --> 00:18:12.780 to actually add in some data. 00:18:12.780 --> 00:18:18.050 So down below, we could say, n arrow phrase gets the value high. 00:18:18.050 --> 00:18:19.050 And what does that mean? 00:18:19.050 --> 00:18:21.200 So if you look at our visual here, we could see-- 00:18:21.200 --> 00:18:23.960 OK, we see n, and we see an arrow. 00:18:23.960 --> 00:18:27.470 And now, we're going to go to the phrase portion of this new piece of memory 00:18:27.470 --> 00:18:31.290 we've allocated and make sure we put the phrase Hi in there. 00:18:31.290 --> 00:18:34.400 So n arrow phrase is saying, start with that pointer 00:18:34.400 --> 00:18:36.950 and follow it to the structure that you have, 00:18:36.950 --> 00:18:39.650 go to that phrase attribute, that phrase portion, 00:18:39.650 --> 00:18:43.160 and make sure that says Hi in the middle, right? 00:18:43.160 --> 00:18:45.320 And that's all fine and good, but now, we also 00:18:45.320 --> 00:18:47.012 have to include the next portion. 00:18:47.012 --> 00:18:49.470 Like, currently it's empty, but we want to have some value. 00:18:49.470 --> 00:18:52.430 And if this is our first node, what's the logical value 00:18:52.430 --> 00:18:56.360 for this next attribute for our node? 00:18:56.360 --> 00:18:58.010 This is our very first node. 00:18:58.010 --> 00:18:59.120 What could we set it to? 00:19:01.680 --> 00:19:02.180 All right. 00:19:02.180 --> 00:19:03.472 So I'm seeing null in the chat. 00:19:03.472 --> 00:19:08.240 And the correct version is the N-U-L-L, not N-U-L. N-U-L is nul, 00:19:08.240 --> 00:19:10.490 and the concept of character is like a nul character. 00:19:10.490 --> 00:19:13.460 But N-U-L-L is null in the context of an address. 00:19:13.460 --> 00:19:16.530 So this is an empty address, zero address here. 00:19:16.530 --> 00:19:20.630 So if we say n arrow next gets the value null, we'd see something like this. 00:19:20.630 --> 00:19:25.020 Here, we have our new node n, it's phrase is Hi, its next pointer is null. 00:19:25.020 --> 00:19:28.160 It's the very first node in our list. 00:19:28.160 --> 00:19:34.130 So any questions on this so far, actually, making up our first node? 00:19:34.130 --> 00:19:37.280 To your question on the arrow syntax, so again, the arrow syntax 00:19:37.280 --> 00:19:40.680 is kind of best visualized by actually reading from left to right. 00:19:40.680 --> 00:19:42.740 So if we look at the very first thing, we see n. 00:19:42.740 --> 00:19:44.430 OK. n is a pointer. 00:19:44.430 --> 00:19:48.380 Now, the arrow says, let's follow that pointer to our structure, right, 00:19:48.380 --> 00:19:49.980 because n is pointing to a structure. 00:19:49.980 --> 00:19:52.550 And let's go to the next field, that next attribute, 00:19:52.550 --> 00:19:55.710 and set it equal to some value we put on the right-hand side. 00:19:55.710 --> 00:20:00.050 So here we said, start with n, follow n's pointer to our node structure, 00:20:00.050 --> 00:20:02.900 and go to that next attribute and set that equal to null, 00:20:02.900 --> 00:20:05.950 as we see down below here. 00:20:05.950 --> 00:20:06.940 OK. 00:20:06.940 --> 00:20:09.400 So there's one more step, though, which is 00:20:09.400 --> 00:20:11.757 our list is still looking pretty empty. 00:20:11.757 --> 00:20:14.090 We have our node, but our list is still empty over here. 00:20:14.090 --> 00:20:17.500 So what could we do to actually add this node to our linked list? 00:20:21.610 --> 00:20:28.445 What kind of code do we need here, or where do we want list to point now? 00:20:32.908 --> 00:20:35.450 Feel free to type in the chat if you think you might have it. 00:20:39.280 --> 00:20:41.620 So I'm seeing two competing opinions here. 00:20:41.620 --> 00:20:44.910 So one opinion is that we should say list equals n, 00:20:44.910 --> 00:20:47.010 which would be list gets the value n. 00:20:47.010 --> 00:20:51.040 And I'm seeing n equals list, or n gets the value list. 00:20:51.040 --> 00:20:53.220 Which one do we think might work here? 00:20:53.220 --> 00:20:54.120 Let's try one. 00:20:54.120 --> 00:20:57.120 So let's say, first, that list would get the value n. 00:20:57.120 --> 00:20:59.680 So list equals n, and let's try that. 00:20:59.680 --> 00:21:01.840 So here we have that up here and let's execute it. 00:21:01.840 --> 00:21:06.720 And now we'll see, OK, we'll list an n point to the same thing. 00:21:06.720 --> 00:21:08.940 Because n is a pointer and list is a pointer. 00:21:08.940 --> 00:21:12.160 If we set list equal to n, though, both point to the very same thing, 00:21:12.160 --> 00:21:13.703 which is our very first node. 00:21:13.703 --> 00:21:15.870 What would happen if we did it the other way around? 00:21:15.870 --> 00:21:21.510 If we said n equals list, instead of list equals n, where would n point to? 00:21:25.920 --> 00:21:27.877 Any ideas in the chat? 00:21:27.877 --> 00:21:28.960 So I'm seeing null, right? 00:21:28.960 --> 00:21:32.050 If list is currently null, it's pointing to really nothing in general, 00:21:32.050 --> 00:21:34.480 well, if we say n equals list, we're going 00:21:34.480 --> 00:21:38.680 to make sure that end points to null and we'll have lost this node over here 00:21:38.680 --> 00:21:40.060 that we just had, right? 00:21:40.060 --> 00:21:41.948 So we probably don't want to do that. 00:21:41.948 --> 00:21:43.990 So this will be our last step in adding our node. 00:21:43.990 --> 00:21:47.230 And now we have our very first node inside 00:21:47.230 --> 00:21:50.070 of our linked list, which is great. 00:21:50.070 --> 00:21:52.575 Questions on this here, too, before we keep going? 00:21:59.170 --> 00:22:01.660 OK. 00:22:01.660 --> 00:22:04.820 Not seeing too many here, so why don't we keep on going. 00:22:04.820 --> 00:22:08.410 And if we want to kick off the very first node, this is how we would do it. 00:22:08.410 --> 00:22:11.770 But let's say we wanted to actually keep adding more and more nodes. 00:22:11.770 --> 00:22:14.500 We could start thinking about inserting nodes to our linked list. 00:22:14.500 --> 00:22:16.910 And there are a few ways we could do even that. 00:22:16.910 --> 00:22:19.517 So if we think back to our picture as it is right now, 00:22:19.517 --> 00:22:21.850 let's say we wanted to create this new space for a node. 00:22:21.850 --> 00:22:24.850 We could say, OK, I'm going to have this new node called n, 00:22:24.850 --> 00:22:27.370 and I'm going to use malloc to give me some space for it. 00:22:27.370 --> 00:22:28.610 So same thing we did before. 00:22:28.610 --> 00:22:32.200 We're going to say, get some space, make sure n points to it. 00:22:32.200 --> 00:22:37.360 And now, how could we best actually add this node to our list? 00:22:37.360 --> 00:22:40.345 Would we add it at the end maybe, or the beginning, which do you think? 00:22:43.843 --> 00:22:46.760 Could we add n at the beginning or the end, and which would be faster? 00:22:50.337 --> 00:22:51.920 So I'm seeing some competing opinions. 00:22:51.920 --> 00:22:54.900 I'm seeing end and I'm seeing beginning. 00:22:54.900 --> 00:22:57.270 Now, if you think about a longer linked list, 00:22:57.270 --> 00:23:01.922 let's say we had not one node but really like 50 nodes. 00:23:01.922 --> 00:23:03.380 Would we want to add it at the end? 00:23:03.380 --> 00:23:05.713 And if so, how many steps would it take to add this node 00:23:05.713 --> 00:23:07.770 at the end of that linked list? 00:23:07.770 --> 00:23:10.470 If we have 50 nodes, how many steps would it 00:23:10.470 --> 00:23:12.180 take to add this node at the end? 00:23:12.180 --> 00:23:13.185 It would take 50, right? 00:23:13.185 --> 00:23:15.810 And that reason is that you have to actually find the last node 00:23:15.810 --> 00:23:16.780 and then add it there. 00:23:16.780 --> 00:23:20.190 So to find that last node, we have to go through all 50 of our nodes. 00:23:20.190 --> 00:23:22.440 So if we want the very quick insertion we talked about 00:23:22.440 --> 00:23:24.398 before, we have to add a node at the beginning, 00:23:24.398 --> 00:23:26.310 right, make sure that this new node points 00:23:26.310 --> 00:23:29.920 to the current first node in our list. 00:23:29.920 --> 00:23:31.237 So let's actually do that here. 00:23:31.237 --> 00:23:33.820 We have some space for a node, we have a pointer to that node. 00:23:33.820 --> 00:23:35.590 But let's see how we might add it in. 00:23:35.590 --> 00:23:38.550 So we probably first give this node a phrase, like, Hey. 00:23:38.550 --> 00:23:41.610 We'll say, OK, n arrow phrase gets the value Hey. 00:23:41.610 --> 00:23:47.940 And now, we want to say, hum, where should we point n's next pointer to? 00:23:47.940 --> 00:23:53.590 If we look at this picture here, where should we point n's next pointer? 00:23:53.590 --> 00:23:56.270 In the chat, if you will. 00:23:56.270 --> 00:23:57.110 Yeah, to list. 00:23:57.110 --> 00:23:59.840 So we want to make sure that this new node called 00:23:59.840 --> 00:24:03.690 n points to the very first node in our list, that being called, well, just 00:24:03.690 --> 00:24:04.190 list. 00:24:04.190 --> 00:24:07.950 So here we'll say, OK, n arrow next gets the value list. 00:24:07.950 --> 00:24:11.150 So now we're saying that n is a node, it has the phrase, 00:24:11.150 --> 00:24:14.630 Hey, and its next arrow is pointing to this list 00:24:14.630 --> 00:24:17.760 node, the very head of our list here. 00:24:17.760 --> 00:24:18.260 All right. 00:24:18.260 --> 00:24:21.770 But now, you'll notice that our linked list is not quite in the state 00:24:21.770 --> 00:24:24.660 we want it to be in, like, list is still pointing to our second node. 00:24:24.660 --> 00:24:26.368 So what's the next step here, if you want 00:24:26.368 --> 00:24:30.550 to make sure our list now includes this very first node? 00:24:30.550 --> 00:24:32.000 What should we do? 00:24:32.000 --> 00:24:34.400 We should point list to n, as I'm seeing in the chat. 00:24:34.400 --> 00:24:37.232 So let's make sure we have list pointing to n now. 00:24:37.232 --> 00:24:37.940 So we'll do this. 00:24:37.940 --> 00:24:40.450 We'll say, list equals n, and we'll run this and we'll see, 00:24:40.450 --> 00:24:45.460 OK, list gets the value n, meaning list now points to this new first node. 00:24:45.460 --> 00:24:49.150 And now we have our own linked list of more than one node, right? 00:24:49.150 --> 00:24:50.440 We went from one to two. 00:24:50.440 --> 00:24:53.148 And now, I could even do the same thing, three, or four, or five, 00:24:53.148 --> 00:24:55.900 even 500 times, if we wanted to. 00:24:55.900 --> 00:24:58.440 So questions on this, inserting this new node here? 00:25:06.850 --> 00:25:08.990 OK. 00:25:08.990 --> 00:25:11.090 So not seeing too many here. 00:25:11.090 --> 00:25:13.200 Let's actually try to translate this bit to code. 00:25:13.200 --> 00:25:15.075 So this is all fine and good with our visual, 00:25:15.075 --> 00:25:16.770 but let's actually try it out in code. 00:25:16.770 --> 00:25:18.950 I'll go over to my Visual Studio code over here. 00:25:18.950 --> 00:25:22.790 And here, I have my terminal, a file already called list.c. 00:25:22.790 --> 00:25:25.097 So I'll code up list.c. 00:25:25.097 --> 00:25:27.180 And if we take a look at this file, what we'll see 00:25:27.180 --> 00:25:29.960 is a very basic kind of linked list file. 00:25:29.960 --> 00:25:32.000 Here at the top, I have my headers. 00:25:32.000 --> 00:25:34.785 And down below, I have my structure called a node. 00:25:34.785 --> 00:25:36.660 And this is the same as we saw before, right? 00:25:36.660 --> 00:25:39.950 We have a new structure called a node that's going to hold a phrase, 00:25:39.950 --> 00:25:41.510 and it's going to have a pointer to the next node. 00:25:41.510 --> 00:25:44.802 And we're going to call all of this node throughout our code using this typedef 00:25:44.802 --> 00:25:46.280 syntax up here. 00:25:46.280 --> 00:25:49.610 Now, down below, we have some functions to actually implement. 00:25:49.610 --> 00:25:51.920 One is called unload, that we'll see in a bit, 00:25:51.920 --> 00:25:54.170 to actually make sure we're not using too much memory, 00:25:54.170 --> 00:25:55.820 we're actually freeing it as we go. 00:25:55.820 --> 00:25:59.690 And the other down here in main is actually add items or add phrases 00:25:59.690 --> 00:26:01.400 to our linked list. 00:26:01.400 --> 00:26:03.710 And as we do, we have this function down here called 00:26:03.710 --> 00:26:06.230 visualize that I wrote myself and should actually 00:26:06.230 --> 00:26:08.480 help you see what's happening inside of our link list. 00:26:08.480 --> 00:26:10.897 Don't worry about all this kind of scary syntax down here. 00:26:10.897 --> 00:26:13.350 We'll see what happens as we add nodes to our linked list. 00:26:13.350 --> 00:26:16.450 So currently, if we wanted to add some phrases to our linked list, 00:26:16.450 --> 00:26:19.200 well, we could just kind of run this program and see what happens. 00:26:19.200 --> 00:26:20.210 Well, first, I could-- 00:26:20.210 --> 00:26:21.890 let me bring up my terminal here. 00:26:21.890 --> 00:26:27.715 And I could say, I want to make list like this. 00:26:27.715 --> 00:26:29.590 And let's go ahead and make this full screen. 00:26:29.590 --> 00:26:31.780 I'll say ./list to run it. 00:26:31.780 --> 00:26:33.270 And now, what phrase should we add? 00:26:33.270 --> 00:26:35.270 Maybe we add Hi, exclamation point. 00:26:35.270 --> 00:26:36.390 So I'll hit Enter. 00:26:36.390 --> 00:26:39.480 And it doesn't show anything, right? 00:26:39.480 --> 00:26:40.740 Nothing's here yet. 00:26:40.740 --> 00:26:41.910 What if I did Hello! 00:26:41.910 --> 00:26:43.110 I could say Hello! 00:26:43.110 --> 00:26:44.800 And nothing's there. 00:26:44.800 --> 00:26:45.300 OK. 00:26:45.300 --> 00:26:47.550 But why is nothing there yet? 00:26:47.550 --> 00:26:49.905 In the chat, why is nothing here? 00:26:53.540 --> 00:26:54.750 Yeah, we didn't add anything. 00:26:54.750 --> 00:26:58.040 So if you go back to our program here, we still 00:26:58.040 --> 00:27:01.820 have a to do, to add phrases to a new node in this list. 00:27:01.820 --> 00:27:05.210 So here, if we look back, very familiar syntax, 00:27:05.210 --> 00:27:07.190 we're creating our very first linked list. 00:27:07.190 --> 00:27:09.200 And initially, this list is null. 00:27:09.200 --> 00:27:10.170 There's nothing in it. 00:27:10.170 --> 00:27:12.140 So that's why, when we ran this program before, 00:27:12.140 --> 00:27:14.990 we didn't see anything inside of our linked list. 00:27:14.990 --> 00:27:19.310 But now, what we're going to do is iterate list size times 00:27:19.310 --> 00:27:23.450 to ask the user for some phrase and add it to our linked list 00:27:23.450 --> 00:27:25.760 and visualize that addition. 00:27:25.760 --> 00:27:27.560 So how big is list size? 00:27:27.560 --> 00:27:28.940 Well, list size is 2. 00:27:28.940 --> 00:27:30.530 We'll iterate two times. 00:27:30.530 --> 00:27:34.110 And each time, we'll go ahead and add some new phrase, right? 00:27:34.110 --> 00:27:37.340 So let's think first-- how are we going to make this new node? 00:27:37.340 --> 00:27:40.020 We first probably want to actually create some space for it. 00:27:40.020 --> 00:27:43.300 So how could we do that here? 00:27:43.300 --> 00:27:45.460 We're trying to add a new node now. 00:27:45.460 --> 00:27:47.230 So we're going to need malloc, OK. 00:27:47.230 --> 00:27:49.120 So we're going to need malloc in some form. 00:27:49.120 --> 00:27:52.510 And how much space should we ask for, do we know? 00:27:56.000 --> 00:27:58.750 So we want to create some space for a node. 00:27:58.750 --> 00:28:00.685 And maybe a node is-- 00:28:00.685 --> 00:28:03.130 I don't know-- I could guess and say it's five bytes. 00:28:03.130 --> 00:28:04.780 But is that right? 00:28:04.780 --> 00:28:05.495 Or maybe it's 10. 00:28:05.495 --> 00:28:06.370 I mean, I don't know. 00:28:06.370 --> 00:28:07.700 It could be anything. 00:28:07.700 --> 00:28:10.360 We don't really quite know, and so that's 00:28:10.360 --> 00:28:13.390 why we want to use this function called sizeof. 00:28:13.390 --> 00:28:16.690 And we can pass sizeof a certain type, like, node, right? 00:28:16.690 --> 00:28:19.530 I don't know how big a node is, but sizeof does. 00:28:19.530 --> 00:28:25.162 It'll tell malloc how much memory to create to have this node in place. 00:28:25.162 --> 00:28:27.370 But now that I've asked for memory, I should probably 00:28:27.370 --> 00:28:29.650 keep track where that memory is. 00:28:29.650 --> 00:28:34.360 So I need some variable to store that location 00:28:34.360 --> 00:28:37.260 memory, which might be called what? 00:28:37.260 --> 00:28:41.020 What could we store here? 00:28:41.020 --> 00:28:44.470 Maybe we could have a node pointer. 00:28:44.470 --> 00:28:45.490 Let's call it n. 00:28:45.490 --> 00:28:49.840 So we could say, OK, this is going to point to some variable-- 00:28:49.840 --> 00:28:52.630 point to some new piece of memory called n. 00:28:52.630 --> 00:28:55.360 And we're going to make sure it points to a node, right? 00:28:55.360 --> 00:28:59.060 This is going to be some new chunk of memory where n points to. 00:28:59.060 --> 00:28:59.560 All right. 00:28:59.560 --> 00:29:02.780 So we do that kind of first bit, and now we have some space for our node. 00:29:02.780 --> 00:29:04.480 But what do we need now? 00:29:04.480 --> 00:29:07.478 We need to make sure that we add in a phrase. 00:29:07.478 --> 00:29:08.395 So how can we do that? 00:29:12.323 --> 00:29:14.740 We would probably use our arrow syntax like we saw before. 00:29:14.740 --> 00:29:17.700 So we could say, n arrow phrase. 00:29:17.700 --> 00:29:20.370 And let's store the phrase that we have right now inside. 00:29:20.370 --> 00:29:23.500 And similarly for the next portion, we could do this. 00:29:23.500 --> 00:29:26.590 We could say n arrow next gets the value. 00:29:26.590 --> 00:29:33.330 But what should it get, if this is our maybe first node? 00:29:33.330 --> 00:29:36.760 I'm seeing a few different ideas here. 00:29:36.760 --> 00:29:38.520 So I'm seeing one for null. 00:29:38.520 --> 00:29:39.190 Let's try that. 00:29:39.190 --> 00:29:41.830 So we'll say, OK, n arrow next is null. 00:29:41.830 --> 00:29:42.810 It's our first node. 00:29:42.810 --> 00:29:45.330 And now what we want to do is make sure that list 00:29:45.330 --> 00:29:48.910 points to that very first node we just made up above. 00:29:48.910 --> 00:29:49.890 So let's try it. 00:29:49.890 --> 00:29:51.275 Let's go ahead and compile this. 00:29:51.275 --> 00:29:52.650 We'll open up our terminal again. 00:29:52.650 --> 00:29:55.160 We'll say, make list and-- 00:29:55.160 --> 00:29:59.590 oops-- we forgot our semicolon, so we'll do that. 00:29:59.590 --> 00:30:03.080 And we'll go back here and we'll say, make list. 00:30:03.080 --> 00:30:05.030 And then we'll do ./list. 00:30:05.030 --> 00:30:07.960 And I'll say, OK, my first phrase is Hi! 00:30:07.960 --> 00:30:11.620 And that seems all right, like, I have this new node at this location. 00:30:11.620 --> 00:30:13.180 It's phrase is Hi! 00:30:13.180 --> 00:30:14.900 And its next pointer is null. 00:30:14.900 --> 00:30:16.120 That seems OK. 00:30:16.120 --> 00:30:17.060 What if I did this? 00:30:17.060 --> 00:30:18.460 What if I said Hello! 00:30:18.460 --> 00:30:20.650 now. 00:30:20.650 --> 00:30:26.470 And I'm seeming to actually be missing my very first node. 00:30:26.470 --> 00:30:27.730 It kind of went away. 00:30:27.730 --> 00:30:30.340 What happened here? 00:30:30.340 --> 00:30:31.930 Any ideas in the chat? 00:30:37.490 --> 00:30:37.990 Yeah. 00:30:37.990 --> 00:30:40.390 So I'm seeing we didn't quite add this node. 00:30:40.390 --> 00:30:43.840 Because if we take a look at our code, we're 00:30:43.840 --> 00:30:47.380 always setting the next pointer for our new node to be null, 00:30:47.380 --> 00:30:48.440 and that means nothing. 00:30:48.440 --> 00:30:50.410 So whenever we add a new node, well, we're just 00:30:50.410 --> 00:30:52.210 going to make sure that the next pointer is null, 00:30:52.210 --> 00:30:53.110 like, it doesn't point to anything? 00:30:53.110 --> 00:30:54.068 That isn't quite right. 00:30:54.068 --> 00:30:58.570 So let's make sure we actually update this to point to the head of our list. 00:30:58.570 --> 00:31:02.840 Our first node should point to whatever is the first element in our list. 00:31:02.840 --> 00:31:06.850 So let's say this, n arrow next gets the value list. 00:31:06.850 --> 00:31:09.130 And how will this work at the beginning? 00:31:09.130 --> 00:31:12.490 So notice how list is null at the beginning. 00:31:12.490 --> 00:31:14.770 Then we'll go through and add our first node. 00:31:14.770 --> 00:31:17.290 We'll say, OK, let's get a phrase, and let's 00:31:17.290 --> 00:31:20.320 make sure we have some space for this new node called n. 00:31:20.320 --> 00:31:23.860 Let's make sure this n has the phrase that we've been typing in. 00:31:23.860 --> 00:31:28.060 And now, n arrow next is equal to list which, in fact, is null, right? 00:31:28.060 --> 00:31:30.880 So our very first node will still have null 00:31:30.880 --> 00:31:34.480 as its sort of beginning next pointer. 00:31:34.480 --> 00:31:36.640 And then we'll go ahead and make sure list 00:31:36.640 --> 00:31:40.420 is updated to have the very first node at the head. 00:31:40.420 --> 00:31:43.060 But then if we do it again, if we go through again, 00:31:43.060 --> 00:31:46.120 we'll say that, OK, this list pointer's next is actually pointing 00:31:46.120 --> 00:31:48.163 to the previous node we just created. 00:31:48.163 --> 00:31:50.830 So now I can actually have the string together as we go through. 00:31:50.830 --> 00:31:52.038 And let's visualize this now. 00:31:52.038 --> 00:31:53.600 So let's go back to our terminal. 00:31:53.600 --> 00:31:58.110 Let's say, make list, and let's do ./list. 00:31:58.110 --> 00:31:59.580 And we could say Hi! 00:31:59.580 --> 00:32:01.290 And then we could say Hello! 00:32:01.290 --> 00:32:05.730 And now we see our list being built. So notice how our very first node here 00:32:05.730 --> 00:32:06.990 is at this location. 00:32:06.990 --> 00:32:08.520 It has the phrase Hello! 00:32:08.520 --> 00:32:11.860 And the next pointer is this, it's pointing to some new location here. 00:32:11.860 --> 00:32:15.990 But notice how this is now in our next node, 56f0. 00:32:15.990 --> 00:32:17.580 And the phrase here is Hi! 00:32:17.580 --> 00:32:21.170 And the next pointer is null here. 00:32:21.170 --> 00:32:22.590 OK. 00:32:22.590 --> 00:32:23.840 So that seemed to work for us. 00:32:23.840 --> 00:32:25.490 Let's go back to our code. 00:32:25.490 --> 00:32:27.785 What questions are there on this? 00:32:35.113 --> 00:32:35.780 A good question. 00:32:35.780 --> 00:32:37.660 So is this some kind of recursion? 00:32:37.660 --> 00:32:40.900 This isn't quite recursion because, again, recursion 00:32:40.900 --> 00:32:43.640 involves having a function call itself. 00:32:43.640 --> 00:32:45.470 And here, we're really inside of a loop. 00:32:45.470 --> 00:32:47.230 So it isn't quite recursion. 00:32:47.230 --> 00:32:51.160 But you're onto an idea, which is that every node that we add 00:32:51.160 --> 00:32:52.865 is going through that same process. 00:32:52.865 --> 00:32:55.240 So this really could potentially be a recursive function, 00:32:55.240 --> 00:32:57.050 if you wanted it to be. 00:32:57.050 --> 00:33:00.010 But here, you're just noticing that the same thing we're doing for one 00:33:00.010 --> 00:33:02.710 node, we're doing for another, as we add it, as we go through. 00:33:04.977 --> 00:33:06.310 And that's a good question, too. 00:33:06.310 --> 00:33:09.010 Is the last node we add becoming the first? 00:33:09.010 --> 00:33:10.100 That is also true. 00:33:10.100 --> 00:33:11.980 So here if we were to say, Hi! 00:33:11.980 --> 00:33:12.910 And then Hello! 00:33:12.910 --> 00:33:15.490 Well, Hello! becomes our new first node, right? 00:33:15.490 --> 00:33:18.070 That's because we're trying to insert new nodes at the front 00:33:18.070 --> 00:33:21.610 where the first node we add becomes the first-- 00:33:21.610 --> 00:33:25.510 sorry-- the last node we add becomes the first node in our list, ultimately. 00:33:28.370 --> 00:33:30.340 OK. 00:33:30.340 --> 00:33:33.010 So there is one final thing to do here, which 00:33:33.010 --> 00:33:35.320 is to make sure that if we're asking for memory, 00:33:35.320 --> 00:33:38.630 we do want to make sure that this memory is actually given to us. 00:33:38.630 --> 00:33:41.680 And before we go ahead and use this pointer called n, 00:33:41.680 --> 00:33:44.540 we do want to make sure that we're not getting back some null value. 00:33:44.540 --> 00:33:47.500 So if n is null, what we want to do is simply 00:33:47.500 --> 00:33:51.820 print something like couldn't allocate memory for node, 00:33:51.820 --> 00:33:54.670 or whatever you want to type here, some descriptive error. 00:33:54.670 --> 00:33:59.560 And then we could return 1 to symbolize that something went wrong here. 00:33:59.560 --> 00:34:00.060 OK. 00:34:02.890 --> 00:34:05.020 A question then, is this a stack? 00:34:05.020 --> 00:34:09.400 Well, it has the beginnings of a stack where if you remember, 00:34:09.400 --> 00:34:12.790 a stack is this data structure where the first thing we add 00:34:12.790 --> 00:34:14.920 becomes the last thing we get out. 00:34:14.920 --> 00:34:18.982 Or otherwise, the last thing we add is the first thing we get out, 00:34:18.982 --> 00:34:19.940 which is the case here. 00:34:19.940 --> 00:34:23.170 So we're saying that the last node, the last phrase we add, 00:34:23.170 --> 00:34:26.920 is going to be the first node we see in our structure. 00:34:26.920 --> 00:34:29.710 But it also depends on how we access this linked list. 00:34:29.710 --> 00:34:32.170 If we always access it from the very first node, 00:34:32.170 --> 00:34:34.929 well, that could be kind of similar to a stack. 00:34:34.929 --> 00:34:38.230 But if we always want to access it by looking at the very last element first, 00:34:38.230 --> 00:34:40.150 well, that could be then a queue, right? 00:34:40.150 --> 00:34:42.820 So a queue in a stack kind of depends on how you actually 00:34:42.820 --> 00:34:46.075 use them, and not so much what you build kind of underneath the hood. 00:34:49.875 --> 00:34:50.375 All right. 00:34:53.760 --> 00:34:55.920 So let's keep going here. 00:34:55.920 --> 00:34:58.880 And I would say that this code is mostly correct. 00:34:58.880 --> 00:35:02.420 But if I go down to my terminal and I now do this, 00:35:02.420 --> 00:35:04.760 I do make list again just to compile it. 00:35:04.760 --> 00:35:07.045 And then I say, let's try to memory check this 00:35:07.045 --> 00:35:08.670 and see if there are any memory errors. 00:35:08.670 --> 00:35:11.330 Well, I could do valgrind./list. 00:35:11.330 --> 00:35:16.310 And let me type in, let's say, once it loads, I can type in Hi! 00:35:16.310 --> 00:35:17.630 Right? 00:35:17.630 --> 00:35:19.190 And now that seems OK. 00:35:19.190 --> 00:35:20.930 I can type in Hello! 00:35:20.930 --> 00:35:25.250 And my program ends, but it seems I've definitely lost some bytes. 00:35:25.250 --> 00:35:27.290 I've definitely lost 16 bytes. 00:35:27.290 --> 00:35:30.110 And I've indirectly lost 16 bytes. 00:35:30.110 --> 00:35:31.680 So what happened here? 00:35:31.680 --> 00:35:33.680 What did I fail to do in my code? 00:35:36.128 --> 00:35:37.420 Yeah, I didn't free the memory. 00:35:37.420 --> 00:35:42.570 So if I look back at my program, if I do this and I say, OK, well, unload. 00:35:42.570 --> 00:35:45.240 I trust that unload will work for me, right? 00:35:45.240 --> 00:35:47.550 But if I actually look down here, it's still a to do. 00:35:47.550 --> 00:35:50.290 I didn't actually free any nodes here. 00:35:50.290 --> 00:35:53.400 And if I look above, well, I used malloc several times. 00:35:53.400 --> 00:35:54.420 I used it twice. 00:35:54.420 --> 00:35:57.430 And I didn't call free any of those times. 00:35:57.430 --> 00:36:00.280 So every time we call malloc, we want to make sure we call free, 00:36:00.280 --> 00:36:02.613 and that's what unload is going to do for us down below. 00:36:02.613 --> 00:36:06.090 We're making our own function that takes in some pointer to a list 00:36:06.090 --> 00:36:09.120 and should go through and free every node in that list. 00:36:09.120 --> 00:36:10.995 We can give back that memory to the computer. 00:36:10.995 --> 00:36:13.370 So let's visualize, if we're actually writing it in code. 00:36:13.370 --> 00:36:15.570 If we go back to our slides here, we could 00:36:15.570 --> 00:36:19.200 think about actually trying to free this linked list. 00:36:19.200 --> 00:36:21.610 And maybe your initial thought is, let's do this. 00:36:21.610 --> 00:36:23.027 Let's actually just free the list. 00:36:23.027 --> 00:36:23.970 Let's say this. 00:36:23.970 --> 00:36:25.740 But why would this be bad? 00:36:25.740 --> 00:36:30.300 If we say free the list, what goes wrong? 00:36:30.300 --> 00:36:32.370 Here's the original picture. 00:36:32.370 --> 00:36:34.110 Here's freeing the list. 00:36:34.110 --> 00:36:35.640 What happened here? 00:36:40.010 --> 00:36:42.140 Yes, I'm saying we lost the other nodes. 00:36:42.140 --> 00:36:47.410 So notice how in this picture, if you go back, well, the very first node 00:36:47.410 --> 00:36:49.570 is still pointing to that next node. 00:36:49.570 --> 00:36:52.510 And the only reason we know where that next node is located 00:36:52.510 --> 00:36:55.690 is because it's in the next portion of our very first node. 00:36:55.690 --> 00:37:00.940 But if we free that, if we get rid of it, well, where is that second node? 00:37:00.940 --> 00:37:01.890 We don't know. 00:37:01.890 --> 00:37:03.807 So if that's going to be a problem for us, 00:37:03.807 --> 00:37:06.390 what we don't want to actually just free the head of the list, 00:37:06.390 --> 00:37:10.230 we want to make sure we're going through and freeing nodes step by step 00:37:10.230 --> 00:37:13.840 and keeping track of where the rest of our nodes are. 00:37:13.840 --> 00:37:16.735 So instead, one way we could approach this is as follows. 00:37:16.735 --> 00:37:19.110 We don't want to do this, just free the head of the list. 00:37:19.110 --> 00:37:22.110 What we could do, instead, is say, let's make a new pointer, 00:37:22.110 --> 00:37:23.527 let's add that to the equation. 00:37:23.527 --> 00:37:24.110 Let's do this. 00:37:24.110 --> 00:37:27.450 So let's say that-- let's make a new pointer, just called maybe ptr, 00:37:27.450 --> 00:37:28.570 for example. 00:37:28.570 --> 00:37:33.550 And that will point to whatever the head of our list is, the next node. 00:37:33.550 --> 00:37:37.290 So now, a pointer is pointing to that second node in our list. 00:37:37.290 --> 00:37:38.992 And now, we could safely free list. 00:37:38.992 --> 00:37:39.700 We could do this. 00:37:39.700 --> 00:37:42.480 We could say, OK, free up list, and that's gone. 00:37:42.480 --> 00:37:49.390 But what is still pointing to the rest of our list now, which variable? 00:37:49.390 --> 00:37:52.140 Maybe in the chat? 00:37:52.140 --> 00:37:52.640 Yeah. 00:37:52.640 --> 00:37:55.080 So ptr is pointing to that second node in our list. 00:37:55.080 --> 00:37:57.170 So we still have this in our memory. 00:37:57.170 --> 00:38:00.250 We know where all the rest of our nodes are. 00:38:00.250 --> 00:38:00.750 OK. 00:38:00.750 --> 00:38:01.580 So that's all good. 00:38:01.580 --> 00:38:04.855 But now, we want to actually get back to the head of our list 00:38:04.855 --> 00:38:07.230 and make sure that we're actually kind of moving forward. 00:38:07.230 --> 00:38:10.910 So what we'll do is say that list gets the value for ptr. 00:38:10.910 --> 00:38:14.030 So we're going to update what our list looks like and say that we're 00:38:14.030 --> 00:38:16.410 going to move this over here. 00:38:16.410 --> 00:38:19.550 So now, the head of our list points to whatever pointer points to. 00:38:19.550 --> 00:38:21.470 And we could, perhaps, do this in a loop. 00:38:21.470 --> 00:38:25.940 We could say, OK, in this case, well, pointer will get list arrow next again. 00:38:25.940 --> 00:38:29.950 And what would pointer be now, though? 00:38:29.950 --> 00:38:33.250 If pointer is list arrow next, what is pointer going to be? 00:38:36.490 --> 00:38:38.950 It will be the second node of the node after the first one. 00:38:38.950 --> 00:38:40.970 But what's the actual value right now? 00:38:40.970 --> 00:38:41.470 Yeah. 00:38:41.470 --> 00:38:45.460 It'll be null, or [INAUDIBLE] because notice how right now list 00:38:45.460 --> 00:38:48.220 is pointing to this very, now, the first node 00:38:48.220 --> 00:38:50.900 in our list, whose next value is null. 00:38:50.900 --> 00:38:54.670 So if pointer equals list arrow next, well, pointer will be null. 00:38:54.670 --> 00:38:56.870 OK, that seems OK for now. 00:38:56.870 --> 00:39:01.840 But if we wanted to kind of keep going, let's say we free list, all right. 00:39:01.840 --> 00:39:04.150 And then we say, well, list gets pointer. 00:39:04.150 --> 00:39:05.125 What is list now? 00:39:07.830 --> 00:39:11.705 There's nothing left, so list would also be null, right? 00:39:11.705 --> 00:39:13.830 So what we've done, we've kind of gone full circle. 00:39:13.830 --> 00:39:17.820 We've made-- we've gone from the very first node to our very last node, 00:39:17.820 --> 00:39:20.880 and we've ended when list equals null, when 00:39:20.880 --> 00:39:25.020 there's nothing left to free, back at, for example, the same place we started. 00:39:25.020 --> 00:39:28.590 If we go back to our code, we could see that the very place we began 00:39:28.590 --> 00:39:32.040 was with list equaling no, no nodes in our list. 00:39:32.040 --> 00:39:38.760 And now, if we unload them, we also want to end in that very same place, right? 00:39:38.760 --> 00:39:39.703 OK. 00:39:39.703 --> 00:39:42.870 So let's try implementing this in code now so we can get rid of those memory 00:39:42.870 --> 00:39:43.370 errors. 00:39:43.370 --> 00:39:47.070 If we go down below here, we're going to free all of our allocated nodes. 00:39:47.070 --> 00:39:49.570 And we know we want some kind of loop to do this. 00:39:49.570 --> 00:39:51.598 But let's hold off on that for now. 00:39:51.598 --> 00:39:54.390 We know, first, we want to make this new pointer to our next nodes. 00:39:54.390 --> 00:39:55.015 We can do this. 00:39:55.015 --> 00:40:00.180 Node star pointer equals whatever list is pointing to and getting 00:40:00.180 --> 00:40:02.400 that next value there. 00:40:02.400 --> 00:40:06.230 Then we could safely free list because pointers pointing the rest of our list. 00:40:06.230 --> 00:40:09.800 And, finally, we could say, list updates to 0.2, 00:40:09.800 --> 00:40:12.610 what pointer is now pointing to. 00:40:12.610 --> 00:40:15.080 But how long do we want to do this? 00:40:15.080 --> 00:40:18.170 How do we know when to stop? 00:40:18.170 --> 00:40:21.120 Maybe in the chat? 00:40:21.120 --> 00:40:25.221 This seems good for one node, but want to keep doing this. 00:40:25.221 --> 00:40:28.190 Yeah, we want to keep doing it until our list is null. 00:40:28.190 --> 00:40:30.680 And we don't really know how many times we want to iterate, 00:40:30.680 --> 00:40:32.630 so a while loop seems like a good choice. 00:40:32.630 --> 00:40:35.880 A while loop we can do something until something else is true. 00:40:35.880 --> 00:40:40.190 So we could say, while list doesn't equal null. 00:40:40.190 --> 00:40:43.340 While we still have nodes, essentially, let's 00:40:43.340 --> 00:40:47.960 go ahead and make sure that we have a pointer to the next node. 00:40:47.960 --> 00:40:50.240 Let's free the current node, and then let's 00:40:50.240 --> 00:40:53.280 update the current node to point to whatever it was previously, 00:40:53.280 --> 00:40:54.590 the next one. 00:40:54.590 --> 00:40:57.503 So we could do something like this. 00:40:57.503 --> 00:41:00.170 And now, if we run our code, if we go back to our terminal here, 00:41:00.170 --> 00:41:02.550 we'll make list. 00:41:02.550 --> 00:41:06.040 We'll run valgrind on ./list. 00:41:06.040 --> 00:41:12.990 And what do we get if we type in, let's say, Hi! here, and then Hello? 00:41:12.990 --> 00:41:15.930 We see, successfully, all heap blocks were freed. 00:41:15.930 --> 00:41:19.460 No leaks are possible down below here. 00:41:19.460 --> 00:41:23.880 So what questions are there on this, if you look at our code 00:41:23.880 --> 00:41:25.290 for unloading here? 00:41:30.420 --> 00:41:32.640 Yeah, a good question, is nil the same as no? 00:41:32.640 --> 00:41:36.210 So if we look at our terminal and we actually run this again, 00:41:36.210 --> 00:41:40.800 call it a make list, I'll run ./list, and I'll say Hi! 00:41:40.800 --> 00:41:44.280 And notice how this very first node has the next pointer of nil. 00:41:44.280 --> 00:41:46.690 Nil is simply the terminal representation of null. 00:41:46.690 --> 00:41:48.795 So nil means null in the terminal. 00:41:51.640 --> 00:41:53.260 All right. 00:41:53.260 --> 00:41:57.140 And now, I'm seeing another question, is this a recursive function? 00:41:57.140 --> 00:42:00.370 So this still is probably not a recursive function, 00:42:00.370 --> 00:42:04.720 because a recursive function would kind of call itself like this function might 00:42:04.720 --> 00:42:07.520 then call unload somewhere in here. 00:42:07.520 --> 00:42:10.135 So it's not quite recursive, but you are correct in noticing 00:42:10.135 --> 00:42:12.010 that this is a good application for recursion 00:42:12.010 --> 00:42:16.220 because we're doing the very same thing for every node as we go. 00:42:16.220 --> 00:42:18.670 So you could certainly write this recursively. 00:42:18.670 --> 00:42:21.430 Here, we're going to iteratively using a loop. 00:42:25.500 --> 00:42:26.520 OK. 00:42:26.520 --> 00:42:27.630 Other questions on this? 00:42:35.430 --> 00:42:35.930 OK. 00:42:35.930 --> 00:42:38.550 So let's keep going on. 00:42:38.550 --> 00:42:39.950 And we've seen linked lists now. 00:42:39.950 --> 00:42:41.810 We've seen these really two core operations 00:42:41.810 --> 00:42:44.960 for linked lists, like, how do we add some data to them, 00:42:44.960 --> 00:42:47.360 and how do we unload them? 00:42:47.360 --> 00:42:50.318 What you might ask, though, is how can we speed this up? 00:42:50.318 --> 00:42:51.860 And that's where hash tables come in. 00:42:51.860 --> 00:42:55.037 So hash tables are simply an array of linked lists. 00:42:55.037 --> 00:42:57.370 We take linked lists, we break them into smaller pieces. 00:42:57.370 --> 00:42:59.120 And so let's get a visual for that and see 00:42:59.120 --> 00:43:00.990 if we can make this run even faster. 00:43:00.990 --> 00:43:06.260 So if we go back to our slides here and we see this idea of a linked list, 00:43:06.260 --> 00:43:07.370 we have maybe, Hey! 00:43:07.370 --> 00:43:08.060 Hello! 00:43:08.060 --> 00:43:08.840 Lo there! 00:43:08.840 --> 00:43:12.560 Different kind of activation words for our voice assistant. 00:43:12.560 --> 00:43:15.890 Well, what we could do is try to make sure we can access 00:43:15.890 --> 00:43:18.470 certain parts of this linked list faster to make sure 00:43:18.470 --> 00:43:20.870 we can actually search this linked list faster. 00:43:20.870 --> 00:43:22.580 And we could for that use a hash table. 00:43:22.580 --> 00:43:25.890 So recall from earlier that a hash table looks a bit like this. 00:43:25.890 --> 00:43:28.310 We have some array on the left-hand side. 00:43:28.310 --> 00:43:31.370 And we have different, what we call, buckets in here, 00:43:31.370 --> 00:43:35.180 where a bucket might, in this case, be a letter of the alphabet. 00:43:35.180 --> 00:43:38.750 And that letter of the alphabet is maybe the first character in a word. 00:43:38.750 --> 00:43:43.850 You could have other buckets, other ways of assigning these phrases to buckets. 00:43:43.850 --> 00:43:46.400 But for now, let's focus on this alphabetical hash 00:43:46.400 --> 00:43:48.740 function here and alphabetical hash table. 00:43:48.740 --> 00:43:52.640 So with this, we have different letters for every hash bucket. 00:43:52.640 --> 00:44:00.560 And how does C know where in the array H, I, J, K and L are? 00:44:00.560 --> 00:44:03.090 If we wanted to add some nodes to a certain bucket, well, 00:44:03.090 --> 00:44:05.380 how do we figure out where that bucket is? 00:44:05.380 --> 00:44:07.540 Anyone in the chat want to chime in on that? 00:44:11.670 --> 00:44:12.170 Yes. 00:44:12.170 --> 00:44:13.400 So it seems to be that they're ordered. 00:44:13.400 --> 00:44:15.483 They're certainly in the right alphabetical order. 00:44:15.483 --> 00:44:17.470 That's helpful. 00:44:17.470 --> 00:44:21.790 Other things, too, for how we know where H is located, or L is located? 00:44:24.950 --> 00:44:28.350 Yes, we could use maybe ASCII numbers and so on. 00:44:28.350 --> 00:44:30.560 And we're on the right track here. 00:44:30.560 --> 00:44:34.160 But because if you look at the left-hand side, this is just an array, 00:44:34.160 --> 00:44:38.420 and we've kind of ourselves as humans assigned those array buckets, 00:44:38.420 --> 00:44:41.030 those array segments, some letter. 00:44:41.030 --> 00:44:44.000 But we have to assign them some number, really, some index. 00:44:44.000 --> 00:44:48.930 And so you might recall from very early on, all arrays have these indices, 00:44:48.930 --> 00:44:49.430 right? 00:44:49.430 --> 00:44:52.340 So maybe zero to however big your array is. 00:44:52.340 --> 00:44:56.540 And now we have, in this case, maybe H corresponds to the 7 index 00:44:56.540 --> 00:45:00.410 in our table, or I corresponds to the 8 index, and so on. 00:45:00.410 --> 00:45:04.550 And so in those indexes, we can actually add in our linked list. 00:45:04.550 --> 00:45:09.110 And so here, if we notice that H corresponds to 7, 00:45:09.110 --> 00:45:12.500 well, H is the eighth letter of the alphabet. 00:45:12.500 --> 00:45:15.020 But if we were counting from zero, it would be the seventh. 00:45:15.020 --> 00:45:19.100 So in this case, H corresponds to zero as that seventh letter of the alphabet 00:45:19.100 --> 00:45:22.010 counting from zero here. 00:45:22.010 --> 00:45:26.340 That's how we know how we actually add in these linked list nodes. 00:45:26.340 --> 00:45:31.060 Now, if I were to give you some new phrase, not Hey! or Hello! or Lo there! 00:45:31.060 --> 00:45:35.290 How would you figure out where to put it in this case? 00:45:35.290 --> 00:45:37.920 Maybe in the chat? 00:45:37.920 --> 00:45:42.500 If I gave you some new phrase, what could you 00:45:42.500 --> 00:45:44.300 do to figure out where to put it? 00:45:49.560 --> 00:45:50.060 Yeah. 00:45:50.060 --> 00:45:51.730 So I'm seeing maybe at the start. 00:45:51.730 --> 00:45:54.140 I don't really know until we see the phrase, right? 00:45:54.140 --> 00:45:57.450 And that's where this idea of a hash function comes in. 00:45:57.450 --> 00:45:59.750 So in order to figure out where Hey! and Hello! 00:45:59.750 --> 00:46:04.670 and Lo there! go, we need to have some kind of function that takes our phrase 00:46:04.670 --> 00:46:07.320 and gives us back some number that tells us where to put it. 00:46:07.320 --> 00:46:09.200 So here we have, maybe, the phrase Hey! 00:46:09.200 --> 00:46:11.150 And our hash function should ideally give us 00:46:11.150 --> 00:46:14.910 some number that tells us where to put that given phrase. 00:46:14.910 --> 00:46:17.960 So if I gave you not Hey! or Hi! or Hello there, 00:46:17.960 --> 00:46:20.360 maybe I gave you the name Carter. 00:46:20.360 --> 00:46:23.630 Well, that should give us some new number that then corresponds 00:46:23.630 --> 00:46:27.140 to an index in our array, or we could start this new linked 00:46:27.140 --> 00:46:30.420 list that might have all the C words, for example. 00:46:30.420 --> 00:46:33.140 So let's actually take a look at this kind of in code now. 00:46:33.140 --> 00:46:38.990 If we go back to our Visual Studio code over here, I'll open up this new file, 00:46:38.990 --> 00:46:40.250 this one called table. 00:46:40.250 --> 00:46:42.590 So I'll code table.c. 00:46:42.590 --> 00:46:44.310 And it's already a bit done for me. 00:46:44.310 --> 00:46:46.680 But notice how we have a few different things. 00:46:46.680 --> 00:46:50.540 We have the same node structure from before for making our linked lists. 00:46:50.540 --> 00:46:54.498 And now, we have this new array of pointers to node. 00:46:54.498 --> 00:46:55.790 And we're calling this a table. 00:46:55.790 --> 00:46:59.570 And notice how we have 26 buckets, 26 places 00:46:59.570 --> 00:47:01.370 in which to have a linked list now. 00:47:01.370 --> 00:47:04.070 And we can figure out where these new phrases 00:47:04.070 --> 00:47:09.330 should go by running this function called hash that we'll see down below. 00:47:09.330 --> 00:47:10.650 So let's scroll down. 00:47:10.650 --> 00:47:13.100 We see that, here, we have this function called hash. 00:47:13.100 --> 00:47:15.350 And hash is right now just returning zero. 00:47:15.350 --> 00:47:18.290 Because if I give it a new phrase, I haven't really finished this yet. 00:47:18.290 --> 00:47:20.660 I just want to return zero for every phrase I get. 00:47:20.660 --> 00:47:24.590 And ideally, if I run this program, I should type in a phrase 00:47:24.590 --> 00:47:29.330 and see the index that I should place this phrase inside of my hash table, 00:47:29.330 --> 00:47:33.920 and then print out, OK, this phrase hash is to whatever number I have. 00:47:33.920 --> 00:47:35.010 So let's see an example. 00:47:35.010 --> 00:47:38.820 So if I go to my terminal now, I could make this full screen 00:47:38.820 --> 00:47:42.120 and I could simply type, make a table. 00:47:42.120 --> 00:47:46.200 And now, I'll run ./table, and I'll type in Hi! 00:47:46.200 --> 00:47:48.330 And I get zero, which is maybe OK. 00:47:48.330 --> 00:47:50.400 But let's try now-- 00:47:50.400 --> 00:47:54.830 let's try Lo there! as we saw before, and we get zero. 00:47:54.830 --> 00:47:57.200 So why would this be bad? 00:47:57.200 --> 00:48:01.700 Besides it being unfinished, why would we not want Hi! and Lo there! 00:48:01.700 --> 00:48:03.380 to kind of hash to that same value? 00:48:07.830 --> 00:48:09.090 Any ideas in the chat? 00:48:16.180 --> 00:48:16.680 Yeah. 00:48:16.680 --> 00:48:20.683 So I'm seeing that this is kind of similar to a linked list, 00:48:20.683 --> 00:48:21.600 if we were to do this. 00:48:21.600 --> 00:48:23.220 Like, if we have-- 00:48:23.220 --> 00:48:24.840 back to our visual here. 00:48:24.840 --> 00:48:30.527 If we have some array of linked lists, but every word hashes to zero, 00:48:30.527 --> 00:48:32.985 well, we have all this other space to put our linked lists, 00:48:32.985 --> 00:48:35.935 but we're just putting everything in the zero bucket, which it makes-- 00:48:35.935 --> 00:48:39.060 turns us back into this picture, which is just a single linked list, right? 00:48:39.060 --> 00:48:42.098 The goal of this hash function is to have many shorter linked lists, 00:48:42.098 --> 00:48:43.890 and that's the goal of trying to put things 00:48:43.890 --> 00:48:46.210 in different buckets, different indexes here. 00:48:46.210 --> 00:48:49.650 So ideally, we don't want to have everything hash to the same value, 00:48:49.650 --> 00:48:52.620 we want to have different hash values for different phrases 00:48:52.620 --> 00:48:54.630 that we might input. 00:48:54.630 --> 00:48:57.390 So one way to do this, as we're seeing right now, 00:48:57.390 --> 00:49:03.120 is trying to make sure we have different linked lists that depend on the very 00:49:03.120 --> 00:49:05.165 first letter of the phrase we add. 00:49:05.165 --> 00:49:08.290 And let's go back to our hash function and figure out how we could do that. 00:49:08.290 --> 00:49:10.570 So let's exit our program here. 00:49:10.570 --> 00:49:12.310 And let's go ahead and go back here. 00:49:12.310 --> 00:49:18.780 And let's try to implement hash so that we get a value 0-25 for a given phrase. 00:49:18.780 --> 00:49:19.780 So here, let's try this. 00:49:19.780 --> 00:49:24.030 Let's actually try returning not just zero, 00:49:24.030 --> 00:49:29.010 but let's say, depending on the very first letter of our phrase, 00:49:29.010 --> 00:49:31.790 we return that and see how that goes. 00:49:31.790 --> 00:49:34.720 So here we have returning phrase, bracket zero. 00:49:34.720 --> 00:49:36.310 And we'll go back to our terminal now. 00:49:36.310 --> 00:49:44.680 We'll say make table, and then ./table, and let's try Hello! 00:49:44.680 --> 00:49:46.540 And we see 72. 00:49:46.540 --> 00:49:48.430 Is that-- I mean, it's better. 00:49:48.430 --> 00:49:49.970 Right now, I have actual number. 00:49:49.970 --> 00:49:54.510 And if I type in Carter, well, that's a different number, which is good. 00:49:54.510 --> 00:49:57.460 And Lo there! perhaps, is also a different number. 00:49:57.460 --> 00:49:59.610 So we're off to a good start. 00:49:59.610 --> 00:50:01.710 But what's wrong with this? 00:50:01.710 --> 00:50:06.240 If I get 72, and 67, and 76, I go back to my code 00:50:06.240 --> 00:50:09.570 and I look at my table, what's wrong? 00:50:13.800 --> 00:50:14.300 Yeah. 00:50:14.300 --> 00:50:16.500 So I'm seeing I only have 26 buckets. 00:50:16.500 --> 00:50:21.080 So if I wanted to, let's say, get a phrase and put it in my bucket 00:50:21.080 --> 00:50:25.160 somewhere, well, I can't really do that because maybe Hello! hashes to, 00:50:25.160 --> 00:50:26.540 what was it if we look at it? 00:50:26.540 --> 00:50:27.650 72. 00:50:27.650 --> 00:50:30.840 Well, that's not a valid index in my table. 00:50:30.840 --> 00:50:33.830 So I need to actually make this shorter a little bit. 00:50:33.830 --> 00:50:37.520 And one way to do that is maybe just subtract A from it. 00:50:37.520 --> 00:50:38.970 So I could do something like this. 00:50:38.970 --> 00:50:41.540 And what this will do is, maybe if you recall 00:50:41.540 --> 00:50:47.290 how A maps to 65, well, if I get A as the very first character, well, 00:50:47.290 --> 00:50:49.850 what is 65? 00:50:49.850 --> 00:50:52.100 A minus A is zero. 00:50:52.100 --> 00:50:54.770 If I get B, B is 66 in ASCII. 00:50:54.770 --> 00:50:57.560 So B minus A is 1. 00:50:57.560 --> 00:51:02.000 And if I get all the way down to Z, well, Z minus a is going to be 25. 00:51:02.000 --> 00:51:04.740 And if I wrap back around to A, well, that's just zero again. 00:51:04.740 --> 00:51:07.580 So what I've done now is I've made sure that every index I get 00:51:07.580 --> 00:51:11.940 is going to be inside the kind of length of my table here. 00:51:11.940 --> 00:51:15.500 I won't get any indexes that are out of bounds for this list. 00:51:15.500 --> 00:51:21.330 And now, if I kind of go back here, we run this code, I'll do make table. 00:51:21.330 --> 00:51:26.580 I'll run ./table, and I'll do Hello! now, and I get 7, which is better. 00:51:26.580 --> 00:51:28.470 Now, I'll do maybe Lo there! 00:51:28.470 --> 00:51:29.490 and I get 11. 00:51:29.490 --> 00:51:33.670 And now, I'll type in just Z to see how that works, and I get 25. 00:51:33.670 --> 00:51:35.550 So that seems to be pretty good for myself. 00:51:35.550 --> 00:51:39.520 But what still am I missing here? 00:51:39.520 --> 00:51:42.340 What could go wrong? 00:51:42.340 --> 00:51:43.060 Any ideas? 00:51:49.220 --> 00:51:49.720 Yes. 00:51:49.720 --> 00:51:52.240 So lowercased letters-- right now, I'm working 00:51:52.240 --> 00:51:53.830 with phrases that are all capital. 00:51:53.830 --> 00:51:56.538 But let's say someone wants to add a phrase that is in lowercase. 00:51:56.538 --> 00:52:00.230 So I could actually open a terminal again, make table. 00:52:00.230 --> 00:52:04.740 I'll do ./table, and I'll do maybe hello! in lowercase. 00:52:04.740 --> 00:52:07.920 And I get 39, which isn't quite what I want. 00:52:07.920 --> 00:52:09.800 So what could I do instead? 00:52:09.800 --> 00:52:13.538 Maybe any ideas in the chat? 00:52:13.538 --> 00:52:14.455 What can I do instead? 00:52:19.360 --> 00:52:21.920 I could, perhaps, maybe standardize on capital letters. 00:52:21.920 --> 00:52:25.930 So I could say, I want to make whatever this first letter is uppercase, 00:52:25.930 --> 00:52:29.890 and then subtract that capital A. So if I go back to my terminal, I could say, 00:52:29.890 --> 00:52:32.520 make table. 00:52:32.520 --> 00:52:34.050 I could say ./table. 00:52:34.050 --> 00:52:38.310 And now I want for it to be the case that lowercase hello and capital 00:52:38.310 --> 00:52:40.060 Hello! hash that same value. 00:52:40.060 --> 00:52:45.430 So I'll type hello here, and capital H Hello! here, and I get 7 in both cases. 00:52:45.430 --> 00:52:48.550 So that seems to work for me now. 00:52:48.550 --> 00:52:51.270 All right. 00:52:51.270 --> 00:52:56.240 Any other-- if we go back to our terminal here, 00:52:56.240 --> 00:52:59.540 that will tell us where to put individual phrases. 00:52:59.540 --> 00:53:04.190 So questions on this hash function so far? 00:53:12.760 --> 00:53:13.260 All right. 00:53:15.990 --> 00:53:18.520 So not seeing too many questions here. 00:53:18.520 --> 00:53:19.820 Why don't we do this? 00:53:19.820 --> 00:53:24.490 So let's go ahead and think about what makes for a good hash function? 00:53:24.490 --> 00:53:29.490 So if we think about this as our very first hash function, 00:53:29.490 --> 00:53:33.418 this idea of putting nodes into individual buckets, 00:53:33.418 --> 00:53:35.710 well, it's not going to be quite what we want it to be. 00:53:35.710 --> 00:53:37.627 And if we actually go through down here and we 00:53:37.627 --> 00:53:39.420 think about how could we try to make sure 00:53:39.420 --> 00:53:42.180 we actually better hash function than this, what we could do 00:53:42.180 --> 00:53:45.640 is actually make sure we have a good distribution of values across it. 00:53:45.640 --> 00:53:48.510 So if you think about our buckets here, we 00:53:48.510 --> 00:53:51.930 want to try to have-- to make sure that every node in our linked list 00:53:51.930 --> 00:53:54.510 is kind of going to go in every bucket here. 00:53:54.510 --> 00:53:58.050 Like, we actually have a lot of I words, a lot of J words, a lot of K words. 00:53:58.050 --> 00:54:01.110 And here, the way we hash this isn't going to quite do that for us. 00:54:01.110 --> 00:54:04.350 We're not going to actually have any words that start with I, or J, or K, 00:54:04.350 --> 00:54:09.130 so we probably think of a new hash function to use in this case. 00:54:09.130 --> 00:54:13.183 And if we go back over here, we can think of some good points 00:54:13.183 --> 00:54:14.100 for our hash function. 00:54:14.100 --> 00:54:18.000 We can think of, OK, we always try to get the same value for a given input. 00:54:18.000 --> 00:54:20.880 We always want to produce an even distribution across the buckets, 00:54:20.880 --> 00:54:23.970 and we always want to make sure that we use all the buckets that we're 00:54:23.970 --> 00:54:26.300 given in this case. 00:54:26.300 --> 00:54:28.488 So these are some ideas for a good hash function. 00:54:28.488 --> 00:54:30.280 Here, what we'll actually do is let you all 00:54:30.280 --> 00:54:34.927 go off and write your own hash functions to go off and build your own overall. 00:54:34.927 --> 00:54:36.760 So if we think of some other ideas here, you 00:54:36.760 --> 00:54:40.150 could use the length of the phrases, you could make sure you maybe 00:54:40.150 --> 00:54:42.170 add up their values, and so on. 00:54:42.170 --> 00:54:44.830 But what we could do is go ahead and have you all 00:54:44.830 --> 00:54:47.920 do that for the rest of this week now. 00:54:47.920 --> 00:54:51.940 And I think that will probably conclude us for this section. 00:54:51.940 --> 00:54:53.690 It is so good to see you all here today. 00:54:53.690 --> 00:54:54.898 Thank you all for joining us. 00:54:54.898 --> 00:54:57.660 We will see you next week.