WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 00:00:00.000 --> 00:00:02.832 >> [MUSIC PLAYING] 00:00:05.670 --> 00:00:08.560 >> DOUG LLOYD: OK, so at this point in the course, 00:00:08.560 --> 00:00:15.300 we've covered a lot of the basics of C. We know a lot about variables, arrays, 00:00:15.300 --> 00:00:17.610 pointers, all that good stuff. 00:00:17.610 --> 00:00:21.610 Those are all sort of built in to see as the fundamentals, 00:00:21.610 --> 00:00:23.880 but we can do more, right? 00:00:23.880 --> 00:00:27.930 We can combine things together in interesting ways. 00:00:27.930 --> 00:00:31.010 >> And so let's do that, let's start to branch out of what C gives us, 00:00:31.010 --> 00:00:35.270 and start to create our own data structures using these building 00:00:35.270 --> 00:00:40.590 blocks together to do something really valuable, useful. 00:00:40.590 --> 00:00:43.420 One way we can do this is to talk about collections. 00:00:43.420 --> 00:00:48.360 So so far we've had one kind of data structure for representing collections 00:00:48.360 --> 00:00:51.030 of like values, similar values. 00:00:51.030 --> 00:00:52.350 That would be an array. 00:00:52.350 --> 00:00:57.020 We have collections of integers, or collections of characters and so on. 00:00:57.020 --> 00:01:00.890 >> Structures are also sort of a data structure for collecting information, 00:01:00.890 --> 00:01:03.220 but it's not for collecting like values. 00:01:03.220 --> 00:01:08.090 It usually mixes different data types together inside of a single box. 00:01:08.090 --> 00:01:10.750 But it's not itself used to chain together 00:01:10.750 --> 00:01:16.920 or connect together similar items, like an array. 00:01:16.920 --> 00:01:20.960 Arrays are great for element look up, but recall 00:01:20.960 --> 00:01:24.262 that it's very difficult to insert into an array, 00:01:24.262 --> 00:01:26.470 unless we're inserting at the very end of that array. 00:01:26.470 --> 00:01:29.730 >> And the best example I have for that is insertion sort. 00:01:29.730 --> 00:01:31.650 If you recall our video on insertion sort, 00:01:31.650 --> 00:01:34.110 there was a lot of expense involved in having 00:01:34.110 --> 00:01:37.970 to pick up elements, and shift them out of the way to fit something 00:01:37.970 --> 00:01:41.290 into the middle of your array. 00:01:41.290 --> 00:01:44.690 Arrays also suffer from another problem, which is inflexibility. 00:01:44.690 --> 00:01:47.150 When we declare an array, we get one shot at it. 00:01:47.150 --> 00:01:49.790 We get to say, I want this many elements. 00:01:49.790 --> 00:01:51.940 Might be 100, it might be 1,000, it might 00:01:51.940 --> 00:01:55.930 be x where x is a number that the user gave us at a prompt or at the command 00:01:55.930 --> 00:01:56.630 line. 00:01:56.630 --> 00:01:59.905 >> But we only get one shot at it, we don't get to then say oh, actually I 00:01:59.905 --> 00:02:04.360 needed 101, or I needed x plus 20. 00:02:04.360 --> 00:02:07.910 Too late, we've already declared the array, and if we want to get 101 or x 00:02:07.910 --> 00:02:12.050 plus 20, we have to declare an entirely different array, 00:02:12.050 --> 00:02:15.540 copy all the elements of the array over, and then we have enough. 00:02:15.540 --> 00:02:19.880 And what if we are wrong again, what if we actually need 102, or x plus 40, 00:02:19.880 --> 00:02:21.970 we have to do this again. 00:02:21.970 --> 00:02:26.250 So they're very inflexible for resizing our data, 00:02:26.250 --> 00:02:29.360 but if we combine together some of the basics that we've already 00:02:29.360 --> 00:02:33.230 learned about pointers and structures, in particular using dynamic memory 00:02:33.230 --> 00:02:36.180 allocation with malloc, we can put these pieces together 00:02:36.180 --> 00:02:40.960 to create a new data structure-- a singly linked list we might say-- 00:02:40.960 --> 00:02:45.400 that allows us to grow and shrink a collection of values 00:02:45.400 --> 00:02:48.800 and we won't have any wasted space. 00:02:48.800 --> 00:02:53.320 >> So again, we call this idea, this notion, a linked list. 00:02:53.320 --> 00:02:56.320 In particular, in this video we're talking about singly linked list, 00:02:56.320 --> 00:02:59.185 and then another video we'll talk about doubly linked lists, which 00:02:59.185 --> 00:03:01.560 is just a variation on a theme here. 00:03:01.560 --> 00:03:05.200 But a singly linked list is comprised of nodes, 00:03:05.200 --> 00:03:08.559 nodes just being an abstract term-- it's just something I'm calling 00:03:08.559 --> 00:03:10.350 that's a kind of structure, basically, I'm? 00:03:10.350 --> 00:03:16.190 Just going to call it a node-- and this node has two members, or two fields. 00:03:16.190 --> 00:03:20.300 It has data, usually an integer, a character float, 00:03:20.300 --> 00:03:23.790 or could be some other data type that you've defined with a type def. 00:03:23.790 --> 00:03:29.290 And it contains a pointer to another node of the same type. 00:03:29.290 --> 00:03:34.710 >> So we have two things inside of this node, data and a pointer 00:03:34.710 --> 00:03:36.380 to another node. 00:03:36.380 --> 00:03:39.370 And if you start to visualize this, you can think about it 00:03:39.370 --> 00:03:42.280 like a chain of nodes that are connected together. 00:03:42.280 --> 00:03:45.070 We have the first node, it contains data, and a pointer 00:03:45.070 --> 00:03:49.110 to the second node, which contains data, and a pointer to the third node. 00:03:49.110 --> 00:03:52.940 And so that's why we call it a linked list, they're linked together. 00:03:52.940 --> 00:03:56.070 >> What does this special node structure look like? 00:03:56.070 --> 00:04:01.120 Well, if you recall from our video on defining custom types, with type def, 00:04:01.120 --> 00:04:05.400 we can define a structure-- and type define a structure like this. 00:04:05.400 --> 00:04:11.240 tyepdef struct sllist, and then I'm using the word value here arbitrarily 00:04:11.240 --> 00:04:13.891 to indicate any data type really. 00:04:13.891 --> 00:04:16.890 You could pass on an integer or float, you could have whatever you want. 00:04:16.890 --> 00:04:19.389 It's not restricted to just integers, or anything like that. 00:04:19.389 --> 00:04:22.790 So value is just an arbitrary data type, and then a pointer 00:04:22.790 --> 00:04:26.310 to another node of the same type. 00:04:26.310 --> 00:04:29.690 >> Now, there's a little catch here with defining a structure 00:04:29.690 --> 00:04:33.030 when it's a self referential structure. 00:04:33.030 --> 00:04:35.340 I have to have a temporary name for my structure. 00:04:35.340 --> 00:04:37.640 At the end of the day I clearly want to call it 00:04:37.640 --> 00:04:43.030 sll node, that's ultimately the new name part of my type definition, 00:04:43.030 --> 00:04:47.450 but I can't use sll node in the middle of this. 00:04:47.450 --> 00:04:51.430 The reason being, I haven't created a type called sll node 00:04:51.430 --> 00:04:55.200 until I hit this final point here. 00:04:55.200 --> 00:04:59.720 Up until that point, I have to have another way to refer to this data type. 00:04:59.720 --> 00:05:02.440 >> And this is a self referential data type. 00:05:02.440 --> 00:05:06.314 It;s a data type of a structure that contains a data, 00:05:06.314 --> 00:05:08.480 and a pointer to another structure of the same type. 00:05:08.480 --> 00:05:11.750 So I need to be able to refer to this data type at least temporarily, 00:05:11.750 --> 00:05:14.910 so giving it a temporary name of struct sllist 00:05:14.910 --> 00:05:18.540 allows me to then say I want a pointer to another struct sllist, 00:05:18.540 --> 00:05:24.690 a struct sllist star, and then after I've completed the definition, 00:05:24.690 --> 00:05:27.220 I can now call this type an sll node. 00:05:27.220 --> 00:05:30.520 >> So that's why you see there's a temporary name here, 00:05:30.520 --> 00:05:31.879 but a permanent name here. 00:05:31.879 --> 00:05:33.920 Sometimes you might see definitions of structure, 00:05:33.920 --> 00:05:36.570 for example, that aren't self referential, that 00:05:36.570 --> 00:05:39.390 don't have a specifier name here. 00:05:39.390 --> 00:05:43.040 It would just say typedef struct, open curly brace and then define it. 00:05:43.040 --> 00:05:45.620 But if you're struct is self referential, as this one is, 00:05:45.620 --> 00:05:49.010 you need to specify a temporary type name. 00:05:49.010 --> 00:05:51.310 But ultimately, now that we've done this, 00:05:51.310 --> 00:05:53.620 we can just refer to these nodes, these units, 00:05:53.620 --> 00:05:57.900 as sll nodes for purposes of the rest of this video. 00:05:57.900 --> 00:06:00.900 >> All right, so we know how to create a linked list node. 00:06:00.900 --> 00:06:03.240 We know how to define a linked list node. 00:06:03.240 --> 00:06:06.670 Now, if we're going to start using them to collect information, 00:06:06.670 --> 00:06:10.360 there's a couple of operations we need to understand and work with. 00:06:10.360 --> 00:06:12.860 We need to know how to create a linked list out of thin air. 00:06:12.860 --> 00:06:14.901 If there's no list already, we want to start one. 00:06:14.901 --> 00:06:16.960 So we need to be able to create a linked list, 00:06:16.960 --> 00:06:19.130 we need to probably search through the link list 00:06:19.130 --> 00:06:21.830 to find an element we're looking for. 00:06:21.830 --> 00:06:24.430 We need to be able to insert new things into the list, 00:06:24.430 --> 00:06:25.930 we want our list to be able to grow. 00:06:25.930 --> 00:06:28.638 And similarly, we want to be able to delete things from our list, 00:06:28.638 --> 00:06:30.250 we want our list to be able to shrink. 00:06:30.250 --> 00:06:32.160 And at the end of our programs, especially 00:06:32.160 --> 00:06:34.550 if you recall that we're dynamically allocating memory 00:06:34.550 --> 00:06:38.337 to build these lists typically, we want to free all of that memory 00:06:38.337 --> 00:06:39.670 when we're done working with it. 00:06:39.670 --> 00:06:44.627 And so we need to be able to delete an entire linked list in one fail swoop. 00:06:44.627 --> 00:06:46.460 So let's go through some of these operations 00:06:46.460 --> 00:06:51.192 and how we might visualize them, talking in pseudocode code specifically. 00:06:51.192 --> 00:06:53.150 So we want to create a linked list, so maybe we 00:06:53.150 --> 00:06:56.480 want to define a function with this prototype. 00:06:56.480 --> 00:07:01.690 sll node star, create, and I'm passing in one argument, some arbitrary data 00:07:01.690 --> 00:07:05.530 type again, of some arbitrary data type. 00:07:05.530 --> 00:07:10.482 But I'm returning-- this function should return to me a pointer, to a singly 00:07:10.482 --> 00:07:11.190 linked list node. 00:07:11.190 --> 00:07:14.050 Again, we're trying to create a linked list out of thin air, 00:07:14.050 --> 00:07:17.900 so I need a pointer to that list when I'm done. 00:07:17.900 --> 00:07:19.420 >> So what are the steps involved here? 00:07:19.420 --> 00:07:20.960 Well, first thing I'm going to do is dynamically 00:07:20.960 --> 00:07:22.550 allocate space for a new node. 00:07:22.550 --> 00:07:26.689 Again, we're creating it out of thin air, so we need to malloc space for it. 00:07:26.689 --> 00:07:28.480 And of course, immediately after we malloc, 00:07:28.480 --> 00:07:31.692 we always check to make sure that our pointer-- we did not get back null. 00:07:31.692 --> 00:07:33.650 Because if we try and deference a null pointer, 00:07:33.650 --> 00:07:36.190 we're going to suffer a segfault and we don't want that. 00:07:36.190 --> 00:07:39.510 >> Then we want to fill in the field, we want to initialize the value field 00:07:39.510 --> 00:07:41.690 and initialize the next field. 00:07:41.690 --> 00:07:45.450 And then we want to-- eventually as the function prototype indicates-- we want 00:07:45.450 --> 00:07:49.940 to return a pointer to an sll node. 00:07:49.940 --> 00:07:51.710 So what make this look like visually? 00:07:51.710 --> 00:07:55.230 Well, first we're going to dynamically allocate space for a new sll node, 00:07:55.230 --> 00:07:58.320 so we malloc-- that's a visual representation 00:07:58.320 --> 00:08:00.020 of the node we just created. 00:08:00.020 --> 00:08:02.757 And we check to make sure it's not null-- in this case, 00:08:02.757 --> 00:08:04.840 the picture wouldn't have shown up if it was null, 00:08:04.840 --> 00:08:07.298 we would have run out of memory, so we're good to go there. 00:08:07.298 --> 00:08:10.200 So now we're on to step C, initialize the nodes value field. 00:08:10.200 --> 00:08:12.280 Well, based on this function call I'm using here, 00:08:12.280 --> 00:08:16.700 looks like I want to pass in 6, so I'll 6 in the value field. 00:08:16.700 --> 00:08:18.865 Now, initialize the next field. 00:08:18.865 --> 00:08:21.640 Well, what am I going to do there, there is nothing next, right, 00:08:21.640 --> 00:08:23.600 this is the only thing in the list. 00:08:23.600 --> 00:08:27.206 So what's the next thing in the list? 00:08:27.206 --> 00:08:29.660 >> It shouldn't point to anything, right. 00:08:29.660 --> 00:08:33.600 There's nothing else there, so what is the concept we know of that's nothing-- 00:08:33.600 --> 00:08:35.638 pointers to nothing? 00:08:35.638 --> 00:08:37.929 It should be maybe we want to put a null pointer there, 00:08:37.929 --> 00:08:40.178 and I'll represent the null pointer as just a red box, 00:08:40.178 --> 00:08:41.559 we can't go any further. 00:08:41.559 --> 00:08:44.430 As we'll see a little later on, we will have eventually chains 00:08:44.430 --> 00:08:46.330 of arrows connecting these nodes together, 00:08:46.330 --> 00:08:48.480 but when you hit the red box, that's null, 00:08:48.480 --> 00:08:51.150 we can't go any further, that's the end of the list. 00:08:51.150 --> 00:08:53.960 >> And lastly, we just want to return a pointer to this node. 00:08:53.960 --> 00:08:56.160 So we'll call it new, and will return new 00:08:56.160 --> 00:08:59.370 so it can be used in whatever function created it. 00:08:59.370 --> 00:09:03.100 So there we go, We've created a singly linked list node out of thin air, 00:09:03.100 --> 00:09:05.920 and now we have a list we can work with. 00:09:05.920 --> 00:09:08.260 >> Now, let's say we already have a large chain, 00:09:08.260 --> 00:09:09.800 and we want to find something in it. 00:09:09.800 --> 00:09:12.716 And we want a function that's going to return true or false, depending 00:09:12.716 --> 00:09:15.840 on whether a value exists in that list. 00:09:15.840 --> 00:09:18.160 A function prototype, or declaration for that function, 00:09:18.160 --> 00:09:23.320 might look like this-- bool find, and then we want to pass in two arguments. 00:09:23.320 --> 00:09:26.996 >> The first, is a pointer to the first element of the linked list. 00:09:26.996 --> 00:09:29.620 This is actually something you'll always want to keep track of, 00:09:29.620 --> 00:09:33.110 and actually might be something that you even put in a global variable. 00:09:33.110 --> 00:09:35.360 Once you create a list, you always, always 00:09:35.360 --> 00:09:38.990 want to keep track of the very first element of the list. 00:09:38.990 --> 00:09:43.690 That way you can refer to all the other elements by just following the chain, 00:09:43.690 --> 00:09:47.300 without having to keep pointers intact to every single element. 00:09:47.300 --> 00:09:50.920 You only need to keep track of the first one if they're all chained together. 00:09:50.920 --> 00:09:52.460 >> And then the second thing we're passing in again 00:09:52.460 --> 00:09:54.376 is arbitrarily some-- whatever data type we're 00:09:54.376 --> 00:09:59.640 looking for there is inside of hopefully one of the nodes in the list. 00:09:59.640 --> 00:10:00.980 So what are the steps? 00:10:00.980 --> 00:10:04.250 Well, the first thing we do is we create a transversal pointer 00:10:04.250 --> 00:10:06.015 pointing to the lists head. 00:10:06.015 --> 00:10:08.890 Well, why do we do that, we already have a pointer at the lists head, 00:10:08.890 --> 00:10:10.974 why don't we just move that one around? 00:10:10.974 --> 00:10:13.140 Well, like I just said, it's really important for us 00:10:13.140 --> 00:10:17.580 to always keep track of the very first element in the list. 00:10:17.580 --> 00:10:21.270 And so it's actually better to create a duplicate of that, 00:10:21.270 --> 00:10:25.350 and use that to move around so we never accidentally move away, or we always 00:10:25.350 --> 00:10:30.430 have a pointer at some point that is right on the first element of the list. 00:10:30.430 --> 00:10:33.290 So it's better to create a second one that we use to move. 00:10:33.290 --> 00:10:35.877 >> Then we just compare whether the value field at that node 00:10:35.877 --> 00:10:38.960 is what we're looking for, and if it's not, we just move to the next node. 00:10:38.960 --> 00:10:41.040 And we keep doing that over, and over, and over, 00:10:41.040 --> 00:10:44.811 until we either find the element, or we hit 00:10:44.811 --> 00:10:47.310 null-- we've reached the end of the list and it isn't there. 00:10:47.310 --> 00:10:50.540 This should hopefully ring a bell to you as just linear search, 00:10:50.540 --> 00:10:54.430 we're just replicating it in a singly linked list structure 00:10:54.430 --> 00:10:56.280 instead of using an array to do it. 00:10:56.280 --> 00:10:58.210 >> So here's an example of a singly linked list. 00:10:58.210 --> 00:11:00.043 This one consists of five nodes, and we have 00:11:00.043 --> 00:11:04.330 a pointer to the head of the list, which is called list. 00:11:04.330 --> 00:11:07.385 The first thing we want to do is again, create that traversal pointer. 00:11:07.385 --> 00:11:09.760 So we have now two pointers that point to the same thing. 00:11:09.760 --> 00:11:15.025 >> Now, notice here also, I didn't have to malloc any space for trav. 00:11:15.025 --> 00:11:18.970 I didn't say trav equals malloc something, that node already exists, 00:11:18.970 --> 00:11:21.160 that space in memory already exists. 00:11:21.160 --> 00:11:24.290 So all I'm actually doing is creating another pointer to it. 00:11:24.290 --> 00:11:28.210 I'm not mallocing an additional space, just have now two pointers 00:11:28.210 --> 00:11:31.370 pointing to the same thing. 00:11:31.370 --> 00:11:33.710 >> So is 2 what I'm looking for? 00:11:33.710 --> 00:11:37.220 Well, no, so instead I'm going to move to the next one. 00:11:37.220 --> 00:11:41.740 So basically I would say, trav equals trav next. 00:11:41.740 --> 00:11:43.630 Is 3 what I'm looking for, no. 00:11:43.630 --> 00:11:45.780 So I continue to go through, until eventually 00:11:45.780 --> 00:11:48.690 get to 6 which is what I'm looking for based on the function call 00:11:48.690 --> 00:11:51.600 I have at the top there, and so I'm done. 00:11:51.600 --> 00:11:54.150 >> Now, what if the element I'm looking for isn't in the list, 00:11:54.150 --> 00:11:55.510 is it still going to work? 00:11:55.510 --> 00:11:57.120 Well, notice that the list here is subtly different, 00:11:57.120 --> 00:11:59.410 and this is another thing that's important with linked lists, 00:11:59.410 --> 00:12:01.780 you don't have to preserve them in any particular order. 00:12:01.780 --> 00:12:05.390 You can if you want, but you may have already noticed 00:12:05.390 --> 00:12:09.310 that we're not keeping track of what number element we are at. 00:12:09.310 --> 00:12:13.150 >> And that's sort of one trade that we have with linked list verses arrays, 00:12:13.150 --> 00:12:15.300 is it we don't have random access anymore. 00:12:15.300 --> 00:12:18.150 We can't just say, I want to go to the 0th element, 00:12:18.150 --> 00:12:21.410 or the 6th element of my array, which I can do in an array. 00:12:21.410 --> 00:12:25.080 I can't say I want to go to the 0th element, or the 6th element, 00:12:25.080 --> 00:12:30.360 or the 25th element of my linked list, there's no index associated with them. 00:12:30.360 --> 00:12:33.660 And so it doesn't really matter if we preserve our list in order. 00:12:33.660 --> 00:12:36.080 If you want to you certainly can, but there's 00:12:36.080 --> 00:12:38.567 no reason why they need to be preserved in any order. 00:12:38.567 --> 00:12:40.400 So again, let's try and find 6 in this list. 00:12:40.400 --> 00:12:43.200 Well, we start at the beginning, we don't find 6, 00:12:43.200 --> 00:12:47.690 and then we continue not finding 6, until we eventually get to here. 00:12:47.690 --> 00:12:52.790 So right now trav points to the node containing 8, and six is not in there. 00:12:52.790 --> 00:12:55.250 >> So the next step would be to go to the next pointer, 00:12:55.250 --> 00:12:57.440 so say trav equals trav next. 00:12:57.440 --> 00:13:00.750 Well, trav next, indicated by the red box there, is null. 00:13:00.750 --> 00:13:03.020 So there's nowhere else to go, and so at this point 00:13:03.020 --> 00:13:06.120 we can conclude that we've reached the end of the linked list, 00:13:06.120 --> 00:13:07.190 and 6 isn't in there. 00:13:07.190 --> 00:13:10.980 And it would be returned false in this case. 00:13:10.980 --> 00:13:14.540 >> OK, how do we insert a new node into the linked list? 00:13:14.540 --> 00:13:17.310 So we've been able to create a linked list out of nowhere, 00:13:17.310 --> 00:13:19.370 but we probably want to build a chain and not 00:13:19.370 --> 00:13:22.620 create a bunch of distinct lists. 00:13:22.620 --> 00:13:25.700 We want to have one list that has a bunch of nodes in it, 00:13:25.700 --> 00:13:28.040 not a bunch of lists with a single node. 00:13:28.040 --> 00:13:31.260 So we can't just keep using the Create function we defined earlier, now we 00:13:31.260 --> 00:13:33.860 want to insert into a list that already exists. 00:13:33.860 --> 00:13:36.499 >> So this case, we're going to pass in two arguments, 00:13:36.499 --> 00:13:39.290 the pointer to the head of that linked list that we want to add to. 00:13:39.290 --> 00:13:40.910 Again, that's why it's so important that we always 00:13:40.910 --> 00:13:43.400 keep track of it, because it's the only way we really 00:13:43.400 --> 00:13:46.690 have to refer to the whole list is just by a pointer to the first element. 00:13:46.690 --> 00:13:49.360 So we want to pass in a pointer to that first element, 00:13:49.360 --> 00:13:52.226 and whatever value we want to add to the list. 00:13:52.226 --> 00:13:54.600 And eventually this function is going to return a pointer 00:13:54.600 --> 00:13:57.980 to the new head of a linked list. 00:13:57.980 --> 00:13:59.700 >> What are the steps involved here? 00:13:59.700 --> 00:14:02.249 Well, just like with create, we need to dynamically allocate 00:14:02.249 --> 00:14:05.540 space for a new node, and check to make sure we don't run out of memory, again, 00:14:05.540 --> 00:14:07.150 because we're using malloc. 00:14:07.150 --> 00:14:09.080 Then we want to populate and insert the node, 00:14:09.080 --> 00:14:12.730 so put the number, whatever val is, into the node. 00:14:12.730 --> 00:14:17.310 We want to insert the node at the beginning of the linked list. 00:14:17.310 --> 00:14:19.619 >> There's a reason that I want to do this, and it 00:14:19.619 --> 00:14:21.910 might be worth taking a second to pause the video here, 00:14:21.910 --> 00:14:25.860 and think about why I would want to insert at the beginning of a linked 00:14:25.860 --> 00:14:26.589 list. 00:14:26.589 --> 00:14:28.630 Again, I mentioned earlier that it doesn't really 00:14:28.630 --> 00:14:33.020 matter if we preserve it in any order, so maybe that's a clue. 00:14:33.020 --> 00:14:36.040 And you saw what would happen if we wanted to-- or from just a second 00:14:36.040 --> 00:14:37.360 ago when we were going through search you 00:14:37.360 --> 00:14:39.235 could see what might happen if we were trying 00:14:39.235 --> 00:14:41.330 to insert at the end of the list. 00:14:41.330 --> 00:14:44.750 Because we don't have a pointer to the end of the list. 00:14:44.750 --> 00:14:47.490 >> So the reason that I would want to insert at the beginning, 00:14:47.490 --> 00:14:49.380 is because I can do it immediately. 00:14:49.380 --> 00:14:52.730 I have a pointer at the beginning, and we'll see this in a visual in a second. 00:14:52.730 --> 00:14:55.605 But if I want to insert at the end, I have to start at the beginning, 00:14:55.605 --> 00:14:58.760 traverse all the way to the end, and then tack it on. 00:14:58.760 --> 00:15:01.420 So that would mean that inserting at the end of the list 00:15:01.420 --> 00:15:04.140 would become an o of n operation, going back 00:15:04.140 --> 00:15:06.720 to our discussion of computational complexity. 00:15:06.720 --> 00:15:10.140 It'd become an o of n operation, where as the list got bigger, and bigger, 00:15:10.140 --> 00:15:13.310 and bigger, it'll become more and more difficult to tack something 00:15:13.310 --> 00:15:14.661 on at the end. 00:15:14.661 --> 00:15:17.410 But it's always really easy to tack something on at the beginning, 00:15:17.410 --> 00:15:19.060 you're always at the beginning. 00:15:19.060 --> 00:15:21.620 >> And we'll see a visual of this again. 00:15:21.620 --> 00:15:24.100 And then once we're done, once we've inserted the new node, 00:15:24.100 --> 00:15:26.880 we want to return our pointer to the new head of a linked list, which 00:15:26.880 --> 00:15:29.213 since we're inserting at the beginning, will actually be 00:15:29.213 --> 00:15:31.060 a pointer to the node we just created. 00:15:31.060 --> 00:15:33.280 Let's visualize this, because I think it'll help. 00:15:33.280 --> 00:15:36.661 >> So here's our list, it consists of four elements, a node containing 15, 00:15:36.661 --> 00:15:38.410 which points to a node containing 9, which 00:15:38.410 --> 00:15:41.370 points to a node containing 13, which points to a node containing 00:15:41.370 --> 00:15:44.840 10, which has the null pointer as its next pointer 00:15:44.840 --> 00:15:47.010 so that's the end of the list. 00:15:47.010 --> 00:15:50.200 So we want to insert a new node with the value 12 00:15:50.200 --> 00:15:52.720 at the beginning of this list, what do we do? 00:15:52.720 --> 00:15:58.770 Well, first we malloc space for the node, and then we put 12 in there. 00:15:58.770 --> 00:16:02.211 >> So now we've reached a decision point, right? 00:16:02.211 --> 00:16:03.960 We have a couple of pointers that we could 00:16:03.960 --> 00:16:06.770 move, which one should we move first? 00:16:06.770 --> 00:16:09.250 Should we make 12 point to the new head of the list-- 00:16:09.250 --> 00:16:13.020 or excuse me, should we make 12 point to the old head of the list? 00:16:13.020 --> 00:16:15.319 Or should we say that the list now begins at 12. 00:16:15.319 --> 00:16:17.110 There's a distinction there, and we'll look 00:16:17.110 --> 00:16:19.870 at what happens with both in a second. 00:16:19.870 --> 00:16:23.350 >> But this leads to a great topic for sidebar, 00:16:23.350 --> 00:16:26.280 which is that one of the trickiest things with linked lists 00:16:26.280 --> 00:16:30.980 is to arrange the pointers in the correct order. 00:16:30.980 --> 00:16:34.520 If you move things out of order, you can end up accidentally 00:16:34.520 --> 00:16:36.050 orphaning the rest of the list. 00:16:36.050 --> 00:16:37.300 And here's an example of that. 00:16:37.300 --> 00:16:40.540 So let's go with the idea of-- well, we've just created 12. 00:16:40.540 --> 00:16:43.180 We know 12 is going to be the new head of the list, 00:16:43.180 --> 00:16:47.660 and so why don't we just move the list pointer to point there. 00:16:47.660 --> 00:16:49.070 >> OK, so that's good. 00:16:49.070 --> 00:16:51.560 So now where does 12 next point? 00:16:51.560 --> 00:16:54.580 I mean, visually we can see that it will point to 15, 00:16:54.580 --> 00:16:57.250 as humans it's really obvious to us. 00:16:57.250 --> 00:17:00.300 How does the computer know? 00:17:00.300 --> 00:17:02.720 We don't have anything pointing to 15 anymore, right? 00:17:02.720 --> 00:17:05.869 >> We've lost any ability to refer to 15. 00:17:05.869 --> 00:17:11.460 We can't say new arrow next equals something, there's nothing there. 00:17:11.460 --> 00:17:13.510 In fact, we've orphaned the rest of the list 00:17:13.510 --> 00:17:16.465 by doing so, we've accidentally broken the chain. 00:17:16.465 --> 00:17:18.089 And we certainly don't want to do that. 00:17:18.089 --> 00:17:20.000 >> So let's go back and try this again. 00:17:20.000 --> 00:17:24.060 Maybe the right thing to do is to set 12's next pointer 00:17:24.060 --> 00:17:28.290 to the old head of the list first, then we can move the list over. 00:17:28.290 --> 00:17:30.420 And in fact, that is the correct order that we 00:17:30.420 --> 00:17:32.836 need to follow when we're working with singly linked list. 00:17:32.836 --> 00:17:36.460 We always want to connect the new element into the list, 00:17:36.460 --> 00:17:41.010 before we take that kind of important step of changing 00:17:41.010 --> 00:17:43.360 where the head of the linked list is. 00:17:43.360 --> 00:17:46.740 Again, that's such a fundamental thing, we don't want to lose track of it. 00:17:46.740 --> 00:17:49.310 >> So we want to make sure that everything is chained together, 00:17:49.310 --> 00:17:52.040 before we move that pointer. 00:17:52.040 --> 00:17:55.300 And so this would be the correct order, which is to connect 12 to the list, 00:17:55.300 --> 00:17:57.630 then say that the list starts a 12. 00:17:57.630 --> 00:18:00.860 If we said the list starts at 12 and then tried to connect 12 to the list, 00:18:00.860 --> 00:18:02.193 we've already seen what happens. 00:18:02.193 --> 00:18:04.920 We lose the list by mistake. 00:18:04.920 --> 00:18:06.740 >> OK, so one more thing to talk about. 00:18:06.740 --> 00:18:09.750 What if we want to get rid of an entire linked list at once? 00:18:09.750 --> 00:18:11.750 Again, we're mallocing all this space, and so we 00:18:11.750 --> 00:18:13.351 need to free it when we're done. 00:18:13.351 --> 00:18:15.350 So now we want to delete the entire linked list. 00:18:15.350 --> 00:18:16.850 Well, what do we want to do? 00:18:16.850 --> 00:18:20.460 >> If we've reached the null pointer, we want to stop, otherwise, just delete 00:18:20.460 --> 00:18:23.420 the rest of the list and then free me. 00:18:23.420 --> 00:18:28.890 Delete the rest of the list, and then free the current node. 00:18:28.890 --> 00:18:32.850 What does that sound like, what technique have we talked 00:18:32.850 --> 00:18:35.440 about previously does that sound like? 00:18:35.440 --> 00:18:39.560 Delete everybody else, then come back and delete me. 00:18:39.560 --> 00:18:42.380 >> That's recursion, we've made the problem a little bit smaller, 00:18:42.380 --> 00:18:46.910 we're saying delete everybody else, then you can delete me. 00:18:46.910 --> 00:18:50.940 And further down the road, that node will say, delete everybody else. 00:18:50.940 --> 00:18:53.940 But eventually we'll get to the point where the list is null, 00:18:53.940 --> 00:18:55.310 and that's our base case. 00:18:55.310 --> 00:18:57.010 >> So let's take a look at this, and how this might work. 00:18:57.010 --> 00:18:59.759 So here's our list, it's the same list we were just talking about, 00:18:59.759 --> 00:19:00.980 and there's the steps. 00:19:00.980 --> 00:19:04.200 There's a lot of text here, but hopefully the visualization will help. 00:19:04.200 --> 00:19:08.557 >> So we have-- and I also pulled up our stack frames illustration 00:19:08.557 --> 00:19:10.890 from our video on call stacks, and hopefully all of this 00:19:10.890 --> 00:19:13.260 together will show you what's going on. 00:19:13.260 --> 00:19:14.510 So here's our pseudocode code. 00:19:14.510 --> 00:19:17.830 If we reach a null pointer, stop, otherwise, 00:19:17.830 --> 00:19:21.320 delete the rest of the list, then free the current node. 00:19:21.320 --> 00:19:25.700 So right now, list-- the pointer that we're 00:19:25.700 --> 00:19:28.410 passing in to destroy points to 12. 00:19:28.410 --> 00:19:33.340 12 is not a null pointer, so we're going to delete the rest of the list. 00:19:33.340 --> 00:19:35.450 >> What is deleting the rest of us involved? 00:19:35.450 --> 00:19:37.950 Well, it means making a call to destroy, saying 00:19:37.950 --> 00:19:42.060 that 15 is the beginning of the rest of the list we want to destroy. 00:19:42.060 --> 00:19:47.480 And so the call to destroy 12 is kind of on hold. 00:19:47.480 --> 00:19:52.690 It's frozen there, waiting for the call to destroy 15, to finish its job. 00:19:52.690 --> 00:19:56.280 >> Well, 15 is not a null pointer, and so it's going to say, all right, 00:19:56.280 --> 00:19:58.450 well, delete the rest of the list. 00:19:58.450 --> 00:20:00.760 The rest of the list starts at 9, and so we'll just 00:20:00.760 --> 00:20:04.514 wait until you delete all that stuff, then come back and delete me. 00:20:04.514 --> 00:20:06.680 Well 9's going to say, well, I'm not a null pointer, 00:20:06.680 --> 00:20:09.020 so delete the rest the list from here. 00:20:09.020 --> 00:20:11.805 And so try and destroy 13. 00:20:11.805 --> 00:20:15.550 13 says, I'm not null pointer, same thing, it passes the buck. 00:20:15.550 --> 00:20:17.930 10 is not null pointer, 10 contains a null pointer, 00:20:17.930 --> 00:20:20.200 but 10 is not itself a null pointer right now, 00:20:20.200 --> 00:20:22.470 and so it passes the buck too. 00:20:22.470 --> 00:20:25.560 >> And now list points there, it really would point to some-- 00:20:25.560 --> 00:20:28.710 if I had more space in the image, it would point to some random space 00:20:28.710 --> 00:20:29.960 that we don't know what it is. 00:20:29.960 --> 00:20:34.680 It's the null pointer though, the list is literally now set it's values null. 00:20:34.680 --> 00:20:36.820 It's pointing right inside that red box. 00:20:36.820 --> 00:20:39.960 We reached a null pointer, so we can stop, and we're done. 00:20:39.960 --> 00:20:46.230 >> And so that purple frame is now-- at the top of stack-- that's the active frame, 00:20:46.230 --> 00:20:47.017 but it's done. 00:20:47.017 --> 00:20:48.600 If we've reached a null pointer, stop. 00:20:48.600 --> 00:20:51.290 We don't do anything, we can't free a null pointer, 00:20:51.290 --> 00:20:55.070 we didn't malloc any space, and so we're done. 00:20:55.070 --> 00:20:57.590 So that function frame is destroyed, and we 00:20:57.590 --> 00:21:00.930 resume-- we pick up where we left off with the next highest one, which 00:21:00.930 --> 00:21:02.807 is this dark blue frame here. 00:21:02.807 --> 00:21:04.390 So we pick up right where we left off. 00:21:04.390 --> 00:21:06.598 We deleted the rest of the list already, so now we're 00:21:06.598 --> 00:21:08.000 going to free the current nodes. 00:21:08.000 --> 00:21:12.920 So now we can free this node, and now we've reached the end of the function. 00:21:12.920 --> 00:21:16.810 And so that function frame is destroyed, and we pick up at the light blue one. 00:21:16.810 --> 00:21:20.650 >> So it says-- I've already done-- deleting the rest of the list, so 00:21:20.650 --> 00:21:23.140 free the current node. 00:21:23.140 --> 00:21:26.520 And now the yellow frame is back on top of the stack. 00:21:26.520 --> 00:21:29.655 And so as you see, we're now destroying the list from right to left. 00:21:33.710 --> 00:21:37.280 >> What would have happened, though, if we had done things the wrong way? 00:21:37.280 --> 00:21:39.410 Just like when we tried to add an element. 00:21:39.410 --> 00:21:41.909 If we messed up the chain, if we didn't connect the pointers 00:21:41.909 --> 00:21:44.690 in the correct order, if we just freed the first element, 00:21:44.690 --> 00:21:47.420 if we just freed the head of the list, now we 00:21:47.420 --> 00:21:49.642 have no way to refer to the rest of the list. 00:21:49.642 --> 00:21:51.350 And so we would have orphaned everything, 00:21:51.350 --> 00:21:53.880 we would have had what's called a memory leak. 00:21:53.880 --> 00:21:56.800 If you recall from our video on dynamic memory allocation, 00:21:56.800 --> 00:21:58.650 that's not very good thing. 00:21:58.650 --> 00:22:00.810 >> So as I said, there are several operations 00:22:00.810 --> 00:22:04.010 that we need to use to work with linked list effectively. 00:22:04.010 --> 00:22:08.430 And you may have noticed I omitted one, deleting a single element from a linked 00:22:08.430 --> 00:22:09.064 list. 00:22:09.064 --> 00:22:10.980 The reason I did that is it's actually kind of 00:22:10.980 --> 00:22:14.360 tricky to think about how to delete a single element from a singly 00:22:14.360 --> 00:22:15.600 linked list. 00:22:15.600 --> 00:22:19.950 We need to be able to skip over something in the list, which 00:22:19.950 --> 00:22:22.975 means we get to a point-- we want to delete this node-- 00:22:22.975 --> 00:22:25.350 but in order to make it so we don't lose any information, 00:22:25.350 --> 00:22:30.530 we need to connect this node over here, here. 00:22:30.530 --> 00:22:33.390 >> So I probably did that wrong from a visual perspective. 00:22:33.390 --> 00:22:36.830 So we're at the beginning of our list, we're proceeding through, 00:22:36.830 --> 00:22:40.510 we want to delete this node. 00:22:40.510 --> 00:22:43.440 If we just delete it, we've broken the chain. 00:22:43.440 --> 00:22:45.950 This node right here refers to everything else, 00:22:45.950 --> 00:22:48.260 it contains the chain from here on out. 00:22:48.260 --> 00:22:51.190 >> So what we need to do actually after we get to this point, 00:22:51.190 --> 00:22:56.670 is we need to step back one, and connect this node over to this node, 00:22:56.670 --> 00:22:58.590 so we can then delete the one in the middle. 00:22:58.590 --> 00:23:02.120 But singly linked lists don't provide us a way to go backwards. 00:23:02.120 --> 00:23:05.160 So we need to either keep two pointers, and move them 00:23:05.160 --> 00:23:09.527 sort of off step, one behind the other as we go, or get to a point 00:23:09.527 --> 00:23:11.110 and then send another pointer through. 00:23:11.110 --> 00:23:13.150 And as you can see, it can get a little messy. 00:23:13.150 --> 00:23:15.360 Fortunately, we have another way to resolve that, 00:23:15.360 --> 00:23:17.810 when we talk about doubly linked lists. 00:23:17.810 --> 00:23:20.720 >> I'm Doug Lloyd, this is CS50.