1 00:00:00,000 --> 00:00:03,000 2 00:00:03,000 --> 00:00:06,527 DAVID MALAN: So how can we now translate this into more useful code? 3 00:00:06,527 --> 00:00:08,360 Not just defining what these structures are, 4 00:00:08,360 --> 00:00:10,610 but how do we begin building up linked lists? 5 00:00:10,610 --> 00:00:14,600 Well, let me propose that a linked list really begins with just a pointer. 6 00:00:14,600 --> 00:00:18,980 And in fact, here we have, thanks to the theater's prop shop, just 7 00:00:18,980 --> 00:00:20,610 a null pointer, if you will. 8 00:00:20,610 --> 00:00:22,760 I'm going to call this variable list. 9 00:00:22,760 --> 00:00:24,567 And list is currently pointing to nothing. 10 00:00:24,567 --> 00:00:27,650 The arrow will say it's just pointing at the floor, which means it's null. 11 00:00:27,650 --> 00:00:29,600 It's not actually pointing at anything useful. 12 00:00:29,600 --> 00:00:32,299 Suppose I now want to start to begin to allocate a linked 13 00:00:32,299 --> 00:00:35,267 list with three numbers, 1, 2, and 3. 14 00:00:35,267 --> 00:00:36,600 Well, how am I going to do this? 15 00:00:36,600 --> 00:00:39,142 Well, at the moment, the only thing that exists in my program 16 00:00:39,142 --> 00:00:40,910 is going to be this variable called list. 17 00:00:40,910 --> 00:00:43,400 There's no array in this story. 18 00:00:43,400 --> 00:00:45,200 That was in week two. 19 00:00:45,200 --> 00:00:46,550 Today's all about linked lists. 20 00:00:46,550 --> 00:00:48,710 So how do I get myself a wooden block that 21 00:00:48,710 --> 00:00:51,380 represents one, another wooden block that represents two, 22 00:00:51,380 --> 00:00:53,060 and a third that represents three? 23 00:00:53,060 --> 00:00:55,610 Well, I need to use our new friend from last week, malloc. 24 00:00:55,610 --> 00:00:58,850 Recall that malloc allows you to allocate memory, 25 00:00:58,850 --> 00:01:01,250 as much memory as you might want, so long as you tell it 26 00:01:01,250 --> 00:01:02,520 the size of that thing. 27 00:01:02,520 --> 00:01:04,910 So frankly, I think what we could do ultimately today 28 00:01:04,910 --> 00:01:08,930 is use malloc to allocate dynamically one struct, 29 00:01:08,930 --> 00:01:11,000 and put the number one in it; another struct, 30 00:01:11,000 --> 00:01:14,090 put the number two on it; another struct, put the number three in it. 31 00:01:14,090 --> 00:01:18,200 And then use these playful arrows here to actually stitch them together, 32 00:01:18,200 --> 00:01:19,980 having one point to the other. 33 00:01:19,980 --> 00:01:21,950 So thankfully the prop shop has wonderfully 34 00:01:21,950 --> 00:01:24,240 created a whole bunch of these for us. 35 00:01:24,240 --> 00:01:30,600 Let me go ahead and malloc a very heavy node that has room for two values. 36 00:01:30,600 --> 00:01:33,965 And you'll see it has a room for a number and a next pointer. 37 00:01:33,965 --> 00:01:35,840 So the number I'm going to first install here 38 00:01:35,840 --> 00:01:38,810 is going to be the number one, for instance. 39 00:01:38,810 --> 00:01:41,780 And I'm going to leave its pointer as just pointing at the ground, 40 00:01:41,780 --> 00:01:43,977 indicating that this is a null pointer. 41 00:01:43,977 --> 00:01:45,810 It's not actually pointing at anything else. 42 00:01:45,810 --> 00:01:48,650 But now that I'm starting to instantiate, 43 00:01:48,650 --> 00:01:51,830 that is create this list, now I'm going to do something like this 44 00:01:51,830 --> 00:01:56,000 and say that, all right, my variable called list, whose purpose in life 45 00:01:56,000 --> 00:01:58,670 is to keep track of where this list is in memory, 46 00:01:58,670 --> 00:02:01,130 I'm going to connect one to the other by actually having 47 00:02:01,130 --> 00:02:04,100 this variable point at this node. 48 00:02:04,100 --> 00:02:06,680 When it comes time, then, to allocate another node, 49 00:02:06,680 --> 00:02:08,433 I want to insert into this linked list. 50 00:02:08,433 --> 00:02:11,600 Back in the world of arrays, I might have to allocate a new chunk of memory, 51 00:02:11,600 --> 00:02:14,370 copy this value over into the new values. 52 00:02:14,370 --> 00:02:15,380 I don't have to do that. 53 00:02:15,380 --> 00:02:18,710 In the world of linked lists, I just call malloc for a second time 54 00:02:18,710 --> 00:02:22,040 and say, give me another chunk of memory big enough to fit a node. 55 00:02:22,040 --> 00:02:25,280 Thankfully, from the prop shop, we have another one of these. 56 00:02:25,280 --> 00:02:29,480 So let me go ahead and malloc another node. 57 00:02:29,480 --> 00:02:32,570 Here, at the moment, there's nothing in it, just these placeholders. 58 00:02:32,570 --> 00:02:34,310 So it's garbage values, if you will. 59 00:02:34,310 --> 00:02:36,620 I don't know what's there until I actually say 60 00:02:36,620 --> 00:02:39,110 that the number shall be number two. 61 00:02:39,110 --> 00:02:43,820 And then I go over to my linked list, whose variable name is list. 62 00:02:43,820 --> 00:02:45,300 And I want to insert this thing. 63 00:02:45,300 --> 00:02:46,790 So I follow the arrow. 64 00:02:46,790 --> 00:02:52,200 I then point the next field of this node at this node here. 65 00:02:52,200 --> 00:02:54,740 So now I have a linked list of size two. 66 00:02:54,740 --> 00:02:56,580 There's three things in the picture. 67 00:02:56,580 --> 00:02:58,160 But this is just a simple variable. 68 00:02:58,160 --> 00:03:01,880 This is a pointer that's pointing at the actual node, which in turn is 69 00:03:01,880 --> 00:03:03,350 pointing at an actual other node. 70 00:03:03,350 --> 00:03:07,130 Now, suppose I want to insert the number three into this linked list. 71 00:03:07,130 --> 00:03:09,560 Recall that malloc is powerful and that it 72 00:03:09,560 --> 00:03:11,660 takes memory from wherever it's available, 73 00:03:11,660 --> 00:03:14,030 the so-called heap from your computer. 74 00:03:14,030 --> 00:03:17,450 And that means, by definition, that it might not be contiguous. 75 00:03:17,450 --> 00:03:20,870 The next number might not actually be here metaphorically 76 00:03:20,870 --> 00:03:22,190 in the computer's memory. 77 00:03:22,190 --> 00:03:23,993 It might be way over there. 78 00:03:23,993 --> 00:03:25,160 So that might indeed happen. 79 00:03:25,160 --> 00:03:29,390 The third time I call malloc now and allocate a third node, 80 00:03:29,390 --> 00:03:32,720 it might not be available anywhere in the computer's memory 81 00:03:32,720 --> 00:03:35,090 except for way over here. 82 00:03:35,090 --> 00:03:36,450 And that's fine. 83 00:03:36,450 --> 00:03:41,480 It doesn't have to be contiguous, as it did in the world of our arrays. 84 00:03:41,480 --> 00:03:44,220 I can now put the number three in its place. 85 00:03:44,220 --> 00:03:48,763 But if I want to keep a pointer to that node so that all of these things 86 00:03:48,763 --> 00:03:50,930 are stitched together, I can start at the beginning. 87 00:03:50,930 --> 00:03:52,280 I follow the arrows. 88 00:03:52,280 --> 00:03:53,480 I follow the arrows. 89 00:03:53,480 --> 00:03:55,670 And now if I want to remember where that one is, 90 00:03:55,670 --> 00:03:58,200 I'm going to have to connect these things. 91 00:03:58,200 --> 00:04:04,230 And so now that pointer needs to point at this block here. 92 00:04:04,230 --> 00:04:05,990 And this visual is very much deliberate. 93 00:04:05,990 --> 00:04:09,000 These nodes can be anywhere in the computer's memory. 94 00:04:09,000 --> 00:04:11,390 They're not necessarily going to be contiguous. 95 00:04:11,390 --> 00:04:16,220 The downside of that is that you cannot rely on binary search, our friend from, 96 00:04:16,220 --> 00:04:17,630 like, week zero onward. 97 00:04:17,630 --> 00:04:21,290 Binary search was amazing in that its big O of log n. 98 00:04:21,290 --> 00:04:24,053 You can find stuff way faster than big O of n. 99 00:04:24,053 --> 00:04:25,970 That was the whole point of even the phonebook 100 00:04:25,970 --> 00:04:27,710 example in the very first week. 101 00:04:27,710 --> 00:04:31,220 But the upside of this approach is that you 102 00:04:31,220 --> 00:04:35,480 don't have to actually keep allocating and copying more memory 103 00:04:35,480 --> 00:04:38,188 and all of your values any time you want to resize this thing. 104 00:04:38,188 --> 00:04:40,730 And I'm a little embarrassed to admit that I'm out of breath, 105 00:04:40,730 --> 00:04:42,080 for some reason, just malloc-ing nodes. 106 00:04:42,080 --> 00:04:42,580 Here. 107 00:04:42,580 --> 00:04:44,960 But the point is using malloc and building out 108 00:04:44,960 --> 00:04:46,880 the structure does come at some price. 109 00:04:46,880 --> 00:04:49,025 It's exhausting, frankly. 110 00:04:49,025 --> 00:04:51,150 But it's also going to spread things out in memory. 111 00:04:51,150 --> 00:04:52,160 But you have this dynamism. 112 00:04:52,160 --> 00:04:54,900 And honestly, if you are the Twitters of the world, the Googles of the world, 113 00:04:54,900 --> 00:04:56,400 we have lots and lots of data. 114 00:04:56,400 --> 00:05:00,500 It's going to kill you performance wise to have to copy all of your data 115 00:05:00,500 --> 00:05:03,200 from one location into another, as Santiago originally 116 00:05:03,200 --> 00:05:05,580 proposed as a solution to the array. 117 00:05:05,580 --> 00:05:09,530 So using these dynamic data structures, like a linked list, where 118 00:05:09,530 --> 00:05:11,780 you allocate more space wherever it's available, 119 00:05:11,780 --> 00:05:15,290 and you somehow remember where that is by stitching things together, 120 00:05:15,290 --> 00:05:19,900 as with these physical pointers, is really the state of the art. 121 00:05:19,900 --> 00:05:21,000