1 00:00:00,000 --> 00:00:05,172 [MUSIC PLAYING] 2 00:00:05,172 --> 00:00:06,880 DOUG LLOYD: So in this video, we're going 3 00:00:06,880 --> 00:00:09,310 to take a look at insertion sort, which is another algorithm that you 4 00:00:09,310 --> 00:00:10,750 can use to sort an array. 5 00:00:10,750 --> 00:00:13,600 And in this example, we're going to use an array of integers. 6 00:00:13,600 --> 00:00:15,820 An algorithm, recall, is a step by step set 7 00:00:15,820 --> 00:00:18,220 of instructions for completing a task. 8 00:00:18,220 --> 00:00:20,360 The idea with insertion sort is as follows, 9 00:00:20,360 --> 00:00:24,820 we're going to build the sorted array in place, shifting elements that 10 00:00:24,820 --> 00:00:27,274 were previously considered sorted, if we need to, 11 00:00:27,274 --> 00:00:29,440 in order to fit those elements in the right position 12 00:00:29,440 --> 00:00:31,224 in the sorted portion of the array. 13 00:00:31,224 --> 00:00:33,640 This is fundamentally different than how we've done things 14 00:00:33,640 --> 00:00:35,470 with selection sort or bubble sort. 15 00:00:35,470 --> 00:00:38,110 Recall with those algorithms what we do is, 16 00:00:38,110 --> 00:00:40,300 we usually end up just sorting one element. 17 00:00:40,300 --> 00:00:42,550 We have to go through the entire array to put 18 00:00:42,550 --> 00:00:44,170 one element in the correct position. 19 00:00:44,170 --> 00:00:47,140 For bubble sort, it's usually the largest element that's available. 20 00:00:47,140 --> 00:00:49,650 For selection sort, it's usually the smallest element. 21 00:00:49,650 --> 00:00:52,180 But we're only getting one at a time. 22 00:00:52,180 --> 00:00:55,660 We're not making-- you know, we have to go through this array multiple times. 23 00:00:55,660 --> 00:01:00,227 With insertion sort, we only are going to move forward through the array once. 24 00:01:00,227 --> 00:01:03,310 We might have to look back at things we've already sorted to sort of shift 25 00:01:03,310 --> 00:01:04,780 them around to make room. 26 00:01:04,780 --> 00:01:08,090 But we only have to make one forward pass of the entire array. 27 00:01:08,090 --> 00:01:09,920 So that's sort of a fundamental difference. 28 00:01:09,920 --> 00:01:11,710 And the way we do this is we get started, we just 29 00:01:11,710 --> 00:01:13,450 say the first thing we see is sorted. 30 00:01:13,450 --> 00:01:14,440 It's the only thing we've seen so far. 31 00:01:14,440 --> 00:01:15,910 So it might as well be sorted. 32 00:01:15,910 --> 00:01:19,210 Then we just do the following for every element that remains. 33 00:01:19,210 --> 00:01:22,510 We look at that element and we maybe shift things 34 00:01:22,510 --> 00:01:24,640 that we previously considered sorted, just 35 00:01:24,640 --> 00:01:28,279 to make room for it, until we get through every single element. 36 00:01:28,279 --> 00:01:31,320 This will probably make a little bit more sense when you see it visually. 37 00:01:31,320 --> 00:01:33,700 So let's illustrate that now with this array. 38 00:01:33,700 --> 00:01:36,700 In this array, what I want to do is everything that's red is unsorted, 39 00:01:36,700 --> 00:01:39,530 everything that's blue is sorted. 40 00:01:39,530 --> 00:01:42,320 So keep that in mind as we go through this example. 41 00:01:42,320 --> 00:01:44,380 So recall that the first thing we do is we 42 00:01:44,380 --> 00:01:47,210 declare the first element of the array is sorted. 43 00:01:47,210 --> 00:01:47,836 So that's five. 44 00:01:47,836 --> 00:01:48,460 We just see it. 45 00:01:48,460 --> 00:01:49,300 We're like, got it. 46 00:01:49,300 --> 00:01:51,580 This one is sorted. 47 00:01:51,580 --> 00:01:53,270 So we have the sorted portion in blue. 48 00:01:53,270 --> 00:01:55,952 And we this five element unsorted portion in red. 49 00:01:55,952 --> 00:01:57,910 Now we're going to repeat the following process 50 00:01:57,910 --> 00:01:59,243 until everything else is sorted. 51 00:01:59,243 --> 00:02:01,420 We look at the next unsorted element and we maybe 52 00:02:01,420 --> 00:02:04,180 shift things around in the blue section in order 53 00:02:04,180 --> 00:02:07,270 to put that element in the correct sorted position. 54 00:02:07,270 --> 00:02:09,380 So the first element that we see is two. 55 00:02:09,380 --> 00:02:11,380 We take a look, we see is there anything we need 56 00:02:11,380 --> 00:02:13,690 to do to put two in the right spot. 57 00:02:13,690 --> 00:02:14,440 The answer is yes. 58 00:02:14,440 --> 00:02:17,650 We need to shift five over in order to put two in front of it. 59 00:02:17,650 --> 00:02:19,610 So we slide five over. 60 00:02:19,610 --> 00:02:23,100 And then we put two where five basically was in memory. 61 00:02:23,100 --> 00:02:26,717 And now two, five is sorted and the red portion is unsorted. 62 00:02:26,717 --> 00:02:28,050 Let's repeat this process again. 63 00:02:28,050 --> 00:02:29,932 The next thing we see is a one. 64 00:02:29,932 --> 00:02:31,390 We take a look at the blue portion. 65 00:02:31,390 --> 00:02:32,560 What do we have to move? 66 00:02:32,560 --> 00:02:34,435 Well here, again, we have to move everything, 67 00:02:34,435 --> 00:02:36,730 because one comes before both two and five. 68 00:02:36,730 --> 00:02:38,620 So both of them slide over. 69 00:02:38,620 --> 00:02:42,280 And that leaves one vacancy to put the one. 70 00:02:42,280 --> 00:02:45,065 The next thing we see is three. 71 00:02:45,065 --> 00:02:48,190 What do we have to move this time in order to get everything into position? 72 00:02:48,190 --> 00:02:49,240 Well, it's not everything this time. 73 00:02:49,240 --> 00:02:50,698 We just have to move the five over. 74 00:02:50,698 --> 00:02:52,870 So we move the five to where the three was. 75 00:02:52,870 --> 00:02:56,620 And then we put the three where the five just vacated. 76 00:02:56,620 --> 00:02:58,630 The next element that we see is 6. 77 00:02:58,630 --> 00:03:01,570 And this is sort of a special case of insertion sort, where 78 00:03:01,570 --> 00:03:04,140 it's larger than everything in the sorted portion, which 79 00:03:04,140 --> 00:03:06,640 is great because then we don't have to move anything at all. 80 00:03:06,640 --> 00:03:08,931 We just say, all right, well, six is in the right spot. 81 00:03:08,931 --> 00:03:11,200 We can just declare six to be sorted. 82 00:03:11,200 --> 00:03:13,540 And then the final element we have here is four. 83 00:03:13,540 --> 00:03:14,623 So we take a look at that. 84 00:03:14,623 --> 00:03:17,470 We look back at the blue portion, the sorted portion of the array. 85 00:03:17,470 --> 00:03:19,209 We figure out what we need to slide over. 86 00:03:19,209 --> 00:03:20,500 It's just the five and the six. 87 00:03:20,500 --> 00:03:23,260 They slide over which makes room for the four. 88 00:03:23,260 --> 00:03:27,670 And now we have sorted our entire six element array using the insertion sort 89 00:03:27,670 --> 00:03:28,540 algorithm. 90 00:03:28,540 --> 00:03:32,060 Now again, this feels very different than anything we've seen before. 91 00:03:32,060 --> 00:03:35,420 But unfortunately it's still n squared algorithm. 92 00:03:35,420 --> 00:03:38,240 So for example, imagine if the array is in reverse order. 93 00:03:38,240 --> 00:03:40,870 So it's six, five, four, three, two, one. 94 00:03:40,870 --> 00:03:45,330 In that case, we have to shift each of the elements n positions every time we 95 00:03:45,330 --> 00:03:46,790 want to make an insertion. 96 00:03:46,790 --> 00:03:49,450 So we see this six, five, four, three, two, one array. 97 00:03:49,450 --> 00:03:50,435 The six is good. 98 00:03:50,435 --> 00:03:51,310 Then we see the five. 99 00:03:51,310 --> 00:03:52,955 We have to shift the six over. 100 00:03:52,955 --> 00:03:53,830 Then we see the four. 101 00:03:53,830 --> 00:03:55,704 We have to shift of six and the five over. 102 00:03:55,704 --> 00:03:56,620 Then we see the three. 103 00:03:56,620 --> 00:03:58,600 We have to shift the six and the five and the four. 104 00:03:58,600 --> 00:03:59,620 So we just keep getting worse. 105 00:03:59,620 --> 00:04:01,286 We have to keep making all these shifts. 106 00:04:01,286 --> 00:04:02,830 And each one of those costs us. 107 00:04:02,830 --> 00:04:05,230 So even though this looks and feels very different 108 00:04:05,230 --> 00:04:10,110 than a bubble sort or a selection sort, this is still n squared, unfortunately. 109 00:04:10,110 --> 00:04:13,060 In the best case scenario though, the array is perfectly sorted. 110 00:04:13,060 --> 00:04:15,970 And this is sort of that example we saw a second ago with the five and the six, 111 00:04:15,970 --> 00:04:18,279 where we just said, oh, well that's kind of convenient. 112 00:04:18,279 --> 00:04:20,110 The six is greater than everything in the sorted portion, 113 00:04:20,110 --> 00:04:21,735 so we just kind of moved the line over. 114 00:04:21,735 --> 00:04:24,671 We just changed it from red to blue without changing anything. 115 00:04:24,671 --> 00:04:27,670 In the best case scenario, we just do that for every element-- one, two, 116 00:04:27,670 --> 00:04:28,300 three, four. 117 00:04:28,300 --> 00:04:30,880 We just keep changing them red to blue, no shifts. 118 00:04:30,880 --> 00:04:34,270 And so if we think about this in asymptotic notation, that 119 00:04:34,270 --> 00:04:40,510 means that this algorithm is a big O of n squared, but it's still omega of n. 120 00:04:40,510 --> 00:04:41,520 I'm Doug Lloyd. 121 00:04:41,520 --> 00:04:43,620 This is CS50. 122 00:04:43,620 --> 00:04:44,977