1 00:00:06,762 --> 00:00:09,980 TOMMY: Let's take a look at selection sort, an algorithm 2 00:00:09,980 --> 00:00:12,800 for taking a list of numbers and sorting them. 3 00:00:12,800 --> 00:00:15,750 An algorithm, remember, is simply a step-by-step 4 00:00:15,750 --> 00:00:18,370 procedure for accomplishing a task. 5 00:00:18,370 --> 00:00:21,470 The basic idea behind selection sort is to divide 6 00:00:21,470 --> 00:00:23,390 our list into two portions-- 7 00:00:23,390 --> 00:00:26,810 a sorted portion and an unsorted portion. 8 00:00:26,810 --> 00:00:30,200 At each step of the algorithm, a number is moved from the 9 00:00:30,200 --> 00:00:33,800 unsorted portion to the sorted portion until eventually the 10 00:00:33,800 --> 00:00:35,880 entire list is sorted. 11 00:00:35,880 --> 00:00:38,510 So here's a list of six numbers-- 12 00:00:38,510 --> 00:00:44,010 23, 42, 4, 16, 8, and 15. 13 00:00:44,010 --> 00:00:47,680 Right now the entire list is considered unsorted. 14 00:00:47,680 --> 00:00:51,770 Even though a number like 16 may already be in its correct 15 00:00:51,770 --> 00:00:56,040 location, our algorithm has no way of knowing that until the 16 00:00:56,040 --> 00:00:57,980 entire list is sorted. 17 00:00:57,980 --> 00:01:01,355 So we'll consider every number to be unsorted until we sort 18 00:01:01,355 --> 00:01:03,800 it ourselves. 19 00:01:03,800 --> 00:01:06,890 We know that we want the list to be in ascending order. 20 00:01:06,890 --> 00:01:10,200 So we'll want to build up the sorted portion of our list 21 00:01:10,200 --> 00:01:13,280 from left to right, smallest to largest. 22 00:01:13,280 --> 00:01:17,970 To do that, we'll need to find the minimum unsorted element 23 00:01:17,970 --> 00:01:21,350 and put it at the end of the sorted portion. 24 00:01:21,350 --> 00:01:25,370 Since this list is not sorted, the only way to do that is to 25 00:01:25,370 --> 00:01:29,330 look at each element in the unsorted portion, remembering 26 00:01:29,330 --> 00:01:32,010 which element is the lowest and comparing 27 00:01:32,010 --> 00:01:33,770 each element to that. 28 00:01:33,770 --> 00:01:36,150 So we'll first look at the 23. 29 00:01:36,150 --> 00:01:38,650 This is the first element we've seen, so we'll remember 30 00:01:38,650 --> 00:01:40,050 it as the minimum. 31 00:01:40,050 --> 00:01:42,320 Next we'll look at 42. 32 00:01:42,320 --> 00:01:46,720 42 is larger than 23, so 23 is still the minimum. 33 00:01:46,720 --> 00:01:51,210 Next is the 4 which is less than 23, so we'll remember 4 34 00:01:51,210 --> 00:01:52,880 as the new minimum. 35 00:01:52,880 --> 00:01:56,380 Next is 16 which is larger than 4, so 4 36 00:01:56,380 --> 00:01:57,980 is still the minimum. 37 00:01:57,980 --> 00:02:03,670 8 is larger than 4, and 15 is larger than 4, so 4 must be 38 00:02:03,670 --> 00:02:05,980 the smallest unsorted element. 39 00:02:05,980 --> 00:02:09,350 So even though as humans we can immediately see that 4 is 40 00:02:09,350 --> 00:02:12,300 the minimum element, our algorithm needs to look at 41 00:02:12,300 --> 00:02:15,710 every unsorted element, even after we've found the 4-- the 42 00:02:15,710 --> 00:02:16,860 minimum element. 43 00:02:16,860 --> 00:02:19,900 So now that we've found the minimum element, 4, we'll want 44 00:02:19,900 --> 00:02:23,410 to move it into the sorted portion of the list. 45 00:02:23,410 --> 00:02:27,320 Since this is the first step, this means we want to put 4 at 46 00:02:27,320 --> 00:02:29,680 the beginning of the list. 47 00:02:29,680 --> 00:02:33,040 Right now 23 is at the beginning of the list, so 48 00:02:33,040 --> 00:02:36,080 let's swap the 4 and the 23. 49 00:02:36,080 --> 00:02:38,870 So now our list looks like this. 50 00:02:38,870 --> 00:02:42,710 We know that 4 must be in its final location, because it is 51 00:02:42,710 --> 00:02:45,890 both the smallest element and the element at the beginning 52 00:02:45,890 --> 00:02:46,960 of the list. 53 00:02:46,960 --> 00:02:50,650 So this means that we don't ever need to move it again. 54 00:02:50,650 --> 00:02:53,910 So let's repeat this process to add another element to the 55 00:02:53,910 --> 00:02:55,910 sorted portion of the list. 56 00:02:55,910 --> 00:02:58,950 We know that we don't need to look at the 4, because it's 57 00:02:58,950 --> 00:03:00,000 already sorted. 58 00:03:00,000 --> 00:03:03,540 So we can start at the 42, which we'll remember as the 59 00:03:03,540 --> 00:03:05,290 minimum element. 60 00:03:05,290 --> 00:03:08,700 So next we'll look at the 23 which is less than 42, so we 61 00:03:08,700 --> 00:03:11,620 remember 23 is the new minimum. 62 00:03:11,620 --> 00:03:14,870 Next we see the 16 which is less than 23, so 63 00:03:14,870 --> 00:03:16,800 16 is the new minimum. 64 00:03:16,800 --> 00:03:19,720 Now we look at the 8 which is less than 16, so 65 00:03:19,720 --> 00:03:21,130 8 is the new minimum. 66 00:03:21,130 --> 00:03:25,900 And finally 8 is less than 15, so we know that 8 is a minimum 67 00:03:25,900 --> 00:03:27,780 unsorted element. 68 00:03:27,780 --> 00:03:30,660 So that means we should append 8 to the sorted 69 00:03:30,660 --> 00:03:32,450 portion of the list. 70 00:03:32,450 --> 00:03:35,990 Right now 4 is the only sorted element, so we want to place 71 00:03:35,990 --> 00:03:38,410 the 8 next to the 4. 72 00:03:38,410 --> 00:03:41,920 Since 42 is the first element in the unsorted portion of the 73 00:03:41,920 --> 00:03:47,260 list, we'll want to swap the 42 and the 8. 74 00:03:47,260 --> 00:03:49,680 So now our list looks like this. 75 00:03:49,680 --> 00:03:53,830 4 and 8 represent the sorted portion of the list, and the 76 00:03:53,830 --> 00:03:56,440 remaining numbers represent the unsorted 77 00:03:56,440 --> 00:03:58,260 portion of the list. 78 00:03:58,260 --> 00:04:00,630 So let's continue with another iteration. 79 00:04:00,630 --> 00:04:03,850 We start with 23 this time, since we don't need to look at 80 00:04:03,850 --> 00:04:05,770 the 4 and the 8 anymore because they've 81 00:04:05,770 --> 00:04:07,660 already been sorted. 82 00:04:07,660 --> 00:04:10,270 16 is less than 23, so we'll remember 83 00:04:10,270 --> 00:04:12,070 16 as the new minimum. 84 00:04:12,070 --> 00:04:18,149 16 is less than 42, but 15 is less than 16, so 15 must be 85 00:04:18,149 --> 00:04:20,480 the minimum unsorted element. 86 00:04:20,480 --> 00:04:24,580 So now we want to swap the 15 and the 23 to 87 00:04:24,580 --> 00:04:26,310 give us this list. 88 00:04:26,310 --> 00:04:30,500 The sorted portion of the list consists of 4, 8 and 15, and 89 00:04:30,500 --> 00:04:33,210 these elements are still unsorted. 90 00:04:33,210 --> 00:04:36,900 But it just so happens that the next unsorted element, 16, 91 00:04:36,900 --> 00:04:38,480 is already sorted. 92 00:04:38,480 --> 00:04:42,060 However, there's no way for our algorithm to know that 16 93 00:04:42,060 --> 00:04:45,230 is already in its correct location, so we still need to 94 00:04:45,230 --> 00:04:47,870 repeat exactly the same process. 95 00:04:47,870 --> 00:04:53,750 So we see that 16 is less than 42, and 16 is less than 23, so 96 00:04:53,750 --> 00:04:56,230 16 must be the minimum element. 97 00:04:56,230 --> 00:04:59,010 It's impossible to swap this element with itself, so we can 98 00:04:59,010 --> 00:05:01,780 simply leave it in this location. 99 00:05:01,780 --> 00:05:04,660 So we need one more pass of our algorithm. 100 00:05:04,660 --> 00:05:09,370 42 is greater than 23, so 23 must be the 101 00:05:09,370 --> 00:05:10,970 minimum unsorted element. 102 00:05:10,970 --> 00:05:17,410 Once we swap the 23 and the 42, we end up with our final 103 00:05:17,410 --> 00:05:18,530 sorted list-- 104 00:05:18,530 --> 00:05:23,390 4, 8, 15, 16, 23, 42. 105 00:05:23,390 --> 00:05:26,830 We know 42 must be in the correct place since it's the 106 00:05:26,830 --> 00:05:30,210 only element left, and that's selection sort. 107 00:05:30,210 --> 00:05:32,100 Let's now formalize our algorithm with some 108 00:05:32,100 --> 00:05:34,540 pseudocode. 109 00:05:34,540 --> 00:05:37,760 On line one, we can see that we need to integrate over 110 00:05:37,760 --> 00:05:39,530 every element of the list. 111 00:05:39,530 --> 00:05:42,150 Except the last element, since the 1 element 112 00:05:42,150 --> 00:05:44,230 list is already sorted. 113 00:05:44,230 --> 00:05:48,100 On line two, we consider the first element of the unsorted 114 00:05:48,100 --> 00:05:51,080 portion of the list to be the minimum, as we did with our 115 00:05:51,080 --> 00:05:53,750 example, so we have something to compare to. 116 00:05:53,750 --> 00:05:57,260 Line three begins a second loop in which we iterate over 117 00:05:57,260 --> 00:05:59,170 each unsorted element. 118 00:05:59,170 --> 00:06:02,150 We know that after i iterations, the sorted portion 119 00:06:02,150 --> 00:06:05,330 of our list must have i elements in it since each step 120 00:06:05,330 --> 00:06:06,890 sorts one element. 121 00:06:06,890 --> 00:06:11,770 So the first unsorted element must be in position i plus 1. 122 00:06:11,770 --> 00:06:15,440 On line four, we compare the current element to the minimum 123 00:06:15,440 --> 00:06:17,750 element that we've seen so far. 124 00:06:17,750 --> 00:06:20,560 If the current element is smaller than the minimum 125 00:06:20,560 --> 00:06:23,870 element, then we remember the current element as the new 126 00:06:23,870 --> 00:06:26,250 minimum on line five. 127 00:06:26,250 --> 00:06:29,900 Finally, on lines six and seven, we swap the minimum 128 00:06:29,900 --> 00:06:33,080 element with the first unsorted element, thereby 129 00:06:33,080 --> 00:06:36,990 adding it to the sorted portion of the list. 130 00:06:36,990 --> 00:06:40,030 Once we have an algorithm, an important question to ask 131 00:06:40,030 --> 00:06:43,370 ourselves as programmers is how long did that take? 132 00:06:43,370 --> 00:06:46,970 We'll first ask the question how long does it take for this 133 00:06:46,970 --> 00:06:50,070 algorithm to run in the worst case? 134 00:06:50,070 --> 00:06:51,640 Recall we represent this running 135 00:06:51,640 --> 00:06:55,060 time with big O notation. 136 00:06:55,060 --> 00:06:58,650 In order to determine the minimum unsorted element, we 137 00:06:58,650 --> 00:07:01,880 essentially had to compare each element in the list to 138 00:07:01,880 --> 00:07:04,040 every other element in the list. 139 00:07:04,040 --> 00:07:08,430 Intuitively, this sounds like an O of n squared operation. 140 00:07:08,430 --> 00:07:12,050 Looking at our pseudocode, we also have a loop nested inside 141 00:07:12,050 --> 00:07:14,420 another loop, which indeed sounds like an O 142 00:07:14,420 --> 00:07:16,480 of n squared operation. 143 00:07:16,480 --> 00:07:19,250 However, remember that we didn't need to look over the 144 00:07:19,250 --> 00:07:23,460 entire list when determining the minimum unsorted element? 145 00:07:23,460 --> 00:07:26,600 Once we knew that the 4 was sorted, for example, we didn't 146 00:07:26,600 --> 00:07:28,170 need to look at it again. 147 00:07:28,170 --> 00:07:31,020 So does this lower the running time? 148 00:07:31,020 --> 00:07:34,510 For our list of length 6, we needed to make five 149 00:07:34,510 --> 00:07:37,990 comparisons for the first element, four comparisons for 150 00:07:37,990 --> 00:07:40,750 the second element, and so on. 151 00:07:40,750 --> 00:07:44,690 That means that the total number of steps is the sum of 152 00:07:44,690 --> 00:07:49,160 the integers from 1 to the length of the list minus 1. 153 00:07:49,160 --> 00:07:51,005 We can represent this with a summation. 154 00:07:57,980 --> 00:07:59,910 We won't go into summations here. 155 00:07:59,910 --> 00:08:04,900 But it turns out that this summation is equal to n times 156 00:08:04,900 --> 00:08:07,540 n minus 1 over 2. 157 00:08:07,540 --> 00:08:14,220 Or equivalently, n squared over 2 minus n over 2. 158 00:08:14,220 --> 00:08:18,860 When talking about asymptotic runtime, this n squared term 159 00:08:18,860 --> 00:08:22,070 is going to dominate this n term. 160 00:08:22,070 --> 00:08:27,850 So selection sort is O of n squared. 161 00:08:27,850 --> 00:08:31,460 Recall that in our example, selection sort still needed to 162 00:08:31,460 --> 00:08:33,850 check if a number that was already sorted 163 00:08:33,850 --> 00:08:35,450 needed to be moved. 164 00:08:35,450 --> 00:08:38,929 So that means that if we ran selection sort over an already 165 00:08:38,929 --> 00:08:43,070 sorted list, it would require the same number of steps as it 166 00:08:43,070 --> 00:08:46,340 would when running over a completely unsorted list. 167 00:08:46,340 --> 00:08:51,470 So selection sort has a best case performance of n squared, 168 00:08:51,470 --> 00:08:56,820 which we represent with omega n squared. 169 00:08:56,820 --> 00:08:58,600 And that's it for selection sort. 170 00:08:58,600 --> 00:09:00,630 Just one of the many algorithms we can 171 00:09:00,630 --> 00:09:02,390 use to sort a list. 172 00:09:02,390 --> 00:09:05,910 My name is Tommy, and this is cs50.