[Merge Sort] [Rob Bowden - Harvard University] [This is CS50. - CS50.TV] Let's talk about merge sort. So far you've seen bubble sort, insertion sort, and selection sort. Although I'll kind of wave my hand at what I mean by better, merge sort generally performs better than any of these 3 sorts. But before talking about merge sort, let's talk about merging 2 sorted lists. We'll call the process of taking 2 sorted lists, like these, and making a single sorted list out of them--merging the lists. How can we do this? Well, one idea is to just stick one list onto the end of the other list and then sort the resulting list. While this works, it's a lot of unnecessary work. We can do it faster than just sorting. Notice that one wrong idea is to just take alternating cups from each list. While that may seem like that works at first, doing that we end up with 4, 8, 15, 23, 16--notice that 16 and 23 are out of place. This is because 2 elements that should appear consecutive in the merged list are in the same initial list. Both 15 and 16 are in the list on the left. The trick is to take advantage of the fact that both lists are already sorted. This means that if we just look at the first elements of both lists-- here, 4 and 8--one of them must also be the first element of the merged list. Well, why is that? Both of these lists are already sorted, and so either 4 or 8 must be the smallest element when we combine the 2 lists. In this case, the smallest is 4, so we can take out 4 and make it the first element of our merged list. Now we continue merging the remaining 3 elements of the first list and 4 elements of the second list. Once again, we only need to look at the first element of both lists. The smaller of the 2 must be the second element of our merged list. This time, between 8 and 15 the smallest is 8, and so we insert that as the second element of our sorted list. We can continue comparing the first elements of both lists and removing the smaller of the 2. Comparing 15 and 23, 15 is smaller and so that's our third element. Now comparing 16 and 23, 16 is smaller. So that's the fourth element. Notice that 2 elements came from the same list in a row. This is why the merged list can't just alternate elements from the 2 lists. Comparing 50 and 23, 23 is smaller, so we choose that. Between 50 and 42, 42 is smaller. Between 50 and 108, 50 is smaller. And, finally, we just have 108, so it must go on the end of our list. Notice that we have a nice, sorted list. Every time we compared the first 2 elements of the 2 lists we were able to determine the next element of the merged list. This means that if the final list contains n numbers, where n here is 8, then we need at most n comparisons to get all of those numbers into the right place. Such an algorithm is said to run in linear time, but don't worry about that here. Using our algorithm for merging, we can make a fast merge sort algorithm. So, let's reset our lists. There are 2 big steps in the process of merge sort. First, continuously split the list of cups into halves until we have a bunch of lists with just 1 cup in them. Don't worry if a list contains an odd number and you can't make a perfectly clean cut between them. Just arbitrarily pick which list to include the extra cup in. So, let's split these lists. Now we have 2 lists. Now we have 4 lists. And now we have 8 lists with a single cup in each list. So that's it for step 1. For step 2, we repeatedly merge pairs of lists using the merge algorithm we learned before. Merging 108 and 15, we end up with the list 15, 108. Merging 50 and 4, we end up with 4, 50. Merging 8 and 42, we end up with 8, 42. And merging 23 and 16, we end up with 16, 23. Now all our lists are of size 2. Notice that each of the 4 lists is sorted. So we can start merging pairs of lists again. Merging 15 and 108 and 4 and 50-- first take the 4, then the 15, then the 50, then the 108. Merging 8, 42 and 16, 23, we first take the 8, then the 16, then the 23, then the 42. So now we have just 2 lists of size 4, each of which is sorted. So now we merge these 2 lists. First we take the 4. Then we take the 8. Then we take the 15 and 16, then 23, then 42, then 50, then 108. And we're done. We now have a sorted list. So how fast was this, exactly? In technical terms, merge sort is O(n log n), whereas all of bubble sort, insertion sort, and selection sort are O(n²). In fact, as you'll learn soon, you won't be able to come up with a sort that performs better than O(n log n) in the general case. Again, don't worry about this big O notation if you haven't seen it yet. Just know that this means if we wanted to sort a really big list bubble sort, insertion sort, and selection sort could potentially take significantly longer than merge sort. It does not mean that merge sort will be faster for all lists or even for any single list under a certain size. For example, insertion sort might be the fastest sort for all lists smaller than 5 elements. In practice, merge sort is usually faster for lists as small as 50 elements. But this extra speed doesn't come without a price. Unlike our other sorts, which take a list and modify the list in place until we get a sorted list, merge sort needs some additional space to merge 2 lists together. We can't immediately use the lists that are being merged to store the merged list since we might override elements that still need to be merged. This space is a bit of a price, but it usually isn't unreasonable. And that's it for merge sort. My name is Rob Bowden, and this is CS50. [CS50.TV] --and selection sort. [laughs] Oh, got to take that out too because I switched how I was presenting it. List on the left. That was a typo. [misspoke] I screwed up-- [laughs] I don't know what--