>> Lecturer: All right. Welcome back to CS50. This is the end of Week 3. Nothing stupid to start off with today on the overhead. Instead I need eight of you if you would be so brave? Yes, one, come on; okay, number two, come on up. Don't look up. No one is looking up in this area. Three? Okay, come on up. Again, you must be comfortable on camera. You two, come on. Okay, wait let's see, one, two, three, four, five, you three in front here. Yeah. We'll do some other awkward things in the future. I promise. Okay. So here we go. If you could line up in that order up here on stage?
[ class noise ] Perfect, perfect. Okay, so if you guys could make yourselves look like that, the first moment of awkwardness here is just to remind you where we left off last time and that was the goal, among the goals on Monday was to actually search numbers, and we had a wonderful demonstration that worked perfectly whereby Edward found the numbers in question immediately, but at least the second time around in theory he was able to assume that the bottom most array was actually sorted. So today we ask this question, how do you go about sorting those numbers? And for this now we need the ninth volunteer. Yes, in the pink shirt there; come on down. So, this gentleman here will be in charge of the sorting demonstration. So we have eight numbers, and if you would as this gentleman comes down think to yourself you've got eight humans here each of which is representing an integer, how would you, the human, go about sorting these folks and let's see if yours actually coincides with?
>> Student: Adam.
>> Lecturer: With Adam's approach. So, Adam, your goal here is simply to tell these eight what to do and sort them from smallest on the left to largest on the right.
>> Student: All right. You and you switch. Will you switch with him? Will you go to the end, please? And then will you come in between these two?
>> Lecturer: Round of applause, absolutely.
[ Applause ] That was fantastic. Okay. So, what was Adam's approach? Can someone put into words what he just did? What is the algorithm? How would you specify this? Not so many volunteers now. Okay?
>> Student: Determine the least number and put that in the first index and then repeat [inaudible].
>> Lecturer: Okay, good. So look for the least, look for the smallest number, pull that person out and tell them go to the beginning and then repeat recursively so to speak do the same thing again but with a problem that is ever so slightly smaller and minus one instead of N and, in fact, we got to the end result. What was your actual approach?
>> Student: It was basically what you said except I also tried to look to make sure not to change anyone that was in their correct position already.
>> Lecturer: Okay, good. So, we had a little optimization heuristic perhaps whereby if someone was already in the right position, we didn't move them at all thereby not wasting any cycles. Okay, so let's see if we can't slap both a label on this for the sake of discussion, but then also make some claim as to the efficiency of what Adam just did. So, let's see here could you go back to the order you were just in? Okay. So we pretty much just did what this gentleman said. We selected so to speak the smallest element. Is it this person? Is it this person? Is it this person? This person. Ah-ha, we found it and notice the one difference here. So, the human, Adam, might have immediately gone for obviously you're number one you go here, but remember a computer is not going to be able to sort of [inaudible] be able to know who is the smallest and where they are because if you think about how you would represent these eight humans, what data structure as we'll call it would you use these days and see? So you'd use an array, which you might have had a list in scratch, but an array even though it allows you random access where you can jump randomly to any one location just by indexing into the array numerically by way of the square brackets, that doesn't necessarily mean [inaudible] that you know where to look. So, for instance, your name is?
>> Student: Gabrielle.
>> Lecturer: Gabrielle. So Adam didn't necessarily know that Gabrielle was in index location zero, one, two, three, four; he only knew that presumably because as a human he very quickly scanned the list, but much like a computer he had to start somewhere so start at the left and move to the right and then finally he saw Gabrielle and then just to push a little harder here once you found Gabrielle did you know this was the smallest element?
>> Student: I did.
>> Lecturer: Okay, you did, why?
>> Student: Because I had seen all the other elements.
>> Lecturer: Okay, good. So, there's at least some knowledge in advance here as to what the numbers are so he at least was able to optimize there, but in the worst case if you didn't have this little cheat sheet that's 20 feet high over here, what would you have had to do to find that smallest element?
>> Student: Check the last three.
>> Lecturer: Okay, good. So check the last element. So maximally finding Gabrielle or more generally finding the smallest element would take from the audience how many steps? Okay so N steps, right? You start here, you check four. Okay, four is actually the smallest element I've seen, but let me keep looking. Oh, two is even better; let's forget about, what was your name?
>> Student: Praya [phonetic].
>> Lecturer: Praya. Let's forget about Praya because we've just now found?
>> Student: Juliana.
>> Lecturer: Juliana, who's number is smaller so now think about how you would implement this I just need to keep around it seems a variable, I or N or whatever that keeps track of maybe what the smallest value is or at least where it is. So, index location 1 and then I keep going, keep going, keep going, finally I get to Gabrielle and I update my variable because I found something better than Juliana and now I keep going and then I get to these guys and realize that was a waste of time because I've already found the smallest element. So, I pull Gabrielle out and then what did you tell her to do exactly?
>> Student: Switch with Praya I believe.
>> Lecturer: Excellent. Okay, good. So notice, and if you could implement this in reality, so we've switched them and this was actually pretty smart because what would another approach have been to make room for Gabrielle down here? Yeah?
>> Student: Maybe [inaudible] one extra spot.
>> Lecturer: Okay. So we could make another array, and we'll see in time how you can create arrays in advanced that are as big as you'd like them subject to RAM constraints so we could just kind of make a bunch of blank space then start filling in the holes. What would another approach be as opposed to just having these two swap? So we could shift people, right? If you could go back to your original locations for just a second? Another very reasonable approach would have been to say, okay, come out to the line if you would. You guys make room, she's got to go into the front of the list, but that's another one, two, three, four steps just to make room now for Gabrielle. So, we have a couple of options here, right? Clearly in a program it's going to take us N steps to at least find Gabrielle or more generally the smallest number, but even then after that there's some clever decisions we can make like as Adam did swapping the two instead of foolishly, though correctly, shifting everyone down. So let's now ask the question, if this algorithm is essentially defined as selecting the smallest person, moving them immediately to the front of the list and then repeating as this gentleman said, how many steps maximally is this algorithm going to take to sort everyone? So it's not as bad as N factorial, thankfully, because that would actually get really slow really fast.
>> Student: A little less than N squared.
>> Lecturer: Oh, okay. Say again?
>> Student: A little less than N square.
>> Lecturer: Yeah, it's actually a little less than N squared. Well, why is that? Well, let's see. If I take N steps initially, I take N steps and I get here, but now what do I have to do? I essentially have to repeat, but I can slightly optimize, right? Because I already put Gabrielle in the right place so I don't have to look at her number again so now I have N minus one steps and then the next iteration N minus two steps then N minus three then N minus four dot, dot, dot so that begs the question in the end, what is N plus N minus one plus N minus two plus dot, dot, dot down to one or down to zero? What is that? Anyone remember the little cheat sheet at the back of your high school math book? Yeah, so it's N times N plus one all over two, which is roughly what? N squared. So, let me part the river here for just one moment. Sorry, we'll get you some cough medicine shortly. So, that series N plus N minus one -- I promise not to do too much math while you guys are standing here -- plus, whoops, not N plus one, N minus one plus N minus two plus dot, dot, dot, plus all the way down to the very last step. You may recall that this generalizes N times N plus one over two, what does that equal? Well, that's N squared plus N over two and now is when we apparently have a way of expressing the running time of this algorithm and we talked on Monday about big O and omega and what those actually mean. So, what's the worst case running time of this algorithm? So it feels like it's big O of this. So, big O of this thing here, but one of the rules of thumb when it comes to talking about the running time of an algorithm mathematically or formally like this is when you start talking about really big values of N, adding another little n like this doesn't really matter. You start to care more about the degree, the member in the formula that has the highest exponent, for instance, because if N is not just eight but N is a billon, well, N squared, a billion squared, that's a really big damn number plus one more billion not so material and so when we talk about big O notation, we tend to get rid of the smaller terms and this divided by two, I mean that's really not so interesting either that we actually only did half as many steps and the motivation conceptually for this is essentially this idea. When I started counting everyone linearly in the lecture hall the other day, one, two, three, we immediately improved on that. That was a big O of N algorithm, but I immediately started doing two, four, six, eight, ten. So that's like an N over two algorithm and mathematically N is slower than N over two because I'm doing twice as many steps in the first version and half as many steps in the second, but you know what? If I wait 12 months, 18 months, you know, Intel is going to come out with a faster CPU that, you know, tends historically to be twice as fast and so over the course of a year my algorithm naive though it was automatically gets twice as fast, but that's not interesting. We need to be able to abstract away from the running time of me the human, the running, sorry, the speed of me, the human, or the speed of the machine implementing these algorithms so that we can say something interesting about the performance of algorithms independent of however many gigahertz happens to be in vogue at the moment. So, with that said the big O running time of this algorithm we would say it's just big O of N squared. Now what about in the best case? If you guys could realign as you were a moment ago, this is not the best case, but what would the best case scenario be for Adam upon arriving on stage?
>> Student: Everyone already in the right place.
>> Lecturer: Okay. So everyone is already in the right place. So, if you could quickly sort yourselves from one through eight so now Adam has just arrived on stage, he is handed this array say with no other knowledge in advance, if he applies that same algorithm of looking for the smallest element, putting him or her at the front then repeating for the N minus one remaining elements then repeating recursively, how many steps is the algorithm going to take this time?
>> Student: Eight.
>> Lecturer: So it's actually not eight. At least as we've described it here, it's the same number because what does he not know upon each iteration of this if we implement exactly the same algorithm?
>> Student: I'm still not sure it one is the smallest element so I have to still check all the way down.
>> Lecturer: Exactly. So, again, you can [inaudible] the cheat sheet that's up here if Adam is the computer that's been handed this array of eight numbers, it's wonderful that they're sorted, but his algorithm is not designed as we've just described it to really take that into account because he's still going to start at the left and he's going to realize, oh, wow, this is a really small number, I'm going to remember this one bigger, bigger, bigger, bigger, bigger, bigger, damn, I just wasted seven steps and yet I had the winner at the very beginning, okay, so I'm going to leave her where she is now let me repeat for the remaining elements, all right? Here is the small number, two, I'll remember this. Bigger, bigger, bigger, ah, I've wasted more time, but I'm going to keep wasting time because at least with this algorithm defined as iteratively or recursively selecting the smallest element and plopping them at the beginning, there's no cognizance of whether or not the list is already sorted. Now, you might think well this is just stupid now, right? Just keep track of whether or not you've seen smaller numbers before and that's fine, but that's a different algorithm then, right? We need more memory for that or we need a different set of steps. So let's try one other approach at least here. Could you guys reshuffle in this order? And this is really random. It doesn't matter what they reset to, but at least this gets it done quickly. So what might another approach be? Let's tell Adam this time what to do rather than leaving it to him. How else could we go about sorting these eight people?
>> Student: Bubble sort.
>> Lecturer: Bubble sort. What a wonderful lead in. Okay, well, what does that mean? Well, let's think about another very simple approach. You know, this is a really kind of overwhelming problem eight numbers, I'm the program, this is a lot of work to do, let me keep it simple and let me start at the left. So, these two here are clearly out of order, four is bigger than two, but two should be to the left of four so you know what? Let me just try a very simple algorithm of swapping these two volunteers. So we swap them. That takes one step or maybe three steps if you think back to our swap preparation, but a fixed number of steps, a constant number of steps. And that's okay. We're more worried about recurring costs, all right? So, now I compare four and six. Oh, these two are actually in pretty good shape. I'm going to leave them alone. Six and eight, they're good. Oh, eight and one, let me do a swap, but again, I'm not backtracking. I'm still making forward progressive. Eight and three, got to swap these two; eight and seven, going to swap these two; and then eight and five going to swap these two. Done, right? All right so obviously not, right? The list is not sorted. So I've taken N steps here and done some swaps in the middle, but any swaps were just constant time anyway. It's always the same number of steps to move two humans or to swap two ends. So, what do I do now? Well, let me try repeating. Two and four, no, four and six? Off six and one. And now notice what's happening to Gabrielle this time. Go ahead and swap. So notice, and we can slap the label that's already been applied to it on this algorithm. The smallest number seems to be bubbling her way up to this end of the list and meanwhile eight has already bubbled his way up to the end of the list. So this algorithm is, in fact, generally called bubble sort. It is just as correct as the previous one that Adam proposed, which was generally called selection sort so that begs the question, is one better than the other? Well, how many steps is bubble sort going to take? So it's also N squared. Now, why is that? Let's see. Every time I want to move from left to right I'm looking at eight elements and maybe doing some constant number of swaps. So, some product of N am I doing through each iteration, but in the worst case, who is where in this scenario? So in the worst case, maybe Gabrielle is all the way over here and this gentleman.
>> Student: Ravi [phonetic].
>> Lecturer: Ravi is all the way over there so that then begs the question if on each iteration someone like Ravi or Gabrielle only bubble up one place to the left or one place to the right, how many times am I going to have to repeat this process to pull Ravi all the way that way or Gabrielle all the way this way? Well, another N times or N minus one. So I have to do N things N times and that one conceptually is perhaps even easier to wrap one's mind around. It's big O of N square. Now, I think we've had you up here for an awkwardly long enough moment. A big round of applause if we could for Adam and these guys?
[ Applause ] Little souvenir and if you want, our paperwork awaits. Thank you. So, let's.
>> Student: [Inaudible].
>> Lecturer: No, no. We don't need to do it every time. We have you for life now. All right so let me go ahead and show exactly what this means for something to be taken N squared time because we'll actually see that N squared though very simply defined and that was actually pretty simple to implement if you think about it I've got a four loop then maybe another four loop so I can pretty relatively easily translate what we just did with Adam and team into code. Well, let's see if we can't visualize it as well. So this is a length that we have on the course's website under lectures and it's just a really nice Java applet, a program someone happened to write in Java, works in a web browser, that helps us visualize some of these algorithms. If I go down here to this drop down if you want to play along at home and I go choose bubble sorts, array type random, it's just going to populate it randomly, and this delay is just going to control the speed of this demonstration. I'm going to make it ten milliseconds or whatever this is and now I'm going to go ahead and click start. What we have here is more, a larger input set. So, each of these bars represents a number or a human. The taller the bar the bigger the number; the shorter the bar, the shorter the number, the smaller the number. So Gabrielle would be the small bar up here number one and Ravi would be one of the larger bars at right. That's the analog here. So what's happening in red from left to right is that this program is implementing bubble sort. It's comparing pairs or neighbors of elements and if they happen to be out of order, it swaps those two but then forges ahead to do it again to the next pair of neighbors and what you can see perhaps more easily since this is a longer demonstration that bigger numbers are, in fact, bubbling up albeit slowly and smaller numbers are bubbling their way down. Now this is only what, that looks like 30 or 40 bars total? So N is not very large and granted we fixed the speed of this thing, but it's already getting kind of boring. So, N squared does not scale particularly well. Sorting elements with even more than 40 elements or whatever is up here definitely takes a while and let's see then if selection sort does a little better. So I'm going to go ahead and choose selection sort as it's called. I'm going to go ahead and click start, again, and now notice less seems to be happening, but that's because this algorithm is swapping bars as far to the left as possible on each iteration. So what's getting highlighted in red is the then smallest element that's been discovered say by Adam and then it's being moved all the way to the left. So, what's a little neat about this algorithm is that you can actually see the progress and that definitely happened more quickly, right? If we ran it again or you actually timed it, I mean I'm still talking and yet this thing is done selection sort seems to be faster and that's because what I've crossed off here for formality sake definitely has real-world implications. So N squared is slower than N squared divided by two for instance, but realize once we start talking about really large data sets, the number of Face Book users out there or the number of web pages that Google indexes, this is when things like N squared and these design decisions really start to hurt potentially because they take more time than you might necessarily have patience for. So let's see if we can't express this in a way that is more easily translated to code. So here's bubble sort pseudo code for it. You can write this in any number of ways, but the algorithm essentially boiled down to repeat the following N times. For each of the elements in the array if I as we'll call it iteratively and its neighbor are out of order, swap them. So this is, again, another lesson in pseudo code what it means to communicate your ideas without getting bogged down in uninteresting minutia [phonetic] like syntax specific to C or any other language, but with the expression of pseudo code here you can see perhaps more clearly the actual running time. You can see in the very first line there repeat N times. You have an explicit directive that the following is going the take N steps, but beneath that it says for each element I. Well, how many elements are there? There's N elements in the array so a four each loop is going to induce another N step so you can immediately see in the pseudo code alone that this algorithm is apparently going to take N times N steps to actually get the job done. Now bubble sort actually allows for more optimization though. So, when we had our eight volunteers up here, we for the sake of selection sort told them to get into proper order and then we asked the question how long would it take to sort these elements using selection sort in the best case, but we didn't ask that in bubble sort's case. So, if in bubble sort's case you have eight humans here from one on up to eight already sorted, how many steps is this algorithm going to take as written?
>> Student: [Inaudible].
>> Lecturer: So I'm hearing a few answers among which are N squared N, I heard seven, but it is, in fact, again just look at the code if you will and do the following N times for each of the elements so N more things if I and its neighbor are out of order, swap them. No matter what this implementation of bubble sort is just blindly telling you to do the following N times, N times this algorithm takes N squared steps. So, if Adam had come on stage and realized, had been handed an array of eight people all of whom were sorted already, doesn't matter. He's going to very blindly compare again and again for a total of N squared steps all of the humans in that row and then eventually he's going to stop because what's neat about this algorithm at least is that because you are doing it N times N times no matter what it will have worked by the end, but there's clearly an opportunity for optimization here. So, bubble sort right now let's slap some numbers on these just to have sort of a list and then see if we can't do better. So bubble sort, so bubble sort thus far it seems to be big O of N squared. Selection sort seems to be also N O of N squared but we also said it also is in omega of N squared because, again, you just keep going through and going through selecting the N smallest element but you don't know in advanced that you found the smallest element until you've checked. So, is Omega a bubble sort better in this case? So, no so can we do better? How would you optimize this algorithm? How would you after receiving some feedback from your TF go back to the drawing board and just add a couple lines of code to fix this, yeah?
>> Student: [Inaudible].
>> Lecturer: Yeah. Okay, so keep track of how many switches you've made by way of a variable. So initialize some counter variable in there, initialize it to zero and then as you iterate over all of the elements if you actually execute this if condition and swap elements, what should you do with this variable? Little plus, plus. Right? Or it could be a Boolean. It could just be a Boolean that says true I have swapped something because then when I get to the end of the list and I meet Ravi, who has been here the whole time because he's already sorted and he's number eight, what do I then check before deciding whether or not to repeat this algorithm?
>> Student: [Inaudible].
>> Lecturer: Right. So if my variable equals equal zero, don't waste your time going back to the beginning and repeating this algorithm. Instead do something like if variable equals equal zero, new line, break. So break is a syntax you might have seen in section or some past examples you can forcibly exit a loop simply by executing the statement break semicolon and so that would be the easiest way of translating that little optimization to code here. So, what are the returns? What then is the best case running time of bubble sort? Yeah, it's just N and that's cool, right? So that was easy. Just by adding two lines of code and, again, testament to this idea of putting up a little more effort and thought and intellect into the design of an algorithm, can you significantly chip away at the running time, which as we saw on that big chart the other day, something that's linear is much, much better as N gets really, really large than something that's quadratic or polynomial. So this is already a good thing, but it still feels pretty slow. So is there another approach altogether? Well, I'll wave my hands at some more pseudo code here since once could write this in a number of ways and this just expresses what Adam and our volunteers already put forth to us, but it turns out that this trick from Monday, this capability of a functions calling itself is actually a really powerful thing. So, there exists this other algorithm called merge sort and as the name implies, it boils down to taking an array or taking a list more generally sorting half of it, sorting the other half and then merging the results and voila you then have a perfectly sorted list. Well, how does this work? Well, this is it. So this is the pseudo code for merge sort. It says on input of N elements so in the form of an array first you do a sanity check. If you've got fewer than two elements, there's no work to be done, return. And if you can recall the lingo from Monday, that represents our base case that actually prevents this thing from devolving into a really bad infinite loop after which bad things, segmentation faults and all of that happen. So, else if you do have enough elements to actually do interesting work on, what should you do? So, sort the left half of elements, sort the right half of elements, then merge the sorted halves. Well, what does this mean exactly? Well, if I have a list of eight numbers and then I take the four numbers on the left and the four numbers on the right, how do I go about sorting the left half of those elements? What would you do? Right, I've got eight volunteers here who we have dismissed, but pretend there's four people here, four people there, I have to divide the list into two so here's the four people I need to sort, how can I sort half a list?
>> Student: [Inaudible].
>> Lecturer: How about in back?
>> Student: Cut it into quarters.
>> Lecturer: Cut it into three quarters?
>> Student: [Inaudible].
>> Lecturer: Okay. So I could cut it in half again and again and again. Okay. So finally I get down to Gabrielle or any of our volunteers because I've cut my list of eight into four into two into one, what do I need now with Gabrielle or whoever I've found? Can't go back to the laptop now. You're on the hook.
>> Student: Oh, you just cut it in half, cut in half and [inaudible].
>> Lecturer: Okay, true, but then what do I do once I keep cutting the list in half? Now I just have eight people, what do I do with them?
>> Student: Merge every two.
>> Lecturer: Okay, I merge every two. Okay. So, you're actually going frankly in this direction, but it seems like you're just kind of weaving a circular argument around me, right? I'm asking you how to sort an element and I've already cut them in half. You're telling me, oh, just cut them in half again and then cut them in half again. I feel like we're not actually getting to resolution here. Yeah?
>> Student: Well, a single element [inaudible] and so when you get down to one element sorted and then you would merge it with another element [inaudible].
>> Lecturer: Okay.
>> Student: And then [inaudible] and then those two pairs would be merged [inaudible] so on and so forth.
>> Lecturer: Okay, good. So to summarize maybe the magic here is not in this punting in lines the second to bottom line. When I say sort left half and sort right half, you know, I'm kind of just punting on that and saying go sort these elements then the magic seems to be in the merging of sorted halves because if we kind of think through the logic that's been proposed, if I eventually get down to a list of one, of size one just Gabrielle or just Ravi, there's actually a really neat feature of that situation, which is that Gabrielle as a single number is already sorted by nature similarly as Ravi or any of our eight volunteers already sorted if I try to sort just them, but then I apparently need to take a step back conceptually and say, okay, I've got a list of length one, it's already sorted trivially because that was really an uninteresting exercise, but now I have two lists that sounds of size two. So, I know this is going to happen, but I will say let's take a five minute break think about how we will solve this and then we'll reveal the answer after you've not actually thought about it.
[ Break ]
>> Lecturer: All right so we are back. Here are, we'll do this one without humans, but here are the same numbers that our volunteers represented the motivation here to make sure the context is clear is to do better than N squared and even though the algorithms we demonstrated both with humans and with computers up top, hasn't taken terribly long realize that we only had, again, 30 or 40 bars on the screen that we're getting sorted and even that was relatively slow, but the goal here will be to compare just how quickly or how much better we can do if, again, we inject a little more intelligence and start to deploy some more sophisticated techniques like this thing called recursion on Monday, which at the time very uninteresting to implement factorial or sigma, the summation function with recursion because we already had other ways of doing that but as in this case as we'll see, sometimes recursion lends itself to the expression of an algorithm that would just be much harder to express iteratively or with lots of loops and variables. So here's our algorithm and let's see if we can apply this here. So here's our eight elements. The first step is to on input of N elements, if N is less than two, return. No, N is actually eight so I'm not going to return so else I do the following. Sort the left half of elements. So, somehow I need to do something with these guys here, no. With these guys here so real-world bugs as well, and then I'm going to move on the those, but not just yet. So, I've called a function essentially sort left half of elements. So I need a way of sorting these four elements. What algorithm might I apply to these four elements? So, I've got bubble sorts, I've got selection sort, but feels like if I deploy one of the algorithms we've already dissected and analyzed and used it twice that we're not really going to make a fundamentally important step forward, but you know, we do have a new sorting algorithm supposedly. We don't know if it works, we haven't seen how it works, but it is called merge sort so let's take this leap of faith and sort the four elements on the left half with the same algorithm, merge sort. So you'll have a little bit of stories piling up in one's brain here as we do this recursively, but here we go. So, I've just sorted, I need to sort the left half, that's these four elements, which means I'm invoking merge sort N is now four, is four less than two? No. So I do the else sort left half of elements. Well, what does that mean? That means I now need to sort these elements here, okay, what does that mean? How do I sort two elements? Let's deploy merge sort. So, call merge sort on a list of size two. Is two less than two? No. So, I proceed to do the else construct yet again sort the left half of elements and here I go. I now sort the left half of elements and voila, I seem again, I seem already to be at the point where I have a list that's now size one. So is one less than two? Yes, finally. So I return immediately. Now, at this point in the story I've got a list of size one what was the next step if you roll back in time now? So, sort the right half, the right half in question at this moment in time having called merge sort multiple times already was sort the left half, now I need to sort the right half. So, I'm down at this point here. Sort two is trivial to sort because if N equals one is less than two I return so now I'm at this point in the story what do I do with four and two? I've got two lists of size one and the goal now is to merge these sorted halves. How do I merge these sorted halves?
>> Student: [Inaudible]
>> Lecturer: Well, we can do this fair, I mean we can do it verbally, all right, here's a list, here's a list, I need to merge them somehow. All right, who's going to come first in the merged list? Two obviously. So, I'm going to make some extra space here. We'll call this I need actually in reality extra memory for merge sort so I'm just going to give myself an extra array, some blank space that I can start writing on so here's two lists, I've already plucked off the two, I advance to the next element in list there's nothing there so who comes next? So four. So, now I'm at this point in the story, and I have a list of size two that's been sorted, where did we leave off in the story now?
>> Student: Six and eight.
>> Lecturer: So six and eight, right? Because this was the left half but then this was the left half of the left half, we sorted him, then we sorted the right half of the left half, that was easy, then we merged the left half of the left half with the right half of the left half, and we got two and four. So, I realize this is actually a little confusing but that's why we're actually doing a fairly small example here. Two and four now are sorted so who is their counterpart? Who is their right-hand side? That's these guys. So, how do I sort the right half of this list? All right, I sort the left half of this list first, okay, now I sort, that's already done because it's a list of size one; now I sort the right half of this list, that's already done, what do I do now? What's the third step in this sorting process? So I still need to merge them. We, the [inaudible] humans can see they're already in order, but we as the computer don't know in advanced until we actually look at them so let's look at them. I've got a list of size one, a list of size one, which one is going to come first when I merge the two? Obviously the six and then I advance this finger, the list was only size one there's nothing there so now I merge the remaining element six, remaining element eight, so I've got six and eight. At what point in the story am I? We've rolled back almost to the beginning. Right so I sorted the left half, which was here, I sorted the right half so the third step here and final step on the board is to merge the sorted halves. So now I'm going to do the same algorithm and, again, I tend to use my left and right finger when I need to represent a loop that has maybe two different indices, two different index variables. So how do I sort, how do I merge these two lists? Well, they're already sorted so I don't have to compare all of the elements against every element, I just need to look at the front of them because I already know from my previous steps that I put the smallest elements there. So I now need to merge this list and this list so here's the two smallest elements. I compare them. Is two less than six? Yeah, it is. So, let me put this into my empty space. Now I advance. Okay, this time there's something there so I don't just blindly take the six, I now compare four against six. Obviously the four comes next, now I advance, now there's nothing to compare against so I can just finish my loop over the six and over the eight. So at this point in the story I've just finished which step?
>> Student: Sort.
>> Lecturer: Sort left half. The very first time in this story that I said sort the left half was to sort these four elements. We finally completed that step, but notice what happened. It induced that simple goal of sorting the left half of elements, induced another call to merge sort, another call to merge sort, another call to merge sort, and then we merge the lists, but then we did the same thing. Another call to merge sort, another call to merge sort, another call to merge sort, then we merge the lists and we're not even done yet. So, it actually feels like we're doing a lot of work here, but let's see if it pans out. Well, on the right hand side if that's the next step, I sort the right hand side of the list. How do I do that? Sort the left half. All right so that's these two guys. How do I do that? Sort the left half. All right, that's actually pretty easy, done. It's a list of size one now I sort the right half. Easy, it's a list of size one done. Now the magic happens, but it's seemingly fairly simple, merge them. All right, so how do I do that? I compare and I'm going to plop down one first then three. These guys are now sorted. Where am I at the story? Now I sort the right half of the right half so that means look at these two, chop it into left and right, do this, trivial, it's already sorted do this, trivial, it's already sorted. Now again the magical step of merging the two and obviously we get out five and seven so now I have a list of size two that's sorted, another list of size two that's sorted, the left half and right halves respectively of the right half of the original. What's the last step? To merge the sorted halves. So, one and five I compare. One comes first, now I compare three and five, three comes next. Now I'm out of numbers there so five and seven get plopped down am I done? Not quite. Merge sort boils down essentially to three lines of code; sort left half, sort right half, and I've done that and I've done that and now the third and final step of merge sort is to merge those sorted halves. So what do I do? Well, I'm running out of board space, but I don't think it'll be all that enlightening to know that the numbers are now going to get merged as one, two, three, four, but how? Well, again, remember the two index variables. I've got on the start of this list, one on this one, I compare the two fingers, one comes first so I write down one. Then I advance this finger. Now, I compare two against three. I put down two and I advance this finger. And the point of this little finger demonstration is to make clear that at no point is my left or right finger going backwards on the board. They're always making forward progress. In other words, every time I merge, I'm touching physically N elements or put another way I'm touching every number once and only once because once I've touched it, I advance and move forward again. I never double back. So, in other words from left the right, I seem to be conceptually always taking N steps in this direction. Every time I do merging, there's N steps involved because I'm touching every number once but never going backwards. So that then begs the question on the other axis here, how many times did I go down in this list? How many times did I chop lists in half? Because if I'm doing something N times this way, hopefully I didn't chop the list N times this way. Otherwise we've got an N squared algorithm and this was all a waste of time, but how many times per week zero can we take a list of size N or a list of size eight and divide it in half, in half, in half, and then bottom out with one element?
>> Student: [Inaudible].
>> Lecturer: So log base two of N. So even if you got lost in some of the details, that's certainly fine at first glance here, but just conceptually what was I doing? Dividing the list in half, in half, in half, in half and then I bottomed out with individual elements. How many times can you do that? Log N times, but every time I did that the goal was to then merge all of the elements together as we tried to do visually here so I'm doing something log N times and then each time I'm doing N merging steps. N moves with my fingers so this algorithm feels like it should be big O of, and I'll write it here just for the sake of reference, big O of N times log N, and it's technically base two, but just like we started scratching off constant numbers before in the context of big O, the base also means very little since it's just a constant value. You can convert one base in binary to any other constant value as well. So if this depiction, if I didn't do this depiction justice, let's just try to think about it more abstractly. If the problem at hand is to figure out what T of N is, the running time of an algorithm whose input is of size N how can we actually compute its running time by looking at the pseudo code essentially? So this is the algorithm. The last time we talked about pseudo code we were able to slap big O of N squared on it just by looking at it so let's see if we can't translate this to big O notation. All right so this first line here if N is less than two that's constant steps so there's a one in there and something uninteresting. Clearly the meaty part of this algorithm is down here in the outs condition. So, let's ask ourselves what is the running time of this algorithm? Well, if you had N elements, we need to figure out T of N, which is just a fancy way of describing its running time. So, what does that mean? Well, computing T of N means you first sort the left half of the list so that's T of what, N over two, then you sort the right half, that's T over T of N over two. So I don't know what the function is. I don't know what T is yet, but I know it's going to be the same thing but with an input that's half as big and then finally how many steps does it take to merge sorted halves? Well, per my fingers moving slowly from left to right, that's some number of N steps because every finger moves forward once and then they eventually both stop. So we can express this even without quite knowing where we're going yet with an algorithm like this. T of N, the running time of an algorithm with [inaudible] elements is the same thing as sorting half of it, T of N over two, plus half of it, T of N over two, plus N steps, but you know I'm kind of asking lots of questions so maybe there's a constant factor in there. So, here two, even in this context can we say the merging takes linear time I don't know exactly if this is one step to compare two elements or maybe this is two steps, but again, these are constant values, who cares, let's just generalize it as big O of N, but now we've kind of handed and turned in an answer that itself is recursive. What's the running time of this algorithm? Well, it's the running time of running the algorithm on half of itself and half of itself. Like we haven't really made progress per se, but let's see if the number at least pan out. So how long does it take to sort not eight, but we can do a little more interesting than that, 16 elements, but it's still a power of two just so that the math works out perfectly even though in reality there'd be some rounding. So, let's just plug these numbers into the formula. So, the formula, again is T of N equals T of N over two plus T of N over two plus some linear time. So let's do it. How long does it take to sort 16 elements? Well, T of 16 is T of eight plus T of eight or two times T of eight plus 16 to do the merging. So sort the left half, sort the right half plus the merging. Unfortunately, we've again gotten caught in a little circular argument here because now the question is, what is T of eight then? All right, well T of eight is the process of sorting the left half, which is now size four, plus sorting the right half, which is also size four so T times two of T of four plus eight. All right, what is the answer to T of four? Well, T of four is now this, two times T of two, plus four. What is T of two? Well, T of two is two times T of one and hopefully we're now going to bottom out and not get stuck in this loop. Ah, T of one is just zero because how long does it take to sort one element? Well, no time. It's already sorted. That's when we returned immediately. So now we just have some high school arithmetic, a little substitution here. I now can fill in T of one then I can fill in T of two, and then it bobbles back up so if we do out the math, and I just flip the numbers around so I can do the substitution from top to bottom, it looks like T of 16 is ultimately two times T of eight, but what is that? Well, that is just the answer to T of eight, which is apparently twenty-four, if you go through the math. So long story short T of 16 is 64 steps. Does this jive with the claim that the running time is big O of N times log N?
>> Student: Yes.
>> Lecturer: Yeah. So 16, right, so if the running time here is N times log N. So N is 16, that's 16 times log base two of sixteen that's four so, yes, so sixteen times four, 32, 64. It is, in fact, 64 steps so the claim ultimately is, in fact, correct that merge sort takes 64 steps concretely with 16 elements but more generally N times log N. So what's the take away there? Like who cares? Right. There's no point in memorizing that merge sort takes 64 steps or 64 seconds for inputs of size 16 but what would the N squared algorithm have been? How long would bubble sort have taken? How long would selection sort have taken? Sixteen squared and that's if you do the math 256 and these are small inputs. Now I'm comparing 256 steps against 64 steps. Already the math is starting to work out in our favor, and if you think back to the slide where we had those graphs, when N really gets big do these numbers really start to hurt. Well, it turns out there are other algorithms to be fair out there so we'll see things over time whether in sections or in future CS classes things called heap sorts and insertion sort, quick sort, rate sort, shell sort and as an aside especially for one of the hacker editions you might enjoy reading up on one or more of these, but N log N for now is about as good as we'll try to do but let's just see exactly how good that is. All right so here we have the second demo. This, too, is linked on the course's website. This is just another Java applet that someone very nicely put together and each of these drop downs is a whole bunch of sorting algorithms. So the [inaudible] has had a lot of free time over the years to come up with algorithms like bozo sort and stooge sort and cocktail sort and a whole bunch of other silly ones so apparently if you come up with a clever algorithm, the fun part is you actually get to name it, but now we have from left to right selection sort, bubble sort and merge sort. The question of the day was, can we do better than these initial algorithms, bubble sort and selection sort, and if so, who cares? Why does it really matter? What I'm about to do is click on each of these one right after the other so that these things essentially run in lock step and let's see what it means for something to be in N log in as opposed to N squared. Apparently not much. There we go. Done. And with that we'll see you Monday.
==== Transcribed by Automatic Sync Technologies ====