1 00:00:00,506 --> 00:00:09,546 [ Silence ] 2 00:00:10,046 --> 00:00:12,636 >> Alright, welcome back! 3 00:00:12,636 --> 00:00:15,906 This is the start of week 4 and this is CS50, 4 00:00:15,906 --> 00:00:19,706 and this is an actual receipt that floated around online 5 00:00:19,706 --> 00:00:21,016 after someone photographed it. 6 00:00:21,286 --> 00:00:24,086 It perhaps speaks to what goes wrong when you have the problem 7 00:00:24,086 --> 00:00:26,836 of floating point imprecision, particularly bad 8 00:00:26,836 --> 00:00:27,936 if it's a cash register. 9 00:00:28,416 --> 00:00:29,826 This is-- just to zoom in-- 10 00:00:29,826 --> 00:00:33,206 this is the receipt from a restaurant. 11 00:00:33,416 --> 00:00:38,226 And again, things should probably not cost 1.48999 cents, 12 00:00:38,226 --> 00:00:39,436 probably unintentional. 13 00:00:39,436 --> 00:00:40,436 But computers make mistake 14 00:00:40,436 --> 00:00:42,926 and apparently people buy computers with mistakes. 15 00:00:42,926 --> 00:00:47,736 So, looking ahead to this week, so Problem Set 3 introduces you 16 00:00:47,736 --> 00:00:51,036 to this sort of age old game of The Game of 15, 17 00:00:51,036 --> 00:00:53,656 whereby you've got a board like this and the goal is 18 00:00:53,656 --> 00:00:55,676 to move the puzzle pieces up, down, left to right 19 00:00:55,676 --> 00:00:58,196 so that you've ordered them ultimately from top to bottom, 20 00:00:58,446 --> 00:01:02,896 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 with a blank 21 00:01:02,896 --> 00:01:04,236 in the bottom right hand corner 22 00:01:04,566 --> 00:01:06,646 so the problem set this week introduces you too 23 00:01:06,646 --> 00:01:09,306 for the first time to something we'll call Distribution Code, 24 00:01:09,306 --> 00:01:10,126 Distro Code. 25 00:01:10,186 --> 00:01:11,736 Something that someone else wrote, 26 00:01:11,856 --> 00:01:14,476 in this case it's code written by us but we stop short 27 00:01:14,476 --> 00:01:17,376 of implementing this whole game and rather we left blanks, 28 00:01:17,376 --> 00:01:20,146 little holes for you to fill in throughout the frameworks 29 00:01:20,146 --> 00:01:22,856 so that the goal of this game is really twofold 30 00:01:22,856 --> 00:01:24,036 so far as the P set goes. 31 00:01:24,036 --> 00:01:27,316 One is understanding someone else's code, so not diving 32 00:01:27,316 --> 00:01:29,406 in with a blank slate and figuring out how 33 00:01:29,406 --> 00:01:31,106 to implement the whole thing from scratch, 34 00:01:31,156 --> 00:01:34,716 but rather understanding how a framework works that we provide 35 00:01:34,716 --> 00:01:36,966 and you'll see its all nicely commented and indented, 36 00:01:37,176 --> 00:01:40,486 and we then direct you toward the holes you need to fill 37 00:01:40,486 --> 00:01:42,056 in with code of your own. 38 00:01:42,056 --> 00:01:44,926 Now, those of you enticed with this Hacker Edition will know 39 00:01:45,216 --> 00:01:48,146 that the goal is not to implement the game per se, 40 00:01:48,146 --> 00:01:49,986 but a solver for the game. 41 00:01:49,986 --> 00:01:52,136 So that if you actually play this thing rather 42 00:01:52,136 --> 00:01:55,246 than be the human typing in the number 3 and 4 and 15 43 00:01:55,246 --> 00:01:56,526 to move the puzzle pieces around, 44 00:01:56,816 --> 00:02:01,056 instead you will type G-O-D in all caps, and voila! 45 00:02:01,056 --> 00:02:02,996 The computer will solve this problem for you 46 00:02:02,996 --> 00:02:05,936 and its quite fun to see it happening live animatedly. 47 00:02:05,936 --> 00:02:07,686 So you can play with the Hacker Edition even 48 00:02:07,686 --> 00:02:09,576 if you're doing standard in the appliance 49 00:02:09,636 --> 00:02:12,616 because we've included the solutions to both-- 50 00:02:12,616 --> 00:02:14,076 in CS50's own directory. 51 00:02:14,296 --> 00:02:17,146 So also as part of this problem set is an opportunity to dabble 52 00:02:17,376 --> 00:02:20,146 with the couple of sorts and searches so you'll be asked 53 00:02:20,256 --> 00:02:23,006 to implement essentially binary search 54 00:02:23,006 --> 00:02:24,516 and we talked about binary search. 55 00:02:24,516 --> 00:02:25,936 We looked at Sean's implementation 56 00:02:25,936 --> 00:02:27,786 of binary search on video last week. 57 00:02:28,016 --> 00:02:29,746 And there's a couple of ways you can implement it 58 00:02:29,746 --> 00:02:31,576 if you look back at the walkthrough from yesterday 59 00:02:31,846 --> 00:02:34,446 on video if you've not already, if you need some guidance there. 60 00:02:34,446 --> 00:02:36,426 And then you're asked to implement sort, 61 00:02:36,606 --> 00:02:37,946 because again this notion 62 00:02:37,946 --> 00:02:41,316 of sorting elements is what ultimately empowered someone 63 00:02:41,316 --> 00:02:45,706 like us on stage to find numbers more efficiently, more quickly. 64 00:02:45,936 --> 00:02:47,236 We were able to find Mike Smith 65 00:02:47,236 --> 00:02:50,166 in the phonebook some four weeks ago because the thing was sorted 66 00:02:50,336 --> 00:02:51,986 but that begs the question, how expensive? 67 00:02:51,986 --> 00:02:52,886 How time consuming? 68 00:02:52,886 --> 00:02:53,786 Is it to sort? 69 00:02:54,066 --> 00:02:56,916 And so we looked at Bubble Sort, and Selection Sort, 70 00:02:56,916 --> 00:02:59,436 and unfortunately those weren't the fastest players out there, 71 00:02:59,436 --> 00:03:02,676 they were in what we call big O of N squared 72 00:03:02,996 --> 00:03:04,986 since they might take N times N steps 73 00:03:05,046 --> 00:03:06,346 but they did get the job done. 74 00:03:06,346 --> 00:03:09,236 Now, those of you interested in the Hacker Edition also needs 75 00:03:09,236 --> 00:03:12,626 to implement sort but you need to do it in Big O of N, 76 00:03:12,826 --> 00:03:15,326 linear time, but you're allowed a couple of assumptions 77 00:03:15,326 --> 00:03:16,776 as the spec directs you toward. 78 00:03:17,306 --> 00:03:22,496 So by popular demand on two accounts, CS50 WiFi now exists 79 00:03:22,606 --> 00:03:25,206 so if you're here for your-- with your laptop, 80 00:03:25,376 --> 00:03:29,316 its time to pull up Facebook if you like and use the CS50 SSID, 81 00:03:29,536 --> 00:03:35,616 it is password protected at the moment but the password is-- 82 00:03:37,396 --> 00:03:38,966 >> The combination is-- 83 00:03:39,201 --> 00:03:41,201 [ Laughter ] 84 00:03:41,436 --> 00:03:41,706 >> One. 85 00:03:42,306 --> 00:03:42,866 >> One. 86 00:03:43,326 --> 00:03:43,796 >> One. 87 00:03:44,346 --> 00:03:44,796 >> Two. 88 00:03:44,796 --> 00:03:45,456 >> Two. 89 00:03:45,856 --> 00:03:46,376 >> Two. 90 00:03:46,376 --> 00:03:46,766 >> Three. 91 00:03:46,806 --> 00:03:47,296 >> Three. 92 00:03:47,796 --> 00:03:47,886 >> Three. 93 00:03:47,886 --> 00:03:48,206 >> Four. 94 00:03:48,666 --> 00:03:49,146 >> Four. 95 00:03:49,146 --> 00:03:49,526 >> Four. 96 00:03:49,896 --> 00:03:50,126 >> Five. 97 00:03:51,366 --> 00:03:51,756 >> Five. 98 00:03:51,756 --> 00:03:52,006 >> Five. 99 00:03:53,886 --> 00:03:57,806 >> Six, seven, eight, so the software required 100 00:03:57,806 --> 00:03:59,876 that they assigned an 8 digit password 101 00:03:59,876 --> 00:04:05,556 but 12345678 should work for you, if not just catch one of-- 102 00:04:05,556 --> 00:04:07,906 myself or one of the TFs on the way out and we'll figure 103 00:04:07,906 --> 00:04:09,466 out how you get you started for next week. 104 00:04:09,656 --> 00:04:11,666 Also, lunch, so this Friday we'll be joined 105 00:04:11,666 --> 00:04:13,046 by an alumnus who's also now 106 00:04:13,046 --> 00:04:14,396 with the Venture Capital Firm NEA, 107 00:04:14,876 --> 00:04:17,846 so if you'd like to not only dine with and chat with me 108 00:04:17,846 --> 00:04:20,256 and the TFs and CAs but also have some vision 109 00:04:20,256 --> 00:04:21,856 of a business you would like to pitch to a-- 110 00:04:22,096 --> 00:04:23,996 a friendly VC, do feel free to RSVP 111 00:04:23,996 --> 00:04:27,386 at the usual URL, cs50.net/rsvp. 112 00:04:27,666 --> 00:04:31,506 So, these were the numbers that we played 113 00:04:31,506 --> 00:04:33,736 with quite a bit last week and these were just random numbers 114 00:04:33,766 --> 00:04:34,596 but there were 8 of them 115 00:04:34,846 --> 00:04:37,836 and 8 is almost always convenient 'cause we can divide 116 00:04:37,836 --> 00:04:40,396 by 2, divide by 2, but in reality and in P set 3, 117 00:04:40,396 --> 00:04:41,886 you're not gonna have the luxury of assuming 118 00:04:41,886 --> 00:04:44,156 that you're only gonna have 8 numbers, but this allows us 119 00:04:44,156 --> 00:04:46,876 to at least discuss them and we looked at again, 120 00:04:46,876 --> 00:04:50,876 a number of algorithms: Binary search, Linear search. 121 00:04:50,876 --> 00:04:53,246 Binary being the faster of the two, it was that divide 122 00:04:53,246 --> 00:04:56,016 and conquer strategy, linear search was just left to right, 123 00:04:56,016 --> 00:04:58,146 or right to left, but they were both correct 124 00:04:58,556 --> 00:04:59,996 but one performs, of course, better. 125 00:05:00,216 --> 00:05:03,446 We looked at a couple of searches and in English terms, 126 00:05:03,446 --> 00:05:05,516 how did Bubble Sort work anyone? 127 00:05:06,656 --> 00:05:12,326 If you implement Bubble Sort, how does it work? 128 00:05:12,326 --> 00:05:14,236 [ Inaudible Remark ] 129 00:05:14,236 --> 00:05:14,726 >> Okay, good! 130 00:05:14,726 --> 00:05:17,286 So it compares two values that are adjacent to one another 131 00:05:17,286 --> 00:05:19,786 and in short, if they're out of order, swap them, 132 00:05:19,926 --> 00:05:22,166 and then repeat, and then repeat, and then repeat, 133 00:05:22,166 --> 00:05:24,846 and then repeat, and at what point could KC 134 00:05:24,846 --> 00:05:28,776 or anyone sorting N elements stop doing those comparisons? 135 00:05:28,776 --> 00:05:30,066 Does she-- did she just need to walk 136 00:05:30,066 --> 00:05:32,986 across the stage once comparing these two, these two, these two, 137 00:05:32,986 --> 00:05:35,176 and by the end done with Bubble Sort? 138 00:05:35,936 --> 00:05:38,946 Well, no, 'cause remember in bubble sort in the worst case, 139 00:05:38,946 --> 00:05:40,416 the person who's all the way at the end 140 00:05:40,416 --> 00:05:43,086 if we had the number one completely in the wrong spot, 141 00:05:43,086 --> 00:05:47,496 he or she might only bubble up one place despite KC's pass 142 00:05:47,496 --> 00:05:50,296 through the entire list of people so KC potentially had 143 00:05:50,296 --> 00:05:54,396 to go back and again, and again, and again potentially N times 144 00:05:54,396 --> 00:05:56,276 and that's where we got the N squared from. 145 00:05:56,526 --> 00:05:57,666 So what about Selection Sort? 146 00:05:57,666 --> 00:05:58,676 This was the first version 147 00:05:58,676 --> 00:06:01,246 and it was actually pretty intuitive, pretty consistent 148 00:06:01,246 --> 00:06:02,556 with what a normal human being would do. 149 00:06:02,556 --> 00:06:03,866 How do you define Selection Sort? 150 00:06:04,356 --> 00:06:09,246 Just in English, how do-- someone else? 151 00:06:09,246 --> 00:06:09,616 Yeah? 152 00:06:09,616 --> 00:06:11,906 >> It's where you find the lowest number 153 00:06:11,906 --> 00:06:12,836 and put it all the way to the left. 154 00:06:13,096 --> 00:06:14,656 >> Okay, so you find the lowest number 155 00:06:14,656 --> 00:06:16,346 and put it all the way to the left, perfect! 156 00:06:16,346 --> 00:06:18,596 And then repeat, but you do it with the next smallest number, 157 00:06:18,596 --> 00:06:19,786 and then the next smallest number, 158 00:06:20,056 --> 00:06:23,396 the problem with this though as clean and as intuitive as it is, 159 00:06:23,526 --> 00:06:26,006 it too involve just as much work because recall 160 00:06:26,006 --> 00:06:28,366 if the list was completely backwards, number one is 161 00:06:28,366 --> 00:06:31,036 over there, number 8 is over there, KC would have 162 00:06:31,036 --> 00:06:33,036 to go all the way over here, find "Oh! 163 00:06:33,036 --> 00:06:36,356 Here's number one, let me move him or her over here." 164 00:06:36,356 --> 00:06:38,356 And just evict this person, and just swap them 165 00:06:38,356 --> 00:06:40,126 over there 'cause they're in random at order 166 00:06:40,126 --> 00:06:42,156 in the first place, it doesn't matter we just shove them 167 00:06:42,156 --> 00:06:45,066 to the end anyway, but now she has to repeat and who knows 168 00:06:45,066 --> 00:06:47,196 where number two is so you have to look again, and again, 169 00:06:47,196 --> 00:06:50,836 and again, and finally we get N squared again 'cause she has 170 00:06:50,866 --> 00:06:53,566 to make N traversals back and forth across the stage, 171 00:06:53,686 --> 00:06:57,026 each time looking at potentially some N or N minus one 172 00:06:57,026 --> 00:07:00,896 or fewer humans so to see this in action, we can actually look 173 00:07:00,896 --> 00:07:02,766 at this visualization which is a little-- 174 00:07:03,786 --> 00:07:06,956 this shows us not just the ones we looked at but a few others 175 00:07:06,956 --> 00:07:10,276 as well so this again, each of the bars represent a number, 176 00:07:10,276 --> 00:07:12,876 smaller bar smaller number, big bar big number. 177 00:07:13,056 --> 00:07:15,346 And in a moment I'm gonna click that green arrow at the-- 178 00:07:15,346 --> 00:07:18,506 the top, and you'll see there's a few more sorts in this world 179 00:07:18,506 --> 00:07:22,716 than just Selection Sort, Bubble Sort, and Merge Sort, 180 00:07:22,716 --> 00:07:25,706 which we looked very briefly at the end of-- last time. 181 00:07:25,956 --> 00:07:28,646 There's Insertion Sort, there's Quick Sort, there's Shell Sort, 182 00:07:28,716 --> 00:07:31,316 Heap Sort, couple of which we'll come back to eventually 183 00:07:31,316 --> 00:07:33,126 but there's dozens of these things 184 00:07:33,126 --> 00:07:35,466 and really it's the principles in just a few of them that-- 185 00:07:35,516 --> 00:07:36,736 that we'll focus on for now. 186 00:07:37,006 --> 00:07:38,376 But the takeaway here is this. 187 00:07:38,556 --> 00:07:41,016 I've clicked random order so that we just kind 188 00:07:41,016 --> 00:07:43,386 of got some random numbers, small and big in there. 189 00:07:43,616 --> 00:07:44,746 If I click Go, what's neat 190 00:07:44,746 --> 00:07:47,756 about this visualization is it's gonna show us more than three, 191 00:07:48,076 --> 00:07:51,176 it's gonna show us all 8 of these algorithms at once, 192 00:07:52,166 --> 00:07:53,996 and recall Merge Sort one last time, 193 00:07:53,996 --> 00:07:55,386 he's doing pretty well this time. 194 00:07:56,406 --> 00:07:58,536 Apparently, Insertion Sort's pretty good, Heap Sort, 195 00:07:58,536 --> 00:08:00,046 Quick Sort, Quick Three, Shell-- 196 00:08:00,046 --> 00:08:01,996 I mean, maybe we should've started with these, right, 197 00:08:01,996 --> 00:08:04,936 'cause these are clearly much higher performance, apparently. 198 00:08:05,866 --> 00:08:10,526 Here comes Insertion Sort, and there's N squared for you. 199 00:08:10,966 --> 00:08:14,926 So as simple as Selection Sort was and as intuitive as it was, 200 00:08:14,926 --> 00:08:17,936 and as correct as it was, it's clearly suboptimal, 201 00:08:17,936 --> 00:08:20,286 at least on random inputs and we can do this again and again, 202 00:08:20,536 --> 00:08:22,416 but unless you get lucky with-- 203 00:08:22,416 --> 00:08:25,466 whereby maybe they're all sorted initially, those two algorithms 204 00:08:25,466 --> 00:08:28,616 in particular are gonna perform and feel slower, 205 00:08:28,616 --> 00:08:30,786 and thus is N squared. 206 00:08:30,846 --> 00:08:32,816 So hopefully, we can do better. 207 00:08:32,816 --> 00:08:36,686 In fact, if-- you might not have seen this a few years back, 208 00:08:36,686 --> 00:08:38,466 a certain someone was running for president 209 00:08:38,466 --> 00:08:41,166 and he was actually asked this question of, what-- 210 00:08:41,276 --> 00:08:43,276 which of these various sorts would you use? 211 00:08:43,616 --> 00:08:45,786 It's perhaps worth hammering home this point 212 00:08:45,786 --> 00:08:49,406 that N squared is indeed a little slow with this interview 213 00:08:49,546 --> 00:08:52,256 with former Google CEO Eric Schmidt. 214 00:08:53,516 --> 00:08:59,546 [ Pause ] 215 00:09:00,046 --> 00:09:00,966 >> Oops, start at the beginning. 216 00:09:01,516 --> 00:09:08,626 [ Applause ] 217 00:09:09,126 --> 00:09:12,596 >> Now-- now, Senator, you're here at Google and I 218 00:09:12,596 --> 00:09:15,696 like to think of the presidency as a-- as a job interview. 219 00:09:15,696 --> 00:09:16,776 [ Laughter ] 220 00:09:16,776 --> 00:09:20,746 >> Now, it's hard to get a job as president. 221 00:09:20,746 --> 00:09:20,916 >> Right. 222 00:09:20,916 --> 00:09:22,176 >> And-- and you're going to [inaudible]. 223 00:09:22,176 --> 00:09:23,726 It's also hard to get a job at Google. 224 00:09:23,856 --> 00:09:24,016 >> Right. 225 00:09:24,016 --> 00:09:24,456 [ Laughter ] 226 00:09:24,456 --> 00:09:29,696 >> We-- we have questions and we ask our candidates questions 227 00:09:29,696 --> 00:09:32,416 and this one is from-- Larry Schwimmer. 228 00:09:32,696 --> 00:09:32,866 >> Okay. 229 00:09:32,866 --> 00:09:32,933 [ Laughter ] 230 00:09:32,933 --> 00:09:36,236 >> What-- you guys think I'm kidding? 231 00:09:36,236 --> 00:09:38,516 It's right here. 232 00:09:38,886 --> 00:09:40,266 What is the most efficient way 233 00:09:40,266 --> 00:09:42,476 to sort a million 32-bit integers? 234 00:09:42,976 --> 00:09:45,446 [ Laughter ] 235 00:09:45,946 --> 00:09:47,826 >> Well-- I-- I-- 236 00:09:47,826 --> 00:09:49,546 >> I-- I'm, maybe-- I'm sorry maybe we should 237 00:09:49,546 --> 00:09:50,256 >> No, no, no, no-- 238 00:09:50,256 --> 00:09:50,546 >> That's not-- 239 00:09:50,546 --> 00:09:50,613 [ Simultaneous Talking ] 240 00:09:50,613 --> 00:09:51,306 >> I-- I think-- I think the-- 241 00:09:51,376 --> 00:09:55,426 the Bubble Sort would be the wrong way to go. 242 00:09:55,426 --> 00:09:56,306 [ Laughter ] 243 00:09:56,306 --> 00:09:59,916 >> Come on, who told him this? 244 00:10:00,416 --> 00:10:03,566 [ Applause ] 245 00:10:04,066 --> 00:10:05,766 >> So there you have it. 246 00:10:05,766 --> 00:10:09,686 So today, we'll look at one of these faster algorithms and sort 247 00:10:09,686 --> 00:10:12,466 of the principles that you can leverage in order to do better 248 00:10:12,466 --> 00:10:16,086 than Bubble Sort but first a few empowering techniques. 249 00:10:16,086 --> 00:10:19,026 So thus far with problem sets and even lecture code 250 00:10:19,026 --> 00:10:20,896 if you've been playing along at home, 251 00:10:20,896 --> 00:10:22,806 you will run into bugs, right? 252 00:10:22,806 --> 00:10:24,296 Your code might not compile. 253 00:10:24,296 --> 00:10:26,826 You might get segmentation faults or core dumps. 254 00:10:27,036 --> 00:10:28,776 You-- your program might never stop 255 00:10:29,076 --> 00:10:31,526 and so it just produces output and output and output 256 00:10:31,526 --> 00:10:32,906 or the cursor never comes back to you. 257 00:10:32,906 --> 00:10:35,326 There's so many ways in which you can screw up a program 258 00:10:35,716 --> 00:10:37,686 and it gets easier, trust us, 259 00:10:37,686 --> 00:10:39,666 to actually find the problems there in, 260 00:10:39,666 --> 00:10:41,836 so thus far if you've been debugging programs, 261 00:10:42,176 --> 00:10:43,226 hopefully you've realized 262 00:10:43,226 --> 00:10:46,356 that printf is absolutely your friend, plugging a few printfs 263 00:10:46,356 --> 00:10:49,196 to see what the value of X is, what the value of Y is-- 264 00:10:49,676 --> 00:10:50,926 this point, and this point, 265 00:10:50,926 --> 00:10:53,066 and this point can certainly help you along, 266 00:10:53,066 --> 00:10:55,296 but unfortunately printf can only take you 267 00:10:55,296 --> 00:10:57,686 so far plus it's tedious then you have to rip 268 00:10:57,686 --> 00:11:01,046 out that code once you found your problem, and especially now 269 00:11:01,046 --> 00:11:03,196 that we're in problem set 3 which is a game, 270 00:11:03,196 --> 00:11:04,896 The Game of Fifteen, there's a little bit 271 00:11:04,896 --> 00:11:07,726 of primitive animation there and you're gonna completely screw 272 00:11:07,726 --> 00:11:09,696 up the game board if you're trying to print 273 00:11:09,696 --> 00:11:12,436 out some error messages or the values of variables 274 00:11:12,436 --> 00:11:13,816 on the screen at the same time. 275 00:11:14,106 --> 00:11:16,456 And so we kind of need more sophisticated techniques now 276 00:11:16,456 --> 00:11:17,546 to debug programs. 277 00:11:17,546 --> 00:11:20,566 So enter into the picture a program called a debugger, 278 00:11:20,566 --> 00:11:23,286 so one of the most popular debuggers out there for Linux, 279 00:11:23,286 --> 00:11:25,046 and Mac OS and it's available on Windows 280 00:11:25,046 --> 00:11:27,856 as well is a program called GDB, the new debugger. 281 00:11:28,116 --> 00:11:29,616 And what this program allows you 282 00:11:29,616 --> 00:11:31,826 to do is set what are called breakpoints 283 00:11:31,946 --> 00:11:35,806 in your program whereby you say, don't run my program all the way 284 00:11:35,806 --> 00:11:41,056 from top to bottom rather pause execution on line 13 of main. 285 00:11:41,056 --> 00:11:43,966 Why 13? Well, you are suspicious that you have a bug somewhere 286 00:11:43,966 --> 00:11:47,226 around line 13 but your program's flying by so fast 287 00:11:47,226 --> 00:11:49,746 that you can't really wrap your mind around what the problem is 288 00:11:49,936 --> 00:11:51,976 so you'd really like to slow things down and look 289 00:11:51,976 --> 00:11:54,596 at step 13, then 14, then 15. 290 00:11:54,596 --> 00:11:57,526 Well a breakpoint, as the name suggests, allows you to brake 291 00:11:57,626 --> 00:12:00,446 at a certain point in your program and then at your pace, 292 00:12:00,786 --> 00:12:04,416 your human pace, only advance line by line when you're ready. 293 00:12:04,646 --> 00:12:07,336 Similarly, while you're inside this debugger you're gonna be 294 00:12:07,336 --> 00:12:09,786 able to printout the values of variable, so you don't have 295 00:12:09,786 --> 00:12:12,936 to use printf, you can just ask the debugger what bits are 296 00:12:12,936 --> 00:12:15,136 inside of this variable X, what number is there, 297 00:12:15,136 --> 00:12:16,316 or what string is here? 298 00:12:16,316 --> 00:12:18,266 And so you can actually see the inside 299 00:12:18,266 --> 00:12:21,096 of your program while still interacting with it. 300 00:12:21,136 --> 00:12:24,456 So let me go ahead and open up the appliance here, 301 00:12:24,506 --> 00:12:29,966 and I'm gonna go ahead and open up, let's say a program called-- 302 00:12:29,966 --> 00:12:35,416 let's go with-- about buggy3, so buggy3 was this program 303 00:12:35,416 --> 00:12:39,786 and we'll actually finally start solving this program-- 304 00:12:39,916 --> 00:12:41,216 this problem this week and next. 305 00:12:41,436 --> 00:12:43,566 This was that broken swap function whereby 306 00:12:43,566 --> 00:12:46,976 in main we had an int X and Y, they each had some values. 307 00:12:46,976 --> 00:12:48,716 We have some printfs, then we called this-- 308 00:12:48,716 --> 00:12:52,186 function Swap which then looked like this 309 00:12:52,496 --> 00:12:57,046 and we claimed verbally a week or so ago that Swap was correct 310 00:12:57,046 --> 00:13:00,516 at least in so far as it goes when it comes to swapping A 311 00:13:00,516 --> 00:13:04,196 and B, but it did not swap X and Y. The moment swap returned, 312 00:13:04,196 --> 00:13:06,716 yes, A and B we claimed were swapped but X 313 00:13:06,716 --> 00:13:09,536 and Y were unchanged, they were still 1 and 2, respectively. 314 00:13:09,796 --> 00:13:12,196 So just conceptually, what was the explanation behind the 315 00:13:12,196 --> 00:13:13,536 bugginess of this program? 316 00:13:14,036 --> 00:13:14,103 Yeah? 317 00:13:14,436 --> 00:13:19,846 >> The variables would pass that value not-- 318 00:13:20,316 --> 00:13:22,496 >> Yeah, so the values of X and Y-- 319 00:13:22,496 --> 00:13:27,006 the X and Y were passed by "value" which meant a copy of X 320 00:13:27,006 --> 00:13:30,256 and Y was passed into Swap, Swap just so happened to call them A 321 00:13:30,256 --> 00:13:31,896 and B, but it could call them whatever it wants, 322 00:13:32,056 --> 00:13:36,866 but they're indeed distinct 32 bits worth of integer 323 00:13:36,866 --> 00:13:39,606 so at this point when Swap was called, 324 00:13:39,606 --> 00:13:42,396 we claimed that there exist in this world X and Y, 325 00:13:42,396 --> 00:13:45,026 and then also A and B, and if you think back 326 00:13:45,026 --> 00:13:47,856 to how we drew the RAM inside of our computer, mains-- 327 00:13:47,856 --> 00:13:51,596 variables were down here, this slice of the stack frames 328 00:13:51,666 --> 00:13:54,566 so to speak, and then on top of it somewhere else in RAM was A 329 00:13:54,566 --> 00:13:57,296 and B, so just pictorially too, 330 00:13:57,396 --> 00:13:59,136 they were distinct chunks of memory. 331 00:13:59,406 --> 00:14:01,826 So, now we can finally see things like this 332 00:14:01,826 --> 00:14:03,426 and this program isn't just-- 333 00:14:03,426 --> 00:14:05,576 fixing this program actually takes a bit of ingenuity. 334 00:14:05,576 --> 00:14:07,196 And we'll actually get there later this week. 335 00:14:07,416 --> 00:14:09,726 But for now, we can at least understand what's going on 336 00:14:09,726 --> 00:14:11,566 and not just have to take my own word for, 337 00:14:11,706 --> 00:14:13,596 and along the way we can actually learn some 338 00:14:13,596 --> 00:14:14,476 of these commands. 339 00:14:14,476 --> 00:14:15,536 So I'm gonna go ahead and do this. 340 00:14:15,666 --> 00:14:17,756 I'm gonna open a larger terminal here, 341 00:14:17,756 --> 00:14:19,216 just so we can see more at one time. 342 00:14:19,216 --> 00:14:25,846 I'm gonna go ahead and run make buggy3, okay, it's up to date, 343 00:14:25,846 --> 00:14:28,126 which means I already compiled it last week so that's good. 344 00:14:28,126 --> 00:14:30,506 And I can go ahead and run it just to prove 345 00:14:30,506 --> 00:14:31,786 that it's still broken. 346 00:14:32,016 --> 00:14:34,296 So X is one, Y is two, there's still one 347 00:14:34,296 --> 00:14:35,256 and two at the end of this. 348 00:14:35,526 --> 00:14:38,256 So let's now, instead of running this as Buggy 3, 349 00:14:38,476 --> 00:14:42,756 lets instead run GDB and then pass to GDB 350 00:14:42,756 --> 00:14:44,346 as they're so-called Command Line Argument, 351 00:14:44,486 --> 00:14:46,056 the name of the program I wanna debug. 352 00:14:46,056 --> 00:14:47,886 So I'm gonna go ahead and hit Enter now. 353 00:14:48,426 --> 00:14:50,546 And now I see a bunch of warranty information 354 00:14:50,546 --> 00:14:52,186 and all this and then a new kind of prompt. 355 00:14:52,186 --> 00:14:53,976 At the bottom left there is a prompt that says, 356 00:14:53,976 --> 00:14:57,116 in parenthesis GDB, and that means I'm inside of the debugger 357 00:14:57,286 --> 00:14:58,906 and nothing's gonna happen just yet. 358 00:14:58,906 --> 00:15:01,256 I can hit Control L to clear the screen, 359 00:15:01,256 --> 00:15:02,386 now I'm inside the debugger 360 00:15:02,386 --> 00:15:05,046 but my program is not running yet but I can run it. 361 00:15:05,266 --> 00:15:09,086 I can literally type Run Enter and voila! 362 00:15:09,086 --> 00:15:11,236 I've run the program inside of the debugger. 363 00:15:11,236 --> 00:15:13,486 Now, unfortunately this is not at all useful thus far. 364 00:15:13,486 --> 00:15:16,886 All I've done is now run my program inside 365 00:15:16,886 --> 00:15:19,066 of another program but I haven't specified 366 00:15:19,066 --> 00:15:20,046 that I wanna poke around. 367 00:15:20,046 --> 00:15:21,316 I haven't specified breakpoints. 368 00:15:21,316 --> 00:15:23,016 I haven't specified I wanna see anything. 369 00:15:23,266 --> 00:15:24,076 So let's do that. 370 00:15:24,076 --> 00:15:26,756 This time, I'm gonna hit Control L just to keep the screen clean, 371 00:15:27,086 --> 00:15:28,376 I'm gonna type break, 372 00:15:28,876 --> 00:15:31,846 and I don't actually remember the line number 13 I made 373 00:15:31,846 --> 00:15:33,356 up before, but I know the function name 374 00:15:33,356 --> 00:15:36,396 so I'm gonna start simple, break at the function called main, 375 00:15:36,716 --> 00:15:39,276 and what you see now on the screen is "Breakpoint 1 376 00:15:39,276 --> 00:15:44,676 at 0x804842d file buggy3.c line 21." 377 00:15:44,966 --> 00:15:48,006 So-- on the one hand, GDB figured out where main starts, 378 00:15:48,006 --> 00:15:49,226 it's apparently line 21, 379 00:15:49,226 --> 00:15:56,236 but what in the world is this 0x804842d do you think? 380 00:15:56,236 --> 00:15:56,426 [ Inaudible Remark ] 381 00:15:56,426 --> 00:15:57,886 >> Yeah, so it's a memory address 382 00:15:57,886 --> 00:16:00,166 and we'll start seeing more things that look like this 383 00:16:00,166 --> 00:16:02,916 but almost any time you see a number in the context 384 00:16:02,916 --> 00:16:05,386 of programming that starts with OX, first of all, 385 00:16:05,386 --> 00:16:08,706 that's indication of hexadecimal notation or basics team, 386 00:16:08,706 --> 00:16:11,246 but we'll come back to that but for now it's just another way 387 00:16:11,246 --> 00:16:13,136 like decimal and binary of expressing numbers. 388 00:16:13,416 --> 00:16:16,516 And yes, that represents the address inside 389 00:16:16,516 --> 00:16:20,526 of my computer's memory at which what lives? 390 00:16:21,616 --> 00:16:23,006 Main itself, right? 391 00:16:23,006 --> 00:16:24,666 We said last week that main itself 392 00:16:24,666 --> 00:16:28,256 and really your program itself lives inside of RAM, in fact, 393 00:16:28,256 --> 00:16:31,066 it's one of the things that's up on toward the top of your memory 394 00:16:31,266 --> 00:16:33,506 and one of the things you run the risk of clobbering 395 00:16:33,506 --> 00:16:35,906 if you call functions way too many times, 396 00:16:35,946 --> 00:16:37,356 such that you overlap that memory 397 00:16:37,636 --> 00:16:39,696 but for now GDB has figured out where it is, 398 00:16:39,696 --> 00:16:43,196 it at a breakpoint number 1 and now if I go ahead and type Run 399 00:16:43,266 --> 00:16:47,146 and hit Enter, now it runs but it seems to break quite quickly. 400 00:16:47,186 --> 00:16:51,386 So it says "Breakpoint 1, main at buggy3.c line 21" well, 401 00:16:51,386 --> 00:16:54,116 I forget what that is but GDB is reminding me. 402 00:16:54,356 --> 00:16:56,526 If you'd like to see it in context, you can go back 403 00:16:56,646 --> 00:16:59,486 to G Edit and you can scroll down and find line 21. 404 00:16:59,706 --> 00:17:02,376 If you've never noticed in the bottom right corner of G Edit, 405 00:17:02,376 --> 00:17:04,276 it always tells you what line you're on 406 00:17:04,276 --> 00:17:05,276 and even what column you're 407 00:17:05,276 --> 00:17:07,136 on so you can do a quick sanity check there, 408 00:17:07,556 --> 00:17:09,596 so I'm gonna go back to my terminal now. 409 00:17:09,846 --> 00:17:12,676 And GDB at this point is saying, "You're on line 21, 410 00:17:12,906 --> 00:17:16,946 I'm about to execute line 21, what do you want to do?" 411 00:17:16,946 --> 00:17:19,456 Well, let's go ahead and do this. 412 00:17:19,866 --> 00:17:21,256 Let's move on to the next line. 413 00:17:21,256 --> 00:17:23,276 So next means execute what you see 414 00:17:23,276 --> 00:17:24,406 and move on to the next line. 415 00:17:24,796 --> 00:17:26,846 Alright, now, I'm gonna go ahead and say, "Yep, lets go ahead 416 00:17:26,846 --> 00:17:28,246 and execute this next." 417 00:17:28,656 --> 00:17:31,196 Okay, now I'm gonna get to line 25 and it's gonna print 418 00:17:31,196 --> 00:17:35,706 out it seems the value of Y. Well what is this? 419 00:17:35,966 --> 00:17:38,106 Why did I just do that? 420 00:17:39,186 --> 00:17:39,626 Interesting. 421 00:17:39,956 --> 00:17:41,906 Let me go ahead and do this. 422 00:17:42,026 --> 00:17:46,736 Print Y Enter, so is that consistent with what we expected 423 00:17:46,736 --> 00:17:47,966 if I go back to the code here? 424 00:17:48,786 --> 00:17:50,786 So yeah, Y was initialized to two here, 425 00:17:50,996 --> 00:17:52,446 I'm now down at this line here. 426 00:17:52,446 --> 00:17:53,896 The reason I'm kind of mumbling 427 00:17:53,896 --> 00:17:56,216 to myself is I'm actually not sure why GDB just skipped X 428 00:17:56,216 --> 00:17:58,396 there, so we'll come back to that later, but for now, 429 00:17:58,396 --> 00:18:02,116 we're at this line here where Y is percent D. If I print out Y 430 00:18:02,116 --> 00:18:05,256 within GDB, it is in fact 2, so let's hit Next again. 431 00:18:05,746 --> 00:18:10,266 And now, indeed, it prints out as much, and now let's go ahead 432 00:18:10,266 --> 00:18:14,376 and say, Next, and oh, this is interesting. 433 00:18:14,666 --> 00:18:18,046 This is the line of code that if executed is supposed to swap X 434 00:18:18,046 --> 00:18:20,066 and Y, so quick sanity check, let me go ahead 435 00:18:20,066 --> 00:18:21,866 and first print X, it's still one. 436 00:18:22,186 --> 00:18:23,996 Print Y, it's still 2. 437 00:18:24,176 --> 00:18:26,346 As an aside, the dollar sign notation here, 438 00:18:26,546 --> 00:18:29,676 this is just the GDB thing, it's giving you temporary nicknames 439 00:18:29,676 --> 00:18:31,926 for those values in case you wanna refer back 440 00:18:32,076 --> 00:18:34,686 to those values, but for now you can essentially ignore the 441 00:18:34,686 --> 00:18:36,306 dollar sign 2 and dollar sign 3, 442 00:18:36,506 --> 00:18:38,696 all we're seeing now is the value of X and the value 443 00:18:38,696 --> 00:18:41,546 of Y. Alright, so let's go ahead and hit Next 444 00:18:41,966 --> 00:18:44,356 and let Swap do its thing so I hit Enter, 445 00:18:44,946 --> 00:18:46,246 now it's about to print 446 00:18:46,246 --> 00:18:49,446 out Swapped exclamation point but lets do a check. 447 00:18:49,446 --> 00:18:55,996 Print X, still 1; Print Y still 2, so clearly it's broken here, 448 00:18:55,996 --> 00:18:58,856 so let's start over and actually see where the problem is, 449 00:18:58,856 --> 00:19:02,266 so let me go ahead and type Run again, it's gonna say, 450 00:19:02,266 --> 00:19:03,426 "Wait a minute, you're still debugging, 451 00:19:03,426 --> 00:19:04,606 do you wanna start it from the beginning?" 452 00:19:04,606 --> 00:19:07,046 I'm gonna say yes, and now we're back at the beginning, 453 00:19:07,296 --> 00:19:09,676 so now let me go and do a sanity check, print X-- 454 00:19:09,676 --> 00:19:14,726 seems like I've got multiple bugs, what's going on here? 455 00:19:14,726 --> 00:19:15,356 [ Inaudible Remark ] 456 00:19:15,356 --> 00:19:15,506 >> Sorry? 457 00:19:17,016 --> 00:19:21,046 >> You haven't hit Next yet. 458 00:19:21,046 --> 00:19:22,986 >> Yeah, so I haven't hit Next yet which means what? 459 00:19:22,986 --> 00:19:25,316 Line 21 has not yet executed 460 00:19:25,446 --> 00:19:28,196 which means what should be the value of X? 461 00:19:28,196 --> 00:19:28,896 [ Inaudible Remark ] 462 00:19:28,896 --> 00:19:30,686 >> Right, it's some garbage value, right? 463 00:19:30,686 --> 00:19:31,956 We talked last week about how 464 00:19:31,956 --> 00:19:34,886 because these stack frames are always piling up and going down, 465 00:19:34,886 --> 00:19:37,186 and up and down, you're reusing the same bits 466 00:19:37,186 --> 00:19:39,596 that some other function maybe not even written 467 00:19:39,596 --> 00:19:41,846 by you has actually previously used. 468 00:19:41,846 --> 00:19:42,806 So what we're seeing 469 00:19:42,806 --> 00:19:47,766 at X's location is this number 3,219,444, 470 00:19:47,986 --> 00:19:49,316 those are just random bits left 471 00:19:49,316 --> 00:19:51,366 over from something else's execution 472 00:19:51,556 --> 00:19:53,216 and so we're just seeing the remnants of that, 473 00:19:53,216 --> 00:19:57,986 but if I do click Next, and then print out X-- okay, whew! 474 00:19:58,056 --> 00:19:59,276 It's back to one. 475 00:19:59,496 --> 00:20:02,966 If I print out Y, same thing, another complete garbage value 476 00:20:02,966 --> 00:20:05,386 but if I hit Next now and then go ahead 477 00:20:05,386 --> 00:20:08,976 and print Y, now it's back to two. 478 00:20:09,126 --> 00:20:11,646 >> So let's hit Next, let's hit Next, 479 00:20:11,866 --> 00:20:13,216 and notice it's just doing the printf's 480 00:20:13,216 --> 00:20:14,516 and this is printing correctly, 481 00:20:14,516 --> 00:20:15,866 I must have hit something wrong earlier 482 00:20:16,266 --> 00:20:18,576 so let's hit Next now, and now Swap. 483 00:20:18,816 --> 00:20:22,016 So if you wanna step into a function and not step over it, 484 00:20:22,146 --> 00:20:26,016 what you do instead is literally type Step, so if I type Step, 485 00:20:26,346 --> 00:20:29,716 now notice what GDB is telling me, I'm now inside 486 00:20:29,716 --> 00:20:32,616 of a function called Swap, the arguments that were just passed 487 00:20:32,916 --> 00:20:35,936 to this function are one and two for the variables A 488 00:20:35,936 --> 00:20:40,566 and B respectively, I'm at line 41 in the same file, buggy3.c 489 00:20:40,566 --> 00:20:44,676 and I'm about to execute line 41 so let's go ahead and do this. 490 00:20:44,946 --> 00:20:48,996 So let's do Next, let's do Next, let's do Next, 491 00:20:49,786 --> 00:20:51,926 and now line 44 is the curly brace. 492 00:20:51,926 --> 00:20:54,576 So this is where our story verbally used to stop. 493 00:20:54,856 --> 00:20:58,276 Let's do a quick sanity check, Print A, Print B, 494 00:20:59,056 --> 00:21:03,806 and what should Temp be if I print Temp? 495 00:21:04,306 --> 00:21:09,276 And fifty-fifty percent chance here, and it's indeed one, 496 00:21:09,276 --> 00:21:11,996 'cause we initialized the two to the value of A and then used 497 00:21:11,996 --> 00:21:14,836 that to put that same value in B. So this looks right. 498 00:21:15,006 --> 00:21:16,416 When I printed A, I got 2. 499 00:21:16,606 --> 00:21:19,606 When I printed B, I got one, so let's go ahead 500 00:21:19,606 --> 00:21:22,306 and click Next and now I'm back. 501 00:21:22,536 --> 00:21:25,626 GDB is telling me back in main where I left off, line 28, 502 00:21:25,946 --> 00:21:30,556 so if I now print out X, print out Y, they're still broken 503 00:21:30,556 --> 00:21:33,046 as we expected and if I try printing A, 504 00:21:33,046 --> 00:21:35,766 no symbol A in current context Print B, 505 00:21:35,766 --> 00:21:38,126 same thing because they're no longer in scope. 506 00:21:38,376 --> 00:21:42,456 So to be honest, using GDB even for these simple operations 507 00:21:42,456 --> 00:21:47,856 of using Print, Next, Step, and Break are hugely more powerful 508 00:21:48,026 --> 00:21:51,616 than just using printf alone, so we'll spend time on these 509 00:21:51,616 --> 00:21:55,326 in section and certainly more over time as well here. 510 00:21:55,646 --> 00:21:59,256 But if you go in the meantime to cs50.net/resources, 511 00:21:59,536 --> 00:22:02,886 you'll see that under the GDB tag, there's one a-- 512 00:22:03,166 --> 00:22:04,306 some documentation here, 513 00:22:04,556 --> 00:22:07,366 but there's also a little quick reference cheat sheet 514 00:22:07,576 --> 00:22:09,856 that has way more commands than you'll probably care 515 00:22:09,856 --> 00:22:12,426 about initially, but it's a one-page PDF you can print out 516 00:22:12,426 --> 00:22:14,486 and it's a cheat sheet of sorts and you'll see 517 00:22:14,486 --> 00:22:15,886 that it gives you a quick sense 518 00:22:16,116 --> 00:22:19,376 of the various commands you can type, and also GDB is nice 519 00:22:19,376 --> 00:22:22,496 in that you can be fairly succinct. 520 00:22:22,576 --> 00:22:25,176 You can just say-- P for Y, PX. 521 00:22:25,176 --> 00:22:28,206 You can almost always type just the first one 522 00:22:28,276 --> 00:22:30,096 or two characters of the command. 523 00:22:30,246 --> 00:22:33,186 So you don't have to redundantly type out print all the time. 524 00:22:33,186 --> 00:22:35,156 And you can also scroll up, down, left, right. 525 00:22:35,156 --> 00:22:37,816 And finally when you're done just type Q or quit. 526 00:22:37,816 --> 00:22:40,236 And that will warn you that you're about to quit. 527 00:22:40,236 --> 00:22:42,546 And now you're back at the appliances prompt, 528 00:22:42,546 --> 00:22:43,636 so a hugely useful tool. 529 00:22:43,636 --> 00:22:44,836 And this is one of those things honestly 530 00:22:44,836 --> 00:22:45,806 where I did the same thing. 531 00:22:46,096 --> 00:22:47,286 We sort of introduced this. 532 00:22:47,666 --> 00:22:49,836 We claim that's incredibly useful. 533 00:22:50,056 --> 00:22:51,626 But there's invariably a learning curve, 534 00:22:51,626 --> 00:22:54,426 whether it's ten minutes, an hour or so to acclimate to it. 535 00:22:54,426 --> 00:22:56,296 And so this is absolutely one of these tools 536 00:22:56,296 --> 00:22:59,286 that like almost every student puts off, myself included back 537 00:22:59,286 --> 00:23:02,206 in the day, until later and later and later in the term. 538 00:23:02,206 --> 00:23:06,056 But honestly, spending the ten minutes, the hour for P Set 3 539 00:23:06,206 --> 00:23:09,986 and 4 to play with GDB, I promise will save you hours 540 00:23:09,986 --> 00:23:11,416 of time in the future. 541 00:23:11,576 --> 00:23:14,076 It is so much more useful than just using printf 542 00:23:14,146 --> 00:23:16,526 in your own brain sometimes when looking through code. 543 00:23:16,756 --> 00:23:19,096 It allows you to step through much more reasonably. 544 00:23:20,236 --> 00:23:24,626 Alright, any questions on debugging or GDB? 545 00:23:26,006 --> 00:23:28,366 Yeah? 546 00:23:28,366 --> 00:23:29,496 [ Inaudible Remark ] 547 00:23:29,496 --> 00:23:33,106 >> Good question, if you're dealing with arrays by default, 548 00:23:33,106 --> 00:23:35,996 GDB will usually know how to print something that's 549 00:23:35,996 --> 00:23:39,226 like an array or a string which is an array of characters. 550 00:23:39,226 --> 00:23:41,216 It does sometimes depend on the context. 551 00:23:41,476 --> 00:23:43,866 But for now if you declare an array inside of a function 552 00:23:43,866 --> 00:23:46,556 and then type "print space foo" where "foo" is the name 553 00:23:46,556 --> 00:23:48,106 of that array, you should see all 554 00:23:48,106 --> 00:23:49,556 of the members of that array, yes. 555 00:23:50,066 --> 00:23:52,776 The only problem arises sometimes if you start passing 556 00:23:52,776 --> 00:23:54,846 that array around which we haven't really done, 557 00:23:54,846 --> 00:23:57,466 except in problem set three's distribution code 558 00:23:57,466 --> 00:24:00,006 that sometimes GDB essentially loses track of the length 559 00:24:00,006 --> 00:24:02,746 and so it has trouble printing it for you. 560 00:24:02,746 --> 00:24:06,066 Other questions on GDB? 561 00:24:06,306 --> 00:24:13,336 Alright so comparing values, and now sorting values, 562 00:24:13,336 --> 00:24:14,826 how can we do this better? 563 00:24:14,826 --> 00:24:17,106 So this little picture here suggests that one 564 00:24:17,106 --> 00:24:18,466 of the fundamental operations, 565 00:24:18,466 --> 00:24:20,886 anytime we're actually doing sorting is that we need 566 00:24:20,886 --> 00:24:22,006 to compare stuff, alright? 567 00:24:22,006 --> 00:24:24,306 We need to compare one value against the other. 568 00:24:24,306 --> 00:24:26,276 In Bubble Sort that's literally what we were doing. 569 00:24:26,276 --> 00:24:27,666 And really in Selection Sort, 570 00:24:27,816 --> 00:24:29,186 that's what we were doing as well. 571 00:24:29,186 --> 00:24:31,506 Even though we describe it as go find the smallest value. 572 00:24:31,666 --> 00:24:33,326 Well, how do you find the smallest value? 573 00:24:33,466 --> 00:24:36,386 Well you look at the first person, there number 8, 574 00:24:36,506 --> 00:24:38,576 you just assume they're the smallest by default 575 00:24:38,576 --> 00:24:39,516 because they are thus far. 576 00:24:39,726 --> 00:24:41,166 Then you look for someone smaller 577 00:24:41,166 --> 00:24:44,566 and smaller every time comparing the person you found 578 00:24:44,726 --> 00:24:47,036 against the number you're considering replacing them with. 579 00:24:47,036 --> 00:24:50,056 So comparisons are really fundamental to this process 580 00:24:50,056 --> 00:24:52,556 of sorting and then to some extent, so is swapping, 581 00:24:52,806 --> 00:24:56,336 of course swapping is very easy if you just use X and Y 582 00:24:56,336 --> 00:24:58,646 in a temporary variable in your function but of-- 583 00:24:58,646 --> 00:25:00,096 we've seen that it gets problematic 584 00:25:00,386 --> 00:25:02,806 if you have a Swap function but we'll fix that problem 585 00:25:02,806 --> 00:25:05,746 on Wednesday but realize that for Problem Set 3, 586 00:25:05,746 --> 00:25:08,656 you don't need to implement your own Swap function, right, 587 00:25:08,656 --> 00:25:11,306 with just three lines of code, case in point, 588 00:25:11,726 --> 00:25:16,326 you can implement Swap inside of a for loop or inside of main, 589 00:25:16,326 --> 00:25:19,376 you don't necessarily need this auxiliary function just yet. 590 00:25:19,916 --> 00:25:23,566 So, can we do better and how can we go about doing better? 591 00:25:23,566 --> 00:25:25,066 Well we need a new technique. 592 00:25:25,616 --> 00:25:28,436 So if we want to solve these problems more effectively, 593 00:25:28,436 --> 00:25:30,996 it'd be nice if we could express ourselves a little more 594 00:25:30,996 --> 00:25:33,596 elegantly, and it's not gonna necessarily empower us 595 00:25:33,626 --> 00:25:36,556 to solve problems we couldn't with something called Iteration, 596 00:25:36,556 --> 00:25:39,516 otherwise known as Looping, but besides for loops 597 00:25:39,516 --> 00:25:42,056 and while loops and do-while loops, there's other ways 598 00:25:42,056 --> 00:25:43,636 to do something again and again and again, 599 00:25:43,636 --> 00:25:46,726 and thus far we did it once intentionally, though foolishly, 600 00:25:46,936 --> 00:25:49,126 recall that when I implemented a function foo, 601 00:25:49,226 --> 00:25:51,206 and then I had foo call foo, 602 00:25:51,286 --> 00:25:53,316 I think that was the function's name, what happened 603 00:25:53,356 --> 00:25:54,646 when I had a function call itself? 604 00:25:54,646 --> 00:25:55,316 [ Inaudible Remark ] 605 00:25:55,316 --> 00:25:59,496 >> It kept going, it kept going, 606 00:25:59,496 --> 00:26:00,856 it kept going, until what happened? 607 00:26:01,386 --> 00:26:01,716 >> Segmentation-- 608 00:26:01,756 --> 00:26:03,486 >> Segmentation fault, right, core dumped. 609 00:26:03,586 --> 00:26:04,416 So why was that? 610 00:26:04,416 --> 00:26:07,326 Well, we claimed last time that when you think of your memory 611 00:26:07,326 --> 00:26:09,786 as that big rectangle and you keep calling a function, 612 00:26:09,786 --> 00:26:11,826 each invocation, not the function, 613 00:26:11,866 --> 00:26:15,116 each call of that function gets its own slice of RAM another, 614 00:26:15,116 --> 00:26:15,966 another, another, another, 615 00:26:15,966 --> 00:26:17,976 and even though you might have gigabytes worth of RAM, 616 00:26:18,166 --> 00:26:20,726 eventually you do something infinitely long and you're going 617 00:26:20,726 --> 00:26:23,226 to take up more than you're allowed 618 00:26:23,226 --> 00:26:24,736 and so the operating system's gonna kill you, 619 00:26:24,736 --> 00:26:27,096 you're gonna dump core, and inside of that file 620 00:26:27,096 --> 00:26:28,796 or the contents of memory. 621 00:26:28,796 --> 00:26:32,306 As an aside, before I forget, if you are very good 622 00:26:32,306 --> 00:26:34,816 at creating core dumps in your programs, 623 00:26:35,046 --> 00:26:38,306 realize that GDB here can also save the day. 624 00:26:38,306 --> 00:26:41,736 If you do find that you do have one of these files called Core, 625 00:26:41,846 --> 00:26:44,536 what you can do with GDB is run GDB 626 00:26:44,796 --> 00:26:48,446 on whatever your program name was but then specify 627 00:26:48,446 --> 00:26:51,306 as a third word at the command line the name 628 00:26:51,306 --> 00:26:53,236 of the file, Core, C-O-R-E. 629 00:26:53,466 --> 00:26:56,406 And then hit Enter, and if I actually had had a core dumped 630 00:26:56,406 --> 00:26:58,256 from this invocation of this program, 631 00:26:58,636 --> 00:27:02,546 the GDB is being handed not only the program called Program, 632 00:27:02,766 --> 00:27:04,426 but also the core dump and inside 633 00:27:04,426 --> 00:27:06,236 of the core dump, is what did we say? 634 00:27:06,236 --> 00:27:06,996 [ Inaudible Remark ] 635 00:27:06,996 --> 00:27:12,526 >> It's the contents essentially of RAM 636 00:27:12,596 --> 00:27:14,556 at the point your program crashed 637 00:27:14,556 --> 00:27:16,866 which means this is your opportunity for an autopsy 638 00:27:16,866 --> 00:27:19,456 of sorts, this is the dead program sitting there 639 00:27:19,456 --> 00:27:23,466 in the file called Core and with GDB, you can now step through 640 00:27:23,466 --> 00:27:28,276 and type Print, and other commands to actually look inside 641 00:27:28,276 --> 00:27:29,246 of what happened there. 642 00:27:29,246 --> 00:27:31,196 So there, too, is an opportunity. 643 00:27:31,916 --> 00:27:34,276 Alright, but we'll be having plenty 644 00:27:34,276 --> 00:27:35,936 of segmentation faults Wednesday onward 645 00:27:35,936 --> 00:27:37,776 when we finally introduce something called Pointers 646 00:27:37,776 --> 00:27:39,876 but for now, really an elegant technique, 647 00:27:39,876 --> 00:27:42,716 and this is something we won't use all that often this semester 648 00:27:42,936 --> 00:27:44,566 but in future courses if you decide 649 00:27:44,566 --> 00:27:46,806 to proceed particularly courses like 51, 650 00:27:47,076 --> 00:27:49,726 or in a Data structures, or algorithms class or really 651 00:27:49,726 --> 00:27:51,256 in the real world when you start trying 652 00:27:51,256 --> 00:27:53,206 to solve really complex problems, 653 00:27:53,436 --> 00:27:55,916 this notion of recursion is really powerful. 654 00:27:55,916 --> 00:27:59,126 And recursion boils down to calling yourself, 655 00:27:59,386 --> 00:28:01,336 like you are a function, you are said 656 00:28:01,336 --> 00:28:03,836 to be recursive if you call yourself. 657 00:28:04,086 --> 00:28:07,396 Now, we saw last week that this is apparently a bad thing. 658 00:28:07,396 --> 00:28:10,086 If you call yourself, you get into this infinite loop 659 00:28:10,086 --> 00:28:11,286 and use segmentation fault. 660 00:28:11,686 --> 00:28:14,906 But that's true if you'll just call yourself-- 661 00:28:16,316 --> 00:28:19,566 if you just call yourself again and again 662 00:28:19,696 --> 00:28:21,686 without even considering stopping. 663 00:28:21,976 --> 00:28:25,486 But suppose we introduce a branch or a condition that says, 664 00:28:25,686 --> 00:28:26,836 sometimes you should stop 665 00:28:27,166 --> 00:28:30,456 so that we can short-circuit this otherwise infinite looping. 666 00:28:30,726 --> 00:28:33,416 So here's some generic pseudo-code 667 00:28:33,816 --> 00:28:37,446 for that algorithm Merge Sort, so just as a teaser here, 668 00:28:37,666 --> 00:28:39,036 here's how Merge Sort works. 669 00:28:39,036 --> 00:28:41,436 Merge Sort was that super fast algorithm we saw at the end 670 00:28:41,436 --> 00:28:45,516 of last week and we saw among the 8 candidates today it works 671 00:28:45,516 --> 00:28:46,106 as follows. 672 00:28:46,216 --> 00:28:50,016 If you were handed N elements and N is less than two, 673 00:28:50,156 --> 00:28:51,196 you just return, right? 674 00:28:51,196 --> 00:28:54,426 If you get fewer than two elements, how big is the list? 675 00:28:55,556 --> 00:28:58,566 It's zero or one, in either case the list is sorted 676 00:28:58,736 --> 00:28:59,336 so you're done. 677 00:28:59,786 --> 00:29:02,436 Else, if you actually get an interesting size list 678 00:29:02,496 --> 00:29:05,686 of size two or three or a thousand or anything bigger, 679 00:29:05,996 --> 00:29:07,136 you're gonna instead do this. 680 00:29:07,346 --> 00:29:10,826 Sort the left half of elements, sort the right half of elements, 681 00:29:11,116 --> 00:29:13,436 then merge the sorted halves. 682 00:29:13,526 --> 00:29:14,306 That's it! 683 00:29:14,306 --> 00:29:15,956 Right, you might have thought that Merge Sort, wow, 684 00:29:15,956 --> 00:29:18,836 must be really interesting and complex, it's gonna take, 685 00:29:18,836 --> 00:29:20,456 you know, really complex demonstration but no-- 686 00:29:20,456 --> 00:29:22,686 maybe, probably, actually you didn't think any of that. 687 00:29:22,686 --> 00:29:27,096 So instead Merge Sort is defined literally this simply, 688 00:29:27,376 --> 00:29:29,426 it literally calls itself. 689 00:29:29,606 --> 00:29:31,256 Right, if you're an algorithm and you wanna kind 690 00:29:31,256 --> 00:29:33,646 of be a smart-ass algorithm and you're told here, 691 00:29:33,646 --> 00:29:36,426 sort these N elements, you could just push back and say, 692 00:29:36,476 --> 00:29:38,626 "Okay fine, I'm gonna sort these N elements 693 00:29:38,626 --> 00:29:41,666 by first sorting the left half, then sorting the right half, 694 00:29:41,666 --> 00:29:43,696 and then I'll merge the two sorted halves." 695 00:29:43,856 --> 00:29:48,276 But that then beget-- that then begs what question cyclically, 696 00:29:48,276 --> 00:29:50,386 okay fine, how are you gonna sort the left half? 697 00:29:50,386 --> 00:29:51,616 How are you gonna sort the right half? 698 00:29:51,886 --> 00:29:54,126 Well, I can give you the same wisecrack answer and say, 699 00:29:54,126 --> 00:29:55,866 well I'm gonna sort the left half of the left half 700 00:29:55,866 --> 00:29:58,246 and the right half of the left half and then the other side 701 00:29:58,306 --> 00:29:59,626 and I'm gonna merge the two, right? 702 00:30:00,156 --> 00:30:02,286 >> Now these two feels like you're being difficult 703 00:30:02,286 --> 00:30:05,296 and aren't really answering the question but you kind of are 704 00:30:05,466 --> 00:30:08,466 because what if you get down to the point in this sort of back 705 00:30:08,466 --> 00:30:11,626 and forth argument where all I do is hand you two numbers 706 00:30:11,626 --> 00:30:11,976 like this. 707 00:30:12,826 --> 00:30:14,306 Well, you're gonna say the same thing, 708 00:30:14,306 --> 00:30:16,686 I'm gonna sort the left then I'm gonna sort the right then I'm 709 00:30:16,686 --> 00:30:17,506 gonna emerge the two. 710 00:30:17,806 --> 00:30:19,636 Alright, find I'm gonna push back one last time 711 00:30:19,636 --> 00:30:21,456 and say "Fine, how are you gonna sort the left half? 712 00:30:21,936 --> 00:30:23,856 Well, the left half looks like this. 713 00:30:23,856 --> 00:30:24,766 How you're gonna sort this?" 714 00:30:26,156 --> 00:30:27,116 It's already sorted. 715 00:30:27,356 --> 00:30:29,186 Right, so there is the brilliant insight. 716 00:30:29,376 --> 00:30:32,996 Once this argument kind of bottoms out and the list is 717 00:30:32,996 --> 00:30:36,826 of sufficiently small size the problem is trivial to solve. 718 00:30:36,826 --> 00:30:38,736 All you have to do is hand me back the list. 719 00:30:39,216 --> 00:30:40,866 So in Merge Sort as its name kind 720 00:30:40,866 --> 00:30:43,666 of suggests the real magic is not so much 721 00:30:43,666 --> 00:30:48,226 in the sorting per se but in the last step the merging. 722 00:30:48,226 --> 00:30:52,986 That must be where the sorting in real terms actually happens. 723 00:30:53,386 --> 00:30:56,956 Now in as much as this algorithm Merge Sort effectively calls 724 00:30:56,956 --> 00:31:00,026 itself by cyclically saying, how do you sort N elements? 725 00:31:00,026 --> 00:31:02,396 Well, you sort these elements then you sort these elements. 726 00:31:02,726 --> 00:31:04,326 The fact that you're answering question 727 00:31:04,326 --> 00:31:07,096 with the same question means that it's recursive. 728 00:31:07,096 --> 00:31:09,036 So let's see this notion of recursion, 729 00:31:09,196 --> 00:31:11,106 but first in a simpler context. 730 00:31:11,106 --> 00:31:15,236 Let me go over to this file here and the appliance 731 00:31:16,416 --> 00:31:21,336 and open up let's say sigma1. 732 00:31:21,336 --> 00:31:23,806 So for those following along at home 733 00:31:23,806 --> 00:31:28,126 over the PDF this is sigma1.c. And I'm gonna scroll down here 734 00:31:28,126 --> 00:31:30,256 and let's see what this sigma function does. 735 00:31:30,256 --> 00:31:32,346 So this function gets its name incidentally 736 00:31:32,346 --> 00:31:35,076 from just the Greek symbol for adding a bunch 737 00:31:35,076 --> 00:31:36,986 of numbers together that looks a little something like that. 738 00:31:37,316 --> 00:31:39,476 So this is just an addition function at the end of the day. 739 00:31:39,476 --> 00:31:41,526 So how is this addition actually working? 740 00:31:41,776 --> 00:31:42,656 Well, let's take a look. 741 00:31:42,746 --> 00:31:46,256 So we have at the very top of the file a prototype 742 00:31:46,526 --> 00:31:49,036 for a function called sigma whose argument is an int 743 00:31:49,036 --> 00:31:50,436 and whose return type is an int. 744 00:31:50,806 --> 00:31:53,996 Recall that I mentioned that for prototypes you don't strictly 745 00:31:53,996 --> 00:31:58,156 need to mention the name of the variable just its type. 746 00:31:58,506 --> 00:32:00,696 So that's an example of that shorthand notation 747 00:32:00,696 --> 00:32:01,486 up at the top. 748 00:32:01,726 --> 00:32:03,826 Now here is main, so what's main doing? 749 00:32:03,826 --> 00:32:05,756 It's gonna ask the user for a positive int. 750 00:32:05,756 --> 00:32:07,546 So here's an example of using do-while. 751 00:32:07,726 --> 00:32:09,836 Again, we keep using do-while in programs 752 00:32:09,836 --> 00:32:11,876 and P sets anytime you wanna get something 753 00:32:11,876 --> 00:32:13,476 from the user then yell at them 754 00:32:13,476 --> 00:32:15,086 if they don't oblige the first time. 755 00:32:15,116 --> 00:32:16,356 So it's a good construct for that. 756 00:32:16,666 --> 00:32:17,756 So all of these first five 757 00:32:17,756 --> 00:32:20,026 or six lines are doing are getting an integer 758 00:32:20,276 --> 00:32:21,946 that should be positive from the user. 759 00:32:22,466 --> 00:32:25,966 Then, apparently I'm declaring an int called answer. 760 00:32:26,626 --> 00:32:27,906 I'm storing an answer, 761 00:32:27,906 --> 00:32:30,506 the result of calling a function whose name is sigma 762 00:32:30,766 --> 00:32:35,026 and the input to sigma is N, whatever the user typed N. So, 763 00:32:35,026 --> 00:32:38,666 I claim that sigma's purpose in life is to compute the sum 764 00:32:38,926 --> 00:32:42,166 from 1 up to N. So if the user types 765 00:32:42,166 --> 00:32:45,446 in two sigma should give me 1 plus 2 it should give me 3. 766 00:32:45,726 --> 00:32:48,786 And if the user types in 3 the program should give me 1 plus 2 767 00:32:48,786 --> 00:32:50,076 plus 3, 6. 768 00:32:50,326 --> 00:32:50,976 I should get back. 769 00:32:50,976 --> 00:32:52,706 You should get the summation of those numbers. 770 00:32:52,926 --> 00:32:55,506 And then, all this program is gonna do is report the answer. 771 00:32:55,746 --> 00:32:57,426 So let's go ahead and run this to see. 772 00:32:57,426 --> 00:32:59,286 Let me go on to the source directory, 773 00:32:59,516 --> 00:33:02,566 I'm gonna go ahead and make sigma 1. 774 00:33:03,086 --> 00:33:04,246 Okay, compile is okay. 775 00:33:04,246 --> 00:33:06,286 Let me run sigma 1 on 2. 776 00:33:07,076 --> 00:33:09,166 Oops, no, doesn't take a command line argument. 777 00:33:09,166 --> 00:33:12,076 Let me run sigma 1, then give its number 2 778 00:33:12,076 --> 00:33:13,336 and it does give me 3. 779 00:33:13,336 --> 00:33:17,146 If I give it 3, it gives me 6, if I give it 10,000, 780 00:33:17,146 --> 00:33:18,386 it gives me a really big number. 781 00:33:18,576 --> 00:33:21,036 So it seems to be working but, you know, 782 00:33:21,036 --> 00:33:24,926 I could be the difficult one or the TF, do something like this. 783 00:33:24,926 --> 00:33:29,746 That was probably a bad idea. 784 00:33:29,746 --> 00:33:30,426 [ Laughter ] 785 00:33:30,426 --> 00:33:34,306 >> So what might be happening here? 786 00:33:34,436 --> 00:33:35,346 That's a pretty big number. 787 00:33:35,346 --> 00:33:36,586 It's bigger than 4 billion. 788 00:33:37,586 --> 00:33:40,076 You know it's pro-- feels 789 00:33:40,076 --> 00:33:42,846 like it's looping infinitely even though that's a big number. 790 00:33:43,126 --> 00:33:44,636 So what-- remember what can happen 791 00:33:44,636 --> 00:33:48,776 if you overflow an integer it could be mistaken as negative, 792 00:33:48,856 --> 00:33:51,806 so it's possible I'm actually looping ad nauseam now 793 00:33:51,806 --> 00:33:53,676 because I've never gonna hit my condition. 794 00:33:53,676 --> 00:33:56,276 So in any case that's something where we'd have to assume a way, 795 00:33:56,276 --> 00:33:57,426 we would tell you in a P set, 796 00:33:57,466 --> 00:33:59,076 assume the user is not gonna type something more 797 00:33:59,076 --> 00:34:00,226 than two or four billion. 798 00:34:00,506 --> 00:34:03,046 But at least it seems to be working for small value. 799 00:34:03,046 --> 00:34:04,626 So let's go back and see how. 800 00:34:05,046 --> 00:34:06,666 So how is sigma working? 801 00:34:07,146 --> 00:34:08,366 Well, it's actually pretty simple. 802 00:34:08,366 --> 00:34:11,436 And this is kind of week 1, week 2 stuff now. 803 00:34:11,436 --> 00:34:14,426 So sigma takes an argument we called it M. 804 00:34:14,486 --> 00:34:16,146 But it could be anything so long it's an int. 805 00:34:16,606 --> 00:34:20,466 First we do have a bit of-- oh, so interesting I actually made 806 00:34:20,466 --> 00:34:22,486 that up for just a moment ago 'cause I'm actually checking 807 00:34:22,486 --> 00:34:24,226 for negative value so there must be something else. 808 00:34:24,286 --> 00:34:25,676 And we could use GDB to figure 809 00:34:25,676 --> 00:34:26,946 out what was really going on there. 810 00:34:27,396 --> 00:34:31,096 So it's saying if M is less than one return zero immediately 811 00:34:31,096 --> 00:34:33,376 so that we don't run the rest of an infinite loop. 812 00:34:33,536 --> 00:34:37,866 Now in here, return the some of 1 through M how does this work? 813 00:34:38,166 --> 00:34:40,106 I've got a variable initialize to zero. 814 00:34:40,446 --> 00:34:43,706 I've got a for loop that's gonna go from literally 1 815 00:34:43,916 --> 00:34:48,126 up to an equal to M. So that's a literal translation of the goal 816 00:34:48,246 --> 00:34:51,956 and then sum plus equals I and then return the sum. 817 00:34:52,246 --> 00:34:55,856 So it feels like a very reasonable simple implementation 818 00:34:55,856 --> 00:34:58,516 of summation from I equals 1 up to M 819 00:34:59,006 --> 00:35:00,616 and then returning the value. 820 00:35:00,826 --> 00:35:02,556 And we did get back that value. 821 00:35:03,126 --> 00:35:05,326 But there's kind of an opportunity here, right? 822 00:35:05,636 --> 00:35:09,366 If I wanted to solve this problem in sort 823 00:35:09,366 --> 00:35:11,936 of fundamentally another way, what does it mean 824 00:35:11,936 --> 00:35:14,836 to compute the sum of 1 through N? 825 00:35:16,056 --> 00:35:18,356 Well, it's-- if I wanna be the wise ass like here. 826 00:35:18,356 --> 00:35:24,966 It's really, alright it's N plus the sum of 1 to N minus 1. 827 00:35:25,206 --> 00:35:26,616 Alright, so that bags the question, 828 00:35:26,616 --> 00:35:29,106 what is the sum of 1 to N minus 1? 829 00:35:29,686 --> 00:35:35,176 Well, it's N minus 1 plus the sum of 1 to N minus 2. 830 00:35:35,176 --> 00:35:37,736 And okay fine, what's the sum of 1 to N minus 2? 831 00:35:37,736 --> 00:35:42,206 Well, its N minus 2 plus the sum of 1 to N minus 3, right? 832 00:35:42,206 --> 00:35:43,756 So you keep going and going and going 833 00:35:43,756 --> 00:35:45,256 and this could become problematic 834 00:35:45,296 --> 00:35:46,406 if you do go negative. 835 00:35:46,686 --> 00:35:48,676 But it's if at some point you do have a mental check 836 00:35:48,676 --> 00:35:49,306 that says what? 837 00:35:50,066 --> 00:35:54,796 What's the sum of 1, 2-- 1 to 1? 838 00:35:54,796 --> 00:35:57,506 Well it's just 1. 839 00:35:57,586 --> 00:35:59,736 So you better stop asking me this stupid question. 840 00:35:59,736 --> 00:36:02,246 Right, at some point there should be a very simple case, 841 00:36:02,246 --> 00:36:05,256 let's call it a base case where it just wouldn't make sense 842 00:36:05,256 --> 00:36:09,356 to recursively or to cyclically ask the same question again. 843 00:36:09,626 --> 00:36:12,266 So let's translate that idea now to code. 844 00:36:12,266 --> 00:36:16,016 Let me go ahead and close sigma1 and open up sigma2 instead 845 00:36:16,016 --> 00:36:18,556 and scroll down here and notice this part 846 00:36:18,556 --> 00:36:20,226 of the program is now identical. 847 00:36:20,526 --> 00:36:23,596 I still have a printf and a do-while loop. 848 00:36:23,876 --> 00:36:27,116 I still have a sigma call storing the result and answer 849 00:36:27,296 --> 00:36:29,496 and I still print out what the answer is. 850 00:36:29,496 --> 00:36:32,606 The only that's different now is the implementation of sigma. 851 00:36:32,836 --> 00:36:34,216 So let's scroll down here. 852 00:36:34,216 --> 00:36:37,716 And look how relatively simple this now is. 853 00:36:37,976 --> 00:36:40,796 No extra variables, no for loop, no plus plus, 854 00:36:40,796 --> 00:36:42,426 no syntactic distraction. 855 00:36:42,706 --> 00:36:44,206 There's like real elegance here. 856 00:36:44,426 --> 00:36:46,326 In fact, if I get rid of these comments, 857 00:36:46,906 --> 00:36:49,676 just look how simple it is to express this idea. 858 00:36:50,166 --> 00:36:53,186 If M is less than or equal to zero, you better stop, right, 859 00:36:53,186 --> 00:36:54,926 'cause bad things will happen if we go negative, 860 00:36:55,146 --> 00:36:56,816 so let's just return zero. 861 00:36:57,036 --> 00:36:59,036 Now, this is not necessarily the right answer 862 00:36:59,036 --> 00:37:01,196 because this is not the sum if you actually try 863 00:37:01,196 --> 00:37:02,296 to sum up negative numbers. 864 00:37:02,296 --> 00:37:03,676 But we have to make a design decision 865 00:37:03,856 --> 00:37:05,376 so we're not gonna support negative numbers. 866 00:37:05,596 --> 00:37:09,416 So if M is less than or equal to zero, return zero otherwise, 867 00:37:09,416 --> 00:37:10,746 what's the answer to the question? 868 00:37:10,746 --> 00:37:12,286 What's the sum from one to M? 869 00:37:12,936 --> 00:37:17,546 Well, it is M plus the sum of everything below it. 870 00:37:18,206 --> 00:37:19,966 So, now, think about what's happening here. 871 00:37:19,966 --> 00:37:21,886 Main gets called then-- 872 00:37:22,006 --> 00:37:24,326 and let's suppose the user types in the number three. 873 00:37:24,326 --> 00:37:25,766 We'll keep it small for simplicity. 874 00:37:25,986 --> 00:37:28,736 So the user types in three and Main gets called. 875 00:37:28,956 --> 00:37:31,866 Main calls sigma with an argument of three. 876 00:37:32,126 --> 00:37:33,256 What does sigma do? 877 00:37:33,256 --> 00:37:35,476 Well, sigma says, all right, is three less 878 00:37:35,476 --> 00:37:36,976 than or equal to zero? 879 00:37:36,976 --> 00:37:39,266 No. So we do the second case, the else. 880 00:37:39,766 --> 00:37:41,376 So what's the answer to the question? 881 00:37:41,376 --> 00:37:44,126 What is the sum of one through three? 882 00:37:44,456 --> 00:37:47,066 Well, it's three plus the sum of? 883 00:37:48,666 --> 00:37:51,306 The sum of one to two which is otherwise expressed 884 00:37:51,306 --> 00:37:57,046 as three plus sigma of two, all right, so what's sigma of two? 885 00:37:57,046 --> 00:37:59,736 Well, sigma of two is two plus sigma of one. 886 00:38:00,026 --> 00:38:01,276 What's sigma of one? 887 00:38:02,006 --> 00:38:05,116 Well, it's one plus sigma of zero? 888 00:38:05,116 --> 00:38:06,856 What's sigma of zero? 889 00:38:07,696 --> 00:38:09,396 That's where we hit the base case, right? 890 00:38:09,396 --> 00:38:12,526 So this is the thing I did not have a week or two ago 891 00:38:12,636 --> 00:38:14,206 when I completely broke the program 892 00:38:14,206 --> 00:38:16,236 by just having foo call itself again and again. 893 00:38:16,376 --> 00:38:19,706 I never once said, if something is true, stop doing this. 894 00:38:20,016 --> 00:38:22,686 But, indeed, with a recursive function like this, 895 00:38:22,726 --> 00:38:26,626 if you do have a decision point that says stop the recursion 896 00:38:26,806 --> 00:38:30,446 if something is true, you avoid that process 897 00:38:30,606 --> 00:38:32,996 of overflowing the stack so to speak. 898 00:38:33,466 --> 00:38:35,356 So let's see if this thing now works. 899 00:38:35,356 --> 00:38:37,076 Let me go to sigma2. 900 00:38:37,486 --> 00:38:38,186 Make sigma2. 901 00:38:38,456 --> 00:38:41,866 Seems to compile okay, let me do sigma2, Enter. 902 00:38:42,196 --> 00:38:43,386 And now, let me go ahead 903 00:38:43,386 --> 00:38:46,766 and do positive integer, let's give it two. 904 00:38:46,766 --> 00:38:47,416 So that works. 905 00:38:47,416 --> 00:38:48,276 Let's give it three. 906 00:38:48,766 --> 00:38:49,276 That works. 907 00:38:49,336 --> 00:38:51,816 Let's give ten, seems to be working. 908 00:38:51,816 --> 00:38:54,746 Let's give it a bigger number, seems to be working. 909 00:38:54,746 --> 00:38:56,576 Let's give it an even bigger number. 910 00:38:57,556 --> 00:38:58,366 Okay, interesting. 911 00:38:59,056 --> 00:39:02,906 So just because we have a base case doesn't mean the solution 912 00:39:02,906 --> 00:39:04,036 is perfect, right? 913 00:39:04,036 --> 00:39:06,506 Even apparently calling this is, what, 914 00:39:06,506 --> 00:39:10,676 one million times calling the same function a million times is 915 00:39:10,676 --> 00:39:11,526 apparently bad. 916 00:39:12,296 --> 00:39:14,776 But that's okay 'cause this technique 917 00:39:14,776 --> 00:39:18,066 of recursion is still something we can leverage to solve, say, 918 00:39:18,066 --> 00:39:21,036 the problem of sorting, because especially with sorting, 919 00:39:21,036 --> 00:39:22,546 even if there are a million elements, 920 00:39:22,826 --> 00:39:25,426 the goal of doing divide and concur recall is 921 00:39:25,426 --> 00:39:28,116 to whittle these problems down to something much smaller, 922 00:39:28,116 --> 00:39:33,416 much fewer steps than anything linear with N. So let's go ahead 923 00:39:33,416 --> 00:39:35,686 and take a quick break but when we come back we'll now play this 924 00:39:35,686 --> 00:39:37,766 idea to the solving of sorting. 925 00:39:38,516 --> 00:39:44,406 [ Noise ] 926 00:39:44,906 --> 00:39:50,246 >> All right, so before class I borrowed a few milk cartons 927 00:39:50,246 --> 00:39:52,716 from Annenberg across the way to see 928 00:39:52,716 --> 00:39:56,036 if we can take a human approach 929 00:39:56,036 --> 00:39:59,466 to sorting these elements while still executing these relatively 930 00:39:59,466 --> 00:40:01,426 straightforward instructions and let's see 931 00:40:01,426 --> 00:40:05,686 if we can understand why and how Merge Sort is actually speeding 932 00:40:05,686 --> 00:40:08,976 ahead so much more effectively than Bubble or Selection Sort. 933 00:40:09,116 --> 00:40:12,536 >> For this, we need just one volunteer, someone perhaps 934 00:40:12,536 --> 00:40:14,766 who hasn't been up here before to kind of run the show 935 00:40:14,766 --> 00:40:16,096 with these eight milk cartons. 936 00:40:16,096 --> 00:40:18,426 Come on down. 937 00:40:18,426 --> 00:40:20,706 All right, so, let's see how well this works. 938 00:40:21,496 --> 00:40:22,766 The algorithm itself is correct. 939 00:40:22,816 --> 00:40:24,546 So let's see if we can now do it some justice. 940 00:40:24,546 --> 00:40:25,196 What's your name? 941 00:40:25,516 --> 00:40:25,686 >> Joseph. 942 00:40:26,006 --> 00:40:26,316 >> Joseph? 943 00:40:26,376 --> 00:40:27,546 Okay, good to meet you. 944 00:40:27,546 --> 00:40:28,556 Come on up. 945 00:40:28,776 --> 00:40:32,386 David. All right, so here are the same exact numbers 946 00:40:32,386 --> 00:40:34,816 in the same exact order as last week. 947 00:40:34,816 --> 00:40:37,386 And for logistical purposes, you wanna hop back 948 00:40:37,386 --> 00:40:38,716 down on the ground floor there? 949 00:40:39,006 --> 00:40:41,136 And the goal is gonna be to sort these things, 950 00:40:41,136 --> 00:40:43,396 but thankfully this time we have some written instructions. 951 00:40:43,626 --> 00:40:47,546 So the input here now is of size eight, so first thing, 952 00:40:47,546 --> 00:40:49,056 I ask the question, is eight less than two? 953 00:40:49,056 --> 00:40:50,376 >> No, it's not. 954 00:40:50,506 --> 00:40:52,626 >> Okay, so no it's not, obviously so we proceed 955 00:40:52,626 --> 00:40:55,646 with the else, so else I say sort left half of element. 956 00:40:55,646 --> 00:40:57,656 So left half looks like these four here. 957 00:40:58,126 --> 00:41:01,226 So now if you interpret the same algorithm, how are you going 958 00:41:01,226 --> 00:41:02,666 to sort the left half of these elements? 959 00:41:02,826 --> 00:41:04,536 >> I'm gonna cut them in half and do it again? 960 00:41:04,656 --> 00:41:06,536 >> Okay, so he's gonna cut them in help and do it again. 961 00:41:06,536 --> 00:41:07,586 And just hold it a little closer there. 962 00:41:07,826 --> 00:41:09,716 So, okay, so cut it in half. 963 00:41:09,716 --> 00:41:10,726 Why cut it half? 964 00:41:10,726 --> 00:41:12,846 Well, Joseph needs an algorithm with which 965 00:41:12,846 --> 00:41:13,936 to sort the left half. 966 00:41:13,936 --> 00:41:16,606 The only algorithms we have are Bubble Sort and Selection Sort. 967 00:41:16,606 --> 00:41:17,796 Today, those are disqualified. 968 00:41:17,796 --> 00:41:18,656 It's way too slow. 969 00:41:18,896 --> 00:41:21,586 So we only have Merge Sort, so let's do the same thing. 970 00:41:21,776 --> 00:41:25,526 So here is now list of size four, so is four less than two? 971 00:41:25,526 --> 00:41:25,636 >> No. 972 00:41:25,636 --> 00:41:27,086 >> No. No. 973 00:41:27,086 --> 00:41:29,246 So now we sort the left half of elements. 974 00:41:29,246 --> 00:41:30,026 So you hand me this. 975 00:41:30,756 --> 00:41:31,246 What do you do? 976 00:41:31,966 --> 00:41:33,456 >> N is not less than two? 977 00:41:33,456 --> 00:41:35,316 >> Not less than two 'cause it is two. 978 00:41:35,716 --> 00:41:36,656 >> We cut in half again. 979 00:41:36,656 --> 00:41:37,656 >> So we cut in half again. 980 00:41:38,046 --> 00:41:39,706 So now here is a list of size one. 981 00:41:39,706 --> 00:41:40,446 Is one less than two? 982 00:41:40,776 --> 00:41:41,206 >> Yeah. 983 00:41:41,326 --> 00:41:42,386 >> Okay, obviously, it is. 984 00:41:42,386 --> 00:41:43,796 And so, what do you do? 985 00:41:43,866 --> 00:41:44,276 >> Return. 986 00:41:44,506 --> 00:41:46,126 >> Okay, so this is sorted, right. 987 00:41:46,126 --> 00:41:47,046 This is progress now. 988 00:41:47,296 --> 00:41:50,916 The left half of the left half of the left half is now sorted, 989 00:41:50,916 --> 00:41:53,186 even though Joseph literally didn't need to lift a finger. 990 00:41:53,426 --> 00:41:54,846 So what step comes next? 991 00:41:54,846 --> 00:41:55,706 Where are we in the story? 992 00:41:55,956 --> 00:41:56,716 >> We merge. 993 00:41:56,946 --> 00:41:57,386 >> Careful. 994 00:41:57,496 --> 00:41:58,476 >> We're at the else. 995 00:41:58,876 --> 00:42:01,216 We return and then we call again. 996 00:42:01,516 --> 00:42:02,946 >> Good. So we sorted the left half, 997 00:42:02,946 --> 00:42:04,336 but what comes after sort left half? 998 00:42:04,466 --> 00:42:05,126 >> We sort the right half. 999 00:42:05,126 --> 00:42:06,276 >> So now we sort the right half. 1000 00:42:06,276 --> 00:42:08,016 Here is the right half on input. 1001 00:42:08,016 --> 00:42:09,176 N equals one. 1002 00:42:09,176 --> 00:42:09,946 Is one less than two? 1003 00:42:10,056 --> 00:42:10,196 >> Yeah. 1004 00:42:10,196 --> 00:42:11,446 >> It is. So he returns. 1005 00:42:11,586 --> 00:42:12,166 This is great. 1006 00:42:12,356 --> 00:42:14,716 Now, we have two bites of the problem solved, 1007 00:42:14,916 --> 00:42:15,946 so now what's the next step? 1008 00:42:16,056 --> 00:42:16,666 >> We merge them. 1009 00:42:16,666 --> 00:42:17,496 >> Okay, so we merge them. 1010 00:42:17,496 --> 00:42:18,696 So now you can finally do something. 1011 00:42:21,216 --> 00:42:22,556 Okay, so good. 1012 00:42:22,556 --> 00:42:24,126 This is actually pretty good progress. 1013 00:42:24,126 --> 00:42:25,566 We've sorted this part of the list. 1014 00:42:25,566 --> 00:42:27,146 Now, they're not necessarily in perfect order, 1015 00:42:27,306 --> 00:42:29,276 'cause two probably this shouldn't be out all the way 1016 00:42:29,276 --> 00:42:30,426 at the end 'cause there is a one. 1017 00:42:30,636 --> 00:42:31,616 Four shouldn't be there. 1018 00:42:31,826 --> 00:42:33,006 But at least this is progress. 1019 00:42:33,326 --> 00:42:35,406 So, now, mentally, we kind of have to start rewinding. 1020 00:42:35,406 --> 00:42:38,196 How did we get to the point of merging these two halves? 1021 00:42:38,596 --> 00:42:42,666 Well, we started here because this is the right half 1022 00:42:42,666 --> 00:42:47,006 of the input of size four, right, so you kind of have 1023 00:42:47,006 --> 00:42:51,066 to mentally recourse in backwards to where we started. 1024 00:42:51,256 --> 00:42:52,436 So here is input of four. 1025 00:42:52,436 --> 00:42:53,836 You just sorted the left half. 1026 00:42:53,836 --> 00:42:54,366 What comes next? 1027 00:42:55,136 --> 00:42:55,276 >> Right. 1028 00:42:55,276 --> 00:42:56,456 >> All right, so let's sort the right half. 1029 00:42:56,556 --> 00:42:57,446 How do we sort the right half? 1030 00:42:57,746 --> 00:42:58,556 >> Cut it in half again. 1031 00:42:58,696 --> 00:43:00,066 >> Okay. Here's the left half. 1032 00:43:00,146 --> 00:43:00,826 >> Return six. 1033 00:43:00,946 --> 00:43:01,526 >> Here's the right half. 1034 00:43:01,586 --> 00:43:02,036 >> Return eight. 1035 00:43:02,286 --> 00:43:03,476 >> Here's both of them together? 1036 00:43:03,476 --> 00:43:03,996 [ Inaudible Remark ] 1037 00:43:03,996 --> 00:43:05,096 >> Okay. Merge. 1038 00:43:05,096 --> 00:43:06,016 They're already merged. 1039 00:43:06,186 --> 00:43:07,386 All right, so that's good. 1040 00:43:07,386 --> 00:43:08,766 It's really minimal work so far. 1041 00:43:08,766 --> 00:43:09,896 So where are we in the story? 1042 00:43:10,906 --> 00:43:13,776 >> Now, we have to merge the two halves together again. 1043 00:43:13,776 --> 00:43:16,536 >> Perfect 'cause now, if you again rewind mentally, 1044 00:43:16,716 --> 00:43:19,516 we've sorted the left half, we just sorted the right half. 1045 00:43:19,516 --> 00:43:20,276 So what comes next? 1046 00:43:20,496 --> 00:43:21,456 Merge the sorted halves. 1047 00:43:21,456 --> 00:43:23,776 >> Now, it's a little more interesting, so go ahead 1048 00:43:23,776 --> 00:43:25,276 and implement this idea of merging. 1049 00:43:26,946 --> 00:43:27,136 >> Well. 1050 00:43:27,296 --> 00:43:28,166 >> Good, very well done. 1051 00:43:28,886 --> 00:43:30,556 So 2, 4, 6, 8, so we're done. 1052 00:43:30,556 --> 00:43:32,136 Hopefully, the right side is gonna be more interesting. 1053 00:43:32,416 --> 00:43:34,396 So where are we now in the story? 1054 00:43:34,396 --> 00:43:36,426 We've hit the bottom of that else condition. 1055 00:43:36,696 --> 00:43:39,846 This was the left half of the input of size? 1056 00:43:40,426 --> 00:43:40,786 >> Eight. 1057 00:43:40,786 --> 00:43:42,806 >> All right, so why don't we put you on autopilot now, 1058 00:43:43,216 --> 00:43:44,886 and do-- tell us, though, verbally what you're doing. 1059 00:43:45,156 --> 00:43:48,266 >> Okay, so now we're on the right half of the input 1060 00:43:48,266 --> 00:43:51,716 of size eight, so we cut that in half and look at the left half 1061 00:43:51,856 --> 00:43:55,206 and then we cut that in half and look at left half of that. 1062 00:43:55,206 --> 00:43:57,636 Return one, return three for the right half, 1063 00:43:57,726 --> 00:43:59,906 and those are already merged. 1064 00:44:00,496 --> 00:44:04,186 So then we come over here and we cut this half and half 1065 00:44:04,286 --> 00:44:07,936 and return seven, return five, and then merge these. 1066 00:44:08,606 --> 00:44:08,956 Beautiful! 1067 00:44:08,956 --> 00:44:09,231 [ Laughter ] 1068 00:44:09,231 --> 00:44:09,506 [ Noise ] 1069 00:44:09,506 --> 00:44:12,796 >> Very well done. 1070 00:44:12,896 --> 00:44:16,756 >> And now we have to merge the whole half with the whole half. 1071 00:44:16,846 --> 00:44:18,596 >> Good. Both halves are size four. 1072 00:44:18,596 --> 00:44:19,736 So now how do we do this? 1073 00:44:19,736 --> 00:44:22,466 And this part-- so, clearly, the real work is involved 1074 00:44:22,556 --> 00:44:23,356 in the merging, right? 1075 00:44:23,356 --> 00:44:24,736 Again, hence the name Merge Sort. 1076 00:44:24,976 --> 00:44:27,036 So be a little precise now verbally. 1077 00:44:27,036 --> 00:44:28,436 How are you gonna merge these two? 1078 00:44:29,256 --> 00:44:30,106 >> Okay, well-- 1079 00:44:30,106 --> 00:44:32,356 >> And just to be clear, there is one half, there is the other. 1080 00:44:32,606 --> 00:44:37,446 >> Okay, so I say we look at the left hand of this half, 1081 00:44:38,326 --> 00:44:41,136 this half, and the left hand of that half and compare it over. 1082 00:44:41,176 --> 00:44:43,256 >> Okay, so comparison, another theme thus far. 1083 00:44:43,256 --> 00:44:49,776 >> Right. And so, if that entity is smaller than half entity, 1084 00:44:49,776 --> 00:44:51,986 we then merge those two first. 1085 00:44:51,986 --> 00:44:52,986 >> Okay, so good. 1086 00:44:52,986 --> 00:44:55,566 So it turns out here is one point worth making. 1087 00:44:55,856 --> 00:44:58,446 So we kind of need some extra space now, right, 1088 00:44:58,516 --> 00:45:00,916 because if we kind of take last week's approach 1089 00:45:00,916 --> 00:45:03,526 of just swapping these guys, then we're screwing 1090 00:45:03,526 --> 00:45:04,716 up all the work we just did. 1091 00:45:04,716 --> 00:45:06,866 It is a good thing that two is on the left over here. 1092 00:45:06,866 --> 00:45:10,096 It's a good thing that one is on the left of the right half. 1093 00:45:10,286 --> 00:45:12,856 So we can't just kinda arbitrarily swap two people 1094 00:45:12,856 --> 00:45:15,276 and say we'll fix that later, 'cause if we fix that later, 1095 00:45:15,496 --> 00:45:18,216 the supposedly fast Merge Sort algorithm is gonna start taking 1096 00:45:18,216 --> 00:45:18,996 even more time. 1097 00:45:19,316 --> 00:45:21,846 So it turns out the one that the hidden costs of Merge Sort 1098 00:45:21,846 --> 00:45:24,136 and one of the key ingredients to making it so fast 1099 00:45:24,306 --> 00:45:25,706 in those visualizations is 1100 00:45:25,706 --> 00:45:28,016 that Joseph actually needs extra space. 1101 00:45:28,016 --> 00:45:30,406 So it was actually deliberate that I left a foot or so 1102 00:45:30,406 --> 00:45:32,116 of space here in front of these buckets 1103 00:45:32,156 --> 00:45:34,816 because Merge Sort gains its efficiency. 1104 00:45:34,816 --> 00:45:37,376 It increases-- it decreases its time 1105 00:45:37,736 --> 00:45:39,666 by increasing its space requirement, 1106 00:45:39,666 --> 00:45:40,626 its memory requirement. 1107 00:45:40,626 --> 00:45:43,676 So put more-- technically, for Merge Sort, you need two arrays. 1108 00:45:43,736 --> 00:45:46,756 One empty, one is the original array, and use the empty one 1109 00:45:46,756 --> 00:45:49,176 to put the elements into by the end. 1110 00:45:49,176 --> 00:45:52,296 So now you have as much memory as you want; eight new spots. 1111 00:45:52,296 --> 00:45:54,346 So where are you gonna put one? 1112 00:45:54,876 --> 00:45:56,016 Okay, so one goes there. 1113 00:45:56,016 --> 00:45:58,536 And, actually, there, let's put them in different places 1114 00:45:58,536 --> 00:46:01,216 so separate arrays, but we can still let it peak through. 1115 00:46:01,356 --> 00:46:02,956 Okay, so now what happens next? 1116 00:46:02,956 --> 00:46:03,896 One was less than two, 1117 00:46:04,096 --> 00:46:06,896 so one goes into the left most location of the new array. 1118 00:46:06,896 --> 00:46:08,296 What do you wanna do next? 1119 00:46:08,446 --> 00:46:11,046 >> So then you-- but you still compare the left, 1120 00:46:11,606 --> 00:46:13,566 this array with the next element in the-- 1121 00:46:13,566 --> 00:46:13,946 >> Perfect! 1122 00:46:13,946 --> 00:46:14,716 So you repeat. 1123 00:46:14,716 --> 00:46:16,656 So, you look at the left end of the left half, 1124 00:46:16,866 --> 00:46:18,126 the left end of the right half. 1125 00:46:18,126 --> 00:46:20,226 Compare those two values, turns out two is 1126 00:46:20,226 --> 00:46:21,646 in fact smaller, so where does 2 go? 1127 00:46:22,066 --> 00:46:22,486 >> Right there. 1128 00:46:23,166 --> 00:46:23,506 >> Right there? 1129 00:46:23,506 --> 00:46:25,326 And so let's put it into the new memory space. 1130 00:46:25,386 --> 00:46:26,776 All right, so now what are you gonna compare? 1131 00:46:27,286 --> 00:46:28,856 >> The left end of this and the left end of that. 1132 00:46:28,856 --> 00:46:29,926 >> So four versus three? 1133 00:46:29,926 --> 00:46:30,726 >> And three is smaller. 1134 00:46:31,006 --> 00:46:34,736 >> Okay. So even though this kind of feels 1135 00:46:34,736 --> 00:46:36,246 like there is a lot of work going on here, 1136 00:46:36,426 --> 00:46:37,676 notice one key takeaway. 1137 00:46:37,676 --> 00:46:40,716 Every time Joseph puts a number into the new array, 1138 00:46:40,946 --> 00:46:42,306 he never has to go back, right? 1139 00:46:42,306 --> 00:46:44,566 There is no looping back and forth 1140 00:46:44,816 --> 00:46:46,516 like KC had to do last week. 1141 00:46:46,716 --> 00:46:49,116 Moreover, he's only looking at each of these-- 1142 00:46:49,116 --> 00:46:52,726 he's only moving to the left of these elements once. 1143 00:46:52,726 --> 00:46:54,386 As soon as he considers an element 1144 00:46:54,386 --> 00:46:55,896 and decides it's smaller, that's it. 1145 00:46:55,896 --> 00:46:56,986 It's out of rotation. 1146 00:46:57,206 --> 00:46:59,366 So how many elements in total do we have to merge? 1147 00:47:00,176 --> 00:47:00,276 >> Eight. 1148 00:47:01,476 --> 00:47:01,956 >> Eight, right? 1149 00:47:01,956 --> 00:47:03,876 Or really seven because the last one we get for free, 1150 00:47:03,876 --> 00:47:06,466 so N or N minus 1, so that two is key here. 1151 00:47:06,466 --> 00:47:07,836 The number of elements we need 1152 00:47:07,836 --> 00:47:11,736 to merge is maximally N 'cause we have N empty spots. 1153 00:47:11,736 --> 00:47:12,566 We have N buckets. 1154 00:47:12,836 --> 00:47:14,266 We need to put them into the empty spots, all right. 1155 00:47:14,266 --> 00:47:15,796 So, where are we in the story comparing what? 1156 00:47:15,866 --> 00:47:17,056 >> So now we're comparing four and five. 1157 00:47:17,056 --> 00:47:17,606 >> Four and five. 1158 00:47:18,086 --> 00:47:18,796 >> Four goes-- 1159 00:47:19,446 --> 00:47:21,056 >> Okay. Now, we're comparing? 1160 00:47:21,216 --> 00:47:21,866 >> Six and five. 1161 00:47:21,866 --> 00:47:22,286 >> Okay. 1162 00:47:22,976 --> 00:47:24,786 >> It goes here. 1163 00:47:24,946 --> 00:47:25,686 >> Now, we're comparing? 1164 00:47:25,686 --> 00:47:29,296 >> Six and seven and 6 goes here and then eight and seven. 1165 00:47:30,316 --> 00:47:34,586 Put seven here and then the eight goes right there. 1166 00:47:34,586 --> 00:47:36,406 >> Done! Okay, a big round of applause for Joseph. 1167 00:47:36,406 --> 00:47:37,616 Well done! 1168 00:47:38,116 --> 00:47:41,256 [ Applause ] 1169 00:47:41,756 --> 00:47:43,836 >> So, as is always the case with these human demos, 1170 00:47:43,836 --> 00:47:45,336 it feels like that took a while, right? 1171 00:47:45,336 --> 00:47:49,526 It really did take some time but think about how much time 1172 00:47:49,736 --> 00:47:52,996 if we try to express it in those N terms that we did last week. 1173 00:47:53,306 --> 00:47:57,946 So, every time-- the way we started this algorithm is we 1174 00:47:57,946 --> 00:47:59,256 divided the list in half. 1175 00:47:59,446 --> 00:48:01,186 And then we divided the list in half. 1176 00:48:01,186 --> 00:48:02,706 And then we divided the list in half. 1177 00:48:02,956 --> 00:48:05,046 Well, this is kind of an easy answer now to this part 1178 00:48:05,046 --> 00:48:07,536 of the question, how many times can you divide N things 1179 00:48:08,596 --> 00:48:10,436 in half formulaically? 1180 00:48:11,596 --> 00:48:11,916 >> Log N. 1181 00:48:11,916 --> 00:48:12,926 >> So that's log N, right? 1182 00:48:12,926 --> 00:48:15,906 It's technically log base 2N and even if you're a little hazy 1183 00:48:15,906 --> 00:48:17,676 on what the logarithms are, for now just know 1184 00:48:17,676 --> 00:48:18,966 that anytime we ask the question, 1185 00:48:19,206 --> 00:48:21,036 how many times can you divide something in half? 1186 00:48:21,676 --> 00:48:24,116 Well, it's gonna be log base 2 of the something 1187 00:48:24,386 --> 00:48:27,096 or really we again, just like last week, we said "Divided 1188 00:48:27,096 --> 00:48:28,356 by two plus two, who cares? 1189 00:48:28,356 --> 00:48:29,896 Let's ignore constant factors." 1190 00:48:30,226 --> 00:48:31,376 Same deal with the logarithms. 1191 00:48:31,376 --> 00:48:35,086 So, we'll almost always just say log N. So log N is what gives us 1192 00:48:35,086 --> 00:48:38,846 that nice shaped curve that was not straight but rather curved 1193 00:48:39,176 --> 00:48:41,606 and that's what gave us that added performance boost. 1194 00:48:41,766 --> 00:48:44,616 So, there is definitely something log N involved here. 1195 00:48:44,836 --> 00:48:45,916 So, a quick sanity check. 1196 00:48:46,196 --> 00:48:49,976 Can you sort N element in log N time? 1197 00:48:49,976 --> 00:48:54,666 All right, is that the answer? 1198 00:48:54,706 --> 00:48:56,806 It's actually not but why can-- 1199 00:48:56,806 --> 00:49:02,666 does it take more than the log N steps to sort N elements? 1200 00:49:02,666 --> 00:49:04,656 [ Inaudible Remark ] 1201 00:49:04,656 --> 00:49:08,286 >> Okay, so that's actually the full answer but just logically. 1202 00:49:08,526 --> 00:49:11,396 If I've got N elements and I claim I can sort these 1203 00:49:11,396 --> 00:49:13,406 in log N steps, you can immediately say, 1204 00:49:13,406 --> 00:49:14,716 "Hmm, not possible." 1205 00:49:14,976 --> 00:49:19,076 Even if you have no idea how I'm claiming to do it because what? 1206 00:49:19,076 --> 00:49:20,506 In order to sort N elements, 1207 00:49:20,906 --> 00:49:22,986 you kind of just logically minimally need 1208 00:49:22,986 --> 00:49:24,156 to do what with each of them? 1209 00:49:25,536 --> 00:49:26,846 At least like look at it, right? 1210 00:49:26,846 --> 00:49:29,176 You at least have to check, is it in the right place? 1211 00:49:29,516 --> 00:49:32,006 So just logically, if you need to solve the problem 1212 00:49:32,006 --> 00:49:34,516 that involves sorting N elements, minimally, 1213 00:49:34,516 --> 00:49:38,616 the lower bound is going to be N steps because if it's any less 1214 00:49:38,666 --> 00:49:39,726 than that, you're just guessing. 1215 00:49:39,726 --> 00:49:42,066 You're just claiming sorted when you haven't even looked 1216 00:49:42,066 --> 00:49:42,996 at all of those elements. 1217 00:49:43,176 --> 00:49:45,376 So again, here too is kind of a mental rule of thumb. 1218 00:49:45,376 --> 00:49:47,186 You can decide, "Hmm, that's not realistic 1219 00:49:47,186 --> 00:49:50,356 because you can't possibly sort all N elements in less 1220 00:49:50,906 --> 00:49:53,476 than N steps because again, you need to look at each of them. 1221 00:49:53,726 --> 00:49:54,536 Now, a quick sanity check. 1222 00:49:54,536 --> 00:49:56,696 When we had Sean on video up at the board here, 1223 00:49:56,976 --> 00:49:59,066 he was able to find an element 1224 00:49:59,116 --> 00:50:01,976 in log N steps but that was okay. 1225 00:50:02,046 --> 00:50:04,546 >> When Sean was looking for the number seven, he didn't have 1226 00:50:04,686 --> 00:50:06,676 to look at all of the other elements, right? 1227 00:50:06,716 --> 00:50:09,586 When they were sorted, rather Sean was doing it randomly 1228 00:50:09,586 --> 00:50:11,986 but when we then did it ourselves up on stage, 1229 00:50:12,346 --> 00:50:15,146 we sorted the elements in advance and so you don't need 1230 00:50:15,146 --> 00:50:17,006 to look at all of the elements if you already know. 1231 00:50:17,266 --> 00:50:18,726 "Those are less-- those are smaller 1232 00:50:18,726 --> 00:50:19,546 than what I'm looking for. 1233 00:50:19,546 --> 00:50:20,986 Those are bigger than what I'm looking for." 1234 00:50:21,196 --> 00:50:23,536 You can throw elements away but not so with sorting. 1235 00:50:23,966 --> 00:50:28,506 So, with Merge Sort, we're gonna have the list log N times 1236 00:50:28,806 --> 00:50:32,006 and that's as many times as we can divide something in half. 1237 00:50:32,106 --> 00:50:34,226 But every time we have the list, 1238 00:50:34,836 --> 00:50:36,576 there is this key step of merging. 1239 00:50:37,336 --> 00:50:39,926 Well, suppose I have the list down to size 2, 1240 00:50:39,926 --> 00:50:42,646 how many steps are involved in merging a list of size 2? 1241 00:50:43,216 --> 00:50:46,996 It's like two or you know, it's like two or three, right? 1242 00:50:46,996 --> 00:50:47,966 You have to compare them. 1243 00:50:48,166 --> 00:50:50,116 Then you need to move one, then move the other. 1244 00:50:50,386 --> 00:50:52,616 So it feels like it's two or three, give or take. 1245 00:50:52,616 --> 00:50:56,226 But it's some linear factor of the number of bucket, same here. 1246 00:50:56,446 --> 00:50:58,356 How about when we had 2 halves like this? 1247 00:50:58,776 --> 00:51:00,256 These were sorted and these were sorted, 1248 00:51:00,256 --> 00:51:01,246 then we had to merge them. 1249 00:51:01,506 --> 00:51:04,346 How many steps did it take to merge these things? 1250 00:51:04,766 --> 00:51:06,696 Well, you can actually simplify this logically. 1251 00:51:06,786 --> 00:51:10,846 If these plastic bins were at this location a moment ago, 1252 00:51:11,056 --> 00:51:13,956 and this half was sorted, this half was sorted, now we needed 1253 00:51:13,956 --> 00:51:16,226 to sort-- now we needed to merge the two halves. 1254 00:51:16,516 --> 00:51:18,546 Well, how many steps does this take? 1255 00:51:18,546 --> 00:51:21,356 Well, you can just kind of infer well, this guy ultimately has 1256 00:51:21,356 --> 00:51:22,536 to move into his location. 1257 00:51:22,676 --> 00:51:24,376 This guy has to move into his location. 1258 00:51:24,376 --> 00:51:26,556 This guy has to be merged into his location and this guy 1259 00:51:26,556 --> 00:51:29,426 into his, so it's again four, maybe give or take 1260 00:51:29,426 --> 00:51:30,496 if we have to-- if it's actually-- 1261 00:51:30,496 --> 00:51:31,386 maybe it's eight, right? 1262 00:51:31,386 --> 00:51:33,396 Maybe you need to compare them and then move it around, 1263 00:51:33,606 --> 00:51:36,036 but it's still linear because there are only four things you 1264 00:51:36,036 --> 00:51:37,146 need to merge in total. 1265 00:51:37,536 --> 00:51:40,116 So it feels like every time we have the list, 1266 00:51:40,726 --> 00:51:42,726 we have to do something as many as N times 1267 00:51:42,756 --> 00:51:44,566 for the merging process. 1268 00:51:45,006 --> 00:51:46,956 And so in the end, merge sorts 1269 00:51:47,036 --> 00:51:48,886 and we'll see this in just a moment. 1270 00:51:49,106 --> 00:51:55,146 Merge sort is what we'll say as in big O of N times log N. So, 1271 00:51:55,356 --> 00:51:56,636 this refers to the merging. 1272 00:51:56,996 --> 00:51:58,836 At the end of the day, you've got to merge N elements 1273 00:51:59,006 --> 00:52:00,806 but this refers to the number 1274 00:52:00,806 --> 00:52:03,316 of times you're gonna have the list again and again and again 1275 00:52:03,316 --> 00:52:06,446 so you get N log N. Now again, just quick sanity check 1276 00:52:06,446 --> 00:52:09,046 without dwelling on what a logarithm even is necessarily, 1277 00:52:09,276 --> 00:52:11,746 which of these are smaller, N or log N? 1278 00:52:11,816 --> 00:52:12,206 >> Log N. 1279 00:52:12,206 --> 00:52:13,136 >> So log N is smaller. 1280 00:52:13,136 --> 00:52:16,036 So then, why is Merge Sort so much fast-- 1281 00:52:16,816 --> 00:52:18,776 why is merge sort so much faster 1282 00:52:19,146 --> 00:52:21,016 than Bubble Sort or Selection Sort? 1283 00:52:21,016 --> 00:52:24,386 Well, those things are N squared or N times N. This is the same. 1284 00:52:24,386 --> 00:52:25,376 This is bigger. 1285 00:52:25,376 --> 00:52:28,086 It's because this second factor is bigger than this 1286 00:52:28,086 --> 00:52:31,476 that you actually get that N squared feel, 1287 00:52:31,596 --> 00:52:33,486 that slowness to the algorithm. 1288 00:52:33,756 --> 00:52:35,496 Well, let's just see this another way rather 1289 00:52:35,496 --> 00:52:37,016 than just based on buckets. 1290 00:52:37,526 --> 00:52:40,256 We can express this actually completely formulaically, 1291 00:52:40,256 --> 00:52:41,336 very simple formula. 1292 00:52:41,336 --> 00:52:44,766 So recall our implementation of this was just this algorithm. 1293 00:52:44,996 --> 00:52:46,136 We have a base case at the top, 1294 00:52:46,456 --> 00:52:48,766 then we've got some recursion going on, sort the left, 1295 00:52:48,766 --> 00:52:50,446 sort the right and then merge. 1296 00:52:50,766 --> 00:52:53,006 So let's now slap some running times on this. 1297 00:52:53,166 --> 00:52:54,216 And here is another takeaway. 1298 00:52:54,516 --> 00:52:56,946 You don't even need to write or see code in order 1299 00:52:56,946 --> 00:53:00,136 to analyze the running time of an algorithm. 1300 00:53:00,376 --> 00:53:02,536 We can just kind of infer it from the pseudo code. 1301 00:53:02,846 --> 00:53:06,266 So, I claim that the running time of this algorithm-- 1302 00:53:06,576 --> 00:53:11,276 call it T for time given an input of size N is gonna be zero 1303 00:53:11,466 --> 00:53:12,516 if N is less than two. 1304 00:53:12,756 --> 00:53:15,306 In other words, it takes me no time at all to sort a list 1305 00:53:15,696 --> 00:53:18,656 of size N if N is actually less than two, 1306 00:53:18,656 --> 00:53:20,206 if it's zero or one, right? 1307 00:53:20,206 --> 00:53:20,756 It's just done. 1308 00:53:21,246 --> 00:53:23,176 So, we'll claim the cost of that is zero. 1309 00:53:23,516 --> 00:53:25,816 Well, that's obviously not the running time or merge sort. 1310 00:53:25,816 --> 00:53:26,516 It's not zero. 1311 00:53:26,516 --> 00:53:28,876 There's got to be some approximation 1312 00:53:28,876 --> 00:53:30,396 of N log N, so let's get there. 1313 00:53:31,016 --> 00:53:32,936 So, what's the recursive case? 1314 00:53:33,096 --> 00:53:34,046 That was the base case. 1315 00:53:34,306 --> 00:53:35,856 So, in the recursive case, 1316 00:53:35,856 --> 00:53:37,986 what's the time involved to sort N elements? 1317 00:53:38,256 --> 00:53:40,336 Well, here is where we can kind of cheat with ourselves 1318 00:53:40,336 --> 00:53:42,146 and to say, "Hmm, it's this plus this." 1319 00:53:42,616 --> 00:53:45,626 The running time to sort N elements equals the running time 1320 00:53:45,626 --> 00:53:49,356 to sort N over 2 elements plus the running time to sort N 1321 00:53:49,426 --> 00:53:53,826 over 2 elements, left half plus right half plus the N is 1322 00:53:53,826 --> 00:53:54,256 the merging. 1323 00:53:55,016 --> 00:53:56,646 So again, a quick sanity check here. 1324 00:53:56,646 --> 00:53:59,606 If the very first step of the algorithm is sort the left half, 1325 00:53:59,606 --> 00:54:01,346 sort the right half, here is the left half, 1326 00:54:01,516 --> 00:54:03,216 here is the right half merging. 1327 00:54:03,416 --> 00:54:04,186 How do you merge? 1328 00:54:04,186 --> 00:54:05,916 Well, think about what Joseph was doing. 1329 00:54:05,916 --> 00:54:08,776 If I kind of do it with my fingers here, he was pointing 1330 00:54:08,776 --> 00:54:11,376 to the left end of the left half and the right end 1331 00:54:11,376 --> 00:54:14,156 of the right half, sorry, and the left end of right half. 1332 00:54:14,226 --> 00:54:15,056 And what was he doing? 1333 00:54:15,246 --> 00:54:17,236 He was comparing them, figuring out which is smaller. 1334 00:54:17,536 --> 00:54:20,176 Then he decided one is smaller, so he'll pick 1335 00:54:20,176 --> 00:54:21,336 that one out of the lineup. 1336 00:54:21,646 --> 00:54:22,966 Then what did he do with this finger? 1337 00:54:23,266 --> 00:54:26,456 He advanced it to the next left most elements of the left half. 1338 00:54:27,046 --> 00:54:27,876 What did he do next? 1339 00:54:28,146 --> 00:54:33,646 Well, in this here, here, then he would go here, here, here. 1340 00:54:33,766 --> 00:54:36,266 So think about how many times did my fingers move 1341 00:54:36,266 --> 00:54:37,866 from left to right. 1342 00:54:38,506 --> 00:54:39,986 Well, once per bucket, right? 1343 00:54:39,986 --> 00:54:41,346 I only moved to the left. 1344 00:54:41,346 --> 00:54:42,226 I never did this. 1345 00:54:42,226 --> 00:54:44,456 KC did a lot of this last week, going back 1346 00:54:44,456 --> 00:54:45,466 and forth, back and forth. 1347 00:54:45,466 --> 00:54:46,496 Joseph never did that. 1348 00:54:46,806 --> 00:54:48,296 So, where do we get this N factor? 1349 00:54:48,556 --> 00:54:51,086 You sort the left half, you sort the right half, 1350 00:54:51,086 --> 00:54:53,166 then you just essentially conceptually have to point 1351 00:54:53,166 --> 00:54:54,986 at each bucket once in order to move it 1352 00:54:54,986 --> 00:54:56,576 into place as Joseph did. 1353 00:54:57,266 --> 00:54:59,226 So, unfortunately this isn't the answer. 1354 00:54:59,226 --> 00:55:02,016 I don't even see a log N in here because now we have 1355 00:55:02,076 --> 00:55:04,356 to recursively answer our same question. 1356 00:55:04,356 --> 00:55:06,916 What's the running time involved in sorting N over 2 elements? 1357 00:55:07,426 --> 00:55:08,566 Well, what's it gonna be mentally? 1358 00:55:09,276 --> 00:55:14,306 T of N over 4 plus T of N over 4 plus N over 2, 1359 00:55:14,306 --> 00:55:17,166 but this is just gonna get very messy very quickly. 1360 00:55:17,446 --> 00:55:20,006 So, let's try to simplify this with just an example. 1361 00:55:20,126 --> 00:55:22,106 So that is in fact an algorithm. 1362 00:55:22,106 --> 00:55:24,046 It's a recursive formula rather. 1363 00:55:24,696 --> 00:55:26,326 Let's try to do this with actual numbers. 1364 00:55:26,566 --> 00:55:28,006 Suppose for the sake of discussion, 1365 00:55:28,256 --> 00:55:30,466 we've now got 16 buckets, 16 numbers just 1366 00:55:30,466 --> 00:55:32,326 so the math is a little more compelling than 8. 1367 00:55:32,806 --> 00:55:35,326 What's the running time to sort using this algorithm 1368 00:55:35,496 --> 00:55:36,146 16 elements? 1369 00:55:36,616 --> 00:55:40,036 Well, it's T of 8, plus T of 8, otherwise known 1370 00:55:40,036 --> 00:55:42,216 as 2 times T of 8 plus 16. 1371 00:55:42,796 --> 00:55:45,796 So to be clear, one of those T of 8s is the amount 1372 00:55:45,796 --> 00:55:48,106 of time involved to sorting 8 elements on the left, 1373 00:55:48,336 --> 00:55:49,906 the other T of 8 is 8 elements 1374 00:55:49,906 --> 00:55:51,826 on the right assuming a total of 16. 1375 00:55:52,136 --> 00:55:53,296 And what's the 16 for? 1376 00:55:54,856 --> 00:55:56,236 Merging those, right? 1377 00:55:56,356 --> 00:55:58,886 Taking 8 from both halves, merging them together 1378 00:55:58,886 --> 00:56:01,406 so that you get a collectively merged 16 elements. 1379 00:56:01,606 --> 00:56:03,596 So now we have to dive in and let's do this one 1380 00:56:03,596 --> 00:56:05,596 because it's relatively easy numerically. 1381 00:56:05,726 --> 00:56:06,516 What's T of 8? 1382 00:56:06,516 --> 00:56:09,656 Well, T of 8 is just 2 times T of 4 plus 8. 1383 00:56:10,686 --> 00:56:12,116 What's T of 4? 1384 00:56:12,116 --> 00:56:15,996 T of 4 is 2 times T of 2 plus 4. 1385 00:56:16,466 --> 00:56:18,096 What's T of 2? 1386 00:56:18,386 --> 00:56:21,486 T of 2 is 2 times T of 1 plus 2. 1387 00:56:21,486 --> 00:56:23,436 All right, I'm just halving, halving, halving. 1388 00:56:23,576 --> 00:56:24,756 So hopefully this is gonna bottom 1389 00:56:24,756 --> 00:56:26,596 out ad sure enough, what's T of 1? 1390 00:56:26,796 --> 00:56:29,786 If I hand you a list of size 1 and I say sort this just 1391 00:56:29,786 --> 00:56:31,956 as Joseph did, you hand it right back, you're done. 1392 00:56:32,356 --> 00:56:33,816 So now, what can we do? 1393 00:56:33,816 --> 00:56:35,766 Just algebraically, we can say all right, 1394 00:56:35,766 --> 00:56:40,256 well I've got this T of 1 here. 1395 00:56:40,486 --> 00:56:41,856 Well, I can plug this in here. 1396 00:56:41,856 --> 00:56:43,496 So this zero is gonna go here. 1397 00:56:43,666 --> 00:56:47,796 That's gonna give me two times zero which is plus two is two. 1398 00:56:47,946 --> 00:56:49,376 So now I know what T of 2 is. 1399 00:56:49,376 --> 00:56:51,036 So now I can plug that two in there. 1400 00:56:51,346 --> 00:56:55,426 So two times two is four, four plus four is eight. 1401 00:56:55,426 --> 00:56:57,166 So that means this thing is eight. 1402 00:56:57,166 --> 00:56:58,066 So let's change this. 1403 00:56:58,066 --> 00:57:04,176 So, two times eight, that's 16 plus eight, right, is-- 1404 00:57:04,176 --> 00:57:06,196 come on, I'm the only doing it, 24. 1405 00:57:06,196 --> 00:57:12,406 And if we keep going with this, what should we get in the end? 1406 00:57:12,406 --> 00:57:17,916 Sixty four, all right, so 64, quick sanity check then, so, 1407 00:57:17,916 --> 00:57:22,896 I claim that this algorithm runs in N times log N. So I claim 1408 00:57:22,896 --> 00:57:28,346 that that's gonna be 16 times log base 2 of 16. 1409 00:57:28,346 --> 00:57:29,776 Well, for those who remember, what's this? 1410 00:57:31,076 --> 00:57:32,016 It's four, right? 1411 00:57:32,016 --> 00:57:36,006 So if log base 2 of 16 is four, 16 times four is 64. 1412 00:57:36,266 --> 00:57:39,126 So sure enough, this is frankly proof by example 1413 00:57:39,206 --> 00:57:41,386 which is an illegitimate way of proving any point 1414 00:57:41,386 --> 00:57:42,956 that you ever might need to make academically. 1415 00:57:43,246 --> 00:57:47,056 However, it at least confirms, albeit with one simple example 1416 00:57:47,056 --> 00:57:50,596 that we are getting numbers along the lines of N log N. 1417 00:57:50,856 --> 00:57:53,506 And even though again, this is just one simple example, 1418 00:57:53,646 --> 00:57:56,096 recall just what the implications are 1419 00:57:56,096 --> 00:57:57,436 when you actually run this. 1420 00:57:57,646 --> 00:58:01,646 If I go again to this tier, let's choose reversed. 1421 00:58:01,646 --> 00:58:05,146 So now this is the worst possible scenario for all 1422 00:58:05,146 --> 00:58:06,456 of these algorithms Merge Sort, 1423 00:58:06,456 --> 00:58:08,086 recall is on the bottom left here. 1424 00:58:08,406 --> 00:58:10,086 Let's see just how well it performs 1425 00:58:10,086 --> 00:58:12,316 in this potentially worst case scenario. 1426 00:58:13,556 --> 00:58:15,406 Last time, recall we started with random. 1427 00:58:15,406 --> 00:58:17,766 So those other performers, Heap, Quick, Shell, 1428 00:58:17,766 --> 00:58:19,016 those are still looking pretty good. 1429 00:58:19,186 --> 00:58:20,266 Merge Sort is pretty good. 1430 00:58:20,626 --> 00:58:23,196 So we chose a decent one, Insertion Sort 1431 00:58:23,196 --> 00:58:25,416 which we didn't talk about a little slower but my God, 1432 00:58:25,416 --> 00:58:27,786 Bubble Sort and Selection Sort are gonna be here all day 1433 00:58:27,786 --> 00:58:29,086 with these example. 1434 00:58:29,216 --> 00:58:33,686 So that is in fact N log N. So, it's one thing 1435 00:58:33,686 --> 00:58:35,626 to see these things graphically and to kind 1436 00:58:35,626 --> 00:58:36,876 of reason through them. 1437 00:58:37,036 --> 00:58:39,216 Something one person did on the internet 1438 00:58:39,216 --> 00:58:41,856 which is actually quite fascinating is actually assigned 1439 00:58:42,216 --> 00:58:44,656 audible frequencies to the lengths of these bars 1440 00:58:44,656 --> 00:58:47,116 so that you can actually hear these various 1441 00:58:47,116 --> 00:58:48,086 sorting algorithms. 1442 00:58:48,126 --> 00:58:51,396 This is about a minute of audible animation. 1443 00:58:51,436 --> 00:58:52,976 Let me go ahead and pull this to the forefront. 1444 00:58:53,516 --> 00:58:57,516 [ Pause ] 1445 00:58:58,016 --> 00:58:58,083 [ Background Noise ] 1446 00:58:58,236 --> 00:59:00,976 >> This is Insertion Sort. 1447 00:59:01,516 --> 00:59:08,516 [ Sound ] 1448 00:59:09,016 --> 00:59:09,083 [ Background Noise ] 1449 00:59:09,083 --> 00:59:09,976 >> This is Bubble Sort. 1450 00:59:10,516 --> 00:59:15,516 [ Sound ] 1451 00:59:16,016 --> 00:59:16,083 [ Background Noise ] 1452 00:59:16,136 --> 00:59:18,686 >> Again, sound aside, notice that Bubble Sort true 1453 00:59:18,746 --> 00:59:20,816 to its name, the big values are kind of bubbling 1454 00:59:20,816 --> 00:59:24,976 up to the top slowly but bubbling up to the top. 1455 00:59:25,516 --> 00:59:36,516 [ Sound ] 1456 00:59:37,016 --> 00:59:37,083 [ Background Noise ] 1457 00:59:37,083 --> 00:59:37,976 >> Selection Sort. 1458 00:59:38,516 --> 00:59:41,516 [ Sound ] 1459 00:59:42,016 --> 00:59:42,083 [ Background Noise ] 1460 00:59:42,576 --> 00:59:44,946 >> And again, remember Selection Sort keeps plucking the smallest 1461 00:59:44,946 --> 00:59:46,186 element so it makes sense that it's looking correct 1462 00:59:46,186 --> 00:59:46,976 on the left but yet the right. 1463 00:59:47,516 --> 00:59:53,516 [ Sound ] 1464 00:59:54,016 --> 00:59:54,083 [ Background Noise ] 1465 00:59:54,146 --> 00:59:57,006 >> Merge Sort where you can really see the halves 1466 00:59:57,006 --> 00:59:57,926 that Joseph was merging. 1467 00:59:58,516 --> 01:00:03,766 [ Sound ] 1468 01:00:04,266 --> 01:00:09,516 [ Laughter ] 1469 01:00:10,016 --> 01:00:10,083 [ Background Noise ] 1470 01:00:10,083 --> 01:00:11,976 >> This is something called Gnome Sort. 1471 01:00:12,516 --> 01:00:23,886 [ Sound ] 1472 01:00:24,386 --> 01:00:35,756 [ Applause ] 1473 01:00:36,256 --> 01:00:37,326 >> So, that's it. 1474 01:00:37,476 --> 01:00:39,576 This is N log N, this is CS50. 1475 01:00:39,576 --> 01:00:40,716 We will see you on Wednesday. 1476 01:00:41,516 --> 01:00:43,516 [ Noise ] 1477 01:00:44,016 --> 01:01:00,776 [ Silence ]