1 00:00:00,506 --> 00:00:09,646 [ Silence ] 2 00:00:10,146 --> 00:00:11,706 >> Alright, welcome back to CS50. 3 00:00:11,736 --> 00:00:14,226 This is the end of Week 3. 4 00:00:14,226 --> 00:00:17,156 Couple of announcements, if you still need to make any changes 5 00:00:17,156 --> 00:00:19,496 to your section please do so today. 6 00:00:19,496 --> 00:00:22,396 You can email help@cs50.net otherwise follow the directions 7 00:00:22,586 --> 00:00:25,356 that were in the email that the bot sent you this weekend with-- 8 00:00:25,416 --> 00:00:26,846 regarding your section assignments. 9 00:00:27,156 --> 00:00:31,306 A few invitations, as well, so Zynga makers of Farmville 10 00:00:31,306 --> 00:00:33,036 which may appear a little too often 11 00:00:33,036 --> 00:00:37,166 in your Facebook walls is here today 6:30 p.m. 12 00:00:37,616 --> 00:00:40,486 in a recruiting capacity in Maxwell Dworkin 119. 13 00:00:40,486 --> 00:00:42,746 Maxwell Dworkin for those unfamiliar is the CS building 14 00:00:43,006 --> 00:00:45,416 down the streets on the left, on Oxford Street. 15 00:00:45,616 --> 00:00:49,226 It looks like there will be free food and free giveaways 16 00:00:49,226 --> 00:00:52,276 and bring resumes and win an HP Mini Notebook. 17 00:00:52,376 --> 00:00:53,676 So, that's 6:30 today. 18 00:00:53,956 --> 00:00:56,746 For those of you looking for interns or full-time roles, 19 00:00:57,246 --> 00:01:01,116 Brian Kernighan, is the man who thought me CS50 years ago. 20 00:01:01,116 --> 00:01:02,946 He is a faculty at Princeton and he's here 21 00:01:02,946 --> 00:01:05,586 on sabbatical this fall and he's actually gonna be giving a talk 22 00:01:05,686 --> 00:01:07,186 tomorrow and giving out ice cream. 23 00:01:07,536 --> 00:01:10,876 Tomorrow at 3:30 p.m. will be ice cream in Maxwell Dworkin 24 00:01:10,876 --> 00:01:13,146 which is actually a weekly tradition over at the School 25 00:01:13,146 --> 00:01:15,236 of Engineering followed by his talk 26 00:01:15,236 --> 00:01:17,626 at 4 p.m. I remember this man, 27 00:01:17,936 --> 00:01:21,176 frankly because his first lecture far out-- 28 00:01:21,176 --> 00:01:22,696 has far outshined mine. 29 00:01:22,996 --> 00:01:24,826 He stood up on stage, 30 00:01:24,826 --> 00:01:27,406 had apparently spent the summer growing a nice beard, 31 00:01:27,696 --> 00:01:30,776 and then his algorithm was not putting on socks but was how 32 00:01:30,776 --> 00:01:34,906 to shave using hedge clippers. 33 00:01:35,326 --> 00:01:37,796 It's something that stays with you for some time. 34 00:01:37,796 --> 00:01:40,936 But the talk he's giving on Thursday for those of interest 35 00:01:40,936 --> 00:01:44,876 and realize, too, as if my being in 50 now is any indication, 36 00:01:44,876 --> 00:01:48,376 he's very much adroit when it comes to speaking 37 00:01:48,376 --> 00:01:50,506 to nontechnical or less technical audiences. 38 00:01:50,506 --> 00:01:51,986 Do not fear we'll be over your head. 39 00:01:52,296 --> 00:01:53,256 "The Changing Phase 40 00:01:53,256 --> 00:01:55,286 of Programming" is his talk tomorrow. 41 00:01:55,546 --> 00:01:58,176 The rapid evolution of languages, tools, environments, 42 00:01:58,176 --> 00:02:01,046 and expectations presents major challenges and opportunities 43 00:02:01,046 --> 00:02:03,736 for programmers and for software engineering education. 44 00:02:03,916 --> 00:02:05,846 This is true across all kinds of programming 45 00:02:05,846 --> 00:02:07,846 but is especially true for web systems 46 00:02:07,846 --> 00:02:11,406 which are now routinely written in untyped scripting languages 47 00:02:11,406 --> 00:02:14,756 and include Ajax, Mashups, toolkits, frameworks like Rails 48 00:02:14,756 --> 00:02:15,926 and Django and a profusion 49 00:02:15,926 --> 00:02:18,196 of interfaces all operating asynchronously 50 00:02:18,196 --> 00:02:19,226 on distributed systems. 51 00:02:19,226 --> 00:02:20,646 Few new pieces of jargon there 52 00:02:20,646 --> 00:02:22,416 that we will touch upon towards semesters 53 00:02:22,416 --> 00:02:23,956 and the growing popularity 54 00:02:23,956 --> 00:02:26,176 of phone applications has not made life easier 55 00:02:26,176 --> 00:02:27,766 for programmers or instructors. 56 00:02:27,766 --> 00:02:30,106 For the past 10 years, I've been teaching a course 57 00:02:30,106 --> 00:02:32,296 on advanced programming techniques that is more 58 00:02:32,296 --> 00:02:34,606 and more stretched between important old material 59 00:02:34,756 --> 00:02:37,096 and new unproven material that might be important. 60 00:02:37,296 --> 00:02:39,316 In this talk I will illustrate some of the challenges 61 00:02:39,316 --> 00:02:41,686 and discuss ways in which we might use complexity 62 00:02:41,956 --> 00:02:44,056 and rapid change to advantage. 63 00:02:44,056 --> 00:02:46,366 He's a wonderful man and a wonderful speaker to be honest. 64 00:02:46,366 --> 00:02:48,546 So if you have some time tomorrow for a little ice cream 65 00:02:48,786 --> 00:02:52,466 and talk, please do feel welcome to join. 66 00:02:52,466 --> 00:02:54,346 And finally, Facebook. 67 00:02:54,516 --> 00:02:58,046 Facebook is a social networking website. 68 00:02:59,536 --> 00:03:01,996 [Laughter] They will be here on Monday in the form 69 00:03:01,996 --> 00:03:04,846 of Thomas Carriero, who's one of CS50's former head TFs, 70 00:03:04,846 --> 00:03:06,316 also in a recruiting capacity. 71 00:03:06,606 --> 00:03:08,236 Lunch will be served. 72 00:03:08,526 --> 00:03:12,676 For those unfamiliar or for those a little intrigued 73 00:03:12,676 --> 00:03:15,056 by the opportunities that await in Mountain View, 74 00:03:15,056 --> 00:03:17,386 Facebook is hiring brilliant engineers who want 75 00:03:17,386 --> 00:03:20,726 to make an impact as we grow and shape the way the world connects 76 00:03:20,726 --> 00:03:21,776 and shares with each other. 77 00:03:22,046 --> 00:03:24,546 No one has ever done what we're trying to do at such speed 78 00:03:24,546 --> 00:03:26,556 and scale on top of that we do it 79 00:03:26,556 --> 00:03:28,446 with only about 500 engineers. 80 00:03:28,446 --> 00:03:33,286 Our active user to engineer ratio is over one million users 81 00:03:33,286 --> 00:03:35,666 to one engineer, and growing. 82 00:03:35,956 --> 00:03:38,316 You won't be able to get that personal impact 83 00:03:38,316 --> 00:03:39,836 in any other tech company in the world. 84 00:03:39,836 --> 00:03:41,966 So if you'd like some lunch, if you'd like to reunite 85 00:03:41,996 --> 00:03:43,506 with an alumnus and like to hear a bit more 86 00:03:43,506 --> 00:03:44,886 about Facebook before class, 87 00:03:45,396 --> 00:03:50,386 Monday 12 p.m. Maxwell Dworkin 1 not G119, 88 00:03:50,386 --> 00:03:55,316 119 which is one floor up upstairs. 89 00:03:55,446 --> 00:04:00,446 Alright, so we left off last time with a whole bunch 90 00:04:00,446 --> 00:04:03,686 of pieces of paper on the wall and the challenges of the day, 91 00:04:03,926 --> 00:04:06,746 were here's in a array and unfortunately was unsorted. 92 00:04:06,746 --> 00:04:09,826 These were just semi random numbers in some random order 93 00:04:09,826 --> 00:04:12,506 and the challenge at hand was find the number 50. 94 00:04:12,726 --> 00:04:15,056 Now, a reasonable person as we-- 95 00:04:15,096 --> 00:04:17,406 as a reasonable person we might start at the left 96 00:04:17,406 --> 00:04:19,976 and then move our array right word and you might start 97 00:04:19,976 --> 00:04:21,556 to realize like this is a bit of a game 98 00:04:21,556 --> 00:04:23,016 and jump ahead at some point. 99 00:04:23,246 --> 00:04:26,656 But for the most part, you can't do much better than good luck 100 00:04:26,656 --> 00:04:30,346 when it comes to searching an array of unsorted integers. 101 00:04:30,346 --> 00:04:32,106 You might have some heuristics but again but at the end 102 00:04:32,106 --> 00:04:33,346 of the day, if you don't know anything 103 00:04:33,346 --> 00:04:35,836 about those numbers it really is just guess at work. 104 00:04:35,836 --> 00:04:37,816 Now in fairness, it was deliberate on my part 105 00:04:38,016 --> 00:04:39,756 that I put 50 all the way over here just 106 00:04:39,756 --> 00:04:41,886 for demonstration's sake, but in the real world 107 00:04:41,886 --> 00:04:44,066 that very much could happen and one of the themes today 108 00:04:44,066 --> 00:04:47,376 in moving forward is gonna be this notion of worst cases, 109 00:04:47,376 --> 00:04:48,626 worst possible inputs. 110 00:04:48,626 --> 00:04:49,796 And in the worse possible case 111 00:04:49,796 --> 00:04:52,926 in that small problem the thing I'm looking for is all the way 112 00:04:52,926 --> 00:04:56,326 at the end which means I have to do a huge amount of work just 113 00:04:56,326 --> 00:04:58,116 to find a single value. 114 00:04:58,116 --> 00:04:59,356 Now thankfully there're approaches 115 00:04:59,356 --> 00:05:00,456 that are more intelligent than that. 116 00:05:00,456 --> 00:05:02,346 When we did this in the first day with the phonebook 117 00:05:02,476 --> 00:05:05,856 and we did it last time with this second array of numbers 118 00:05:05,856 --> 00:05:07,936 and the key difference there was that the second array 119 00:05:07,936 --> 00:05:11,346 of numbers was sorted and this simple little detail was a 120 00:05:11,346 --> 00:05:12,716 huge advantage. 121 00:05:12,716 --> 00:05:13,526 Right, in the real world, 122 00:05:13,526 --> 00:05:15,056 you can imagine the nightmare it would be 123 00:05:15,336 --> 00:05:18,396 if Horizon just published a big book with numbers in the order 124 00:05:18,396 --> 00:05:20,406 in which people signed up for telephone numbers, 125 00:05:20,406 --> 00:05:21,336 right, from left to right. 126 00:05:21,376 --> 00:05:23,006 Whether you have to search the whole damn phonebook 127 00:05:23,006 --> 00:05:24,356 or at least half of it on average 128 00:05:24,606 --> 00:05:27,186 to find some random friend whose number you're looking up. 129 00:05:27,336 --> 00:05:28,216 Well same here. 130 00:05:28,216 --> 00:05:29,506 If you don't know anything about the numbers, 131 00:05:29,506 --> 00:05:32,236 you ultimately might have to check them all but not the case 132 00:05:32,466 --> 00:05:34,686 if those numbers are actually sorted. 133 00:05:34,686 --> 00:05:36,566 And so one of the questions we'll look at today is 134 00:05:36,816 --> 00:05:38,956 if you are given a seemingly random sequence 135 00:05:38,956 --> 00:05:40,836 of numbers not even certainly that large, 136 00:05:41,026 --> 00:05:43,046 what are your options for actually sorting this? 137 00:05:43,156 --> 00:05:45,446 Now, we'll use numbers because they're easy to talk about 138 00:05:45,766 --> 00:05:48,426 but in the context of Facebook for instance, 139 00:05:48,426 --> 00:05:50,076 you might wanna sort your friends. 140 00:05:50,076 --> 00:05:51,506 You might wanna find friends of friends. 141 00:05:51,506 --> 00:05:53,166 In the context of Google you might wanna search 142 00:05:53,166 --> 00:05:56,156 or sort web pages so certainly could you apply what we'll look 143 00:05:56,156 --> 00:05:59,336 at in this small context of numbers to really any type 144 00:05:59,336 --> 00:06:01,666 of objects that you can impose some sorting on. 145 00:06:02,026 --> 00:06:05,586 But first, some new tools for your toolkits, 146 00:06:05,586 --> 00:06:09,236 thus far printf has probably been your primary tool 147 00:06:09,526 --> 00:06:10,616 for debugging. 148 00:06:10,616 --> 00:06:13,546 To debug a program means to find the bugs and remove them 149 00:06:13,546 --> 00:06:16,336 or to chase down logical errors that you might have made 150 00:06:16,536 --> 00:06:18,496 or even syntax errors that you might have made. 151 00:06:18,496 --> 00:06:20,506 Now, if you've made a syntax error that is, 152 00:06:20,506 --> 00:06:23,046 the characters you've typed at the keyboard just don't belong 153 00:06:23,046 --> 00:06:25,356 where you put them, you put a semicolon in the wrong place, 154 00:06:25,356 --> 00:06:26,406 you forgot a parenthesis. 155 00:06:26,666 --> 00:06:28,696 Well, GCC nicely enough will just yell 156 00:06:28,696 --> 00:06:33,726 at you before it even finishes compiling telling you I can't 157 00:06:33,726 --> 00:06:35,446 compile this 'cause I don't know what you mean. 158 00:06:35,446 --> 00:06:37,606 You've expressed yourself in an incorrect way. 159 00:06:37,956 --> 00:06:40,686 But very often will you actually get your code to compile 160 00:06:40,956 --> 00:06:42,106 and you'll be able to run it 161 00:06:42,106 --> 00:06:44,116 and then you'll get segmentation fault, 162 00:06:44,116 --> 00:06:46,346 which we've seen already it refers somehow to memory 163 00:06:46,346 --> 00:06:47,826 and more on that next week as well. 164 00:06:48,136 --> 00:06:49,756 You might just get the wrong output, 165 00:06:49,756 --> 00:06:51,586 you might get the answer, you might get no answer 166 00:06:51,586 --> 00:06:54,596 or whatsoever, and at this point your code yes compiles 167 00:06:54,646 --> 00:06:57,356 but it's surely not correct and so you might go back 168 00:06:57,356 --> 00:07:00,156 in as you have been and sorting printf statements here or there, 169 00:07:00,156 --> 00:07:01,466 printing the value of this variable, 170 00:07:01,466 --> 00:07:03,386 printing out what the user just wrote just 171 00:07:03,386 --> 00:07:05,196 to do some sanity checks for yourself 172 00:07:05,196 --> 00:07:07,946 and presumably you delete those printf statements before you 173 00:07:07,946 --> 00:07:10,166 actually submit and call this program complete. 174 00:07:10,456 --> 00:07:11,916 But this is a little bit hackish, right? 175 00:07:11,916 --> 00:07:14,316 And its programs get longer than a few dozen lines, 176 00:07:14,316 --> 00:07:17,106 a couple of hundred lines inserting printf statements all 177 00:07:17,106 --> 00:07:19,266 over the place and then recompiling your code, 178 00:07:19,266 --> 00:07:21,686 rerunning it, going back and changing the printf statements. 179 00:07:21,986 --> 00:07:23,716 This really isn't the happiest workflow. 180 00:07:23,716 --> 00:07:26,986 It doesn't scale very well and it's just not very interactive. 181 00:07:26,986 --> 00:07:30,016 You have to kind of do-- make the change, compile, run. 182 00:07:30,116 --> 00:07:30,866 Make the change, compile. 183 00:07:31,156 --> 00:07:32,246 This is three steps just 184 00:07:32,246 --> 00:07:34,376 to accomplish a relatively simple goal, 185 00:07:34,376 --> 00:07:35,666 which is find my problem. 186 00:07:35,906 --> 00:07:37,386 So thankfully, there exist tools. 187 00:07:37,386 --> 00:07:40,206 And one we'll look at today is called GDB, the new debugger. 188 00:07:40,206 --> 00:07:41,936 This is a program that you run 189 00:07:41,936 --> 00:07:44,176 at the command line just like GCC itself. 190 00:07:44,316 --> 00:07:46,756 But what happen is once you compile your program 191 00:07:46,756 --> 00:07:49,046 with GCC you of course get your executable, 192 00:07:49,046 --> 00:07:52,046 your binary the program like a.out or czar 193 00:07:52,046 --> 00:07:53,326 or whatever the program is called. 194 00:07:53,676 --> 00:07:57,566 What you do with GDB is you run GDB but pass it. 195 00:07:57,566 --> 00:07:59,956 Your program is an argument and what GDB does 196 00:07:59,956 --> 00:08:04,286 for you is essentially runs your program for you line by line 197 00:08:04,286 --> 00:08:08,986 by line but slowly so that you can watch things happen step 198 00:08:09,026 --> 00:08:11,856 by step and you can set what we'll call breakpoints which is 199 00:08:11,856 --> 00:08:13,846 to say, if you only wanna see what's going 200 00:08:13,846 --> 00:08:15,696 on at line 20 'cause you're pretty sure that's 201 00:08:15,696 --> 00:08:17,496 where you're having some issues, you don't have to step 202 00:08:17,536 --> 00:08:18,746 to the first 19 steps. 203 00:08:18,746 --> 00:08:23,016 You can just say break at line 20, enter, then run the program. 204 00:08:23,056 --> 00:08:26,686 It will run terribly fast and then pause at line 20 for you 205 00:08:26,686 --> 00:08:29,116 to then poke around at what's going on inside the computer. 206 00:08:29,116 --> 00:08:31,486 So it's gonna feel a little arcane at first. 207 00:08:31,486 --> 00:08:33,976 In terms of interface, you'll find a nice cheat sheet 208 00:08:33,976 --> 00:08:36,226 on CS50's website on the resources page. 209 00:08:36,226 --> 00:08:38,426 But what I thought I'd do today and you'll see in sections 210 00:08:38,426 --> 00:08:41,186 in the weeks to come are some new tricks of the trade 211 00:08:41,186 --> 00:08:43,206 and to be honest with just a few commands 212 00:08:43,206 --> 00:08:45,636 that I'll use myself here, you can get a lot done. 213 00:08:45,636 --> 00:08:48,506 So I'm gonna go back in time to a program we used last week 214 00:08:48,796 --> 00:08:50,726 which was called Buggy3. 215 00:08:50,726 --> 00:08:53,956 As the name implied it was in fact buggy not 216 00:08:53,956 --> 00:08:56,566 because it didn't compile or because there were syntax errors 217 00:08:56,806 --> 00:08:57,936 but what was the problem 218 00:08:57,936 --> 00:08:59,686 with this swapping program if you recall? 219 00:08:59,746 --> 00:09:01,236 >> It didn't actually swap. 220 00:09:01,336 --> 00:09:02,876 >> It didn't actually swap, right. 221 00:09:02,876 --> 00:09:05,526 I claimed this swapped 2 numbers and yet we dug 222 00:09:05,526 --> 00:09:06,356 in a little deeper, right? 223 00:09:06,356 --> 00:09:08,436 I used a printf ad nauseam here 224 00:09:08,566 --> 00:09:11,596 and it did actually swap those numbers when I added a printf 225 00:09:11,596 --> 00:09:15,686 at the very bottom of my program here but then as soon 226 00:09:15,686 --> 00:09:19,086 as the program-- as soon as swap the function returned-- 227 00:09:20,166 --> 00:09:21,396 just looking for my laser pointer, 228 00:09:21,716 --> 00:09:24,916 as soon as swap returned it seemed all the hard work we have 229 00:09:24,916 --> 00:09:27,886 done was for nothing and the values remained unchanged 230 00:09:27,886 --> 00:09:28,296 in main. 231 00:09:28,296 --> 00:09:30,396 Alright, so this is an approach that you might reasonably take 232 00:09:30,396 --> 00:09:33,926 with printf, printing out every thing you wanna see line by line 233 00:09:34,136 --> 00:09:36,756 and then as I did too, once I was done I deleted that line 234 00:09:36,756 --> 00:09:39,226 from swap 'cause I don't want it printed out all over the place. 235 00:09:39,226 --> 00:09:41,686 I just wanted it to print it out for debugging techniques. 236 00:09:41,686 --> 00:09:42,836 Well, let's do this instead. 237 00:09:43,066 --> 00:09:44,956 Let me go ahead and quit nano. 238 00:09:45,236 --> 00:09:47,596 I'm gonna go ahead and make Buggy3. 239 00:09:47,916 --> 00:09:48,786 And notice this. 240 00:09:49,136 --> 00:09:50,816 There's a whole bunch of stuff that's been flashing 241 00:09:50,816 --> 00:09:53,756 on the screen in the past few couple of weeks and some 242 00:09:53,756 --> 00:09:57,306 of them we've talked a little bit about for instance -LCS50, 243 00:09:57,366 --> 00:09:58,636 it's wrapping from over there. 244 00:09:58,906 --> 00:10:01,636 That of course means link in CS50 library, -lm, 245 00:10:01,676 --> 00:10:03,976 little sanity check is the math library. 246 00:10:04,176 --> 00:10:05,996 >> So you can use functions like round 247 00:10:05,996 --> 00:10:07,286 and a few others if you need them. 248 00:10:07,486 --> 00:10:10,256 For those doing the hacker edition, -lcrypt means-- 249 00:10:13,136 --> 00:10:15,436 so use the crypt function. 250 00:10:15,436 --> 00:10:18,016 So for those tackling the hacker edition, it involves cracking 251 00:10:18,016 --> 00:10:19,286 or decrypting passwords 252 00:10:19,486 --> 00:10:22,516 and there exist a function called crypt for that 253 00:10:22,516 --> 00:10:25,766 and it is among other functions in a library called crypt. 254 00:10:26,006 --> 00:10:27,026 Now, what's this other stuff? 255 00:10:27,186 --> 00:10:30,666 Well, some of these things are flags, as we call them, 256 00:10:30,666 --> 00:10:32,166 that have been configured by us. 257 00:10:32,166 --> 00:10:34,426 We've pre-configured Make on the cloud for you. 258 00:10:34,746 --> 00:10:39,096 And though you might resent us for this -Wall 259 00:10:39,436 --> 00:10:42,016 or rather -Werror means even 260 00:10:42,016 --> 00:10:44,446 if there's the slightest syntactical mistake 261 00:10:44,446 --> 00:10:47,026 in this program or thing you really shouldn't do, 262 00:10:47,026 --> 00:10:49,736 yell at the student and refuse to compile the code. 263 00:10:49,806 --> 00:10:52,336 So, this is deliberate in that it really forces you 264 00:10:52,336 --> 00:10:54,686 to get things perfectly right but there are some side effects. 265 00:10:54,686 --> 00:10:57,176 Some of you might have gotten annoyed at the computer or at us 266 00:10:57,396 --> 00:10:59,206 because you declare a variable X 267 00:10:59,346 --> 00:11:01,246 and you're not quite ready to use it yet. 268 00:11:01,246 --> 00:11:02,696 You're just, you know, you wanna use X 269 00:11:02,756 --> 00:11:04,086 but you wanna write some more code first 270 00:11:04,086 --> 00:11:06,016 and it won't let you even compile 'cause you're not using 271 00:11:06,016 --> 00:11:08,656 X. Well again, this is a pedagogical tool 272 00:11:08,656 --> 00:11:10,536 and it's also frankly a matter of good practice 273 00:11:10,766 --> 00:11:12,346 because there's too much code out there, 274 00:11:12,576 --> 00:11:15,396 too much code that's vulnerable out there that just hasn't been 275 00:11:15,396 --> 00:11:18,056 as rigorously checked as this little configuration options 276 00:11:18,196 --> 00:11:18,856 allow for. 277 00:11:19,026 --> 00:11:20,616 Now, for today, we're taking-- 278 00:11:20,646 --> 00:11:25,176 making use of this guy over here, -ggdb is a flag 279 00:11:25,176 --> 00:11:26,766 that we've been secretly including 280 00:11:26,766 --> 00:11:27,936 and make all this time. 281 00:11:27,936 --> 00:11:29,536 But what it does is I think I said a couple 282 00:11:29,536 --> 00:11:32,036 of weeks ago is it adds to your program. 283 00:11:32,036 --> 00:11:34,606 You're a.out, your czar, some additional bits 284 00:11:35,126 --> 00:11:39,326 that re-help a program like GDB follow along where you are. 285 00:11:39,326 --> 00:11:41,576 It includes things like line numbers essentially. 286 00:11:41,576 --> 00:11:42,476 So that when you're walking 287 00:11:42,476 --> 00:11:45,126 through what is otherwise just a file with 0s and 1s, 288 00:11:45,276 --> 00:11:47,766 there're still some reminders inside your file as to 289 00:11:47,866 --> 00:11:49,486 where those 0s and 1s came from. 290 00:11:49,486 --> 00:11:52,146 What original line, so if you are still in a habit 291 00:11:52,146 --> 00:11:55,456 of running GCC manually at the prompt, that's fine 292 00:11:55,686 --> 00:11:58,036 but henceforth if you wanted to use GDB, you're gonna have 293 00:11:58,066 --> 00:12:00,876 to include this flag as well manually. 294 00:12:00,876 --> 00:12:02,656 So, we're all ready at the point frankly where you might 295 00:12:02,656 --> 00:12:05,046 as well just be using Make 'cause it automates all of this 296 00:12:05,046 --> 00:12:07,086 for you and saves you a whole lot of typing. 297 00:12:07,386 --> 00:12:10,706 So now, I'm gonna go ahead and run the program itself 298 00:12:10,706 --> 00:12:12,066 which is called Buggy3. 299 00:12:12,486 --> 00:12:14,816 And of course, it seems to in fact be buggy. 300 00:12:14,816 --> 00:12:16,416 And here are my printf statements that seemed 301 00:12:16,416 --> 00:12:18,786 to be confirming that it is not in fact working. 302 00:12:19,116 --> 00:12:23,956 And just to be clear, let me go back in nano Buggy3 and recall 303 00:12:23,956 --> 00:12:26,336 that what we did last time was I did a little sanity check. 304 00:12:26,336 --> 00:12:32,386 I said, printf A equals percent D, new line let's say, 305 00:12:32,776 --> 00:12:35,256 comma A. So I wanna print out the value of A 306 00:12:35,256 --> 00:12:39,276 and then I wanna print out the value of B. 307 00:12:39,276 --> 00:12:40,666 So this was my sanity check. 308 00:12:41,156 --> 00:12:44,476 So I'm gonna save, Enter, I'm gonna recompile with Make, 309 00:12:44,796 --> 00:12:46,856 and I'm gonna re-run Buggy3, 310 00:12:46,856 --> 00:12:48,426 and sure enough it seems to be working. 311 00:12:48,426 --> 00:12:50,836 It starts off at 1-2, becomes 2-1, 312 00:12:50,836 --> 00:12:52,626 but then reverses back to 1-2. 313 00:12:52,626 --> 00:12:54,196 And this is how we concluded definitively. 314 00:12:54,486 --> 00:12:56,106 This thing is not working as intended. 315 00:12:56,106 --> 00:12:58,486 So let me remove these superfluous printf statements 316 00:12:58,486 --> 00:13:00,596 'cause again, this isn't the best habit long term 317 00:13:00,596 --> 00:13:01,186 to get into. 318 00:13:01,186 --> 00:13:03,456 Let me recompile, so I have the freshest copy. 319 00:13:03,676 --> 00:13:05,296 But now instead of running Buggy3, 320 00:13:05,296 --> 00:13:09,136 I'm gonna run GDB space Buggy3 and then Enter. 321 00:13:09,376 --> 00:13:11,326 You get a whole lot of stuff that's pretty useless, 322 00:13:11,626 --> 00:13:13,196 just warrant the information and all this 323 00:13:13,196 --> 00:13:14,796 but then you get a GDB prompt. 324 00:13:15,166 --> 00:13:17,666 Alright, at the GDB prompt I get to do a few things, 325 00:13:17,666 --> 00:13:19,096 among them run my program. 326 00:13:19,346 --> 00:13:21,946 So one of the new tricks today, and again all this is documented 327 00:13:21,946 --> 00:13:24,516 in a cheat sheet on CS50.net/resources. 328 00:13:24,516 --> 00:13:25,606 I'm gonna hit Enter. 329 00:13:26,046 --> 00:13:28,876 It apparently runs your program, so not all that useful yet. 330 00:13:28,956 --> 00:13:30,866 I could've done this, you know, a second ago 331 00:13:30,866 --> 00:13:33,916 at the original command line, but notice this as a teaser. 332 00:13:34,156 --> 00:13:37,396 When a GDB tells you program exited normally, 333 00:13:37,646 --> 00:13:40,826 it turns out that means that main returned what? 334 00:13:40,826 --> 00:13:40,996 >> Zero. 335 00:13:41,466 --> 00:13:42,156 >> Zero, right. 336 00:13:42,156 --> 00:13:47,306 So as promised, returning 1 or 2 or 3 or 0 for success cases 337 00:13:47,306 --> 00:13:50,616 in main thus have utility when you're using tools like this 338 00:13:50,616 --> 00:13:52,706 so that you can actually see what's being returned 339 00:13:52,706 --> 00:13:55,176 or be informed that it in fact all was well. 340 00:13:55,396 --> 00:13:57,556 So this was not all that useful though, just running my program, 341 00:13:57,596 --> 00:13:58,666 I wanna know what's wrong. 342 00:13:58,876 --> 00:14:00,286 So I'm gonna go ahead and do this. 343 00:14:00,286 --> 00:14:03,246 I'm gonna type break and then I'm gonna type main. 344 00:14:03,246 --> 00:14:05,066 What' nice and let me move this up to the top, 345 00:14:05,066 --> 00:14:06,706 I'm gonna type break main 346 00:14:06,876 --> 00:14:08,576 and this step is what's called a breakpoint. 347 00:14:08,576 --> 00:14:12,026 A pause point in the program at the function called main, 348 00:14:12,236 --> 00:14:13,966 notice I get a slightly cryptic output 349 00:14:13,966 --> 00:14:16,456 which is breakpoint number 1 is now at OX, 350 00:14:16,456 --> 00:14:17,756 and what does this mean perhaps? 351 00:14:17,756 --> 00:14:19,066 [ Inaudible Remark ] 352 00:14:19,066 --> 00:14:20,636 >> Yes, so this is the address in memory. 353 00:14:20,636 --> 00:14:22,486 So we'll talk more about this overtime. 354 00:14:22,486 --> 00:14:24,516 Frankly, it's used for silly things, like photoshop, 355 00:14:24,516 --> 00:14:25,856 colors, and web colors. 356 00:14:25,896 --> 00:14:29,966 But hexadecimal notation is just another base system like binary 357 00:14:29,966 --> 00:14:32,046 which we've discussed, decimal which we've grown up with, 358 00:14:32,236 --> 00:14:35,616 0x typically denotes hexadecimal which is base 16, 359 00:14:35,866 --> 00:14:37,926 and as we'll see this is just a more compact way 360 00:14:37,926 --> 00:14:39,566 of representing numbers so that they don't get 361 00:14:39,566 --> 00:14:40,536 ridiculously long. 362 00:14:40,826 --> 00:14:44,306 So this just means in RAM, in my program's memory space, 363 00:14:44,356 --> 00:14:49,196 in RAM lives a function called main, the so-called text segment 364 00:14:49,196 --> 00:14:50,916 from last time but more on that to come. 365 00:14:51,266 --> 00:14:53,576 And that is where it happens to live, is this relevant, 366 00:14:53,656 --> 00:14:55,176 not so much for us just yet, 367 00:14:55,346 --> 00:14:57,996 but as things get more advanced then we get more savvy, 368 00:14:58,266 --> 00:14:59,706 certainly details like that can help. 369 00:14:59,756 --> 00:15:02,666 But for us, it looks like we've set a breakpoint in this file 370 00:15:02,666 --> 00:15:05,386 at line 21 'cause that's apparently where main began. 371 00:15:05,846 --> 00:15:09,026 So now if I run and hit Enter, 372 00:15:09,646 --> 00:15:12,986 notice that execution does break at line 21. 373 00:15:13,246 --> 00:15:15,186 So again, the output is a little cryptic at first 374 00:15:15,186 --> 00:15:16,666 but this is just telling me what happened. 375 00:15:17,056 --> 00:15:20,706 This is saying, here's line 21, here's what's on line 21 376 00:15:20,706 --> 00:15:22,916 but it hasn't executed line 21 yet. 377 00:15:23,096 --> 00:15:27,776 In fact, it won't until I actually hit the command Next. 378 00:15:27,776 --> 00:15:30,996 So I'm gonna go ahead and hit Next and now I'm on line 22, 379 00:15:31,296 --> 00:15:34,046 X has already been assigned the value of 1 and which I confirm 380 00:15:34,046 --> 00:15:36,476 by typing the command Print, so it is not printf or anything 381 00:15:36,476 --> 00:15:37,656 like that, there's no parenthesis. 382 00:15:37,656 --> 00:15:39,236 These are just commands in a program, 383 00:15:39,236 --> 00:15:42,036 not functions, print X, Enter. 384 00:15:42,336 --> 00:15:44,386 And in fact, X equals 1. 385 00:15:44,386 --> 00:15:45,806 Now as this thing here, 386 00:15:45,806 --> 00:15:48,716 the dollar-sign notation is just handy if you wanna get fancy 387 00:15:48,716 --> 00:15:51,706 and refer back to the same answer later on, kinda of like 388 00:15:51,706 --> 00:15:54,016 in a fancy graphing calculator allows you to do. 389 00:15:54,276 --> 00:15:56,956 But for now, what we care about is that the answer is 1. 390 00:15:56,956 --> 00:16:00,976 Meanwhile, if I type print Y-- well, that's weird. 391 00:16:02,066 --> 00:16:07,106 How did Y become 3,223,540? 392 00:16:07,666 --> 00:16:12,246 What-- 'Cause we haven't initialized it yet, right? 393 00:16:12,246 --> 00:16:13,536 So this is 1 already. 394 00:16:13,536 --> 00:16:15,976 A visual lesson that if you don't' initialize variables 395 00:16:15,976 --> 00:16:18,736 in a program to something, who knows what they're gonna be. 396 00:16:18,736 --> 00:16:21,336 This computer, the clouds have been on for several days time. 397 00:16:21,546 --> 00:16:23,446 Even this program did some other stuff 398 00:16:23,446 --> 00:16:24,816 when we first started it up, on-- 399 00:16:24,816 --> 00:16:26,896 but known to us or that we didn't see visually. 400 00:16:27,156 --> 00:16:29,596 And surely the program might have used some 401 00:16:29,596 --> 00:16:32,206 of the memory that's allocated to my program 402 00:16:32,206 --> 00:16:34,116 for other purposes if temporarily. 403 00:16:34,346 --> 00:16:37,106 But now Y happens to live at a certain location 404 00:16:37,106 --> 00:16:39,966 and it just has this garbage value, a meaningless value 405 00:16:39,966 --> 00:16:43,196 that I should absolutely not trust is anything legitimate 406 00:16:43,196 --> 00:16:45,306 if I haven't assigned it a value explicitly 407 00:16:45,526 --> 00:16:47,206 as I'm about to in line 22. 408 00:16:47,206 --> 00:16:50,096 So even though I typed a few commands, I have not preceded 409 00:16:50,096 --> 00:16:51,626 to continue executing this program. 410 00:16:51,626 --> 00:16:53,056 We're still at line 22. 411 00:16:53,286 --> 00:16:56,736 So now if type Next or if I'm getting a little lazy 412 00:16:56,736 --> 00:16:59,546 with keystrokes, you can just type N, the first letter or two 413 00:16:59,546 --> 00:17:02,336 of each command, N. Now I'm at line 24. 414 00:17:02,336 --> 00:17:06,186 So now if I do print Y, I should hopefully see 415 00:17:06,186 --> 00:17:07,336 in fact the answer is 2. 416 00:17:07,336 --> 00:17:09,146 And again, ignore the dollar-sign stuff for now. 417 00:17:09,146 --> 00:17:11,066 It's what's on the right hand side that's interesting. 418 00:17:11,346 --> 00:17:13,726 Alright, so if I now type, oops, Next again, 419 00:17:14,026 --> 00:17:15,866 this printf line is going to execute. 420 00:17:15,866 --> 00:17:19,186 So I'm gonna see the programs output commingled with GDBs. 421 00:17:19,496 --> 00:17:20,426 Alright, so there it is. 422 00:17:20,426 --> 00:17:23,866 X is 1 on the left then I'm at line 25, let me go ahead 423 00:17:23,866 --> 00:17:25,426 and I'm gonna scroll things up to the top 424 00:17:25,426 --> 00:17:26,576 so you can see in front. 425 00:17:27,216 --> 00:17:30,736 Let's type Next again, Y is 2, I claim in the moment 426 00:17:30,736 --> 00:17:33,016 that we're swapping dot, dot, dot, so let's do Next. 427 00:17:33,516 --> 00:17:35,556 Alright, so here's an interesting decision point. 428 00:17:35,746 --> 00:17:37,146 This is a function call. 429 00:17:37,146 --> 00:17:39,236 I am about to on line 27, 430 00:17:39,296 --> 00:17:41,876 call the function Swap, which I wrote. 431 00:17:42,116 --> 00:17:45,056 Remember it's lower in the file, called buggy3.c. I'm 432 00:17:45,056 --> 00:17:47,926 about to call this function and so I can. 433 00:17:47,926 --> 00:17:49,216 I'm gonna go ahead and do Next. 434 00:17:49,676 --> 00:17:51,156 That function will execute. 435 00:17:51,436 --> 00:17:54,046 Something will hopefully happen to X and Y, let's check. 436 00:17:54,386 --> 00:17:57,916 Print X-- uh-oh, print Y, still nothing. 437 00:17:57,916 --> 00:18:00,296 So that function seems in fact to have had no effect. 438 00:18:00,846 --> 00:18:02,216 Now, this was not all that useful 439 00:18:02,216 --> 00:18:04,536 in exercise 'cause now my program is pretty much done, 440 00:18:04,536 --> 00:18:04,726 right? 441 00:18:04,726 --> 00:18:06,996 I claimed swap, I then claimed 442 00:18:07,216 --> 00:18:09,176 that X is something, Y is something. 443 00:18:09,386 --> 00:18:10,816 I now hit this curly brace. 444 00:18:10,816 --> 00:18:11,926 I hit Next again. 445 00:18:12,206 --> 00:18:14,396 Now I'm this weird part of my world 446 00:18:14,396 --> 00:18:16,526 because remember I've used other functions, 447 00:18:16,526 --> 00:18:17,746 I've used printf and the like. 448 00:18:18,016 --> 00:18:18,906 So don't get distracted 449 00:18:18,906 --> 00:18:20,546 if you see stuff that's clearly not yours. 450 00:18:20,546 --> 00:18:21,336 I know I'm at the end 451 00:18:21,336 --> 00:18:24,326 of my program 'cause I already saw this curly brace pass me by. 452 00:18:24,706 --> 00:18:26,616 So I'm just gonna go ahead and type continue 453 00:18:26,846 --> 00:18:28,706 which undoes the effect of brake, 454 00:18:28,706 --> 00:18:29,736 it just says, "Keep going. 455 00:18:29,736 --> 00:18:30,826 Forget about this breakpoint. 456 00:18:30,996 --> 00:18:31,826 I'm done caring. 457 00:18:32,116 --> 00:18:34,196 And now, I'm in fact done with the program." 458 00:18:34,196 --> 00:18:37,686 It finished up whatever leg work it needed to do at the end. 459 00:18:38,226 --> 00:18:39,586 Alright, so this was not all that helpful. 460 00:18:39,586 --> 00:18:42,236 All I did was very slowly discover what I'd already 461 00:18:42,236 --> 00:18:44,556 discovered before with printf but here's the power 462 00:18:44,556 --> 00:18:46,326 of something like GDB, let's redo this. 463 00:18:46,326 --> 00:18:47,726 Let me rerun Run. 464 00:18:47,936 --> 00:18:49,556 It's gonna automatically start at the beginning. 465 00:18:49,556 --> 00:18:51,886 It still have that breakpoint, so I'm gonna hit again. 466 00:18:52,126 --> 00:18:53,886 And now let me step through a little more quickly. 467 00:18:53,886 --> 00:18:58,436 Next, with N, next, next, next, next. 468 00:18:58,506 --> 00:19:00,076 Okay, here's an interesting point. 469 00:19:00,336 --> 00:19:03,526 Now I wanna step into this function 'cause what's powerful 470 00:19:03,526 --> 00:19:04,836 about a debugger is you can poke 471 00:19:04,836 --> 00:19:06,616 around anywhere you want in your program. 472 00:19:06,616 --> 00:19:09,356 So I'm gonna type a Step instead of Next. 473 00:19:09,356 --> 00:19:12,596 Step means step into this function whereas Next means step 474 00:19:12,596 --> 00:19:15,696 over this function while still executing it, Enter. 475 00:19:16,006 --> 00:19:17,566 And now notice what I get as output. 476 00:19:17,566 --> 00:19:20,426 I am being informed, okay, I am in the scope 477 00:19:20,426 --> 00:19:22,406 as we've discussed of swap. 478 00:19:22,636 --> 00:19:25,636 It was past these two arguments, A is 1, B is 2, 479 00:19:25,836 --> 00:19:27,746 and the line I'm currently paused 480 00:19:27,746 --> 00:19:29,766 at is line 41 of my program. 481 00:19:29,766 --> 00:19:32,136 So here, too, it's helpful if you have two terminal 482 00:19:32,136 --> 00:19:34,676 or two putty windows open, have your source code in one, 483 00:19:34,676 --> 00:19:36,206 have GDB in the left hand one 484 00:19:36,336 --> 00:19:38,856 and you can watch things progress in the big window. 485 00:19:39,436 --> 00:19:40,136 So, let's see. 486 00:19:40,416 --> 00:19:41,246 This is swap. 487 00:19:41,366 --> 00:19:44,786 Actually I can also type List if you wanna see in the context 488 00:19:44,786 --> 00:19:46,956 of GDB, what code you're currently dealing with, 489 00:19:46,956 --> 00:19:48,736 with the line numbers on the left hand side, 490 00:19:48,986 --> 00:19:51,686 but I'm on line 41 as this thing here says. 491 00:19:52,096 --> 00:19:53,856 So this is uninteresting at the moment. 492 00:19:53,856 --> 00:19:54,826 So let's do a quick check. 493 00:19:54,996 --> 00:19:58,206 Print the value of temp and let me move it to the top. 494 00:19:58,206 --> 00:19:59,936 I'm gonna print the value of temp, alright 0. 495 00:20:00,696 --> 00:20:02,516 >> So it may be the case that just 496 00:20:02,516 --> 00:20:05,216 by happened stance a variable seems to be initialized 497 00:20:05,216 --> 00:20:08,556 to some value very often 0 but that's clearly not a guarantee. 498 00:20:08,556 --> 00:20:09,606 So do not trust that. 499 00:20:09,936 --> 00:20:12,576 So now let's go ahead and type Next, execute that line of code. 500 00:20:12,576 --> 00:20:14,546 So now if I print temp, it should be-- 501 00:20:14,546 --> 00:20:20,546 if you recall I passed in 1 and 2 and I assigned it the value 502 00:20:20,546 --> 00:20:22,846 of A. So in fact, yup, it's 1 here. 503 00:20:22,956 --> 00:20:26,006 So a-- temp is currently 1. 504 00:20:26,326 --> 00:20:28,026 Now I'm gonna clobber the value of B. 505 00:20:28,066 --> 00:20:29,146 So let's do quick sanity check. 506 00:20:29,206 --> 00:20:32,636 Print A, print B, and that's consistent with what I passed 507 00:20:32,866 --> 00:20:35,926 in to swap, but here we go, we're about to execute line 42 508 00:20:35,926 --> 00:20:37,946 and I'm gonna do this just by typing Next. 509 00:20:38,456 --> 00:20:40,756 Next. Alright, so now I'm gonna do it again. 510 00:20:40,826 --> 00:20:43,996 Print A, print B, and there it is. 511 00:20:43,996 --> 00:20:46,396 So we are half way through the swapping process 'cause now A 512 00:20:46,466 --> 00:20:49,176 and B happen to be the value of 2. 513 00:20:49,176 --> 00:20:50,246 Now what can we do here? 514 00:20:50,506 --> 00:20:54,676 Well, I can go ahead and just say, okay next, next, 515 00:20:54,676 --> 00:20:56,666 I'm kinda losing interest in this and so as soon 516 00:20:56,666 --> 00:21:00,106 as that function returns, notice it reminds me, I'm back in main, 517 00:21:00,346 --> 00:21:02,806 I'm at line 28 and you're about to do this. 518 00:21:03,236 --> 00:21:06,056 So in short, you have this power with just a few commands, Run, 519 00:21:06,456 --> 00:21:10,406 Break, Next, Step, and Continue. 520 00:21:10,406 --> 00:21:11,916 You can do a lot of useful things. 521 00:21:11,916 --> 00:21:13,766 And there is yet more that we'll look at in this section 522 00:21:13,766 --> 00:21:15,016 and is documented online. 523 00:21:15,016 --> 00:21:17,536 But let me show you one last trick that's not terribly useful 524 00:21:17,536 --> 00:21:19,056 for this program 'cause it's so short 525 00:21:19,276 --> 00:21:21,596 but when we start writing programs that have multiple, 526 00:21:21,596 --> 00:21:24,126 multiple, multiple functions that you wrote, that get called 527 00:21:24,126 --> 00:21:25,856 in succession, watch what you can do. 528 00:21:25,856 --> 00:21:28,106 I'm gonna hit Run and because it's not finished, 529 00:21:28,106 --> 00:21:30,576 it's gonna yell at me start it from the beginning, that's fine. 530 00:21:31,006 --> 00:21:33,866 I'm done with this iteration. 531 00:21:33,866 --> 00:21:34,786 So, Y for Yes. 532 00:21:35,376 --> 00:21:38,206 I'm at the top of my program, I broke in as usual. 533 00:21:38,476 --> 00:21:42,106 I'm gonna go ahead and type Next, Next, and you can do up-- 534 00:21:42,306 --> 00:21:43,666 use the up arrows as before. 535 00:21:43,666 --> 00:21:45,456 Next, I don't wanna take Next here. 536 00:21:45,456 --> 00:21:47,716 I wanna hit Step to dive in. 537 00:21:48,046 --> 00:21:50,796 And now suppose this is a reasonably sized program, 538 00:21:50,796 --> 00:21:53,106 I'm actually getting disoriented with where I am, 539 00:21:53,106 --> 00:21:55,036 what I've stepped in to and what not, that's fine. 540 00:21:55,036 --> 00:21:56,316 You can do a little sanity check, 541 00:21:56,316 --> 00:21:57,896 you can type the command back trace. 542 00:21:58,306 --> 00:22:00,546 And as the name implies, it kind of looks back in time 543 00:22:00,546 --> 00:22:02,716 at where you came from and what it says is this, 544 00:22:03,266 --> 00:22:05,276 the place I am now is the first line. 545 00:22:05,276 --> 00:22:06,936 I'm in frame 0. 546 00:22:07,246 --> 00:22:10,366 So I'm in swap, I past in this variables, and that happens 547 00:22:10,366 --> 00:22:11,216 to happen at this line. 548 00:22:11,456 --> 00:22:12,556 Where did I come from? 549 00:22:12,556 --> 00:22:14,646 In other words, how did I get to that line of code? 550 00:22:14,926 --> 00:22:18,676 Well previously, in the other frame that is currently 551 00:22:18,676 --> 00:22:21,426 in may RAM, I started in main. 552 00:22:21,666 --> 00:22:22,936 So the reason this prints kind 553 00:22:22,936 --> 00:22:24,786 of in reverse order is actually consistent 554 00:22:24,786 --> 00:22:26,766 with that picture we drew on the board the other day. 555 00:22:26,766 --> 00:22:30,436 When we talked about RAM having stacks and frames that grow 556 00:22:30,436 --> 00:22:32,056 like this, it's the same idea. 557 00:22:32,056 --> 00:22:35,086 Notice that main is on the bottom and swap is on top. 558 00:22:35,266 --> 00:22:37,596 So if swap itself calls a third function, 559 00:22:37,646 --> 00:22:40,536 it is gonna be located in RAM up here. 560 00:22:40,636 --> 00:22:42,816 And so pictorially it's pretty consistent. 561 00:22:43,756 --> 00:22:45,556 Okay, so a bit of world wind tour. 562 00:22:45,556 --> 00:22:47,886 The next couple of problems that will encourage you and walk you 563 00:22:47,886 --> 00:22:50,226 through the process of using this to the extent you need it-- 564 00:22:50,606 --> 00:22:52,776 but any questions on the usage, the concept, anything? 565 00:22:54,086 --> 00:22:54,206 Yeah. 566 00:22:55,516 --> 00:23:02,596 [ Inaudible Remark ] 567 00:23:03,096 --> 00:23:05,876 >> Absolutely, you don't have to set breakpoints only at main 568 00:23:05,876 --> 00:23:07,466 if you have a function called swap, 569 00:23:07,466 --> 00:23:09,306 I could have typed the break swap. 570 00:23:09,476 --> 00:23:12,466 Or if I again know that there's a certain line I care about, 571 00:23:12,626 --> 00:23:15,946 I could even do this, break line, let's say 23, 572 00:23:16,186 --> 00:23:17,716 'cause I care about line 23. 573 00:23:17,866 --> 00:23:20,346 Then when I type Run, it will start running from the top 574 00:23:20,346 --> 00:23:23,006 of the program as usual but stop at that specific line. 575 00:23:23,236 --> 00:23:24,656 So it's actually quite versatile. 576 00:23:25,366 --> 00:23:25,476 Yeah. 577 00:23:25,566 --> 00:23:27,386 >> Can you break continue then break again? 578 00:23:27,836 --> 00:23:29,556 >> Can you break and then continue again, yes. 579 00:23:29,666 --> 00:23:30,656 So the-- yes. 580 00:23:30,656 --> 00:23:34,286 The continue command is useful also in the context of loops. 581 00:23:34,626 --> 00:23:36,486 Because if you set a breakpoint somewhere in the middle 582 00:23:36,486 --> 00:23:39,126 of a loop, you might get bored with typing next, next, next, 583 00:23:39,126 --> 00:23:41,086 next, next throughout all this lines, you really just wanna get 584 00:23:41,086 --> 00:23:43,016 to the next iteration or the next iteration. 585 00:23:43,256 --> 00:23:45,736 So if you type Continue, it's not gonna run to the bottom 586 00:23:45,736 --> 00:23:47,106 of the program, it's gonna run 587 00:23:47,306 --> 00:23:48,866 as the logic of your code suggest. 588 00:23:48,866 --> 00:23:50,806 So it will iterate in this loop and if you hit 589 00:23:50,806 --> 00:23:53,296 that same breakpoint again, you will in fact pause. 590 00:23:53,556 --> 00:23:55,766 Now if you're iterating a hundred times and you only care 591 00:23:55,766 --> 00:23:59,006 about the 99th iteration, this could very quickly get tedious. 592 00:23:59,236 --> 00:24:00,306 But you'll see in the online PDF, 593 00:24:00,306 --> 00:24:04,156 you can say like continue 98 times and then break. 594 00:24:04,276 --> 00:24:06,446 So again, you pick up little tricks along the way. 595 00:24:06,946 --> 00:24:07,656 Any other questions? 596 00:24:09,126 --> 00:24:09,206 Yeah. 597 00:24:10,516 --> 00:24:21,956 [ Inaudible Remark ] 598 00:24:22,456 --> 00:24:24,726 >> So you can't start execution at a certain line. 599 00:24:24,726 --> 00:24:26,476 Your program has start at the beginning 600 00:24:26,906 --> 00:24:28,536 but you can break at any line. 601 00:24:28,536 --> 00:24:31,236 And break just means pause here so I can poke around 602 00:24:31,546 --> 00:24:32,626 and then it's up to you to decide 603 00:24:32,626 --> 00:24:33,886 if you wanna continue execution 604 00:24:33,886 --> 00:24:35,486 by hitting Next or Continue itself. 605 00:24:35,851 --> 00:24:37,851 [ Inaudible Remark ] 606 00:24:38,216 --> 00:24:41,136 >> No, so your-- if you wrote the code here, you're gonna have 607 00:24:41,166 --> 00:24:42,936 to execute it from here to here. 608 00:24:43,066 --> 00:24:45,346 You can't just say start running my program at a different point. 609 00:24:46,326 --> 00:24:49,416 Alright, so that's a little tool for your toolkit, so to speak. 610 00:24:49,416 --> 00:24:52,366 It might very well be useful for this P set if you're not 611 00:24:52,366 --> 00:24:54,366 yet done or not yet started. 612 00:24:54,716 --> 00:24:57,046 But for the next ones and ones moving forward, 613 00:24:57,166 --> 00:25:00,136 it's one of this things honestly where statistically many 614 00:25:00,136 --> 00:25:02,376 of you probably won't take this advice for at least a couple 615 00:25:02,376 --> 00:25:04,566 of P sets but this is one of those things that certainly 616 00:25:04,566 --> 00:25:05,816 in programing where if you spend 617 00:25:05,976 --> 00:25:08,216 that frustrating first 30 minutes, maybe an hour 618 00:25:08,216 --> 00:25:09,376 of learning something like this, 619 00:25:09,616 --> 00:25:12,186 it will absolutely save you 10 times 620 00:25:12,186 --> 00:25:14,056 that much time by semesters end. 621 00:25:14,096 --> 00:25:15,946 But when you get comfortable with this stuff, 622 00:25:16,016 --> 00:25:17,836 printf is not going to be your best friend 623 00:25:17,836 --> 00:25:19,706 for very long 'cause it's so terribly limited 624 00:25:19,706 --> 00:25:20,796 and so terribly tedious. 625 00:25:21,026 --> 00:25:22,456 So invest this time, this week 626 00:25:22,456 --> 00:25:24,436 or next just learning some of the basics of GDB. 627 00:25:24,436 --> 00:25:26,056 And honestly, it's one of those things 628 00:25:26,056 --> 00:25:28,106 that will absolutely help you help yourselves 629 00:25:28,176 --> 00:25:30,826 and in the end save you time for sure. 630 00:25:31,216 --> 00:25:35,536 So here are some numbers and the goal at hand is to get 631 00:25:35,536 --> 00:25:37,896 to the point of actually having these numbers sorted 632 00:25:37,896 --> 00:25:40,656 so that we can actually find pieces of data more efficiently. 633 00:25:40,846 --> 00:25:42,896 Now again, I've chose some fairly simple numbers here 634 00:25:42,896 --> 00:25:45,416 up on the board, you could imagine gaps in them like we had 635 00:25:45,416 --> 00:25:46,986 on the chalk board on Monday 636 00:25:47,236 --> 00:25:50,156 so that they're not just evenly spaced by 1 number apart 637 00:25:50,156 --> 00:25:51,456 and you can certainly imagine each 638 00:25:51,456 --> 00:25:53,936 of this things representing a web page or name 639 00:25:53,936 --> 00:25:56,596 in the phonebook or anything that will lends itself 640 00:25:56,946 --> 00:25:58,306 to actually being sorted. 641 00:25:58,526 --> 00:26:00,546 But how do we go about sorting something like this? 642 00:26:00,546 --> 00:26:02,936 Well, a human probably will just kind of scan the list and say, 643 00:26:02,936 --> 00:26:04,916 alright, there is the number 1, let's move that, there's 2. 644 00:26:05,286 --> 00:26:06,426 And we all have some kind 645 00:26:06,426 --> 00:26:09,216 of intuition behind how you sort this numbers just 646 00:26:09,216 --> 00:26:11,846 as a normal person probably already knows before taking 647 00:26:11,886 --> 00:26:13,266 CS50, how to find someone 648 00:26:13,266 --> 00:26:15,306 in the phonebook pretty intelligently, right. 649 00:26:15,306 --> 00:26:17,566 Before this class, odds are you were not starting at the begging 650 00:26:17,566 --> 00:26:18,706 and moving from left to right. 651 00:26:18,956 --> 00:26:20,916 But how do we begin to now express what we 652 00:26:20,916 --> 00:26:23,806 as humans consider to be intuition if not obvious 653 00:26:24,096 --> 00:26:26,096 in a way that the computer can now understand 654 00:26:26,096 --> 00:26:28,986 so that it can do our bidding using our own life lessons 655 00:26:29,266 --> 00:26:32,416 but far more quickly than we could ourselves. 656 00:26:32,726 --> 00:26:35,116 So if you don't mind, perhaps some fresh faces this time, 657 00:26:35,346 --> 00:26:38,596 I have 8 pieces of paper and need another bite of volunteers. 658 00:26:39,166 --> 00:26:43,926 If you'd hear me, come on up, 2, 3, no one ever wants 659 00:26:43,926 --> 00:26:44,746 to come up from up here. 660 00:26:44,746 --> 00:26:48,016 There you go, 4, 5, 6 and 7. 661 00:26:48,016 --> 00:26:52,446 Let me see if I counted properly, 1, 2, 3, 4, 5, 6, 662 00:26:52,446 --> 00:26:56,876 7 okay, good and 8 in the red. 663 00:26:57,416 --> 00:26:59,636 Alright, so if you could, first part is easy. 664 00:26:59,636 --> 00:27:00,556 And you have to be comfortable 665 00:27:00,556 --> 00:27:01,936 on camera before you step on stage. 666 00:27:02,286 --> 00:27:04,596 First part is easy, line yourselves up and let's say 667 00:27:04,596 --> 00:27:07,836 over here, just so we have you all together in the same order 668 00:27:08,086 --> 00:27:12,656 as the projectors indicating now. 669 00:27:15,676 --> 00:27:16,616 Alright, thank you. 670 00:27:21,436 --> 00:27:23,446 Alright, so line up if you could in this way. 671 00:27:24,516 --> 00:27:29,186 [ Noise ] 672 00:27:29,686 --> 00:27:34,686 >> And so that you are not anonymous, we wanna say, 673 00:27:34,686 --> 00:27:36,316 who you are and one interesting thing 674 00:27:36,316 --> 00:27:39,436 which might mean concentration house, sports, extracurricular, 675 00:27:39,436 --> 00:27:41,296 whatever and just pass it down. 676 00:27:41,296 --> 00:27:43,716 >> My name is Chi [phonetic], I wanna do CS. 677 00:27:43,996 --> 00:27:44,496 >> Very nice. 678 00:27:44,886 --> 00:27:48,896 >> May name is Mar [phonetic] and I'm in the best house 679 00:27:49,166 --> 00:27:52,346 on campus, Eliot [phonetic] House. 680 00:27:52,346 --> 00:27:52,486 [ Cheering ] 681 00:27:52,486 --> 00:27:54,636 >> My name is Luis [phonetic], and I like yoghurt. 682 00:27:54,636 --> 00:27:55,176 [ Laughter ] 683 00:27:55,176 --> 00:27:57,326 >> My name is Daniel [phonetic] I also have the grand fortune 684 00:27:57,326 --> 00:27:59,346 of being in Eliot House. 685 00:27:59,346 --> 00:27:59,976 [ Cheering ] 686 00:27:59,976 --> 00:28:01,836 >> Hi, my name is Aziza [phonetic] 687 00:28:01,996 --> 00:28:03,416 and I'm in Eliot, too. 688 00:28:03,416 --> 00:28:07,746 >> Hi, I'm Carl and I'm from Leverett. 689 00:28:07,746 --> 00:28:08,736 [ Cheering ] 690 00:28:08,736 --> 00:28:11,476 >> I'm Ron, freshman from Canada. 691 00:28:11,476 --> 00:28:11,666 [ Cheering ] 692 00:28:11,666 --> 00:28:16,506 >> I'm Ian, I'm on the Alpine ski team. 693 00:28:16,716 --> 00:28:18,096 It's pretty fun. 694 00:28:18,096 --> 00:28:18,606 >> Alright. 695 00:28:20,816 --> 00:28:23,566 [Laughter] Alright, so, now you know these eight folks 696 00:28:23,566 --> 00:28:27,066 and we have a couple of approaches 697 00:28:27,276 --> 00:28:28,616 that we might take here. 698 00:28:28,616 --> 00:28:30,116 So I'm gonna propose one and then see 699 00:28:30,116 --> 00:28:32,176 if these eight can help me execute this method. 700 00:28:32,436 --> 00:28:35,036 So as a human, I might take the step back and say, alright, 701 00:28:35,036 --> 00:28:36,846 number 1, I'm gonna put you over here, number 2. 702 00:28:37,046 --> 00:28:38,886 But don't do that just yet, right 'cause it's kinda-- 703 00:28:38,886 --> 00:28:40,816 there's this intuition that presumably all of us have 704 00:28:40,816 --> 00:28:42,446 and we can sort this list in a couple of seconds. 705 00:28:42,446 --> 00:28:43,206 Now ironically what we're 706 00:28:43,206 --> 00:28:45,456 about to do is probably gonna take more than a few seconds. 707 00:28:45,586 --> 00:28:48,346 But once this input gets larger than N equals A 708 00:28:48,346 --> 00:28:50,876 and as N equals a million or a billion or the like, 709 00:28:51,086 --> 00:28:53,596 surely some of this tricks will be again a leverage 710 00:28:53,596 --> 00:28:54,666 to our advantage. 711 00:28:54,666 --> 00:28:57,016 So, I'm actually gonna point out first. 712 00:28:57,286 --> 00:28:59,426 If these folks are implemented in a computer, 713 00:28:59,426 --> 00:29:02,376 they're probably implemented not as 8 variables, X1, X2, 714 00:29:02,426 --> 00:29:05,076 X3 'cause we've already introduced arrays, 715 00:29:05,376 --> 00:29:09,036 odds are they're just called number and that-- 716 00:29:09,036 --> 00:29:12,146 the size of this array numbers is probably 8 and each 717 00:29:12,146 --> 00:29:14,906 of them is an int inside of this array. 718 00:29:15,136 --> 00:29:17,736 But the catch is with the computer, I don't have this sort 719 00:29:17,736 --> 00:29:20,026 of god-like ability to just look out on them all and see-- 720 00:29:20,026 --> 00:29:21,716 taking all of this numbers at once. 721 00:29:21,716 --> 00:29:22,966 I have to be much more deliberate. 722 00:29:22,966 --> 00:29:25,506 In an array, you can iterate from left to right, 723 00:29:25,506 --> 00:29:28,646 from right to left, you can jump randomly by a random access 724 00:29:28,856 --> 00:29:32,746 to any value inside the array by of the square bracket notation. 725 00:29:33,016 --> 00:29:34,986 But you certainly can't see them all at once. 726 00:29:34,986 --> 00:29:35,956 So just for visual's sake, 727 00:29:35,956 --> 00:29:37,456 if you guys can just turn your papers 728 00:29:37,456 --> 00:29:39,196 around so they're blank white on this side. 729 00:29:39,196 --> 00:29:41,956 I mean this really is what the computer is handed. 730 00:29:41,956 --> 00:29:43,026 There is 8 numbers, yes, 731 00:29:43,296 --> 00:29:46,806 but I certainly can't take some aggregate view all at once. 732 00:29:46,806 --> 00:29:48,106 I have to look more carefully. 733 00:29:48,406 --> 00:29:49,826 So go ahead and flip back the normal way 734 00:29:49,826 --> 00:29:52,696 and I'm gonna propose a very simple heuristic. 735 00:29:52,916 --> 00:29:55,636 I'm gonna start at the beginning 'cause it's just easy to iterate 736 00:29:55,636 --> 00:29:57,086 from left to right, I'm gonna look 737 00:29:57,086 --> 00:29:59,326 at the first 2 elements, 0 and 1. 738 00:29:59,416 --> 00:30:02,306 Index values, 0 and 1 and I see that the number 4 is 739 00:30:02,306 --> 00:30:04,286 at location 0, and the number 2 is 740 00:30:04,286 --> 00:30:05,976 at location 1, this is wrong, right. 741 00:30:06,126 --> 00:30:08,076 >> The goal at hand is to sort this numbers 742 00:30:08,186 --> 00:30:08,946 so you know what I'm gonna do, 743 00:30:08,946 --> 00:30:10,586 the simplistic heuristic I can come 744 00:30:10,586 --> 00:30:12,256 up with, I'm gonna say swap. 745 00:30:12,436 --> 00:30:13,346 Now, go ahead and swap. 746 00:30:14,646 --> 00:30:16,556 Now this took how many steps? 747 00:30:16,556 --> 00:30:18,286 Well if we implemented this in code, 748 00:30:18,286 --> 00:30:19,996 it's like at least 3 lines of code. 749 00:30:19,996 --> 00:30:21,866 We need a temporary variable then we need 750 00:30:21,866 --> 00:30:24,326 to clobber A then replace B and so forth. 751 00:30:24,516 --> 00:30:26,276 Now in fairness, our swap function is a little 752 00:30:26,276 --> 00:30:27,236 dysfunctional right now. 753 00:30:27,236 --> 00:30:28,806 We have yet to see a working version 754 00:30:28,806 --> 00:30:30,126 of swap but that's coming. 755 00:30:30,126 --> 00:30:32,386 In fact we'll introduce that in something called pointers next 756 00:30:32,386 --> 00:30:34,306 week that allows us to pass by reference 757 00:30:34,306 --> 00:30:36,096 and not by value or by copy. 758 00:30:36,326 --> 00:30:37,596 So let's take for granted today 759 00:30:37,596 --> 00:30:39,976 that surely there's some solution to this problem 760 00:30:39,976 --> 00:30:41,836 with swap, we just haven't done it ourselves yet. 761 00:30:41,836 --> 00:30:44,376 So I swap these two guys and now what do I do? 762 00:30:44,376 --> 00:30:46,856 Well, I don't have to look at them just yet in pair anymore, 763 00:30:46,856 --> 00:30:49,476 I can advance one step, so I plus-plus. 764 00:30:49,676 --> 00:30:52,136 Now 4 compared to 6 checks out. 765 00:30:52,136 --> 00:30:53,196 I don't have to do a swap 766 00:30:53,196 --> 00:30:55,236 so I can take another step, I plus-plus. 767 00:30:55,506 --> 00:30:57,636 6 against 8, okay, plus-plus. 768 00:30:57,786 --> 00:30:59,866 8 against 1, I need to do another swap. 769 00:31:00,236 --> 00:31:02,626 So that'll take a few CPU cycles. 770 00:31:02,626 --> 00:31:04,876 Now, you know, I could start backtracking 771 00:31:04,876 --> 00:31:07,336 because clearly 1 should be going this way 772 00:31:07,556 --> 00:31:10,036 but I'm gonna keep implementing the same approach, right. 773 00:31:10,036 --> 00:31:12,136 You wouldn't change your code in the middle of a for loop, 774 00:31:12,136 --> 00:31:14,696 it's gonna execute again and again in exactly the same way 775 00:31:15,016 --> 00:31:16,086 so let's do the plus-plus. 776 00:31:16,196 --> 00:31:18,806 8 and 3, alright, plus-plus. 777 00:31:18,876 --> 00:31:20,966 8 and 7, plus-plus. 778 00:31:21,526 --> 00:31:25,366 And 8 and 5, and now probably I do plus-plus. 779 00:31:25,366 --> 00:31:28,176 My for loops condition are my wild condition, what it is, 780 00:31:28,176 --> 00:31:30,616 realizes that I've gotten too big and so I reset 781 00:31:31,096 --> 00:31:32,456 to the start of this list. 782 00:31:33,186 --> 00:31:37,086 So it seems like I have a loop that's going this way, right, 783 00:31:37,086 --> 00:31:39,576 so that's I plus-plus, I plus-plus, I plus-plus. 784 00:31:39,796 --> 00:31:41,236 So I've looped through the whole array. 785 00:31:41,556 --> 00:31:43,026 So that would seem to suggest I'm done. 786 00:31:43,026 --> 00:31:46,216 I've looked at every value but obviously, no. 787 00:31:46,326 --> 00:31:49,046 So just intuitively, how many more times am I gonna have 788 00:31:49,226 --> 00:31:50,906 to come back to the end of this list much 789 00:31:50,906 --> 00:31:52,276 like a typewriter going back 790 00:31:52,276 --> 00:31:54,196 to the carriage return and go again. 791 00:31:54,406 --> 00:31:56,076 If I do it once more, will it fix the problem? 792 00:31:56,076 --> 00:31:56,256 >> No. 793 00:31:57,586 --> 00:32:00,496 >> No, so maybe-- I feel like maybe 8 or so, 794 00:32:00,496 --> 00:32:02,556 7 or 8 total times am I gonna have 795 00:32:02,556 --> 00:32:04,156 to do this because why is that? 796 00:32:04,276 --> 00:32:07,006 Well, intuitively, number 1 a moment ago was here. 797 00:32:07,006 --> 00:32:10,526 But how far did she move down the list with just 1 step. 798 00:32:10,836 --> 00:32:13,176 Now 8 made some pretty good progress so it seems 799 00:32:13,176 --> 00:32:15,846 like there is some asymmetry in the benefits of this algorithm. 800 00:32:16,076 --> 00:32:18,456 But again, anyone who is too small to be 801 00:32:18,456 --> 00:32:21,106 on that N is only stepping one at a time. 802 00:32:21,106 --> 00:32:23,566 It's but-- these small numbers like 1, number 3, 803 00:32:23,786 --> 00:32:25,676 or bubbling up, if you will, pretty slowly 804 00:32:25,676 --> 00:32:27,946 to this side whereas 8 is already in place. 805 00:32:28,246 --> 00:32:31,146 So if that suggests in the worse case that 1 started over there, 806 00:32:31,146 --> 00:32:34,956 let's say, well each iteration of this loop, plus, plus, plus, 807 00:32:34,956 --> 00:32:38,106 plus, plus would number 1 just move 1 step which means I've got 808 00:32:38,106 --> 00:32:41,446 to go back here, do it again, do it again, do it again as many 809 00:32:41,506 --> 00:32:43,916 as 7 times or really N times. 810 00:32:43,916 --> 00:32:47,006 N being 8, give or take to this again and again. 811 00:32:47,006 --> 00:32:48,126 So let's do this once more 812 00:32:48,126 --> 00:32:49,896 and this time I won't deliberately go 813 00:32:49,896 --> 00:32:50,706 through all the verbiage, 814 00:32:50,836 --> 00:32:54,566 let's see if you guys can execute now down the line. 815 00:32:54,566 --> 00:32:56,256 So ready, this is iteration number 2. 816 00:32:57,516 --> 00:33:04,766 [ Noise ] 817 00:33:05,266 --> 00:33:07,296 >> Okay, now what do we do? 818 00:33:07,436 --> 00:33:08,226 So I reset. 819 00:33:08,226 --> 00:33:09,906 So it feels like if you're thinking this through, 820 00:33:09,906 --> 00:33:12,176 I'm doing plus, plus, plus, plus, plus, plus, 821 00:33:12,176 --> 00:33:13,686 then I'm going all the way back round, 822 00:33:13,826 --> 00:33:15,266 it feels like we just have a nested loop. 823 00:33:15,266 --> 00:33:16,116 You did this in scratch. 824 00:33:16,116 --> 00:33:18,276 You might have done this in C already, maybe with visionaire 825 00:33:18,276 --> 00:33:20,366 if you've already started that part. 826 00:33:20,366 --> 00:33:22,396 And even visionaire can be done without a nested loop. 827 00:33:22,396 --> 00:33:24,976 But it feels like a nested loop 'cause I'm looping again 828 00:33:24,976 --> 00:33:27,326 and again and again and then I've got this bigger outer loop 829 00:33:27,326 --> 00:33:30,246 that's bringing me back home after 8 or so steps. 830 00:33:30,506 --> 00:33:36,366 Alright, let's go the third round. 831 00:33:37,206 --> 00:33:38,006 Okay. Okay. 832 00:33:38,406 --> 00:33:41,556 Okay and now, what do I do? 833 00:33:41,556 --> 00:33:41,796 [ Inaudible Remark ] 834 00:33:41,796 --> 00:33:43,046 >> Okay, so I go back to the beginning. 835 00:33:43,126 --> 00:33:45,376 Alright, so hopefully we're almost there. 836 00:33:45,376 --> 00:33:48,676 Now, can I stop? 837 00:33:49,486 --> 00:33:50,816 Alright, so maybe intuitively, 838 00:33:50,816 --> 00:33:52,636 you might think, yeah, it's right. 839 00:33:52,666 --> 00:33:54,386 But again, you don't have that aggregate view. 840 00:33:54,386 --> 00:33:56,186 The only way we're gonna know if we're done is 841 00:33:56,226 --> 00:33:59,426 to keep checking pair wise, pair wise, pair wise, pair wise, 842 00:33:59,426 --> 00:34:02,156 pair wise, now am I done? 843 00:34:02,476 --> 00:34:06,376 So, visually yes, but on this iteration, I made a swap 844 00:34:06,376 --> 00:34:10,206 and unless I've added some very sophisticated conditions 845 00:34:10,206 --> 00:34:12,876 or checks all I know is I just made one swap. 846 00:34:12,926 --> 00:34:16,876 So surely, if I made one swap, I might not be done because maybe 847 00:34:16,876 --> 00:34:19,416 that number that I swapped has to bubble down even further. 848 00:34:19,636 --> 00:34:22,266 So really the only way to this for sure is 849 00:34:22,306 --> 00:34:24,436 to now do 1 additional pass and say, uh-hmm, 850 00:34:24,436 --> 00:34:25,656 uh-hmm, uh-hmm, uh-hmm. 851 00:34:25,656 --> 00:34:29,146 And now, maybe I have a counter, called counter. 852 00:34:29,376 --> 00:34:32,356 And every time I do a swap, I increment this by 1, right. 853 00:34:32,356 --> 00:34:35,346 And now, my counter is still just 0. 854 00:34:35,346 --> 00:34:37,746 So now I'd be stupid frankly to go back to the beginning 855 00:34:37,746 --> 00:34:40,216 and do this again, do this again, do this again. 856 00:34:40,476 --> 00:34:42,536 So that then begs the question in total, 857 00:34:42,746 --> 00:34:45,496 how many steps does this algorithm involve? 858 00:34:45,496 --> 00:34:48,086 How much work does this algorithm involve? 859 00:34:48,086 --> 00:34:50,316 Well in the best case, let's start with the easier one. 860 00:34:50,536 --> 00:34:53,566 In the best case, this group of folks did not line up like that. 861 00:34:53,826 --> 00:34:54,936 They lined up right like this. 862 00:34:55,196 --> 00:34:57,716 In which case, how many steps does it take for me, 863 00:34:57,716 --> 00:34:59,936 the computer, to sort N individuals. 864 00:35:00,656 --> 00:35:00,816 >> One. 865 00:35:01,416 --> 00:35:01,626 >> Eight. 866 00:35:01,626 --> 00:35:02,496 [ Inaudible Remark ] 867 00:35:02,496 --> 00:35:03,376 >> Okay, we have disagreement. 868 00:35:03,376 --> 00:35:05,446 Everyone here says 1, and this guy says 8. 869 00:35:05,866 --> 00:35:06,706 So which is it? 870 00:35:06,706 --> 00:35:08,466 Well, it kinda boils down frankly to semantics. 871 00:35:08,466 --> 00:35:09,386 What do we mean by step? 872 00:35:09,446 --> 00:35:11,026 Do we mean one outer loop, maybe, 873 00:35:11,026 --> 00:35:13,546 do we mean one individual, yeah, step maybe. 874 00:35:13,546 --> 00:35:15,286 I'm gonna say-- minimally let's say 8. 875 00:35:15,286 --> 00:35:18,576 Because let's literally count any line of code that I execute 876 00:35:18,576 --> 00:35:20,246 or physically any step that I take. 877 00:35:20,506 --> 00:35:23,076 So to know that all of this folks are sorted minimally, 878 00:35:23,296 --> 00:35:25,566 I'm gonna have to check each pair keeping track 879 00:35:25,566 --> 00:35:26,856 of how many swaps I do. 880 00:35:27,146 --> 00:35:28,916 Still 0, now I can quit. 881 00:35:29,076 --> 00:35:30,226 Even though I was in a loop, 882 00:35:30,536 --> 00:35:32,446 some of you might have used a special instruction 883 00:35:32,446 --> 00:35:37,256 in C called break that will break you out of a loop. 884 00:35:37,326 --> 00:35:39,476 You might not have occasion to use this before but just 885 00:35:39,476 --> 00:35:42,416 because you say a loop needs to iterate 8 times or N times, 886 00:35:42,636 --> 00:35:44,126 you can break out if you want 887 00:35:44,176 --> 00:35:46,376 if just this situation warrants at any point. 888 00:35:46,376 --> 00:35:49,216 You could do this with scratch, too, with the stop all 889 00:35:49,276 --> 00:35:51,536 or stop script block as well. 890 00:35:51,536 --> 00:35:53,636 So now, what about the worse case? 891 00:35:53,636 --> 00:35:55,286 Well, what was the worse case? 892 00:35:55,286 --> 00:35:56,636 If you guys don't mind for the visual, 893 00:35:56,636 --> 00:35:59,426 can you arrange yourselves in the worse possible case? 894 00:36:01,736 --> 00:36:04,486 The worse case in not random, right? 895 00:36:04,526 --> 00:36:07,666 Actually, worse case for our purpose is, is the most amount 896 00:36:07,666 --> 00:36:10,156 of work that I might have to do to fix this problem. 897 00:36:10,156 --> 00:36:12,976 Now in fact, this is the worse possible case because now, 898 00:36:13,186 --> 00:36:14,906 everyone is in the wrong position. 899 00:36:14,906 --> 00:36:16,846 And in fact, number 8 is way the heck over here, 900 00:36:16,846 --> 00:36:17,976 number 1 is way over there. 901 00:36:18,266 --> 00:36:19,816 This is the worse possible case. 902 00:36:19,816 --> 00:36:22,496 That was actually somewhere in between, best and worse, 903 00:36:22,746 --> 00:36:25,746 how many steps now is it gonna take me to sort this folks. 904 00:36:25,746 --> 00:36:26,326 [ Inaudible Remark ] 905 00:36:26,326 --> 00:36:28,596 >> So 64, if you count it up. 906 00:36:28,596 --> 00:36:31,336 And can we generalize if these are actually N individuals, 907 00:36:31,446 --> 00:36:33,216 as oppose to 8 specifically? 908 00:36:33,216 --> 00:36:34,106 [ Inaudible Remark ] 909 00:36:34,106 --> 00:36:35,286 >> And so, N squared. 910 00:36:35,286 --> 00:36:36,406 And where does this number come from? 911 00:36:37,256 --> 00:36:40,746 Y N squared, N times N. Oh, what's that? 912 00:36:40,896 --> 00:36:42,556 >> Inner and outer loop. 913 00:36:42,926 --> 00:36:45,156 >> Yeah. So it's this inner and outer loop. 914 00:36:45,156 --> 00:36:46,206 The inner loop is obvious 915 00:36:46,236 --> 00:36:48,636 to see 'cause I keep taking all these steps across the stage. 916 00:36:48,636 --> 00:36:52,606 Every time I check for swapping opportunities, I'm taking 8 917 00:36:52,606 --> 00:36:55,216 or maybe 7 steps but for simplicity, let's just round 918 00:36:55,216 --> 00:36:57,286 up to 8 so we don't have to do N minus 1 all the time. 919 00:36:57,576 --> 00:36:59,456 Let's say this took 8 steps to get here. 920 00:36:59,776 --> 00:37:01,806 But now in the worse possible case, I'm gonna have 921 00:37:01,806 --> 00:37:05,036 to reset myself like a typewriter all the way back 922 00:37:05,036 --> 00:37:08,036 to the beginning with one more time then another time, 923 00:37:08,036 --> 00:37:08,786 then another time. 924 00:37:08,976 --> 00:37:15,286 And if number 1 needs to move, let's say 1, 2, 3, 4, 5, 6, 7. 925 00:37:15,286 --> 00:37:17,716 If number 1 needs to participate in 7 swaps 926 00:37:17,716 --> 00:37:19,966 which is really almost 8, so let's call it N 927 00:37:19,966 --> 00:37:21,076 or we could really be fanatic 928 00:37:21,076 --> 00:37:23,146 and do N minus 1, that's N swaps. 929 00:37:23,516 --> 00:37:26,966 So in each iteration, she only moves maximally 1 step. 930 00:37:27,146 --> 00:37:29,026 She only bubbles up a little to this N. 931 00:37:29,356 --> 00:37:32,946 So to get here all the way here, I need to take N steps this way 932 00:37:33,136 --> 00:37:35,666 and I need to do that as many as N times round. 933 00:37:35,896 --> 00:37:38,166 So in the worse case to sort N individuals 934 00:37:38,166 --> 00:37:39,726 with this algorithm it would seem to take, 935 00:37:39,726 --> 00:37:41,536 give or take N squared. 936 00:37:41,536 --> 00:37:43,576 N times 2 number of steps. 937 00:37:43,866 --> 00:37:47,186 Now 8 times 8, 64, that's not that bad, right. 938 00:37:47,186 --> 00:37:49,736 It felt like the intuitive human way was a little faster 939 00:37:49,736 --> 00:37:51,776 but N squared isn't that bad but what 940 00:37:51,776 --> 00:37:54,026 about when N is not 8 but 10. 941 00:37:54,026 --> 00:37:57,276 So that's 10 squared, that's a 100 or 100, 10,000, 942 00:37:57,276 --> 00:37:59,586 now we start getting a very and very quickly 943 00:37:59,826 --> 00:38:02,666 to much larger scarier values and this feels 944 00:38:02,666 --> 00:38:05,806 like it could very quickly start consuming a lot of resources. 945 00:38:06,076 --> 00:38:07,006 So can we do better? 946 00:38:07,006 --> 00:38:08,326 Let's try one other approach. 947 00:38:08,326 --> 00:38:10,186 If you guys could reset to this random approach, 948 00:38:10,466 --> 00:38:15,056 so we don't always consider best case worse case. 949 00:38:15,256 --> 00:38:17,986 Let's see if we can't do something a little different. 950 00:38:17,986 --> 00:38:20,856 So that was the simplest most naive algorithm I could come 951 00:38:20,856 --> 00:38:24,776 up with a sort of a newbie computer swapping the things 952 00:38:24,776 --> 00:38:25,856 that I see next to each other. 953 00:38:26,086 --> 00:38:28,566 Let's try to leverage the common sense that most of us have, 954 00:38:29,186 --> 00:38:31,176 that all of us have in this room when it comes 955 00:38:31,236 --> 00:38:33,186 to finding the smallest number. 956 00:38:33,186 --> 00:38:35,136 And let's just grab the smallest number then look 957 00:38:35,136 --> 00:38:37,316 for the next smallest, then look for the next smallest, right, 958 00:38:37,316 --> 00:38:39,636 'cause odds are when you glance at this numbers and you're told 959 00:38:39,636 --> 00:38:42,766 to sort them, you probably latch on to the number 1 first. 960 00:38:42,766 --> 00:38:44,746 Maybe the number 8 if you prefer to work backwards, 961 00:38:44,956 --> 00:38:47,416 but at least the number 1 then 2, so let's see 962 00:38:47,416 --> 00:38:48,426 if we can't leverage that. 963 00:38:48,426 --> 00:38:50,376 So again visually, if you could flip your papers around, 964 00:38:50,756 --> 00:38:52,296 this is what the computer sees in advance. 965 00:38:52,436 --> 00:38:53,656 I just know there is eight elements, 966 00:38:53,716 --> 00:38:54,916 I can't see them all at once. 967 00:38:54,916 --> 00:38:56,986 So now if you could go back to normal, now, 968 00:38:56,986 --> 00:38:58,566 I'm gonna be the computer and I'm gonna look 969 00:38:58,566 --> 00:39:00,206 for the smallest possible value. 970 00:39:00,336 --> 00:39:03,676 Alright, so 4, he is the smallest possible value 971 00:39:03,676 --> 00:39:04,276 at this point in time. 972 00:39:04,276 --> 00:39:05,596 So I'm gonna make a mental note. 973 00:39:05,596 --> 00:39:07,306 Or in computer terms, I'm gonna use a variable 974 00:39:07,306 --> 00:39:10,076 to remember 4 is the smallest and he is at location 0. 975 00:39:10,316 --> 00:39:12,156 Alright, oh, disqualified. 976 00:39:12,156 --> 00:39:14,956 He is no longer the smallest value, I've now found number 2 977 00:39:14,956 --> 00:39:17,536 at location bracket 1, now I'm gonna remember that 978 00:39:17,536 --> 00:39:18,636 and I'm gonna forget about him. 979 00:39:18,636 --> 00:39:20,036 Alright, 'cause it's a lot easier to keep track 980 00:39:20,036 --> 00:39:21,536 of one number than two at once. 981 00:39:21,926 --> 00:39:24,116 So now, I keep looking 'cause maybe your not number 1. 982 00:39:24,376 --> 00:39:28,386 So umm, you don't change my variables, definitely not. 983 00:39:28,596 --> 00:39:31,206 One, we've already ousted number 2. 984 00:39:31,206 --> 00:39:34,756 So here is the smallest possible value at location 0, 1, 2, 3, 985 00:39:34,966 --> 00:39:37,046 at location 4, so I should remember that. 986 00:39:37,866 --> 00:39:39,406 Do I now know I found the smallest? 987 00:39:40,316 --> 00:39:40,386 >> No. 988 00:39:40,726 --> 00:39:40,956 >> Alright. 989 00:39:40,956 --> 00:39:43,096 So not-- technically no, visually obviously. 990 00:39:43,096 --> 00:39:45,456 But technically no unless I was given an assumption, 991 00:39:45,696 --> 00:39:47,786 here our 8 numbers, the smallest number is 1 992 00:39:47,996 --> 00:39:50,206 but I was just given 8 numbers in random order, 993 00:39:50,336 --> 00:39:53,206 the only way I know, number 1 is in fact the smallest is 994 00:39:53,206 --> 00:39:56,726 by continuing a total of 8 or N steps. 995 00:39:56,916 --> 00:39:59,026 But now what do I do, I remembered with some variable 996 00:39:59,226 --> 00:39:59,976 that she's at location 4. 997 00:40:00,096 --> 00:40:01,366 >> So what can I do? 998 00:40:01,636 --> 00:40:03,146 Well unfortunately, this is an array. 999 00:40:03,146 --> 00:40:05,836 An array thus far re-fixed the size. 1000 00:40:05,836 --> 00:40:07,896 When you create an array as we've seen like an array 1001 00:40:07,896 --> 00:40:10,516 for quizzes, we said two quizzes go here. 1002 00:40:10,776 --> 00:40:12,096 And we used the constant at that time. 1003 00:40:12,096 --> 00:40:15,376 And in fact in C, unlike some languages like PHP 1004 00:40:15,376 --> 00:40:16,696 and JavaScript we'll see, 1005 00:40:16,916 --> 00:40:18,936 when you declare an array it's a fixed size. 1006 00:40:19,206 --> 00:40:22,286 Now this is a problem because I might wanna move number 1 1007 00:40:22,286 --> 00:40:25,236 to the start of this list but there's no room. 1008 00:40:25,236 --> 00:40:27,516 Alright, I only have 8 slots I have 8 numbers. 1009 00:40:27,516 --> 00:40:29,616 I can I have temporary variables so I could hold you 1010 00:40:29,616 --> 00:40:33,246 out if you will from the array leaving kind of a duplicator 1011 00:40:33,246 --> 00:40:35,326 or a gap in it so what, what are my options? 1012 00:40:35,826 --> 00:40:38,166 How do I move number 1 to her appropriate place? 1013 00:40:39,456 --> 00:40:40,766 So I could swap, right? 1014 00:40:40,766 --> 00:40:41,116 What could I do? 1015 00:40:41,116 --> 00:40:43,246 You know, he doesn't' need to be there, right? 1016 00:40:43,246 --> 00:40:45,646 I definitely don't need number 4 there or really 1017 00:40:45,646 --> 00:40:47,386 in general any one other than 1 there. 1018 00:40:47,386 --> 00:40:48,356 So why don't I do that? 1019 00:40:48,356 --> 00:40:49,576 And actually before I do that, 1020 00:40:49,576 --> 00:40:51,676 what's an alternative before we correct-- do this right? 1021 00:40:52,316 --> 00:40:54,866 Alright, I could just-- guys could just-- that way, yeah. 1022 00:40:55,296 --> 00:40:59,216 But wait, that looks obvious, looks intuitive but that's 1, 2, 1023 00:40:59,326 --> 00:41:02,506 that was three steps 'cause each of these variables or each 1024 00:41:02,506 --> 00:41:04,086 of ints has to be moved deliberately 1025 00:41:04,086 --> 00:41:05,656 by the computer with a line of code. 1026 00:41:05,916 --> 00:41:08,116 So that's three steps to move these guys over, 1027 00:41:08,116 --> 00:41:09,746 maybe 4 to move him over as well. 1028 00:41:10,076 --> 00:41:11,296 Let's take the simpler approach. 1029 00:41:11,296 --> 00:41:12,886 4 again doesn't need to be there. 1030 00:41:13,136 --> 00:41:14,846 This numbers we given to me in random order. 1031 00:41:14,846 --> 00:41:17,256 Who cares if I mix them up even more if I don't care 1032 00:41:17,256 --> 00:41:18,156 where they are just yet. 1033 00:41:18,426 --> 00:41:21,206 So let's move number 1 over here, number 4 over there. 1034 00:41:21,206 --> 00:41:26,406 And now this is 1 swap not a 3 or 4 total swaps. 1035 00:41:26,406 --> 00:41:27,096 Now what do I do? 1036 00:41:27,506 --> 00:41:29,996 Well at the back at the end of the list, now nicely enough, 1037 00:41:29,996 --> 00:41:31,216 I can have a little optimization. 1038 00:41:31,216 --> 00:41:33,156 I never again need to look at number 1 1039 00:41:33,386 --> 00:41:34,866 because I already have decided-- 1040 00:41:35,086 --> 00:41:38,116 [laughter] because I've already determined she's the smallest 1041 00:41:38,116 --> 00:41:41,286 so it would just be a waste of CPU cycles to start here again. 1042 00:41:41,536 --> 00:41:43,116 So I can maybe have another variable 1043 00:41:43,296 --> 00:41:45,156 that says start this time at number 1. 1044 00:41:45,156 --> 00:41:47,996 Not in number-- location 0, start at location 1. 1045 00:41:48,246 --> 00:41:48,806 So here we go. 1046 00:41:48,806 --> 00:41:50,146 Number 2, I found him. 1047 00:41:50,146 --> 00:41:51,106 He's already right here. 1048 00:41:51,646 --> 00:41:52,916 But I don't know that necessarily. 1049 00:41:52,916 --> 00:41:54,216 So now you might be thinking 1050 00:41:54,216 --> 00:41:56,906 but why didn't you just remember this the last time and I could 1051 00:41:56,906 --> 00:41:58,176 but that's a different algorithm. 1052 00:41:58,176 --> 00:41:58,936 That's more memory. 1053 00:41:58,986 --> 00:42:01,156 Right now I'm just using a couple of variables 1054 00:42:01,156 --> 00:42:04,016 to keep track of who and who's where and what they are. 1055 00:42:04,066 --> 00:42:05,836 So 2, is currently the smallest. 1056 00:42:05,836 --> 00:42:06,526 So I'm gonna remember that. 1057 00:42:07,066 --> 00:42:09,996 6 doesn't beat him, 8, 4, 3, 7, 5, 1058 00:42:09,996 --> 00:42:13,156 so that's another N minus 1 steps 'cause I didn't bother 1059 00:42:13,156 --> 00:42:14,196 looking at number 1 again. 1060 00:42:14,506 --> 00:42:15,806 So that's almost N steps. 1061 00:42:16,216 --> 00:42:17,786 He is in fact in the right place. 1062 00:42:18,066 --> 00:42:19,526 I don't have to do any work at all. 1063 00:42:19,796 --> 00:42:22,666 Now, let's continue I can move my left boundary here. 1064 00:42:23,016 --> 00:42:25,586 6 is currently my smallest number not beaten 1065 00:42:25,586 --> 00:42:27,036 by 8-- oh, beaten by 4. 1066 00:42:27,036 --> 00:42:28,136 4 is now the smallest. 1067 00:42:28,136 --> 00:42:30,026 Oh, 3 is now the smallest number. 1068 00:42:30,326 --> 00:42:33,286 No, nope I made metal note of where 3 is, what do I do? 1069 00:42:33,796 --> 00:42:33,956 >> Swap. 1070 00:42:34,046 --> 00:42:35,326 >> Swap 3 with 6. 1071 00:42:36,106 --> 00:42:39,106 And now I'll stop telling the story verbally if you guys, 1072 00:42:39,106 --> 00:42:41,296 I'll just point, can execute now, so. 1073 00:42:41,856 --> 00:42:45,706 Actually this really isn't very, 1074 00:42:45,706 --> 00:42:47,186 very clear if I just point and say nothing. 1075 00:42:47,476 --> 00:42:51,846 Okay, so [laughter] 4 is currently the smallest value 1076 00:42:51,846 --> 00:42:52,456 that I found. 1077 00:42:52,456 --> 00:42:53,796 So where do we move you? 1078 00:42:54,366 --> 00:42:55,246 So we swap this. 1079 00:42:55,246 --> 00:42:58,386 And again, it might seem that 8 is moving into his position 1080 00:42:58,386 --> 00:43:00,446 and that is true 'cause presumably it is bigger 1081 00:43:00,446 --> 00:43:01,156 and is moving up. 1082 00:43:01,396 --> 00:43:02,436 But for the most part, 1083 00:43:02,436 --> 00:43:04,646 these swaps are just moving random numbers 1084 00:43:04,836 --> 00:43:06,196 to other random location. 1085 00:43:06,196 --> 00:43:08,716 So we're not actually making the problem any worse fundamentally. 1086 00:43:08,946 --> 00:43:10,176 So now 8 is my smallest. 1087 00:43:10,246 --> 00:43:12,896 Oh, now 6 is, now 5 is. 1088 00:43:12,896 --> 00:43:15,766 So 5, alright, it's good. 1089 00:43:15,766 --> 00:43:20,476 And now 6 is my smallest, 7 is my-- 6 is my smallest, 1090 00:43:20,476 --> 00:43:24,706 6 is still smallest, 6 is still smallest and so no swaps. 1091 00:43:24,916 --> 00:43:26,476 Now I do the same thing, the same thing 1092 00:43:26,476 --> 00:43:27,556 so this begs the question. 1093 00:43:27,556 --> 00:43:28,906 This was a little more a little real world, 1094 00:43:28,906 --> 00:43:31,426 a little more human-oriented 'cause I just implemented 1095 00:43:31,426 --> 00:43:32,106 my commonsense. 1096 00:43:32,106 --> 00:43:33,696 I found the smallest number put her in place, 1097 00:43:33,696 --> 00:43:35,256 found the next smallest number, put him in place 1098 00:43:35,626 --> 00:43:36,896 and then repeat, repeat, repeat. 1099 00:43:37,216 --> 00:43:41,246 So how much time, how many steps did this algorithm take? 1100 00:43:41,246 --> 00:43:41,936 [ Inaudible] 1101 00:43:41,936 --> 00:43:45,866 >> So it's not N factorial. 1102 00:43:45,866 --> 00:43:47,526 It kind of feels like one of those expressions 1103 00:43:47,526 --> 00:43:49,006 but that would actually be really bad, right? 1104 00:43:49,006 --> 00:43:51,586 That's 8 times 7 times 6 times 5 times 4. 1105 00:43:51,756 --> 00:43:53,186 That gets really big quickly. 1106 00:43:53,186 --> 00:43:55,756 So it's actually not that bad. 1107 00:43:55,756 --> 00:43:55,823 [ Inaudible Remark ] 1108 00:43:55,823 --> 00:43:59,466 >> And log in, it's not that good. 1109 00:43:59,466 --> 00:44:00,596 [ Inaudible Remark ] 1110 00:44:00,596 --> 00:44:03,896 >> N squared, so you know what, it turns out it almost is. 1111 00:44:03,896 --> 00:44:04,836 It's not as bad as this. 1112 00:44:04,926 --> 00:44:06,366 It's almost N squared. 1113 00:44:06,366 --> 00:44:08,106 It feels like we had some optimizations. 1114 00:44:08,106 --> 00:44:10,316 We stopped looking at 1 then we stopped looking at 2 1115 00:44:10,496 --> 00:44:12,276 and we only took as many steps as we had to do 1116 00:44:12,466 --> 00:44:14,366 and on each time I didn't just blindly swap. 1117 00:44:14,366 --> 00:44:17,486 I actually looked for the optimal position for someone 1118 00:44:17,486 --> 00:44:20,776 by plucking out the smallest and yet I claim now 1119 00:44:20,776 --> 00:44:23,626 that this is still roughly N squared. 1120 00:44:23,846 --> 00:44:26,396 Why? Let's take a 5-minute break, give these guys are round 1121 00:44:26,396 --> 00:44:27,636 of applause and we'll be back in 5. 1122 00:44:27,756 --> 00:44:29,756 [ Applause ] 1123 00:44:29,876 --> 00:44:30,716 >> You guys can keep those. 1124 00:44:31,016 --> 00:44:37,296 If you can just meet the TF over there, that'd be great. 1125 00:44:37,546 --> 00:44:39,366 Okay, so the cliff hanger that we left off 1126 00:44:39,366 --> 00:44:42,366 with was how much time the selection sort take, 1127 00:44:42,606 --> 00:44:43,726 vis-a-vis bubble sort. 1128 00:44:43,726 --> 00:44:46,186 So bubble sort we said pretty much takes N 1129 00:44:46,186 --> 00:44:47,436 squared comparisons. 1130 00:44:47,716 --> 00:44:52,126 So N squared steps rather is something like bubble sorts 1131 00:44:52,176 --> 00:44:55,586 in the worst case-- oh, actually I'm the only one 1132 00:44:55,586 --> 00:44:56,516 who knows this bubble sort. 1133 00:44:56,766 --> 00:44:59,176 So the first algorithm we looked at and I was kind of trying 1134 00:44:59,176 --> 00:45:00,386 to hint it this just by using the-- 1135 00:45:00,536 --> 00:45:02,526 by saying bubbling this way, bubbling that way. 1136 00:45:02,716 --> 00:45:05,446 But in this much as when we compare values, 1137 00:45:05,686 --> 00:45:07,656 one of them might bubble up that way 1138 00:45:07,656 --> 00:45:09,496 and the other one might bubble down this way. 1139 00:45:09,806 --> 00:45:12,106 The world generally calls that first algorithm 1140 00:45:12,106 --> 00:45:15,176 where we just did ParaWise swaps something bubble sort. 1141 00:45:15,176 --> 00:45:16,986 And what's nice about it is you could actually code it 1142 00:45:16,986 --> 00:45:19,536 up in any language very, very straightforwardly. 1143 00:45:19,836 --> 00:45:22,226 But the down side is it doesn't feel 1144 00:45:22,226 --> 00:45:23,416 like the most efficient thing. 1145 00:45:23,416 --> 00:45:24,506 How slow is it? 1146 00:45:24,506 --> 00:45:26,476 Well, we'll see and feel that in just a moment. 1147 00:45:26,676 --> 00:45:29,106 But in the worst case it was in fact, N squared. 1148 00:45:29,356 --> 00:45:32,896 But in the best case it was just N, right? 1149 00:45:32,896 --> 00:45:35,266 In the best case, when I was handed all of the humans 1150 00:45:35,266 --> 00:45:38,226 in the correct order, I just had to do a sanity check once 1151 00:45:38,286 --> 00:45:41,726 down the list of N people to see, I didn't make any swaps, 1152 00:45:41,836 --> 00:45:44,496 therefore I'd be foolish to do that again and again and again 1153 00:45:44,756 --> 00:45:45,536 so I can terminate 1154 00:45:45,536 --> 00:45:48,076 that algorithm early after just N steps. 1155 00:45:48,076 --> 00:45:52,326 Now the second algorithm we did might be called selection sort. 1156 00:45:52,376 --> 00:45:54,766 That's what the world generally calls it. 1157 00:45:54,766 --> 00:45:56,506 In as much as you are selecting 1158 00:45:56,546 --> 00:45:58,366 on each iteration the smallest value. 1159 00:45:58,546 --> 00:46:00,346 And you could completely flip it around and just select 1160 00:46:00,346 --> 00:46:02,066 on each iteration the largest value. 1161 00:46:02,226 --> 00:46:03,926 But the algorithm would be the same in the N. 1162 00:46:04,176 --> 00:46:06,996 So selection sort, it wasn't as clear 1163 00:46:06,996 --> 00:46:09,256 at first what the running time is. 1164 00:46:09,256 --> 00:46:11,886 What the performance of this actual algorithm is. 1165 00:46:12,206 --> 00:46:14,296 But let's see, so with selection sort, 1166 00:46:14,626 --> 00:46:18,016 to find the smallest element it might take me what? 1167 00:46:18,336 --> 00:46:20,336 N steps. So I have to look at N values 1168 00:46:20,366 --> 00:46:23,266 because number 1 might be all the way over here, then I swap 1169 00:46:23,916 --> 00:46:27,476 but then I leave number 1 in her correct place and so now I move 1170 00:46:27,476 --> 00:46:29,986 on to look for the next smallest element which might be 1171 00:46:29,986 --> 00:46:30,896 in the worst case, where? 1172 00:46:32,216 --> 00:46:33,206 Might be over here. 1173 00:46:33,296 --> 00:46:35,346 But this is only seven steps this time. 1174 00:46:35,346 --> 00:46:37,626 So it feels like an optimization is brewing here, right? 1175 00:46:37,626 --> 00:46:39,816 Because one, I don't have to looking at it as I said. 1176 00:46:40,006 --> 00:46:41,366 I'm just now looking for the smallest-- 1177 00:46:41,366 --> 00:46:43,056 the next smallest value which happens to be 2. 1178 00:46:43,056 --> 00:46:44,756 I find him maybe worse case over here. 1179 00:46:45,006 --> 00:46:45,786 I swap him. 1180 00:46:46,056 --> 00:46:49,406 that takes just a single step or maybe three. 1181 00:46:49,406 --> 00:46:51,066 And here's where I'll start to wave my hands 1182 00:46:51,066 --> 00:46:52,676 at some implementation details. 1183 00:46:52,946 --> 00:46:55,886 But visually swapping those two people we can think 1184 00:46:55,886 --> 00:46:56,886 of that as just one step. 1185 00:46:56,886 --> 00:46:58,416 It's one swap, one operation. 1186 00:46:58,716 --> 00:47:01,276 So now I did eight steps to find number 1. 1187 00:47:01,446 --> 00:47:03,966 Took me seven steps, worse case to find number 2. 1188 00:47:04,166 --> 00:47:07,586 Might take me six steps then five then four, 1189 00:47:07,796 --> 00:47:11,286 so it seems that we have this kind of process, 8 plus 7 plus 1190 00:47:11,646 --> 00:47:14,256 and then down to 1 to find the very last person. 1191 00:47:14,256 --> 00:47:16,056 Those of you who would like looking at formulas 1192 00:47:16,056 --> 00:47:18,086 like this might see a little something 1193 00:47:18,086 --> 00:47:20,056 like I equals 1 up to 8. 1194 00:47:20,646 --> 00:47:22,286 Alright, so a little summation like that 1195 00:47:22,576 --> 00:47:24,476 and what has this actually work out to be if you use 1196 00:47:24,476 --> 00:47:26,536 that cheat sheet in the back of a-- like high school math book? 1197 00:47:29,196 --> 00:47:31,206 Alright, so if we actually do this out, 1198 00:47:31,206 --> 00:47:33,816 8 plus 7 plus 6 plus 5, I believe this works 1199 00:47:33,816 --> 00:47:39,326 out to be N times N minus 1 over 2, is that right? 1200 00:47:39,996 --> 00:47:41,136 Right, so how did I get to that? 1201 00:47:41,136 --> 00:47:43,536 Well, frankly you can-- What's that? 1202 00:47:43,976 --> 00:47:44,556 >> N plus 1. 1203 00:47:45,196 --> 00:47:45,946 >> And plus 1? 1204 00:47:45,946 --> 00:47:49,196 Okay, N times N plus 1 over 2, I get the end result 1205 00:47:49,196 --> 00:47:50,596 of a formula that's what? 1206 00:47:50,596 --> 00:47:51,726 If I actually multiply this 1207 00:47:51,726 --> 00:47:56,226 out well this is N squared plus N over 2. 1208 00:47:56,496 --> 00:47:58,766 So you know there's an N squared in there. 1209 00:47:58,766 --> 00:48:00,586 Is it N plus 1 or N minus 1? 1210 00:48:00,976 --> 00:48:01,246 >> Plus. 1211 00:48:01,246 --> 00:48:02,136 >> Okay, plus 1. 1212 00:48:02,396 --> 00:48:08,126 So, N plus 1-- N squared plus N over 2, now that feels better 1213 00:48:08,126 --> 00:48:10,876 than N squared alone was, right? 1214 00:48:10,876 --> 00:48:13,396 Because even though this is N squared plus N then I'm 1215 00:48:13,396 --> 00:48:14,066 having it. 1216 00:48:14,366 --> 00:48:16,826 So this actually feels pretty good, but here is the thing. 1217 00:48:17,126 --> 00:48:19,196 When it comes to actually implementing algorithms 1218 00:48:19,196 --> 00:48:23,426 and analyzing algorithms, it's in the context of computers, 1219 00:48:23,756 --> 00:48:26,296 these kinds of constant factors like the number 2, 1220 00:48:26,656 --> 00:48:28,836 this really isn't all that interesting. 1221 00:48:28,836 --> 00:48:31,746 I might claim selection sort is twice as fast give 1222 00:48:31,746 --> 00:48:33,176 or take than bubble sort. 1223 00:48:33,376 --> 00:48:35,176 Because look, it takes half as many steps. 1224 00:48:35,176 --> 00:48:37,806 Now again, N squared plus N, I'm kind of waving my hand 1225 00:48:37,806 --> 00:48:38,666 at some of the terms here. 1226 00:48:38,946 --> 00:48:41,786 But this, too, kind of means that you have the running time, 1227 00:48:41,786 --> 00:48:43,696 the amount of time it takes for this algorithm to run. 1228 00:48:44,136 --> 00:48:45,516 But you know what, who cares, right? 1229 00:48:45,516 --> 00:48:49,436 Every year CPUs and consumer PCs seems to get faster and faster 1230 00:48:49,436 --> 00:48:51,246 so that this year I might have a gigahertz computer, 1231 00:48:51,386 --> 00:48:53,026 next year I'm gonna have a 2 gigahertz computer 1232 00:48:53,026 --> 00:48:55,266 which means next year selection sort will be twice 1233 00:48:55,266 --> 00:48:56,206 as good as it is today. 1234 00:48:56,706 --> 00:48:59,986 Alright, so it's hard to make claims with algorithms 1235 00:49:00,026 --> 00:49:03,896 that are just these constant factors better than one another. 1236 00:49:03,896 --> 00:49:06,236 Because again, with the hardware you can kind 1237 00:49:06,276 --> 00:49:07,526 of hide those details. 1238 00:49:07,526 --> 00:49:11,066 What we really wanna do is find an algorithm that even 1239 00:49:11,066 --> 00:49:14,286 if computer-- if the computer speeds double next year, well, 1240 00:49:14,286 --> 00:49:17,596 my algorithm itself gets even more than twice as fast. 1241 00:49:17,596 --> 00:49:20,486 We really wanna ride the curve as best as possible. 1242 00:49:20,766 --> 00:49:23,006 But for now, we're gonna wave our hands for a bit 1243 00:49:23,286 --> 00:49:24,526 until we get more precise. 1244 00:49:24,596 --> 00:49:27,746 And this is to say if selection sort takes N squared plus N 1245 00:49:27,796 --> 00:49:30,426 over 2 steps, eh, this isn't that interesting. 1246 00:49:30,426 --> 00:49:33,286 And you know what, even this term here maybe this is relevant 1247 00:49:33,286 --> 00:49:35,796 when it's N squared, so that's 64 plus 8. 1248 00:49:35,796 --> 00:49:40,116 That's an 8 additional units of measure more than 64. 1249 00:49:40,116 --> 00:49:41,186 That's non trivial here. 1250 00:49:41,376 --> 00:49:43,236 But what if now N is a billion? 1251 00:49:43,306 --> 00:49:46,726 A billion squared plus another billion that's actually not all 1252 00:49:46,726 --> 00:49:48,986 that much more, so relative proportion. 1253 00:49:49,296 --> 00:49:50,216 So you know what, forget this. 1254 00:49:50,356 --> 00:49:53,196 This, in the long term as N gets bigger, this is kind 1255 00:49:53,196 --> 00:49:55,916 of a meaningless addition to the formula. 1256 00:49:55,916 --> 00:49:59,106 So we're gonna claim for now that selection sort, too, 1257 00:49:59,106 --> 00:50:02,236 in the worst case is actually pretty much comparable 1258 00:50:02,236 --> 00:50:02,976 to bubble sort. 1259 00:50:03,066 --> 00:50:06,006 >> Not if you line up the numbers precisely 1260 00:50:06,246 --> 00:50:09,016 but at least big picture if you think in terms of large N 1261 00:50:09,016 --> 00:50:11,286 and you compared the graphs, as we'll do in a moment, 1262 00:50:11,286 --> 00:50:12,256 that this formulas make. 1263 00:50:12,256 --> 00:50:14,166 You know, it's pretty much N squared. 1264 00:50:14,166 --> 00:50:16,566 And in fact, there's a problem with selection sort. 1265 00:50:16,566 --> 00:50:20,286 What if in the best case, I'm handed 1, then 2, then 3, 1266 00:50:20,286 --> 00:50:22,716 then 4 and I try to sort those elements with bubble sort. 1267 00:50:22,716 --> 00:50:25,666 How many steps is it gonna take to sort N people 1268 00:50:25,666 --> 00:50:27,726 in the best case with selection sort? 1269 00:50:27,726 --> 00:50:28,576 >> The same. 1270 00:50:28,856 --> 00:50:29,796 >> It's the same, right? 1271 00:50:29,796 --> 00:50:30,856 Because even though it felt 1272 00:50:30,856 --> 00:50:33,366 like this nicely intuitive algorithm just keeps selecting 1273 00:50:33,366 --> 00:50:34,646 the smallest possible value. 1274 00:50:34,906 --> 00:50:36,326 Well, what's gonna happen in the best case. 1275 00:50:36,426 --> 00:50:37,946 They're all lined up from 1 to 8. 1276 00:50:38,176 --> 00:50:39,966 I find number 1 here and I say, okay, 1277 00:50:39,966 --> 00:50:43,186 he's the current smallest then I find 2, uh-hmm, 3, uh-hmm, 1278 00:50:43,186 --> 00:50:45,226 uh-hmm and not until I get to the end do I realize, 1279 00:50:45,276 --> 00:50:47,446 we'll that was stupid I found him in the very first step, 1280 00:50:47,826 --> 00:50:49,236 now let me go back to the beginning. 1281 00:50:49,236 --> 00:50:51,466 I don't have to look at him anymore because now I have 1282 00:50:51,466 --> 00:50:52,836 to look for the next smallest 1283 00:50:52,836 --> 00:50:55,936 and so I'm looking now for 2, right. 1284 00:50:55,936 --> 00:50:57,766 So now where's the next smallest? 1285 00:50:57,896 --> 00:51:00,566 Well 2 is right here, but maybe there's 1.5 1286 00:51:00,566 --> 00:51:01,696 or something like that over there. 1287 00:51:01,696 --> 00:51:03,326 I have to keep looking and looking and looking 1288 00:51:03,586 --> 00:51:05,446 so here's the downside of selection sort. 1289 00:51:05,446 --> 00:51:08,346 In the best case, it too is N squared 1290 00:51:08,436 --> 00:51:10,496 so while it might feel more intuitive, it might feel 1291 00:51:10,496 --> 00:51:11,716 like the more obvious algorithm, 1292 00:51:12,006 --> 00:51:14,826 we've just made the situation worst 'cause I don't know 1293 00:51:14,826 --> 00:51:16,906 in advance what these elements are. 1294 00:51:16,906 --> 00:51:18,606 So let's add some nomenclature here 1295 00:51:18,606 --> 00:51:20,506 that computer scientists are fond of using. 1296 00:51:20,826 --> 00:51:22,666 There's a symbol that the world generally uses 1297 00:51:22,706 --> 00:51:25,206 to describe what's called the worst case 1298 00:51:25,736 --> 00:51:28,386 and this is the capital O, often written in italics 1299 00:51:28,816 --> 00:51:32,956 so this is what the world generally calls big O notation. 1300 00:51:33,276 --> 00:51:37,106 And if you say an algorithm is in big O of N squared 1301 00:51:37,226 --> 00:51:38,796 that just means in the worst possible case 1302 00:51:38,846 --> 00:51:40,976 if you handed everything completely out of order, 1303 00:51:41,206 --> 00:51:43,106 it might take as many as this many steps. 1304 00:51:43,516 --> 00:51:46,826 Now in the best case, we use a different symbol a Greek one 1305 00:51:46,826 --> 00:51:47,656 this time, omega. 1306 00:51:48,046 --> 00:51:49,966 Omega just is the fancy way of saying 1307 00:51:49,966 --> 00:51:53,316 in the best possible case this algorithm might take N steps. 1308 00:51:53,706 --> 00:51:54,826 So these two correspond 1309 00:51:54,826 --> 00:51:57,736 to bubble sorts' worst and best cases. 1310 00:51:57,736 --> 00:52:01,286 Now we can do the same symbology for selection sort 1311 00:52:02,506 --> 00:52:05,486 and so we now have big O of N squared for the worst case, 1312 00:52:05,726 --> 00:52:08,806 omega of N squared for the best case and just to toss 1313 00:52:08,976 --> 00:52:12,086 out one additional piece of jargon if you say theta 1314 00:52:12,456 --> 00:52:15,976 of N squared that just means it's essentially in both big O 1315 00:52:15,976 --> 00:52:17,946 of N squared and in omega. 1316 00:52:18,046 --> 00:52:20,196 So in other words, if these are the same, you can then whip 1317 00:52:20,196 --> 00:52:22,746 out this notation as well, this theta notation. 1318 00:52:22,946 --> 00:52:23,816 Now why do we use this? 1319 00:52:23,886 --> 00:52:25,306 'Cause it's just a nice shorthand way 1320 00:52:25,306 --> 00:52:27,806 of saying best case, worst case and we can now start to talk 1321 00:52:27,806 --> 00:52:29,626 about the efficiency of our algorithms. 1322 00:52:29,626 --> 00:52:32,976 We can talk about the quality of their design in terms 1323 00:52:32,976 --> 00:52:36,036 of their performance in a way that's not CPU specific. 1324 00:52:36,036 --> 00:52:38,526 It doesn't matter if it's Intel Inside or anything else 1325 00:52:38,746 --> 00:52:40,786 because now we're looking at a higher level than we have 1326 00:52:40,786 --> 00:52:42,866 in the weeks past at the underlying operation. 1327 00:52:42,866 --> 00:52:45,496 Swapping two things conceptually or stepping 1328 00:52:45,496 --> 00:52:47,286 down an array conceptually. 1329 00:52:47,286 --> 00:52:49,036 It's step after step after step. 1330 00:52:49,396 --> 00:52:51,116 So let's see if we can't visualize a couple 1331 00:52:51,116 --> 00:52:53,956 of these algorithms in a more rapid way 1332 00:52:54,246 --> 00:52:55,846 than humans here allowed for. 1333 00:52:56,056 --> 00:52:58,636 So this is a little program that's available online. 1334 00:52:58,636 --> 00:53:00,316 We've linked to it on CS50's website. 1335 00:53:00,536 --> 00:53:02,396 I switched over to one of the TF's PCs here 1336 00:53:02,396 --> 00:53:04,946 because it doesn't work properly on a Mac unfortunately. 1337 00:53:05,236 --> 00:53:07,326 The buttons don't appear on the bottom 1338 00:53:07,326 --> 00:53:09,596 but that's fine just realize if you play with it on your Macs, 1339 00:53:09,596 --> 00:53:11,476 you'll be missing some of the demo and notice 1340 00:53:11,476 --> 00:53:13,766 that in the bottom left hand corner of this little applet, 1341 00:53:14,146 --> 00:53:17,186 happens to be written in Java, I can choose a whole bunch 1342 00:53:17,186 --> 00:53:19,596 of sorts and I've preselected bubble sort, 1343 00:53:19,726 --> 00:53:21,476 that's the first thing we started with. 1344 00:53:21,766 --> 00:53:24,286 You can specify what the array is gonna look like at first, 1345 00:53:24,286 --> 00:53:26,086 you can specify a time delay here. 1346 00:53:26,276 --> 00:53:27,786 I'm gonna go ahead and just click start 1347 00:53:27,786 --> 00:53:28,996 and we'll see what happens. 1348 00:53:29,606 --> 00:53:32,546 So what you see on the board now is just a visualization 1349 00:53:32,726 --> 00:53:34,666 of what we did a moment ago with humans. 1350 00:53:34,746 --> 00:53:37,656 Each of these bars represents a number. 1351 00:53:37,656 --> 00:53:40,416 The taller the bar, the bigger the number, the smaller the bar, 1352 00:53:40,496 --> 00:53:42,886 the smaller the number, so small bar might be 1, 1353 00:53:43,176 --> 00:53:44,596 big bar might be like 100. 1354 00:53:44,886 --> 00:53:47,476 And what's happening here in red is that this algorithm much 1355 00:53:47,476 --> 00:53:50,306 like my feet on stage is looping from left to right, 1356 00:53:50,426 --> 00:53:53,856 left to right on an each iteration it's comparing two 1357 00:53:53,856 --> 00:53:56,536 elements making them red and then swapping them, 1358 00:53:56,736 --> 00:53:58,926 if in fact they are out of order. 1359 00:53:59,486 --> 00:54:01,686 And even though this is moving at a pretty good clip, 1360 00:54:01,986 --> 00:54:03,706 you can see a couple of features here. 1361 00:54:03,706 --> 00:54:05,986 One, it's kinda boring and it's kind 1362 00:54:05,986 --> 00:54:08,666 of slow even though those bars are moving in a decent clip left 1363 00:54:08,666 --> 00:54:11,986 to right but clearly visually are the larger elements bubbling 1364 00:54:11,986 --> 00:54:13,556 up to the right hand side 1365 00:54:13,786 --> 00:54:16,266 and gradually are the smaller elements bubbling 1366 00:54:16,266 --> 00:54:18,916 down if you will to the left hand side. 1367 00:54:19,036 --> 00:54:24,266 So it's almost done, almost done, almost done, almost done, 1368 00:54:25,206 --> 00:54:26,796 almost done and it only makes them red 1369 00:54:26,796 --> 00:54:28,706 when it actually swaps them, okay done. 1370 00:54:28,916 --> 00:54:30,996 And down here if you're curious afterwards to the play 1371 00:54:30,996 --> 00:54:32,046 with the demo in more detail, 1372 00:54:32,046 --> 00:54:34,156 it actually counts how many comparisons you're making, 1373 00:54:34,156 --> 00:54:37,456 how many swaps you're making and so forth. 1374 00:54:37,456 --> 00:54:40,056 So now let's do this, let me go ahead and click stop here. 1375 00:54:40,406 --> 00:54:43,546 I'm gonna change the algorithm to selection sort. 1376 00:54:43,936 --> 00:54:47,076 I'm gonna use the same speed to be fair and click start 1377 00:54:47,486 --> 00:54:50,746 and now less seems to be happening from unit of time, 1378 00:54:50,746 --> 00:54:54,086 there is less flashiness but notice now highlighted 1379 00:54:54,086 --> 00:54:57,536 in red is the smallest element just encountered and as soon 1380 00:54:57,536 --> 00:55:00,216 as it's found, by conforming, by going all the way to the right, 1381 00:55:00,516 --> 00:55:01,636 it drops it into place. 1382 00:55:02,006 --> 00:55:04,176 So visually, this algorithm is working kind 1383 00:55:04,176 --> 00:55:05,436 of the opposite direction. 1384 00:55:05,656 --> 00:55:07,476 The end result is of course gonna be the same 1385 00:55:07,766 --> 00:55:09,936 and it actually felt a little faster 1386 00:55:09,936 --> 00:55:12,596 and it absolutely did fewer swaps because this time 1387 00:55:12,596 --> 00:55:14,416 when we're swapping, we're really swapping it 1388 00:55:14,416 --> 00:55:15,896 as far away as we want. 1389 00:55:15,896 --> 00:55:18,766 We're not just doing these adjacent swaps but these too, 1390 00:55:18,766 --> 00:55:20,616 if you actually count out the number of steps 1391 00:55:20,616 --> 00:55:21,726 in the worst possible case, 1392 00:55:21,976 --> 00:55:23,936 you'll see these numbers N squared. 1393 00:55:23,936 --> 00:55:24,896 You've got 100 elements, 1394 00:55:24,896 --> 00:55:27,636 it's gonna be 100 squared possible steps for comparisons 1395 00:55:27,636 --> 00:55:30,866 or swaps and same deal for bubble sort as well. 1396 00:55:31,166 --> 00:55:34,736 So that then begs the question, can we do different 1397 00:55:34,736 --> 00:55:36,076 and why would we wanna do different? 1398 00:55:36,076 --> 00:55:38,736 Well let me toggle back here to these visuals 1399 00:55:39,026 --> 00:55:42,096 and see what the implications are for algorithms that are 1400 00:55:42,096 --> 00:55:46,186 in fact N squared or just in general not so well implemented. 1401 00:55:46,186 --> 00:55:48,526 So this is a little chart that I just I happen 1402 00:55:48,526 --> 00:55:49,516 to make up with Excel. 1403 00:55:49,516 --> 00:55:50,596 There's not all that much going-- 1404 00:55:50,696 --> 00:55:53,146 oh kind of not the best fidelity on the screen here 1405 00:55:53,146 --> 00:55:56,616 but I'll point out the details that are actually important. 1406 00:55:57,066 --> 00:55:58,306 Here-- thanks, got it. 1407 00:55:58,736 --> 00:56:02,276 Okay, so on the X axis here we just had N so this happens 1408 00:56:02,276 --> 00:56:05,186 to be 0, this happens to be 33, 66 and so forth. 1409 00:56:05,286 --> 00:56:07,136 It doesn't matter what they are just on the right-- 1410 00:56:07,226 --> 00:56:10,486 on that X axis, there's a value of N and it's increasing 1411 00:56:10,486 --> 00:56:12,846 so this implies more and more humans or more 1412 00:56:12,846 --> 00:56:13,896 and more numbers to sort. 1413 00:56:14,156 --> 00:56:16,646 Meanwhile on the Y axis here, this is meant to be-- 1414 00:56:16,786 --> 00:56:20,656 depict time so and if you have 33 elements 1415 00:56:20,656 --> 00:56:23,296 as this suggests according to this chart here, 1416 00:56:23,296 --> 00:56:25,086 it's saying this takes 5 units of time 1417 00:56:25,086 --> 00:56:27,276 and this means more units of time and if it's up here, 1418 00:56:27,576 --> 00:56:28,856 that's even more units of time. 1419 00:56:28,856 --> 00:56:30,386 So the numbers frankly are irrelevant, 1420 00:56:30,596 --> 00:56:32,026 it's really the curves of these things 1421 00:56:32,026 --> 00:56:33,986 that are worth considering for just a moment. 1422 00:56:34,186 --> 00:56:36,436 So there's 3 different graphs here that I made with Excel. 1423 00:56:36,726 --> 00:56:39,846 One represents N, so the little diamonds or the ones here 1424 00:56:39,846 --> 00:56:42,406 on the left hand side and even though it's not going 1425 00:56:42,406 --> 00:56:45,306 up at a 45-degree angle based on how I did the axes, 1426 00:56:45,486 --> 00:56:47,146 it's obviously a straight line. 1427 00:56:47,146 --> 00:56:50,216 An algorithm that takes N steps, means if you increase the number 1428 00:56:50,216 --> 00:56:52,306 of humans or the number of numbers by 1, 1429 00:56:52,306 --> 00:56:54,516 it's gonna take you 1 more step to count it. 1430 00:56:54,546 --> 00:56:56,686 So consider for instance bubble sort in the best case. 1431 00:56:56,956 --> 00:56:59,756 If I didn't have 8 volunteers but I had 9 volunteers well 1432 00:56:59,756 --> 00:57:02,186 in the best case, it's gonna take me one additional step 1433 00:57:02,216 --> 00:57:04,676 to count that-- check that one additional human 'cause I just 1434 00:57:04,676 --> 00:57:08,896 have to take one 9th step over here so that we can see 1435 00:57:09,016 --> 00:57:11,666 as pictorially as this relationship here. 1436 00:57:11,926 --> 00:57:14,906 Now suppose I come up with a smarter algorithm as I did 1437 00:57:14,906 --> 00:57:15,916 in the first day of class. 1438 00:57:16,206 --> 00:57:17,896 Recall that the simplest algorithm 1439 00:57:17,896 --> 00:57:21,286 for counting all you guys is 1, 2, 3, 4, 5, 6, 7, 8, 1440 00:57:21,656 --> 00:57:24,806 this takes a while but it's slightly faster actually it's 1441 00:57:24,806 --> 00:57:27,556 twice as fast if I use that little grade school trick of 2, 1442 00:57:27,556 --> 00:57:31,456 4, 6, 8, 10, 12 and so now that algorithm is 1443 00:57:31,456 --> 00:57:32,456 in fact twice as fast. 1444 00:57:32,966 --> 00:57:34,046 It's still a straight line 1445 00:57:34,046 --> 00:57:36,316 and that's what this one what the squares represent. 1446 00:57:36,436 --> 00:57:38,246 So according to the axes, 1447 00:57:38,246 --> 00:57:40,556 I've chosen here it's still a straight line but notice 1448 00:57:40,606 --> 00:57:43,736 that it moves at a lesser gradient so that's good 1449 00:57:43,956 --> 00:57:47,586 because it means as N increases the amount of time taken 1450 00:57:47,776 --> 00:57:50,356 to count-- the amount of time involved in counting people 1451 00:57:50,356 --> 00:57:53,426 in 2s instead of 1 obviously is less 1452 00:57:53,646 --> 00:57:55,186 than if you're just counting them one at a time. 1453 00:57:55,616 --> 00:57:58,656 But here is really the end game, this thing here. 1454 00:57:58,996 --> 00:58:01,436 So recall that the other algorithm we deployed 1455 00:58:01,436 --> 00:58:04,066 in the very first day of class was that self-counting algorithm 1456 00:58:04,066 --> 00:58:06,516 where you guys all stood up and half of you sat down, 1457 00:58:06,586 --> 00:58:09,416 half of you sat down, half of you sat down so that we went 1458 00:58:09,416 --> 00:58:12,866 from like 500 to 250 to 125 and so forth 1459 00:58:12,866 --> 00:58:15,626 and so there were just one in theory person standing. 1460 00:58:15,626 --> 00:58:17,686 Now, it will took a while 'cause there's a lot of social issues 1461 00:58:17,686 --> 00:58:20,166 and coordination of all that going on but in theory, 1462 00:58:20,166 --> 00:58:22,556 on each iteration that problem was getting halved, 1463 00:58:23,096 --> 00:58:23,656 halved, halved. 1464 00:58:23,656 --> 00:58:25,786 Just like the phonebook getting halved, halved, halved. 1465 00:58:26,056 --> 00:58:26,626 Well, what does it look 1466 00:58:26,626 --> 00:58:29,356 like when you halve a problem again and again and again. 1467 00:58:29,356 --> 00:58:31,076 Well it looks like this chart here. 1468 00:58:31,346 --> 00:58:35,156 When we describe that algorithm as logarithmic or log base 2 1469 00:58:35,156 --> 00:58:38,486 of N, this is what it looks like and again it's curves here 1470 00:58:38,486 --> 00:58:39,976 that are interesting because it looks 1471 00:58:39,976 --> 00:58:43,376 like if I'm counting people in Sanders, the naive way one 1472 00:58:43,376 --> 00:58:45,596 at a time, I mean this is gonna take a ridiculous amount 1473 00:58:45,596 --> 00:58:47,856 of time, the chart literally goes up to the roof. 1474 00:58:47,896 --> 00:58:50,996 Now it's slightly better if I count you in 2s but even 1475 00:58:50,996 --> 00:58:52,346 that eventually for large N, 1476 00:58:52,346 --> 00:58:54,106 it's gonna take a ridiculous amount of time 1477 00:58:54,396 --> 00:58:57,906 but if we use something like the divide and conquer strategy 1478 00:58:57,906 --> 00:59:00,256 of the phonebook or as applied to humans having half 1479 00:59:00,256 --> 00:59:01,686 of you sit down, half of you sit down, 1480 00:59:02,126 --> 00:59:05,366 this is when this curve gets amazingly powerful, right. 1481 00:59:05,366 --> 00:59:08,206 Literally, on that first day of class, we could have had 1482 00:59:08,206 --> 00:59:12,146 like justice come back into Sanders, right, doubling-- 1483 00:59:12,146 --> 00:59:14,506 more than doubling the number of students in the room and 1484 00:59:14,506 --> 00:59:16,936 yet the algorithm you guys were applying one more step, right, 1485 00:59:17,056 --> 00:59:18,596 'cause we could've made all of justice sit 1486 00:59:18,596 --> 00:59:22,966 down the moment they enter the room and we just-- right? 1487 00:59:23,406 --> 00:59:26,366 [Laughter] We know that we've taken a bite 1488 00:59:26,366 --> 00:59:29,486 out of the problem that's half as large as N itself. 1489 00:59:29,796 --> 00:59:33,246 So this is really the sort of golden ring that you're 1490 00:59:33,246 --> 00:59:35,956 after when you're trying to come up with the most efficient 1491 00:59:35,956 --> 00:59:38,856 or the most alluring algorithm possible and the payoffs 1492 00:59:38,856 --> 00:59:40,776 as this chart suggests are huge. 1493 00:59:41,036 --> 00:59:44,246 This is one 130 here, this means if we have 260 students that's 1494 00:59:44,246 --> 00:59:48,026 over here and then 500 or so over here, that chart 1495 00:59:48,026 --> 00:59:50,736 or that graph rather is flat lining 1496 00:59:50,736 --> 00:59:53,266 in this way far more so than these things. 1497 00:59:53,266 --> 00:59:59,456 Now of course, we cannot sort N elements in log N time, right. 1498 00:59:59,456 --> 01:00:01,976 Log N seems to be again what we're after. 1499 01:00:02,216 --> 01:00:04,676 >> The holy grail of algorithms 'cause look how much better it 1500 01:00:04,676 --> 01:00:06,166 is than anything we've looked at thus far. 1501 01:00:06,446 --> 01:00:10,156 But I claim we can't sort N individuals in logarithmic time 1502 01:00:10,256 --> 01:00:11,856 so neither of these algorithms were that good 1503 01:00:11,856 --> 01:00:13,446 and I don't think I can come up with one that's good 1504 01:00:13,756 --> 01:00:15,546 and intuitively why might that be the case? 1505 01:00:16,916 --> 01:00:20,096 Why can you not sort N people or N numbers 1506 01:00:20,456 --> 01:00:24,686 in the log base N of time? 1507 01:00:24,936 --> 01:00:25,696 So clearly, yeah? 1508 01:00:25,696 --> 01:00:26,576 >> So this at least gets you run 1509 01:00:26,576 --> 01:00:28,296 through from the beginning to the end. 1510 01:00:28,606 --> 01:00:29,796 >> Yeah. There's this one got to. 1511 01:00:29,796 --> 01:00:32,396 You can't possibly sort any individuals in less 1512 01:00:32,536 --> 01:00:35,236 than say N time because you don't know 1513 01:00:35,236 --> 01:00:37,366 if they're actually sorted unless you actually check 1514 01:00:37,436 --> 01:00:38,456 that list, right. 1515 01:00:38,456 --> 01:00:39,436 Otherwise, it's guesswork. 1516 01:00:39,436 --> 01:00:41,546 Now we-- the humans might glance at the list, oh, sort it. 1517 01:00:41,546 --> 01:00:42,466 But how did you do that? 1518 01:00:42,586 --> 01:00:44,306 Well maybe unbeknownst to you or your brain, 1519 01:00:44,306 --> 01:00:47,546 you actually did scheme that list at least one eye movement 1520 01:00:47,546 --> 01:00:49,906 for each person up here so that's at least 8 steps. 1521 01:00:50,016 --> 01:00:52,726 So this might work for searching and that's what we've-- 1522 01:00:52,726 --> 01:00:53,976 this might work for counting. 1523 01:00:54,256 --> 01:00:56,236 I'm searching the phonebook, counting the students 1524 01:00:56,376 --> 01:00:58,296 but for sorting we need to spend a bit more. 1525 01:00:58,296 --> 01:00:59,946 Well, what would these algorithms feel like. 1526 01:00:59,946 --> 01:01:02,226 We'll here's a different chart, a different axis 1527 01:01:02,226 --> 01:01:03,826 but again it's the shape that we care about. 1528 01:01:04,146 --> 01:01:06,336 So here is the same chart 1529 01:01:06,706 --> 01:01:09,216 but we've given us more of a Y access. 1530 01:01:09,216 --> 01:01:11,836 Here is what I will call N so this is-- 1531 01:01:11,836 --> 01:01:13,336 if you increase the number of students, 1532 01:01:13,336 --> 01:01:14,756 your algorithm increases by 1. 1533 01:01:14,756 --> 01:01:18,656 It's a 1 to N ratio here or rather it's a one to one ratio. 1534 01:01:19,036 --> 01:01:21,346 Here is log base 2 of N so when you-- 1535 01:01:21,476 --> 01:01:23,896 right, this is how you can kind of mislead people 1536 01:01:23,896 --> 01:01:26,546 with statistics and charts and consulting and all of this 1537 01:01:26,546 --> 01:01:27,586 where I change your axis 1538 01:01:27,916 --> 01:01:30,806 and look how N looks you know not all of that much worse 1539 01:01:30,866 --> 01:01:34,356 than log base N-- log base 2 of N, right. 1540 01:01:34,586 --> 01:01:36,446 Your clients might as well buy the N algorithm, 1541 01:01:36,446 --> 01:01:37,536 it's not all that much slower. 1542 01:01:38,016 --> 01:01:41,566 But notice here if we actually are now looking at algorithms 1543 01:01:41,566 --> 01:01:45,856 as we're gonna start to in class that take may be N log N squared 1544 01:01:45,856 --> 01:01:47,856 or actually this is pretty rather-- sorry. 1545 01:01:48,786 --> 01:01:50,926 If we look at algorithms that take N squared time 1546 01:01:51,196 --> 01:01:54,036 like bubble sort and like selection sort, I mean my God, 1547 01:01:54,036 --> 01:01:57,286 if only we could sort N individuals in just N time. 1548 01:01:57,286 --> 01:01:59,646 You really start to feel the difference again even 1549 01:01:59,646 --> 01:02:00,726 irrespective of the axis. 1550 01:02:00,726 --> 01:02:02,366 The shapes of these graphs suggest 1551 01:02:02,366 --> 01:02:05,206 that if you can avoid N squared, you really should. 1552 01:02:05,246 --> 01:02:06,746 In fact if we zoom out even further, 1553 01:02:06,746 --> 01:02:08,606 there are some algorithms in this world 1554 01:02:08,936 --> 01:02:11,496 that are believed to be exponential. 1555 01:02:11,966 --> 01:02:13,666 This isn't even as bad as factorial, 1556 01:02:13,666 --> 01:02:14,976 exponential here at top. 1557 01:02:15,236 --> 01:02:17,586 If you have algorithm that takes 2 to the N steps 1558 01:02:17,586 --> 01:02:21,336 which we may discuss briefly in time, for more on this CS121, 1559 01:02:21,336 --> 01:02:24,616 CS124, I mean my God, this is absolutely an algorithm 1560 01:02:24,616 --> 01:02:28,126 to be avoided because here you are like 200,000 or so steps 1561 01:02:28,456 --> 01:02:30,886 and well, here is N down here at which point all 1562 01:02:30,886 --> 01:02:32,456 of these algorithms look better and better, 1563 01:02:32,716 --> 01:02:33,656 but this is the take away. 1564 01:02:33,656 --> 01:02:35,966 When you start looking at problems like Facebook faces 1565 01:02:35,966 --> 01:02:38,296 and Goggle faces and N is actually big, 1566 01:02:38,496 --> 01:02:39,946 these are the kinds of curves 1567 01:02:40,186 --> 01:02:42,386 that are particularly alluring for someone. 1568 01:02:42,686 --> 01:02:45,496 So can we do better and how can we describe these things. 1569 01:02:45,496 --> 01:02:47,916 Well, again we got big O, we've omega got and theta. 1570 01:02:47,916 --> 01:02:51,056 Let's see if we can't describe yet another running time here 1571 01:02:51,206 --> 01:02:52,286 but we need a tool for this. 1572 01:02:52,766 --> 01:02:56,136 So, we have talked certainly about the phonebook. 1573 01:02:56,256 --> 01:02:59,016 It's sort of ad nauseam at this point and that was an example 1574 01:02:59,016 --> 01:03:00,036 of divide and conquer. 1575 01:03:00,036 --> 01:03:02,696 Divide the phonebook in half and then do it again, do it again, 1576 01:03:02,696 --> 01:03:05,956 do it again and there in lies kind of the insight 1577 01:03:06,156 --> 01:03:07,336 of divide and conquer. 1578 01:03:07,336 --> 01:03:10,596 We did this too on Monday with the second array. 1579 01:03:10,826 --> 01:03:13,096 We dived into the middle of the array, then we dived 1580 01:03:13,096 --> 01:03:15,666 into the middle here, then the middle here and voila, 1581 01:03:15,666 --> 01:03:18,456 there was our element, the number 50. 1582 01:03:18,706 --> 01:03:21,186 So when you do something again and again and again 1583 01:03:21,186 --> 01:03:23,716 and the only thing that's changing is really the size 1584 01:03:23,716 --> 01:03:26,176 of the problem, the steps you are executing are still 1585 01:03:26,176 --> 01:03:26,606 the same. 1586 01:03:26,786 --> 01:03:28,226 I'm still tearing a phonebook in half. 1587 01:03:28,226 --> 01:03:29,196 It's just half as big now. 1588 01:03:29,196 --> 01:03:29,916 I'm going it again. 1589 01:03:29,916 --> 01:03:32,926 It's now a quarter as big but it's still the same algorithm. 1590 01:03:33,136 --> 01:03:35,206 Well, this is in general known as recursion 1591 01:03:35,466 --> 01:03:37,906 and we can write functions and I did this just kind 1592 01:03:37,906 --> 01:03:40,956 of for fun the other day, we can deliberately write functions 1593 01:03:41,036 --> 01:03:43,876 that recurs by calling themselves. 1594 01:03:44,016 --> 01:03:45,056 Now what do we mean by this. 1595 01:03:45,056 --> 01:03:48,646 Well, in the context of C, we can write a function and surely 1596 01:03:48,646 --> 01:03:49,916 that function can call itself. 1597 01:03:49,916 --> 01:03:52,866 If I define a function called, fu, if I want fu to call itself, 1598 01:03:52,866 --> 01:03:55,226 I just have to write the word fu open parenthesis, 1599 01:03:55,226 --> 01:03:57,656 close parenthesis, semicolon, inside fu itself, 1600 01:03:57,686 --> 01:03:59,906 but bad things happened when I did this last time. 1601 01:03:59,906 --> 01:04:01,986 What happened when I wrote literally a function fu 1602 01:04:01,986 --> 01:04:02,706 that called itself? 1603 01:04:02,876 --> 01:04:03,236 >> Seg fault. 1604 01:04:03,486 --> 01:04:04,356 >> So I got a seg fault. 1605 01:04:04,486 --> 01:04:05,426 Now why was that? 1606 01:04:05,426 --> 01:04:06,586 Well again just think back 1607 01:04:06,586 --> 01:04:08,396 to the building blocks we've been looking at, right? 1608 01:04:08,396 --> 01:04:10,216 Where you have memory in our computer, 1609 01:04:10,216 --> 01:04:11,646 we keep drawing it as a rectangle. 1610 01:04:11,916 --> 01:04:15,086 Every time you call a function, you add a frame to that stack. 1611 01:04:15,226 --> 01:04:17,856 In other words, you ask for some memory and conceptually it's 1612 01:04:17,856 --> 01:04:19,586 like putting it on top of the stack 1613 01:04:19,586 --> 01:04:22,416 of memory you've already asked for so you call fu, you call fu, 1614 01:04:22,416 --> 01:04:25,026 you call fu, fu, fu, fu, fu, what eventually happens? 1615 01:04:25,026 --> 01:04:25,256 >> Run out. 1616 01:04:26,376 --> 01:04:28,416 >> Right, you run out or you hit some other memory 1617 01:04:28,416 --> 01:04:29,496 that doesn't belong to you 1618 01:04:29,496 --> 01:04:31,716 and that's something we'll see before long called the heap 1619 01:04:31,896 --> 01:04:34,336 but in short doing this again and again and again is bad 1620 01:04:34,556 --> 01:04:36,486 so recursion does have this risk in it. 1621 01:04:36,486 --> 01:04:38,516 If you do the same thing again and again 1622 01:04:38,786 --> 01:04:40,906 but in each time allocate memory, you might run 1623 01:04:40,906 --> 01:04:43,756 into some trouble but perhaps it's nonetheless a technique we 1624 01:04:43,756 --> 01:04:44,446 can leverage. 1625 01:04:44,706 --> 01:04:47,206 So let's take a look at one piece of code here 1626 01:04:47,206 --> 01:04:49,236 that was among in the printouts from Monday 1627 01:04:49,236 --> 01:04:53,406 and this is something called sigma 1. 1628 01:04:53,406 --> 01:04:56,026 Just out of habit, I'm gonna start using VI sometimes instead 1629 01:04:56,026 --> 01:04:56,476 of nano. 1630 01:04:56,476 --> 01:04:58,906 It's just another text editor with more arcane commands 1631 01:04:58,946 --> 01:04:59,846 but it does the same thing. 1632 01:05:00,206 --> 01:05:01,186 I'm gonna scroll down 1633 01:05:01,576 --> 01:05:04,216 and introduce this relatively tiny program. 1634 01:05:04,906 --> 01:05:07,636 So this program here has a main function 1635 01:05:07,806 --> 01:05:11,286 and the first thing this function does is ask the user 1636 01:05:11,286 --> 01:05:13,256 for int so this is actually a good use 1637 01:05:13,256 --> 01:05:15,206 of the do-while construct because recall 1638 01:05:15,206 --> 01:05:17,456 that do-while tells the user to do something 1639 01:05:17,456 --> 01:05:18,786 and maybe get something from the user 1640 01:05:18,786 --> 01:05:21,446 but if the user is being difficult, I can force it 1641 01:05:21,446 --> 01:05:24,466 to loop again and again after I check that value. 1642 01:05:24,526 --> 01:05:27,546 So this chunk of code here is the comments explained. 1643 01:05:27,796 --> 01:05:30,416 I'm forcing the user to give me a positive integer, 1644 01:05:30,696 --> 01:05:32,946 not a negative nonzero but a positive integer. 1645 01:05:33,226 --> 01:05:34,886 Now at this line in the story, 1646 01:05:35,106 --> 01:05:37,906 I'm apparently declaring a variable called answer it's 1647 01:05:37,906 --> 01:05:41,076 of type int and I'm lining it with value of sigma 1648 01:05:41,316 --> 01:05:44,706 so I chose sigma based on this symbol here just to be clever 1649 01:05:44,706 --> 01:05:46,806 but sigma is apparently a function that I wrote. 1650 01:05:47,216 --> 01:05:48,276 It takes one argument. 1651 01:05:48,666 --> 01:05:50,316 I'm passing it apparently an integer 1652 01:05:50,316 --> 01:05:51,636 so let's see what it does. 1653 01:05:51,966 --> 01:05:54,156 Well actually, this is pretty simple, 1654 01:05:54,156 --> 01:05:56,096 it looks a little cryptic at first but this is 1655 01:05:56,156 --> 01:05:58,216 like day 1 stuff now, right. 1656 01:05:58,306 --> 01:05:59,626 Sigma is gonna return an int. 1657 01:06:00,156 --> 01:06:01,926 It's gonna take an int as it's argument 1658 01:06:01,926 --> 01:06:03,046 and I'm calling it M just 1659 01:06:03,046 --> 01:06:05,596 because on a little sanity check, I don't wanna get 1660 01:06:05,596 --> 01:06:07,466 into an infinite loop here by accident 1661 01:06:07,466 --> 01:06:10,626 so I'm making sure M is not less than 1. 1662 01:06:10,996 --> 01:06:13,846 If it is, I'm returning 0 just so bad things don't happen 1663 01:06:14,146 --> 01:06:15,076 and now what am I doing? 1664 01:06:15,076 --> 01:06:18,056 I'm declaring a variable called sum, initializing it to zero 1665 01:06:18,366 --> 01:06:19,446 and then this loop here. 1666 01:06:19,446 --> 01:06:20,936 Well, actually it's pretty straightforward. 1667 01:06:20,936 --> 01:06:24,216 I starts at 1 goes up to and through M 1668 01:06:24,496 --> 01:06:26,816 and on each iteration plus equals just does what? 1669 01:06:28,186 --> 01:06:30,506 It adds the thing on the right to the left. 1670 01:06:30,506 --> 01:06:32,576 This is equivalent and it's just shorthand notation 1671 01:06:32,576 --> 01:06:35,866 for saying sum equals sum plus I. 1672 01:06:36,146 --> 01:06:37,646 It's just a little more succinct than that. 1673 01:06:37,876 --> 01:06:40,896 So in the end, the sigma function just returns the sum 1674 01:06:40,896 --> 01:06:44,556 of all of the numbers from N down to 1 or down 1675 01:06:44,556 --> 01:06:45,816 to 0 which are equivalent. 1676 01:06:46,136 --> 01:06:47,746 Alright, so this is one way to do it. 1677 01:06:47,986 --> 01:06:50,606 Let me go ahead and go back to the original here 1678 01:06:50,606 --> 01:06:55,876 so this is plus equal 1, let me ago ahead and compile sigma 1. 1679 01:06:56,286 --> 01:06:59,606 I'm gonna run sigma 1 and I'll give it a positive integer 1680 01:06:59,606 --> 01:07:00,736 of 8 people. 1681 01:07:01,406 --> 01:07:05,486 Alright so 36, so that's 8 plus 7 plus 6 plus 5 let's do it 1682 01:07:05,486 --> 01:07:05,996 for 10. 1683 01:07:06,556 --> 01:07:09,296 Alright 55, okay let's do it for 100. 1684 01:07:09,426 --> 01:07:11,326 Alright, so this program seems to work 1685 01:07:11,326 --> 01:07:13,006 and frankly it's doing this a lot faster 1686 01:07:13,006 --> 01:07:14,926 than I could off the top of my head 1687 01:07:14,926 --> 01:07:16,436 so we have a nice useful program here. 1688 01:07:16,706 --> 01:07:18,566 But can we implement it in a different way. 1689 01:07:18,856 --> 01:07:21,596 Well, this is isn't necessarily the best way to implement it 1690 01:07:21,596 --> 01:07:23,716 but it's a nice way of demonstrating this idea 1691 01:07:23,716 --> 01:07:26,276 of recursion in a very simple context. 1692 01:07:26,506 --> 01:07:31,426 So this is version 2, sigma 2 and here we have same code. 1693 01:07:31,676 --> 01:07:34,526 Get the int, put it in N and make sure it's not less than 1 1694 01:07:34,816 --> 01:07:37,836 and then call sigma again and I'm gonna print out the answer. 1695 01:07:38,126 --> 01:07:39,556 So apparently, main is identical. 1696 01:07:39,716 --> 01:07:41,246 So let me scroll down to sigma 1697 01:07:42,286 --> 01:07:46,146 and here is something that's beautiful to my eyes at least. 1698 01:07:46,706 --> 01:07:49,176 So what does this mean, sad though that is. 1699 01:07:49,526 --> 01:07:50,886 So it returns an int. 1700 01:07:51,286 --> 01:07:55,986 It takes an int but now I just have 2, no 4 real lines of code 1701 01:07:55,986 --> 01:07:58,816 and even those are not all that interesting 'cause one is an if 1702 01:07:58,816 --> 01:08:00,216 and an else, what am I doing? 1703 01:08:00,216 --> 01:08:03,336 Well, first this is pretty much the same sanity check as before. 1704 01:08:03,336 --> 01:08:06,636 If M is less than or equal to 0, just return 0 right away 1705 01:08:06,636 --> 01:08:08,336 so that the user doesn't hand me a negative number 1706 01:08:08,336 --> 01:08:09,606 and completely break this program. 1707 01:08:10,076 --> 01:08:13,906 But notice this, all that work I did last time using a four loop, 1708 01:08:13,906 --> 01:08:16,976 an incrementing I and the sum and all this messy syntax. 1709 01:08:17,326 --> 01:08:19,076 it looks like I could just do this. 1710 01:08:19,956 --> 01:08:23,936 This is an example of recursion, right, just as I could look 1711 01:08:23,936 --> 01:08:25,386 for someone in the phonebook by looking 1712 01:08:25,386 --> 01:08:26,876 in the middle then going checking left 1713 01:08:26,876 --> 01:08:28,516 or right then doing it again. 1714 01:08:28,826 --> 01:08:31,096 Well technically you could sum numbers in the same way. 1715 01:08:31,096 --> 01:08:34,676 If you hand me the number 8, I can say the answer to sigma 1716 01:08:34,676 --> 01:08:40,406 of 8 is 8 plus sigma of 7 and I can completely punt on the rest 1717 01:08:40,406 --> 01:08:42,046 of the problem and let you deal with that, right. 1718 01:08:42,076 --> 01:08:45,276 You ask me for sigma 8, I say it's literally 8 plus sigma 7. 1719 01:08:45,566 --> 01:08:47,156 Well now, you come back to me and you say fine. 1720 01:08:47,226 --> 01:08:51,676 What's sigma of 7and I sort of smart ass say, 7 plus sigma 1721 01:08:51,676 --> 01:08:53,656 of 6, right and again and again. 1722 01:08:54,176 --> 01:08:56,306 Now, I'm sort of picking a contrived example here 1723 01:08:56,306 --> 01:08:58,746 but what's nice is just like with the phonebook, 1724 01:08:58,746 --> 01:09:00,776 just like with the counting students, we can take a bite 1725 01:09:00,776 --> 01:09:03,746 out of this problem, maybe it's only size 1, it's one number 1726 01:09:03,846 --> 01:09:05,646 but I can take a bite out of this problem 1727 01:09:05,886 --> 01:09:08,026 and then apply the exact same algorithm 1728 01:09:08,026 --> 01:09:09,516 to this slightly smaller problem. 1729 01:09:09,666 --> 01:09:11,956 And if I code this correctly, notice what happens. 1730 01:09:11,956 --> 01:09:15,276 If I code this correctly, I can take the small bite out 1731 01:09:15,726 --> 01:09:18,836 and then I can defer to who, to myself, right. 1732 01:09:18,836 --> 01:09:20,996 If I am now the sigma function who knows how 1733 01:09:20,996 --> 01:09:23,686 to compute this even though it's in this-- the smart aleck way, 1734 01:09:23,686 --> 01:09:27,516 well I can just say it's M plus sigma of M minus 1. 1735 01:09:27,796 --> 01:09:29,426 Now conceptually, what does that mean? 1736 01:09:29,426 --> 01:09:32,656 It means this is gonna get added to the return value 1737 01:09:32,656 --> 01:09:34,206 of sigma of N minus 1. 1738 01:09:34,496 --> 01:09:35,506 Well how do you compute that? 1739 01:09:35,856 --> 01:09:39,336 We'll here's sigma, you pass it now in the number 7 instead 1740 01:09:39,336 --> 01:09:42,196 of 8, 7 is not less than zero 1741 01:09:42,196 --> 01:09:43,966 so you don't return 0 what do you return? 1742 01:09:44,206 --> 01:09:46,756 You return 7 plus sigma of 6. 1743 01:09:47,086 --> 01:09:49,756 Now at this point, it gets to be a bit of a mind game, right. 1744 01:09:49,756 --> 01:09:52,106 'Cause now you have to keep saying call sigma, call sigma, 1745 01:09:52,106 --> 01:09:54,606 call sigma and you're never actually finishing any 1746 01:09:54,606 --> 01:09:56,416 of these answers until what? 1747 01:09:56,416 --> 01:09:59,136 When you call sigma of 0, 1748 01:09:59,276 --> 01:10:02,956 finally the seeming endless loop stops because at 1749 01:10:02,956 --> 01:10:05,976 that final 8th call you say, well what's sigma of 0? 1750 01:10:06,246 --> 01:10:08,176 >> And I don't give you again the smart aleck answer, 1751 01:10:08,176 --> 01:10:09,406 I instead say 0. 1752 01:10:10,206 --> 01:10:12,686 And that means you can then kind of reverse the process. 1753 01:10:12,686 --> 01:10:13,876 So you can say, okay if the answer 1754 01:10:13,876 --> 01:10:17,106 to that last question was 0 then a 0 plus 1 1755 01:10:17,106 --> 01:10:18,536 and that means 1 then the answer 1756 01:10:18,536 --> 01:10:21,286 to the previous question was 2 plus that and you can kind 1757 01:10:21,286 --> 01:10:23,446 of unwind in your mind and it's a little hard to think 1758 01:10:23,446 --> 01:10:25,606 of it first, but what's the end result. 1759 01:10:25,866 --> 01:10:28,596 You can implement in-- there's this beautiful way, 1760 01:10:28,596 --> 01:10:30,976 this very elegant sort of logical way, 1761 01:10:31,216 --> 01:10:35,366 the notion of summation that we just did but using just one line 1762 01:10:35,366 --> 01:10:38,626 of code and leveraging the code sigma that you already wrote. 1763 01:10:38,766 --> 01:10:39,976 Now what's the power of this? 1764 01:10:39,976 --> 01:10:42,366 Well once you have the ability to do things recursively, 1765 01:10:42,776 --> 01:10:45,356 you don't necessarily or you're not necessarily able 1766 01:10:45,356 --> 01:10:47,916 to solve problems in different in-- 1767 01:10:47,916 --> 01:10:49,386 you're not necessarily to solve problems 1768 01:10:49,386 --> 01:10:51,466 that you couldn't otherwise using iteration and loops 1769 01:10:51,746 --> 01:10:53,986 but you can solve them a lot more intuitively 1770 01:10:53,986 --> 01:10:57,226 and you can use a programming technique recursion in a way 1771 01:10:57,226 --> 01:10:59,806 that really lends itself to the problem, and for that, 1772 01:10:59,976 --> 01:11:01,296 let me introduce this. 1773 01:11:01,996 --> 01:11:04,326 I'm gonna pull up a different demo here. 1774 01:11:04,326 --> 01:11:06,886 This one also happens to be implemented in Java 1775 01:11:06,886 --> 01:11:10,396 but it's just the visualization as in aside, the world has come 1776 01:11:10,396 --> 01:11:11,856 up with all sorts of sorts. 1777 01:11:11,926 --> 01:11:16,316 We won't look in this course at bozo sorts or cocktail sorts, 1778 01:11:16,816 --> 01:11:19,026 shaker sort but there's a whole bunch of them out there. 1779 01:11:19,026 --> 01:11:20,856 Let's pick the ones we did look at. 1780 01:11:20,856 --> 01:11:24,056 Bubble sort there, I'm gonna pick a selection sort here. 1781 01:11:24,226 --> 01:11:26,226 And it turns out one of our favorites 1782 01:11:26,766 --> 01:11:29,736 because of its power is something called merge sort. 1783 01:11:30,166 --> 01:11:31,586 So I've zoomed in on this graph. 1784 01:11:31,586 --> 01:11:34,146 What you see here is same kind of visualization as last time 1785 01:11:34,146 --> 01:11:35,576 but instead of the numbers being vertical, 1786 01:11:35,576 --> 01:11:38,566 this guy just implemented it sideways so in a moment, 1787 01:11:38,566 --> 01:11:39,836 I'm gonna click each of this graphs. 1788 01:11:40,126 --> 01:11:41,296 They're randomly oriented. 1789 01:11:41,296 --> 01:11:43,886 Small bars means small number, big bars means big number 1790 01:11:44,046 --> 01:11:45,876 and what you're gonna see assuming I click them all fast 1791 01:11:46,096 --> 01:11:48,586 enough is each of this 3 algorithms running 1792 01:11:48,836 --> 01:11:52,146 simultaneously and my claim is as we've seen bubble sort 1793 01:11:52,146 --> 01:11:54,706 in the worst case is N squared, selection sort 1794 01:11:54,706 --> 01:11:56,076 in the worst case is N squared 1795 01:11:56,386 --> 01:11:59,506 but this thing merge sort is an example of an algorithm that's 1796 01:11:59,576 --> 01:12:03,926 in N times log of N. So, it's not as good as log N 1797 01:12:04,006 --> 01:12:06,626 which was the case for divide and conquer and for searching, 1798 01:12:06,936 --> 01:12:09,316 but N log N is definitely smaller 1799 01:12:09,316 --> 01:12:13,366 than N times N 'cause log N is smaller than N. So here we go, 1800 01:12:13,726 --> 01:12:15,696 I give you in conclusion bubble sort, 1801 01:12:15,696 --> 01:12:17,926 the selection sort, the merge sort. 1802 01:12:18,516 --> 01:12:24,786 [ Pause ] 1803 01:12:25,286 --> 01:12:28,306 >> So that is what it means the solve problems effectively. 1804 01:12:28,886 --> 01:12:34,326 See you on Monday. 1805 01:12:34,826 --> 01:12:37,670 [ Noise ]