1 00:00:00,000 --> 00:00:00,990 2 00:00:00,990 --> 00:00:02,970 >> [MUSIC PLAYING] 3 00:00:02,970 --> 00:00:10,400 4 00:00:10,400 --> 00:00:12,550 >> DAVID J. MALAN: This is CS50. 5 00:00:12,550 --> 00:00:14,612 And this is the start of week three. 6 00:00:14,612 --> 00:00:16,820 So we've got a lot of exciting things to cover today. 7 00:00:16,820 --> 00:00:20,160 A lot of opportunities for volunteers up on stage. 8 00:00:20,160 --> 00:00:22,780 And ultimately, today is not about code at all. 9 00:00:22,780 --> 00:00:24,820 But it's about ideas, and it's about algorithms, 10 00:00:24,820 --> 00:00:28,420 and actually bringing back some of the lessons learned from week zero, 11 00:00:28,420 --> 00:00:31,910 wherein recall, we introduced this monstrosity. 12 00:00:31,910 --> 00:00:33,880 And borrowing inspiration from that, to start 13 00:00:33,880 --> 00:00:36,879 to solve ever more sophisticated problems algorithmically. 14 00:00:36,879 --> 00:00:38,420 But first, a couple of announcements. 15 00:00:38,420 --> 00:00:42,020 So one, if you would like to join CS50's staff and classmates at lunch 16 00:00:42,020 --> 00:00:44,670 this Friday, both here and in Cambridge, and in New Haven, 17 00:00:44,670 --> 00:00:48,060 please visit the course's website, where a URL can be found. 18 00:00:48,060 --> 00:00:50,390 Lecture this Wednesday will not be here in Sanders. 19 00:00:50,390 --> 00:00:53,610 It will be online only, so tune in at CS50's website, 20 00:00:53,610 --> 00:00:55,850 whether here in Cambridge or New Haven as well. 21 00:00:55,850 --> 00:00:58,110 >> And then problem set two is already in your hands. 22 00:00:58,110 --> 00:01:03,067 If you haven't dived in yet, allow me to offer the strongly worded suggestion 23 00:01:03,067 --> 00:01:05,150 that, especially now, as the problem sets advance, 24 00:01:05,150 --> 00:01:08,669 you really do want to start now, if not dabble a bit on the weekend or prior 25 00:01:08,669 --> 00:01:10,710 when they first go out on Fridays, because you'll 26 00:01:10,710 --> 00:01:14,380 find that they're not necessarily getting longer or more challenging per 27 00:01:14,380 --> 00:01:14,950 se. 28 00:01:14,950 --> 00:01:17,575 I think you'll find that, in general, they tend to take roughly 29 00:01:17,575 --> 00:01:18,892 around same amount of time. 30 00:01:18,892 --> 00:01:20,850 But it certainly depends on the student, and it 31 00:01:20,850 --> 00:01:22,880 depends on the mindset with which you approach it. 32 00:01:22,880 --> 00:01:24,910 But invariably, you're going to run up against some wall, 33 00:01:24,910 --> 00:01:26,350 and you're going to hit some bug, and you're just 34 00:01:26,350 --> 00:01:27,950 not going to be able to get over it at some point. 35 00:01:27,950 --> 00:01:31,380 And it's hugely valuable to be able to step away, come back the next day, 36 00:01:31,380 --> 00:01:35,286 go to office hours, post on CS50 Discuss or the like, to actually get unblocked. 37 00:01:35,286 --> 00:01:36,160 So keep that in mind. 38 00:01:36,160 --> 00:01:40,830 Starting earliest as possible is the best thing you can do. 39 00:01:40,830 --> 00:01:44,160 So here's where we started the class, over in week zero. 40 00:01:44,160 --> 00:01:47,441 And can we get a volunteer here to help me find mics? 41 00:01:47,441 --> 00:01:47,940 OK. 42 00:01:47,940 --> 00:01:48,900 Standing up already. 43 00:01:48,900 --> 00:01:50,080 Come on up. 44 00:01:50,080 --> 00:01:53,707 Guess that's how it's going to work. 45 00:01:53,707 --> 00:01:54,415 What's your name? 46 00:01:54,415 --> 00:01:55,570 ALAN ESTRADA: Alan Estrada. 47 00:01:55,570 --> 00:01:56,778 DAVID J. MALAN: Alan Estrada. 48 00:01:56,778 --> 00:01:57,910 Come on up. 49 00:01:57,910 --> 00:01:58,619 Nice to meet you. 50 00:01:58,619 --> 00:01:59,910 ALAN ESTRADA: Nice to meet you. 51 00:01:59,910 --> 00:02:02,772 DAVID J. MALAN: And you were here with us in week zero, of course. 52 00:02:02,772 --> 00:02:03,028 ALAN ESTRADA: I was. 53 00:02:03,028 --> 00:02:03,160 I was. 54 00:02:03,160 --> 00:02:05,868 >> DAVID J. MALAN: So could you go ahead and find for us Mike Smith, 55 00:02:05,868 --> 00:02:08,639 as fast as you can? 56 00:02:08,639 --> 00:02:10,639 As fast as you can. 57 00:02:10,639 --> 00:02:13,756 Literally tearing the problem in half as you need to. 58 00:02:13,756 --> 00:02:15,130 >> ALAN ESTRADA: Um. 59 00:02:15,130 --> 00:02:17,380 DAVID J. MALAN: Literally tearing the problem in half. 60 00:02:17,380 --> 00:02:20,210 61 00:02:20,210 --> 00:02:22,083 >> ALAN ESTRADA: Oh. 62 00:02:22,083 --> 00:02:22,583 Mm. 63 00:02:22,583 --> 00:02:23,300 Very good. 64 00:02:23,300 --> 00:02:23,700 >> DAVID J. MALAN: OK. 65 00:02:23,700 --> 00:02:24,200 Good. 66 00:02:24,200 --> 00:02:24,701 Thank you. 67 00:02:24,701 --> 00:02:25,700 ALAN ESTRADA: Very good. 68 00:02:25,700 --> 00:02:26,210 OK. 69 00:02:26,210 --> 00:02:27,610 >> DAVID J. MALAN: And so now, you've whittled it down 70 00:02:27,610 --> 00:02:29,320 to half the size of the problem. 71 00:02:29,320 --> 00:02:31,267 Now, we're down to a quarter. 72 00:02:31,267 --> 00:02:33,475 Are you paying attention to which side we're keeping? 73 00:02:33,475 --> 00:02:34,405 >> [LAUGHING] 74 00:02:34,405 --> 00:02:35,970 >> ALAN ESTRADA: Yes, I think-- 75 00:02:35,970 --> 00:02:37,594 >> DAVID J. MALAN: What section are we in? 76 00:02:37,594 --> 00:02:39,150 ALAN ESTRADA: Mufflers, so. 77 00:02:39,150 --> 00:02:39,941 >> DAVID J. MALAN: OK. 78 00:02:39,941 --> 00:02:42,810 But Mike Smith is going to be after Mufflers. 79 00:02:42,810 --> 00:02:44,130 So-- 80 00:02:44,130 --> 00:02:48,180 >> [LAUGHING] 81 00:02:48,180 --> 00:02:48,742 >> All right. 82 00:02:48,742 --> 00:02:50,200 ALAN ESTRADA: Where are we looking? 83 00:02:50,200 --> 00:02:52,049 DAVID J. MALAN: Mike Smith. 84 00:02:52,049 --> 00:02:53,090 ALAN ESTRADA: Mike Smith. 85 00:02:53,090 --> 00:02:54,760 DAVID J. MALAN: Now, we're in surgical. 86 00:02:54,760 --> 00:02:57,840 Now, physicians. 87 00:02:57,840 --> 00:02:58,340 Now-- 88 00:02:58,340 --> 00:02:59,856 >> ALAN ESTRADA: Let's- let's go with real. 89 00:02:59,856 --> 00:03:00,370 Real. 90 00:03:00,370 --> 00:03:00,970 >> DAVID J. MALAN: Real. 91 00:03:00,970 --> 00:03:01,470 OK. 92 00:03:01,470 --> 00:03:03,700 If you need Real. 93 00:03:03,700 --> 00:03:05,250 Now, which way is Mike Smith? 94 00:03:05,250 --> 00:03:06,250 >> ALAN ESTRADA: This way. 95 00:03:06,250 --> 00:03:07,333 >> DAVID J. MALAN: Which way? 96 00:03:07,333 --> 00:03:08,240 ALAN ESTRADA: Wait. 97 00:03:08,240 --> 00:03:08,790 M is-- right? 98 00:03:08,790 --> 00:03:09,110 We started with-- 99 00:03:09,110 --> 00:03:10,090 >> DAVID J. MALAN: Yeah. 100 00:03:10,090 --> 00:03:10,650 They're left. 101 00:03:10,650 --> 00:03:11,430 Your right. 102 00:03:11,430 --> 00:03:11,710 >> ALAN ESTRADA: Yeah. 103 00:03:11,710 --> 00:03:13,126 >> DAVID J. MALAN: So Mike's in here. 104 00:03:13,126 --> 00:03:13,990 ALAN ESTRADA: What? 105 00:03:13,990 --> 00:03:14,665 >> [LAUGHING] 106 00:03:14,665 --> 00:03:17,365 107 00:03:17,365 --> 00:03:18,330 >> Bad example, guys. 108 00:03:18,330 --> 00:03:18,830 Sorry. 109 00:03:18,830 --> 00:03:21,610 DAVID J. MALAN: This will teach you to leap out of your chair. 110 00:03:21,610 --> 00:03:22,318 >> ALAN ESTRADA: Oh. 111 00:03:22,318 --> 00:03:22,890 Oh. 112 00:03:22,890 --> 00:03:23,390 I got you. 113 00:03:23,390 --> 00:03:24,670 I got you. 114 00:03:24,670 --> 00:03:25,170 Oh. 115 00:03:25,170 --> 00:03:25,669 Oh. 116 00:03:25,669 --> 00:03:26,812 This is-- OK, I got you. 117 00:03:26,812 --> 00:03:27,520 Smith right here? 118 00:03:27,520 --> 00:03:28,894 >> DAVID J. MALAN: Smith, thank you. 119 00:03:28,894 --> 00:03:30,535 So I'll keep looking up Smith? 120 00:03:30,535 --> 00:03:30,790 >> ALAN ESTRADA: Oh, yeah. 121 00:03:30,790 --> 00:03:31,340 No, no, no. 122 00:03:31,340 --> 00:03:31,600 Oh, no. 123 00:03:31,600 --> 00:03:31,940 This is mine. 124 00:03:31,940 --> 00:03:32,580 >> DAVID J. MALAN: Oh, you got Smith. 125 00:03:32,580 --> 00:03:33,415 OK. 126 00:03:33,415 --> 00:03:34,040 >> ALAN ESTRADA: Yeah, I got Smith right here. 127 00:03:34,040 --> 00:03:34,700 Sorry, guys. 128 00:03:34,700 --> 00:03:35,860 I thought Michael-- we were looking for Michael. 129 00:03:35,860 --> 00:03:36,550 Sorry. 130 00:03:36,550 --> 00:03:37,550 >> DAVID J. MALAN: It's OK. 131 00:03:37,550 --> 00:03:39,950 All right, now we're into Paccini and Sons. 132 00:03:39,950 --> 00:03:41,242 >> ALAN ESTRADA: Paccini and sons. 133 00:03:41,242 --> 00:03:43,158 DAVID J. MALAN: Only you and I are in on this. 134 00:03:43,158 --> 00:03:44,050 OK. 135 00:03:44,050 --> 00:03:45,130 Find us Mike Smith. 136 00:03:45,130 --> 00:03:45,830 Smith. 137 00:03:45,830 --> 00:03:46,310 >> ALAN ESTRADA: Smith. 138 00:03:46,310 --> 00:03:46,750 >> DAVID J. MALAN: Smith. 139 00:03:46,750 --> 00:03:47,728 We're in R for rubbish. 140 00:03:47,728 --> 00:03:48,644 ALAN ESTRADA: Rubbish. 141 00:03:48,644 --> 00:03:50,096 Oh. 142 00:03:50,096 --> 00:03:52,480 This is going to take a while. 143 00:03:52,480 --> 00:03:54,340 >> [LAUGHING] 144 00:03:54,340 --> 00:03:55,804 145 00:03:55,804 --> 00:03:56,720 DAVID J. MALAN: Shoes. 146 00:03:56,720 --> 00:03:58,080 We're in shoes. 147 00:03:58,080 --> 00:04:00,210 >> ALAN ESTRADA: Now we're gonna-- 148 00:04:00,210 --> 00:04:01,105 >> DAVID J. MALAN: Nice. 149 00:04:01,105 --> 00:04:01,980 ALAN ESTRADA: Which-- 150 00:04:01,980 --> 00:04:03,620 [LAUGHING] 151 00:04:03,620 --> 00:04:05,440 Oh, this is great. 152 00:04:05,440 --> 00:04:06,910 [LAUGHING] 153 00:04:06,910 --> 00:04:08,380 154 00:04:08,380 --> 00:04:09,390 >> DAVID J. MALAN: It's OK. 155 00:04:09,390 --> 00:04:11,365 >> ALAN ESTRADA: Oh, this is good. 156 00:04:11,365 --> 00:04:14,425 I don't think I'm going to have PSAT buddies after this. 157 00:04:14,425 --> 00:04:15,300 DAVID J. MALAN: Good. 158 00:04:15,300 --> 00:04:16,078 Sporting. 159 00:04:16,078 --> 00:04:17,036 ALAN ESTRADA: Sporting. 160 00:04:17,036 --> 00:04:18,668 Um, L, M, N, O, P. 161 00:04:18,668 --> 00:04:19,459 DAVID J. MALAN: OK. 162 00:04:19,459 --> 00:04:21,600 So let's tear this in half. 163 00:04:21,600 --> 00:04:22,270 It's OK. 164 00:04:22,270 --> 00:04:25,606 This ends poorly anyway, because Mike Smith won't be in the yellow pages. 165 00:04:25,606 --> 00:04:26,430 >> ALAN ESTRADA: Aw. 166 00:04:26,430 --> 00:04:27,140 >> DAVID J. MALAN: No, it's OK. 167 00:04:27,140 --> 00:04:28,930 But let's pretend like he's on this page. 168 00:04:28,930 --> 00:04:33,260 So now, you've whittled the problem down to one page, and we found Mike Smith. 169 00:04:33,260 --> 00:04:35,180 >> [CHEERING] 170 00:04:35,180 --> 00:04:35,757 171 00:04:35,757 --> 00:04:36,340 OK, thank you. 172 00:04:36,340 --> 00:04:40,700 173 00:04:40,700 --> 00:04:41,200 OK. 174 00:04:41,200 --> 00:04:43,646 That was extraordinary. 175 00:04:43,646 --> 00:04:45,954 But it was still faster than linear search, 176 00:04:45,954 --> 00:04:47,870 wherein we start at the beginning of the book, 177 00:04:47,870 --> 00:04:51,210 and we move our way from left to right, eventually looking for Mike Smith. 178 00:04:51,210 --> 00:04:53,540 And so, if the phone book had maybe 1,000 pages, 179 00:04:53,540 --> 00:04:56,300 maybe it would have taken us 10 or so page tears. 180 00:04:56,300 --> 00:04:59,380 >> But you may have leveraged touched an assumption 181 00:04:59,380 --> 00:05:03,602 during all of that, which is to say that the phone book in advance was what? 182 00:05:03,602 --> 00:05:04,310 AUDIENCE: Sorted. 183 00:05:04,310 --> 00:05:05,000 DAVID J. MALAN: It's sorted. 184 00:05:05,000 --> 00:05:05,160 Right? 185 00:05:05,160 --> 00:05:07,909 It's sorted alphabetically, so that all of those names and numbers 186 00:05:07,909 --> 00:05:11,230 are sorted from the A's to the Z's, and alphabetically in between. 187 00:05:11,230 --> 00:05:13,100 But today, we now ask the question, well, 188 00:05:13,100 --> 00:05:16,170 how did Verizon or the phone company get it into that state? 189 00:05:16,170 --> 00:05:19,560 >> Because it's one thing to leverage that assumption, and therefore, 190 00:05:19,560 --> 00:05:22,570 solve a problem with an algorithm more efficiently. 191 00:05:22,570 --> 00:05:24,900 But we never really asked in week zero, well, 192 00:05:24,900 --> 00:05:27,790 how much did it cost Verizon or someone else 193 00:05:27,790 --> 00:05:29,620 to get that phone book in sorted order? 194 00:05:29,620 --> 00:05:29,780 >> Right? 195 00:05:29,780 --> 00:05:31,529 It doesn't matter if looking up Mike Smith 196 00:05:31,529 --> 00:05:35,190 is super fast, if it takes you a year to sort the pages initially. 197 00:05:35,190 --> 00:05:35,690 Right? 198 00:05:35,690 --> 00:05:38,620 You might as well just sift through a randomized phone book, 199 00:05:38,620 --> 00:05:40,690 if it's going to be super expensive to sort it. 200 00:05:40,690 --> 00:05:42,350 So if we can have another volunteer. 201 00:05:42,350 --> 00:05:46,350 Let's take a look up here at how we might-- come on up-- how 202 00:05:46,350 --> 00:05:48,100 we might go about sorting these. 203 00:05:48,100 --> 00:05:51,990 >> And if Jordan could actually join us up here on stage. 204 00:05:51,990 --> 00:05:55,100 Come on up for just a moment. 205 00:05:55,100 --> 00:05:56,359 What's your name? 206 00:05:56,359 --> 00:05:57,150 CAROLINE: Caroline. 207 00:05:57,150 --> 00:05:58,691 DAVID J. MALAN: Caroline, come on up. 208 00:05:58,691 --> 00:06:02,070 And you'll be joined by me and Jordan here. 209 00:06:02,070 --> 00:06:03,800 Caroline, thank you. 210 00:06:03,800 --> 00:06:04,300 All right. 211 00:06:04,300 --> 00:06:08,330 So what we have here for Caroline is 26 blue books 212 00:06:08,330 --> 00:06:10,747 that FAS uses to administer certain final exams. 213 00:06:10,747 --> 00:06:13,330 These are getting hard to find, but what we've done in advance 214 00:06:13,330 --> 00:06:15,800 is that we've put someone's name on the front of each of these, 215 00:06:15,800 --> 00:06:18,133 but we've kept it simple by then putting out full names. 216 00:06:18,133 --> 00:06:22,720 So we would put the person with the name L, D, J, B, all the way A through Z, 217 00:06:22,720 --> 00:06:24,090 but they're in random order. 218 00:06:24,090 --> 00:06:26,890 And so if you would, talking your way through the problem as you 219 00:06:26,890 --> 00:06:31,620 do it, can you go ahead and sort these for us, from A to Z. 220 00:06:31,620 --> 00:06:34,070 >> AUDIENCE: OK, so L is like, the middle. 221 00:06:34,070 --> 00:06:35,050 C is beginning. 222 00:06:35,050 --> 00:06:42,410 B. J before L. B, Q. 223 00:06:42,410 --> 00:06:45,140 >> DAVID J. MALAN: Hold that thought for one second. 224 00:06:45,140 --> 00:06:48,910 Because otherwise, this is only interesting to you, me, and Jordan. 225 00:06:48,910 --> 00:06:49,724 There we go. 226 00:06:49,724 --> 00:06:50,640 AUDIENCE: [INAUDIBLE]. 227 00:06:50,640 --> 00:06:57,299 R. 228 00:06:57,299 --> 00:06:58,090 DAVID J. MALAN: OK. 229 00:06:58,090 --> 00:06:59,310 What are you doing? 230 00:06:59,310 --> 00:07:01,730 >> CAROLINE: M comes after O. 231 00:07:01,730 --> 00:07:02,564 >> DAVID J. MALAN: OK. 232 00:07:02,564 --> 00:07:03,064 >> CAROLINE: O. 233 00:07:03,064 --> 00:07:04,120 DAVID J. MALAN: O, Good. 234 00:07:04,120 --> 00:07:04,970 >> CAROLINE: E. 235 00:07:04,970 --> 00:07:06,730 >> DAVID J. MALAN: E, F. Yeah. 236 00:07:06,730 --> 00:07:07,620 >> CAROLINE: T, U, V. 237 00:07:07,620 --> 00:07:10,689 >> DAVID J. MALAN: V, T, U, V. So it looks like you're making-- keep going. 238 00:07:10,689 --> 00:07:12,730 It looks like you're making a big pile over here, 239 00:07:12,730 --> 00:07:13,910 and kind of a big pile over there. 240 00:07:13,910 --> 00:07:16,230 So the first half of the alphabet, second half of the alphabet. 241 00:07:16,230 --> 00:07:16,460 OK. 242 00:07:16,460 --> 00:07:16,960 Good. 243 00:07:16,960 --> 00:07:19,680 Kind of splitting the problem in two. 244 00:07:19,680 --> 00:07:21,771 M, N, X. Yeah. 245 00:07:21,771 --> 00:07:22,270 CAROLINE: K. 246 00:07:22,270 --> 00:07:22,980 DAVID J. MALAN: OK. 247 00:07:22,980 --> 00:07:25,070 K. So you're kind of selecting them one after another, 248 00:07:25,070 --> 00:07:27,620 putting it either left or right, or Z's going on the floor. 249 00:07:27,620 --> 00:07:28,012 OK. 250 00:07:28,012 --> 00:07:29,190 >> CAROLINE: Z's going on the floor. 251 00:07:29,190 --> 00:07:29,360 >> DAVID J. MALAN: OK. 252 00:07:29,360 --> 00:07:30,920 Y is going on the floor. 253 00:07:30,920 --> 00:07:31,735 Now we can put X. 254 00:07:31,735 --> 00:07:32,409 >> CAROLINE: G. 255 00:07:32,409 --> 00:07:33,700 DAVID J. MALAN: G's going left. 256 00:07:33,700 --> 00:07:36,017 S is going right. 257 00:07:36,017 --> 00:07:37,642 All right, A is going all the way left. 258 00:07:37,642 --> 00:07:38,790 >> CAROLINE: A, B, C, D. 259 00:07:38,790 --> 00:07:39,873 >> DAVID J. MALAN: Now, good. 260 00:07:39,873 --> 00:07:43,260 We've got A, B, C. W's going down there. 261 00:07:43,260 --> 00:07:45,566 All right, T. 262 00:07:45,566 --> 00:07:46,611 >> CAROLINE: H, I , J. 263 00:07:46,611 --> 00:07:47,860 DAVID J. MALAN: H, I, J. Good. 264 00:07:47,860 --> 00:07:49,160 CAROLINE: In the center, I'm gonna-- 265 00:07:49,160 --> 00:07:50,000 DAVID J. MALAN: OK. 266 00:07:50,000 --> 00:07:52,375 So now, we're going to kind of merge these various piles. 267 00:07:52,375 --> 00:08:00,730 So A through C, then I see D, and E, and F, and G, and H, and I. Nice. 268 00:08:00,730 --> 00:08:05,540 J, K. And then, this pile is upside down, but that's OK. 269 00:08:05,540 --> 00:08:06,040 Sure. 270 00:08:06,040 --> 00:08:07,240 We can cut some corners. 271 00:08:07,240 --> 00:08:07,950 OK. 272 00:08:07,950 --> 00:08:10,530 And then we need W, X, Y, Z. 273 00:08:10,530 --> 00:08:11,250 >> CAROLINE: Yeah. 274 00:08:11,250 --> 00:08:11,880 >> DAVID J. MALAN: Excellent. 275 00:08:11,880 --> 00:08:14,122 So a big thank you to Caroline for sorting these. 276 00:08:14,122 --> 00:08:15,030 >> [CHEERING] 277 00:08:15,030 --> 00:08:17,287 >> Thank you. 278 00:08:17,287 --> 00:08:18,120 Thank you very much. 279 00:08:18,120 --> 00:08:22,910 So now let's consider for a moment how Caroline went about doing that, 280 00:08:22,910 --> 00:08:26,040 and what exactly we were able to-- how we 281 00:08:26,040 --> 00:08:28,409 were able to solve that problem when we were just 282 00:08:28,409 --> 00:08:29,950 given a whole bunch of random inputs. 283 00:08:29,950 --> 00:08:31,610 >> Well, it looks like there was a bit of a system there? 284 00:08:31,610 --> 00:08:32,110 Right. 285 00:08:32,110 --> 00:08:34,495 So the earlier letters in the alphabet, she 286 00:08:34,495 --> 00:08:37,120 was putting to the left, and the later letters in the alphabet, 287 00:08:37,120 --> 00:08:38,270 she was putting into the right. 288 00:08:38,270 --> 00:08:40,500 And as soon as she found some proximal letters, ones 289 00:08:40,500 --> 00:08:43,124 that go right next to each other, she would put those in order. 290 00:08:43,124 --> 00:08:46,750 And so we kind of had these small piles of sorted inputs occurring. 291 00:08:46,750 --> 00:08:50,540 >> And so that's quite like what most of us humans would do. 292 00:08:50,540 --> 00:08:53,530 We would sort of sift through it, and we'd kind of have a mechanism. 293 00:08:53,530 --> 00:08:56,930 But it might be hard to write it down in a formula per se. 294 00:08:56,930 --> 00:08:59,010 It felt a little more organic than that. 295 00:08:59,010 --> 00:09:02,560 So let's see if we can now bound the problem with fewer inputs. 296 00:09:02,560 --> 00:09:05,170 >> Instead of 26, let's do something far fewer 297 00:09:05,170 --> 00:09:09,890 with just say, seven, behind these doors, so to speak. 298 00:09:09,890 --> 00:09:11,300 Are there just seven numbers? 299 00:09:11,300 --> 00:09:15,240 And if the goal now at hand is to find a value, 300 00:09:15,240 --> 00:09:17,850 let's see how efficiently we can go about doing this. 301 00:09:17,850 --> 00:09:22,460 And let's see if we can now start to apply some numbers, 302 00:09:22,460 --> 00:09:26,310 or some formulas with which to describe the efficiency of our phone book 303 00:09:26,310 --> 00:09:31,060 algorithm, our exam book algorithm, and more generally, finding information. 304 00:09:31,060 --> 00:09:34,770 >> So for this, let me go ahead, and on the touch screen over here, 305 00:09:34,770 --> 00:09:41,100 put up a web browser that has exactly these seven doors. 306 00:09:41,100 --> 00:09:46,670 And if we could get one other volunteer to come on over here, 307 00:09:46,670 --> 00:09:48,480 I've put these same doors over here. 308 00:09:48,480 --> 00:09:49,170 Quick volunteer. 309 00:09:49,170 --> 00:09:51,130 >> This one-- demos are going to a quicker and quicker now. 310 00:09:51,130 --> 00:09:51,600 Come on down. 311 00:09:51,600 --> 00:09:52,308 What's your name? 312 00:09:52,308 --> 00:09:53,040 TREVOR: Trevor. 313 00:09:53,040 --> 00:09:53,998 >> DAVID J. MALAN: Trevor? 314 00:09:53,998 --> 00:09:55,770 All right, Trevor, come on down. 315 00:09:55,770 --> 00:09:59,212 So Trevor has volunteered here to do a similar problem, but one that's 316 00:09:59,212 --> 00:10:02,170 narrower in scope, and that's going to allow us to try to formalize now 317 00:10:02,170 --> 00:10:03,970 the process for sorting these numbers. 318 00:10:03,970 --> 00:10:05,500 >> So Trevor, nice to meet you. 319 00:10:05,500 --> 00:10:08,720 So here is an array, so to speak, a list of seven doors. 320 00:10:08,720 --> 00:10:10,327 Go ahead and find us the number 50. 321 00:10:10,327 --> 00:10:12,410 And then after the fact, tell us how you found it. 322 00:10:12,410 --> 00:10:19,124 323 00:10:19,124 --> 00:10:20,040 Should be-- all right. 324 00:10:20,040 --> 00:10:21,945 Yeah, this is the one here? 325 00:10:21,945 --> 00:10:24,680 Uh-oh. 326 00:10:24,680 --> 00:10:25,560 OK. 327 00:10:25,560 --> 00:10:26,680 You clicked that one. 328 00:10:26,680 --> 00:10:28,690 Good. 329 00:10:28,690 --> 00:10:29,780 >> And good. 330 00:10:29,780 --> 00:10:30,970 Now you clicked that one. 331 00:10:30,970 --> 00:10:34,060 And let me give you the microphone, so that you have it in just a moment. 332 00:10:34,060 --> 00:10:37,000 Go ahead and click the next door that you intend. 333 00:10:37,000 --> 00:10:39,812 Yes, good. 334 00:10:39,812 --> 00:10:41,020 TREVOR: Can I unclick a door? 335 00:10:41,020 --> 00:10:42,620 DAVID J. MALAN: No, you can't unclick. 336 00:10:42,620 --> 00:10:43,119 TREVOR: OK. 337 00:10:43,119 --> 00:10:43,974 This one. 338 00:10:43,974 --> 00:10:45,640 DAVID J. MALAN: Where do you want to go? 339 00:10:45,640 --> 00:10:46,410 Which one? 340 00:10:46,410 --> 00:10:47,230 >> TREVOR: That one. 341 00:10:47,230 --> 00:10:48,042 >> DAVID J. MALAN: No. 342 00:10:48,042 --> 00:10:48,450 >> TREVOR: OK. 343 00:10:48,450 --> 00:10:48,735 This one. 344 00:10:48,735 --> 00:10:49,020 >> DAVID J. MALAN: Yes. 345 00:10:49,020 --> 00:10:49,700 That was good. 346 00:10:49,700 --> 00:10:50,380 All right. 347 00:10:50,380 --> 00:10:53,900 So what was your algorithm or procedure for doing this, Trevor? 348 00:10:53,900 --> 00:10:56,149 >> TREVOR: I just went through doors until I found a 50. 349 00:10:56,149 --> 00:10:56,940 DAVID J. MALAN: OK. 350 00:10:56,940 --> 00:10:58,150 Excellent algorithm. 351 00:10:58,150 --> 00:10:59,540 So that's fine. 352 00:10:59,540 --> 00:11:03,120 Because in fact, if I reveal what's behind these two other doors, what 353 00:11:03,120 --> 00:11:06,954 we'll find here is that we only have random input. 354 00:11:06,954 --> 00:11:08,870 So that was actually as good as you could get. 355 00:11:08,870 --> 00:11:12,509 And in fact, you got better than exhaustively searching the whole array, 356 00:11:12,509 --> 00:11:15,300 because it would have been really unlucky if you had hit the number 357 00:11:15,300 --> 00:11:16,604 50 at the very last door. 358 00:11:16,604 --> 00:11:18,520 But what if we instead gave you an assumption. 359 00:11:18,520 --> 00:11:20,630 Suppose I sort all of these doors around, 360 00:11:20,630 --> 00:11:23,500 so that you have the numbers sorted this time, 361 00:11:23,500 --> 00:11:29,730 but this time it's actually a different-- this time, 362 00:11:29,730 --> 00:11:32,640 it's actually sorted for you. 363 00:11:32,640 --> 00:11:35,380 And now the goal at hand is to hit the number 50. 364 00:11:35,380 --> 00:11:36,090 >> TREVOR: OK. 365 00:11:36,090 --> 00:11:37,670 >> DAVID J. MALAN: What's your algorithm going to be? 366 00:11:37,670 --> 00:11:39,628 >> TREVOR: Well, if it's sorted, it's either going 367 00:11:39,628 --> 00:11:42,710 to be-- if biggest to largest, descending, it'll be the first one, 368 00:11:42,710 --> 00:11:44,751 or if it's the opposite, it will be the last one. 369 00:11:44,751 --> 00:11:48,897 So I'll just tap this door, and then just tap the last door. 370 00:11:48,897 --> 00:11:49,980 DAVID J. MALAN: Excellent. 371 00:11:49,980 --> 00:11:50,270 All right. 372 00:11:50,270 --> 00:11:51,150 So we found the number 50. 373 00:11:51,150 --> 00:11:52,970 So as soon as you knew they were sorted, we 374 00:11:52,970 --> 00:11:55,040 were able to leverage this assumption. 375 00:11:55,040 --> 00:11:57,040 So they're too much like the phone book example. 376 00:11:57,040 --> 00:11:59,540 As soon as you have, even with a small problem like this, 377 00:11:59,540 --> 00:12:02,380 your inputs pre-sorted, we can actually find the value arguably 378 00:12:02,380 --> 00:12:03,130 more efficiently. 379 00:12:03,130 --> 00:12:05,800 >> And I didn't tell you if it was sorted small to big, or big to small, 380 00:12:05,800 --> 00:12:08,080 and so it was very reasonable to start at one end or the other 381 00:12:08,080 --> 00:12:09,750 to actually find that target value. 382 00:12:09,750 --> 00:12:11,400 So thank to Trevor as well. 383 00:12:11,400 --> 00:12:13,260 And I'll propose-- nicely done. 384 00:12:13,260 --> 00:12:16,960 We have a little clip, actually, that is among our favorite moments in CS50, 385 00:12:16,960 --> 00:12:19,700 whereby sometimes these demos don't quite go according to plan. 386 00:12:19,700 --> 00:12:22,050 And indeed right now, I pulled up the wrong interface 387 00:12:22,050 --> 00:12:23,508 with which to use the touch screen. 388 00:12:23,508 --> 00:12:24,660 So that was my fault there. 389 00:12:24,660 --> 00:12:26,600 >> So this will make for next year's clip as 390 00:12:26,600 --> 00:12:28,570 to why I was clicking on my own screen. 391 00:12:28,570 --> 00:12:31,390 But let's take a quick look at what happened last year 392 00:12:31,390 --> 00:12:34,770 with Jay, who came up, much like Trevor here, volunteered, 393 00:12:34,770 --> 00:12:39,380 and in this short clip, you'll see how this same demo didn't quite 394 00:12:39,380 --> 00:12:41,074 reveal the same lessons learned. 395 00:12:41,074 --> 00:12:41,740 [VIDEO PLAYBACK] 396 00:12:41,740 --> 00:12:45,360 -All I want you to do now is to find for me, and for us, 397 00:12:45,360 --> 00:12:51,674 really, the number 50 one step at a time. 398 00:12:51,674 --> 00:12:52,450 >> -The number 50? 399 00:12:52,450 --> 00:12:53,190 >> -The number 50. 400 00:12:53,190 --> 00:12:55,356 And you can reveal what's behind each of these doors 401 00:12:55,356 --> 00:12:58,540 simply by touching it with a finger. 402 00:12:58,540 --> 00:13:00,910 Damn it. 403 00:13:00,910 --> 00:13:02,870 >> [LAUGHING] 404 00:13:02,870 --> 00:13:03,806 >> [END PLAYBACK] 405 00:13:03,806 --> 00:13:05,430 DAVID J. MALAN: So that went very well. 406 00:13:05,430 --> 00:13:06,796 Those were the unsorted doors. 407 00:13:06,796 --> 00:13:08,670 And Jay, of course, found it all too quickly. 408 00:13:08,670 --> 00:13:12,910 Trevor did a much better job in terms of a teachable moment, 409 00:13:12,910 --> 00:13:15,850 so to speak, this year in taking longer to find it. 410 00:13:15,850 --> 00:13:17,950 Of course, then we gave Jay a second chance, 411 00:13:17,950 --> 00:13:20,320 whereby we sorted the doors, just as we did for Trevor, 412 00:13:20,320 --> 00:13:22,300 and Trevor did super well this time. 413 00:13:22,300 --> 00:13:26,124 But Jay did it half as quickly. 414 00:13:26,124 --> 00:13:26,790 [VIDEO PLAYBACK] 415 00:13:26,790 --> 00:13:29,650 -The goal now is to also find us the number 50, 416 00:13:29,650 --> 00:13:33,030 but do it algorithmically, and tell us how you're going about it. 417 00:13:33,030 --> 00:13:33,660 >> -OK. 418 00:13:33,660 --> 00:13:35,604 >> -And if you find it, you keep the movie. 419 00:13:35,604 --> 00:13:37,228 If you don't find it, you give it back. 420 00:13:37,228 --> 00:13:38,044 >> -Man. 421 00:13:38,044 --> 00:13:38,860 >> -Oh! 422 00:13:38,860 --> 00:13:40,800 >> -[INAUDIBLE] OK. 423 00:13:40,800 --> 00:13:46,236 So I'm going to check the ends first to determine if there's-- Oh. 424 00:13:46,236 --> 00:13:48,646 >> [APPLAUSE] 425 00:13:48,646 --> 00:13:53,948 426 00:13:53,948 --> 00:13:55,729 >> [END PLAYBACK] 427 00:13:55,729 --> 00:13:56,520 DAVID J. MALAN: OK. 428 00:13:56,520 --> 00:13:59,760 So sorting doors clearly leads to greater efficiency. 429 00:13:59,760 --> 00:14:01,680 And so, twice as fast is what I meant there. 430 00:14:01,680 --> 00:14:03,270 And so Jay got lucky both times. 431 00:14:03,270 --> 00:14:06,685 And he also got lucky in that last year, I ordered some Blu-ray discs 432 00:14:06,685 --> 00:14:07,560 to actually give out. 433 00:14:07,560 --> 00:14:09,768 I'm sorry this year, we didn't have the same, Trevor. 434 00:14:09,768 --> 00:14:11,540 But better still was a few years back. 435 00:14:11,540 --> 00:14:14,820 And some of you might know this fellow, Sean, who when he was in CS50, 436 00:14:14,820 --> 00:14:17,780 was challenged with the exact same problem, albeit in SD, 437 00:14:17,780 --> 00:14:19,360 as you'll soon see, back in the day. 438 00:14:19,360 --> 00:14:22,622 And you'll find that not only did he take a little longer than Jay, 439 00:14:22,622 --> 00:14:25,580 a little longer than Trevor, it was actually this wonderful opportunity 440 00:14:25,580 --> 00:14:29,820 to engage almost everyone in the crowd a la Price is Right, encouraging 441 00:14:29,820 --> 00:14:31,889 him to find the number we were seeking. 442 00:14:31,889 --> 00:14:32,930 Let's. take a quick look. 443 00:14:32,930 --> 00:14:33,320 >> [VIDEO PLAYBACK] 444 00:14:33,320 --> 00:14:33,820 >> -OK. 445 00:14:33,820 --> 00:14:36,680 So your task here, Sean, is the following. 446 00:14:36,680 --> 00:14:40,860 I have hidden behind these doors the number seven. 447 00:14:40,860 --> 00:14:45,120 But tucked away in some of these doors as well are other negative numbers. 448 00:14:45,120 --> 00:14:47,500 And your goal is to think of this top row of numbers 449 00:14:47,500 --> 00:14:50,390 as just an array, or just sequence of pieces of paper 450 00:14:50,390 --> 00:14:51,510 with numbers behind them. 451 00:14:51,510 --> 00:14:55,540 And your goal is, only using the top array here, find me the number seven. 452 00:14:55,540 --> 00:14:58,570 And we are then going to critique how you go about doing it. 453 00:14:58,570 --> 00:14:59,070 -All right. 454 00:14:59,070 --> 00:15:00,850 -Find us the number seven, please. 455 00:15:00,850 --> 00:15:10,500 456 00:15:10,500 --> 00:15:11,000 No. 457 00:15:11,000 --> 00:15:15,050 458 00:15:15,050 --> 00:15:18,550 Five, 19, 13. 459 00:15:18,550 --> 00:15:22,240 460 00:15:22,240 --> 00:15:24,770 >> [LAUGHING] 461 00:15:24,770 --> 00:15:25,910 >> It's not a trick question. 462 00:15:25,910 --> 00:15:29,410 463 00:15:29,410 --> 00:15:29,910 One. 464 00:15:29,910 --> 00:15:33,218 465 00:15:33,218 --> 00:15:34,695 >> [LAUGHING] 466 00:15:34,695 --> 00:15:37,861 At this point, your score is not very good, so you might as well keep going. 467 00:15:37,861 --> 00:15:40,610 468 00:15:40,610 --> 00:15:41,110 Three. 469 00:15:41,110 --> 00:15:43,890 470 00:15:43,890 --> 00:15:45,378 >> [LAUGHING] 471 00:15:45,378 --> 00:15:46,370 472 00:15:46,370 --> 00:15:47,774 >> Go on. 473 00:15:47,774 --> 00:15:50,690 Frankly, I can't help but wonder what you're even thinking about, so-- 474 00:15:50,690 --> 00:15:51,959 >> [LAUGHING] 475 00:15:51,959 --> 00:15:53,229 476 00:15:53,229 --> 00:15:55,020 Only the top row, so you've got three left. 477 00:15:55,020 --> 00:15:56,200 So find me seven. 478 00:15:56,200 --> 00:15:59,700 479 00:15:59,700 --> 00:16:02,167 >> [LAUGHING] 480 00:16:02,167 --> 00:16:14,870 481 00:16:14,870 --> 00:16:15,370 17. 482 00:16:15,370 --> 00:16:25,675 483 00:16:25,675 --> 00:16:26,946 Seven. 484 00:16:26,946 --> 00:16:28,780 >> [APPLAUSE] 485 00:16:28,780 --> 00:16:29,426 >> All right. 486 00:16:29,426 --> 00:16:30,360 >> [END PLAYBACK] 487 00:16:30,360 --> 00:16:31,840 >> DAVID J. MALAN: So we could watch these all day long. 488 00:16:31,840 --> 00:16:34,090 And of course, some of this year's demos perhaps 489 00:16:34,090 --> 00:16:36,330 will now end up in next year's video as well. 490 00:16:36,330 --> 00:16:39,040 So now let's actually focus on the algorithms 491 00:16:39,040 --> 00:16:42,140 here, and see if we can't now start to formalize 492 00:16:42,140 --> 00:16:46,650 how we can go about getting our data into this state that it's sorted, 493 00:16:46,650 --> 00:16:50,054 so that ultimately, we can actually search it more efficiently. 494 00:16:50,054 --> 00:16:52,470 And even though we're going to use fairly small data sets, 495 00:16:52,470 --> 00:16:54,511 like the eight numbers we have here on the board, 496 00:16:54,511 --> 00:16:58,230 ultimately these same ideas could apply to 1,000 inputs, a million inputs, 497 00:16:58,230 --> 00:17:02,100 4 billion inputs, because the algorithms are going to be fundamentally the same. 498 00:17:02,100 --> 00:17:05,359 >> And so this is our last opportunity for volunteers today, 499 00:17:05,359 --> 00:17:09,790 but perhaps the most involved one, for which we need eight volunteers 500 00:17:09,790 --> 00:17:12,960 to come up and walk us through the process of sorting what will soon 501 00:17:12,960 --> 00:17:15,212 be on these music stands here. 502 00:17:15,212 --> 00:17:16,170 Let me start back here. 503 00:17:16,170 --> 00:17:19,692 >> So one in the turquoise-- green is it? 504 00:17:19,692 --> 00:17:21,130 Are you committing? 505 00:17:21,130 --> 00:17:21,630 Two. 506 00:17:21,630 --> 00:17:23,069 Come on down. 507 00:17:23,069 --> 00:17:23,569 OK. 508 00:17:23,569 --> 00:17:24,420 Three. 509 00:17:24,420 --> 00:17:25,400 Four. 510 00:17:25,400 --> 00:17:27,247 Let me-- OK, five. 511 00:17:27,247 --> 00:17:28,830 You're being nominated by your friend. 512 00:17:28,830 --> 00:17:31,340 Six, seven, and eight. 513 00:17:31,340 --> 00:17:32,130 Come on up. 514 00:17:32,130 --> 00:17:32,630 All right. 515 00:17:32,630 --> 00:17:33,190 Thank you so much. 516 00:17:33,190 --> 00:17:33,689 Come on up. 517 00:17:33,689 --> 00:17:34,790 Come on up. 518 00:17:34,790 --> 00:17:35,330 >> All right. 519 00:17:35,330 --> 00:17:38,890 So what we have here-- and this is among the more awkward ones, 520 00:17:38,890 --> 00:17:42,390 since this will require that you humor me for just a little bit of time. 521 00:17:42,390 --> 00:17:43,442 You shall be number one. 522 00:17:43,442 --> 00:17:44,150 What's your name? 523 00:17:44,150 --> 00:17:44,610 >> ANNAN: Annan. 524 00:17:44,610 --> 00:17:45,526 >> DAVID J. MALAN: Annan. 525 00:17:45,526 --> 00:17:46,092 David. 526 00:17:46,092 --> 00:17:46,800 What's your name? 527 00:17:46,800 --> 00:17:47,140 >> JOSEPH: Joseph. 528 00:17:47,140 --> 00:17:49,190 >> DAVID J. MALAN: Joseph, you are number two. 529 00:17:49,190 --> 00:17:52,260 >> SERENA: Serena, number three. 530 00:17:52,260 --> 00:17:53,722 Stefan, number four. 531 00:17:53,722 --> 00:17:54,430 CYNTHIA: Cynthia. 532 00:17:54,430 --> 00:17:57,548 DAVID J. MALAN: Cynthia, number five. 533 00:17:57,548 --> 00:17:58,452 [INAUDIBLE] 534 00:17:58,452 --> 00:17:59,618 DAVID J. MALAN: [INAUDIBLE]. 535 00:17:59,618 --> 00:18:00,391 David, number six. 536 00:18:00,391 --> 00:18:00,890 MATT: Matt. 537 00:18:00,890 --> 00:18:02,160 DAVID J. MALAN: Matt's number seven. 538 00:18:02,160 --> 00:18:02,850 And? 539 00:18:02,850 --> 00:18:03,210 >> WAVERLY: Waverly. 540 00:18:03,210 --> 00:18:04,470 >> DAVID J. MALAN: Waverly, number eight. 541 00:18:04,470 --> 00:18:04,970 All right. 542 00:18:04,970 --> 00:18:06,510 If you could-- whoops. 543 00:18:06,510 --> 00:18:08,820 If you all, as your first challenge, there 544 00:18:08,820 --> 00:18:10,820 are eight music stands here facing the audience. 545 00:18:10,820 --> 00:18:14,200 If you could put your numbers on these music stands in such a way 546 00:18:14,200 --> 00:18:16,560 that they line up with the same numbers on the board. 547 00:18:16,560 --> 00:18:19,560 So make yourselves look like that by putting your numbers on these music 548 00:18:19,560 --> 00:18:21,960 stands here. 549 00:18:21,960 --> 00:18:25,980 Excellent so far. 550 00:18:25,980 --> 00:18:26,600 >> Excellent. 551 00:18:26,600 --> 00:18:26,890 OK. 552 00:18:26,890 --> 00:18:29,556 So now, we're going to ask the question in a few different ways. 553 00:18:29,556 --> 00:18:31,610 How can we go about sorting these folks up here? 554 00:18:31,610 --> 00:18:34,500 Because we had a few approaches earlier, whereby we were 555 00:18:34,500 --> 00:18:36,360 kind of making two different buckets. 556 00:18:36,360 --> 00:18:38,842 And then we were generally piecing things together. 557 00:18:38,842 --> 00:18:41,050 As soon as we saw two numbers that belonged together, 558 00:18:41,050 --> 00:18:41,975 we put them together. 559 00:18:41,975 --> 00:18:43,350 Two letters that belong together. 560 00:18:43,350 --> 00:18:45,058 >> But let's see if we can't formalize this, 561 00:18:45,058 --> 00:18:48,044 so that we ultimately have some pseudo-code you will, 562 00:18:48,044 --> 00:18:49,710 with which you can solve these problems. 563 00:18:49,710 --> 00:18:51,870 So now, I'm looking out at these numbers here. 564 00:18:51,870 --> 00:18:55,030 And I see a whole bunch of mistakes. 565 00:18:55,030 --> 00:18:57,750 Ultimately, I want one on the left and eight on the right. 566 00:18:57,750 --> 00:19:00,650 >> And so I'm looking at these two, four and two. 567 00:19:00,650 --> 00:19:02,930 And what's the problem, obviously? 568 00:19:02,930 --> 00:19:04,261 Yeah. 569 00:19:04,261 --> 00:19:04,760 So. 570 00:19:04,760 --> 00:19:07,160 Two obviously comes before four, so you know what? 571 00:19:07,160 --> 00:19:10,210 Let me first take a greedy approach, if you will, much like problem 572 00:19:10,210 --> 00:19:13,790 set one-- if you recall from the Standard Edition of Problem Set One, 573 00:19:13,790 --> 00:19:16,820 where I just locally solve the problem that's right here in front of me 574 00:19:16,820 --> 00:19:17,690 and see where it leads me. 575 00:19:17,690 --> 00:19:17,870 >> OK. 576 00:19:17,870 --> 00:19:20,161 So two and four, let me go ahead and just swap you two. 577 00:19:20,161 --> 00:19:22,400 If you can physically move yourselves and your paper, 578 00:19:22,400 --> 00:19:25,040 I seem to have gotten the list in a better state. 579 00:19:25,040 --> 00:19:26,330 >> Now, they're good. 580 00:19:26,330 --> 00:19:28,480 I'm going to move on, four and six, looks good. 581 00:19:28,480 --> 00:19:29,110 Not a problem. 582 00:19:29,110 --> 00:19:30,440 Six and eight, OK. 583 00:19:30,440 --> 00:19:31,860 Eight and one, another problem. 584 00:19:31,860 --> 00:19:34,750 Because what's true about eight and one? 585 00:19:34,750 --> 00:19:36,990 One comes before eight, and so what should we do? 586 00:19:36,990 --> 00:19:38,090 Let's swap these two. 587 00:19:38,090 --> 00:19:39,316 One and eight. 588 00:19:39,316 --> 00:19:40,690 And now, I'm going to keep going. 589 00:19:40,690 --> 00:19:42,030 I'm going to keep looking ahead. 590 00:19:42,030 --> 00:19:42,840 And let's see what happens. 591 00:19:42,840 --> 00:19:44,680 Eight and three, of course, out of order. 592 00:19:44,680 --> 00:19:45,815 Let's swap. 593 00:19:45,815 --> 00:19:46,940 Eight and seven, of course. 594 00:19:46,940 --> 00:19:47,481 Out of order. 595 00:19:47,481 --> 00:19:48,280 Let's swap. 596 00:19:48,280 --> 00:19:49,940 Eight and five, of course, let's swap. 597 00:19:49,940 --> 00:19:50,560 All right. 598 00:19:50,560 --> 00:19:51,880 List is sorted. 599 00:19:51,880 --> 00:19:53,060 yes? 600 00:19:53,060 --> 00:19:54,280 >> OK, obviously not. 601 00:19:54,280 --> 00:19:55,860 But it is a little better, right? 602 00:19:55,860 --> 00:19:57,270 Because notice what happened. 603 00:19:57,270 --> 00:20:01,749 Each time we performed a swap, a smaller number kind of percolated that way, 604 00:20:01,749 --> 00:20:03,790 and a bigger number percolated this way, or we'll 605 00:20:03,790 --> 00:20:06,880 start saying bubbled to the left or bubbled to the right. 606 00:20:06,880 --> 00:20:10,080 >> Now, it's not enough, because at best a number might 607 00:20:10,080 --> 00:20:11,990 have moved one spot forward, or at worst, 608 00:20:11,990 --> 00:20:13,880 a number might have moved one spot further. 609 00:20:13,880 --> 00:20:16,369 So you know what, this kind of worked pretty well so far. 610 00:20:16,369 --> 00:20:17,410 Let me just try it again. 611 00:20:17,410 --> 00:20:18,880 Two and four, they're OK. 612 00:20:18,880 --> 00:20:20,180 Four and six, they're OK. 613 00:20:20,180 --> 00:20:21,790 Six and one, out of order. 614 00:20:21,790 --> 00:20:23,007 So let's swap you two. 615 00:20:23,007 --> 00:20:25,840 And now, notice the problem's starting to get a little better again. 616 00:20:25,840 --> 00:20:27,006 Six and three, out of order. 617 00:20:27,006 --> 00:20:28,100 Let's swap you two. 618 00:20:28,100 --> 00:20:29,730 Six and seven, you're good. 619 00:20:29,730 --> 00:20:32,230 Seven and five, of course, out of order. 620 00:20:32,230 --> 00:20:33,920 Seven and eight, in order. 621 00:20:33,920 --> 00:20:36,470 And now, I might need to do this a few more times. 622 00:20:36,470 --> 00:20:39,830 And in fact, think for yourselves perhaps how many times maximally 623 00:20:39,830 --> 00:20:41,330 might I have to walk back and forth? 624 00:20:41,330 --> 00:20:42,390 >> We'll come back to that. 625 00:20:42,390 --> 00:20:43,700 So two and four are still OK. 626 00:20:43,700 --> 00:20:44,940 Four and one, nope. 627 00:20:44,940 --> 00:20:45,747 So, let's swap. 628 00:20:45,747 --> 00:20:47,830 And again, notice visually one is kind of bubbling 629 00:20:47,830 --> 00:20:49,163 to the left, where it should be. 630 00:20:49,163 --> 00:20:50,010 Four and three swap. 631 00:20:50,010 --> 00:20:51,330 Four and six. 632 00:20:51,330 --> 00:20:53,100 Six and five swap. 633 00:20:53,100 --> 00:20:53,959 Six and seven. 634 00:20:53,959 --> 00:20:55,000 Seven and eight are good. 635 00:20:55,000 --> 00:20:55,500 >> Good. 636 00:20:55,500 --> 00:20:58,460 We're getting even better. 637 00:20:58,460 --> 00:20:59,130 So let's see. 638 00:20:59,130 --> 00:21:00,940 Now, we have two and one. 639 00:21:00,940 --> 00:21:02,520 Of course, swap. 640 00:21:02,520 --> 00:21:07,520 Two and three, three and four, four and five, six and seven, seven and eight. 641 00:21:07,520 --> 00:21:08,020 Good. 642 00:21:08,020 --> 00:21:08,730 And you know what? 643 00:21:08,730 --> 00:21:11,190 Because I made one change there, let me do one sanity check. 644 00:21:11,190 --> 00:21:13,023 Let me go all the way back to the beginning. 645 00:21:13,023 --> 00:21:13,680 OK. 646 00:21:13,680 --> 00:21:14,750 One, two-- yup, see? 647 00:21:14,750 --> 00:21:15,870 Something was wrong. 648 00:21:15,870 --> 00:21:18,420 Three, four, five, six, seven, eight. 649 00:21:18,420 --> 00:21:21,920 And in this last pass, are you comfortable with my now 650 00:21:21,920 --> 00:21:23,830 claiming it is sorted? 651 00:21:23,830 --> 00:21:24,330 OK. 652 00:21:24,330 --> 00:21:25,880 Visually, that's absolutely true. 653 00:21:25,880 --> 00:21:28,410 But functionally, what did also just happen 654 00:21:28,410 --> 00:21:31,870 in that last pass that allows you to confirm that this list is indeed 655 00:21:31,870 --> 00:21:32,660 sorted? 656 00:21:32,660 --> 00:21:34,477 >> What did I do or not do this last pass? 657 00:21:34,477 --> 00:21:35,810 AUDIENCE: There were no changes. 658 00:21:35,810 --> 00:21:36,120 DAVID J. MALAN: Sorry? 659 00:21:36,120 --> 00:21:37,070 AUDIENCE: No changes. 660 00:21:37,070 --> 00:21:38,653 DAVID J. MALAN: There were no changes. 661 00:21:38,653 --> 00:21:41,947 So it would be stupid of me to do that same algorithm again 662 00:21:41,947 --> 00:21:43,780 if I didn't make any changes the first time. 663 00:21:43,780 --> 00:21:45,160 And the state has not changed. 664 00:21:45,160 --> 00:21:47,576 Surely, I'm not going to make any changes the second time. 665 00:21:47,576 --> 00:21:49,820 And so, it's safe now to say, list is sorted. 666 00:21:49,820 --> 00:21:52,069 >> And indeed, this is now something that we'll generally 667 00:21:52,069 --> 00:21:56,900 call bubble sort, whereby pairwise, you correct mistakes again, 668 00:21:56,900 --> 00:22:00,210 and again, and again, and you keep going back and forth, 669 00:22:00,210 --> 00:22:03,370 and back and forth, until you make no such swaps, at which point 670 00:22:03,370 --> 00:22:07,089 you can be confident, yeah, I finished fixing all of the mistakes. 671 00:22:07,089 --> 00:22:08,630 Let's reset and try another approach. 672 00:22:08,630 --> 00:22:11,590 If you guys could go back into the order you were a moment ago, 673 00:22:11,590 --> 00:22:13,780 which looked like this. 674 00:22:13,780 --> 00:22:17,640 Now, let's take an approach a little more like the exam book, 675 00:22:17,640 --> 00:22:21,122 whereby we were constantly selecting the letter of the alphabet 676 00:22:21,122 --> 00:22:22,830 that we kind of wanted to deal with next. 677 00:22:22,830 --> 00:22:26,420 Maybe it was a high letter, like A, or a low letter Z. 678 00:22:26,420 --> 00:22:28,170 >> So everyone's back in this order. 679 00:22:28,170 --> 00:22:29,800 And now let me do this. 680 00:22:29,800 --> 00:22:34,880 Let's see I know I have eight numbers here. 681 00:22:34,880 --> 00:22:37,410 I'm going to go ahead and just deliberately select 682 00:22:37,410 --> 00:22:38,520 the smallest elements. 683 00:22:38,520 --> 00:22:38,760 Right? 684 00:22:38,760 --> 00:22:39,801 This seems intuitive too. 685 00:22:39,801 --> 00:22:42,560 Why don't I find the smallest element, put it where it belongs, 686 00:22:42,560 --> 00:22:45,280 then get the next smallest element, put it where it belongs, and just repeat. 687 00:22:45,280 --> 00:22:46,820 >> Because intuitively, that should work too. 688 00:22:46,820 --> 00:22:48,441 So four, that's a pretty small number. 689 00:22:48,441 --> 00:22:49,940 I'm going to remember where this is. 690 00:22:49,940 --> 00:22:50,523 Wait a minute. 691 00:22:50,523 --> 00:22:51,577 Two is smaller. 692 00:22:51,577 --> 00:22:53,910 Let me now remember where two is, and forget about four. 693 00:22:53,910 --> 00:22:55,050 We'll deal with that later. 694 00:22:55,050 --> 00:22:56,460 Six, I'm not interested. 695 00:22:56,460 --> 00:22:57,810 Eight, I'm not interested in. 696 00:22:57,810 --> 00:22:59,780 One is my new small number. 697 00:22:59,780 --> 00:23:01,470 So I'm going to remember where one is. 698 00:23:01,470 --> 00:23:02,534 Three, not interested. 699 00:23:02,534 --> 00:23:03,450 Seven, not interested. 700 00:23:03,450 --> 00:23:04,530 Five, not interested. 701 00:23:04,530 --> 00:23:07,390 >> So without falling off the stage this year, 702 00:23:07,390 --> 00:23:09,890 I'm going to grab number one-- and what was your name again? 703 00:23:09,890 --> 00:23:10,150 >> ANNAN: Annan. 704 00:23:10,150 --> 00:23:11,220 >> DAVID J. MALAN: Annan. 705 00:23:11,220 --> 00:23:13,540 And if you could join me at the beginning of the list, 706 00:23:13,540 --> 00:23:14,870 let's put you where you belong. 707 00:23:14,870 --> 00:23:16,080 Unfortunately-- what's your name? 708 00:23:16,080 --> 00:23:16,650 >> STEFAN: Stefan. 709 00:23:16,650 --> 00:23:18,191 >> DAVID J. MALAN: Stefan is in the way. 710 00:23:18,191 --> 00:23:23,490 So before Stefan solves this problem, what should we do? 711 00:23:23,490 --> 00:23:25,412 What do we do with Stefan? 712 00:23:25,412 --> 00:23:27,269 >> AUDIENCE: [INAUDIBLE]. 713 00:23:27,269 --> 00:23:28,060 DAVID J. MALAN: OK. 714 00:23:28,060 --> 00:23:28,850 So we could do that. 715 00:23:28,850 --> 00:23:31,730 We could sort of take Stefan and his four, and just put it in a variable 716 00:23:31,730 --> 00:23:33,530 and hold on to it for some amount of time, 717 00:23:33,530 --> 00:23:35,220 thereby making room for number one. 718 00:23:35,220 --> 00:23:36,280 And that's not bad. 719 00:23:36,280 --> 00:23:39,270 I could suggest, why don't we just put Stefan here? 720 00:23:39,270 --> 00:23:41,610 Why might this violate one of the ideas we started 721 00:23:41,610 --> 00:23:44,830 talking about last time, last week? 722 00:23:44,830 --> 00:23:45,330 Yeah? 723 00:23:45,330 --> 00:23:45,740 >> AUDIENCE: [INAUDIBLE]. 724 00:23:45,740 --> 00:23:46,860 >> DAVID J. MALAN: There's no index for it. 725 00:23:46,860 --> 00:23:49,735 If you think of this, indeed, as an array, this is like negative one, 726 00:23:49,735 --> 00:23:52,900 so there's no memory actually here if this is indeed an array, 727 00:23:52,900 --> 00:23:55,090 like we declared last week in lecture. 728 00:23:55,090 --> 00:23:56,250 So we shouldn't do this. 729 00:23:56,250 --> 00:23:57,340 We might store it in a variable. 730 00:23:57,340 --> 00:23:57,820 >> Or you know what? 731 00:23:57,820 --> 00:23:59,153 I heard someone else suggest it. 732 00:23:59,153 --> 00:24:01,020 What else could we do with Stefan? 733 00:24:01,020 --> 00:24:03,770 Why don't we just evict him and put him over where number one was. 734 00:24:03,770 --> 00:24:05,170 So if you want to go over there. 735 00:24:05,170 --> 00:24:07,300 And indeed, this is a pretty good solution. 736 00:24:07,300 --> 00:24:10,480 Now on the one hand, I've kind of made the problem worse. 737 00:24:10,480 --> 00:24:13,650 Four is now farther away from where it should be. 738 00:24:13,650 --> 00:24:14,900 It should be toward this half. 739 00:24:14,900 --> 00:24:16,100 >> But you know what? 740 00:24:16,100 --> 00:24:17,630 That could have been bad luck. 741 00:24:17,630 --> 00:24:18,822 Maybe number eight was here. 742 00:24:18,822 --> 00:24:20,530 And so, maybe we would have gotten lucky, 743 00:24:20,530 --> 00:24:22,460 and pushed eight closer to the end. 744 00:24:22,460 --> 00:24:24,710 So in the end of the day, it kind of all averages out. 745 00:24:24,710 --> 00:24:26,085 We don't need to care about four. 746 00:24:26,085 --> 00:24:29,400 All I care about right now is selecting the smallest element. 747 00:24:29,400 --> 00:24:32,030 >> And now, what I'm going to do is forget about number one 748 00:24:32,030 --> 00:24:35,160 permanently, because I know the list behind me is now sorted. 749 00:24:35,160 --> 00:24:36,720 So my list was previously size eight. 750 00:24:36,720 --> 00:24:37,720 Now, it's of size seven. 751 00:24:37,720 --> 00:24:40,340 So my problem is getting smaller, albeit linearly. 752 00:24:40,340 --> 00:24:43,022 So now, I'm going to select the current smallest element, two. 753 00:24:43,022 --> 00:24:46,520 Six, eight, four, three, seven, five. 754 00:24:46,520 --> 00:24:47,770 That was the smallest element. 755 00:24:47,770 --> 00:24:49,416 So what am I going to do with-- what was your name again? 756 00:24:49,416 --> 00:24:49,760 >> JOSEPH: Joseph. 757 00:24:49,760 --> 00:24:50,085 >> DAVID J. MALAN: Joseph? 758 00:24:50,085 --> 00:24:52,000 We're going to leave Joseph in place. 759 00:24:52,000 --> 00:24:54,842 Now, I'm going to pretend that these guys are-- well, 760 00:24:54,842 --> 00:24:56,550 I know that these two are already sorted. 761 00:24:56,550 --> 00:24:58,424 Let's now focus on the remainder of the list. 762 00:24:58,424 --> 00:25:00,080 Six is the current smallest. 763 00:25:00,080 --> 00:25:01,190 Eight is bigger. 764 00:25:01,190 --> 00:25:02,970 Four is now the current smallest. 765 00:25:02,970 --> 00:25:04,762 Three is now the current smallest. 766 00:25:04,762 --> 00:25:07,720 And so now, I'm going to select three, who is-- what's your name again? 767 00:25:07,720 --> 00:25:08,190 SERENA: Serena. 768 00:25:08,190 --> 00:25:10,620 DAVID J. MALAN: Serena, if you could grab your number and swap with-- 769 00:25:10,620 --> 00:25:11,550 KALSANG: Kalsang. 770 00:25:11,550 --> 00:25:12,940 DAVID J. MALAN: Kalsang. 771 00:25:12,940 --> 00:25:15,220 Come on back, and we're going to swap those two. 772 00:25:15,220 --> 00:25:17,360 And now, let's put this on autopilot. 773 00:25:17,360 --> 00:25:21,589 I'm going to go and leave it to you guys to select the next smallest elements. 774 00:25:21,589 --> 00:25:22,380 Dun, dun, dun, dun. 775 00:25:22,380 --> 00:25:24,560 Number four, what should you do? 776 00:25:24,560 --> 00:25:26,261 Excellent. 777 00:25:26,261 --> 00:25:27,760 Now, I'm going to make another pass. 778 00:25:27,760 --> 00:25:28,590 Dun, dun, dun, dun. 779 00:25:28,590 --> 00:25:31,465 I see five is the next smallest. 780 00:25:31,465 --> 00:25:32,840 Now, I'm going take another pass. 781 00:25:32,840 --> 00:25:33,631 Dun, dun, dun, dun. 782 00:25:33,631 --> 00:25:34,880 Six is the smallest. 783 00:25:34,880 --> 00:25:35,520 Good. 784 00:25:35,520 --> 00:25:36,585 Seven is the smallest. 785 00:25:36,585 --> 00:25:37,085 No change. 786 00:25:37,085 --> 00:25:38,630 Eight is the smallest. 787 00:25:38,630 --> 00:25:39,170 Done. 788 00:25:39,170 --> 00:25:43,900 >> So what we've just done by iteratively selecting one element after the other 789 00:25:43,900 --> 00:25:47,230 is implement something that we're going to formalize as selection sort. 790 00:25:47,230 --> 00:25:49,120 And it's perhaps even simpler to explain, 791 00:25:49,120 --> 00:25:51,310 in that literally all you want to do is just keep 792 00:25:51,310 --> 00:25:54,700 going back and forth through the list selecting, the next smallest element, 793 00:25:54,700 --> 00:25:55,720 until you're done. 794 00:25:55,720 --> 00:25:58,650 >> So it's even simpler, perhaps intuitively, than last. 795 00:25:58,650 --> 00:26:00,020 Let's try one last one. 796 00:26:00,020 --> 00:26:03,060 If you guys could reset yourselves into the following positions 797 00:26:03,060 --> 00:26:08,600 one final time, let's see if we can't now formalize one other approach. 798 00:26:08,600 --> 00:26:12,857 In fact, would someone out there like to propose 799 00:26:12,857 --> 00:26:14,440 how else we might go about doing this? 800 00:26:14,440 --> 00:26:17,439 Without tossing out buzzwords or sort of answers that are already known, 801 00:26:17,439 --> 00:26:19,689 just intuitively, what could we do? 802 00:26:19,689 --> 00:26:21,635 >> AUDIENCE: [INAUDIBLE]. 803 00:26:21,635 --> 00:26:22,510 DAVID J. MALAN: Yeah. 804 00:26:22,510 --> 00:26:24,620 So there's some great intuition there. 805 00:26:24,620 --> 00:26:28,020 Good things seem to happen thus far in computer science when we divide 806 00:26:28,020 --> 00:26:30,832 and conquer the problem of dividing it in half and half and half. 807 00:26:30,832 --> 00:26:32,540 And so indeed, we could start to do that. 808 00:26:32,540 --> 00:26:35,754 And in fact, that's going to be, we'll see, one of our best solutions yet. 809 00:26:35,754 --> 00:26:37,420 But let's come back to that before long. 810 00:26:37,420 --> 00:26:40,500 In fact, we're going to do that a little later this week. 811 00:26:40,500 --> 00:26:42,180 What else might we do to solve this? 812 00:26:42,180 --> 00:26:44,647 So everyone here is in seemingly random order. 813 00:26:44,647 --> 00:26:45,230 You know what? 814 00:26:45,230 --> 00:26:48,320 Rather than go back and forth, back and forth, back and forth 815 00:26:48,320 --> 00:26:50,624 each time, this feels like I'm doing a lot of walking. 816 00:26:50,624 --> 00:26:52,790 Why don't I just start at the beginning of the list, 817 00:26:52,790 --> 00:26:54,960 and just put four where it belongs? 818 00:26:54,960 --> 00:26:59,680 So let me assume for the moment that my list is only this first element. 819 00:26:59,680 --> 00:27:04,937 Is four sorted at this moment in time, if all I care about is everything here? 820 00:27:04,937 --> 00:27:06,520 This is sort of trivially true, right? 821 00:27:06,520 --> 00:27:10,000 Like the list containing one number, and that number four is obviously sorted. 822 00:27:10,000 --> 00:27:13,070 >> So let me just stipulate that this list is sorted. 823 00:27:13,070 --> 00:27:15,090 But now I have the rest of this list. 824 00:27:15,090 --> 00:27:17,240 So now, I encounter two. 825 00:27:17,240 --> 00:27:21,690 Where does two obviously belong with respect to four? 826 00:27:21,690 --> 00:27:22,580 Before four. 827 00:27:22,580 --> 00:27:23,862 So what can I do here? 828 00:27:23,862 --> 00:27:24,820 What's your name again? 829 00:27:24,820 --> 00:27:25,090 >> JOSEPH: Joseph. 830 00:27:25,090 --> 00:27:26,030 >> DAVID J. MALAN: Joseph, if you could step back 831 00:27:26,030 --> 00:27:27,790 for just a moment with your number. 832 00:27:27,790 --> 00:27:31,130 And now what should Stefan do here? 833 00:27:31,130 --> 00:27:33,720 Let's shift Stefan over here. 834 00:27:33,720 --> 00:27:35,520 And now, let Joseph come in here. 835 00:27:35,520 --> 00:27:39,660 And now, let me claim that everything here is sorted. 836 00:27:39,660 --> 00:27:42,474 So, similar result, but a fundamentally different approach. 837 00:27:42,474 --> 00:27:44,140 I haven't even looked what's down there. 838 00:27:44,140 --> 00:27:46,310 I just keep taking the elements as they're handed to me, 839 00:27:46,310 --> 00:27:47,240 and deal with them. 840 00:27:47,240 --> 00:27:48,330 >> So now, I see number six. 841 00:27:48,330 --> 00:27:51,110 Where does number six belong? 842 00:27:51,110 --> 00:27:53,250 We have two, four, six. 843 00:27:53,250 --> 00:27:54,800 Exactly where she is right now. 844 00:27:54,800 --> 00:27:57,750 So let's leave that alone, and now claim that this part of the list 845 00:27:57,750 --> 00:27:58,772 is now sorted. 846 00:27:58,772 --> 00:28:01,230 And so, this feels fundamentally different in that I'm just 847 00:28:01,230 --> 00:28:05,230 moving through the list here linearly, and I'm never doubling back. 848 00:28:05,230 --> 00:28:05,730 Yes. 849 00:28:05,730 --> 00:28:06,230 All right. 850 00:28:06,230 --> 00:28:08,190 So eight, where do you belong? 851 00:28:08,190 --> 00:28:08,730 Right here. 852 00:28:08,730 --> 00:28:09,310 Perfect. 853 00:28:09,310 --> 00:28:10,210 So now, one. 854 00:28:10,210 --> 00:28:10,900 Uh-oh. 855 00:28:10,900 --> 00:28:13,010 This feels like it's going to be expensive. 856 00:28:13,010 --> 00:28:15,690 Now, in the previous algorithm, I just swapped people. 857 00:28:15,690 --> 00:28:18,648 So I might put him all the way at the beginning, but then moved Joseph. 858 00:28:18,648 --> 00:28:21,450 But if I move Joseph, now what's going to be wrong? 859 00:28:21,450 --> 00:28:24,250 >> Now, I've sort of undone-- I've taken one step forward and then 860 00:28:24,250 --> 00:28:26,300 one step back, because now Joseph would be out of order. 861 00:28:26,300 --> 00:28:26,830 So let's do this. 862 00:28:26,830 --> 00:28:29,150 If you could take number one and step back for just a moment. 863 00:28:29,150 --> 00:28:30,490 How can we put-- what was your name again? 864 00:28:30,490 --> 00:28:31,130 >> ANNAN: Annan. 865 00:28:31,130 --> 00:28:32,610 >> DAVID J. MALAN: Annan in place? 866 00:28:32,610 --> 00:28:36,091 What needs to happen with respect to two, four, six, and eight? 867 00:28:36,091 --> 00:28:37,570 They all need to shift. 868 00:28:37,570 --> 00:28:42,590 So if eight would like to shift first, then six, then four, then two. 869 00:28:42,590 --> 00:28:45,380 And then Annan, if you'd like to come in here, good. 870 00:28:45,380 --> 00:28:47,760 But here, we've just kind of paid a price 871 00:28:47,760 --> 00:28:49,510 at a different point in the algorithm. 872 00:28:49,510 --> 00:28:52,550 Whereas last time with selection sort, and even bubble sort, 873 00:28:52,550 --> 00:28:54,700 I'm walking back and forth, back and forth, 874 00:28:54,700 --> 00:28:58,360 which is certainly adding up time-wise, and literally stepwise. 875 00:28:58,360 --> 00:29:00,660 >> Insertion sort, at first glance, looks like it's 876 00:29:00,660 --> 00:29:05,150 super smarter, in that I'm just making slow, incremental progress, 877 00:29:05,150 --> 00:29:07,120 but I'm not going this back and forth. 878 00:29:07,120 --> 00:29:09,410 But if someone is indeed out of order, notice 879 00:29:09,410 --> 00:29:10,840 all of the work I just had to do. 880 00:29:10,840 --> 00:29:14,750 I had to move half of the list just to make room for number one. 881 00:29:14,750 --> 00:29:16,790 So it's the same amount of work thus far it 882 00:29:16,790 --> 00:29:18,690 feels, just a different type of work. 883 00:29:18,690 --> 00:29:19,370 >> Let's continue. 884 00:29:19,370 --> 00:29:22,657 So now we know that everyone between one and eight are sorted. 885 00:29:22,657 --> 00:29:23,740 Here, I have number three. 886 00:29:23,740 --> 00:29:25,864 If you like to pick up number three, step back one. 887 00:29:25,864 --> 00:29:28,260 And what do you guys need to do? 888 00:29:28,260 --> 00:29:28,760 Yep. 889 00:29:28,760 --> 00:29:33,070 So that's another one, two, three steps. 890 00:29:33,070 --> 00:29:36,010 Three units of time that just cost me, so that three can now fit. 891 00:29:36,010 --> 00:29:37,460 Finally, seven. 892 00:29:37,460 --> 00:29:39,730 >> Let's go ahead and have you take a step back. 893 00:29:39,730 --> 00:29:42,780 This is only going to cost us one unit of time, but that's OK. 894 00:29:42,780 --> 00:29:44,170 And now, five's going to be a little more expensive. 895 00:29:44,170 --> 00:29:45,340 If you'd like to step back. 896 00:29:45,340 --> 00:29:48,380 We need to move eight, and seven, and six. 897 00:29:48,380 --> 00:29:50,749 And then everyone is now sorted. 898 00:29:50,749 --> 00:29:52,290 So a big hand to our volunteers here. 899 00:29:52,290 --> 00:29:53,554 Thank you so much. 900 00:29:53,554 --> 00:29:56,220 >> [APPLAUSE] 901 00:29:56,220 --> 00:29:56,860 >> Thank you all. 902 00:29:56,860 --> 00:29:57,520 Thank you all. 903 00:29:57,520 --> 00:30:02,940 So let's see now just how costly all of that was. 904 00:30:02,940 --> 00:30:06,210 Let's consider perhaps the simplest of these, bubble sort. 905 00:30:06,210 --> 00:30:09,950 And I say simplest, only because you can solve it greedily by just 906 00:30:09,950 --> 00:30:11,660 fix the pairwise problem here. 907 00:30:11,660 --> 00:30:13,720 Fix the pairwise problem here, again and again 908 00:30:13,720 --> 00:30:17,680 and again, repeating as many times as you actually need to. 909 00:30:17,680 --> 00:30:21,050 >> So it turns out that with a bubble sort, well, 910 00:30:21,050 --> 00:30:25,820 how many steps do I have to take on the first pass of that algorithm? 911 00:30:25,820 --> 00:30:30,850 I might take-- let's see-- one, two, three, four, five, six, seven. 912 00:30:30,850 --> 00:30:32,190 And there's eight elements here. 913 00:30:32,190 --> 00:30:35,280 So it's like n minus 1 steps to get from the beginning of the list 914 00:30:35,280 --> 00:30:36,380 to the end of the list. 915 00:30:36,380 --> 00:30:41,350 >> But with selection sort, recall that I'm selecting the elements again and again 916 00:30:41,350 --> 00:30:44,590 and again that's smallest, I'm putting it in place, 917 00:30:44,590 --> 00:30:46,616 but then I'm not looking behind me again. 918 00:30:46,616 --> 00:30:49,490 So I think it's a little more clear then that the first time, I might 919 00:30:49,490 --> 00:30:52,680 have to take all n minus 1 steps to find the smallest element. 920 00:30:52,680 --> 00:30:55,920 Then I put them in place, and I evict whoever was here previously. 921 00:30:55,920 --> 00:30:57,500 >> But then I don't have to keep looking at this element, 922 00:30:57,500 --> 00:30:59,040 because I know it's already the smallest. 923 00:30:59,040 --> 00:31:01,581 So now, I can look at just seven elements, then six elements, 924 00:31:01,581 --> 00:31:03,290 then five elements, then four elements. 925 00:31:03,290 --> 00:31:06,900 And so mathematically, if n is the number of elements or numbers 926 00:31:06,900 --> 00:31:11,990 that we started with, you can imagine that this is the same as n minus 1, 927 00:31:11,990 --> 00:31:14,250 plus n minus 2 steps, plus n minus 3 steps, 928 00:31:14,250 --> 00:31:16,780 plus n minus 4 steps, all the way down to just one step. 929 00:31:16,780 --> 00:31:18,160 And I'm on my last person. 930 00:31:18,160 --> 00:31:20,650 >> And if you recall that a lot of stats books or math books 931 00:31:20,650 --> 00:31:24,730 have those formulas on the hardcover back or front of them, 932 00:31:24,730 --> 00:31:27,690 it turns out that this series can be expressed more simply 933 00:31:27,690 --> 00:31:28,857 as n times n minus 1 over 2. 934 00:31:28,857 --> 00:31:31,273 And it's fine if that's not at the forefront of your mind. 935 00:31:31,273 --> 00:31:32,420 But this is indeed true. 936 00:31:32,420 --> 00:31:34,449 That's just a simpler way of writing it. 937 00:31:34,449 --> 00:31:36,240 And then if you think back to grade school, 938 00:31:36,240 --> 00:31:38,698 when you just start multiplying things out, this of course, 939 00:31:38,698 --> 00:31:41,820 is just n squared minus n divided by 2. 940 00:31:41,820 --> 00:31:44,772 All I've done is expand the expressions there. 941 00:31:44,772 --> 00:31:46,730 And so let's rewrite this a little differently. 942 00:31:46,730 --> 00:31:49,780 That's n squared divided by 2 minus n/2. 943 00:31:49,780 --> 00:31:53,270 >> So again, I'm just kind of applying some arithmetic rules there. 944 00:31:53,270 --> 00:31:57,140 But notice now that the biggest term in this expression, so to speak, 945 00:31:57,140 --> 00:31:58,540 is that n squared. 946 00:31:58,540 --> 00:32:02,910 So yes, it's n squared divided by 2, minus n/2. 947 00:32:02,910 --> 00:32:05,080 >> But generally, if n is going to be a big value, 948 00:32:05,080 --> 00:32:08,740 I'm going to claim that n squared is going to be the dominant factor. 949 00:32:08,740 --> 00:32:10,490 It's just going to be a bigger contributor 950 00:32:10,490 --> 00:32:12,877 to the number of steps than n/2. 951 00:32:12,877 --> 00:32:13,960 So what do I mean by this? 952 00:32:13,960 --> 00:32:16,795 Let's try a simple example, even though the math gets a little big. 953 00:32:16,795 --> 00:32:20,210 >> So suppose we had 1 million people on stage, or 1 million things 954 00:32:20,210 --> 00:32:21,320 that we want to sort. 955 00:32:21,320 --> 00:32:23,730 Let's plug a million into exactly that formula 956 00:32:23,730 --> 00:32:27,230 to see how many steps it takes total to sort a million elements using say, 957 00:32:27,230 --> 00:32:28,560 selection sort. 958 00:32:28,560 --> 00:32:30,760 >> So we'd have the same formula as before. 959 00:32:30,760 --> 00:32:34,120 I'd plug a million, so that I get a million squared divided by 2, 960 00:32:34,120 --> 00:32:35,990 minus a million divided by 2. 961 00:32:35,990 --> 00:32:40,180 If I do that math in advance here, we have 500 billion 962 00:32:40,180 --> 00:32:47,460 minus 500,000, which gives us 499,999,500,000, 963 00:32:47,460 --> 00:32:49,270 which is pretty darn big. 964 00:32:49,270 --> 00:32:54,370 >> In fact, if you compare now 499 billion, 999 million, 965 00:32:54,370 --> 00:33:01,210 500,000 against our original value, 500 billion, it's so damn close. 966 00:33:01,210 --> 00:33:06,850 Right? n squared divided by 2 gives us-- or rather, n squared divided by 2 967 00:33:06,850 --> 00:33:08,370 gave us 500 billion. 968 00:33:08,370 --> 00:33:13,510 That's pretty darn close to 499,999,500,000, 969 00:33:13,510 --> 00:33:17,970 which is to say subtracting off 500,000, or more generally, subtracting off 970 00:33:17,970 --> 00:33:20,010 n squared, not really a big deal. 971 00:33:20,010 --> 00:33:22,490 The n squared makes these numbers grow really fast. 972 00:33:22,490 --> 00:33:25,790 >> Now, this is important only insofar as we, as computer scientists, 973 00:33:25,790 --> 00:33:29,350 are generally not going to care so much about the nuances of these formulas 974 00:33:29,350 --> 00:33:31,400 and exactly what the precise answers are. 975 00:33:31,400 --> 00:33:33,390 We care only that, you know what? 976 00:33:33,390 --> 00:33:37,810 At the end of the day, this formula is on the order of n squared. 977 00:33:37,810 --> 00:33:39,350 >> Yes, we're dividing by 2 in there. 978 00:33:39,350 --> 00:33:41,360 Yes, we're subtracting off n minus 2. 979 00:33:41,360 --> 00:33:46,860 But at the end of the day, the term that really hurts us and costs us 980 00:33:46,860 --> 00:33:48,995 a lot of steps is that square term. 981 00:33:48,995 --> 00:33:51,370 And so what a computer scientist is going to generally do 982 00:33:51,370 --> 00:33:54,160 is ignore all of those smaller order terms, 983 00:33:54,160 --> 00:33:56,900 and just look at the one that contributes the most to the cost. 984 00:33:56,900 --> 00:34:00,530 >> And this is nice, because we can now talk in much greater generality 985 00:34:00,530 --> 00:34:02,470 about algorithms, and can compare them. 986 00:34:02,470 --> 00:34:04,550 And the fact that I'm using this O is deliberate. 987 00:34:04,550 --> 00:34:06,680 When I say on the order of, I'm specifically 988 00:34:06,680 --> 00:34:09,560 referring to something called big O. And big O 989 00:34:09,560 --> 00:34:14,090 is a notation that a computer scientist uses to describe 990 00:34:14,090 --> 00:34:16,710 an upper bound on something. 991 00:34:16,710 --> 00:34:21,150 >> So if you say that an algorithm is in big O of n squared, 992 00:34:21,150 --> 00:34:23,380 as I proposed just a moment ago, that means 993 00:34:23,380 --> 00:34:27,710 that in terms of its running time or its efficiency, 994 00:34:27,710 --> 00:34:30,090 it takes on the order of n squared steps. 995 00:34:30,090 --> 00:34:31,420 Maybe more, maybe less. 996 00:34:31,420 --> 00:34:33,435 But it's on the order of n squared. 997 00:34:33,435 --> 00:34:34,560 And that's the upper bound. 998 00:34:34,560 --> 00:34:36,530 It's not going to be more painful than that. 999 00:34:36,530 --> 00:34:40,800 It's not going to be n cubed, or 2 to the n, or something much bigger. 1000 00:34:40,800 --> 00:34:43,800 This is an upper bound on whatever that cost is. 1001 00:34:43,800 --> 00:34:46,150 So given that, let's consider just a few examples. 1002 00:34:46,150 --> 00:34:49,820 And this is just a finite list of very common running times 1003 00:34:49,820 --> 00:34:52,870 for algorithms that's meant to be illustrative of some things we've 1004 00:34:52,870 --> 00:34:53,600 seen already. 1005 00:34:53,600 --> 00:34:58,060 >> So for instance, in the case of selection sort, what I'm claiming here 1006 00:34:58,060 --> 00:35:02,250 is that selection sort's running time is on the order of n squared. 1007 00:35:02,250 --> 00:35:06,260 In the worst case, I'm going to have a whole bunch of random numbers here. 1008 00:35:06,260 --> 00:35:08,600 And as we saw mathematically, if I keep walking 1009 00:35:08,600 --> 00:35:11,310 through the list, through the list, selecting the next smallest 1010 00:35:11,310 --> 00:35:14,410 element again and again, if I actually write down all of the steps 1011 00:35:14,410 --> 00:35:18,750 I'm taking as I proposed formulaically before, it's on the order of n squared 1012 00:35:18,750 --> 00:35:20,370 steps that I'm taking. 1013 00:35:20,370 --> 00:35:24,520 >> And it turns out that bubble sort and insertion sort 1014 00:35:24,520 --> 00:35:27,370 are just as slow in the worst case. 1015 00:35:27,370 --> 00:35:32,040 Consider, for instance, insertion sort, the very last algorithm we dealt with, 1016 00:35:32,040 --> 00:35:35,500 which had us look at the element, and then insert it where it belongs. 1017 00:35:35,500 --> 00:35:38,720 And then we looked at the next element, and inserted it where it belongs. 1018 00:35:38,720 --> 00:35:40,990 >> So consider the best possible scenario. 1019 00:35:40,990 --> 00:35:45,590 Suppose I had my volunteers line up literally like this, one through eight, 1020 00:35:45,590 --> 00:35:47,440 already sorted. 1021 00:35:47,440 --> 00:35:51,300 How many steps is insertion sort going to take to sort eight people, 1022 00:35:51,300 --> 00:35:55,640 if they arrive on stage looking like this? 1023 00:35:55,640 --> 00:35:57,410 >> Eight people already sorted. 1024 00:35:57,410 --> 00:35:58,760 And I use insertion sort. 1025 00:35:58,760 --> 00:36:02,180 That last of the algorithms. 1026 00:36:02,180 --> 00:36:03,640 Well, let's reenact real fast. 1027 00:36:03,640 --> 00:36:05,504 So if I start here, I see one. 1028 00:36:05,504 --> 00:36:06,420 Where does one belong? 1029 00:36:06,420 --> 00:36:07,730 It belongs right here. 1030 00:36:07,730 --> 00:36:08,330 I see two. 1031 00:36:08,330 --> 00:36:09,660 Where does two belong? 1032 00:36:09,660 --> 00:36:10,260 Right here. 1033 00:36:10,260 --> 00:36:10,900 I see three. 1034 00:36:10,900 --> 00:36:11,920 Where does three belong? 1035 00:36:11,920 --> 00:36:12,480 Right here. 1036 00:36:12,480 --> 00:36:13,100 >> I see four. 1037 00:36:13,100 --> 00:36:13,600 Right here. 1038 00:36:13,600 --> 00:36:15,660 Five, six, seven, eight. 1039 00:36:15,660 --> 00:36:17,320 There's no reason to repeat myself. 1040 00:36:17,320 --> 00:36:21,260 And so, how many steps is that in terms of n? 1041 00:36:21,260 --> 00:36:23,870 It's on the order of n steps, right? n minus 1. 1042 00:36:23,870 --> 00:36:27,567 But I took a linear number of steps, and now I'm done. 1043 00:36:27,567 --> 00:36:28,900 So that's the best case, though. 1044 00:36:28,900 --> 00:36:29,983 What about the worst case? 1045 00:36:29,983 --> 00:36:32,730 What eight were over there, and seven were down there, 1046 00:36:32,730 --> 00:36:35,840 and one and two were over here, so that the list were truly reversed? 1047 00:36:35,840 --> 00:36:38,300 >> Well, what happens indeed if this is the number? 1048 00:36:38,300 --> 00:36:41,300 And we'll do just a couple of examples. 1049 00:36:41,300 --> 00:36:49,300 What if, indeed, the number eight is here, and the number-- whoops. 1050 00:36:49,300 --> 00:36:52,660 1051 00:36:52,660 --> 00:36:56,430 So what if, indeed, the number eight is all the way over here, 1052 00:36:56,430 --> 00:36:57,790 and I'm using insertion sort? 1053 00:36:57,790 --> 00:36:58,290 >> OK. 1054 00:36:58,290 --> 00:37:00,280 I claim at the moment it's in place. 1055 00:37:00,280 --> 00:37:03,152 But now, seven-- where does seven go? 1056 00:37:03,152 --> 00:37:04,360 Of course, it goes over here. 1057 00:37:04,360 --> 00:37:06,760 So I have to move eight over one place. 1058 00:37:06,760 --> 00:37:08,554 Now six, where does it go? 1059 00:37:08,554 --> 00:37:09,220 Well, all right. 1060 00:37:09,220 --> 00:37:13,150 Now, I have to move eight over a place, and seven over a place, 1061 00:37:13,150 --> 00:37:14,440 and then I plop down six. 1062 00:37:14,440 --> 00:37:16,870 >> So the first time, it cost me one step to fix things, 1063 00:37:16,870 --> 00:37:18,570 then it cost me two steps to fix things. 1064 00:37:18,570 --> 00:37:20,370 How many steps is it going to take to fix 1065 00:37:20,370 --> 00:37:22,720 things to put five in the right place? 1066 00:37:22,720 --> 00:37:23,340 Three. 1067 00:37:23,340 --> 00:37:29,520 Because now I have to move one, two, three. 1068 00:37:29,520 --> 00:37:32,430 How many steps is it going to take to put four in the right place? 1069 00:37:32,430 --> 00:37:36,040 4 plus 5, plus 6, plus 7. 1070 00:37:36,040 --> 00:37:40,260 >> And so it's mathematically identical to what we described for selection sort. 1071 00:37:40,260 --> 00:37:42,130 We have this series that's just increasing. 1072 00:37:42,130 --> 00:37:45,650 1 plus 2 plus 3 plus 4, or conversely, 7 plus 6 1073 00:37:45,650 --> 00:37:52,610 plus 5 plus 4 adds up for today's purposes to on the order of n squared. 1074 00:37:52,610 --> 00:37:57,640 >> So let me stipulate too that bubble sort is also in n squared. 1075 00:37:57,640 --> 00:38:01,340 Because with bubble sort, each time I go through the list, 1076 00:38:01,340 --> 00:38:03,100 I'm taking roughly how many steps? 1077 00:38:03,100 --> 00:38:06,260 Each time I literally walk from there to there? 1078 00:38:06,260 --> 00:38:07,960 Roughly n steps. 1079 00:38:07,960 --> 00:38:12,650 But how many times might I need to go through the list? 1080 00:38:12,650 --> 00:38:13,920 >> Well, roughly n time. 1081 00:38:13,920 --> 00:38:15,680 Maybe n minus 1, but roughly n times. 1082 00:38:15,680 --> 00:38:16,430 Well, why is that? 1083 00:38:16,430 --> 00:38:19,560 Well, with bubble sort, if we start with bubble sort, 1084 00:38:19,560 --> 00:38:23,570 with the list in the worst possible situation, which again is completely 1085 00:38:23,570 --> 00:38:25,550 backwards, what's going to happen? 1086 00:38:25,550 --> 00:38:28,830 I go through the list, and number one belongs all the way over there. 1087 00:38:28,830 --> 00:38:33,280 >> But with bubble sort, how far does one move on my first pass through the list? 1088 00:38:33,280 --> 00:38:36,620 How many spots does he get closer to the correct place? 1089 00:38:36,620 --> 00:38:37,240 Just one. 1090 00:38:37,240 --> 00:38:40,281 So if you kind of reason through this, every time through this algorithm, 1091 00:38:40,281 --> 00:38:41,880 David's taking roughly n steps. 1092 00:38:41,880 --> 00:38:44,940 But how many passes through the list is it 1093 00:38:44,940 --> 00:38:49,060 going to take for one to bubble to the left where it belongs? 1094 00:38:49,060 --> 00:38:51,840 >> He's got to move like, n spaces this way. 1095 00:38:51,840 --> 00:38:57,960 So just to do the sorting of the list, I have to walk back and forth n times. 1096 00:38:57,960 --> 00:39:01,540 And each time, I'm looking at n elements. 1097 00:39:01,540 --> 00:39:05,410 So do n things n times on the order of n squared. 1098 00:39:05,410 --> 00:39:07,220 >> Now, we'll see in some of the shorts that 1099 00:39:07,220 --> 00:39:10,440 are embedded in CS50's next problem set, another approach at these, 1100 00:39:10,440 --> 00:39:13,490 but for now, let's just consider some other running times, 1101 00:39:13,490 --> 00:39:16,840 especially if the sorting ones take a little bit of time to sink in. 1102 00:39:16,840 --> 00:39:21,790 What's an algorithm we've seen already that takes on the order of n steps? 1103 00:39:21,790 --> 00:39:27,560 >> What should take a linear number of steps that we've seen thus far? 1104 00:39:27,560 --> 00:39:29,350 What's that? 1105 00:39:29,350 --> 00:39:30,480 The phone directory search. 1106 00:39:30,480 --> 00:39:31,390 The first algorithm. 1107 00:39:31,390 --> 00:39:31,560 Right? 1108 00:39:31,560 --> 00:39:33,650 Where we're linearly searching for Mike Smith? 1109 00:39:33,650 --> 00:39:34,150 Indeed. 1110 00:39:34,150 --> 00:39:37,180 From week zero, when I started turning one page at a time, 1111 00:39:37,180 --> 00:39:40,095 and I even said that it was kind of a linear feeling algorithm, 1112 00:39:40,095 --> 00:39:42,720 and we had that picture on the board with the straight red line 1113 00:39:42,720 --> 00:39:44,678 and the straight yellow line, those were indeed 1114 00:39:44,678 --> 00:39:46,810 algorithms that are in big O of n. 1115 00:39:46,810 --> 00:39:50,680 >> Because to find Mike Smith in a phone book of n pages, in the worst case, 1116 00:39:50,680 --> 00:39:52,422 might take me n steps. 1117 00:39:52,422 --> 00:39:53,630 What about taking attendance? 1118 00:39:53,630 --> 00:39:55,790 One, two, three, four, five, six. 1119 00:39:55,790 --> 00:39:59,420 What's the running time of this algorithm for taking attendance? 1120 00:39:59,420 --> 00:40:03,070 Big O of n, because in theory I have to point everyone in the room. 1121 00:40:03,070 --> 00:40:05,861 >> Now as an aside, what about the other optimization from week zero? 1122 00:40:05,861 --> 00:40:08,117 Two, four, six, eight, 10, 12. 1123 00:40:08,117 --> 00:40:10,200 A computer scientist would realize, wait a minute, 1124 00:40:10,200 --> 00:40:12,320 that's on the order of n divided by two steps. 1125 00:40:12,320 --> 00:40:12,820 Right? 1126 00:40:12,820 --> 00:40:14,444 Because I'm doing two people at a time. 1127 00:40:14,444 --> 00:40:17,015 But we're going to ignore those lower order terms, 1128 00:40:17,015 --> 00:40:19,140 and we're just going to throw away the divide by 2, 1129 00:40:19,140 --> 00:40:21,830 and just say, big O of n for that algorithm as well. 1130 00:40:21,830 --> 00:40:22,760 >> What about this one? 1131 00:40:22,760 --> 00:40:26,170 We'll skip over some of these, but what was an algorithm that was log of n? 1132 00:40:26,170 --> 00:40:29,900 That took roughly log n steps? 1133 00:40:29,900 --> 00:40:30,870 The divide and conquer. 1134 00:40:30,870 --> 00:40:31,369 Exactly. 1135 00:40:31,369 --> 00:40:33,900 Like the phone book example in week zero and earlier today, 1136 00:40:33,900 --> 00:40:36,191 where we divided the problem again and again and again. 1137 00:40:36,191 --> 00:40:39,070 We drew it on the board in week zero as a curved green line, 1138 00:40:39,070 --> 00:40:41,460 and we said that day it was a logarithmic algorithm. 1139 00:40:41,460 --> 00:40:44,970 >> And indeed, the number of steps it takes to perform divide and conquer, 1140 00:40:44,970 --> 00:40:48,610 or binary search as we'll start calling it, as in the phone book, 1141 00:40:48,610 --> 00:40:50,680 is on the order of log and steps. 1142 00:40:50,680 --> 00:40:52,470 And this is a bit of a weird one. 1143 00:40:52,470 --> 00:40:54,910 >> What takes one step, or more specifically 1144 00:40:54,910 --> 00:40:56,240 a constant number of steps? 1145 00:40:56,240 --> 00:40:58,865 Maybe it's two, maybe it's three, but a computer scientist just 1146 00:40:58,865 --> 00:41:01,423 simplifies it as big O of 1, some constant number of steps. 1147 00:41:01,423 --> 00:41:04,256 What's something you could do that takes a constant number of steps? 1148 00:41:04,256 --> 00:41:08,030 1149 00:41:08,030 --> 00:41:10,930 >> What's the running time of clapping? 1150 00:41:10,930 --> 00:41:11,920 Constant time. 1151 00:41:11,920 --> 00:41:12,420 Right? 1152 00:41:12,420 --> 00:41:15,490 Like, what's the running time of doing anything that takes just one 1153 00:41:15,490 --> 00:41:18,570 operation, like print F Hello World. 1154 00:41:18,570 --> 00:41:24,110 That might be said to be constant time, unless less corner case with print F, 1155 00:41:24,110 --> 00:41:28,260 what might the running time of print F actually be? 1156 00:41:28,260 --> 00:41:28,790 And why? 1157 00:41:28,790 --> 00:41:30,550 What is n measuring in that case? 1158 00:41:30,550 --> 00:41:32,251 >> AUDIENCE: [INAUDIBLE]. 1159 00:41:32,251 --> 00:41:33,250 DAVID J. MALAN: Exactly. 1160 00:41:33,250 --> 00:41:34,900 The number of characters we want to print. 1161 00:41:34,900 --> 00:41:36,191 So it's very context-sensitive. 1162 00:41:36,191 --> 00:41:39,910 Today, we've been focusing a lot on letters and numbers here on the board. 1163 00:41:39,910 --> 00:41:43,540 But it might also be characters in an actual string. 1164 00:41:43,540 --> 00:41:46,420 So it turns out there's another measure that will start caring about, 1165 00:41:46,420 --> 00:41:48,530 and that's the opposite of big O, so to speak. 1166 00:41:48,530 --> 00:41:50,120 >> That's omega notation. 1167 00:41:50,120 --> 00:41:53,380 Whereas big O means what's, the upper bound on your running time? 1168 00:41:53,380 --> 00:41:55,580 Maximally, how much time might something take? 1169 00:41:55,580 --> 00:41:59,250 Omega-- sorry this keeps coming up-- is the opposite of that, 1170 00:41:59,250 --> 00:42:02,960 whereby it's a lower bound on the amount of time something might take. 1171 00:42:02,960 --> 00:42:10,480 >> So. for instance, what's an algorithm that takes always n squared steps? 1172 00:42:10,480 --> 00:42:15,600 Well, one of the algorithms we've seen today, in fact, might be that as well. 1173 00:42:15,600 --> 00:42:16,720 Selection sort. 1174 00:42:16,720 --> 00:42:18,270 Selection sort's pretty stupid. 1175 00:42:18,270 --> 00:42:21,760 Even if the algorithm-- sorry, even if the array is already sorted, 1176 00:42:21,760 --> 00:42:24,150 selection sort is going to keep walking through the list 1177 00:42:24,150 --> 00:42:28,907 to make sure it has the smallest element again and again and again. 1178 00:42:28,907 --> 00:42:31,740 And even though you humans in the audience know that, wait a minute, 1179 00:42:31,740 --> 00:42:33,948 you already passed the smallest element, the computer 1180 00:42:33,948 --> 00:42:37,300 doesn't know that until it looks all the way through the list. 1181 00:42:37,300 --> 00:42:40,240 Similarly, a lower bound that might also be taken into account 1182 00:42:40,240 --> 00:42:42,000 might be linear time. 1183 00:42:42,000 --> 00:42:48,260 >> How much time does it take to sort n elements in the best 1184 00:42:48,260 --> 00:42:52,420 case using something like bubble sort? 1185 00:42:52,420 --> 00:42:54,280 Suppose your list is already sorted. 1186 00:42:54,280 --> 00:42:56,696 We said bubble sort takes on the order of n squared steps. 1187 00:42:56,696 --> 00:42:59,640 But what if it's already sorted? 1188 00:42:59,640 --> 00:43:02,310 What if you realize after one pass through the array 1189 00:43:02,310 --> 00:43:03,540 that you've made no swaps? 1190 00:43:03,540 --> 00:43:05,970 Do you need to keep making more passes? 1191 00:43:05,970 --> 00:43:06,470 >> No. 1192 00:43:06,470 --> 00:43:10,340 So a lower bound on bubble sort might be said to be linear. 1193 00:43:10,340 --> 00:43:11,830 Omega of n. 1194 00:43:11,830 --> 00:43:14,450 And we can look at others of these as well. 1195 00:43:14,450 --> 00:43:17,990 So let's take a quick look at just a visualization here 1196 00:43:17,990 --> 00:43:20,790 to see how these distinguish themselves. 1197 00:43:20,790 --> 00:43:24,592 I'm going to go down here into this page that's available on C50's website, 1198 00:43:24,592 --> 00:43:27,550 but it will be a pain to get working, since it uses a technology called 1199 00:43:27,550 --> 00:43:30,560 Java applets, which is a largely unsupported these days, 1200 00:43:30,560 --> 00:43:32,730 at least by Chrome and certain others. 1201 00:43:32,730 --> 00:43:37,070 >> And let me go ahead and speed this up and explain what's going on. 1202 00:43:37,070 --> 00:43:40,840 This is a demonstration of bubble sort, the first algorithm we looked at. 1203 00:43:40,840 --> 00:43:43,950 And it's a visualization in that each of these bars represents a number. 1204 00:43:43,950 --> 00:43:45,710 The bigger the bar, the bigger the number. 1205 00:43:45,710 --> 00:43:47,520 The smaller the bar, the smaller the number. 1206 00:43:47,520 --> 00:43:50,353 And what you can see visually, even though this is going super fast, 1207 00:43:50,353 --> 00:43:53,699 is that the red bar is like me, walking back and forth fixing problems. 1208 00:43:53,699 --> 00:43:56,740 You can see that the bigger elements are indeed bubbling up to the right, 1209 00:43:56,740 --> 00:43:59,650 and the smaller elements are bubbling up to the left. 1210 00:43:59,650 --> 00:44:01,870 And down here, if we actually look more closely, 1211 00:44:01,870 --> 00:44:04,330 we can actually count the number of comparisons and swaps 1212 00:44:04,330 --> 00:44:05,350 that were being made. 1213 00:44:05,350 --> 00:44:07,360 >> But instead, let's look at the second algorithm 1214 00:44:07,360 --> 00:44:11,240 we looked at earlier with our volunteers, selection sort. 1215 00:44:11,240 --> 00:44:13,500 Visually, it has a very different effect. 1216 00:44:13,500 --> 00:44:16,820 But it's, again, very intuitive, in that we keep selecting the next smallest 1217 00:44:16,820 --> 00:44:18,660 element, and we got a little lucky. 1218 00:44:18,660 --> 00:44:20,110 That felt fundamentally faster. 1219 00:44:20,110 --> 00:44:22,840 But if we ran this again and again and again with lots of inputs, 1220 00:44:22,840 --> 00:44:26,680 we would see that it's indeed still in big O of n squared. 1221 00:44:26,680 --> 00:44:29,920 >> Let's do one last one here, insertion sort, 1222 00:44:29,920 --> 00:44:33,180 which was the third algorithm we looked at, and recall 1223 00:44:33,180 --> 00:44:36,700 that this one deals with the elements as it encounters them, 1224 00:44:36,700 --> 00:44:39,290 but then it maybe shifts things over to make room, 1225 00:44:39,290 --> 00:44:41,660 inserting elements where they belong. 1226 00:44:41,660 --> 00:44:45,330 >> And this too ends up giving the final result. Now all three of those 1227 00:44:45,330 --> 00:44:46,490 felt pretty fast. 1228 00:44:46,490 --> 00:44:48,740 And indeed, I ran them at a pretty good clip. 1229 00:44:48,740 --> 00:44:52,510 But fundamentally, they're all pretty horrible, to be honest. 1230 00:44:52,510 --> 00:44:56,960 All of these algorithms thus far that run in big O of n squared 1231 00:44:56,960 --> 00:44:59,270 take quite a bit of time to run in the end. 1232 00:44:59,270 --> 00:45:01,920 >> And indeed, we can see and feel this lastly 1233 00:45:01,920 --> 00:45:04,090 if I pull up this third and final demo. 1234 00:45:04,090 --> 00:45:05,840 This is another visualization that's going 1235 00:45:05,840 --> 00:45:08,500 to show bubble sort on the left, selection sort in the middle, 1236 00:45:08,500 --> 00:45:13,410 and something, as one of our hand raises earlier suggested, 1237 00:45:13,410 --> 00:45:15,020 merge sort on the right. 1238 00:45:15,020 --> 00:45:16,937 A divide and conquer strategy on the right. 1239 00:45:16,937 --> 00:45:19,520 And that's, in fact, what we're going to look at on Wednesday. 1240 00:45:19,520 --> 00:45:21,990 But let's time these to run in parallel. 1241 00:45:21,990 --> 00:45:26,765 It's roughly the same number of elements, all running at the same time. 1242 00:45:26,765 --> 00:45:30,940 1243 00:45:30,940 --> 00:45:34,440 Bubble sort vs selection sort vs merge sort. 1244 00:45:34,440 --> 00:45:36,760 >> Now, they're all running in theory at the same time. 1245 00:45:36,760 --> 00:45:39,830 The CPU is running at the same speed, but you 1246 00:45:39,830 --> 00:45:44,014 can feel how boring this is very quickly going to become, 1247 00:45:44,014 --> 00:45:45,930 and just how fast when we inject a bit of week 1248 00:45:45,930 --> 00:45:49,330 zero's algorithms can we speed things up. 1249 00:45:49,330 --> 00:45:51,760 >> And now let's compare these in one last form. 1250 00:45:51,760 --> 00:45:55,710 I'm going to go ahead to CS50's website, where 1251 00:45:55,710 --> 00:45:59,020 we have this final link for today, where someone on the internet 1252 00:45:59,020 --> 00:46:03,960 kindly put together a video that captures what different sorting 1253 00:46:03,960 --> 00:46:07,510 algorithms sound like. 1254 00:46:07,510 --> 00:46:09,577 This is insertion sort. 1255 00:46:09,577 --> 00:46:12,072 >> [BEEPING] 1256 00:46:12,072 --> 00:46:13,070 1257 00:46:13,070 --> 00:46:16,850 >> Whereby you're applying a frequency based on the height of the bar bar. 1258 00:46:16,850 --> 00:46:19,826 This is bubble sort. 1259 00:46:19,826 --> 00:46:21,822 >> [WARPED BEEPING] 1260 00:46:21,822 --> 00:46:33,299 1261 00:46:33,299 --> 00:46:45,774 >> Coming up next is-- coming up next is-- selection sort, 1262 00:46:45,774 --> 00:46:48,820 where again, we're selecting the next smallest element, 1263 00:46:48,820 --> 00:46:51,820 and we can see it growing from left to right. 1264 00:46:51,820 --> 00:47:01,120 1265 00:47:01,120 --> 00:47:04,000 >> Merge sort, our winner thus far today. 1266 00:47:04,000 --> 00:47:09,659 1267 00:47:09,659 --> 00:47:12,450 Notice how it's dividing things into [INAUDIBLE] half and quarters. 1268 00:47:12,450 --> 00:47:17,510 1269 00:47:17,510 --> 00:47:21,660 Gnome sort, which we have not talked about, and creates visually 1270 00:47:21,660 --> 00:47:24,450 and audally a bit of a different shape and sound. 1271 00:47:24,450 --> 00:47:27,060 1272 00:47:27,060 --> 00:47:30,160 Going back and forth, cleaning things up. 1273 00:47:30,160 --> 00:47:32,230 Also check out heapsort on this guy's website. 1274 00:47:32,230 --> 00:47:36,100 1275 00:47:36,100 --> 00:47:36,810 >> And that's it. 1276 00:47:36,810 --> 00:47:38,210 We will see you next time. 1277 00:47:38,210 --> 00:47:42,647 1278 00:47:42,647 --> 00:47:48,334 >> [WHOOSHING AND MUSIC] 1279 00:47:48,334 --> 00:50:24,839