1 00:00:00,000 --> 00:00:03,437 [MUSIC PLAYING] 2 00:00:03,437 --> 00:00:10,330 3 00:00:10,330 --> 00:00:13,530 DAVID J. MALAN: So odds are you use a computer most every day, 4 00:00:13,530 --> 00:00:16,750 but you might not necessarily think of yourself as a computer person. 5 00:00:16,750 --> 00:00:20,090 Something goes wrong, you don't necessarily know how to fix it. 6 00:00:20,090 --> 00:00:23,180 And if you actually want to solve some problem using technology, 7 00:00:23,180 --> 00:00:25,420 the whole world of technology, and computing, 8 00:00:25,420 --> 00:00:28,930 and algorithms and all of that might seem all quite foreign. 9 00:00:28,930 --> 00:00:31,180 So much so that you can't necessarily feel like you're 10 00:00:31,180 --> 00:00:33,140 making an informed decision. 11 00:00:33,140 --> 00:00:35,890 Well, that's just because the technology that we all use around us 12 00:00:35,890 --> 00:00:37,867 is kind of working at this level. 13 00:00:37,867 --> 00:00:39,700 Whereby there's so many people who have come 14 00:00:39,700 --> 00:00:43,810 before us that have invented this level, and this level, and this level, 15 00:00:43,810 --> 00:00:44,510 and this level. 16 00:00:44,510 --> 00:00:48,730 And so what we'll do here is try to start from that ground level up. 17 00:00:48,730 --> 00:00:52,360 And indeed, at the base of all computing, do we really have, 18 00:00:52,360 --> 00:00:54,670 as we'll see, just zeros and ones? 19 00:00:54,670 --> 00:00:57,930 And that's probably something you know, at least, generally speaking. 20 00:00:57,930 --> 00:01:00,070 But it turns out once you go have zeros and ones, 21 00:01:00,070 --> 00:01:01,662 can you very quickly go to text? 22 00:01:01,662 --> 00:01:02,620 Can you go to graphics? 23 00:01:02,620 --> 00:01:03,490 Can you go to videos? 24 00:01:03,490 --> 00:01:04,615 Can you go to spreadsheets? 25 00:01:04,615 --> 00:01:06,370 Can you go to so much more? 26 00:01:06,370 --> 00:01:08,500 But again, that's the world we now inhabit. 27 00:01:08,500 --> 00:01:10,760 But let's now build ourselves up to that point, 28 00:01:10,760 --> 00:01:12,926 so that you actually begin to look around yourselves 29 00:01:12,926 --> 00:01:17,020 and realize, oh, I understand now, what's going on. 30 00:01:17,020 --> 00:01:21,430 And to do that, let's consider this, computational thinking, which really 31 00:01:21,430 --> 00:01:23,650 refers to thinking computationally. 32 00:01:23,650 --> 00:01:27,160 Thinking more methodically, thinking more carefully, 33 00:01:27,160 --> 00:01:32,590 and somehow framing all problems in the world as a sequence of steps. 34 00:01:32,590 --> 00:01:38,474 And those steps are quite simply, input comes in, and output goes out. 35 00:01:38,474 --> 00:01:40,390 And what's in between those inputs and outputs 36 00:01:40,390 --> 00:01:42,765 we'll eventually describe as something called algorithms, 37 00:01:42,765 --> 00:01:45,620 but for now, let's just consider it to be this black box. 38 00:01:45,620 --> 00:01:48,250 We don't know, we don't care what's going on inside there. 39 00:01:48,250 --> 00:01:52,810 All that we care is that we get back a correct output or a solution 40 00:01:52,810 --> 00:01:54,400 to a problem. 41 00:01:54,400 --> 00:01:58,480 But how do we go about representing those inputs and outputs? 42 00:01:58,480 --> 00:02:01,471 If our problem at hand is to analyze a whole bunch of financial data, 43 00:02:01,471 --> 00:02:03,220 a whole bunch of numbers in a spreadsheet, 44 00:02:03,220 --> 00:02:05,053 how can we actually sum all of those numbers 45 00:02:05,053 --> 00:02:08,380 together, or perform some other arithmetic calculation on them? 46 00:02:08,380 --> 00:02:12,020 If we instead have a really big document, a contract of some sort. 47 00:02:12,020 --> 00:02:14,460 And we want to make sure that it is properly spellchecked? 48 00:02:14,460 --> 00:02:16,994 Well, the input there isn't numbers, but is 49 00:02:16,994 --> 00:02:18,910 all of the words and letters in that document. 50 00:02:18,910 --> 00:02:21,610 And the output, we hope, is a document without all 51 00:02:21,610 --> 00:02:23,410 of those little red squigglies. 52 00:02:23,410 --> 00:02:26,080 We want a correctly spelled document. 53 00:02:26,080 --> 00:02:30,040 So those are just some of the problems that you might experience most any day, 54 00:02:30,040 --> 00:02:33,160 but underneath the hood, there's quite a bit of complexity going on. 55 00:02:33,160 --> 00:02:35,260 But that complexity, it turns out, is just 56 00:02:35,260 --> 00:02:40,064 the result of layering, pretty simple ideas, one on top of another. 57 00:02:40,064 --> 00:02:42,230 But in order to get to that point in the discussion, 58 00:02:42,230 --> 00:02:45,640 we need to somehow represent these inputs, these numbers, these letters, 59 00:02:45,640 --> 00:02:47,800 whatever it is that we have at hand. 60 00:02:47,800 --> 00:02:51,820 And to do that, we're going to use binary. 61 00:02:51,820 --> 00:02:55,330 Now odds are you've probably heard that computers indeed only speak 62 00:02:55,330 --> 00:02:57,430 or understand zeros and ones. 63 00:02:57,430 --> 00:03:01,060 But how can that possibly be, when they can do so much today, whether they're 64 00:03:01,060 --> 00:03:05,020 on our desktops, or laptops or even our mobile phones in our pockets, 65 00:03:05,020 --> 00:03:07,420 if all you have at the end of the day is zeros and ones? 66 00:03:07,420 --> 00:03:12,550 How could you possibly count to two, let alone three, or four, or four billion, 67 00:03:12,550 --> 00:03:15,540 let alone representing any number of other types of media 68 00:03:15,540 --> 00:03:17,760 that computers today understand? 69 00:03:17,760 --> 00:03:20,770 Well, odds are you recognize in this word here, 70 00:03:20,770 --> 00:03:24,910 this prefix, bi, bi meaning two, and indeed that hints at the fact 71 00:03:24,910 --> 00:03:28,090 that you only have two letters in your alphabet, so to speak. 72 00:03:28,090 --> 00:03:30,490 Two digits, zero and one. 73 00:03:30,490 --> 00:03:34,960 Now we humans have typically grown up using the decimal system, 74 00:03:34,960 --> 00:03:36,640 dec meaning 10. 75 00:03:36,640 --> 00:03:41,800 And of course we have 0,1, 2,3, 4,5, 6, 7, 8, 9, 10 possible digits 76 00:03:41,800 --> 00:03:45,730 that we can permute and arrange to count even higher than that. 77 00:03:45,730 --> 00:03:49,570 But with just zeros and ones, how do you get to two, how do you get to three? 78 00:03:49,570 --> 00:03:55,090 Well, you're thinking still in base 10, so to speak, base 10 being decimal. 79 00:03:55,090 --> 00:03:59,200 But we want to think now in base 2, so to speak, binary, 80 00:03:59,200 --> 00:04:00,970 where we have just two of these inputs. 81 00:04:00,970 --> 00:04:03,290 And let me propose that if you're like me, 82 00:04:03,290 --> 00:04:07,150 you probably grew up understanding numbers as really just this pattern 83 00:04:07,150 --> 00:04:11,430 of symbols like this, where each symbol was in a column of sorts, 84 00:04:11,430 --> 00:04:13,160 a placeholder, if you will. 85 00:04:13,160 --> 00:04:17,170 And this, if you recall, was generally called the ones place, and then 86 00:04:17,170 --> 00:04:19,750 the tens place, and then with the hundreds place. 87 00:04:19,750 --> 00:04:22,270 And so at the moment, what we really just have on the screen 88 00:04:22,270 --> 00:04:26,230 is a pattern of symbols, a pattern of digits, one, two, three. 89 00:04:26,230 --> 00:04:29,920 But why is this pattern of symbols, one, two, three, 90 00:04:29,920 --> 00:04:34,180 the number you and I think of immediately as 123? 91 00:04:34,180 --> 00:04:38,500 Well, that's because by way of these columns or places do we implicitly, 92 00:04:38,500 --> 00:04:40,870 in our own mind, quickly do a bit of arithmetic? 93 00:04:40,870 --> 00:04:45,790 Of course, if we have one, two, three, that's really like 100 times 1 plus 94 00:04:45,790 --> 00:04:53,840 10 times 2 plus 1 times 3, and of course that gives us 123. 95 00:04:53,840 --> 00:04:58,240 So it turns out that the binary system is actually fundamentally the same. 96 00:04:58,240 --> 00:05:01,550 So if you get decimal, if you've been counting like that since grade school, 97 00:05:01,550 --> 00:05:03,880 you are good to go now with binary, because in fact 98 00:05:03,880 --> 00:05:07,259 and arguably, binary is even simpler, because you don't even 99 00:05:07,259 --> 00:05:08,800 have to keep track of so many digits. 100 00:05:08,800 --> 00:05:11,740 In fact, if we consider this pattern of symbols, 101 00:05:11,740 --> 00:05:14,200 this of course, in our human world, would generally 102 00:05:14,200 --> 00:05:17,467 be immediately recognized by us humans as the number zero. 103 00:05:17,467 --> 00:05:19,300 Of course, it's a little silly to have those 104 00:05:19,300 --> 00:05:22,210 left most zeros, those leading zeros, so to speak, 105 00:05:22,210 --> 00:05:24,640 because they don't really add much mathematically. 106 00:05:24,640 --> 00:05:25,270 But that's OK. 107 00:05:25,270 --> 00:05:28,630 You can put as many zeros to the left of a number in our real world 108 00:05:28,630 --> 00:05:31,120 and it doesn't really affect the value. 109 00:05:31,120 --> 00:05:38,530 Now instead of using one and 10 and 100, let me just use one, two, and four. 110 00:05:38,530 --> 00:05:39,580 Now why those values? 111 00:05:39,580 --> 00:05:43,910 Well before, 1, 10, 100, and again a thousand, 10,000, 100,000, and so 112 00:05:43,910 --> 00:05:44,410 forth. 113 00:05:44,410 --> 00:05:49,450 Those are powers of 10, 10 to the zero is 1, 10 to the one is 10, 114 00:05:49,450 --> 00:05:52,760 10 to the two is 100, and so forth. 115 00:05:52,760 --> 00:05:55,900 So what pattern do you perhaps see here? 116 00:05:55,900 --> 00:06:00,520 These aren't powers of 10 anymore, 1, 2, 4, an 8, 16, 32. 117 00:06:00,520 --> 00:06:02,470 Now you really hear the pattern? 118 00:06:02,470 --> 00:06:03,970 These are instead powers of two. 119 00:06:03,970 --> 00:06:08,740 Two to the zero is one, two to the one is two, two to the two 120 00:06:08,740 --> 00:06:10,730 is four, and so forth. 121 00:06:10,730 --> 00:06:11,950 So that's the only change. 122 00:06:11,950 --> 00:06:16,270 By using base 10, just two digits, zero and one, 123 00:06:16,270 --> 00:06:19,040 can we still count using the same arithmetic system. 124 00:06:19,040 --> 00:06:21,850 So for instance, this of course is still the number zero, 125 00:06:21,850 --> 00:06:27,670 because we have zero fours, and zero twos, and zero ones. 126 00:06:27,670 --> 00:06:30,850 But if we instead change this pattern to 0 0 1, 127 00:06:30,850 --> 00:06:32,620 well what value does this now represent? 128 00:06:32,620 --> 00:06:35,410 In the world of decimal, this would represent the number one. 129 00:06:35,410 --> 00:06:40,660 And that's true in the world of binary, because we have 4 times 0 plus 2 times 130 00:06:40,660 --> 00:06:44,800 0 plus 1 times 1, which of course is just 1. 131 00:06:44,800 --> 00:06:46,360 So how do we get to two? 132 00:06:46,360 --> 00:06:51,040 You might be inclined to just change this zero to a one, 133 00:06:51,040 --> 00:06:52,510 but that would be incorrect. 134 00:06:52,510 --> 00:06:54,280 That's like carrying the one, but then not 135 00:06:54,280 --> 00:06:56,920 wrapping around on the rightmost digit. 136 00:06:56,920 --> 00:06:59,769 Rather, if we want to represent the number two, 137 00:06:59,769 --> 00:07:02,060 we need to come up with a pattern that represents that. 138 00:07:02,060 --> 00:07:06,850 So let me propose instead that we don't just change that zero to a one, 139 00:07:06,850 --> 00:07:10,030 but we also change that one to zero. 140 00:07:10,030 --> 00:07:12,190 Because now we have 4 times 0 plus 2 times 141 00:07:12,190 --> 00:07:16,910 1 plus times 0 which of course is the number we know as 2. 142 00:07:16,910 --> 00:07:19,240 And now you can perhaps grok the pattern at hand. 143 00:07:19,240 --> 00:07:21,790 If we want to count up to the number we know as three, 144 00:07:21,790 --> 00:07:25,330 let's just change that rightmost one to a one. 145 00:07:25,330 --> 00:07:29,290 So that's four times 0 plus 2 times 1 plus 1 times 1. 146 00:07:29,290 --> 00:07:30,480 That of course is three. 147 00:07:30,480 --> 00:07:32,140 Four is perhaps jumping out at you now. 148 00:07:32,140 --> 00:07:36,640 We just change that zero to a one, and then the other two digits to zeros. 149 00:07:36,640 --> 00:07:42,730 And now if we want to count up to five, or to six, or to seven, or to-- 150 00:07:42,730 --> 00:07:45,070 dang it, what's supposed to come next? 151 00:07:45,070 --> 00:07:48,830 It would seem that in the binary system, we can only count as high as-- 152 00:07:48,830 --> 00:07:49,622 what, seven? 153 00:07:49,622 --> 00:07:52,990 And that's a little strange, because of course, we started at zero, 154 00:07:52,990 --> 00:07:56,440 we counted here up to seven, but surely computers can count higher than that. 155 00:07:56,440 --> 00:07:58,930 And that's fine, just need another digit. 156 00:07:58,930 --> 00:08:04,600 Much like if we wanted to count from 999, in our human decimal world, 157 00:08:04,600 --> 00:08:07,480 up to a thousand, which has four digits. 158 00:08:07,480 --> 00:08:09,730 Similarly here, do we need another digit? 159 00:08:09,730 --> 00:08:12,790 We need an eights place, and so we could represent 160 00:08:12,790 --> 00:08:16,636 the number we know as eight with one, zero, zero, zero. 161 00:08:16,636 --> 00:08:20,620 But the key here is that we needed another digit. 162 00:08:20,620 --> 00:08:23,740 To put it more mechanically, we needed more hardware. 163 00:08:23,740 --> 00:08:27,430 We needed another physical place, at least in this depiction, 164 00:08:27,430 --> 00:08:32,360 to actually store that one. 165 00:08:32,360 --> 00:08:37,990 And now we begin already to hint at a connection between this digital world 166 00:08:37,990 --> 00:08:42,309 of zeros and ones, and the hardware world in which it's typically 167 00:08:42,309 --> 00:08:43,390 incarnated. 168 00:08:43,390 --> 00:08:48,250 And by that, I mean, we need to decide, ultimately, 169 00:08:48,250 --> 00:08:51,850 how and where to store patterns like this. 170 00:08:51,850 --> 00:08:55,600 And you know the nice thing about binary only having two values, 171 00:08:55,600 --> 00:08:58,450 is that that effectively, means you have just two states. 172 00:08:58,450 --> 00:09:01,450 And you might think of these two states as something being on, 173 00:09:01,450 --> 00:09:02,710 or something being off. 174 00:09:02,710 --> 00:09:06,010 And maybe we might think of something being on as a one, and something being 175 00:09:06,010 --> 00:09:07,570 off as a zero. 176 00:09:07,570 --> 00:09:11,920 And so just with two states, can we have these two extremes. 177 00:09:11,920 --> 00:09:15,850 And maybe it's not on or off, maybe it's true, or false, or black, or white, 178 00:09:15,850 --> 00:09:16,750 or red, or blue. 179 00:09:16,750 --> 00:09:19,390 It doesn't even matter what we call the two states, 180 00:09:19,390 --> 00:09:21,745 the key is that we have two of them. 181 00:09:21,745 --> 00:09:25,510 And you know In the physical world, the simplest thing 182 00:09:25,510 --> 00:09:31,930 I can think of to turn on and off, is perhaps something like this. 183 00:09:31,930 --> 00:09:32,920 Just a light bulb. 184 00:09:32,920 --> 00:09:35,080 This here's a little desk lamp, and in fact, 185 00:09:35,080 --> 00:09:38,740 if I go ahead and clip this on here, and let 186 00:09:38,740 --> 00:09:41,950 me plug this into an electrical outlet so that I'm 187 00:09:41,950 --> 00:09:46,030 getting some additional input if you will, electricity or electrons, 188 00:09:46,030 --> 00:09:49,450 this lamp right now I claim is representing the number zero. 189 00:09:49,450 --> 00:09:53,430 It's completely off, and I'm just going to agree, to agree with you, that this 190 00:09:53,430 --> 00:09:54,650 shall represent zero. 191 00:09:54,650 --> 00:09:58,400 But you know what, if I want to represent a one in the physical world, 192 00:09:58,400 --> 00:09:59,720 just going to turn it on. 193 00:09:59,720 --> 00:10:04,550 And so now we have something that's on, or true, we'll call that a one. 194 00:10:04,550 --> 00:10:06,170 Zero, one. 195 00:10:06,170 --> 00:10:09,200 Of course, we're not really counting all that high just yet. 196 00:10:09,200 --> 00:10:13,220 If I want to count higher than zero to one, to something higher, 197 00:10:13,220 --> 00:10:15,680 I'm going to need another light bulb. 198 00:10:15,680 --> 00:10:18,567 I'm going to need more hardware, more memory, if you will, 199 00:10:18,567 --> 00:10:19,650 in the world of computers. 200 00:10:19,650 --> 00:10:25,400 So let's actually now give myself a second zero or one. 201 00:10:25,400 --> 00:10:28,890 Plug this thing in here. 202 00:10:28,890 --> 00:10:31,940 Give it some electricity as well, which again, is really 203 00:10:31,940 --> 00:10:34,500 one of the few physical resources we have in a computer, 204 00:10:34,500 --> 00:10:37,040 whether it's coming from a power cord or a battery. 205 00:10:37,040 --> 00:10:40,460 And so now, if I want to represent the number zero, here we are. 206 00:10:40,460 --> 00:10:42,500 But I'm doing it really with zero zero. 207 00:10:42,500 --> 00:10:48,950 This now is going to be one, this now is going to be two, and this, of course, 208 00:10:48,950 --> 00:10:50,180 is going to be three. 209 00:10:50,180 --> 00:10:51,830 And let's not stop there. 210 00:10:51,830 --> 00:10:54,920 Why have two desk lamps when you can have three desk lamps. 211 00:10:54,920 --> 00:10:57,680 If we instead make a little more room for here, and we 212 00:10:57,680 --> 00:11:01,010 want to instead now count not just as high as three. 213 00:11:01,010 --> 00:11:06,310 Let me go ahead and plug-in this third and final desk lamp, 214 00:11:06,310 --> 00:11:08,230 and go ahead and turn this on. 215 00:11:08,230 --> 00:11:12,340 And of course, we instantly go from three to-- 216 00:11:12,340 --> 00:11:14,480 not quite instantly because you have the hands-- 217 00:11:14,480 --> 00:11:16,410 to the number four. 218 00:11:16,410 --> 00:11:17,770 And so forth. 219 00:11:17,770 --> 00:11:20,850 So what's now the connection between the digital 220 00:11:20,850 --> 00:11:23,430 and the sort of analog world of light bulbs, 221 00:11:23,430 --> 00:11:25,710 and the physical world of computers? 222 00:11:25,710 --> 00:11:29,460 Well inside of a computer is a whole bunch of tiny, tiny little bulbs, 223 00:11:29,460 --> 00:11:29,961 so to speak. 224 00:11:29,961 --> 00:11:33,001 In fact, you can think of what's inside of your computer is exactly this. 225 00:11:33,001 --> 00:11:35,520 So many little light bulbs that are either on or off. 226 00:11:35,520 --> 00:11:41,100 And thanks to those little light bulbs, can your computer represent numbers? 227 00:11:41,100 --> 00:11:44,250 Zero, one, two, three, four, as many-- 228 00:11:44,250 --> 00:11:47,850 as high of a number as it wants so long as it has enough little lights. 229 00:11:47,850 --> 00:11:49,590 Now of course I'm oversimplifying. 230 00:11:49,590 --> 00:11:53,280 It's not actually lights that you have inside of your computer, rather, 231 00:11:53,280 --> 00:11:55,250 you have just the switches. 232 00:11:55,250 --> 00:11:58,034 These switches being called in the computing world transistors, 233 00:11:58,034 --> 00:11:58,950 as you may have heard. 234 00:11:58,950 --> 00:12:02,190 Indeed, one of the things that companies like Intel are increasingly doing, 235 00:12:02,190 --> 00:12:06,420 is packing more and more transistors into their CPUs, central processing 236 00:12:06,420 --> 00:12:06,990 units. 237 00:12:06,990 --> 00:12:08,250 The brains of computers. 238 00:12:08,250 --> 00:12:12,270 And with those additional transistors, can they store even more values, 239 00:12:12,270 --> 00:12:15,330 or equivalently can they count even higher? 240 00:12:15,330 --> 00:12:18,690 And so now, even though we began the discussion down here 241 00:12:18,690 --> 00:12:21,400 with just zeros and ones in the abstract, 242 00:12:21,400 --> 00:12:23,580 now we've made it a little more physical, in so far 243 00:12:23,580 --> 00:12:26,820 as we can represent those zeros and ones with something physical, 244 00:12:26,820 --> 00:12:29,400 drawing electricity to power these kinds of light bulbs. 245 00:12:29,400 --> 00:12:31,680 Of course, we can now miniaturize that and consider 246 00:12:31,680 --> 00:12:35,160 these light bulbs to really be, what we'll call in the computer world, 247 00:12:35,160 --> 00:12:37,980 transistors. 248 00:12:37,980 --> 00:12:40,320 But at the end of the day, even though we 249 00:12:40,320 --> 00:12:43,380 have now a way of representing information, 250 00:12:43,380 --> 00:12:47,160 the zeros and ones, even though we have a way now 251 00:12:47,160 --> 00:12:51,300 of representing even bigger numbers by using even more transistors inside 252 00:12:51,300 --> 00:12:55,530 of our computer, we're still talking only about numbers. 253 00:12:55,530 --> 00:12:58,170 And with my computer I want to do more than just use something 254 00:12:58,170 --> 00:12:59,670 like Microsoft Excel. 255 00:12:59,670 --> 00:13:04,170 I want to do more than just do math on my data. 256 00:13:04,170 --> 00:13:07,710 I want to actually store things like letters, and words. 257 00:13:07,710 --> 00:13:15,040 Let alone, colors, and images, and videos, and more. 258 00:13:15,040 --> 00:13:17,410 So we need to abstract higher. 259 00:13:17,410 --> 00:13:20,890 So again zeros and ones, we know how to represent it, now let's 260 00:13:20,890 --> 00:13:24,610 build on top of the zeros and ones and start to represent 261 00:13:24,610 --> 00:13:26,470 something more interesting. 262 00:13:26,470 --> 00:13:31,520 Something more alphabetical, if you will. 263 00:13:31,520 --> 00:13:34,630 And so this gives us ASCII, the American Standard 264 00:13:34,630 --> 00:13:37,570 Code for Information Interchange, or more modernly 265 00:13:37,570 --> 00:13:40,780 referred to as a superset of Unicode. 266 00:13:40,780 --> 00:13:43,420 Turns out, human some time ago, decided on a mapping 267 00:13:43,420 --> 00:13:45,070 between letters and numbers. 268 00:13:45,070 --> 00:13:47,560 Somewhat arbitrarily, but in a way that's 269 00:13:47,560 --> 00:13:51,290 convenient when it comes to actually programming, things like this. 270 00:13:51,290 --> 00:13:53,290 And this is to say that humans time ago decided, 271 00:13:53,290 --> 00:13:58,140 you know what, we are going to represent the letter A using the decimal number 272 00:13:58,140 --> 00:13:58,900 65. 273 00:13:58,900 --> 00:13:59,990 Now what does that mean? 274 00:13:59,990 --> 00:14:03,670 Well, this means that if your computer is to store the letter A, or the letter 275 00:14:03,670 --> 00:14:06,550 B, or the letter C, it simply has to store ultimately 276 00:14:06,550 --> 00:14:09,680 the number 65, or 66, or 67. 277 00:14:09,680 --> 00:14:13,030 What does it mean to store number like 65, 66, or 67? 278 00:14:13,030 --> 00:14:17,040 Well that just means to store some pattern of bits, some pattern of bulbs 279 00:14:17,040 --> 00:14:21,910 as we did a moment ago, and turn them on and off in such a way 280 00:14:21,910 --> 00:14:27,521 that you're representing in binary, the decimal number 65, 66, 67. 281 00:14:27,521 --> 00:14:30,520 So if you think about something like Microsoft Word or Google documents, 282 00:14:30,520 --> 00:14:33,580 or any program in which you type text, what you're really 283 00:14:33,580 --> 00:14:37,480 doing by typing on the keyboard, is sending some pattern of signals 284 00:14:37,480 --> 00:14:42,190 that's telling the computer to store not just the letter ABC per se, 285 00:14:42,190 --> 00:14:45,700 but really to store the number 65 or 66 or 67, 286 00:14:45,700 --> 00:14:50,260 or really, to store the pattern of zeros and ones 287 00:14:50,260 --> 00:14:52,680 that ultimately represent those same values. 288 00:14:52,680 --> 00:14:56,020 So again, this spirit of layering binary now 289 00:14:56,020 --> 00:14:59,380 becomes ASCII, or Unicode, something higher still. 290 00:14:59,380 --> 00:15:01,630 And so with this can we represent messages? 291 00:15:01,630 --> 00:15:07,950 For instance, if I wanted to represent a message like this, a familiar message, 292 00:15:07,950 --> 00:15:10,780 there say, I might store these three values. 293 00:15:10,780 --> 00:15:15,640 72, 73, and 33. 294 00:15:15,640 --> 00:15:16,540 Now what is that? 295 00:15:16,540 --> 00:15:20,305 Well turns out if I look back at this pattern here, where 65 is A, 296 00:15:20,305 --> 00:15:26,910 and in tens 72 is H, and 73 is I, well what are we really representing here? 297 00:15:26,910 --> 00:15:29,301 It would seem we're representing the pattern H I, 298 00:15:29,301 --> 00:15:31,300 and then you wouldn't know this from that chart, 299 00:15:31,300 --> 00:15:33,091 but if you look at another online reference 300 00:15:33,091 --> 00:15:35,390 you'll see a 33 is an exclamation point. 301 00:15:35,390 --> 00:15:38,980 So now we have the word Hi or the exclamation Hi. 302 00:15:38,980 --> 00:15:42,700 But that's in the context of a text editor, or word processor like word. 303 00:15:42,700 --> 00:15:45,460 Suppose, instead, you were using not a text 304 00:15:45,460 --> 00:15:51,220 editor, but something like Photoshop, or MS Paint, 305 00:15:51,220 --> 00:15:53,290 or some other graphical program. 306 00:15:53,290 --> 00:15:57,520 Well instead, you might have that same pattern of numbers, 307 00:15:57,520 --> 00:16:01,300 or really bits at the end of the day, but in this context, 308 00:16:01,300 --> 00:16:04,540 something like Photoshop, these numbers are 309 00:16:04,540 --> 00:16:08,980 meant to be values between 0 and 255. 310 00:16:08,980 --> 00:16:12,937 Turns out, long story short, that if you have eight zeros or ones, 311 00:16:12,937 --> 00:16:14,770 where zero or one, you know what, let's just 312 00:16:14,770 --> 00:16:19,030 start calling them by their formal name bits for binary digits. 313 00:16:19,030 --> 00:16:24,190 If you have eight bits, you can count from 0 all the way up to 255. 314 00:16:24,190 --> 00:16:27,160 And the quick math there is 2 to the 8, means 315 00:16:27,160 --> 00:16:30,220 you have 256 possible permutations of zeros and ones. 316 00:16:30,220 --> 00:16:33,880 So therefore, if you start counting at zero you can count as high up as 255. 317 00:16:33,880 --> 00:16:40,000 And so in the context of a graphics program, you can think of 72 in so far 318 00:16:40,000 --> 00:16:46,210 as it's not quite halfway between zero and 255, it's a medium amount of red 319 00:16:46,210 --> 00:16:47,950 and a medium amount of green. 320 00:16:47,950 --> 00:16:49,810 Not too much blue. 321 00:16:49,810 --> 00:16:58,090 Where zero means none of this color, and 255 means a lot of this color. 322 00:16:58,090 --> 00:17:03,490 So if you had a pattern of bits, and in-turn numbers, stored inside 323 00:17:03,490 --> 00:17:06,525 of your computer for the purposes of a graphics program, 324 00:17:06,525 --> 00:17:09,400 it's really like telling the computer give me a medium amount of red, 325 00:17:09,400 --> 00:17:12,910 give me a minimum amount of green and just a little bit of blue, 326 00:17:12,910 --> 00:17:19,390 and combine those like paint, or like frequencies of light., 327 00:17:19,390 --> 00:17:21,609 until you get the summation of those, really, 328 00:17:21,609 --> 00:17:23,770 are the combination really of those, which 329 00:17:23,770 --> 00:17:26,500 is this murky shade here of yellow. 330 00:17:26,500 --> 00:17:29,200 So again, using the same patterns of bits, 331 00:17:29,200 --> 00:17:32,080 can we represent either letters and words and paragraphs, 332 00:17:32,080 --> 00:17:35,410 or in another context all together, could that same pattern of bits 333 00:17:35,410 --> 00:17:40,180 still represent numbers, but be interpreted not as letters and words 334 00:17:40,180 --> 00:17:42,190 and paragraphs, but as colors. 335 00:17:42,190 --> 00:17:47,290 If you treat each trio of 8 bits as representing some amount of red, 336 00:17:47,290 --> 00:17:49,240 some amount of green, some amount of blue, 337 00:17:49,240 --> 00:17:55,320 otherwise known as, if you've heard the term, RGB, for red, green, blue. 338 00:17:55,320 --> 00:17:59,100 So this principle of starting simple, and gradually making 339 00:17:59,100 --> 00:18:02,270 things more and more and more complicated, 340 00:18:02,270 --> 00:18:04,354 is really a principle of abstraction. 341 00:18:04,354 --> 00:18:06,270 Because at this point in the story, when we're 342 00:18:06,270 --> 00:18:10,020 talking about words and paragraphs and essays and documents, 343 00:18:10,020 --> 00:18:13,230 or in another context we're talking about images, or maybe even 344 00:18:13,230 --> 00:18:17,430 movies and more, we no longer really need to care about, 345 00:18:17,430 --> 00:18:19,470 or even need to know about, or understand, 346 00:18:19,470 --> 00:18:21,570 what's underneath the hood those zeros and ones, 347 00:18:21,570 --> 00:18:24,960 because we've abstracted away from that lower level. 348 00:18:24,960 --> 00:18:28,410 And this principle of abstraction, layering idea on idea 349 00:18:28,410 --> 00:18:30,780 on idea on idea such that you no longer need 350 00:18:30,780 --> 00:18:33,750 to worry about how the lower level ideas are implemented 351 00:18:33,750 --> 00:18:36,540 is nice, because it allows us humans to focus only 352 00:18:36,540 --> 00:18:38,670 on the problems we really care about, which 353 00:18:38,670 --> 00:18:41,220 in theory are those top most problems. 354 00:18:41,220 --> 00:18:44,640 The ones that are immediately at hand, that we're building solutions to, 355 00:18:44,640 --> 00:18:47,820 on the shoulders of, computer scientists, and engineers, 356 00:18:47,820 --> 00:18:50,580 and just colleagues that have come before us. 357 00:18:50,580 --> 00:18:52,510 And we don't have to get lost in the weeds, 358 00:18:52,510 --> 00:18:54,870 so to speak, of the earlier complexity. 359 00:18:54,870 --> 00:18:57,210 And so this is a powerful problem solving technique, 360 00:18:57,210 --> 00:18:59,910 and indeed it's a principle that we'll see applied ultimately 361 00:18:59,910 --> 00:19:04,070 in the world of programming as well. 362 00:19:04,070 --> 00:19:05,500 Now let's try something. 363 00:19:05,500 --> 00:19:08,140 Speaking of abstractions, let me encourage you 364 00:19:08,140 --> 00:19:11,354 at this point to take out a piece of paper if you have one there. 365 00:19:11,354 --> 00:19:13,270 And surely, if you don't have one right there, 366 00:19:13,270 --> 00:19:17,380 surely you could pause this video and actually go dig one up. 367 00:19:17,380 --> 00:19:18,950 Grab a pencil too, or pen. 368 00:19:18,950 --> 00:19:21,640 369 00:19:21,640 --> 00:19:24,620 Yeah, Ill wait. 370 00:19:24,620 --> 00:19:28,460 Now you could just pause this, I can't wait all that long. 371 00:19:28,460 --> 00:19:32,090 Let's assume at this point in the story you've got that piece of paper, 372 00:19:32,090 --> 00:19:34,760 and a pencil or a pen, and let's play a little game. 373 00:19:34,760 --> 00:19:37,446 I of course can't really see what you're doing. 374 00:19:37,446 --> 00:19:40,070 But I'm going to hope that you either do or don't do what I do, 375 00:19:40,070 --> 00:19:42,200 because either way it will be instructive. 376 00:19:42,200 --> 00:19:44,420 And I'm going to leverage some abstractions here, 377 00:19:44,420 --> 00:19:45,740 for better or for worse. 378 00:19:45,740 --> 00:19:49,130 I want you to go ahead, and please don't be one of those people like me 379 00:19:49,130 --> 00:19:50,890 who's just following along pretending like all right, 380 00:19:50,890 --> 00:19:54,056 I have the piece of paper, I have the pencil, this is what I would be doing. 381 00:19:54,056 --> 00:19:55,170 Actually do this. 382 00:19:55,170 --> 00:19:59,000 This will be kind of fun for one of us in the end, if this works out. 383 00:19:59,000 --> 00:20:03,050 Go ahead now, on that piece of paper, with your pen or pencil, and go ahead 384 00:20:03,050 --> 00:20:05,040 and just draw a circle. 385 00:20:05,040 --> 00:20:07,810 386 00:20:07,810 --> 00:20:08,710 OK. 387 00:20:08,710 --> 00:20:09,210 All right. 388 00:20:09,210 --> 00:20:13,890 And below that circle draw a square. 389 00:20:13,890 --> 00:20:15,100 All set. 390 00:20:15,100 --> 00:20:18,025 All right, and below that square draw a triangle. 391 00:20:18,025 --> 00:20:21,360 392 00:20:21,360 --> 00:20:25,460 Now odds are, you know exactly what I meant when I said draw a circle, 393 00:20:25,460 --> 00:20:27,690 and perhaps you did a little something like this. 394 00:20:27,690 --> 00:20:29,700 Or a little more perfect than my circle there. 395 00:20:29,700 --> 00:20:31,499 And then beneath that I said draw a square. 396 00:20:31,499 --> 00:20:33,290 And you just intuitively know what a square 397 00:20:33,290 --> 00:20:36,920 is, so you might have drawn a square like this. 398 00:20:36,920 --> 00:20:39,710 And then the third thing I said was draw a triangle. 399 00:20:39,710 --> 00:20:44,120 And so you might have drawn a triangle that looks like this. 400 00:20:44,120 --> 00:20:47,490 But immediately, ant that looks curiously like part of a jack-o-lantern 401 00:20:47,490 --> 00:20:47,990 now. 402 00:20:47,990 --> 00:20:50,917 But curiously, there's some ambiguity there. 403 00:20:50,917 --> 00:20:53,750 Right, a circle is kind of a circle, but I didn't specify the radius 404 00:20:53,750 --> 00:20:56,270 or diameter, so maybe yours is bigger, maybe yours a smaller, 405 00:20:56,270 --> 00:20:59,270 maybe it's over here on your paper, maybe it's over there on your paper. 406 00:20:59,270 --> 00:21:02,540 So already, these abstractions are useful in that you immediately 407 00:21:02,540 --> 00:21:06,050 knew what to draw, but you didn't really know how to draw it. 408 00:21:06,050 --> 00:21:09,560 Similarly, this square, I don't know why I drew it a little smaller here, 409 00:21:09,560 --> 00:21:11,960 but it's indeed smaller, and it's barely a square. 410 00:21:11,960 --> 00:21:13,250 But it's meant to be a square. 411 00:21:13,250 --> 00:21:16,540 But there's a gap between it and the circle, but I didn't really specify. 412 00:21:16,540 --> 00:21:18,290 So there, too, the abstraction is valuable 413 00:21:18,290 --> 00:21:22,760 in that you immediately knew how to draw a square, but not where to draw it. 414 00:21:22,760 --> 00:21:24,890 Or in what size to draw it. 415 00:21:24,890 --> 00:21:27,140 And lastly, the triangle, perhaps the most, 416 00:21:27,140 --> 00:21:29,510 the juiciest opportunity for ambiguity, I 417 00:21:29,510 --> 00:21:31,350 didn't tell you how to orient that triangle. 418 00:21:31,350 --> 00:21:34,400 Maybe you, instead, did something a little different, 419 00:21:34,400 --> 00:21:37,130 where your triangle wasn't drawn like that, 420 00:21:37,130 --> 00:21:39,140 with the pointy part at the bottom. 421 00:21:39,140 --> 00:21:42,380 Maybe you, instead, did something like this. 422 00:21:42,380 --> 00:21:46,310 Or maybe it was somewhere else on the paper altogether. 423 00:21:46,310 --> 00:21:50,450 So now at this point in the story, you could have followed my instructions 424 00:21:50,450 --> 00:21:54,230 exactly as I described, but we could still somehow come up 425 00:21:54,230 --> 00:21:55,550 with different solutions. 426 00:21:55,550 --> 00:21:59,660 And in fact, if I can now spoil what the actual task at hand was, 427 00:21:59,660 --> 00:22:02,000 this was the picture I was describing. 428 00:22:02,000 --> 00:22:04,220 This picture here has a circle, below which 429 00:22:04,220 --> 00:22:06,400 is a square, below which is a triangle. 430 00:22:06,400 --> 00:22:08,750 But that leaves out some key details. 431 00:22:08,750 --> 00:22:12,500 And the curious thing here, is that even though abstraction is a useful 432 00:22:12,500 --> 00:22:16,730 mechanism, once you start to move away from those implementation details, 433 00:22:16,730 --> 00:22:21,140 if you will, you very quickly realize that I don't really know what 434 00:22:21,140 --> 00:22:23,390 you're telling me to do, necessarily. 435 00:22:23,390 --> 00:22:28,430 And the challenge is, that computers, as complicated or intimidating as they 436 00:22:28,430 --> 00:22:30,960 might seem to you, they're really not all that bright. 437 00:22:30,960 --> 00:22:31,460 Right. 438 00:22:31,460 --> 00:22:34,640 They can only do what they are explicitly told to do. 439 00:22:34,640 --> 00:22:38,000 And so if you, the human, or you eventually perhaps, 440 00:22:38,000 --> 00:22:42,410 the programmer, don't actually specify absolutely 441 00:22:42,410 --> 00:22:45,420 precisely what you want the computer-- 442 00:22:45,420 --> 00:22:48,620 or the human in this case-- to do, you might not 443 00:22:48,620 --> 00:22:53,630 get the output, or the correctness of a solution, that you're hoping for. 444 00:22:53,630 --> 00:22:57,080 In fact, let's try one other, and let's see 445 00:22:57,080 --> 00:22:58,761 what the other extreme might feel like. 446 00:22:58,761 --> 00:23:01,010 So on that same piece of paper, maybe on the flip side 447 00:23:01,010 --> 00:23:05,569 or another sheet of paper, let's play this game just one more time. 448 00:23:05,569 --> 00:23:07,360 And this one's going to be a little harder. 449 00:23:07,360 --> 00:23:11,170 I'm going to try to tell you exactly what to do. 450 00:23:11,170 --> 00:23:13,940 So lesson learned, that was too abstract. 451 00:23:13,940 --> 00:23:16,070 Let's now drill down on the implementation details. 452 00:23:16,070 --> 00:23:16,570 OK. 453 00:23:16,570 --> 00:23:20,650 So let me think like a computer might, or I think a computer might, 454 00:23:20,650 --> 00:23:24,400 go ahead and put your pen or pencil down on the piece of paper 455 00:23:24,400 --> 00:23:27,320 toward the top middle of the page. 456 00:23:27,320 --> 00:23:37,020 Now from that point, move say Southwest to halfway down the piece of paper. 457 00:23:37,020 --> 00:23:40,270 So really at a 45 degree downward angle. 458 00:23:40,270 --> 00:23:47,350 And then go ahead and move south east, or a different 45 degree angle, 459 00:23:47,350 --> 00:23:49,580 toward the bottom of the page. 460 00:23:49,580 --> 00:23:52,950 So you've kind of sort of made the left half of a triangle, 461 00:23:52,950 --> 00:23:55,690 or not really that, 2/3 of a triangle, two sides of a triangle. 462 00:23:55,690 --> 00:23:57,340 But stay where you are. 463 00:23:57,340 --> 00:24:03,580 At this point in the story your pen is probably below your original dot. 464 00:24:03,580 --> 00:24:07,150 But you've drawn two lines that form an angle. 465 00:24:07,150 --> 00:24:13,370 From where your pen now is, draw a line Northeast, 466 00:24:13,370 --> 00:24:15,950 or in an upward and to the right 45 degree 467 00:24:15,950 --> 00:24:19,490 angle, to the same height as your previous line. 468 00:24:19,490 --> 00:24:25,490 And then double back and go say Northwest, or a different 45 degree 469 00:24:25,490 --> 00:24:28,570 angle, still back to the original point. 470 00:24:28,570 --> 00:24:33,340 And at this point in the story, you have really either nothing at all, 471 00:24:33,340 --> 00:24:35,020 or you have a diamond. 472 00:24:35,020 --> 00:24:37,240 And therein lies a curiosity too. 473 00:24:37,240 --> 00:24:40,492 I could have just said diamond, draw a diamond, or draw a kite. 474 00:24:40,492 --> 00:24:41,950 Which would be an equivalent shape. 475 00:24:41,950 --> 00:24:44,950 But that too lends itself to same ambiguity, so I drill down deeper. 476 00:24:44,950 --> 00:24:48,290 But my God, it's taken us just a minute or more just to get to this point. 477 00:24:48,290 --> 00:24:49,417 And we're not done yet. 478 00:24:49,417 --> 00:24:51,000 We're not drawing a diamond or a kite. 479 00:24:51,000 --> 00:24:58,520 Now you have a diamond or a kite, with a top vertex or point, a bottom one, 480 00:24:58,520 --> 00:24:59,540 a left and a right. 481 00:24:59,540 --> 00:25:04,040 Move your pen to the left one, and draw a vertical line down. 482 00:25:04,040 --> 00:25:09,020 Now go back to the diamond or the kite, and on the right vertex or point, 483 00:25:09,020 --> 00:25:13,350 draw similarly a vertical line down. 484 00:25:13,350 --> 00:25:19,170 And now, in that original diameter kite, at the bottom most vertex or point, 485 00:25:19,170 --> 00:25:21,930 draw a vertical line down. 486 00:25:21,930 --> 00:25:25,740 And now, if you followed along with these very precise machine 487 00:25:25,740 --> 00:25:30,212 instructions, you've got three vertical lines that are just kind of dangling. 488 00:25:30,212 --> 00:25:33,420 I don't even want to mislead you with any hand gestures, three vertical lines 489 00:25:33,420 --> 00:25:37,110 that are dangling, go ahead and draw two lines that 490 00:25:37,110 --> 00:25:42,070 connect the ends of those three lines. 491 00:25:42,070 --> 00:25:42,920 Oh my God. 492 00:25:42,920 --> 00:25:46,870 Like it's so complex to do what we just did, and I would put money on the fact 493 00:25:46,870 --> 00:25:50,410 that you did not draw correctly, what I was trying to get you to draw, 494 00:25:50,410 --> 00:25:52,030 which is just a cube. 495 00:25:52,030 --> 00:25:56,170 Right, a cube is a wonderfully simple abstraction. 496 00:25:56,170 --> 00:25:59,190 A shape with which you and I are probably long familiar, 497 00:25:59,190 --> 00:26:01,876 and it's so easy to say draw a cube. 498 00:26:01,876 --> 00:26:04,750 But as we saw before with the circle and the square and the triangle, 499 00:26:04,750 --> 00:26:06,670 just saying draw a cube is ambiguous. 500 00:26:06,670 --> 00:26:08,170 At what angle should it be oriented? 501 00:26:08,170 --> 00:26:09,220 How big should it be? 502 00:26:09,220 --> 00:26:10,934 Where on the piece of paper should it be? 503 00:26:10,934 --> 00:26:12,850 And so I was trying to be so much more precise 504 00:26:12,850 --> 00:26:15,800 this time by having you put your pen down at the top, 505 00:26:15,800 --> 00:26:19,120 go down to the southwest and draw this line, 506 00:26:19,120 --> 00:26:25,810 then go down to the southernmost point here, then another Northeasternly line, 507 00:26:25,810 --> 00:26:28,300 and then a north westerly line. 508 00:26:28,300 --> 00:26:30,580 And that just got us to the kite. 509 00:26:30,580 --> 00:26:33,700 But the takeaway here, is that when it comes 510 00:26:33,700 --> 00:26:36,640 to making a computer do what you want to do, 511 00:26:36,640 --> 00:26:39,130 you can't just speak these abstractions. 512 00:26:39,130 --> 00:26:43,540 You actually have to implement them, or program them, or code them 513 00:26:43,540 --> 00:26:44,830 at least once. 514 00:26:44,830 --> 00:26:48,490 In fact, some of the earliest graphical programs in the world of computing, 515 00:26:48,490 --> 00:26:50,530 were kind of as low level as this. 516 00:26:50,530 --> 00:26:52,890 There was an old programming language called logo, 517 00:26:52,890 --> 00:26:56,625 in fact, that allowed you to program by moving a cursor, 518 00:26:56,625 --> 00:26:59,500 like a turtle of sorts, up and down and left and right on the screen. 519 00:26:59,500 --> 00:27:01,812 And putting either down or up a marker of sorts, 520 00:27:01,812 --> 00:27:05,020 and that you could draw shapes like this by just moving around on the screen. 521 00:27:05,020 --> 00:27:08,560 But to draw things like this clearly, as in our verbal example here, 522 00:27:08,560 --> 00:27:10,930 you have to be so darn precise. 523 00:27:10,930 --> 00:27:13,090 And it just gets so tedious so quickly. 524 00:27:13,090 --> 00:27:16,270 It certainly would take all of the fun out of using a computer, 525 00:27:16,270 --> 00:27:19,540 or all the fun out of using programming a computer, if you 526 00:27:19,540 --> 00:27:21,340 had to do this every time. 527 00:27:21,340 --> 00:27:24,100 But that's where there's an ingredient here to be leveraged. 528 00:27:24,100 --> 00:27:27,370 One of the things that a computer scientist, and a programmer, 529 00:27:27,370 --> 00:27:30,610 and engineers, more generally, very often do, 530 00:27:30,610 --> 00:27:34,270 is they absolutely implement these kinds of low level details once. 531 00:27:34,270 --> 00:27:38,110 They go through those very methodical, if mundane, steps 532 00:27:38,110 --> 00:27:40,330 of getting something just right. 533 00:27:40,330 --> 00:27:43,300 And then they save the instructions they wrote. 534 00:27:43,300 --> 00:27:46,460 They save the programs they wrote, if you will, 535 00:27:46,460 --> 00:27:48,332 so that they can reuse them later. 536 00:27:48,332 --> 00:27:50,665 And the fancy words for these things will eventually see 537 00:27:50,665 --> 00:27:52,330 are called libraries. 538 00:27:52,330 --> 00:27:53,500 Or functions. 539 00:27:53,500 --> 00:27:56,470 Or other names still. 540 00:27:56,470 --> 00:27:59,980 So once one human in the world has implemented a program, if you will, 541 00:27:59,980 --> 00:28:03,280 with which to draw a cube, similarly can we stand on his or her shoulders 542 00:28:03,280 --> 00:28:05,140 and reuse that same routine. 543 00:28:05,140 --> 00:28:09,090 And hopefully, they were clever enough to allow us to parameterize it. 544 00:28:09,090 --> 00:28:13,210 To customize it, by maybe changing the angle and the size and the depth 545 00:28:13,210 --> 00:28:14,090 and so forth. 546 00:28:14,090 --> 00:28:17,260 So it doesn't just have to be that one cube. 547 00:28:17,260 --> 00:28:21,820 And so here we have a wonderfully powerful problem solving technique. 548 00:28:21,820 --> 00:28:22,730 Abstraction. 549 00:28:22,730 --> 00:28:26,560 Which allows us to say what we mean, and the rest of the humans in the room 550 00:28:26,560 --> 00:28:30,250 just immediately understand-- at least after some instruction-- what 551 00:28:30,250 --> 00:28:31,600 it is we're talking about. 552 00:28:31,600 --> 00:28:35,560 But with computers being these very little literal devices, 553 00:28:35,560 --> 00:28:37,930 we can only talk at those levels of abstraction 554 00:28:37,930 --> 00:28:42,970 once we've actually built up software, implemented solutions to get us 555 00:28:42,970 --> 00:28:44,950 to that point in the conversation. 556 00:28:44,950 --> 00:28:48,670 And this is why, at first glance, using a phone in your pocket 557 00:28:48,670 --> 00:28:52,442 or a computer on your desk might actually seem super, super complicated. 558 00:28:52,442 --> 00:28:53,650 There's so many moving parts. 559 00:28:53,650 --> 00:28:55,270 And absolutely there are. 560 00:28:55,270 --> 00:28:58,420 Windows and Mac OS are literally the result of millions 561 00:28:58,420 --> 00:29:02,550 of lines of programming code these days, having been written over the years. 562 00:29:02,550 --> 00:29:05,530 And so of course it's to be expected that you might be a little daunted 563 00:29:05,530 --> 00:29:08,350 or overwhelmed by the apparent complexity. 564 00:29:08,350 --> 00:29:10,750 But one of the goals here for this lesson, 565 00:29:10,750 --> 00:29:14,740 is to really help you appreciate that beneath all that complexity, 566 00:29:14,740 --> 00:29:16,180 is a simpler idea. 567 00:29:16,180 --> 00:29:17,770 And then an even simpler idea. 568 00:29:17,770 --> 00:29:20,290 And then a very simple idea, and so forth. 569 00:29:20,290 --> 00:29:24,100 And so once you sort of bottom out and understand those first principles, 570 00:29:24,100 --> 00:29:28,330 zeros and ones, binary on top of which might be Ascii or Unicode, on top 571 00:29:28,330 --> 00:29:31,030 of which might be some other encoding still, 572 00:29:31,030 --> 00:29:35,370 can you resume the current conversation and understand 573 00:29:35,370 --> 00:29:38,770 that what might have looked completely complex at first glance, 574 00:29:38,770 --> 00:29:41,440 is really just the result of assembling, if you will, 575 00:29:41,440 --> 00:29:45,970 a whole bunch of pretty simple puzzle pieces. 576 00:29:45,970 --> 00:29:51,460 So at this point in the story, we now have a way of representing information. 577 00:29:51,460 --> 00:29:53,080 But now let's just stipulate. 578 00:29:53,080 --> 00:29:54,836 We know how to represent information. 579 00:29:54,836 --> 00:29:57,586 At the end of the day, Yeah it's binary, And at the end of the day 580 00:29:57,586 --> 00:30:01,720 you can think about it as decimal, and maybe you're using Ascii in Unicode, 581 00:30:01,720 --> 00:30:04,030 or maybe you're using graphics, or whatever 582 00:30:04,030 --> 00:30:05,680 is going on underneath the hood. 583 00:30:05,680 --> 00:30:09,310 All we need to know and care about now is that you can do it. 584 00:30:09,310 --> 00:30:12,550 And we don't really have to think too much more at that level. 585 00:30:12,550 --> 00:30:17,200 Now we can resume our look at the overarching model 586 00:30:17,200 --> 00:30:19,360 at hand, which is problem solving. 587 00:30:19,360 --> 00:30:23,780 We now have a way to represent our inputs and outputs. 588 00:30:23,780 --> 00:30:26,290 What then is inside that black box? 589 00:30:26,290 --> 00:30:28,660 Well the buzzword du jour these days is perhaps 590 00:30:28,660 --> 00:30:32,050 algorithms, where even if you don't necessarily know what it is 591 00:30:32,050 --> 00:30:36,040 or how to make one, or how to use one in the context of a computer program, 592 00:30:36,040 --> 00:30:39,620 algorithms seem to be increasingly the solution to all of our problems. 593 00:30:39,620 --> 00:30:42,370 Well an algorithm isn't all that complex, fancy 594 00:30:42,370 --> 00:30:45,100 though the word might sound, it's really just step 595 00:30:45,100 --> 00:30:48,160 by step instructions for solving a problem. 596 00:30:48,160 --> 00:30:51,340 And those steps can be in English, or they can be in something we'll call 597 00:30:51,340 --> 00:30:54,790 pseudo code, sort of code like but it's not really an actual language, 598 00:30:54,790 --> 00:30:58,600 or can be in Java, or C, or c++, or JavaScript, 599 00:30:58,600 --> 00:31:00,640 or any number of other programming languages. 600 00:31:00,640 --> 00:31:05,560 Algorithms are, again, just sets of instructions for solving a problem. 601 00:31:05,560 --> 00:31:08,610 Well what's one such problem we might want to solve? 602 00:31:08,610 --> 00:31:11,380 Well in the real world we might have-- 603 00:31:11,380 --> 00:31:13,390 the real old world shall we say-- 604 00:31:13,390 --> 00:31:16,960 we might have a problem that once upon a time looked like this. 605 00:31:16,960 --> 00:31:20,970 A phone book with a whole lot of pieces of paper inside of it, 606 00:31:20,970 --> 00:31:24,760 on which were a whole bunch of names and a whole bunch of telephone numbers. 607 00:31:24,760 --> 00:31:28,482 And the phone book was alphabetized from A to Z typically. 608 00:31:28,482 --> 00:31:31,690 And maybe there were some other sections like the yellow pages, or apparently 609 00:31:31,690 --> 00:31:33,550 the red pages, whatever this is here. 610 00:31:33,550 --> 00:31:35,758 But we'll just assume that these are the white pages, 611 00:31:35,758 --> 00:31:38,470 so to speak, with just a whole bunch of humans names and numbers. 612 00:31:38,470 --> 00:31:40,540 So suppose I want to find one such human, 613 00:31:40,540 --> 00:31:43,270 a human whom we'll call Mike Smith. 614 00:31:43,270 --> 00:31:45,970 How do you go about finding Mike Smith in this phone book? 615 00:31:45,970 --> 00:31:51,970 Well I could, somewhat stupidly, but arguably quite correctly, 616 00:31:51,970 --> 00:31:54,130 start at the beginning of the book. 617 00:31:54,130 --> 00:31:56,890 And I see here instructions for calling 911. 618 00:31:56,890 --> 00:32:00,220 If I move on to page two, I now see someone's names and numbers. 619 00:32:00,220 --> 00:32:02,320 But these people's names all start with A. 620 00:32:02,320 --> 00:32:04,660 And so I continue going through the A section. 621 00:32:04,660 --> 00:32:05,740 And the A section. 622 00:32:05,740 --> 00:32:08,380 And then eventually I get to the B section. 623 00:32:08,380 --> 00:32:09,640 And the B section. 624 00:32:09,640 --> 00:32:10,780 And the B section. 625 00:32:10,780 --> 00:32:14,410 And then the Cs, and the Ds and the Es and Fs and so forth. 626 00:32:14,410 --> 00:32:20,050 And eventually, tediously but correctly, I'll get to the Ss in the phone book. 627 00:32:20,050 --> 00:32:24,580 And if I see Mike Smith here, I can now pick up the phone and call Mike Smith. 628 00:32:24,580 --> 00:32:28,330 Now no one out there, if you've been still use this technology, 629 00:32:28,330 --> 00:32:30,550 is going to have looked up Mike Smith in that way. 630 00:32:30,550 --> 00:32:33,010 You're going to fly through this phone book far faster than I, right, 631 00:32:33,010 --> 00:32:35,470 you learned probably in grade school why count by ones 632 00:32:35,470 --> 00:32:36,880 when you can count by twos? 633 00:32:36,880 --> 00:32:41,230 So two, four, six, eight, ten, twelve. 634 00:32:41,230 --> 00:32:43,660 Sounds faster, is faster. 635 00:32:43,660 --> 00:32:47,680 It's going to get me to Mike faster, but is it going to get to me to Mike 636 00:32:47,680 --> 00:32:49,650 correctly? 637 00:32:49,650 --> 00:32:52,810 I'm going twice as fast by doing two pages at a time. 638 00:32:52,810 --> 00:32:55,510 So I'm going to have flip half as many times. 639 00:32:55,510 --> 00:33:01,770 But there's actually a bug, so to speak, a potential mistake here. 640 00:33:01,770 --> 00:33:03,700 What is that? 641 00:33:03,700 --> 00:33:06,720 Well in my cleverness to get to Mike's name 642 00:33:06,720 --> 00:33:10,230 twice as fast, what if I go ever so slightly too 643 00:33:10,230 --> 00:33:16,620 far, just because by bad luck, Mike is sandwiched between two of the pages 644 00:33:16,620 --> 00:33:20,220 that I so cleverly was skimming two at a time? 645 00:33:20,220 --> 00:33:22,490 Of course I'm looking at this side, maybe this side, 646 00:33:22,490 --> 00:33:24,370 but maybe Mike is in the middle. 647 00:33:24,370 --> 00:33:28,770 And so it turns out with that algorithm, that twosie approach, 648 00:33:28,770 --> 00:33:32,070 am I going to have to have a little bit of a check at the end? 649 00:33:32,070 --> 00:33:35,769 Such that if I hit like the T section, and see a name starting with T, 650 00:33:35,769 --> 00:33:37,560 I better wait a minute, let me double back, 651 00:33:37,560 --> 00:33:40,890 I might need to go back at most one page to make sure 652 00:33:40,890 --> 00:33:42,810 that I didn't actually miss Mike Smith. 653 00:33:42,810 --> 00:33:45,030 So at the end of the day, it's not correct 654 00:33:45,030 --> 00:33:46,860 if you naively just do two pages at a time, 655 00:33:46,860 --> 00:33:49,410 but it is correct if you do two pages at a time, 656 00:33:49,410 --> 00:33:53,580 with one final reversal by a page, just to make sure 657 00:33:53,580 --> 00:33:57,360 once you go past SMITH in the phone book, 658 00:33:57,360 --> 00:34:00,210 that you didn't accidentally miss Mike just one page prior. 659 00:34:00,210 --> 00:34:03,080 So it's still super fast, you just need that one little check. 660 00:34:03,080 --> 00:34:08,250 But still, no one out there, is going to look up Mike Smith one page at 661 00:34:08,250 --> 00:34:09,530 a time, two pages at a time. 662 00:34:09,530 --> 00:34:12,170 You're going to open in the middle of the damn phone book, 663 00:34:12,170 --> 00:34:17,070 look down and see, oh, not in the S section yet, I'm in the M section. 664 00:34:17,070 --> 00:34:21,270 And so what you intuitively know how to do since growing up perhaps, 665 00:34:21,270 --> 00:34:24,449 is that you know Mike is not in this half of the phone book. 666 00:34:24,449 --> 00:34:26,940 Mike is clearly in this half of the phone book. 667 00:34:26,940 --> 00:34:29,800 And so at this point you can figuratively-- but in our case 668 00:34:29,800 --> 00:34:32,610 literally-- tear the problem in half. 669 00:34:32,610 --> 00:34:36,739 Throw half of the problem away, and be left with, 670 00:34:36,739 --> 00:34:42,489 we single use phone book, and half at that, but half the size of the problem. 671 00:34:42,489 --> 00:34:46,350 So if we had 1,000 pages originally, now maybe I've got only 500 pages. 672 00:34:46,350 --> 00:34:48,030 And so I can repeat this intuition. 673 00:34:48,030 --> 00:34:49,580 Jump roughly to the middle. 674 00:34:49,580 --> 00:34:52,500 Darn, I'm a little too far now, I'm in the T section. 675 00:34:52,500 --> 00:34:57,420 Again, let's tear half the problem away, throw it away as well, 676 00:34:57,420 --> 00:35:02,880 and now be left with a 250 page problem, which can now be 125 page problem. 677 00:35:02,880 --> 00:35:06,520 Which is getting easier and easier and easier, until we repeat, 678 00:35:06,520 --> 00:35:08,160 repeat, repeat, repeat. 679 00:35:08,160 --> 00:35:12,600 Until theoretically, we're left with just one page in the end. 680 00:35:12,600 --> 00:35:15,750 Maybe Mike's on it, maybe Mike's not, but if he's in the phone book, 681 00:35:15,750 --> 00:35:18,160 he will be on this page. 682 00:35:18,160 --> 00:35:19,950 So how efficient was that. 683 00:35:19,950 --> 00:35:25,150 Well if that phone book had 1,000 pages, in my first algorithm 684 00:35:25,150 --> 00:35:28,900 it might have taken me what, like 700 plus steps to get to Mike? 685 00:35:28,900 --> 00:35:31,590 Or in the worst case 1,000 steps, right, it's alphabetical. 686 00:35:31,590 --> 00:35:33,150 But maybe it's not Smith, maybe it's someone 687 00:35:33,150 --> 00:35:36,316 with the last name that starts with Z. In the worst case in that first phone 688 00:35:36,316 --> 00:35:39,840 book, maybe it would take me 1,000 steps maximally to find Mike Smith. 689 00:35:39,840 --> 00:35:40,860 Pretty slow. 690 00:35:40,860 --> 00:35:44,320 What about that second algorithm where I was going two pages at a time? 691 00:35:44,320 --> 00:35:47,790 Well, with that algorithm, it's going to take me like 500 steps 692 00:35:47,790 --> 00:35:49,650 maximally to find Mike Smith. 693 00:35:49,650 --> 00:35:51,750 That's twice as fast, that's pretty good. 694 00:35:51,750 --> 00:35:56,790 It's not nearly as amazing as the algorithm we settled on. 695 00:35:56,790 --> 00:36:00,510 The intuitive one, arguably, where I divided and conquered. 696 00:36:00,510 --> 00:36:03,030 I have the problem again and again and again. 697 00:36:03,030 --> 00:36:09,150 Because if I start at 1,000, and I go to 500, then 250, then 125 and so forth, 698 00:36:09,150 --> 00:36:14,730 rounding as needed, I actually get to one page much faster. 699 00:36:14,730 --> 00:36:19,020 Put another way, how many times can you divide 1,000 in half 700 00:36:19,020 --> 00:36:22,537 before you're left with just the number one? 701 00:36:22,537 --> 00:36:25,120 Well if you do the math, either in one direction or the other, 702 00:36:25,120 --> 00:36:27,810 you'll see that it's roughly ten. 703 00:36:27,810 --> 00:36:31,150 In fact, if you want to pause even and grab a calculator, or a piece of paper, 704 00:36:31,150 --> 00:36:34,233 or pencil, or just think through it in your head, you can start with 1,000 705 00:36:34,233 --> 00:36:36,660 and go to 500, 125, and so forth. 706 00:36:36,660 --> 00:36:39,630 And you'll eventually hit one, after just 10-- 707 00:36:39,630 --> 00:36:41,610 give or take, depending on how you round-- 708 00:36:41,610 --> 00:36:42,900 steps. 709 00:36:42,900 --> 00:36:44,430 That's pretty powerful. 710 00:36:44,430 --> 00:36:45,740 But not that big of a deal. 711 00:36:45,740 --> 00:36:46,240 Right. 712 00:36:46,240 --> 00:36:48,360 Ten still is like a lot of page turns. 713 00:36:48,360 --> 00:36:51,240 But what about an even bigger problem. 714 00:36:51,240 --> 00:36:54,780 The types of problems that Google, and Microsoft, and Facebook, 715 00:36:54,780 --> 00:36:57,570 and Oracle, and really big companies deal with that 716 00:36:57,570 --> 00:36:58,800 have lots and lots of data. 717 00:36:58,800 --> 00:37:02,310 Suppose, for instance, that I'm searching through not a phone 718 00:37:02,310 --> 00:37:04,170 book, but a database. 719 00:37:04,170 --> 00:37:06,300 A big program that stores lots and lots of data. 720 00:37:06,300 --> 00:37:09,780 And suppose the data that's being stored is still names and numbers. 721 00:37:09,780 --> 00:37:13,440 How much time might it take to find someone like Mike Smith, 722 00:37:13,440 --> 00:37:18,510 in a database that's got like 4 billion names and numbers in it somehow? 723 00:37:18,510 --> 00:37:20,400 Well, four billion names and numbers. 724 00:37:20,400 --> 00:37:22,251 Well if we use that first algorithm, might 725 00:37:22,251 --> 00:37:24,000 take as many as four billion steps to find 726 00:37:24,000 --> 00:37:25,560 Mike Smith in a really big database. 727 00:37:25,560 --> 00:37:28,350 In a really big computer program, that's not too smart. 728 00:37:28,350 --> 00:37:30,480 But if I instead use the twosie approach, 729 00:37:30,480 --> 00:37:33,870 flipping through two database records at once, 730 00:37:33,870 --> 00:37:37,730 maybe it's not 4 billion operations, maybe it's just 2 billion. 731 00:37:37,730 --> 00:37:39,000 That's good. 732 00:37:39,000 --> 00:37:40,450 That's half as many operations. 733 00:37:40,450 --> 00:37:43,740 But what if I use this super clever intuition 734 00:37:43,740 --> 00:37:47,160 that I kind of grew up with here, with that divide and conquer algorithm. 735 00:37:47,160 --> 00:37:50,310 Well, I can start with 4 billion database records, go to 2 billion, 736 00:37:50,310 --> 00:37:56,190 then 1 billion, then 500 million, 250 million, 125 million, and so forth. 737 00:37:56,190 --> 00:37:58,690 I'm getting to one much, much faster. 738 00:37:58,690 --> 00:38:03,270 In fact, I can only divide the number 4 billion 739 00:38:03,270 --> 00:38:07,020 in half, roughly 32 times total. 740 00:38:07,020 --> 00:38:11,260 Again, depending kind of sort of on how you round, but 32 times. 741 00:38:11,260 --> 00:38:15,810 32 is so much smaller than 2 billion steps. 742 00:38:15,810 --> 00:38:18,675 And 32 is certainly smaller than the original starting point, 743 00:38:18,675 --> 00:38:20,010 4 billion steps. 744 00:38:20,010 --> 00:38:23,160 So this is a really powerful problem solving technique 745 00:38:23,160 --> 00:38:24,480 to divide and conquer. 746 00:38:24,480 --> 00:38:28,590 And here, too, even though what computers might seem to be doing these 747 00:38:28,590 --> 00:38:31,470 days is super complicated and sophisticated-- 748 00:38:31,470 --> 00:38:33,120 and it is in many ways-- 749 00:38:33,120 --> 00:38:36,180 but some of the ideas that those computers and the programmers who 750 00:38:36,180 --> 00:38:41,460 program them are leveraging, are actually pretty familiar to us already. 751 00:38:41,460 --> 00:38:46,710 Inside of this black box might not be something super fancy, but just 752 00:38:46,710 --> 00:38:49,890 a clever adaptation of some of your grade school 753 00:38:49,890 --> 00:38:56,380 human intuition to the context of a computer program. 754 00:38:56,380 --> 00:39:00,490 Now it's one thing to talk about algorithms, especially 755 00:39:00,490 --> 00:39:03,730 if we're just spit balling it verbally. 756 00:39:03,730 --> 00:39:07,330 But computers, of course, need us to be more precise. 757 00:39:07,330 --> 00:39:10,570 And they need us to state our thoughts more methodically. 758 00:39:10,570 --> 00:39:11,930 So what does this mean? 759 00:39:11,930 --> 00:39:16,330 Well, let me propose that we write some code, or really pseudocode, 760 00:39:16,330 --> 00:39:19,450 for this same algorithm, where we're looking for someone like Mike 761 00:39:19,450 --> 00:39:20,710 Smith in a phone book. 762 00:39:20,710 --> 00:39:23,620 And it's pseudocode because it's not going to be Java, or c++, 763 00:39:23,620 --> 00:39:25,120 or JavaScript, or anything else. 764 00:39:25,120 --> 00:39:28,990 It's going to be English like syntax, that's kind of sort of like code. 765 00:39:28,990 --> 00:39:31,630 And before long, we'll see some actual code as well. 766 00:39:31,630 --> 00:39:33,422 But step zero, and just to be playful here, 767 00:39:33,422 --> 00:39:36,588 I'm not going to start counting at one, I'm going to start counting at zero, 768 00:39:36,588 --> 00:39:39,010 just because with any number of bits, or digits, 769 00:39:39,010 --> 00:39:42,970 the lowest number that I could count with is of course zero. 770 00:39:42,970 --> 00:39:45,550 Pick up phone book is the first thing that I did. 771 00:39:45,550 --> 00:39:47,260 One, open to the middle of the phone book 772 00:39:47,260 --> 00:39:50,200 is the second thing I did in that third and final algorithm. 773 00:39:50,200 --> 00:39:54,100 Look at the names was the next thing I did, looking down at that phone book. 774 00:39:54,100 --> 00:39:56,830 And then if Smith is among names, and notice, 775 00:39:56,830 --> 00:40:00,000 this is semantically different from those first three steps, 776 00:40:00,000 --> 00:40:02,050 because this expression starts with if. 777 00:40:02,050 --> 00:40:04,390 So this is kind of like the proverbial fork in the road, 778 00:40:04,390 --> 00:40:06,940 if Smith is among the names, let's do this. 779 00:40:06,940 --> 00:40:08,200 What do we want to do? 780 00:40:08,200 --> 00:40:10,370 Call Mike, that's great. 781 00:40:10,370 --> 00:40:14,890 Otherwise, or else if Mike is earlier in the book, 782 00:40:14,890 --> 00:40:18,310 let's go a different direction instead. 783 00:40:18,310 --> 00:40:24,340 Let's instead open to the middle of the left half of the book, all right. 784 00:40:24,340 --> 00:40:27,730 So that would be the left half that I threw away earlier, 785 00:40:27,730 --> 00:40:29,710 and then go back to step two. 786 00:40:29,710 --> 00:40:34,000 Because once I have opened to the middle of the left half of the book, 787 00:40:34,000 --> 00:40:35,980 I don't have to actually dramatically tear it. 788 00:40:35,980 --> 00:40:40,390 I now need to look for Mike's name again, as per step two. 789 00:40:40,390 --> 00:40:43,870 Else if Mike is later in the book, I actually 790 00:40:43,870 --> 00:40:47,560 want to open to the middle of the right half of the book, 791 00:40:47,560 --> 00:40:50,880 and then go back to step two as well. 792 00:40:50,880 --> 00:40:54,020 Now you might think that's everything to this program. 793 00:40:54,020 --> 00:40:55,860 But there's actually a remaining step. 794 00:40:55,860 --> 00:40:59,870 Indeed I've got room left on the screen for a remaining step or two, 795 00:40:59,870 --> 00:41:02,540 but there are more than three possibilities. 796 00:41:02,540 --> 00:41:05,510 One of them is that Mike is among the names, one of them 797 00:41:05,510 --> 00:41:09,460 is that Mike is to the left, the other is that Mike is to the right. 798 00:41:09,460 --> 00:41:10,670 What's the fourth? 799 00:41:10,670 --> 00:41:13,610 If I don't consider the fourth, and indeed if in a program 800 00:41:13,610 --> 00:41:16,700 I don't implement the fourth, my program might crash. 801 00:41:16,700 --> 00:41:17,839 My computer might hang. 802 00:41:17,839 --> 00:41:19,880 My computer might behave in an unpredictable way, 803 00:41:19,880 --> 00:41:23,870 because if the programmer wasn't so precise as to anticipate 804 00:41:23,870 --> 00:41:27,590 something that might happen, who knows what the computer might do. 805 00:41:27,590 --> 00:41:30,440 And indeed, that's often why your own computer might 806 00:41:30,440 --> 00:41:32,600 have a little spinning beachball, or icon, 807 00:41:32,600 --> 00:41:34,510 or it might crash outright or freeze. 808 00:41:34,510 --> 00:41:37,100 It's because something unanticipated happened. 809 00:41:37,100 --> 00:41:39,090 So let's be precise. 810 00:41:39,090 --> 00:41:41,510 There's a fourth and final scenario I can think of, 811 00:41:41,510 --> 00:41:46,260 which perhaps on your mind too, just quit. 812 00:41:46,260 --> 00:41:49,024 Because in the fourth scenario, if Mike's not among the names, 813 00:41:49,024 --> 00:41:51,690 and he's not to the left, and he's not to the right in the book, 814 00:41:51,690 --> 00:41:53,110 he must not be there. 815 00:41:53,110 --> 00:41:56,400 And so let's avoid just hanging infinitely somehow or other, 816 00:41:56,400 --> 00:41:58,967 by actually proactively deciding to quit. 817 00:41:58,967 --> 00:42:01,800 But now let's tease apart what some of these terms actually are now. 818 00:42:01,800 --> 00:42:06,030 So in yellow are some things that really just look like verbs, or actions. 819 00:42:06,030 --> 00:42:09,000 And we're going to call those statements, or more specifically 820 00:42:09,000 --> 00:42:12,690 functions, or procedures would be a reasonable synonym too. 821 00:42:12,690 --> 00:42:15,030 And each of these yellow terms is really a call 822 00:42:15,030 --> 00:42:19,350 to action for the computer to just do something unconditionally. 823 00:42:19,350 --> 00:42:22,950 But sometimes, we want that computer to do something conditionally, 824 00:42:22,950 --> 00:42:25,920 as evinced by these yellow terms now. 825 00:42:25,920 --> 00:42:30,630 If, and else if, and else and else, kind of paints the picture of a four-way 826 00:42:30,630 --> 00:42:31,890 fork in the road. 827 00:42:31,890 --> 00:42:34,650 Where each of these branches, or conditions, 828 00:42:34,650 --> 00:42:36,480 leads us to take different action. 829 00:42:36,480 --> 00:42:41,160 And you'll see that I've indented lines four and six and seven and nine and ten 830 00:42:41,160 --> 00:42:45,510 and twelve, because they are meant to happen 831 00:42:45,510 --> 00:42:52,230 only if lines three and five and eight and eleven, or eleven, 832 00:42:52,230 --> 00:42:54,330 actually applies. 833 00:42:54,330 --> 00:42:58,030 So those indentation kind of captures the logic of this program. 834 00:42:58,030 --> 00:43:01,950 Lastly, or second to last there is this. 835 00:43:01,950 --> 00:43:05,680 This is what we call a little more fancily, a Boolean expression. 836 00:43:05,680 --> 00:43:08,810 These yellow phrases here are kind of like questions. 837 00:43:08,810 --> 00:43:12,840 There are either yes no, or true false, or one and zero 838 00:43:12,840 --> 00:43:16,320 or any number of other binary terms. 839 00:43:16,320 --> 00:43:17,880 But these are questions we're asking. 840 00:43:17,880 --> 00:43:19,470 The answer to which is going to be true or false. 841 00:43:19,470 --> 00:43:21,010 Smith is among the names, true or false? 842 00:43:21,010 --> 00:43:22,830 Smith is earlier in the book, true or false? 843 00:43:22,830 --> 00:43:24,580 Smith is later in the book, true or false? 844 00:43:24,580 --> 00:43:27,360 And so these Boolean expressions, named after someone 845 00:43:27,360 --> 00:43:31,860 who the last name of Bool, long ago, is a way 846 00:43:31,860 --> 00:43:35,550 of having conditions take you in a different direction based 847 00:43:35,550 --> 00:43:38,190 on whether something is true or false. 848 00:43:38,190 --> 00:43:40,000 Three such examples there. 849 00:43:40,000 --> 00:43:42,150 And then, lastly, there's these two lines. 850 00:43:42,150 --> 00:43:44,970 On seven and ten there's this expression go back 851 00:43:44,970 --> 00:43:49,750 to step two, which is to induce a bit of cyclicity, right. 852 00:43:49,750 --> 00:43:52,500 You can sort of think about it visually if you go from step seven, 853 00:43:52,500 --> 00:43:54,990 or ten for that matter, back up to step two. 854 00:43:54,990 --> 00:43:58,200 This might happen again, and again, and again, 855 00:43:58,200 --> 00:44:00,720 if Mike is still to the left half, the left half, 856 00:44:00,720 --> 00:44:03,210 the left, as you're whittling down the phone book. 857 00:44:03,210 --> 00:44:05,990 And so we're going to induce, and induce, and induce potentially, 858 00:44:05,990 --> 00:44:08,889 this cycling or looping behavior. 859 00:44:08,889 --> 00:44:10,680 So these lines here now in yellow represent 860 00:44:10,680 --> 00:44:14,020 what a computer program might call a loop. 861 00:44:14,020 --> 00:44:18,060 Now these same English phrases we'll eventually see can be translated 862 00:44:18,060 --> 00:44:21,150 into Java, and c++, and JavaScript, and Python, and Ruby, 863 00:44:21,150 --> 00:44:22,740 and other languages still. 864 00:44:22,740 --> 00:44:24,630 But the key takeaway for us here today is 865 00:44:24,630 --> 00:44:28,740 one, the precision with which we specified these steps, two, 866 00:44:28,740 --> 00:44:31,937 the fact that there's this ordering, what happens after the other. 867 00:44:31,937 --> 00:44:34,020 The fact that there is these conditions, some only 868 00:44:34,020 --> 00:44:36,540 happen if something is true, and the fact that some of them 869 00:44:36,540 --> 00:44:40,680 can happen again, and again, and again, based on some kind of looping. 870 00:44:40,680 --> 00:44:43,300 871 00:44:43,300 --> 00:44:45,040 But is this good? 872 00:44:45,040 --> 00:44:48,020 Like this is correct, and that of course, was one of our goals 873 00:44:48,020 --> 00:44:50,850 with the phone book, initially, was let's get it correct, 874 00:44:50,850 --> 00:44:53,450 and better still, let's get it correct and efficient. 875 00:44:53,450 --> 00:44:55,100 correct and fast. 876 00:44:55,100 --> 00:44:57,890 But how fast now is this? 877 00:44:57,890 --> 00:45:01,010 In fact, how do I put a number, of really a formula, 878 00:45:01,010 --> 00:45:03,170 on the performance of the algorithm so that I 879 00:45:03,170 --> 00:45:07,190 can claim that I am a good programmer, or I am a good problem solver? 880 00:45:07,190 --> 00:45:10,550 I've not only solved your problem correctly, but really, really well. 881 00:45:10,550 --> 00:45:13,820 Well let me propose that we analyze these three functions. 882 00:45:13,820 --> 00:45:14,900 Kind of in the abstract. 883 00:45:14,900 --> 00:45:19,310 We don't need actual arithmetic expressions here, per se. 884 00:45:19,310 --> 00:45:21,930 We'll just do things variably as follows. 885 00:45:21,930 --> 00:45:26,240 So here's a nice little Cartesian plane, with an x-axis of size of problem, 886 00:45:26,240 --> 00:45:29,000 and a y-axis of time to solve. 887 00:45:29,000 --> 00:45:33,020 And the farther out you go on size of problem, from left to right, 888 00:45:33,020 --> 00:45:36,210 means bigger, bigger, bigger, bigger, bigger, bigger, bigger bigger problem. 889 00:45:36,210 --> 00:45:38,570 And by bigger problem I mean more and more pages. 890 00:45:38,570 --> 00:45:41,150 More and more input, whatever the problem actually is. 891 00:45:41,150 --> 00:45:44,930 And then time to solve, is not very much time, lots of time. 892 00:45:44,930 --> 00:45:47,240 So on the y-axis too, the higher you go, the more 893 00:45:47,240 --> 00:45:50,360 time it takes to solve a problem, or really the slower 894 00:45:50,360 --> 00:45:52,380 it is to solve the problem. 895 00:45:52,380 --> 00:45:58,100 So let's now draw this red line here as a depiction of the first algorithms 896 00:45:58,100 --> 00:46:00,080 performance, or running time. 897 00:46:00,080 --> 00:46:04,820 Whereby, there is a one to one relationship between size and time. 898 00:46:04,820 --> 00:46:07,580 A one to one relationship between number of pages, 899 00:46:07,580 --> 00:46:10,520 and maybe number of page turns, or seconds, or whatever 900 00:46:10,520 --> 00:46:12,470 your unit of measure happens to be. 901 00:46:12,470 --> 00:46:18,380 So that if I have a phone book of this size, this dot here on the red line 902 00:46:18,380 --> 00:46:21,390 is how many seconds, or page turns it takes me to find it. 903 00:46:21,390 --> 00:46:24,920 And there's a linear relationship there, as implied by this variable-- 904 00:46:24,920 --> 00:46:26,630 as we'll call it-- as in algebra. 905 00:46:26,630 --> 00:46:29,330 n for number of pages, for instance. 906 00:46:29,330 --> 00:46:33,692 It's linear in so far as Verizon, if the phone company like Verizon 907 00:46:33,692 --> 00:46:36,650 increases the number of pages in the phone book just buy one next year, 908 00:46:36,650 --> 00:46:39,860 a few more people move into town so they add one more page to the phone book. 909 00:46:39,860 --> 00:46:44,360 That might take one additional page turn to find someone like Mike, 910 00:46:44,360 --> 00:46:46,960 or anyone else in that phone book if Verizon just adds a page. 911 00:46:46,960 --> 00:46:48,710 Or if they doubled the number of pages, it 912 00:46:48,710 --> 00:46:51,084 might double the amount of time it takes to find someone. 913 00:46:51,084 --> 00:46:53,781 There's a linear relationship between the two. 914 00:46:53,781 --> 00:46:56,780 Now with the second algorithm, where I was flying through the phone book 915 00:46:56,780 --> 00:47:00,060 two pages at a time, there's still a linear relationship, 916 00:47:00,060 --> 00:47:02,460 but it's not quite as bad so to speak. 917 00:47:02,460 --> 00:47:05,300 For instance, if I have this many pages in my phone book, 918 00:47:05,300 --> 00:47:10,010 my first algorithm might take this many seconds in the first algorithm, 919 00:47:10,010 --> 00:47:13,280 but because I'm flying through the phone book two at a time, 920 00:47:13,280 --> 00:47:15,440 it will take me half as much time. 921 00:47:15,440 --> 00:47:18,060 Half as many page turns with that second algorithm. 922 00:47:18,060 --> 00:47:20,690 So we'll describe it algebraically as n over two. 923 00:47:20,690 --> 00:47:22,980 Where n, again, is just the number of pages. 924 00:47:22,980 --> 00:47:26,970 So there is a relationship between those two lines and those two formulas. 925 00:47:26,970 --> 00:47:28,550 But that third algorithm. 926 00:47:28,550 --> 00:47:31,680 Super, super fancy I'll wager. 927 00:47:31,680 --> 00:47:35,100 And the shape of its line is fundamentally different. 928 00:47:35,100 --> 00:47:37,740 If you remember your logarithms, you'll see here 929 00:47:37,740 --> 00:47:42,170 this green curved line, otherwise depicted here as log of n, 930 00:47:42,170 --> 00:47:44,240 or really log base 2 of n, but we'll just keep 931 00:47:44,240 --> 00:47:46,760 it abstracted away as just log of n. 932 00:47:46,760 --> 00:47:49,100 This is a fundamentally different shape. 933 00:47:49,100 --> 00:47:51,590 And it still goes forever up and up and up and up. 934 00:47:51,590 --> 00:47:53,506 Even though it looks like it's flattening out, 935 00:47:53,506 --> 00:47:58,430 it's not perfectly flattening out, it's just growing and growing very slowly. 936 00:47:58,430 --> 00:48:02,180 And that's key, because what's powerful about that third and final algorithm, 937 00:48:02,180 --> 00:48:06,705 is that Verizon, for instance, doubles the number of pages in the phone book 938 00:48:06,705 --> 00:48:09,830 next year, because a whole lot of people move into town, or maybe two towns 939 00:48:09,830 --> 00:48:11,510 merge or whatever. 940 00:48:11,510 --> 00:48:14,510 Then how many more steps will it take me to find someone 941 00:48:14,510 --> 00:48:17,200 like Mike Smith in that phone book? 942 00:48:17,200 --> 00:48:19,460 Well just one additional page turn. 943 00:48:19,460 --> 00:48:21,950 One additional tear of the phone book. 944 00:48:21,950 --> 00:48:26,060 Because with each tear can I take a bite out of that problem. 945 00:48:26,060 --> 00:48:27,800 That's 50% of it. 946 00:48:27,800 --> 00:48:30,054 I can literally tear it in half. 947 00:48:30,054 --> 00:48:32,720 Meanwhile, Verizon doubled the number of pages in the phone book 948 00:48:32,720 --> 00:48:37,190 with my first algorithm, as per this straight linear relationship, aka n, 949 00:48:37,190 --> 00:48:40,640 it would double the amount of time necessary, or double the number of page 950 00:48:40,640 --> 00:48:43,440 turns necessary, to find Mike Smith in that case. 951 00:48:43,440 --> 00:48:46,640 So again, in the case of the first algorithm, 952 00:48:46,640 --> 00:48:49,040 would double the amount of time taken. 953 00:48:49,040 --> 00:48:53,900 The case of the third algorithm would increase by one step. 954 00:48:53,900 --> 00:48:54,740 One step. 955 00:48:54,740 --> 00:48:59,540 So if that phone book went ridiculously from four billion to eight billion, 956 00:48:59,540 --> 00:49:00,470 no big deal. 957 00:49:00,470 --> 00:49:03,770 But that third algorithm, just need one additional page turn. 958 00:49:03,770 --> 00:49:06,040 One additional tear of the phone book. 959 00:49:06,040 --> 00:49:07,850 Not in the case of the first algorithm. 960 00:49:07,850 --> 00:49:13,790 I would need an additional 4 billion page turns to find who I'm looking for. 961 00:49:13,790 --> 00:49:17,290 962 00:49:17,290 --> 00:49:17,790 OK. 963 00:49:17,790 --> 00:49:20,460 But a phone book is a physical problem. 964 00:49:20,460 --> 00:49:24,360 And searching for phone numbers in a phone book is a physical solution. 965 00:49:24,360 --> 00:49:26,920 But what can computers do electronically? 966 00:49:26,920 --> 00:49:30,150 Like what is the analog in our computer to the problems 967 00:49:30,150 --> 00:49:31,450 we've been solving thus far? 968 00:49:31,450 --> 00:49:34,590 Well what about our own iPhones, or Android phones, or the like, 969 00:49:34,590 --> 00:49:36,480 where you have contacts these days? 970 00:49:36,480 --> 00:49:38,440 Sort of a digital version of a phone book. 971 00:49:38,440 --> 00:49:40,809 And those contacts are typically sorted alphabetically. 972 00:49:40,809 --> 00:49:43,350 But even then, you might know a lot of people, and most of us 973 00:49:43,350 --> 00:49:46,380 probably don't scroll, scroll, scroll, scroll, scroll. 974 00:49:46,380 --> 00:49:48,070 And if you do, you're doing it wrong. 975 00:49:48,070 --> 00:49:51,502 You can instead search by keyword, or by first name, or by last name, 976 00:49:51,502 --> 00:49:53,460 to actually find someone much more efficiently. 977 00:49:53,460 --> 00:49:56,680 And typically their name jumps to the top of the list. 978 00:49:56,680 --> 00:49:58,250 So that's an algorithm. 979 00:49:58,250 --> 00:50:00,660 The Google, and Apple, and other companies 980 00:50:00,660 --> 00:50:04,740 have implemented algorithms for searching, or even sorting 981 00:50:04,740 --> 00:50:07,620 your list of contacts so that they can find people for you quickly, 982 00:50:07,620 --> 00:50:10,830 so that you can just send them a text message or make that call. 983 00:50:10,830 --> 00:50:12,270 But let's simplify further. 984 00:50:12,270 --> 00:50:17,300 Rather than even get into the sorting of names, and alphabetical phrases, 985 00:50:17,300 --> 00:50:19,020 let's keep things simple for a moment. 986 00:50:19,020 --> 00:50:22,710 And just assume that we want to store things like numbers. 987 00:50:22,710 --> 00:50:24,940 A list of numbers for instance. 988 00:50:24,940 --> 00:50:28,800 So if we want to store a list of numbers, how can we do it? 989 00:50:28,800 --> 00:50:32,790 Well let me propose that the screen here, or if you will a piece of paper 990 00:50:32,790 --> 00:50:36,495 in front of you, really just represents your computer's memory. 991 00:50:36,495 --> 00:50:38,370 Now you may know that inside of your computer 992 00:50:38,370 --> 00:50:39,786 there's different types of memory. 993 00:50:39,786 --> 00:50:42,320 Something called the hard disk, or maybe a solid state disk, 994 00:50:42,320 --> 00:50:45,630 or something called ram, or random access memory, something called Rom 995 00:50:45,630 --> 00:50:48,510 or read only memory, something called l-1 cache, l-2 cache, 996 00:50:48,510 --> 00:50:50,130 sometimes l-3 cache and the like. 997 00:50:50,130 --> 00:50:52,110 So there's different types of memory. 998 00:50:52,110 --> 00:50:55,515 But the one we're going to think about right now is ram, random access memory. 999 00:50:55,515 --> 00:50:57,390 And this is the type of memory that you might 1000 00:50:57,390 --> 00:51:02,910 have a gigabyte of, four gigabytes of, 16 gigabytes of, depending on how fancy 1001 00:51:02,910 --> 00:51:07,360 your laptop or desktop computer is, or your phone for that matter. 1002 00:51:07,360 --> 00:51:10,620 So inside of your computer, or phone, is a whole bunch of memory. 1003 00:51:10,620 --> 00:51:11,790 A whole bunch of RAM. 1004 00:51:11,790 --> 00:51:15,240 And if you have, for instance, 4 gigabytes of RAM, 1005 00:51:15,240 --> 00:51:17,940 that is four billion bytes. 1006 00:51:17,940 --> 00:51:20,580 Because what that means is giga means billion, 1007 00:51:20,580 --> 00:51:25,220 so four giga means for billion, four gigabytes means four billion bytes. 1008 00:51:25,220 --> 00:51:28,420 Mega, meanwhile means million, give or take. 1009 00:51:25,840 --> 00:51:28,776 And in fact, earlier when we talked about rgb, 1010 00:51:28,420 --> 00:51:31,310 And four megabytes then would be four million bytes. 1011 00:51:31,310 --> 00:51:32,592 So 4 gigabytes is even bigger. 1012 00:51:32,592 --> 00:51:34,800 And if you want to count higher still, four terabytes 1013 00:51:34,800 --> 00:51:37,540 would be a huge amount of memory. 1014 00:51:37,540 --> 00:51:39,780 So if you have four gigabytes of memory, that's 1015 00:51:39,780 --> 00:51:45,840 four billion bytes of memory, and a byte meanwhile, is just eight bits. 1016 00:51:45,910 --> 00:51:47,460 So why talk in terms of bytes? 1017 00:51:47,460 --> 00:51:49,964 Well it's just easier to talk in terms of eight bits, 1018 00:51:49,964 --> 00:51:51,130 rather than individual bits. 1019 00:51:51,130 --> 00:51:53,463 Because with one bit, you can't really do all that much. 1020 00:51:53,463 --> 00:51:55,220 You can only count from zero to one. 1021 00:51:55,220 --> 00:51:59,390 So with 8 bits at least we can count a bit higher and express a bit more. 1022 00:51:59,390 --> 00:52:04,744 so if we have 4 billion bytes of memory inside of our computer, 1023 00:52:04,744 --> 00:52:07,660 you know it stands to reason that we could number each of those bytes. 1024 00:52:07,660 --> 00:52:11,710 And maybe the top most byte up here represents byte number zero, 1025 00:52:11,710 --> 00:52:18,039 and maybe the bottom most byte over here represents the number four billion, 1026 00:52:18,039 --> 00:52:18,580 give or take. 1027 00:52:18,580 --> 00:52:21,740 In fact it's not exactly 4 billion, it's a little more than that. 1028 00:52:21,740 --> 00:52:24,580 But for our purposes, let's just assume that if this screen here 1029 00:52:24,580 --> 00:52:27,950 represents all of my computer's memory, and you know for that matter, 1030 00:52:27,950 --> 00:52:31,720 let's think of my memory as being divisible into rows and columns, 1031 00:52:31,720 --> 00:52:33,250 just to kind of keep things orderly. 1032 00:52:33,250 --> 00:52:36,070 This is not actually what it looks like underneath the hood. 1033 00:52:36,070 --> 00:52:39,310 And it's definitely not as wavy as I making it out to be here. 1034 00:52:39,310 --> 00:52:42,280 But you can think of it really, as rows and columns, 1035 00:52:42,280 --> 00:52:45,700 that you can address each of the cells in this table. 1036 00:52:45,700 --> 00:52:48,970 So kind of sort of like a spreadsheet with a whole lot of room for values. 1037 00:52:48,970 --> 00:52:53,670 So this might be byte zero one, two, three, dot dot dot if you will. 1038 00:52:53,670 --> 00:52:55,967 Four billion or so in the bottom right hand corner. 1039 00:52:55,967 --> 00:52:59,050 But where these things are laid out doesn't really matter for our purposes 1040 00:52:59,050 --> 00:53:00,290 right now. 1041 00:53:00,290 --> 00:53:02,510 So that's how you might address memory. 1042 00:53:02,510 --> 00:53:04,210 But how might you use memory? 1043 00:53:04,210 --> 00:53:06,730 So suppose I want to store a whole bunch of values 1044 00:53:06,730 --> 00:53:08,620 inside of my computer's memory. 1045 00:53:08,620 --> 00:53:11,630 I don't really care where it is for the moment. 1046 00:53:11,630 --> 00:53:13,840 And therefore, I don't care what those addresses are. 1047 00:53:13,840 --> 00:53:20,290 But let's suppose that I do care that the numbers are close by to each other. 1048 00:53:20,290 --> 00:53:21,640 They are contiguous in memory. 1049 00:53:21,640 --> 00:53:22,870 Back to back to back to back. 1050 00:53:22,870 --> 00:53:26,290 And we'll see that that has some advantages for efficiency. 1051 00:53:26,290 --> 00:53:29,290 So suppose that I want to store one number in my computer's memory, 1052 00:53:29,290 --> 00:53:34,750 and suppose that this memory and this memory and this memory 1053 00:53:34,750 --> 00:53:37,690 here, those are all being used already by other programs, 1054 00:53:37,690 --> 00:53:39,410 or other things going on in my program. 1055 00:53:39,410 --> 00:53:43,119 And so the first available slot, for instance, say is this one here. 1056 00:53:43,119 --> 00:53:45,160 And the number I want to store is the number one. 1057 00:53:45,160 --> 00:53:48,940 And the next number that I want to store is going to be the number 50. 1058 00:53:48,940 --> 00:53:51,670 And the next number I want to store is going to be the number 51. 1059 00:53:51,670 --> 00:53:53,410 And the next one 61. 1060 00:53:53,410 --> 00:53:55,720 And then the next one 109. 1061 00:53:55,720 --> 00:53:58,180 And then the next one 121. 1062 00:53:58,180 --> 00:53:59,690 And so forth. 1063 00:53:59,690 --> 00:54:03,400 Which is to say, if I'm storing not people's phone numbers or names, 1064 00:54:03,400 --> 00:54:05,650 just more simply for the sake of discussion, 1065 00:54:05,650 --> 00:54:09,130 all I care about for whatever reason is storing a list of numbers. 1066 00:54:09,130 --> 00:54:13,090 I might store them in my computer's memory in exactly this way. 1067 00:54:13,090 --> 00:54:16,150 Each of these squares represents a byte, or eight bits. 1068 00:54:16,150 --> 00:54:18,910 We know that we can abstract above bits, and actually 1069 00:54:18,910 --> 00:54:20,530 represent things like decimal numbers. 1070 00:54:20,530 --> 00:54:22,613 So I'm just stipulating that inside of these boxes 1071 00:54:22,613 --> 00:54:25,669 really is eight bits, or some pattern of eight zeros and ones. 1072 00:54:25,669 --> 00:54:26,710 But that's too low level. 1073 00:54:26,710 --> 00:54:30,160 We're going to abstract away from that and just talk in terms, for now, 1074 00:54:30,160 --> 00:54:32,300 in decimal digits. 1075 00:54:32,300 --> 00:54:36,100 So this thing here might be how I store something 1076 00:54:36,100 --> 00:54:38,260 underneath the hood in my computer. 1077 00:54:38,260 --> 00:54:43,910 And what's nice about this, is that it's super, super easy to find values. 1078 00:54:43,910 --> 00:54:47,680 In fact, let me just highlight this range of values, and slap a name on it. 1079 00:54:47,680 --> 00:54:50,620 This boxed in region here in a computer program, 1080 00:54:50,620 --> 00:54:53,620 would very often be called an array. 1081 00:54:53,620 --> 00:54:57,370 It's a list of sorts, of values, but what's key 1082 00:54:57,370 --> 00:55:01,540 here is that this list of values is back to back to back 1083 00:55:01,540 --> 00:55:04,090 to back all contiguous in memory. 1084 00:55:04,090 --> 00:55:08,620 When that is the case, when you have this property inside of a computer, 1085 00:55:08,620 --> 00:55:10,870 you can do things pretty darned fast. 1086 00:55:10,870 --> 00:55:14,500 Because there's a mathematical relationship among the locations of all 1087 00:55:14,500 --> 00:55:15,320 of these values. 1088 00:55:15,320 --> 00:55:17,770 For instance, if you are the computer, and you 1089 00:55:17,770 --> 00:55:20,170 find yourself looking at the number 50, how do you 1090 00:55:20,170 --> 00:55:22,180 find the next number in the list? 1091 00:55:22,180 --> 00:55:25,990 Well, you just add one to your location. 1092 00:55:25,990 --> 00:55:28,390 Not one to your value, one to your location. 1093 00:55:28,390 --> 00:55:31,060 How do you go from the number 51 to 61? 1094 00:55:31,060 --> 00:55:33,130 You just add one to your location. 1095 00:55:33,130 --> 00:55:35,982 Again, because if this is byte zero, one. two, three, and I 1096 00:55:35,982 --> 00:55:37,190 don't know where we are here. 1097 00:55:37,190 --> 00:55:40,750 Maybe this is byte 100, 101, 102, 103. 1098 00:55:40,750 --> 00:55:43,390 Because all of these values are back to back to back to back, 1099 00:55:43,390 --> 00:55:46,960 you can very quickly access them just by looking to the left, 1100 00:55:46,960 --> 00:55:51,260 looking to the right, by a fixed number of bits. 1101 00:55:51,260 --> 00:55:53,560 This is advantageous, because you can jump around. 1102 00:55:53,560 --> 00:55:56,950 In fact, this is where the name random access memory comes from. 1103 00:55:56,950 --> 00:56:01,150 You have random access, not in the sense that you just go anywhere randomly, 1104 00:56:01,150 --> 00:56:06,290 but you can go anywhere you want in the same amount of time. 1105 00:56:06,290 --> 00:56:10,000 Even though this byte four billion looks really far away from everything else, 1106 00:56:10,000 --> 00:56:10,840 doesn't matter. 1107 00:56:10,840 --> 00:56:14,950 Because arithmetically the computer can say, give me byte number four billion. 1108 00:56:14,950 --> 00:56:18,590 And that's a constant time operation, it's not linear. 1109 00:56:18,590 --> 00:56:21,880 Doesn't take more and more steps the farther away it is, because all of this 1110 00:56:21,880 --> 00:56:25,060 is electricity and electronic, not moving parts. 1111 00:56:25,060 --> 00:56:30,220 You can just instantly jump to any of the cells in your computer's memory 1112 00:56:30,220 --> 00:56:32,020 like that. 1113 00:56:32,020 --> 00:56:36,220 The catch is, me the human, can look at these values and just see, 1114 00:56:36,220 --> 00:56:38,290 oh I see a whole bunch of values. 1115 00:56:38,290 --> 00:56:42,610 Six values inside of that array of memory there. 1116 00:56:42,610 --> 00:56:44,360 But a computer can't do that. 1117 00:56:44,360 --> 00:56:48,430 In fact a computer can really only see one value at a time. 1118 00:56:48,430 --> 00:56:52,440 And so when I say the computer can go get value four billion, it can do that. 1119 00:56:52,440 --> 00:56:54,090 But it can only go get that one value. 1120 00:56:54,090 --> 00:56:57,160 If it wants another, it has to go get that value and another value. 1121 00:56:57,160 --> 00:57:01,140 And indeed, inside of a computer, inside of a CPU that a company like Intel 1122 00:57:01,140 --> 00:57:02,850 might make, there's an instruction that's 1123 00:57:02,850 --> 00:57:06,000 often called load, which refers to exactly this process. 1124 00:57:06,000 --> 00:57:09,660 Go load a value, maybe one byte, maybe four bytes, maybe eight bytes, 1125 00:57:09,660 --> 00:57:11,880 from memory, and bring it back to me. 1126 00:57:11,880 --> 00:57:14,460 And put it in something called a register, typically. 1127 00:57:14,460 --> 00:57:18,660 But that load operation is kind of like, if you've ever seen those fancy soda 1128 00:57:18,660 --> 00:57:21,675 machines where it doesn't just drop your soda on the ground, 1129 00:57:21,675 --> 00:57:24,300 instead there's that fancy little robotic arm that for whatever 1130 00:57:24,300 --> 00:57:27,420 reason goes left and then goes right and then goes up and then goes down, 1131 00:57:27,420 --> 00:57:30,450 and then finally decides where your soda is. 1132 00:57:30,450 --> 00:57:33,540 Then it moves it over the dispenser and drops it out. 1133 00:57:33,540 --> 00:57:37,560 Well that kind of robotic arm is kind of like a physical incarnation 1134 00:57:37,560 --> 00:57:41,070 of a computer, in that it can't just take a look back and figure out 1135 00:57:41,070 --> 00:57:46,290 where the coca-cola is, it has to go to that particular location 1136 00:57:46,290 --> 00:57:47,670 and actually drop it for you. 1137 00:57:47,670 --> 00:57:51,420 But computers, when there aren't moving parts, can just jump there instantly. 1138 00:57:51,420 --> 00:57:53,730 You don't have to wait for that robotic arm to move. 1139 00:57:53,730 --> 00:57:57,690 And so in this case, the computer can get one of these values instantly, 1140 00:57:57,690 --> 00:58:02,640 but it doesn't know what value is there until it arrives. 1141 00:58:02,640 --> 00:58:04,110 Right It's kind of like being-- 1142 00:58:04,110 --> 00:58:09,240 another analogy might be a whole bunch of lockers 1143 00:58:09,240 --> 00:58:11,160 in a train station or the like. 1144 00:58:11,160 --> 00:58:14,370 Where the only way you can see what's inside of a locker is by renting it 1145 00:58:14,370 --> 00:58:17,490 or by putting in a key and opening it up and seeing what's inside. 1146 00:58:17,490 --> 00:58:21,150 Similarly, can you think of there being tiny little doors, 1147 00:58:21,150 --> 00:58:24,540 occluding these numbers so that the computer has to open it up 1148 00:58:24,540 --> 00:58:25,920 before it sees what's there. 1149 00:58:25,920 --> 00:58:28,890 1150 00:58:28,890 --> 00:58:32,920 Speaking of doors, let's suppose that the computer's memory actually 1151 00:58:32,920 --> 00:58:34,270 has some doors in front of it. 1152 00:58:34,270 --> 00:58:36,760 Depicted here, is just these simple rectangles. 1153 00:58:36,760 --> 00:58:40,660 And behind each of these doors is some random numbers, 1154 00:58:40,660 --> 00:58:43,600 among which is one number I do know and care about. 1155 00:58:43,600 --> 00:58:46,000 In fact, I want to find myself the number 50, 1156 00:58:46,000 --> 00:58:47,430 but I don't know where it is. 1157 00:58:47,430 --> 00:58:51,280 So I'm going to take a pretty naive, but correct approach, of just starting 1158 00:58:51,280 --> 00:58:51,910 from the left. 1159 00:58:51,910 --> 00:58:55,210 Just as we did with a phone book, starting from the first page. 1160 00:58:55,210 --> 00:58:58,819 And I'm going to go ahead and knock on, and then open the first door. 1161 00:58:58,819 --> 00:59:01,360 And I see the number 15, that of course is not the number 50. 1162 00:59:01,360 --> 00:59:02,693 So I'm going to proceed further. 1163 00:59:02,693 --> 00:59:07,030 To go and knock on and touch the second door, and I see 23. 1164 00:59:07,030 --> 00:59:10,030 Still not the number I'm looking for, so I knock and touch again. 1165 00:59:10,030 --> 00:59:10,690 16. 1166 00:59:10,690 --> 00:59:12,400 Knock and touch again. 1167 00:59:12,400 --> 00:59:14,170 Knock and touch again. 1168 00:59:14,170 --> 00:59:16,600 Still no number, but I have found the meaning of life. 1169 00:59:16,600 --> 00:59:22,150 And let's try one more door here and touch the number 50 behind this door. 1170 00:59:22,150 --> 00:59:25,807 So here too is a very linear algorithm, but these numbers 1171 00:59:25,807 --> 00:59:27,390 are kind of all over the place, right. 1172 00:59:27,390 --> 00:59:30,980 It starts at 15, gets a little bigger, then gets smaller, then smaller still, 1173 00:59:30,980 --> 00:59:32,440 then bigger then bigger again. 1174 00:59:32,440 --> 00:59:36,580 So it doesn't seem that these numbers are sorted. 1175 00:59:36,580 --> 00:59:41,560 And in fact, if the analog then in my phone might be, 1176 00:59:41,560 --> 00:59:44,020 my contacts might be just randomly ordered. 1177 00:59:44,020 --> 00:59:46,099 Maybe ordered in the order in which I inputted 1178 00:59:46,099 --> 00:59:48,640 them, which probably isn't ideal when I want to find someone. 1179 00:59:48,640 --> 00:59:50,770 I don't really care in what order they were put in, 1180 00:59:50,770 --> 00:59:52,930 I care that I find them quickly. 1181 00:59:52,930 --> 00:59:55,540 And so maybe I should do a little work up front, 1182 00:59:55,540 --> 00:59:58,330 or maybe Apple or Google should do a little work upfront, 1183 00:59:58,330 --> 01:00:01,120 and actually maintain my contacts in sorted order. 1184 01:00:01,120 --> 01:00:02,380 Which indeed they do. 1185 01:00:02,380 --> 01:00:05,470 Or in this case here, why don't I at least maintain my numbers 1186 01:00:05,470 --> 01:00:06,800 in sorted order. 1187 01:00:06,800 --> 01:00:10,870 So let me go ahead and reset all of the doors, and start again. 1188 01:00:10,870 --> 01:00:13,180 This time with an assumption. 1189 01:00:13,180 --> 01:00:17,050 This time with a feature, if you will, that the numbers are now 1190 01:00:17,050 --> 01:00:19,240 sorted behind these doors. 1191 01:00:19,240 --> 01:00:22,960 So they go from smallest to biggest, and there might be some gaps in the middle, 1192 01:00:22,960 --> 01:00:25,720 but in so far as they're now sorted, I can 1193 01:00:25,720 --> 01:00:29,470 leverage some of that old school phone book intuition and divide 1194 01:00:29,470 --> 01:00:30,880 and conquer this problem. 1195 01:00:30,880 --> 01:00:33,760 Let me go ahead into the middle of the problem now, not the left, 1196 01:00:33,760 --> 01:00:37,060 and let me go ahead and knock on and open the middle door. 1197 01:00:37,060 --> 01:00:38,040 And there's 16. 1198 01:00:38,040 --> 01:00:40,450 Now 16 is kind of a small number, but I now 1199 01:00:40,450 --> 01:00:42,414 know that none of the numbers to the left 1200 01:00:42,414 --> 01:00:44,080 are going to be the one I'm looking for. 1201 01:00:44,080 --> 01:00:45,670 Because I'm still looking for 50. 1202 01:00:45,670 --> 01:00:48,070 So I can sort of throw that half of the problem 1203 01:00:48,070 --> 01:00:50,350 away, or really just turn a blind eye to it. 1204 01:00:50,350 --> 01:00:52,270 And now go to the right half of the problem, 1205 01:00:52,270 --> 01:00:58,270 identify the middle door here, knock on and open this one, there again is 42. 1206 01:00:58,270 --> 01:01:00,580 But now, again, I can apply some of that intuition 1207 01:01:00,580 --> 01:01:03,820 and know that 50 is bigger than 42, so must be 1208 01:01:03,820 --> 01:01:08,680 that 50 is right here at the last door. 1209 01:01:08,680 --> 01:01:13,360 And in this case, have I gotten away with opening fewer doors? 1210 01:01:13,360 --> 01:01:17,789 It's not a frightening number of fewer doors, it's still relatively small, 1211 01:01:17,789 --> 01:01:20,830 but that's only because we could fit some seven doors or so on the screen 1212 01:01:20,830 --> 01:01:21,340 here. 1213 01:01:21,340 --> 01:01:25,330 But if the number of doors were four billion, or if the number of doors 1214 01:01:25,330 --> 01:01:30,130 were even larger, surely with this divide and conquer approach, 1215 01:01:30,130 --> 01:01:33,800 could I find my number super fast. 1216 01:01:33,800 --> 01:01:35,890 But there's kind of a trade off here, right. 1217 01:01:35,890 --> 01:01:39,520 Like yes, I can now find numbers faster, or equivalently 1218 01:01:39,520 --> 01:01:42,790 I can find people presumably in my address book in my phone 1219 01:01:42,790 --> 01:01:47,890 faster, if they are maintained in sorted order, but what price did I pay? 1220 01:01:47,890 --> 01:01:52,300 In fact, a theme in computer science, and in programming specifically, 1221 01:01:52,300 --> 01:01:54,760 is this theme of trade offs. 1222 01:01:54,760 --> 01:01:59,030 Yes, you can speed up searches, but you're going to have to pay a price. 1223 01:01:59,030 --> 01:02:00,680 What is that price in this context? 1224 01:02:00,680 --> 01:02:04,930 Well it took someone some amount of work to sort those contacts, right. 1225 01:02:04,930 --> 01:02:08,811 Before class I had actually come up with the sorted order of those doors, 1226 01:02:08,811 --> 01:02:11,060 and then put the right numbers behind the right doors. 1227 01:02:11,060 --> 01:02:12,700 There was a non-zero amount of work. 1228 01:02:12,700 --> 01:02:15,130 Not a huge amount of work, but it's non-zero. 1229 01:02:15,130 --> 01:02:19,030 So really, by speeding things up during search, 1230 01:02:19,030 --> 01:02:24,940 I have to first incur an up front, one time cost, of sorting things 1231 01:02:24,940 --> 01:02:26,830 before I even begin to search. 1232 01:02:26,830 --> 01:02:29,907 Or equivalently, someone at Google, or someone at Microsoft, 1233 01:02:29,907 --> 01:02:32,740 or someone at Facebook, or any of these big companies with big data, 1234 01:02:32,740 --> 01:02:36,280 if they want to use searching efficiently, 1235 01:02:36,280 --> 01:02:39,190 they need to pay a price upfront of sorting the data first. 1236 01:02:39,190 --> 01:02:42,100 And indeed, that's what Android and iOS and other devices 1237 01:02:42,100 --> 01:02:43,760 too, must be doing underneath the hood. 1238 01:02:43,760 --> 01:02:47,620 Someone at some point in time, probably long ago at this point, 1239 01:02:47,620 --> 01:02:50,820 implemented a sorting routine. 1240 01:02:50,820 --> 01:02:53,380 But how quickly can we sort numbers? 1241 01:02:53,380 --> 01:02:58,150 How quickly can we actually achieve that result of sorted values 1242 01:02:58,150 --> 01:03:02,020 so that we can then do search more efficiently? 1243 01:03:02,020 --> 01:03:04,110 So for this, let's try something a little visual, 1244 01:03:04,110 --> 01:03:05,549 a little old school, in fact. 1245 01:03:05,549 --> 01:03:07,090 We're here in this beautiful theater. 1246 01:03:07,090 --> 01:03:09,965 And in fact we have a whole bunch of music stands from the musicians. 1247 01:03:09,965 --> 01:03:12,010 Let's go ahead and use a few music stands, 1248 01:03:12,010 --> 01:03:15,705 each of which will represent a location in my computer's memory, 1249 01:03:15,705 --> 01:03:17,830 and then I'm just going to have some numbers that I 1250 01:03:17,830 --> 01:03:19,930 want to put inside of those locations. 1251 01:03:19,930 --> 01:03:25,625 And let's see how expensive it is for me to actually sort those values. 1252 01:03:25,625 --> 01:03:28,610 So I have here in my hand the numbers one through eight, 1253 01:03:28,610 --> 01:03:32,900 and I have here eight memory locations as represented here by music stands. 1254 01:03:32,900 --> 01:03:37,070 And let me go ahead and just put these numbers in pretty much random order. 1255 01:03:37,070 --> 01:03:39,320 I've got my little cheat sheet there up on the screen, 1256 01:03:39,320 --> 01:03:42,380 so that we reset each time to the same location. 1257 01:03:42,380 --> 01:03:45,955 And I'm just kind of putting these in some random order. 1258 01:03:45,955 --> 01:03:48,080 Where four's going to be over here, and the numbers 1259 01:03:48,080 --> 01:03:49,913 are going to sometimes get bigger, sometimes 1260 01:03:49,913 --> 01:03:51,560 get smaller from left to right. 1261 01:03:51,560 --> 01:03:54,590 The key, though, is that they're not actually sorted. 1262 01:03:54,590 --> 01:04:00,200 And so I start now with these values as such. 1263 01:04:00,200 --> 01:04:00,860 All right. 1264 01:04:00,860 --> 01:04:04,850 So suppose I wanted to find now, maybe the number 50. 1265 01:04:04,850 --> 01:04:07,850 Well 50 is not among these numbers, so we better think a little smaller. 1266 01:04:07,850 --> 01:04:10,250 Suppose I want to find myself the number seven. 1267 01:04:10,250 --> 01:04:13,570 Well again, as in the case of a computer, or the doors, 1268 01:04:13,570 --> 01:04:16,340 I can only find seven not just by kind of cheating like a human 1269 01:04:16,340 --> 01:04:18,170 and say, oh there it is over there. 1270 01:04:18,170 --> 01:04:20,180 I have to get to that point more methodically. 1271 01:04:20,180 --> 01:04:24,830 And define the number seven linearly, using linear search, so to speak. 1272 01:04:24,830 --> 01:04:26,690 I have to check is this it, nope. 1273 01:04:26,690 --> 01:04:27,770 Is this it, nope. 1274 01:04:27,770 --> 01:04:28,270 Is this it? 1275 01:04:28,270 --> 01:04:29,270 Nope. 1276 01:04:29,270 --> 01:04:29,770 Is this it? 1277 01:04:29,770 --> 01:04:30,270 Nope. 1278 01:04:30,270 --> 01:04:30,770 Is this it? 1279 01:04:30,770 --> 01:04:31,270 Nope. 1280 01:04:31,270 --> 01:04:31,770 Is this it? 1281 01:04:31,770 --> 01:04:32,270 Nope. 1282 01:04:32,270 --> 01:04:33,050 Is this it? 1283 01:04:33,050 --> 01:04:33,680 Yes. 1284 01:04:33,680 --> 01:04:36,139 And now I found it, coincidentally, some seven steps later. 1285 01:04:36,139 --> 01:04:38,555 But seven could have been anywhere, but in the worst case, 1286 01:04:38,555 --> 01:04:39,930 it's all the way over here. 1287 01:04:39,930 --> 01:04:43,830 Now could I use binary search on this? 1288 01:04:43,830 --> 01:04:47,040 Let's actually slap a label on that algorithm we've leveraged with a phone 1289 01:04:47,040 --> 01:04:47,540 book. 1290 01:04:47,540 --> 01:04:48,920 Dividing and conquering. 1291 01:04:48,920 --> 01:04:52,040 If you do it half and half and half and half, looking for some value, 1292 01:04:52,040 --> 01:04:54,740 it's actually more technically called binary search. 1293 01:04:54,740 --> 01:04:57,450 Bi meaning two, and so it's two because you're splitting 1294 01:04:57,450 --> 01:04:58,700 the problem in half each time. 1295 01:04:58,700 --> 01:05:00,620 Could I use binary search here? 1296 01:05:00,620 --> 01:05:02,592 Well there's one minor problem, which is that I 1297 01:05:02,592 --> 01:05:04,550 don't have an odd number of music stands, which 1298 01:05:04,550 --> 01:05:07,184 means the middle element is like between eight and one. 1299 01:05:07,184 --> 01:05:07,850 But that's fine. 1300 01:05:07,850 --> 01:05:10,940 We can deal with that arithmetically by just rounding up or down, 1301 01:05:10,940 --> 01:05:13,460 depending on how we want to implement it ultimately. 1302 01:05:13,460 --> 01:05:15,810 But even then, that's not the biggest problem. 1303 01:05:15,810 --> 01:05:19,070 If I look at eight here, and I'm looking for seven, well 1304 01:05:19,070 --> 01:05:22,010 based on my previous divide and conquer strategy, 1305 01:05:22,010 --> 01:05:24,890 I should be looking to the left because seven is smaller than eight. 1306 01:05:24,890 --> 01:05:27,223 Of course I'm not going to find it to the left of eight, 1307 01:05:27,223 --> 01:05:29,220 because it's not there, it's over there. 1308 01:05:29,220 --> 01:05:31,530 And that's because the numbers aren't sorted. 1309 01:05:31,530 --> 01:05:34,410 So as in the case with my hypothetical address book, 1310 01:05:34,410 --> 01:05:37,250 if I want to be able to search this efficiently, 1311 01:05:37,250 --> 01:05:41,020 or search a database sufficiently, or search most anything efficiently, 1312 01:05:41,020 --> 01:05:42,770 I need to know something about the data. 1313 01:05:42,770 --> 01:05:44,070 And ideally that it's sorted. 1314 01:05:44,070 --> 01:05:47,280 So how do I get to the point of sorting this data? 1315 01:05:47,280 --> 01:05:50,390 Well, you know what, I could take a fairly localized approach. 1316 01:05:50,390 --> 01:05:52,700 Like four and two, I'm looking at them right here. 1317 01:05:52,700 --> 01:05:54,920 And technically, I'm looking at them one at a time. 1318 01:05:54,920 --> 01:05:58,380 I see four, I see two, and I know they're out of order. 1319 01:05:58,380 --> 01:06:00,180 Well, as the computer, I could do this. 1320 01:06:00,180 --> 01:06:03,530 I could, using my load and store instruction, essentially, 1321 01:06:03,530 --> 01:06:05,300 or a move instruction, do that. 1322 01:06:05,300 --> 01:06:08,370 And now I've improved the sorting of this list. 1323 01:06:08,370 --> 01:06:10,640 Now four and six, those look OK. 1324 01:06:10,640 --> 01:06:12,170 Six and eight, those look OK. 1325 01:06:12,170 --> 01:06:14,390 One and eight, Mm-mm, not OK. 1326 01:06:14,390 --> 01:06:16,370 So let me just make a quick fix. 1327 01:06:16,370 --> 01:06:18,450 Let me just swap one and eight. 1328 01:06:18,450 --> 01:06:20,510 It's not perfect yet, but it's better. 1329 01:06:20,510 --> 01:06:23,300 Eight and three, let me make a quick fix. 1330 01:06:23,300 --> 01:06:26,090 Well, that's better. 1331 01:06:26,090 --> 01:06:28,735 Eight and seven, a quick fix, that's better. 1332 01:06:28,735 --> 01:06:31,280 Eight and five, quick fix, that's better. 1333 01:06:31,280 --> 01:06:32,240 So this is great. 1334 01:06:32,240 --> 01:06:34,460 Eight is all the way to the right as intended, 1335 01:06:34,460 --> 01:06:38,780 and one is not yet all the way to the left. 1336 01:06:38,780 --> 01:06:42,440 So I haven't solved the problem, but I have improved it. 1337 01:06:42,440 --> 01:06:45,650 I've gotten a step closer to the correct solution of sorted numbers, 1338 01:06:45,650 --> 01:06:49,430 because some of the values moved at least one step closer 1339 01:06:49,430 --> 01:06:51,962 to their intended location. 1340 01:06:51,962 --> 01:06:53,670 But of course, I've got to do this again. 1341 01:06:53,670 --> 01:06:55,097 So two and four their good. 1342 01:06:55,097 --> 01:06:56,180 Four and six they're good. 1343 01:06:56,180 --> 01:06:58,470 Oh six and one, now that's out of order. 1344 01:06:58,470 --> 01:07:00,980 And so I can transpose these two. 1345 01:07:00,980 --> 01:07:06,200 So this algorithm has a name, whereby I am transposing pairwise elements that 1346 01:07:06,200 --> 01:07:10,190 are out of order, such that the biggest numbers are bubbling their way up 1347 01:07:10,190 --> 01:07:14,060 to the top. eight already made its way there, seven just made its way there. 1348 01:07:14,060 --> 01:07:16,880 And so if I keep doing this, these bigger values 1349 01:07:16,880 --> 01:07:19,010 are going to continue to bubble their way up. 1350 01:07:19,010 --> 01:07:20,600 Two and four those or fine. 1351 01:07:20,600 --> 01:07:23,750 Four and one, ooh notice, four is starting to bubble to the right. 1352 01:07:23,750 --> 01:07:26,200 Four and three, it's continuing to bubble. 1353 01:07:26,200 --> 01:07:27,440 Four and six is OK. 1354 01:07:27,440 --> 01:07:28,670 Six and five, mm-mm. 1355 01:07:28,670 --> 01:07:32,090 Six bubbled its way up, and now I got a good one there. 1356 01:07:32,090 --> 01:07:34,160 I really fixed a lot of those values. 1357 01:07:34,160 --> 01:07:35,990 I just need to do this once more. 1358 01:07:35,990 --> 01:07:38,180 One and two, fixed that. 1359 01:07:38,180 --> 01:07:42,050 Two and three, three and four, four and five, six and seven, seven and eight. 1360 01:07:42,050 --> 01:07:44,630 Let me do another spot check just to make sure. 1361 01:07:44,630 --> 01:07:48,320 One and two, three and four, five,six, seven, eight. 1362 01:07:48,320 --> 01:07:49,010 I'm good. 1363 01:07:49,010 --> 01:07:51,950 But I did a lot of walking, and I did a lot of transpositions. 1364 01:07:51,950 --> 01:07:54,450 And it turns out this algorithm is correct, 1365 01:07:54,450 --> 01:07:56,960 does have a formal name called bubble sort. 1366 01:07:56,960 --> 01:07:58,529 But it's not very efficient. 1367 01:07:58,529 --> 01:08:02,149 In fact, if you actually do the math out, this takes as many as like n times 1368 01:08:02,149 --> 01:08:03,819 n steps in the worst case. 1369 01:08:03,819 --> 01:08:06,649 So if you have n elements, that's eight in this case. 1370 01:08:06,649 --> 01:08:07,990 N times n, that's 64. 1371 01:08:07,990 --> 01:08:12,500 It's not actually 64, but it starts to approach 64 asymptotically, 1372 01:08:12,500 --> 01:08:13,109 so to speak. 1373 01:08:13,109 --> 01:08:15,479 So if we did a formal analysis, long story short, 1374 01:08:15,479 --> 01:08:18,590 it would start to feel like something quadratic, so to speak. 1375 01:08:18,590 --> 01:08:19,819 Not a fast algorithm. 1376 01:08:19,819 --> 01:08:20,830 But there's other ones. 1377 01:08:20,830 --> 01:08:22,663 So let's actually try out another algorithm. 1378 01:08:22,663 --> 01:08:27,149 Let me go ahead and restore the original, seemingly random order here. 1379 01:08:27,149 --> 01:08:29,580 And indeed it is random in so far as I'm just 1380 01:08:29,580 --> 01:08:35,260 putting it in some non-sorted order that I thought through initially. 1381 01:08:35,260 --> 01:08:39,569 One's going to go here, three is going to go here, seven and five. 1382 01:08:39,569 --> 01:08:42,180 So now we're back to the original unsorted order. 1383 01:08:42,180 --> 01:08:44,670 You know what, I can do another algorithm altogether here. 1384 01:08:44,670 --> 01:08:48,840 Let me not look at them pairwise, let me just go even simpler. 1385 01:08:48,840 --> 01:08:51,534 Let me just find the smallest number, and deal with it. 1386 01:08:51,534 --> 01:08:54,700 All right, four is pretty small indeed it's the smallest I've seen thus far, 1387 01:08:54,700 --> 01:08:55,908 I'm going to hang on to this. 1388 01:08:55,908 --> 01:08:58,050 Oh wait a minute, two a smaller. 1389 01:08:58,050 --> 01:08:59,229 Let me hang on to that. 1390 01:08:59,229 --> 01:09:00,450 Six not smaller. 1391 01:09:00,450 --> 01:09:02,010 One, eight not smaller. 1392 01:09:02,010 --> 01:09:05,500 One is smaller, let me go ahead and hang on to this. 1393 01:09:05,500 --> 01:09:07,810 Three, not smaller. 1394 01:09:07,810 --> 01:09:08,729 Seven, five. 1395 01:09:08,729 --> 01:09:10,200 OK so here's the smallest element. 1396 01:09:10,200 --> 01:09:12,424 I found it I've selected the smallest element. 1397 01:09:12,424 --> 01:09:14,590 You know what, this belongs all the way to the left. 1398 01:09:14,590 --> 01:09:17,380 I'm going to put it there. 1399 01:09:17,380 --> 01:09:18,810 I'm going to do that. 1400 01:09:18,810 --> 01:09:20,220 Of course, this is cheating. 1401 01:09:20,220 --> 01:09:22,890 Like this music stand, even though it's kind of wide, 1402 01:09:22,890 --> 01:09:24,540 should only be storing one value. 1403 01:09:24,540 --> 01:09:27,880 There's only enough room for one byte for instance of information. 1404 01:09:27,880 --> 01:09:29,340 So where do I put four? 1405 01:09:29,340 --> 01:09:31,840 Like four can't stay on the same music stand. 1406 01:09:31,840 --> 01:09:33,479 You know I can't cheat and do that. 1407 01:09:33,479 --> 01:09:37,440 So I could move eight over, move six over, move two over, 1408 01:09:37,440 --> 01:09:38,460 that's a lot of work. 1409 01:09:38,460 --> 01:09:41,921 Or I could just, fine, four was already out of order in the first place. 1410 01:09:41,921 --> 01:09:43,170 Let's just put four over here. 1411 01:09:43,170 --> 01:09:45,254 It's not made the problem any fundamentally worse, 1412 01:09:45,254 --> 01:09:46,795 because they started in random order. 1413 01:09:46,795 --> 01:09:47,970 I randomly put four there. 1414 01:09:47,970 --> 01:09:51,840 The key is, one is in really good shape now. 1415 01:09:51,840 --> 01:09:54,754 Let me repeat this process of selecting the smallest element. 1416 01:09:54,754 --> 01:09:56,170 Let me look for the next smallest. 1417 01:09:56,170 --> 01:09:58,020 Two is pretty small. 1418 01:09:58,020 --> 01:09:59,550 Small, smallest, yep. 1419 01:09:59,550 --> 01:10:00,720 OK I found it. 1420 01:10:00,720 --> 01:10:01,840 Two is the smallest. 1421 01:10:01,840 --> 01:10:04,290 Let's put him co-incidentally right where he goes. 1422 01:10:04,290 --> 01:10:08,670 So that was kind of a waste of time, but verification that it's correct. 1423 01:10:08,670 --> 01:10:09,750 Let's keep looking. 1424 01:10:09,750 --> 01:10:11,370 Six is pretty small now. 1425 01:10:11,370 --> 01:10:13,410 Eight is not, oh four is smaller. 1426 01:10:13,410 --> 01:10:14,700 Let's grab four. 1427 01:10:14,700 --> 01:10:16,224 Oh Three is even better. 1428 01:10:16,224 --> 01:10:16,890 Let's grab that. 1429 01:10:16,890 --> 01:10:17,810 Seven, five. 1430 01:10:17,810 --> 01:10:20,830 OK 3 needs to be put in its place. 1431 01:10:20,830 --> 01:10:23,500 I don't want to touch one and two, because those are good. 1432 01:10:23,500 --> 01:10:25,570 Six, sorry you don't belong there. 1433 01:10:25,570 --> 01:10:28,470 Let's go ahead and evict you and put six over there. 1434 01:10:28,470 --> 01:10:32,910 So if I repeat this process of just continually iterating, or looping, 1435 01:10:32,910 --> 01:10:36,420 if you will, through the list, looking for the smallest element, 1436 01:10:36,420 --> 01:10:41,220 and evicting whatever is in the place that it belongs, 1437 01:10:41,220 --> 01:10:44,370 notice that I've sorted it seemingly faster. 1438 01:10:44,370 --> 01:10:46,170 But I actually got a little lucky there. 1439 01:10:46,170 --> 01:10:50,700 It turns out that this algorithm, which also has a name called selection sort, 1440 01:10:50,700 --> 01:10:52,570 is also quadratic in nature. 1441 01:10:52,570 --> 01:10:55,830 If you've got n elements, it's going to take roughly n times n total 1442 01:10:55,830 --> 01:10:58,020 steps in order to achieve that. 1443 01:10:58,020 --> 01:10:59,100 Not in every case. 1444 01:10:59,100 --> 01:11:01,470 But in the worst case, so to speak. 1445 01:11:01,470 --> 01:11:05,520 So it turns out this is just two ways now to sort elements. 1446 01:11:05,520 --> 01:11:10,511 And there are dozens, if not more, other ways to actually sort elements, 1447 01:11:10,511 --> 01:11:13,260 with each one getting maybe a little better, maybe a little worse, 1448 01:11:13,260 --> 01:11:15,660 maybe a little better depending on the inputs. 1449 01:11:15,660 --> 01:11:17,760 So sometimes it really depends. 1450 01:11:17,760 --> 01:11:21,030 But these are all examples then of algorithms. 1451 01:11:21,030 --> 01:11:24,420 And algorithms have performance associated with them. 1452 01:11:24,420 --> 01:11:27,010 Algorithms have a running time associated with them. 1453 01:11:27,010 --> 01:11:29,730 And one of the things that theoretical computer scientists do, 1454 01:11:29,730 --> 01:11:31,950 is not only design algorithms like these, 1455 01:11:31,950 --> 01:11:35,050 or really even way more sophisticated algorithms than these, 1456 01:11:35,050 --> 01:11:36,180 but also analyze them. 1457 01:11:36,180 --> 01:11:39,240 So that we can make arguments, mathematical arguments, sometimes 1458 01:11:39,240 --> 01:11:42,820 that state as a proof, this algorithm is correct. 1459 01:11:42,820 --> 01:11:46,020 No matter what, this will sort your elements correctly, 1460 01:11:46,020 --> 01:11:48,180 if implemented correctly in a computer, and this 1461 01:11:48,180 --> 01:11:53,070 is how much time it will take in general in order to achieve that. 1462 01:11:53,070 --> 01:11:57,120 But the takeaway here, is that it was a lot of work, right. 1463 01:11:57,120 --> 01:11:59,520 I had the luxury with that phone book, long ago, 1464 01:11:59,520 --> 01:12:02,280 of just opening it up to the middle and looking for Mike Smith, 1465 01:12:02,280 --> 01:12:05,370 and da da da da da, found Mike Smith pretty darn efficiently, 1466 01:12:05,370 --> 01:12:07,950 because Verizon, or whoever made the phone book, 1467 01:12:07,950 --> 01:12:11,380 actually did all of the upfront work for me. 1468 01:12:11,380 --> 01:12:12,690 So here's a question then. 1469 01:12:12,690 --> 01:12:16,200 Suppose that you have a whole bunch of random values, 1470 01:12:16,200 --> 01:12:19,020 and you want to search for something specific among those values. 1471 01:12:19,020 --> 01:12:22,950 Mike Smith, number 50, whatever it is you're looking for, 1472 01:12:22,950 --> 01:12:27,960 should you sort those values first before searching? 1473 01:12:27,960 --> 01:12:31,030 Should you sort the values first? 1474 01:12:31,030 --> 01:12:34,200 Well, I would wager you should sort them if you're 1475 01:12:34,200 --> 01:12:36,730 going to be searching over them pretty darn often. 1476 01:12:36,730 --> 01:12:37,230 Right. 1477 01:12:37,230 --> 01:12:39,240 Because it might be painful, might be boring, 1478 01:12:39,240 --> 01:12:42,480 might be tedious to actually sort those values either manually, 1479 01:12:42,480 --> 01:12:46,170 or algorithmically with a computer, and a computer program. 1480 01:12:46,170 --> 01:12:48,760 But it'll probably pay off over time. 1481 01:12:48,760 --> 01:12:52,080 So there's this notion of amortization of the cost over time, whereby yeah, it 1482 01:12:52,080 --> 01:12:53,746 might take you a decent amount of steps. 1483 01:12:53,746 --> 01:12:56,830 Quadratic, seconds, minutes, whatever you want to measure it in. 1484 01:12:56,830 --> 01:12:59,050 But, it's going to pay off in the long run. 1485 01:12:59,050 --> 01:12:59,100 Of 1486 01:12:59,100 --> 01:13:00,974 Course, if you add data, you're going to need 1487 01:13:00,974 --> 01:13:02,710 to keep that new data sorted as well. 1488 01:13:02,710 --> 01:13:04,202 But again, it's this trade off. 1489 01:13:04,202 --> 01:13:07,160 However, consider a scenario where you just have a whole bunch of data, 1490 01:13:07,160 --> 01:13:09,120 it's in random order for whatever reason, 1491 01:13:09,120 --> 01:13:10,705 and you just want to find Mike Smith. 1492 01:13:10,705 --> 01:13:12,330 Or you just want to find the number 50. 1493 01:13:12,330 --> 01:13:14,589 And after that, you do not care about the data. 1494 01:13:14,589 --> 01:13:16,380 You're never going to look up someone else. 1495 01:13:16,380 --> 01:13:18,213 You're never going to look up another value. 1496 01:13:18,213 --> 01:13:20,940 Maybe then, in that case, you should just blindly 1497 01:13:20,940 --> 01:13:23,190 go through it brute force, so to speak. 1498 01:13:23,190 --> 01:13:26,760 Left or right, in some sense, or linearly, as opposed 1499 01:13:26,760 --> 01:13:29,730 to trying to do anything clever like we did with the phone book. 1500 01:13:29,730 --> 01:13:34,290 Because it's faster to just brute force your way through the solution, 1501 01:13:34,290 --> 01:13:38,550 than doing some upfront sophistication just to improve the subsequent result. 1502 01:13:38,550 --> 01:13:40,290 So again, it's a trade off. 1503 01:13:40,290 --> 01:13:43,950 It's a cost benefit analysis that you need to decide, ultimately, 1504 01:13:43,950 --> 01:13:45,300 on what is most important. 1505 01:13:45,300 --> 01:13:51,420 Which resource, time, space, money, people, is most important to you. 1506 01:13:51,420 --> 01:13:54,150 So in all of these examples thus far have we 1507 01:13:54,150 --> 01:13:57,180 assumed that the data is contiguous. 1508 01:13:57,180 --> 01:13:59,079 Back to back to back with other values. 1509 01:13:59,079 --> 01:14:01,620 And indeed, that's exactly what the music stands represented, 1510 01:14:01,620 --> 01:14:03,930 were these fixed locations in memory. 1511 01:14:03,930 --> 01:14:07,320 Bytes in your computer's ram, so to speak, that could have values. 1512 01:14:07,320 --> 01:14:11,070 But you had to physically move those values from one location to another 1513 01:14:11,070 --> 01:14:13,110 if you wanted to sort the values therein. 1514 01:14:13,110 --> 01:14:15,699 you couldn't just kind of insert a new music stand, 1515 01:14:15,699 --> 01:14:18,240 and push the others aside, because that's simply not allowed. 1516 01:14:18,240 --> 01:14:20,550 That's not the metaphor in question. 1517 01:14:20,550 --> 01:14:23,010 Those music stands were fixed just as the locations 1518 01:14:23,010 --> 01:14:26,030 in memory of your computer are as well. 1519 01:14:26,030 --> 01:14:31,050 And as it turns out, just as we drew a picture before wherein we had some 1520 01:14:31,050 --> 01:14:36,210 sequence of back to back values in memory, turns out that in programming, 1521 01:14:36,210 --> 01:14:40,080 actually using languages that aren't just pseudo code, but actual code, 1522 01:14:40,080 --> 01:14:41,430 Java and c++ and others. 1523 01:14:41,430 --> 01:14:45,900 Still, there is very often something called an array, or a list. 1524 01:14:45,900 --> 01:14:50,340 A data structure that gives you the appearance of having values 1525 01:14:50,340 --> 01:14:52,140 back to back to back to back. 1526 01:14:52,140 --> 01:14:54,810 And in some languages, these values are indeed 1527 01:14:54,810 --> 01:14:59,310 next to each other in terms of their bits inside of your computer's memory. 1528 01:14:59,310 --> 01:15:01,590 But there's a problem with a world like this. 1529 01:15:01,590 --> 01:15:05,460 Because if you put in a whole bunch of values up front, as I did earlier, 1530 01:15:05,460 --> 01:15:08,200 and you want to actually insert some new value, 1531 01:15:08,200 --> 01:15:10,680 where do you actually put that value? 1532 01:15:10,680 --> 01:15:15,540 For instance, earlier we had one and 50 and 51 and 61 and 109 and 121. 1533 01:15:15,540 --> 01:15:20,820 Suppose I want to put the number 55 in this sequence of numbers. 1534 01:15:20,820 --> 01:15:25,320 And better still, I want to keep the whole list sorted. 1535 01:15:25,320 --> 01:15:26,190 Well dammit. 1536 01:15:26,190 --> 01:15:28,800 There's really not room for 55 in there. 1537 01:15:28,800 --> 01:15:32,100 I know where it should go numerically, it should go right here, of course, 1538 01:15:32,100 --> 01:15:34,320 between the 51 and the 61. 1539 01:15:34,320 --> 01:15:35,320 But there's no room. 1540 01:15:35,320 --> 01:15:38,070 Or I could put it arbitrarily over here, but then it's not sorted. 1541 01:15:38,070 --> 01:15:39,778 I could put it, well I can't put it there 1542 01:15:39,778 --> 01:15:42,130 because the dot dot dot remember from earlier, suggested 1543 01:15:42,130 --> 01:15:43,380 that something else was there. 1544 01:15:43,380 --> 01:15:45,300 Something else from your program or computer 1545 01:15:45,300 --> 01:15:46,836 was already using that location. 1546 01:15:46,836 --> 01:15:49,210 Now I could put it maybe below, but that's kind of weird. 1547 01:15:49,210 --> 01:15:52,710 The whole point here is this contiguousness. 1548 01:15:52,710 --> 01:15:55,500 And indeed, with the music stands a moment ago, 1549 01:15:55,500 --> 01:15:58,380 was it advantageous for me to be able to take one step 1550 01:15:58,380 --> 01:15:59,910 and be at another music stand? 1551 01:15:59,910 --> 01:16:01,950 One more step be at another music stand. 1552 01:16:01,950 --> 01:16:04,830 I didn't have to walk all around stage looking for my values, 1553 01:16:04,830 --> 01:16:06,705 because they were indeed back to back to back 1554 01:16:06,705 --> 01:16:10,540 and that made things very efficient and predictably close to one another. 1555 01:16:10,540 --> 01:16:11,520 So what are my options? 1556 01:16:11,520 --> 01:16:13,460 Like surely a computer can do this. 1557 01:16:13,460 --> 01:16:16,630 It's not the case that computers are so dumb that once you write your values 1558 01:16:16,630 --> 01:16:19,860 you can't solve any new problems. 1559 01:16:19,860 --> 01:16:22,470 So what are my options? 1560 01:16:22,470 --> 01:16:25,700 I could put the 55 here at the end, but then I'm 1561 01:16:25,700 --> 01:16:27,860 going to have to reshuffle all of the numbers. 1562 01:16:27,860 --> 01:16:31,009 Or maybe I ask the computer for more memory altogether, 1563 01:16:31,009 --> 01:16:33,050 and you know what, I re do this, and I say let me 1564 01:16:33,050 --> 01:16:40,400 put one here, 50 here, 51 here, 55 here, then 61, 109, and 121. 1565 01:16:40,400 --> 01:16:43,190 In other words, why don't I just use different memory? 1566 01:16:43,190 --> 01:16:46,460 Different bytes in my computer's memory, but that feels kind of lame, 1567 01:16:46,460 --> 01:16:50,100 because now I need twice as much memory, at least temporarily. 1568 01:16:50,100 --> 01:16:52,250 So I can kind of copy the old values into the new, 1569 01:16:52,250 --> 01:16:54,680 but leave room for that new value. 1570 01:16:54,680 --> 01:16:58,070 So that seems a little time consuming, and tedious certainly. 1571 01:16:58,070 --> 01:17:00,900 And indeed, it would be for a computer as well. 1572 01:17:00,900 --> 01:17:03,710 But maybe there's another solution altogether. 1573 01:17:03,710 --> 01:17:06,530 And here too, we're experiencing yet another trade off. 1574 01:17:06,530 --> 01:17:09,830 The contiguousness of my memory has thus far been an advantage. 1575 01:17:09,830 --> 01:17:13,070 Because again, each of my music stands just one step away. 1576 01:17:13,070 --> 01:17:16,490 In the case of actual ram, each number is just one byte away, 1577 01:17:16,490 --> 01:17:20,510 and that lends itself to again, random access, ergo random access memory. 1578 01:17:20,510 --> 01:17:23,150 But it's kind of painting me into a corner, 1579 01:17:23,150 --> 01:17:26,840 because I'm finding that now I can no longer really 1580 01:17:26,840 --> 01:17:30,060 add new values without incurring significant cost. 1581 01:17:30,060 --> 01:17:35,180 And by significant cost I mean having to duplicate the entire array of memory 1582 01:17:35,180 --> 01:17:38,840 into a new location, leaving one additional space for that new value, 1583 01:17:38,840 --> 01:17:43,700 which is going to take as many steps as there are values in the original array. 1584 01:17:43,700 --> 01:17:46,940 So you know what, maybe I take an even older school approach, 1585 01:17:46,940 --> 01:17:49,730 and maybe I do something proactive. 1586 01:17:49,730 --> 01:17:50,960 A little defensive. 1587 01:17:50,960 --> 01:17:57,590 I'll put one there, and 50 here, and 51 here, and 61 and then 109, 1588 01:17:57,590 --> 01:17:59,510 and then 121 and so forth. 1589 01:17:59,510 --> 01:18:01,970 And wow, this was clever of me, because now I 1590 01:18:01,970 --> 01:18:04,620 left myself little gaps in my values. 1591 01:18:04,620 --> 01:18:08,750 So that I still get the predictability of every value is now two steps away, 1592 01:18:08,750 --> 01:18:13,060 but now I can fit 55 in here. 1593 01:18:13,060 --> 01:18:16,490 Well, clever as you might initially think that, now we've 1594 01:18:16,490 --> 01:18:17,900 created kind of a problem. 1595 01:18:17,900 --> 01:18:22,730 Because most values are two bytes away from each other. 1596 01:18:22,730 --> 01:18:25,670 But now there's this weirdness where 55 is only one byte away 1597 01:18:25,670 --> 01:18:26,490 from its neighbors. 1598 01:18:26,490 --> 01:18:29,330 So that's really just complicating things now. 1599 01:18:29,330 --> 01:18:32,570 I can no longer predictably take just one step at a time. 1600 01:18:32,570 --> 01:18:34,490 Sometimes I take two, sometimes I take one, 1601 01:18:34,490 --> 01:18:39,350 and that just feels like it's devolving into a very confusing bug prone 1602 01:18:39,350 --> 01:18:40,760 situation already. 1603 01:18:40,760 --> 01:18:44,679 And in fact, if you ever saw, or maybe grew up with the basic programming 1604 01:18:44,679 --> 01:18:46,970 language, back in the day, you didn't number your lines 1605 01:18:46,970 --> 01:18:48,840 of code zero, one, two, three, four. 1606 01:18:48,840 --> 01:18:51,170 If you numbered of them at all, you instead 1607 01:18:51,170 --> 01:18:55,610 did something like line ten, line 20, line 30, line 40. 1608 01:18:55,610 --> 01:18:59,630 Which wonderfully left your room for like nine more additions later on, 1609 01:18:59,630 --> 01:19:01,670 without having to renumber everything. 1610 01:19:01,670 --> 01:19:04,790 Of course now, programs just number or lines automatically for us. 1611 01:19:04,790 --> 01:19:07,530 And they're not even strictly necessary, those line numbers. 1612 01:19:07,530 --> 01:19:11,480 So we've tried this, and it's just, it's not really a good solution. 1613 01:19:11,480 --> 01:19:15,590 Better would be I'll claim some dynamism. 1614 01:19:15,590 --> 01:19:21,410 What if instead I consider my computer's memory 1615 01:19:21,410 --> 01:19:25,080 still to be this grid of locations. 1616 01:19:25,080 --> 01:19:26,480 But let me do this. 1617 01:19:26,480 --> 01:19:28,530 Let me propose that the first number. 1618 01:19:28,530 --> 01:19:31,712 I'm going to put in, maybe is here, and just to keep things prettier I'm not 1619 01:19:31,712 --> 01:19:33,170 going to draw the whole grid again. 1620 01:19:33,170 --> 01:19:35,810 But let's assume that one byte I'm using is right there, 1621 01:19:35,810 --> 01:19:37,460 and I put the number one in there. 1622 01:19:37,460 --> 01:19:41,660 And then eventually, I decide to add that second number. 1623 01:19:41,660 --> 01:19:45,410 And I go ahead and add the number 50. 1624 01:19:45,410 --> 01:19:51,110 And the number 50 just so happens to end up, oh I don't know, over here. 1625 01:19:51,110 --> 01:19:52,480 And it's farther away in memory. 1626 01:19:52,480 --> 01:19:54,815 Maybe enough time has passed that the bunch 1627 01:19:54,815 --> 01:19:57,940 of memory in between those two values is being used for some other purpose. 1628 01:19:57,940 --> 01:19:59,140 And that's OK. 1629 01:19:59,140 --> 01:20:02,050 But the 50 and the one are no longer next to each 1630 01:20:02,050 --> 01:20:06,820 other, or therefore pictorially, or really related, 1631 01:20:06,820 --> 01:20:09,250 unless I somehow interconnect them. 1632 01:20:09,250 --> 01:20:11,890 So for now, let me go ahead and draw an arrow. 1633 01:20:11,890 --> 01:20:15,070 And let me kind of weave together my values. 1634 01:20:15,070 --> 01:20:19,120 Much like popcorn on a thread, or any number of other metaphors 1635 01:20:19,120 --> 01:20:21,100 where you're linking things together. 1636 01:20:21,100 --> 01:20:24,370 Like with a chain, for instance, can you make a list like this? 1637 01:20:24,370 --> 01:20:27,010 Because my other numbers, suppose that more time passes. 1638 01:20:27,010 --> 01:20:31,150 I insert a third value like 51, just happens to be over there, whatever. 1639 01:20:31,150 --> 01:20:32,900 It's not contiguous, which is unfortunate. 1640 01:20:32,900 --> 01:20:37,750 But that's OK, so long as I somehow link these together like this. 1641 01:20:37,750 --> 01:20:41,560 And then maybe that next number here, 61, get lucky and it's pretty close. 1642 01:20:41,560 --> 01:20:42,610 So it's over here. 1643 01:20:42,610 --> 01:20:45,850 But we still need an arrow to this one here. 1644 01:20:45,850 --> 01:20:53,500 And then maybe we have maybe 121 and 109, each of which 1645 01:20:53,500 --> 01:20:55,180 is in some different locations. 1646 01:20:55,180 --> 01:20:57,520 So we might need this arrow here. 1647 01:20:57,520 --> 01:20:59,300 And this arrow there. 1648 01:20:59,300 --> 01:21:01,490 So it's kind of all over the place. 1649 01:21:01,490 --> 01:21:03,430 And if you ever play the game growing up, 1650 01:21:03,430 --> 01:21:05,138 it's kind of like chutes and ladders now, 1651 01:21:05,138 --> 01:21:08,140 you kind of have to like follow the chutes to get from one number 1652 01:21:08,140 --> 01:21:09,320 to another. 1653 01:21:09,320 --> 01:21:10,090 But that's OK. 1654 01:21:10,090 --> 01:21:13,850 In fact, it's actually quite beautiful, my handwriting aside. 1655 01:21:13,850 --> 01:21:15,610 It's not quite as efficient though. 1656 01:21:15,610 --> 01:21:20,260 Because it turns out that if you're at one of the numbers in this list, 1657 01:21:20,260 --> 01:21:24,240 and you want to jump to another, you can get to the next number pretty fast. 1658 01:21:24,240 --> 01:21:26,340 If I'm at one, I can get to 50 pretty quickly. 1659 01:21:26,340 --> 01:21:27,840 I just follow that arrow somehow. 1660 01:21:27,840 --> 01:21:30,173 I don't know how it works in memory, and we don't really 1661 01:21:30,173 --> 01:21:31,625 need to know how it works. 1662 01:21:31,625 --> 01:21:34,750 We can kind of abstract that away for the sake of discussion at the moment. 1663 01:21:34,750 --> 01:21:38,350 But I can get to 50 based on this picture alone pretty darn directly. 1664 01:21:38,350 --> 01:21:41,410 But I can't really get to like 109 pretty quickly. 1665 01:21:41,410 --> 01:21:44,770 If I'm starting at one, I want to get to 109. 1666 01:21:44,770 --> 01:21:49,180 previously, I could just take one two, three, four steps away, four bytes away 1667 01:21:49,180 --> 01:21:52,420 and I could just take a step that's like four times as big as usual 1668 01:21:52,420 --> 01:21:54,040 and I'm instantly there. 1669 01:21:54,040 --> 01:21:57,234 But these arrows seem to suggest that the memory could be anywhere. 1670 01:21:57,234 --> 01:21:58,900 And so it's almost like following a map. 1671 01:21:58,900 --> 01:22:01,810 If I'm at one, all right let me go find 50. 1672 01:22:01,810 --> 01:22:04,180 Once I found 50, I can pick up another map, 1673 01:22:04,180 --> 01:22:08,260 and now I can follow another arrow to 51, which is maybe over here. 1674 01:22:08,260 --> 01:22:09,380 Now I pick up another map. 1675 01:22:09,380 --> 01:22:10,780 Oh here's 61. 1676 01:22:10,780 --> 01:22:13,810 OK now this leads me to, Ah here is 109. 1677 01:22:13,810 --> 01:22:16,300 It wasn't as simple as just one big step that's 1678 01:22:16,300 --> 01:22:21,580 four times as big as was possible in the case of contiguous memory. 1679 01:22:21,580 --> 01:22:24,506 Now, I kind of have to weave my way through. 1680 01:22:24,506 --> 01:22:25,630 That actually kind of hurt. 1681 01:22:25,630 --> 01:22:29,630 Weave my way through my computer's memory to get where I'm going. 1682 01:22:29,630 --> 01:22:30,940 So it's a trade off, right. 1683 01:22:30,940 --> 01:22:33,820 Because now I have perfect dynamism. 1684 01:22:33,820 --> 01:22:36,820 No matter where there is available memory in my computer, 1685 01:22:36,820 --> 01:22:39,590 over there, over here, over here, over here, I can use it. 1686 01:22:39,590 --> 01:22:42,950 And I can just kind of stitch together this data structure, 1687 01:22:42,950 --> 01:22:44,740 the structure for my data. 1688 01:22:44,740 --> 01:22:50,630 I have eliminated the gotcha that my data structure is a fixed size. 1689 01:22:50,630 --> 01:22:54,130 So now if I want to add another value, suppose I want to add 55. 1690 01:22:54,130 --> 01:22:58,120 OK, suppose that 55 happens to have some available space here. 1691 01:22:58,120 --> 01:23:01,690 You know what I can do here, I can just get rid of this arrow, 1692 01:23:01,690 --> 01:23:07,720 and then I can go ahead and stitch this in like this. 1693 01:23:07,720 --> 01:23:08,391 No big deal. 1694 01:23:08,391 --> 01:23:10,390 I don't have to touch any of the other elements. 1695 01:23:10,390 --> 01:23:12,100 But think back to the array. 1696 01:23:12,100 --> 01:23:13,990 The array being the rectangular example where 1697 01:23:13,990 --> 01:23:15,448 everybody was back to back to back. 1698 01:23:15,448 --> 01:23:19,480 That I had to make room for things, or duplicate all of it. 1699 01:23:19,480 --> 01:23:23,410 Had to do more work here, just to allocate space for your 55, 1700 01:23:23,410 --> 01:23:25,870 and then go ahead and just update two of these arrows, 1701 01:23:25,870 --> 01:23:30,650 or pointers, or references as that might be called in different languages. 1702 01:23:30,650 --> 01:23:32,510 So that's pretty powerful. 1703 01:23:32,510 --> 01:23:34,780 But again I've lost the random access, which 1704 01:23:34,780 --> 01:23:38,870 means I can't do things like binary search anymore. 1705 01:23:38,870 --> 01:23:41,080 Which we're predicated on simple arithmetic. 1706 01:23:41,080 --> 01:23:44,110 Divide by two, divide by two, divide by two, Maybe round, 1707 01:23:44,110 --> 01:23:47,410 but divide by two each time. 1708 01:23:47,410 --> 01:23:48,580 And these arrows. 1709 01:23:48,580 --> 01:23:52,480 Kind of abstracting maybe too generously here. 1710 01:23:52,480 --> 01:23:55,420 These arrows are going to cost me some memory too. 1711 01:23:55,420 --> 01:23:58,900 Now I've drawn them prettily as arrows on the screen, 1712 01:23:58,900 --> 01:24:01,420 but those actually themselves are some kind of value. 1713 01:24:01,420 --> 01:24:06,340 And it turns out that those arrows themselves take up space, take up bits. 1714 01:24:06,340 --> 01:24:09,730 So I've kind of doubled, let's say, the amount of space 1715 01:24:09,730 --> 01:24:11,440 necessary to store this data. 1716 01:24:11,440 --> 01:24:13,881 Because I need to somehow store and memory those arrows. 1717 01:24:13,881 --> 01:24:15,880 Underneath the hood those arrows are effectively 1718 01:24:15,880 --> 01:24:17,350 stored as numbers themselves. 1719 01:24:17,350 --> 01:24:20,080 Which is to say, they take up at least a byte or more. 1720 01:24:20,080 --> 01:24:24,402 In fact, on most systems, they take up four or eight bytes still. 1721 01:24:24,402 --> 01:24:26,110 So it depends on what's important to you. 1722 01:24:26,110 --> 01:24:29,670 Do you want to be able to search the data super efficiently 1723 01:24:29,670 --> 01:24:32,970 and actually have random access, and do something like binary search 1724 01:24:32,970 --> 01:24:36,375 and get that divide and conquer upside? 1725 01:24:36,375 --> 01:24:39,000 Or do you want to have dynamism, and be able to grow and shrink 1726 01:24:39,000 --> 01:24:43,950 the data structure super fast, but pay a price in terms of memory as well. 1727 01:24:43,950 --> 01:24:45,960 And use more of those bits. 1728 01:24:45,960 --> 01:24:48,480 It really quite depends. 1729 01:24:48,480 --> 01:24:51,390 Now these aren't the only data structures at our disposal. 1730 01:24:51,390 --> 01:24:54,630 We can use not just arrays, not just linked lists, 1731 01:24:54,630 --> 01:24:56,880 as we'll call that last one, whereby we have 1732 01:24:56,880 --> 01:25:00,450 links tying these lists items together. 1733 01:25:00,450 --> 01:25:02,320 But we have maybe tree structures too. 1734 01:25:02,320 --> 01:25:05,760 In fact, we think back to like your family tree that you 1735 01:25:05,760 --> 01:25:07,437 might have made in grade school. 1736 01:25:07,437 --> 01:25:09,270 You might have drawn a little something that 1737 01:25:09,270 --> 01:25:12,510 might have grandma or grandpa at the top of the tree, 1738 01:25:12,510 --> 01:25:15,330 and then their children might be down here, 1739 01:25:15,330 --> 01:25:18,610 and then these children might be down here, and so forth. 1740 01:25:18,610 --> 01:25:20,820 Well if we generalize away from a fam-- 1741 01:25:20,820 --> 01:25:21,730 OK ignore that one. 1742 01:25:21,730 --> 01:25:24,060 If we generalize away from a family tree, 1743 01:25:24,060 --> 01:25:27,150 it turns out that this tree data structure is another kind 1744 01:25:27,150 --> 01:25:29,610 of incarnation of that previous idea. 1745 01:25:29,610 --> 01:25:34,080 A linked list doesn't have to just be linked from left to right, so to speak, 1746 01:25:34,080 --> 01:25:36,420 even if it's kind of swooping all over the screen. 1747 01:25:36,420 --> 01:25:38,430 It doesn't have to have a start and an end. 1748 01:25:38,430 --> 01:25:42,870 It can be a tree structure, whereby your arrows actually, much 1749 01:25:42,870 --> 01:25:45,710 like a branch and a program, can go in different directions. 1750 01:25:45,710 --> 01:25:49,920 So each of these edges in this graph actually are directional. 1751 01:25:49,920 --> 01:25:51,790 They take you from one place to another. 1752 01:25:51,790 --> 01:25:56,380 You can imagine using a data structure like this to store values as well. 1753 01:25:56,380 --> 01:26:02,320 For instance, suppose that I wanted to store numbers in this data structure. 1754 01:26:02,320 --> 01:26:04,320 Instead of using squares, I'm just doing circles 1755 01:26:04,320 --> 01:26:05,778 now just because it's conventional. 1756 01:26:05,778 --> 01:26:06,990 Doesn't mean anything else. 1757 01:26:06,990 --> 01:26:07,870 Anything special. 1758 01:26:07,870 --> 01:26:11,290 These are still just bytes underneath the hood, or ultimately bits. 1759 01:26:11,290 --> 01:26:16,320 Now suppose I want to store the number two here, and one here, and three 1760 01:26:16,320 --> 01:26:20,490 here, and five here, and six here, and seven here. 1761 01:26:20,490 --> 01:26:23,250 This is all very deliberate. 1762 01:26:23,250 --> 01:26:25,890 Because notice the pattern that I've kind of adopted. 1763 01:26:25,890 --> 01:26:29,660 Four is at the top, in the so-called root of the tree. 1764 01:26:29,660 --> 01:26:33,960 What do you notice about all of the values to the left of the four? 1765 01:26:33,960 --> 01:26:37,560 All of its left descendants so to speak, to borrow that family tree 1766 01:26:37,560 --> 01:26:39,530 nomenclature. 1767 01:26:39,530 --> 01:26:41,990 One, and two, and three. 1768 01:26:41,990 --> 01:26:46,400 Curiously, all three of those are smaller than the number four. 1769 01:26:46,400 --> 01:26:49,370 What about all of its right descendants, down this branch? 1770 01:26:49,370 --> 01:26:51,710 Six and five and six and seven. 1771 01:26:51,710 --> 01:26:53,510 Those are all bigger than four. 1772 01:26:53,510 --> 01:26:55,400 That's kind of neat. 1773 01:26:55,400 --> 01:26:56,570 How about here? 1774 01:26:56,570 --> 01:26:59,360 Two, this is kind of a mini tree, if you will. 1775 01:26:59,360 --> 01:27:01,730 This is a new route, if you ignore everything above it. 1776 01:27:01,730 --> 01:27:04,059 Two has two children, left and right. 1777 01:27:04,059 --> 01:27:06,350 Notice that the one is less than the two, and the three 1778 01:27:06,350 --> 01:27:07,310 is more than the two. 1779 01:27:07,310 --> 01:27:09,930 So this is not a coincidence that I drew it like this. 1780 01:27:09,930 --> 01:27:13,450 Similarly, if you think of this as a tree, just those three nodes below it, 1781 01:27:13,450 --> 01:27:16,430 six is bigger than five, six is less than seven. 1782 01:27:16,430 --> 01:27:18,050 So there's a pattern. 1783 01:27:18,050 --> 01:27:24,200 So we can actually reclaim some of the capabilities of divide 1784 01:27:24,200 --> 01:27:26,330 and conquer by laying out our data, still 1785 01:27:26,330 --> 01:27:29,690 with these arrows, these pointers or references if you will. 1786 01:27:29,690 --> 01:27:33,110 But not just doing it linearly, or swiveling all over the screen, 1787 01:27:33,110 --> 01:27:37,430 but from one start node to an end node. 1788 01:27:37,430 --> 01:27:42,110 Where node is a fancy word for these squares or these circles. 1789 01:27:42,110 --> 01:27:45,870 What if we instead allow the user to go in two different directions 1790 01:27:45,870 --> 01:27:46,941 when searching for data? 1791 01:27:46,941 --> 01:27:49,190 And suppose we leverage that same phone book principle 1792 01:27:49,190 --> 01:27:52,700 where the smaller values are this way, and the bigger values are that way. 1793 01:27:52,700 --> 01:27:54,470 What does that afford us? 1794 01:27:54,470 --> 01:27:59,120 Well, this binary search tree, if you will, to give it a fancy name, 1795 01:27:59,120 --> 01:28:02,540 actually gives us back a logarithmic running time. 1796 01:28:02,540 --> 01:28:05,180 Much like dividing and conquering a phone book. 1797 01:28:05,180 --> 01:28:09,050 Because indeed, if I start at the four and I'm looking for seven, 1798 01:28:09,050 --> 01:28:13,310 and I immediately realize four is here, which means seven must be to the right 1799 01:28:13,310 --> 01:28:18,800 if it's present at all, I can pretty much chop off the half of the problem 1800 01:28:18,800 --> 01:28:20,240 that I know seven is not in. 1801 01:28:20,240 --> 01:28:22,698 So it's like throwing away the left half of the phone book. 1802 01:28:22,698 --> 01:28:25,850 I can ignore almost half of the nodes in the tree 1803 01:28:25,850 --> 01:28:29,330 by just snipping off that left branch. 1804 01:28:29,330 --> 01:28:31,380 Can't be that easy. 1805 01:28:31,380 --> 01:28:32,800 Got to be a trade off. 1806 01:28:32,800 --> 01:28:35,930 What it that trade off here? 1807 01:28:35,930 --> 01:28:38,240 Well, from the looks of this picture, pretty 1808 01:28:38,240 --> 01:28:43,220 though it is, relatively speaking, it looks like now each of my nodes, 1809 01:28:43,220 --> 01:28:47,150 each of my values has not just one arrow associated with it potentially, 1810 01:28:47,150 --> 01:28:49,250 but as many as two. 1811 01:28:49,250 --> 01:28:52,740 And that means we're spending twice as much space as before. 1812 01:28:52,740 --> 01:28:56,030 So whereas in our array, we use just one byte per value, 1813 01:28:56,030 --> 01:28:59,660 or four bytes per value, or some fixed number of bytes per value, 1814 01:28:59,660 --> 01:29:01,670 in my linked list example I had to double 1815 01:29:01,670 --> 01:29:03,470 that because I had to also store-- 1816 01:29:03,470 --> 01:29:05,060 using bytes in some form-- 1817 01:29:05,060 --> 01:29:05,840 those arrows. 1818 01:29:05,840 --> 01:29:07,920 Those pointers, or references as they're called. 1819 01:29:07,920 --> 01:29:11,960 But now in this darn tree, I need three times as much data, 1820 01:29:11,960 --> 01:29:14,730 because each of my nodes might have a left child so to speak, 1821 01:29:14,730 --> 01:29:16,370 or a right child. 1822 01:29:16,370 --> 01:29:18,230 So again it's a trade off. 1823 01:29:18,230 --> 01:29:23,242 And the prevailing trade off here, thus far, seems to be if you want less time, 1824 01:29:23,242 --> 01:29:24,950 you're going to have to spend more space. 1825 01:29:24,950 --> 01:29:29,510 Or rather, less time, I'm getting the visual metaphor wrong here. 1826 01:29:29,510 --> 01:29:33,290 If you want to spend less time, you're going to have to spend more space. 1827 01:29:33,290 --> 01:29:38,180 And if you want to save space, you might have to tolerate a little more time. 1828 01:29:38,180 --> 01:29:39,890 But can we do better? 1829 01:29:39,890 --> 01:29:44,960 Indeed, can we combine some of the best features of multiple data structures 1830 01:29:44,960 --> 01:29:48,110 in order to achieve something better still? 1831 01:29:48,110 --> 01:29:48,890 Well, sort of. 1832 01:29:48,890 --> 01:29:52,400 It turns out that there exists a very popular data 1833 01:29:52,400 --> 01:29:54,710 structure called a hash table. 1834 01:29:54,710 --> 01:29:57,950 And a hash table is a data structure that a computer scientist 1835 01:29:57,950 --> 01:30:01,940 will use when he or she really wants to store a dynamic amount of data. 1836 01:30:01,940 --> 01:30:03,350 It might grow or might shrink. 1837 01:30:03,350 --> 01:30:06,740 But they also want to search that data pretty efficiently, 1838 01:30:06,740 --> 01:30:10,830 and ideally be able to find data instantly, in constant time, 1839 01:30:10,830 --> 01:30:11,730 so to speak. 1840 01:30:11,730 --> 01:30:15,320 So you'll often see a hash table implemented, really 1841 01:30:15,320 --> 01:30:18,350 with an array something like this, and I've drawn it vertically just 1842 01:30:18,350 --> 01:30:19,790 to make the picture prettier. 1843 01:30:19,790 --> 01:30:22,040 But again, these are just artists depictions. 1844 01:30:22,040 --> 01:30:24,920 Bad artists depictions of what's going on in memory. 1845 01:30:24,920 --> 01:30:27,065 It's still just an array of back to back values. 1846 01:30:27,065 --> 01:30:28,940 But the values now in the array are not going 1847 01:30:28,940 --> 01:30:31,670 to be the numbers I care about, or the words, or whatever. 1848 01:30:31,670 --> 01:30:36,260 It's actually going to be arrows, which again we've stipulated take space. 1849 01:30:36,260 --> 01:30:38,720 And each of these arrows is going to point 1850 01:30:38,720 --> 01:30:42,380 to what we described as a linked list. 1851 01:30:42,380 --> 01:30:46,880 So it turns out that a hash table will often 1852 01:30:46,880 --> 01:30:48,830 look a little something like this. 1853 01:30:48,830 --> 01:30:52,640 A combination of an array, and a linked list 1854 01:30:52,640 --> 01:30:57,410 where those linked lists might be not present at all, might be short, 1855 01:30:57,410 --> 01:31:00,410 might be long, totally depends. 1856 01:31:00,410 --> 01:31:05,670 But we decide where to put our values based on some property of the value. 1857 01:31:05,670 --> 01:31:09,260 So for instance, if this array weren't this big, but maybe 1858 01:31:09,260 --> 01:31:11,810 had 26 different locations in it. 1859 01:31:11,810 --> 01:31:15,350 And suppose that we're no longer storing numbers, we're storing names. 1860 01:31:15,350 --> 01:31:20,960 Well every name in the English alphabet starts with an A or B or C 1861 01:31:20,960 --> 01:31:24,500 or dot dot dot a Z. One of 26 possibilities. 1862 01:31:24,500 --> 01:31:29,720 So if my hash table has an array of size 26, 1863 01:31:29,720 --> 01:31:33,770 then maybe, if I'm storing names that start with A, 1864 01:31:33,770 --> 01:31:37,250 I could just put them in this linked list here, the first one. 1865 01:31:37,250 --> 01:31:39,590 And if they start with B, they'll go in the second one. 1866 01:31:39,590 --> 01:31:41,190 And if they start with C, the third. 1867 01:31:41,190 --> 01:31:44,630 D the fourth, Z the very last. 1868 01:31:44,630 --> 01:31:49,760 And what you'll find in this case is partial optimization. 1869 01:31:49,760 --> 01:31:53,120 It's not instantaneous to find someone with a given name. 1870 01:31:53,120 --> 01:31:55,580 If there's a lot of people who's names start with A, 1871 01:31:55,580 --> 01:31:59,000 you might still have to look through all of the A names linearly 1872 01:31:59,000 --> 01:32:00,350 through that list. 1873 01:32:00,350 --> 01:32:04,280 But at least you only have to look through 126th 1874 01:32:04,280 --> 01:32:06,770 of the possible values in your data structure, 1875 01:32:06,770 --> 01:32:08,240 assuming a uniform distribution. 1876 01:32:08,240 --> 01:32:11,570 Of course, that's not fair, there's probably fewer names that start with Z, 1877 01:32:11,570 --> 01:32:13,880 fewer names that start with Q, maybe a lot 1878 01:32:13,880 --> 01:32:16,610 that start with A or D or other such letters. 1879 01:32:16,610 --> 01:32:19,040 So your chains might be of variable length, 1880 01:32:19,040 --> 01:32:20,690 and maybe that's not the best design. 1881 01:32:20,690 --> 01:32:22,970 So maybe we should think a little harder about this. 1882 01:32:22,970 --> 01:32:27,180 So this principle of bucketizing things, and putting values in their place, 1883 01:32:27,180 --> 01:32:29,750 and using that as a stepping stone to getting 1884 01:32:29,750 --> 01:32:32,710 to a complete solution-- in this case of sorted order-- 1885 01:32:32,710 --> 01:32:36,560 is really useful not just in these increasingly complex examples, 1886 01:32:36,560 --> 01:32:38,190 but even in the real world. 1887 01:32:38,190 --> 01:32:40,880 So for instance, if you're a fan of playing cards, 1888 01:32:40,880 --> 01:32:43,100 you might have a deck of cards at home. 1889 01:32:43,100 --> 01:32:47,000 And those cards of course, have different values or numbers 1890 01:32:47,000 --> 01:32:48,650 on them, and also different suits. 1891 01:32:48,650 --> 01:32:50,990 Clubs and spades and diamonds and hearts. 1892 01:32:50,990 --> 01:32:53,870 And you know, sometimes you might want to either shuffle the deck, 1893 01:32:53,870 --> 01:32:56,280 or really un-shuffle it and actually sort it. 1894 01:32:56,280 --> 01:32:58,640 And if you did want to, sort of obsessively like me, 1895 01:32:58,640 --> 01:33:00,980 want to sort your deck of cards, well you 1896 01:33:00,980 --> 01:33:03,500 could just go through it one card at a time. 1897 01:33:03,500 --> 01:33:07,010 And then try to find or select or maybe bubble up the values, 1898 01:33:07,010 --> 01:33:08,480 like our algorithms before. 1899 01:33:08,480 --> 01:33:11,190 But most of us probably use a bit of intuition. 1900 01:33:11,190 --> 01:33:14,169 So if I see a spade here, I might put it in one pile, 1901 01:33:14,169 --> 01:33:16,460 and a heart, I'm going to put that in a different pile. 1902 01:33:16,460 --> 01:33:18,470 A diamond, different pile still. 1903 01:33:18,470 --> 01:33:19,880 A spade will go up here. 1904 01:33:19,880 --> 01:33:21,700 A club now goes over here. 1905 01:33:21,700 --> 01:33:25,640 And as I continue this pattern, I'm kind of dividing the problem, 1906 01:33:25,640 --> 01:33:27,140 albeit in a different way. 1907 01:33:27,140 --> 01:33:30,020 I now have, instead of a 52 card problem, 1908 01:33:30,020 --> 01:33:33,350 I'm eventually going to have four 13-card problems. 1909 01:33:33,350 --> 01:33:35,480 But how am I getting to that point? 1910 01:33:35,480 --> 01:33:37,040 Well, I'm bucketizing things. 1911 01:33:37,040 --> 01:33:40,940 Not physical buckets, but there's like four piles here of cards. 1912 01:33:40,940 --> 01:33:42,950 But what really am I doing? 1913 01:33:42,950 --> 01:33:45,500 I'm really hashing these values. 1914 01:33:45,500 --> 01:33:48,160 I am looking at each card as input. 1915 01:33:48,160 --> 01:33:52,310 I am passing it through a sort of mental function, a mental black box if you 1916 01:33:52,310 --> 01:33:54,590 will, that outputs a value. 1917 01:33:54,590 --> 01:34:00,140 Either bucket one, or two, or three, or four, or more specifically spade, 1918 01:34:00,140 --> 01:34:02,690 or diamond, or heart, or club. 1919 01:34:02,690 --> 01:34:04,269 That's kind of my hash function. 1920 01:34:04,269 --> 01:34:06,060 So if you've ever done something like that, 1921 01:34:06,060 --> 01:34:09,510 where you're trying to clean up a mess, or you're trying to sort some data, 1922 01:34:09,510 --> 01:34:11,090 or organize your cards. 1923 01:34:11,090 --> 01:34:14,480 Once you start bucketizing in this way, you are hashing values. 1924 01:34:14,480 --> 01:34:17,090 You're taking as input some value, maybe a card, 1925 01:34:17,090 --> 01:34:19,130 and producing as output some identifier. 1926 01:34:19,130 --> 01:34:21,200 Similarly, with our example of names, if I've 1927 01:34:21,200 --> 01:34:24,382 got a whole bunch of names, each of which might start with A through Z, 1928 01:34:24,382 --> 01:34:26,090 each time you hand me a name, I'm looking 1929 01:34:26,090 --> 01:34:28,310 at the very first letter in that person's name, 1930 01:34:28,310 --> 01:34:32,390 and I'm deciding A bucket, Z bucket, D bucket, C bucket or so forth. 1931 01:34:32,390 --> 01:34:33,590 C should be over there. 1932 01:34:33,590 --> 01:34:37,280 And deciding, based on the input, what the output should be. 1933 01:34:37,280 --> 01:34:41,060 So again, here to, this sort of intuitive principle 1934 01:34:41,060 --> 01:34:44,900 can very quickly be applied to something much more sophisticated. 1935 01:34:44,900 --> 01:34:48,740 Again, compare and contrast the sort of sophistication 1936 01:34:48,740 --> 01:34:50,750 of the before and the after. 1937 01:34:50,750 --> 01:34:55,290 This is odds are pretty darn intuitive, if not something very familiar to you. 1938 01:34:55,290 --> 01:34:57,680 This might still look a bit like Greek. 1939 01:34:57,680 --> 01:35:01,430 Certainly pretty arcane versus some simple playing cards. 1940 01:35:01,430 --> 01:35:08,090 But to the overarching point, that while computing and technology all around 1941 01:35:08,090 --> 01:35:11,150 us might seem especially sophisticated, and complicated, 1942 01:35:11,150 --> 01:35:13,250 and well beyond one's understanding, that's 1943 01:35:13,250 --> 01:35:16,520 just because it might be presently beyond your familiarity 1944 01:35:16,520 --> 01:35:18,090 with some of those building blocks. 1945 01:35:18,090 --> 01:35:21,660 Indeed, we started so low level with just zeros and ones, 1946 01:35:21,660 --> 01:35:23,660 and then we got to letters, and then maybe words 1947 01:35:23,660 --> 01:35:26,420 and paragraphs, and graphics and videos, and eventually 1948 01:35:26,420 --> 01:35:28,610 so many more forms of media. 1949 01:35:28,610 --> 01:35:31,310 And at the end of the day, it's useful to understand 1950 01:35:31,310 --> 01:35:34,550 what's going on underneath the hood, but it's also very empowering. 1951 01:35:34,550 --> 01:35:36,830 Because you realize that even once we're at the point 1952 01:35:36,830 --> 01:35:40,640 as we are literally right now, of talking it this degree of complexity 1953 01:35:40,640 --> 01:35:43,730 and sophistication for how you might store data inside of her computer's 1954 01:35:43,730 --> 01:35:46,670 memory, and very efficiently get at it, all of this 1955 01:35:46,670 --> 01:35:49,040 was the result of these stepping stones. 1956 01:35:49,040 --> 01:35:50,360 These abstractions. 1957 01:35:50,360 --> 01:35:53,180 And along the way, in building up these abstractions, 1958 01:35:53,180 --> 01:35:56,390 and in solving problems more effectively by standing 1959 01:35:56,390 --> 01:35:58,830 on the shoulders of problems solved past, 1960 01:35:58,830 --> 01:36:01,610 similarly we have to make some trade offs. 1961 01:36:01,610 --> 01:36:05,580 And so at the end of the day, this is what computational thinking is all 1962 01:36:05,580 --> 01:36:06,080 about. 1963 01:36:06,080 --> 01:36:09,350 This is what computer science and engineering and programming 1964 01:36:09,350 --> 01:36:10,460 is all about. 1965 01:36:10,460 --> 01:36:14,090 It's solving problems using solutions to problems past, 1966 01:36:14,090 --> 01:36:17,930 and along the way making very conscious, calculated, and hopefully 1967 01:36:17,930 --> 01:36:21,890 correct decisions, as to what resources and what 1968 01:36:21,890 --> 01:36:25,810 objectives are most important to you. 1969 01:36:25,810 --> 01:36:26,915