1 00:00:00,000 --> 00:00:00,500 2 00:00:00,500 --> 00:00:03,590 SPEAKER 1: This is a seminar on two of my favorite buzz terms-- 3 00:00:03,590 --> 00:00:05,060 machine learning, computer vision. 4 00:00:05,060 --> 00:00:06,770 You throw those out there in like a party conversation 5 00:00:06,770 --> 00:00:09,940 and besides people saying nerd, they'll also think that it's pretty bad ass. 6 00:00:09,940 --> 00:00:11,690 If they're computer scientists, they might 7 00:00:11,690 --> 00:00:13,790 question how much you actually know. 8 00:00:13,790 --> 00:00:16,910 So machine learning and computer vision, I've 9 00:00:16,910 --> 00:00:21,230 kind of put up a cutesy little slide on what they are. 10 00:00:21,230 --> 00:00:23,540 I usually will refer to them by their full name. 11 00:00:23,540 --> 00:00:25,430 But if I ever say ML I mean Machine Learning. 12 00:00:25,430 --> 00:00:27,590 If I ever say CV I mean Computer Vision. 13 00:00:27,590 --> 00:00:28,520 Pretty easy. 14 00:00:28,520 --> 00:00:31,444 And these are kind of two of the myths that people 15 00:00:31,444 --> 00:00:32,610 have about machine learning. 16 00:00:32,610 --> 00:00:34,370 They're usually on one of two extremes. 17 00:00:34,370 --> 00:00:36,920 Either it's impossible and only for the super technical geeky 18 00:00:36,920 --> 00:00:39,800 wizards that can sit at their computers for hours on end. 19 00:00:39,800 --> 00:00:45,050 Or they're not that interesting because we're better at doing what they do. 20 00:00:45,050 --> 00:00:49,400 And if you fall in any of the betweens of those two extremes, that's cool too. 21 00:00:49,400 --> 00:00:53,360 But basically machine learning and computer vision, 22 00:00:53,360 --> 00:00:55,746 the context for why we would need them is we 23 00:00:55,746 --> 00:00:57,620 want to be able to solve problems that people 24 00:00:57,620 --> 00:01:00,740 solve all day, every day but programmatically. 25 00:01:00,740 --> 00:01:03,740 So I won't be able to take a computer program, have it look at something 26 00:01:03,740 --> 00:01:07,340 and say, that's a cat, that's a dog, that's a person, that's a couch 27 00:01:07,340 --> 00:01:10,670 and not accidentally look at a person and say, oh, that's a car. 28 00:01:10,670 --> 00:01:12,410 Where's there's license plate. 29 00:01:12,410 --> 00:01:15,590 And that's maybe a real world example that you could imagine existing 30 00:01:15,590 --> 00:01:17,030 is traffic recognition. 31 00:01:17,030 --> 00:01:20,030 If you're just trying to figure out what something is, 32 00:01:20,030 --> 00:01:21,530 computer vision is super important. 33 00:01:21,530 --> 00:01:23,510 Because I want to be able to tell, is the thing 34 00:01:23,510 --> 00:01:28,340 that just ran my red light a brown bear or a car. 35 00:01:28,340 --> 00:01:29,600 And that's important. 36 00:01:29,600 --> 00:01:33,180 And so that's something that is really what we're kind of after in general 37 00:01:33,180 --> 00:01:33,680 here. 38 00:01:33,680 --> 00:01:36,013 But we're going to approach a very specific problem just 39 00:01:36,013 --> 00:01:38,460 to give us some context. 40 00:01:38,460 --> 00:01:42,410 And as far as context for me and my kind of background in these two things 41 00:01:42,410 --> 00:01:48,290 is I had never coded before a year ago, a year ago being 2016 so in case 42 00:01:48,290 --> 00:01:49,370 this is in the future. 43 00:01:49,370 --> 00:01:53,210 And basically I wanted to, for my CS50 final project, 44 00:01:53,210 --> 00:01:54,410 do something really cool. 45 00:01:54,410 --> 00:01:56,570 I wanted to do it on my own. 46 00:01:56,570 --> 00:01:59,059 And I wanted it to be something that was accessible. 47 00:01:59,059 --> 00:02:01,350 And so a lot of people were like, oh, machine learning, 48 00:02:01,350 --> 00:02:05,420 computer vision are possibly, they break like two of those three criteria. 49 00:02:05,420 --> 00:02:08,570 Like, you can't do that on your own and it's not accessible. 50 00:02:08,570 --> 00:02:10,160 Those things are impossible to do. 51 00:02:10,160 --> 00:02:11,810 Why would you ever approach that? 52 00:02:11,810 --> 00:02:14,660 And yes that is maybe true from like a theoretical side. 53 00:02:14,660 --> 00:02:17,840 I didn't sit down and teach myself all of the math behind machine learning. 54 00:02:17,840 --> 00:02:22,280 I didn't sit down and teach myself how to do contouring using computer vision. 55 00:02:22,280 --> 00:02:25,490 I sat down and played around with some YouTube videos. 56 00:02:25,490 --> 00:02:27,500 And that's basically the point of the seminar, 57 00:02:27,500 --> 00:02:32,420 is to teach or show or prove that these two concepts, while they are buzzwords 58 00:02:32,420 --> 00:02:35,420 and they are super cool and there's whole fields around each of them, 59 00:02:35,420 --> 00:02:38,990 they're accessible to you, the CS50 student that just started CS, 60 00:02:38,990 --> 00:02:43,370 had never seen what a line of code and like what it meant before. 61 00:02:43,370 --> 00:02:45,200 That's who this is for. 62 00:02:45,200 --> 00:02:47,840 And so if you are some sort of CS guru that does 63 00:02:47,840 --> 00:02:51,020 know all sorts of things about machine learning, computer vision, little 64 00:02:51,020 --> 00:02:53,606 confused why you're watching but also really appreciative. 65 00:02:53,606 --> 00:02:56,480 And hopefully you'll find it at least entertaining and maybe a little 66 00:02:56,480 --> 00:02:58,360 been informative. 67 00:02:58,360 --> 00:03:01,760 And so basically, my story was I came in, 68 00:03:01,760 --> 00:03:05,180 did a final project using these two packages or pieces of software, 69 00:03:05,180 --> 00:03:07,660 Keras which is the machine learning part and Open CV 70 00:03:07,660 --> 00:03:10,674 2 which is the computer vision part and built 71 00:03:10,674 --> 00:03:12,590 an algorithm or a piece of software that would 72 00:03:12,590 --> 00:03:15,410 allow me to do a very specific task. 73 00:03:15,410 --> 00:03:17,240 I wanted to convert images of sheet music 74 00:03:17,240 --> 00:03:20,450 into machine readable versions of music. 75 00:03:20,450 --> 00:03:22,140 Software that doesn't exist right now. 76 00:03:22,140 --> 00:03:22,973 And I found out why. 77 00:03:22,973 --> 00:03:24,470 Because it's really freaking hard. 78 00:03:24,470 --> 00:03:26,131 But it was a cool process anyway. 79 00:03:26,131 --> 00:03:28,130 And I thought that it was a worthwhile endeavor. 80 00:03:28,130 --> 00:03:31,650 And I would encourage everyone to try it if they want to. 81 00:03:31,650 --> 00:03:35,690 So just kind of a real world example, we have kind of four people 82 00:03:35,690 --> 00:03:38,960 and this will be to kind of ease us into the idea of pattern recognition. 83 00:03:38,960 --> 00:03:41,270 That's what we're after with machine learning anyway. 84 00:03:41,270 --> 00:03:43,424 And so I have for stock images. 85 00:03:43,424 --> 00:03:45,590 They came with the template of the PowerPoint slide. 86 00:03:45,590 --> 00:03:47,240 I just changed their descriptions. 87 00:03:47,240 --> 00:03:51,120 And basically I'm going to ask you to kind of think 88 00:03:51,120 --> 00:03:55,260 in your head about what patterns you can find between these four people. 89 00:03:55,260 --> 00:03:58,170 And as a human being, you can find-- 90 00:03:58,170 --> 00:03:59,601 I'm hesitant to say hundreds. 91 00:03:59,601 --> 00:04:02,600 I don't know if there's even really enough data for hundreds, but maybe. 92 00:04:02,600 --> 00:04:03,100 Right? 93 00:04:03,100 --> 00:04:04,830 A very large number of patterns. 94 00:04:04,830 --> 00:04:08,695 So now I'm going to restrict that in the sense that machines, 95 00:04:08,695 --> 00:04:11,070 they have a limited amount of data that they can pick up. 96 00:04:11,070 --> 00:04:11,660 So do we. 97 00:04:11,660 --> 00:04:14,201 But we pick up a lot more kind of instantaneously 98 00:04:14,201 --> 00:04:16,950 than a machine really is going to, especially a machine that we're 99 00:04:16,950 --> 00:04:20,579 working with, kind of on the softer or smaller software side. 100 00:04:20,579 --> 00:04:23,277 So you get these three categories. 101 00:04:23,277 --> 00:04:25,860 And that's basically how many eyes do you have, are you human, 102 00:04:25,860 --> 00:04:27,510 and do you have a ponytail. 103 00:04:27,510 --> 00:04:28,710 I was short on time. 104 00:04:28,710 --> 00:04:31,290 And so, from these three categories, if I 105 00:04:31,290 --> 00:04:35,500 were to ask you a question you should be able to identify which group of people 106 00:04:35,500 --> 00:04:38,730 or I were to point at someone and say based on these categories, what 107 00:04:38,730 --> 00:04:41,227 category do they belong in, you should be able to do that. 108 00:04:41,227 --> 00:04:43,060 And human beings, we're pretty good at that. 109 00:04:43,060 --> 00:04:46,200 So if I say is this a person, you'd say with pretty much 100% confidence, 110 00:04:46,200 --> 00:04:47,110 yes that is. 111 00:04:47,110 --> 00:04:47,610 Same here. 112 00:04:47,610 --> 00:04:48,510 Same with that one. 113 00:04:48,510 --> 00:04:49,426 Same with number four. 114 00:04:49,426 --> 00:04:51,810 It's hard to say, is this a girl. 115 00:04:51,810 --> 00:04:53,700 Girl isn't really one of the categories. 116 00:04:53,700 --> 00:04:58,140 And looking at the data, well maybe you have some extraneous data 117 00:04:58,140 --> 00:05:00,090 that says people with ponytails are probably 118 00:05:00,090 --> 00:05:01,522 girls, which is a little sexist. 119 00:05:01,522 --> 00:05:04,230 But like, you know, we're going to kind of ignore that assumption 120 00:05:04,230 --> 00:05:06,330 and just say ponytails are probably girls. 121 00:05:06,330 --> 00:05:09,761 Well, you get this one with we'll say like 95% confidence. 122 00:05:09,761 --> 00:05:12,010 But all the rest of these are like, well, no ponytail. 123 00:05:12,010 --> 00:05:14,360 So I'm 100% sure they're not girls. 124 00:05:14,360 --> 00:05:15,100 Slight problem. 125 00:05:15,100 --> 00:05:15,960 That one is. 126 00:05:15,960 --> 00:05:19,470 And so this is a very contrived but kind of interesting 127 00:05:19,470 --> 00:05:23,070 example where if I limit the amount of data that you're able to look at, 128 00:05:23,070 --> 00:05:26,400 you are now restricted in what patterns you can find. 129 00:05:26,400 --> 00:05:27,870 And that's a very intuitive thing. 130 00:05:27,870 --> 00:05:31,140 That's something that should be almost an instantaneous revelation 131 00:05:31,140 --> 00:05:32,850 like a shower thought if you will. 132 00:05:32,850 --> 00:05:35,730 But that has kind of some severe manifestations 133 00:05:35,730 --> 00:05:39,120 when we're trying to apply machine learning and computer vision. 134 00:05:39,120 --> 00:05:41,610 So one of the assumptions that people make 135 00:05:41,610 --> 00:05:46,600 is that we can accumulate as much data as we need. 136 00:05:46,600 --> 00:05:47,970 However, I'm not Google. 137 00:05:47,970 --> 00:05:49,000 You're also not Google. 138 00:05:49,000 --> 00:05:50,950 You're not Amazon or Microsoft. 139 00:05:50,950 --> 00:05:54,819 We're not one of these big companies that have access to hedabytes of data. 140 00:05:54,819 --> 00:05:56,610 And so a lot of people then are like, well, 141 00:05:56,610 --> 00:05:57,690 then machine learning is not for me. 142 00:05:57,690 --> 00:05:59,400 When I work at Google I will do it then. 143 00:05:59,400 --> 00:06:00,600 And they move on. 144 00:06:00,600 --> 00:06:05,340 But you don't need hedabytes or even gigabytes necessarily of data 145 00:06:05,340 --> 00:06:07,290 to get machine learning to work. 146 00:06:07,290 --> 00:06:11,640 If I were to give you a simple pattern, we'll talk about the logic gate and. 147 00:06:11,640 --> 00:06:14,100 It takes two inputs, either 0 or 1. 148 00:06:14,100 --> 00:06:16,770 And it returns one output, either 0 or 1. 149 00:06:16,770 --> 00:06:19,680 And if I say and of 1 and 1, you return 1. 150 00:06:19,680 --> 00:06:23,250 If I say and of any other input, you give me 0. 151 00:06:23,250 --> 00:06:24,630 It's a very small amount of data. 152 00:06:24,630 --> 00:06:28,710 In fact, I can represent the inputs as single bits. 153 00:06:28,710 --> 00:06:32,870 And I can represent and in, we'll say, less than 10 bits. 154 00:06:32,870 --> 00:06:37,560 Now you have learned an entire pattern, a complete pattern, if you will. 155 00:06:37,560 --> 00:06:40,470 And it took you less than even a kilobyte. 156 00:06:40,470 --> 00:06:42,750 So patterns don't require lots of data. 157 00:06:42,750 --> 00:06:46,230 Complicated patterns maybe require more data, as you'll see. 158 00:06:46,230 --> 00:06:47,610 And it's kind of intuitive. 159 00:06:47,610 --> 00:06:51,240 But we can kind of get past this problem without having 160 00:06:51,240 --> 00:06:53,520 to just collect more data. 161 00:06:53,520 --> 00:06:55,650 We don't need to sit down for hours and hours 162 00:06:55,650 --> 00:06:59,280 labeling things and manually going, all right, that's a cat. 163 00:06:59,280 --> 00:06:59,940 That goes here. 164 00:06:59,940 --> 00:07:00,510 That's a dog. 165 00:07:00,510 --> 00:07:01,080 It goes here. 166 00:07:01,080 --> 00:07:01,830 That's a triangle. 167 00:07:01,830 --> 00:07:03,210 Why is a triangle in here? 168 00:07:03,210 --> 00:07:05,820 You don't need to actually sit there doing that. 169 00:07:05,820 --> 00:07:08,050 There's all these other techniques that exist. 170 00:07:08,050 --> 00:07:11,640 And I think that they're not as well publicized. 171 00:07:11,640 --> 00:07:13,752 And if you had known about these sort of things, 172 00:07:13,752 --> 00:07:16,960 maybe you wouldn't have turned away from machine learning in the first place. 173 00:07:16,960 --> 00:07:19,630 So one of the first ones that I have listed there, 174 00:07:19,630 --> 00:07:21,037 it's called data augmentation. 175 00:07:21,037 --> 00:07:23,370 Another one of those things that sounds like a buzzword. 176 00:07:23,370 --> 00:07:27,285 You throw it out there and you're just like, I augmented some data. 177 00:07:27,285 --> 00:07:29,910 And then you just kind of move on and hope that nobody actually 178 00:07:29,910 --> 00:07:31,050 asks what that means. 179 00:07:31,050 --> 00:07:34,770 But all it really is just taking the data that you have 180 00:07:34,770 --> 00:07:39,520 and kind of creating new data or making sure that patterns are preserved 181 00:07:39,520 --> 00:07:42,180 but changing what it looks like. 182 00:07:42,180 --> 00:07:45,630 And so what I mean by that is if you take a picture of someone's face 183 00:07:45,630 --> 00:07:50,040 and you stretch it out, to a point, you can still recognize them. 184 00:07:50,040 --> 00:07:54,987 But to a machine, each pixel that you stretched it out is more data. 185 00:07:54,987 --> 00:07:57,570 Because I've now taken this picture and I've stretched it out. 186 00:07:57,570 --> 00:07:59,950 That's the same person but these are two different images. 187 00:07:59,950 --> 00:08:02,491 If I were to compare the two images, they are not just by bit 188 00:08:02,491 --> 00:08:04,350 by bit comparison the same. 189 00:08:04,350 --> 00:08:05,290 And that's important. 190 00:08:05,290 --> 00:08:07,623 So that's one of the techniques, is stretching an image. 191 00:08:07,623 --> 00:08:11,250 In the example code or distribution code that you'll get access to later, 192 00:08:11,250 --> 00:08:13,920 you'll see that there's an entire configuration file dedicated 193 00:08:13,920 --> 00:08:15,930 to how do you augment your data. 194 00:08:15,930 --> 00:08:17,670 What if they rotated something. 195 00:08:17,670 --> 00:08:20,450 Is a triangle still a triangle if I turn it sideways? 196 00:08:20,450 --> 00:08:21,060 Yeah. 197 00:08:21,060 --> 00:08:22,380 And same thing with a face. 198 00:08:22,380 --> 00:08:26,110 If I have your face and I turn it upside down, well, people 199 00:08:26,110 --> 00:08:28,710 we'll turn ourselves upside down and try to look at your face. 200 00:08:28,710 --> 00:08:32,549 But the same thing is applicable to machines and how they learn. 201 00:08:32,549 --> 00:08:35,419 If I can take a piece of data, even a small amount of data, 202 00:08:35,419 --> 00:08:38,760 I can amplify it according to all of these different ways of shifting it. 203 00:08:38,760 --> 00:08:40,980 What if color doesn't matter. 204 00:08:40,980 --> 00:08:44,280 What if I was using a emojis and they could be in any color. 205 00:08:44,280 --> 00:08:46,020 You could still recognize the emoji. 206 00:08:46,020 --> 00:08:52,230 It's still a smiley face emoji, even if it's yellow or black or blue or pink. 207 00:08:52,230 --> 00:08:55,920 But in the particular case that I've just given, there's a slight problem. 208 00:08:55,920 --> 00:08:58,780 Some emojis do use color to convey meaning. 209 00:08:58,780 --> 00:09:00,590 The angry emoji is red. 210 00:09:00,590 --> 00:09:03,535 If you had an angry emoji that was like a bright pink, 211 00:09:03,535 --> 00:09:04,910 it might look a little different. 212 00:09:04,910 --> 00:09:06,347 We might get a different message. 213 00:09:06,347 --> 00:09:08,680 So it's important that when you're augmenting your data, 214 00:09:08,680 --> 00:09:11,080 you keep in mind what patterns can change 215 00:09:11,080 --> 00:09:15,040 and which ones can't, what information are you actually after. 216 00:09:15,040 --> 00:09:17,320 And that comes into the next point of clever data 217 00:09:17,320 --> 00:09:18,640 gathering, which is basically what you're 218 00:09:18,640 --> 00:09:20,264 doing when you're augmenting your data. 219 00:09:20,264 --> 00:09:22,180 You're just not going outside to get more. 220 00:09:22,180 --> 00:09:25,452 So if I was collecting data, I was just picking up images 221 00:09:25,452 --> 00:09:27,160 off the internet which is often what I'll 222 00:09:27,160 --> 00:09:29,580 do if I'm trying to build a machine learning model, 223 00:09:29,580 --> 00:09:35,770 I have to make sure that I collect maybe not more but the right kinds. 224 00:09:35,770 --> 00:09:40,940 So if I gathered 30 pictures of the same cat and then I said, 225 00:09:40,940 --> 00:09:44,410 all right, machine, tell me if this is a cat or a dog. 226 00:09:44,410 --> 00:09:46,510 Not very good data gathering. 227 00:09:46,510 --> 00:09:48,310 I've just picked up the same thing. 228 00:09:48,310 --> 00:09:51,500 And even a human would be like, well, if I only had that information. 229 00:09:51,500 --> 00:09:53,440 I would just say it's not that cat. 230 00:09:53,440 --> 00:09:55,420 But that's basically the idea here. 231 00:09:55,420 --> 00:09:58,780 If you were teaching a toddler something or if you were teaching a small kid 232 00:09:58,780 --> 00:10:02,260 or even if you were teaching a full on college student a complicated concept, 233 00:10:02,260 --> 00:10:04,540 you have to give them enough of the pattern 234 00:10:04,540 --> 00:10:06,520 that they can get it right every time. 235 00:10:06,520 --> 00:10:09,667 They can extrapolate from the pattern that they're given. 236 00:10:09,667 --> 00:10:11,500 So if I were to give you a number sequence-- 237 00:10:11,500 --> 00:10:16,820 1, 1, 2-- then some people might go, oh, that's the Fibonacci sequence. 238 00:10:16,820 --> 00:10:17,520 Well, no. 239 00:10:17,520 --> 00:10:21,280 It's 1, 1, 2, 1, 1, 2. 240 00:10:21,280 --> 00:10:26,440 And so that might be some example in a very kind of contrived way of a pattern 241 00:10:26,440 --> 00:10:29,140 that I didn't give you enough information. 242 00:10:29,140 --> 00:10:31,330 And this sort of thing, where you're using an image, 243 00:10:31,330 --> 00:10:32,740 that's a lot of patterns. 244 00:10:32,740 --> 00:10:33,730 It could be eye color. 245 00:10:33,730 --> 00:10:34,390 It could be hair color. 246 00:10:34,390 --> 00:10:35,140 What if they're not people? 247 00:10:35,140 --> 00:10:35,890 It could be shape. 248 00:10:35,890 --> 00:10:38,260 It could be the angle at which things intersect. 249 00:10:38,260 --> 00:10:39,850 There's a lot of information there. 250 00:10:39,850 --> 00:10:41,565 And we pick it up almost instantly. 251 00:10:41,565 --> 00:10:43,440 You look at a single picture and you're like, 252 00:10:43,440 --> 00:10:45,060 I could name 400 patterns from here. 253 00:10:45,060 --> 00:10:47,800 If you're not thinking, oh, I know exactly 400 patterns 254 00:10:47,800 --> 00:10:50,810 but you could start enumerating, oh, well, there's this attribute. 255 00:10:50,810 --> 00:10:52,390 There's this attribute, and this one. 256 00:10:52,390 --> 00:10:54,306 So you have to make sure that your machine has 257 00:10:54,306 --> 00:10:58,500 enough data and the right type of data that it can actually pick up stuff. 258 00:10:58,500 --> 00:11:01,930 And a good benchmark for that is if you kind of narrow your mind-- 259 00:11:01,930 --> 00:11:05,080 and this is one of the few cases where I'll say, just be narrow minded-- 260 00:11:05,080 --> 00:11:07,630 and only look at something in the context of what you're 261 00:11:07,630 --> 00:11:10,720 given, if you can still figure it out, there's probably 262 00:11:10,720 --> 00:11:12,440 a way for the machine to do it too. 263 00:11:12,440 --> 00:11:15,820 And if you can't, the machine probably can't do it either. 264 00:11:15,820 --> 00:11:19,930 And so automated data gathering is one of the next solutions to this 265 00:11:19,930 --> 00:11:21,620 I don't have enough data problems. 266 00:11:21,620 --> 00:11:24,130 Because basically the original, the brute force, 267 00:11:24,130 --> 00:11:27,460 solution is to take a bunch of pictures and label them. 268 00:11:27,460 --> 00:11:30,732 And we're talking about specifically like image classification here. 269 00:11:30,732 --> 00:11:32,440 There are other kinds of machine learning 270 00:11:32,440 --> 00:11:34,780 but I'm kind of gearing this more towards image 271 00:11:34,780 --> 00:11:38,120 classification because it is a little bit easier to understand and intuit. 272 00:11:38,120 --> 00:11:41,110 So if I was manually labeling images. 273 00:11:41,110 --> 00:11:42,530 And I've done this before. 274 00:11:42,530 --> 00:11:43,720 It's awful. 275 00:11:43,720 --> 00:11:46,080 If you can avoid it, don't do it. 276 00:11:46,080 --> 00:11:50,080 But you can sit there and you can say, well, this is an A, this is a B, 277 00:11:50,080 --> 00:11:52,387 this is a C. But you need enough data to make sure 278 00:11:52,387 --> 00:11:53,720 that your patterns are complete. 279 00:11:53,720 --> 00:11:57,100 So you might be doing that for seven, eight hours. 280 00:11:57,100 --> 00:11:58,376 It's horrible. 281 00:11:58,376 --> 00:11:59,250 Find some good music. 282 00:11:59,250 --> 00:12:01,560 It'll make it a little bit easier to do. 283 00:12:01,560 --> 00:12:03,060 But you can automate that process. 284 00:12:03,060 --> 00:12:05,156 If I could generate all of the-- let's say 285 00:12:05,156 --> 00:12:07,030 we're doing some sort of letter recognition-- 286 00:12:07,030 --> 00:12:11,610 if I could generate in 100 different fonts all of the 26 letters, 287 00:12:11,610 --> 00:12:14,910 we'll say, maybe 52 if you do lower case and uppercase. 288 00:12:14,910 --> 00:12:17,850 Then all my data gathering is pretty easy. 289 00:12:17,850 --> 00:12:19,170 Click a button, it's done. 290 00:12:19,170 --> 00:12:23,659 So if there's a way for you to automate your data collection, do it. 291 00:12:23,659 --> 00:12:26,700 And now that you've automated it, you might as well generate as much data 292 00:12:26,700 --> 00:12:27,529 as you want. 293 00:12:27,529 --> 00:12:29,820 And you might find that there is some time restrictions 294 00:12:29,820 --> 00:12:32,028 because as your machine learns on more and more data, 295 00:12:32,028 --> 00:12:33,670 it might take more and more time. 296 00:12:33,670 --> 00:12:36,060 So now you have a balancing act that you have to perform. 297 00:12:36,060 --> 00:12:38,680 Do I want to generate more data and get a better machine learning modeled? 298 00:12:38,680 --> 00:12:41,430 On want to generate less data and have it be done faster? 299 00:12:41,430 --> 00:12:45,470 And is there a point where giving it more data doesn't make it any better 300 00:12:45,470 --> 00:12:46,927 but it does make it slower? 301 00:12:46,927 --> 00:12:48,510 Because at that point you should stop. 302 00:12:48,510 --> 00:12:51,551 If it's not getting any better, than maybe you need to change your model. 303 00:12:51,551 --> 00:12:54,960 And that's the last point there, is we need sometimes beefier models 304 00:12:54,960 --> 00:12:57,300 and sometimes more clever models. 305 00:12:57,300 --> 00:13:00,030 And those are somewhat interchangeable and sometimes not. 306 00:13:00,030 --> 00:13:01,920 Just because you have a bigger heftier model, 307 00:13:01,920 --> 00:13:03,503 that's what I kind of mean by beefier. 308 00:13:03,503 --> 00:13:05,670 Doesn't mean it's better at learning things. 309 00:13:05,670 --> 00:13:09,210 People that are just bigger don't learn things faster inherently. 310 00:13:09,210 --> 00:13:12,180 But if I have a model that's a little bit more clever about how 311 00:13:12,180 --> 00:13:16,020 it learns something, it is faster able to pick up on a pattern. 312 00:13:16,020 --> 00:13:19,790 It's probably going to do a little bit better depending on the circumstance. 313 00:13:19,790 --> 00:13:22,300 So one of our one of my favorite myths about machine 314 00:13:22,300 --> 00:13:27,550 learning that I'm also blocking part of is that it takes a long time. 315 00:13:27,550 --> 00:13:30,250 If I want to take a model and get it to work, 316 00:13:30,250 --> 00:13:34,600 I have to train it for hours and hours and hours. 317 00:13:34,600 --> 00:13:35,817 And that's not true. 318 00:13:35,817 --> 00:13:36,400 That's people. 319 00:13:36,400 --> 00:13:37,870 People take a long time. 320 00:13:37,870 --> 00:13:40,060 You train people for hours and hours. 321 00:13:40,060 --> 00:13:42,520 One of the benefits to machine learning is that you 322 00:13:42,520 --> 00:13:44,490 don't have to train them that long. 323 00:13:44,490 --> 00:13:46,491 A lot of times, I mean, that's with the caveat 324 00:13:46,491 --> 00:13:48,490 that if you're training on an enormous data set, 325 00:13:48,490 --> 00:13:50,531 you're training a particularly complicated model, 326 00:13:50,531 --> 00:13:51,740 it might take a long time. 327 00:13:51,740 --> 00:13:54,980 But given that we're doing some sort of CS50 final project, 328 00:13:54,980 --> 00:13:56,457 this is not a problem for you. 329 00:13:56,457 --> 00:13:58,540 This is not something that pushes this project out 330 00:13:58,540 --> 00:14:01,960 of the reaches of your grasp. 331 00:14:01,960 --> 00:14:05,770 It's one of those things that is actually just kind of a myth, 332 00:14:05,770 --> 00:14:07,450 that machine learning takes too long. 333 00:14:07,450 --> 00:14:10,930 And a kind of parallel myth is that computer vision perfectly 334 00:14:10,930 --> 00:14:12,910 captures all data in an image. 335 00:14:12,910 --> 00:14:16,750 Right And maybe the way that these are parallel is not immediately apparent. 336 00:14:16,750 --> 00:14:19,420 But the same idea is present here. 337 00:14:19,420 --> 00:14:24,340 This concept that computer vision is a perfect representation of whatever data 338 00:14:24,340 --> 00:14:27,300 it sees pushes it outside of our grasp. 339 00:14:27,300 --> 00:14:29,500 Because if it is a perfect representation 340 00:14:29,500 --> 00:14:33,190 and we can't learn something from that perfect representation, 341 00:14:33,190 --> 00:14:34,510 then it's not doable. 342 00:14:34,510 --> 00:14:36,430 Might as well throw up our hands and give up. 343 00:14:36,430 --> 00:14:37,420 But that's not true. 344 00:14:37,420 --> 00:14:39,610 Computer vision is a little subjective. 345 00:14:39,610 --> 00:14:41,530 I can choose how my machine sees. 346 00:14:41,530 --> 00:14:44,830 I can choose how well it picks up on patterns, 347 00:14:44,830 --> 00:14:48,230 how well it distinguishes between what is the foreground and the background. 348 00:14:48,230 --> 00:14:51,310 All of those things come into play when you're trying to pick up data. 349 00:14:51,310 --> 00:14:54,610 And so we use a very simple example of computer vision, more just 350 00:14:54,610 --> 00:15:00,220 to give you a taste in the distribution code of how to interface with open CV2. 351 00:15:00,220 --> 00:15:01,490 But it does exist. 352 00:15:01,490 --> 00:15:03,490 And it is something that I think is particularly 353 00:15:03,490 --> 00:15:06,300 important for image classifiers. 354 00:15:06,300 --> 00:15:09,600 But when we're choosing software to do all of these things, 355 00:15:09,600 --> 00:15:12,900 to do machine learning, to do computer vision, even just 356 00:15:12,900 --> 00:15:16,290 to program in general, it becomes very important to us 357 00:15:16,290 --> 00:15:20,320 to kind of see the trade offs between different pieces of software. 358 00:15:20,320 --> 00:15:24,810 So in this project and in general, I go to Keras and Open CV. 359 00:15:24,810 --> 00:15:28,770 However underneath the hood, Keras uses TensorFlow. 360 00:15:28,770 --> 00:15:30,360 Or at least, I have it use TensorFlow. 361 00:15:30,360 --> 00:15:32,130 You could also have it use Theano. 362 00:15:32,130 --> 00:15:32,997 I don't use Theano. 363 00:15:32,997 --> 00:15:34,080 It's a little bit mathier. 364 00:15:34,080 --> 00:15:36,870 It was a little bit above my intellectual level. 365 00:15:36,870 --> 00:15:39,480 But TensorFlow I thought was pretty accessible 366 00:15:39,480 --> 00:15:41,800 and I like it as a company and just kind of in general. 367 00:15:41,800 --> 00:15:43,050 I swear I don't work for them. 368 00:15:43,050 --> 00:15:44,400 They're just the really cool. 369 00:15:44,400 --> 00:15:47,849 And so these two things actually have the same benefits, 370 00:15:47,849 --> 00:15:49,140 at least from my point of view. 371 00:15:49,140 --> 00:15:52,350 I was a college student that was just learning all of these things, 372 00:15:52,350 --> 00:15:54,730 just learning computer science in general. 373 00:15:54,730 --> 00:15:57,120 And so I was like, well, you know, what projects exist. 374 00:15:57,120 --> 00:15:57,930 I asked my TF. 375 00:15:57,930 --> 00:16:01,230 And he was like, oh, go look up like OpenCV, see what that does. 376 00:16:01,230 --> 00:16:03,990 And I asked him, like what can I do to a machine learn something. 377 00:16:03,990 --> 00:16:04,710 I want to do AI. 378 00:16:04,710 --> 00:16:07,280 And he was like, oh AI sounds a little scary. 379 00:16:07,280 --> 00:16:09,690 But machine learning is also scary. 380 00:16:09,690 --> 00:16:10,950 Well, pick one. 381 00:16:10,950 --> 00:16:12,970 And I was like OK, we'll do machine learning. 382 00:16:12,970 --> 00:16:14,886 And you'll notice they're not super different. 383 00:16:14,886 --> 00:16:17,862 But what I mean by high level interface, I think open source, 384 00:16:17,862 --> 00:16:20,570 it's open to the public, I didn't have to pay for it, [INAUDIBLE] 385 00:16:20,570 --> 00:16:21,930 college student. 386 00:16:21,930 --> 00:16:24,630 But providing a high level interface is something 387 00:16:24,630 --> 00:16:27,810 that I've kind of done here as well, is the product, the distribution code 388 00:16:27,810 --> 00:16:32,130 that I'll have at the end, it's a high level interface on a high level 389 00:16:32,130 --> 00:16:32,700 interface. 390 00:16:32,700 --> 00:16:36,630 It makes it accessible in that you just have to say, build model. 391 00:16:36,630 --> 00:16:38,880 And it takes care of everything underneath the hood. 392 00:16:38,880 --> 00:16:40,797 It just builds the model, whatever that means. 393 00:16:40,797 --> 00:16:43,005 And if you want to go look underneath the hood, which 394 00:16:43,005 --> 00:16:45,450 I advise that you do if you're building this as a project, 395 00:16:45,450 --> 00:16:47,350 you can then see what's going on. 396 00:16:47,350 --> 00:16:52,870 But if you don't and you just want it to work, that's what this does. 397 00:16:52,870 --> 00:16:56,800 I don't have to sit down and say, oh, crap, I carried a 1 wrong in my math 398 00:16:56,800 --> 00:16:59,470 here so my machine now says everything is a square. 399 00:16:59,470 --> 00:17:00,970 Like that would be kind of annoying. 400 00:17:00,970 --> 00:17:02,710 And maybe tracing between those two things 401 00:17:02,710 --> 00:17:04,450 is very difficult, especially if you're using 402 00:17:04,450 --> 00:17:07,450 an algorithm that maybe does something that you don't fully understand. 403 00:17:07,450 --> 00:17:09,460 You're just sitting here like, that's a lot of math. 404 00:17:09,460 --> 00:17:10,450 I think that's a sigma. 405 00:17:10,450 --> 00:17:11,750 And there's another letter here. 406 00:17:11,750 --> 00:17:12,625 And I don't know why. 407 00:17:12,625 --> 00:17:17,470 And then you go look it up on Wikipedia, which I've done, and there's just names 408 00:17:17,470 --> 00:17:20,359 and there's no actual numbers in the math anymore. 409 00:17:20,359 --> 00:17:21,437 It's all symbols. 410 00:17:21,437 --> 00:17:23,020 And it becomes very difficult to read. 411 00:17:23,020 --> 00:17:25,069 And from there it becomes inaccessible. 412 00:17:25,069 --> 00:17:26,800 And then maybe you give up. 413 00:17:26,800 --> 00:17:29,380 Or you just get frustrated and you go eat a piece of cake. 414 00:17:29,380 --> 00:17:30,320 That's what I did. 415 00:17:30,320 --> 00:17:32,000 And so it's very frustrating. 416 00:17:32,000 --> 00:17:34,660 But the main reason that I chose these two pieces of software 417 00:17:34,660 --> 00:17:35,980 was that they were usable. 418 00:17:35,980 --> 00:17:38,080 I could figure out how to use them. 419 00:17:38,080 --> 00:17:41,380 Somewhat ironically, figuring out how to get them downloaded 420 00:17:41,380 --> 00:17:45,051 and working was very difficult. I spent around 20 hours doing that. 421 00:17:45,051 --> 00:17:47,050 And admittedly, that was because I didn't really 422 00:17:47,050 --> 00:17:49,030 know how to read documentation at the time. 423 00:17:49,030 --> 00:17:52,060 I also didn't know how to read like through code 424 00:17:52,060 --> 00:17:53,909 on like an GitHub page or anything. 425 00:17:53,909 --> 00:17:55,450 But I know that I'm not the only one. 426 00:17:55,450 --> 00:17:58,840 In fact I have about 728 students that are in roughly the same place 427 00:17:58,840 --> 00:18:01,030 right now in that class called CS50. 428 00:18:01,030 --> 00:18:03,970 And so I think that having something gathered into one place 429 00:18:03,970 --> 00:18:07,670 with an easy way of installing things is a much easier introduction to that. 430 00:18:07,670 --> 00:18:09,970 So that's what this distribution code is also. 431 00:18:09,970 --> 00:18:12,011 In case you're looking for the distribution curve 432 00:18:12,011 --> 00:18:14,380 and you don't want to listen to the rest of my talk, 433 00:18:14,380 --> 00:18:16,005 it's of towards the end of the lecture. 434 00:18:16,005 --> 00:18:18,710 It'll be there after I get there. 435 00:18:18,710 --> 00:18:20,140 But if you do, hang around. 436 00:18:20,140 --> 00:18:23,700 So basically, I took all of the packages, 437 00:18:23,700 --> 00:18:26,620 there's a lot of them that needed to be installed to get OpenCV work. 438 00:18:26,620 --> 00:18:27,820 That one's the really annoying one. 439 00:18:27,820 --> 00:18:28,480 Keras is fine. 440 00:18:28,480 --> 00:18:31,450 You just do like pip3 install Keras, it works fine. 441 00:18:31,450 --> 00:18:33,682 OpenCV is awful. 442 00:18:33,682 --> 00:18:34,640 Nothing against OpenCV. 443 00:18:34,640 --> 00:18:37,240 I very much appreciate that the project exists. 444 00:18:37,240 --> 00:18:39,370 I have a super psyched that I get to use it. 445 00:18:39,370 --> 00:18:41,770 It's just a pain in the ass to install. 446 00:18:41,770 --> 00:18:45,010 And or at least it was when I was young and naive. 447 00:18:45,010 --> 00:18:48,070 And it was very hard for me to sit there reading through documentation 448 00:18:48,070 --> 00:18:50,740 and not knowing what they meant by certain terms. 449 00:18:50,740 --> 00:18:54,010 What does it mean to use a virtual environments to install things? 450 00:18:54,010 --> 00:18:55,090 Why is that necessary? 451 00:18:55,090 --> 00:18:57,310 If I don't do that did it break my download? 452 00:18:57,310 --> 00:18:58,870 Why doesn't my download work? 453 00:18:58,870 --> 00:19:01,370 There's all of these terms and things that get thrown around 454 00:19:01,370 --> 00:19:04,060 because they're taken for granted. 455 00:19:04,060 --> 00:19:05,900 I know that that is very scary. 456 00:19:05,900 --> 00:19:08,710 And at the very least, it's incredibly frustrating. 457 00:19:08,710 --> 00:19:12,070 And so what I ended up doing was I said, OK, I'm 458 00:19:12,070 --> 00:19:15,710 going to just try all of the solutions and whichever one works, works. 459 00:19:15,710 --> 00:19:20,950 So my computer at one point had like 40 different versions of OpenCV on it. 460 00:19:20,950 --> 00:19:23,350 Every programmer I think has had that sort of experience 461 00:19:23,350 --> 00:19:24,820 where they just downloaded hundreds of things. 462 00:19:24,820 --> 00:19:26,357 I built from source at one point. 463 00:19:26,357 --> 00:19:27,190 And I was like cool. 464 00:19:27,190 --> 00:19:28,870 I don't know what this means but I did it. 465 00:19:28,870 --> 00:19:29,745 And that worked well. 466 00:19:29,745 --> 00:19:33,010 That actually was the one that I ended up using for my final project. 467 00:19:33,010 --> 00:19:35,830 I would not recommend doing that unless you know what you're doing. 468 00:19:35,830 --> 00:19:37,430 I screwed it up horribly. 469 00:19:37,430 --> 00:19:40,774 And I didn't even realize that I just was missing half of OpenCV. 470 00:19:40,774 --> 00:19:41,940 I didn't need it apparently. 471 00:19:41,940 --> 00:19:44,190 But like bad, bad deal. 472 00:19:44,190 --> 00:19:47,790 So I've finally gotten to the point where we have some code. 473 00:19:47,790 --> 00:19:49,780 If you want to there's the bitly links. 474 00:19:49,780 --> 00:19:52,116 These are the actual slides in case you want them. 475 00:19:52,116 --> 00:19:54,490 They have these links on them so you can go to the slides 476 00:19:54,490 --> 00:19:55,970 and then click the link. 477 00:19:55,970 --> 00:19:58,342 The GitHub link is my personal GitHub I didn't 478 00:19:58,342 --> 00:20:00,550 realize we were supposed to use our actual names when 479 00:20:00,550 --> 00:20:02,870 we created GitHubs for school. 480 00:20:02,870 --> 00:20:04,630 My name is not powerhouse of the cell. 481 00:20:04,630 --> 00:20:05,890 It's actually Nick. 482 00:20:05,890 --> 00:20:07,152 But I got to keep it. 483 00:20:07,152 --> 00:20:08,110 My TF was fine with it. 484 00:20:08,110 --> 00:20:09,760 So we kept it. 485 00:20:09,760 --> 00:20:12,805 And this is the bitly version, slightly shorter. 486 00:20:12,805 --> 00:20:15,430 So I'll leave those up there while I kind of talk a little bit. 487 00:20:15,430 --> 00:20:17,471 And eventually I will pull up some code and we'll 488 00:20:17,471 --> 00:20:20,999 get to coding things, or at least giving demonstrations of what the code does. 489 00:20:20,999 --> 00:20:24,040 I've found that when you're actually coding things up in front of people, 490 00:20:24,040 --> 00:20:27,655 you make about 400 more typos per second. 491 00:20:27,655 --> 00:20:29,030 It's really just not a good deal. 492 00:20:29,030 --> 00:20:32,350 So I don't like coding particularly much in front of other people. 493 00:20:32,350 --> 00:20:35,410 But what you'll find on that GitHub is basically 494 00:20:35,410 --> 00:20:37,330 a lot of very cheesy read-me files. 495 00:20:37,330 --> 00:20:39,760 I included as many emojis as I thought were necessary. 496 00:20:39,760 --> 00:20:41,590 I don't generally use emojis. 497 00:20:41,590 --> 00:20:45,054 I also don't use them often when I write things on my GitHub. 498 00:20:45,054 --> 00:20:46,470 That's the only one that has them. 499 00:20:46,470 --> 00:20:49,286 But GitHub has a nice interface for including emojis. 500 00:20:49,286 --> 00:20:51,160 And the reason for that is the problem that I 501 00:20:51,160 --> 00:20:54,070 wanted to solve with this sort of algorithm 502 00:20:54,070 --> 00:20:56,650 with this machine learning, computer vision for this seminar 503 00:20:56,650 --> 00:20:59,080 was classifying emojis. 504 00:20:59,080 --> 00:21:00,470 And that's a very broad problem. 505 00:21:00,470 --> 00:21:01,360 And I didn't solve all of it. 506 00:21:01,360 --> 00:21:02,985 I actually didn't even really solve it. 507 00:21:02,985 --> 00:21:04,690 But I started us on that path. 508 00:21:04,690 --> 00:21:07,360 And so I said, OK, well I want to do something with the emoji 509 00:21:07,360 --> 00:21:10,390 because that's kind of hip, kind of cool, also kind of dorky. 510 00:21:10,390 --> 00:21:13,360 And that kind of fits me, the latter one, not the first two. 511 00:21:13,360 --> 00:21:15,970 And so what I ended up doing was I said, you 512 00:21:15,970 --> 00:21:19,810 know what, we're not going to classify the like hundreds of emojis that exist. 513 00:21:19,810 --> 00:21:23,340 We're going to just take like 15 happy looking ones, 514 00:21:23,340 --> 00:21:25,480 15 kind of neutral-ish ones-- 515 00:21:25,480 --> 00:21:27,700 I think I actually get 13 of those-- and then 15 516 00:21:27,700 --> 00:21:29,530 kind of angry or negative looking ones. 517 00:21:29,530 --> 00:21:32,240 We're going to call those three groups classes. 518 00:21:32,240 --> 00:21:34,960 We're going to say, there's something positive, there's neutral, 519 00:21:34,960 --> 00:21:35,860 and there's negative. 520 00:21:35,860 --> 00:21:40,480 And I want the machine to be able to tell me is an arbitrary emoji 521 00:21:40,480 --> 00:21:43,570 that I'm looking at positive, neutral or negative. 522 00:21:43,570 --> 00:21:45,070 And that seems kind of trivial. 523 00:21:45,070 --> 00:21:46,620 Human beings do that all the time. 524 00:21:46,620 --> 00:21:48,340 And we're very subjective about it too. 525 00:21:48,340 --> 00:21:51,870 We're kind of like, ooh, that person she's looking to me just like angrily. 526 00:21:51,870 --> 00:21:54,570 He's got just like an aggressive face on him. 527 00:21:54,570 --> 00:21:56,490 He's just chilling there sipping his tea. 528 00:21:56,490 --> 00:21:58,500 You messed up her shoe. 529 00:21:58,500 --> 00:22:01,830 Something like that can be very difficult for us to perceive. 530 00:22:01,830 --> 00:22:04,060 And even in that example, that's totally subjective. 531 00:22:04,060 --> 00:22:07,260 What I just said is basically up to whoever is viewing it. 532 00:22:07,260 --> 00:22:10,380 And that is where this becomes a difficult problem, 533 00:22:10,380 --> 00:22:13,210 is how do you provide enough data to get this to work. 534 00:22:13,210 --> 00:22:15,540 So I actually did a couple of disservices 535 00:22:15,540 --> 00:22:17,987 to you, the user of this code. 536 00:22:17,987 --> 00:22:20,070 One, the machine that you are provided with, well, 537 00:22:20,070 --> 00:22:24,480 it does work and will train and learn , it doesn't it very well. 538 00:22:24,480 --> 00:22:26,850 By the end it's basically randomly guessing. 539 00:22:26,850 --> 00:22:32,040 It'll say, it's about, I don't know, 33% chance of this, 33% chance of that, 540 00:22:32,040 --> 00:22:34,890 and 33% chance of that, which you'll notice 541 00:22:34,890 --> 00:22:39,220 there are three categories, 100% across all three categories, about 33%. 542 00:22:39,220 --> 00:22:42,130 So the machine doesn't do a very good job of figuring it out. 543 00:22:42,130 --> 00:22:45,827 And sometimes which I found when I was testing which one I was going to demo, 544 00:22:45,827 --> 00:22:47,910 sometimes the machine that I provided you actually 545 00:22:47,910 --> 00:22:50,832 just gets it completely wrong, but it's like super sure of that. 546 00:22:50,832 --> 00:22:53,040 I give it a very happy looking emoji and it was like, 547 00:22:53,040 --> 00:22:55,170 I'm 100% sure that is negative. 548 00:22:55,170 --> 00:22:56,884 Negative look emoji right there. 549 00:22:56,884 --> 00:22:58,800 And so that's kind of one of the funny things. 550 00:22:58,800 --> 00:23:01,424 I think that a lot of times when you're doing machine learning, 551 00:23:01,424 --> 00:23:04,620 you feel like you're training a toddler, particularly annoying toddler, that 552 00:23:04,620 --> 00:23:08,100 is not a danger really to itself or anything around it, 553 00:23:08,100 --> 00:23:09,854 but it particularly hates you. 554 00:23:09,854 --> 00:23:13,020 In fact, it wants to make sure that you never get whatever assignment you're 555 00:23:13,020 --> 00:23:14,250 trying to do done. 556 00:23:14,250 --> 00:23:17,190 That's how I've kind of learned machine learning works, especially 557 00:23:17,190 --> 00:23:21,270 when I was working on this being like, oh, crap I have a seminar to teach. 558 00:23:21,270 --> 00:23:22,320 This was not working. 559 00:23:22,320 --> 00:23:24,180 It just refused to do what I wanted it to. 560 00:23:24,180 --> 00:23:25,980 And that was very frustrating. 561 00:23:25,980 --> 00:23:30,270 But something that shares a lot of parallels with this toddler analogy. 562 00:23:30,270 --> 00:23:32,400 If you were to take a toddler and every time 563 00:23:32,400 --> 00:23:34,580 that they just like didn't do what you wanted them to do, just 564 00:23:34,580 --> 00:23:36,590 be all right, well, getting a new one, and just 565 00:23:36,590 --> 00:23:39,570 like go get a new toddler, that'd be weird in a number of ways. 566 00:23:39,570 --> 00:23:41,220 It's also kind of weird with this too. 567 00:23:41,220 --> 00:23:42,450 It's a little less extreme. 568 00:23:42,450 --> 00:23:44,670 You can trade out machines, no problem. 569 00:23:44,670 --> 00:23:47,730 But you'll find that the machine that I have handed you actually does 570 00:23:47,730 --> 00:23:49,680 have a couple of things that can be modified 571 00:23:49,680 --> 00:23:51,441 within it to make it a lot better. 572 00:23:51,441 --> 00:23:54,690 And it doesn't mean that you just copy and paste the same layers over and over 573 00:23:54,690 --> 00:23:57,614 again and make your machine much longer, but rather you 574 00:23:57,614 --> 00:23:59,030 can make a little bit more clever. 575 00:23:59,030 --> 00:24:05,670 576 00:24:05,670 --> 00:24:10,170 So basically, when we were training the machines-- or when I was training-- 577 00:24:10,170 --> 00:24:10,680 I say we. 578 00:24:10,680 --> 00:24:14,490 I mean I. When I was like lonely in my room training the machine models, 579 00:24:14,490 --> 00:24:17,610 I was saying, all right, I've got to get this to work somehow. 580 00:24:17,610 --> 00:24:19,290 I don't know what I'm going to do. 581 00:24:19,290 --> 00:24:21,720 And if you'll recall, I said I had about 15 images 582 00:24:21,720 --> 00:24:24,300 of positive and neutral and sad. 583 00:24:24,300 --> 00:24:26,820 So I had 45 images total, roughly. 584 00:24:26,820 --> 00:24:28,710 That's kind of a very small amount of data. 585 00:24:28,710 --> 00:24:30,480 So I actually used some techniques in the code-- 586 00:24:30,480 --> 00:24:32,729 and I'll try and point them out when I go over there-- 587 00:24:32,729 --> 00:24:36,044 that allow me to augment my data. 588 00:24:36,044 --> 00:24:36,960 We do that first step. 589 00:24:36,960 --> 00:24:38,070 We augmented our data. 590 00:24:38,070 --> 00:24:40,470 What I didn't do was add very good data. 591 00:24:40,470 --> 00:24:43,920 My data collection as partially a product of my laziness 592 00:24:43,920 --> 00:24:46,560 but now retrospectively, a product of I wanted 593 00:24:46,560 --> 00:24:49,980 to teach, it was a teaching moment, is the data 594 00:24:49,980 --> 00:24:52,290 was collected kind of arbitrarily without much thought 595 00:24:52,290 --> 00:24:53,873 as to what patterns were being picked. 596 00:24:53,873 --> 00:24:56,821 I kind of ignored my own sort of second rule, if you will. 597 00:24:56,821 --> 00:24:59,070 And yes, that is mostly because I was just being lazy. 598 00:24:59,070 --> 00:25:00,890 I just had to pick a bunch of data and like throw it in there 599 00:25:00,890 --> 00:25:01,860 and hope it worked. 600 00:25:01,860 --> 00:25:03,870 And it doesn't work that well. 601 00:25:03,870 --> 00:25:05,544 That strategy won't help you very much. 602 00:25:05,544 --> 00:25:06,960 I didn't think about it that hard. 603 00:25:06,960 --> 00:25:12,360 But you can get there by thinking about it just minimally. 604 00:25:12,360 --> 00:25:14,530 If you have a smiley face emoji, for example, 605 00:25:14,530 --> 00:25:17,220 since we're talking about this in this case, 606 00:25:17,220 --> 00:25:20,010 it's not too hard to find enough different smiley faces 607 00:25:20,010 --> 00:25:22,425 to cover the general case. 608 00:25:22,425 --> 00:25:26,730 It's mostly that smiley little half circle on the bottom of their face. 609 00:25:26,730 --> 00:25:29,970 And so covering as much data as you can while still keeping it small 610 00:25:29,970 --> 00:25:32,970 is not that difficult. You just have to be a little bit smarter about it 611 00:25:32,970 --> 00:25:33,829 than I was. 612 00:25:33,829 --> 00:25:36,120 And I believe the data is included in that GitHub page. 613 00:25:36,120 --> 00:25:39,189 So you'll see my crappy collected data there as well. 614 00:25:39,189 --> 00:25:40,980 I really hope it's not a copyright problem. 615 00:25:40,980 --> 00:25:44,070 We'll find out if someone shows up to arrest me. 616 00:25:44,070 --> 00:25:52,650 So we're going to actually switch over to looking at the actual GitHub page. 617 00:25:52,650 --> 00:25:54,424 And this is where the cheesiness comes in. 618 00:25:54,424 --> 00:25:55,590 I called it machine feeling. 619 00:25:55,590 --> 00:25:57,780 I was feeling a little dorky that day. 620 00:25:57,780 --> 00:25:59,910 I'm feeling a little dorky every day. 621 00:25:59,910 --> 00:26:03,360 And then I included as many emojis as I could because I was like, oh, crap. 622 00:26:03,360 --> 00:26:05,190 Like you can include emojis in markdown. 623 00:26:05,190 --> 00:26:05,974 That's cool. 624 00:26:05,974 --> 00:26:07,140 So I through those in there. 625 00:26:07,140 --> 00:26:09,249 And you can read this on your own if you want to. 626 00:26:09,249 --> 00:26:09,915 Maybe you don't. 627 00:26:09,915 --> 00:26:11,310 I don't blame me if you don't. 628 00:26:11,310 --> 00:26:14,220 There is this requirement .txt right here. 629 00:26:14,220 --> 00:26:18,150 And so that allows you to basically just immediately install 630 00:26:18,150 --> 00:26:20,570 all the requirements for this entire project. 631 00:26:20,570 --> 00:26:21,120 That's it. 632 00:26:21,120 --> 00:26:23,720 No 20 hours of searching through Google or anything. 633 00:26:23,720 --> 00:26:24,702 Just that. 634 00:26:24,702 --> 00:26:26,160 And then we have our source folder. 635 00:26:26,160 --> 00:26:29,010 And originally I was going to provide some skeleton code 636 00:26:29,010 --> 00:26:31,350 to kind of complement the actual code. 637 00:26:31,350 --> 00:26:33,434 I decided against that because I ran out of time. 638 00:26:33,434 --> 00:26:35,100 And I also thought it was a little mean. 639 00:26:35,100 --> 00:26:38,210 So instead we have a fully working sample code. 640 00:26:38,210 --> 00:26:39,960 And I've separated things out a little bit 641 00:26:39,960 --> 00:26:42,900 just to make it easier to comprehend what's going on. 642 00:26:42,900 --> 00:26:45,390 And so right up here, up at the top, you basically 643 00:26:45,390 --> 00:26:46,732 have the computer vision folder. 644 00:26:46,732 --> 00:26:47,940 There's only one thing in it. 645 00:26:47,940 --> 00:26:52,110 It is just the thing that provides computer vision basically 646 00:26:52,110 --> 00:26:53,520 properties to our code. 647 00:26:53,520 --> 00:26:54,660 And then you have the data. 648 00:26:54,660 --> 00:26:56,760 So we'll a short look into here. 649 00:26:56,760 --> 00:26:57,810 It's not very large. 650 00:26:57,810 --> 00:27:00,330 There is testing data and there is training data. 651 00:27:00,330 --> 00:27:01,650 It's pretty segmented out. 652 00:27:01,650 --> 00:27:04,350 But you can see that it's basically just a bunch of .pngs. 653 00:27:04,350 --> 00:27:06,540 This one happens to be pretty sad. 654 00:27:06,540 --> 00:27:10,560 And they're are also all cropped so that they look the same height and width. 655 00:27:10,560 --> 00:27:12,510 They're all 200 by 200 pixels. 656 00:27:12,510 --> 00:27:15,030 You don't have to do that, although I would recommend it 657 00:27:15,030 --> 00:27:18,000 as just one of the things that you can normalize across. 658 00:27:18,000 --> 00:27:20,040 Because what is the size the data is different? 659 00:27:20,040 --> 00:27:21,750 There are techniques for dealing with that. 660 00:27:21,750 --> 00:27:23,791 You can just shrink it to the right aspect ratio. 661 00:27:23,791 --> 00:27:25,750 You can do kind of a variety of things. 662 00:27:25,750 --> 00:27:29,070 But if you can, you want to keep your data pretty consistent across things 663 00:27:29,070 --> 00:27:30,480 that don't matter. 664 00:27:30,480 --> 00:27:34,260 So like whether this is this size or this size, it's still a sad face. 665 00:27:34,260 --> 00:27:35,290 It's still negative. 666 00:27:35,290 --> 00:27:38,800 So the classification doesn't depend really on what size it is. 667 00:27:38,800 --> 00:27:40,800 So I wanted to keep all of my data the same size 668 00:27:40,800 --> 00:27:43,680 so that in the event that the machine was like, oh, 669 00:27:43,680 --> 00:27:46,500 images that are 201 pixels, those are sad. 670 00:27:46,500 --> 00:27:48,970 Images that are 400 pixels, they're happy. 671 00:27:48,970 --> 00:27:50,760 That be really unfortunate because that's 672 00:27:50,760 --> 00:27:53,790 not even close to the actual pattern that we're going after. 673 00:27:53,790 --> 00:27:57,060 And you could understand that even complicated examples, you'll 674 00:27:57,060 --> 00:27:59,542 want to be able to be aware of which patterns matter 675 00:27:59,542 --> 00:28:02,250 and which ones don't, and which ones are you actually introducing 676 00:28:02,250 --> 00:28:03,360 into your models. 677 00:28:03,360 --> 00:28:05,340 And that might sound kind of complicated. 678 00:28:05,340 --> 00:28:08,700 But the example that I just gave there, not too difficult. 679 00:28:08,700 --> 00:28:11,130 Making sure that the size of the image doesn't actually 680 00:28:11,130 --> 00:28:14,510 play into the machine learning kind of what it actually learns. 681 00:28:14,510 --> 00:28:15,630 That was very intuitive. 682 00:28:15,630 --> 00:28:17,229 And most of these are like that. 683 00:28:17,229 --> 00:28:18,270 They're pretty intuitive. 684 00:28:18,270 --> 00:28:21,450 This sort of project, even though it sounds kind of complicated, 685 00:28:21,450 --> 00:28:22,450 isn't too bad. 686 00:28:22,450 --> 00:28:24,810 It's not particularly complex if you make 687 00:28:24,810 --> 00:28:27,530 it analogous to like human beings or toddlers. 688 00:28:27,530 --> 00:28:31,140 You can think of it as like your young niece, nephew, daughter, 689 00:28:31,140 --> 00:28:36,460 if you have a child, brother, sibling, smaller people, little human beings, 690 00:28:36,460 --> 00:28:37,610 and how you teach them. 691 00:28:37,610 --> 00:28:39,734 And if you can teach a child that, you can probably 692 00:28:39,734 --> 00:28:41,850 teach the machine with some caveats. 693 00:28:41,850 --> 00:28:44,670 So in the rest of this folder, we have the ML. 694 00:28:44,670 --> 00:28:47,110 It was right around here. 695 00:28:47,110 --> 00:28:49,402 It's just that, just the machine learning portion. 696 00:28:49,402 --> 00:28:52,110 There's a file in there that does all the machine learning parts. 697 00:28:52,110 --> 00:28:55,139 I built a class for us to just have a very high level model. 698 00:28:55,139 --> 00:28:57,180 But it also gives you a low enough level that you 699 00:28:57,180 --> 00:28:58,930 can tinker with the model itself. 700 00:28:58,930 --> 00:29:00,560 So depending on what you want. 701 00:29:00,560 --> 00:29:04,200 And we have a configure file which has a bunch of variables inside of it. 702 00:29:04,200 --> 00:29:06,090 We might take a look inside in a little bit. 703 00:29:06,090 --> 00:29:08,520 And then we have our actual run file which 704 00:29:08,520 --> 00:29:11,640 allows us to execute the entire piece of software. 705 00:29:11,640 --> 00:29:17,190 And then I have a test.png which is just a test image that I was using earlier. 706 00:29:17,190 --> 00:29:19,210 I left it in the it's kind of cute. 707 00:29:19,210 --> 00:29:21,930 So that's all of the code on the GitHub. 708 00:29:21,930 --> 00:29:24,630 You're welcome to clone it, download it, make pull requests, 709 00:29:24,630 --> 00:29:27,370 preferably don't sell it for a profit. 710 00:29:27,370 --> 00:29:28,620 If you do, that's really cool. 711 00:29:28,620 --> 00:29:30,900 I'm just proud that it worked. 712 00:29:30,900 --> 00:29:32,220 Any of those things is awesome. 713 00:29:32,220 --> 00:29:35,260 But we're actually going to show just a little bit of code over here. 714 00:29:35,260 --> 00:29:40,210 So I'm already within the directory of the actual code. 715 00:29:40,210 --> 00:29:42,390 So if you were to have GitCloned this, you'd 716 00:29:42,390 --> 00:29:46,000 have ended up somewhere around here. 717 00:29:46,000 --> 00:29:48,150 So this is the actual kind of root directory. 718 00:29:48,150 --> 00:29:50,340 I know this is kind of a boring terminal screen 719 00:29:50,340 --> 00:29:52,210 but it'll get more interesting shortly. 720 00:29:52,210 --> 00:29:54,420 And so here you can see there's just a bunch of files 721 00:29:54,420 --> 00:29:58,800 that tell us what license, MIT, what the read-me says, 722 00:29:58,800 --> 00:30:02,069 and like there's some caching here, source files and requirements. 723 00:30:02,069 --> 00:30:04,110 We're going to go into SRC because that's source. 724 00:30:04,110 --> 00:30:07,026 And we're actually going to go specifically into the sample directory. 725 00:30:07,026 --> 00:30:08,380 And we're back where we started. 726 00:30:08,380 --> 00:30:14,340 So if I wanted to run this, I can basically say, /run.py and-- oh, yep. 727 00:30:14,340 --> 00:30:17,530 Like I said, typing in front of people, you make so many more mistakes. 728 00:30:17,530 --> 00:30:20,010 And this will bring up kind of our help screen 729 00:30:20,010 --> 00:30:23,080 which is meant to be as non-obscure as possible. 730 00:30:23,080 --> 00:30:26,704 However, I am no expert coder so it might be a little obscure. 731 00:30:26,704 --> 00:30:28,620 It's intended to be pretty easy to use though. 732 00:30:28,620 --> 00:30:30,477 So there's -o for an output file. 733 00:30:30,477 --> 00:30:32,310 And this is all just kind of software stuff. 734 00:30:32,310 --> 00:30:33,510 Not particularly interesting. 735 00:30:33,510 --> 00:30:36,040 If you're interested afterward, please do go ahead and let me know. 736 00:30:36,040 --> 00:30:38,100 And I'll be happy to talk with you about it. 737 00:30:38,100 --> 00:30:42,080 But if you wanted to just run this, then we can say, OK, well, 738 00:30:42,080 --> 00:30:45,450 I want my output file to be seminar. 739 00:30:45,450 --> 00:30:48,600 And I want it to go through, we're going to say one round of training, 740 00:30:48,600 --> 00:30:52,859 unless you guys want to sit here for the next 45 minutes. 741 00:30:52,859 --> 00:30:55,650 And I don't want it to load another model, one that already exists. 742 00:30:55,650 --> 00:30:57,780 I want it to just kind of do its own thing. 743 00:30:57,780 --> 00:31:00,560 And that's all I really need to do. 744 00:31:00,560 --> 00:31:02,580 From the command line, that will train it. 745 00:31:02,580 --> 00:31:04,930 And I say that and this'll be the one time that it 746 00:31:04,930 --> 00:31:07,810 breaks which is absolutely fantastic. 747 00:31:07,810 --> 00:31:10,477 But it tells you that it's going to use the TensorFlow back end. 748 00:31:10,477 --> 00:31:12,393 It found all of the data that I had handed it. 749 00:31:12,393 --> 00:31:13,470 And now begins training. 750 00:31:13,470 --> 00:31:16,470 And so the reason I bring this up is because I think it's kind of-- 751 00:31:16,470 --> 00:31:18,746 you're not quite sure what each of these things mean. 752 00:31:18,746 --> 00:31:21,870 And what's kind of funny is you can customize each of these metrics anyway. 753 00:31:21,870 --> 00:31:24,552 754 00:31:24,552 --> 00:31:26,510 So basically, if you're looking at this screen, 755 00:31:26,510 --> 00:31:30,230 you see their kind of cool little animation if you will. 756 00:31:30,230 --> 00:31:33,590 But this is really just telling you how many steps through the training round 757 00:31:33,590 --> 00:31:34,710 it's gotten. 758 00:31:34,710 --> 00:31:38,750 [? Epoch or ?] Epic is going to usually be the actual training 759 00:31:38,750 --> 00:31:40,070 round that it's on. 760 00:31:40,070 --> 00:31:44,640 And so within a training round, your machine is basically saying, all right, 761 00:31:44,640 --> 00:31:47,310 I'm given some amount of data that you specify. 762 00:31:47,310 --> 00:31:50,200 And I've got to figure out what the hell this means. 763 00:31:50,200 --> 00:31:51,920 I've got to classify it. 764 00:31:51,920 --> 00:31:55,130 And what it does is it sits there and it says, hm, that looks like a cat. 765 00:31:55,130 --> 00:31:55,820 That's a bird. 766 00:31:55,820 --> 00:31:56,810 That's a dog. 767 00:31:56,810 --> 00:32:00,740 And, or in this case, that's positive, that's negative, that's neutral. 768 00:32:00,740 --> 00:32:02,220 And it throws out those answers. 769 00:32:02,220 --> 00:32:03,620 And it encodes them somehow-- 770 00:32:03,620 --> 00:32:06,512 0, 1, 2, totally reasonable. 771 00:32:06,512 --> 00:32:09,470 And what it does is it says, OK, here are my answers to all of the data 772 00:32:09,470 --> 00:32:10,520 that I've been handed. 773 00:32:10,520 --> 00:32:11,922 And then it looks at the answers. 774 00:32:11,922 --> 00:32:14,630 And it says, oh, crap, I missed this one, this one, and this one. 775 00:32:14,630 --> 00:32:16,100 So I've got to do some magic. 776 00:32:16,100 --> 00:32:17,850 I'm going to re-weight some of my numbers. 777 00:32:17,850 --> 00:32:20,180 I'm going to do some hardcore math stuff. 778 00:32:20,180 --> 00:32:21,704 And then I'm going to try again. 779 00:32:21,704 --> 00:32:23,120 And that's the new training round. 780 00:32:23,120 --> 00:32:27,690 Now you'll notice that kind of towards the right side of each bar or each row, 781 00:32:27,690 --> 00:32:30,680 there's this val loss and vowel ACC. 782 00:32:30,680 --> 00:32:33,500 And they correspond to loss and ACC over here. 783 00:32:33,500 --> 00:32:37,820 Loss being, well, loss which is a metric used in the actual algorithm 784 00:32:37,820 --> 00:32:39,230 or the math underneath. 785 00:32:39,230 --> 00:32:43,580 And ACC being accuracy or the accuracy of the model given 786 00:32:43,580 --> 00:32:47,690 that it is categorically trying to tell what sort of image it 787 00:32:47,690 --> 00:32:49,350 is with multiple categories. 788 00:32:49,350 --> 00:32:53,421 And you can specify all this within the file that builds this model. 789 00:32:53,421 --> 00:32:55,670 But for now we're going to just kind of take it as is. 790 00:32:55,670 --> 00:32:58,820 You don't need to use that accuracy metric, for example. 791 00:32:58,820 --> 00:33:01,937 The val versions of each of those are the validation versions. 792 00:33:01,937 --> 00:33:03,770 They're the ones that say, all right, here's 793 00:33:03,770 --> 00:33:05,870 one that you've never seen before. 794 00:33:05,870 --> 00:33:06,887 How do you do on that? 795 00:33:06,887 --> 00:33:08,970 And for that one, it doesn't readjust its weights, 796 00:33:08,970 --> 00:33:10,469 That just evaluates it a little bit. 797 00:33:10,469 --> 00:33:12,400 It just checks that you're not overfitting. 798 00:33:12,400 --> 00:33:14,150 So that was the first term that was thrown 799 00:33:14,150 --> 00:33:16,570 at me when I was starting to learn this is 800 00:33:16,570 --> 00:33:18,590 what does it mean to overfit your data. 801 00:33:18,590 --> 00:33:19,980 And it's kind of intuitive. 802 00:33:19,980 --> 00:33:21,710 You're doing fitting too much. 803 00:33:21,710 --> 00:33:25,050 And if you think of this training process as fitting, 804 00:33:25,050 --> 00:33:26,780 then you're just training it too much. 805 00:33:26,780 --> 00:33:28,820 And you can think of this as like with a toddler 806 00:33:28,820 --> 00:33:32,750 if you give it too much of a limited pattern, is maybe 807 00:33:32,750 --> 00:33:38,000 you tell it, the machine, that everything is kind of so and so. 808 00:33:38,000 --> 00:33:40,670 You give it all the data pieces that it gets. 809 00:33:40,670 --> 00:33:44,570 And it just memorizes the data, but not the actual patterns. 810 00:33:44,570 --> 00:33:45,737 People do this all the time. 811 00:33:45,737 --> 00:33:48,570 You give them a bunch of chemistry facts, they memorize those facts. 812 00:33:48,570 --> 00:33:51,380 If you ask them an extrapolation on those facts, they have no idea. 813 00:33:51,380 --> 00:33:54,800 That's a very common problem, especially in public school systems for example. 814 00:33:54,800 --> 00:33:56,540 Small, political jab. 815 00:33:56,540 --> 00:33:58,970 But that is something that happens with machines too. 816 00:33:58,970 --> 00:34:01,460 If they just end up memorizing their data, 817 00:34:01,460 --> 00:34:06,260 yes they get it right, at least on this kind of loss accuracy metric, 818 00:34:06,260 --> 00:34:08,671 but they won't get it right on the validation accuracy 819 00:34:08,671 --> 00:34:10,670 because that is stuff they've never seen before. 820 00:34:10,670 --> 00:34:12,003 They couldn't have memorized it. 821 00:34:12,003 --> 00:34:14,810 It's basically the same idea behind test taking. 822 00:34:14,810 --> 00:34:15,800 I give you some data. 823 00:34:15,800 --> 00:34:18,300 I expect you to learn the patterns, not the actual data. 824 00:34:18,300 --> 00:34:21,170 And then I give you a test that has data you've never seen before 825 00:34:21,170 --> 00:34:23,920 but has the same patterns, you should be able to figure it out. 826 00:34:23,920 --> 00:34:27,850 And so you can see or unfortunately it got kind of locked over to the next 827 00:34:27,850 --> 00:34:31,100 one, but we're going to say that these 0.3s threes are roughly about the same. 828 00:34:31,100 --> 00:34:36,610 You'll notice they are, it's like 0.32, 0.35, 0.38, 0.38. 829 00:34:36,610 --> 00:34:39,120 And you'll notice they're roughly guessing. 830 00:34:39,120 --> 00:34:43,317 The machine is basically saying, hey, if I say that this one is neutral, 831 00:34:43,317 --> 00:34:45,900 this one is negative, this one's positive-- neutral, negative, 832 00:34:45,900 --> 00:34:46,620 positive-- 833 00:34:46,620 --> 00:34:49,830 I get it roughly right which is not so good. 834 00:34:49,830 --> 00:34:52,170 And it's because the model that I handed you is, well, 835 00:34:52,170 --> 00:34:53,489 not particularly intelligent. 836 00:34:53,489 --> 00:34:55,650 Also the data is not very well collected either. 837 00:34:55,650 --> 00:34:57,480 And you'll notice that even the accuracy-- 838 00:34:57,480 --> 00:35:01,650 it didn't really get the chance to memorize everything, thank god-- 839 00:35:01,650 --> 00:35:02,920 is still pretty low. 840 00:35:02,920 --> 00:35:04,960 It's roughly guessing here too. 841 00:35:04,960 --> 00:35:06,220 So that's not too good. 842 00:35:06,220 --> 00:35:08,560 And it asked me, do I want to save the model. 843 00:35:08,560 --> 00:35:11,280 And you'll notice that one my points about the myths of machine 844 00:35:11,280 --> 00:35:13,320 learning, that didn't take that long. 845 00:35:13,320 --> 00:35:15,970 We were talking here for a couple of minutes and it's done. 846 00:35:15,970 --> 00:35:16,786 It's now trained. 847 00:35:16,786 --> 00:35:18,660 So now we have kind of a computer vision part 848 00:35:18,660 --> 00:35:22,800 of this, which is somewhat annoying because I did hack it together. 849 00:35:22,800 --> 00:35:23,880 But that's OK. 850 00:35:23,880 --> 00:35:29,290 So we have our kind of live feed of the screen as it's going right now. 851 00:35:29,290 --> 00:35:32,310 And this allows us to take pictures of things. 852 00:35:32,310 --> 00:35:34,170 So there is an actual screen shot software 853 00:35:34,170 --> 00:35:36,003 that you could use that is easier than this, 854 00:35:36,003 --> 00:35:40,320 but it was a easy way for me to introduce the ideas of computer vision 855 00:35:40,320 --> 00:35:41,440 specifically. 856 00:35:41,440 --> 00:35:45,294 So what I can do is I can say, all right, let's pull up an emoji. 857 00:35:45,294 --> 00:35:47,460 Because I want to take a picture of that emoji and I 858 00:35:47,460 --> 00:35:49,376 want my machine to tell me what that emoji is. 859 00:35:49,376 --> 00:35:51,810 Is it positive, negative, or neutral? 860 00:35:51,810 --> 00:35:54,799 And I can say smiley emoji into Google. 861 00:35:54,799 --> 00:35:57,090 And the reason I do this live is to prove to you that I 862 00:35:57,090 --> 00:35:58,950 didn't just hardcoded into the machine. 863 00:35:58,950 --> 00:36:01,290 I'm willing to bet it will not get it correct. 864 00:36:01,290 --> 00:36:03,770 But if it does, kudos. 865 00:36:03,770 --> 00:36:04,770 So we have this emoji. 866 00:36:04,770 --> 00:36:07,579 And you'll see it pops up in our feed right over here. 867 00:36:07,579 --> 00:36:09,870 It actually pops up I think an infinite number of times 868 00:36:09,870 --> 00:36:11,453 if you were to like look close enough. 869 00:36:11,453 --> 00:36:13,320 But the reason I have it pop up in the feed 870 00:36:13,320 --> 00:36:18,820 is that I can drag over the actual feed and have it select that picture. 871 00:36:18,820 --> 00:36:20,445 And so it takes that picture. 872 00:36:20,445 --> 00:36:22,320 Oh, I'm so psyched that I got that one right. 873 00:36:22,320 --> 00:36:24,070 That's lit. 874 00:36:24,070 --> 00:36:27,120 So you'll notice that in the actual terminal output, 875 00:36:27,120 --> 00:36:31,920 it gave me probabilities that the thing was correct in being 876 00:36:31,920 --> 00:36:33,747 positive or negative or neutral. 877 00:36:33,747 --> 00:36:36,830 I'm really just psyched that it got it right with really high probability. 878 00:36:36,830 --> 00:36:39,150 If you're above like 70% probability and you 879 00:36:39,150 --> 00:36:42,440 have a good enough number of labels, you should be pretty psyched. 880 00:36:42,440 --> 00:36:45,690 The fact that this got away with 94% likelihood, 881 00:36:45,690 --> 00:36:47,220 it was probably just guessing. 882 00:36:47,220 --> 00:36:51,300 It's like the toddler or this small child that's like, it's that one. 883 00:36:51,300 --> 00:36:52,960 And they happen to be correct. 884 00:36:52,960 --> 00:36:56,940 And they're like, yes, I'm so smart. 885 00:36:56,940 --> 00:36:59,280 Like this machine, it's really not that great. 886 00:36:59,280 --> 00:37:02,880 But to its credit and to the seminar's credit is 887 00:37:02,880 --> 00:37:06,930 we have a dumb machine that I've handled very little data 888 00:37:06,930 --> 00:37:13,060 and I've trained for a total of like four minutes in front of all of us. 889 00:37:13,060 --> 00:37:13,990 And it got it right. 890 00:37:13,990 --> 00:37:15,400 It was able to figure stuff out. 891 00:37:15,400 --> 00:37:16,540 It figured out a pattern. 892 00:37:16,540 --> 00:37:19,244 And it said that it also could have been a little bit negative. 893 00:37:19,244 --> 00:37:21,910 And you'll notice that there are some attributes that are shared 894 00:37:21,910 --> 00:37:24,040 among smiley faces and frowny faces. 895 00:37:24,040 --> 00:37:26,970 They both have eyes and in particular emojis are pretty standard. 896 00:37:26,970 --> 00:37:28,330 They have that same rough shape. 897 00:37:28,330 --> 00:37:29,530 They're the same color. 898 00:37:29,530 --> 00:37:33,040 And they do kind of have the same width of smile or frown 899 00:37:33,040 --> 00:37:35,210 even though it's in different orientations. 900 00:37:35,210 --> 00:37:38,530 So the machine didn't do a terrible job. 901 00:37:38,530 --> 00:37:40,300 And that's kind of nuts. 902 00:37:40,300 --> 00:37:44,440 And hopefully that proves at least in a small way that it is accessible 903 00:37:44,440 --> 00:37:46,000 and it is easy to do. 904 00:37:46,000 --> 00:37:48,106 You might have to sit there and tinker with code. 905 00:37:48,106 --> 00:37:50,230 But if you're not sitting there tinkering with code 906 00:37:50,230 --> 00:37:51,580 are you really coding? 907 00:37:51,580 --> 00:37:55,440 If you're not sitting there debugging things, why are you here? 908 00:37:55,440 --> 00:37:57,330 So the debugging is a good amount of coding. 909 00:37:57,330 --> 00:37:59,163 And this, just like any other piece of code, 910 00:37:59,163 --> 00:38:03,260 can be done and debugged even by people that just started programming. 911 00:38:03,260 --> 00:38:06,887 You can do this at this level in a couple of hours, maybe 912 00:38:06,887 --> 00:38:07,595 a couple of days. 913 00:38:07,595 --> 00:38:09,594 You might have to research a little bit and say, 914 00:38:09,594 --> 00:38:11,337 oh, crap, what does it mean to overfit. 915 00:38:11,337 --> 00:38:14,420 What did that guy say, the crazy dude that talked about cats and triangles 916 00:38:14,420 --> 00:38:15,740 for a while? 917 00:38:15,740 --> 00:38:17,260 Well, that's OK. 918 00:38:17,260 --> 00:38:18,790 That's how this works. 919 00:38:18,790 --> 00:38:22,510 But it's not any more difficult than if I said, oh, go use an API 920 00:38:22,510 --> 00:38:26,200 and retrieve some information via PUT request for me. 921 00:38:26,200 --> 00:38:27,970 Just as complicated sounding. 922 00:38:27,970 --> 00:38:30,447 But it's all the same idea. 923 00:38:30,447 --> 00:38:32,780 You have to just sit down and learn it for a little bit. 924 00:38:32,780 --> 00:38:35,080 And in this case, you have a pretty decent example. 925 00:38:35,080 --> 00:38:39,490 That was-- I'm gonig to stress that that was kind of luck that that worked. 926 00:38:39,490 --> 00:38:40,370 I'm very proud of it. 927 00:38:40,370 --> 00:38:42,430 But still kind of ridiculous. 928 00:38:42,430 --> 00:38:46,190 So let's say that we wanted to improve on a model that already exists. 929 00:38:46,190 --> 00:38:49,960 So there are smarter people than me that have written lots and lots of machine 930 00:38:49,960 --> 00:38:50,840 learning algorithms. 931 00:38:50,840 --> 00:38:53,890 There are, I would argue, more intelligent people than I 932 00:38:53,890 --> 00:38:56,720 am that have done this than less. 933 00:38:56,720 --> 00:39:00,760 So I actually included one of those pre-trained models 934 00:39:00,760 --> 00:39:02,890 because I figured it be kind of cool to demo. 935 00:39:02,890 --> 00:39:04,880 And so in the code that you have, you actually 936 00:39:04,880 --> 00:39:06,880 have the ability to pull up a pre-trained model. 937 00:39:06,880 --> 00:39:09,130 It's called Inception V3. 938 00:39:09,130 --> 00:39:12,390 I think it's pretty bad ass that they call that. 939 00:39:12,390 --> 00:39:15,980 A lot of the other ones are like VGG16 and stuff like that. 940 00:39:15,980 --> 00:39:17,710 But this one is called Inception V3. 941 00:39:17,710 --> 00:39:20,670 I like the sound of that name. 942 00:39:20,670 --> 00:39:24,550 And so you can run this program with that flag, the pre-trained flag. 943 00:39:24,550 --> 00:39:28,060 It still pulls up the TensorFlow back end because it is a TensorFlow model. 944 00:39:28,060 --> 00:39:32,510 TensorFlow being the underlying machine learning software of Keras, 945 00:39:32,510 --> 00:39:34,359 or at least the way that I designed it. 946 00:39:34,359 --> 00:39:37,150 It still loads the data even though it doesn't have to in this case 947 00:39:37,150 --> 00:39:39,460 because we're going to look at a different piece of data. 948 00:39:39,460 --> 00:39:40,730 I don't really want to save the model. 949 00:39:40,730 --> 00:39:41,590 It's a little big. 950 00:39:41,590 --> 00:39:43,423 But it's going to bring up that same feed so 951 00:39:43,423 --> 00:39:46,000 that I can take a picture of my screen. 952 00:39:46,000 --> 00:39:48,010 And basically, what we're going to do, is 953 00:39:48,010 --> 00:39:51,930 we're going to pull up a picture of a cat, particularly this cat. 954 00:39:51,930 --> 00:39:52,890 I really like this cat. 955 00:39:52,890 --> 00:39:53,640 It's kind of cute. 956 00:39:53,640 --> 00:39:56,064 So this is an Egyptian cat. 957 00:39:56,064 --> 00:39:58,230 And what I want to do is I'm going to take my mouse, 958 00:39:58,230 --> 00:40:00,180 I'm going to click, drag it over. 959 00:40:00,180 --> 00:40:02,930 I'm going to take a picture of that cat. 960 00:40:02,930 --> 00:40:04,940 And what I can do is then I can say, all right, 961 00:40:04,940 --> 00:40:07,790 let's take a look at what my machine said it was. 962 00:40:07,790 --> 00:40:11,419 And if you'll read carefully, this one returns five labels. 963 00:40:11,419 --> 00:40:13,460 There are actually 1,000 labels it has access to. 964 00:40:13,460 --> 00:40:14,501 It's not just these five. 965 00:40:14,501 --> 00:40:16,070 I just picked the top five. 966 00:40:16,070 --> 00:40:18,050 And you'll notice that while the bottom one is 967 00:40:18,050 --> 00:40:25,370 a Windows screen, which is not wrong, that isn't the most accurate one, 968 00:40:25,370 --> 00:40:26,330 not even close. 969 00:40:26,330 --> 00:40:28,820 Because these are percentages, not fractions. 970 00:40:28,820 --> 00:40:33,170 The closest one by far was by 94% or roughly the same accuracy 971 00:40:33,170 --> 00:40:35,130 that my other model had. 972 00:40:35,130 --> 00:40:36,360 And it's an Egyptian cat. 973 00:40:36,360 --> 00:40:38,818 And so that's one of the powerful parts of machine learning 974 00:40:38,818 --> 00:40:41,530 is this model, that was even faster than the previous one. 975 00:40:41,530 --> 00:40:44,010 And it got just an arbitrary picture of a cat 976 00:40:44,010 --> 00:40:49,380 that I picked off the internet correct with 94% accuracy. 977 00:40:49,380 --> 00:40:50,400 That's nuts. 978 00:40:50,400 --> 00:40:54,930 I just took a random picture and then picked it and it works. 979 00:40:54,930 --> 00:40:57,120 And that's really the point that I want to stress 980 00:40:57,120 --> 00:40:59,490 here is in a couple of minutes, admittedly 981 00:40:59,490 --> 00:41:02,700 I've had the advantage of prepping this for a little while, 982 00:41:02,700 --> 00:41:05,310 you can sit here and build an algorithm that 983 00:41:05,310 --> 00:41:08,020 identifies things pretty accurately. 984 00:41:08,020 --> 00:41:11,790 And so if you wanted to build a facial recognition software algorithm, 985 00:41:11,790 --> 00:41:12,360 it's this. 986 00:41:12,360 --> 00:41:13,410 It's the same idea. 987 00:41:13,410 --> 00:41:14,580 You just change the data. 988 00:41:14,580 --> 00:41:15,780 And you change your model a little bit. 989 00:41:15,780 --> 00:41:16,821 Make it a little smarter. 990 00:41:16,821 --> 00:41:19,237 Make it better suited for specifically faces. 991 00:41:19,237 --> 00:41:19,820 But that's it. 992 00:41:19,820 --> 00:41:22,180 That's really only the big difference here. 993 00:41:22,180 --> 00:41:25,860 This idea, these buzz words, machine learning, computer vision, 994 00:41:25,860 --> 00:41:29,550 they're just as accessible to you and me as beginners in computer science 995 00:41:29,550 --> 00:41:33,270 as they are to someone who has done a bunch of years of computer science 996 00:41:33,270 --> 00:41:34,950 and is maybe a computer wizard. 997 00:41:34,950 --> 00:41:36,790 Maybe they can do cooler stuff with it. 998 00:41:36,790 --> 00:41:40,680 They can put all sorts of APIs and other acronyms and scary sounding words 999 00:41:40,680 --> 00:41:42,010 behind it. 1000 00:41:42,010 --> 00:41:43,440 It's the same thing underneath. 1001 00:41:43,440 --> 00:41:49,014 It's all just working as a machine should work, deterministically 1002 00:41:49,014 --> 00:41:50,430 and hopefully the way you want it. 1003 00:41:50,430 --> 00:41:53,640 So we're all a little bit at the code because I think 1004 00:41:53,640 --> 00:41:55,569 that that is a worthwhile endeavor. 1005 00:41:55,569 --> 00:41:58,860 This is also like the worst possible way you can check what director you're in, 1006 00:41:58,860 --> 00:42:02,500 but I'm talking at the same time so I feel like it's justified. 1007 00:42:02,500 --> 00:42:04,050 I use Visual Studio Code. 1008 00:42:04,050 --> 00:42:07,180 And I really hope that I don't have anything like ridiculous open. 1009 00:42:07,180 --> 00:42:11,330 We're going to just expand it a little bit, make it easier to read. 1010 00:42:11,330 --> 00:42:12,907 So we have code. 1011 00:42:12,907 --> 00:42:15,490 And this is usually the part where people are like, all right, 1012 00:42:15,490 --> 00:42:16,930 now I'm out. 1013 00:42:16,930 --> 00:42:19,400 We got there, we're done. 1014 00:42:19,400 --> 00:42:20,740 And if you're not, awesome. 1015 00:42:20,740 --> 00:42:22,840 I would have thought that the math would have scared you away. 1016 00:42:22,840 --> 00:42:24,260 And since I've shown that there's no math, 1017 00:42:24,260 --> 00:42:25,930 I'm hoping that you're still here. 1018 00:42:25,930 --> 00:42:28,420 So we're sitting here looking at a pretty random file, 1019 00:42:28,420 --> 00:42:30,320 but this is actually the ML model file. 1020 00:42:30,320 --> 00:42:34,000 So this is a file that tells you or actually 1021 00:42:34,000 --> 00:42:37,600 that codes in all of the attributes of the actual model. 1022 00:42:37,600 --> 00:42:41,410 This is the class that has the Save method of the model. 1023 00:42:41,410 --> 00:42:44,215 It has the part that builds it or predicts on data. 1024 00:42:44,215 --> 00:42:46,090 It has all of the things that you could maybe 1025 00:42:46,090 --> 00:42:50,360 need to get what we just showed you in the example. 1026 00:42:50,360 --> 00:42:54,910 So we're looking at here and what I want to kind of draw our attention to 1027 00:42:54,910 --> 00:42:57,940 is right around here. 1028 00:42:57,940 --> 00:42:59,950 Looking at this part. 1029 00:42:59,950 --> 00:43:03,700 This is pretty much the bulk of the model that you just saw. 1030 00:43:03,700 --> 00:43:04,720 That's it. 1031 00:43:04,720 --> 00:43:10,500 If you don't count the empty lines, it's just five lines of code, 1032 00:43:10,500 --> 00:43:12,880 and my mouse highlighting everything. 1033 00:43:12,880 --> 00:43:15,244 So it's pretty simple. 1034 00:43:15,244 --> 00:43:16,410 It's pretty straightforward. 1035 00:43:16,410 --> 00:43:19,190 Now, a lot of these terms are maybe a little bit more confusing. 1036 00:43:19,190 --> 00:43:22,610 Max pooling with drop out and then you flatten it like a pancake 1037 00:43:22,610 --> 00:43:24,950 and then you do a dense of something, god knows what. 1038 00:43:24,950 --> 00:43:27,280 You activate that but there's a pool size here, 1039 00:43:27,280 --> 00:43:28,530 there's a random number there. 1040 00:43:28,530 --> 00:43:29,430 I think it's magic. 1041 00:43:29,430 --> 00:43:30,140 I don't know why. 1042 00:43:30,140 --> 00:43:32,970 And then it gets very complicated very quickly. 1043 00:43:32,970 --> 00:43:36,020 But again, like in CS50 and like an any problem, 1044 00:43:36,020 --> 00:43:38,840 really just break it into smaller and smaller pieces. 1045 00:43:38,840 --> 00:43:41,780 Let's start with maybe the easiest piece of code here-- 1046 00:43:41,780 --> 00:43:42,620 flatten. 1047 00:43:42,620 --> 00:43:44,010 It has no arguments. 1048 00:43:44,010 --> 00:43:46,100 So all we had to do was add flatten. 1049 00:43:46,100 --> 00:43:50,090 And maybe even easier is why are we adding things to the model. 1050 00:43:50,090 --> 00:43:51,540 How does this model work? 1051 00:43:51,540 --> 00:43:54,830 You can think of it as like a stack of layers. 1052 00:43:54,830 --> 00:43:57,920 And you take the input and depending on how your stack is oriented, 1053 00:43:57,920 --> 00:44:01,594 not the data structure stacked like a literal physical stack, 1054 00:44:01,594 --> 00:44:04,010 you're either dropping inputs in or you're putting them up 1055 00:44:04,010 --> 00:44:05,870 but either way it's going through the stack. 1056 00:44:05,870 --> 00:44:08,300 And then that first layer takes in that input and it says, 1057 00:44:08,300 --> 00:44:10,610 all right, we're going to do some magic with that and drops 1058 00:44:10,610 --> 00:44:11,390 into the next layer. 1059 00:44:11,390 --> 00:44:12,850 And then that does the same thing. 1060 00:44:12,850 --> 00:44:16,770 So like it's the last one which is located here. 1061 00:44:16,770 --> 00:44:20,690 And that last one says, I know what it is. 1062 00:44:20,690 --> 00:44:21,650 It's a triangle. 1063 00:44:21,650 --> 00:44:23,840 And it throws out that number to you. 1064 00:44:23,840 --> 00:44:25,430 And that's all this really does. 1065 00:44:25,430 --> 00:44:29,347 It's just a bunch of math that takes in data points and does stuff with them. 1066 00:44:29,347 --> 00:44:30,680 So that's why we're adding them. 1067 00:44:30,680 --> 00:44:33,840 And the order in which we add them changes the order of the stack. 1068 00:44:33,840 --> 00:44:35,490 And that's not too bad. 1069 00:44:35,490 --> 00:44:39,050 But then we have these weird words, like max pooling, drop out, flatten 1070 00:44:39,050 --> 00:44:40,370 and dense. 1071 00:44:40,370 --> 00:44:43,204 And those aren't as difficult to understand as you may think either. 1072 00:44:43,204 --> 00:44:45,870 We're going to start with flatten because it takes no arguments. 1073 00:44:45,870 --> 00:44:48,270 But it will be pretty easy to move on from there. 1074 00:44:48,270 --> 00:44:53,240 So adding a flattening layer, this might seem a little ridiculous. 1075 00:44:53,240 --> 00:44:55,080 It might seem unnecessary even. 1076 00:44:55,080 --> 00:44:59,680 But if you're looking at a picture and that picture captured all sorts of data 1077 00:44:59,680 --> 00:45:07,670 points and maybe it was x long and x wide, and some amount thick, 1078 00:45:07,670 --> 00:45:10,580 we really only have to worry about the width and the height. 1079 00:45:10,580 --> 00:45:13,010 And every other piece of information can probably somehow 1080 00:45:13,010 --> 00:45:15,800 be encoded without having it be stretched out like this. 1081 00:45:15,800 --> 00:45:18,980 Like let's say that that stretching out is color, r, g, and b. 1082 00:45:18,980 --> 00:45:23,520 So even if we have our image kind of laid out in kind of this rectangle, 1083 00:45:23,520 --> 00:45:25,722 there's some three layer of depth to it. 1084 00:45:25,722 --> 00:45:28,430 The first layer is how much red is in that pixel, how much green, 1085 00:45:28,430 --> 00:45:29,870 and then how much blue. 1086 00:45:29,870 --> 00:45:33,590 But what if we don't really care or we can encode that data somehow 1087 00:45:33,590 --> 00:45:34,850 some other way? 1088 00:45:34,850 --> 00:45:38,240 Then we can flatten the picture, so to speak, and hand you 1089 00:45:38,240 --> 00:45:41,110 a two dimensional thing instead of a three dimensional one. 1090 00:45:41,110 --> 00:45:43,610 And if you can do the same but taking two dimensional things 1091 00:45:43,610 --> 00:45:46,640 and collapsing it into a line, you would flatten again. 1092 00:45:46,640 --> 00:45:48,990 And so this concept is really not that difficult. 1093 00:45:48,990 --> 00:45:51,170 It's actually something that we do anyway. 1094 00:45:51,170 --> 00:45:54,560 If you wanted to analyze an image and you didn't really care about the color, 1095 00:45:54,560 --> 00:45:57,740 for example, you could flatten it, make it black and white. 1096 00:45:57,740 --> 00:45:59,270 You've now flattened an image. 1097 00:45:59,270 --> 00:46:03,080 And so this, although it might be a little bit strange or weirdly worded, 1098 00:46:03,080 --> 00:46:05,870 it does something that we're actually not too familiar with. 1099 00:46:05,870 --> 00:46:08,690 The next easiest one is probably drop out. 1100 00:46:08,690 --> 00:46:11,720 And this plays a role in something that we've already seen. 1101 00:46:11,720 --> 00:46:14,990 This plays a role in basically overfitting. 1102 00:46:14,990 --> 00:46:16,760 So we've talked about this term before. 1103 00:46:16,760 --> 00:46:19,110 We've taught a toddler a bunch of facts. 1104 00:46:19,110 --> 00:46:21,020 And that toddler knows those facts. 1105 00:46:21,020 --> 00:46:23,940 It knows what a brachiosaurus is. 1106 00:46:23,940 --> 00:46:24,640 That's it. 1107 00:46:24,640 --> 00:46:27,055 And so now what we want to do in our model 1108 00:46:27,055 --> 00:46:28,930 is make sure that our model isn't doing that. 1109 00:46:28,930 --> 00:46:31,320 It's not just going, the answer is a, the next answer 1110 00:46:31,320 --> 00:46:33,340 is b, the next one is c, and so on. 1111 00:46:33,340 --> 00:46:36,130 We want our model to pick up on patterns and say, well, 1112 00:46:36,130 --> 00:46:39,740 according to how those patterns work, that should be this. 1113 00:46:39,740 --> 00:46:40,980 That's a much better model. 1114 00:46:40,980 --> 00:46:43,720 And so in this case, what we do is we introduce dropout. 1115 00:46:43,720 --> 00:46:46,300 And you could think of that as every once in a 1116 00:46:46,300 --> 00:46:49,840 while we just kind of randomly kick out some data. 1117 00:46:49,840 --> 00:46:52,910 It's with 50% probability, supposedly. 1118 00:46:52,910 --> 00:46:55,470 So the fraction there is to tell it how much data 1119 00:46:55,470 --> 00:46:59,330 to kick out, not the probability. 1120 00:46:59,330 --> 00:47:00,030 My mistake. 1121 00:47:00,030 --> 00:47:01,946 This is actually a fraction of the data that's 1122 00:47:01,946 --> 00:47:06,370 going to be kind of dropped in a given section, in this layer. 1123 00:47:06,370 --> 00:47:08,730 And so what that layer says is, like, all right, 1124 00:47:08,730 --> 00:47:09,990 we're going to just kind of everyone once in a while 1125 00:47:09,990 --> 00:47:11,520 not pick a piece of data. 1126 00:47:11,520 --> 00:47:14,350 And then we're going to move on and do something else. 1127 00:47:14,350 --> 00:47:18,660 And in that way, we don't give it the same data set every time. 1128 00:47:18,660 --> 00:47:19,927 We give it a little bit less. 1129 00:47:19,927 --> 00:47:22,260 We say, all right, here's the data you get to pick from. 1130 00:47:22,260 --> 00:47:24,260 We're actually only going to hand you this much. 1131 00:47:24,260 --> 00:47:24,972 Here you go. 1132 00:47:24,972 --> 00:47:27,930 And then the next time it comes around, it might be a different subset. 1133 00:47:27,930 --> 00:47:30,690 Maybe I'll hand you this subset instead of the previous one. 1134 00:47:30,690 --> 00:47:34,020 And in that way, we can avoid to a degree overfitting. 1135 00:47:34,020 --> 00:47:38,160 Now 50% of the data being dropped out every time, pretty high. 1136 00:47:38,160 --> 00:47:40,652 And so I've introduced that here to kind of combat 1137 00:47:40,652 --> 00:47:42,360 the overfitting that will occur if we are 1138 00:47:42,360 --> 00:47:45,004 training something hundreds of times in a couple of minutes. 1139 00:47:45,004 --> 00:47:46,920 But you could lower that and see what happens. 1140 00:47:46,920 --> 00:47:50,520 It'll probably get very, very good at the kind of training data. 1141 00:47:50,520 --> 00:47:53,310 But it'll be pretty bad at the actual validation or testing data. 1142 00:47:53,310 --> 00:47:54,750 So maybe not ideal. 1143 00:47:54,750 --> 00:47:57,360 But you could also increase this so much that it never 1144 00:47:57,360 --> 00:48:02,194 gets good at the training data and well the testing data will follow suit. 1145 00:48:02,194 --> 00:48:03,360 And that's not ideal either. 1146 00:48:03,360 --> 00:48:05,150 So there is some kind of give and take here. 1147 00:48:05,150 --> 00:48:07,150 You do have to mess around with it a little bit. 1148 00:48:07,150 --> 00:48:10,030 And you can add more or fewer of these layers as you see fit. 1149 00:48:10,030 --> 00:48:12,340 You'll notice there is some dimensionality that needs to happen. 1150 00:48:12,340 --> 00:48:14,465 For example, if you got rid of the flattened layer, 1151 00:48:14,465 --> 00:48:17,670 Keras will just be like, I don't understand what's going on. 1152 00:48:17,670 --> 00:48:19,290 And it'll kind of freak out on you. 1153 00:48:19,290 --> 00:48:23,740 But other than that, you can play around with these more or less as you see fit. 1154 00:48:23,740 --> 00:48:29,340 The other kind of major one before we go into max pooling 2D is dense. 1155 00:48:29,340 --> 00:48:31,140 And dense has some activation. 1156 00:48:31,140 --> 00:48:35,190 If you were to imagine dense as being some distribution of weights 1157 00:48:35,190 --> 00:48:38,400 is what they're called or numbers that tell the computer what 1158 00:48:38,400 --> 00:48:42,930 the value of the decision it makes is, so if I tell you that you touching 1159 00:48:42,930 --> 00:48:45,480 a stove has a value of 100 and you touching 1160 00:48:45,480 --> 00:48:48,960 the ground has a value like 40, and you touching your own skin has 1161 00:48:48,960 --> 00:48:51,390 a value of like 0, then you can pretty easily tell 1162 00:48:51,390 --> 00:48:53,040 where my value system is going there. 1163 00:48:53,040 --> 00:48:54,840 Which one is more dangerous. 1164 00:48:54,840 --> 00:48:57,270 And that sort of value system is at play here, 1165 00:48:57,270 --> 00:49:00,330 but the activation tells it well how do we want to just 1166 00:49:00,330 --> 00:49:02,850 kind of pre-weight things to a degree. 1167 00:49:02,850 --> 00:49:06,900 And [INAUDIBLE] happens to just be a common one that people use with images. 1168 00:49:06,900 --> 00:49:08,070 There are a bunch of them. 1169 00:49:08,070 --> 00:49:09,400 You're free to look them up. 1170 00:49:09,400 --> 00:49:11,760 Keras comes in built with a ton of them. 1171 00:49:11,760 --> 00:49:12,750 There's like Softmax. 1172 00:49:12,750 --> 00:49:13,637 There's 10H. 1173 00:49:13,637 --> 00:49:14,970 There's all sorts of other ones. 1174 00:49:14,970 --> 00:49:16,590 And they all mean varying things. 1175 00:49:16,590 --> 00:49:17,940 That can be very technical. 1176 00:49:17,940 --> 00:49:21,251 Sometimes you can just play around with them and see which ones work better. 1177 00:49:21,251 --> 00:49:23,750 You can just swap them out every once in a while and try it. 1178 00:49:23,750 --> 00:49:24,416 Which one works? 1179 00:49:24,416 --> 00:49:25,260 Which one doesn't? 1180 00:49:25,260 --> 00:49:29,070 And you'll often find that [INAUDIBLE] works particularly well with images 1181 00:49:29,070 --> 00:49:30,940 just because of the math underneath. 1182 00:49:30,940 --> 00:49:33,390 And if you're interested in that math, feel free to talk to me afterward. 1183 00:49:33,390 --> 00:49:35,556 But if you're not, we're going to just kind of leave 1184 00:49:35,556 --> 00:49:38,490 that as an activation that tells it the value of its decisions. 1185 00:49:38,490 --> 00:49:40,650 And that's the starting value of its decisions. 1186 00:49:40,650 --> 00:49:42,022 Later, it re-weights itself. 1187 00:49:42,022 --> 00:49:43,980 It says, oh, yeah, no, that was a bad decision. 1188 00:49:43,980 --> 00:49:45,480 We're rechanging that. 1189 00:49:45,480 --> 00:49:48,750 And then dense is the actual thing being added to this layer. 1190 00:49:48,750 --> 00:49:50,460 That is the name of this layer. 1191 00:49:50,460 --> 00:49:54,220 And dense and of itself is really just saying, 1192 00:49:54,220 --> 00:49:57,090 hey, we're going to have a of nodes or neurons. 1193 00:49:57,090 --> 00:49:59,400 We're going to have 16 of them specifically. 1194 00:49:59,400 --> 00:50:02,650 And we're going to have them all be able to communicate with each other. 1195 00:50:02,650 --> 00:50:04,776 And so what that just says is if I make a decision, 1196 00:50:04,776 --> 00:50:06,566 I'm going to tell everyone around me that's 1197 00:50:06,566 --> 00:50:08,200 the decision I made and it was bad. 1198 00:50:08,200 --> 00:50:08,960 Don't do that one. 1199 00:50:08,960 --> 00:50:10,230 That was a terrible decision. 1200 00:50:10,230 --> 00:50:14,520 It's like if you get a little bit too drunk on water one night 1201 00:50:14,520 --> 00:50:17,970 and you just go around to everyone the next day like, hey, guys bad plan. 1202 00:50:17,970 --> 00:50:18,990 Don't do that. 1203 00:50:18,990 --> 00:50:20,370 Do you not make that decision. 1204 00:50:20,370 --> 00:50:21,840 Very easy way of dealing with that. 1205 00:50:21,840 --> 00:50:24,840 And in that layer you have a bunch of neurons all talking to each other. 1206 00:50:24,840 --> 00:50:29,430 And some people's immediate solution is well I could just add more neurons. 1207 00:50:29,430 --> 00:50:30,330 Sometimes. 1208 00:50:30,330 --> 00:50:31,140 Sometimes not. 1209 00:50:31,140 --> 00:50:33,610 And you'll notice that it makes your computer a lot slower. 1210 00:50:33,610 --> 00:50:36,270 So there is always a trade off. 1211 00:50:36,270 --> 00:50:39,600 And then we have our max pooling 2D, which 1212 00:50:39,600 --> 00:50:42,990 is actually pretty intuitively named if you know what's going on underneath. 1213 00:50:42,990 --> 00:50:45,780 But if you don't, it's just like what the fudge. 1214 00:50:45,780 --> 00:50:49,170 So what we have ending up going on here is I gave it a pool size. 1215 00:50:49,170 --> 00:50:50,370 I said 2 by 2. 1216 00:50:50,370 --> 00:50:52,380 And so if you imagine that in your image you 1217 00:50:52,380 --> 00:50:56,430 have 2 by 2 like sections of pixels, basically a square that has four 1218 00:50:56,430 --> 00:50:58,890 pixels in it sliding across the image. 1219 00:50:58,890 --> 00:51:02,550 Then really what I'm doing here is I'm kind of pooling them all together 1220 00:51:02,550 --> 00:51:03,810 and taking the max. 1221 00:51:03,810 --> 00:51:04,790 That's it. 1222 00:51:04,790 --> 00:51:07,290 And I'm taking that max one and saying that that is probably 1223 00:51:07,290 --> 00:51:09,000 the feature that determines things. 1224 00:51:09,000 --> 00:51:11,294 And in images, that can sometimes be the case. 1225 00:51:11,294 --> 00:51:13,710 Particularly for this kind of image, it works pretty well. 1226 00:51:13,710 --> 00:51:14,935 There's also min pooling. 1227 00:51:14,935 --> 00:51:15,810 You take the minimum. 1228 00:51:15,810 --> 00:51:17,674 That's the one that matters. 1229 00:51:17,674 --> 00:51:20,090 There are cases where that might be particularly relevant. 1230 00:51:20,090 --> 00:51:22,150 What if you're looking at the negative of images? 1231 00:51:22,150 --> 00:51:23,160 Maybe that applies here. 1232 00:51:23,160 --> 00:51:24,180 Maybe it doesn't. 1233 00:51:24,180 --> 00:51:25,900 And so that's something to keep in mind. 1234 00:51:25,900 --> 00:51:27,900 And there's, I believe, also average pooling. 1235 00:51:27,900 --> 00:51:30,360 It might be called mean pooling in Keras. 1236 00:51:30,360 --> 00:51:33,750 But it does the same thing that you just would think of, 1237 00:51:33,750 --> 00:51:36,600 takes that, averages it, and then does that as it goes. 1238 00:51:36,600 --> 00:51:40,200 And the pool size can change if you think that's appropriate. 1239 00:51:40,200 --> 00:51:43,740 2 by 2 is pretty fitting here because we aren't really saying, yeah, 1240 00:51:43,740 --> 00:51:46,602 this whole thing, if you just take the biggest point there, 1241 00:51:46,602 --> 00:51:48,352 that determines whether it's happy or not. 1242 00:51:48,352 --> 00:51:49,590 That's it. 1243 00:51:49,590 --> 00:51:50,640 That's not very accurate. 1244 00:51:50,640 --> 00:51:52,380 We wouldn't be able to get far from that. 1245 00:51:52,380 --> 00:51:54,450 So this helps us condense our data a little bit. 1246 00:51:54,450 --> 00:51:56,280 We kind of just take the information we're looking at 1247 00:51:56,280 --> 00:51:57,730 and throw out some of the fluff. 1248 00:51:57,730 --> 00:51:59,640 And you do this a couple of times. 1249 00:51:59,640 --> 00:52:03,010 And then at the end we spit out our output. 1250 00:52:03,010 --> 00:52:08,400 So that is kind of your very topical overview into machine learning 1251 00:52:08,400 --> 00:52:10,690 and hopefully an introduction to the idea 1252 00:52:10,690 --> 00:52:14,500 that it is accessible as a final project for CS50, specifically. 1253 00:52:14,500 --> 00:52:18,626 But even in the kind of real world, outside of CS50 and outside of classes, 1254 00:52:18,626 --> 00:52:20,500 if you wanted to tinker around with this that 1255 00:52:20,500 --> 00:52:22,940 is totally within your capabilities. 1256 00:52:22,940 --> 00:52:26,260 And I mean your not as someone who has done a year of CS 1257 00:52:26,260 --> 00:52:28,450 and is now teaching a course, but someone 1258 00:52:28,450 --> 00:52:31,870 who has started where you all started or worse. 1259 00:52:31,870 --> 00:52:35,480 I started with no experience of this and this was where I went. 1260 00:52:35,480 --> 00:52:37,030 This was the direction I chose. 1261 00:52:37,030 --> 00:52:38,300 And it's totally accessible. 1262 00:52:38,300 --> 00:52:39,200 You can do that. 1263 00:52:39,200 --> 00:52:41,591 That is entirely within your grasp. 1264 00:52:41,591 --> 00:52:44,590 And so if you're at all interested in it, I would recommend pursuing it. 1265 00:52:44,590 --> 00:52:47,715 You'll find that it is difficult. There are points where it is frustrating. 1266 00:52:47,715 --> 00:52:50,864 But that is the case with anything that you are going to do in CS. 1267 00:52:50,864 --> 00:52:52,780 There are points where they will be difficult, 1268 00:52:52,780 --> 00:52:54,200 where they will be frustrating. 1269 00:52:54,200 --> 00:52:56,800 So I would encourage you to not give up but rather think 1270 00:52:56,800 --> 00:52:58,420 that that is basically the right path. 1271 00:52:58,420 --> 00:52:59,830 You're going down the street. 1272 00:52:59,830 --> 00:53:01,150 Just keep going. 1273 00:53:01,150 --> 00:53:04,060 Because you might as well, if you're going to do this on any project, 1274 00:53:04,060 --> 00:53:05,844 do it on one you're interested in. 1275 00:53:05,844 --> 00:53:08,260 And that's more of a piece of advice specifically directed 1276 00:53:08,260 --> 00:53:10,210 at the final project for CS50. 1277 00:53:10,210 --> 00:53:12,400 Don't waste your time for three weeks. 1278 00:53:12,400 --> 00:53:13,660 Build something cool. 1279 00:53:13,660 --> 00:53:17,490 If it's hard and it takes a lot of time and it's very annoying to debug 1280 00:53:17,490 --> 00:53:20,649 and there's things that don't work up until the last possible minute, 1281 00:53:20,649 --> 00:53:21,940 you're probably doing it right. 1282 00:53:21,940 --> 00:53:23,560 That's probably about right. 1283 00:53:23,560 --> 00:53:26,800 A lot of the best work in CS happens at the last possible minute. 1284 00:53:26,800 --> 00:53:30,085 It's that moment is where you're like, I got it. 1285 00:53:30,085 --> 00:53:30,960 And then you're good. 1286 00:53:30,960 --> 00:53:34,000 And that sense of relief is why a lot of us are still in CS, 1287 00:53:34,000 --> 00:53:37,650 is we like that feeling of being satisfied with what we produced. 1288 00:53:37,650 --> 00:53:41,580 So if you ever think that CS is not for you because it's too difficult 1289 00:53:41,580 --> 00:53:46,320 or because everyone seems to get it but you, that is 100% not the case. 1290 00:53:46,320 --> 00:53:47,550 Machine learning is hard. 1291 00:53:47,550 --> 00:53:48,660 Computer vision is hard. 1292 00:53:48,660 --> 00:53:50,220 Computer science is hard. 1293 00:53:50,220 --> 00:53:54,780 Learning is difficult. All of this, we can do. 1294 00:53:54,780 --> 00:53:56,890 So I would recommend always pursuing it. 1295 00:53:56,890 --> 00:53:59,330 What are some questions you guys have about machine learning, computer 1296 00:53:59,330 --> 00:53:59,830 vision? 1297 00:53:59,830 --> 00:54:02,610 I figure in my last like seven or so minutes I'll 1298 00:54:02,610 --> 00:54:04,604 open it up to any questions. 1299 00:54:04,604 --> 00:54:07,390 1300 00:54:07,390 --> 00:54:08,015 Sure. 1301 00:54:08,015 --> 00:54:10,098 SPEAKER 2: I was wondering if you could talk maybe 1302 00:54:10,098 --> 00:54:14,720 about the min pooling and max pooling, and when to use a 2 by 2. 1303 00:54:14,720 --> 00:54:18,455 Like what are some circumstances where you'd use like 100 by 100 1304 00:54:18,455 --> 00:54:19,381 or [INAUDIBLE] 1305 00:54:19,381 --> 00:54:22,340 SPEAKER 1: You can imagine, I'm not really in a pull up any image. 1306 00:54:22,340 --> 00:54:24,423 But maybe I'll keep the cat one up here it's cute. 1307 00:54:24,423 --> 00:54:27,250 1308 00:54:27,250 --> 00:54:29,870 So this image has a set amount of pixels in it. 1309 00:54:29,870 --> 00:54:34,454 So the question being why min pooling versus max pooling versus mean pooling 1310 00:54:34,454 --> 00:54:36,620 and what does it mean to have a different pool size. 1311 00:54:36,620 --> 00:54:38,632 Why is that relevant really? 1312 00:54:38,632 --> 00:54:41,840 And so we're going to talk about this image in particular, because it's cute. 1313 00:54:41,840 --> 00:54:43,520 And I think brown or red. 1314 00:54:43,520 --> 00:54:44,444 I can't really tell. 1315 00:54:44,444 --> 00:54:46,610 But it's an Egyptian cat and they have the beautiful 1316 00:54:46,610 --> 00:54:47,980 like wide eyes and big ears. 1317 00:54:47,980 --> 00:54:48,950 They're awesome. 1318 00:54:48,950 --> 00:54:52,130 And basically this image has a set number of pixels. 1319 00:54:52,130 --> 00:54:54,440 Even though I'm displaying it on some number of pixels, 1320 00:54:54,440 --> 00:54:59,300 the image itself is, let's say, I don't 400 by 200. 1321 00:54:59,300 --> 00:55:01,430 Not quite right but close enough. 1322 00:55:01,430 --> 00:55:08,810 So if it's 400 by 200, then in a given, we'll say like 20 by 20 box, 1323 00:55:08,810 --> 00:55:10,940 we can only get so much data. 1324 00:55:10,940 --> 00:55:13,910 Let's say that's just the tip of the ear, 20 by 20. 1325 00:55:13,910 --> 00:55:16,340 Well, if I take just the max of that, then you 1326 00:55:16,340 --> 00:55:21,170 can think of it as the actual 20 by 20 section, if I do a max of 20 1327 00:55:21,170 --> 00:55:26,840 by 20, well, then the entire tip of this ear becomes one point. 1328 00:55:26,840 --> 00:55:29,060 I have one data point that is the tip of the ear. 1329 00:55:29,060 --> 00:55:32,500 And same thing as we iterate through the entire image. 1330 00:55:32,500 --> 00:55:35,360 So that this image gets significantly condensed. 1331 00:55:35,360 --> 00:55:38,300 If it was 400 by 200, well you can think of it 1332 00:55:38,300 --> 00:55:42,520 as now being reduced by a factor of 20 which might be appropriate. 1333 00:55:42,520 --> 00:55:46,430 Maybe all you really care about is the general shape. 1334 00:55:46,430 --> 00:55:48,950 Is it a cat or is it a doorknob? 1335 00:55:48,950 --> 00:55:50,870 That's a pretty easy classifier to build. 1336 00:55:50,870 --> 00:55:55,760 All I have to really care about is it not a circle, not a doorknob, good. 1337 00:55:55,760 --> 00:55:57,830 But maybe your class of doorknobs is different. 1338 00:55:57,830 --> 00:55:59,700 It can get more complicated from there. 1339 00:55:59,700 --> 00:56:02,810 But in this case, you'd probably want to use 1340 00:56:02,810 --> 00:56:05,012 to preserve detail a pretty small pool. 1341 00:56:05,012 --> 00:56:07,220 We're just trying to condense our image a little bit. 1342 00:56:07,220 --> 00:56:10,010 We're trying to get rid of some of the fluff, some of the noise. 1343 00:56:10,010 --> 00:56:11,940 Like there's some fur here. 1344 00:56:11,940 --> 00:56:14,600 But it doesn't really matter what that fur actually 1345 00:56:14,600 --> 00:56:17,624 does, unless you're looking for a very particular machine 1346 00:56:17,624 --> 00:56:19,790 classifier in which case you're probably not looking 1347 00:56:19,790 --> 00:56:21,590 at pictures of whole animals. 1348 00:56:21,590 --> 00:56:24,350 So if we're looking at, is it a cat versus a dog, well, 1349 00:56:24,350 --> 00:56:26,600 does it really matter if there's a speck of fur here 1350 00:56:26,600 --> 00:56:30,560 or some extra noise captured by the camera that took the picture? 1351 00:56:30,560 --> 00:56:31,220 Not really. 1352 00:56:31,220 --> 00:56:34,040 So we can just kind of ignore that and average over it. 1353 00:56:34,040 --> 00:56:35,874 Or use max pooling over it and just say, you 1354 00:56:35,874 --> 00:56:38,831 know what, we're just going to pool all of our details, all the biggest 1355 00:56:38,831 --> 00:56:40,190 details, together. 1356 00:56:40,190 --> 00:56:45,320 And while that can be appropriate in this case, what if this picture was 1357 00:56:45,320 --> 00:56:49,470 4 million pixels by 2 million pixels? 1358 00:56:49,470 --> 00:56:51,770 Now your pool size might want to be scaled up a lot. 1359 00:56:51,770 --> 00:56:54,000 We don't need all of that extra information, 1360 00:56:54,000 --> 00:56:56,090 especially if it's the same picture. 1361 00:56:56,090 --> 00:56:58,670 We can just say, you know what, we're going to reduce that 1362 00:56:58,670 --> 00:57:01,600 by a factor of like a million. 1363 00:57:01,600 --> 00:57:04,750 And now you have a 4 by 2, which might be a little too much. 1364 00:57:04,750 --> 00:57:07,450 Now you've basically just got four pixels down and two across 1365 00:57:07,450 --> 00:57:08,770 and hopefully it's still a cat. 1366 00:57:08,770 --> 00:57:10,300 But you can play around with that. 1367 00:57:10,300 --> 00:57:12,340 And that might be a case in which you would 1368 00:57:12,340 --> 00:57:15,220 need to change whether you're doing a min or a max or even 1369 00:57:15,220 --> 00:57:17,440 just how you're analyzing this image. 1370 00:57:17,440 --> 00:57:20,830 Is it appropriate to take just one image and do this? 1371 00:57:20,830 --> 00:57:23,980 Or is only one image in your data set extra 1372 00:57:23,980 --> 00:57:27,757 large and then all of the rest of them are like 150 by 150? 1373 00:57:27,757 --> 00:57:29,215 Then you might want to change that. 1374 00:57:29,215 --> 00:57:32,742 SPEAKER 3: So like [INAUDIBLE] like if you 1375 00:57:32,742 --> 00:57:36,750 had one image that was say like 4 million pixels long, 1376 00:57:36,750 --> 00:57:39,085 it would probably make more sense then to preprocess 1377 00:57:39,085 --> 00:57:42,190 that data before you go in, [INAUDIBLE] size 1378 00:57:42,190 --> 00:57:43,850 to like a certain value [INAUDIBLE]. 1379 00:57:43,850 --> 00:57:44,960 SPEAKER 1: Yes. 1380 00:57:44,960 --> 00:57:47,980 Actually there is a little bit of that in the sample code. 1381 00:57:47,980 --> 00:57:50,480 There is a bit of what was just brought up, 1382 00:57:50,480 --> 00:57:52,294 which is pre-processing of an image. 1383 00:57:52,294 --> 00:57:55,210 It's another kind of fun little word that people will throw out there, 1384 00:57:55,210 --> 00:57:57,910 just like oh, yeah, preprocess your image before you use them. 1385 00:57:57,910 --> 00:58:00,243 And they like turn around and just ignore that they just 1386 00:58:00,243 --> 00:58:02,080 dropped a whole thing on you. 1387 00:58:02,080 --> 00:58:05,350 Pre-processing is really just basically what was just mentioned 1388 00:58:05,350 --> 00:58:09,076 is that you want to take your images and kind of normalize them a little bit. 1389 00:58:09,076 --> 00:58:11,950 You don't want to have this outlier in your data set that's 4 million 1390 00:58:11,950 --> 00:58:14,950 by 2 million when the rest of like 100 across. 1391 00:58:14,950 --> 00:58:17,350 You want to take those and maybe resize them, scale them 1392 00:58:17,350 --> 00:58:18,642 down using appropriate methods. 1393 00:58:18,642 --> 00:58:21,350 And whatever those methods are might change depending on the data 1394 00:58:21,350 --> 00:58:23,800 you're looking at or depending on how you want to do it 1395 00:58:23,800 --> 00:58:26,196 but being able to normalize across them is 1396 00:58:26,196 --> 00:58:27,820 going to be some sort of preprocessing. 1397 00:58:27,820 --> 00:58:31,449 And it's called preprocessing if you think of it as the processing part is 1398 00:58:31,449 --> 00:58:33,490 throwing it into your machine learning algorithm, 1399 00:58:33,490 --> 00:58:35,861 you do this beforehand-- preprocessing. 1400 00:58:35,861 --> 00:58:38,110 And so that's where that terminology kind of comes in. 1401 00:58:38,110 --> 00:58:40,390 And it comes up a lot with images in particular 1402 00:58:40,390 --> 00:58:43,420 because images can be taken by cameras, which are used by people 1403 00:58:43,420 --> 00:58:45,710 and people are pretty stochastic. 1404 00:58:45,710 --> 00:58:49,334 I might take the same picture 400 times and it might look different every time. 1405 00:58:49,334 --> 00:58:52,250 And that's kind of a problem with how people take pictures, especially 1406 00:58:52,250 --> 00:58:54,291 for real world scenarios where you're applying it 1407 00:58:54,291 --> 00:58:57,860 to some sort of pictures of living animals or people's faces 1408 00:58:57,860 --> 00:58:58,880 or things like that. 1409 00:58:58,880 --> 00:59:02,046 You'll probably want to find a way to preprocess your images so that they're 1410 00:59:02,046 --> 00:59:06,830 roughly the right size and give or take roughly the right thing that you're 1411 00:59:06,830 --> 00:59:08,000 looking for. 1412 00:59:08,000 --> 00:59:11,390 Maybe a picture of someone's face is zoomed all the way out here 1413 00:59:11,390 --> 00:59:14,980 and maybe every other person's picture is like zoomed in super close 1414 00:59:14,980 --> 00:59:16,940 so you just have their face. 1415 00:59:16,940 --> 00:59:19,520 Harvard IT uses that to identify you. 1416 00:59:19,520 --> 00:59:23,060 They preprocess all of the images they take of you so that it's just your face 1417 00:59:23,060 --> 00:59:24,329 and they can identify you. 1418 00:59:24,329 --> 00:59:26,870 And that's something that comes up a lot in machine learning. 1419 00:59:26,870 --> 00:59:29,240 It's part of the project. 1420 00:59:29,240 --> 00:59:32,550 And I think I'm right about out of time, but I'll 1421 00:59:32,550 --> 00:59:34,660 be hanging around afterward for any questions. 1422 00:59:34,660 --> 00:59:37,076 But as far as the livestream goes, thank you for watching. 1423 00:59:37,076 --> 00:59:39,857 I'll be on campus kind of doing my own thing. 1424 00:59:39,857 --> 00:59:41,940 But I really appreciate you hanging in all the way 1425 00:59:41,940 --> 00:59:44,250 through the weird cat picture at the very end. 1426 00:59:44,250 --> 00:59:45,990 So thank you very much. 1427 00:59:45,990 --> 00:59:48,840 Thanks for showing up, you guys. 1428 00:59:48,840 --> 00:59:51,034