1 00:00:00,000 --> 00:00:02,982 2 00:00:02,982 --> 00:00:06,461 [MUSIC PLAYING] 3 00:00:06,461 --> 00:01:12,065 4 00:01:12,065 --> 00:01:13,210 DAVID MALAN: All right. 5 00:01:13,210 --> 00:01:18,700 This is CS50, and this is week six, wherein we finally transition 6 00:01:18,700 --> 00:01:20,935 from Scratch to C to, now, Python. 7 00:01:20,935 --> 00:01:22,735 And, indeed, this is going to be somewhat 8 00:01:22,735 --> 00:01:27,370 of a unique experience in that, just like a few weeks past-- 9 00:01:27,370 --> 00:01:30,605 perhaps, for the first time-- and now, today, you're 10 00:01:30,605 --> 00:01:31,855 going to learn a new language. 11 00:01:31,855 --> 00:01:35,935 But the goal isn't just to throw another fire hose of content and syntax 12 00:01:35,935 --> 00:01:39,568 and whatnot at you, but rather, to really equip you all to actually teach 13 00:01:39,568 --> 00:01:41,110 yourself new languages in the future. 14 00:01:41,110 --> 00:01:43,902 And so, indeed, what we'll do today, what we'll do this coming week 15 00:01:43,902 --> 00:01:46,580 is prepare you to stand on your own. 16 00:01:46,580 --> 00:01:48,527 And once Python is passe and the world has 17 00:01:48,527 --> 00:01:50,860 moved on to some other language in some number of years, 18 00:01:50,860 --> 00:01:52,568 you'll be well equipped to figure out how 19 00:01:52,568 --> 00:01:55,027 to wrap your mind around some new syntax, some new language 20 00:01:55,027 --> 00:01:56,280 and solve problems, as well. 21 00:01:56,280 --> 00:01:59,320 Now, you recall, in week zero, this is where we started-- 22 00:01:59,320 --> 00:02:01,390 just saying hello to the world. 23 00:02:01,390 --> 00:02:03,850 And that quickly escalated just a week later in C 24 00:02:03,850 --> 00:02:06,250 to be something much, much more cryptic. 25 00:02:06,250 --> 00:02:09,234 And if you've still struggled with some of the syntax, 26 00:02:09,234 --> 00:02:11,723 find yourself checking your notes or your previous code, 27 00:02:11,723 --> 00:02:12,640 that's totally normal. 28 00:02:12,640 --> 00:02:16,675 And that's one of the reasons why there are languages besides C 29 00:02:16,675 --> 00:02:18,970 out there-- among them, this language called Python. 30 00:02:18,970 --> 00:02:21,520 Humans over the decades have realized, gee, 31 00:02:21,520 --> 00:02:25,167 that wasn't necessarily the best design decision, or humans have realized, wow, 32 00:02:25,167 --> 00:02:25,750 you know what? 33 00:02:25,750 --> 00:02:30,160 Now that computers have gotten faster with more memory and faster CPUs, 34 00:02:30,160 --> 00:02:33,070 we can actually do more with our programming languages. 35 00:02:33,070 --> 00:02:36,985 So just as human languages evolve, so do actual programming languages. 36 00:02:36,985 --> 00:02:40,810 And even within a programming language, there's typically different versions. 37 00:02:40,810 --> 00:02:43,870 We, for instance, have been using version C11 38 00:02:43,870 --> 00:02:46,720 of C, which was updated in 2011. 39 00:02:46,720 --> 00:02:50,800 But Python itself continues to evolve, and it's now up to version 3-plus. 40 00:02:50,800 --> 00:02:53,680 And so there, too, these things will evolve in the coming days. 41 00:02:53,680 --> 00:02:56,560 Thankfully, what you're about to see is "Hello, World!" 42 00:02:56,560 --> 00:02:59,440 for the third time, but it's going to be literally this. 43 00:02:59,440 --> 00:03:04,930 None of the crazy syntax above or below, fewer semicolons, if any, fewer 44 00:03:04,930 --> 00:03:05,770 currently braces. 45 00:03:05,770 --> 00:03:08,630 And, really, a lot of the distractions get out of the way. 46 00:03:08,630 --> 00:03:11,200 So to get there, let's consider exactly how 47 00:03:11,200 --> 00:03:13,000 we've been programming up until now. 48 00:03:13,000 --> 00:03:16,300 So you write a program in C and you've got, hopefully, 49 00:03:16,300 --> 00:03:19,135 no syntax error, so you're ready to build it-- that is, compile it. 50 00:03:19,135 --> 00:03:22,135 And so, you've run make, and then, you've run the program, like ./hello. 51 00:03:22,135 --> 00:03:24,850 Or if you think back to week two, where we 52 00:03:24,850 --> 00:03:27,100 took a peek underneath the hood of what make is doing, 53 00:03:27,100 --> 00:03:29,710 it's really running the actual compiler-- 54 00:03:29,710 --> 00:03:32,800 something called clang-- maybe with some command-line arguments creating 55 00:03:32,800 --> 00:03:34,090 a program called hello. 56 00:03:34,090 --> 00:03:36,128 And then, you could do ./hello. 57 00:03:36,128 --> 00:03:38,920 So, today, you're going to start doing something similar in spirit, 58 00:03:38,920 --> 00:03:40,270 but fewer steps. 59 00:03:40,270 --> 00:03:42,270 No longer will you have to compile your code 60 00:03:42,270 --> 00:03:45,520 and then run it, and then, maybe, fix or change it, and then compile your code 61 00:03:45,520 --> 00:03:47,470 and run it, and then repeat, repeat. 62 00:03:47,470 --> 00:03:50,200 The process of running your code is going 63 00:03:50,200 --> 00:03:52,542 to be distilled into just a single step. 64 00:03:52,542 --> 00:03:54,250 And the way to think of this, for now, is 65 00:03:54,250 --> 00:03:58,420 that, whereas C is frequently used as, indeed, a compiled language whereby 66 00:03:58,420 --> 00:04:01,045 you convert it first to 0s and 1s, Python's 67 00:04:01,045 --> 00:04:04,400 going to let you speed things up whereby you, the human programmer, 68 00:04:04,400 --> 00:04:05,740 don't have to compile it. 69 00:04:05,740 --> 00:04:09,400 You're just going to run what's called an interpreter-- which, by design, 70 00:04:09,400 --> 00:04:12,190 is named the exact same thing as the language itself-- 71 00:04:12,190 --> 00:04:14,860 and by running this program installed in VS Code 72 00:04:14,860 --> 00:04:17,230 or, eventually, on your own Macs or PCs. 73 00:04:17,230 --> 00:04:20,320 This is just going to tell your computer to interpret this code 74 00:04:20,320 --> 00:04:23,800 and figure out how to get down to that lower level of 0s and 1s. 75 00:04:23,800 --> 00:04:26,626 But you don't have to compile the code yourself anymore. 76 00:04:26,626 --> 00:04:31,000 So with that said, let's consider what the code is going to look like, 77 00:04:31,000 --> 00:04:31,690 side by side. 78 00:04:31,690 --> 00:04:33,850 In fact, let's look back at some Scratch blocks, 79 00:04:33,850 --> 00:04:36,582 just like we did with C in week one, and do some side by sides. 80 00:04:36,582 --> 00:04:39,040 Because even though some of the syntax this week and beyond 81 00:04:39,040 --> 00:04:42,705 is going to be different, the ideas are truly going to be the same. 82 00:04:42,705 --> 00:04:45,565 There's not all that much intellectually new just yet. 83 00:04:45,565 --> 00:04:48,190 So whereas, in week zero, we might have said hello to the world 84 00:04:48,190 --> 00:04:51,220 with this purple puzzle piece, today, of course-- 85 00:04:51,220 --> 00:04:56,080 or, rather, in week one, it looked like this in C. But today, moving forward, 86 00:04:56,080 --> 00:04:58,665 it's going to, quite simply, look like this instead. 87 00:04:58,665 --> 00:05:00,610 And if we go back and forth for just a moment, 88 00:05:00,610 --> 00:05:03,580 here, again, is the version in C, noticing 89 00:05:03,580 --> 00:05:05,500 the very C-like characteristics. 90 00:05:05,500 --> 00:05:09,200 And just at a glance here, in Python, I claim it's now this. 91 00:05:09,200 --> 00:05:13,190 What do you apparently need not worry about anymore? 92 00:05:13,190 --> 00:05:14,940 What's gone? 93 00:05:14,940 --> 00:05:15,990 So semi-colon is gone. 94 00:05:15,990 --> 00:05:19,073 And, indeed, you don't need those to finish most of your thoughts anymore. 95 00:05:19,073 --> 00:05:19,830 Anything else? 96 00:05:19,830 --> 00:05:20,860 AUDIENCE: Backslash n. 97 00:05:20,860 --> 00:05:22,690 DAVID MALAN: So the backslash n is absent. 98 00:05:22,690 --> 00:05:25,140 And that's curious because we're still going to get a new line, 99 00:05:25,140 --> 00:05:26,985 but we'll see that it's become the default. 100 00:05:26,985 --> 00:05:29,402 And this one's a little more subtle, but now, the function 101 00:05:29,402 --> 00:05:31,185 is called print instead of printf. 102 00:05:31,185 --> 00:05:33,610 So it's a little more familiar in that sense. 103 00:05:33,610 --> 00:05:34,110 All right. 104 00:05:34,110 --> 00:05:37,050 So when it comes to using libraries-- that 105 00:05:37,050 --> 00:05:39,300 is, code that other people have written-- in the past, 106 00:05:39,300 --> 00:05:43,350 we've done things like #include cs50.h to use CS50's own header 107 00:05:43,350 --> 00:05:47,730 file or standard I/O or standard lib or string or any number of other header 108 00:05:47,730 --> 00:05:49,440 files you have all used. 109 00:05:49,440 --> 00:05:52,635 Moving forward, we're going to give you, for this first week, a similar CS50 110 00:05:52,635 --> 00:05:53,280 library-- 111 00:05:53,280 --> 00:05:55,920 just very short-term training wheels that we'll quickly 112 00:05:55,920 --> 00:05:59,370 take off because, in reality, it's a lot easier to do things in Python, 113 00:05:59,370 --> 00:06:00,267 as we'll see. 114 00:06:00,267 --> 00:06:02,100 But the syntax for this, now, is going to be 115 00:06:02,100 --> 00:06:05,165 to import the CS50 library in this way. 116 00:06:05,165 --> 00:06:08,452 And when we have, now, this ability, we can actually 117 00:06:08,452 --> 00:06:09,910 start writing some code right away. 118 00:06:09,910 --> 00:06:12,420 In fact, let me switch over to VS Code here. 119 00:06:12,420 --> 00:06:14,760 And just as in the past, I'll create a new file. 120 00:06:14,760 --> 00:06:17,230 But instead of creating something called .c, 121 00:06:17,230 --> 00:06:19,980 I'm going to go ahead and create my first program called hello.py, 122 00:06:19,980 --> 00:06:22,260 using code space hello dot py. 123 00:06:22,260 --> 00:06:24,000 That, of course, gives me this new tab. 124 00:06:24,000 --> 00:06:28,185 And let me actually, quite simply, do what I proposed-- print, quote unquote, 125 00:06:28,185 --> 00:06:33,780 "Hello, world" without the /n, without the semicolon, without the f in print. 126 00:06:33,780 --> 00:06:36,270 And now, let me go down to my terminal window. 127 00:06:36,270 --> 00:06:37,792 And I don't have to compile it. 128 00:06:37,792 --> 00:06:39,000 I don't have to do dot slash. 129 00:06:39,000 --> 00:06:43,140 I, instead, run a program called python, whose purpose in life 130 00:06:43,140 --> 00:06:46,180 is, now, to interpret my code top to bottom, left to right. 131 00:06:46,180 --> 00:06:50,130 And if I run python of hello.py, crossing my fingers, as always-- 132 00:06:50,130 --> 00:06:51,000 voila. 133 00:06:51,000 --> 00:06:53,190 Now I have printed out "hello, world." 134 00:06:53,190 --> 00:06:56,460 So we seem to have gotten the new line for free, in the sense where 135 00:06:56,460 --> 00:06:57,735 it's automatically happening. 136 00:06:57,735 --> 00:06:59,880 The dollar sign isn't weirdly on the same line, 137 00:06:59,880 --> 00:07:02,220 like it once was in week one. 138 00:07:02,220 --> 00:07:04,493 But that's just a minor detail here. 139 00:07:04,493 --> 00:07:06,660 If we switch back to, now, some other capabilities-- 140 00:07:06,660 --> 00:07:09,780 well, indeed, with the CS50 library, you can also not 141 00:07:09,780 --> 00:07:12,795 just import the library itself, but specific functions. 142 00:07:12,795 --> 00:07:14,850 And you'll see that, temporarily, we're going 143 00:07:14,850 --> 00:07:19,080 to give you a helper function called get_string, just like in C, that just 144 00:07:19,080 --> 00:07:20,872 makes it work exactly the same way as in C. 145 00:07:20,872 --> 00:07:22,580 And we'll see a couple of other functions 146 00:07:22,580 --> 00:07:24,660 that will just make life easier, initially. 147 00:07:24,660 --> 00:07:26,910 But, quickly, will we take those training wheels off 148 00:07:26,910 --> 00:07:29,295 so that nothing is, indeed, CS50-specific. 149 00:07:29,295 --> 00:07:29,970 All right. 150 00:07:29,970 --> 00:07:32,640 Well, how about functions, more generally, in Python? 151 00:07:32,640 --> 00:07:34,710 Let's do a whirlwind tour, if you will, much 152 00:07:34,710 --> 00:07:38,940 like we did in that first week of C, comparing one to the other. 153 00:07:38,940 --> 00:07:42,270 So back in our world of Scratch, one of the first programs we wrote 154 00:07:42,270 --> 00:07:45,360 was this one here, whereby we ask the human their name. 155 00:07:45,360 --> 00:07:49,110 We then used the return value that was automatically stored 156 00:07:49,110 --> 00:07:53,130 in this answer variable as an second argument 157 00:07:53,130 --> 00:07:56,265 to join so that we could say "Hello, David" or "Hello, Carter." 158 00:07:56,265 --> 00:07:59,340 So this was back in week zero. 159 00:07:59,340 --> 00:08:01,143 In week one, we converted it to this. 160 00:08:01,143 --> 00:08:03,810 And here is a perfect example of things like escalating quickly. 161 00:08:03,810 --> 00:08:05,910 And, again, this is why we start in Scratch. 162 00:08:05,910 --> 00:08:09,060 There's just so much distraction here to achieve the same idea. 163 00:08:09,060 --> 00:08:12,010 But even today, we're going to chip away at some of that syntax. 164 00:08:12,010 --> 00:08:17,940 So, in C, we had to declare the variable as a string, here. 165 00:08:17,940 --> 00:08:19,935 We of course, had the semicolon and more. 166 00:08:19,935 --> 00:08:22,650 Well, in Python, the comparable code, now, 167 00:08:22,650 --> 00:08:26,100 is going to look, more simply, like this. 168 00:08:26,100 --> 00:08:29,250 So semicolon is, again, gone on both lines, for that matter. 169 00:08:29,250 --> 00:08:30,450 So that's good. 170 00:08:30,450 --> 00:08:33,100 What else appears to have changed or disappeared? 171 00:08:33,100 --> 00:08:33,600 Yeah. 172 00:08:33,600 --> 00:08:35,340 AUDIENCE: [? Do you have ?] the same type of variable? 173 00:08:35,340 --> 00:08:36,090 DAVID MALAN: Yeah. 174 00:08:36,090 --> 00:08:39,419 So I didn't have to specifically say that answer is now a string. 175 00:08:39,419 --> 00:08:41,820 And, indeed, Python is dynamically typed. 176 00:08:41,820 --> 00:08:45,270 And, in fact, it will infer from context exactly what 177 00:08:45,270 --> 00:08:48,000 it is you are storing in that variable. 178 00:08:48,000 --> 00:08:50,775 Other details that seem a little bit different? 179 00:08:50,775 --> 00:08:53,640 180 00:08:53,640 --> 00:08:54,607 A little bit different. 181 00:08:54,607 --> 00:08:55,940 What else jumps out at you here? 182 00:08:55,940 --> 00:08:56,482 I'll go back. 183 00:08:56,482 --> 00:08:58,690 This was the C version. 184 00:08:58,690 --> 00:09:01,570 And maybe focus, now, on the second line because we've rather 185 00:09:01,570 --> 00:09:02,740 exhausted the first. 186 00:09:02,740 --> 00:09:04,690 Here's, now, the Python version. 187 00:09:04,690 --> 00:09:05,720 What's different here? 188 00:09:05,720 --> 00:09:06,220 Yeah? 189 00:09:06,220 --> 00:09:08,845 AUDIENCE: You don't need to worry about %s or percent anything. 190 00:09:08,845 --> 00:09:10,930 You just have the variable after [? them. ?] 191 00:09:10,930 --> 00:09:11,680 DAVID MALAN: Yeah. 192 00:09:11,680 --> 00:09:12,820 There's no %s anymore. 193 00:09:12,820 --> 00:09:16,480 There's no second argument, at the moment, per se, to print. 194 00:09:16,480 --> 00:09:17,818 Now, it is still a little weird. 195 00:09:17,818 --> 00:09:20,485 It's as though I've deployed some addition here, arithmetically. 196 00:09:20,485 --> 00:09:21,860 But that's not the case. 197 00:09:21,860 --> 00:09:23,230 Some of you have program before. 198 00:09:23,230 --> 00:09:27,377 And plus, some of you might know, means what in this context? 199 00:09:27,377 --> 00:09:29,960 So to combine or, more technically-- anyone know the buzzword? 200 00:09:29,960 --> 00:09:30,390 Yeah. 201 00:09:30,390 --> 00:09:31,040 AUDIENCE: Concatenate. 202 00:09:31,040 --> 00:09:32,460 DAVID MALAN: To concatenate. 203 00:09:32,460 --> 00:09:35,753 So to concatenate is the fancy way of what Scratch calls joining, 204 00:09:35,753 --> 00:09:38,420 which is to take one string on the left, one string on the right 205 00:09:38,420 --> 00:09:40,100 and to join them together. 206 00:09:40,100 --> 00:09:41,880 To glue them together, if you will. 207 00:09:41,880 --> 00:09:43,080 So this is not addition. 208 00:09:43,080 --> 00:09:45,080 It would be if it were numbers involved instead. 209 00:09:45,080 --> 00:09:46,413 But because we've got a string-- 210 00:09:46,413 --> 00:09:49,430 Hello comma-- and another string implicitly in this variable 211 00:09:49,430 --> 00:09:53,540 based on what the human typed in in response to this get_string function. 212 00:09:53,540 --> 00:09:58,130 That's going to concatenate Hello comma space, and then, David or Carter 213 00:09:58,130 --> 00:09:59,637 or whatever the human has typed in. 214 00:09:59,637 --> 00:10:02,720 But it turns out, there's going to be different ways to do this in Python. 215 00:10:02,720 --> 00:10:04,387 And we'll show you a few different ones. 216 00:10:04,387 --> 00:10:06,380 And here, too, try not to get too hung up 217 00:10:06,380 --> 00:10:09,255 on or frustrated by all of the different ways you can solve problems. 218 00:10:09,255 --> 00:10:12,130 Odds are, you're going to be picking up tips and techniques for years 219 00:10:12,130 --> 00:10:14,100 to come if you continue programming. 220 00:10:14,100 --> 00:10:16,710 So let's just give you a few of the possible ways. 221 00:10:16,710 --> 00:10:20,900 So here's a second way you could print out hello comma David or hello comma 222 00:10:20,900 --> 00:10:21,680 Carter. 223 00:10:21,680 --> 00:10:22,655 But what has changed? 224 00:10:22,655 --> 00:10:26,030 In the previous version, I used concatenation explicitly. 225 00:10:26,030 --> 00:10:28,445 And the space here is important, grammatically, 226 00:10:28,445 --> 00:10:30,485 just so we get that in the final phrase. 227 00:10:30,485 --> 00:10:33,410 Now, I'm proposing to get rid of that space 228 00:10:33,410 --> 00:10:36,985 to add a comma outside of the double quotes, as well. 229 00:10:36,985 --> 00:10:39,020 But if you think back to C, this probably 230 00:10:39,020 --> 00:10:42,620 just means that print, similar in spirit to printf, 231 00:10:42,620 --> 00:10:45,200 can take not just one argument, but even two. 232 00:10:45,200 --> 00:10:47,510 And in fact, because of this comma in the middle that's 233 00:10:47,510 --> 00:10:50,390 outside of the double quotes, it's hello comma, 234 00:10:50,390 --> 00:10:52,655 and then, it will be automatically concatenated 235 00:10:52,655 --> 00:10:56,420 with-- even without using the plus, to whatever the value of answer is. 236 00:10:56,420 --> 00:10:59,630 And by default, just for grammatical prettiness, 237 00:10:59,630 --> 00:11:01,850 the print function always gives you a space 238 00:11:01,850 --> 00:11:05,120 for free in between each of the multiple arguments you pass in. 239 00:11:05,120 --> 00:11:07,290 We'll see how you can override that down the line. 240 00:11:07,290 --> 00:11:09,248 But, for now, that's just another way to do it. 241 00:11:09,248 --> 00:11:12,680 Now, perhaps the better, if slightly cryptic way to do this-- 242 00:11:12,680 --> 00:11:14,420 or just the increasingly common way-- 243 00:11:14,420 --> 00:11:18,290 is, probably, the third version, which looks a little weird, too. 244 00:11:18,290 --> 00:11:20,555 And, probably, the weirdness jumps out. 245 00:11:20,555 --> 00:11:24,060 We've suddenly introduced these curly braces, 246 00:11:24,060 --> 00:11:25,518 which I promised were mostly gone. 247 00:11:25,518 --> 00:11:26,060 And they are. 248 00:11:26,060 --> 00:11:29,270 But inside of this string here, I've done 249 00:11:29,270 --> 00:11:31,520 a curly brace, which might mean what? 250 00:11:31,520 --> 00:11:32,918 Just intuitively. 251 00:11:32,918 --> 00:11:35,210 And here is an example of how you learn a new language. 252 00:11:35,210 --> 00:11:39,945 Just infer, from context, how Python probably works. 253 00:11:39,945 --> 00:11:40,820 What might this mean? 254 00:11:40,820 --> 00:11:41,320 Yeah? 255 00:11:41,320 --> 00:11:45,160 AUDIENCE: [INAUDIBLE] 256 00:11:45,160 --> 00:11:45,910 DAVID MALAN: Yeah. 257 00:11:45,910 --> 00:11:48,610 So this is an indication, because the curly braces-- 258 00:11:48,610 --> 00:11:50,740 because this was the way Python was designed-- 259 00:11:50,740 --> 00:11:55,340 that we want to plug in the value of answer, not literally A-N-S-W-E-R. 260 00:11:55,340 --> 00:11:59,688 And the fancy word here is that the answer variable will be interpolated-- 261 00:11:59,688 --> 00:12:01,480 that is, substituted with its actual value. 262 00:12:01,480 --> 00:12:04,435 But, but, but-- and this is actually weird-looking; 263 00:12:04,435 --> 00:12:06,820 this was introduced a few years ago to Python. 264 00:12:06,820 --> 00:12:11,230 What else did I have to change to make these curly braces work, apparently? 265 00:12:11,230 --> 00:12:11,935 Yeah? 266 00:12:11,935 --> 00:12:13,510 AUDIENCE: Drop the f before the-- 267 00:12:13,510 --> 00:12:14,260 DAVID MALAN: Yeah. 268 00:12:14,260 --> 00:12:15,160 There's this weird f. 269 00:12:15,160 --> 00:12:17,245 And so, it's like part of printf. 270 00:12:17,245 --> 00:12:20,950 But now, it's inside the parentheses there. 271 00:12:20,950 --> 00:12:22,945 This is just the way Python designed this. 272 00:12:22,945 --> 00:12:24,820 So a few years ago, when they introduced what 273 00:12:24,820 --> 00:12:30,070 are called format strings or fstrings, you literally prefix your quoted string 274 00:12:30,070 --> 00:12:32,080 with the letter f. 275 00:12:32,080 --> 00:12:34,570 And then, you can use trickery like this, 276 00:12:34,570 --> 00:12:36,640 like putting curly braces so that the value will 277 00:12:36,640 --> 00:12:38,170 be substituted automatically. 278 00:12:38,170 --> 00:12:41,530 If you forget the f, you're going to literally see hello comma curly 279 00:12:41,530 --> 00:12:43,330 brace answer closed curly brace. 280 00:12:43,330 --> 00:12:45,355 If you add the f, it's, indeed, interpolated. 281 00:12:45,355 --> 00:12:47,360 The value is plugged in. 282 00:12:47,360 --> 00:12:47,860 All right. 283 00:12:47,860 --> 00:12:52,510 Questions on how we can just say hello to the world via Python, in this case. 284 00:12:52,510 --> 00:12:53,350 Yeah? 285 00:12:53,350 --> 00:12:55,280 AUDIENCE: If you do this without the f, what would happen? 286 00:12:55,280 --> 00:12:56,300 DAVID MALAN: If you do this without the-- 287 00:12:56,300 --> 00:12:57,260 AUDIENCE: [? The f. ?] 288 00:12:57,260 --> 00:12:58,385 DAVID MALAN: Without the f? 289 00:12:58,385 --> 00:13:02,450 If you omit the f, you will literally see H-E-L-L-O comma curly brace 290 00:13:02,450 --> 00:13:04,730 A-N-S-W-E-R closed curly brace. 291 00:13:04,730 --> 00:13:05,930 So, in fact, let's do this. 292 00:13:05,930 --> 00:13:08,300 Let me go back to VS Code here, quickly. 293 00:13:08,300 --> 00:13:11,540 I've still got my file called hello.py open. 294 00:13:11,540 --> 00:13:14,210 And let me go ahead and change this ever so slightly. 295 00:13:14,210 --> 00:13:16,700 So I'm going to go ahead and-- 296 00:13:16,700 --> 00:13:20,930 let's say from cs50 import get_string. 297 00:13:20,930 --> 00:13:23,615 And that's just the new syntax I propose using to import 298 00:13:23,615 --> 00:13:26,150 a function from someone else's library. 299 00:13:26,150 --> 00:13:30,593 I'm going to now go ahead and ask the question-- 300 00:13:30,593 --> 00:13:33,260 let's go ahead and use get_string, storing the result in answer. 301 00:13:33,260 --> 00:13:37,480 So get_string, quote unquote, "What's your name?" 302 00:13:37,480 --> 00:13:41,090 And then, on this line, I'm going to deliberately make a mistake here, 303 00:13:41,090 --> 00:13:42,450 exactly to your question. 304 00:13:42,450 --> 00:13:46,820 Let me just say hello comma answer, and just this. 305 00:13:46,820 --> 00:13:48,980 Now, even though answer is a variable, Python's 306 00:13:48,980 --> 00:13:53,150 not going to be so presumptuous as to just plug in the value of a variable 307 00:13:53,150 --> 00:13:53,810 called answer. 308 00:13:53,810 --> 00:13:56,000 What it's going to do, of course, is-- 309 00:13:56,000 --> 00:13:56,985 if I type in my name-- 310 00:13:56,985 --> 00:13:57,485 whoops. 311 00:13:57,485 --> 00:13:58,880 I typed too fast. 312 00:13:58,880 --> 00:14:00,470 Let me go ahead and rerun that again. 313 00:14:00,470 --> 00:14:04,550 If I run python with hello.py, type in my name and hit Enter, 314 00:14:04,550 --> 00:14:06,035 I get hello comma answer. 315 00:14:06,035 --> 00:14:07,160 Well, let me do one better. 316 00:14:07,160 --> 00:14:10,680 Let me apply these curly braces as before. 317 00:14:10,680 --> 00:14:13,340 Let me rerun python of hello.py. 318 00:14:13,340 --> 00:14:14,060 What's your name? 319 00:14:14,060 --> 00:14:14,405 D-A-V-I-D. 320 00:14:14,405 --> 00:14:16,363 And here's, again, the answer to your question. 321 00:14:16,363 --> 00:14:18,780 Now, we get, literally, the curly braces. 322 00:14:18,780 --> 00:14:20,780 So the fix here, ultimately, is just going 323 00:14:20,780 --> 00:14:24,640 to be to add the f there, rerun my program again with David. 324 00:14:24,640 --> 00:14:26,482 And now, hello comma David. 325 00:14:26,482 --> 00:14:28,940 So this is, admittedly, a little more cryptic than the ones 326 00:14:28,940 --> 00:14:31,858 with the plus or the comma, but this is just increasingly common. 327 00:14:31,858 --> 00:14:33,650 Why? because you can read it left to right. 328 00:14:33,650 --> 00:14:34,720 It's nice and convenient. 329 00:14:34,720 --> 00:14:36,125 It's less cryptic than the %s's. 330 00:14:36,125 --> 00:14:40,130 So it's a new and improved version, if you will, of printf in C, 331 00:14:40,130 --> 00:14:44,780 based on decades of experience of programmers doing things like this. 332 00:14:44,780 --> 00:14:49,540 Questions on printing in this way? 333 00:14:49,540 --> 00:14:52,780 We're now on our way to programming in Python. 334 00:14:52,780 --> 00:14:53,280 Anything? 335 00:14:53,280 --> 00:14:53,780 All right. 336 00:14:53,780 --> 00:14:56,825 Well, what more can we do with this language, here? 337 00:14:56,825 --> 00:15:00,000 Well, let me propose that we consider that we 338 00:15:00,000 --> 00:15:07,200 have, for instance, a few other features that we can add to the mix, as well-- 339 00:15:07,200 --> 00:15:12,640 namely, let's say some data types, as well. 340 00:15:12,640 --> 00:15:15,600 So let me flip over here, to back to the slides. 341 00:15:15,600 --> 00:15:18,318 And there's different data types in Python, as we'll soon see. 342 00:15:18,318 --> 00:15:19,485 But they're not as explicit. 343 00:15:19,485 --> 00:15:23,070 As we already saw, by using a string from get_string, 344 00:15:23,070 --> 00:15:25,050 you don't have to explicitly state what it is. 345 00:15:25,050 --> 00:15:29,130 But you saw-- recall, in C-- all of these various data types. 346 00:15:29,130 --> 00:15:33,720 And then, in Python, nicely enough, this list is about to get shorter. 347 00:15:33,720 --> 00:15:37,740 And so, here is our list in C. Here is an abbreviated list in Python. 348 00:15:37,740 --> 00:15:41,220 So we're still going to have strings, but they're going to be more succinctly 349 00:15:41,220 --> 00:15:45,032 called strs now, S-T-R. We're still going to have ints for integers. 350 00:15:45,032 --> 00:15:47,490 We're still going to have floats for floating point values. 351 00:15:47,490 --> 00:15:49,900 We're even going to have bools for true and false. 352 00:15:49,900 --> 00:15:53,550 But what's missing, now, from the list is long and floats. 353 00:15:53,550 --> 00:15:54,420 And why is that? 354 00:15:54,420 --> 00:15:56,220 Or rather, long and double. 355 00:15:56,220 --> 00:15:58,650 We'll recall that, in C, those used more bits. 356 00:15:58,650 --> 00:16:02,550 Well, in Python, the smaller data types, previously-- int and float, 357 00:16:02,550 --> 00:16:04,950 themselves-- just used more bits for you. 358 00:16:04,950 --> 00:16:08,010 And so, you don't need to distinguish between small and large. 359 00:16:08,010 --> 00:16:10,290 You just use one data type, and the language 360 00:16:10,290 --> 00:16:12,345 gives you a bigger range than before. 361 00:16:12,345 --> 00:16:15,510 It turns out, though, there's going to be some other features, as well, 362 00:16:15,510 --> 00:16:17,610 of Python, and these data types-- one of which 363 00:16:17,610 --> 00:16:20,010 will be called range, another of which will be list. 364 00:16:20,010 --> 00:16:21,402 So gone will be arrays. 365 00:16:21,402 --> 00:16:23,610 We'll actually use something literally called a list. 366 00:16:23,610 --> 00:16:28,110 Tuples-- sort of x, y pairs for coordinates and things like that. 367 00:16:28,110 --> 00:16:31,260 Dicts for dictionaries-- so we'll have built-in capabilities 368 00:16:31,260 --> 00:16:34,270 for storing keys and values we'll see, and even a set. 369 00:16:34,270 --> 00:16:36,270 Mathematically, a set is a collection of values, 370 00:16:36,270 --> 00:16:38,790 but it automatically gets rid of duplicates for you. 371 00:16:38,790 --> 00:16:43,470 So all of these things, we could absolutely implement in C if we wanted. 372 00:16:43,470 --> 00:16:47,940 And, indeed, in problem set five, you've been implementing your very own spell 373 00:16:47,940 --> 00:16:50,400 checker using some form of hash table. 374 00:16:50,400 --> 00:16:54,060 Well, it turns out that, in Python, you can solve those same problems, 375 00:16:54,060 --> 00:16:56,070 but perhaps a little more readily. 376 00:16:56,070 --> 00:16:58,980 In fact, let me go back over here to VS Code, 377 00:16:58,980 --> 00:17:01,895 and let me propose that I do the following. 378 00:17:01,895 --> 00:17:06,210 Let me go ahead and create a file called dictionary.py. 379 00:17:06,210 --> 00:17:09,510 Let me propose that I try to implement, say-- problem set five-- 380 00:17:09,510 --> 00:17:14,220 our spell checker in Python instead of C and achieve, ultimately, 381 00:17:14,220 --> 00:17:17,443 the same kind of behavior whereby I'll be 382 00:17:17,443 --> 00:17:19,235 able to spell check a whole bunch of words. 383 00:17:19,235 --> 00:17:21,480 So this is jumping the gun a little bit because you're 384 00:17:21,480 --> 00:17:23,897 about to see syntax will revisit over the course of today. 385 00:17:23,897 --> 00:17:26,580 But, for now, I've got a new file called dictionary.py. 386 00:17:26,580 --> 00:17:30,810 And let me begin to create some placeholders for functions. 387 00:17:30,810 --> 00:17:34,710 We'll see in just a bit that, in Python, you can define a function called check, 388 00:17:34,710 --> 00:17:38,000 and that check function can take a word as its input. 389 00:17:38,000 --> 00:17:40,292 And I'll come back to this in just a moment. 390 00:17:40,292 --> 00:17:42,000 In Python, I can define a second function 391 00:17:42,000 --> 00:17:44,865 like load, which itself will take a whole dictionary, 392 00:17:44,865 --> 00:17:47,010 just like in problem set five. 393 00:17:47,010 --> 00:17:51,010 And I'll go ahead and come back to the implementation of this. 394 00:17:51,010 --> 00:17:53,130 Meanwhile, we might similarly implement a function 395 00:17:53,130 --> 00:17:57,090 called size, which takes no arguments but, ultimately, is going to return 396 00:17:57,090 --> 00:17:59,100 the size of my dictionary of words. 397 00:17:59,100 --> 00:18:02,370 And then, lastly, for consistency with problem set five, 398 00:18:02,370 --> 00:18:05,130 we might define an unload function, whose purpose in life 399 00:18:05,130 --> 00:18:07,770 is to free any memory that you've been using, just 400 00:18:07,770 --> 00:18:09,390 to give it back to the computer. 401 00:18:09,390 --> 00:18:11,790 Now, odds are, whether you're still working on speller 402 00:18:11,790 --> 00:18:15,660 or have finished speller, you wrote a decent amount of lines of code. 403 00:18:15,660 --> 00:18:18,550 And indeed, it's been, by design, a challenge. 404 00:18:18,550 --> 00:18:22,620 But one of the reasons for these higher-level languages like Python 405 00:18:22,620 --> 00:18:25,680 is that you can stand on the shoulders of programmers before you 406 00:18:25,680 --> 00:18:28,703 and solve very common problems much more quickly. 407 00:18:28,703 --> 00:18:31,620 So that you can focus on building your new app or your web application 408 00:18:31,620 --> 00:18:34,690 or your own project to solve problems of interest to you. 409 00:18:34,690 --> 00:18:38,490 So at the risk of crushing some spirits, let 410 00:18:38,490 --> 00:18:42,540 me propose that, in Python if you want a dictionary for something like a spell 411 00:18:42,540 --> 00:18:44,070 checker, well, that's fine. 412 00:18:44,070 --> 00:18:48,030 Go ahead and give yourself a variable, like words, to store all of those words 413 00:18:48,030 --> 00:18:52,410 and just assign it equal to a dictionary-- or dict, for short, 414 00:18:52,410 --> 00:18:53,220 in Python. 415 00:18:53,220 --> 00:18:55,140 That will give you a hash table. 416 00:18:55,140 --> 00:18:57,690 Now, it turns out, in speller recall, you 417 00:18:57,690 --> 00:18:59,720 don't need to worry about words and definitions. 418 00:18:59,720 --> 00:19:01,763 It's just about spell-checking the words. 419 00:19:01,763 --> 00:19:03,930 So strictly speaking, we don't need keys and values. 420 00:19:03,930 --> 00:19:05,610 We just need keys. 421 00:19:05,610 --> 00:19:07,980 So I'm going to save myself a few more keystrokes 422 00:19:07,980 --> 00:19:11,055 by just saying that, technically, in Python, using a set suffices. 423 00:19:11,055 --> 00:19:13,770 Again, a set is just a collection of values with no duplicates. 424 00:19:13,770 --> 00:19:16,400 But they don't necessarily have keys and values. 425 00:19:16,400 --> 00:19:18,250 It's just one or the other. 426 00:19:18,250 --> 00:19:21,420 But now that I have-- on line one, I claim the equivalent, in Python, 427 00:19:21,420 --> 00:19:25,720 of a hash table, I can actually do something like this. 428 00:19:25,720 --> 00:19:28,890 Here's how I might implement the check function in Python. 429 00:19:28,890 --> 00:19:33,840 If the word passed into this function is in my variable called words, 430 00:19:33,840 --> 00:19:35,390 well, return True. 431 00:19:35,390 --> 00:19:39,360 Else, go ahead and return False. 432 00:19:39,360 --> 00:19:40,030 Done. 433 00:19:40,030 --> 00:19:40,530 No, wait. 434 00:19:40,530 --> 00:19:42,990 You're thinking, if anything at all, maybe 435 00:19:42,990 --> 00:19:46,507 we want to handle lowercase instead of just uppercase and lowercase. 436 00:19:46,507 --> 00:19:47,340 Well, you know what? 437 00:19:47,340 --> 00:19:49,725 In Python, if you want to force a whole word to lowercase, 438 00:19:49,725 --> 00:19:51,360 you don't have to iterate over it with a loop. 439 00:19:51,360 --> 00:19:54,190 You don't have to use any of that C-type functions or anything. 440 00:19:54,190 --> 00:19:56,947 Just say word.lower, and that will convert the whole thing 441 00:19:56,947 --> 00:19:58,780 to lowercase for parity with the dictionary. 442 00:19:58,780 --> 00:19:59,440 All right. 443 00:19:59,440 --> 00:20:02,185 How about something like the load function in Python? 444 00:20:02,185 --> 00:20:06,130 Well, in Python, you can open files just like in C. For instance, in Python, I 445 00:20:06,130 --> 00:20:09,940 might do open, the dictionary argument in read mode, 446 00:20:09,940 --> 00:20:11,798 just like fopen in Python. 447 00:20:11,798 --> 00:20:13,090 I might do something like this. 448 00:20:13,090 --> 00:20:20,230 For each line in that file, let me go ahead and add, to my words variable, 449 00:20:20,230 --> 00:20:21,430 that line. 450 00:20:21,430 --> 00:20:24,790 And then, let me go ahead and close that file. 451 00:20:24,790 --> 00:20:26,320 And I think I'm done. 452 00:20:26,320 --> 00:20:28,457 I'm just going to go ahead and return True, 453 00:20:28,457 --> 00:20:30,040 just because I think I'm already done. 454 00:20:30,040 --> 00:20:32,350 Now, here, too, I could nitpick a little bit. 455 00:20:32,350 --> 00:20:35,680 Technically, if I'm reading in every line from the file, 456 00:20:35,680 --> 00:20:38,620 every line in the dictionary ends with, technically, a backslash n. 457 00:20:38,620 --> 00:20:41,140 But there's an easy way to get rid of that, 458 00:20:41,140 --> 00:20:43,360 just like you might see with an alternative syntax. 459 00:20:43,360 --> 00:20:45,060 What I'm actually going to do is this. 460 00:20:45,060 --> 00:20:49,060 Let me grab from the current line, the current word, 461 00:20:49,060 --> 00:20:51,940 by stripping off with reverse strip-- 462 00:20:51,940 --> 00:20:53,935 rstrip; a function we'll, again, see-- 463 00:20:53,935 --> 00:20:55,810 that just gets rid of the trailing new line-- 464 00:20:55,810 --> 00:20:58,000 the backslash n at the end of that line. 465 00:20:58,000 --> 00:21:01,900 And what I really want to do, then, is add this word to that dictionary. 466 00:21:01,900 --> 00:21:05,780 Meanwhile, if I want to figure out what the size is of my dictionary, well-- 467 00:21:05,780 --> 00:21:08,890 and, see, you're probably writing code to iterate over all of those lines, 468 00:21:08,890 --> 00:21:12,040 and you're just going to count them up using a variable. 469 00:21:12,040 --> 00:21:13,060 Not so in Python. 470 00:21:13,060 --> 00:21:15,460 You can just return the length of those words. 471 00:21:15,460 --> 00:21:19,360 And better still, in Python, you don't have to manage your own memory. 472 00:21:19,360 --> 00:21:20,500 No more malloc. 473 00:21:20,500 --> 00:21:21,700 No more free. 474 00:21:21,700 --> 00:21:24,370 No more manual thinking about memory. 475 00:21:24,370 --> 00:21:27,310 The language just deals with all of that for you. 476 00:21:27,310 --> 00:21:28,030 So you know what? 477 00:21:28,030 --> 00:21:30,760 It suffices for me to just return True and claim 478 00:21:30,760 --> 00:21:33,640 that unloading is done for me. 479 00:21:33,640 --> 00:21:35,170 And that's it. 480 00:21:35,170 --> 00:21:37,840 Again, whether, you're in the middle of or already finished, 481 00:21:37,840 --> 00:21:39,960 this might, perhaps, adjust some frustration, 482 00:21:39,960 --> 00:21:45,700 but also, enlightenment in that this is why higher-level languages exist. 483 00:21:45,700 --> 00:21:47,605 You can build on top of the same principles, 484 00:21:47,605 --> 00:21:50,170 the same ideas, with which you've been dealing, 485 00:21:50,170 --> 00:21:51,820 struggling even this past week. 486 00:21:51,820 --> 00:21:55,090 But you can now express yourself all the more succinctly. 487 00:21:55,090 --> 00:21:59,590 This one line implements a hash table for you, and all of this, now, 488 00:21:59,590 --> 00:22:03,250 just uses that hash table in a simpler way. 489 00:22:03,250 --> 00:22:05,980 Any questions, now, on this, keeping in mind 490 00:22:05,980 --> 00:22:08,830 that the point, nonetheless, of speller in p-set 5 491 00:22:08,830 --> 00:22:12,160 is to understand what's really going on underneath the hood 492 00:22:12,160 --> 00:22:14,860 and, better still, to notice this. 493 00:22:14,860 --> 00:22:18,010 This might seem all rather amazing, but let me go ahead and do this. 494 00:22:18,010 --> 00:22:21,100 I've actually got a couple of versions of speller written here, 495 00:22:21,100 --> 00:22:24,800 and I've got a version written in C that I won't show the source code for. 496 00:22:24,800 --> 00:22:28,990 But I'm going to go ahead and make that version of speller in C. 497 00:22:28,990 --> 00:22:32,470 And I'm going to go ahead here and, let's say, split 498 00:22:32,470 --> 00:22:34,270 my window here for just a moment. 499 00:22:34,270 --> 00:22:37,030 And I'm going to go into a Python version of speller, 500 00:22:37,030 --> 00:22:38,470 really, that I just wrote. 501 00:22:38,470 --> 00:22:42,820 And on the left-hand side here, let me go ahead and run speller-- 502 00:22:42,820 --> 00:22:44,740 the version I compiled in C-- 503 00:22:44,740 --> 00:22:47,890 using a big text like the Sherlock Holmes text, 504 00:22:47,890 --> 00:22:50,030 which has a whole lot of words in it. 505 00:22:50,030 --> 00:22:52,180 And on the right-hand side, let me run python 506 00:22:52,180 --> 00:22:55,510 of speller.py, which is a separate file I wrote in advance, 507 00:22:55,510 --> 00:22:57,430 just like we give you speller.c. 508 00:22:57,430 --> 00:23:00,790 And I'll, similarly, run this on the Sherlock Holmes text. 509 00:23:00,790 --> 00:23:05,020 And I'm going to do my best to hit Enter on the left and the right of my screen 510 00:23:05,020 --> 00:23:06,100 at the same time. 511 00:23:06,100 --> 00:23:08,770 But we should see, hopefully, the same list of misspelled words 512 00:23:08,770 --> 00:23:10,390 and the timings thereof. 513 00:23:10,390 --> 00:23:12,380 So here we go on the right. 514 00:23:12,380 --> 00:23:15,136 Here we go on the left. 515 00:23:15,136 --> 00:23:16,730 All right. 516 00:23:16,730 --> 00:23:18,680 A race to see which one wins here. 517 00:23:18,680 --> 00:23:19,820 C is on the left. 518 00:23:19,820 --> 00:23:21,680 Python is on the right. 519 00:23:21,680 --> 00:23:23,270 OK. 520 00:23:23,270 --> 00:23:25,530 Interesting. 521 00:23:25,530 --> 00:23:28,200 Hopefully, Python's close behind. 522 00:23:28,200 --> 00:23:30,330 Note that some of this is internet delay. 523 00:23:30,330 --> 00:23:33,360 And so, it might not necessarily be a crazy number of seconds. 524 00:23:33,360 --> 00:23:37,050 But the system is, indeed, using, if we measure it, a low level. 525 00:23:37,050 --> 00:23:39,630 How much time the CPU spent executing my code? 526 00:23:39,630 --> 00:23:41,653 C took a total of 1.64 seconds. 527 00:23:41,653 --> 00:23:44,820 That was pretty fast, even though it took a moment more for all of the bytes 528 00:23:44,820 --> 00:23:46,590 to come over the internet. 529 00:23:46,590 --> 00:23:49,050 The Python version, though, took what? 530 00:23:49,050 --> 00:23:50,605 2.44 seconds. 531 00:23:50,605 --> 00:23:53,100 So what might the inference be? 532 00:23:53,100 --> 00:23:55,590 One, maybe I'm just better at programming in C 533 00:23:55,590 --> 00:23:59,400 than I am in Python, which is probably not true. 534 00:23:59,400 --> 00:24:03,210 But what else might you infer from this example? 535 00:24:03,210 --> 00:24:07,541 536 00:24:07,541 --> 00:24:11,176 Should we, maybe, give up on Python, stick with C? 537 00:24:11,176 --> 00:24:12,070 No? 538 00:24:12,070 --> 00:24:14,410 So what might be going on here? 539 00:24:14,410 --> 00:24:16,870 Why is the Python version, that I claim is correct-- 540 00:24:16,870 --> 00:24:20,620 and I think the numbers all line up, just not the times. 541 00:24:20,620 --> 00:24:21,820 Where is the trade-off here? 542 00:24:21,820 --> 00:24:23,915 Well, here, again, is this design trade-off. 543 00:24:23,915 --> 00:24:24,415 Yeah? 544 00:24:24,415 --> 00:24:29,310 AUDIENCE: In order to save the programmer time, [INAUDIBLE].. 545 00:24:29,310 --> 00:24:30,690 DAVID MALAN: Yeah, exactly. 546 00:24:30,690 --> 00:24:32,910 In order to save the human programmer time, 547 00:24:32,910 --> 00:24:35,700 there's a lot more features built into Python-- more functions, 548 00:24:35,700 --> 00:24:38,920 more automatic management of memory and so forth-- 549 00:24:38,920 --> 00:24:40,530 and you have to pay a price. 550 00:24:40,530 --> 00:24:43,193 Someone else's code is doing all of that work for you. 551 00:24:43,193 --> 00:24:45,360 But if they've written some number of lines of code, 552 00:24:45,360 --> 00:24:47,152 those are just more lines of code that need 553 00:24:47,152 --> 00:24:50,730 to be executed for you, whereas here, the computer is 554 00:24:50,730 --> 00:24:54,615 at the risk of oversimplifying only running my lines of code. 555 00:24:54,615 --> 00:24:55,865 So there's just less overhead. 556 00:24:55,865 --> 00:24:57,448 And so, this is a perpetual trade-off. 557 00:24:57,448 --> 00:25:00,990 Typically, when using a more user-friendly and more modern language, 558 00:25:00,990 --> 00:25:02,983 one of the prices you might pay is performance. 559 00:25:02,983 --> 00:25:06,150 Now, there's a lot of smart computer scientists in the world, though, trying 560 00:25:06,150 --> 00:25:08,440 to push back on those same trade-offs. 561 00:25:08,440 --> 00:25:11,220 And so, these interpreters, like the command I wrote, 562 00:25:11,220 --> 00:25:15,390 Python technically can-- especially if you run a program again and again-- 563 00:25:15,390 --> 00:25:19,350 actually, secretly, behind the scenes, compile your code for you, 564 00:25:19,350 --> 00:25:20,610 down to 0s and 1s. 565 00:25:20,610 --> 00:25:23,640 And then, the second, the third, the fourth time you run that program, 566 00:25:23,640 --> 00:25:25,010 it might very well be faster. 567 00:25:25,010 --> 00:25:27,150 So this is a bit of a head fake here, in that 568 00:25:27,150 --> 00:25:29,490 I'm running them once and only once. 569 00:25:29,490 --> 00:25:32,070 But we could get benefit over time if we kept 570 00:25:32,070 --> 00:25:34,183 running the Python version again and again 571 00:25:34,183 --> 00:25:35,850 and, perhaps, fine-tune the performance. 572 00:25:35,850 --> 00:25:38,017 But, in general, there's going to be this trade-off. 573 00:25:38,017 --> 00:25:40,560 Now, would you rather spend the 60 seconds 574 00:25:40,560 --> 00:25:43,620 I wrote implementing a spell checker or this 6 hours, 575 00:25:43,620 --> 00:25:47,910 16 hours you might be or have spent implementing the same in C? 576 00:25:47,910 --> 00:25:48,720 Probably not. 577 00:25:48,720 --> 00:25:52,650 For productivity's sake, this is why we have these additional languages. 578 00:25:52,650 --> 00:25:57,300 Just for fun, let me flip over to another screen here and open up 579 00:25:57,300 --> 00:26:00,540 a version of Python that's actually-- in just a second-- 580 00:26:00,540 --> 00:26:04,230 on my own Mac instead of the cloud so that 581 00:26:04,230 --> 00:26:06,490 I can actually do something with graphics. 582 00:26:06,490 --> 00:26:09,930 So, here, I just have a black and white terminal window on my very own Mac. 583 00:26:09,930 --> 00:26:12,450 And I've pre-installed Python, just like we've done so 584 00:26:12,450 --> 00:26:14,370 for VS Code in the cloud for you. 585 00:26:14,370 --> 00:26:19,320 Notice that I've got this photo of, perhaps, one of your favorite TV 586 00:26:19,320 --> 00:26:21,090 shows here, with the cast of The Office. 587 00:26:21,090 --> 00:26:24,630 Notice all of the faces in this image here. 588 00:26:24,630 --> 00:26:30,210 And let me propose that we try to find one face in the crowd, CSI-style, 589 00:26:30,210 --> 00:26:33,660 whereby we want to find, perhaps, the Scranton Strangler, so to speak. 590 00:26:33,660 --> 00:26:37,080 And so, here is an example of this guy's face. 591 00:26:37,080 --> 00:26:40,385 Now, how do we go about finding this specific face in the crowd? 592 00:26:40,385 --> 00:26:42,510 Well, our human eyes, obviously, can pluck him out, 593 00:26:42,510 --> 00:26:44,370 especially if you're familiar with the show. 594 00:26:44,370 --> 00:26:46,605 But let me go ahead and do this instead. 595 00:26:46,605 --> 00:26:50,730 Let me go ahead and propose that we run code 596 00:26:50,730 --> 00:26:52,800 that I already wrote in advance here. 597 00:26:52,800 --> 00:26:55,085 This is a Python program with more lines of code 598 00:26:55,085 --> 00:26:56,460 that we won't dwell on for today. 599 00:26:56,460 --> 00:26:58,800 But it's meant to motivate what we can do. 600 00:26:58,800 --> 00:27:03,150 From a pillow library, implying a Python image library, 601 00:27:03,150 --> 00:27:07,033 I want to import some type of information, 602 00:27:07,033 --> 00:27:09,450 some feature called image so that I can manipulate images, 603 00:27:09,450 --> 00:27:12,150 not unlike our own problem set four. 604 00:27:12,150 --> 00:27:13,330 And this is powerful. 605 00:27:13,330 --> 00:27:13,830 in? 606 00:27:13,830 --> 00:27:14,330 Python. 607 00:27:14,330 --> 00:27:18,450 You can just [MIMICS EXPLOSION] import face recognition as a library 608 00:27:18,450 --> 00:27:19,950 that someone else wrote. 609 00:27:19,950 --> 00:27:22,770 From there, I'm going to create a variable called image. 610 00:27:22,770 --> 00:27:25,050 I'm going to use this face recognition libraries. 611 00:27:25,050 --> 00:27:27,330 load_image_file function. 612 00:27:27,330 --> 00:27:30,030 It's a little verbose, but it's similar in spirit to fopen. 613 00:27:30,030 --> 00:27:32,100 And I'm going to open office.jpeg. 614 00:27:32,100 --> 00:27:36,990 I'm going to, then, declare a second variable called face_locations, plural, 615 00:27:36,990 --> 00:27:40,620 because what I'm expecting to get back, per the documentation for this library, 616 00:27:40,620 --> 00:27:44,650 is a list of all of the faces' locations that are detected. 617 00:27:44,650 --> 00:27:45,150 All right. 618 00:27:45,150 --> 00:27:48,660 Then, I'm going to iterate over each of those faces using a for loop, 619 00:27:48,660 --> 00:27:50,460 that we'll see in more detail. 620 00:27:50,460 --> 00:27:53,475 I'm going to, then, infer what the top, right, bottom, and left corners 621 00:27:53,475 --> 00:27:55,050 are of that face. 622 00:27:55,050 --> 00:28:00,300 And then, what I'm going to do here is show that face alone, 623 00:28:00,300 --> 00:28:03,040 if I've detected the face in question. 624 00:28:03,040 --> 00:28:08,760 So let me go ahead, here, and run detect.py. 625 00:28:08,760 --> 00:28:12,370 And we'll see not just the one face we're looking for. 626 00:28:12,370 --> 00:28:16,430 But if I run Python of detect.py, it's going to do all of the analysis. 627 00:28:16,430 --> 00:28:22,380 I'll see a big opening here, now, of all of the faces that 628 00:28:22,380 --> 00:28:24,870 were detected in this here program. 629 00:28:24,870 --> 00:28:26,870 [CHUCKLES] OK, some better than others, I guess, 630 00:28:26,870 --> 00:28:28,560 if you zoom in on catching someone. 631 00:28:28,560 --> 00:28:29,970 Typical Angela. 632 00:28:29,970 --> 00:28:32,700 If you want to, now, find that one face, I 633 00:28:32,700 --> 00:28:34,920 think we need to train the software a bit more. 634 00:28:34,920 --> 00:28:37,080 So let me actually open up a second program called 635 00:28:37,080 --> 00:28:39,270 recognize that's got more going on. 636 00:28:39,270 --> 00:28:41,370 But let me, with a wave of a hand, point out 637 00:28:41,370 --> 00:28:45,870 that I'm now loading not only the office.jpeg, but also toby.jpeg 638 00:28:45,870 --> 00:28:49,840 to train the algorithm to find that specific face. 639 00:28:49,840 --> 00:28:53,580 And so, now, if I run this second version-- recognize.py-- 640 00:28:53,580 --> 00:28:56,310 with Python of recognize.py-- 641 00:28:56,310 --> 00:28:59,160 hold my breath for just a moment; it's analyzing, presumably, 642 00:28:59,160 --> 00:29:00,420 all of the faces-- 643 00:29:00,420 --> 00:29:02,070 you see the same, original photo. 644 00:29:02,070 --> 00:29:05,610 But do you see one such face highlighted here? 645 00:29:05,610 --> 00:29:09,420 This version of the code found Toby, highlighted him 646 00:29:09,420 --> 00:29:12,110 with the screen and, voila, we have face recognition. 647 00:29:12,110 --> 00:29:14,318 So for better or for worse, this is what's happening, 648 00:29:14,318 --> 00:29:15,967 increasingly societally, nowadays. 649 00:29:15,967 --> 00:29:18,300 And honestly, even though I didn't write the code live-- 650 00:29:18,300 --> 00:29:21,420 because it's a good dozen or more lines of code-- it's not terribly many. 651 00:29:21,420 --> 00:29:24,450 And literally, all the authorities-- all we have to do 652 00:29:24,450 --> 00:29:27,960 is import face recognition and, voila, you have access. 653 00:29:27,960 --> 00:29:29,890 These technologies are here already. 654 00:29:29,890 --> 00:29:31,690 But let's consider, for just a moment-- 655 00:29:31,690 --> 00:29:33,820 how did we find Toby? 656 00:29:33,820 --> 00:29:35,150 How might that library-- 657 00:29:35,150 --> 00:29:37,900 even though we're not going to look at its implementation details, 658 00:29:37,900 --> 00:29:40,000 how does it find Toby and distinguish him 659 00:29:40,000 --> 00:29:43,960 from all of these other faces in the crowd? 660 00:29:43,960 --> 00:29:47,180 What might it be doing, intuitively. 661 00:29:47,180 --> 00:29:50,570 Think back even to p-set four, what you, yourselves, have access to, data-wise. 662 00:29:50,570 --> 00:29:51,083 Yeah? 663 00:29:51,083 --> 00:29:53,750 AUDIENCE: [? Since ?] we gave it an image of Toby's face before, 664 00:29:53,750 --> 00:29:59,010 it probably looks at are the pixels in one area the same as in another area 665 00:29:59,010 --> 00:30:00,720 and allots it to the same-- 666 00:30:00,720 --> 00:30:02,998 from that reference image to this image. 667 00:30:02,998 --> 00:30:06,870 And then, it's going to say, hey, a lot of the similar consult ranges 668 00:30:06,870 --> 00:30:09,292 are here and here, so we can safely guess 669 00:30:09,292 --> 00:30:10,750 that this is the same [? person. ?] 670 00:30:10,750 --> 00:30:11,875 DAVID MALAN: Yeah, exactly. 671 00:30:11,875 --> 00:30:15,610 And to summarize for the camera here, we have trained the software, if you will, 672 00:30:15,610 --> 00:30:17,560 by giving it a photo of Toby's face. 673 00:30:17,560 --> 00:30:20,218 So, by looking for the same or, really, similar pixels-- 674 00:30:20,218 --> 00:30:22,510 especially if it's a slightly different image of Toby-- 675 00:30:22,510 --> 00:30:24,970 we can, perhaps, identify him in the crowd. 676 00:30:24,970 --> 00:30:26,412 And what really is a human face? 677 00:30:26,412 --> 00:30:28,120 Well, at the end of the day, the computer 678 00:30:28,120 --> 00:30:30,340 only knows it as a pattern of bits or, really, 679 00:30:30,340 --> 00:30:32,110 at a higher level, a pattern of pixels. 680 00:30:32,110 --> 00:30:35,170 So maybe a human face is, perhaps, best defined, in general, 681 00:30:35,170 --> 00:30:39,295 as two eyes and a nose and a mouth that, even though all of us look similar, 682 00:30:39,295 --> 00:30:43,268 structurally, odds are, the measurement between the eyes and the nose 683 00:30:43,268 --> 00:30:45,310 and the width of the mouth, the skin tone and all 684 00:30:45,310 --> 00:30:47,920 of these other physical characteristics are patterns 685 00:30:47,920 --> 00:30:51,280 that software could, perhaps, detect and then look, statistically, 686 00:30:51,280 --> 00:30:53,920 through the image, looking for the closest possible match 687 00:30:53,920 --> 00:30:57,422 to these various measurement shapes, colors and sizes and the like. 688 00:30:57,422 --> 00:30:59,130 And, indeed, that might be the intuition. 689 00:30:59,130 --> 00:31:03,520 But what's powerful here, again, is just how easy and readily available 690 00:31:03,520 --> 00:31:06,280 this technology now is. 691 00:31:06,280 --> 00:31:06,820 All right. 692 00:31:06,820 --> 00:31:10,605 So with that said, let's propose to consider what more we 693 00:31:10,605 --> 00:31:13,480 can do with Python itself, get back to the fundamentals, so that you, 694 00:31:13,480 --> 00:31:16,990 yourselves can start to implement something along those same lines. 695 00:31:16,990 --> 00:31:21,820 So besides having access to things like a get_string function, 696 00:31:21,820 --> 00:31:26,080 the CS50 library provides a few other things, as well-- namely, in C, 697 00:31:26,080 --> 00:31:27,040 we had these. 698 00:31:27,040 --> 00:31:29,052 But in Python, we're going to have fewer. 699 00:31:29,052 --> 00:31:32,260 In Python, our library, short-term, is going to give you not only get_string, 700 00:31:32,260 --> 00:31:33,740 but also get_int and get_float. 701 00:31:33,740 --> 00:31:34,240 Why? 702 00:31:34,240 --> 00:31:36,310 It's actually just annoying, as we'll soon 703 00:31:36,310 --> 00:31:39,190 see, to get back an integer or a float from a user 704 00:31:39,190 --> 00:31:44,890 and just make sure that it's an int and a float and not a word like cat or dog, 705 00:31:44,890 --> 00:31:47,170 or some string that's not actually a number. 706 00:31:47,170 --> 00:31:50,810 Well, we can import not just the specific function, get_string, 707 00:31:50,810 --> 00:31:53,920 but we can actually import all of these functions one at a time, 708 00:31:53,920 --> 00:31:55,840 like this, as we'll soon see. 709 00:31:55,840 --> 00:31:59,410 Or you can even, in Python, import specific functions from a file. 710 00:31:59,410 --> 00:32:04,300 One of you asked a while back, when you include something like CS50.h 711 00:32:04,300 --> 00:32:08,780 or standard I/O .h, you're actually getting all of the code in that file, 712 00:32:08,780 --> 00:32:12,010 which, potentially, can add bulk to your own program or time. 713 00:32:12,010 --> 00:32:15,040 In this case, when you import specific functions from Python, 714 00:32:15,040 --> 00:32:17,875 you can be a little more narrowly precise 715 00:32:17,875 --> 00:32:21,230 as to what it is you want to have access to. 716 00:32:21,230 --> 00:32:21,730 All right. 717 00:32:21,730 --> 00:32:23,890 So, with that said, let's go ahead and see 718 00:32:23,890 --> 00:32:25,900 what conditionals look like in Python. 719 00:32:25,900 --> 00:32:29,470 So in the left-hand side again, here, we'll see Scratch. 720 00:32:29,470 --> 00:32:33,190 So it's just a contrived example asking if x is less than y, then, 721 00:32:33,190 --> 00:32:35,350 say, x is less than y. 722 00:32:35,350 --> 00:32:37,540 In C, it looked like this. 723 00:32:37,540 --> 00:32:41,050 In Python, now, it's going to look like this instead. 724 00:32:41,050 --> 00:32:44,815 And here's before in C, and here's after. 725 00:32:44,815 --> 00:32:47,320 And just to call out a few of the obvious differences, what 726 00:32:47,320 --> 00:32:49,810 has changed, in Python, for conditionals, it would seem? 727 00:32:49,810 --> 00:32:53,013 728 00:32:53,013 --> 00:32:53,930 What's the difference? 729 00:32:53,930 --> 00:32:54,230 Yeah. 730 00:32:54,230 --> 00:32:55,920 AUDIENCE: There's a lack of curly braces. 731 00:32:55,920 --> 00:32:56,380 DAVID MALAN: Yeah. 732 00:32:56,380 --> 00:32:57,760 So there's no more curly braces. 733 00:32:57,760 --> 00:32:59,170 And, indeed, you don't use those. 734 00:32:59,170 --> 00:33:04,138 What appears to be taking their place, if you might infer? 735 00:33:04,138 --> 00:33:05,680 What seems to have taken their place? 736 00:33:05,680 --> 00:33:05,890 What do you think? 737 00:33:05,890 --> 00:33:06,765 AUDIENCE: [INAUDIBLE] 738 00:33:06,765 --> 00:33:09,560 DAVID MALAN: So the colon at the start of this line, here. 739 00:33:09,560 --> 00:33:13,760 But also even more important, now, is this indentation below it. 740 00:33:13,760 --> 00:33:16,160 So some of you, and we know this from office hours, 741 00:33:16,160 --> 00:33:19,380 have a habit of indenting everything on the left, right? 742 00:33:19,380 --> 00:33:21,200 And it's just this crazy mess to look at. 743 00:33:21,200 --> 00:33:23,000 Frustrating for you, surely. 744 00:33:23,000 --> 00:33:25,670 But C and Clang is pretty tolerant when it 745 00:33:25,670 --> 00:33:27,860 comes to things like white space in a program. 746 00:33:27,860 --> 00:33:29,030 Python, uh-uh. 747 00:33:29,030 --> 00:33:32,240 They realized, years ago, that-- let's help humans help themselves and just 748 00:33:32,240 --> 00:33:34,610 require standard indentation. 749 00:33:34,610 --> 00:33:36,620 So four spaces would be the norm here. 750 00:33:36,620 --> 00:33:38,870 But because it's indented below that colon, that, 751 00:33:38,870 --> 00:33:42,110 indeed, indicates that this, now, is part of that condition. 752 00:33:42,110 --> 00:33:46,340 Something else has gone missing, versus C, in this conditional. 753 00:33:46,340 --> 00:33:47,855 What else is a little simplified? 754 00:33:47,855 --> 00:33:49,660 AUDIENCE: [INAUDIBLE] 755 00:33:49,660 --> 00:33:50,410 DAVID MALAN: Yeah. 756 00:33:50,410 --> 00:33:51,368 So no more parentheses. 757 00:33:51,368 --> 00:33:53,650 You can still use them, especially when you need to, 758 00:33:53,650 --> 00:33:56,112 logically, to do order of operations, like in math. 759 00:33:56,112 --> 00:33:57,820 But in this case, if you just want to ask 760 00:33:57,820 --> 00:34:01,162 a simple question, like if x less than y, you can just do it like that. 761 00:34:01,162 --> 00:34:02,620 How about when you have an if else? 762 00:34:02,620 --> 00:34:05,170 Well, this is almost the same, here, with these same changes. 763 00:34:05,170 --> 00:34:06,800 In C, this looked like this. 764 00:34:06,800 --> 00:34:08,800 And it's starting to get a bit bulky-- at least, 765 00:34:08,800 --> 00:34:10,659 if we use our curly braces in this way. 766 00:34:10,659 --> 00:34:13,060 In Python, we can tighten things up further, even though, 767 00:34:13,060 --> 00:34:15,727 strictly speaking, in C, you don't always need the curly braces. 768 00:34:15,727 --> 00:34:18,280 But here, gone are the parentheses, again. 769 00:34:18,280 --> 00:34:20,020 Gone are the curly braces. 770 00:34:20,020 --> 00:34:23,380 Indentation is consistent, and we've just added another keyword, 771 00:34:23,380 --> 00:34:24,580 else, with the colon. 772 00:34:24,580 --> 00:34:26,325 But no more semicolons, as well. 773 00:34:26,325 --> 00:34:30,010 How about something larger, like this, in if, else, if else? 774 00:34:30,010 --> 00:34:31,960 This one's a little curious. 775 00:34:31,960 --> 00:34:35,290 But in C, it looked like this-- if, else, if else. 776 00:34:35,290 --> 00:34:38,143 In Python, it now looks like this. 777 00:34:38,143 --> 00:34:40,060 And there's, perhaps, one curiosity here that, 778 00:34:40,060 --> 00:34:41,977 honestly, all these years later, I still can't 779 00:34:41,977 --> 00:34:43,630 remember how to spell it half the time. 780 00:34:43,630 --> 00:34:46,900 What's weird about this? 781 00:34:46,900 --> 00:34:50,415 What do you spot as different? 782 00:34:50,415 --> 00:34:51,230 Yeah, over here. 783 00:34:51,230 --> 00:34:53,520 AUDIENCE: [INAUDIBLE] 784 00:34:53,520 --> 00:34:54,270 DAVID MALAN: Yeah. 785 00:34:54,270 --> 00:34:56,260 Instead of else if, it's elif. 786 00:34:56,260 --> 00:34:56,760 Why? 787 00:34:56,760 --> 00:34:59,340 [SIGHS] Apparently, else space if was just too many 788 00:34:59,340 --> 00:35:02,250 keystrokes for humans to type, so they condensed it into this way. 789 00:35:02,250 --> 00:35:05,100 Probably means it's a little more distinguishable, too, 790 00:35:05,100 --> 00:35:07,200 for the computer between the if and the else, too. 791 00:35:07,200 --> 00:35:08,700 But just something to remember, now. 792 00:35:08,700 --> 00:35:10,620 It's, indeed, elif and not else if. 793 00:35:10,620 --> 00:35:11,123 All right. 794 00:35:11,123 --> 00:35:12,540 So what about variables in Python? 795 00:35:12,540 --> 00:35:16,590 I've used a couple of them already, but let's 796 00:35:16,590 --> 00:35:19,533 distill exactly how you define and declare these things, as well. 797 00:35:19,533 --> 00:35:22,200 So, in Scratch, if we wanted to create a variable called counter 798 00:35:22,200 --> 00:35:25,185 and set it equal, initially, to 0, we would do something 799 00:35:25,185 --> 00:35:28,680 like this-- specify that it's an int, use the assignment operator, 800 00:35:28,680 --> 00:35:30,060 end the thought with a semicolon. 801 00:35:30,060 --> 00:35:32,310 In Python, it's just simpler. 802 00:35:32,310 --> 00:35:34,680 You name the variable, use the assignment operator, 803 00:35:34,680 --> 00:35:37,755 as before, you set it equal to some value, and that's it. 804 00:35:37,755 --> 00:35:38,880 You don't mention the type. 805 00:35:38,880 --> 00:35:41,250 You don't mention the semicolon or anything more. 806 00:35:41,250 --> 00:35:44,250 What if you want to change a variable, like counter, 807 00:35:44,250 --> 00:35:46,320 by 1-- that is, incremented by 1? 808 00:35:46,320 --> 00:35:47,800 You have a few different ways here. 809 00:35:47,800 --> 00:35:51,990 In C, we saw syntax like this, where you can say counter equals counter plus 1, 810 00:35:51,990 --> 00:35:54,900 which, again, feels illogical. 811 00:35:54,900 --> 00:35:56,610 How can counter equal counter plus 1? 812 00:35:56,610 --> 00:36:01,890 But, again, we read this code, really, right to left, updating its value by 1. 813 00:36:01,890 --> 00:36:03,550 In Python, it's almost the same. 814 00:36:03,550 --> 00:36:04,535 You just get rid of the semicolon. 815 00:36:04,535 --> 00:36:05,580 So that logic is there. 816 00:36:05,580 --> 00:36:08,070 But recall, in C, we could do something slightly different 817 00:36:08,070 --> 00:36:09,840 that we can also do in Python. 818 00:36:09,840 --> 00:36:12,060 In Python, you can also, more succinctly, 819 00:36:12,060 --> 00:36:15,420 do this-- plus equals, and then, whatever number you want to add. 820 00:36:15,420 --> 00:36:17,790 Or you can even change it to subtract, if you prefer. 821 00:36:17,790 --> 00:36:21,495 Sadly, gone is something you've probably typed a whole lot. 822 00:36:21,495 --> 00:36:23,940 What was the other way you can add 1? 823 00:36:23,940 --> 00:36:24,773 AUDIENCE: Plus plus? 824 00:36:24,773 --> 00:36:26,940 DAVID MALAN: Plus plus is no more, sadly, in Python. 825 00:36:26,940 --> 00:36:29,600 Just too many ways to do the same thing, so they got rid of it 826 00:36:29,600 --> 00:36:31,705 in favor of just this syntax, here. 827 00:36:31,705 --> 00:36:33,140 So keep that in mind, as well. 828 00:36:33,140 --> 00:36:36,500 What about loops, when you want to do something in Python again and again. 829 00:36:36,500 --> 00:36:39,380 Well, in Scratch, in week zero, here's how we meowed three times, 830 00:36:39,380 --> 00:36:40,700 specifically. 831 00:36:40,700 --> 00:36:42,650 In C, we had a couple of ways of doing this. 832 00:36:42,650 --> 00:36:46,460 This was the more mechanical approach, where you create a variable called i. 833 00:36:46,460 --> 00:36:47,780 You set it equal to 0. 834 00:36:47,780 --> 00:36:51,230 You then do while i is less than 3, the following. 835 00:36:51,230 --> 00:36:54,530 And then, you, yourself increment i again and again. 836 00:36:54,530 --> 00:36:57,920 Mechanical in the sense that you have to implement all of these gears 837 00:36:57,920 --> 00:37:01,130 and make them turn yourself, but this was a correct way to do that. 838 00:37:01,130 --> 00:37:03,740 In Python, we can still achieve the same idea, 839 00:37:03,740 --> 00:37:05,945 but we don't need the int keyword. 840 00:37:05,945 --> 00:37:07,445 We don't need any of the semicolons. 841 00:37:07,445 --> 00:37:08,695 We don't need the parentheses. 842 00:37:08,695 --> 00:37:10,310 We don't need the curly braces. 843 00:37:10,310 --> 00:37:12,200 We can't use the plus plus, so maybe that's 844 00:37:12,200 --> 00:37:14,300 a minor step backwards if you're a fan. 845 00:37:14,300 --> 00:37:17,930 But otherwise, the code, the logic is exactly the same. 846 00:37:17,930 --> 00:37:20,390 But there's other ways to achieve this same idea. 847 00:37:20,390 --> 00:37:22,950 Recall that, in C, we could also do this. 848 00:37:22,950 --> 00:37:25,880 You could use a for loop, which does exactly the same thing. 849 00:37:25,880 --> 00:37:26,893 Both are correct. 850 00:37:26,893 --> 00:37:28,310 Both are, arguably, well-designed. 851 00:37:28,310 --> 00:37:32,000 It's to each their own when it comes to choosing between these. 852 00:37:32,000 --> 00:37:35,930 In Python, though, we're going to have to think through how to do this. 853 00:37:35,930 --> 00:37:41,300 So you don't do the same for loop as in C. The closest I could come up with 854 00:37:41,300 --> 00:37:44,270 is this, where you say for i-- 855 00:37:44,270 --> 00:37:47,555 or whatever variable you want to do the counting-- in-- literally 856 00:37:47,555 --> 00:37:50,522 the preposition-- and then, you use square brackets here. 857 00:37:50,522 --> 00:37:52,730 And we've used square brackets before, in the context 858 00:37:52,730 --> 00:37:55,370 of arrays and things like that. 859 00:37:55,370 --> 00:38:00,080 And the 0, 1, 2 looks like an array, in some sense, even though we've also seen 860 00:38:00,080 --> 00:38:01,470 arrays with curly braces. 861 00:38:01,470 --> 00:38:03,950 But these square brackets, for now, denote a list. 862 00:38:03,950 --> 00:38:05,420 Python does not have arrays. 863 00:38:05,420 --> 00:38:08,600 An array is that contiguous chunk of memory, back to back to back, 864 00:38:08,600 --> 00:38:13,160 that you have to resize somehow by moving things around in memory, 865 00:38:13,160 --> 00:38:14,450 as per two weeks ago. 866 00:38:14,450 --> 00:38:19,175 In Python, though, you can just create a list like this using square brackets. 867 00:38:19,175 --> 00:38:22,700 And better still, as we'll see, you can add or even remove things 868 00:38:22,700 --> 00:38:24,920 from that list down the road. 869 00:38:24,920 --> 00:38:27,140 This, though, is not going to be very well-designed. 870 00:38:27,140 --> 00:38:28,610 This will work. 871 00:38:28,610 --> 00:38:32,030 This will iterate in Python three times. 872 00:38:32,030 --> 00:38:34,700 But what might rub you the wrong way about this design, 873 00:38:34,700 --> 00:38:36,860 even if you've never seen Python before? 874 00:38:36,860 --> 00:38:38,460 How does this example not end well? 875 00:38:38,460 --> 00:38:38,960 Yeah? 876 00:38:38,960 --> 00:38:41,810 AUDIENCE: Making a large list [INAUDIBLE].. 877 00:38:41,810 --> 00:38:42,560 DAVID MALAN: Yeah. 878 00:38:42,560 --> 00:38:45,830 If you're making a large list, you have to type out each one of these numbers, 879 00:38:45,830 --> 00:38:50,178 like comma 3, comma 4, comma 5, comma, dot, dot, dot, 50 comma, dot, dot, dot, 880 00:38:50,178 --> 00:38:50,678 500. 881 00:38:50,678 --> 00:38:52,640 Like, surely, that's not the best solution, 882 00:38:52,640 --> 00:38:55,760 to have all of these numbers on the screen, 883 00:38:55,760 --> 00:38:57,140 wrapping endlessly on the screen. 884 00:38:57,140 --> 00:39:01,100 So, in Python, another way to do this would be to use a function 885 00:39:01,100 --> 00:39:04,160 called range, which, technically, is a data type onto itself. 886 00:39:04,160 --> 00:39:08,080 And this returns to you as many values as you ask for it. 887 00:39:08,080 --> 00:39:09,830 range takes some other arguments, as well. 888 00:39:09,830 --> 00:39:14,540 But the simplest use case here is, if you want back the numbers 0, 1, and 2-- 889 00:39:14,540 --> 00:39:15,890 a total of three values-- 890 00:39:15,890 --> 00:39:19,070 you say, hey, Python, please give me a range of three values. 891 00:39:19,070 --> 00:39:21,260 And by default, they start at 0 on up. 892 00:39:21,260 --> 00:39:24,320 But this is more efficient than it would be 893 00:39:24,320 --> 00:39:26,390 to hard code the entire list at once. 894 00:39:26,390 --> 00:39:29,150 And the best metaphor I could come up with is something like this. 895 00:39:29,150 --> 00:39:30,775 Here, for instance, is a deck of cards. 896 00:39:30,775 --> 00:39:34,430 This is normal, human size, and there's presumably 52 cards here. 897 00:39:34,430 --> 00:39:38,728 So writing out 0 through 51 on code would be a little ridiculous 898 00:39:38,728 --> 00:39:39,770 for the reasons you know. 899 00:39:39,770 --> 00:39:44,510 And it would just be very unwieldy and ugly and wrapping in all of that. 900 00:39:44,510 --> 00:39:48,500 It would be the virtual equivalent of me handing you all of these cards at once 901 00:39:48,500 --> 00:39:49,430 to just deal with. 902 00:39:49,430 --> 00:39:52,760 And, right, they're not that big, but it's a lot of cards to hold on to. 903 00:39:52,760 --> 00:39:55,760 It requires a lot of memory or physical storage, if you will. 904 00:39:55,760 --> 00:39:59,840 What range does, metaphorically, is, if you ask me for three cards, 905 00:39:59,840 --> 00:40:04,910 I hand you them one at a time, like this, so that, at any point in time, 906 00:40:04,910 --> 00:40:08,150 you only have one number in the computer's memory 907 00:40:08,150 --> 00:40:09,760 until you're handed the next. 908 00:40:09,760 --> 00:40:11,840 The alternative-- the previous version would 909 00:40:11,840 --> 00:40:15,360 be to hand me all three cards at once, or all 52 cards at once. 910 00:40:15,360 --> 00:40:17,840 But in this case, range is just way more efficient. 911 00:40:17,840 --> 00:40:19,700 You can do range of 1,000. 912 00:40:19,700 --> 00:40:22,940 That's not going to give you a list of 1,000 values all at once. 913 00:40:22,940 --> 00:40:25,910 It's going to give you 1,000 values one at a time, 914 00:40:25,910 --> 00:40:30,800 reducing memory significantly in the computer itself. 915 00:40:30,800 --> 00:40:31,310 All right. 916 00:40:31,310 --> 00:40:34,745 So, besides this, what about doing something forever in Scratch? 917 00:40:34,745 --> 00:40:38,060 Well, we could do this, literally, with a forever block, which didn't quite 918 00:40:38,060 --> 00:40:42,590 exist in C. In C, we had to hack it together by saying while True-- 919 00:40:42,590 --> 00:40:46,000 because True is, by definition, T-R-U-E, always true. 920 00:40:46,000 --> 00:40:50,420 So this just deliberately induces an infinite loop for us. 921 00:40:50,420 --> 00:40:53,375 In Python, the logic's going to be almost the same. 922 00:40:53,375 --> 00:40:55,250 And infinite loops in Python tend to actually 923 00:40:55,250 --> 00:40:58,760 be even more common because you can always break out of them, as you could 924 00:40:58,760 --> 00:41:02,280 in C. In Python, it looks like this. 925 00:41:02,280 --> 00:41:05,960 And this is slightly more subtle, but gone are the curly braces. 926 00:41:05,960 --> 00:41:07,370 Gone are the parentheses. 927 00:41:07,370 --> 00:41:10,400 But ever so slight difference, too? 928 00:41:10,400 --> 00:41:13,187 A capital T for True and it's going to be a capital F for False. 929 00:41:13,187 --> 00:41:14,270 Stupid little differences. 930 00:41:14,270 --> 00:41:16,440 Eventually, you're going to mistype one or the other. 931 00:41:16,440 --> 00:41:18,607 But these are the kinds of things to keep an eye out 932 00:41:18,607 --> 00:41:21,770 and to start recognizing in your mind's eye when you read code. 933 00:41:21,770 --> 00:41:25,310 Questions, now, on any of these building blocks? 934 00:41:25,310 --> 00:41:26,075 Yeah? 935 00:41:26,075 --> 00:41:31,360 AUDIENCE: In the for loop, was i set to 0 once for [? every loop? ?] 936 00:41:31,360 --> 00:41:33,970 DAVID MALAN: In the for loop, was i-- 937 00:41:33,970 --> 00:41:37,090 it was set to 0 on the first iteration, then 1 on the next, 938 00:41:37,090 --> 00:41:38,530 then 2 on the third. 939 00:41:38,530 --> 00:41:39,985 And the same thing for range. 940 00:41:39,985 --> 00:41:44,050 It just doesn't use up as much memory all at once. 941 00:41:44,050 --> 00:41:49,860 Other questions, now, on any of these building blocks of Python? 942 00:41:49,860 --> 00:41:50,400 All right. 943 00:41:50,400 --> 00:41:53,250 Well, let's go ahead and build something a little more than hello. 944 00:41:53,250 --> 00:41:56,415 Let me propose that, over here, we implement, maybe, 945 00:41:56,415 --> 00:41:58,200 the simplest of calculators here. 946 00:41:58,200 --> 00:42:02,145 So let me go back to VS Code here, open my terminal window 947 00:42:02,145 --> 00:42:06,885 and open up, say, a file called calculator.py. 948 00:42:06,885 --> 00:42:09,000 And in calculator.py, we'll have an opportunity 949 00:42:09,000 --> 00:42:11,340 to explore some of these building blocks, 950 00:42:11,340 --> 00:42:13,890 but we'll allow things to escalate pretty quickly 951 00:42:13,890 --> 00:42:17,225 to more interesting examples so that we can do the same thing, ultimately, 952 00:42:17,225 --> 00:42:17,760 as well. 953 00:42:17,760 --> 00:42:19,510 And, in fact, let me go ahead and do this. 954 00:42:19,510 --> 00:42:22,950 Moreover, I've brought some code with me in advance. 955 00:42:22,950 --> 00:42:25,725 For instance, something called calculator0.c, 956 00:42:25,725 --> 00:42:28,860 from the first week of C. And let me go ahead 957 00:42:28,860 --> 00:42:34,420 and split my window here, in fact, so that I can now do something like this. 958 00:42:34,420 --> 00:42:37,170 Let me move this over here, here. 959 00:42:37,170 --> 00:42:38,105 Calculator.py. 960 00:42:38,105 --> 00:42:40,920 So now, I have, on the left of my screen, calculator.c-- 961 00:42:40,920 --> 00:42:43,620 or calculator0.c because that's the first version I 962 00:42:43,620 --> 00:42:45,690 made-- and calculator.py on the right. 963 00:42:45,690 --> 00:42:48,290 Let me go ahead and implement, really, the same idea here. 964 00:42:48,290 --> 00:42:51,675 So on the right-hand side, the analog of including cs50.h 965 00:42:51,675 --> 00:42:56,390 would be from cs50 import get_int if I want to, indeed, use this function. 966 00:42:56,390 --> 00:42:58,140 Now, I'm going to go ahead and give myself 967 00:42:58,140 --> 00:43:00,453 a variable x without defining its type. 968 00:43:00,453 --> 00:43:02,370 I'm going to use this get_int function and I'm 969 00:43:02,370 --> 00:43:05,302 going to prompt the user for x, just like in C. 970 00:43:05,302 --> 00:43:08,010 I'm, then, going to go ahead and prompt the user for another int, 971 00:43:08,010 --> 00:43:12,300 like y, here, just like in C. And at the very end, I'm going to go ahead 972 00:43:12,300 --> 00:43:14,640 and do print x plus y. 973 00:43:14,640 --> 00:43:15,690 And that's it. 974 00:43:15,690 --> 00:43:19,020 Now, granted, I have some comments in my C version of the code, 975 00:43:19,020 --> 00:43:21,090 just to remind you of what each line is doing. 976 00:43:21,090 --> 00:43:23,878 But I've still distilled this into six lines-- or, really, four 977 00:43:23,878 --> 00:43:25,170 if I get rid of the blank line. 978 00:43:25,170 --> 00:43:29,580 So it's already, perhaps, a bit tighter here. 979 00:43:29,580 --> 00:43:33,600 It's tighter because something really important, historically, is missing. 980 00:43:33,600 --> 00:43:38,240 What did I seem to omit altogether that we haven't really highlighted yet? 981 00:43:38,240 --> 00:43:39,136 Yeah? 982 00:43:39,136 --> 00:43:40,530 AUDIENCE: [INAUDIBLE] 983 00:43:40,530 --> 00:43:41,280 DAVID MALAN: Yeah. 984 00:43:41,280 --> 00:43:42,910 The main function is gone. 985 00:43:42,910 --> 00:43:45,330 And in fact, maybe you took for granted that it just 986 00:43:45,330 --> 00:43:47,580 worked a moment ago when I wrote hello, but I didn't 987 00:43:47,580 --> 00:43:49,273 have a main function in hello, either. 988 00:43:49,273 --> 00:43:52,440 And this, too, is a feature of Python and a lot of other languages, as well. 989 00:43:52,440 --> 00:43:55,320 Instead of having to adhere to these long-standing traditions, 990 00:43:55,320 --> 00:43:57,400 if you just want to write code and get something done, fine. 991 00:43:57,400 --> 00:43:59,925 Just write code and get something done without, necessarily, 992 00:43:59,925 --> 00:44:01,185 all of this same boilerplate. 993 00:44:01,185 --> 00:44:04,380 So whatever is in your Python file-- 994 00:44:04,380 --> 00:44:06,510 left indented, if you will, by default-- 995 00:44:06,510 --> 00:44:10,180 is just going to be the code that the interpreter runs, top to bottom, 996 00:44:10,180 --> 00:44:10,850 left to right. 997 00:44:10,850 --> 00:44:14,300 Well, let me go ahead, now, and run code like this. 998 00:44:14,300 --> 00:44:17,470 Let me go ahead and open back up my terminal window, 999 00:44:17,470 --> 00:44:19,140 run python of calculator.py. 1000 00:44:19,140 --> 00:44:21,570 And I'll do x is 1, y is 2. 1001 00:44:21,570 --> 00:44:23,460 And as you might expect, it gives me 3. 1002 00:44:23,460 --> 00:44:24,570 Slight aesthetic bug. 1003 00:44:24,570 --> 00:44:26,590 I put my space in the wrong place here. 1004 00:44:26,590 --> 00:44:27,810 So that's a newbie mistake. 1005 00:44:27,810 --> 00:44:29,220 Let me fix that, aesthetically. 1006 00:44:29,220 --> 00:44:31,050 Let me rerun python of calculator.py. 1007 00:44:31,050 --> 00:44:31,680 Type in 1. 1008 00:44:31,680 --> 00:44:32,250 Type in 2. 1009 00:44:32,250 --> 00:44:36,280 And, voila, there is now my same version again. 1010 00:44:36,280 --> 00:44:39,585 But let me propose, now, that we get rid of this training wheel. 1011 00:44:39,585 --> 00:44:41,460 We don't want to keep taking one step forward 1012 00:44:41,460 --> 00:44:43,793 and then two steps back by adding these training wheels, 1013 00:44:43,793 --> 00:44:45,330 so let me instead do this. 1014 00:44:45,330 --> 00:44:49,590 In my version of calculator.py, suppose that we take away, already, 1015 00:44:49,590 --> 00:44:53,610 the training wheel that is the CS50 library here and let me, 1016 00:44:53,610 --> 00:44:56,910 instead, then, use just Python's built-in function called 1017 00:44:56,910 --> 00:44:59,020 input, which literally does just that. 1018 00:44:59,020 --> 00:45:03,600 It gets input from the user and it stores it, as before, in x and y. 1019 00:45:03,600 --> 00:45:04,950 So this is not CS50-specific. 1020 00:45:04,950 --> 00:45:07,155 This is real-world Python programming. 1021 00:45:07,155 --> 00:45:10,740 Well, let me go ahead and run, again, python of calculator.py. 1022 00:45:10,740 --> 00:45:16,530 And, of course, if x is 1 and y is 2, x plus y should, of course, still be 3. 1023 00:45:16,530 --> 00:45:19,306 1024 00:45:19,306 --> 00:45:24,285 It's apparently 12, according to Python, until CS50's library gets involved. 1025 00:45:24,285 --> 00:45:28,620 But does anyone want to infer what just went wrong? 1026 00:45:28,620 --> 00:45:29,160 Yeah? 1027 00:45:29,160 --> 00:45:32,925 AUDIENCE: We're always [INAUDIBLE]. 1028 00:45:32,925 --> 00:45:33,800 DAVID MALAN: Exactly. 1029 00:45:33,800 --> 00:45:37,660 The input function, by design, always returns a string of text. 1030 00:45:37,660 --> 00:45:39,410 After all, that's what the human typed in. 1031 00:45:39,410 --> 00:45:42,620 And even though, yes, I typed the number keys on the keyboard, 1032 00:45:42,620 --> 00:45:44,600 it's still coming back as all text. 1033 00:45:44,600 --> 00:45:47,090 Now, maybe we should use like a get_int function. 1034 00:45:47,090 --> 00:45:48,575 Well, that doesn't exist in Python. 1035 00:45:48,575 --> 00:45:52,340 All you can do is get textual input-- a string from the user. 1036 00:45:52,340 --> 00:45:54,415 But we can convert one to the other. 1037 00:45:54,415 --> 00:45:58,610 And so, a fix for this so that we don't accidentally concatenate-- 1038 00:45:58,610 --> 00:46:02,760 that is, join x plus y together-- would be to do something like this. 1039 00:46:02,760 --> 00:46:04,595 Let me go back to my Python code, here. 1040 00:46:04,595 --> 00:46:08,870 And whereas, in C, we could previously do typecasting-- 1041 00:46:08,870 --> 00:46:11,060 we can convert one type to another-- 1042 00:46:11,060 --> 00:46:14,420 that generally wasn't the case when you were doing something complex, 1043 00:46:14,420 --> 00:46:15,470 like a string to an int. 1044 00:46:15,470 --> 00:46:18,450 You could do a char to an int and vise versa. 1045 00:46:18,450 --> 00:46:22,370 But for a string, recall, there was a special function in the C-type library 1046 00:46:22,370 --> 00:46:25,100 called a to I, like Ascii to integer. 1047 00:46:25,100 --> 00:46:27,880 That's the closest analog, here. 1048 00:46:27,880 --> 00:46:29,630 And, in fact, the way to do this in Python 1049 00:46:29,630 --> 00:46:32,740 would be to use a function called int, which, 1050 00:46:32,740 --> 00:46:34,490 indeed, is the name of the data type, too, 1051 00:46:34,490 --> 00:46:36,380 even though I have not yet had to type it. 1052 00:46:36,380 --> 00:46:40,340 And I can convert the output of the input function 1053 00:46:40,340 --> 00:46:44,600 automatically from a string immediately to an int. 1054 00:46:44,600 --> 00:46:48,620 And now, if I go back to my terminal window, rerun python of calculator.py 1055 00:46:48,620 --> 00:46:52,770 with 1 and 2 for x and y, now, I'm back in business. 1056 00:46:52,770 --> 00:46:55,400 So that, then, is, for instance, what the CS50 library 1057 00:46:55,400 --> 00:46:59,420 does, if temporarily this week, is it just deals with the conversion for you. 1058 00:46:59,420 --> 00:47:03,500 And, in fact, bad things could happen if I type the wrong thing, 1059 00:47:03,500 --> 00:47:05,615 like dog or cat instead of a number. 1060 00:47:05,615 --> 00:47:08,400 But we'll cross that bridge in just a moment, as well. 1061 00:47:08,400 --> 00:47:08,900 All right. 1062 00:47:08,900 --> 00:47:11,990 What if we do something slightly different, now, with our calculator. 1063 00:47:11,990 --> 00:47:16,400 1064 00:47:16,400 --> 00:47:18,790 Instead of addition, let's do division instead. 1065 00:47:18,790 --> 00:47:23,990 So z equals x divided by y, thereby giving me a third variable z. 1066 00:47:23,990 --> 00:47:27,320 Let me go ahead and run python of calculator.py again. 1067 00:47:27,320 --> 00:47:29,120 I'll type in 1. 1068 00:47:29,120 --> 00:47:31,790 I'll type in 3 this time. 1069 00:47:31,790 --> 00:47:37,470 And what problem do you think we're about to see? 1070 00:47:37,470 --> 00:47:38,400 Or is it gone? 1071 00:47:38,400 --> 00:47:41,670 What happened when I did this in C, albeit with some slightly more 1072 00:47:41,670 --> 00:47:47,680 cryptic syntax, when I divided one number, like 1 divided by 3? 1073 00:47:47,680 --> 00:47:48,600 Anyone recall? 1074 00:47:48,600 --> 00:47:49,100 Yeah? 1075 00:47:49,100 --> 00:47:51,310 AUDIENCE: You would round to the nearest integer. 1076 00:47:51,310 --> 00:47:52,060 DAVID MALAN: Yeah. 1077 00:47:52,060 --> 00:47:55,030 So it would round down to the nearest integer, 1078 00:47:55,030 --> 00:47:57,560 whereby you experience truncation. 1079 00:47:57,560 --> 00:48:00,340 So if you take an integer like 1, you divide it 1080 00:48:00,340 --> 00:48:02,530 by another integer like 3, that technically 1081 00:48:02,530 --> 00:48:06,310 should be 0.33333, infinitely long. 1082 00:48:06,310 --> 00:48:10,297 But in C, recall, you truncate the value. 1083 00:48:10,297 --> 00:48:12,130 If you divide an int by an int, you get back 1084 00:48:12,130 --> 00:48:14,965 an int, which means you get only the integer part, which was the 0. 1085 00:48:14,965 --> 00:48:18,805 Now, Python actually handles this for us and avoids the truncation. 1086 00:48:18,805 --> 00:48:23,650 But it leaves us, still, with one other problem here, which is going to be, 1087 00:48:23,650 --> 00:48:27,453 for instance, not necessarily visible at a glance. 1088 00:48:27,453 --> 00:48:28,245 This looks correct. 1089 00:48:28,245 --> 00:48:31,780 This has solved the problem in C. So truncation does not happen. 1090 00:48:31,780 --> 00:48:36,010 The integers are automatically converted to a float-- a floating point value. 1091 00:48:36,010 --> 00:48:41,970 But what other problem did we trip over, back in week one? 1092 00:48:41,970 --> 00:48:44,480 1093 00:48:44,480 --> 00:48:49,700 What else got a little dicey when dealing with simple arithmetic? 1094 00:48:49,700 --> 00:48:51,238 Anyone recall? 1095 00:48:51,238 --> 00:48:53,280 Well, the syntax in Python is a little different, 1096 00:48:53,280 --> 00:48:54,780 but let me go ahead and do this. 1097 00:48:54,780 --> 00:48:58,700 It turns out, in Python, if you want to see more significant digits than what 1098 00:48:58,700 --> 00:49:02,360 I'm seeing here by default, which is a dozen or so, let me go ahead 1099 00:49:02,360 --> 00:49:03,715 and print out z as follows. 1100 00:49:03,715 --> 00:49:07,310 Let me first print out a format string because I want to format z 1101 00:49:07,310 --> 00:49:08,780 in an interesting way. 1102 00:49:08,780 --> 00:49:11,330 And notice, this would have no effect on the difference. 1103 00:49:11,330 --> 00:49:14,630 This is just a format string that, for no compelling reason at the moment, 1104 00:49:14,630 --> 00:49:19,280 is interpolating z in those curly braces using an fstring or format string. 1105 00:49:19,280 --> 00:49:23,390 If I run this again with 1 and 3, we'll see, indeed, the exact same thing. 1106 00:49:23,390 --> 00:49:25,700 But when you use an fstring, you, indeed, 1107 00:49:25,700 --> 00:49:28,460 have the ability to format that string more precisely. 1108 00:49:28,460 --> 00:49:32,930 Just like with %f in Python, you could start to fine-tune how many significant 1109 00:49:32,930 --> 00:49:35,720 digits you see-- 1110 00:49:35,720 --> 00:49:37,070 in C, rather. 1111 00:49:37,070 --> 00:49:40,190 In Python, you can do the same, but the syntax is a little different. 1112 00:49:40,190 --> 00:49:43,925 If you want the computer to interpolate z and show you 1113 00:49:43,925 --> 00:49:47,570 50 significant digits-- that is, 50 numbers 1114 00:49:47,570 --> 00:49:50,033 after the decimal point-- syntax is similar to C, 1115 00:49:50,033 --> 00:49:51,200 but it's a little different. 1116 00:49:51,200 --> 00:49:54,110 You literally put a colon after the variable's name. 1117 00:49:54,110 --> 00:49:59,090 dot 50 means show me the decimal point and, then, 50 digits to the right, 1118 00:49:59,090 --> 00:50:02,760 and the f just indicates please treat this as a floating point value. 1119 00:50:02,760 --> 00:50:05,540 So now, if I rerun python of calculator.py, 1120 00:50:05,540 --> 00:50:11,495 divide 1 by 3, unfortunately, Python has not solved all of the world's problems 1121 00:50:11,495 --> 00:50:12,710 for us. 1122 00:50:12,710 --> 00:50:15,545 This, again, was an example of floating point imprecision. 1123 00:50:15,545 --> 00:50:17,692 So that problem is still latent. 1124 00:50:17,692 --> 00:50:20,150 So just because the world has advanced, doesn't necessarily 1125 00:50:20,150 --> 00:50:22,317 mean that all of our problems from C have gone away. 1126 00:50:22,317 --> 00:50:26,418 There are solutions using third-party libraries for scientific calculations 1127 00:50:26,418 --> 00:50:26,960 and the like. 1128 00:50:26,960 --> 00:50:31,445 But out of the box, floating point imprecision is still an issue. 1129 00:50:31,445 --> 00:50:35,780 Meanwhile, there was one other problem in C 1130 00:50:35,780 --> 00:50:39,890 that we ran into involving numbers, and that was this-- integer overflow. 1131 00:50:39,890 --> 00:50:41,930 Recall that an integer in C only took up, 1132 00:50:41,930 --> 00:50:45,140 what, 32 bits typically, which meant you could count as high as 4 billion 1133 00:50:45,140 --> 00:50:48,140 or, maybe, if you're doing positive and negatives, as high as 2 billion, 1134 00:50:48,140 --> 00:50:50,030 after which, weird things would happen. 1135 00:50:50,030 --> 00:50:54,798 The number would go to 0 or negative or it would overflow or wrap back around. 1136 00:50:54,798 --> 00:50:56,840 Well, wonderfully, in Python, they did, at least, 1137 00:50:56,840 --> 00:51:00,800 address this, whereby you can count as high as you want. 1138 00:51:00,800 --> 00:51:03,830 And Python will just use more and more and more and more 1139 00:51:03,830 --> 00:51:08,000 bits and bytes to store really big numbers so integer overflow is not 1140 00:51:08,000 --> 00:51:09,020 a thing. 1141 00:51:09,020 --> 00:51:13,820 With that said, Python is limited to how many digits it will show you 1142 00:51:13,820 --> 00:51:15,410 on the screen at once as a string. 1143 00:51:15,410 --> 00:51:18,560 But, mathematically, your math will be correct now. 1144 00:51:18,560 --> 00:51:21,860 So we've taken a couple of steps forward, one step sideways. 1145 00:51:21,860 --> 00:51:25,530 But, indeed, we have solved some of our problems here. 1146 00:51:25,530 --> 00:51:26,030 All right. 1147 00:51:26,030 --> 00:51:32,230 Questions, now, on any of these examples thus far? 1148 00:51:32,230 --> 00:51:34,400 Question? 1149 00:51:34,400 --> 00:51:35,000 All right. 1150 00:51:35,000 --> 00:51:40,250 Well, how about another problem that we encountered in C. Let's 1151 00:51:40,250 --> 00:51:41,720 revisit it here in Python, as well. 1152 00:51:41,720 --> 00:51:43,595 So let me go ahead and, on the left-hand side 1153 00:51:43,595 --> 00:51:54,020 here, let me open up a file called, say, compare3.c on the left, 1154 00:51:54,020 --> 00:51:57,640 and let me go ahead and create a new file on the right called compare.py. 1155 00:51:57,640 --> 00:52:00,070 Because recall that bad things happened when 1156 00:52:00,070 --> 00:52:03,580 we needed to compare two values in C. So on the left, 1157 00:52:03,580 --> 00:52:06,550 here, is a reminder of what we once did in C, 1158 00:52:06,550 --> 00:52:11,230 whereby, if we want to compare values, we can get an int in C, store it in x. 1159 00:52:11,230 --> 00:52:13,450 A get_int in C, store it in y. 1160 00:52:13,450 --> 00:52:16,180 We then have our familiar, conditional logic here, 1161 00:52:16,180 --> 00:52:19,210 just printing out if x x less than y or not. 1162 00:52:19,210 --> 00:52:23,080 Well, we can certainly do the same thing, ultimately, in Python 1163 00:52:23,080 --> 00:52:25,720 by using some fairly familiar syntax. 1164 00:52:25,720 --> 00:52:27,640 And let's just demonstrate this one quickly. 1165 00:52:27,640 --> 00:52:29,500 Let me go over here, too. 1166 00:52:29,500 --> 00:52:34,690 I'll do from cs50 import get_int, even though I could do this, instead, 1167 00:52:34,690 --> 00:52:36,700 with the input function itself. 1168 00:52:36,700 --> 00:52:39,700 x equals get_int, and I'll prompt the user for that. 1169 00:52:39,700 --> 00:52:42,880 y equals get_int, and I'll prompt the user for that. 1170 00:52:42,880 --> 00:52:45,910 After that, recall that I can say, without parentheses, 1171 00:52:45,910 --> 00:52:52,010 if x is less than y, then print out, without the f, "x is less than y." 1172 00:52:52,010 --> 00:52:58,570 Then, I can go ahead and say else if x is greater than y, I can print out, 1173 00:52:58,570 --> 00:53:01,270 quote unquote, "x is greater than y." 1174 00:53:01,270 --> 00:53:05,320 If you'd like to interject now, what did I screw up? 1175 00:53:05,320 --> 00:53:05,820 Anyone? 1176 00:53:05,820 --> 00:53:06,150 Yeah? 1177 00:53:06,150 --> 00:53:06,915 AUDIENCE: Elif. 1178 00:53:06,915 --> 00:53:07,957 DAVID MALAN: Elif, right? 1179 00:53:07,957 --> 00:53:13,965 So elif x is greater than y, else-- this part's the same-- print 1180 00:53:13,965 --> 00:53:18,000 "x is equal to y." 1181 00:53:18,000 --> 00:53:19,805 There's no new logic going on here. 1182 00:53:19,805 --> 00:53:21,960 But, at least syntactically, it's a little cleaner. 1183 00:53:21,960 --> 00:53:25,500 Indeed, this program is only 11 lines long, albeit without any comments. 1184 00:53:25,500 --> 00:53:27,765 Let me go ahead and run python of compare.py. 1185 00:53:27,765 --> 00:53:28,350 Let's see. 1186 00:53:28,350 --> 00:53:30,235 Is 1 less than 2? 1187 00:53:30,235 --> 00:53:30,735 Indeed. 1188 00:53:30,735 --> 00:53:32,070 Let's run it again. 1189 00:53:32,070 --> 00:53:33,330 Is 2 less than 1? 1190 00:53:33,330 --> 00:53:34,890 No, it's greater than. 1191 00:53:34,890 --> 00:53:37,740 And let's, lastly, type in 1 and 1 twice. 1192 00:53:37,740 --> 00:53:38,910 x is equal to y. 1193 00:53:38,910 --> 00:53:42,030 So we've got a pretty side-by-side, one-to-one conversion here. 1194 00:53:42,030 --> 00:53:44,190 Let's do something a little more interesting, then. 1195 00:53:44,190 --> 00:53:48,270 In C, how about I open, instead, something where we actually 1196 00:53:48,270 --> 00:53:49,310 compared for a purpose? 1197 00:53:49,310 --> 00:53:54,150 So if I open up, from earlier in the course-- 1198 00:53:54,150 --> 00:54:00,320 how about agree.c, which prompt the user to agree to something or not? 1199 00:54:00,320 --> 00:54:03,860 And let me code up a new version here, called agree.py. 1200 00:54:03,860 --> 00:54:06,720 And I'll do this on the right-hand side, with agree.py. 1201 00:54:06,720 --> 00:54:08,830 But on agree.c on the left-- 1202 00:54:08,830 --> 00:54:12,210 notice that this is how we did this yes-no thing in C-- 1203 00:54:12,210 --> 00:54:16,590 we compared c, a character, equal to single quotes 'Y' 1204 00:54:16,590 --> 00:54:18,840 or equal to single quotes little 'y.' 1205 00:54:18,840 --> 00:54:20,430 And then, the same thing for n. 1206 00:54:20,430 --> 00:54:22,470 Now, in Python, this one is actually going 1207 00:54:22,470 --> 00:54:23,960 to be a little bit different, here. 1208 00:54:23,960 --> 00:54:27,310 Let me go ahead and, in the Python version of this, 1209 00:54:27,310 --> 00:54:29,640 let me do something like this. 1210 00:54:29,640 --> 00:54:31,258 We'll use get_string. 1211 00:54:31,258 --> 00:54:31,800 Actually, no. 1212 00:54:31,800 --> 00:54:33,217 We'll just use input in this case. 1213 00:54:33,217 --> 00:54:36,780 So let's do s equals input. 1214 00:54:36,780 --> 00:54:38,940 And we'll ask the user the same thing-- 1215 00:54:38,940 --> 00:54:40,875 Do you agree, question mark. 1216 00:54:40,875 --> 00:54:46,110 Then, let's go ahead and say, if s equals equals-- 1217 00:54:46,110 --> 00:54:48,940 how about Y? 1218 00:54:48,940 --> 00:54:49,740 Huh. 1219 00:54:49,740 --> 00:54:50,758 How do I do this? 1220 00:54:50,758 --> 00:54:51,550 Well, a few things. 1221 00:54:51,550 --> 00:54:54,660 Turns out, I'm going to do this-- s equals equals little y. 1222 00:54:54,660 --> 00:54:57,210 Then, I'm going to go ahead and print out "Agreed." 1223 00:54:57,210 --> 00:55:03,390 And elif s equals equals capital N or s equals equals lowercase n, 1224 00:55:03,390 --> 00:55:05,520 I'm going to go ahead and print out "Not agreed." 1225 00:55:05,520 --> 00:55:08,820 And I claim, for the moment, that this is identical, now, 1226 00:55:08,820 --> 00:55:13,760 to the program on the left in C. But what's different? 1227 00:55:13,760 --> 00:55:17,280 So we're still doing the same kind of logic, these equal equals 1228 00:55:17,280 --> 00:55:18,780 for comparing for equality. 1229 00:55:18,780 --> 00:55:21,922 But notice that, nicely enough, Python got rid of the two vertical bars, 1230 00:55:21,922 --> 00:55:23,505 and it's just literally the word "or." 1231 00:55:23,505 --> 00:55:27,933 If you recall seeing ampersand ampersand to express a logical and in C, [GRUNTS] 1232 00:55:27,933 --> 00:55:29,850 you can just write, literally, the word "and." 1233 00:55:29,850 --> 00:55:33,390 And so, here's a hint of why Python tends to be pretty popular. 1234 00:55:33,390 --> 00:55:35,640 People just like that it's a little closer to English. 1235 00:55:35,640 --> 00:55:38,520 There's a little less of the cryptic syntax here. 1236 00:55:38,520 --> 00:55:41,850 Now, this is correct, as this code will now work. 1237 00:55:41,850 --> 00:55:45,750 But I've also used double quotes instead of single quotes, 1238 00:55:45,750 --> 00:55:48,780 and I also omitted, a few minutes ago, from my list of data 1239 00:55:48,780 --> 00:55:51,180 types in Python the word "char." 1240 00:55:51,180 --> 00:55:53,430 In Python, there are no chars. 1241 00:55:53,430 --> 00:55:55,320 There are no individual characters. 1242 00:55:55,320 --> 00:55:58,830 If you want to manipulate an individual character, you use a string-- 1243 00:55:58,830 --> 00:56:00,510 that is to say, a str-- 1244 00:56:00,510 --> 00:56:01,680 of size 1. 1245 00:56:01,680 --> 00:56:04,930 Now, in Python, you can use single quotes or double quotes. 1246 00:56:04,930 --> 00:56:06,930 I'm deliberately using double quotes everywhere, 1247 00:56:06,930 --> 00:56:09,715 just for consistency with how we treat strings in C. 1248 00:56:09,715 --> 00:56:12,090 It's pretty common, though, to use single quotes instead, 1249 00:56:12,090 --> 00:56:14,190 if only because, on most keyboards, you don't 1250 00:56:14,190 --> 00:56:16,320 have to hold the Shift key anymore. 1251 00:56:16,320 --> 00:56:18,288 Humans have really started to optimize just how 1252 00:56:18,288 --> 00:56:19,830 quickly they want to be able to code. 1253 00:56:19,830 --> 00:56:22,110 So using a single quote tends to be pretty popular 1254 00:56:22,110 --> 00:56:24,270 in Python and other languages, as well. 1255 00:56:24,270 --> 00:56:29,520 They are fundamentally the same, single or double, unlike in C, 1256 00:56:29,520 --> 00:56:30,570 where they have meaning. 1257 00:56:30,570 --> 00:56:33,120 So this is correct, I claim. 1258 00:56:33,120 --> 00:56:34,830 And, in fact, let me run this real quick. 1259 00:56:34,830 --> 00:56:37,090 I'll open up my terminal window here. 1260 00:56:37,090 --> 00:56:40,230 Let me get rid of the version in C and run python of agree.py. 1261 00:56:40,230 --> 00:56:42,420 And I'll type in Y. OK. 1262 00:56:42,420 --> 00:56:44,220 I'll run it again and type in little y. 1263 00:56:44,220 --> 00:56:46,780 And I'll stipulate it's going to work for no, as well. 1264 00:56:46,780 --> 00:56:49,840 But this isn't necessarily the only way we can do this. 1265 00:56:49,840 --> 00:56:52,350 There are other ways to implement the same idea. 1266 00:56:52,350 --> 00:56:57,630 And in fact, I can go about doing this instead. 1267 00:56:57,630 --> 00:56:59,910 Let me go back up to my code here. 1268 00:56:59,910 --> 00:57:03,240 And we saw a hint of this earlier. 1269 00:57:03,240 --> 00:57:06,240 We know that lists exist in Python, and you can create them 1270 00:57:06,240 --> 00:57:08,040 just by using square brackets. 1271 00:57:08,040 --> 00:57:10,380 So what if I simplify the code a little bit and just 1272 00:57:10,380 --> 00:57:14,940 say if s is in the following list of values-- 1273 00:57:14,940 --> 00:57:17,850 capital Y or lowercase y. 1274 00:57:17,850 --> 00:57:21,090 It's not all that different, logically, but it's a little tighter. 1275 00:57:21,090 --> 00:57:22,440 It's a little more compact. 1276 00:57:22,440 --> 00:57:29,040 So elif s is in capital N or lowercase n, I can express that same idea, too. 1277 00:57:29,040 --> 00:57:32,220 So here, again, it's just getting a little more pleasant to write code. 1278 00:57:32,220 --> 00:57:33,960 There's less hitting of the keyboard. 1279 00:57:33,960 --> 00:57:36,090 You can express yourself a little more succinctly. 1280 00:57:36,090 --> 00:57:40,020 And using the keyword in, Python will figure out 1281 00:57:40,020 --> 00:57:44,370 how to search the entire list for whatever the value of s is. 1282 00:57:44,370 --> 00:57:47,010 And if it finds it, it will return True automatically. 1283 00:57:47,010 --> 00:57:48,230 Else, it will return False. 1284 00:57:48,230 --> 00:57:54,960 So if I run agree.py again and type in capital Y or lowercase y, that still, 1285 00:57:54,960 --> 00:57:55,695 now, works. 1286 00:57:55,695 --> 00:58:00,330 Well, I can tighten this up further if I want to add more features. 1287 00:58:00,330 --> 00:58:04,710 Well, what if I want to support not just big Y and little y, 1288 00:58:04,710 --> 00:58:10,050 but how about "Yes" or "yes" or, in case the user 1289 00:58:10,050 --> 00:58:14,357 is yelling or someone who isn't good with CapsLock types in "YES?" 1290 00:58:14,357 --> 00:58:14,940 Wait a minute. 1291 00:58:14,940 --> 00:58:16,020 But it could be weird. 1292 00:58:16,020 --> 00:58:20,850 Do we want to support this or this? 1293 00:58:20,850 --> 00:58:23,480 This just gets really tedious, quickly, combinatorially, 1294 00:58:23,480 --> 00:58:25,710 if you consider all of these possible permutations. 1295 00:58:25,710 --> 00:58:27,990 What would be smarter than doing something 1296 00:58:27,990 --> 00:58:30,120 like this, if you want to just be able to tolerate 1297 00:58:30,120 --> 00:58:33,570 "yes" in any form of capitalization? 1298 00:58:33,570 --> 00:58:35,370 Logically, what would be nice? 1299 00:58:35,370 --> 00:58:38,232 AUDIENCE: Maybe, whatever the input is, you just transfer it over 1300 00:58:38,232 --> 00:58:40,357 to all lowercase while uppercase, and then redo it? 1301 00:58:40,357 --> 00:58:41,125 DAVID MALAN: Exactly. 1302 00:58:41,125 --> 00:58:42,042 Super common paradigm. 1303 00:58:42,042 --> 00:58:46,510 Why don't we just force the user's input to all lowercase or all uppercase-- 1304 00:58:46,510 --> 00:58:49,570 doesn't matter, so long as we're self-consistent-- and just compare 1305 00:58:49,570 --> 00:58:52,030 against all uppercase or all lowercase. 1306 00:58:52,030 --> 00:58:55,760 And that will get rid of all of the possible permutations, otherwise. 1307 00:58:55,760 --> 00:58:58,510 Now, in C, we might have done something like this. 1308 00:58:58,510 --> 00:59:01,820 We might have simplified this whole list and just said-- 1309 00:59:01,820 --> 00:59:04,940 let's say we'll do-- 1310 00:59:04,940 --> 00:59:06,220 how about lowercase? 1311 00:59:06,220 --> 00:59:10,490 So y or yes, and we'll just leave it at that. 1312 00:59:10,490 --> 00:59:12,370 But we need to force, now, s to lowercase. 1313 00:59:12,370 --> 00:59:15,970 Well, in C, we would have used the C-type library. 1314 00:59:15,970 --> 00:59:19,660 We would have done to.lower and call that function, passing it in. 1315 00:59:19,660 --> 00:59:22,330 Although, not really because, in C-type, those 1316 00:59:22,330 --> 00:59:25,870 operate on individual characters or chars, not whole strings. 1317 00:59:25,870 --> 00:59:29,920 We actually didn't see a function that could convert a whole string in C 1318 00:59:29,920 --> 00:59:31,030 to lowercase. 1319 00:59:31,030 --> 00:59:34,910 But in Python, we're going to benefit from some other feature, as well. 1320 00:59:34,910 --> 00:59:39,330 It turns out that Python supports what's called object-oriented programming. 1321 00:59:39,330 --> 00:59:41,830 And we're only going to scratch the surface of this in CS50. 1322 00:59:41,830 --> 00:59:44,740 But if you take a higher-level C course in programming or CS, 1323 00:59:44,740 --> 00:59:46,750 you explore this as a different paradigm. 1324 00:59:46,750 --> 00:59:49,930 Up until now, in C, we've been focusing on what's called, really, 1325 00:59:49,930 --> 00:59:51,025 procedural programming. 1326 00:59:51,025 --> 00:59:52,210 You write procedures. 1327 00:59:52,210 --> 00:59:55,250 You write functions, top to bottom, left to right. 1328 00:59:55,250 --> 00:59:57,790 And when you want to change some value, we 1329 00:59:57,790 --> 01:00:00,550 were in the habit of using a procedure-- that is, a function. 1330 01:00:00,550 --> 01:00:03,670 You would pass something, like a variable, into a function, 1331 01:00:03,670 --> 01:00:07,600 like toupper or tolower, and it would do its thing and hand you back a value. 1332 01:00:07,600 --> 01:00:12,610 Well, it turns out that it would be nicer, programming-wise, if some data 1333 01:00:12,610 --> 01:00:15,250 types just had built-in functionality. 1334 01:00:15,250 --> 01:00:18,220 Why do we have our variables over here and all of our helper functions, 1335 01:00:18,220 --> 01:00:21,010 like toupper and tolower over here, such that we constantly 1336 01:00:21,010 --> 01:00:22,660 have to pass one into the other. 1337 01:00:22,660 --> 01:00:27,590 It would be nice to bake into our data type some built-in functionality 1338 01:00:27,590 --> 01:00:33,267 so that you can change variables using their own, default built-in 1339 01:00:33,267 --> 01:00:33,850 functionality. 1340 01:00:33,850 --> 01:00:37,510 And so, Object-Oriented Programming, otherwise known as OOP, 1341 01:00:37,510 --> 01:00:41,635 is a technique whereby certain types of values, like a string-- 1342 01:00:41,635 --> 01:00:47,230 AKA str-- not only have properties inside of them-- 1343 01:00:47,230 --> 01:00:49,900 attributes, just like a struct in C-- 1344 01:00:49,900 --> 01:00:54,480 your data can also have functions built into them, as well. 1345 01:00:54,480 --> 01:00:57,955 So, whereas in C, which is not object-oriented, you have structs. 1346 01:00:57,955 --> 01:01:01,150 And structs can only store data, like a name and a number 1347 01:01:01,150 --> 01:01:02,620 when implementing a person. 1348 01:01:02,620 --> 01:01:07,210 In Python, you can, for instance, have not just a structure-- 1349 01:01:07,210 --> 01:01:09,010 otherwise known as a class-- 1350 01:01:09,010 --> 01:01:10,930 storing a name and a number. 1351 01:01:10,930 --> 01:01:15,460 You can have a function call that person or email that person 1352 01:01:15,460 --> 01:01:19,510 or actual verbs or actions associated with that piece of data. 1353 01:01:19,510 --> 01:01:21,910 Now, in the context of strings, it turns out 1354 01:01:21,910 --> 01:01:24,565 that strings come with a lot of useful functionality. 1355 01:01:24,565 --> 01:01:28,900 And in fact, at this URL here, which is in docs.python.org, 1356 01:01:28,900 --> 01:01:31,720 which is the official documentation for Python, 1357 01:01:31,720 --> 01:01:34,300 you'll see a whole list of methods-- 1358 01:01:34,300 --> 01:01:37,705 that is, functions-- that come with strings that you can actually 1359 01:01:37,705 --> 01:01:40,150 use to modify their values. 1360 01:01:40,150 --> 01:01:42,440 And what I mean by this is the following. 1361 01:01:42,440 --> 01:01:44,900 If we go through the documentation, poke around, 1362 01:01:44,900 --> 01:01:48,163 it turns out that strings come with a function called lower. 1363 01:01:48,163 --> 01:01:50,080 And if you want to use that function, you just 1364 01:01:50,080 --> 01:01:54,850 have to use slightly different syntax than in C. You do not do tolower, 1365 01:01:54,850 --> 01:01:59,140 and you do not say, as I just did, lower because this function is 1366 01:01:59,140 --> 01:02:01,150 built into s itself. 1367 01:02:01,150 --> 01:02:05,770 And just like in C, when you want to go inside of a variable, like a structure, 1368 01:02:05,770 --> 01:02:09,790 and access a piece of data inside of it, like name or number, 1369 01:02:09,790 --> 01:02:12,370 when you also have functions built into data types-- 1370 01:02:12,370 --> 01:02:17,530 AKA methods; a method is just a function that is built into a piece of data-- 1371 01:02:17,530 --> 01:02:23,480 you can do s dot lower open paren, closed paren in this case. 1372 01:02:23,480 --> 01:02:25,480 And I can do this down here, as well. 1373 01:02:25,480 --> 01:02:33,280 If s.lower in, quote unquote, "n" or "no", the whole thing, 1374 01:02:33,280 --> 01:02:35,455 I can force this whole thing to lowercase. 1375 01:02:35,455 --> 01:02:38,620 So the only difference here, now, as an object-oriented programming, 1376 01:02:38,620 --> 01:02:41,840 instead of constantly passing a value into a function, 1377 01:02:41,840 --> 01:02:45,910 you just access a function that's inside of the value. 1378 01:02:45,910 --> 01:02:48,928 It just works because of how the language itself is defined. 1379 01:02:48,928 --> 01:02:51,220 And the only way you know whether these functions exist 1380 01:02:51,220 --> 01:02:55,495 is the documentation-- a class, a book, a website or the like. 1381 01:02:55,495 --> 01:03:00,490 Questions, now, on this technique? 1382 01:03:00,490 --> 01:03:01,070 All right. 1383 01:03:01,070 --> 01:03:02,513 I claim this is correct. 1384 01:03:02,513 --> 01:03:05,180 Now, even though you've never programmed, most of you, in Python 1385 01:03:05,180 --> 01:03:07,655 before, not super well-designed. 1386 01:03:07,655 --> 01:03:12,140 There's an subtle inefficiency, now, on lines 3 and 5 together. 1387 01:03:12,140 --> 01:03:18,150 What's dumb about how I've used lower, might you think? 1388 01:03:18,150 --> 01:03:18,720 Yeah? 1389 01:03:18,720 --> 01:03:21,975 AUDIENCE: I feel like, using it twice, you'd just want another [? variable. ?] 1390 01:03:21,975 --> 01:03:22,440 DAVID MALAN: Yeah. 1391 01:03:22,440 --> 01:03:25,482 If you're going to use the same function twice and ask the same question, 1392 01:03:25,482 --> 01:03:29,248 expecting the same answer, why are you calling the function itself twice? 1393 01:03:29,248 --> 01:03:31,415 Maybe we should just store the result in a variable. 1394 01:03:31,415 --> 01:03:33,030 So we could do this in a couple of different ways. 1395 01:03:33,030 --> 01:03:36,360 We, for instance, could go up here and create another variable called t 1396 01:03:36,360 --> 01:03:38,040 and set that equal to s.lower. 1397 01:03:38,040 --> 01:03:41,330 And then, we could just change this to be t, here. 1398 01:03:41,330 --> 01:03:43,080 But honestly, I don't think we technically 1399 01:03:43,080 --> 01:03:45,480 need another variable altogether, here. 1400 01:03:45,480 --> 01:03:47,410 I could just do something like this. 1401 01:03:47,410 --> 01:03:52,360 Let's change the value of s to be the lowercase version thereof. 1402 01:03:52,360 --> 01:03:55,920 And so, now, I can quite simply refer to s again and again like this, 1403 01:03:55,920 --> 01:03:57,550 reusing that same value. 1404 01:03:57,550 --> 01:04:01,380 Now, to be sure, I have now just lost the user's original input. 1405 01:04:01,380 --> 01:04:05,430 And if I care about that-- if they typed in all caps, I have no idea anymore. 1406 01:04:05,430 --> 01:04:08,070 So maybe I do want to use a separate variable, altogether. 1407 01:04:08,070 --> 01:04:10,830 But a takeaway here, too, is that strings in Python 1408 01:04:10,830 --> 01:04:13,590 are technically what we'll call immutable-- 1409 01:04:13,590 --> 01:04:15,640 that is, they cannot be changed. 1410 01:04:15,640 --> 01:04:19,830 This was not true in C. Once we gave you arrays in week two 1411 01:04:19,830 --> 01:04:22,800 or memory in week four, you could go to town on a string 1412 01:04:22,800 --> 01:04:25,780 and change any of the characters you want-- uppercasing, lowercasing, 1413 01:04:25,780 --> 01:04:27,560 changing it, shortening it and so forth. 1414 01:04:27,560 --> 01:04:33,690 But in this case, this returns a copy of s, forced to lowercase. 1415 01:04:33,690 --> 01:04:35,790 It doesn't change the original string-- 1416 01:04:35,790 --> 01:04:38,700 that is, the bytes in the computer's memory. 1417 01:04:38,700 --> 01:04:41,580 When you assign it back to s, you're essentially 1418 01:04:41,580 --> 01:04:43,703 forgetting about the old version of s. 1419 01:04:43,703 --> 01:04:46,620 But because Python does memory management for you-- there's no malloc, 1420 01:04:46,620 --> 01:04:47,820 there's no free-- 1421 01:04:47,820 --> 01:04:52,200 Python automatically frees up the original bytes, like Y-E-S, 1422 01:04:52,200 --> 01:04:54,750 and hands them back to the operating system for you. 1423 01:04:54,750 --> 01:04:55,340 All right. 1424 01:04:55,340 --> 01:04:59,640 Questions, now, on this technique? 1425 01:04:59,640 --> 01:05:02,310 Questions on this? 1426 01:05:02,310 --> 01:05:05,145 In general, I'll call out-- the Python documentation 1427 01:05:05,145 --> 01:05:07,927 will start to be your friend because, in class, we'll only scratch 1428 01:05:07,927 --> 01:05:09,510 the surface with some of these things. 1429 01:05:09,510 --> 01:05:12,210 But in docs.python.org, for instance, there's 1430 01:05:12,210 --> 01:05:15,630 a whole reference of all of the built-in functions that come with the language, 1431 01:05:15,630 --> 01:05:18,135 as well as, for instance, those with a string. 1432 01:05:18,135 --> 01:05:19,620 All right. 1433 01:05:19,620 --> 01:05:23,205 Before we take a break, let's go ahead and create something a little familiar 1434 01:05:23,205 --> 01:05:27,030 too based on our weeks here, in C. Let me 1435 01:05:27,030 --> 01:05:30,690 propose that we revisit those examples involving some meows. 1436 01:05:30,690 --> 01:05:34,260 So, for instance, when we had our cat meow back in the first week 1437 01:05:34,260 --> 01:05:37,650 and, then, second in C, we did something that was a little stupid at first 1438 01:05:37,650 --> 01:05:41,960 whereby we created a file, as I'll do here-- this time, called meow.py. 1439 01:05:41,960 --> 01:05:44,550 And if I want a cat to meow three times, I 1440 01:05:44,550 --> 01:05:47,190 could run it once, like this, a little copy-paste. 1441 01:05:47,190 --> 01:05:50,580 And now, python of meow.py, and I'm done. 1442 01:05:50,580 --> 01:05:53,100 Now, we've visited this example two times, at least, 1443 01:05:53,100 --> 01:05:54,690 now in Scratch and in C. 1444 01:05:54,690 --> 01:06:00,080 It's correct, I'll stipulate, but what's, obviously, poorly designed? 1445 01:06:00,080 --> 01:06:01,655 What's the fault here? 1446 01:06:01,655 --> 01:06:02,212 Yeah? 1447 01:06:02,212 --> 01:06:03,670 AUDIENCE: It should just be a loop. 1448 01:06:03,670 --> 01:06:04,990 DAVID MALAN: It should just be a loop, right? 1449 01:06:04,990 --> 01:06:05,990 Why type it three times? 1450 01:06:05,990 --> 01:06:08,560 Literally, copying and pasting is almost always a bad thing-- 1451 01:06:08,560 --> 01:06:11,440 except in C, when you have the function prototypes that you need to borrow. 1452 01:06:11,440 --> 01:06:13,232 But in this case, this is just inefficient. 1453 01:06:13,232 --> 01:06:15,652 So what could we do better here, in Python? 1454 01:06:15,652 --> 01:06:18,610 Well, in Python, we could probably change this in a few different ways. 1455 01:06:18,610 --> 01:06:21,280 We could borrow some of the syntax we proposed in slide form 1456 01:06:21,280 --> 01:06:23,710 earlier, like give me a variable called i. 1457 01:06:23,710 --> 01:06:26,080 Set it to 0, no semicolon. 1458 01:06:26,080 --> 01:06:29,510 While i is less than 3-- if I want to do this three times-- 1459 01:06:29,510 --> 01:06:31,280 I can go ahead and print out "meow." 1460 01:06:31,280 --> 01:06:33,580 And then, I can do i plus equals 1. 1461 01:06:33,580 --> 01:06:35,080 And I think this would do the trick. 1462 01:06:35,080 --> 01:06:38,650 Python of meow.py, and we're back in business already. 1463 01:06:38,650 --> 01:06:41,463 Well, if I wanted to change this to a for loop, well, in Python, 1464 01:06:41,463 --> 01:06:44,380 it would be a little tighter, but this would not be the best approach. 1465 01:06:44,380 --> 01:06:52,510 So for i in 0, 1, 2, I could just do print "meow", like this. 1466 01:06:52,510 --> 01:06:54,250 And that, too, would get the job done. 1467 01:06:54,250 --> 01:06:58,390 But, to our discussion earlier, this would get stupid pretty quickly 1468 01:06:58,390 --> 01:07:00,970 if you had to keep enumerating all of these values. 1469 01:07:00,970 --> 01:07:03,880 What did we introduce instead? 1470 01:07:03,880 --> 01:07:04,940 The range function. 1471 01:07:04,940 --> 01:07:05,440 Exactly. 1472 01:07:05,440 --> 01:07:09,040 So that hands me back, way more efficiently, just the values I want, 1473 01:07:09,040 --> 01:07:10,635 indeed, one at a time. 1474 01:07:10,635 --> 01:07:14,745 So even this, if I run it a third or fourth time, we've got the same result. 1475 01:07:14,745 --> 01:07:18,220 But now, let's transition to where we went with this back in the day. 1476 01:07:18,220 --> 01:07:20,650 How can we start to modularize this? 1477 01:07:20,650 --> 01:07:24,100 It would be nice, I claimed, if MIT had given us a meow function. 1478 01:07:24,100 --> 01:07:27,370 Wouldn't it be nice if Python had given us a meow function? 1479 01:07:27,370 --> 01:07:30,580 Maybe less compelling in Python, but how can I build my own function? 1480 01:07:30,580 --> 01:07:33,618 Well, I did this briefly with the spell checker earlier, 1481 01:07:33,618 --> 01:07:36,160 but let me go ahead and propose that we could implement, now, 1482 01:07:36,160 --> 01:07:40,280 our own version of this in Python as follows. 1483 01:07:40,280 --> 01:07:44,050 Let me go ahead and start fresh here and use the keyword def. 1484 01:07:44,050 --> 01:07:47,860 So this did not exist in C. You had the return value, the function 1485 01:07:47,860 --> 01:07:48,850 name, the arguments. 1486 01:07:48,850 --> 01:07:52,120 In Python, you literally say def to define a function. 1487 01:07:52,120 --> 01:07:54,757 You give it a name, like meow. 1488 01:07:54,757 --> 01:07:57,840 And now, I'm going to go ahead and, in this function, just print out meow. 1489 01:07:57,840 --> 01:08:01,460 And this lets me change it to anything else I want in the future. 1490 01:08:01,460 --> 01:08:03,400 But for now, it's an abstraction. 1491 01:08:03,400 --> 01:08:07,773 And in fact, I can move it out of sight, out of mind-- 1492 01:08:07,773 --> 01:08:09,940 just going to hit Enter a bunch of times to pretend, 1493 01:08:09,940 --> 01:08:13,382 now, it exists, but I don't care how it is implemented. 1494 01:08:13,382 --> 01:08:15,340 And up here, now, I can do something like this. 1495 01:08:15,340 --> 01:08:20,590 For i in range of 3, let me go ahead and not print "meow" anymore. 1496 01:08:20,590 --> 01:08:25,359 Let me just call meow and tightening up my code further. 1497 01:08:25,359 --> 01:08:25,960 Let's see. 1498 01:08:25,960 --> 01:08:26,859 Python of meow.py. 1499 01:08:26,859 --> 01:08:31,240 This is, I think, going to be the first time it does not work correctly. 1500 01:08:31,240 --> 01:08:32,680 OK. 1501 01:08:32,680 --> 01:08:36,310 So here, we have, sadly, our first Python error. 1502 01:08:36,310 --> 01:08:37,569 And let's see. 1503 01:08:37,569 --> 01:08:40,300 The syntax is going to be different from C or Clangs output. 1504 01:08:40,300 --> 01:08:41,920 Traceback is the term of art here. 1505 01:08:41,920 --> 01:08:44,859 This is like a trace back of all of the lines of code 1506 01:08:44,859 --> 01:08:47,560 that were just executed or, really, functions you've called. 1507 01:08:47,560 --> 01:08:49,090 The file name is uninteresting. 1508 01:08:49,090 --> 01:08:52,149 This is my codespace, specifically, but the file name 1509 01:08:52,149 --> 01:08:53,890 is important here-- meow.py. 1510 01:08:53,890 --> 01:08:55,675 Our line 2 is the issue-- 1511 01:08:55,675 --> 01:08:58,060 OK, I didn't get very far before I screwed up-- 1512 01:08:58,060 --> 01:08:59,470 and then, there's a name error. 1513 01:08:59,470 --> 01:09:03,430 And you'll see, in Python, there's typically these capitalized keywords 1514 01:09:03,430 --> 01:09:05,350 that hint at what the issue is. 1515 01:09:05,350 --> 01:09:09,260 It's something related to names of variables. "meow" is not defined. 1516 01:09:09,260 --> 01:09:09,760 All right. 1517 01:09:09,760 --> 01:09:11,635 You're programming Python for the first time. 1518 01:09:11,635 --> 01:09:12,399 You've screwed up. 1519 01:09:12,399 --> 01:09:14,560 You're following some online tutorial. 1520 01:09:14,560 --> 01:09:16,149 You're seeing this. 1521 01:09:16,149 --> 01:09:18,010 Reason through it. 1522 01:09:18,010 --> 01:09:20,680 Why might "meow" not be defined? 1523 01:09:20,680 --> 01:09:24,779 What can we infer about Python? 1524 01:09:24,779 --> 01:09:27,240 How to troubleshoot, logically? 1525 01:09:27,240 --> 01:09:29,147 AUDIENCE: [INAUDIBLE] 1526 01:09:29,147 --> 01:09:29,939 DAVID MALAN: Maybe. 1527 01:09:29,939 --> 01:09:32,520 Is it because "meow" is defined after? 1528 01:09:32,520 --> 01:09:34,890 As smart as Python seems to be, vis-a-vis C, 1529 01:09:34,890 --> 01:09:37,055 they have some similar design characteristics. 1530 01:09:37,055 --> 01:09:37,920 So let's try that. 1531 01:09:37,920 --> 01:09:41,729 So let me scroll all the way back down to where I moved this earlier. 1532 01:09:41,729 --> 01:09:43,649 Let me get rid of it-- 1533 01:09:43,649 --> 01:09:44,279 way down there. 1534 01:09:44,279 --> 01:09:46,410 I'll copy it to my clipboard. 1535 01:09:46,410 --> 01:09:48,180 And let me just hack something together. 1536 01:09:48,180 --> 01:09:49,963 Let me just put it up here. 1537 01:09:49,963 --> 01:09:51,130 And let's see if this works. 1538 01:09:51,130 --> 01:09:54,120 So now, let me clear my terminal, run python of meow.py. 1539 01:09:54,120 --> 01:09:55,110 OK. 1540 01:09:55,110 --> 01:09:56,198 We're back in business. 1541 01:09:56,198 --> 01:09:57,990 So that was actually really good intuition. 1542 01:09:57,990 --> 01:10:00,180 Good debugging technique, just reason through it. 1543 01:10:00,180 --> 01:10:02,430 Now, this is contradicting what I claimed back 1544 01:10:02,430 --> 01:10:05,325 in week one, which was that the main part of your program, 1545 01:10:05,325 --> 01:10:07,470 ideally, should just be at the top of the file. 1546 01:10:07,470 --> 01:10:08,580 Don't make me look for it. 1547 01:10:08,580 --> 01:10:10,497 It's not a huge deal with a four-line program, 1548 01:10:10,497 --> 01:10:13,290 but if you've got 40 lines or 400 lines, you 1549 01:10:13,290 --> 01:10:15,480 don't want the juicy part of your program 1550 01:10:15,480 --> 01:10:18,455 to be way down here, and all of these functions way up here. 1551 01:10:18,455 --> 01:10:22,085 So it would be nice, maybe, if we actually have a main function. 1552 01:10:22,085 --> 01:10:25,260 And so, it actually turns out to be a convention in Python 1553 01:10:25,260 --> 01:10:27,460 to define a main function. 1554 01:10:27,460 --> 01:10:30,720 It's not a special function that's automatically called, like in C. 1555 01:10:30,720 --> 01:10:32,340 But humans realized, you know what? 1556 01:10:32,340 --> 01:10:34,120 That was a pretty useful feature. 1557 01:10:34,120 --> 01:10:36,540 Let me define a function called main. 1558 01:10:36,540 --> 01:10:39,000 Let me indent these lines underneath it. 1559 01:10:39,000 --> 01:10:41,070 Let me practice what I'm preaching, which is put 1560 01:10:41,070 --> 01:10:43,290 the main code at the top of the file. 1561 01:10:43,290 --> 01:10:47,730 And, wonderfully, in Python, now, you do not need prototypes. 1562 01:10:47,730 --> 01:10:49,920 There's none of that hackish copying and pasting 1563 01:10:49,920 --> 01:10:52,462 of the return type, the name and the arguments to a function, 1564 01:10:52,462 --> 01:10:58,485 like we needed in C. This is now OK instead, except for one, minor detail. 1565 01:10:58,485 --> 01:11:01,290 Let me go ahead and run python of meow.py. 1566 01:11:01,290 --> 01:11:05,940 Hopefully, now, I've solved this problem by having [GROANS] a main function. 1567 01:11:05,940 --> 01:11:08,170 But now, nothing has happened. 1568 01:11:08,170 --> 01:11:08,670 All right. 1569 01:11:08,670 --> 01:11:12,200 Even if you've never programmed in Python before, 1570 01:11:12,200 --> 01:11:17,855 what might explain this behavior, and how do I fix? 1571 01:11:17,855 --> 01:11:20,730 Again, when you're off in the real world, learning some new language, 1572 01:11:20,730 --> 01:11:23,790 all you have is deductive logic to debug. 1573 01:11:23,790 --> 01:11:24,300 Yeah? 1574 01:11:24,300 --> 01:11:28,656 AUDIENCE: I remember in C, even though we [INAUDIBLE].. 1575 01:11:28,656 --> 01:11:31,708 1576 01:11:31,708 --> 01:11:32,500 DAVID MALAN: Right. 1577 01:11:32,500 --> 01:11:34,510 So the solution, to be clear, in C was that we 1578 01:11:34,510 --> 01:11:35,650 had to put the prototype up here. 1579 01:11:35,650 --> 01:11:36,790 Otherwise, we'd get an error message. 1580 01:11:36,790 --> 01:11:39,123 In this case, I'm actually not getting an error message. 1581 01:11:39,123 --> 01:11:42,610 And, indeed, I'll claim that you don't need the prototypes in Python. 1582 01:11:42,610 --> 01:11:46,910 Just not necessary because that was annoying, if nothing else. 1583 01:11:46,910 --> 01:11:48,820 But what else might explain? 1584 01:11:48,820 --> 01:11:49,570 Yeah, in the back? 1585 01:11:49,570 --> 01:11:51,030 AUDIENCE: [INAUDIBLE] 1586 01:11:51,030 --> 01:11:51,780 DAVID MALAN: Yeah. 1587 01:11:51,780 --> 01:11:53,880 Maybe you have to call main itself. 1588 01:11:53,880 --> 01:11:58,410 If main is not some special status in Python, maybe just because it exists 1589 01:11:58,410 --> 01:11:59,040 isn't enough. 1590 01:11:59,040 --> 01:12:02,580 And, indeed, if you want to call main, the new convention 1591 01:12:02,580 --> 01:12:05,460 is actually going to be-- as the very last line of your program, 1592 01:12:05,460 --> 01:12:07,350 typically-- to literally call main. 1593 01:12:07,350 --> 01:12:10,950 It's a little stupid-looking, but they made a design decision. 1594 01:12:10,950 --> 01:12:13,200 And this is how, now, we work around it. 1595 01:12:13,200 --> 01:12:14,610 Python of meow.py. 1596 01:12:14,610 --> 01:12:16,890 Now we're back in business. 1597 01:12:16,890 --> 01:12:19,560 But now, logically, why does this work the way it does? 1598 01:12:19,560 --> 01:12:22,320 Well, in this case-- top to bottom-- 1599 01:12:22,320 --> 01:12:25,350 line 1 is telling Python to define a function called main 1600 01:12:25,350 --> 01:12:27,660 and, then, define it as follows, lines 2 and 3. 1601 01:12:27,660 --> 01:12:29,610 But it's not calling main yet. 1602 01:12:29,610 --> 01:12:33,210 Line 6 is telling Python how to define a function called meow, 1603 01:12:33,210 --> 01:12:35,580 but it's not calling these lines yet. 1604 01:12:35,580 --> 01:12:38,730 Now, on line 10, you're telling Python, call main. 1605 01:12:38,730 --> 01:12:41,310 And at that point, Python has been trained, if you will, 1606 01:12:41,310 --> 01:12:45,390 to know what main is on line 1, to know what meow is on line 6. 1607 01:12:45,390 --> 01:12:49,650 And so, it's now perfectly OK for main to be above meow 1608 01:12:49,650 --> 01:12:51,150 because you never called them yet. 1609 01:12:51,150 --> 01:12:54,340 You defined, defined, and then, you called. 1610 01:12:54,340 --> 01:12:56,380 And that's the logic behind this. 1611 01:12:56,380 --> 01:13:01,250 Any questions, now, on the structure of this technique, here? 1612 01:13:01,250 --> 01:13:03,000 Now, let's do one more, then. 1613 01:13:03,000 --> 01:13:07,740 Recall that the last thing we did in Scratch and in C was to, 1614 01:13:07,740 --> 01:13:10,940 actually, parameterize these same functions. 1615 01:13:10,940 --> 01:13:14,070 So suppose that you don't want main to be responsible for the loop here. 1616 01:13:14,070 --> 01:13:17,580 You instead want to, very simply, do something like "meow" three times 1617 01:13:17,580 --> 01:13:18,660 and be done with it. 1618 01:13:18,660 --> 01:13:21,427 Well, in Python, it's going to be similar in spirit to C. 1619 01:13:21,427 --> 01:13:23,760 But, again, we don't need to keep mentioning data types. 1620 01:13:23,760 --> 01:13:26,310 If you want "meow" to take some argument-- 1621 01:13:26,310 --> 01:13:27,930 like a number n-- 1622 01:13:27,930 --> 01:13:30,792 you can just specify n as the name of that argument. 1623 01:13:30,792 --> 01:13:33,250 Or you can call it anything else, of course, that you want. 1624 01:13:33,250 --> 01:13:35,700 You don't have to specify int or anything else. 1625 01:13:35,700 --> 01:13:40,890 In your code, now, inside of meow, you can do something like for i in, 1626 01:13:40,890 --> 01:13:41,670 let's say-- 1627 01:13:41,670 --> 01:13:45,690 I definitely, now, can't do this because that would be weird, to start the list 1628 01:13:45,690 --> 01:13:46,590 and end it with n. 1629 01:13:46,590 --> 01:13:49,360 So, if I can come back over here, what's the solution? 1630 01:13:49,360 --> 01:13:51,270 How can I do something n times? 1631 01:13:51,270 --> 01:13:52,410 AUDIENCE: [INAUDIBLE] 1632 01:13:52,410 --> 01:13:53,160 DAVID MALAN: Yeah. 1633 01:13:53,160 --> 01:13:54,340 Using range. 1634 01:13:54,340 --> 01:13:58,140 So range is nice because I can pass in, now, this variable n. 1635 01:13:58,140 --> 01:13:59,940 And now, I can meow-- whoops. 1636 01:13:59,940 --> 01:14:03,195 Now i can print out, quote unquote, "meow." 1637 01:14:03,195 --> 01:14:05,820 So it's almost the same as in Scratch, almost the same as in C. 1638 01:14:05,820 --> 01:14:06,903 But it's a little simpler. 1639 01:14:06,903 --> 01:14:12,210 And if, now, I run meow.py, I'll have the ability, now, to do this here, 1640 01:14:12,210 --> 01:14:13,110 as well. 1641 01:14:13,110 --> 01:14:13,770 All right. 1642 01:14:13,770 --> 01:14:16,590 Questions on any of this? 1643 01:14:16,590 --> 01:14:19,800 Right now, we're taking this stroll through week one. 1644 01:14:19,800 --> 01:14:22,050 We're going to, momentarily, escalate things 1645 01:14:22,050 --> 01:14:24,840 to look not only at some of these basics, 1646 01:14:24,840 --> 01:14:27,390 but also, other features, like we saw with face recognition 1647 01:14:27,390 --> 01:14:28,920 with the speller or the like. 1648 01:14:28,920 --> 01:14:31,962 Because of how many of us are here, we have a huge amount of candy 1649 01:14:31,962 --> 01:14:32,670 out in the lobby. 1650 01:14:32,670 --> 01:14:34,440 So why don't we go ahead and take a 10-minute break? 1651 01:14:34,440 --> 01:14:37,230 And when we come back, we'll do even fancier, more powerful things 1652 01:14:37,230 --> 01:14:38,595 with Python in 10. 1653 01:14:38,595 --> 01:14:40,020 All right. 1654 01:14:40,020 --> 01:14:41,730 So we are back. 1655 01:14:41,730 --> 01:14:44,280 Among our goals, now, are to introduce a few more building 1656 01:14:44,280 --> 01:14:47,880 blocks so that we can solve more interesting problems at the end, 1657 01:14:47,880 --> 01:14:49,560 much like those that we began with. 1658 01:14:49,560 --> 01:14:52,830 You'll recall, from a few weeks ago, we played with this two-dimensional Super 1659 01:14:52,830 --> 01:14:53,670 Mario world. 1660 01:14:53,670 --> 01:14:57,380 And we tried to print a vertical column of three or more bricks. 1661 01:14:57,380 --> 01:15:00,210 Well, let me propose that we use this as an opportunity to, now, 1662 01:15:00,210 --> 01:15:02,880 tinker with some of Python's more useful, more 1663 01:15:02,880 --> 01:15:04,470 user-friendly functionality, as well. 1664 01:15:04,470 --> 01:15:09,265 So let me code a file called mario.py, and let's just print out 1665 01:15:09,265 --> 01:15:10,890 the equivalent of that vertical column. 1666 01:15:10,890 --> 01:15:12,690 So it's of height 3. 1667 01:15:12,690 --> 01:15:16,740 Each one is a hash, so let's do for i in range of 3 initially, 1668 01:15:16,740 --> 01:15:18,600 and let's just print out a single hash. 1669 01:15:18,600 --> 01:15:21,790 And I think, now, python of mario.py-- 1670 01:15:21,790 --> 01:15:22,290 voila. 1671 01:15:22,290 --> 01:15:27,480 We're in business, printing out just that same column there. 1672 01:15:27,480 --> 01:15:31,110 What if, though, we want to print a column of some variable height 1673 01:15:31,110 --> 01:15:33,510 where the user tells us how tall they want it to be? 1674 01:15:33,510 --> 01:15:39,600 Well, let me go up here, for instance and, instead, how about-- 1675 01:15:39,600 --> 01:15:40,920 let's do this. 1676 01:15:40,920 --> 01:15:45,210 How about from cs50 import? 1677 01:15:45,210 --> 01:15:47,620 How about the get_int function, as before? 1678 01:15:47,620 --> 01:15:50,430 So it will deal with making sure the user gives us an integer. 1679 01:15:50,430 --> 01:15:54,750 And now, in the past, whenever we wanted to get a number from a user, 1680 01:15:54,750 --> 01:15:56,780 we've actually followed a certain paradigm. 1681 01:15:56,780 --> 01:16:02,895 In fact, if I open up here, for instance, 1682 01:16:02,895 --> 01:16:06,630 how about mario1.c from a while back, you 1683 01:16:06,630 --> 01:16:11,430 might recall that we had code like this. 1684 01:16:11,430 --> 01:16:13,800 And we specifically use the do while loop in C 1685 01:16:13,800 --> 01:16:16,410 whenever we want to get something from the user, 1686 01:16:16,410 --> 01:16:18,858 maybe, again and again and again, until they cooperate. 1687 01:16:18,858 --> 01:16:20,900 At which point, we finally break out of the loop. 1688 01:16:20,900 --> 01:16:22,830 So it turns out, Python does have while loops, 1689 01:16:22,830 --> 01:16:25,698 does have for loops, does not have do while loops. 1690 01:16:25,698 --> 01:16:27,990 And yet, pretty much any time you've gotten user input, 1691 01:16:27,990 --> 01:16:30,100 you've probably used this paradigm. 1692 01:16:30,100 --> 01:16:33,930 So it turns out that the Python equivalent of this is to do, 1693 01:16:33,930 --> 01:16:36,450 similar in spirit, but using only a while loop. 1694 01:16:36,450 --> 01:16:39,300 And a common paradigm in Python, as I alluded earlier, 1695 01:16:39,300 --> 01:16:43,440 is to actually deliberately induce an infinite loop while True-- 1696 01:16:43,440 --> 01:16:48,240 capital T-- and then, do what you want to do, like get an int from the user 1697 01:16:48,240 --> 01:16:51,690 and prompt them for the height, for instance, in question. 1698 01:16:51,690 --> 01:16:56,070 And then, if you're sure that the user has given you what you want-- 1699 01:16:56,070 --> 01:16:59,220 like n is greater than 0, which is what I want, in this case, 1700 01:16:59,220 --> 01:17:02,610 because I want a positive integer; otherwise, there's nothing to print-- 1701 01:17:02,610 --> 01:17:04,505 you literally just break out of the loop. 1702 01:17:04,505 --> 01:17:08,070 And so, we could actually use this technique in C. It's just not 1703 01:17:08,070 --> 01:17:10,260 really done in C. You could absolutely, in C, 1704 01:17:10,260 --> 01:17:13,590 have done a while True loop with the parentheses, lowercase true. 1705 01:17:13,590 --> 01:17:15,670 You could break out of it, and so forth. 1706 01:17:15,670 --> 01:17:18,312 But in Python, this is the Python way. 1707 01:17:18,312 --> 01:17:19,770 And this is actually a term of art. 1708 01:17:19,770 --> 01:17:24,017 This way in Python is pythonic This is "the way everyone does it," 1709 01:17:24,017 --> 01:17:24,600 quote unquote. 1710 01:17:24,600 --> 01:17:28,830 Doesn't mean you have to, but that's the way the cool Python programmers would 1711 01:17:28,830 --> 01:17:31,980 implement an idea like this-- trying to do something again and again 1712 01:17:31,980 --> 01:17:34,607 and again until the user actually cooperates. 1713 01:17:34,607 --> 01:17:36,690 But all we've done is take away the do while loop. 1714 01:17:36,690 --> 01:17:39,790 But still, logically, we can implement the same idea. 1715 01:17:39,790 --> 01:17:44,580 Now, below this, let me go ahead and just print out, for i in range of n 1716 01:17:44,580 --> 01:17:47,370 this time-- because I want it to be variable and not 3. 1717 01:17:47,370 --> 01:17:49,920 I can go ahead and print out the hash-- 1718 01:17:49,920 --> 01:17:52,260 let me go ahead and get rid of the C version here-- 1719 01:17:52,260 --> 01:17:55,920 open my terminal window and I'll run, again, Python of mario.py. 1720 01:17:55,920 --> 01:17:58,530 I'll type in 3 and I get back those three hashes. 1721 01:17:58,530 --> 01:18:02,635 But if I, instead, type in 4, I now get four hashes instead. 1722 01:18:02,635 --> 01:18:04,640 So the takeaway here is, quite simply, that this 1723 01:18:04,640 --> 01:18:08,030 would be the way, for instance, to actually get back 1724 01:18:08,030 --> 01:18:11,615 a value in Python that is consistent with some parameter, 1725 01:18:11,615 --> 01:18:13,160 like greater than 0. 1726 01:18:13,160 --> 01:18:13,950 How about this? 1727 01:18:13,950 --> 01:18:17,810 Let's actually practice what we preached a moment ago with our meowing examples 1728 01:18:17,810 --> 01:18:19,830 and factoring all this out. 1729 01:18:19,830 --> 01:18:23,220 Let me go ahead and define a main function, as before. 1730 01:18:23,220 --> 01:18:25,190 Let me go ahead and assume, for the moment, 1731 01:18:25,190 --> 01:18:28,673 that a get_height function exists, which is not a thing in Python. 1732 01:18:28,673 --> 01:18:30,340 I'm going to invent it in just a moment. 1733 01:18:30,340 --> 01:18:33,620 And now, I'm going to go ahead and do something like this. for i 1734 01:18:33,620 --> 01:18:39,470 in the range of that height, well, let's go ahead and print out those hashes. 1735 01:18:39,470 --> 01:18:41,760 So I'm assuming that get_height exists. 1736 01:18:41,760 --> 01:18:44,725 Let me go ahead and implement that abstraction, so define a function, 1737 01:18:44,725 --> 01:18:46,100 now, called get_height. 1738 01:18:46,100 --> 01:18:48,830 It's not going to take any arguments in this design. 1739 01:18:48,830 --> 01:18:52,820 While True, I can go ahead and do the same thing as before-- 1740 01:18:52,820 --> 01:18:55,880 assign a variable n, the return value of get_int 1741 01:18:55,880 --> 01:18:58,140 prompting the user for that height. 1742 01:18:58,140 --> 01:19:03,980 And then, if n is greater than 0, I can go ahead and break. 1743 01:19:03,980 --> 01:19:08,390 But if I break here, I, logically-- just like in C-- 1744 01:19:08,390 --> 01:19:11,360 end up executing below the loop in question. 1745 01:19:11,360 --> 01:19:12,690 But there's nothing there. 1746 01:19:12,690 --> 01:19:16,820 But if I want get_height to return the height, what should 1747 01:19:16,820 --> 01:19:18,650 I type here on line 14, logically? 1748 01:19:18,650 --> 01:19:21,580 1749 01:19:21,580 --> 01:19:23,380 What do I want to return, to be clear? 1750 01:19:23,380 --> 01:19:23,995 AUDIENCE: [INAUDIBLE] 1751 01:19:23,995 --> 01:19:24,745 DAVID MALAN: Yeah. 1752 01:19:24,745 --> 01:19:26,890 So I actually want to return n. 1753 01:19:26,890 --> 01:19:30,880 And here's another curiosity of Python, vis-a-vis C. 1754 01:19:30,880 --> 01:19:33,670 There doesn't seem to be an issue of scope anymore, right? 1755 01:19:33,670 --> 01:19:37,180 In C, it was super important to not only declare your variables with the data 1756 01:19:37,180 --> 01:19:39,550 types, you also had to be mindful of where they exist-- 1757 01:19:39,550 --> 01:19:41,200 inside of those curly braces. 1758 01:19:41,200 --> 01:19:45,238 In Python, it turns out you can be a little looser with things, for better 1759 01:19:45,238 --> 01:19:45,780 or for worse. 1760 01:19:45,780 --> 01:19:50,020 And so, on line 11, if I create a variable called n, 1761 01:19:50,020 --> 01:19:57,170 it exists on line 11, 12 and even 13, outside of the while loop. 1762 01:19:57,170 --> 01:19:59,710 So to be clear, in C, with a while loop, we 1763 01:19:59,710 --> 01:20:03,040 would have ordinarily had not a colon. 1764 01:20:03,040 --> 01:20:05,920 We would have had the curly brace, like here and over here. 1765 01:20:05,920 --> 01:20:08,770 And a week ago, I would have claimed that, in C, n 1766 01:20:08,770 --> 01:20:12,130 does not exist outside of the while loop, by nature of those curly braces. 1767 01:20:12,130 --> 01:20:15,250 Even though the curly braces are gone, Python actually 1768 01:20:15,250 --> 01:20:20,685 allows you to use a variable any time after you have assigned it a value. 1769 01:20:20,685 --> 01:20:23,625 So slightly more powerful, as such. 1770 01:20:23,625 --> 01:20:26,830 However, I can tighten this up a little bit, logically. 1771 01:20:26,830 --> 01:20:30,700 And this is true in C. I don't really need to break out of the loop 1772 01:20:30,700 --> 01:20:32,020 by using break. 1773 01:20:32,020 --> 01:20:36,070 Recall that or know that I can actually-- once I'm ready to go, 1774 01:20:36,070 --> 01:20:40,030 I can just return the value I care about, even inside of the loop. 1775 01:20:40,030 --> 01:20:43,000 And that will have the side effect of breaking me out of the loop 1776 01:20:43,000 --> 01:20:46,590 and, also, breaking me out of and returning from the entire function. 1777 01:20:46,590 --> 01:20:50,470 So nothing too new here, in terms of C versus Python, except for this issue 1778 01:20:50,470 --> 01:20:51,490 with scope. 1779 01:20:51,490 --> 01:20:53,770 And I, indeed, returned n at the bottom there, 1780 01:20:53,770 --> 01:20:56,360 just to make clear that n would still exist. 1781 01:20:56,360 --> 01:20:58,170 So either of those are correct. 1782 01:20:58,170 --> 01:21:02,350 Now, I just have a Python program that I think 1783 01:21:02,350 --> 01:21:05,590 is going to allow me to implement this same Mario idea. 1784 01:21:05,590 --> 01:21:07,450 So let's run python of mario.py. 1785 01:21:07,450 --> 01:21:09,820 And-- OK, so nothing happened. 1786 01:21:09,820 --> 01:21:13,390 Python of mario.py. 1787 01:21:13,390 --> 01:21:14,260 What did I do wrong? 1788 01:21:14,260 --> 01:21:14,965 AUDIENCE: [INAUDIBLE] 1789 01:21:14,965 --> 01:21:16,590 DAVID MALAN: Yeah, I have to call main. 1790 01:21:16,590 --> 01:21:19,720 So, at the bottom of my code, I have to call main here. 1791 01:21:19,720 --> 01:21:22,720 And this is a stylistic detail that's been subtle. 1792 01:21:22,720 --> 01:21:26,050 Generally speaking, when you are writing in Python, 1793 01:21:26,050 --> 01:21:28,360 there's not a CS50 style guide, per se. 1794 01:21:28,360 --> 01:21:33,700 There's actually a Python style guide that most people adhere to. 1795 01:21:33,700 --> 01:21:37,480 And in this case, double blank lines between functions is the norm. 1796 01:21:37,480 --> 01:21:41,890 I'm doing that deliberately, although it might, otherwise, not be obvious. 1797 01:21:41,890 --> 01:21:45,130 But now that I've called main on line 16, let's run mario.py once more. 1798 01:21:45,130 --> 01:21:46,690 Aha. 1799 01:21:46,690 --> 01:21:47,560 Now we see it. 1800 01:21:47,560 --> 01:21:51,730 Type in 3, and I'm back in business, printing out the values there. 1801 01:21:51,730 --> 01:21:52,330 Yeah? 1802 01:21:52,330 --> 01:21:54,146 AUDIENCE: Why do you [INAUDIBLE]? 1803 01:21:54,146 --> 01:21:56,120 Why can't [INAUDIBLE]? 1804 01:21:56,120 --> 01:21:56,870 DAVID MALAN: Sure. 1805 01:21:56,870 --> 01:21:58,453 Why do I need the if condition at all? 1806 01:21:58,453 --> 01:22:02,390 Why can't I just return n here as by doing return n. 1807 01:22:02,390 --> 01:22:06,890 Or if I really want to be succinct, I could technically just do this. 1808 01:22:06,890 --> 01:22:09,512 The only reason I added the if condition is 1809 01:22:09,512 --> 01:22:11,720 because, if the user types in negative 1, negative 2, 1810 01:22:11,720 --> 01:22:13,850 I wanted to prompt them again and again. 1811 01:22:13,850 --> 01:22:14,390 That's all. 1812 01:22:14,390 --> 01:22:17,660 But that would be totally acceptable, too, if you were OK with that result 1813 01:22:17,660 --> 01:22:18,630 instead. 1814 01:22:18,630 --> 01:22:21,170 Well, let me do one other thing here to point out 1815 01:22:21,170 --> 01:22:23,870 why we are using get_int so frequently. 1816 01:22:23,870 --> 01:22:26,030 This new training wheel, albeit temporarily. 1817 01:22:26,030 --> 01:22:28,490 So let me go back to the way it was a moment ago 1818 01:22:28,490 --> 01:22:32,510 and let me propose, now, to take away get_int. 1819 01:22:32,510 --> 01:22:35,840 I claimed earlier that, if you're not using get_int, 1820 01:22:35,840 --> 01:22:40,400 you can just use the input function itself from Python. 1821 01:22:40,400 --> 01:22:43,250 But that always returns a string, or a str. 1822 01:22:43,250 --> 01:22:48,110 And so, recall that you have to pass the output of the input function to an int, 1823 01:22:48,110 --> 01:22:51,930 either on the same line or, if you prefer, on another line, instead. 1824 01:22:51,930 --> 01:22:54,110 But it turns out what I didn't do was show 1825 01:22:54,110 --> 01:22:59,250 you what happens if you don't cooperate with the program. 1826 01:22:59,250 --> 01:23:02,540 So if I run python of mario.py now, works great, even 1827 01:23:02,540 --> 01:23:04,252 without the get_int function. 1828 01:23:04,252 --> 01:23:05,210 And I can do it with 4. 1829 01:23:05,210 --> 01:23:06,575 Still works great. 1830 01:23:06,575 --> 01:23:09,122 But let me clear my terminal and be difficult, now, 1831 01:23:09,122 --> 01:23:11,330 as the user and type in "cat" for the height instead. 1832 01:23:11,330 --> 01:23:12,560 Enter. 1833 01:23:12,560 --> 01:23:14,540 Now, we see one of those trace backs again. 1834 01:23:14,540 --> 01:23:15,900 This one is different. 1835 01:23:15,900 --> 01:23:18,780 This isn't a name error, but, apparently, a value error. 1836 01:23:18,780 --> 01:23:20,870 And if I ignore the stuff I don't understand, 1837 01:23:20,870 --> 01:23:24,440 I can see "invalid literal for int with base 10-- "cat."" 1838 01:23:24,440 --> 01:23:27,800 That's a super cryptic way of saying that C-A-T is not 1839 01:23:27,800 --> 01:23:29,640 a number in decimal notation. 1840 01:23:29,640 --> 01:23:32,600 And so, I would seem to have to, somehow, handle this case. 1841 01:23:32,600 --> 01:23:34,490 And if you want to be more curious, you'll 1842 01:23:34,490 --> 01:23:36,350 see that this is, indeed, a traceback. 1843 01:23:36,350 --> 01:23:40,100 And C tends to do this, too, or the debugger would do this for you, too. 1844 01:23:40,100 --> 01:23:41,960 You can see all of the functions that have 1845 01:23:41,960 --> 01:23:43,502 been called to get you to this point. 1846 01:23:43,502 --> 01:23:48,170 So apparently, my problem is, initially, in line 14. 1847 01:23:48,170 --> 01:23:50,375 But line 14, if I keep scrolling, is uninteresting. 1848 01:23:50,375 --> 01:23:51,410 It's main. 1849 01:23:51,410 --> 01:23:55,820 But line 14 leads me to execute line 2, which is, indeed, in main. 1850 01:23:55,820 --> 01:23:59,225 That leads me to execute line 9, which is in get_height. 1851 01:23:59,225 --> 01:24:00,880 And so, OK, here is the issue. 1852 01:24:00,880 --> 01:24:02,960 So the closest line number to the error message 1853 01:24:02,960 --> 01:24:05,360 is the one that probably reveals the most. 1854 01:24:05,360 --> 01:24:06,950 Line 9 is where my issue is. 1855 01:24:06,950 --> 01:24:10,940 So I can't just blindly ask the user for input and, then, convert it to an int 1856 01:24:10,940 --> 01:24:12,620 if they're not going to give me an int. 1857 01:24:12,620 --> 01:24:13,870 Now, how do we deal with this? 1858 01:24:13,870 --> 01:24:16,010 Well, back in problem set two, you might recall 1859 01:24:16,010 --> 01:24:18,380 validating that the user typed in a number 1860 01:24:18,380 --> 01:24:19,862 and using a for loop and the like. 1861 01:24:19,862 --> 01:24:22,445 Well, it turns out, there's a better way to do this in Python, 1862 01:24:22,445 --> 01:24:24,830 and the semantics are there. 1863 01:24:24,830 --> 01:24:29,600 If you want to try to convert something to a number that might not actually 1864 01:24:29,600 --> 01:24:32,780 be a number, turns out, Python and certain other languages 1865 01:24:32,780 --> 01:24:35,060 literally have a keyword called try. 1866 01:24:35,060 --> 01:24:37,820 And if only this existed for the past few weeks, I know. 1867 01:24:37,820 --> 01:24:40,583 But you can try to do the following with your code. 1868 01:24:40,583 --> 01:24:41,750 What do I want to try to do? 1869 01:24:41,750 --> 01:24:46,980 Well, I want to try to execute those few lines, except if there's an error. 1870 01:24:46,980 --> 01:24:50,225 So I can say except if there's a value error-- specifically, 1871 01:24:50,225 --> 01:24:53,065 the one I screwed up and created a moment ago. 1872 01:24:53,065 --> 01:24:56,480 And if there is a value error, I can print out an informative message 1873 01:24:56,480 --> 01:25:00,920 to the user, like "not an integer" or anything else. 1874 01:25:00,920 --> 01:25:05,270 And what's happening here, now, is literally this operative word, try. 1875 01:25:05,270 --> 01:25:09,920 Python is going to try to get input and try to convert it to an int, 1876 01:25:09,920 --> 01:25:12,470 and it's going to try to check if it's greater than 0 1877 01:25:12,470 --> 01:25:14,750 and then try to return it. 1878 01:25:14,750 --> 01:25:15,467 Why? 1879 01:25:15,467 --> 01:25:17,300 Three of those lines are inside of, indented 1880 01:25:17,300 --> 01:25:20,780 underneath the try block, except if something goes wrong-- 1881 01:25:20,780 --> 01:25:23,540 specifically, a value error happens. 1882 01:25:23,540 --> 01:25:24,560 Then, it prints this. 1883 01:25:24,560 --> 01:25:26,110 But it doesn't return anything. 1884 01:25:26,110 --> 01:25:30,335 And because I'm in a loop, that means it's going to do it again and again 1885 01:25:30,335 --> 01:25:33,980 and again until the human actually cooperates and gives me 1886 01:25:33,980 --> 01:25:35,360 an actual number. 1887 01:25:35,360 --> 01:25:38,210 And so, this, too, is what the world would call pythonic. 1888 01:25:38,210 --> 01:25:41,420 In Python, you don't, necessarily, rigorously try to validate 1889 01:25:41,420 --> 01:25:43,940 the user's input, make sure they haven't screwed up. 1890 01:25:43,940 --> 01:25:46,160 You honestly take a more lackadaisical approach 1891 01:25:46,160 --> 01:25:50,300 and just try to do something, but catch an error if it happens. 1892 01:25:50,300 --> 01:25:53,720 So catch is also a term of art, even though it's not a keyword here. 1893 01:25:53,720 --> 01:25:55,760 Except if something happens, you handle it. 1894 01:25:55,760 --> 01:25:57,470 So you try and you handle it. 1895 01:25:57,470 --> 01:25:59,480 You best-effort programming, if you will. 1896 01:25:59,480 --> 01:26:04,200 But this is baked into the mindset of the Python programming community. 1897 01:26:04,200 --> 01:26:08,630 So now, if I do python of mario.py and I cooperate, works great as before. 1898 01:26:08,630 --> 01:26:09,830 Try and succeed. 1899 01:26:09,830 --> 01:26:10,670 3 works. 1900 01:26:10,670 --> 01:26:11,345 4 works. 1901 01:26:11,345 --> 01:26:17,243 If, though, I try and fail by typing in "cat," it doesn't crash, per se. 1902 01:26:17,243 --> 01:26:18,410 It doesn't show me an error. 1903 01:26:18,410 --> 01:26:20,695 It shows me something more user-friendly, like "not an integer." 1904 01:26:20,695 --> 01:26:22,610 And then, I can try again with "dog." 1905 01:26:22,610 --> 01:26:23,390 "Not an integer." 1906 01:26:23,390 --> 01:26:24,980 I can try again with 5. 1907 01:26:24,980 --> 01:26:26,240 And now, it works. 1908 01:26:26,240 --> 01:26:28,160 So we won't, generally, have you write much 1909 01:26:28,160 --> 01:26:30,500 in the way of these try-except blocks, only because they 1910 01:26:30,500 --> 01:26:33,080 get a little sophisticated quickly. 1911 01:26:33,080 --> 01:26:35,777 But that is to reveal what the get_int function is doing. 1912 01:26:35,777 --> 01:26:37,610 This is why we give you the training wheels, 1913 01:26:37,610 --> 01:26:39,420 so that, when you want to get an int, you 1914 01:26:39,420 --> 01:26:41,990 don't have to jump through all these annoying hoops to do so. 1915 01:26:41,990 --> 01:26:45,965 But that's all the library's really doing for you, is just try and except. 1916 01:26:45,965 --> 01:26:48,980 You won't be left with any training wheels, ultimately. 1917 01:26:48,980 --> 01:26:52,760 Questions, now, on getting input and trying in this way? 1918 01:26:52,760 --> 01:26:55,433 1919 01:26:55,433 --> 01:26:56,100 Anything at all? 1920 01:26:56,100 --> 01:26:56,610 Yeah? 1921 01:26:56,610 --> 01:27:03,643 AUDIENCE: I'm still [INAUDIBLE] try block. 1922 01:27:03,643 --> 01:27:06,560 DAVID MALAN: Oh, could you put the condition outside of the try block? 1923 01:27:06,560 --> 01:27:07,310 Short answer, yes. 1924 01:27:07,310 --> 01:27:09,227 And, in fact, I struggled with this last night 1925 01:27:09,227 --> 01:27:11,750 when tweaking this example to show the simplest version. 1926 01:27:11,750 --> 01:27:17,180 I will disclaim that, really, I should only be trying, literally, 1927 01:27:17,180 --> 01:27:18,470 to do the fragile part. 1928 01:27:18,470 --> 01:27:21,710 And then, down here, I should be really doing 1929 01:27:21,710 --> 01:27:24,380 what you're proposing, which is do the condition out here. 1930 01:27:24,380 --> 01:27:27,380 The problem is, though, that, logically, this gets messy quickly, right? 1931 01:27:27,380 --> 01:27:31,205 Because except if there's a value error, I want to print out "not an integer." 1932 01:27:31,205 --> 01:27:33,920 I can't compare n against 0, then, because n doesn't 1933 01:27:33,920 --> 01:27:35,752 exist because there was an error. 1934 01:27:35,752 --> 01:27:37,460 So it turns out-- and I'll show you this; 1935 01:27:37,460 --> 01:27:39,350 this is now the advanced version of Python-- 1936 01:27:39,350 --> 01:27:42,620 there's actually an else keyword you can use in Python 1937 01:27:42,620 --> 01:27:44,570 that does not accompany if or elif. 1938 01:27:44,570 --> 01:27:48,680 It accompanies try and except, which I think is weirdly confusing. 1939 01:27:48,680 --> 01:27:50,640 A different word would have been better. 1940 01:27:50,640 --> 01:27:53,692 But if you'd really prefer, I could have done this, instead. 1941 01:27:53,692 --> 01:27:56,900 And this is one of these design things where reasonable people will disagree. 1942 01:27:56,900 --> 01:27:58,775 Generally speaking, you should only try to do 1943 01:27:58,775 --> 01:28:00,980 the one line that might very well fail. 1944 01:28:00,980 --> 01:28:02,420 But honestly, this looks stupid. 1945 01:28:02,420 --> 01:28:04,850 No, it's just unnecessarily complicated. 1946 01:28:04,850 --> 01:28:08,560 And so, my own preference was actually the original, which was-- yeah, 1947 01:28:08,560 --> 01:28:10,310 I'm trying a few extra lines that, really, 1948 01:28:10,310 --> 01:28:11,973 aren't going to fail, mathematically. 1949 01:28:11,973 --> 01:28:12,890 But it's just tighter. 1950 01:28:12,890 --> 01:28:14,030 It's cleaner this way. 1951 01:28:14,030 --> 01:28:16,580 And here's, again, the sort of arguments you'll 1952 01:28:16,580 --> 01:28:18,530 start to make yourself as you get more comfortable with programming. 1953 01:28:18,530 --> 01:28:19,280 You'll have an opinion. 1954 01:28:19,280 --> 01:28:20,488 You'll disagree with someone. 1955 01:28:20,488 --> 01:28:25,200 And so long as you can back you argument up, it's pretty reasonable, probably. 1956 01:28:25,200 --> 01:28:25,700 All right. 1957 01:28:25,700 --> 01:28:30,222 So how about we, now, take away some piece of magic 1958 01:28:30,222 --> 01:28:31,430 that's been here for a while. 1959 01:28:31,430 --> 01:28:33,950 Let me go ahead and delete all of this here. 1960 01:28:33,950 --> 01:28:38,855 And let me propose that we revisit not that vertical column and the exceptions 1961 01:28:38,855 --> 01:28:42,110 that might result from getting input, but these horizontal question marks 1962 01:28:42,110 --> 01:28:43,130 that we saw a while ago. 1963 01:28:43,130 --> 01:28:45,980 So I want all of those question marks on the same line. 1964 01:28:45,980 --> 01:28:48,860 And yet, I worry we're about to see a challenge here because print, 1965 01:28:48,860 --> 01:28:51,830 up until now, has been putting new lines everywhere automatically, 1966 01:28:51,830 --> 01:28:53,570 even without those backslash n's. 1967 01:28:53,570 --> 01:28:56,360 Well, let me propose that we do this. 1968 01:28:56,360 --> 01:28:58,130 for i in the range of 4. 1969 01:28:58,130 --> 01:29:02,165 If I want four question marks, let me just print four question marks. 1970 01:29:02,165 --> 01:29:04,370 Unfortunately, I don't think this is correct yet. 1971 01:29:04,370 --> 01:29:06,530 Let me run python of mario.py. 1972 01:29:06,530 --> 01:29:11,510 And, of course, this gives me a column instead of the row of question marks 1973 01:29:11,510 --> 01:29:12,630 that I want. 1974 01:29:12,630 --> 01:29:13,550 So how do we do this? 1975 01:29:13,550 --> 01:29:17,785 Well, it turns out, if you read the documentation for the print function, 1976 01:29:17,785 --> 01:29:19,910 it turns out that print, not surprisingly, perhaps, 1977 01:29:19,910 --> 01:29:22,000 takes a lot of different arguments, as well. 1978 01:29:22,000 --> 01:29:24,590 And in fact, if you go to the documentation for it, 1979 01:29:24,590 --> 01:29:27,650 you'll see that it takes not just positional 1980 01:29:27,650 --> 01:29:30,685 arguments-- that is, from left to right, separated by commas. 1981 01:29:30,685 --> 01:29:32,810 It turns out, Python has supports a fancier feature 1982 01:29:32,810 --> 01:29:36,860 with arguments where you can pass the names of arguments to functions, too. 1983 01:29:36,860 --> 01:29:38,470 So what do I mean by this? 1984 01:29:38,470 --> 01:29:43,430 If I go back to VS Code here and I've read the documentation, 1985 01:29:43,430 --> 01:29:48,995 it turns out that, yes, as before, you can pass multiple arguments to Python, 1986 01:29:48,995 --> 01:29:49,700 like this. 1987 01:29:49,700 --> 01:29:53,030 Hello comma David comma Nalan, that will just automatically 1988 01:29:53,030 --> 01:29:56,553 concatenate all three of those positional arguments together. 1989 01:29:56,553 --> 01:29:59,720 They're positional in the sense that they literally flow from left to right, 1990 01:29:59,720 --> 01:30:01,238 separated by commas. 1991 01:30:01,238 --> 01:30:03,530 But if you don't want to just pass in values like that, 1992 01:30:03,530 --> 01:30:07,370 you want to actually print out, as I did before, a question mark. 1993 01:30:07,370 --> 01:30:11,240 But you want to override the default behavior of print 1994 01:30:11,240 --> 01:30:14,610 by changing the line ending, you can actually do this. 1995 01:30:14,610 --> 01:30:18,890 You can use the name of an argument that you know exists from the documentation 1996 01:30:18,890 --> 01:30:22,130 and set it equal to some alternative value. 1997 01:30:22,130 --> 01:30:24,770 And in fact, even though this looks cryptic, 1998 01:30:24,770 --> 01:30:30,380 this is how I would override the end of each line, to be quote, unquote. 1999 01:30:30,380 --> 01:30:32,900 That is nothing because, if you read the documentation, 2000 01:30:32,900 --> 01:30:37,190 the default value for this end argument-- does someone want to guess-- 2001 01:30:37,190 --> 01:30:38,750 is-- 2002 01:30:38,750 --> 01:30:39,800 is backslash n. 2003 01:30:39,800 --> 01:30:41,690 So if you read the documentation, you'll se 2004 01:30:41,690 --> 01:30:46,550 that backslash n is the implied default for this end argument. 2005 01:30:46,550 --> 01:30:49,810 And so, if you want to change it, you just say end equals something else. 2006 01:30:49,810 --> 01:30:57,057 And so, here, I can change it to nothing and, now, rerun python of mario.py. 2007 01:30:57,057 --> 01:30:58,640 And now, they're all in the same line. 2008 01:30:58,640 --> 01:31:01,190 Now, it looks a little stupid because I made that week 2009 01:31:01,190 --> 01:31:04,190 one mistake where I still need to move the cursor to the next line. 2010 01:31:04,190 --> 01:31:05,570 That's just a different problem. 2011 01:31:05,570 --> 01:31:07,612 I'm just going to go over here and print nothing. 2012 01:31:07,612 --> 01:31:10,550 I don't even need to print backslash n because, if print automatically 2013 01:31:10,550 --> 01:31:13,970 gives you a backslash n, just call print with nothing, 2014 01:31:13,970 --> 01:31:15,420 and you'll get that for free. 2015 01:31:15,420 --> 01:31:16,940 So let me rerun python of mario.py. 2016 01:31:16,940 --> 01:31:19,895 And now, it looks a little prettier at the prompt. 2017 01:31:19,895 --> 01:31:21,770 And to be super clear as to what's going on-- 2018 01:31:21,770 --> 01:31:24,300 suppose I want to make an exclamation here. 2019 01:31:24,300 --> 01:31:27,320 I could change the backslash n default to an exclamation point, 2020 01:31:27,320 --> 01:31:28,680 just for kicks. 2021 01:31:28,680 --> 01:31:31,550 And if I run python of mario.py Again, now, I 2022 01:31:31,550 --> 01:31:36,662 get this exclamation with question marks and exclamation points, as well. 2023 01:31:36,662 --> 01:31:38,120 So that's all that's going on here. 2024 01:31:38,120 --> 01:31:40,670 And this is what's called a named argument. 2025 01:31:40,670 --> 01:31:43,670 It literally has a name that you can specify when calling it in. 2026 01:31:43,670 --> 01:31:47,787 And it's different from positional in that you're literally using the name. 2027 01:31:47,787 --> 01:31:49,370 Let me propose something else, though. 2028 01:31:49,370 --> 01:31:50,828 And this is why people like Python. 2029 01:31:50,828 --> 01:31:52,550 There's just cool ways to do things. 2030 01:31:52,550 --> 01:31:55,724 2031 01:31:55,724 --> 01:32:00,740 That's a three-line, verbose way of printing out four question marks. 2032 01:32:00,740 --> 01:32:04,002 I could certainly take the shortcut and just do this. 2033 01:32:04,002 --> 01:32:06,085 But that's not really that interesting for anyone, 2034 01:32:06,085 --> 01:32:08,720 especially if I want to do it a variable number of times. 2035 01:32:08,720 --> 01:32:10,390 But Python does let you do this. 2036 01:32:10,390 --> 01:32:15,110 If you want to multiply a character some number of times, 2037 01:32:15,110 --> 01:32:18,020 not only can you use plus for concatenation, 2038 01:32:18,020 --> 01:32:23,930 you can use star or an asterisk for multiplication, if you will-- that is, 2039 01:32:23,930 --> 01:32:26,250 concatenation again and again and again. 2040 01:32:26,250 --> 01:32:29,030 So if I just print out, quote unquote, "?" 2041 01:32:29,030 --> 01:32:34,190 times 4, that's actually going to be the tightest way, the most distinct way 2042 01:32:34,190 --> 01:32:36,020 I can print four question marks instead. 2043 01:32:36,020 --> 01:32:39,095 And if I don't use 4, I use n, where I get n from the user. 2044 01:32:39,095 --> 01:32:39,830 Bang. 2045 01:32:39,830 --> 01:32:42,320 Now, I've gotten rid of the for loop entirely, 2046 01:32:42,320 --> 01:32:48,000 and I'm using the star operator to manipulate it instead. 2047 01:32:48,000 --> 01:32:50,120 And, to be super clear here, insofar as Python 2048 01:32:50,120 --> 01:32:54,440 does not have malloc or free or memory management that you have to do, 2049 01:32:54,440 --> 01:32:56,060 guess what Python also doesn't have. 2050 01:32:56,060 --> 01:32:59,760 2051 01:32:59,760 --> 01:33:03,110 Anything on your minds in the past couple of week? 2052 01:33:03,110 --> 01:33:03,875 Doesn't have-- 2053 01:33:03,875 --> 01:33:04,853 AUDIENCE: Pointers. 2054 01:33:04,853 --> 01:33:06,020 DAVID MALAN: Pointers, yeah. 2055 01:33:06,020 --> 01:33:09,295 So Python does not have pointers, which just means that all of that 2056 01:33:09,295 --> 01:33:11,420 happens for you automatically, underneath the hood, 2057 01:33:11,420 --> 01:33:14,150 again, by way of code that someone else wrote. 2058 01:33:14,150 --> 01:33:15,950 How about one more throwback with Mario? 2059 01:33:15,950 --> 01:33:20,450 We've talked about, in week one, this two-dimensional structure where 2060 01:33:20,450 --> 01:33:24,302 it's like I claim 3 by 3-- a grid of bricks, if you will. 2061 01:33:24,302 --> 01:33:25,760 Well, how can we do this in Python? 2062 01:33:25,760 --> 01:33:27,590 We can do this in a couple of ways, now. 2063 01:33:27,590 --> 01:33:32,810 Let me go back to my mario.py, and let me do something like for i in range 2064 01:33:32,810 --> 01:33:36,200 of-- we'll just do 3, even though I know, now, I could use get_int 2065 01:33:36,200 --> 01:33:38,453 or I could use input and int. 2066 01:33:38,453 --> 01:33:41,120 And if I want to do something two-dimensionally, just like in C, 2067 01:33:41,120 --> 01:33:42,590 you can nest your for loops. 2068 01:33:42,590 --> 01:33:45,980 So maybe I could do for j in range of 3. 2069 01:33:45,980 --> 01:33:50,690 And then, in here, I could print out a hash symbol. 2070 01:33:50,690 --> 01:33:53,210 And then, let's see if that gives me 9 total. 2071 01:33:53,210 --> 01:33:56,870 So if I've got a nested loop like this, python of mario.py 2072 01:33:56,870 --> 01:33:58,625 hopefully gives me a grid. 2073 01:33:58,625 --> 01:34:01,710 No, it gave me a column of 9. 2074 01:34:01,710 --> 01:34:09,280 Why, logically, even though I've got my row and my columns? 2075 01:34:09,280 --> 01:34:10,210 Yeah. 2076 01:34:10,210 --> 01:34:11,542 AUDIENCE: [INAUDIBLE] 2077 01:34:11,542 --> 01:34:13,000 DAVID MALAN: Yeah, the line ending. 2078 01:34:13,000 --> 01:34:17,380 So in my row, I can't let print just keep adding new line, adding new line. 2079 01:34:17,380 --> 01:34:20,740 So I just have to override this here and let me not screw up like before. 2080 01:34:20,740 --> 01:34:24,250 Let me print one at the end of the whole row, just to move the cursor down. 2081 01:34:24,250 --> 01:34:28,090 And I think, now, together, we've got our 3 by 3. 2082 01:34:28,090 --> 01:34:29,950 Of course, we could tighten this up further. 2083 01:34:29,950 --> 01:34:33,730 If I don't like the nested loop, I probably could go in here 2084 01:34:33,730 --> 01:34:37,975 and just print out, for instance, a brick times 3. 2085 01:34:37,975 --> 01:34:41,055 Or I could change the 3 to a variable if I've gotten it from the user. 2086 01:34:41,055 --> 01:34:42,582 So I can tighten this up further. 2087 01:34:42,582 --> 01:34:45,790 So, again, just different ways to solve the same problem and, again, evidence 2088 01:34:45,790 --> 01:34:47,575 of why a lot of people like Python. 2089 01:34:47,575 --> 01:34:49,825 There's just some more pleasant ways to solve problems 2090 01:34:49,825 --> 01:34:52,330 without getting into the weeds, constantly, of doing things, 2091 01:34:52,330 --> 01:34:56,845 like with for loops and while loops endlessly. 2092 01:34:56,845 --> 01:34:57,430 All right. 2093 01:34:57,430 --> 01:34:59,222 Well, how about some other building blocks? 2094 01:34:59,222 --> 01:35:02,983 Lists are going to be so incredibly useful in Python, just as arrays 2095 01:35:02,983 --> 01:35:04,900 were in C. But arrays are annoying because you 2096 01:35:04,900 --> 01:35:06,410 have to manage the memory yourself. 2097 01:35:06,410 --> 01:35:08,327 You have to in advance how big they are or you 2098 01:35:08,327 --> 01:35:11,440 have to use pointers and malloc or realloc to resize them. 2099 01:35:11,440 --> 01:35:12,100 Oh my god. 2100 01:35:12,100 --> 01:35:14,267 The past two weeks have been painful, in that sense. 2101 01:35:14,267 --> 01:35:17,298 But Python does this all for free for you. 2102 01:35:17,298 --> 01:35:19,090 In fact, there's a whole bunch of functions 2103 01:35:19,090 --> 01:35:22,030 that come with Python that involve lists, 2104 01:35:22,030 --> 01:35:29,678 and they'll allow us, ultimately, to do things again and again and again 2105 01:35:29,678 --> 01:35:30,970 within the same data structure. 2106 01:35:30,970 --> 01:35:33,220 And, for instance, we'll be able to get the length of a list. 2107 01:35:33,220 --> 01:35:35,560 You don't have to remember it yourself in a variable. 2108 01:35:35,560 --> 01:35:39,085 You can just ask Python how many elements are in this list. 2109 01:35:39,085 --> 01:35:42,850 And with this, I think we can solve some old problems, too. 2110 01:35:42,850 --> 01:35:45,250 So let me go back here, to VS Code. 2111 01:35:45,250 --> 01:35:50,890 Let me close mario and give us a new program called scores.py. 2112 01:35:50,890 --> 01:35:54,535 And rather than show the C and the Python now, let's just focus on Python. 2113 01:35:54,535 --> 01:35:59,390 And in scores.c way back when, we just averaged three test scores or something 2114 01:35:59,390 --> 01:35:59,890 like that-- 2115 01:35:59,890 --> 01:36:01,900 72, 73, and 33-- 2116 01:36:01,900 --> 01:36:03,230 a few weeks ago. 2117 01:36:03,230 --> 01:36:07,450 So if I want to create a list in this Python version of 72, 73, 33, 2118 01:36:07,450 --> 01:36:09,220 I just use my square bracket notation. 2119 01:36:09,220 --> 01:36:12,640 C let you use curly braces if you know the values in advance, 2120 01:36:12,640 --> 01:36:14,170 but Python's just this. 2121 01:36:14,170 --> 01:36:16,855 And now, if I want to compute the average-- 2122 01:36:16,855 --> 01:36:19,360 in C, recall, I did something with a loop. 2123 01:36:19,360 --> 01:36:21,140 I added all the values together. 2124 01:36:21,140 --> 01:36:23,230 I, then, divide it by the total number of values 2125 01:36:23,230 --> 01:36:26,110 just like you would in grade school, and that gave me the average. 2126 01:36:26,110 --> 01:36:29,085 Well, Python comes with a lot of super handy functions-- 2127 01:36:29,085 --> 01:36:31,395 not just length, but others, as well. 2128 01:36:31,395 --> 01:36:34,150 And so, in fact, if you want to compute the average, 2129 01:36:34,150 --> 01:36:36,970 you can take the sum of all of those scores 2130 01:36:36,970 --> 01:36:40,010 and divide it by the length of all of those scores. 2131 01:36:40,010 --> 01:36:42,490 So Python comes with length, comes with sum. 2132 01:36:42,490 --> 01:36:45,310 You can just pass in a whole list of any size 2133 01:36:45,310 --> 01:36:47,590 and let it deal with that problem for you. 2134 01:36:47,590 --> 01:36:49,900 So if I want to, now, print out this average, 2135 01:36:49,900 --> 01:36:51,760 I can print out Average colon-- 2136 01:36:51,760 --> 01:36:55,570 and then, I'll plug in my average variable for interpolation. 2137 01:36:55,570 --> 01:36:58,900 Let me make this an fstring so that it gets formatted, 2138 01:36:58,900 --> 01:37:01,530 and let me just run python of scores.py. 2139 01:37:01,530 --> 01:37:02,800 And there is my average. 2140 01:37:02,800 --> 01:37:05,890 It's rounding weird because we're still vulnerable to some floating point 2141 01:37:05,890 --> 01:37:09,340 imprecision, but at least I didn't need loops 2142 01:37:09,340 --> 01:37:11,575 and I didn't have to write all this darn code just 2143 01:37:11,575 --> 01:37:15,130 to do something that Excel and Google Spreadsheets can just do like that. 2144 01:37:15,130 --> 01:37:17,950 Well, Python is closer to those kinds of tools, 2145 01:37:17,950 --> 01:37:21,790 but more powerful in that you can manipulate the data yourself. 2146 01:37:21,790 --> 01:37:25,510 How about, though, if I want to get a bunch of scores manually from the user 2147 01:37:25,510 --> 01:37:27,280 and, then, sum them together. 2148 01:37:27,280 --> 01:37:28,920 Well, let's combine a few ideas here. 2149 01:37:28,920 --> 01:37:29,830 How about this? 2150 01:37:29,830 --> 01:37:36,070 First, let me go ahead and import the get_int function from the CS50 library, 2151 01:37:36,070 --> 01:37:39,340 just so we don't have to deal with try and except or all of that. 2152 01:37:39,340 --> 01:37:42,340 And let me go ahead and give myself an empty list. 2153 01:37:42,340 --> 01:37:44,410 And this is powerful. 2154 01:37:44,410 --> 01:37:48,068 In C, [SIGHS] there's no point to an empty array 2155 01:37:48,068 --> 01:37:50,860 because, if you create an empty array with square bracket notation, 2156 01:37:50,860 --> 01:37:52,600 it's not useful for anything. 2157 01:37:52,600 --> 01:37:55,780 But in Python, you can create it empty because Python 2158 01:37:55,780 --> 01:37:59,590 will grow and shrink the list for you automatically, as you add things to it. 2159 01:37:59,590 --> 01:38:01,600 So if I want to get three scores from the user, 2160 01:38:01,600 --> 01:38:04,840 I could do something like this-- for i in range of 3. 2161 01:38:04,840 --> 01:38:08,680 And then, I can grab a variable called "score" or anything. 2162 01:38:08,680 --> 01:38:11,467 I could call get_int, prompt the human for the score 2163 01:38:11,467 --> 01:38:12,550 that they want to type in. 2164 01:38:12,550 --> 01:38:15,060 And then, once they do, I can do this. 2165 01:38:15,060 --> 01:38:19,450 Thinking back to our object-oriented programming capability now, 2166 01:38:19,450 --> 01:38:24,358 I could do scores.append, and I can append that score to it. 2167 01:38:24,358 --> 01:38:27,400 And you would only know this from having read the documentation, heard it 2168 01:38:27,400 --> 01:38:30,040 in class, in a book or whatnot, but it turns out 2169 01:38:30,040 --> 01:38:33,880 that, just like strings have functions like lower built into them, 2170 01:38:33,880 --> 01:38:37,735 lists have functions like append built into them that just literally appends 2171 01:38:37,735 --> 01:38:40,165 to the end of the list for you, and Python 2172 01:38:40,165 --> 01:38:42,250 will grow or shrink it as needed. 2173 01:38:42,250 --> 01:38:44,760 No more malloc or realloc or the like. 2174 01:38:44,760 --> 01:38:49,120 So this just appends to the scores list. 2175 01:38:49,120 --> 01:38:51,740 That score, and then again and again and again. 2176 01:38:51,740 --> 01:38:52,990 So the array starts at-- 2177 01:38:52,990 --> 01:38:57,640 sorry, the list starts at size 0, then grows to 1 then 2 then 3 2178 01:38:57,640 --> 01:38:59,320 without you having to do anything else. 2179 01:38:59,320 --> 01:39:02,845 And so, now, down here, I can compute an average 2180 01:39:02,845 --> 01:39:05,620 with the sum of those scores divided by the length 2181 01:39:05,620 --> 01:39:07,455 of the total number of scores. 2182 01:39:07,455 --> 01:39:11,830 And to be clear, length is the total number of elements in the list. 2183 01:39:11,830 --> 01:39:14,200 Doesn't matter how big the values themselves are. 2184 01:39:14,200 --> 01:39:18,160 Now I can go ahead and print out an fstring with something 2185 01:39:18,160 --> 01:39:22,100 like Average colon average in curly braces. 2186 01:39:22,100 --> 01:39:24,680 And if I run python of scores.py-- 2187 01:39:24,680 --> 01:39:27,505 I'll type in, just for the sake of discussion, the three values, 2188 01:39:27,505 --> 01:39:29,440 I still get the same answer. 2189 01:39:29,440 --> 01:39:31,390 But that would have been painful to do in C 2190 01:39:31,390 --> 01:39:35,770 unless you committed, in advance, to a fixed size array-- which we already 2191 01:39:35,770 --> 01:39:41,830 decided, weeks ago, was annoying-- or you grew it dynamically 2192 01:39:41,830 --> 01:39:44,740 using malloc or realloc or the like. 2193 01:39:44,740 --> 01:39:45,400 All right. 2194 01:39:45,400 --> 01:39:46,240 What else can I do? 2195 01:39:46,240 --> 01:39:49,990 Well, there's some nice things you might as well know exist. 2196 01:39:49,990 --> 01:39:54,340 Instead of scores.append, you can do slight fanciness like this. 2197 01:39:54,340 --> 01:39:57,290 If you want to append something to a list, 2198 01:39:57,290 --> 01:40:00,100 you can actually do plus equals, and then 2199 01:40:00,100 --> 01:40:03,620 put that thing in a temporary list of its own 2200 01:40:03,620 --> 01:40:05,740 and just use what is essentially concatenation-- 2201 01:40:05,740 --> 01:40:09,410 but not concatenation of strings, but concatenation of lists. 2202 01:40:09,410 --> 01:40:13,480 So this new line 6 appends to the score's list-- 2203 01:40:13,480 --> 01:40:15,640 this tiny, little list I'm temporarily creating 2204 01:40:15,640 --> 01:40:17,670 with just the current new score. 2205 01:40:17,670 --> 01:40:20,260 So just another piece of syntax that's worth seeing that 2206 01:40:20,260 --> 01:40:23,290 allows you to do something like that, as well. 2207 01:40:23,290 --> 01:40:23,890 All right. 2208 01:40:23,890 --> 01:40:26,093 Well, how about we go back to strings for a moment? 2209 01:40:26,093 --> 01:40:29,260 And all of these examples, as always, are on the course's website afterward. 2210 01:40:29,260 --> 01:40:32,860 Suppose we want to do something like converting characters to uppercase. 2211 01:40:32,860 --> 01:40:35,170 Well, to be clear, I could do something like this. 2212 01:40:35,170 --> 01:40:38,080 Let me create a program called uppercase.py. 2213 01:40:38,080 --> 01:40:42,280 Let me prompt the user for a before string as by using the input function 2214 01:40:42,280 --> 01:40:44,510 or get_string, which is almost the same. 2215 01:40:44,510 --> 01:40:47,110 And I'll prompt the user for a string beforehand. 2216 01:40:47,110 --> 01:40:52,750 Then, let me go ahead and print out, how about, the keyword "After," 2217 01:40:52,750 --> 01:40:56,650 and then end the new line with nothing, just so 2218 01:40:56,650 --> 01:41:00,010 that I can see "Before" on one line and "After" on the next line. 2219 01:41:00,010 --> 01:41:01,240 And then, let me do this-- 2220 01:41:01,240 --> 01:41:04,450 and here's where Python gets pleasant, too, with loops-- 2221 01:41:04,450 --> 01:41:07,270 for c in before-- 2222 01:41:07,270 --> 01:41:11,110 print c.upper end equals quote, unquote. 2223 01:41:11,110 --> 01:41:12,580 And then, I'll print this here. 2224 01:41:12,580 --> 01:41:13,120 All right. 2225 01:41:13,120 --> 01:41:15,950 That was fast, but let's try to infer what's going on. 2226 01:41:15,950 --> 01:41:19,600 So line 1 just gets input from the user, stores it in a variable called before. 2227 01:41:19,600 --> 01:41:22,510 Line two literally just prints "After" but doesn't 2228 01:41:22,510 --> 01:41:25,300 move the cursor to the next line. 2229 01:41:25,300 --> 01:41:27,015 What it, then, does is this. 2230 01:41:27,015 --> 01:41:29,875 And, in C, this was a little more annoying. 2231 01:41:29,875 --> 01:41:31,450 You needed a for loop with i. 2232 01:41:31,450 --> 01:41:34,690 You needed array notation with the square brackets. 2233 01:41:34,690 --> 01:41:39,850 But, Python, if you say for variable in string-- 2234 01:41:39,850 --> 01:41:42,670 so for c, for character, in string, Python 2235 01:41:42,670 --> 01:41:46,060 is going to automatically assign c to the first letter 2236 01:41:46,060 --> 01:41:47,110 that the user types in. 2237 01:41:47,110 --> 01:41:49,120 Then, on the next iteration, the second letter, the third letter, 2238 01:41:49,120 --> 01:41:49,745 and the fourth. 2239 01:41:49,745 --> 01:41:52,360 So you don't need any square bracket notation, you just use c, 2240 01:41:52,360 --> 01:41:55,180 and Python will do it for you and just hand you back, 2241 01:41:55,180 --> 01:41:59,000 one at a time, each of the letters that the user has typed in. 2242 01:41:59,000 --> 01:42:04,720 So if I go back over here and I run, for instance, python of uppercase.py 2243 01:42:04,720 --> 01:42:09,760 and I'll type in, how about, "david" in all lowercase and hit Enter, 2244 01:42:09,760 --> 01:42:13,630 you'll now see that it's all uppercase instead by iterating over it, 2245 01:42:13,630 --> 01:42:15,372 indeed, one character at a time. 2246 01:42:15,372 --> 01:42:17,830 But we already know, thanks to object-oriented programming, 2247 01:42:17,830 --> 01:42:20,027 strings themselves have the functionality built 2248 01:42:20,027 --> 01:42:24,100 in to not just uppercase single characters, but the whole string. 2249 01:42:24,100 --> 01:42:26,530 So, honestly, this was a bit of a silly exercise. 2250 01:42:26,530 --> 01:42:31,360 I don't need to use a loop anymore, like in C. And so, some of the habits 2251 01:42:31,360 --> 01:42:34,720 you've only just developed in recent weeks, it's time to start breaking them 2252 01:42:34,720 --> 01:42:36,130 when they're not necessary. 2253 01:42:36,130 --> 01:42:40,470 I can create a variable called after, set it equal to before.upper-- 2254 01:42:40,470 --> 01:42:43,600 which, indeed, exists, just like dot lower exists. 2255 01:42:43,600 --> 01:42:47,490 And then, what I can go ahead and print out is, for instance-- 2256 01:42:47,490 --> 01:42:49,990 let's get rid of this print line here and do it at the end-- 2257 01:42:49,990 --> 01:42:53,900 "After" and print the value of that variable. 2258 01:42:53,900 --> 01:42:58,005 So now, if I rerun uppercase.py, type in "david" in all lowercase, 2259 01:42:58,005 --> 01:43:03,400 I can just uppercase the whole thing all at once because, again, in Python, 2260 01:43:03,400 --> 01:43:07,000 you don't have to operate on characters individually. 2261 01:43:07,000 --> 01:43:13,310 Questions on any of these tricks up until now? 2262 01:43:13,310 --> 01:43:13,810 No? 2263 01:43:13,810 --> 01:43:14,290 All right. 2264 01:43:14,290 --> 01:43:17,290 How about a few other techniques that we saw in C that we'll bring back, 2265 01:43:17,290 --> 01:43:18,145 now, in Python. 2266 01:43:18,145 --> 01:43:22,860 So it turns out, in Python, there are other libraries you can use, too, 2267 01:43:22,860 --> 01:43:24,360 that unlock even more functionality. 2268 01:43:24,360 --> 01:43:27,040 So, in C, if you wanted command line arguments, 2269 01:43:27,040 --> 01:43:32,410 you just change the signature for main to be, instead of void, 2270 01:43:32,410 --> 01:43:38,515 int argc comma string argv, open brackets for an array or char star, 2271 01:43:38,515 --> 01:43:39,130 eventually. 2272 01:43:39,130 --> 01:43:41,770 Well, it turns out, in Python, that, if you want to access command line 2273 01:43:41,770 --> 01:43:44,770 arguments, it's a little simpler, but they're tucked away in a library-- 2274 01:43:44,770 --> 01:43:46,990 otherwise known as a module-- 2275 01:43:46,990 --> 01:43:49,552 called sys, the system module. 2276 01:43:49,552 --> 01:43:51,760 Now, this is similar, in spirit, to the CS50 library, 2277 01:43:51,760 --> 01:43:53,802 and that's got a bunch of functionality built in. 2278 01:43:53,802 --> 01:43:55,725 But this one comes with Python itself. 2279 01:43:55,725 --> 01:43:59,710 So if I want tot create a program like greet.py, in VS Code, 2280 01:43:59,710 --> 01:44:01,510 here, let me go ahead and do this. 2281 01:44:01,510 --> 01:44:05,785 From the sys library, let's import argv. 2282 01:44:05,785 --> 01:44:07,850 And that's just a thing that exists. 2283 01:44:07,850 --> 01:44:10,660 It's not built into main because there is no main, per se, anymore. 2284 01:44:10,660 --> 01:44:12,590 So it's tucked away in that library. 2285 01:44:12,590 --> 01:44:14,330 And now, I can do something like this. 2286 01:44:14,330 --> 01:44:16,925 If the length of argv equals equals 2, well, 2287 01:44:16,925 --> 01:44:19,090 let's go ahead and print out something friendly, 2288 01:44:19,090 --> 01:44:24,955 like hello comma argv bracket 1, and then, close quotes. 2289 01:44:24,955 --> 01:44:28,360 Else, if the length of argv is not equal to 2, 2290 01:44:28,360 --> 01:44:30,400 Let's just go ahead and print out hello, world. 2291 01:44:30,400 --> 01:44:32,525 Now, at a glance, this might look a little cryptic, 2292 01:44:32,525 --> 01:44:35,050 but it's identical to what we did a few weeks ago. 2293 01:44:35,050 --> 01:44:39,570 When I run this, python of greet.py, with no arguments, 2294 01:44:39,570 --> 01:44:40,950 it just says "hello, world." 2295 01:44:40,950 --> 01:44:46,180 But if I, instead, add a command line argument, like my first name and hit 2296 01:44:46,180 --> 01:44:49,825 Enter, now, the length of argv is no longer 1. 2297 01:44:49,825 --> 01:44:51,700 It's going to be 2. 2298 01:44:51,700 --> 01:44:54,680 And so, it prints out "Hello, David" instead. 2299 01:44:54,680 --> 01:44:57,880 So the takeaway here is that, whereas in C, 2300 01:44:57,880 --> 01:45:03,955 argv technically contained the name of your program, like ./hello or ./greet, 2301 01:45:03,955 --> 01:45:05,455 and then everything the human typed. 2302 01:45:05,455 --> 01:45:08,410 Python's a little different in that, because we're 2303 01:45:08,410 --> 01:45:10,150 using the interpreter in this way-- 2304 01:45:10,150 --> 01:45:16,090 technically, when you run python of greet.py, the length of argv is only 1. 2305 01:45:16,090 --> 01:45:18,760 It contains only greet.py, so the name of the file. 2306 01:45:18,760 --> 01:45:21,670 It does not unnecessarily contain Python itself 2307 01:45:21,670 --> 01:45:24,460 because what's the point of that being there, omnipresently? 2308 01:45:24,460 --> 01:45:28,760 It does contain the number of words that the human typed after Python itself. 2309 01:45:28,760 --> 01:45:32,230 So argv is length 1 here. argv is length 2 here. 2310 01:45:32,230 --> 01:45:35,350 And that's why, when it did equal 2, I saw "Hello, David" instead 2311 01:45:35,350 --> 01:45:37,240 of the default "Hello, world." 2312 01:45:37,240 --> 01:45:41,440 So same ability to access command line arguments, add these kinds of inputs 2313 01:45:41,440 --> 01:45:43,570 to your functions, but you have to unlock it 2314 01:45:43,570 --> 01:45:47,830 by way of using argv instead, in this way. 2315 01:45:47,830 --> 01:45:51,910 If you want to see all of the words, you could do something like this. 2316 01:45:51,910 --> 01:45:57,760 Just as-- if we combine ideas, here-- for i in range of, how about, length 2317 01:45:57,760 --> 01:45:59,610 of argv. 2318 01:45:59,610 --> 01:46:02,260 Then, I can do this-- print argv bracket i. 2319 01:46:02,260 --> 01:46:02,860 All right. 2320 01:46:02,860 --> 01:46:06,385 A little cryptic, but line 3 is just a for loop iterating 2321 01:46:06,385 --> 01:46:08,410 over the range of length of argv. 2322 01:46:08,410 --> 01:46:12,640 So if the human types in two words, the length of argv will be 2. 2323 01:46:12,640 --> 01:46:16,885 So this is just a way of saying iterate over all of the words in argv, 2324 01:46:16,885 --> 01:46:18,380 printing them one at a time. 2325 01:46:18,380 --> 01:46:22,810 So python of greet.py, Enter just prints out the name of the program. 2326 01:46:22,810 --> 01:46:27,340 python of greet.py with David prints out greet.py and, then, David. 2327 01:46:27,340 --> 01:46:29,470 I can keep running it though with more words, 2328 01:46:29,470 --> 01:46:32,650 and they'll each get printed one at a time. 2329 01:46:32,650 --> 01:46:35,440 But what's nice, too, about Python-- 2330 01:46:35,440 --> 01:46:38,920 and this is the point of this exercise-- honestly, this looks pretty cryptic. 2331 01:46:38,920 --> 01:46:40,720 This is not very pleasant to look at. 2332 01:46:40,720 --> 01:46:46,150 If you just want to iterate over every word in a list, which argv is, 2333 01:46:46,150 --> 01:46:47,680 watch what I can do. 2334 01:46:47,680 --> 01:46:52,090 I can do for arg or any variable name in argv. 2335 01:46:52,090 --> 01:46:54,147 Let me just, now, print out that argument. 2336 01:46:54,147 --> 01:46:56,980 I could keep calling it i, but i seems weird when it's not a number. 2337 01:46:56,980 --> 01:46:59,710 So I'm changing to arg as a word, instead. 2338 01:46:59,710 --> 01:47:03,970 If I now do python of greet.py, it does this. 2339 01:47:03,970 --> 01:47:06,460 If I do python of greet.py, David, it does that again. 2340 01:47:06,460 --> 01:47:08,690 David Malan, it does that again. 2341 01:47:08,690 --> 01:47:10,898 So this is, again, why Python is just very appealing. 2342 01:47:10,898 --> 01:47:13,482 You want to do something this many times, iterate over a list? 2343 01:47:13,482 --> 01:47:15,820 Just say it, and it reads a little more like English. 2344 01:47:15,820 --> 01:47:18,130 And there's even other fanciness, too, if I may. 2345 01:47:18,130 --> 01:47:21,820 It's a little stupid that I keep seeing the name of the program, greet.py, 2346 01:47:21,820 --> 01:47:24,640 so it'd be nice if I could remove that. 2347 01:47:24,640 --> 01:47:28,960 Python also supports what are called slices of arrays-- 2348 01:47:28,960 --> 01:47:30,340 sorry, slices of lists. 2349 01:47:30,340 --> 01:47:32,050 Even I get the terminology confused. 2350 01:47:32,050 --> 01:47:36,400 If argv is a list, then it's going to print out everything in it. 2351 01:47:36,400 --> 01:47:41,950 But if I want a slice of it that starts at location 1 all the way to the end, 2352 01:47:41,950 --> 01:47:45,500 you can use this funky syntax in between the square brackets, which 2353 01:47:45,500 --> 01:47:48,700 we've not seen yet, that's going to start at item 1 2354 01:47:48,700 --> 01:47:50,220 and go all the way to the end. 2355 01:47:50,220 --> 01:47:53,830 And so, this is a nice, clever way of slicing off, 2356 01:47:53,830 --> 01:47:56,170 if you will, the very first element because now, 2357 01:47:56,170 --> 01:48:01,900 when I run greet.py, David Malan, I should only see David and Malan. 2358 01:48:01,900 --> 01:48:04,940 If I only want one element, I could do 1 to 2. 2359 01:48:04,940 --> 01:48:08,260 If I want all of them, I could do 0 onward. 2360 01:48:08,260 --> 01:48:10,900 I could give myself just one of them in this way. 2361 01:48:10,900 --> 01:48:14,380 So you can play with the start value and the end value in this way, 2362 01:48:14,380 --> 01:48:17,020 to slice and dice these lists in different ways. 2363 01:48:17,020 --> 01:48:20,620 That would have been a pain in C, just because we didn't really 2364 01:48:20,620 --> 01:48:26,840 have the built-in support for manipulating arrays as cleanly as this. 2365 01:48:26,840 --> 01:48:27,340 All right. 2366 01:48:27,340 --> 01:48:31,440 Just so you've seen it, too-- though, this one is less exciting to see live-- 2367 01:48:31,440 --> 01:48:33,940 if I go ahead and create a quick program here, it turns out, 2368 01:48:33,940 --> 01:48:37,630 there's something else in the sys library, the ability to exit programs-- 2369 01:48:37,630 --> 01:48:41,590 either exiting with status code 1 or 0, as we've been doing any time something 2370 01:48:41,590 --> 01:48:42,673 goes right or wrong. 2371 01:48:42,673 --> 01:48:45,340 So, for instance, let me whip up a quick program that just says, 2372 01:48:45,340 --> 01:48:52,300 if the length of sys.argv does not equal 2, then let's yell at the user 2373 01:48:52,300 --> 01:48:54,970 and say you're missing a command line argument. 2374 01:48:54,970 --> 01:48:57,380 Otherwise, command-line argument. 2375 01:48:57,380 --> 01:49:01,360 And let's, then, return sys.exit(1). 2376 01:49:01,360 --> 01:49:05,590 Else, let's go ahead and, logically, just say print a formatted string that 2377 01:49:05,590 --> 01:49:07,450 says hello-- as before-- 2378 01:49:07,450 --> 01:49:09,640 sys.argv 1. 2379 01:49:09,640 --> 01:49:11,770 Now, things look different all of a sudden, 2380 01:49:11,770 --> 01:49:13,312 but I'm doing something deliberately. 2381 01:49:13,312 --> 01:49:14,870 First, let's see what this does. 2382 01:49:14,870 --> 01:49:18,730 So, on line 1, I'm importing not argv, specifically. 2383 01:49:18,730 --> 01:49:22,150 I'm importing the whole sys library, and we'll see why in a second. 2384 01:49:22,150 --> 01:49:27,220 Well, it turns out that the sys library has not only the argv list, 2385 01:49:27,220 --> 01:49:30,580 it also has a function called exit, which I'd like to be able to use, 2386 01:49:30,580 --> 01:49:31,370 as well. 2387 01:49:31,370 --> 01:49:35,200 So it turns out that, if you import a whole library in this way, that's fine. 2388 01:49:35,200 --> 01:49:37,840 But you have to refer to the things inside of it 2389 01:49:37,840 --> 01:49:42,980 by using that same library's name and a dot to namespace it, so to speak. 2390 01:49:42,980 --> 01:49:47,002 So here, I'm just saying, if the user does not type in two words, 2391 01:49:47,002 --> 01:49:49,960 yell at them with missing command line argument, and then, exit with 1. 2392 01:49:49,960 --> 01:49:52,975 Just like in C, when you do exit 1, just means something went wrong. 2393 01:49:52,975 --> 01:49:54,785 Otherwise, print out hello to this. 2394 01:49:54,785 --> 01:49:57,910 And this is starting to look cryptic, but it's just a combination of ideas. 2395 01:49:57,910 --> 01:50:02,080 The curly braces means interpolate this value, plug it in here. 2396 01:50:02,080 --> 01:50:05,740 sys.argv is just the verbose way of saying go into the sys library 2397 01:50:05,740 --> 01:50:09,010 and get the argv variable therein. 2398 01:50:09,010 --> 01:50:11,860 And bracket 1, of course, just like arrays in C, 2399 01:50:11,860 --> 01:50:15,440 is just the second element at the prompt. 2400 01:50:15,440 --> 01:50:18,700 So when I run this version, now-- python of exit.py-- 2401 01:50:18,700 --> 01:50:21,340 with no arguments, I get yelled at in this way. 2402 01:50:21,340 --> 01:50:24,640 If, however, I type in two arguments total-- 2403 01:50:24,640 --> 01:50:26,950 the name of the file and my own name-- 2404 01:50:26,950 --> 01:50:29,050 now, I get greeted with hello, David. 2405 01:50:29,050 --> 01:50:30,310 And it's the same idea before. 2406 01:50:30,310 --> 01:50:33,160 This was a very low-level technique, but same thing here. 2407 01:50:33,160 --> 01:50:36,310 If you do echo dollar sign question mark Enter, 2408 01:50:36,310 --> 01:50:39,170 you'll see the exit code of your program. 2409 01:50:39,170 --> 01:50:41,270 So if I do this incorrectly again-- 2410 01:50:41,270 --> 01:50:43,953 let me rerun it without my name, Enter-- 2411 01:50:43,953 --> 01:50:44,620 I get yelled at. 2412 01:50:44,620 --> 01:50:47,320 But if I do echo dollar sign question mark, 2413 01:50:47,320 --> 01:50:50,170 there's the secret one that's returned. 2414 01:50:50,170 --> 01:50:54,160 Again, just to show you parity with C, in this case. 2415 01:50:54,160 --> 01:50:56,320 Questions, now, on any of these techniques, here? 2416 01:50:56,320 --> 01:50:58,900 2417 01:50:58,900 --> 01:50:59,400 No. 2418 01:50:59,400 --> 01:51:00,030 All right. 2419 01:51:00,030 --> 01:51:02,580 How about something that's a little more powerful, too? 2420 01:51:02,580 --> 01:51:05,880 We spend so much time in week 0 and 1 doing searching 2421 01:51:05,880 --> 01:51:07,830 and, then, eventually, sorting in week 3. 2422 01:51:07,830 --> 01:51:10,288 Well, it turns out, Python can help with some of this, too. 2423 01:51:10,288 --> 01:51:12,720 Let me go ahead and create a program called names.py 2424 01:51:12,720 --> 01:51:15,053 that's just going to be an opportunity to, maybe, search 2425 01:51:15,053 --> 01:51:16,650 over a whole bunch of names. 2426 01:51:16,650 --> 01:51:21,060 Let me go ahead and import sys, just so I have access to exit. 2427 01:51:21,060 --> 01:51:22,920 And let me go ahead and create a variable 2428 01:51:22,920 --> 01:51:26,756 called names that's going to be a list with a whole bunch of names. 2429 01:51:26,756 --> 01:51:27,660 How about here? 2430 01:51:27,660 --> 01:51:34,740 Charlie and Fred and George and Ginny and Percy and, lastly, Ron. 2431 01:51:34,740 --> 01:51:36,290 So a whole bunch of names here. 2432 01:51:36,290 --> 01:51:38,040 And it'd be a little annoying to implement 2433 01:51:38,040 --> 01:51:42,540 code that iterates over that, from left to right, in C, searching for one 2434 01:51:42,540 --> 01:51:43,165 of those names. 2435 01:51:43,165 --> 01:51:43,957 In fact, what name? 2436 01:51:43,957 --> 01:51:46,290 Well, let's go ahead and ask the user to input the name 2437 01:51:46,290 --> 01:51:48,498 that they want to search for so that we can tell them 2438 01:51:48,498 --> 01:51:50,460 if the name is there or not. 2439 01:51:50,460 --> 01:51:54,670 And we could do this, similar to C, in Python, doing something like this. 2440 01:51:54,670 --> 01:52:00,600 So for n in names, where n is just a variable to iterate over each name-- 2441 01:52:00,600 --> 01:52:05,595 if the name I'm looking for equals the current name in the list-- 2442 01:52:05,595 --> 01:52:09,060 AKA n-- well, let's print out something friendly, like "Found." 2443 01:52:09,060 --> 01:52:14,250 And then, let's do sys.exit 0 to indicate that we found whoever that is. 2444 01:52:14,250 --> 01:52:17,460 Otherwise, if we get all the way to the bottom here, outside of this loop, 2445 01:52:17,460 --> 01:52:20,340 let's just print "Not found" because if we haven't exited yet. 2446 01:52:20,340 --> 01:52:22,800 And then, let's just exit with 1. 2447 01:52:22,800 --> 01:52:25,980 Just to be clear, I can continue importing all of sys, 2448 01:52:25,980 --> 01:52:31,920 or I could do from sys import exit, and then, I could get rid of sys dot 2449 01:52:31,920 --> 01:52:33,240 everywhere else. 2450 01:52:33,240 --> 01:52:36,540 But sometimes, it's helpful to know exactly where functions came from. 2451 01:52:36,540 --> 01:52:39,675 So this, too, is just a matter of style, in this case. 2452 01:52:39,675 --> 01:52:40,230 All right. 2453 01:52:40,230 --> 01:52:41,522 So let's go ahead and run this. 2454 01:52:41,522 --> 01:52:46,540 python of names.py, and let's look for Ron, all the way at the end. 2455 01:52:46,540 --> 01:52:47,040 All right. 2456 01:52:47,040 --> 01:52:47,910 He's found. 2457 01:52:47,910 --> 01:52:51,570 And let's search for someone outside of the family here, like Hermione. 2458 01:52:51,570 --> 01:52:52,700 Not found. 2459 01:52:52,700 --> 01:52:53,200 OK. 2460 01:52:53,200 --> 01:52:54,783 So it seems to be working in this way. 2461 01:52:54,783 --> 01:52:58,548 But I've essentially implemented what algorithm? 2462 01:52:58,548 --> 01:53:05,247 What algorithm would this seem to be, per line 7 and 8 to 9 and 10? 2463 01:53:05,247 --> 01:53:05,955 AUDIENCE: Linear. 2464 01:53:05,955 --> 01:53:06,450 DAVID MALAN: Yeah. 2465 01:53:06,450 --> 01:53:07,350 So it's just linear search. 2466 01:53:07,350 --> 01:53:10,185 It's a loop, even thought he syntax is a little more succinct today, 2467 01:53:10,185 --> 01:53:12,060 and it's just iterating over the whole thing. 2468 01:53:12,060 --> 01:53:15,240 Well, honestly, we've seen an even more terse way to do this in Python. 2469 01:53:15,240 --> 01:53:19,230 And this, again, is what makes it a more pleasant language, sometimes. 2470 01:53:19,230 --> 01:53:20,630 Why don't I just do this? 2471 01:53:20,630 --> 01:53:24,790 Instead of iterating one at a time, why don't I just say this? 2472 01:53:24,790 --> 01:53:27,840 Let me go ahead and change my condition to just 2473 01:53:27,840 --> 01:53:33,270 be-- how about if the name we're looking for is in the names list, we're done. 2474 01:53:33,270 --> 01:53:33,960 We found it. 2475 01:53:33,960 --> 01:53:36,570 Use the end preposition that we've seen a couple of times, 2476 01:53:36,570 --> 01:53:40,710 now, that itself asks the question, is something in something else? 2477 01:53:40,710 --> 01:53:44,050 And Python will take care of linear search for us. 2478 01:53:44,050 --> 01:53:46,080 And it's going to work exactly the same if I 2479 01:53:46,080 --> 01:53:48,030 do python of names.py, search for Ron. 2480 01:53:48,030 --> 01:53:50,077 It's still going to find him and it's still 2481 01:53:50,077 --> 01:53:51,660 going to do it linearly, in this case. 2482 01:53:51,660 --> 01:53:58,060 But I don't have to write all of the lower-level code myself, in this case. 2483 01:53:58,060 --> 01:54:02,430 Questions, now, on any of this? 2484 01:54:02,430 --> 01:54:05,380 The code's just getting shorter and shorter. 2485 01:54:05,380 --> 01:54:05,880 No? 2486 01:54:05,880 --> 01:54:07,740 What about-- let's see. 2487 01:54:07,740 --> 01:54:09,250 What else might we have here? 2488 01:54:09,250 --> 01:54:10,770 How about this? 2489 01:54:10,770 --> 01:54:12,780 Let's go ahead and implement that phonebook 2490 01:54:12,780 --> 01:54:15,690 that we started, metaphorically, with in the beginning of the course. 2491 01:54:15,690 --> 01:54:17,940 Let's code up a program called phonebook.py. 2492 01:54:17,940 --> 01:54:22,440 And in this case, let's go ahead and let's create a dictionary this time. 2493 01:54:22,440 --> 01:54:25,470 Recall that a dictionary is a little something that 2494 01:54:25,470 --> 01:54:27,060 implements something like this-- 2495 01:54:27,060 --> 01:54:31,140 a two-column table that's got keys and values, words 2496 01:54:31,140 --> 01:54:33,240 and definitions, names and numbers. 2497 01:54:33,240 --> 01:54:36,367 And let's focus on the last of those, names and numbers, in this case. 2498 01:54:36,367 --> 01:54:38,700 Well, I claimed earlier that Python has built-in support 2499 01:54:38,700 --> 01:54:42,780 for dictionaries-- dict objects-- that you can create with one line. 2500 01:54:42,780 --> 01:54:45,120 I didn't need it for speller because a set is sufficient 2501 01:54:45,120 --> 01:54:47,610 when you only want one of the keys or the values, not both. 2502 01:54:47,610 --> 01:54:49,680 But now, I want some names and numbers. 2503 01:54:49,680 --> 01:54:53,220 So it turns out, in Python, you can create an empty dictionary 2504 01:54:53,220 --> 01:54:55,680 by saying dict open parenthesis, closed. 2505 01:54:55,680 --> 01:54:58,080 And that just gives you, essentially, a chart that 2506 01:54:58,080 --> 01:54:59,640 looks like this, with nothing in it. 2507 01:54:59,640 --> 01:55:01,725 Or there's more succinct syntax. 2508 01:55:01,725 --> 01:55:06,858 You can, alternatively, do this, with two curly braces, instead. 2509 01:55:06,858 --> 01:55:09,150 And, in fact, I've been using a shortcut all this time. 2510 01:55:09,150 --> 01:55:15,885 When I had a list, earlier, where my variable was called scores, 2511 01:55:15,885 --> 01:55:19,860 and I did this, that was actually the shorthand version of this-- 2512 01:55:19,860 --> 01:55:21,637 hey, Python, give me an empty list. 2513 01:55:21,637 --> 01:55:23,970 So there's different syntax for achieving the same goal. 2514 01:55:23,970 --> 01:55:27,540 In this case, if I want a dictionary for people, 2515 01:55:27,540 --> 01:55:32,530 I can either do this or, more commonly, just two curly braces, like that. 2516 01:55:32,530 --> 01:55:33,030 All right. 2517 01:55:33,030 --> 01:55:34,360 Well, what do I want to put in this? 2518 01:55:34,360 --> 01:55:36,360 Well, let me actually put some things in this. 2519 01:55:36,360 --> 01:55:39,360 And I'm going to just move my closed curly brace to a new line. 2520 01:55:39,360 --> 01:55:42,580 If I want to implement this idea of keys and values, 2521 01:55:42,580 --> 01:55:47,220 the way you do this in Python is key colon value comma. 2522 01:55:47,220 --> 01:55:48,230 Key colon value. 2523 01:55:48,230 --> 01:55:50,410 So you'd implement it more in code. 2524 01:55:50,410 --> 01:55:54,270 So, for instance, if I want Carter to be the first key in my phone book and I 2525 01:55:54,270 --> 01:56:00,135 want his number to be +1-617-495-1000, I can put that as the corresponding 2526 01:56:00,135 --> 01:56:00,960 value. 2527 01:56:00,960 --> 01:56:02,010 The colon is in between. 2528 01:56:02,010 --> 01:56:05,970 Both are strings, or strs, so I've quoted both deliberately. 2529 01:56:05,970 --> 01:56:07,762 If I want to add myself, I can put a comma. 2530 01:56:07,762 --> 01:56:10,970 And then, just to keep things pretty, I'm moving the cursor to the next line. 2531 01:56:10,970 --> 01:56:12,990 But that's not strictly required, aesthetically. 2532 01:56:12,990 --> 01:56:13,865 It's just good style. 2533 01:56:13,865 --> 01:56:19,500 And here, I might do +1-949-468-2750. 2534 01:56:19,500 --> 01:56:24,270 And now, I have a dictionary that, essentially, has two rows, here-- 2535 01:56:24,270 --> 01:56:27,322 Carter and his number and David and his number, as well. 2536 01:56:27,322 --> 01:56:30,405 And if I kept adding to this, this chart would just get longer and longer. 2537 01:56:30,405 --> 01:56:32,430 Suppose I want to search for one of our numbers. 2538 01:56:32,430 --> 01:56:34,950 Well, let's prompt the user for the name, 2539 01:56:34,950 --> 01:56:37,470 for whose number you want to search by getting string. 2540 01:56:37,470 --> 01:56:38,560 Or you know what? 2541 01:56:38,560 --> 01:56:39,893 We don't need this CS50 library. 2542 01:56:39,893 --> 01:56:43,090 Let's just use input and prompt the user for a name. 2543 01:56:43,090 --> 01:56:49,230 And now, we can use this super terse syntax and just say if name in people, 2544 01:56:49,230 --> 01:56:53,700 print the formatted string number colon and-- 2545 01:56:53,700 --> 01:56:57,160 here, we can do this-- people bracket name. 2546 01:56:57,160 --> 01:56:57,930 OK. 2547 01:56:57,930 --> 01:57:01,800 So this is getting cool quickly, confusingly. 2548 01:57:01,800 --> 01:57:02,805 So let me run this. 2549 01:57:02,805 --> 01:57:06,810 python of phonebook.py Let's type in Carter. 2550 01:57:06,810 --> 01:57:08,910 And, indeed, I see his number. 2551 01:57:08,910 --> 01:57:12,910 Let's run it again with David, and I see my number here. 2552 01:57:12,910 --> 01:57:14,590 So what's going on? 2553 01:57:14,590 --> 01:57:19,320 Well, it turns out that a dictionary is very similar, in spirit, to a list. 2554 01:57:19,320 --> 01:57:22,350 It's actually very similar, in spirit, to an array in C. 2555 01:57:22,350 --> 01:57:27,150 But instead of being limited to keys that are numbers, like bracket 0, 2556 01:57:27,150 --> 01:57:30,690 bracket 1, bracket 2, you can actually use words. 2557 01:57:30,690 --> 01:57:33,060 And that's all I'm doing here on line 8. 2558 01:57:33,060 --> 01:57:36,765 If I want to check for the name Carter, which is currently 2559 01:57:36,765 --> 01:57:39,555 in this variable called name, I can index 2560 01:57:39,555 --> 01:57:42,660 into my people dictionary using not a number, 2561 01:57:42,660 --> 01:57:44,830 but using, literally, a string-- 2562 01:57:44,830 --> 01:57:48,000 the name Carter or David or anything else. 2563 01:57:48,000 --> 01:57:50,640 To make this clearer, too, notice that I'm, at the moment, 2564 01:57:50,640 --> 01:57:54,095 using this format string, which is adding some undue complexity. 2565 01:57:54,095 --> 01:57:56,220 But I could clarify this, perhaps, further as this. 2566 01:57:56,220 --> 01:57:58,080 I could give myself another variable called 2567 01:57:58,080 --> 01:58:01,320 number, set it equal to the people dictionary, 2568 01:58:01,320 --> 01:58:03,875 indexing into it using the current name. 2569 01:58:03,875 --> 01:58:07,230 And now, I can shorten this to make it clearer that all I'm doing 2570 01:58:07,230 --> 01:58:09,910 is printing the value of that. 2571 01:58:09,910 --> 01:58:12,930 And, in fact, I can do this even more cryptically. 2572 01:58:12,930 --> 01:58:16,710 This would be weird to do, but if I only ever want to show David's phone number 2573 01:58:16,710 --> 01:58:21,150 and never Carter's, I can literally, quote unquote, "index into" the people 2574 01:58:21,150 --> 01:58:24,930 dictionary because, now, when I run this, even if I type Carter, 2575 01:58:24,930 --> 01:58:27,020 I'm going to get back my number instead. 2576 01:58:27,020 --> 01:58:31,080 But that's all that's happening if I undo that, because that's now a bug. 2577 01:58:31,080 --> 01:58:35,250 But I index into it using the value of name. 2578 01:58:35,250 --> 01:58:37,230 Dictionaries are just so wonderfully convenient 2579 01:58:37,230 --> 01:58:39,688 because, now, you can associate anything with anything else 2580 01:58:39,688 --> 01:58:43,420 but not using numbers, but entire key words, instead. 2581 01:58:43,420 --> 01:58:46,770 So here's how, if, in speller, we gave you not just words, 2582 01:58:46,770 --> 01:58:50,340 but hundreds of thousands of definitions, as well, 2583 01:58:50,340 --> 01:58:52,385 you could essentially store them as this. 2584 01:58:52,385 --> 01:58:55,680 And then, when the human wants to look up a definition in a proper dictionary, 2585 01:58:55,680 --> 01:58:57,750 not just for spell checking, you could index 2586 01:58:57,750 --> 01:59:00,290 into the dictionary using square brackets 2587 01:59:00,290 --> 01:59:04,240 and get back the definition in English, as well. 2588 01:59:04,240 --> 01:59:06,770 Questions on this? 2589 01:59:06,770 --> 01:59:07,280 Yeah? 2590 01:59:07,280 --> 01:59:09,760 AUDIENCE: Is the way this code does, as presented, 2591 01:59:09,760 --> 01:59:11,744 saying that Python has [INAUDIBLE]? 2592 01:59:11,744 --> 01:59:21,390 2593 01:59:21,390 --> 01:59:22,890 DAVID MALAN: A really good question. 2594 01:59:22,890 --> 01:59:27,330 So, to summarize, how is Python finding that name within that dictionary? 2595 01:59:27,330 --> 01:59:31,110 This is where, honestly, speller in p-set 5 is what Python's all about. 2596 01:59:31,110 --> 01:59:34,215 So you have struggled, are struggling with implementing your own spell 2597 01:59:34,215 --> 01:59:36,090 checker and implementing your own hash table. 2598 01:59:36,090 --> 01:59:39,210 And recall that, per last week, the goal of a hash table is to, 2599 01:59:39,210 --> 01:59:41,190 ideally, get constant time access. 2600 01:59:41,190 --> 01:59:45,435 Not something linear, which is slow and even better than something logarithmic, 2601 01:59:45,435 --> 01:59:47,400 like log base 2 of n. 2602 01:59:47,400 --> 01:59:50,130 So Python and the really smart people who invented it, 2603 01:59:50,130 --> 01:59:53,310 they have written the code that does its best to give you 2604 01:59:53,310 --> 01:59:55,853 constant time searches of dictionaries. 2605 01:59:55,853 --> 01:59:58,020 And they're not always going to succeed, just as you 2606 01:59:58,020 --> 01:59:59,430 and your own problem set are probably going 2607 01:59:59,430 --> 02:00:01,805 to have some collisions once in a while and start to have 2608 02:00:01,805 --> 02:00:03,440 chains of length lists of words. 2609 02:00:03,440 --> 02:00:05,940 But this is where, again, you defer to someone else, someone 2610 02:00:05,940 --> 02:00:07,800 smarter than you, someone with more time than you 2611 02:00:07,800 --> 02:00:09,270 to solve these problems for you. 2612 02:00:09,270 --> 02:00:11,490 And if you read Python's documentation, you'll 2613 02:00:11,490 --> 02:00:13,650 see that it doesn't guarantee constant time, 2614 02:00:13,650 --> 02:00:15,990 but it's going to, ideally, optimize the data structure 2615 02:00:15,990 --> 02:00:19,320 for you to get as fast as possible. 2616 02:00:19,320 --> 02:00:22,690 And of all of the data structures like a dictionary, 2617 02:00:22,690 --> 02:00:25,380 a hash table is, really, like the Swiss army knife of computing 2618 02:00:25,380 --> 02:00:28,260 because it just lets you associate something with something else. 2619 02:00:28,260 --> 02:00:30,510 And even though we keep focusing on names and numbers, 2620 02:00:30,510 --> 02:00:32,400 that's a really powerful thing because it's 2621 02:00:32,400 --> 02:00:34,230 more powerful than lists and arrays, which 2622 02:00:34,230 --> 02:00:35,910 are only numbers and something else. 2623 02:00:35,910 --> 02:00:38,690 Now, you can have any sorts of relationships, instead. 2624 02:00:38,690 --> 02:00:39,270 All right. 2625 02:00:39,270 --> 02:00:41,178 Let me show a few other examples before we 2626 02:00:41,178 --> 02:00:43,470 culminate with some more powerful techniques in Python, 2627 02:00:43,470 --> 02:00:45,000 thanks to libraries. 2628 02:00:45,000 --> 02:00:49,480 How about this problem we encountered in week 4, which was this. 2629 02:00:49,480 --> 02:00:54,120 Let me code up a program called, again, compare.py here but, this time, 2630 02:00:54,120 --> 02:00:56,770 compare to strings and not numbers. 2631 02:00:56,770 --> 02:01:01,230 So let me, for instance, get one string from the user called s. 2632 02:01:01,230 --> 02:01:04,890 Just for the sake of discussion, let me get another string from the user 2633 02:01:04,890 --> 02:01:07,830 called t so that we can actually do some comparison here. 2634 02:01:07,830 --> 02:01:12,780 And if s equals equals t, let's go ahead and print out that they're the same. 2635 02:01:12,780 --> 02:01:15,640 Else, let's go ahead and print out that they're different. 2636 02:01:15,640 --> 02:01:17,910 So this is very similar to what we did in week 4. 2637 02:01:17,910 --> 02:01:20,580 But in week 4, recall we did this specifically 2638 02:01:20,580 --> 02:01:23,800 because we had encountered a problem. 2639 02:01:23,800 --> 02:01:28,680 For instance, if I run-- whoops. 2640 02:01:28,680 --> 02:01:34,970 If I run-- what's going on? 2641 02:01:34,970 --> 02:01:40,396 [INAUDIBLE] Come on. 2642 02:01:40,396 --> 02:01:41,390 Oh. 2643 02:01:41,390 --> 02:01:41,890 OK. 2644 02:01:41,890 --> 02:01:43,240 Wow, OK. 2645 02:01:43,240 --> 02:01:43,840 Long day. 2646 02:01:43,840 --> 02:01:44,380 All right. 2647 02:01:44,380 --> 02:01:48,670 If I run the proper command, python of compare.py, then let's go ahead 2648 02:01:48,670 --> 02:01:53,785 and type in something like "cat" in all lowercase, "cat" in all lowercase. 2649 02:01:53,785 --> 02:01:56,110 And they're the same. 2650 02:01:56,110 --> 02:01:59,565 If, though, I do this again with "dog" and "dog," they're the same. 2651 02:01:59,565 --> 02:02:01,690 And, of course, "cat" and "dog," they're different. 2652 02:02:01,690 --> 02:02:06,430 But does anyone recall, from two weeks ago, when I typed in my name twice, 2653 02:02:06,430 --> 02:02:08,680 both identically capitalized. 2654 02:02:08,680 --> 02:02:10,360 What did it say? 2655 02:02:10,360 --> 02:02:13,390 That they were, in fact, different. 2656 02:02:13,390 --> 02:02:14,110 And why was that? 2657 02:02:14,110 --> 02:02:16,660 Why were two strings in C different, even though I typed literally 2658 02:02:16,660 --> 02:02:17,410 the same thing? 2659 02:02:17,410 --> 02:02:20,040 2660 02:02:20,040 --> 02:02:21,540 Two different places in memory. 2661 02:02:21,540 --> 02:02:24,560 So each string might look the same, aesthetically, but, of course, 2662 02:02:24,560 --> 02:02:25,852 was stored elsewhere in memory. 2663 02:02:25,852 --> 02:02:29,970 And yet, Python appears to be using the equality operator-- 2664 02:02:29,970 --> 02:02:33,510 equals equals-- like you and I would expect, as humans-- actually 2665 02:02:33,510 --> 02:02:38,510 comparing for us char by char in each of those strings for actual [? quality. ?] 2666 02:02:38,510 --> 02:02:41,610 So this is a feature of Python, in that it's just easier to do. 2667 02:02:41,610 --> 02:02:42,210 And why? 2668 02:02:42,210 --> 02:02:44,627 Well, this derives from the reality that, in Python, there 2669 02:02:44,627 --> 02:02:45,630 are no pointers anymore. 2670 02:02:45,630 --> 02:02:47,297 There's no underlying memory management. 2671 02:02:47,297 --> 02:02:50,400 It's not up to you, now, to worry about those lower-level details. 2672 02:02:50,400 --> 02:02:52,960 The language itself takes care of that for you. 2673 02:02:52,960 --> 02:02:55,050 And so, similarly, if I do this and don't 2674 02:02:55,050 --> 02:02:57,510 ask the user for two strings, but just one, 2675 02:02:57,510 --> 02:02:59,370 and then, I do something like this. 2676 02:02:59,370 --> 02:03:05,550 How about give myself a second variable t, set it equal to s.capitalize, which, 2677 02:03:05,550 --> 02:03:08,040 note, is not the same as upper; capitalize, by design, 2678 02:03:08,040 --> 02:03:12,270 per Python's documentation, will only capitalize the first letter for you-- 2679 02:03:12,270 --> 02:03:15,240 I can now print out, say, two fstrings here-- 2680 02:03:15,240 --> 02:03:18,240 what the value of s is and, then, let me print out, 2681 02:03:18,240 --> 02:03:20,340 with another fstring, what the value of t is. 2682 02:03:20,340 --> 02:03:22,995 And recall that, in C, this was a problem 2683 02:03:22,995 --> 02:03:26,820 because if you capitalize s and store it in t, 2684 02:03:26,820 --> 02:03:29,670 we accidentally capitalized both s and t. 2685 02:03:29,670 --> 02:03:33,510 But in this case, in Python, when I actually run this and type in "cat" 2686 02:03:33,510 --> 02:03:37,770 In all lowercase, the original s is unchanged 2687 02:03:37,770 --> 02:03:42,780 because, when I use capitalize on line 3, this is, indeed, capitalizing s. 2688 02:03:42,780 --> 02:03:47,550 But it's returning a copy of the result. It cannot change s itself 2689 02:03:47,550 --> 02:03:50,385 because, again, for that technical term, s is immutable. 2690 02:03:50,385 --> 02:03:53,265 Strings, once they exist, cannot be changed themselves. 2691 02:03:53,265 --> 02:03:58,590 But you can return copies and modify mutated copies of those same strings. 2692 02:03:58,590 --> 02:04:02,040 So, in short, all of those headaches we encountered in week 4 2693 02:04:02,040 --> 02:04:05,070 are now solved, really, in the way you might expect. 2694 02:04:05,070 --> 02:04:07,500 And here's another one that we dwelled on in week 4, 2695 02:04:07,500 --> 02:04:09,660 with the colored liquid in glasses. 2696 02:04:09,660 --> 02:04:12,150 Let me code up a program called swap.py. 2697 02:04:12,150 --> 02:04:16,690 And in swap.py, let me set x equal to 1, y equal to 2. 2698 02:04:16,690 --> 02:04:18,690 And then, let me just print out an fstring here. 2699 02:04:18,690 --> 02:04:24,360 So how about x is this comma y is that. 2700 02:04:24,360 --> 02:04:27,735 And then, let me do that twice, just for the sake of demonstration. 2701 02:04:27,735 --> 02:04:31,005 And in here, recall that we had to create a swap function. 2702 02:04:31,005 --> 02:04:33,630 But then, we had to pass it in by reference with the ampersand. 2703 02:04:33,630 --> 02:04:38,460 And oh my god, that was peak complexity in C. Well, 2704 02:04:38,460 --> 02:04:41,100 if you want to swap x and y in Python, you 2705 02:04:41,100 --> 02:04:43,830 could do x comma y equals y comma x. 2706 02:04:43,830 --> 02:04:49,020 And now, python of swap.py. 2707 02:04:49,020 --> 02:04:50,130 And there we go. 2708 02:04:50,130 --> 02:04:51,840 All of that's handled for you. 2709 02:04:51,840 --> 02:04:56,350 It's like a shell game without even a temporary variable in mind. 2710 02:04:56,350 --> 02:04:58,290 So what more can we do here? 2711 02:04:58,290 --> 02:05:00,870 How about a few final building blocks? 2712 02:05:00,870 --> 02:05:03,330 And these related, now, to files from that week 4. 2713 02:05:03,330 --> 02:05:07,710 Suppose that I want to save some names and numbers in a CSV file-- 2714 02:05:07,710 --> 02:05:11,080 Comma Separated Values, which is like a very lightweight spreadsheet. 2715 02:05:11,080 --> 02:05:15,300 Well, first, let me create a phonebook.csv file 2716 02:05:15,300 --> 02:05:19,458 that just has name comma number as the first row there. 2717 02:05:19,458 --> 02:05:21,750 But after that, I'm going to go ahead, now, and code up 2718 02:05:21,750 --> 02:05:25,170 a phonebook.py program that actually allows 2719 02:05:25,170 --> 02:05:27,040 me to add things to this phonebook. 2720 02:05:27,040 --> 02:05:31,020 So let me split my screen here so that we can see the old and the new. 2721 02:05:31,020 --> 02:05:34,050 And down here, in my code for phonebook.py, 2722 02:05:34,050 --> 02:05:36,360 in this new and improved version, I'm going 2723 02:05:36,360 --> 02:05:40,020 to actually import a whole other library, this one called CSV. 2724 02:05:40,020 --> 02:05:42,885 And here, too, especially for people in data science and the like, 2725 02:05:42,885 --> 02:05:46,500 really like being able to manipulate files and data that might very well be 2726 02:05:46,500 --> 02:05:48,060 stored in spreadsheets or CSVs-- 2727 02:05:48,060 --> 02:05:51,510 Comma Separated Values, which we saw briefly in week 4. 2728 02:05:51,510 --> 02:05:53,670 In phonebook.py, then, it suffices to just 2729 02:05:53,670 --> 02:05:57,348 import CSV after reading the documentation therefore 2730 02:05:57,348 --> 02:05:59,265 because this is going to give me functionality 2731 02:05:59,265 --> 02:06:02,150 in code related to CSV files. 2732 02:06:02,150 --> 02:06:04,950 So here's how I might open a file in Python. 2733 02:06:04,950 --> 02:06:08,340 I literally call open-- it's not fopen now; it's just open-- 2734 02:06:08,340 --> 02:06:10,860 and I open this file called phonebook.csv. 2735 02:06:10,860 --> 02:06:13,470 And just as in C, I'm going to open it in append mode-- 2736 02:06:13,470 --> 02:06:15,930 not right, where it would change the whole thing. 2737 02:06:15,930 --> 02:06:18,660 I want to append new line at a time. 2738 02:06:18,660 --> 02:06:21,750 After this, I want to get, maybe, a name from the user. 2739 02:06:21,750 --> 02:06:25,350 So let's prompt the user for some input for their name. 2740 02:06:25,350 --> 02:06:27,255 And then, let's prompt the user for a number, 2741 02:06:27,255 --> 02:06:31,060 as well, using input prompting for number. 2742 02:06:31,060 --> 02:06:31,560 All right. 2743 02:06:31,560 --> 02:06:33,602 And now, this is a little cryptic, and you'd only 2744 02:06:33,602 --> 02:06:35,050 know this from the documentation. 2745 02:06:35,050 --> 02:06:38,370 But if you want to write rows to a CSV file 2746 02:06:38,370 --> 02:06:41,850 that you can, then, view in Excel or the like, you can do this-- 2747 02:06:41,850 --> 02:06:45,060 give me a variable called writer-- but I could call it anything I want. 2748 02:06:45,060 --> 02:06:50,760 Let me use a csv.writer function that comes with this CSV library, 2749 02:06:50,760 --> 02:06:51,885 passing in the file. 2750 02:06:51,885 --> 02:06:56,070 This is like saying, hey, Python, treat this open file as a CSV file 2751 02:06:56,070 --> 02:06:59,340 so that things are separated with commas and nicely formatted 2752 02:06:59,340 --> 02:07:00,515 in rows and columns. 2753 02:07:00,515 --> 02:07:02,100 Now, I'm going to do this-- 2754 02:07:02,100 --> 02:07:04,030 use that writer to write a row. 2755 02:07:04,030 --> 02:07:05,280 Well, what do I want to write? 2756 02:07:05,280 --> 02:07:07,380 I want to write a short list-- 2757 02:07:07,380 --> 02:07:10,200 namely, the current name and the current number-- 2758 02:07:10,200 --> 02:07:14,790 to that file, but I don't want to use fprintf and %s and all of that stuff 2759 02:07:14,790 --> 02:07:16,440 that we might have had in the past. 2760 02:07:16,440 --> 02:07:19,030 And now, I just want to close the file. 2761 02:07:19,030 --> 02:07:20,410 Let me reopen my terminal. 2762 02:07:20,410 --> 02:07:26,102 Let me run python of phonebook.py, and let me type in David and then 2763 02:07:26,102 --> 02:07:30,190 +1-949-468-2750 and, crossing my fingers, 2764 02:07:30,190 --> 02:07:33,430 watching the actual CSV at top-left. 2765 02:07:33,430 --> 02:07:35,737 My code has just added me to the file. 2766 02:07:35,737 --> 02:07:37,570 And if I were to run it again, for instance, 2767 02:07:37,570 --> 02:07:41,770 with Carter and +1-617-495-1000, crossing my fingers again-- 2768 02:07:41,770 --> 02:07:42,820 we've updated the file. 2769 02:07:42,820 --> 02:07:46,150 And it turns out, there's code now, via which I can even read that file. 2770 02:07:46,150 --> 02:07:48,850 But I can, first, tighten this up, just so you've seen it. 2771 02:07:48,850 --> 02:07:52,720 It turns out, in Python, it's so common to open files and close them. 2772 02:07:52,720 --> 02:07:54,610 Humans make mistakes, and they often forget 2773 02:07:54,610 --> 02:07:58,477 to close files, which might, then, end up using more memory than you intend. 2774 02:07:58,477 --> 02:08:00,310 So you can, alternatively, do this in Python 2775 02:08:00,310 --> 02:08:03,310 so that you don't have to worry about closing files. 2776 02:08:03,310 --> 02:08:05,920 You can use this keyword instead. 2777 02:08:05,920 --> 02:08:09,100 You can say with the opening of this file 2778 02:08:09,100 --> 02:08:13,420 as a variable called file do all of the following underneath. 2779 02:08:13,420 --> 02:08:15,470 So I'm indenting most of my code. 2780 02:08:15,470 --> 02:08:18,430 I'm using this new, Python-specific keyword called width. 2781 02:08:18,430 --> 02:08:22,330 And this is just a matter of saying, with the following opening of the file, 2782 02:08:22,330 --> 02:08:26,120 do those next four lines of code, and then, automatically close it for me 2783 02:08:26,120 --> 02:08:27,370 at the end of the indentation. 2784 02:08:27,370 --> 02:08:31,480 It's a minor optimization, but this, again, is the pythonic way 2785 02:08:31,480 --> 02:08:33,250 to do things, instead. 2786 02:08:33,250 --> 02:08:34,720 How else might I do this, too? 2787 02:08:34,720 --> 02:08:38,860 Well, it turns out that the code I've written here-- on line 9, 2788 02:08:38,860 --> 02:08:40,630 especially-- is a little fragile. 2789 02:08:40,630 --> 02:08:44,350 If any human opens this spreadsheet-- the CSV file in Excel, 2790 02:08:44,350 --> 02:08:46,000 Google Spreadsheets, Apple Numbers-- 2791 02:08:46,000 --> 02:08:49,390 and maybe moves the columns around just because, maybe, they're fussing. 2792 02:08:49,390 --> 02:08:52,790 They saved it, and they don't realize they've, now, changed my assumptions. 2793 02:08:52,790 --> 02:08:55,120 I don't want to, necessarily, write name and number 2794 02:08:55,120 --> 02:08:58,360 always in that order because what if someone screws up and flips those two 2795 02:08:58,360 --> 02:09:01,040 columns by literally dragging and dropping? 2796 02:09:01,040 --> 02:09:03,640 So it turns out that, instead of using a list here, 2797 02:09:03,640 --> 02:09:06,890 we can use another feature of this library, as follows. 2798 02:09:06,890 --> 02:09:09,520 Instead of using a writer, there's something 2799 02:09:09,520 --> 02:09:11,530 called a dictionary writer or dict writer 2800 02:09:11,530 --> 02:09:14,140 that takes the same argument as input-- 2801 02:09:14,140 --> 02:09:15,580 the file that's opened. 2802 02:09:15,580 --> 02:09:18,070 But now, the one difference here is that you 2803 02:09:18,070 --> 02:09:25,030 need to tell this dictionary writer that your field names are name and number. 2804 02:09:25,030 --> 02:09:27,370 And let me close the CSV here. 2805 02:09:27,370 --> 02:09:32,140 Name and number are the names of the fields, the columns in this CSV file. 2806 02:09:32,140 --> 02:09:34,450 And when it comes time to write a new row, 2807 02:09:34,450 --> 02:09:37,750 the syntax here is going to be a little uglier, but it's just a dictionary. 2808 02:09:37,750 --> 02:09:40,120 The name I want to write to the dictionary 2809 02:09:40,120 --> 02:09:42,310 is going to be whatever name the human typed in. 2810 02:09:42,310 --> 02:09:45,790 The number that I want to write to the CSV file 2811 02:09:45,790 --> 02:09:48,550 is going to be whatever the number the human typed in. 2812 02:09:48,550 --> 02:09:51,010 But what's different, now, about this code is, 2813 02:09:51,010 --> 02:09:55,960 by simply using a dictionary writer here instead of the generic writer, 2814 02:09:55,960 --> 02:10:00,640 now, the columns can be in this order or this order or any order. 2815 02:10:00,640 --> 02:10:03,010 And the dictionary writer is going to figure out, 2816 02:10:03,010 --> 02:10:06,557 based on the first line of text in that CSV, where to put name, 2817 02:10:06,557 --> 02:10:07,390 where to put number. 2818 02:10:07,390 --> 02:10:08,883 So if you flip them, no big deal. 2819 02:10:08,883 --> 02:10:11,050 It's going to notice, oh, wait, the columns changed. 2820 02:10:11,050 --> 02:10:14,330 And it's going to insert the columns correctly. 2821 02:10:14,330 --> 02:10:18,970 So just, again, another more powerful feature that lets you 2822 02:10:18,970 --> 02:10:22,750 focus on real work, as opposed to actually getting 2823 02:10:22,750 --> 02:10:27,250 tied up in the weeds of writing code like this, otherwise. 2824 02:10:27,250 --> 02:10:30,440 Questions on this one, as well? 2825 02:10:30,440 --> 02:10:33,520 But what we will do, now, is come full circle 2826 02:10:33,520 --> 02:10:37,180 to some of the more sophisticated examples with which we began, 2827 02:10:37,180 --> 02:10:40,855 and I'm going to go back over to my own Mac laptop 2828 02:10:40,855 --> 02:10:43,743 here, where I've got my own terminal window up and running, 2829 02:10:43,743 --> 02:10:46,285 and I was just going to introduce a couple of final libraries 2830 02:10:46,285 --> 02:10:49,788 that really speak to just how powerful Python can be 2831 02:10:49,788 --> 02:10:51,580 and how quickly you can get up and running. 2832 02:10:51,580 --> 02:10:54,330 To be fair, can't necessarily do all of these things in the cloud, 2833 02:10:54,330 --> 02:10:57,337 like in code spaces, because you need access to your own speakers 2834 02:10:57,337 --> 02:10:58,420 or microphone or the like. 2835 02:10:58,420 --> 02:11:01,090 So that's why I'm doing it on my own Mac, here. 2836 02:11:01,090 --> 02:11:05,680 But let me go ahead and open up a program called speech.py. 2837 02:11:05,680 --> 02:11:07,300 And I'm not using VS Code here. 2838 02:11:07,300 --> 02:11:10,150 I'm using a program called VI that's entirely terminal window based. 2839 02:11:10,150 --> 02:11:13,105 But it's going to allow me, for instance, to import the Python 2840 02:11:13,105 --> 02:11:16,120 text to speech version 3 library. 2841 02:11:16,120 --> 02:11:18,790 I'm going to give myself a variable called engine that's 2842 02:11:18,790 --> 02:11:21,610 going to be set equal to the Python text to speech 2843 02:11:21,610 --> 02:11:26,350 3 libraries init method, which is just going to initialize this library that 2844 02:11:26,350 --> 02:11:28,090 relates to text to speech. 2845 02:11:28,090 --> 02:11:32,410 I'm going to, then, use the engine's say function to say something 2846 02:11:32,410 --> 02:11:35,260 like, how about, hello comma world. 2847 02:11:35,260 --> 02:11:39,850 And then, as my last line, I'm going to say engine.runAndWait, capitalized 2848 02:11:39,850 --> 02:11:44,690 as such, to tell my program, now, to run that speech and wait until it's done. 2849 02:11:44,690 --> 02:11:45,190 All right. 2850 02:11:45,190 --> 02:11:46,540 I'm going to save this file. 2851 02:11:46,540 --> 02:11:49,110 I'm going to run python of speech.py. 2852 02:11:49,110 --> 02:11:52,357 And I'm going to cross my fingers, as always, and-- 2853 02:11:52,357 --> 02:11:53,440 INTERPRETER: Hello, world. 2854 02:11:53,440 --> 02:11:54,398 DAVID MALAN: All right. 2855 02:11:54,398 --> 02:11:57,130 So now, I have a program that's actually synthesizing speech 2856 02:11:57,130 --> 02:11:58,570 using a library like this. 2857 02:11:58,570 --> 02:12:01,285 How can I, now, modify this to be a little more interesting? 2858 02:12:01,285 --> 02:12:02,690 Well, how about this? 2859 02:12:02,690 --> 02:12:05,050 Let me go ahead and prompt the user for their name, 2860 02:12:05,050 --> 02:12:08,680 like we've done several times here, using Python's built-in name function. 2861 02:12:08,680 --> 02:12:11,665 And now, let me go ahead and use a format string in conjunction 2862 02:12:11,665 --> 02:12:14,980 with this library, interpolating the value of name there. 2863 02:12:14,980 --> 02:12:18,460 And-- at least, if my name is somewhat phonetically pronounceable-- 2864 02:12:18,460 --> 02:12:23,587 let's go ahead and run python of speech.py, type in my name, and-- 2865 02:12:23,587 --> 02:12:24,670 INTERPRETER: Hello, David. 2866 02:12:24,670 --> 02:12:25,445 DAVID MALAN: OK. 2867 02:12:25,445 --> 02:12:27,640 It's a weird choice of inflection, but we're 2868 02:12:27,640 --> 02:12:30,475 starting to synthesize voice, not unlike Siri or Google Assistant 2869 02:12:30,475 --> 02:12:32,050 or Alexa or the like. 2870 02:12:32,050 --> 02:12:36,130 Now, we can, maybe, do something a little more advanced, too. 2871 02:12:36,130 --> 02:12:39,310 In addition to synthesizing speech in this way, 2872 02:12:39,310 --> 02:12:43,270 we could synthesize, for instance, an actual graphic. 2873 02:12:43,270 --> 02:12:45,740 Let me go ahead, now, and do something like this. 2874 02:12:45,740 --> 02:12:48,760 Let me create a program called qr.py. 2875 02:12:48,760 --> 02:12:50,890 I'm going to go ahead and import a library called 2876 02:12:50,890 --> 02:12:54,860 OS, which gives you access to operating system related functionality in Python. 2877 02:12:54,860 --> 02:12:56,860 I'm going to import a library I've pre-installed 2878 02:12:56,860 --> 02:12:59,830 called qrcode, which is a two-dimensional barcode that you 2879 02:12:59,830 --> 02:13:01,300 might have seen in the real world. 2880 02:13:01,300 --> 02:13:03,715 I'm going to go ahead and create an image variable using 2881 02:13:03,715 --> 02:13:08,260 this qrcode library's make function, which, per its documentation, 2882 02:13:08,260 --> 02:13:10,365 takes a URL, like one of CS50's own videos. 2883 02:13:10,365 --> 02:13:23,003 So we'll do this with youtu.be/xvF2joSPgG0. 2884 02:13:23,003 --> 02:13:24,670 So, hopefully, that's the right lecture. 2885 02:13:24,670 --> 02:13:27,160 And now, we've got img.save, which is going to allow 2886 02:13:27,160 --> 02:13:30,130 me to create a file called qr.ping. 2887 02:13:30,130 --> 02:13:33,460 Think back, now, on problem set 4 and how painful it was to save files. 2888 02:13:33,460 --> 02:13:36,940 We'll just use the save function, now, in Python and save this as a PNG file-- 2889 02:13:36,940 --> 02:13:38,260 Portable Network Graphic. 2890 02:13:38,260 --> 02:13:43,420 And then, lastly, let's just go ahead and open with the command open qr.png 2891 02:13:43,420 --> 02:13:46,120 on my Mac so that, hopefully, this just automatically opens. 2892 02:13:46,120 --> 02:13:46,660 All right. 2893 02:13:46,660 --> 02:13:49,300 I'm going to go ahead and just double-check my syntax here 2894 02:13:49,300 --> 02:13:51,280 so that I haven't made any mistakes. 2895 02:13:51,280 --> 02:13:54,235 I'm going to go ahead and run python of qr.py. 2896 02:13:54,235 --> 02:13:55,810 Enter. 2897 02:13:55,810 --> 02:13:57,223 That opens up this. 2898 02:13:57,223 --> 02:13:58,390 Let me go ahead and zoom in. 2899 02:13:58,390 --> 02:14:03,750 If you've got a phone handy and you'd like to scan this code here, 2900 02:14:03,750 --> 02:14:07,131 whether in person or online-- 2901 02:14:07,131 --> 02:14:08,095 I apologize. 2902 02:14:08,095 --> 02:14:09,130 You won't appreciate it. 2903 02:14:09,130 --> 02:14:11,640 2904 02:14:11,640 --> 02:14:12,140 Amazing! 2905 02:14:12,140 --> 02:14:13,600 OK. 2906 02:14:13,600 --> 02:14:17,230 And, lastly, let me go back into our speech example 2907 02:14:17,230 --> 02:14:21,400 here, create a final ending here in our final moments. 2908 02:14:21,400 --> 02:14:26,060 And how about we just say something like "This was CS50," like this. 2909 02:14:26,060 --> 02:14:27,087 Let's go ahead, here. 2910 02:14:27,087 --> 02:14:28,795 Fix my capitalization, just for tidiness. 2911 02:14:28,795 --> 02:14:29,878 Let's get rid of the name. 2912 02:14:29,878 --> 02:14:33,840 And now, with our final flourish and your introduction to Python equipped-- 2913 02:14:33,840 --> 02:14:35,230 here we go-- 2914 02:14:35,230 --> 02:14:36,535 INTERPRETER: This was CS50. 2915 02:14:36,535 --> 02:14:37,000 DAVID MALAN: All right. 2916 02:14:37,000 --> 02:14:38,000 We'll see you next time. 2917 02:14:38,000 --> 02:14:39,460 [APPLAUSE] 2918 02:14:39,460 --> 02:14:41,860 2919 02:14:41,860 --> 02:14:45,210 [MUSIC PLAYING] 2920 02:14:45,210 --> 02:15:18,000