1 00:00:00,000 --> 00:00:05,988 [MUSIC PLAYING] 2 00:00:05,988 --> 00:01:17,990 3 00:01:17,990 --> 00:01:21,800 DAVID J. MALAN: All right, this is CS50, and this is already week 6. 4 00:01:21,800 --> 00:01:24,458 And this is the week in which you learn yet another language. 5 00:01:24,458 --> 00:01:26,750 But the goal is not just to teach you another language, 6 00:01:26,750 --> 00:01:29,480 for languages sake, as we transition today 7 00:01:29,480 --> 00:01:32,780 and in the coming weeks from C, where we've spent the past several weeks, now 8 00:01:32,780 --> 00:01:33,440 to Python. 9 00:01:33,440 --> 00:01:37,530 The goal ultimately is to teach you all how to teach yourselves new languages, 10 00:01:37,530 --> 00:01:40,020 so that by the end of this course, it's not in your mind, 11 00:01:40,020 --> 00:01:42,710 the fact that you learned how to program in C 12 00:01:42,710 --> 00:01:44,960 or learned some weeks back how to program in Scratch, 13 00:01:44,960 --> 00:01:48,170 but really how you learned how to program fundamentally, 14 00:01:48,170 --> 00:01:50,630 in a paradigm known as procedural programming, 15 00:01:50,630 --> 00:01:53,450 as well as with some taste today, and in the weeks to come, 16 00:01:53,450 --> 00:01:55,310 of other aspects of programming languages, 17 00:01:55,310 --> 00:01:58,010 like object-oriented programming, and more. 18 00:01:58,010 --> 00:02:00,180 So recall, though, back in week zero, Hello, world 19 00:02:00,180 --> 00:02:01,680 looked a little something like this. 20 00:02:01,680 --> 00:02:03,387 And the world was quite simple. 21 00:02:03,387 --> 00:02:05,720 All you had to do was drag and drop these puzzle pieces. 22 00:02:05,720 --> 00:02:08,960 But there were still functions and conditionals and loops and variables 23 00:02:08,960 --> 00:02:11,030 and all of those kinds of primitives. 24 00:02:11,030 --> 00:02:14,300 We then transitioned, of course, to a much more arcane language that 25 00:02:14,300 --> 00:02:15,840 looked a little something like this. 26 00:02:15,840 --> 00:02:17,798 And even now, some weeks later, you might still 27 00:02:17,798 --> 00:02:20,470 be struggling with some of the syntax or getting annoying bugs 28 00:02:20,470 --> 00:02:22,970 when you try to compile your code, and it just doesn't work. 29 00:02:22,970 --> 00:02:24,800 But there, too, the past few weeks, we've 30 00:02:24,800 --> 00:02:28,130 been focusing on functions and loops and variables, conditionals, and really 31 00:02:28,130 --> 00:02:29,550 all of those same ideas. 32 00:02:29,550 --> 00:02:33,710 And so what we begin to do today is to, one, simplify the language 33 00:02:33,710 --> 00:02:38,840 we're using, transitioning from C now to Python, this now being the equivalent 34 00:02:38,840 --> 00:02:42,200 program in Python, and look at its relative simplicity, 35 00:02:42,200 --> 00:02:43,940 but also transitioning to look at how you 36 00:02:43,940 --> 00:02:45,800 can implement these same kinds of features, 37 00:02:45,800 --> 00:02:47,430 just using a different language. 38 00:02:47,430 --> 00:02:49,250 So we're going to see a lot of code today. 39 00:02:49,250 --> 00:02:53,150 And you won't have nearly as much practice with Python as you did with C. 40 00:02:53,150 --> 00:02:56,210 But that's because so many of the ideas are still going to be with us. 41 00:02:56,210 --> 00:02:58,580 And, really, it's going to be a process of figuring out, all right, 42 00:02:58,580 --> 00:02:59,413 I want to do a loop. 43 00:02:59,413 --> 00:03:01,760 I know how to do it in C. How do I do this in Python? 44 00:03:01,760 --> 00:03:02,990 How do I do the same with conditionals? 45 00:03:02,990 --> 00:03:04,710 How do I declare variables, and the like, 46 00:03:04,710 --> 00:03:07,460 and moving forward, not just in CS50, but in life in general, 47 00:03:07,460 --> 00:03:10,760 if you continue programming and learn some other language after the class, 48 00:03:10,760 --> 00:03:14,270 if in 5-10 years, there's a new, more popular language that you pick up, 49 00:03:14,270 --> 00:03:16,520 it's just going to be a matter of googling and looking 50 00:03:16,520 --> 00:03:18,410 at websites like Stack Overflow and the like, 51 00:03:18,410 --> 00:03:21,350 to look at just basic building blocks of programming languages, 52 00:03:21,350 --> 00:03:24,680 because you already speak, after these past 6 plus weeks, 53 00:03:24,680 --> 00:03:27,500 you already speak programming itself fundamentally. 54 00:03:27,500 --> 00:03:31,070 All right, so let's do a few quick comparisons, left and right, of what 55 00:03:31,070 --> 00:03:32,960 something might have looked like in Scratch, 56 00:03:32,960 --> 00:03:34,820 and what it then looked like in C, but now, 57 00:03:34,820 --> 00:03:36,770 as of today, what it's going to look like in Python. 58 00:03:36,770 --> 00:03:38,853 Then we'll turn our attention to the command line, 59 00:03:38,853 --> 00:03:42,510 ultimately, in order to implement some actual programs. 60 00:03:42,510 --> 00:03:45,740 So in Scratch, we had functions like this, say Hello, 61 00:03:45,740 --> 00:03:47,270 world, a verb or an action. 62 00:03:47,270 --> 00:03:49,740 In C it looked a little something like this, 63 00:03:49,740 --> 00:03:53,150 and a bit of a cryptic mess the first week, you had the printf, 64 00:03:53,150 --> 00:03:54,290 you had the double quotes. 65 00:03:54,290 --> 00:03:55,980 You had the semicolon, the parentheses. 66 00:03:55,980 --> 00:03:58,423 So there's a lot more syntax just to do the same thing. 67 00:03:58,423 --> 00:04:01,340 We're not going to get rid of all of that syntax now, but as of today, 68 00:04:01,340 --> 00:04:05,580 in Python, that same statement is going to look a little something like this. 69 00:04:05,580 --> 00:04:07,640 And just to perhaps call out the obvious, what 70 00:04:07,640 --> 00:04:12,050 is different or, now, simpler in Python versus C, even 71 00:04:12,050 --> 00:04:13,640 in this simple example here? 72 00:04:13,640 --> 00:04:14,545 Yeah. 73 00:04:14,545 --> 00:04:17,420 AUDIENCE: Now print, instead of printf would be, something like that. 74 00:04:17,420 --> 00:04:19,837 DAVID J. MALAN: Good, so it's now print instead of printf. 75 00:04:19,837 --> 00:04:21,110 And there's also no semicolon. 76 00:04:21,110 --> 00:04:23,103 And there's one other subtlety, over here. 77 00:04:23,103 --> 00:04:24,020 AUDIENCE: No new line. 78 00:04:24,020 --> 00:04:25,640 DAVID J. MALAN: Yeah, so no new line, and that 79 00:04:25,640 --> 00:04:27,110 doesn't mean it's not going to be printed. 80 00:04:27,110 --> 00:04:29,402 It just turns out that one of the differences we'll see 81 00:04:29,402 --> 00:04:31,640 is that, with print, you get the new line for free. 82 00:04:31,640 --> 00:04:34,950 It automatically gets outputted by default, being sort of a common case. 83 00:04:34,950 --> 00:04:37,190 But you can override it, we'll see, ultimately, too. 84 00:04:37,190 --> 00:04:38,300 How about in Scratch? 85 00:04:38,300 --> 00:04:42,082 We had multiple functions like this, that not only said something 86 00:04:42,082 --> 00:04:43,790 on the screen, but also asked a question, 87 00:04:43,790 --> 00:04:47,300 thereby being another function that returned a value, called answer. 88 00:04:47,300 --> 00:04:49,730 In C we saw code that looked a little something 89 00:04:49,730 --> 00:04:53,420 like this, whereby that first line declares a variable called answer, 90 00:04:53,420 --> 00:04:55,790 sets it equal to the return value of getString, 91 00:04:55,790 --> 00:04:57,740 one of the functions from the CS50 library, 92 00:04:57,740 --> 00:05:00,980 and then the same double quotes and parentheses and semicolon. 93 00:05:00,980 --> 00:05:05,390 Then we had this format code in C that allowed us, with %S, 94 00:05:05,390 --> 00:05:07,760 to actually print out that same value. 95 00:05:07,760 --> 00:05:10,400 In Python, this, too, is going to look a little bit simpler. 96 00:05:10,400 --> 00:05:13,460 Instead, we're going to have answer equals getString, 97 00:05:13,460 --> 00:05:16,070 quote unquote "What's your name," and then print, 98 00:05:16,070 --> 00:05:18,870 with a plus sign and a little bit of new syntax. 99 00:05:18,870 --> 00:05:21,650 But let's see if we can't just infer from this example what 100 00:05:21,650 --> 00:05:22,860 it is that's going on. 101 00:05:22,860 --> 00:05:25,670 Well, first missing on the left is what? 102 00:05:25,670 --> 00:05:28,620 To the left of the equal sign, there's no what this time? 103 00:05:28,620 --> 00:05:29,870 Feel free to just call it out. 104 00:05:29,870 --> 00:05:30,690 AUDIENCE: Type. 105 00:05:30,690 --> 00:05:31,460 DAVID J. MALAN: So there's no type. 106 00:05:31,460 --> 00:05:33,770 There's no type, like the word string, which 107 00:05:33,770 --> 00:05:38,090 even though that was a type in CS50, every other variable in C 108 00:05:38,090 --> 00:05:41,437 did we use Int or string or float, or Bool or something else. 109 00:05:41,437 --> 00:05:43,520 In Python, there are still going to be data types, 110 00:05:43,520 --> 00:05:45,980 today onward, but you, the programmer, don't 111 00:05:45,980 --> 00:05:49,042 have to bother telling the computer what types you're using. 112 00:05:49,042 --> 00:05:50,750 The computer is going to be smart enough, 113 00:05:50,750 --> 00:05:53,240 the language, really, is going to be smart enough, to just figure it out 114 00:05:53,240 --> 00:05:54,260 from context. 115 00:05:54,260 --> 00:05:56,150 Meanwhile, on the right hand side, getString 116 00:05:56,150 --> 00:05:57,858 is going to be a function we'll use today 117 00:05:57,858 --> 00:06:01,320 and this week, which comes from a Python version of the CS50 library. 118 00:06:01,320 --> 00:06:04,370 But we'll also start to take off those training wheels, so that you'll 119 00:06:04,370 --> 00:06:07,670 see how to do things without any CS50 library moving forward, 120 00:06:07,670 --> 00:06:09,290 using a different function instead. 121 00:06:09,290 --> 00:06:12,920 As before, no semicolon, but the rest of the syntax is pretty much the same 122 00:06:12,920 --> 00:06:13,430 here. 123 00:06:13,430 --> 00:06:16,013 This starts, of course, to get a little bit different, though. 124 00:06:16,013 --> 00:06:17,650 We're using print instead of printf. 125 00:06:17,650 --> 00:06:20,860 But now, even though this looks a little cryptic, 126 00:06:20,860 --> 00:06:23,110 perhaps, if you've never programmed before CS50, 127 00:06:23,110 --> 00:06:27,130 what might that plus be doing, just based on inference here. 128 00:06:27,130 --> 00:06:27,880 What do you think? 129 00:06:27,880 --> 00:06:31,720 AUDIENCE: Adding answer to the string Hello. 130 00:06:31,720 --> 00:06:34,990 DAVID J. MALAN: Yeah, so adding answer to the string Hello, 131 00:06:34,990 --> 00:06:37,030 and adding, so to speak, not mathematically, 132 00:06:37,030 --> 00:06:39,580 but in the form of joining them together, much like we 133 00:06:39,580 --> 00:06:43,040 saw the joined block in Scratch, or concatenation was the term of art 134 00:06:43,040 --> 00:06:43,540 there. 135 00:06:43,540 --> 00:06:46,810 This plus sign appends, if you will, whatever's 136 00:06:46,810 --> 00:06:48,625 in answer to whatever is quoted here. 137 00:06:48,625 --> 00:06:51,250 And I deliberately left a space there, so that grammatically it 138 00:06:51,250 --> 00:06:53,422 looks nice, after the comma as well. 139 00:06:53,422 --> 00:06:54,880 Now there's another way to do this. 140 00:06:54,880 --> 00:06:57,130 And it, too, is going to look cryptic at first glance. 141 00:06:57,130 --> 00:06:59,510 But it just gets easier and more convenient over time. 142 00:06:59,510 --> 00:07:04,580 You can also change this second line to be this, instead. 143 00:07:04,580 --> 00:07:05,770 So what's going on here. 144 00:07:05,770 --> 00:07:08,710 This is actually a relatively new feature of Python in the past couple 145 00:07:08,710 --> 00:07:11,020 of years, where now what you're seeing is, yes, 146 00:07:11,020 --> 00:07:13,580 a string, between these same double quotes, 147 00:07:13,580 --> 00:07:17,075 but this is what Python would call a format string, or Fstring. 148 00:07:17,075 --> 00:07:20,200 And it literally starts with the letter F, which admittedly looks, I think, 149 00:07:20,200 --> 00:07:20,980 a little weird. 150 00:07:20,980 --> 00:07:24,700 But that just indicates that Python should 151 00:07:24,700 --> 00:07:29,110 assume that anything inside of curly braces inside of the string 152 00:07:29,110 --> 00:07:32,560 should be interpolated, so to speak, which is a fancy term saying, 153 00:07:32,560 --> 00:07:36,160 substitute the value of any variables therein. 154 00:07:36,160 --> 00:07:38,030 And it can do some other things as well. 155 00:07:38,030 --> 00:07:42,040 So answer is a variable, declared, of course, on this first line. 156 00:07:42,040 --> 00:07:46,300 This Fstring, then, says to Python, print out Hello comma space, and then 157 00:07:46,300 --> 00:07:47,950 the value of Answer. 158 00:07:47,950 --> 00:07:52,390 If, by contrast, you omitted the curly braces, 159 00:07:52,390 --> 00:07:54,040 just take a guess, what would happen? 160 00:07:54,040 --> 00:07:56,920 What would the symptom of that bug be, if you accidentally 161 00:07:56,920 --> 00:08:00,010 forgot the curly braces, but maybe still had the F there? 162 00:08:00,010 --> 00:08:01,750 AUDIENCE: It would print below it, too. 163 00:08:01,750 --> 00:08:04,300 DAVID J. MALAN: Yeah, it would literally print Hello, comma answer, because it's 164 00:08:04,300 --> 00:08:05,200 going to take you literally. 165 00:08:05,200 --> 00:08:07,690 So the curly braces just kind of allow you to plug things in. 166 00:08:07,690 --> 00:08:09,350 And, again, it looks a little more cryptic, 167 00:08:09,350 --> 00:08:11,267 but it's just going to save us time over time. 168 00:08:11,267 --> 00:08:14,120 And if any of you programmed in Java in high school, for instance, 169 00:08:14,120 --> 00:08:16,630 you saw plus in that context, too, for concatenation. 170 00:08:16,630 --> 00:08:19,755 This just kind of makes your code a little tighter, a little more succinct. 171 00:08:19,755 --> 00:08:21,730 So it's a convenient feature now in Python. 172 00:08:21,730 --> 00:08:24,190 All right, this was an example in Scratch of a variable, 173 00:08:24,190 --> 00:08:26,740 setting a variable like counter equal to 0. 174 00:08:26,740 --> 00:08:30,460 In C it looked like this, where you specify the type, the name, 175 00:08:30,460 --> 00:08:32,230 and then the value, with a semicolon. 176 00:08:32,230 --> 00:08:35,096 In Python, it's going to look like this. 177 00:08:35,096 --> 00:08:36,429 And I'll state the obvious here. 178 00:08:36,429 --> 00:08:39,340 You don't need to mention the type, just like before with string. 179 00:08:39,340 --> 00:08:41,030 And you don't need a semicolon. 180 00:08:41,030 --> 00:08:42,130 So it's a little simpler. 181 00:08:42,130 --> 00:08:45,005 If you want a variable, just write it and set it equal to some value. 182 00:08:45,005 --> 00:08:48,070 But the single equal sign still behaves the same as in C. 183 00:08:48,070 --> 00:08:50,440 Suppose we wanted to increment counter by one. 184 00:08:50,440 --> 00:08:52,750 In Scratch, we use this puzzle piece here. 185 00:08:52,750 --> 00:08:55,250 In C, we could do this, actually, in a few different ways. 186 00:08:55,250 --> 00:08:57,400 There was this way, if counter already exists, 187 00:08:57,400 --> 00:08:59,980 you just say counter equals counter plus 1. 188 00:08:59,980 --> 00:09:04,840 There was the slightly less verbose way, where you could say, oops, sorry. 189 00:09:04,840 --> 00:09:06,400 Let me do the first sentence first. 190 00:09:06,400 --> 00:09:08,690 In Python, that same thing, as you might guess, 191 00:09:08,690 --> 00:09:12,160 is actually going to be almost the same, you just throw away the semicolon. 192 00:09:12,160 --> 00:09:15,370 And the mathematics are ultimately the same, copying from right to left, 193 00:09:15,370 --> 00:09:17,290 via the assignment operator. 194 00:09:17,290 --> 00:09:19,570 Now, recall, in C, that we had this shorthand 195 00:09:19,570 --> 00:09:22,000 notation, which did the same thing. 196 00:09:22,000 --> 00:09:26,980 In Python, you can similarly do the same thing, just no need for the semicolon. 197 00:09:26,980 --> 00:09:29,290 The only step backwards we're taking, if you 198 00:09:29,290 --> 00:09:33,790 were a big fan of counter plus plus, that doesn't exist in Python, 199 00:09:33,790 --> 00:09:34,625 nor minus minus. 200 00:09:34,625 --> 00:09:35,500 You just can't do it. 201 00:09:35,500 --> 00:09:40,210 You have to do the plus equals 1 or plus/minus or minus equals 1 202 00:09:40,210 --> 00:09:43,720 to achieve that same result. All right, how about in Python 2? 203 00:09:43,720 --> 00:09:46,360 Here in Scratch, recall, was a conditional, 204 00:09:46,360 --> 00:09:49,990 asking a silly question like is x less than y, and if so, just say as much. 205 00:09:49,990 --> 00:09:53,980 In C, that looked a little something like this, printf and if 206 00:09:53,980 --> 00:09:57,310 with the parentheses, the curly braces, the semicolon, and all of that. 207 00:09:57,310 --> 00:10:00,610 In Python, this is going to get a little more pleasant to type, too. 208 00:10:00,610 --> 00:10:03,320 It's going to be just this. 209 00:10:03,320 --> 00:10:06,460 And if someone wants to call out some of the obvious changes here, 210 00:10:06,460 --> 00:10:10,365 what has been simplified now in Python for a conditional, it would seem? 211 00:10:10,365 --> 00:10:11,740 Yeah, what's missing, or changed? 212 00:10:11,740 --> 00:10:12,350 AUDIENCE: Braces. 213 00:10:12,350 --> 00:10:13,405 DAVID J. MALAN: So no curly braces. 214 00:10:13,405 --> 00:10:14,740 AUDIENCE: Colon is back. 215 00:10:14,740 --> 00:10:15,370 DAVID J. MALAN: I'm sorry? 216 00:10:15,370 --> 00:10:16,510 AUDIENCE: Using the colon instead. 217 00:10:16,510 --> 00:10:18,593 DAVID J. MALAN: And we're using the colon instead. 218 00:10:18,593 --> 00:10:20,620 So I got rid of the curly braces in Python. 219 00:10:20,620 --> 00:10:22,193 But I'm using a colon instead. 220 00:10:22,193 --> 00:10:24,110 And even though this is a single line of code, 221 00:10:24,110 --> 00:10:28,450 so long as you indent subsequent lines along with the printf, 222 00:10:28,450 --> 00:10:32,830 that's going to imply that everything, if the if condition is true, 223 00:10:32,830 --> 00:10:36,970 should be executed below it, until you start to un-indent and start writing 224 00:10:36,970 --> 00:10:38,470 a different line of code altogether. 225 00:10:38,470 --> 00:10:41,000 So indentation in Python is important. 226 00:10:41,000 --> 00:10:45,100 So this is among the reasons we've emphasized axes like style, 227 00:10:45,100 --> 00:10:46,840 just how well styled your code is. 228 00:10:46,840 --> 00:10:49,360 And honestly, we've seen, certainly, in office hours, 229 00:10:49,360 --> 00:10:52,000 and you've seen in your own code, sort of a tendency sometimes 230 00:10:52,000 --> 00:10:55,030 to be a little lax when it comes to indentation, right? 231 00:10:55,030 --> 00:10:57,670 If you're one of those folks who likes to indent everything 232 00:10:57,670 --> 00:11:01,210 on the left hand side of the window, yeah, it might compile and run. 233 00:11:01,210 --> 00:11:04,870 But it's not particularly readable by you or anyone else. 234 00:11:04,870 --> 00:11:08,590 Python actually addresses this by just requiring indentation, 235 00:11:08,590 --> 00:11:09,790 when logically needed. 236 00:11:09,790 --> 00:11:14,050 So Python is going to force you to start inventing properly now, if that's been, 237 00:11:14,050 --> 00:11:16,680 perhaps, a tendency otherwise. 238 00:11:16,680 --> 00:11:17,620 What else is missing? 239 00:11:17,620 --> 00:11:19,050 Well, we have no semicolon here. 240 00:11:19,050 --> 00:11:21,150 Of course, it's print instead of printf. 241 00:11:21,150 --> 00:11:23,820 But otherwise, those seem to be the primary differences. 242 00:11:23,820 --> 00:11:25,680 What about something larger in Scratch? 243 00:11:25,680 --> 00:11:28,812 If an if-else block, like this, you can perhaps 244 00:11:28,812 --> 00:11:30,270 guess what it's going to look like. 245 00:11:30,270 --> 00:11:33,540 In C it looks like this, curly braces semicolons, and so forth. 246 00:11:33,540 --> 00:11:37,530 In Python, it's going to now look like this, almost the same, 247 00:11:37,530 --> 00:11:38,820 but indentation is important. 248 00:11:38,820 --> 00:11:39,960 The colons are important. 249 00:11:39,960 --> 00:11:42,810 And there's one other difference that's now again visible here, 250 00:11:42,810 --> 00:11:44,670 but we didn't call it out a second ago. 251 00:11:44,670 --> 00:11:47,760 What else is different in Python versus C for these conditionals? 252 00:11:47,760 --> 00:11:48,471 Yeah. 253 00:11:48,471 --> 00:11:51,120 AUDIENCE: You don't have any parentheses around the condition. 254 00:11:51,120 --> 00:11:51,700 DAVID J. MALAN: Perfect. 255 00:11:51,700 --> 00:11:54,090 We don't have any parentheses around the condition, 256 00:11:54,090 --> 00:11:55,710 the Boolean expression itself. 257 00:11:55,710 --> 00:11:56,567 And why not? 258 00:11:56,567 --> 00:11:57,900 Well, it's just simpler to type. 259 00:11:57,900 --> 00:11:58,950 It's less to type. 260 00:11:58,950 --> 00:12:00,450 You can still use parentheses. 261 00:12:00,450 --> 00:12:02,550 And, in fact, you might want to or need to, 262 00:12:02,550 --> 00:12:07,470 if you want to combine thoughts and do this and that, or this or that. 263 00:12:07,470 --> 00:12:10,920 But by default, you no longer need or should have those parentheses. 264 00:12:10,920 --> 00:12:12,150 Just say what you mean. 265 00:12:12,150 --> 00:12:14,440 Lastly, with conditionals, we had something like this, 266 00:12:14,440 --> 00:12:16,770 an if else if else statement. 267 00:12:16,770 --> 00:12:18,840 In C, it looked a little something like this. 268 00:12:18,840 --> 00:12:20,880 In Python, it's going to get really tighter now. 269 00:12:20,880 --> 00:12:25,830 It's just if, and this is the curiosity, elif x greater than y. 270 00:12:25,830 --> 00:12:31,110 So it's not else if, it's literally one keyword, elif, and the colons 271 00:12:31,110 --> 00:12:33,315 remain now on each of the three lines. 272 00:12:33,315 --> 00:12:34,690 But the indentation is important. 273 00:12:34,690 --> 00:12:36,480 And if we did want to do multiple things, 274 00:12:36,480 --> 00:12:40,238 we could just indent below each of these conditionals, as well. 275 00:12:40,238 --> 00:12:42,030 All right, let me pause there first, to see 276 00:12:42,030 --> 00:12:44,490 if there's any questions on these syntactic differences. 277 00:12:44,490 --> 00:12:45,247 Yeah. 278 00:12:45,247 --> 00:12:47,532 AUDIENCE: My thought is maybe like, it's good, 279 00:12:47,532 --> 00:12:51,160 though, does it matter if there's this in between thing like that, but 280 00:12:51,160 --> 00:12:52,170 and why. 281 00:12:52,170 --> 00:12:55,050 DAVID J. MALAN: In between, between what and what? 282 00:12:55,050 --> 00:12:58,420 AUDIENCE: So like the left-hand side and like the right side spaces? 283 00:12:58,420 --> 00:13:01,830 DAVID J. MALAN: Ah, good question, is Python sensitive 284 00:13:01,830 --> 00:13:03,750 to spaces and where they go? 285 00:13:03,750 --> 00:13:06,390 Sometimes no, sometimes yes, is the short answer. 286 00:13:06,390 --> 00:13:10,080 Stylistically, though, you should be practicing what we're preaching here, 287 00:13:10,080 --> 00:13:14,265 whereby you do have spaces to the left and right of binary operators, 288 00:13:14,265 --> 00:13:16,140 that they're called, something like less than 289 00:13:16,140 --> 00:13:18,348 or greater than is a binary operator, because there's 290 00:13:18,348 --> 00:13:20,580 two operands to the left and to the right of them. 291 00:13:20,580 --> 00:13:23,640 And in fact, in Python, more so than the world of C, 292 00:13:23,640 --> 00:13:26,340 there's actually formal style conventions. 293 00:13:26,340 --> 00:13:30,687 Not only within CS50 have we had a style guide on the course's website, 294 00:13:30,687 --> 00:13:34,020 for instance, that just dictates how you should write your code so that it looks 295 00:13:34,020 --> 00:13:34,945 like everyone else's. 296 00:13:34,945 --> 00:13:37,320 In the Python community, they take this one step further, 297 00:13:37,320 --> 00:13:41,260 and there's an actual standard whereby you don't have to adhere to it, 298 00:13:41,260 --> 00:13:44,310 but generally speaking, in the real world, someone would reprimand you, 299 00:13:44,310 --> 00:13:47,100 would reject your code, if you're trying to contribute it to another project, 300 00:13:47,100 --> 00:13:48,730 if you don't adhere to these standards. 301 00:13:48,730 --> 00:13:51,690 So while you could be lax with some of this white space, 302 00:13:51,690 --> 00:13:52,860 do make things readable. 303 00:13:52,860 --> 00:13:56,775 And that's Python theme, for the code to be as readable as possible. 304 00:13:56,775 --> 00:13:59,400 All right, so let's take a look at a couple of other constructs 305 00:13:59,400 --> 00:14:01,360 before transitioning to some actual code. 306 00:14:01,360 --> 00:14:04,110 This, of course, in Scratch was a loop, meowing forever. 307 00:14:04,110 --> 00:14:08,340 In C, the closest we could get was doing something while true, because true 308 00:14:08,340 --> 00:14:09,100 never changes. 309 00:14:09,100 --> 00:14:12,060 So it's sort of a simple way of just saying do this forever. 310 00:14:12,060 --> 00:14:14,940 In Python, it's pretty much the same thing, 311 00:14:14,940 --> 00:14:16,740 but a couple of small differences here. 312 00:14:16,740 --> 00:14:18,600 The parentheses are gone. 313 00:14:18,600 --> 00:14:19,598 The colon is there. 314 00:14:19,598 --> 00:14:20,640 The indentation is there. 315 00:14:20,640 --> 00:14:24,263 No semicolon, and there's one other subtle difference. 316 00:14:24,263 --> 00:14:24,930 What do you see? 317 00:14:24,930 --> 00:14:25,920 AUDIENCE: True is capitalized? 318 00:14:25,920 --> 00:14:28,003 DAVID J. MALAN: True is capitalized, just because. 319 00:14:28,003 --> 00:14:30,570 Both true and false are Boolean values in Python. 320 00:14:30,570 --> 00:14:33,150 But you've got to start capitalizing them, just because. 321 00:14:33,150 --> 00:14:35,040 All right, how about a loop like this, where 322 00:14:35,040 --> 00:14:38,460 you repeat something a finite number of times, like meowing three times. 323 00:14:38,460 --> 00:14:41,050 In C, we could do this a few different ways. 324 00:14:41,050 --> 00:14:44,790 There's this very mechanical way, where you initialize a variable like i 325 00:14:44,790 --> 00:14:45,570 to zero. 326 00:14:45,570 --> 00:14:49,350 You then use a while loop and check if i is less than 3, 327 00:14:49,350 --> 00:14:51,187 the total number of times you want to meow. 328 00:14:51,187 --> 00:14:52,770 Then you print what you want to print. 329 00:14:52,770 --> 00:14:56,370 You increment i using this syntax, or the longer, more verbose syntax, 330 00:14:56,370 --> 00:14:57,880 with plus equals or whatnot. 331 00:14:57,880 --> 00:15:00,210 And then you do it again and again and again. 332 00:15:00,210 --> 00:15:04,170 In Python, you can do it functionally the same way, same idea, 333 00:15:04,170 --> 00:15:05,580 slightly different syntax. 334 00:15:05,580 --> 00:15:08,190 You just don't bother saying what type of variable you want. 335 00:15:08,190 --> 00:15:11,038 Python will infer from the fact that there's a 0 right there. 336 00:15:11,038 --> 00:15:12,330 You don't need the parentheses. 337 00:15:12,330 --> 00:15:13,260 You do need the colon. 338 00:15:13,260 --> 00:15:14,760 You do need the indentation. 339 00:15:14,760 --> 00:15:17,910 You can't do the i plus plus, but you can do this other technique, 340 00:15:17,910 --> 00:15:20,100 as we could have done in C, as well. 341 00:15:20,100 --> 00:15:22,320 How else might we do this, though, too? 342 00:15:22,320 --> 00:15:24,540 Well. it turns out in C, we could do something 343 00:15:24,540 --> 00:15:28,230 like this, which, again, sort of cryptic at first glance, 344 00:15:28,230 --> 00:15:31,170 became perhaps more familiar, where you have initialization, 345 00:15:31,170 --> 00:15:34,920 a conditional, and then an update that you do after each iteration. 346 00:15:34,920 --> 00:15:37,950 In Python, there isn't really an analog. 347 00:15:37,950 --> 00:15:40,500 There is no analog in Python, where you have 348 00:15:40,500 --> 00:15:43,380 the parentheses and the multiple semicolons in the same line. 349 00:15:43,380 --> 00:15:47,010 Instead, there is a for loop, but it's meant to read a little more 350 00:15:47,010 --> 00:15:50,550 like English, for i in 0, 1, and 2. 351 00:15:50,550 --> 00:15:54,780 So we'll see in a bit, these square brackets represent an array, now 352 00:15:54,780 --> 00:15:57,090 to be called a list in Python. 353 00:15:57,090 --> 00:16:01,290 So lists in Python are more like link lists than they are arrays. 354 00:16:01,290 --> 00:16:02,380 More on that soon. 355 00:16:02,380 --> 00:16:06,210 So this just means for i and the following list of three values. 356 00:16:06,210 --> 00:16:09,820 And on each iteration of this loop, Python automatically, for you, 357 00:16:09,820 --> 00:16:11,250 it first sets i to zero. 358 00:16:11,250 --> 00:16:12,840 Then it sets i to one. 359 00:16:12,840 --> 00:16:17,880 Then it sets i to two, so that you effectively do things three times. 360 00:16:17,880 --> 00:16:21,450 But this doesn't necessarily scale, as I've drawn it on the board. 361 00:16:21,450 --> 00:16:25,140 Suppose you took this at face value as the way 362 00:16:25,140 --> 00:16:28,980 you iterate some number of times in Python, using a for loop. 363 00:16:28,980 --> 00:16:33,482 At what point does this approach perhaps get bad, or bad design? 364 00:16:33,482 --> 00:16:35,190 Let me give folks just a moment to think. 365 00:16:35,190 --> 00:16:36,415 Yeah, in back. 366 00:16:36,415 --> 00:16:39,082 AUDIENCE: If you don't know how many times, last time, you know, 367 00:16:39,082 --> 00:16:41,083 you've got the link in there. 368 00:16:41,083 --> 00:16:43,500 DAVID J. MALAN: Sure, if you don't know how many times you 369 00:16:43,500 --> 00:16:47,460 want to loop or iterate, you can't really create a hard-coded list 370 00:16:47,460 --> 00:16:48,750 like that, of 0, 1, 2. 371 00:16:48,750 --> 00:16:50,323 Other thoughts? 372 00:16:50,323 --> 00:16:52,990 AUDIENCE: So you want to say raise a large number of allowances. 373 00:16:52,990 --> 00:16:55,740 DAVID J. MALAN: Yeah, if you're iterating a large number of times, 374 00:16:55,740 --> 00:16:57,640 this list is going to get longer and longer, 375 00:16:57,640 --> 00:16:59,932 and you're just kind of stupidly going to be typing out 376 00:16:59,932 --> 00:17:03,660 like comma 3, comma 4, comma 5, comma dot dot dot, comma 99, comma 100. 377 00:17:03,660 --> 00:17:06,160 I mean, your code would start to look atrocious, eventually. 378 00:17:06,160 --> 00:17:07,510 So there is a better way. 379 00:17:07,510 --> 00:17:10,359 In Python, there is a function, or technically a type, 380 00:17:10,359 --> 00:17:14,530 called range, that essentially magically gives you back a range of values 381 00:17:14,530 --> 00:17:17,599 from 0 on up to, but not through a value. 382 00:17:17,599 --> 00:17:21,609 So the effect of this line of code, for i in the following range, 383 00:17:21,609 --> 00:17:24,484 essentially hands you back a list of three values, 384 00:17:24,484 --> 00:17:26,359 thereby letting you do something three times. 385 00:17:26,359 --> 00:17:29,067 And if you want to do something 99 times instead, you, of course, 386 00:17:29,067 --> 00:17:30,575 just change the 3 to a 99. 387 00:17:30,575 --> 00:17:31,075 Question. 388 00:17:31,075 --> 00:17:35,090 AUDIENCE: Is there a way to start the beginning point of that range 389 00:17:35,090 --> 00:17:39,410 at a number or an integer that's higher than zero, or is there never a really 390 00:17:39,410 --> 00:17:40,460 any point to do so? 391 00:17:40,460 --> 00:17:41,540 DAVID J. MALAN: A really good question, can 392 00:17:41,540 --> 00:17:43,440 you start counting at a higher number. 393 00:17:43,440 --> 00:17:46,910 So not 0, which is the implied default, but something larger than that. 394 00:17:46,910 --> 00:17:51,560 Yes, so it turns out the range function takes multiple arguments, not just one 395 00:17:51,560 --> 00:17:54,998 but maybe two or even three, that allows you to customize this behavior. 396 00:17:54,998 --> 00:17:56,540 So you can customize where it begins. 397 00:17:56,540 --> 00:17:57,920 You can customize the increment. 398 00:17:57,920 --> 00:17:59,712 By default, it's one, but if you want to do 399 00:17:59,712 --> 00:18:02,582 every two values, for like evens or odds, you could do that as well, 400 00:18:02,582 --> 00:18:03,540 and a few other things. 401 00:18:03,540 --> 00:18:05,930 And before long, we'll take a look at some Python documentation 402 00:18:05,930 --> 00:18:08,810 that will become your authoritative source for answers like that. 403 00:18:08,810 --> 00:18:10,790 Like, what can this function do. 404 00:18:10,790 --> 00:18:15,020 Other questions on this thus far? 405 00:18:15,020 --> 00:18:19,980 Seeing none, so what else might we compare and contrast here. 406 00:18:19,980 --> 00:18:24,320 Well, in the world of C, recall that we had a whole bunch of built-in data 407 00:18:24,320 --> 00:18:28,310 types, like these here, Bool and char and double and float, and so forth, 408 00:18:28,310 --> 00:18:31,670 string, which happened to come from the CS50 library. 409 00:18:31,670 --> 00:18:35,990 But the language C itself certainly understood the idea of strings, 410 00:18:35,990 --> 00:18:40,700 because the backslash 0, the support for %S and printf, that's all native, 411 00:18:40,700 --> 00:18:43,370 built into C, not a CS50 simplification. 412 00:18:43,370 --> 00:18:45,620 All we did, and revealed, as of a couple of weeks 413 00:18:45,620 --> 00:18:48,050 ago, is that string, this data type, is just 414 00:18:48,050 --> 00:18:52,730 a synonym for a typedef for char star, which is part of the language natively. 415 00:18:52,730 --> 00:18:55,610 In Python now, this list actually gets a little shorter, at least 416 00:18:55,610 --> 00:18:57,443 for these common primitive data types. 417 00:18:57,443 --> 00:19:00,110 Still going to have bulls, we're going to have floats, and Ints, 418 00:19:00,110 --> 00:19:02,600 and we're going to have strings, but we're going to call them STRs. 419 00:19:02,600 --> 00:19:04,760 And this is not a CS50 thing from the library, 420 00:19:04,760 --> 00:19:08,300 STR, S-T-R, is, in fact, a data type in Python, 421 00:19:08,300 --> 00:19:12,260 that's going to do a lot more than strings did for us automatically in C. 422 00:19:12,260 --> 00:19:17,133 Ints and floats, meanwhile, don't need the corresponding longs and doubles, 423 00:19:17,133 --> 00:19:19,550 because, in fact, among the problems Python solves for us, 424 00:19:19,550 --> 00:19:22,340 too, Ints can get as big as you want. 425 00:19:22,340 --> 00:19:25,220 Integer overflow is no longer going to be an issue. 426 00:19:25,220 --> 00:19:27,950 Per week 1, the language solves that for us. 427 00:19:27,950 --> 00:19:29,790 Floating point imprecision, unfortunately, 428 00:19:29,790 --> 00:19:31,190 is still a problem that remains. 429 00:19:31,190 --> 00:19:34,730 But there are libraries, code that other people have written, as we briefly 430 00:19:34,730 --> 00:19:37,010 discussed in weeks past, that allow you to do 431 00:19:37,010 --> 00:19:40,250 scientific or financial computing, using libraries that build 432 00:19:40,250 --> 00:19:42,625 on top of these data types, as well. 433 00:19:42,625 --> 00:19:45,500 So there's other data types, too, in Python, which we'll see actually 434 00:19:45,500 --> 00:19:48,710 gives us a whole bunch of more power and capability, 435 00:19:48,710 --> 00:19:51,500 things called ranges, like we just saw, lists, 436 00:19:51,500 --> 00:19:54,080 like I called out verbally, with the square brackets, 437 00:19:54,080 --> 00:19:56,900 things called tuples, for things like x comma y, 438 00:19:56,900 --> 00:20:00,305 or latitude, longitude, dictionaries, or Dicts, 439 00:20:00,305 --> 00:20:03,740 which allow you to store keys and values, much like our hash tables 440 00:20:03,740 --> 00:20:06,973 from last time, and then sets in the mathematical sense, where they filter 441 00:20:06,973 --> 00:20:09,890 out duplicates for you, and you can just put a whole bunch of numbers, 442 00:20:09,890 --> 00:20:13,910 a whole bunch of words or whatnot, and the language, via this data type, 443 00:20:13,910 --> 00:20:16,400 will filter out duplicates for you. 444 00:20:16,400 --> 00:20:19,985 Now there's going to be a few functions we give you this week and beyond, 445 00:20:19,985 --> 00:20:22,610 training wheels that we're then going to very quickly take off, 446 00:20:22,610 --> 00:20:26,060 just because, as we'll see today, they just simplify the process of getting 447 00:20:26,060 --> 00:20:29,205 user input correctly, without accidentally writing buggy code, 448 00:20:29,205 --> 00:20:32,330 just when you're trying to get Hello, World, or something similar, to work. 449 00:20:32,330 --> 00:20:36,050 And we'll give you functions, not like, not as long as this list in C, 450 00:20:36,050 --> 00:20:38,630 but a subset of these, get float, get Int, 451 00:20:38,630 --> 00:20:41,660 and get string, that'll automate the process of getting 452 00:20:41,660 --> 00:20:45,410 user input in a way that's more resilient against potential bugs. 453 00:20:45,410 --> 00:20:47,270 But we'll see what those bugs might be. 454 00:20:47,270 --> 00:20:50,120 And the way we're going to do this is similar in spirit to C. 455 00:20:50,120 --> 00:20:54,380 Instead of doing include, CS50.h, like we did in C, 456 00:20:54,380 --> 00:20:57,290 you're going to now start saying import CS50. 457 00:20:57,290 --> 00:21:00,560 Python supports, similar to C, libraries, 458 00:21:00,560 --> 00:21:02,300 but there aren't header files anymore. 459 00:21:02,300 --> 00:21:05,090 You just use the name of the library in Python. 460 00:21:05,090 --> 00:21:08,450 And if you want to import CS50's functions, you just say import CS50. 461 00:21:08,450 --> 00:21:12,470 Or, if you want to be more precise, and not just import the whole thing, which 462 00:21:12,470 --> 00:21:15,860 could be slow, if you've got a really big library with a lot of functionality 463 00:21:15,860 --> 00:21:19,730 in it, you can be more precise and say from CS50, import get float. 464 00:21:19,730 --> 00:21:23,480 From CS50 import get Int, from CSM 50 import get string, 465 00:21:23,480 --> 00:21:26,270 or you can just separate them by commas and import 3 466 00:21:26,270 --> 00:21:30,550 and only 3 things from a particular library, like ours. 467 00:21:30,550 --> 00:21:32,300 But starting today and onward, we're going 468 00:21:32,300 --> 00:21:35,450 to start making much more heavy use of libraries, code 469 00:21:35,450 --> 00:21:38,570 that other people wrote, so that we're no longer reinventing the wheel. 470 00:21:38,570 --> 00:21:41,875 We're not making our own linked lists, our own trees, our own dictionaries. 471 00:21:41,875 --> 00:21:44,250 We're going to start standing on the shoulders of others, 472 00:21:44,250 --> 00:21:47,120 so that you can get real work done, so to speak, faster, 473 00:21:47,120 --> 00:21:51,710 by building your software on top of others' code as well. 474 00:21:51,710 --> 00:21:55,110 All right, so that's it for the syntactic tour of the language, 475 00:21:55,110 --> 00:21:56,360 and the sort of core features. 476 00:21:56,360 --> 00:21:58,320 Soon we'll transition to application thereof. 477 00:21:58,320 --> 00:22:04,040 But let me pause here to see if there's any questions on syntax or primitives 478 00:22:04,040 --> 00:22:10,340 or otherwise, or otherwise. 479 00:22:10,340 --> 00:22:12,204 Oh, yes, in back. 480 00:22:12,204 --> 00:22:16,163 AUDIENCE: Why don't Python have the increment operators. 481 00:22:16,163 --> 00:22:18,330 DAVID J. MALAN: I'm sorry, say it again, why doesn't 482 00:22:18,330 --> 00:22:19,788 Python have what kind of operators? 483 00:22:19,788 --> 00:22:22,578 AUDIENCE: Why doesn't Python have the increment operator? 484 00:22:22,578 --> 00:22:25,620 DAVID J. MALAN: Sorry, someone coughed when you said something operators. 485 00:22:25,620 --> 00:22:26,948 AUDIENCE: The increment. 486 00:22:26,948 --> 00:22:28,740 DAVID J. MALAN: Oh, the increment operator? 487 00:22:28,740 --> 00:22:30,407 I'd have to check the history, honestly. 488 00:22:30,407 --> 00:22:32,910 Python has tended to be a fairly minimus language. 489 00:22:32,910 --> 00:22:36,090 And if you can do something one way, the community, arguably, 490 00:22:36,090 --> 00:22:40,145 has tended to not give you multiple ways to do the same thing syntactically. 491 00:22:40,145 --> 00:22:41,520 There's probably a better answer. 492 00:22:41,520 --> 00:22:45,840 And I'll see if I can dig in and post something online, to follow up on that. 493 00:22:45,840 --> 00:22:49,870 All right, so before we transition to now writing some actual code, 494 00:22:49,870 --> 00:22:54,870 let me go ahead and consider exactly how we're going to write code. 495 00:22:54,870 --> 00:22:58,770 In the world of C, recall that it's generally been a 2-step process. 496 00:22:58,770 --> 00:23:04,230 We create a file called like Hello.c, and then, step one, make Hello, step 2, 497 00:23:04,230 --> 00:23:05,400 ./Hello. 498 00:23:05,400 --> 00:23:08,130 Or, if you think back to week two, when we sort of peeled back 499 00:23:08,130 --> 00:23:11,100 the layer of what Hello, of what make was doing, 500 00:23:11,100 --> 00:23:14,310 you could more verbosely type out the name of the actual compiler, 501 00:23:14,310 --> 00:23:17,640 Clang in our case, command line arguments like dash Oh, Hello, 502 00:23:17,640 --> 00:23:19,840 to specify what name you want to create. 503 00:23:19,840 --> 00:23:21,660 And then you can specify the file name. 504 00:23:21,660 --> 00:23:25,050 And then you can specify what libraries you want to link in. 505 00:23:25,050 --> 00:23:26,550 So that was a very verbose approach. 506 00:23:26,550 --> 00:23:28,930 But it was always a two-step approach. 507 00:23:28,930 --> 00:23:31,680 And so, even as you've been doing recent problem sets, 508 00:23:31,680 --> 00:23:35,400 odds are you've realized that, any time you want to make a change to your code, 509 00:23:35,400 --> 00:23:39,660 or make a change to your code and try and test your code again, 510 00:23:39,660 --> 00:23:42,360 you're constantly doing those two steps. 511 00:23:42,360 --> 00:23:45,840 Moving forward in Python, it's going to become simpler, 512 00:23:45,840 --> 00:23:47,610 and it's going to be just this. 513 00:23:47,610 --> 00:23:50,460 The file name is going to change, but that might go without saying. 514 00:23:50,460 --> 00:23:55,260 It's going to be something like Hello.py, P-Y, instead of Hello.c. 515 00:23:55,260 --> 00:23:57,990 And that's just a convention, using a different file extension. 516 00:23:57,990 --> 00:24:00,780 But there's no compilation step per se. 517 00:24:00,780 --> 00:24:04,170 You jump right to the execution of your code. 518 00:24:04,170 --> 00:24:07,200 And so Python, it turns out, is the name, not only of the language 519 00:24:07,200 --> 00:24:12,150 we're going to start using, it's also the name of a program on a Mac, a PC, 520 00:24:12,150 --> 00:24:16,020 assuming it's been pre-installed, that interprets the language for you. 521 00:24:16,020 --> 00:24:20,100 This is to say that Python is generally described as being interpreted, 522 00:24:20,100 --> 00:24:21,360 not compiled. 523 00:24:21,360 --> 00:24:25,170 And by that, I mean you get to skip, from the programmer's perspective, 524 00:24:25,170 --> 00:24:26,370 that compilation step. 525 00:24:26,370 --> 00:24:30,870 There is no manual step in the world of Python, typically, of writing your code 526 00:24:30,870 --> 00:24:34,530 and then compiling it to zeros and ones, and then running the zeros and ones. 527 00:24:34,530 --> 00:24:36,870 Instead, these kind of two steps get collapsed 528 00:24:36,870 --> 00:24:42,570 into the illusion of one, whereby you, instead, are able to just run the code, 529 00:24:42,570 --> 00:24:46,200 and let the computer figure out how to actually convert it 530 00:24:46,200 --> 00:24:48,240 to something the computer understands. 531 00:24:48,240 --> 00:24:51,850 And the way we do that is via this old process, input and output. 532 00:24:51,850 --> 00:24:53,910 But now, when you have source code, it's going 533 00:24:53,910 --> 00:24:56,850 to be passed into an interpreter, not a compiler. 534 00:24:56,850 --> 00:24:59,400 And the best analog of this is just to perhaps point out 535 00:24:59,400 --> 00:25:01,950 that, in the human world, if you speak, or don't speak, 536 00:25:01,950 --> 00:25:05,640 multiple human languages, it can be a pretty slow process from going 537 00:25:05,640 --> 00:25:07,270 from one language to another. 538 00:25:07,270 --> 00:25:10,170 For instance, here are step-by-step instructions for finding someone 539 00:25:10,170 --> 00:25:12,540 in a phone book, unfortunately, in Spanish. 540 00:25:12,540 --> 00:25:15,360 Unfortunately, if you don't speak or read Spanish. 541 00:25:15,360 --> 00:25:16,560 You could figure this out. 542 00:25:16,560 --> 00:25:19,380 You could run this algorithm, but you're going to have to do some googling, 543 00:25:19,380 --> 00:25:22,130 or you're going to have to open up literal dictionary from Spanish 544 00:25:22,130 --> 00:25:23,460 to English and convert this. 545 00:25:23,460 --> 00:25:27,060 And the catch with translating any language, human or computer 546 00:25:27,060 --> 00:25:30,850 or otherwise, is that you're going to pay a price, typically some time. 547 00:25:30,850 --> 00:25:33,840 And so converting this in Spanish to this in English 548 00:25:33,840 --> 00:25:36,360 is just going to take you longer than if this were already 549 00:25:36,360 --> 00:25:38,453 in your native language. 550 00:25:38,453 --> 00:25:41,370 And that's going to be one of the subtleties with the world of Python. 551 00:25:41,370 --> 00:25:45,180 Yes, it's a feature that you can just run the code without having 552 00:25:45,180 --> 00:25:47,880 to bother compiling it manually first. 553 00:25:47,880 --> 00:25:49,050 But we might pay a price. 554 00:25:49,050 --> 00:25:50,815 And things might be a little slower. 555 00:25:50,815 --> 00:25:52,440 Now, there's ways to chip away at that. 556 00:25:52,440 --> 00:25:53,815 But we'll see an example thereof. 557 00:25:53,815 --> 00:25:56,700 In fact, let me transition now to just a couple of examples 558 00:25:56,700 --> 00:26:00,660 that demonstrate how Python is not only easier for many people 559 00:26:00,660 --> 00:26:03,240 to use, perhaps yourselves too, because it throws away 560 00:26:03,240 --> 00:26:06,120 a lot of the annoying syntax, it shortens the number of lines 561 00:26:06,120 --> 00:26:09,810 you have to write, and also it comes with so many darn libraries, 562 00:26:09,810 --> 00:26:14,740 you can just do so much more without having to write the code yourself. 563 00:26:14,740 --> 00:26:17,670 So, as an example of this, let me switch over here 564 00:26:17,670 --> 00:26:24,090 to this image from problem set 4, which is the Weeks Bridge down by the Charles 565 00:26:24,090 --> 00:26:25,290 River here in Cambridge. 566 00:26:25,290 --> 00:26:27,245 And this is the original photo, pretty clear, 567 00:26:27,245 --> 00:26:30,370 and it's even higher res if we looked at the original version of the photo. 568 00:26:30,370 --> 00:26:33,660 But there have been no filters, a la Instagram, applied to this photo. 569 00:26:33,660 --> 00:26:36,750 Recall, for problem set four, you had to implement a few filters. 570 00:26:36,750 --> 00:26:38,460 And among them might have been blur. 571 00:26:38,460 --> 00:26:41,610 And blur was probably among the more challenging of the ones, 572 00:26:41,610 --> 00:26:44,190 because you had to iterate over all of the pixels, 573 00:26:44,190 --> 00:26:47,130 you had to take into account what's above, what's below, to the left, 574 00:26:47,130 --> 00:26:47,490 to the right. 575 00:26:47,490 --> 00:26:49,448 I mean, there was a lot of math and arithmetic. 576 00:26:49,448 --> 00:26:52,620 And if you ultimately got it, it was probably a great sense of satisfaction. 577 00:26:52,620 --> 00:26:54,780 But that was probably several hours later. 578 00:26:54,780 --> 00:26:57,540 In a language like Python, where there might 579 00:26:57,540 --> 00:27:01,170 be libraries that had been written by others, on whose shoulders 580 00:27:01,170 --> 00:27:03,880 you can stand, we could perhaps do something like this. 581 00:27:03,880 --> 00:27:08,280 Let me go ahead and run a program, or write a program, called Blur.py here. 582 00:27:08,280 --> 00:27:12,130 And in Blur.py, in VS Code, let me just do this. 583 00:27:12,130 --> 00:27:15,370 Let me import from a library, not the CS50 library, 584 00:27:15,370 --> 00:27:19,620 but the Pillow library, so to speak, a keyword called image 585 00:27:19,620 --> 00:27:23,330 and another one called image filter, then let me go ahead 586 00:27:23,330 --> 00:27:26,420 and say, let me open the current version of this image, which 587 00:27:26,420 --> 00:27:27,740 is called Bridge.bmp. 588 00:27:27,740 --> 00:27:30,260 So the before version of the image will be 589 00:27:30,260 --> 00:27:34,550 the result of calling image.open quote unquote "Bridge.bmp," 590 00:27:34,550 --> 00:27:37,040 and then, let me create an after version. 591 00:27:37,040 --> 00:27:38,840 So you'll see before and after. 592 00:27:38,840 --> 00:27:45,010 After equals the before version .filter of image filter. 593 00:27:45,010 --> 00:27:46,760 And there is, if I read the documentation, 594 00:27:46,760 --> 00:27:49,052 I'll see that there's something called a box blur, that 595 00:27:49,052 --> 00:27:52,160 allows you to blur in box format, like one pixel above, 596 00:27:52,160 --> 00:27:53,750 below, left, and right. 597 00:27:53,750 --> 00:27:55,367 So I'll do one pixel there. 598 00:27:55,367 --> 00:27:57,950 And then, after that's done, let me go ahead and save the file 599 00:27:57,950 --> 00:28:01,070 as something like Out.bmp. 600 00:28:01,070 --> 00:28:02,180 That's it. 601 00:28:02,180 --> 00:28:04,910 Assuming this library works as described, 602 00:28:04,910 --> 00:28:08,060 I am opening the file in Python, using line 3. 603 00:28:08,060 --> 00:28:09,680 And this is somewhat new syntax. 604 00:28:09,680 --> 00:28:13,250 In the world of Python, we're going to start making use of the dot operator 605 00:28:13,250 --> 00:28:15,320 more, because in the world of Python, you have 606 00:28:15,320 --> 00:28:19,700 what's called object-oriented programming, or OOP, as a term of art. 607 00:28:19,700 --> 00:28:22,470 And what this means is that you still have functions, 608 00:28:22,470 --> 00:28:24,980 you still have variables, but sometimes those functions 609 00:28:24,980 --> 00:28:28,850 are embedded inside of the variables, or, more specifically, 610 00:28:28,850 --> 00:28:30,710 inside of the data types themselves. 611 00:28:30,710 --> 00:28:34,430 Think back to C. When you wanted to convert something to uppercase, 612 00:28:34,430 --> 00:28:38,582 there was a to upper function that takes as input an argument that's a char. 613 00:28:38,582 --> 00:28:41,540 And you can pass in any char you want, and it will uppercase it for you 614 00:28:41,540 --> 00:28:42,890 and give you back a value. 615 00:28:42,890 --> 00:28:46,160 Well, you know what, if that's such a common paradigm, where 616 00:28:46,160 --> 00:28:49,850 upper-casing chars is a useful thing, what the world of Python does 617 00:28:49,850 --> 00:28:54,470 is it embeds into the string data type, or char if you will, 618 00:28:54,470 --> 00:28:59,240 the ability just to uppercase any char by treating the char, or the string, 619 00:28:59,240 --> 00:29:02,150 as though it's a struct in C. Recall that structs 620 00:29:02,150 --> 00:29:04,400 encapsulate multiple types of values. 621 00:29:04,400 --> 00:29:07,610 In object-oriented programming, in a language like Python, 622 00:29:07,610 --> 00:29:11,510 you can encapsulate not just values, but also functionality. 623 00:29:11,510 --> 00:29:13,818 Functions can now be inside of structs. 624 00:29:13,818 --> 00:29:15,860 But we're not going to call them structs anymore. 625 00:29:15,860 --> 00:29:17,270 We're going to call them objects. 626 00:29:17,270 --> 00:29:19,130 But that's just a different vernacular. 627 00:29:19,130 --> 00:29:20,870 So what am I doing here? 628 00:29:20,870 --> 00:29:23,870 Inside of the image library, there's a function called open, 629 00:29:23,870 --> 00:29:26,630 and it takes an argument, the name of the file, to open. 630 00:29:26,630 --> 00:29:30,260 Once I have a variable called before, that is a struct, or technically 631 00:29:30,260 --> 00:29:33,290 an object, inside of which is now, because it 632 00:29:33,290 --> 00:29:36,140 was returned from this function, a function 633 00:29:36,140 --> 00:29:38,280 called filter, that takes an argument. 634 00:29:38,280 --> 00:29:41,660 The argument here happens to be image.boxblur1, 635 00:29:41,660 --> 00:29:42,830 which itself is a function. 636 00:29:42,830 --> 00:29:44,803 But it just returns the filter to use. 637 00:29:44,803 --> 00:29:46,970 And then, after, dot save does what you might think. 638 00:29:46,970 --> 00:29:48,150 It just saves the file. 639 00:29:48,150 --> 00:29:51,470 So instead of using fopen and fwrite, you just say dot save, 640 00:29:51,470 --> 00:29:54,510 and that does all of that messy work for you. 641 00:29:54,510 --> 00:29:57,230 So it's just, what, four lines of code total? 642 00:29:57,230 --> 00:30:00,240 Let me go ahead and go down to my terminal window. 643 00:30:00,240 --> 00:30:03,533 Let me go ahead and show you with LS that, at the moment, 644 00:30:03,533 --> 00:30:05,450 whoops, sorry, let me not bother showing that, 645 00:30:05,450 --> 00:30:07,160 because I have other examples to come. 646 00:30:07,160 --> 00:30:14,310 I'm going to go ahead and do Python of Blur.py, nope, sorry, wrong place. 647 00:30:14,310 --> 00:30:15,570 I did need to make a command. 648 00:30:15,570 --> 00:30:16,280 There we go. 649 00:30:16,280 --> 00:30:19,340 OK, let me go ahead and type LS inside of my filter directory, which 650 00:30:19,340 --> 00:30:21,560 is among the sample code online today. 651 00:30:21,560 --> 00:30:24,800 There's only one file called Bridge.bmp, dammit, 652 00:30:24,800 --> 00:30:27,630 I'm trying to get these things ready at the same time. 653 00:30:27,630 --> 00:30:28,730 Let me rewind. 654 00:30:28,730 --> 00:30:32,120 Let me move this code into place. 655 00:30:32,120 --> 00:30:34,710 All right, I've gone ahead and moved this file, Blur.py, 656 00:30:34,710 --> 00:30:37,190 into a folder called filter, inside of which 657 00:30:37,190 --> 00:30:42,080 there's another file called Bridge.bmp, which we can confer with LS. 658 00:30:42,080 --> 00:30:44,390 Let me now go ahead and run Python, which 659 00:30:44,390 --> 00:30:46,700 is my interpreter, and also the name of the language, 660 00:30:46,700 --> 00:30:48,990 and run Python on this file. 661 00:30:48,990 --> 00:30:51,348 So much like running the Spanish algorithm 662 00:30:51,348 --> 00:30:53,390 through Google Translate, or something like that, 663 00:30:53,390 --> 00:30:55,650 as input, to get back the English output, 664 00:30:55,650 --> 00:30:59,540 this is going to translate the Python language to something 665 00:30:59,540 --> 00:31:01,760 this computer, or this cloud-based environment, 666 00:31:01,760 --> 00:31:05,070 understands, and then run the corresponding code, top to bottom, 667 00:31:05,070 --> 00:31:05,707 left to right. 668 00:31:05,707 --> 00:31:07,040 I'm going to go ahead and Enter. 669 00:31:07,040 --> 00:31:08,930 No error message is generally a good thing. 670 00:31:08,930 --> 00:31:11,960 If I type LS you'll now see out.bmp. 671 00:31:11,960 --> 00:31:13,295 Let me go ahead and open that. 672 00:31:13,295 --> 00:31:15,920 And, you know what, just to make clear what's really happening, 673 00:31:15,920 --> 00:31:17,087 let me blur it even further. 674 00:31:17,087 --> 00:31:20,550 Let's make a box that's not just one pixel around, but 10. 675 00:31:20,550 --> 00:31:21,950 So let's make that change. 676 00:31:21,950 --> 00:31:24,830 And let me just go ahead and rerun it with Python of Blur.py. 677 00:31:24,830 --> 00:31:27,320 I still have Out.bmp. 678 00:31:27,320 --> 00:31:32,100 Let me go ahead and open Out.bmp and show you first the before, 679 00:31:32,100 --> 00:31:33,680 which looks like this. 680 00:31:33,680 --> 00:31:34,550 That's the original. 681 00:31:34,550 --> 00:31:37,820 And now, crossing my fingers, four lines of code later, 682 00:31:37,820 --> 00:31:39,758 the result of blurring it, as well. 683 00:31:39,758 --> 00:31:42,050 So the library is doing all of the same kind of legwork 684 00:31:42,050 --> 00:31:44,120 that you all did for the assignment, but it's 685 00:31:44,120 --> 00:31:48,303 encapsulated it all into a single library, that you can then use instead. 686 00:31:48,303 --> 00:31:50,720 Those of you who might have been feeling more comfortable, 687 00:31:50,720 --> 00:31:52,595 might have done a little something like this. 688 00:31:52,595 --> 00:31:56,900 Let me go ahead and open up one other file, called Edges.py. 689 00:31:56,900 --> 00:32:00,290 And in Edges.py, I'm again going to import from the Pillow library 690 00:32:00,290 --> 00:32:03,010 the image keyword, and the image filter. 691 00:32:03,010 --> 00:32:05,510 Then I'm going to go ahead and create a before image, that's 692 00:32:05,510 --> 00:32:09,590 a result of calling image.open of the same thing, Bridge.bmp, 693 00:32:09,590 --> 00:32:16,910 then I'm going to go ahead and run a filter on that, called image, whoops, 694 00:32:16,910 --> 00:32:21,850 image filter.find edges, which is like a content, if you will, 695 00:32:21,850 --> 00:32:23,708 defined inside of this library for us. 696 00:32:23,708 --> 00:32:25,750 And then I'm going to do after.save quote unquote 697 00:32:25,750 --> 00:32:28,210 "Out.bmp," using the same file name. 698 00:32:28,210 --> 00:32:36,490 I'm now going to run Python of Edges.py, after, sorry, user error. 699 00:32:36,490 --> 00:32:38,930 We'll see what syntax error means soon. 700 00:32:38,930 --> 00:32:41,470 Let me go ahead and run the code now, Edges.py. 701 00:32:41,470 --> 00:32:44,830 Let me now open that new file, Out.bmp. 702 00:32:44,830 --> 00:32:49,510 And before we had this, and now, especially if what will look familiar 703 00:32:49,510 --> 00:32:52,210 if we did the more comfortable version of P set 4, 704 00:32:52,210 --> 00:32:55,340 we now get this, after just four lines of code. 705 00:32:55,340 --> 00:32:58,120 So again, suggesting the power of using a language that's better 706 00:32:58,120 --> 00:32:59,560 optimized for the tool at hand. 707 00:32:59,560 --> 00:33:02,950 And at the risk of really making folks sad, let's go ahead 708 00:33:02,950 --> 00:33:06,820 and re-implement, if we could, problem set five, real quickly here. 709 00:33:06,820 --> 00:33:11,080 Let me go ahead and open another version of this code, 710 00:33:11,080 --> 00:33:14,307 wherein I have a C version, just from problem 711 00:33:14,307 --> 00:33:16,390 set five, wherein you implemented a spell checker, 712 00:33:16,390 --> 00:33:18,640 loading 100,000 plus words into memory. 713 00:33:18,640 --> 00:33:22,390 And then you kept track of just how much time and memory it took. 714 00:33:22,390 --> 00:33:24,340 And that probably took a while, implementing 715 00:33:24,340 --> 00:33:26,530 all of those functions in Dictionary.c. 716 00:33:26,530 --> 00:33:32,240 Let me instead now go into a new file, called Dictionary.py. 717 00:33:32,240 --> 00:33:35,200 And let me stipulate, for the sake of discussion, 718 00:33:35,200 --> 00:33:37,660 that we already wrote in advance, Speller.py, 719 00:33:37,660 --> 00:33:39,850 which corresponds to Speller.c. 720 00:33:39,850 --> 00:33:41,380 You didn't write either of those. 721 00:33:41,380 --> 00:33:43,600 Recall for problem set five, we gave you Speller.c. 722 00:33:43,600 --> 00:33:45,558 Assume that we're going to give you Speller.py. 723 00:33:45,558 --> 00:33:52,030 So the onus on us right now is only to implement Speller, Dictionary.py. 724 00:33:52,030 --> 00:33:54,940 All right, so I'm going to go ahead and define a few functions. 725 00:33:54,940 --> 00:33:58,000 And we're going to see now the syntax for defining functions in Python. 726 00:33:58,000 --> 00:34:02,230 I want to go ahead and define first, a hash table, which 727 00:34:02,230 --> 00:34:04,840 was the very first thing you defined in Dictionary.c. 728 00:34:04,840 --> 00:34:09,969 I'm going to go ahead, then, and say words gets this, give me a dictionary, 729 00:34:09,969 --> 00:34:11,683 otherwise known as a hash table. 730 00:34:11,683 --> 00:34:13,600 All right, now let me define a function called 731 00:34:13,600 --> 00:34:16,630 check, which was the first function you might have implemented. 732 00:34:16,630 --> 00:34:19,000 Check is going to take a word, and you'll see in Python, 733 00:34:19,000 --> 00:34:20,375 the syntax is a little different. 734 00:34:20,375 --> 00:34:21,880 You don't specify the return type. 735 00:34:21,880 --> 00:34:24,610 You use the word Def instead to define. 736 00:34:24,610 --> 00:34:28,540 You still specify the name of the function and any arguments thereto. 737 00:34:28,540 --> 00:34:31,210 But you omit any mention of types. 738 00:34:31,210 --> 00:34:33,280 But you do use a colon and indent. 739 00:34:33,280 --> 00:34:37,780 So how do I check if a word is in my dictionary, or in my hash table? 740 00:34:37,780 --> 00:34:41,440 Well, in Python, I can just say, if word in words, 741 00:34:41,440 --> 00:34:46,570 go ahead and return true, else go ahead and return false, done, 742 00:34:46,570 --> 00:34:47,949 with the check function. 743 00:34:47,949 --> 00:34:49,639 All right, now I want to do like load. 744 00:34:49,639 --> 00:34:52,639 That was the heavy lift, where you had to load the big file into memory. 745 00:34:52,639 --> 00:34:54,306 So let me define a function called load. 746 00:34:54,306 --> 00:34:56,650 It takes a string, the name of a file to load. 747 00:34:56,650 --> 00:34:59,980 So I'll call that Dictionary, just like in C, but no data type. 748 00:34:59,980 --> 00:35:04,180 Let me go ahead and open a file by using an open function in Python, 749 00:35:04,180 --> 00:35:06,740 by opening that Dictionary in read mode. 750 00:35:06,740 --> 00:35:10,360 So this is a little similar to fopen, a function in C you might recall. 751 00:35:10,360 --> 00:35:12,880 Then let me iterate over every line in the file. 752 00:35:12,880 --> 00:35:17,800 In Python, this is pretty pleasant, for line in file colon indent. 753 00:35:17,800 --> 00:35:22,510 How, now, do I get at the current word, and then strip off the new line, 754 00:35:22,510 --> 00:35:25,570 because in this file of words, 140,000 words, 755 00:35:25,570 --> 00:35:28,752 there's word backslash n, word backslash n, all right? 756 00:35:28,752 --> 00:35:31,210 Well, let me go ahead and get a word from the current line, 757 00:35:31,210 --> 00:35:34,840 but strip off, from the right end of the string, the new line, which 758 00:35:34,840 --> 00:35:37,540 the Rstrip function in Python does for me. 759 00:35:37,540 --> 00:35:42,370 Then let me go ahead and add to my dictionary, or hash table, that word, 760 00:35:42,370 --> 00:35:43,030 done. 761 00:35:43,030 --> 00:35:45,535 Let me go ahead and close the file for good measure. 762 00:35:45,535 --> 00:35:48,160 And then let me go ahead and return true, because all was well. 763 00:35:48,160 --> 00:35:50,320 That's it for the load function in Python. 764 00:35:50,320 --> 00:35:51,580 How about the size function? 765 00:35:51,580 --> 00:35:54,820 This did not take any arguments, it just returns the size of the hash table 766 00:35:54,820 --> 00:35:55,990 or dictionary in Python. 767 00:35:55,990 --> 00:35:59,980 I can do that by returning the length of the dictionary in question. 768 00:35:59,980 --> 00:36:04,660 And then lastly, gone from the world of Python is malloc and free. 769 00:36:04,660 --> 00:36:06,090 Memory is managed for you. 770 00:36:06,090 --> 00:36:08,950 So no matter what I do, there's nothing to unload. 771 00:36:08,950 --> 00:36:10,820 The computer will do that for me. 772 00:36:10,820 --> 00:36:14,860 So I give you, in these functions, problem set five in Python. 773 00:36:14,860 --> 00:36:17,020 So, I'm sorry, we made you write it in C first. 774 00:36:17,020 --> 00:36:20,620 But the implication now is that, what are you getting for free, 775 00:36:20,620 --> 00:36:21,850 in a language like Python? 776 00:36:21,850 --> 00:36:24,370 Well, encapsulated in this one line of code 777 00:36:24,370 --> 00:36:28,270 is much of what you wrote for problem set five, implementing 778 00:36:28,270 --> 00:36:31,270 your array for all of your letters of the alphabet or more, 779 00:36:31,270 --> 00:36:34,390 all of the linked lists that you implemented to create chains, 780 00:36:34,390 --> 00:36:35,930 to store all of those words. 781 00:36:35,930 --> 00:36:37,060 All of that is happening. 782 00:36:37,060 --> 00:36:40,090 It's just someone else in the world wrote that code for you. 783 00:36:40,090 --> 00:36:43,060 And you can now use it by way of a dictionary. 784 00:36:43,060 --> 00:36:45,550 And actually, I can change this a little bit, 785 00:36:45,550 --> 00:36:48,670 because add is technically not the right function to use here. 786 00:36:48,670 --> 00:36:51,620 I'm actually treating the dictionary as something simpler, a set. 787 00:36:51,620 --> 00:36:55,420 So I'm going to make one tweak, set recall was another data type in Python. 788 00:36:55,420 --> 00:36:57,700 But set just allows it to handle duplicates, 789 00:36:57,700 --> 00:37:00,430 and it allows me to just throw things into it by literally 790 00:37:00,430 --> 00:37:02,320 using a function as simple as add. 791 00:37:02,320 --> 00:37:05,170 And I'm going to make one other tweak here, 792 00:37:05,170 --> 00:37:09,790 because, when I'm checking a word, it's possible it might be given 793 00:37:09,790 --> 00:37:12,520 to me in uppercase or capitalized. 794 00:37:12,520 --> 00:37:15,880 It's not going to necessarily come in in the same lowercase format 795 00:37:15,880 --> 00:37:17,470 that my dictionary did. 796 00:37:17,470 --> 00:37:22,390 I can force every word to lowercase by using word.lower. 797 00:37:22,390 --> 00:37:24,500 And I don't have to do it character for character, 798 00:37:24,500 --> 00:37:29,800 I can do the whole darn string at once, by just saying word.lower. 799 00:37:29,800 --> 00:37:32,860 All right, let me go ahead and open up a terminal window here. 800 00:37:32,860 --> 00:37:36,118 And let me go into, first, my C version, on the left. 801 00:37:36,118 --> 00:37:39,160 And actually I'm going to go ahead and split my terminal window into two. 802 00:37:39,160 --> 00:37:44,007 And on the right, I'm going to go into a version that I essentially just wrote. 803 00:37:44,007 --> 00:37:46,840 But it's also available online, if you want to play along afterward. 804 00:37:46,840 --> 00:37:50,170 I'm going to go ahead and make speller in C on the left, 805 00:37:50,170 --> 00:37:52,270 and note that it takes a moment to compile. 806 00:37:52,270 --> 00:37:56,530 Then I'm going to be ready to run speller of dictionaries, 807 00:37:56,530 --> 00:37:59,330 let's do like the Sherlock Holmes text, which is pretty big. 808 00:37:59,330 --> 00:38:03,970 And then over here, let me get ready to run Python of speller 809 00:38:03,970 --> 00:38:07,733 on texts/homes.txt2. 810 00:38:07,733 --> 00:38:10,150 So the syntax is a little different at the command prompt. 811 00:38:10,150 --> 00:38:12,880 I just, on the left, have to compile the code, with make, 812 00:38:12,880 --> 00:38:14,650 and then run it with ./speller. 813 00:38:14,650 --> 00:38:16,370 On the right, I don't need to compile it. 814 00:38:16,370 --> 00:38:17,860 But I do need to use the interpreter. 815 00:38:17,860 --> 00:38:20,230 So even though the lines are wrapping a little bit here, 816 00:38:20,230 --> 00:38:22,180 let me go ahead and run it on the right. 817 00:38:22,180 --> 00:38:24,305 And I'm going to count how long it takes, verbally, 818 00:38:24,305 --> 00:38:25,570 for demonstration sake. 819 00:38:25,570 --> 00:38:28,720 One Mississippi, two Mississippi, three Mississippi, OK, 820 00:38:28,720 --> 00:38:31,190 so it's like three seconds, give or take. 821 00:38:31,190 --> 00:38:33,520 Now running it in Python, keeping in mind, 822 00:38:33,520 --> 00:38:37,103 I spent way fewer hours implementing a spell checker in Python 823 00:38:37,103 --> 00:38:38,770 than you might have in problem set five. 824 00:38:38,770 --> 00:38:42,007 But what's the trade-off going to be, and what kinds of design decisions 825 00:38:42,007 --> 00:38:43,840 do we all now need to be making consciously? 826 00:38:43,840 --> 00:38:46,300 Here we go, on the right, in Python. 827 00:38:46,300 --> 00:38:50,020 One Mississippi, two Mississippi, three Mississippi, four Mississippi, 828 00:38:50,020 --> 00:38:54,070 five Mississippi, six Mississippi, seven Mississippi, eight Mississippi, 829 00:38:54,070 --> 00:38:57,100 nine Mississippi, 10 Mississippi, 11 Mississippi, 830 00:38:57,100 --> 00:38:59,990 all right, so 10 or 11 seconds. 831 00:38:59,990 --> 00:39:01,980 So which one is better? 832 00:39:01,980 --> 00:39:06,550 Let's go to the group here, which of these programs is the better one? 833 00:39:06,550 --> 00:39:10,780 How might you answer that question, based on demonstration alone? 834 00:39:10,780 --> 00:39:11,530 What do you think? 835 00:39:11,530 --> 00:39:13,738 AUDIENCE: I think Python's better for the programmer, 836 00:39:13,738 --> 00:39:17,847 more comfortable for the programmer, but C is better for the user. 837 00:39:17,847 --> 00:39:19,680 DAVID J. MALAN: OK, so Python, to summarize, 838 00:39:19,680 --> 00:39:23,460 is better for the programmer, because it was way faster to write, 839 00:39:23,460 --> 00:39:26,460 but C is maybe better for the computer, because it's much faster to run. 840 00:39:26,460 --> 00:39:28,127 I think that's a reasonable formulation. 841 00:39:28,127 --> 00:39:29,430 Other opinions? 842 00:39:29,430 --> 00:39:30,588 Yeah. 843 00:39:30,588 --> 00:39:32,880 AUDIENCE: I think it depends on the size of the project 844 00:39:32,880 --> 00:39:33,910 that you're dealing with. 845 00:39:33,910 --> 00:39:36,285 So if it's going to be something that's relatively quick, 846 00:39:36,285 --> 00:39:38,710 I might not care that it takes 10 seconds to do it. 847 00:39:38,710 --> 00:39:40,910 And it could be way faster to do it with Python. 848 00:39:40,910 --> 00:39:44,070 Whereas with C, if I'm dealing with something like a massive data 849 00:39:44,070 --> 00:39:48,300 set or something huge, then that time is going to really build up on, 850 00:39:48,300 --> 00:39:52,740 it might be worth it to put in the upfront effort and just load it into C, 851 00:39:52,740 --> 00:39:56,260 so the process continually will run faster over a longer period of time. 852 00:39:56,260 --> 00:39:57,430 DAVID J. MALAN: Absolutely, a really good answer. 853 00:39:57,430 --> 00:40:00,300 And let me summarize, is it depends on the workload, if you will. 854 00:40:00,300 --> 00:40:04,050 If you have a very large data set, you might 855 00:40:04,050 --> 00:40:07,128 want to optimize your code to be as fast and performant as it can be, 856 00:40:07,128 --> 00:40:09,420 especially if you're running that code again and again. 857 00:40:09,420 --> 00:40:10,950 Maybe you're a company like Google. 858 00:40:10,950 --> 00:40:13,110 People are searching a huge database all the time. 859 00:40:13,110 --> 00:40:15,750 You really want to squeeze every bit of performance 860 00:40:15,750 --> 00:40:17,222 as you can out of the computer. 861 00:40:17,222 --> 00:40:19,680 You might want to have someone smart take a language like C 862 00:40:19,680 --> 00:40:21,450 and write it at a very low level. 863 00:40:21,450 --> 00:40:22,500 It's going to be painful. 864 00:40:22,500 --> 00:40:23,400 They're going to have bugs. 865 00:40:23,400 --> 00:40:26,150 They're going to have to deal with memory management and the like. 866 00:40:26,150 --> 00:40:29,490 But if and when it works correctly, it's going to be much faster, it would seem. 867 00:40:29,490 --> 00:40:32,280 By contrast, if you have a data set that's big, 868 00:40:32,280 --> 00:40:35,820 and 140,000 words is not small, but you don't 869 00:40:35,820 --> 00:40:38,940 want to spend like 5 hours, 10 hours, a week of your time, 870 00:40:38,940 --> 00:40:41,063 building a spell checker or a dictionary, 871 00:40:41,063 --> 00:40:43,980 you can instead leverage a different language with different libraries 872 00:40:43,980 --> 00:40:48,690 and build on top of it, in order to prioritize the human time instead. 873 00:40:48,690 --> 00:40:50,841 Other thoughts? 874 00:40:50,841 --> 00:40:52,789 AUDIENCE: Would you, because with Python, 875 00:40:52,789 --> 00:40:56,928 doesn't it also like convert the words, or like 876 00:40:56,928 --> 00:40:58,539 convert the words, for a lesson? 877 00:40:58,539 --> 00:41:00,581 When we convert that into the same version again, 878 00:41:00,581 --> 00:41:04,148 do we just take that into view? 879 00:41:04,148 --> 00:41:06,940 DAVID J. MALAN: That's a perfect segue to exactly the next point we 880 00:41:06,940 --> 00:41:09,340 wanted to make, which was, is there something in between? 881 00:41:09,340 --> 00:41:10,360 And indeed there is. 882 00:41:10,360 --> 00:41:12,970 I'm oversimplifying what this language is actually doing. 883 00:41:12,970 --> 00:41:15,280 It's not as stark a difference as saying, like, hey, 884 00:41:15,280 --> 00:41:18,340 Python is four times slower than C. Like that's not the right takeaway. 885 00:41:18,340 --> 00:41:21,460 There are absolutely ways that engineers can optimize languages, 886 00:41:21,460 --> 00:41:23,230 as they have already done for Python. 887 00:41:23,230 --> 00:41:25,840 And in fact, I've configured my settings in such a way 888 00:41:25,840 --> 00:41:28,777 that I've kind of dramatized just how big the difference is. 889 00:41:28,777 --> 00:41:30,610 It is going to be slower, Python, typically, 890 00:41:30,610 --> 00:41:31,930 than the equivalent C program. 891 00:41:31,930 --> 00:41:33,940 But it doesn't have to be as big of a gap 892 00:41:33,940 --> 00:41:37,720 as it is here, because, indeed, among the features you can turn on in Python 893 00:41:37,720 --> 00:41:40,120 is to save some intermediate results. 894 00:41:40,120 --> 00:41:43,360 Technically speaking, yes, Python is interpreting 895 00:41:43,360 --> 00:41:46,690 Dictionary.py and these other files, translating them 896 00:41:46,690 --> 00:41:48,203 from one language to another. 897 00:41:48,203 --> 00:41:51,370 But that doesn't mean it has to do that every darn time you run the program. 898 00:41:51,370 --> 00:41:57,020 As you propose, you can save, or cache, C-A-C-H-E, the results of that process. 899 00:41:57,020 --> 00:42:00,440 So that the second time and the third time are actually notably faster. 900 00:42:00,440 --> 00:42:03,430 And, in fact, Python itself, the interpreter, the most popular version 901 00:42:03,430 --> 00:42:05,980 thereof, itself is actually implemented in C. 902 00:42:05,980 --> 00:42:09,290 So you can make sure that your interpreter is as fast as possible. 903 00:42:09,290 --> 00:42:11,350 And what then is maybe the high level takeaway? 904 00:42:11,350 --> 00:42:14,320 Yes, if you are going to try to squeeze every bit of performance 905 00:42:14,320 --> 00:42:17,710 out of your code, and maybe code is constrained. 906 00:42:17,710 --> 00:42:19,150 Maybe you have very small devices. 907 00:42:19,150 --> 00:42:20,770 Maybe it's like a watch nowadays. 908 00:42:20,770 --> 00:42:26,320 Or maybe it's a sensor that's installed in some small format in an appliance, 909 00:42:26,320 --> 00:42:29,710 or in infrastructure, where you don't have much battery life 910 00:42:29,710 --> 00:42:31,630 and you don't have much size, you might want 911 00:42:31,630 --> 00:42:33,710 to minimize just how much work is being done. 912 00:42:33,710 --> 00:42:36,743 And so the faster the code runs, and the better it's going to be, 913 00:42:36,743 --> 00:42:38,410 if it's implemented something low level. 914 00:42:38,410 --> 00:42:42,310 So C is still very commonly used for certain types of applications. 915 00:42:42,310 --> 00:42:45,580 But, again, if you just want to solve real world problems, 916 00:42:45,580 --> 00:42:49,840 and get real work done, and your time is just as, if not more, valuable 917 00:42:49,840 --> 00:42:52,000 than the device you're running it on, long term, 918 00:42:52,000 --> 00:42:55,358 you know what, Python is among the most popular languages as well. 919 00:42:55,358 --> 00:42:58,150 And frankly, if I were implementing a spell checker moving forward, 920 00:42:58,150 --> 00:42:59,710 I'm probably starting with Python. 921 00:42:59,710 --> 00:43:01,543 And I'm not going to waste time implementing 922 00:43:01,543 --> 00:43:04,930 all of that low-level stuff, because the whole point of using newer, 923 00:43:04,930 --> 00:43:09,460 modern languages is to use abstractions that other people have created for you. 924 00:43:09,460 --> 00:43:12,910 And by abstraction, I mean something like the dictionary function, 925 00:43:12,910 --> 00:43:15,370 that just gives you a dictionary, or hash table, 926 00:43:15,370 --> 00:43:19,225 or the equivalent version that I used, which in this case was a set. 927 00:43:19,225 --> 00:43:22,720 All right, any questions, then, on Python thus far? 928 00:43:22,720 --> 00:43:25,730 929 00:43:25,730 --> 00:43:26,710 No, all right. 930 00:43:26,710 --> 00:43:27,710 Oh, yeah, in the middle. 931 00:43:27,710 --> 00:43:29,920 AUDIENCE: Could you compile the Python code, 932 00:43:29,920 --> 00:43:34,610 or is there some, I'd imagine that with the audience that can happen, 933 00:43:34,610 --> 00:43:38,180 but it feels like if you can just come up with a Python compiler, 934 00:43:38,180 --> 00:43:40,093 that would give you the best of both worlds. 935 00:43:40,093 --> 00:43:42,260 DAVID J. MALAN: Really good question or observation, 936 00:43:42,260 --> 00:43:43,718 could you just compile Python code? 937 00:43:43,718 --> 00:43:47,180 Yes, absolutely, this idea of compiling code or interpreting code 938 00:43:47,180 --> 00:43:49,490 is not native to the language itself. 939 00:43:49,490 --> 00:43:52,410 It tends to be native to the conventions that we humans use. 940 00:43:52,410 --> 00:43:54,730 So you could actually write an interpreter for C 941 00:43:54,730 --> 00:43:57,980 that would read it top to bottom, left to right, converting it to, on the fly, 942 00:43:57,980 --> 00:44:01,640 something the computer understands, but historically that's not been the case. 943 00:44:01,640 --> 00:44:03,560 C is generally a compiled language. 944 00:44:03,560 --> 00:44:04,670 But it doesn't have to be. 945 00:44:04,670 --> 00:44:08,010 What Python nowadays is actually doing is what you described earlier. 946 00:44:08,010 --> 00:44:10,220 It technically is, sort of unbeknownst to us, 947 00:44:10,220 --> 00:44:13,970 compiling the code, technically not into 0's and 1's, technically 948 00:44:13,970 --> 00:44:17,510 into something called byte code, which is this intermediate step that 949 00:44:17,510 --> 00:44:21,510 just doesn't take as much time as it would to recompile the whole thing. 950 00:44:21,510 --> 00:44:24,377 And this is an area of research for computer scientists working 951 00:44:24,377 --> 00:44:26,960 in programming languages, to improve these kinds of paradigms. 952 00:44:26,960 --> 00:44:27,500 Why? 953 00:44:27,500 --> 00:44:30,740 Well, honestly, for you and I, the programmer, it's just much easier to, 954 00:44:30,740 --> 00:44:33,800 one, run the code and not worry about the stupid second step 955 00:44:33,800 --> 00:44:35,100 of compiling it all the time. 956 00:44:35,100 --> 00:44:35,600 Why? 957 00:44:35,600 --> 00:44:38,220 It's literally half as many steps for me, the human. 958 00:44:38,220 --> 00:44:40,500 And that's a nice thing to optimize for. 959 00:44:40,500 --> 00:44:44,330 And ultimately, too, you might want all of the fancy features that 960 00:44:44,330 --> 00:44:45,920 come with these other languages. 961 00:44:45,920 --> 00:44:47,960 So you should really just be fine-tuning how 962 00:44:47,960 --> 00:44:51,800 you can enable these features, as opposed to shying away from them here. 963 00:44:51,800 --> 00:44:54,590 And, in fact, the only time I personally ever use C 964 00:44:54,590 --> 00:44:57,950 is from like September to October of every year, during CS50. 965 00:44:57,950 --> 00:45:00,350 Almost every other month do I reach for Python, 966 00:45:00,350 --> 00:45:03,690 or another language called JavaScript, to actually get real work done, 967 00:45:03,690 --> 00:45:07,640 which is not to impugn C. It's just that those other languages tend to be better 968 00:45:07,640 --> 00:45:11,030 fits for the amount of time I have to allocate, and the types of problems 969 00:45:11,030 --> 00:45:11,905 that I want to solve. 970 00:45:11,905 --> 00:45:14,405 All right, let's go ahead and take a five minute break here. 971 00:45:14,405 --> 00:45:17,390 And when we come back, we'll start writing some programs from Scratch. 972 00:45:17,390 --> 00:45:18,300 All right. 973 00:45:18,300 --> 00:45:21,740 So let's go ahead and start writing some code from the beginning 974 00:45:21,740 --> 00:45:24,710 here, whereby we start small with some simple examples, 975 00:45:24,710 --> 00:45:28,042 and then we'll build our way up to more sophisticated examples in Python. 976 00:45:28,042 --> 00:45:29,750 But what we'll do along the way is first, 977 00:45:29,750 --> 00:45:31,865 look side by side at what the C code looked 978 00:45:31,865 --> 00:45:34,640 like way back in week 1 or 2 or 3 and so forth, 979 00:45:34,640 --> 00:45:36,890 and then write the corresponding Python code at right. 980 00:45:36,890 --> 00:45:39,530 And then we'll transition just to focusing on Python itself. 981 00:45:39,530 --> 00:45:42,322 What I've done in advance today is I've downloaded some of the code 982 00:45:42,322 --> 00:45:44,930 from the course's website, my source 6 directory, which 983 00:45:44,930 --> 00:45:47,825 contains all of the pre-written C code from weeks past. 984 00:45:47,825 --> 00:45:49,700 But it'll also have copies of the Python code 985 00:45:49,700 --> 00:45:51,660 we'll write here together and look at. 986 00:45:51,660 --> 00:45:55,445 So first, here is Hello.c back from week 0. 987 00:45:55,445 --> 00:45:57,323 And this was version 0 of it. 988 00:45:57,323 --> 00:45:58,740 I'm going to go ahead and do this. 989 00:45:58,740 --> 00:46:02,240 I'm going to go ahead and split my code window up here. 990 00:46:02,240 --> 00:46:05,042 I'm going to go ahead and create a new file called Hello.py. 991 00:46:05,042 --> 00:46:07,250 And this isn't something you'll typically have to do, 992 00:46:07,250 --> 00:46:08,810 laying your code out side by side. 993 00:46:08,810 --> 00:46:10,880 But I've just clicked the little icon in VS Code 994 00:46:10,880 --> 00:46:14,330 that looks like two columns, that splits my code editor into two places, 995 00:46:14,330 --> 00:46:17,330 so that we can, in fact, see things, for now, side by side, 996 00:46:17,330 --> 00:46:18,788 with my terminal window down below. 997 00:46:18,788 --> 00:46:21,747 All right, now I'm going to go ahead and write the corresponding Python 998 00:46:21,747 --> 00:46:24,560 program on the right, which, recall, was just print, quote 999 00:46:24,560 --> 00:46:27,170 unquote, "Hello, world," and that's it. 1000 00:46:27,170 --> 00:46:29,420 Now down in my terminal window, I'm going 1001 00:46:29,420 --> 00:46:33,080 to go ahead and run Python of Hello.py, Enter, and voila, 1002 00:46:33,080 --> 00:46:34,450 we've got Hello.py working. 1003 00:46:34,450 --> 00:46:36,950 So again, I'm not going to play any further with the C code. 1004 00:46:36,950 --> 00:46:38,930 It's there just to jog your memory left and right. 1005 00:46:38,930 --> 00:46:41,240 So let's now look at a second version of Hello, world 1006 00:46:41,240 --> 00:46:44,452 from that first week, whereby if I go and get Hello1.c, 1007 00:46:44,452 --> 00:46:46,160 I'm going to drag that over to the right. 1008 00:46:46,160 --> 00:46:48,980 Whoops, I'm going to go ahead and drag that over to the left here. 1009 00:46:48,980 --> 00:46:51,950 And now, on the right, let's modify Hello.py 1010 00:46:51,950 --> 00:46:55,700 to look a little more like this second version in C, all right? 1011 00:46:55,700 --> 00:46:59,867 I want to get an answer from the user as a return value, 1012 00:46:59,867 --> 00:47:01,700 but I also want to get some input from them. 1013 00:47:01,700 --> 00:47:05,420 So from CS50, I'm going to import the function called getString for now. 1014 00:47:05,420 --> 00:47:07,170 We're going to get rid of that eventually, 1015 00:47:07,170 --> 00:47:08,962 but for now, it's a helpful training wheel. 1016 00:47:08,962 --> 00:47:11,180 And then down here, I'm going to say, answer 1017 00:47:11,180 --> 00:47:14,510 equals getString quote unquote, "What's your name"? 1018 00:47:14,510 --> 00:47:15,980 Question mark, space. 1019 00:47:15,980 --> 00:47:17,453 But no semicolon, no data type. 1020 00:47:17,453 --> 00:47:19,370 And then I'm going to go ahead and print, just 1021 00:47:19,370 --> 00:47:25,118 like the first example on the slide, Hello, comma space plus answer. 1022 00:47:25,118 --> 00:47:26,660 And now let me go ahead and run this. 1023 00:47:26,660 --> 00:47:29,660 Python, of Hello.py, all right, it's asking me what's my name. 1024 00:47:29,660 --> 00:47:30,170 David. 1025 00:47:30,170 --> 00:47:31,370 Hello comma David. 1026 00:47:31,370 --> 00:47:36,507 But it's worth calling attention to the fact that I've also simplified further. 1027 00:47:36,507 --> 00:47:38,840 It's not just that the individual functions are simpler. 1028 00:47:38,840 --> 00:47:42,470 What is also now glaringly omitted from my Python code at right, 1029 00:47:42,470 --> 00:47:44,657 both in this version, and the previous version. 1030 00:47:44,657 --> 00:47:46,115 What did I not bother implementing? 1031 00:47:46,115 --> 00:47:47,267 AUDIENCE: The main code. 1032 00:47:47,267 --> 00:47:49,850 DAVID J. MALAN: Yeah, so I didn't even need to implement main. 1033 00:47:49,850 --> 00:47:53,210 We'll revisit the main function, because having a main function 1034 00:47:53,210 --> 00:47:54,860 actually does solve problems sometimes. 1035 00:47:54,860 --> 00:47:56,090 But it's no longer required. 1036 00:47:56,090 --> 00:47:59,750 In C you have to have that to kick-start the entire process of actually running 1037 00:47:59,750 --> 00:48:00,337 your code. 1038 00:48:00,337 --> 00:48:03,170 And in fact, if you were missing main, as you might have experienced 1039 00:48:03,170 --> 00:48:06,033 if you accidentally compiled Helpers.c instead of the file 1040 00:48:06,033 --> 00:48:08,450 that contained main, you would have seen a compiler error. 1041 00:48:08,450 --> 00:48:09,658 In Python it's not necessary. 1042 00:48:09,658 --> 00:48:12,410 In Python you can just jump right in, start programming, and boom, 1043 00:48:12,410 --> 00:48:13,350 you're good to go. 1044 00:48:13,350 --> 00:48:15,225 Especially if it's a small program like this, 1045 00:48:15,225 --> 00:48:18,210 you don't need the added overhead or complexity of a main function. 1046 00:48:18,210 --> 00:48:19,860 So that's one other difference here. 1047 00:48:19,860 --> 00:48:23,390 All right, there are a few other ways we could say Hello, world. 1048 00:48:23,390 --> 00:48:26,160 Recall that I could use a format string. 1049 00:48:26,160 --> 00:48:30,360 So I could put this whole thing in quotes, I could use this f prefix. 1050 00:48:30,360 --> 00:48:33,250 And then let me go ahead and run Python of Hello.py again. 1051 00:48:33,250 --> 00:48:35,250 You can perhaps see where we're going with this. 1052 00:48:35,250 --> 00:48:37,170 Let me type my name, David, and here we go. 1053 00:48:37,170 --> 00:48:39,570 OK, that's the mistake that someone identified earlier, 1054 00:48:39,570 --> 00:48:41,040 you need the curly braces. 1055 00:48:41,040 --> 00:48:44,940 Otherwise no variables are interpolated, that is substituted, 1056 00:48:44,940 --> 00:48:46,390 with their actual values. 1057 00:48:46,390 --> 00:48:50,160 So if I go back in and add those curly braces to the F string, 1058 00:48:50,160 --> 00:48:54,632 now let me run Python of Hello.py, type in my name, and there we go. 1059 00:48:54,632 --> 00:48:55,590 We're back in business. 1060 00:48:55,590 --> 00:48:56,388 Which one's better? 1061 00:48:56,388 --> 00:48:57,180 I mean, it depends. 1062 00:48:57,180 --> 00:49:00,540 But generally speaking, making shorter, more concise code 1063 00:49:00,540 --> 00:49:01,870 tends to be a good thing. 1064 00:49:01,870 --> 00:49:06,450 So stylistically, the F string is probably a reasonable instinct to have. 1065 00:49:06,450 --> 00:49:09,280 All right, well, what more can we do besides this? 1066 00:49:09,280 --> 00:49:12,180 Well, let me go ahead here and let's get rid of the training wheel 1067 00:49:12,180 --> 00:49:13,230 altogether, actually. 1068 00:49:13,230 --> 00:49:15,180 So same C code at left. 1069 00:49:15,180 --> 00:49:18,150 Let me get rid of the CS50 library, which we will ultimately, 1070 00:49:18,150 --> 00:49:19,620 in a couple of weeks, anyway. 1071 00:49:19,620 --> 00:49:22,560 I can't use getString, but I can use a function 1072 00:49:22,560 --> 00:49:24,730 that comes with Python called input. 1073 00:49:24,730 --> 00:49:28,050 And, in fact, this is actually a one-for-one substitution, pretty much. 1074 00:49:28,050 --> 00:49:31,380 There's really no downside to using input instead of getString. 1075 00:49:31,380 --> 00:49:33,420 We implement getString just for consistency 1076 00:49:33,420 --> 00:49:37,800 with what you saw in C. Python of Hello.py, what's your name, David. 1077 00:49:37,800 --> 00:49:39,310 Still actually works the same. 1078 00:49:39,310 --> 00:49:41,227 So gone are the CS50 specific training wheels. 1079 00:49:41,227 --> 00:49:43,227 But we're going to bring them back shortly, just 1080 00:49:43,227 --> 00:49:45,240 to deal with integers or floats or other values, 1081 00:49:45,240 --> 00:49:47,490 too, because it's going to make our lives a little simpler, 1082 00:49:47,490 --> 00:49:48,510 with error checking. 1083 00:49:48,510 --> 00:49:52,350 All right, any questions, before we now pivot to revisiting other examples 1084 00:49:52,350 --> 00:49:56,280 from week 1, but now in Python? 1085 00:49:56,280 --> 00:49:58,110 All right, let me go ahead and open up now. 1086 00:49:58,110 --> 00:50:03,240 Let's say Calculator0.c, which was one of the first examples we did involving 1087 00:50:03,240 --> 00:50:06,870 math and operators like that, as well as functions like getInt, 1088 00:50:06,870 --> 00:50:11,820 let me go ahead and create a new file now called Calculator.py, 1089 00:50:11,820 --> 00:50:15,360 at right, so that I have my C code at left still, 1090 00:50:15,360 --> 00:50:16,950 and my Python code at right. 1091 00:50:16,950 --> 00:50:20,610 All right, let me go dive into a translation of this code into Python. 1092 00:50:20,610 --> 00:50:23,100 I am going to use getInt from the CS50 library. 1093 00:50:23,100 --> 00:50:24,960 So let me import that. 1094 00:50:24,960 --> 00:50:27,340 I'm going to go ahead now and get an Int from the user. 1095 00:50:27,340 --> 00:50:31,000 So x equals getInt, and I'll ask them for an x value, 1096 00:50:31,000 --> 00:50:32,430 just like we did weeks ago. 1097 00:50:32,430 --> 00:50:37,800 No need to specify a semicolon, though, or an Int for the x. 1098 00:50:37,800 --> 00:50:38,940 It will just figure it out. 1099 00:50:38,940 --> 00:50:42,090 Y is going to get another Int via y colon, 1100 00:50:42,090 --> 00:50:46,830 and then down here, I'm going to go ahead and say print of x plus y. 1101 00:50:46,830 --> 00:50:48,720 So this is already a bit new. 1102 00:50:48,720 --> 00:50:53,400 Recall, the C version required that I use this format string, as well 1103 00:50:53,400 --> 00:50:54,428 as printf itself. 1104 00:50:54,428 --> 00:50:56,220 Python is just a little more user-friendly. 1105 00:50:56,220 --> 00:50:59,670 If all you want to do is print out a value, like x plus y, just print it. 1106 00:50:59,670 --> 00:51:02,610 Don't futz with any percent signs or format codes. 1107 00:51:02,610 --> 00:51:05,160 It's not printf, it's indeed just print now. 1108 00:51:05,160 --> 00:51:08,610 All right, let me go ahead and run Python of Calculator.py, 1109 00:51:08,610 --> 00:51:13,620 Enter, just do a quick sample, 1 plus 2 indeed equals 3. 1110 00:51:13,620 --> 00:51:16,410 As an aside, suppose I had taken a different approach 1111 00:51:16,410 --> 00:51:19,508 to importing the whole CS50 library, functionally, it's the same. 1112 00:51:19,508 --> 00:51:21,550 You're not to notice any performance impact here. 1113 00:51:21,550 --> 00:51:22,690 It's a small library. 1114 00:51:22,690 --> 00:51:25,680 But notice what does not work now, whereas it did work 1115 00:51:25,680 --> 00:51:31,110 in C. Python of Calculator.py, Enter, we see our first traceback deliberately 1116 00:51:31,110 --> 00:51:31,690 here. 1117 00:51:31,690 --> 00:51:33,570 So a traceback is just a term of art that 1118 00:51:33,570 --> 00:51:37,210 says, here is a trace back through all of the functions 1119 00:51:37,210 --> 00:51:38,250 that just got executed. 1120 00:51:38,250 --> 00:51:40,170 In the world of C, you might call this a stack 1121 00:51:40,170 --> 00:51:42,937 trace, stack being the operative word. 1122 00:51:42,937 --> 00:51:45,270 Recall that when we talked about the stack and the heap, 1123 00:51:45,270 --> 00:51:48,077 the stack, like a stack of trays, was all of the functions that 1124 00:51:48,077 --> 00:51:49,660 might get called, one after the other. 1125 00:51:49,660 --> 00:51:54,330 We had main, we had swap, then swap went away, and then main finished, recall. 1126 00:51:54,330 --> 00:51:58,020 So here's a trace back of all of the functions or code that got executed. 1127 00:51:58,020 --> 00:52:00,880 There's not really any functions other than my file itself. 1128 00:52:00,880 --> 00:52:02,350 Otherwise there'd be more detail. 1129 00:52:02,350 --> 00:52:05,580 But even though it's a little cryptic, we can perhaps infer from the output 1130 00:52:05,580 --> 00:52:09,960 here, name error, so something related to the name of something, name, getInt 1131 00:52:09,960 --> 00:52:10,950 is not defined. 1132 00:52:10,950 --> 00:52:14,190 And this of course, happens on line 3 over there. 1133 00:52:14,190 --> 00:52:15,520 All right, so why is that? 1134 00:52:15,520 --> 00:52:19,170 Well, Python essentially allows us to namespace 1135 00:52:19,170 --> 00:52:21,750 our functions that come from libraries. 1136 00:52:21,750 --> 00:52:25,290 There was a problem in C. If you were using the CS50 library, 1137 00:52:25,290 --> 00:52:27,180 and thus had access to getInt, getString, 1138 00:52:27,180 --> 00:52:29,850 and so forth, you could not use another library 1139 00:52:29,850 --> 00:52:31,590 that had the same function names. 1140 00:52:31,590 --> 00:52:33,510 They would collide, and the compiler would not 1141 00:52:33,510 --> 00:52:36,030 know how to link them together correctly. 1142 00:52:36,030 --> 00:52:41,520 In Python, and other languages like JavaScript, and in Java, 1143 00:52:41,520 --> 00:52:45,270 you have support for effectively what would be called namespaces. 1144 00:52:45,270 --> 00:52:50,370 You can isolate variables and function names to their own namespace, 1145 00:52:50,370 --> 00:52:52,590 like their own container in memory. 1146 00:52:52,590 --> 00:52:55,560 And what this means is, if you import all of CS50, 1147 00:52:55,560 --> 00:52:59,730 you have to say that the getInt you want is inside the CS50 library. 1148 00:52:59,730 --> 00:53:03,180 So just like with the image blurring, and the image edges 1149 00:53:03,180 --> 00:53:08,430 before, where I had to specify image dot and image filter dot, similarly here, 1150 00:53:08,430 --> 00:53:11,970 am I specifying with a dot operator, albeit a little differently, that I 1151 00:53:11,970 --> 00:53:14,410 want CS50.getInt in both places. 1152 00:53:14,410 --> 00:53:18,120 And now if I rerun Python of Calculator.py, 1 and 2, 1153 00:53:18,120 --> 00:53:19,860 now we're back in business. 1154 00:53:19,860 --> 00:53:20,790 Which one is better? 1155 00:53:20,790 --> 00:53:24,790 Generally speaking, it depends on just how many functions 1156 00:53:24,790 --> 00:53:26,040 you're using from the library. 1157 00:53:26,040 --> 00:53:29,040 If you're using a whole bunch of functions, just import the whole thing. 1158 00:53:29,040 --> 00:53:33,333 If you're only using maybe one or two, import them line by line. 1159 00:53:33,333 --> 00:53:35,750 All right, so let's go ahead and make a little tweak here. 1160 00:53:35,750 --> 00:53:38,917 Let's get rid of this library and take this training wheel off, 1161 00:53:38,917 --> 00:53:41,750 too, as quickly as we introduced it, though for the problems set six 1162 00:53:41,750 --> 00:53:44,310 you'll be able to use all of these same functions. 1163 00:53:44,310 --> 00:53:48,110 Suppose I get rid of this, and I just use the input function, 1164 00:53:48,110 --> 00:53:51,710 just like I did by replacing getString earlier. 1165 00:53:51,710 --> 00:53:54,710 Let me go ahead now and run this version of the code. 1166 00:53:54,710 --> 00:54:00,964 Python of Calculator.py, OK, how about 1 plus 2 equals 3. 1167 00:54:00,964 --> 00:54:02,660 Huh. 1168 00:54:02,660 --> 00:54:05,330 All right, obviously wrong, incorrect. 1169 00:54:05,330 --> 00:54:09,890 Can anyone explain what just happened, based on instincts? 1170 00:54:09,890 --> 00:54:10,890 What just happened here. 1171 00:54:10,890 --> 00:54:11,390 Yeah. 1172 00:54:11,390 --> 00:54:12,620 AUDIENCE: You want an answer? 1173 00:54:12,620 --> 00:54:13,745 DAVID J. MALAN: Sure, yeah. 1174 00:54:13,745 --> 00:54:17,930 AUDIENCE: Say you have a number of strings that don't have Ints, 1175 00:54:17,930 --> 00:54:21,320 so you would part with them and say, printing one, two, better. 1176 00:54:21,320 --> 00:54:24,650 DAVID J. MALAN: Exactly, Python is interpreting, or treating, 1177 00:54:24,650 --> 00:54:26,810 both x and y as strings, which is actually 1178 00:54:26,810 --> 00:54:29,120 what the input function returns by default. 1179 00:54:29,120 --> 00:54:32,150 And so plus is now being interpreted as concatenation, as we defined it 1180 00:54:32,150 --> 00:54:32,660 earlier. 1181 00:54:32,660 --> 00:54:35,780 So x plus y isn't x plus y mathematically, 1182 00:54:35,780 --> 00:54:38,480 but in terms of string joining, just like in Scratch. 1183 00:54:38,480 --> 00:54:41,690 So that's why we're getting 12, or really one two, 1184 00:54:41,690 --> 00:54:43,040 which isn't itself a number. 1185 00:54:43,040 --> 00:54:44,180 It, too, is another string. 1186 00:54:44,180 --> 00:54:45,950 So we somehow need to convert things. 1187 00:54:45,950 --> 00:54:49,040 And we didn't have this ability quite as easily in C. 1188 00:54:49,040 --> 00:54:52,670 We did have like the A to i function, ASCII to integer, 1189 00:54:52,670 --> 00:54:54,270 which did allow you to do this. 1190 00:54:54,270 --> 00:54:59,390 The analog in Python is actually just to do a cast, a typecast, using Int. 1191 00:54:59,390 --> 00:55:02,750 So just like in C, you can use the keyword Int, 1192 00:55:02,750 --> 00:55:04,500 but you use it a little differently. 1193 00:55:04,500 --> 00:55:09,300 Notice that I'm not doing parenthesis Int close parenthesis before the value. 1194 00:55:09,300 --> 00:55:11,010 I'm using Int as a function. 1195 00:55:11,010 --> 00:55:13,430 So indeed, in Python, Int is a function. 1196 00:55:13,430 --> 00:55:16,610 Float is a function, that you can pass values into, 1197 00:55:16,610 --> 00:55:18,270 to do this kind of conversion. 1198 00:55:18,270 --> 00:55:22,010 So now, if I run Python of Calculator.py, 1 and 2, 1199 00:55:22,010 --> 00:55:25,430 now we're back in business, and getting the answer of 3. 1200 00:55:25,430 --> 00:55:27,240 But there's kind of a catch here. 1201 00:55:27,240 --> 00:55:28,430 There's always going to be a trade-off. 1202 00:55:28,430 --> 00:55:30,560 Like that sounds amazing that it just works in this way. 1203 00:55:30,560 --> 00:55:32,450 We can throw away the CS50 library already. 1204 00:55:32,450 --> 00:55:37,130 But what if the user accidentally types, or maliciously types in, 1205 00:55:37,130 --> 00:55:39,035 like a cat, instead of a number. 1206 00:55:39,035 --> 00:55:40,910 Damn, well, there's one of these trace backs. 1207 00:55:40,910 --> 00:55:42,780 Like, now my program has crashed. 1208 00:55:42,780 --> 00:55:45,342 This is similar in spirit to the kinds of segfaults 1209 00:55:45,342 --> 00:55:46,550 that you might have had in C. 1210 00:55:46,550 --> 00:55:47,840 But they're not segfaults per se. 1211 00:55:47,840 --> 00:55:49,507 It doesn't necessarily relate to memory. 1212 00:55:49,507 --> 00:55:55,290 This time it relates to actual runtime values, not being as expected. 1213 00:55:55,290 --> 00:55:58,250 So this time it's not a name error, it's a value error, 1214 00:55:58,250 --> 00:56:02,580 invalid literal for Int with base 10 quote unquote "cat." 1215 00:56:02,580 --> 00:56:06,800 So, again, it's written for sort of a programmer, more than sort 1216 00:56:06,800 --> 00:56:09,650 of a typical person, because it's pretty arcane, the language here. 1217 00:56:09,650 --> 00:56:10,900 But let's try to interpret it. 1218 00:56:10,900 --> 00:56:14,862 Invalid literal, a literal is just something someone typed for Int, which 1219 00:56:14,862 --> 00:56:16,320 is the function name, with base 10. 1220 00:56:16,320 --> 00:56:18,170 It's just defaulting to decimal numbers. 1221 00:56:18,170 --> 00:56:20,415 Cat is apparently not a decimal number. 1222 00:56:20,415 --> 00:56:23,040 It doesn't look like it, therefore it can't be treated like it. 1223 00:56:23,040 --> 00:56:24,930 Therefore, there's a value error. 1224 00:56:24,930 --> 00:56:26,750 So what can we do? 1225 00:56:26,750 --> 00:56:30,200 Unfortunately, you would have to somehow catch this error. 1226 00:56:30,200 --> 00:56:32,450 And the only way to do that in Python really 1227 00:56:32,450 --> 00:56:34,970 is by way of another feature that C did not have, 1228 00:56:34,970 --> 00:56:37,400 namely, what are called exceptions. 1229 00:56:37,400 --> 00:56:42,080 An exception is exactly what just happened, name error, value error. 1230 00:56:42,080 --> 00:56:45,590 They are things that can go wrong when your Python code is running, 1231 00:56:45,590 --> 00:56:50,670 that aren't necessarily going to be detected until you run your code. 1232 00:56:50,670 --> 00:56:56,240 So in Python, and in JavaScript, and in Java, and other more modern languages, 1233 00:56:56,240 --> 00:56:59,240 there's this ability to actually try to do something, 1234 00:56:59,240 --> 00:57:01,015 except if something goes wrong. 1235 00:57:01,015 --> 00:57:03,140 And in fact, I'm going to introduce a bit of syntax 1236 00:57:03,140 --> 00:57:05,557 here, even though we won't have to use this much just yet. 1237 00:57:05,557 --> 00:57:09,980 Instead of just blindly converting x to an Int, let me go ahead 1238 00:57:09,980 --> 00:57:11,970 and try to do that. 1239 00:57:11,970 --> 00:57:15,380 And if there's an exception, go ahead and say something 1240 00:57:15,380 --> 00:57:22,280 like print, that is not an Int. 1241 00:57:22,280 --> 00:57:25,538 And then I'm going to do something like exit, right there. 1242 00:57:25,538 --> 00:57:27,080 And let me go ahead and do this here. 1243 00:57:27,080 --> 00:57:31,370 Let me try to get y, except if there's an exception. 1244 00:57:31,370 --> 00:57:35,997 Then let me go ahead and say, again, that is not an Int exclamation point. 1245 00:57:35,997 --> 00:57:38,330 And then I'm going to exit from there to, otherwise I'll 1246 00:57:38,330 --> 00:57:39,860 go ahead and print x plus y. 1247 00:57:39,860 --> 00:57:46,460 If I run Python of Calculator.py now, whoops, oh, 1248 00:57:46,460 --> 00:57:48,680 forgot my close quote, sorry. 1249 00:57:48,680 --> 00:57:54,560 All right, so close quote, Python of Calculator.py, 1 and 2 still work. 1250 00:57:54,560 --> 00:57:57,800 But if I try to type in something wrong like cat, now 1251 00:57:57,800 --> 00:57:59,310 it actually detects the error. 1252 00:57:59,310 --> 00:58:01,850 So what is the CS50 library in Python doing? 1253 00:58:01,850 --> 00:58:05,600 It's actually doing that try and accept for you, because suffice it to say, 1254 00:58:05,600 --> 00:58:08,540 otherwise your programs for something simple, like a calculator, 1255 00:58:08,540 --> 00:58:09,900 start to get longer and longer. 1256 00:58:09,900 --> 00:58:13,160 So we factored that kind of logic out to the CS50 getInt 1257 00:58:13,160 --> 00:58:14,690 function and get float function. 1258 00:58:14,690 --> 00:58:18,783 But underneath the hood, they're essentially doing this, try except, 1259 00:58:18,783 --> 00:58:20,450 but they're being a little more precise. 1260 00:58:20,450 --> 00:58:24,450 They're detecting a specific error, and they are doing it in a loop, 1261 00:58:24,450 --> 00:58:27,050 so that these functions will get executed again and again. 1262 00:58:27,050 --> 00:58:30,710 In fact, the best way to do this is to say except if there's a value error, 1263 00:58:30,710 --> 00:58:34,078 then print that error message out to the user. 1264 00:58:34,078 --> 00:58:36,870 And again, let's not get too into the weeds here with this feature. 1265 00:58:36,870 --> 00:58:38,760 We've already put into the CS50 library. 1266 00:58:38,760 --> 00:58:41,060 But that's why, for instance, we bootstrap things, 1267 00:58:41,060 --> 00:58:44,420 by just using these functions out of the box. 1268 00:58:44,420 --> 00:58:47,610 All right, let's do something more with our calculator here. 1269 00:58:47,610 --> 00:58:49,010 How about this. 1270 00:58:49,010 --> 00:58:51,890 In the world of C, we had another version 1271 00:58:51,890 --> 00:58:56,990 of this code, which actually did some division by way of-- 1272 00:58:56,990 --> 00:59:01,680 which actually did division of numbers, not just the addition herein. 1273 00:59:01,680 --> 00:59:05,990 So let me go ahead and close the C version, and let's focus only on Python 1274 00:59:05,990 --> 00:59:07,942 now, doing some of these same lines of codes. 1275 00:59:07,942 --> 00:59:09,650 But I'm going to go ahead and just assume 1276 00:59:09,650 --> 00:59:12,140 that the user is going to cooperate and use proper input. 1277 00:59:12,140 --> 00:59:16,310 So from CS50, import getInt, that will deal with any errors for me. 1278 00:59:16,310 --> 00:59:23,640 X gets getInt, ask the user for an Int x, y equals getInt, 1279 00:59:23,640 --> 00:59:25,170 ask the user for an Int y. 1280 00:59:25,170 --> 00:59:27,010 And then, let's go ahead and do this. 1281 00:59:27,010 --> 00:59:31,110 Let's declare a variable called z, set it equal to x divided by y. 1282 00:59:31,110 --> 00:59:32,850 Then let's go ahead and print z. 1283 00:59:32,850 --> 00:59:37,240 Still no need for a format string, I can just print out the variable's value. 1284 00:59:37,240 --> 00:59:39,240 Let me go ahead and run Python of Calculator.py. 1285 00:59:39,240 --> 00:59:43,650 Let me do 1, 10, and I get 0.1. 1286 00:59:43,650 --> 00:59:49,260 What did I get in C, though, if you think back. 1287 00:59:49,260 --> 00:59:52,076 What would we have happened in C? 1288 00:59:52,076 --> 00:59:53,420 AUDIENCE: Zero? 1289 00:59:53,420 --> 00:59:55,640 DAVID J. MALAN: Yeah, we would have gotten zero in C. 1290 00:59:55,640 --> 00:59:57,998 But why, in C, when you divide one Int by another, 1291 00:59:57,998 --> 00:59:59,915 and those Ints are like 1 and 10 respectively? 1292 00:59:59,915 --> 01:00:01,677 AUDIENCE: It'll give you an integer back. 1293 01:00:01,677 --> 01:00:03,260 DAVID J. MALAN: It will give you what? 1294 01:00:03,260 --> 01:00:04,343 AUDIENCE: An integer back. 1295 01:00:04,343 --> 01:00:07,910 DAVID J. MALAN: It will give you an integer back, and, unfortunately, 0.1, 1296 01:00:07,910 --> 01:00:09,860 the integer part of it is indeed zero. 1297 01:00:09,860 --> 01:00:11,970 So this was an example of truncation. 1298 01:00:11,970 --> 01:00:14,540 So truncation was an issue in C. But it would 1299 01:00:14,540 --> 01:00:17,450 seem as though this is no longer a problem in Python, 1300 01:00:17,450 --> 01:00:21,290 insofar as the division operator actually handles that for us. 1301 01:00:21,290 --> 01:00:24,230 As an aside, if you want the old behavior, because it actually 1302 01:00:24,230 --> 01:00:27,020 is sometimes useful for rounding or flooring values, 1303 01:00:27,020 --> 01:00:29,570 you can actually use two slashes. 1304 01:00:29,570 --> 01:00:31,620 And now you get the C behavior. 1305 01:00:31,620 --> 01:00:33,710 So that now 1 divided by 10 is zero. 1306 01:00:33,710 --> 01:00:36,230 So you don't give up that capability, but at least it 1307 01:00:36,230 --> 01:00:37,610 does a more sensible default. 1308 01:00:37,610 --> 01:00:41,030 Most people, especially new programmers, when dividing one value by another, 1309 01:00:41,030 --> 01:00:44,000 would want to get 0.1, not 0, for reasons 1310 01:00:44,000 --> 01:00:46,100 that indeed we had to explain weeks ago. 1311 01:00:46,100 --> 01:00:49,940 But what about another problem we had with the world of floats before, 1312 01:00:49,940 --> 01:00:52,040 whereby there is imprecision? 1313 01:00:52,040 --> 01:00:54,980 Let me go ahead and, somewhat cryptically, print out the value of z 1314 01:00:54,980 --> 01:00:55,860 as follows. 1315 01:00:55,860 --> 01:00:58,340 I'm going to format it using an f-string. 1316 01:00:58,340 --> 01:01:02,720 And I'm going to go ahead and format, not just z, because this is essentially 1317 01:01:02,720 --> 01:01:03,450 the same thing. 1318 01:01:03,450 --> 01:01:06,620 Notice this, if I do Python of Calculator.py, 1 and 10, 1319 01:01:06,620 --> 01:01:09,770 I get, by default, just one significant digit. 1320 01:01:09,770 --> 01:01:13,920 But if I use this syntax in Python, which we won't have to use often, 1321 01:01:13,920 --> 01:01:16,550 I can actually do in C like I did before, 1322 01:01:16,550 --> 01:01:19,650 50 significant digits after the decimal point. 1323 01:01:19,650 --> 01:01:24,020 So now let me rerun Python of Calculator.py 1 and 10, 1324 01:01:24,020 --> 01:01:26,990 and let's see if floating point imprecision is still with us. 1325 01:01:26,990 --> 01:01:28,280 Unfortunately, it is. 1326 01:01:28,280 --> 01:01:30,950 And you can see as much here, the f-string, the format string, 1327 01:01:30,950 --> 01:01:33,990 is just showing us now 50 digits instead of the default one. 1328 01:01:33,990 --> 01:01:36,110 So we've not solved all problems. 1329 01:01:36,110 --> 01:01:38,845 But we have solved at least some. 1330 01:01:38,845 --> 01:01:41,720 All right, before we pivot away from a mere calculator, any questions 1331 01:01:41,720 --> 01:01:45,350 now on syntax or concepts or the like? 1332 01:01:45,350 --> 01:01:46,070 Yeah. 1333 01:01:46,070 --> 01:01:49,320 AUDIENCE: Do you think the double slash you get 1334 01:01:49,320 --> 01:01:51,937 has merit, how do you comment on that? 1335 01:01:51,937 --> 01:01:53,270 DAVID J. MALAN: How do you what? 1336 01:01:53,270 --> 01:01:54,228 Oh, how do you comment. 1337 01:01:54,228 --> 01:01:57,410 Really good question, if you're using double slash for division 1338 01:01:57,410 --> 01:01:59,870 with flooring or truncation, like I described, 1339 01:01:59,870 --> 01:02:01,850 how do you do a comment in Python. 1340 01:02:01,850 --> 01:02:03,380 This is a comment. 1341 01:02:03,380 --> 01:02:05,930 And the convention is actually to use a complete sentence, 1342 01:02:05,930 --> 01:02:07,473 like with a capital T here. 1343 01:02:07,473 --> 01:02:09,890 You don't need a period unless there's multiple sentences. 1344 01:02:09,890 --> 01:02:12,840 And technically, it should be above the line of code by convention. 1345 01:02:12,840 --> 01:02:15,120 So you would use a hash symbol instead. 1346 01:02:15,120 --> 01:02:16,080 Good question. 1347 01:02:16,080 --> 01:02:17,420 I haven't seen those yet. 1348 01:02:17,420 --> 01:02:20,750 All right, let's go ahead and make something else here, how about. 1349 01:02:20,750 --> 01:02:23,430 Let me go ahead and open up, for instance, 1350 01:02:23,430 --> 01:02:29,090 an example called Points1.c, which we saw a few weeks back. 1351 01:02:29,090 --> 01:02:33,530 And let me go ahead on the other side and create a file called Points.py. 1352 01:02:33,530 --> 01:02:36,890 This was a program, recall, that asked the user how many points they 1353 01:02:36,890 --> 01:02:39,388 lost on the first assignment. 1354 01:02:39,388 --> 01:02:41,180 And then it went ahead and just printed out 1355 01:02:41,180 --> 01:02:43,790 whether they lost fewer points than me, because I lost two, 1356 01:02:43,790 --> 01:02:47,117 if you recall the photo, more points than me, or the same points as me. 1357 01:02:47,117 --> 01:02:49,700 Let me go ahead and zoom out so we can see a bit more of this. 1358 01:02:49,700 --> 01:02:54,208 And let me now, on the top right here, go about implementing this in Python. 1359 01:02:54,208 --> 01:02:56,750 So I want to first prompt the user for some number of points. 1360 01:02:56,750 --> 01:03:00,540 So from CS50 let's import getInt, so it handles the error-checking. 1361 01:03:00,540 --> 01:03:03,410 Let's then do points equals getInt, and ask 1362 01:03:03,410 --> 01:03:07,430 the user, how many points did you lose, question mark. 1363 01:03:07,430 --> 01:03:11,990 Then let's go ahead and say, if points less than two, which was my value, 1364 01:03:11,990 --> 01:03:15,800 print, you lost fewer points than me. 1365 01:03:15,800 --> 01:03:23,270 Otherwise, if it's else if points greater than 2, go ahead and print, 1366 01:03:23,270 --> 01:03:27,070 you lost more points than me. 1367 01:03:27,070 --> 01:03:30,800 Else let's go ahead and handle the final scenario, which is you 1368 01:03:30,800 --> 01:03:34,600 lost the same number of points as me. 1369 01:03:34,600 --> 01:03:39,230 Before I run this, does anyone want to point out a mistake I've already made? 1370 01:03:39,230 --> 01:03:39,730 Yeah. 1371 01:03:39,730 --> 01:03:41,390 AUDIENCE: Else if has to be elif. 1372 01:03:41,390 --> 01:03:44,690 DAVID J. MALAN: Yeah, so else if in C is actually now elif in Python. 1373 01:03:44,690 --> 01:03:45,780 It's a single word. 1374 01:03:45,780 --> 01:03:49,790 So let me change this to elif, and now cross my fingers, Python of Points.py, 1375 01:03:49,790 --> 01:03:53,330 suppose you lost three points on some assignment. 1376 01:03:53,330 --> 01:03:55,190 You lost more points than my two. 1377 01:03:55,190 --> 01:03:57,808 If you only lost one point, you lost fewer points than me. 1378 01:03:57,808 --> 01:03:58,850 So the logic is the same. 1379 01:03:58,850 --> 01:04:01,040 But notice the code is much tighter. 1380 01:04:01,040 --> 01:04:04,700 In 10 total lines, we did in what was 24 lines, because we've 1381 01:04:04,700 --> 01:04:06,350 thrown away a lot of the syntax. 1382 01:04:06,350 --> 01:04:08,370 The curly braces are no longer necessary. 1383 01:04:08,370 --> 01:04:10,230 The parentheses are gone, the semicolons. 1384 01:04:10,230 --> 01:04:13,670 So this is why it just tends to be more pleasant pretty quickly, 1385 01:04:13,670 --> 01:04:16,310 using a language like this. 1386 01:04:16,310 --> 01:04:18,770 All right, let's do one other example here. 1387 01:04:18,770 --> 01:04:23,000 In C, recall that we were able to determine the parity of some number, 1388 01:04:23,000 --> 01:04:24,590 if something is even or odd. 1389 01:04:24,590 --> 01:04:29,000 Well, in Python, let me go ahead and create a file called Parity.py, 1390 01:04:29,000 --> 01:04:32,810 and let's look for a moment at the C version at left. 1391 01:04:32,810 --> 01:04:36,680 Here was the code in C that we used to determine the parity of a number. 1392 01:04:36,680 --> 01:04:39,800 And, really, the key takeaway from all these lines 1393 01:04:39,800 --> 01:04:41,290 was just the remainder operator. 1394 01:04:41,290 --> 01:04:42,540 And that one is still with us. 1395 01:04:42,540 --> 01:04:44,998 So this is a simple demonstration, just to make that point, 1396 01:04:44,998 --> 01:04:48,770 if in Python, I want to determine whether a number is even or odd. 1397 01:04:48,770 --> 01:04:53,150 Well, let's go ahead and from CS50, import getInt, then let's go ahead 1398 01:04:53,150 --> 01:04:58,610 and get a number like n from the user, using getInt, and ask them for n. 1399 01:04:58,610 --> 01:05:04,220 And then let's go ahead and say, if n percent sign 2 equals 0, 1400 01:05:04,220 --> 01:05:08,270 then let's go ahead and print quote unquote "Even." 1401 01:05:08,270 --> 01:05:13,753 Else let's go ahead and print out Odd, but before I run this, 1402 01:05:13,753 --> 01:05:16,670 anyone want to instinctively, even though we've not talked about this, 1403 01:05:16,670 --> 01:05:19,010 point out a mistake here? 1404 01:05:19,010 --> 01:05:19,810 What I did wrong? 1405 01:05:19,810 --> 01:05:20,810 AUDIENCE: Double equals. 1406 01:05:20,810 --> 01:05:22,435 DAVID J. MALAN: Yeah, so double equals. 1407 01:05:22,435 --> 01:05:25,850 Again, so even though some of the stuff is changing, some of the same ideas 1408 01:05:25,850 --> 01:05:26,430 are the same. 1409 01:05:26,430 --> 01:05:28,520 So this, too, should be a double equal sign, 1410 01:05:28,520 --> 01:05:30,620 because I'm comparing for equality here. 1411 01:05:30,620 --> 01:05:32,153 And why is this the right math? 1412 01:05:32,153 --> 01:05:34,070 Well, if you divide a number by 2, it's either 1413 01:05:34,070 --> 01:05:36,290 going to have 0 or 1 as a remainder. 1414 01:05:36,290 --> 01:05:39,030 And that's going to determine if it's even or odd for us. 1415 01:05:39,030 --> 01:05:42,200 So let's run Python of Parity.py, type in a number like 50, 1416 01:05:42,200 --> 01:05:44,660 and hopefully we get, indeed, even. 1417 01:05:44,660 --> 01:05:46,910 So again, same idea, but now we're down to eight lines 1418 01:05:46,910 --> 01:05:48,560 of code instead of the 20. 1419 01:05:48,560 --> 01:05:50,810 Well, let's now do something a little more interactive 1420 01:05:50,810 --> 01:05:54,680 and a little representative of tools that actually ask the user questions. 1421 01:05:54,680 --> 01:06:00,320 In C, recall that we had this agreement program, Agree.c. 1422 01:06:00,320 --> 01:06:04,280 And then let's go ahead and implement a corresponding version in Python, 1423 01:06:04,280 --> 01:06:05,870 in a file called Agree.py. 1424 01:06:05,870 --> 01:06:08,570 And let's look at the C version first. 1425 01:06:08,570 --> 01:06:10,700 On the left, we used get char here. 1426 01:06:10,700 --> 01:06:13,190 And then we used the double vertical bars 1427 01:06:13,190 --> 01:06:16,430 to check if C is equal to capital Y or lowercase y. 1428 01:06:16,430 --> 01:06:18,500 And then we did the same thing for n for no. 1429 01:06:18,500 --> 01:06:24,380 And so let's go over here and let's do from CS50, import get-- 1430 01:06:24,380 --> 01:06:26,570 OK, get char is not a thing. 1431 01:06:26,570 --> 01:06:29,090 And this here is another difference with Python. 1432 01:06:29,090 --> 01:06:32,510 There is no data type for individual characters. 1433 01:06:32,510 --> 01:06:34,640 You have strings, STRs, and, honestly, those 1434 01:06:34,640 --> 01:06:36,620 are fine, because if you have a STR that's 1435 01:06:36,620 --> 01:06:38,960 just one character, for all intents and purposes, 1436 01:06:38,960 --> 01:06:40,710 it is just a single character. 1437 01:06:40,710 --> 01:06:41,960 So it's just a simplification. 1438 01:06:41,960 --> 01:06:43,200 You don't have to think as much. 1439 01:06:43,200 --> 01:06:45,658 You don't have to worry about double quotes, single quotes. 1440 01:06:45,658 --> 01:06:49,350 In fact, in Python, you can use double quotes or single quotes, 1441 01:06:49,350 --> 01:06:50,930 so long as you're consistent. 1442 01:06:50,930 --> 01:06:52,970 So long as you're consistent, the single quotes 1443 01:06:52,970 --> 01:06:55,670 do not mean something different, like they do in C. 1444 01:06:55,670 --> 01:06:58,340 So I'm going to go ahead and use getString here, 1445 01:06:58,340 --> 01:07:01,220 although, strictly speaking, I could just use the input function, 1446 01:07:01,220 --> 01:07:02,480 as we saw before. 1447 01:07:02,480 --> 01:07:07,250 I'm going to get a string from the user that asks them this, getString, 1448 01:07:07,250 --> 01:07:10,557 quote unquote, "Do you agree," like a little checkbox or interactive prompt, 1449 01:07:10,557 --> 01:07:13,640 where you have to say yes or no, you want to agree to the following terms, 1450 01:07:13,640 --> 01:07:14,580 or whatnot. 1451 01:07:14,580 --> 01:07:18,110 And then let's translate the conditionals to Python, now, too. 1452 01:07:18,110 --> 01:07:25,850 So if S equals equals quote-unquote "Y," or S equals equals lowercase y, 1453 01:07:25,850 --> 01:07:32,180 let's go ahead and print out agreed, just like in C, elif S equals 1454 01:07:32,180 --> 01:07:35,540 equals N or S equals equals little n. 1455 01:07:35,540 --> 01:07:38,058 Let's go ahead, then, and print out not agreed. 1456 01:07:38,058 --> 01:07:40,850 And you can already see, perhaps, one of the differences here, too. 1457 01:07:40,850 --> 01:07:43,700 Is Python a little more English-like, in that 1458 01:07:43,700 --> 01:07:47,610 you just literally use the English word or, instead of the two vertical bars. 1459 01:07:47,610 --> 01:07:50,370 But it's ultimately doing the same thing. 1460 01:07:50,370 --> 01:07:53,390 Can we simplify this code a bit, though. 1461 01:07:53,390 --> 01:07:55,340 This would be a little annoying if we wanted 1462 01:07:55,340 --> 01:07:57,800 to add support, not just for big Y and little y, 1463 01:07:57,800 --> 01:08:04,230 but Yes or big Yes or little yes or big Y, lowercase e, capital S, right? 1464 01:08:04,230 --> 01:08:07,130 There's a lot of permutations of Y-E-S or just y, 1465 01:08:07,130 --> 01:08:08,720 that we ideally should tolerate. 1466 01:08:08,720 --> 01:08:11,470 Otherwise, the user is going to have to type exactly what we want, 1467 01:08:11,470 --> 01:08:12,770 which isn't very user-friendly. 1468 01:08:12,770 --> 01:08:15,050 Any intuition for how we could logically, 1469 01:08:15,050 --> 01:08:18,270 even if you don't know how to do it in code, make this better? 1470 01:08:18,270 --> 01:08:18,770 Yeah. 1471 01:08:18,770 --> 01:08:21,535 AUDIENCE: Write way over the list, and then up, 1472 01:08:21,535 --> 01:08:22,910 it's like the things in the list. 1473 01:08:22,910 --> 01:08:27,050 DAVID J. MALAN: Nice, yeah, we saw an example of a list before, just 0, 1, 2. 1474 01:08:27,050 --> 01:08:29,899 Why don't we take that same idea and ask a similar question. 1475 01:08:29,899 --> 01:08:34,819 If S is in the following list of values, Y or little y, 1476 01:08:34,819 --> 01:08:38,600 or heck, let me add to the list now, yes, or maybe all capital YES. 1477 01:08:38,600 --> 01:08:40,779 And it's going to get a little annoying, admittedly, 1478 01:08:40,779 --> 01:08:43,750 but this is still better than the alternative, with all the or's. 1479 01:08:43,750 --> 01:08:45,640 I could do things like this, and so forth. 1480 01:08:45,640 --> 01:08:47,740 There's a whole bunch more permutations. 1481 01:08:47,740 --> 01:08:50,470 But let's leave this alone, and let me just go into here 1482 01:08:50,470 --> 01:08:57,279 and change this to, if S is in the following list of N or little n or no, 1483 01:08:57,279 --> 01:09:00,460 and I won't do as, let's just not worry about the weird capitalizations 1484 01:09:00,460 --> 01:09:01,600 there, for now. 1485 01:09:01,600 --> 01:09:02,800 Let's go ahead and run this. 1486 01:09:02,800 --> 01:09:05,950 Python of Agree.py, do I agree? 1487 01:09:05,950 --> 01:09:08,740 Y. OK, how about yes? 1488 01:09:08,740 --> 01:09:10,359 All right, how about big Yes. 1489 01:09:10,359 --> 01:09:11,850 OK, that does not seem to work. 1490 01:09:11,850 --> 01:09:14,350 Notice it did not say agreed, and it did not say not agreed. 1491 01:09:14,350 --> 01:09:15,410 It didn't detect it. 1492 01:09:15,410 --> 01:09:17,180 So how can I do this? 1493 01:09:17,180 --> 01:09:20,770 Well, you know what I could do, what I don't really 1494 01:09:20,770 --> 01:09:22,240 need the uppercase and lowercase. 1495 01:09:22,240 --> 01:09:24,189 Let me tighten this list up a little bit. 1496 01:09:24,189 --> 01:09:27,640 And why don't I just force S to be lowercase. 1497 01:09:27,640 --> 01:09:31,000 S.lower, recall, whether it's one character or more, 1498 01:09:31,000 --> 01:09:34,180 is a function built into STRs now, strings in Python, 1499 01:09:34,180 --> 01:09:35,950 that forces the whole thing to lowercase. 1500 01:09:35,950 --> 01:09:37,450 So now, watch what I can do. 1501 01:09:37,450 --> 01:09:42,700 Python of Agree.py, little y, that works, big Y, that works. 1502 01:09:42,700 --> 01:09:47,840 Big Yes, that works, big Y, little e, big S, that also works. 1503 01:09:47,840 --> 01:09:50,910 So we've now handled, in one fell swoop, a whole bunch more logic. 1504 01:09:50,910 --> 01:09:52,910 And you know what, we can tighten this up a bit. 1505 01:09:52,910 --> 01:09:56,350 Here's an opportunity, in Python, for slightly better design. 1506 01:09:56,350 --> 01:10:00,070 What have I done in here that's a little redundant? 1507 01:10:00,070 --> 01:10:04,180 Does anyone see an opportunity to eliminate a redundancy, 1508 01:10:04,180 --> 01:10:06,820 doing something more times than you need. 1509 01:10:06,820 --> 01:10:08,030 Is a stretch here, no. 1510 01:10:08,030 --> 01:10:08,530 Yep. 1511 01:10:08,530 --> 01:10:11,163 AUDIENCE: You can do S dot lower, above. 1512 01:10:11,163 --> 01:10:13,330 DAVID J. MALAN: We could move the S dot lower above. 1513 01:10:13,330 --> 01:10:15,310 Notice that I'm using S dot lower twice. 1514 01:10:15,310 --> 01:10:17,870 But it's going to give me the same answer both times. 1515 01:10:17,870 --> 01:10:20,080 So I could do a couple of things here. 1516 01:10:20,080 --> 01:10:24,700 I could, first of all, get rid of this lower, and get rid of this lower, 1517 01:10:24,700 --> 01:10:28,720 and then above this, maybe I could do something like this, S equal-- 1518 01:10:28,720 --> 01:10:31,600 I can't just do this, because that throws the value away. 1519 01:10:31,600 --> 01:10:34,240 It does the math, but it doesn't convert the string itself. 1520 01:10:34,240 --> 01:10:35,840 It's going to return a value. 1521 01:10:35,840 --> 01:10:38,260 So I have to say S equals s.lower. 1522 01:10:38,260 --> 01:10:39,340 I could do that. 1523 01:10:39,340 --> 01:10:41,840 Or, honestly, I can chain these things together. 1524 01:10:41,840 --> 01:10:46,070 And this is not something we saw in C. If getString returns a string, 1525 01:10:46,070 --> 01:10:49,240 and strings have functions like lower in them, 1526 01:10:49,240 --> 01:10:52,330 you can chain these functions together, like this, and do dot this, 1527 01:10:52,330 --> 01:10:53,788 dot that, dot this other thing. 1528 01:10:53,788 --> 01:10:56,830 And eventually you want to stop, because it's going to become crazy long. 1529 01:10:56,830 --> 01:10:58,810 But this is reasonable, still fits on the screen. 1530 01:10:58,810 --> 01:10:59,560 It's pretty tight. 1531 01:10:59,560 --> 01:11:01,690 It does in one place what I was doing in two. 1532 01:11:01,690 --> 01:11:03,010 So I think that's OK. 1533 01:11:03,010 --> 01:11:05,980 Let me go ahead and do Python of Agree.py one last time. 1534 01:11:05,980 --> 01:11:07,120 Let's try it one last time. 1535 01:11:07,120 --> 01:11:10,360 And it's still working as intended. 1536 01:11:10,360 --> 01:11:12,700 Also if I tried those other inputs as well. 1537 01:11:12,700 --> 01:11:13,435 Yeah, question. 1538 01:11:13,435 --> 01:11:19,290 AUDIENCE: Could you add on like a for uppercase as well, for like upper, 1539 01:11:19,290 --> 01:11:22,700 and then cover all the functions where it's lowercase, for all the functions 1540 01:11:22,700 --> 01:11:25,450 where it's uppercase as well, or could you not just do this again. 1541 01:11:25,450 --> 01:11:29,095 1542 01:11:29,095 --> 01:11:30,470 DAVID J. MALAN: Let me summarize. 1543 01:11:30,470 --> 01:11:33,340 Could we handle uppercase and lowercase together in some form? 1544 01:11:33,340 --> 01:11:35,020 I'm actually doing that already. 1545 01:11:35,020 --> 01:11:36,370 I just have to pick a lane. 1546 01:11:36,370 --> 01:11:39,307 I have to either be all lowercase in my logic or all uppercase, 1547 01:11:39,307 --> 01:11:41,140 and not worry about what the human types in, 1548 01:11:41,140 --> 01:11:43,240 because no matter what the human types in, I'm 1549 01:11:43,240 --> 01:11:44,950 forcing their input to lowercase. 1550 01:11:44,950 --> 01:11:48,280 And then I am using a lowercase list of values. 1551 01:11:48,280 --> 01:11:49,520 If I want to flip that, fine. 1552 01:11:49,520 --> 01:11:51,040 I just have to be self-consistent. 1553 01:11:51,040 --> 01:11:52,420 But I'm handling that already. 1554 01:11:52,420 --> 01:11:53,223 Yeah. 1555 01:11:53,223 --> 01:11:56,953 AUDIENCE: Are strings no longer an array of characters? 1556 01:11:56,953 --> 01:11:58,870 DAVID J. MALAN: A really good loaded questions 1557 01:11:58,870 --> 01:12:02,080 are strings no longer an array of characters? 1558 01:12:02,080 --> 01:12:04,120 Conceptually, yes, underneath the hood, no. 1559 01:12:04,120 --> 01:12:06,190 They're a little more sophisticated than that, 1560 01:12:06,190 --> 01:12:08,590 because with strings, you have a few changes. 1561 01:12:08,590 --> 01:12:10,600 Not only do they have functions built into them, 1562 01:12:10,600 --> 01:12:12,580 because strings are now what we call objects, 1563 01:12:12,580 --> 01:12:14,500 in what's called object-oriented programming. 1564 01:12:14,500 --> 01:12:17,042 And we're going to keep seeing examples of this dot operator. 1565 01:12:17,042 --> 01:12:21,550 They are also immutable, so to speak, I-M-M-U-T-A-B-L-E. 1566 01:12:21,550 --> 01:12:25,180 Immutable means they cannot be changed, which means, unlike C, 1567 01:12:25,180 --> 01:12:28,750 you can't go into a string and change its individual characters. 1568 01:12:28,750 --> 01:12:31,480 You can make a copy of the string that makes a change, 1569 01:12:31,480 --> 01:12:33,698 but you can't change the original string itself. 1570 01:12:33,698 --> 01:12:35,740 This is both a little annoying, maybe, sometimes. 1571 01:12:35,740 --> 01:12:38,365 But it's also pretty protective, because you can't do screw-ups 1572 01:12:38,365 --> 01:12:41,680 like I did weeks ago, when I was trying to copy S and call it T. 1573 01:12:41,680 --> 01:12:43,270 And then one affected the other. 1574 01:12:43,270 --> 01:12:47,080 Python, underneath the hood, is handling all of the memory management 1575 01:12:47,080 --> 01:12:48,550 and the pointers and all of that. 1576 01:12:48,550 --> 01:12:51,040 There are no pointers in Python. 1577 01:12:51,040 --> 01:12:55,840 So If that wasn't clear, all of that pain, if you will, all of that power, 1578 01:12:55,840 --> 01:13:00,280 is now handled by the language itself, not by us, the programmers. 1579 01:13:00,280 --> 01:13:02,440 All right, so let's introduce maybe some loops, 1580 01:13:02,440 --> 01:13:04,390 like we've been in the habit of doing. 1581 01:13:04,390 --> 01:13:08,170 Let me open up Meow.c, which was an example in C, just meowing 1582 01:13:08,170 --> 01:13:09,730 a bunch of times textually. 1583 01:13:09,730 --> 01:13:12,800 Let me create a file called Meow.py here on the right. 1584 01:13:12,800 --> 01:13:15,190 And notice on the left, this was correct code in C, 1585 01:13:15,190 --> 01:13:16,670 but it was kind of poorly designed. 1586 01:13:16,670 --> 01:13:17,170 Why? 1587 01:13:17,170 --> 01:13:19,450 Because it was a missed opportunity for a loop. 1588 01:13:19,450 --> 01:13:22,460 Why say something three times when you can say it just once? 1589 01:13:22,460 --> 01:13:25,990 So in Python, let me do it the poorly designed way first. 1590 01:13:25,990 --> 01:13:27,400 Let me print out meow. 1591 01:13:27,400 --> 01:13:31,210 And, like I generally should not, let me copy, paste it three times, 1592 01:13:31,210 --> 01:13:33,670 run Python of Meow.py, and it works. 1593 01:13:33,670 --> 01:13:35,318 OK, but not good practice. 1594 01:13:35,318 --> 01:13:37,360 So let me go ahead and improve this a little bit. 1595 01:13:37,360 --> 01:13:38,990 And there's a few ways to do this. 1596 01:13:38,990 --> 01:13:44,050 If I wanted to do this three times, I could instead do something like this. 1597 01:13:44,050 --> 01:13:48,010 For i in range of 3, recall that that was the better version, 1598 01:13:48,010 --> 01:13:51,370 rather than arbitrarily enumerate numbers yourself, let me go ahead 1599 01:13:51,370 --> 01:13:53,490 and print out quote unquote "Meow." 1600 01:13:53,490 --> 01:13:56,077 Now if I run Python of Meow, still seems to work. 1601 01:13:56,077 --> 01:13:57,910 So it's a little tighter, and, my God, like, 1602 01:13:57,910 --> 01:13:59,952 programs can't really get much shorter than this. 1603 01:13:59,952 --> 01:14:04,300 We're down to two lines of code, no main function, no gratuitous syntax. 1604 01:14:04,300 --> 01:14:06,580 Let's now improve the design further, like we 1605 01:14:06,580 --> 01:14:09,550 did in C, by introducing a function called 1606 01:14:09,550 --> 01:14:11,230 meow, that actually does the meowing. 1607 01:14:11,230 --> 01:14:13,000 So this was our first abstraction, recall, 1608 01:14:13,000 --> 01:14:18,100 both in Scratch and in C. Let me focus now entirely on the Python version 1609 01:14:18,100 --> 01:14:18,760 here. 1610 01:14:18,760 --> 01:14:23,485 Let me go ahead and first define a function. 1611 01:14:23,485 --> 01:14:26,890 1612 01:14:26,890 --> 01:14:30,250 Let me first go ahead and do this, for i in range of 3, 1613 01:14:30,250 --> 01:14:33,430 let's assume for the moment that there's a meow function, 1614 01:14:33,430 --> 01:14:34,720 that I'm just going to call. 1615 01:14:34,720 --> 01:14:38,320 Let's now go ahead and define, using the Def key word, which we saw briefly 1616 01:14:38,320 --> 01:14:41,170 with the speller demonstration, a function 1617 01:14:41,170 --> 01:14:42,880 called meow that takes no arguments. 1618 01:14:42,880 --> 01:14:45,460 And all it does for now is print meow. 1619 01:14:45,460 --> 01:14:50,620 Let me now go ahead and run Python of Meow.py Enter, huh, one 1620 01:14:50,620 --> 01:14:51,950 of those trace backs. 1621 01:14:51,950 --> 01:14:54,080 So this is another name error. 1622 01:14:54,080 --> 01:14:57,080 And, again, name meow is not defined. 1623 01:14:57,080 --> 01:14:59,080 What's your instinct here, even though we've not 1624 01:14:59,080 --> 01:15:00,760 tripped over this yet in Python? 1625 01:15:00,760 --> 01:15:03,130 Where does your mind go here? 1626 01:15:03,130 --> 01:15:03,670 Yeah. 1627 01:15:03,670 --> 01:15:06,080 AUDIENCE: Does it read top to bottom, left to right? 1628 01:15:06,080 --> 01:15:09,600 I'm guessing we could find a new case. 1629 01:15:09,600 --> 01:15:13,020 DAVID J. MALAN: Perfect, as smart, as smarter as Python seems to be, 1630 01:15:13,020 --> 01:15:14,770 it still makes certain assumptions. 1631 01:15:14,770 --> 01:15:18,010 And if it hasn't seen a keyword yet, it just doesn't exist. 1632 01:15:18,010 --> 01:15:21,000 So if you want it to exist, we have to be a little clever here. 1633 01:15:21,000 --> 01:15:24,090 I could just put it, flip it around, like this. 1634 01:15:24,090 --> 01:15:26,470 But this honestly isn't particularly good design. 1635 01:15:26,470 --> 01:15:26,970 Why? 1636 01:15:26,970 --> 01:15:30,390 Because now, if you, the reader of your code, whether you 1637 01:15:30,390 --> 01:15:32,970 wrote it or someone else, you kind of have to go fishing now. 1638 01:15:32,970 --> 01:15:34,560 Like where does this program begin? 1639 01:15:34,560 --> 01:15:38,130 And even though, yes, it's obvious that it begins on line four, logically, 1640 01:15:38,130 --> 01:15:40,710 like, if the file were longer, you're going to be annoyed 1641 01:15:40,710 --> 01:15:43,180 and fishing visually for the right lines of code. 1642 01:15:43,180 --> 01:15:44,397 So let's reintroduce main. 1643 01:15:44,397 --> 01:15:46,230 And indeed, this would be a common paradigm. 1644 01:15:46,230 --> 01:15:49,380 When you want to start having abstractions in your own functions, 1645 01:15:49,380 --> 01:15:53,460 just put your own code in main, so that, one, you can leave it up top, and two, 1646 01:15:53,460 --> 01:15:55,650 you can solve the problem we just encountered. 1647 01:15:55,650 --> 01:15:58,860 So let me define a function called main that has that same loop, 1648 01:15:58,860 --> 01:16:00,240 meowing three times. 1649 01:16:00,240 --> 01:16:02,040 But now watch what happens. 1650 01:16:02,040 --> 01:16:07,350 Let me go into my terminal and run Python of Meow.py, Enter. 1651 01:16:07,350 --> 01:16:07,850 Nothing. 1652 01:16:07,850 --> 01:16:10,500 1653 01:16:10,500 --> 01:16:14,050 All right, investigate this. 1654 01:16:14,050 --> 01:16:16,290 What could explain this symptom. 1655 01:16:16,290 --> 01:16:18,020 I have not told you the answer yet. 1656 01:16:18,020 --> 01:16:19,770 So all you have is your instinct, assuming 1657 01:16:19,770 --> 01:16:21,720 you've never touched Python before. 1658 01:16:21,720 --> 01:16:26,800 What might explain this symptom, where nothing is meowing? 1659 01:16:26,800 --> 01:16:27,300 Yeah? 1660 01:16:27,300 --> 01:16:28,970 AUDIENCE: Didn't run the main function. 1661 01:16:28,970 --> 01:16:31,178 DAVID J. MALAN: Yeah, I didn't run the main function. 1662 01:16:31,178 --> 01:16:33,390 So in C, this is functionality you get for free. 1663 01:16:33,390 --> 01:16:34,765 You have to have a main function. 1664 01:16:34,765 --> 01:16:37,580 But, heck, so long as you make it, it will be called for you. 1665 01:16:37,580 --> 01:16:41,390 In Python, this is just a convention, to create a main function, 1666 01:16:41,390 --> 01:16:43,200 borrowing a very common name for it. 1667 01:16:43,200 --> 01:16:46,320 But if you want to call that main function, you have to do it. 1668 01:16:46,320 --> 01:16:48,110 So this looks a little weird, admittedly, 1669 01:16:48,110 --> 01:16:50,030 that you have to call your own main function now, 1670 01:16:50,030 --> 01:16:51,860 and it has to be at the bottom of the file, 1671 01:16:51,860 --> 01:16:55,040 because only once the interpreter gets to the bottom of the file, 1672 01:16:55,040 --> 01:16:58,460 have all of your functions been defined, higher up. 1673 01:16:58,460 --> 01:16:59,990 But this solves both problems. 1674 01:16:59,990 --> 01:17:02,450 It keeps your code, that's the main part of your code, 1675 01:17:02,450 --> 01:17:03,660 at the very top of the file. 1676 01:17:03,660 --> 01:17:06,980 So it's just obvious to you, and a TF, or any reader in the future, 1677 01:17:06,980 --> 01:17:09,140 where the program logically starts. 1678 01:17:09,140 --> 01:17:13,310 But it also ensures that main is not called until everything else, main 1679 01:17:13,310 --> 01:17:15,660 included, has been defined. 1680 01:17:15,660 --> 01:17:17,648 So this is another perfect example of we're 1681 01:17:17,648 --> 01:17:19,440 learning a new language for the first time. 1682 01:17:19,440 --> 01:17:21,020 You're not going to have heard all of the answers before. 1683 01:17:21,020 --> 01:17:24,830 Just apply some logic, as to, like, all right, what could explain this symptom. 1684 01:17:24,830 --> 01:17:28,190 Start to infer how the language does or doesn't work. 1685 01:17:28,190 --> 01:17:32,450 If I now go and run this, Python of Meow.py, now we're back in business. 1686 01:17:32,450 --> 01:17:35,360 And just so you have seen it, there is a quote 1687 01:17:35,360 --> 01:17:38,840 unquote "better" way of doing this, that solves different problems that we 1688 01:17:38,840 --> 01:17:42,050 are not going to encounter, certainly in these initial days. 1689 01:17:42,050 --> 01:17:45,440 Typically, you would see in online tutorials or books, 1690 01:17:45,440 --> 01:17:49,400 something that looks like this, where you actually have a weird conditional 1691 01:17:49,400 --> 01:17:50,810 with multiple underscores. 1692 01:17:50,810 --> 01:17:54,470 That's functionally the same thing, but it solves problems with libraries, 1693 01:17:54,470 --> 01:17:57,840 if we ourselves were implementing a library or something similar in spirit. 1694 01:17:57,840 --> 01:18:00,882 But we're going to keep things simpler and just write main at the bottom, 1695 01:18:00,882 --> 01:18:03,355 because we're not going to encounter that problem just yet. 1696 01:18:03,355 --> 01:18:06,230 All right, let's make one change to this, just to show how it's done. 1697 01:18:06,230 --> 01:18:11,420 In C, the last version of meow also took command line argument, sorry, also 1698 01:18:11,420 --> 01:18:13,910 took arguments to the function meow. 1699 01:18:13,910 --> 01:18:16,490 So suppose that I want to factor this out. 1700 01:18:16,490 --> 01:18:19,250 And I want to just call meow as a better abstraction, where I just 1701 01:18:19,250 --> 01:18:21,080 say meow this number of times. 1702 01:18:21,080 --> 01:18:24,290 And I figure out how many times by just, like, putting in number 3 1703 01:18:24,290 --> 01:18:26,990 or using getInt or something like that, to figure out 1704 01:18:26,990 --> 01:18:28,550 how many times to say meow. 1705 01:18:28,550 --> 01:18:31,820 Well, now, I have to define inside my meow function, in input, 1706 01:18:31,820 --> 01:18:38,330 let's call it n, and then use that, as by doing this, for i in range of n, 1707 01:18:38,330 --> 01:18:41,640 let me go ahead and print out meow that many times. 1708 01:18:41,640 --> 01:18:43,820 So again, the only thing that's different in C 1709 01:18:43,820 --> 01:18:47,630 is we don't bother specifying return types for any of these functions, 1710 01:18:47,630 --> 01:18:52,230 and we don't bother specifying the type of our arguments or our variables. 1711 01:18:52,230 --> 01:18:54,930 So same ideas, simpler in some sense. 1712 01:18:54,930 --> 01:18:56,660 We're just throwing away keystrokes. 1713 01:18:56,660 --> 01:18:59,450 All right, let me run this one final time, Python of Meow.py, 1714 01:18:59,450 --> 01:19:02,390 and we still have the same program. 1715 01:19:02,390 --> 01:19:04,110 All right, let me pause here. 1716 01:19:04,110 --> 01:19:04,780 Any questions? 1717 01:19:04,780 --> 01:19:06,030 And I know this is going fast. 1718 01:19:06,030 --> 01:19:11,355 But hopefully, the C code is still somewhat familiar. 1719 01:19:11,355 --> 01:19:11,855 Yeah. 1720 01:19:11,855 --> 01:19:17,530 AUDIENCE: Is there any difference between global and local variables. 1721 01:19:17,530 --> 01:19:18,780 DAVID J. MALAN: Good question. 1722 01:19:18,780 --> 01:19:21,238 Is there any difference between global and local variables? 1723 01:19:21,238 --> 01:19:23,850 Short answer, yes, and we would run into that same problem, 1724 01:19:23,850 --> 01:19:25,320 if we declare a variable in one function, 1725 01:19:25,320 --> 01:19:27,445 another function is not going to have access to it. 1726 01:19:27,445 --> 01:19:30,660 We can solve that by putting variables globally. 1727 01:19:30,660 --> 01:19:32,760 But we don't have all of the features we had in C, 1728 01:19:32,760 --> 01:19:35,160 like there's no such thing as a constant in Python. 1729 01:19:35,160 --> 01:19:36,900 The mentality in the Python community is, 1730 01:19:36,900 --> 01:19:39,480 if you don't want some value to change, don't touch it. 1731 01:19:39,480 --> 01:19:40,630 Like just don't screw up. 1732 01:19:40,630 --> 01:19:42,240 So there's trade-offs here, too. 1733 01:19:42,240 --> 01:19:45,000 Some languages are stronger or more defensive than that. 1734 01:19:45,000 --> 01:19:48,990 But that, too, is part of the mindset with this particular language. 1735 01:19:48,990 --> 01:19:49,770 [SIREN] 1736 01:19:49,770 --> 01:19:50,645 DAVID J. MALAN: Yeah. 1737 01:19:50,645 --> 01:19:52,937 AUDIENCE: There is really only one green line, in the-- 1738 01:19:52,937 --> 01:19:54,437 DAVID J. MALAN: Oh, sorry, where's-- 1739 01:19:54,437 --> 01:19:55,080 say it louder. 1740 01:19:55,080 --> 01:19:58,342 AUDIENCE: There has only been one green line printed at a time. 1741 01:19:58,342 --> 01:20:00,050 DAVID J. MALAN: That is an amazing segue. 1742 01:20:00,050 --> 01:20:01,370 Let's come to that in just a moment, because we're 1743 01:20:01,370 --> 01:20:03,620 going to recreate also that Mario example, where 1744 01:20:03,620 --> 01:20:06,925 we had like the question marks for the coins and the vertical bars. 1745 01:20:06,925 --> 01:20:08,550 So let's come back to that in a second. 1746 01:20:08,550 --> 01:20:09,656 And your question? 1747 01:20:09,656 --> 01:20:13,362 AUDIENCE: If strings are immutable, and every time you like make a copy. 1748 01:20:13,362 --> 01:20:15,320 DAVID J. MALAN: Correct, strings are immutable. 1749 01:20:15,320 --> 01:20:19,220 Any time you seem to be modifying it, as with the lower function, 1750 01:20:19,220 --> 01:20:20,480 you're getting back a copy. 1751 01:20:20,480 --> 01:20:22,940 So it's taking a little more memory somewhere. 1752 01:20:22,940 --> 01:20:26,145 But you don't have to deal with it Python's doing that for you. 1753 01:20:26,145 --> 01:20:28,892 AUDIENCE: So you don't free anything. 1754 01:20:28,892 --> 01:20:30,100 DAVID J. MALAN: Say it again? 1755 01:20:30,100 --> 01:20:31,226 You don't need what? 1756 01:20:31,226 --> 01:20:34,663 AUDIENCE: You don't free like taking leave on stuff. 1757 01:20:34,663 --> 01:20:36,330 DAVID J. MALAN: You don't free anything. 1758 01:20:36,330 --> 01:20:38,870 So if you weren't a big fan, over the past couple of weeks, 1759 01:20:38,870 --> 01:20:42,860 of malloc or free or memory or addresses, or all 1760 01:20:42,860 --> 01:20:44,990 of those low level implementation details, 1761 01:20:44,990 --> 01:20:47,390 Python is the language for you, because all of that 1762 01:20:47,390 --> 01:20:49,340 is handled for you automatically. 1763 01:20:49,340 --> 01:20:50,780 Java does the same. 1764 01:20:50,780 --> 01:20:51,960 JavaScript does the same. 1765 01:20:51,960 --> 01:20:52,460 Yeah. 1766 01:20:52,460 --> 01:20:58,244 AUDIENCE: Each up for the variable, you put it before the name, use of the body 1767 01:20:58,244 --> 01:20:59,700 before the name, correct? 1768 01:20:59,700 --> 01:21:03,785 Well, if there isn't a main function in Python, how do you define those words? 1769 01:21:03,785 --> 01:21:05,910 DAVID J. MALAN: How do you define a global variable 1770 01:21:05,910 --> 01:21:07,493 if there's no main function in Python? 1771 01:21:07,493 --> 01:21:11,480 Global variables, by definition, always need to be outside of main, as well. 1772 01:21:11,480 --> 01:21:12,480 So that's not a problem. 1773 01:21:12,480 --> 01:21:15,300 If I wanted to have a function that's outside of, 1774 01:21:15,300 --> 01:21:19,703 and, therefore, global to all of these, like global-- 1775 01:21:19,703 --> 01:21:22,620 actually, don't use the word global, that's a special word in Python-- 1776 01:21:22,620 --> 01:21:27,450 variable equals Foo, F-O-O, just as an arbitrary string 1777 01:21:27,450 --> 01:21:31,410 value that a computer scientist would typically use, that is now global. 1778 01:21:31,410 --> 01:21:34,000 There are some caveats, though, as to how you access that. 1779 01:21:34,000 --> 01:21:36,010 But let's come back to that another time. 1780 01:21:36,010 --> 01:21:38,030 But that problem is solvable, too. 1781 01:21:38,030 --> 01:21:38,530 All right. 1782 01:21:38,530 --> 01:21:39,780 So let's go ahead and do this. 1783 01:21:39,780 --> 01:21:43,050 To come back to the question about the print command, let me go ahead 1784 01:21:43,050 --> 01:21:45,300 and create a file now called Mario.py. 1785 01:21:45,300 --> 01:21:47,700 Won't bother showing the C code anymore. 1786 01:21:47,700 --> 01:21:49,590 We'll focus just on the new language here. 1787 01:21:49,590 --> 01:21:54,540 But recall that, in Python, in Mario, we wanted to first do something like this. 1788 01:21:54,540 --> 01:21:57,600 This was a random screen from the side scroller version 1 1789 01:21:57,600 --> 01:21:58,800 of Super Mario Brothers. 1790 01:21:58,800 --> 01:22:02,820 And we just want to print like three hashes to represent those three blocks. 1791 01:22:02,820 --> 01:22:04,950 Well, in Python, we could do something like this, 1792 01:22:04,950 --> 01:22:11,280 print, oh, sorry, for i in the range of 3, go ahead and print out quote unquote 1793 01:22:11,280 --> 01:22:11,828 "hash." 1794 01:22:11,828 --> 01:22:13,620 And I think this is pretty straightforward. 1795 01:22:13,620 --> 01:22:16,260 Python of Mario.py, we get our three hashes. 1796 01:22:16,260 --> 01:22:18,850 You could imagine parameterizing this now, though, 1797 01:22:18,850 --> 01:22:20,350 and getting actual user input. 1798 01:22:20,350 --> 01:22:21,730 So let's do that. 1799 01:22:21,730 --> 01:22:27,420 Let me go up here and let me go and say from CS50, import getInt, 1800 01:22:27,420 --> 01:22:31,090 and then let's get the input from the user. 1801 01:22:31,090 --> 01:22:33,210 So it actually is a value n, like, all right, 1802 01:22:33,210 --> 01:22:38,190 getInt the height of the column of bricks that you want to do. 1803 01:22:38,190 --> 01:22:42,270 And then, let's go ahead and print out n hashes instead of three. 1804 01:22:42,270 --> 01:22:43,560 So let me run this. 1805 01:22:43,560 --> 01:22:45,385 Let's print out like five hashes. 1806 01:22:45,385 --> 01:22:47,760 OK, one, two, three, four, five, that seems to work, too. 1807 01:22:47,760 --> 01:22:49,677 And it's going to work for any positive value. 1808 01:22:49,677 --> 01:22:53,400 But it's not going to work for, how about negative 1? 1809 01:22:53,400 --> 01:22:54,660 That just doesn't do anything. 1810 01:22:54,660 --> 01:22:55,747 But that seems OK. 1811 01:22:55,747 --> 01:22:58,830 But also recall that it's not going to work if the user types in something 1812 01:22:58,830 --> 01:23:03,990 weird, like, oh, sorry, it is going to work if the user types in something 1813 01:23:03,990 --> 01:23:05,790 weird like cat, why? 1814 01:23:05,790 --> 01:23:08,820 We're using CS50's getInt function, which is 1815 01:23:08,820 --> 01:23:11,710 handling all of those headaches for us. 1816 01:23:11,710 --> 01:23:15,180 But, what if the user indeed types a negative number? 1817 01:23:15,180 --> 01:23:16,110 We're tolerating that. 1818 01:23:16,110 --> 01:23:17,860 So that was the bug I wanted to highlight. 1819 01:23:17,860 --> 01:23:20,250 It would be nice to re-prompt them and re-prompt them. 1820 01:23:20,250 --> 01:23:22,560 And in C, what was the programming construct we 1821 01:23:22,560 --> 01:23:25,020 used when we wanted to ask the user a question. 1822 01:23:25,020 --> 01:23:29,280 And then, if they didn't cooperate, prompt them again, prompt them again. 1823 01:23:29,280 --> 01:23:29,890 What was that? 1824 01:23:29,890 --> 01:23:30,390 Yeah. 1825 01:23:30,390 --> 01:23:30,750 AUDIENCE: Do while loop. 1826 01:23:30,750 --> 01:23:32,100 DAVID J. MALAN: Yeah, do while loop, right? 1827 01:23:32,100 --> 01:23:34,830 That was useful, because it's almost the same as a while loop. 1828 01:23:34,830 --> 01:23:38,100 But instead of checking a condition, and then doing something, 1829 01:23:38,100 --> 01:23:39,948 you do something and then check a condition, 1830 01:23:39,948 --> 01:23:42,240 which makes sense with user input, because what are you 1831 01:23:42,240 --> 01:23:44,615 even going to check if the user hasn't done anything yet? 1832 01:23:44,615 --> 01:23:46,200 You need that inverted logic. 1833 01:23:46,200 --> 01:23:50,010 Unfortunately in Python, there is no do while loop. 1834 01:23:50,010 --> 01:23:51,300 There is a for loop. 1835 01:23:51,300 --> 01:23:52,740 There is a while loop. 1836 01:23:52,740 --> 01:23:55,590 And frankly, those are enough to recreate this idea. 1837 01:23:55,590 --> 01:23:59,160 And the way to do this in Python, the Pythonic way, which 1838 01:23:59,160 --> 01:24:02,160 is another term of art in the community, is to say this. 1839 01:24:02,160 --> 01:24:06,300 Deliberately induce an infinite loop, while True, with capital T for true. 1840 01:24:06,300 --> 01:24:09,930 And then do what you got to do, like get an Int from a user, 1841 01:24:09,930 --> 01:24:12,060 asking them for the height of this thing. 1842 01:24:12,060 --> 01:24:18,270 And then, if that is what you want, like a number greater than zero, go ahead 1843 01:24:18,270 --> 01:24:20,020 and break out of the loop. 1844 01:24:20,020 --> 01:24:25,440 So this is how, in Python, you could recreate the idea of a do while loop. 1845 01:24:25,440 --> 01:24:27,315 You deliberately induce an infinite loop. 1846 01:24:27,315 --> 01:24:29,190 So something's going to happen at least once. 1847 01:24:29,190 --> 01:24:32,280 Then, if you get the answer you want, you break out of it, 1848 01:24:32,280 --> 01:24:34,330 effectively achieving the same logic. 1849 01:24:34,330 --> 01:24:37,080 So this is the Pythonic way of doing a do while loop. 1850 01:24:37,080 --> 01:24:41,760 Let me go ahead and run Python of Mario.py, type in 3 this time. 1851 01:24:41,760 --> 01:24:44,670 And now I get back just the 3 hashes as well. 1852 01:24:44,670 --> 01:24:50,310 What if, though, I wanted to get rid of, how about ultimately 1853 01:24:50,310 --> 01:24:55,058 that CS50 library function, and also encapsulate this in a function. 1854 01:24:55,058 --> 01:24:57,100 Well, let's go ahead and tweak this a little bit. 1855 01:24:57,100 --> 01:24:59,070 Let me go ahead and remove this temporarily. 1856 01:24:59,070 --> 01:25:01,680 Give myself a main function, so I don't make the same mistake 1857 01:25:01,680 --> 01:25:03,360 as I did initially earlier. 1858 01:25:03,360 --> 01:25:07,110 And let me give myself a function called get height that takes no arguments. 1859 01:25:07,110 --> 01:25:10,620 And inside of that function is going to be that same code. 1860 01:25:10,620 --> 01:25:14,280 But I don't want to break in this case, I want to return n. 1861 01:25:14,280 --> 01:25:17,293 So, recall, that if you return from a function, you're done, 1862 01:25:17,293 --> 01:25:19,210 you're going to exit from right at that point. 1863 01:25:19,210 --> 01:25:20,320 So this would be fine. 1864 01:25:20,320 --> 01:25:22,680 You can just say return n inside of the loop, 1865 01:25:22,680 --> 01:25:25,320 or, if you would prefer to break out, you 1866 01:25:25,320 --> 01:25:26,940 could do something like this instead. 1867 01:25:26,940 --> 01:25:32,700 Break, and then down here, you could return, down here, 1868 01:25:32,700 --> 01:25:34,630 you could return n as well. 1869 01:25:34,630 --> 01:25:37,290 And let me make one point here before we go back up to main. 1870 01:25:37,290 --> 01:25:41,490 This is a little different from C. And this one's subtle. 1871 01:25:41,490 --> 01:25:47,250 What have I done here that in C would have been a bug, but is apparently not, 1872 01:25:47,250 --> 01:25:48,315 I claim, in Python. 1873 01:25:48,315 --> 01:25:50,860 1874 01:25:50,860 --> 01:25:52,220 It's super subtle, this one. 1875 01:25:52,220 --> 01:25:52,720 Yeah. 1876 01:25:52,720 --> 01:25:55,911 AUDIENCE: So aren't we like defining mostly object, 1877 01:25:55,911 --> 01:25:59,470 like we're using it first, defining an object? 1878 01:25:59,470 --> 01:26:04,275 [INAUDIBLE] 1879 01:26:04,275 --> 01:26:07,150 DAVID J. MALAN: So similar, it's not quite that we're using it first. 1880 01:26:07,150 --> 01:26:10,980 So it's OK not to declare a variable with like the data type. 1881 01:26:10,980 --> 01:26:15,420 We've addressed that before, but on line 9, we're assigning n a value, it seems. 1882 01:26:15,420 --> 01:26:18,600 And then we return n on line 12. 1883 01:26:18,600 --> 01:26:20,190 But notice the indentation. 1884 01:26:20,190 --> 01:26:25,410 In the world of C, if we had declared a variable inside of a loop, on line 9, 1885 01:26:25,410 --> 01:26:28,200 it would have been scoped to that loop, which 1886 01:26:28,200 --> 01:26:31,530 means as soon as you get out of that loop, like further down in the program, 1887 01:26:31,530 --> 01:26:33,340 n would not exist. 1888 01:26:33,340 --> 01:26:36,090 It would be local to the curly braces therein. 1889 01:26:36,090 --> 01:26:39,720 Here, logically, curly braces are gone, but the indentation 1890 01:26:39,720 --> 01:26:44,250 makes clear that n is still inside of this loop, between lines 8 through 11. 1891 01:26:44,250 --> 01:26:47,280 But n is actually still in scope in Python. 1892 01:26:47,280 --> 01:26:50,380 The moment you create a variable in Python, for better or for worse, 1893 01:26:50,380 --> 01:26:53,760 It is available everywhere within that function, even outside 1894 01:26:53,760 --> 01:26:55,690 of the loop in which you defined it. 1895 01:26:55,690 --> 01:26:59,070 So this logic is actually OK in Python. 1896 01:26:59,070 --> 01:27:02,138 In C, recall, to solve this same problem, 1897 01:27:02,138 --> 01:27:04,680 we would have had to do something a little hackish like this, 1898 01:27:04,680 --> 01:27:09,600 like define n up here on line 8, so that it exists, now, on line 10, 1899 01:27:09,600 --> 01:27:12,000 and so that it exists on line 13. 1900 01:27:12,000 --> 01:27:15,700 That is no longer an issue or need, in Python. 1901 01:27:15,700 --> 01:27:17,700 Once you create a variable, even if it's nested, 1902 01:27:17,700 --> 01:27:19,867 nested, nested inside of some loops or conditionals, 1903 01:27:19,867 --> 01:27:23,520 it still exists within the function itself. 1904 01:27:23,520 --> 01:27:27,870 All right, any questions then on this, before we now run this and then get 1905 01:27:27,870 --> 01:27:31,680 rid of the CS50 library again? 1906 01:27:31,680 --> 01:27:34,300 OK, so let me go ahead and get the height from the user. 1907 01:27:34,300 --> 01:27:36,758 Let's go ahead and create a variable in main called height. 1908 01:27:36,758 --> 01:27:38,460 Let's call this get height function. 1909 01:27:38,460 --> 01:27:43,380 And then let's use that height value, instead of something hardcoded there. 1910 01:27:43,380 --> 01:27:45,000 And let me see if this all works now. 1911 01:27:45,000 --> 01:27:46,410 Python of Mario.py. 1912 01:27:46,410 --> 01:27:49,110 Hopefully, I haven't messed up, but I did. 1913 01:27:49,110 --> 01:27:51,460 But this is an easy fix now. 1914 01:27:51,460 --> 01:27:51,960 Yeah. 1915 01:27:51,960 --> 01:27:53,085 AUDIENCE: Got to call main. 1916 01:27:53,085 --> 01:27:54,543 DAVID J. MALAN: I got to call main. 1917 01:27:54,543 --> 01:27:55,980 So again, I deleted that earlier. 1918 01:27:55,980 --> 01:27:56,920 But let me bring it back. 1919 01:27:56,920 --> 01:27:58,128 So I'm actually calling main. 1920 01:27:58,128 --> 01:28:02,190 Let me rerun Python of Mario.py, there we go, height 3. 1921 01:28:02,190 --> 01:28:03,880 Now it seems to be working. 1922 01:28:03,880 --> 01:28:05,880 So let's do one last thing with Mario, just 1923 01:28:05,880 --> 01:28:08,980 to tie together that idea now of exceptions from before. 1924 01:28:08,980 --> 01:28:11,070 Again, exceptions are a feature of Python, 1925 01:28:11,070 --> 01:28:13,060 whereby you can try to do something. 1926 01:28:13,060 --> 01:28:16,710 And if there's a problem, you can handle it in any way you see fit. 1927 01:28:16,710 --> 01:28:20,070 Previously, I handled it by just yelling at the user that that's not an Int. 1928 01:28:20,070 --> 01:28:23,460 But let's actually use this to re-implement CS50's own getInt 1929 01:28:23,460 --> 01:28:24,240 function. 1930 01:28:24,240 --> 01:28:27,130 Let me throw away CS50's getInt function. 1931 01:28:27,130 --> 01:28:32,880 And now let me go ahead and replace getInt with input. 1932 01:28:32,880 --> 01:28:35,670 But it's not sufficient to just use input. 1933 01:28:35,670 --> 01:28:39,480 What do I have to add to this line of code on line 8? 1934 01:28:39,480 --> 01:28:40,740 If I want to get back an Int? 1935 01:28:40,740 --> 01:28:41,790 AUDIENCE: The Int function. 1936 01:28:41,790 --> 01:28:43,832 DAVID J. MALAN: Yeah, I have to cast it to an Int 1937 01:28:43,832 --> 01:28:46,500 by calling the Int function around that value, 1938 01:28:46,500 --> 01:28:48,750 or I could do it on a separate line, just to be clear. 1939 01:28:48,750 --> 01:28:52,110 I could also do n equals Int of n. 1940 01:28:52,110 --> 01:28:55,020 That would work too, but it's sort of an unnecessary extra line. 1941 01:28:55,020 --> 01:28:57,990 This is not sufficient, because that does not change the value. 1942 01:28:57,990 --> 01:28:58,935 It creates the value. 1943 01:28:58,935 --> 01:29:00,060 But then it throws it away. 1944 01:29:00,060 --> 01:29:01,192 We need to assign it. 1945 01:29:01,192 --> 01:29:03,900 So the conventional way to do this would probably be in one line, 1946 01:29:03,900 --> 01:29:05,358 just to keep things nice and tight. 1947 01:29:05,358 --> 01:29:06,780 So that works fine now. 1948 01:29:06,780 --> 01:29:11,470 If I run Python of Mario.py, I can still type in 3, and all as well. 1949 01:29:11,470 --> 01:29:15,720 I can still type in negative 1, because that is an Int that I am handling. 1950 01:29:15,720 --> 01:29:18,750 What I'm not yet handling is weird input like cat 1951 01:29:18,750 --> 01:29:21,760 or some string that is not a base 10 number. 1952 01:29:21,760 --> 01:29:23,880 So here, again, is my traceback. 1953 01:29:23,880 --> 01:29:27,000 And notice that here, let me scroll up a little bit, 1954 01:29:27,000 --> 01:29:31,620 here we can actually see more detail in the traceback. 1955 01:29:31,620 --> 01:29:36,900 Notice that, just like in C, or just like in the debugger in VS Code, 1956 01:29:36,900 --> 01:29:38,100 you can see a few things. 1957 01:29:38,100 --> 01:29:41,490 You can see mention of module, that just means your file, main, which 1958 01:29:41,490 --> 01:29:43,013 is my main function, and get height. 1959 01:29:43,013 --> 01:29:44,430 So notice, it's kind of backwards. 1960 01:29:44,430 --> 01:29:46,720 It's top to bottom instead of bottom up, as we drew it 1961 01:29:46,720 --> 01:29:48,720 on the board the other day, and as we envisioned 1962 01:29:48,720 --> 01:29:50,520 stacks of trays in the cafeteria. 1963 01:29:50,520 --> 01:29:52,680 But this is your stack, of functions that 1964 01:29:52,680 --> 01:29:54,330 have been called, from top to bottom. 1965 01:29:54,330 --> 01:29:57,360 Get height is the most recent, main is the very first, 1966 01:29:57,360 --> 01:29:59,200 value error is the problem. 1967 01:29:59,200 --> 01:30:03,740 So let's try to do, let's try to do this literally, except if there's an error. 1968 01:30:03,740 --> 01:30:04,740 So what do I want to do? 1969 01:30:04,740 --> 01:30:09,720 I'm going to go in here, and I'm going to say, try to do the following. 1970 01:30:09,720 --> 01:30:17,070 Whoops, try to do the following, except if there's a value error, value error, 1971 01:30:17,070 --> 01:30:20,640 then go ahead and say something, well, like before, print, 1972 01:30:20,640 --> 01:30:23,830 that's not an integer exclamation point. 1973 01:30:23,830 --> 01:30:26,760 But the difference this time is because I'm in a loop, the user 1974 01:30:26,760 --> 01:30:29,200 is going to have a chance to recover from this issue. 1975 01:30:29,200 --> 01:30:32,340 So if I run Mario.py, 3 still works as before. 1976 01:30:32,340 --> 01:30:35,880 If I run Mario.py and type in cat, I detect it now, 1977 01:30:35,880 --> 01:30:39,240 and because I'm still in that loop, and because the program hasn't crashed, 1978 01:30:39,240 --> 01:30:43,050 because I've caught, so to speak, the value error, using this line of code 1979 01:30:43,050 --> 01:30:46,950 here, that's the way in Python to detect these kinds of errors, 1980 01:30:46,950 --> 01:30:49,680 that would otherwise end up being on the user's own screen. 1981 01:30:49,680 --> 01:30:51,540 If I type in cat, dog, that doesn't work. 1982 01:30:51,540 --> 01:30:56,820 If I type in, though, 2, I get my two hashes, because that's, indeed, an Int. 1983 01:30:56,820 --> 01:30:58,740 Are any questions on this, and we're not going 1984 01:30:58,740 --> 01:31:00,750 to spend too much time on exceptions, but just wanted 1985 01:31:00,750 --> 01:31:03,680 to show you what's involved with getting rid of those training wheels. 1986 01:31:03,680 --> 01:31:04,180 Yeah. 1987 01:31:04,180 --> 01:31:05,763 AUDIENCE: Then the hash marks in line. 1988 01:31:05,763 --> 01:31:07,305 DAVID J. MALAN: OK, so let's do this. 1989 01:31:07,305 --> 01:31:09,140 That actually comes to the earlier question 1990 01:31:09,140 --> 01:31:11,060 about printing the hashes on the same line, 1991 01:31:11,060 --> 01:31:13,808 or maybe something like this, where we have the little bricks 1992 01:31:13,808 --> 01:31:15,350 in the sky, or little question marks. 1993 01:31:15,350 --> 01:31:17,725 Let's recreate this idea, because the problem with print, 1994 01:31:17,725 --> 01:31:20,930 as was noted earlier, is you're automatically printing out new lines. 1995 01:31:20,930 --> 01:31:22,460 But what if we don't want that. 1996 01:31:22,460 --> 01:31:24,740 Well, let's change this program entirely. 1997 01:31:24,740 --> 01:31:26,310 Let me throw away all the functions. 1998 01:31:26,310 --> 01:31:29,220 Let's just go to a simpler world, where we're just doing this. 1999 01:31:29,220 --> 01:31:30,912 So let me start fresh in Mario.py. 2000 01:31:30,912 --> 01:31:33,120 I'm not going to bother with exceptions or functions. 2001 01:31:33,120 --> 01:31:39,410 Let's just do a very simple program, to create this idea, for i in range of 4 2002 01:31:39,410 --> 01:31:42,860 this time, because there are four of these things in the sky. 2003 01:31:42,860 --> 01:31:45,230 Let's go ahead and just print out a question mark 2004 01:31:45,230 --> 01:31:47,450 to represent each of those bricks. 2005 01:31:47,450 --> 01:31:51,140 Odds are you know this not going to end well, because these are unfortunately, 2006 01:31:51,140 --> 01:31:54,450 as you've predicted, on separate lines. 2007 01:31:54,450 --> 01:31:57,380 So it turns out that the print function actually 2008 01:31:57,380 --> 01:32:00,320 takes in multiple arguments, not just the thing you want to print, 2009 01:32:00,320 --> 01:32:03,650 but also some additional arguments, that allow you to specify 2010 01:32:03,650 --> 01:32:06,170 what the default line ending should be. 2011 01:32:06,170 --> 01:32:09,110 But what's interesting about this is that, if you 2012 01:32:09,110 --> 01:32:12,630 want to change the line ending to be something like, 2013 01:32:12,630 --> 01:32:16,790 quote unquote, "that is nothing," instead of backslash n, 2014 01:32:16,790 --> 01:32:19,310 this is not sufficient, because in Python, you 2015 01:32:19,310 --> 01:32:21,770 can have two types of arguments, or parameters. 2016 01:32:21,770 --> 01:32:25,160 Some arguments are positional, which is the fancy way of saying it's 2017 01:32:25,160 --> 01:32:26,690 a comma separated list of arguments. 2018 01:32:26,690 --> 01:32:29,540 And that's what we did all the time in C. Something comma, something 2019 01:32:29,540 --> 01:32:31,665 comma, something, we did it in printf all the time, 2020 01:32:31,665 --> 01:32:33,980 and in other functions that took multiple arguments. 2021 01:32:33,980 --> 01:32:37,880 In Python, you have, not only positional arguments, 2022 01:32:37,880 --> 01:32:41,660 where you just separate them by commas, to give one or two or three or more 2023 01:32:41,660 --> 01:32:42,650 arguments. 2024 01:32:42,650 --> 01:32:46,220 There are also named arguments, which looks weird but is 2025 01:32:46,220 --> 01:32:48,140 helpful for reasons like this. 2026 01:32:48,140 --> 01:32:50,900 If you read the documentation, you will see 2027 01:32:50,900 --> 01:32:54,740 that there is a named argument that Python accepts, called end. 2028 01:32:54,740 --> 01:32:57,680 And if you set that equal to something, that 2029 01:32:57,680 --> 01:33:00,200 will be used as the end of every line, instead 2030 01:33:00,200 --> 01:33:02,750 of the default, which the documentation will also say 2031 01:33:02,750 --> 01:33:04,700 is quote unquote backslash n. 2032 01:33:04,700 --> 01:33:09,000 So this line here has no effect on my logic at the moment. 2033 01:33:09,000 --> 01:33:13,280 But if I change it to just quote unquote, essentially overriding 2034 01:33:13,280 --> 01:33:18,470 the default new line character, and now run Mario again, now I get all four 2035 01:33:18,470 --> 01:33:19,278 on the same line. 2036 01:33:19,278 --> 01:33:20,570 There's a bit of a bug, though. 2037 01:33:20,570 --> 01:33:23,610 My prompt is not meant to be on the same line. 2038 01:33:23,610 --> 01:33:25,640 So I can fix that by just printing nothing. 2039 01:33:25,640 --> 01:33:28,640 But, really, it's not nothing, because you get the new line for free. 2040 01:33:28,640 --> 01:33:32,930 So let me run Python of Mario.py again, and now we 2041 01:33:32,930 --> 01:33:36,140 have what I intended in the first place, which was a little something that 2042 01:33:36,140 --> 01:33:37,170 looked like this. 2043 01:33:37,170 --> 01:33:40,910 And this is just one example of an argument that has a name. 2044 01:33:40,910 --> 01:33:43,280 But this is a common paradigm in Python 2, 2045 01:33:43,280 --> 01:33:46,250 to not just separate things by commas, but to be very specific, 2046 01:33:46,250 --> 01:33:50,810 because the print function might take 5, 10, even 20 different arguments. 2047 01:33:50,810 --> 01:33:54,628 And my God, if you had to enumerate like 10 or 20 commas, 2048 01:33:54,628 --> 01:33:55,670 you're going to screw up. 2049 01:33:55,670 --> 01:33:57,587 You're going to get things in the wrong order. 2050 01:33:57,587 --> 01:34:00,600 Named arguments allow you to be resilient against that. 2051 01:34:00,600 --> 01:34:02,690 So you only specify arguments by name, and it 2052 01:34:02,690 --> 01:34:06,004 doesn't matter what order they are in. 2053 01:34:06,004 --> 01:34:10,160 All right, any questions, then, on this, and the overriding of new line. 2054 01:34:10,160 --> 01:34:14,270 And to be clear, you can do something like, very weird, 2055 01:34:14,270 --> 01:34:19,910 but logically expected, like this, by just changing the line ending, too. 2056 01:34:19,910 --> 01:34:21,830 But the right way to solve the Mario problem 2057 01:34:21,830 --> 01:34:25,652 would be just to override it to be nothing like this. 2058 01:34:25,652 --> 01:34:27,110 All right, how about this for cool. 2059 01:34:27,110 --> 01:34:29,000 And this is why a lot of people like Python. 2060 01:34:29,000 --> 01:34:30,440 Suppose you don't really like loops. 2061 01:34:30,440 --> 01:34:31,970 You don't really like three-line programs, 2062 01:34:31,970 --> 01:34:34,637 because that was kind of three times longer than it needs to be. 2063 01:34:34,637 --> 01:34:39,200 What if you just printed out a question mark four times? 2064 01:34:39,200 --> 01:34:43,380 Python, whoops, Python of Mario.py, that also works. 2065 01:34:43,380 --> 01:34:46,550 So it turns out that, just like the plus operator in Python 2066 01:34:46,550 --> 01:34:50,570 can join things together, the multiply operator is not 2067 01:34:50,570 --> 01:34:51,840 arithmetic in this case. 2068 01:34:51,840 --> 01:34:56,070 It actually means, take this and concatenate it four times over. 2069 01:34:56,070 --> 01:34:59,000 So that's a way of just distilling into one line what 2070 01:34:59,000 --> 01:35:02,750 would have otherwise taken multiple lines in C, fewer, but still multiple 2071 01:35:02,750 --> 01:35:07,130 lines in Python, but is really now rather succinct in Python, 2072 01:35:07,130 --> 01:35:08,385 by doing that instead. 2073 01:35:08,385 --> 01:35:11,510 Let's do one last Mario example, which looked a little something like this. 2074 01:35:11,510 --> 01:35:14,090 If this is another part of the Mario interface, 2075 01:35:14,090 --> 01:35:16,800 this is like a grid of like 3 by 3 bricks, for instance. 2076 01:35:16,800 --> 01:35:20,690 So two dimensions now, just not just vertical, not horizontal, but now both. 2077 01:35:20,690 --> 01:35:23,130 Let's print out something like that, using hashes. 2078 01:35:23,130 --> 01:35:26,070 Well, how about, how do I do this. 2079 01:35:26,070 --> 01:35:29,210 So how about for i in range of 3. 2080 01:35:29,210 --> 01:35:34,280 Then I could do for j in range of 3, just because j comes after I 2081 01:35:34,280 --> 01:35:35,810 and that's reasonable for counting. 2082 01:35:35,810 --> 01:35:41,000 I could now print out a hash symbol, well, let's see what this does. 2083 01:35:41,000 --> 01:35:47,660 Python of Mario.py, OK, that's just one crazy long column. 2084 01:35:47,660 --> 01:35:51,240 What do I need to fix and where here, to make this look like this? 2085 01:35:51,240 --> 01:35:55,850 So 3 by 3 bricks, instead of one long column. 2086 01:35:55,850 --> 01:35:56,450 Any instincts? 2087 01:35:56,450 --> 01:36:00,500 AUDIENCE: Why don't we create a line and then we'll skip it. 2088 01:36:00,500 --> 01:36:03,450 DAVID J. MALAN: OK, so after printing 3, we want to skip a line. 2089 01:36:03,450 --> 01:36:05,750 So maybe like print out a blank line here. 2090 01:36:05,750 --> 01:36:06,740 OK, let's try that. 2091 01:36:06,740 --> 01:36:09,920 I like that instinct, right, print 3, new line, print 3, new line. 2092 01:36:09,920 --> 01:36:12,260 Let's go ahead and run Python of Mario.py. 2093 01:36:12,260 --> 01:36:16,580 OK, it's more visible, what I'm doing, but still wrong. 2094 01:36:16,580 --> 01:36:19,110 What can I, what's the remaining fix, though? 2095 01:36:19,110 --> 01:36:19,610 Yeah. 2096 01:36:19,610 --> 01:36:22,790 AUDIENCE: So right behind the two. 2097 01:36:22,790 --> 01:36:25,680 DAVID J. MALAN: Yeah, I'm getting an extra new line here, 2098 01:36:25,680 --> 01:36:27,870 which I don't want while I'm on this row. 2099 01:36:27,870 --> 01:36:31,850 So let me do n equals quote unquote, and now, together, your solutions might 2100 01:36:31,850 --> 01:36:33,950 take us the whole way there. 2101 01:36:33,950 --> 01:36:37,345 Python of Mario.py, voila, now we've got it, in two dimensions. 2102 01:36:37,345 --> 01:36:38,720 And even this, we can tighten up. 2103 01:36:38,720 --> 01:36:41,220 Like, we could just use the little trick we learned. 2104 01:36:41,220 --> 01:36:45,230 So we could just say, print a hash times 3 times, 2105 01:36:45,230 --> 01:36:47,810 and we can get rid of one of those loops altogether. 2106 01:36:47,810 --> 01:36:50,930 All it's doing is, whoops, all it's doing is automating that process. 2107 01:36:50,930 --> 01:36:53,060 But, no, I don't want to do that. 2108 01:36:53,060 --> 01:36:54,832 What do I, how do I fix this here. 2109 01:36:54,832 --> 01:36:56,540 I don't think I want this anymore, right? 2110 01:36:56,540 --> 01:36:58,350 Because that's giving me an extra new line. 2111 01:36:58,350 --> 01:37:01,260 So now this program is really tightened up. 2112 01:37:01,260 --> 01:37:03,050 Same thing, two lines of code. 2113 01:37:03,050 --> 01:37:07,220 But we're now implementing this same two dimensional structure here. 2114 01:37:07,220 --> 01:37:10,440 All right, any questions here on these? 2115 01:37:10,440 --> 01:37:10,940 Yeah. 2116 01:37:10,940 --> 01:37:16,790 AUDIENCE: Is there any practical reason why when we write n, n is, I mean, 2117 01:37:16,790 --> 01:37:19,850 the print function, you don't put any spaces in it. 2118 01:37:19,850 --> 01:37:22,430 DAVID J. MALAN: If I print n, any spaces. 2119 01:37:22,430 --> 01:37:23,300 Say that once more. 2120 01:37:23,300 --> 01:37:25,440 AUDIENCE: Whenever we write n, for example, 2121 01:37:25,440 --> 01:37:28,850 the print function is, you know, in order 2122 01:37:28,850 --> 01:37:33,820 to stop it from going to a new line, it seems like any spaces, 2123 01:37:33,820 --> 01:37:37,800 we did like n equals and then too close. 2124 01:37:37,800 --> 01:37:38,820 There were no spaces. 2125 01:37:38,820 --> 01:37:40,300 Did you do that on purpose? 2126 01:37:40,300 --> 01:37:42,300 DAVID J. MALAN: Oh. 2127 01:37:42,300 --> 01:37:43,200 yes, good question. 2128 01:37:43,200 --> 01:37:44,242 I see what you're saying. 2129 01:37:44,242 --> 01:37:48,030 So in a previous version, let me rewind in time, when we had this, 2130 01:37:48,030 --> 01:37:49,170 I did not put spaces. 2131 01:37:49,170 --> 01:37:51,720 The convention in Python is not to do that. 2132 01:37:51,720 --> 01:37:52,350 Why? 2133 01:37:52,350 --> 01:37:54,263 It just starts to add too much space. 2134 01:37:54,263 --> 01:37:56,430 And this is a little inconsistent, because, earlier, 2135 01:37:56,430 --> 01:37:58,470 when we talked about like pluses or spaces 2136 01:37:58,470 --> 01:38:00,750 around the less than or equal signs, I did say add it. 2137 01:38:00,750 --> 01:38:03,010 Here it's actually clearer and recommended 2138 01:38:03,010 --> 01:38:04,260 to keep them tighter together. 2139 01:38:04,260 --> 01:38:07,560 Otherwise it just becomes harder to read where the gaps are. 2140 01:38:07,560 --> 01:38:08,820 Good observation. 2141 01:38:08,820 --> 01:38:14,357 All right, let's do, how about, another five minute break. 2142 01:38:14,357 --> 01:38:14,940 Let's do that. 2143 01:38:14,940 --> 01:38:17,732 And then we're going to dive into some more sophisticated problems, 2144 01:38:17,732 --> 01:38:21,160 and then ultimately build with some audio and visual examples, as well. 2145 01:38:21,160 --> 01:38:23,130 See you in five. 2146 01:38:23,130 --> 01:38:28,260 All right, so almost all of the examples we just did 2147 01:38:28,260 --> 01:38:30,540 were recreations of what we did in week 1. 2148 01:38:30,540 --> 01:38:33,120 And recall that week 1 was like our most syntax-heavy week. 2149 01:38:33,120 --> 01:38:36,930 It was when we were first learning how to program in C. But after week 1, 2150 01:38:36,930 --> 01:38:39,900 we began to focus a bit more on ideas, like arrays, 2151 01:38:39,900 --> 01:38:41,640 and other higher-level constructs. 2152 01:38:41,640 --> 01:38:44,880 And we'll do that again here, condensing some of those first early weeks 2153 01:38:44,880 --> 01:38:47,250 into a fewer set of examples in Python. 2154 01:38:47,250 --> 01:38:50,020 And we'll culminate by actually taking Python out for a spin, 2155 01:38:50,020 --> 01:38:52,300 and doing things that would be way harder to do, 2156 01:38:52,300 --> 01:38:56,830 and way more time-consuming to do in C, even more so than the speller example. 2157 01:38:56,830 --> 01:38:59,790 But how do you go about figuring out what functions exist, 2158 01:38:59,790 --> 01:39:02,970 if you didn't hear it in class, you don't see it online, 2159 01:39:02,970 --> 01:39:06,480 but you want to see it officially, you can go to the Python documentation, 2160 01:39:06,480 --> 01:39:08,220 docs.python.org here. 2161 01:39:08,220 --> 01:39:11,340 And I will disclaim that, honestly, the Python documentation is not 2162 01:39:11,340 --> 01:39:12,750 terribly user-friendly. 2163 01:39:12,750 --> 01:39:15,240 Google will often be your friend, so googling something 2164 01:39:15,240 --> 01:39:19,350 you're interested in, to find your way to the appropriate page on Python.org, 2165 01:39:19,350 --> 01:39:22,410 or StackOverflow.com is another popular website. 2166 01:39:22,410 --> 01:39:24,780 As always, though, the line should be googling 2167 01:39:24,780 --> 01:39:27,600 things like, how do I convert a string to lowercase. 2168 01:39:27,600 --> 01:39:29,070 Like that's reasonable to Google. 2169 01:39:29,070 --> 01:39:33,160 Or how to convert to uppercase or how implement function in Python. 2170 01:39:33,160 --> 01:39:37,950 But googling, of course, things like how to implement problem set 6 in CS50, 2171 01:39:37,950 --> 01:39:39,120 of course, crosses the line. 2172 01:39:39,120 --> 01:39:42,078 But moving forward, and really with programming in general, like Google 2173 01:39:42,078 --> 01:39:44,220 and Stack Overflow are your friends, but the line 2174 01:39:44,220 --> 01:39:46,540 is between the reasonable and the unreasonable. 2175 01:39:46,540 --> 01:39:49,890 So let me officially use the Python documentation search, just 2176 01:39:49,890 --> 01:39:52,530 to search for something like the lowercase function. 2177 01:39:52,530 --> 01:39:54,540 Like, I know I can lowercase things in Python. 2178 01:39:54,540 --> 01:39:55,980 I don't quite remember how. 2179 01:39:55,980 --> 01:39:57,870 So let me just search for the word lower. 2180 01:39:57,870 --> 01:40:00,810 You're going to get, often, an overwhelming number of results, 2181 01:40:00,810 --> 01:40:03,678 because Python is a pretty big language, with lots of functionality. 2182 01:40:03,678 --> 01:40:05,970 And you're going to want to look for familiar patterns. 2183 01:40:05,970 --> 01:40:09,060 For whatever reason, string.lower, which is probably 2184 01:40:09,060 --> 01:40:12,420 more popular or more commonly used than these other ones, is third on the list. 2185 01:40:12,420 --> 01:40:15,460 But it's purple, because I clicked it a moment ago, when looking for it. 2186 01:40:15,460 --> 01:40:18,450 So str.lower is probably what I want, because I 2187 01:40:18,450 --> 01:40:21,060 am interested at the moment in lower casing strings. 2188 01:40:21,060 --> 01:40:25,258 When I click on that, this is an example of what Python's documentation tends 2189 01:40:25,258 --> 01:40:25,800 to look like. 2190 01:40:25,800 --> 01:40:27,340 It's in this general format. 2191 01:40:27,340 --> 01:40:29,340 Here's my str.lower function. 2192 01:40:29,340 --> 01:40:31,540 This returns a copy of the string, with all 2193 01:40:31,540 --> 01:40:33,750 of the cased characters converted to lowercase, 2194 01:40:33,750 --> 01:40:35,670 and the lower-casing algorithm, dot dot dot. 2195 01:40:35,670 --> 01:40:37,168 So that doesn't give me much. 2196 01:40:37,168 --> 01:40:38,460 It doesn't give me sample code. 2197 01:40:38,460 --> 01:40:40,210 But it does say what the function does. 2198 01:40:40,210 --> 01:40:43,890 And if we keep looking, you'll see mention of Lstrip, which is left strip. 2199 01:40:43,890 --> 01:40:48,120 I used its analog, Rstrip before, right strip, which allows you to remove, 2200 01:40:48,120 --> 01:40:51,000 that is strip, from the end of a string, something like white space, 2201 01:40:51,000 --> 01:40:52,930 like a new line, or even something else. 2202 01:40:52,930 --> 01:40:56,410 And if you scroll through string, this web page here. 2203 01:40:56,410 --> 01:40:58,110 And we're halfway down the page already. 2204 01:40:58,110 --> 01:41:00,180 If you see my scroll bar, tiny on the right, 2205 01:41:00,180 --> 01:41:05,250 there's a huge amount of functionality built into string objects, here. 2206 01:41:05,250 --> 01:41:08,460 And this is just testament to just how rich the language itself is. 2207 01:41:08,460 --> 01:41:12,620 But it's also reason to reassure that the goal, when 2208 01:41:12,620 --> 01:41:14,870 playing around with some new language and learning it, 2209 01:41:14,870 --> 01:41:16,598 is not to learn it exhaustively. 2210 01:41:16,598 --> 01:41:18,390 Just like in English or any human language, 2211 01:41:18,390 --> 01:41:20,640 there's always going to be vocab words you don't know, 2212 01:41:20,640 --> 01:41:23,563 ways of presenting the same information in some language. 2213 01:41:23,563 --> 01:41:25,230 That's going to be the case with Python. 2214 01:41:25,230 --> 01:41:28,620 And what we'll do today and this week in problem set 6 is really 2215 01:41:28,620 --> 01:41:30,120 get your footing with this language. 2216 01:41:30,120 --> 01:41:33,300 But you won't know all of Python, just like you won't know all of C. 2217 01:41:33,300 --> 01:41:36,300 And, honestly, you won't know all of any of these languages on your own, 2218 01:41:36,300 --> 01:41:38,800 unless you're, perhaps, using them full time professionally, 2219 01:41:38,800 --> 01:41:42,370 and even then, there's more libraries than one might even retain themselves. 2220 01:41:42,370 --> 01:41:45,420 So let's actually now pivot to a few other ideas, 2221 01:41:45,420 --> 01:41:47,560 that we'll implement in Python, in a moment. 2222 01:41:47,560 --> 01:41:50,010 Let me switch back over to VS Code here. 2223 01:41:50,010 --> 01:41:55,260 And let me whip up, say, a recreation of our scores example from week two, 2224 01:41:55,260 --> 01:41:57,883 where we averaged like three scores together. 2225 01:41:57,883 --> 01:42:00,300 And that was an opportunity in week 2 to play with arrays, 2226 01:42:00,300 --> 01:42:02,430 to realize how constrained arrays are. 2227 01:42:02,430 --> 01:42:03,720 They can't grow or shrink. 2228 01:42:03,720 --> 01:42:05,040 You have to decide in advance. 2229 01:42:05,040 --> 01:42:07,110 But let's see what's different here in Python. 2230 01:42:07,110 --> 01:42:11,580 So let me do Scores.py, and let me give myself an array in Python 2231 01:42:11,580 --> 01:42:15,780 called scores, sorry, let me give myself a variable in Python called scores. 2232 01:42:15,780 --> 01:42:17,940 Set it equal to a list of three scores, which 2233 01:42:17,940 --> 01:42:22,560 are the same ones we've used before, 72, 73, 33, in this context 2234 01:42:22,560 --> 01:42:24,630 meant to be scores, not ASCII values. 2235 01:42:24,630 --> 01:42:26,520 And then let's just do the average of these. 2236 01:42:26,520 --> 01:42:28,630 So average will be another variable. 2237 01:42:28,630 --> 01:42:32,910 And it turns out I can do, well, how did I sum these before? 2238 01:42:32,910 --> 01:42:36,580 I probably had a for loop to add one, then I knew how long they were. 2239 01:42:36,580 --> 01:42:39,580 Turns out in Python, you can just say sum of scores 2240 01:42:39,580 --> 01:42:41,530 divided by the length of scores. 2241 01:42:41,530 --> 01:42:43,130 That's going to give me my average. 2242 01:42:43,130 --> 01:42:46,210 So sum is a function that takes a list, in this case, as input, 2243 01:42:46,210 --> 01:42:49,000 and it just does the sum for you, with a for loop or whatever 2244 01:42:49,000 --> 01:42:49,930 underneath the hood. 2245 01:42:49,930 --> 01:42:53,480 Len gives you the length of the list, how many things are in it. 2246 01:42:53,480 --> 01:42:55,240 So I can dynamically figure that out. 2247 01:42:55,240 --> 01:43:00,340 Now let me go ahead and print out, using print, the word average, and then, 2248 01:43:00,340 --> 01:43:03,628 in curly braces, the actual average, close quote. 2249 01:43:03,628 --> 01:43:05,920 All right, so let's run this code, Python of Scores.py. 2250 01:43:05,920 --> 01:43:11,050 And there is my average, in this case, 59.33333 and so forth, 2251 01:43:11,050 --> 01:43:12,310 based on the math. 2252 01:43:12,310 --> 01:43:14,500 Well, let's actually, now, change this a little bit 2253 01:43:14,500 --> 01:43:17,625 and make it a little more interesting, and actually get input from the user 2254 01:43:17,625 --> 01:43:19,190 rather than hard coding this. 2255 01:43:19,190 --> 01:43:22,568 Let me go back up here and use from CS50 import getInt, 2256 01:43:22,568 --> 01:43:25,360 because I don't want to deal with all the exceptions and the loops. 2257 01:43:25,360 --> 01:43:27,820 Like, I just want to use someone else's function here. 2258 01:43:27,820 --> 01:43:31,600 Let me give myself an empty list called scores. 2259 01:43:31,600 --> 01:43:34,480 And this is not something we were able to do in C, right? 2260 01:43:34,480 --> 01:43:36,610 Because in C, if you tried to make an empty array, 2261 01:43:36,610 --> 01:43:39,590 well, that's pretty stupid, because you can't add things to it. 2262 01:43:39,590 --> 01:43:40,910 It's a fixed size. 2263 01:43:40,910 --> 01:43:42,650 So it wouldn't even let you do that. 2264 01:43:42,650 --> 01:43:45,640 But I can just create an empty list in Python, 2265 01:43:45,640 --> 01:43:48,340 because lists, unlike arrays, are really lengthless. 2266 01:43:48,340 --> 01:43:49,750 They'll grow and shrink. 2267 01:43:49,750 --> 01:43:52,870 But you and I are not dealing with all the pointers underneath the hood. 2268 01:43:52,870 --> 01:43:54,770 Python's doing that for us. 2269 01:43:54,770 --> 01:43:58,435 So now, let's go ahead and get a whole bunch of scores from the user. 2270 01:43:58,435 --> 01:43:59,810 How about three of them in total. 2271 01:43:59,810 --> 01:44:05,350 So for i in range of 3, let's go ahead and grab a score from the user, 2272 01:44:05,350 --> 01:44:07,810 using getInt, asking them for score. 2273 01:44:07,810 --> 01:44:14,840 And then let's go ahead and append, to the scores list, that particular score. 2274 01:44:14,840 --> 01:44:17,200 So it turns out that a list, and I could read the Python 2275 01:44:17,200 --> 01:44:21,280 documentation to confirm as much, lists have a function built into them, 2276 01:44:21,280 --> 01:44:25,155 and functions built into objects are generally known as methods, 2277 01:44:25,155 --> 01:44:26,530 if you've heard that term before. 2278 01:44:26,530 --> 01:44:29,320 Same idea, but whereas a function kind of stands on its own, 2279 01:44:29,320 --> 01:44:33,430 a method is a function built into an object, like a list here. 2280 01:44:33,430 --> 01:44:35,917 That's going to achieve the same result. Strictly speaking, 2281 01:44:35,917 --> 01:44:37,000 I don't need the variable. 2282 01:44:37,000 --> 01:44:40,603 Just like in C, I could tighten this up and do something like this as well. 2283 01:44:40,603 --> 01:44:42,520 But, I don't know, I kind of like it this way. 2284 01:44:42,520 --> 01:44:45,970 It's more clear, to me, at least, that what I'm doing here, getting the score 2285 01:44:45,970 --> 01:44:47,838 and then appending it to the list. 2286 01:44:47,838 --> 01:44:49,630 Now the rest of the code can stay the same. 2287 01:44:49,630 --> 01:44:54,700 Python of Scores.py, score will be 72, 73, 33. 2288 01:44:54,700 --> 01:44:55,820 And I get back the math. 2289 01:44:55,820 --> 01:44:58,840 But now the program's a little more dynamic, which is nice. 2290 01:44:58,840 --> 01:45:00,940 But there's other syntax I could use here. 2291 01:45:00,940 --> 01:45:04,330 Just so you've seen it, Python does have some neat syntactic tricks, 2292 01:45:04,330 --> 01:45:06,850 whereby, if you don't want to do scores.append, 2293 01:45:06,850 --> 01:45:11,290 you can actually say scores plus equals this score. 2294 01:45:11,290 --> 01:45:15,730 So you can actually concatenate lists together in Python 2. 2295 01:45:15,730 --> 01:45:18,340 Just as we used plus to join two strings together, 2296 01:45:18,340 --> 01:45:21,400 you can use plus to join two lists together. 2297 01:45:21,400 --> 01:45:24,040 The catch is, you need to put the one score I'm 2298 01:45:24,040 --> 01:45:26,770 adding here in a list of its own, which is kind of silly. 2299 01:45:26,770 --> 01:45:31,330 But it's necessary, so that this thing and this thing are both lists. 2300 01:45:31,330 --> 01:45:33,970 To do this more verbosely, which most programmers wouldn't 2301 01:45:33,970 --> 01:45:36,310 do, but just for clarity, this is the same thing 2302 01:45:36,310 --> 01:45:38,950 as saying scores plus this score. 2303 01:45:38,950 --> 01:45:42,910 So now maybe it's a little more clear that scores and brackets score 2304 01:45:42,910 --> 01:45:47,680 plural, sorry, singular, are both lists themselves, being concatenated 2305 01:45:47,680 --> 01:45:48,860 or joined together. 2306 01:45:48,860 --> 01:45:51,740 So two different ways, not sure one is better than the other. 2307 01:45:51,740 --> 01:45:57,640 This way is pretty common, but .append is also quite reasonable as well. 2308 01:45:57,640 --> 01:46:00,340 All right, how about another example from week two. 2309 01:46:00,340 --> 01:46:03,070 This one was called uppercase. 2310 01:46:03,070 --> 01:46:06,320 So let me do this in Uppercase.py, though, this time. 2311 01:46:06,320 --> 01:46:10,180 And let me import from CS50, get string again. 2312 01:46:10,180 --> 01:46:14,020 And let me go ahead and say, before will be my first variable. 2313 01:46:14,020 --> 01:46:17,500 Let me get a string from the user, asking them for a before string. 2314 01:46:17,500 --> 01:46:22,660 And then let me go ahead and say, after, just to demonstrate some changes, 2315 01:46:22,660 --> 01:46:25,190 upper-casing to this string. 2316 01:46:25,190 --> 01:46:27,850 Let me change my line ending to be that, using our new trick. 2317 01:46:27,850 --> 01:46:31,490 And this is where things get cool in Python, relatively speaking. 2318 01:46:31,490 --> 01:46:35,050 If I want to iterate over all of the characters in a string, 2319 01:46:35,050 --> 01:46:38,140 and print them out in uppercase, one way to do that would be this. 2320 01:46:38,140 --> 01:46:46,032 For c in the before string, go ahead and print out C.uppercase, sorry, C.upper, 2321 01:46:46,032 --> 01:46:49,240 but don't end the line yet, because I want to keep these all on the same line 2322 01:46:49,240 --> 01:46:50,440 until I'm all done. 2323 01:46:50,440 --> 01:46:51,490 So what am I doing? 2324 01:46:51,490 --> 01:46:54,970 Python of Uppercase.py, let me type in Hello in all lowercase. 2325 01:46:54,970 --> 01:46:57,010 I've just upper-cased the whole string. 2326 01:46:57,010 --> 01:46:57,700 How? 2327 01:46:57,700 --> 01:47:00,130 I first get string, calling it before. 2328 01:47:00,130 --> 01:47:02,680 I then just print out some fluffy text that says after colon, 2329 01:47:02,680 --> 01:47:04,840 and I get rid of the line ending, just so I can kind of line these up. 2330 01:47:04,840 --> 01:47:06,632 Notice I hit the spacebar a couple of times 2331 01:47:06,632 --> 01:47:08,620 just so letters line up to be pretty. 2332 01:47:08,620 --> 01:47:10,780 For c and before, this is new. 2333 01:47:10,780 --> 01:47:14,500 This is powerful in C, sorry, in Python, whereby 2334 01:47:14,500 --> 01:47:17,590 you don't have to do like Int i equals 0 and i less than this, 2335 01:47:17,590 --> 01:47:22,310 you could just say, for c in the string in question, for c and before. 2336 01:47:22,310 --> 01:47:25,510 And then here is just upper-casing that specific character, 2337 01:47:25,510 --> 01:47:27,700 and making sure we don't output a new line too soon. 2338 01:47:27,700 --> 01:47:29,920 But this is actually more work than I need to do. 2339 01:47:29,920 --> 01:47:34,000 Based on what we've seen thus far, like from our agreement example, 2340 01:47:34,000 --> 01:47:35,620 can I tighten this up further? 2341 01:47:35,620 --> 01:47:40,340 Can I collapse lines 5 and 6, maybe even 7, all together? 2342 01:47:40,340 --> 01:47:46,550 If the goal of this program is just to uppercase the before string, 2343 01:47:46,550 --> 01:47:49,640 how might I do this? 2344 01:47:49,640 --> 01:47:50,480 Yeah, in back. 2345 01:47:50,480 --> 01:47:52,287 AUDIENCE: Would it be str.upper? 2346 01:47:52,287 --> 01:47:54,620 DAVID J. MALAN: Str.upper, yeah, so I could do something 2347 01:47:54,620 --> 01:47:57,500 like this, after gets before.upper. 2348 01:47:57,500 --> 01:47:59,750 So it's not stir literally dot upper, stir 2349 01:47:59,750 --> 01:48:01,500 just represents the string in question. 2350 01:48:01,500 --> 01:48:04,620 So it would be before.upper, but right idea otherwise. 2351 01:48:04,620 --> 01:48:08,130 And so let me go ahead and just tweak my print statement a little bit. 2352 01:48:08,130 --> 01:48:12,810 Let me just go ahead and print out the after variable here, after creating it. 2353 01:48:12,810 --> 01:48:15,440 So this line is the same, I'm getting a string called before. 2354 01:48:15,440 --> 01:48:18,530 I'm creating another variable called after, and, as you propose, 2355 01:48:18,530 --> 01:48:21,960 I'm calling upper on the whole string, not one character at a time. 2356 01:48:21,960 --> 01:48:22,460 Why? 2357 01:48:22,460 --> 01:48:23,360 Because it's allowed. 2358 01:48:23,360 --> 01:48:27,350 And, again, in Python, there aren't technically characters individually. 2359 01:48:27,350 --> 01:48:28,760 There's only strings, anyway. 2360 01:48:28,760 --> 01:48:30,600 So I might as well do them all at once. 2361 01:48:30,600 --> 01:48:34,220 So if I rerun the code now, Python of Uppercase.py. 2362 01:48:34,220 --> 01:48:39,080 Now I'll type in Hello in all lowercase, and, oh, so close, 2363 01:48:39,080 --> 01:48:42,110 I think I can get rid of this override, because I'm 2364 01:48:42,110 --> 01:48:45,510 printing the whole thing out at once, not character by character. 2365 01:48:45,510 --> 01:48:49,880 So now if I type in Hello before, now I have an even tighter version 2366 01:48:49,880 --> 01:48:52,080 of the program here. 2367 01:48:52,080 --> 01:48:55,910 All right, any questions, then, on lists or on strings, 2368 01:48:55,910 --> 01:49:01,240 and what this kind of function, upper, represents, with its docs. 2369 01:49:01,240 --> 01:49:01,740 No? 2370 01:49:01,740 --> 01:49:04,760 All right, so a couple other building blocks before we start. 2371 01:49:04,760 --> 01:49:05,855 Oh. 2372 01:49:05,855 --> 01:49:06,480 Where was that? 2373 01:49:06,480 --> 01:49:08,010 AUDIENCE: To the right. 2374 01:49:08,010 --> 01:49:10,050 DAVID J. MALAN: To the right, right. 2375 01:49:10,050 --> 01:49:11,040 Yes, thank you. 2376 01:49:11,040 --> 01:49:17,202 AUDIENCE: Could you write, very close to variable string, and then print upper, 2377 01:49:17,202 --> 01:49:19,257 you start creating a variable upper. 2378 01:49:19,257 --> 01:49:21,840 DAVID J. MALAN: Yes, do I have to create this variable, upper? 2379 01:49:21,840 --> 01:49:22,590 No, I don't. 2380 01:49:22,590 --> 01:49:24,870 I could actually tighten this up, and, if you really 2381 01:49:24,870 --> 01:49:28,170 want to see something neat, inside of the curly braces, 2382 01:49:28,170 --> 01:49:31,050 you don't have to just put the names of variables. 2383 01:49:31,050 --> 01:49:33,600 You can put a small amount of logic, so long 2384 01:49:33,600 --> 01:49:36,780 as it doesn't start to look stupid and kind of overwhelmingly complex, such 2385 01:49:36,780 --> 01:49:38,940 that it's sort of bad design at that point. 2386 01:49:38,940 --> 01:49:40,540 I can tighten this up like this. 2387 01:49:40,540 --> 01:49:44,610 And now we're in Python of Uppercase.py, writing Hello again. 2388 01:49:44,610 --> 01:49:45,730 And that, too, works. 2389 01:49:45,730 --> 01:49:47,280 But I would be careful about this. 2390 01:49:47,280 --> 01:49:50,483 You want to resist the temptation of having like a long line of code that's 2391 01:49:50,483 --> 01:49:53,400 inside the curly braces, because it's just going to be harder to read. 2392 01:49:53,400 --> 01:49:55,890 But, absolutely, you could indeed do that, too. 2393 01:49:55,890 --> 01:49:58,950 All right, how about command line arguments, which was one thing 2394 01:49:58,950 --> 01:50:03,030 we introduced in week two also, so that we could actually have the ability 2395 01:50:03,030 --> 01:50:06,750 to take input from the user, whoops. 2396 01:50:06,750 --> 01:50:10,270 So we could actually take input from the user at the command line, 2397 01:50:10,270 --> 01:50:13,210 so as to take literally command line arguments. 2398 01:50:13,210 --> 01:50:16,020 These are a little different, but it follows the same paradigm. 2399 01:50:16,020 --> 01:50:19,860 There's no main by default. And there's no Def main int 2400 01:50:19,860 --> 01:50:26,050 arg c char, or we called it string, argv by default. There's none of this. 2401 01:50:26,050 --> 01:50:30,510 So if you want access to the argument vector, argv, you import it. 2402 01:50:30,510 --> 01:50:35,100 And it turns out, there's another module in Python, or library in Python 2403 01:50:35,100 --> 01:50:39,180 called CIS, and you can import from the system this thing called argv. 2404 01:50:39,180 --> 01:50:41,357 So same idea, different place. 2405 01:50:41,357 --> 01:50:42,940 Now I'm going to go ahead and do this. 2406 01:50:42,940 --> 01:50:47,820 Let's write a program that just requires that the user types in two, a word, 2407 01:50:47,820 --> 01:50:50,050 after the program's name, or none at all. 2408 01:50:50,050 --> 01:50:56,670 So if the length of argv equals 2, let's go ahead and print out, how about, 2409 01:50:56,670 --> 01:51:05,088 Hello comma argv bracket 1 close quote, else if they don't type two words 2410 01:51:05,088 --> 01:51:08,130 total at the prompt, let's just say the default's, like we did weeks ago, 2411 01:51:08,130 --> 01:51:09,160 Hello, world. 2412 01:51:09,160 --> 01:51:12,180 So the only thing that's new here is we're importing argv from CIS, 2413 01:51:12,180 --> 01:51:15,450 and we're using this fancy f-string format, which kind of to your point, 2414 01:51:15,450 --> 01:51:18,510 too, it's putting more complex logic in the curly braces. 2415 01:51:18,510 --> 01:51:19,270 But that's OK. 2416 01:51:19,270 --> 01:51:23,890 In this case, it's a list called argv, and we're getting bracket 1 from it. 2417 01:51:23,890 --> 01:51:27,780 Let's do Python of Argv.py, Enter, Hello, world. 2418 01:51:27,780 --> 01:51:31,480 What if I do Argv.py David at the command line. 2419 01:51:31,480 --> 01:51:32,730 Now I get Hello, David. 2420 01:51:32,730 --> 01:51:34,680 So there's one curiosity here. 2421 01:51:34,680 --> 01:51:39,375 Python is not included in argv, whereas in C, dot 2422 01:51:39,375 --> 01:51:41,940 slash whatever was the first thing. 2423 01:51:41,940 --> 01:51:45,510 If the analog in Python is that the name of your Python program 2424 01:51:45,510 --> 01:51:49,800 is the first thing, in bracket 0, which is why David is in bracket 1, 2425 01:51:49,800 --> 01:51:55,740 the word Python does not appear in the argv list, just to be clear. 2426 01:51:55,740 --> 01:51:57,990 But otherwise, the idea of these arguments 2427 01:51:57,990 --> 01:52:00,383 is exactly the same as before. 2428 01:52:00,383 --> 01:52:02,550 And in fact, what you can do, which is kind of cool, 2429 01:52:02,550 --> 01:52:05,730 is, because argv is a list, you can do things like this. 2430 01:52:05,730 --> 01:52:10,890 For arg in argv, go ahead and print out each argument. 2431 01:52:10,890 --> 01:52:12,990 So instead of using a for loop and i and all 2432 01:52:12,990 --> 01:52:17,220 of this, if I do Python of argv Enter, it just writes the program's name. 2433 01:52:17,220 --> 01:52:21,960 If I do Python of argv Foo, it puts Argv.py and Foo. 2434 01:52:21,960 --> 01:52:26,520 If I do, sorry, if I do Foo and bar, those words all print out. 2435 01:52:26,520 --> 01:52:28,770 If I do Foobar baz, those print out too. 2436 01:52:28,770 --> 01:52:31,830 And Foo and bar or baz are like a mathematician's x and y and z 2437 01:52:31,830 --> 01:52:35,200 for computer scientists, when you just need some placeholder words. 2438 01:52:35,200 --> 01:52:36,420 So this is just nice. 2439 01:52:36,420 --> 01:52:40,020 It reads a little more like English, and a for loop is just much more concise, 2440 01:52:40,020 --> 01:52:43,530 allows you to iterate very quickly when you want something like that. 2441 01:52:43,530 --> 01:52:46,170 Suppose I only wanted the real words that the human typed 2442 01:52:46,170 --> 01:52:47,250 after the program's name. 2443 01:52:47,250 --> 01:52:50,460 Like, suppose I want to ignore Argv.py. 2444 01:52:50,460 --> 01:52:53,640 I mean I could do something hackish like this. 2445 01:52:53,640 --> 01:52:59,105 If arg equals Argv.py, I could just ignore, 2446 01:52:59,105 --> 01:53:00,480 you know, let's invert the logic. 2447 01:53:00,480 --> 01:53:02,530 I could do this, for instance. 2448 01:53:02,530 --> 01:53:05,100 So if the arg does not equal the program name, 2449 01:53:05,100 --> 01:53:07,890 then go ahead and print out the word. 2450 01:53:07,890 --> 01:53:09,840 So I get Foobar and baz only. 2451 01:53:09,840 --> 01:53:14,400 Or, this is what's kind of neat about Python 2, let me undo that. 2452 01:53:14,400 --> 01:53:18,400 And let me just take a slice of the array of the list instead. 2453 01:53:18,400 --> 01:53:22,810 So it turns out, if argv is a list, I can actually say, 2454 01:53:22,810 --> 01:53:27,060 you know what, go into that list, start at element 1, instead of 0, 2455 01:53:27,060 --> 01:53:29,200 and then go all the way to the end. 2456 01:53:29,200 --> 01:53:31,800 And we have not seen this syntax in C. But this 2457 01:53:31,800 --> 01:53:34,410 is a way of slicing a list in Python. 2458 01:53:34,410 --> 01:53:35,820 So now watch what happens. 2459 01:53:35,820 --> 01:53:40,860 If I run Python of Argv.py, Foo bar baz Enter, 2460 01:53:40,860 --> 01:53:44,730 I get only a subset of the list, starting at position 1, 2461 01:53:44,730 --> 01:53:46,892 going all of the way to the end. 2462 01:53:46,892 --> 01:53:48,600 And you can even do kind of the opposite. 2463 01:53:48,600 --> 01:53:51,330 If, for whatever reason, you want to ignore the last element, 2464 01:53:51,330 --> 01:53:57,030 you can say colon, we could say colon negative 1, 2465 01:53:57,030 --> 01:53:59,560 and use a negative number, which we've not seen before, 2466 01:53:59,560 --> 01:54:02,470 which slices off the end of the list, as well. 2467 01:54:02,470 --> 01:54:06,000 So there's some syntactic tricks that tend to be powerful in Python 2, 2468 01:54:06,000 --> 01:54:10,140 even if at first glance, you might not need them for typical things. 2469 01:54:10,140 --> 01:54:12,798 All right, let's do one other example with exit, 2470 01:54:12,798 --> 01:54:15,090 and then we'll start actually applying some algorithms, 2471 01:54:15,090 --> 01:54:16,215 to make things interesting. 2472 01:54:16,215 --> 01:54:20,470 So in one last program here, let's do Exit.py, just to do one more mechanic, 2473 01:54:20,470 --> 01:54:22,210 before we introduce some algorithms. 2474 01:54:22,210 --> 01:54:24,220 And let's do this. 2475 01:54:24,220 --> 01:54:28,900 Let's import from CIS, import argv. 2476 01:54:28,900 --> 01:54:30,490 Let's now do this. 2477 01:54:30,490 --> 01:54:33,200 Let's make sure the user gives me one command line argument. 2478 01:54:33,200 --> 01:54:39,580 So if the length of argv does not equal 2 in total, then let's go ahead 2479 01:54:39,580 --> 01:54:42,790 and print out something like missing command line argument, 2480 01:54:42,790 --> 01:54:44,590 just to explain what the problem is. 2481 01:54:44,590 --> 01:54:47,380 And then let's do this. 2482 01:54:47,380 --> 01:54:48,580 We can exit. 2483 01:54:48,580 --> 01:54:50,710 But I'm going to use a better version of exit here. 2484 01:54:50,710 --> 01:54:52,900 Let me import two functions from CIS. 2485 01:54:52,900 --> 01:54:57,040 Turns out the better way to do this is with CIS.exit, because I can then exit 2486 01:54:57,040 --> 01:54:59,993 specifically 2, with this exit code. 2487 01:54:59,993 --> 01:55:02,410 Otherwise, down here, I'm going to go ahead and print out, 2488 01:55:02,410 --> 01:55:06,818 something like Hello, comma argv bracket 1, same as before. 2489 01:55:06,818 --> 01:55:08,360 And then I'm going to exit with zero. 2490 01:55:08,360 --> 01:55:10,410 So, again, this was a subtle thing we introduced 2491 01:55:10,410 --> 01:55:12,910 in week two, where you can actually have your programs exit, 2492 01:55:12,910 --> 01:55:15,430 with some number, where 0 signifies success, 2493 01:55:15,430 --> 01:55:17,350 and anything else signifies error. 2494 01:55:17,350 --> 01:55:19,240 This is just the same idea in Python. 2495 01:55:19,240 --> 01:55:23,920 So if I, for instance, just run the program like this, oops, I screwed up. 2496 01:55:23,920 --> 01:55:26,620 I meant to say exit here and exit here. 2497 01:55:26,620 --> 01:55:27,710 Let me do that again. 2498 01:55:27,710 --> 01:55:30,500 If I run this like this, I'm missing a command line argument. 2499 01:55:30,500 --> 01:55:33,200 So let me rerun it with like my name at the prompt. 2500 01:55:33,200 --> 01:55:37,030 So I have exactly two command line arguments, the file name and my name, 2501 01:55:37,030 --> 01:55:38,050 Hello comma David. 2502 01:55:38,050 --> 01:55:40,342 And if I do David Malan, it's not going to work either, 2503 01:55:40,342 --> 01:55:42,160 because now argv does not equal 2. 2504 01:55:42,160 --> 01:55:44,860 But the difference here is that we're exiting with 1, 2505 01:55:44,860 --> 01:55:49,900 so that special programs can detect an error, or 0 in the event of success. 2506 01:55:49,900 --> 01:55:52,180 And now there's one other way to do this, too. 2507 01:55:52,180 --> 01:55:54,460 Suppose that you're importing a lot of functions, 2508 01:55:54,460 --> 01:55:56,943 and you don't really want to make a mess of things 2509 01:55:56,943 --> 01:55:59,110 and just have all of these function names available, 2510 01:55:59,110 --> 01:56:01,630 without it being clear where they came from. 2511 01:56:01,630 --> 01:56:03,460 Let's just import all of CIS. 2512 01:56:03,460 --> 01:56:07,180 And let's just change our syntax, kind of like I proposed for CS50, 2513 01:56:07,180 --> 01:56:09,970 where we just prepend to all of these library functions, 2514 01:56:09,970 --> 01:56:13,420 CIS, just to be super-explicit where they came from, 2515 01:56:13,420 --> 01:56:18,837 and if there's another exit or argv value 2516 01:56:18,837 --> 01:56:21,920 that we want to import from a library, this is one way to avoid collision. 2517 01:56:21,920 --> 01:56:25,150 So if I do it one last time here, missing command line argument. 2518 01:56:25,150 --> 01:56:27,190 But David still actually worked. 2519 01:56:27,190 --> 01:56:30,250 All right, only to demonstrate how we can implement that same idea. 2520 01:56:30,250 --> 01:56:33,130 Let's now do something more powerful, like a search algorithm, 2521 01:56:33,130 --> 01:56:34,032 like binary search. 2522 01:56:34,032 --> 01:56:36,490 I'm going to go ahead and open up a file called Numbers.py, 2523 01:56:36,490 --> 01:56:40,420 and let's just do some searching or linear search, rather, 2524 01:56:40,420 --> 01:56:42,440 on a list of numbers. 2525 01:56:42,440 --> 01:56:44,060 Let's go ahead and do this. 2526 01:56:44,060 --> 01:56:47,050 How about import CIS as before. 2527 01:56:47,050 --> 01:56:52,840 Let me give myself a list of numbers, like 4, 6, 8, 2, 7, 5, 0, 2528 01:56:52,840 --> 01:56:54,670 so just a bunch of integers. 2529 01:56:54,670 --> 01:56:56,170 And then let's do this. 2530 01:56:56,170 --> 01:56:59,590 If you recall from week three, we searched for the number 0 2531 01:56:59,590 --> 01:57:01,880 at the end of the lockers on stage. 2532 01:57:01,880 --> 01:57:04,120 So let's just ask that question in Python. 2533 01:57:04,120 --> 01:57:05,860 No need for a loop or anything like that. 2534 01:57:05,860 --> 01:57:09,550 If 0 is in the numbers, go ahead and print out found. 2535 01:57:09,550 --> 01:57:13,420 And then let's just exit successfully, with 0, else, if we get down here, 2536 01:57:13,420 --> 01:57:15,670 let's just say print not found. 2537 01:57:15,670 --> 01:57:19,210 And then we'll CIS exit with 1. 2538 01:57:19,210 --> 01:57:21,820 So this is where Python starts to get powerful again. 2539 01:57:21,820 --> 01:57:23,050 Here's your list. 2540 01:57:23,050 --> 01:57:25,733 Here is your loop, that's doing all of the checking for you. 2541 01:57:25,733 --> 01:57:28,150 Underneath the hood, Python is going to use linear search. 2542 01:57:28,150 --> 01:57:29,817 You don't have to implement it yourself. 2543 01:57:29,817 --> 01:57:32,320 No while loop, no for loop, you just ask a question. 2544 01:57:32,320 --> 01:57:36,230 If 0 is in numbers, then do the following. 2545 01:57:36,230 --> 01:57:38,350 So that's one feature we now get with Python, 2546 01:57:38,350 --> 01:57:40,340 and get to throw away a lot of that code. 2547 01:57:40,340 --> 01:57:41,830 We can do it with strings, too. 2548 01:57:41,830 --> 01:57:44,840 Let me open a file called Names.py instead, 2549 01:57:44,840 --> 01:57:46,990 and do something that was even more involved in C, 2550 01:57:46,990 --> 01:57:50,020 because we needed Str Comp and the for loop, and so forth. 2551 01:57:50,020 --> 01:57:52,000 Let me import CIS for this file. 2552 01:57:52,000 --> 01:57:54,460 Let's give myself a bunch of names like we did in C. 2553 01:57:54,460 --> 01:58:01,630 And those were Bill and Charlie and Fred and George and Ginny, 2554 01:58:01,630 --> 01:58:05,440 and two more, Percy, and lastly Ron. 2555 01:58:05,440 --> 01:58:07,390 And recall, at the time, we looked for Ron. 2556 01:58:07,390 --> 01:58:09,432 And so we had to iterate through the whole thing, 2557 01:58:09,432 --> 01:58:11,810 doing Str Comp and i plus plus and all of that. 2558 01:58:11,810 --> 01:58:18,760 Now just ask the question, if Ron is in names, then let's go ahead 2559 01:58:18,760 --> 01:58:20,440 and, whoops, let me hide that. 2560 01:58:20,440 --> 01:58:22,250 I hit the command too soon. 2561 01:58:22,250 --> 01:58:26,180 Let me go ahead and say print, found, as before. 2562 01:58:26,180 --> 01:58:29,710 CIS exit 1, just to indicate success, and then down here, 2563 01:58:29,710 --> 01:58:32,840 if we get to this point, we can say not found. 2564 01:58:32,840 --> 01:58:36,170 And then we'll just CIS exit 1 instead. 2565 01:58:36,170 --> 01:58:40,960 So, again, this just does linear search for us by default, Python of Names.py, 2566 01:58:40,960 --> 01:58:44,410 we found Ron, because, indeed, he's there, and at the end of the list. 2567 01:58:44,410 --> 01:58:48,190 But we don't need to deal with all of the mechanics of it. 2568 01:58:48,190 --> 01:58:50,530 All right, let's take things one step further. 2569 01:58:50,530 --> 01:58:52,840 In week three, we also implemented the idea 2570 01:58:52,840 --> 01:58:56,980 of a phone book, that actually associated keys with values. 2571 01:58:56,980 --> 01:59:00,010 But remember, the phone book in C, was kind of a hack, right? 2572 01:59:00,010 --> 01:59:03,520 Because we first had two arrays, one with names, one with numbers. 2573 01:59:03,520 --> 01:59:07,330 Then we introduced structs, and so we gave you a person structure. 2574 01:59:07,330 --> 01:59:10,900 And then we had an array of persons. 2575 01:59:10,900 --> 01:59:15,040 You can do this in Python, using objects and things called classes. 2576 01:59:15,040 --> 01:59:17,670 But we can also just use a general purpose dictionary, 2577 01:59:17,670 --> 01:59:21,420 because just like in P set 5, you can associate keys with values, using 2578 01:59:21,420 --> 01:59:23,100 a hash table, using a try. 2579 01:59:23,100 --> 01:59:26,400 Well, similarly, can Python just do this for us. 2580 01:59:26,400 --> 01:59:29,250 From CS50, let's import get string. 2581 01:59:29,250 --> 01:59:32,760 And now let's give myself a dictionary of people, 2582 01:59:32,760 --> 01:59:36,540 D-I-C-T () open paren closed paren gives you a dictionary. 2583 01:59:36,540 --> 01:59:39,300 Or you can simplify the syntax, actually, 2584 01:59:39,300 --> 01:59:42,360 and a dictionary again is just keys and values, words and definitions. 2585 01:59:42,360 --> 01:59:45,060 You can also just use curly braces instead. 2586 01:59:45,060 --> 01:59:47,020 That gives me an empty dictionary. 2587 01:59:47,020 --> 01:59:50,400 But if I know what I want to put in it by default, let's put Carter in there, 2588 01:59:50,400 --> 01:59:57,790 with a number of plus 1-617-495-1000, just like last time, and put myself, 2589 01:59:57,790 --> 02:00:03,777 David, with plus 1-949-468-2750. 2590 02:00:03,777 --> 02:00:06,360 And it came to my attention, tragically, after class that day, 2591 02:00:06,360 --> 02:00:08,152 that we had a bug in our little Easter egg. 2592 02:00:08,152 --> 02:00:11,190 If today, you would like to call me or text me, at that number, 2593 02:00:11,190 --> 02:00:14,130 we have fixed the code that underlies that little Easter egg. 2594 02:00:14,130 --> 02:00:15,090 Spoiler ahead. 2595 02:00:15,090 --> 02:00:17,040 All right, so this now gives me a variable 2596 02:00:17,040 --> 02:00:21,120 called people, that's associating keys with values. 2597 02:00:21,120 --> 02:00:25,230 There is some new syntax here in Python, not just the curly braces, 2598 02:00:25,230 --> 02:00:28,290 but the colons, and the quotes on the left and the right. 2599 02:00:28,290 --> 02:00:31,380 This is a way, in Python, of associating keys 2600 02:00:31,380 --> 02:00:35,350 with values, words with definitions, anything with anything else. 2601 02:00:35,350 --> 02:00:38,550 And it's going to be a super-common paradigm, including in week seven, 2602 02:00:38,550 --> 02:00:42,450 when we look at CSS and HTML and web programming, keys and values 2603 02:00:42,450 --> 02:00:45,840 are like this omnipresent idea in computer science and programming, 2604 02:00:45,840 --> 02:00:49,300 because it's just a really useful way of associating one thing with another. 2605 02:00:49,300 --> 02:00:52,690 So, at this point in the story, we have a dictionary, a hash table, 2606 02:00:52,690 --> 02:00:56,190 if you will, of people, associating names with phone numbers, 2607 02:00:56,190 --> 02:00:57,675 just like a real world phone book. 2608 02:00:57,675 --> 02:01:01,200 So let's write a program that gets a string from the user and asks them 2609 02:01:01,200 --> 02:01:03,390 whose number they would like to look up. 2610 02:01:03,390 --> 02:01:09,510 Then, let's go ahead and say, if that name is in the people dictionary, 2611 02:01:09,510 --> 02:01:12,090 go ahead and print out that person's number, 2612 02:01:12,090 --> 02:01:14,730 by going into the people dictionary and going 2613 02:01:14,730 --> 02:01:19,480 to that specific name, within there, using an f-string for the whole thing. 2614 02:01:19,480 --> 02:01:21,960 So this is similar in spirit to before. 2615 02:01:21,960 --> 02:01:26,130 Linear search and dictionary lookups will just happen automatically for you 2616 02:01:26,130 --> 02:01:29,280 in Python, by just asking the question, if name and people. 2617 02:01:29,280 --> 02:01:31,170 And this line is just going to print out, 2618 02:01:31,170 --> 02:01:35,710 whoever is in the people dictionary, at that name. 2619 02:01:35,710 --> 02:01:40,200 So I'm using square brackets, because here's the interesting thing in Python, 2620 02:01:40,200 --> 02:01:43,320 just like you can index into an array, or a list in Python, 2621 02:01:43,320 --> 02:01:48,150 using numbers, 0, 1, 2, you can very conveniently index 2622 02:01:48,150 --> 02:01:53,080 into a dictionary in Python, using square brackets, as well. 2623 02:01:53,080 --> 02:01:56,070 And just to make clear what's going on here, let me go 2624 02:01:56,070 --> 02:02:00,480 and create a temporary variable, person equals people bracket name. 2625 02:02:00,480 --> 02:02:05,010 And then let's just, or, sorry, let's say, number equals people bracket name. 2626 02:02:05,010 --> 02:02:07,890 And that will just print out the number in question. 2627 02:02:07,890 --> 02:02:11,850 In C, and previously in Python, anything with square brackets like this 2628 02:02:11,850 --> 02:02:16,950 would have been go to a location in a list or an array, using a number. 2629 02:02:16,950 --> 02:02:20,790 But that can actually be a string, like a word the human has typed. 2630 02:02:20,790 --> 02:02:22,830 And this is what's amazing about dictionaries, 2631 02:02:22,830 --> 02:02:25,890 it's not like a big line, a big linear thing. 2632 02:02:25,890 --> 02:02:28,740 It's this table, that you can look up in one column the name, 2633 02:02:28,740 --> 02:02:31,060 and get back in the other column the number. 2634 02:02:31,060 --> 02:02:33,120 So let's go ahead and run Python of Phonebook.py, 2635 02:02:33,120 --> 02:02:38,100 found, not that, oh, wait. 2636 02:02:38,100 --> 02:02:41,880 That's not what's supposed to happen at all. 2637 02:02:41,880 --> 02:02:43,440 I think I'm in the wrong play. 2638 02:02:43,440 --> 02:02:44,290 Phonebook.py. 2639 02:02:44,290 --> 02:02:47,130 2640 02:02:47,130 --> 02:02:49,260 What's going on? 2641 02:02:49,260 --> 02:02:51,720 Print found. 2642 02:02:51,720 --> 02:02:53,580 I am confused. 2643 02:02:53,580 --> 02:02:55,830 OK, let's run this again. 2644 02:02:55,830 --> 02:02:59,970 Python of Phonebook.py, what the-- 2645 02:02:59,970 --> 02:03:01,050 OK, stand by. 2646 02:03:01,050 --> 02:03:07,026 2647 02:03:07,026 --> 02:03:17,902 [KEYS CLICKING] 2648 02:03:17,902 --> 02:03:19,140 What the heck? 2649 02:03:19,140 --> 02:03:21,255 What am I not understanding here? 2650 02:03:21,255 --> 02:03:24,180 2651 02:03:24,180 --> 02:03:27,348 OK, Roxanne, Carter, do you see what I'm doing wrong? 2652 02:03:27,348 --> 02:03:29,220 AUDIENCE: I don't. 2653 02:03:29,220 --> 02:03:31,484 DAVID J. MALAN: What the-- 2654 02:03:31,484 --> 02:03:33,720 [LAUGHTER] 2655 02:03:33,720 --> 02:03:34,230 Say again? 2656 02:03:34,230 --> 02:03:38,110 SPEAKER 47: When you found the test results, it was doing both commands. 2657 02:03:38,110 --> 02:03:43,390 DAVID J. MALAN: Oh, yeah, found, OK, we're going to do this. 2658 02:03:43,390 --> 02:03:45,622 One sec. 2659 02:03:45,622 --> 02:03:52,270 [KEYS CLICKING] 2660 02:03:52,270 --> 02:03:55,360 Whoa, OK. 2661 02:03:55,360 --> 02:03:57,270 All this is coming out of the video. 2662 02:03:57,270 --> 02:03:58,228 So. 2663 02:03:58,228 --> 02:03:59,164 [LAUGHTER] 2664 02:03:59,164 --> 02:04:01,310 [APPLAUSE] 2665 02:04:01,310 --> 02:04:01,810 Thanks. 2666 02:04:01,810 --> 02:04:05,400 2667 02:04:05,400 --> 02:04:06,283 All right. 2668 02:04:06,283 --> 02:04:08,200 I will try to figure out what was going wrong. 2669 02:04:08,200 --> 02:04:10,800 The best I can tell, it was running the wrong program. 2670 02:04:10,800 --> 02:04:12,820 I don't quite understand why. 2671 02:04:12,820 --> 02:04:14,170 So we will diagnose this later. 2672 02:04:14,170 --> 02:04:16,962 I just put the file into a temporary directory, for now, to run it. 2673 02:04:16,962 --> 02:04:22,710 So let me go ahead and just run this, Python of Phonebook.py, 2674 02:04:22,710 --> 02:04:24,240 type in, for instance, my name. 2675 02:04:24,240 --> 02:04:26,418 And there's my corresponding number. 2676 02:04:26,418 --> 02:04:27,960 Have no idea what was just happening. 2677 02:04:27,960 --> 02:04:30,060 But I will get to the bottom of it and update you, 2678 02:04:30,060 --> 02:04:31,360 if we can put our finger on it. 2679 02:04:31,360 --> 02:04:34,890 So this was just an example, now, of implementing a phone book. 2680 02:04:34,890 --> 02:04:37,590 Let's now consider what we can do that's a little more 2681 02:04:37,590 --> 02:04:40,410 powerful, in these examples, like a phone book that 2682 02:04:40,410 --> 02:04:42,150 actually keeps this information around. 2683 02:04:42,150 --> 02:04:45,510 Thus far, these simple phone book examples throw the information away. 2684 02:04:45,510 --> 02:04:48,780 But using CSV files, comma separated values, 2685 02:04:48,780 --> 02:04:51,555 maybe we could actually keep around the names and numbers, 2686 02:04:51,555 --> 02:04:53,430 so that, like on your phone, you can actually 2687 02:04:53,430 --> 02:04:55,780 keep your contacts around long-term. 2688 02:04:55,780 --> 02:04:59,060 So I'm going to go ahead now and do a slightly different example. 2689 02:04:59,060 --> 02:05:03,240 And let me just hide this detail, so it's not confusing. 2690 02:05:03,240 --> 02:05:06,630 Whoops, I'm going to change my prompt temporarily. 2691 02:05:06,630 --> 02:05:10,540 So let me go ahead now and refine this example as follows. 2692 02:05:10,540 --> 02:05:13,830 I'm going to go into Phonebook.py, and I'm 2693 02:05:13,830 --> 02:05:16,290 going to import a whole library called CSV. 2694 02:05:16,290 --> 02:05:18,150 And this is a powerful one, because Python 2695 02:05:18,150 --> 02:05:21,870 comes with a library that just handles CSV files for you. 2696 02:05:21,870 --> 02:05:25,600 A CSV file is just a file with comma separated values. 2697 02:05:25,600 --> 02:05:29,580 And, in fact, to demonstrate this, let me check on one thing 2698 02:05:29,580 --> 02:05:32,460 here, just to make this a little more real. 2699 02:05:32,460 --> 02:05:39,010 To demonstrate this, let's go ahead and do this. 2700 02:05:39,010 --> 02:05:41,970 Let me import the CSV library from CS50. 2701 02:05:41,970 --> 02:05:43,830 Let me import getString. 2702 02:05:43,830 --> 02:05:47,550 Let me then open a file, using the open function, 2703 02:05:47,550 --> 02:05:52,410 open a file called Phonebook.csv, in append format, 2704 02:05:52,410 --> 02:05:54,900 in contrast with read format and write format. 2705 02:05:54,900 --> 02:05:58,450 Write just blows it away if it exists, append adds to the bottom of it. 2706 02:05:58,450 --> 02:06:00,930 So I keep this phone book around, just like you might 2707 02:06:00,930 --> 02:06:02,868 keep adding contacts to your phone. 2708 02:06:02,868 --> 02:06:05,410 Now let me go ahead and get a couple of values from the user. 2709 02:06:05,410 --> 02:06:08,820 Let me say getString and ask the user for a name. 2710 02:06:08,820 --> 02:06:14,160 Then let me getString again, and ask the user for their number. 2711 02:06:14,160 --> 02:06:16,185 And now, let me go ahead and do this. 2712 02:06:16,185 --> 02:06:18,060 And this is new, and this is Python-specific. 2713 02:06:18,060 --> 02:06:20,820 And you would only know this by following a tutorial, 2714 02:06:20,820 --> 02:06:22,480 or reading the documentation. 2715 02:06:22,480 --> 02:06:24,870 Let me give myself a variable called writer, 2716 02:06:24,870 --> 02:06:29,950 and ask the CSV library for a writer to that file. 2717 02:06:29,950 --> 02:06:33,390 Then, let me go ahead and use that writer variable, 2718 02:06:33,390 --> 02:06:36,720 use a function or a method inside of it, called write row, 2719 02:06:36,720 --> 02:06:41,200 to write out a list containing that person's name and number. 2720 02:06:41,200 --> 02:06:44,310 Notice the square brackets inside the parentheses, 2721 02:06:44,310 --> 02:06:49,350 because I'm just printing a list to that particular row in the file. 2722 02:06:49,350 --> 02:06:51,100 And then I'm just going to close the file. 2723 02:06:51,100 --> 02:06:52,742 So what is the effect of all of this? 2724 02:06:52,742 --> 02:06:55,200 Well, let me go ahead and run this version of Phonebook.py, 2725 02:06:55,200 --> 02:06:56,680 and I'm prompted for a name. 2726 02:06:56,680 --> 02:07:05,130 Let's do Carter's first, plus 1-617-495-1000, and then, 2727 02:07:05,130 --> 02:07:07,770 let's go ahead and LS. 2728 02:07:07,770 --> 02:07:10,960 Notice in my current directory, there's two files now, Phonebook.py, 2729 02:07:10,960 --> 02:07:14,430 which I wrote, and apparently Phonebook.csv. 2730 02:07:14,430 --> 02:07:16,830 CSV just stands for comma separated values. 2731 02:07:16,830 --> 02:07:20,380 And it's like a very simple way of storing data in a spreadsheet, 2732 02:07:20,380 --> 02:07:23,670 if you will, where the comma represents the separation between your columns. 2733 02:07:23,670 --> 02:07:26,370 There's only two columns here, name and number. 2734 02:07:26,370 --> 02:07:29,580 But, because I'm writing to this file in append mode, 2735 02:07:29,580 --> 02:07:33,220 let me run it one more time, Python of Phonebook.py, 2736 02:07:33,220 --> 02:07:41,490 and let me go ahead and do David and plus 1-949-468-2750, Enter. 2737 02:07:41,490 --> 02:07:43,350 And notice what happened in the CSV file. 2738 02:07:43,350 --> 02:07:46,380 It automatically updated, because I'm now persisting 2739 02:07:46,380 --> 02:07:49,000 this data to the file in question. 2740 02:07:49,000 --> 02:07:51,360 So if I wanted to now read this file in, I 2741 02:07:51,360 --> 02:07:55,680 could actually go ahead and do linear search on the data, 2742 02:07:55,680 --> 02:07:58,650 using a read function to actually read from the CSV. 2743 02:07:58,650 --> 02:08:01,350 But, for now, we'll just leave it a little simply as write. 2744 02:08:01,350 --> 02:08:03,270 And let me make one refinement here. 2745 02:08:03,270 --> 02:08:07,020 It turns out that, if you're in the habit of re-opening a file, 2746 02:08:07,020 --> 02:08:09,330 you don't have to even close it explicitly. 2747 02:08:09,330 --> 02:08:10,920 You can instead do this. 2748 02:08:10,920 --> 02:08:16,050 You can instead say, with the opening of a file called Phonebook.csv 2749 02:08:16,050 --> 02:08:21,300 in append mode, calling the thing file, go ahead and do all of these lines 2750 02:08:21,300 --> 02:08:22,350 here. 2751 02:08:22,350 --> 02:08:24,377 So the with keyword is a new thing in Python. 2752 02:08:24,377 --> 02:08:27,210 And it's used in a few different ways, but one of the ways it's used 2753 02:08:27,210 --> 02:08:28,335 is to tighten up code here. 2754 02:08:28,335 --> 02:08:30,418 And I'm going to move my variables to the outside, 2755 02:08:30,418 --> 02:08:32,910 because they don't need to be inside of the with statement, 2756 02:08:32,910 --> 02:08:33,868 where the file is open. 2757 02:08:33,868 --> 02:08:36,452 This just has the effect of ensuring that you, the programmer, 2758 02:08:36,452 --> 02:08:38,790 don't screw up, and accidentally don't close your file. 2759 02:08:38,790 --> 02:08:40,680 In fact, you might recall, from C, Valgrind 2760 02:08:40,680 --> 02:08:45,237 might have complained at you, if you had a file that, you didn't close a file, 2761 02:08:45,237 --> 02:08:47,820 you might have had a memory leak as a result. The with keyword 2762 02:08:47,820 --> 02:08:51,840 takes care of all of that for you, as well. 2763 02:08:51,840 --> 02:08:54,670 How about let's do, want to do this. 2764 02:08:54,670 --> 02:08:57,960 How about, let's do one other thing. 2765 02:08:57,960 --> 02:08:59,230 Let's do this. 2766 02:08:59,230 --> 02:09:02,280 Let me go ahead and propose, that on your phone or laptop 2767 02:09:02,280 --> 02:09:07,470 here, or online, go to this URL here, where you'll find a Google form. 2768 02:09:07,470 --> 02:09:10,290 And just to show that these CSVs are actually kind of omnipresent, 2769 02:09:10,290 --> 02:09:11,850 and if you've ever like used a Google Form 2770 02:09:11,850 --> 02:09:13,560 or managed a student group, or something where you've 2771 02:09:13,560 --> 02:09:15,750 collected data via Google Forms, you can actually 2772 02:09:15,750 --> 02:09:18,640 export all of that data via CSV files. 2773 02:09:18,640 --> 02:09:21,150 So go ahead to this URL here. 2774 02:09:21,150 --> 02:09:22,950 And those of you watching on demand later, 2775 02:09:22,950 --> 02:09:24,540 will find that the form is no longer working, 2776 02:09:24,540 --> 02:09:26,030 since we're only doing this live. 2777 02:09:26,030 --> 02:09:27,780 But that will lead to a Google Form that's 2778 02:09:27,780 --> 02:09:30,750 going to let everyone input their answer to a question, 2779 02:09:30,750 --> 02:09:33,660 like what house do you want to end up into, 2780 02:09:33,660 --> 02:09:36,630 sort of an approximation of the sorting hat in Harry Potter. 2781 02:09:36,630 --> 02:09:40,680 And via this form, will we then have the ability to export, 2782 02:09:40,680 --> 02:09:43,780 we'll see, a CSV file. 2783 02:09:43,780 --> 02:09:47,610 So let's give you a moment to do that. 2784 02:09:47,610 --> 02:09:50,460 In just a moment, I'll share my version of the screen, which 2785 02:09:50,460 --> 02:09:54,330 is going to let me actually open the file, the form itself. 2786 02:09:54,330 --> 02:09:59,070 And in just a moment, I'll switch over. 2787 02:09:59,070 --> 02:10:01,020 OK, so this is now my version of the form 2788 02:10:01,020 --> 02:10:04,290 here, where we have 200 plus responses to a simple question of the form, what 2789 02:10:04,290 --> 02:10:08,010 house do you belong in, Gryffindor, Hufflepuff, Ravenclaw, or Slytherin. 2790 02:10:08,010 --> 02:10:12,800 If I go over to responses, I'll see all of the responses in the GUI form here. 2791 02:10:12,800 --> 02:10:15,300 So graphical user interface, and we could flip through this. 2792 02:10:15,300 --> 02:10:20,010 And it looks like, interestingly, 40% of Harvard students 2793 02:10:20,010 --> 02:10:24,223 want to be in Gryffindor, 22% in Slytherin, and everyone else 2794 02:10:24,223 --> 02:10:25,140 in between the others. 2795 02:10:25,140 --> 02:10:27,270 But you might have noticed, if ever using a Google Form, 2796 02:10:27,270 --> 02:10:28,720 this Google Spreadsheets link. 2797 02:10:28,720 --> 02:10:30,010 So I'm going to go ahead and click that. 2798 02:10:30,010 --> 02:10:32,460 And that's going to automatically open, in this case, Google Spreadsheets. 2799 02:10:32,460 --> 02:10:35,290 But you can do the same thing with Office 365 as well. 2800 02:10:35,290 --> 02:10:38,040 And now you see the raw data as a spreadsheet. 2801 02:10:38,040 --> 02:10:42,900 But in Google Spreadsheets, if I go to File and then I go to Download, 2802 02:10:42,900 --> 02:10:46,800 notice I can download this as an Excel file, a PDF, and also 2803 02:10:46,800 --> 02:10:48,910 a CSV, comma separated values. 2804 02:10:48,910 --> 02:10:50,620 So let me go ahead and do that. 2805 02:10:50,620 --> 02:10:53,920 That gives me a file in my Downloads folder on my computer. 2806 02:10:53,920 --> 02:10:57,970 I'm going to now go back to my code editor here. 2807 02:10:57,970 --> 02:11:00,180 And what I'm going to go ahead and do is upload 2808 02:11:00,180 --> 02:11:04,320 this file, from my Downloads folder to VS Code, 2809 02:11:04,320 --> 02:11:06,610 so that we can actually see it within here. 2810 02:11:06,610 --> 02:11:08,220 And now you can see this open file. 2811 02:11:08,220 --> 02:11:11,220 And I'm going to shorten its name, just so it's a little easier to read. 2812 02:11:11,220 --> 02:11:15,990 I'm going to rename this using the MV command, to just Hogwarts.csv. 2813 02:11:15,990 --> 02:11:19,367 And then we can see, in the file, that there's two columns, timestamp column 2814 02:11:19,367 --> 02:11:21,450 house, where you have a whole bunch of time stamps 2815 02:11:21,450 --> 02:11:24,270 when people filled out the form, with someone very early in class. 2816 02:11:24,270 --> 02:11:25,980 And then everyone else just a moment ago. 2817 02:11:25,980 --> 02:11:29,310 And the second value, after each comma, is the name of the house. 2818 02:11:29,310 --> 02:11:32,040 Well, let me go ahead here and implement a program 2819 02:11:32,040 --> 02:11:36,100 in a file called Hogwarts.py, that processes this data. 2820 02:11:36,100 --> 02:11:38,280 So in Hogwarts.py, let's just write a program 2821 02:11:38,280 --> 02:11:41,440 that now reads a CSV, in this case not a phone book, 2822 02:11:41,440 --> 02:11:43,410 but everyone's sorting hat information. 2823 02:11:43,410 --> 02:11:45,450 And I'm going to go ahead and Import CSV. 2824 02:11:45,450 --> 02:11:48,660 And suppose I want to answer a reasonable question, ignoring 2825 02:11:48,660 --> 02:11:52,470 the fact that Google's GUI or graphical user interface, can do this for me. 2826 02:11:52,470 --> 02:11:55,320 I just want to count up who's going to be in which house. 2827 02:11:55,320 --> 02:11:59,640 So let me give myself a dictionary called houses, that's initially empty, 2828 02:11:59,640 --> 02:12:00,780 with curly braces. 2829 02:12:00,780 --> 02:12:02,790 And let me pre-create a few keys. 2830 02:12:02,790 --> 02:12:07,500 Let me say Gryffindor is going to be initialized to 0, 2831 02:12:07,500 --> 02:12:11,820 Hufflepuff will be initialized to 0 as well, Ravenclaw 2832 02:12:11,820 --> 02:12:13,200 will be initialized to 0. 2833 02:12:13,200 --> 02:12:16,770 And finally, Slytherin will be initialized to 0. 2834 02:12:16,770 --> 02:12:19,950 So here's another example of a dictionary, or a hash table, 2835 02:12:19,950 --> 02:12:22,140 just being a very general-purpose piece of data. 2836 02:12:22,140 --> 02:12:23,760 You can have keys and values. 2837 02:12:23,760 --> 02:12:25,470 The keys, in this case, are the houses. 2838 02:12:25,470 --> 02:12:28,500 The values are initially zero, but I'm going to use this, 2839 02:12:28,500 --> 02:12:33,600 instead of like four separate variables, to keep track of everyone's answer 2840 02:12:33,600 --> 02:12:34,730 to this form. 2841 02:12:34,730 --> 02:12:35,730 So I'm going to do this. 2842 02:12:35,730 --> 02:12:43,180 With opening Hogwarts.csv, in read mode, not append, I don't want to change it. 2843 02:12:43,180 --> 02:12:46,440 I just want to read it, as file as my variable name. 2844 02:12:46,440 --> 02:12:49,530 Let's go ahead and create a reader this time, 2845 02:12:49,530 --> 02:12:54,710 that is using the reader function in the CSV library, by opening that file. 2846 02:12:54,710 --> 02:12:57,210 I'm going to go ahead and ignore the first line of the file, 2847 02:12:57,210 --> 02:13:00,270 because, recall, that the first line is just timestamp and house. 2848 02:13:00,270 --> 02:13:01,450 I want to get the real data. 2849 02:13:01,450 --> 02:13:03,540 So this next function is just a little trick 2850 02:13:03,540 --> 02:13:06,730 for ignoring the first line of the file. 2851 02:13:06,730 --> 02:13:07,800 Then let's do this. 2852 02:13:07,800 --> 02:13:12,180 For every other row in the reader, that is line by line, 2853 02:13:12,180 --> 02:13:15,420 get the current person's house, which is in row bracket 1. 2854 02:13:15,420 --> 02:13:18,213 This is what the CSV reader library is doing for us. 2855 02:13:18,213 --> 02:13:20,130 It's handling all of the reading of this file. 2856 02:13:20,130 --> 02:13:23,760 It figures out where the comma is, and, for every row in the file, 2857 02:13:23,760 --> 02:13:26,250 it hands you back a list of size 2. 2858 02:13:26,250 --> 02:13:31,090 In bracket 0 is the time stamp, in bracket 1 is the house name. 2859 02:13:31,090 --> 02:13:34,830 So, in my code, I can say house equals row bracket 1. 2860 02:13:34,830 --> 02:13:36,970 I don't care about the time stamp for this program. 2861 02:13:36,970 --> 02:13:41,070 And then let's go into my dictionary called houses, plural, index 2862 02:13:41,070 --> 02:13:47,370 into it at the house location, by its name, and increment that 0 to 1. 2863 02:13:47,370 --> 02:13:50,280 And now, at the end of this block of code, 2864 02:13:50,280 --> 02:13:53,040 that has the effect of iterating over every line of the file, 2865 02:13:53,040 --> 02:13:55,470 updating my dictionary in four different places, 2866 02:13:55,470 --> 02:13:59,190 based on whether someone typed Gryffindor or Slytherin or anything 2867 02:13:59,190 --> 02:13:59,700 else. 2868 02:13:59,700 --> 02:14:03,810 And notice that I'm using the name of the house to index into my dictionary, 2869 02:14:03,810 --> 02:14:07,500 to essentially go up to this little cheat sheet and change the 0 to a 1, 2870 02:14:07,500 --> 02:14:10,020 the 1 to a 2, the 2 to a 3, instead of having 2871 02:14:10,020 --> 02:14:12,000 like four separate variables, which would just 2872 02:14:12,000 --> 02:14:14,070 be much more annoying to maintain. 2873 02:14:14,070 --> 02:14:16,290 Down at the bottom, let's just print out the results. 2874 02:14:16,290 --> 02:14:19,620 For each house in those houses, iterating over 2875 02:14:19,620 --> 02:14:21,750 the keys they're in by default in Python, 2876 02:14:21,750 --> 02:14:24,630 let's go ahead and print out an f-string that says, 2877 02:14:24,630 --> 02:14:29,460 the current house has the current count. 2878 02:14:29,460 --> 02:14:35,070 And count will be the result of indexing into houses, for that given house. 2879 02:14:35,070 --> 02:14:36,810 And let me close my quote. 2880 02:14:36,810 --> 02:14:41,940 So let's run this to summarize the data, Hogwarts.py, 140 of you 2881 02:14:41,940 --> 02:14:46,200 answered Gryffindor, 54 Hufflepuff, 72 Ravenclaw, and 80 of you Slytherin. 2882 02:14:46,200 --> 02:14:48,570 And that's just my now way of code, and this is, oh, 2883 02:14:48,570 --> 02:14:52,227 my God, so much easier than C, to actually analyze data in this way. 2884 02:14:52,227 --> 02:14:55,560 And one of the reasons that Python is so popular for data science and analytics, 2885 02:14:55,560 --> 02:14:59,910 more generally, is that it's actually really easy to manipulate data, and run 2886 02:14:59,910 --> 02:15:00,940 analytics like this. 2887 02:15:00,940 --> 02:15:02,370 And let me clean this up slightly. 2888 02:15:02,370 --> 02:15:05,160 It's a little annoying that I just have to know and trust 2889 02:15:05,160 --> 02:15:10,410 that the house name is in bracket 1 and timestamp is in bracket 0. 2890 02:15:10,410 --> 02:15:11,440 Let's clean this up. 2891 02:15:11,440 --> 02:15:16,530 There's something called a Dictionary Reader in the CSV library 2892 02:15:16,530 --> 02:15:17,880 that I can use instead. 2893 02:15:17,880 --> 02:15:22,470 Capital D, capital R, this means I can throw away this next thing, 2894 02:15:22,470 --> 02:15:24,900 because what a dictionary reader does is it 2895 02:15:24,900 --> 02:15:28,890 still returns to me every row from the file, one after the other, 2896 02:15:28,890 --> 02:15:32,560 but it doesn't just give me a list of size 2 representing each row. 2897 02:15:32,560 --> 02:15:33,960 It gives me a dictionary. 2898 02:15:33,960 --> 02:15:39,000 And it uses, as the keys in that dictionary, timestamp and house, 2899 02:15:39,000 --> 02:15:41,460 for every row in the file, which is just to say 2900 02:15:41,460 --> 02:15:43,950 it makes my code a little more readable, because instead 2901 02:15:43,950 --> 02:15:46,590 of doing this little trickery, bracket 1, 2902 02:15:46,590 --> 02:15:49,500 I can say quote unquote "Bracket House" with a capital H, 2903 02:15:49,500 --> 02:15:52,360 because it's capitalized in the Google Form itself. 2904 02:15:52,360 --> 02:15:54,798 So the code now is just minorly different, 2905 02:15:54,798 --> 02:15:57,840 but it's way more resilient, especially if I'm using Google Spreadsheets, 2906 02:15:57,840 --> 02:16:00,390 and I'm moving the columns around or doing something like that, 2907 02:16:00,390 --> 02:16:01,973 where the numbers might get messed up. 2908 02:16:01,973 --> 02:16:05,260 Now I can run this on Hogwarts.py again, and I get the same answers. 2909 02:16:05,260 --> 02:16:09,960 But I now don't have to worry about where those individual columns are. 2910 02:16:09,960 --> 02:16:14,880 All right, any questions on those capabilities there. 2911 02:16:14,880 --> 02:16:17,400 And that's a teaser of sorts, for some of the manipulation 2912 02:16:17,400 --> 02:16:19,620 we'll do in P set 6. 2913 02:16:19,620 --> 02:16:23,555 All right, so some final examples and flair, to intrigue 2914 02:16:23,555 --> 02:16:24,930 with what you can do with Python. 2915 02:16:24,930 --> 02:16:28,710 I'm going to actually switch over to a terminal window on my own Mac, 2916 02:16:28,710 --> 02:16:31,900 so that I can actually use audio a little more effectively. 2917 02:16:31,900 --> 02:16:33,930 So here's just a terminal window on Mac OS. 2918 02:16:33,930 --> 02:16:37,950 I before class have preinstalled some additional Python libraries, 2919 02:16:37,950 --> 02:16:40,379 that won't really work in VS Code in the cloud, 2920 02:16:40,379 --> 02:16:43,535 because they require audio that the browser won't necessarily support. 2921 02:16:43,535 --> 02:16:45,660 But I'm going to go ahead and write an example here 2922 02:16:45,660 --> 02:16:49,559 that involves writing a speech-based program, that actually does something 2923 02:16:49,559 --> 02:16:50,212 with speech. 2924 02:16:50,212 --> 02:16:52,170 And I'm going to go ahead and import a library, 2925 02:16:52,170 --> 02:16:55,709 that, again, I pre-installed, called Python text to speech, 2926 02:16:55,709 --> 02:16:58,260 and I'm going to go ahead and, per its documentation, 2927 02:16:58,260 --> 02:17:02,879 give myself a speech engine, by using that library's init function, 2928 02:17:02,879 --> 02:17:04,080 for initialize. 2929 02:17:04,080 --> 02:17:06,930 I'm then going to use this engine's save function 2930 02:17:06,930 --> 02:17:09,180 to do something fun, like Hello, world. 2931 02:17:09,180 --> 02:17:12,480 And then I'm going to go ahead and tell this engine to run and wait, 2932 02:17:12,480 --> 02:17:13,855 while it says those words. 2933 02:17:13,855 --> 02:17:15,480 All right, I'm going to save this file. 2934 02:17:15,480 --> 02:17:16,980 I'm not using VS Code at the moment. 2935 02:17:16,980 --> 02:17:20,070 I'm using another popular program that we used in CS50 back in my day, 2936 02:17:20,070 --> 02:17:22,830 called Vim, which is a command line program that's 2937 02:17:22,830 --> 02:17:24,790 just in this black and white window. 2938 02:17:24,790 --> 02:17:28,849 Let me go ahead now and run Python of Speech.py, and-- 2939 02:17:28,849 --> 02:17:30,745 COMPUTER: Hello, world. 2940 02:17:30,745 --> 02:17:33,120 DAVID J. MALAN: All right, so it's a little computerized, 2941 02:17:33,120 --> 02:17:36,113 but it is speech that has been synthesized from this example. 2942 02:17:36,113 --> 02:17:38,280 Let's change it a little bit to be more interesting. 2943 02:17:38,280 --> 02:17:39,488 Let's do something like this. 2944 02:17:39,488 --> 02:17:43,950 Let's ask the user for their name, like what's your name question mark. 2945 02:17:43,950 --> 02:17:47,850 And then, let's use the little F string, and say, not Hello, world, 2946 02:17:47,850 --> 02:17:50,010 but Hello to that person's name. 2947 02:17:50,010 --> 02:17:54,270 Let me save my file, run Python of Speech.py, Enter. 2948 02:17:54,270 --> 02:17:55,260 David. 2949 02:17:55,260 --> 02:17:57,360 COMPUTER: Hello, David. 2950 02:17:57,360 --> 02:17:59,639 DAVID J. MALAN: All right, so we pronounce my name OK, 2951 02:17:59,639 --> 02:18:02,306 might struggle with different names, depending on the phonetics. 2952 02:18:02,306 --> 02:18:03,570 But that one seemed to be OK. 2953 02:18:03,570 --> 02:18:05,850 Let's do something else with Python, using similarly, 2954 02:18:05,850 --> 02:18:07,780 just a few lines of code. 2955 02:18:07,780 --> 02:18:12,540 Let me go into today's examples. 2956 02:18:12,540 --> 02:18:18,330 And I'm going to go into a folder called Detect, whoops, a folder called 2957 02:18:18,330 --> 02:18:19,680 Faces.py. 2958 02:18:19,680 --> 02:18:20,790 Sorry, Faces. 2959 02:18:20,790 --> 02:18:23,370 And in this folder, that I've written in advance, 2960 02:18:23,370 --> 02:18:25,879 are a few files, Detect.py, Recognize.py, 2961 02:18:25,879 --> 02:18:30,330 and two full of photos, Office.jpeg and Toby.jpeg. 2962 02:18:30,330 --> 02:18:32,799 If you're familiar with the show, here, for instance, 2963 02:18:32,799 --> 02:18:34,809 is the cast photo from The Office here. 2964 02:18:34,809 --> 02:18:36,299 So here's a photo as input. 2965 02:18:36,299 --> 02:18:38,639 Suppose I want to do something very Facebook-style, 2966 02:18:38,639 --> 02:18:40,860 where I want to analyze all of the faces, 2967 02:18:40,860 --> 02:18:42,870 or detect all of the faces in there. 2968 02:18:42,870 --> 02:18:44,940 Well, let me go ahead and show you a program 2969 02:18:44,940 --> 02:18:47,879 I wrote in advance, that's not terribly long. 2970 02:18:47,879 --> 02:18:49,379 Much of it is actually comments. 2971 02:18:49,379 --> 02:18:50,639 But let's see what I'm doing. 2972 02:18:50,639 --> 02:18:54,000 I'm importing the Pillow library, again, to get access to images. 2973 02:18:54,000 --> 02:18:57,480 I'm importing a library called face recognition, which I downloaded 2974 02:18:57,480 --> 02:18:58,590 and installed in advance. 2975 02:18:58,590 --> 02:19:00,129 But it does what it says. 2976 02:19:00,129 --> 02:19:02,959 According to its documentation, you go into that library 2977 02:19:02,959 --> 02:19:04,760 and you call a function called load image 2978 02:19:04,760 --> 02:19:07,370 file, to load something like Office.jpeg, 2979 02:19:07,370 --> 02:19:10,040 and then you can use the line of code like this. 2980 02:19:10,040 --> 02:19:14,120 Call a function called face locations, passing the images input, 2981 02:19:14,120 --> 02:19:17,120 and you get back a list of all of the faces in the image. 2982 02:19:17,120 --> 02:19:20,750 And then down here, a for loop, that iterates over all of those 2983 02:19:20,750 --> 02:19:22,040 face locations. 2984 02:19:22,040 --> 02:19:24,799 And inside of this loop, I just do a bit of trickery. 2985 02:19:24,799 --> 02:19:29,580 I figure out the top, right, bottom, and left corners of those locations. 2986 02:19:29,580 --> 02:19:31,940 And then, using these lines of code here, 2987 02:19:31,940 --> 02:19:34,834 I'm using that image library, to just draw a box, essentially. 2988 02:19:34,834 --> 02:19:35,959 And the code looks cryptic. 2989 02:19:35,959 --> 02:19:38,150 Honestly, I would have to look this up to write it again. 2990 02:19:38,150 --> 02:19:40,650 But per the documentation, this just draws a nice little box 2991 02:19:40,650 --> 02:19:41,610 around the image. 2992 02:19:41,610 --> 02:19:48,200 So let me go ahead and zoom out here, and run this now on Office.jpeg. 2993 02:19:48,200 --> 02:19:53,390 All right, it's analyzing, analyzing, and you can see in the sidebar here, 2994 02:19:53,390 --> 02:19:54,380 here's the original. 2995 02:19:54,380 --> 02:19:59,180 And here is every face that my, what, 10 lines of Python code 2996 02:19:59,180 --> 02:20:00,740 found, within that file. 2997 02:20:00,740 --> 02:20:01,410 What's a face? 2998 02:20:01,410 --> 02:20:04,190 Presumably the library is looking for something, 2999 02:20:04,190 --> 02:20:07,100 maybe without a mask, that has two eyes, a nose, and a mouth, 3000 02:20:07,100 --> 02:20:09,420 in some kind of arrangement, some kind of pattern. 3001 02:20:09,420 --> 02:20:12,440 So it would seem pretty reliable, at least on these fairly easy-to-read 3002 02:20:12,440 --> 02:20:13,370 faces here. 3003 02:20:13,370 --> 02:20:15,660 What if we want to look for someone specific, 3004 02:20:15,660 --> 02:20:17,180 for instance, someone that's always getting picked on. 3005 02:20:17,180 --> 02:20:18,763 Well, we could do something like this. 3006 02:20:18,763 --> 02:20:23,060 Recognize.py, which is taking two files as input, that image and the image 3007 02:20:23,060 --> 02:20:24,620 of one person in particular. 3008 02:20:24,620 --> 02:20:26,900 And if you're trying to find Toby in a crowd, 3009 02:20:26,900 --> 02:20:29,570 here I conflated the program, sorry, this is the version that 3010 02:20:29,570 --> 02:20:31,550 draws a box around the given face. 3011 02:20:31,550 --> 02:20:33,680 Here we have Toby as identified. 3012 02:20:33,680 --> 02:20:34,220 Why? 3013 02:20:34,220 --> 02:20:38,450 Because that program, Recognize.py, has a few more lines of code, 3014 02:20:38,450 --> 02:20:42,800 but long story short, it additionally loads as input Toby.jpeg, 3015 02:20:42,800 --> 02:20:45,410 in order to recognize that specific face. 3016 02:20:45,410 --> 02:20:48,350 And that specific face is a completely different photo, 3017 02:20:48,350 --> 02:20:52,970 but it looks similar enough to the person, that it all worked out OK. 3018 02:20:52,970 --> 02:20:55,820 Let's do one other that's a little sensitive to microphones. 3019 02:20:55,820 --> 02:21:00,650 Let me go into, how about my listen folder here, which is available 3020 02:21:00,650 --> 02:21:01,610 online, too. 3021 02:21:01,610 --> 02:21:04,380 And let's just run Python of Listen0.py. 3022 02:21:04,380 --> 02:21:07,430 I'm going to type in like David. 3023 02:21:07,430 --> 02:21:10,520 Oh, sorry, no, I'm going to-- 3024 02:21:10,520 --> 02:21:11,150 Hello, world. 3025 02:21:11,150 --> 02:21:16,045 3026 02:21:16,045 --> 02:21:17,420 Oh, no, that's the wrong version. 3027 02:21:17,420 --> 02:21:19,250 [CHUCKLES] OK, I looked like an idiot. 3028 02:21:19,250 --> 02:21:21,500 OK, hello, there we go. 3029 02:21:21,500 --> 02:21:22,310 Hello to you, too. 3030 02:21:22,310 --> 02:21:26,300 And if I say goodbye, I'm talking to my laptop like an idiot, OK. 3031 02:21:26,300 --> 02:21:28,590 Now it's detecting what I'm saying here. 3032 02:21:28,590 --> 02:21:32,130 So this first version of the program is just using some relatively simple, if 3033 02:21:32,130 --> 02:21:36,472 elif elif, and it's just asking for input, forcing it to lowercase. 3034 02:21:36,472 --> 02:21:38,430 And that was my mistake with the first example. 3035 02:21:38,430 --> 02:21:41,360 And then, I'm just checking, is Hello in the user's words? 3036 02:21:41,360 --> 02:21:42,818 Is how are you in the user's words? 3037 02:21:42,818 --> 02:21:44,152 Didn't see that, but it's there. 3038 02:21:44,152 --> 02:21:45,470 Is goodbye in the user's words? 3039 02:21:45,470 --> 02:21:49,280 Now let's do a cooler version, using a library, just by looking at the effect. 3040 02:21:49,280 --> 02:21:51,140 Python of Listen1.py. 3041 02:21:51,140 --> 02:21:55,685 Hello, world. 3042 02:21:55,685 --> 02:21:56,720 Huh. 3043 02:21:56,720 --> 02:22:04,170 Let's do version 2 of this, that uses an audio speech-to-text library. 3044 02:22:04,170 --> 02:22:07,160 Hello, world. 3045 02:22:07,160 --> 02:22:09,710 OK, so now it's artificial intelligence. 3046 02:22:09,710 --> 02:22:11,810 Now let's do something a little more interesting. 3047 02:22:11,810 --> 02:22:15,230 The third version of this program that actually analyzes the words that are 3048 02:22:15,230 --> 02:22:16,880 said. 3049 02:22:16,880 --> 02:22:18,800 Hello, world, my name is David. 3050 02:22:18,800 --> 02:22:19,700 How are you? 3051 02:22:19,700 --> 02:22:22,760 3052 02:22:22,760 --> 02:22:26,000 OK, so that time, it not only analyzed what I said, 3053 02:22:26,000 --> 02:22:27,930 but it plucked my name out of it. 3054 02:22:27,930 --> 02:22:30,480 Let's do two final examples. 3055 02:22:30,480 --> 02:22:33,150 This one will generate a QR code. 3056 02:22:33,150 --> 02:22:35,120 Let me go ahead and write a program called 3057 02:22:35,120 --> 02:22:39,030 QR.py, that very simply does this. 3058 02:22:39,030 --> 02:22:40,820 Let me import a library called OS. 3059 02:22:40,820 --> 02:22:43,230 Let me import a library called QR code. 3060 02:22:43,230 --> 02:22:48,000 Let me grab an image here, that's QRcode.make. 3061 02:22:48,000 --> 02:22:51,440 And let me give you the URL of like a lecture video on YouTube, or something 3062 02:22:51,440 --> 02:22:55,040 like that, with this ID. 3063 02:22:55,040 --> 02:22:59,840 Let me just type this, so I don't get it wrong. 3064 02:22:59,840 --> 02:23:05,300 OK, so if I now use this URL here, of a video on YouTube, making 3065 02:23:05,300 --> 02:23:07,812 sure I haven't made any typos, I'm now going 3066 02:23:07,812 --> 02:23:09,770 to go ahead and do two lines of code in Python. 3067 02:23:09,770 --> 02:23:13,460 I'm going to first save that as a file called QR.png, which is 3068 02:23:13,460 --> 02:23:15,490 a two dimensional barcode, a QR code. 3069 02:23:15,490 --> 02:23:17,240 And, indeed, I'm going to use this format. 3070 02:23:17,240 --> 02:23:23,790 And I'm going to use the OS.system library to open QR.png automatically. 3071 02:23:23,790 --> 02:23:26,090 And if you'd like to take out your phone at this point, 3072 02:23:26,090 --> 02:23:32,270 you can see the result of my barcode, that's just been dynamically generated. 3073 02:23:32,270 --> 02:23:33,785 Hopefully from afar that will scan. 3074 02:23:33,785 --> 02:23:37,355 3075 02:23:37,355 --> 02:23:40,150 [UPROAR] 3076 02:23:40,150 --> 02:23:42,460 And I think that's an appropriate line to end on. 3077 02:23:42,460 --> 02:23:43,860 So that's it for CS50. 3078 02:23:43,860 --> 02:23:46,020 We will see you next time. 3079 02:23:46,020 --> 02:23:47,820 [APPLAUSE] 3080 02:23:47,820 --> 02:23:51,470 [MUSIC PLAYING] 3081 02:23:51,470 --> 02:24:25,000