1 00:00:00,000 --> 00:00:02,465 [MUSIC PLAYING] 2 00:00:02,465 --> 00:00:24,287 3 00:00:24,287 --> 00:00:25,370 DAVID J. MALAN: All right. 4 00:00:25,370 --> 00:00:28,730 This is CS50'S Introduction to Programming with Python. 5 00:00:28,730 --> 00:00:30,050 My name is David Malan. 6 00:00:30,050 --> 00:00:32,210 And over these past many weeks have we focused 7 00:00:32,210 --> 00:00:36,020 on functions and variables early on, then conditionals, and loops, 8 00:00:36,020 --> 00:00:39,030 and exceptions, a bit of libraries, unit test file 9 00:00:39,030 --> 00:00:42,050 layout, regular expressions, object-oriented programming, 10 00:00:42,050 --> 00:00:43,520 and really, et cetera. 11 00:00:43,520 --> 00:00:45,770 And indeed, that's where we focus today, is 12 00:00:45,770 --> 00:00:48,710 on all the more that you can do with Python and programming 13 00:00:48,710 --> 00:00:52,410 more generally beyond some of those fundamental concepts as well. 14 00:00:52,410 --> 00:00:55,340 In fact, if you start to flip through the documentation for Python 15 00:00:55,340 --> 00:00:59,570 and all of its form, all of which is as always accessible at docs.python.org, 16 00:00:59,570 --> 00:01:03,770 you'll see additional documentation on Python's own tutorial and library, 17 00:01:03,770 --> 00:01:05,510 its reference, its how-to. 18 00:01:05,510 --> 00:01:09,540 And among all of those various documents as well as others more online, 19 00:01:09,540 --> 00:01:12,530 you'll see that there's some tidbits that we didn't quite touch on. 20 00:01:12,530 --> 00:01:15,560 And indeed, even though we themed these past several weeks 21 00:01:15,560 --> 00:01:19,430 of around fairly broad topics that are rather essential for doing 22 00:01:19,430 --> 00:01:22,160 typical types of problems in Python, it turns out 23 00:01:22,160 --> 00:01:25,460 there's quite a number of other features as well, that we didn't necessarily 24 00:01:25,460 --> 00:01:27,950 touch on, that didn't necessarily fit within any 25 00:01:27,950 --> 00:01:30,560 of those overarching concepts, or might have 26 00:01:30,560 --> 00:01:33,770 been a little too much too soon if we did them too early on in the course. 27 00:01:33,770 --> 00:01:36,320 And so, in today, our final lecture, well, we 28 00:01:36,320 --> 00:01:38,990 focus really on all the more that you can do with Python 29 00:01:38,990 --> 00:01:42,740 and hopefully whet your appetite for teaching yourself all the more to. 30 00:01:42,740 --> 00:01:45,680 For instance, among Python's various data types, 31 00:01:45,680 --> 00:01:49,130 there's this other one that we haven't had occasion to yet use, namely, a set. 32 00:01:49,130 --> 00:01:52,040 In mathematics, a set is typically a collection of values 33 00:01:52,040 --> 00:01:53,820 wherein there are no duplicates. 34 00:01:53,820 --> 00:01:55,170 So it's not quite a list. 35 00:01:55,170 --> 00:01:58,790 It's a bit more special than that in that somehow any duplicates are 36 00:01:58,790 --> 00:01:59,990 eliminated for you. 37 00:01:59,990 --> 00:02:03,200 Well, it turns out within Python, this is an actual data type 38 00:02:03,200 --> 00:02:05,690 that you yourself can use in your code. 39 00:02:05,690 --> 00:02:08,330 And via the documentation here, might you 40 00:02:08,330 --> 00:02:10,580 be able to glean that it's a useful problem 41 00:02:10,580 --> 00:02:13,430 if you want to somehow automatically filter out duplicates. 42 00:02:13,430 --> 00:02:16,250 So let me go ahead and go over to VS Code here. 43 00:02:16,250 --> 00:02:20,210 And let me go ahead and show you a file that I created a bit of in advance, 44 00:02:20,210 --> 00:02:23,300 whereby we have a file here called houses.py. 45 00:02:23,300 --> 00:02:26,420 And in houses.py, I already went ahead and whipped up 46 00:02:26,420 --> 00:02:29,990 a big list of students inside of which is 47 00:02:29,990 --> 00:02:33,470 a number of dictionaries, each of which represents a student's name 48 00:02:33,470 --> 00:02:35,310 and house respectively. 49 00:02:35,310 --> 00:02:37,080 Now, this is a pretty sizable dictionary. 50 00:02:37,080 --> 00:02:39,560 And so, it lends itself to iteration over the same. 51 00:02:39,560 --> 00:02:42,810 And suppose that the goal here was quite simply to figure out, 52 00:02:42,810 --> 00:02:46,735 well, what are the unique houses at Hogwarts in the world of Harry Potter? 53 00:02:46,735 --> 00:02:49,610 It would be nice, perhaps, to not have to know these kinds of details 54 00:02:49,610 --> 00:02:50,570 or look them up online. 55 00:02:50,570 --> 00:02:54,500 Here we have a set of students, albeit not exhaustive, with all of the houses. 56 00:02:54,500 --> 00:02:58,550 But among these students here, what are the unique houses in which they live? 57 00:02:58,550 --> 00:03:00,800 Well, I could certainly, as a human, just eyeball this 58 00:03:00,800 --> 00:03:03,650 and tell you that it's, well, Gryffindor, Slytherin, and Ravenclaw. 59 00:03:03,650 --> 00:03:06,620 But how can we go about doing it programmatically for these students 60 00:03:06,620 --> 00:03:07,520 as well? 61 00:03:07,520 --> 00:03:09,360 Well, let's take one approach first here. 62 00:03:09,360 --> 00:03:11,240 Let me go into houses.py. 63 00:03:11,240 --> 00:03:15,170 And let me propose that we first how about create an empty list 64 00:03:15,170 --> 00:03:20,390 called houses in which I'm going to accumulate each of the houses uniquely. 65 00:03:20,390 --> 00:03:24,920 So every time I iterate through this list of dictionaries, 66 00:03:24,920 --> 00:03:28,830 I'm only going to add a house to this list if I haven't seen it before. 67 00:03:28,830 --> 00:03:30,020 So how do I express that? 68 00:03:30,020 --> 00:03:34,130 Well, let me iterate over all of the students with for student in students, 69 00:03:34,130 --> 00:03:35,550 as we've done in the past. 70 00:03:35,550 --> 00:03:37,200 And let me ask you a question now. 71 00:03:37,200 --> 00:03:40,400 So if the current student's house-- 72 00:03:40,400 --> 00:03:43,550 and notice that I'm indexing into the current student 73 00:03:43,550 --> 00:03:46,580 because I know they are a dictionary or dict object, 74 00:03:46,580 --> 00:03:51,830 and if that student's house is not in my house's list, 75 00:03:51,830 --> 00:03:56,480 then, indented, am I going to say houses.append, 76 00:03:56,480 --> 00:03:58,190 because again, houses is a list. 77 00:03:58,190 --> 00:04:02,120 And I'm going to append that particular house to the list. 78 00:04:02,120 --> 00:04:04,130 Then at the very bottom here, let me go ahead 79 00:04:04,130 --> 00:04:07,880 and do something somewhat interesting here and say, for each of the houses 80 00:04:07,880 --> 00:04:11,360 that I've accumulated in, I could just say houses. 81 00:04:11,360 --> 00:04:14,917 But if I just say houses, what was the point of accumulating them all at once? 82 00:04:14,917 --> 00:04:16,709 I could just do this whole thing in a loop. 83 00:04:16,709 --> 00:04:19,190 Let's at least go about and sort those houses 84 00:04:19,190 --> 00:04:22,550 with sorted, which is going to the strings alphabetically. 85 00:04:22,550 --> 00:04:25,520 And let's go ahead therein and print each of the houses. 86 00:04:25,520 --> 00:04:27,260 Let me go ahead now in my terminal window 87 00:04:27,260 --> 00:04:29,715 and run Python of houses.py and hit Enter. 88 00:04:29,715 --> 00:04:30,590 And there we have it. 89 00:04:30,590 --> 00:04:34,220 Gryffindor, Ravenclaw, Slytherin in alphabetical order, 90 00:04:34,220 --> 00:04:37,280 even though in the list of dictionaries up here, 91 00:04:37,280 --> 00:04:40,960 technically the order in which we saw these was Gryffindor, Gryffindor, 92 00:04:40,960 --> 00:04:43,190 Gryffindor, Slytherin, Ravenclaw. 93 00:04:43,190 --> 00:04:46,620 So indeed, my code seems to have sorted them properly. 94 00:04:46,620 --> 00:04:48,110 So this is perfectly fine. 95 00:04:48,110 --> 00:04:50,600 And it's one way of solving this problem. 96 00:04:50,600 --> 00:04:55,070 But it turns out we could use more that's built into the language Python 97 00:04:55,070 --> 00:04:56,690 to solve this problem ourself. 98 00:04:56,690 --> 00:05:00,470 Here I'm rather reinventing a wheel, really the notion of a set 99 00:05:00,470 --> 00:05:02,610 wherein duplicates are eliminated for me. 100 00:05:02,610 --> 00:05:04,580 So let me go ahead and clear my terminal window 101 00:05:04,580 --> 00:05:07,580 and perhaps change the type of object I'm using here. 102 00:05:07,580 --> 00:05:09,650 Instead of a list, which could also be written 103 00:05:09,650 --> 00:05:11,960 like this to create an empty list, let me go ahead 104 00:05:11,960 --> 00:05:15,740 and create an empty set, whereby I call a function called 105 00:05:15,740 --> 00:05:18,500 set that's going to return to me some object in Python 106 00:05:18,500 --> 00:05:21,950 that represents this notion of a set wherein duplicates are automatically 107 00:05:21,950 --> 00:05:22,730 eliminated. 108 00:05:22,730 --> 00:05:24,650 And now, I can tighten up my code. 109 00:05:24,650 --> 00:05:27,370 Because I don't have to use this if condition myself. 110 00:05:27,370 --> 00:05:29,590 I think I can just do something like this. 111 00:05:29,590 --> 00:05:32,650 Inside of my loop, let me do houses.add. 112 00:05:32,650 --> 00:05:35,800 So it's not append for a set, it's append for a list. 113 00:05:35,800 --> 00:05:39,220 But it's add to a set per the documentation. 114 00:05:39,220 --> 00:05:42,490 Then let me go ahead and add this current student's house. 115 00:05:42,490 --> 00:05:45,110 And now, I think the rest of my code can be the same. 116 00:05:45,110 --> 00:05:48,670 I'm just now trusting per the documentation for set in Python 117 00:05:48,670 --> 00:05:50,860 that it's going to filter out duplicates for me. 118 00:05:50,860 --> 00:05:55,240 And I can just blindly add, add, add, add all of these houses to the set 119 00:05:55,240 --> 00:05:57,850 and any duplicates already there will be gone. 120 00:05:57,850 --> 00:06:00,550 Python of houses.py and Enter. 121 00:06:00,550 --> 00:06:04,480 And voila, we're back in business with just those three there as well. 122 00:06:04,480 --> 00:06:08,890 Let me pause here to see if there's any questions now on this use of set, which 123 00:06:08,890 --> 00:06:11,380 is just another data type that's available to you, 124 00:06:11,380 --> 00:06:14,620 another class in the world of Python that you can reach for when 125 00:06:14,620 --> 00:06:16,750 solving some problem like this. 126 00:06:16,750 --> 00:06:19,330 STUDENT: How can we locate an item in a set, 127 00:06:19,330 --> 00:06:22,148 for example, find Gryffindor in that set? 128 00:06:22,148 --> 00:06:24,190 DAVID J. MALAN: How do you find an item in a set? 129 00:06:24,190 --> 00:06:28,030 You can use very similar syntax as we've done for a list before. 130 00:06:28,030 --> 00:06:34,630 You can use syntax like if Gryffindor in houses then, 131 00:06:34,630 --> 00:06:36,980 and you can answer a question along those lines. 132 00:06:36,980 --> 00:06:40,790 So you can use in and not in and similar functions as well. 133 00:06:40,790 --> 00:06:42,250 Other questions on set? 134 00:06:42,250 --> 00:06:45,730 STUDENT: Look what happens if you have a similar house name? 135 00:06:45,730 --> 00:06:48,520 Let's say instead of Slytherin, it is maybe 136 00:06:48,520 --> 00:06:52,000 an O instead of an I. Will the for loop loop 137 00:06:52,000 --> 00:06:56,800 throughout each of those letters in the house name? 138 00:06:56,800 --> 00:06:59,270 DAVID J. MALAN: It would compare the strings. 139 00:06:59,270 --> 00:07:01,810 So if Slytherin appears more than once but is 140 00:07:01,810 --> 00:07:04,720 slightly misspelled or capitalized, if I heard you right, 141 00:07:04,720 --> 00:07:08,230 those would appear to be distinct strings. 142 00:07:08,230 --> 00:07:11,420 So you would get both versions of Slytherin in the result. 143 00:07:11,420 --> 00:07:14,500 However, we've seen in the past how we can clean up users' data 144 00:07:14,500 --> 00:07:15,910 if indeed it might be messy. 145 00:07:15,910 --> 00:07:19,030 We could force everything to uppercase, or everything to lowercase, 146 00:07:19,030 --> 00:07:22,270 or we could use capitalize the function built into strs, 147 00:07:22,270 --> 00:07:25,300 or title case that would handle some of the cleanup for us. 148 00:07:25,300 --> 00:07:28,930 In this case, because the data is not coming from humans using the input 149 00:07:28,930 --> 00:07:31,600 function, I wrote the code in advance, it's safer 150 00:07:31,600 --> 00:07:33,760 to assume that I got the houses right. 151 00:07:33,760 --> 00:07:37,000 But that's absolutely a risk if it's coming from users. 152 00:07:37,000 --> 00:07:39,850 Allow me to turn our attention back to some of the other features 153 00:07:39,850 --> 00:07:43,390 here that we can leverage in Python if we dig further into the documentation 154 00:07:43,390 --> 00:07:45,130 and read up more on its features. 155 00:07:45,130 --> 00:07:47,380 Well, in some language, there's this notion 156 00:07:47,380 --> 00:07:51,520 of global variables, whereby you can define a variable that's either 157 00:07:51,520 --> 00:07:54,280 local to a function, as we've seen many times, 158 00:07:54,280 --> 00:07:58,450 or if you put a variable outside of all of your functions, 159 00:07:58,450 --> 00:08:01,060 perhaps near the top of your file, that would generally 160 00:08:01,060 --> 00:08:03,340 be considered a global variable. 161 00:08:03,340 --> 00:08:06,350 Or in the world of Python, it might be specific to the module. 162 00:08:06,350 --> 00:08:09,610 But for all intents and purposes, it's going to behave for a given program 163 00:08:09,610 --> 00:08:11,080 as though it is global. 164 00:08:11,080 --> 00:08:13,120 However, it turns out that if you do this 165 00:08:13,120 --> 00:08:16,750 when solving some problem down the line, whereby you have multiple functions 166 00:08:16,750 --> 00:08:20,590 and you do have one or more variables that are outside of those functions, 167 00:08:20,590 --> 00:08:26,470 you might not be able to change those variables as easily as you might think. 168 00:08:26,470 --> 00:08:28,930 So indeed, let me go back to VS Code here. 169 00:08:28,930 --> 00:08:32,289 And in just a moment, I'm going to go ahead and create a new file, how about 170 00:08:32,289 --> 00:08:34,419 called bank.py. 171 00:08:34,419 --> 00:08:36,730 Let's go ahead and implement the notion of a bank 172 00:08:36,730 --> 00:08:40,960 wherein we can store things like money in various forms. 173 00:08:40,960 --> 00:08:42,710 And let me go ahead and do this. 174 00:08:42,710 --> 00:08:44,890 Let me go ahead and implement a very simple bank 175 00:08:44,890 --> 00:08:48,550 that simply keeps track of my total balance, the number of dollars or cents 176 00:08:48,550 --> 00:08:50,410 or whatever I might be storing in this bank. 177 00:08:50,410 --> 00:08:53,560 And I'm going to give myself a variable called balance at the top, which 178 00:08:53,560 --> 00:08:55,970 is an integer, a set to zero. 179 00:08:55,970 --> 00:08:58,902 Now let me go ahead and define a main function as we often do. 180 00:08:58,902 --> 00:09:02,110 And inside of my main function, let me go ahead and print out, quote unquote, 181 00:09:02,110 --> 00:09:05,290 balance, and then print out the value of balance itself. 182 00:09:05,290 --> 00:09:10,570 Passing to print, as we've often done, more than one argument so that they get 183 00:09:10,570 --> 00:09:12,520 separated by a single white space. 184 00:09:12,520 --> 00:09:15,340 And now, since I have a main function, really setting the stage 185 00:09:15,340 --> 00:09:17,470 for doing more interesting things soon, let 186 00:09:17,470 --> 00:09:20,860 me go ahead and do our usual if the name of this file 187 00:09:20,860 --> 00:09:25,060 equals equals underscore underscore main, then go ahead and call main. 188 00:09:25,060 --> 00:09:28,450 So this is a terribly short program, but it's perhaps 189 00:09:28,450 --> 00:09:31,630 representative of how you might solve some future problem in Python. 190 00:09:31,630 --> 00:09:34,390 Whereby you have a main function that's going to eventually do 191 00:09:34,390 --> 00:09:35,780 some interesting stuff. 192 00:09:35,780 --> 00:09:38,590 And at the top of your file, you have one or more variables 193 00:09:38,590 --> 00:09:41,590 that are just useful to keep there because then you know where they are. 194 00:09:41,590 --> 00:09:45,560 And perhaps not just main but other functions can access them as well. 195 00:09:45,560 --> 00:09:46,420 So let's see. 196 00:09:46,420 --> 00:09:49,720 When I run this program, Python of bank.py, 197 00:09:49,720 --> 00:09:52,870 I would hope based on my own intuition thus far that I'm going 198 00:09:52,870 --> 00:09:54,880 to see that my current balance is zero. 199 00:09:54,880 --> 00:09:59,800 That is to say, even though the balance variable is defined on line one, 200 00:09:59,800 --> 00:10:03,550 hopefully I can still print it online five inside of main, 201 00:10:03,550 --> 00:10:07,120 even though balance was not defined in my main function. 202 00:10:07,120 --> 00:10:07,820 Here we go. 203 00:10:07,820 --> 00:10:08,500 Hitting Enter. 204 00:10:08,500 --> 00:10:10,540 And voila, balance zero. 205 00:10:10,540 --> 00:10:12,010 So it does seem to work. 206 00:10:12,010 --> 00:10:15,730 Even if you declare a variable in Python outside of your functions, 207 00:10:15,730 --> 00:10:17,500 it appears that you can access it. 208 00:10:17,500 --> 00:10:22,870 You can read the value of that variable even inside of a function like main. 209 00:10:22,870 --> 00:10:24,880 Well, let's get a little more adventurous now. 210 00:10:24,880 --> 00:10:27,380 Because this program really isn't solving anyone's problems. 211 00:10:27,380 --> 00:10:29,390 Let's go ahead and implement more of a bank, 212 00:10:29,390 --> 00:10:31,880 like the ability to deposit money into the bank 213 00:10:31,880 --> 00:10:33,573 and to withdraw money from the bank. 214 00:10:33,573 --> 00:10:35,990 Thereby giving me some more functions that might very well 215 00:10:35,990 --> 00:10:37,880 need to access that same variable. 216 00:10:37,880 --> 00:10:39,710 Let me clear my terminal window here. 217 00:10:39,710 --> 00:10:42,440 And let me go ahead and pretend for the moment 218 00:10:42,440 --> 00:10:47,300 that I have the ability to deposit, say, $100 or 100 coins, 219 00:10:47,300 --> 00:10:49,190 whatever the unit of currency is here. 220 00:10:49,190 --> 00:10:51,680 And then, maybe I want to withdraw straight 221 00:10:51,680 --> 00:10:54,620 away 50 of those same dollars or coins. 222 00:10:54,620 --> 00:10:57,770 And now, let me go ahead and just print out at the bottom of main 223 00:10:57,770 --> 00:11:01,340 what my new balance should be so that in an ideal world, 224 00:11:01,340 --> 00:11:06,470 once I have deposited 100 then withdrawn 50, after starting at 0, 225 00:11:06,470 --> 00:11:11,430 I'd like to think that my new balance on line eight should indeed be 50. 226 00:11:11,430 --> 00:11:11,930 All right. 227 00:11:11,930 --> 00:11:13,700 But I haven't implemented these functions yet. 228 00:11:13,700 --> 00:11:15,620 So let's do that as we've done in the past. 229 00:11:15,620 --> 00:11:18,680 Down here, I'm going to go ahead and define another function deposit. 230 00:11:18,680 --> 00:11:22,640 I'm going to say that it takes an argument called n for a number of coins 231 00:11:22,640 --> 00:11:23,910 or dollars or the like. 232 00:11:23,910 --> 00:11:25,220 And I'm just going to do this. 233 00:11:25,220 --> 00:11:28,280 I'm going to go ahead and say, balance plus equals n, 234 00:11:28,280 --> 00:11:30,230 thereby changing the value of n. 235 00:11:30,230 --> 00:11:33,680 I could do it more verbosely, balance equals balance plus n. 236 00:11:33,680 --> 00:11:36,980 But I'm going to use the shorter hand notation here instead. 237 00:11:36,980 --> 00:11:38,510 And now, let's implement withdraw. 238 00:11:38,510 --> 00:11:40,730 So define a function called withdraw. 239 00:11:40,730 --> 00:11:43,160 It too is going to take a variable-- an argument 240 00:11:43,160 --> 00:11:45,410 n for number of dollars or coins. 241 00:11:45,410 --> 00:11:48,560 And now, I'm going to go ahead and subtract from balance 242 00:11:48,560 --> 00:11:51,690 using minus equals n as well. 243 00:11:51,690 --> 00:11:55,010 And I'm still going to call main if the name of this file is main. 244 00:11:55,010 --> 00:11:56,370 So what have I done? 245 00:11:56,370 --> 00:12:00,860 I've just added not just one but three functions total, all of which 246 00:12:00,860 --> 00:12:06,050 apparently need to access balance by printing it, incrementing it, 247 00:12:06,050 --> 00:12:08,670 or decrementing it, as we've seen here. 248 00:12:08,670 --> 00:12:09,170 All right. 249 00:12:09,170 --> 00:12:11,690 Let me go ahead and focus on these three functions here. 250 00:12:11,690 --> 00:12:16,490 Let me go back to my terminal window and run Python of bank.py and hit Enter. 251 00:12:16,490 --> 00:12:17,420 And wow. 252 00:12:17,420 --> 00:12:20,970 Seems like we've introduced some number of problems here. 253 00:12:20,970 --> 00:12:23,010 And what are these problems? 254 00:12:23,010 --> 00:12:28,670 Well, unbound local error is perhaps the first time we've seen this one here. 255 00:12:28,670 --> 00:12:32,720 Local variable balance referenced before assignment. 256 00:12:32,720 --> 00:12:35,720 And that's a bit misleading, definitely confusing. 257 00:12:35,720 --> 00:12:40,700 Because I absolutely assigned balance of value on the top of my code. 258 00:12:40,700 --> 00:12:44,355 And indeed, if I scroll back up, nothing has changed or been lost up there. 259 00:12:44,355 --> 00:12:45,980 It's definitely been assigned to value. 260 00:12:45,980 --> 00:12:49,970 And now on line 12, it would seem, that when deposit is called 261 00:12:49,970 --> 00:12:53,160 I'm just trying to access that variable again. 262 00:12:53,160 --> 00:12:59,610 So intuitively, what might explain this error message, unbound local error? 263 00:12:59,610 --> 00:13:02,630 What is Python telling us there that Python can or can't 264 00:13:02,630 --> 00:13:06,410 do when it comes to these so-called global variables that 265 00:13:06,410 --> 00:13:08,430 are at the top of my file? 266 00:13:08,430 --> 00:13:12,110 STUDENT: So if you want to change this variable, 267 00:13:12,110 --> 00:13:17,090 you should write an inside left function main. 268 00:13:17,090 --> 00:13:19,740 And the global variable unchangeable. 269 00:13:19,740 --> 00:13:20,615 DAVID J. MALAN: Yeah. 270 00:13:20,615 --> 00:13:21,448 STUDENT: [INAUDIBLE] 271 00:13:21,448 --> 00:13:23,990 DAVID J. MALAN: So if you want to change the value, 272 00:13:23,990 --> 00:13:26,240 it might need to be local to the function. 273 00:13:26,240 --> 00:13:29,660 If you are trying to change a global variable though in a function, 274 00:13:29,660 --> 00:13:31,620 it clearly does not work. 275 00:13:31,620 --> 00:13:33,800 So it's OK to read a global variable. 276 00:13:33,800 --> 00:13:36,270 Read meaning access it and print it and so forth. 277 00:13:36,270 --> 00:13:40,010 But apparently, you can't write to a global variable in the same way 278 00:13:40,010 --> 00:13:41,640 from within one of these functions. 279 00:13:41,640 --> 00:13:42,140 All right. 280 00:13:42,140 --> 00:13:43,700 Well, maybe the fix is to do this. 281 00:13:43,700 --> 00:13:45,740 Let me clear my terminal window and that error. 282 00:13:45,740 --> 00:13:47,210 And maybe I could just do this. 283 00:13:47,210 --> 00:13:48,800 Let's get rid of the global variable. 284 00:13:48,800 --> 00:13:52,460 And let's go ahead and put it, for instance, inside of main. 285 00:13:52,460 --> 00:13:54,140 Might this now work? 286 00:13:54,140 --> 00:13:55,430 Well, let me try this now. 287 00:13:55,430 --> 00:13:59,150 Python of Bank.py Enter. 288 00:13:59,150 --> 00:14:00,980 That alone did not solve it. 289 00:14:00,980 --> 00:14:03,320 I still have an unbound local error. 290 00:14:03,320 --> 00:14:06,860 This time though, it's for a different reason. 291 00:14:06,860 --> 00:14:13,350 It turns out now that balance on line two is by definition a local variable. 292 00:14:13,350 --> 00:14:16,670 A local variable is one that exists in the context of a function, at least 293 00:14:16,670 --> 00:14:17,390 in this case. 294 00:14:17,390 --> 00:14:20,120 A global variable is the opposite, one that does not, 295 00:14:20,120 --> 00:14:21,900 for instance at the top of my file. 296 00:14:21,900 --> 00:14:24,500 So here is another distinction in Python. 297 00:14:24,500 --> 00:14:27,650 If you declare a variable in a function, like main, 298 00:14:27,650 --> 00:14:29,720 just as I've done on line two with balance, 299 00:14:29,720 --> 00:14:31,760 it is indeed local to that function. 300 00:14:31,760 --> 00:14:36,030 Deposit and withdraw do not have access to that same variable. 301 00:14:36,030 --> 00:14:36,530 Why? 302 00:14:36,530 --> 00:14:38,030 Because it's local to main. 303 00:14:38,030 --> 00:14:41,150 And so, you would think now we're kind of stuck in this vicious cycle. 304 00:14:41,150 --> 00:14:45,120 Well, maybe the solution then is to move balance globally 305 00:14:45,120 --> 00:14:47,030 so all three functions can access it. 306 00:14:47,030 --> 00:14:50,840 But clearly, where we began, as Elena noted, we can't therefore change it. 307 00:14:50,840 --> 00:14:53,930 So it turns out the solution to this problem in Python 308 00:14:53,930 --> 00:14:56,487 is ironically exactly this keyword here. 309 00:14:56,487 --> 00:14:58,820 It's a little different as you might have seen if you've 310 00:14:58,820 --> 00:15:00,380 programmed before in other languages. 311 00:15:00,380 --> 00:15:02,600 But there's indeed a keyword in Python called 312 00:15:02,600 --> 00:15:06,170 global that allows you to tell a function that, hey, this 313 00:15:06,170 --> 00:15:08,030 is not a variable that's local to you. 314 00:15:08,030 --> 00:15:11,460 I mean it to be a global variable that I want you to edit. 315 00:15:11,460 --> 00:15:14,180 So if I go back to VS Code here, clearing my terminal 316 00:15:14,180 --> 00:15:15,810 window to get rid of that error. 317 00:15:15,810 --> 00:15:17,870 Let me go ahead and undo the change I just made 318 00:15:17,870 --> 00:15:20,450 and put balance back at the top of my file. 319 00:15:20,450 --> 00:15:22,920 But this time, what I'm going to do is I'm 320 00:15:22,920 --> 00:15:28,890 going to inform my two functions that need to change the value of balance, 321 00:15:28,890 --> 00:15:34,420 that it is indeed global, by typing global balance again here as well as 322 00:15:34,420 --> 00:15:34,920 here. 323 00:15:34,920 --> 00:15:36,480 Global balance. 324 00:15:36,480 --> 00:15:40,980 I still leave the same lines of code now on lines 13 and 18, that increment 325 00:15:40,980 --> 00:15:42,090 and decrement balance. 326 00:15:42,090 --> 00:15:46,860 But this now use of keyword global is a little bit of a clue to Python that, 327 00:15:46,860 --> 00:15:47,610 oh, OK. 328 00:15:47,610 --> 00:15:48,782 It's not a local variable. 329 00:15:48,782 --> 00:15:50,490 This is not a bug that you've introduced. 330 00:15:50,490 --> 00:15:53,860 You mean for me to edit this variable up above. 331 00:15:53,860 --> 00:15:57,750 So now, let me go ahead in my terminal window and run Python of bank.py. 332 00:15:57,750 --> 00:16:03,540 I'm hoping to see that my balance is zero plus 100 minus 50 is 50. 333 00:16:03,540 --> 00:16:04,740 And indeed, it now is. 334 00:16:04,740 --> 00:16:08,370 It starts off at zero per my first print statement on line five. 335 00:16:08,370 --> 00:16:13,140 But it ends up at 50 total at below that on line eight. 336 00:16:13,140 --> 00:16:15,480 Let me pause here to see if now there's any questions 337 00:16:15,480 --> 00:16:18,090 on these global or local variables. 338 00:16:18,090 --> 00:16:23,250 STUDENT: What happens when you declare a variable globally, and as 339 00:16:23,250 --> 00:16:26,037 in the same variable globally and in a function? 340 00:16:26,037 --> 00:16:27,370 DAVID J. MALAN: A good question. 341 00:16:27,370 --> 00:16:29,703 You're always thinking about the so-called corner cases. 342 00:16:29,703 --> 00:16:33,510 So if you declare a variable both globally, like at the top of your file, 343 00:16:33,510 --> 00:16:38,700 and then an identically named variable inside of a function, same name, 344 00:16:38,700 --> 00:16:41,940 the latter will shadow, so to speak, the former. 345 00:16:41,940 --> 00:16:46,110 That is, you'll be able to use the latter, that is the local variable. 346 00:16:46,110 --> 00:16:49,020 But it will have no effect on the global variable. 347 00:16:49,020 --> 00:16:53,260 Temporarily, Python will only know that the local variable exists. 348 00:16:53,260 --> 00:16:56,430 So in general, the rule of thumb is, just don't do that. 349 00:16:56,430 --> 00:16:58,500 Not only might it create bugs in your code 350 00:16:58,500 --> 00:17:01,120 because you don't quite change what you intend to change. 351 00:17:01,120 --> 00:17:05,140 It's also perhaps non-obvious to other readers as well. 352 00:17:05,140 --> 00:17:07,680 Other questions on globals or locals? 353 00:17:07,680 --> 00:17:09,720 STUDENT: OK, what if we decide to add balance 354 00:17:09,720 --> 00:17:11,516 as an argument inside the main function? 355 00:17:11,516 --> 00:17:13,349 DAVID J. MALAN: Yeah, another good instinct. 356 00:17:13,349 --> 00:17:16,530 But in this case, that also is not going to solve the problem. 357 00:17:16,530 --> 00:17:21,510 Because if you pass in a variable like balance to each of the functions 358 00:17:21,510 --> 00:17:24,089 and then change it within that function, it's 359 00:17:24,089 --> 00:17:27,030 only going to be changing in effect a local copy thereof. 360 00:17:27,030 --> 00:17:30,490 It's not going to be changing what's outside of those functions. 361 00:17:30,490 --> 00:17:33,330 So I think we actually need a better way altogether. 362 00:17:33,330 --> 00:17:36,720 And in fact, allow me to transition to perhaps a modification 363 00:17:36,720 --> 00:17:38,010 of this same program. 364 00:17:38,010 --> 00:17:40,710 Recall that we looked most recently at this notion 365 00:17:40,710 --> 00:17:42,600 of object-oriented programming. 366 00:17:42,600 --> 00:17:47,010 Whereby you can model real world entities, for instance a bank, 367 00:17:47,010 --> 00:17:50,490 and you can model and encapsulate information 368 00:17:50,490 --> 00:17:52,680 about that real world entity, for instance, 369 00:17:52,680 --> 00:17:54,580 like someone's account balance. 370 00:17:54,580 --> 00:17:56,700 So let me propose that we actually do this. 371 00:17:56,700 --> 00:17:58,950 Let me start from scratch with bank.py. 372 00:17:58,950 --> 00:18:01,140 Get rid of the global variable altogether. 373 00:18:01,140 --> 00:18:03,960 And actually use some object-oriented code. 374 00:18:03,960 --> 00:18:08,550 Let me define a class called account to represent someone's bank account. 375 00:18:08,550 --> 00:18:13,740 And then, let me go ahead and initialize with my init method, which 376 00:18:13,740 --> 00:18:17,010 again, takes by convention at least one argument called self. 377 00:18:17,010 --> 00:18:20,790 Let me go ahead and initialize every person's bank account 378 00:18:20,790 --> 00:18:22,650 to some value like zero. 379 00:18:22,650 --> 00:18:23,890 Now, how can I do that? 380 00:18:23,890 --> 00:18:27,540 Well, I'm going to go ahead and do self.balance equals zero. 381 00:18:27,540 --> 00:18:30,210 Thereby giving me an instance variable called 382 00:18:30,210 --> 00:18:33,030 balance initialized for this account to zero. 383 00:18:33,030 --> 00:18:35,490 But I'm going to proactively remember how we also 384 00:18:35,490 --> 00:18:38,250 introduced this notion of properties which might otherwise 385 00:18:38,250 --> 00:18:40,720 collide with the names of my instance variables. 386 00:18:40,720 --> 00:18:42,690 So just by convention I'm going to do this. 387 00:18:42,690 --> 00:18:45,630 I'm going to rename this instance variable proactively 388 00:18:45,630 --> 00:18:49,740 to underscore balance to effectively indicate that it's private, even 389 00:18:49,740 --> 00:18:51,570 though that's not enforced by Python. 390 00:18:51,570 --> 00:18:53,940 It's just a visual clue to myself that this 391 00:18:53,940 --> 00:18:57,900 is something that really I should not-- or other code should not touch, 392 00:18:57,900 --> 00:18:59,970 just functions in this class. 393 00:18:59,970 --> 00:19:01,540 Now, let me go ahead and do this. 394 00:19:01,540 --> 00:19:03,540 Let me go ahead and define an actual function 395 00:19:03,540 --> 00:19:07,620 called balance that really is going to be a property whose purpose in life 396 00:19:07,620 --> 00:19:10,380 is just to return self.balance. 397 00:19:10,380 --> 00:19:13,200 And I'm going to go explicitly and say this is indeed 398 00:19:13,200 --> 00:19:15,360 a property of this class. 399 00:19:15,360 --> 00:19:18,270 Now, let me go ahead and re-implement those other two functions, 400 00:19:18,270 --> 00:19:21,850 deposit and withdraw, but in the confines of this class. 401 00:19:21,850 --> 00:19:24,900 So I'm going to say, define deposit. 402 00:19:24,900 --> 00:19:27,420 It's going to take in an argument self as always, 403 00:19:27,420 --> 00:19:31,410 but an additional one n, a number of dollars or coins to deposit. 404 00:19:31,410 --> 00:19:33,120 And how do I now manipulate this? 405 00:19:33,120 --> 00:19:37,560 Well, I'm going to do self._balance plus equals n. 406 00:19:37,560 --> 00:19:43,260 And now down here, I'm going to do def withdraw self n, just like for deposit. 407 00:19:43,260 --> 00:19:47,250 But here, I'm going to do self.balance minus equals n. 408 00:19:47,250 --> 00:19:50,430 And now, if I go down below this class, I'm 409 00:19:50,430 --> 00:19:52,620 going to go ahead and define myself a main function 410 00:19:52,620 --> 00:19:54,390 just so I can try this now out. 411 00:19:54,390 --> 00:19:58,530 I'm going to go ahead and create an account object by calling the account 412 00:19:58,530 --> 00:20:02,130 constructor, that is the name of the class with two parentheses 413 00:20:02,130 --> 00:20:04,350 if I'm not passing in any arguments to init. 414 00:20:04,350 --> 00:20:07,830 I'm going to go ahead now and print out as before the balance of my account. 415 00:20:07,830 --> 00:20:12,420 But to do that, I'm going to access the property of that account like this. 416 00:20:12,420 --> 00:20:16,470 And I'm going to go ahead now and say, deposit another $100 or coins 417 00:20:16,470 --> 00:20:18,300 with deposit 100. 418 00:20:18,300 --> 00:20:22,620 And I'm going to go ahead, like before, and also now immediately withdraw 419 00:20:22,620 --> 00:20:25,140 for whatever reason 50 of the same. 420 00:20:25,140 --> 00:20:27,120 And now, I'm going to print one last time 421 00:20:27,120 --> 00:20:31,383 balance followed by account.balance, again, accessing that property. 422 00:20:31,383 --> 00:20:34,050 And for this whole thing to work, of course, I need one of these 423 00:20:34,050 --> 00:20:39,030 if name equals equals underscore main, then go ahead and call main. 424 00:20:39,030 --> 00:20:42,270 Now, before I run this, you'll see that it rather escalated quickly. 425 00:20:42,270 --> 00:20:45,390 I had a very simple goal at hand to implement the notion of a bank. 426 00:20:45,390 --> 00:20:49,530 And I was able to implement that perfectly fine ultimately 427 00:20:49,530 --> 00:20:53,550 by declaring balance to be global but then to tell each of my functions 428 00:20:53,550 --> 00:20:54,990 that it is indeed global. 429 00:20:54,990 --> 00:20:59,130 But that's not really the best form of encapsulation we have at our disposal 430 00:20:59,130 --> 00:20:59,670 now. 431 00:20:59,670 --> 00:21:02,100 Per our focus on object-oriented programming, 432 00:21:02,100 --> 00:21:05,760 if we're trying to implement some real world entity like an account at a bank, 433 00:21:05,760 --> 00:21:07,770 that's what classes allow us to do. 434 00:21:07,770 --> 00:21:10,380 And it allows us to solve that same problem perhaps a little 435 00:21:10,380 --> 00:21:14,010 more cleanly, certainly if we're going to accumulate more and more functions 436 00:21:14,010 --> 00:21:15,610 or methods over time. 437 00:21:15,610 --> 00:21:19,800 So if I didn't make any mistakes here, if I run Python of bank.py and hit 438 00:21:19,800 --> 00:21:23,550 Enter now, you'll see that it just works just fine. 439 00:21:23,550 --> 00:21:26,490 Because in the world of classes in Python, 440 00:21:26,490 --> 00:21:29,550 these so-called instance variables are by definition 441 00:21:29,550 --> 00:21:33,180 accessible to all of the methods in that class 442 00:21:33,180 --> 00:21:39,240 because we're accessing them all by way of that special parameter self. 443 00:21:39,240 --> 00:21:40,440 So which way to do it? 444 00:21:40,440 --> 00:21:43,260 For a reasonably small script wherein you are simply 445 00:21:43,260 --> 00:21:47,728 trying to implement a script that has some global information, 446 00:21:47,728 --> 00:21:50,520 like an account balance that you then need to manipulate elsewhere, 447 00:21:50,520 --> 00:21:52,930 the global keyword is a solution to that problem. 448 00:21:52,930 --> 00:21:55,890 But generally speaking, in many languages, 449 00:21:55,890 --> 00:21:58,800 Python to some extent among them, using global variables 450 00:21:58,800 --> 00:22:03,030 tends to be frowned upon only because things can get messy quickly. 451 00:22:03,030 --> 00:22:07,470 And it can become less obvious quickly exactly where your information is 452 00:22:07,470 --> 00:22:10,150 stored, if some of it's up here, some of it's in your function. 453 00:22:10,150 --> 00:22:15,180 So generally, the rule of thumb is to use global variables sparingly. 454 00:22:15,180 --> 00:22:18,390 Though technically speaking, in Python these global variables 455 00:22:18,390 --> 00:22:21,870 are technically local to our module if we were indeed implementing 456 00:22:21,870 --> 00:22:23,830 a library and not just a program. 457 00:22:23,830 --> 00:22:27,010 So in short, try to use global variables sparingly. 458 00:22:27,010 --> 00:22:30,120 But when you do, there is a solution to these same problems. 459 00:22:30,120 --> 00:22:35,280 Questions now on globals or our reimplementation of the same idea 460 00:22:35,280 --> 00:22:40,030 but using full-fledged object-oriented programming? 461 00:22:40,030 --> 00:22:44,163 STUDENT: I just would like to ask, what this property does? 462 00:22:44,163 --> 00:22:45,830 DAVID J. MALAN: What this property does. 463 00:22:45,830 --> 00:22:47,740 So if I go back to VS Code here, you'll see 464 00:22:47,740 --> 00:22:50,530 that this was a technique we looked at in our lecture 465 00:22:50,530 --> 00:22:52,150 on object-oriented programming. 466 00:22:52,150 --> 00:22:57,400 Whereby a property is a instance variable that somehow protected. 467 00:22:57,400 --> 00:23:01,490 It allows me to control it can be read and written. 468 00:23:01,490 --> 00:23:04,870 So in this case, I only have what's called generally a setter. 469 00:23:04,870 --> 00:23:05,710 And or sorry. 470 00:23:05,710 --> 00:23:08,372 In this case, I only have what's generally called a getter. 471 00:23:08,372 --> 00:23:10,330 And there's no mention of the word getter here. 472 00:23:10,330 --> 00:23:12,250 This is just @property means. 473 00:23:12,250 --> 00:23:15,550 That function balance will allow me, recall, 474 00:23:15,550 --> 00:23:18,160 to use syntax like this, where I can pretend 475 00:23:18,160 --> 00:23:22,360 as though balance is indeed with no underscore an instance variable. 476 00:23:22,360 --> 00:23:27,220 But I can now prevent code like mine in main from trying to change balance. 477 00:23:27,220 --> 00:23:29,650 Because I do not have a setter, I would not 478 00:23:29,650 --> 00:23:32,500 be able to do something like account balance equals 1,000 479 00:23:32,500 --> 00:23:36,370 to just give myself 1,000 or coins because I have not defined a setter. 480 00:23:36,370 --> 00:23:39,460 So again, per our focus on object-oriented programming, 481 00:23:39,460 --> 00:23:42,910 these properties just allow me some finer-grained control. 482 00:23:42,910 --> 00:23:46,750 Some languages allow you to define variables that are, so to speak, 483 00:23:46,750 --> 00:23:47,380 constant. 484 00:23:47,380 --> 00:23:49,870 That is, once you have set a value to them, 485 00:23:49,870 --> 00:23:52,297 you cannot change the value of that variable. 486 00:23:52,297 --> 00:23:54,130 And that tends to be a good thing because it 487 00:23:54,130 --> 00:23:55,810 allows you to program defensively. 488 00:23:55,810 --> 00:23:58,570 Just in case you accidentally, or someone else, 489 00:23:58,570 --> 00:24:01,640 accidentally tries to modify the value of that variable, 490 00:24:01,640 --> 00:24:06,430 if you have declared it in some language as a constant, it cannot be changed, 491 00:24:06,430 --> 00:24:09,490 or usually cannot be changed without great effort. 492 00:24:09,490 --> 00:24:13,150 Unfortunately, in Python, we're again on the sort of honor system here. 493 00:24:13,150 --> 00:24:15,430 Where we have conventions to indicate that something 494 00:24:15,430 --> 00:24:17,260 should be treated as though it's constant. 495 00:24:17,260 --> 00:24:19,760 But that's not actually enforced by the language. 496 00:24:19,760 --> 00:24:22,000 So for instance, let me go back here to VS Code. 497 00:24:22,000 --> 00:24:25,240 And let me create a new file, for instance, called meows.py. 498 00:24:25,240 --> 00:24:28,690 And let's see if we can't implement the notion of a cat meowing on the screen. 499 00:24:28,690 --> 00:24:31,510 So I'll do code of meows.py. 500 00:24:31,510 --> 00:24:34,360 And in meows.py, let me go ahead for instance 501 00:24:34,360 --> 00:24:38,120 and implement a very simple program that just has a cat meowing three times. 502 00:24:38,120 --> 00:24:39,280 So how about this. 503 00:24:39,280 --> 00:24:45,260 For i in the range of three, go ahead and print out, quote unquote, meow. 504 00:24:45,260 --> 00:24:45,760 All right. 505 00:24:45,760 --> 00:24:48,580 Well, we've seen in the past how we can clean this up a little bit. 506 00:24:48,580 --> 00:24:50,455 For instance, if I'm not actually using i, 507 00:24:50,455 --> 00:24:52,720 I might as well Pythonically just change the name 508 00:24:52,720 --> 00:24:55,900 of that variable to underscore even though that has no functional 509 00:24:55,900 --> 00:24:56,710 effect here. 510 00:24:56,710 --> 00:25:01,210 But here we have this three randomly hardcoded, that is, 511 00:25:01,210 --> 00:25:02,950 typed explicitly into my code. 512 00:25:02,950 --> 00:25:06,670 And it's totally not a big deal when your code is only two lines. 513 00:25:06,670 --> 00:25:09,610 But imagine that this is a much bigger program with dozens 514 00:25:09,610 --> 00:25:10,930 or even hundreds of lines. 515 00:25:10,930 --> 00:25:14,890 And imagine that one of those lines just has a three in there somewhere. 516 00:25:14,890 --> 00:25:17,050 You're never going to find that three very easily. 517 00:25:17,050 --> 00:25:18,640 And it's going to be very easily overlooked 518 00:25:18,640 --> 00:25:20,440 by you or colleagues or others that you've 519 00:25:20,440 --> 00:25:25,460 hardcoded some magic value like a three right there in your code. 520 00:25:25,460 --> 00:25:28,540 So it tends to be best practice, not just in Python but other languages 521 00:25:28,540 --> 00:25:31,660 as well, any time you have what is essentially a constant, 522 00:25:31,660 --> 00:25:34,030 like a number three that shouldn't ever change, 523 00:25:34,030 --> 00:25:37,760 is to at least let it bubble up, surface it to the top of your code, 524 00:25:37,760 --> 00:25:41,380 so that it's just obvious what your code's constant values are. 525 00:25:41,380 --> 00:25:42,910 And so, by that I mean this. 526 00:25:42,910 --> 00:25:44,860 At the top of this file, it would probably 527 00:25:44,860 --> 00:25:48,370 be a little clearer to colleagues, and frankly, me tomorrow 528 00:25:48,370 --> 00:25:52,150 after I've forgotten what I did today, to define a variable like meows 529 00:25:52,150 --> 00:25:53,650 and set it equal to three. 530 00:25:53,650 --> 00:25:57,340 And then, instead of hardcoding three here or even lower 531 00:25:57,340 --> 00:26:01,750 in a much bigger program, let me just go ahead and pass in that variable's value 532 00:26:01,750 --> 00:26:02,570 to my loop. 533 00:26:02,570 --> 00:26:04,630 So that now it's just kind of obvious to me 534 00:26:04,630 --> 00:26:06,940 that meows is apparently the number of times to meow. 535 00:26:06,940 --> 00:26:10,300 And if I ever want to change it, the only code I have to change 536 00:26:10,300 --> 00:26:11,643 is at the very top of my file. 537 00:26:11,643 --> 00:26:14,560 I don't need to go fishing around or figure out what's going to break, 538 00:26:14,560 --> 00:26:15,610 what do I need to change. 539 00:26:15,610 --> 00:26:19,330 I just know that I can change these constants up at the top. 540 00:26:19,330 --> 00:26:23,050 The problem though with Python is that Python doesn't actually 541 00:26:23,050 --> 00:26:24,640 make variables constant. 542 00:26:24,640 --> 00:26:27,640 It's indeed a convention in Python and some other languages 543 00:26:27,640 --> 00:26:31,390 to at least capitalize your variables when you want to indicate to the world 544 00:26:31,390 --> 00:26:33,370 that you should not touch this. 545 00:26:33,370 --> 00:26:34,720 It is constant. 546 00:26:34,720 --> 00:26:38,067 But there is literally nothing in my code preventing me from saying, 547 00:26:38,067 --> 00:26:38,650 you know what? 548 00:26:38,650 --> 00:26:41,440 Today I feel like four meows instead. 549 00:26:41,440 --> 00:26:42,460 That would work. 550 00:26:42,460 --> 00:26:44,260 In other languages though there's typically 551 00:26:44,260 --> 00:26:47,200 a keyword or some other mechanism syntactically that 552 00:26:47,200 --> 00:26:51,940 would allow you to prevent line three currently from executing. 553 00:26:51,940 --> 00:26:55,510 So that when you try to run your code, you would actually get an error message 554 00:26:55,510 --> 00:26:57,770 explicitly saying, you cannot do that. 555 00:26:57,770 --> 00:27:00,370 So Python, again, is a bit more on the honor system 556 00:27:00,370 --> 00:27:02,680 when it comes to these conventions instead. 557 00:27:02,680 --> 00:27:06,190 Now, it turns out there's other types of constants, quote unquote, 558 00:27:06,190 --> 00:27:07,932 that Python typically manifests. 559 00:27:07,932 --> 00:27:10,640 And in fact, let me go ahead and change this around a little bit. 560 00:27:10,640 --> 00:27:12,140 Let me delete this version of meows. 561 00:27:12,140 --> 00:27:15,430 And let me introduce, again, a class from our discussion 562 00:27:15,430 --> 00:27:18,070 of object-oriented programming, like a class representing 563 00:27:18,070 --> 00:27:20,710 a cat, another real-world entity. 564 00:27:20,710 --> 00:27:23,650 Recall that within classes, you can have not just 565 00:27:23,650 --> 00:27:26,170 instance variables but class variables. 566 00:27:26,170 --> 00:27:30,430 That is variables inside of the class that aren't inside of self, per se, 567 00:27:30,430 --> 00:27:33,490 but they're accessible to all of the methods inside of that class. 568 00:27:33,490 --> 00:27:36,340 Here too, there's a convention but not enforced 569 00:27:36,340 --> 00:27:40,840 by Python of having class constants, whereby inside of the class, 570 00:27:40,840 --> 00:27:45,220 you might want to have a variable that should, should, should not be changed. 571 00:27:45,220 --> 00:27:49,040 But you just want to indicate that visually by capitalizing its name. 572 00:27:49,040 --> 00:27:51,820 So for instance, if the default number of meows for a cat 573 00:27:51,820 --> 00:27:55,480 is meant to be three, I can literally inside of my class 574 00:27:55,480 --> 00:27:58,360 but outside of any of my defined methods just 575 00:27:58,360 --> 00:28:02,390 create a class variable all capitalized with that same value. 576 00:28:02,390 --> 00:28:05,500 And then, if I want to create a method, like meow, 577 00:28:05,500 --> 00:28:09,710 for instance, which as an instance method might take in self as we know. 578 00:28:09,710 --> 00:28:13,130 And then, I might have my loop here for underscore in the range of-- 579 00:28:13,130 --> 00:28:14,380 and now I need to access this. 580 00:28:14,380 --> 00:28:17,980 The convention would be to say cat.meows to make clear 581 00:28:17,980 --> 00:28:22,570 that I want the meows variable that's associated with the class called cat. 582 00:28:22,570 --> 00:28:25,780 Then I'm going to go ahead and print out one of these meows. 583 00:28:25,780 --> 00:28:29,230 And now, at the bottom of my code, outside of the class, let me go ahead 584 00:28:29,230 --> 00:28:30,380 and do something like this. 585 00:28:30,380 --> 00:28:33,340 Let me instantiate a cat using the cat constructor. 586 00:28:33,340 --> 00:28:34,690 Notice this is important. 587 00:28:34,690 --> 00:28:38,710 Per our discussion of OOP, the class is capitalized by convention. 588 00:28:38,710 --> 00:28:41,360 But the variable over here is lowercase. 589 00:28:41,360 --> 00:28:44,030 And I could call it just C or anything else. 590 00:28:44,030 --> 00:28:47,440 But I kind of like the symmetry of calling it little cat here and big cat, 591 00:28:47,440 --> 00:28:48,730 so to speak, over here. 592 00:28:48,730 --> 00:28:51,550 And now, if I want this particular cat to meow 593 00:28:51,550 --> 00:28:55,660 that default number of three times, I can just do cat.meow like this. 594 00:28:55,660 --> 00:29:01,990 And that method meow is going to, per line five, access that class constant. 595 00:29:01,990 --> 00:29:04,390 But again, it's constant only in the fact-- 596 00:29:04,390 --> 00:29:08,770 only in the sense that you should not touch that, not that it's actually 597 00:29:08,770 --> 00:29:11,500 going to be enforced by the language. 598 00:29:11,500 --> 00:29:15,040 Let me go ahead then and run this with Python of meows.py. 599 00:29:15,040 --> 00:29:18,250 And there it is, three of our meows, meow, meow. 600 00:29:18,250 --> 00:29:22,480 It turns out that Python is a dynamically typed language. 601 00:29:22,480 --> 00:29:24,520 That is to say, it's not strongly typed. 602 00:29:24,520 --> 00:29:27,520 Whereby when you want an int, you have to tell the program 603 00:29:27,520 --> 00:29:28,870 that you are using an int. 604 00:29:28,870 --> 00:29:32,350 You don't have to tell the program that you are using a str, or a float, 605 00:29:32,350 --> 00:29:34,150 or a set, or anything else. 606 00:29:34,150 --> 00:29:37,450 Generally speaking, to date, you and I, when we're creating variables, 607 00:29:37,450 --> 00:29:39,190 we just give a variable a name. 608 00:29:39,190 --> 00:29:42,850 We frequently assign it using in the equal sign some other value. 609 00:29:42,850 --> 00:29:44,950 And honestly, Python just kind of dynamically 610 00:29:44,950 --> 00:29:47,590 figures out what type of variable it is. 611 00:29:47,590 --> 00:29:51,790 If it's, quote unquote Hello, world, the variable is going to be a str. 612 00:29:51,790 --> 00:29:55,820 If it's 50, the integer the variable is going to be an int. 613 00:29:55,820 --> 00:30:00,310 Now, in other languages, including C, and C++, and Java, and others, 614 00:30:00,310 --> 00:30:05,050 it's sometimes necessary for the programmer to specify what types 615 00:30:05,050 --> 00:30:07,810 of variables you want something to be. 616 00:30:07,810 --> 00:30:11,710 The upside of that is that it helps you detect bugs more readily. 617 00:30:11,710 --> 00:30:15,760 Because if you intend for a variable to store a string or an integer, 618 00:30:15,760 --> 00:30:19,900 but you accidentally store an integer or a string, the opposite, or something 619 00:30:19,900 --> 00:30:24,583 else altogether, your language can detect that kind of mistake for you. 620 00:30:24,583 --> 00:30:26,500 When you go, for instance, to run the program, 621 00:30:26,500 --> 00:30:28,420 it can say, no, you've made a mistake. 622 00:30:28,420 --> 00:30:31,840 And you can fix that before your actual users detect as much. 623 00:30:31,840 --> 00:30:35,770 In Python too here, it's again, more of a friendly environment 624 00:30:35,770 --> 00:30:38,680 where you can provide hints to Python itself 625 00:30:38,680 --> 00:30:41,230 as to what type a variable should be. 626 00:30:41,230 --> 00:30:44,620 But the language itself does not strongly enforce these. 627 00:30:44,620 --> 00:30:48,820 Rather, you can use a tool that will tell you whether or not 628 00:30:48,820 --> 00:30:50,620 you're using a variable correctly. 629 00:30:50,620 --> 00:30:53,440 But it's typically a tool you would run as the programmer 630 00:30:53,440 --> 00:30:56,140 before you actually release your code to the world. 631 00:30:56,140 --> 00:30:59,770 Or if you have some kind of automated process, you can run this kind of tool 632 00:30:59,770 --> 00:31:03,730 just like you could reformat or link to your code with some other program 633 00:31:03,730 --> 00:31:06,470 before you actually release it to the world. 634 00:31:06,470 --> 00:31:09,220 So how might we go about using these so-called type hints? 635 00:31:09,220 --> 00:31:12,700 Well, they're documented in the usual place in Python's own documentation. 636 00:31:12,700 --> 00:31:14,500 And it turns out there's a program that's 637 00:31:14,500 --> 00:31:17,950 pretty popular for checking whether or not your code is 638 00:31:17,950 --> 00:31:19,990 adhering to your own type hints. 639 00:31:19,990 --> 00:31:22,052 And that program here is called mypy. 640 00:31:22,052 --> 00:31:23,260 And it's just one of several. 641 00:31:23,260 --> 00:31:26,020 But this one is particularly popular and can be easily installed 642 00:31:26,020 --> 00:31:29,380 in the usual way with pip install mypy. 643 00:31:29,380 --> 00:31:31,900 And its own documentation is at this URL here. 644 00:31:31,900 --> 00:31:36,160 But we'll use it quite simply to check whether or not our variables are indeed 645 00:31:36,160 --> 00:31:37,660 using the right types. 646 00:31:37,660 --> 00:31:39,550 So how can we go about doing this? 647 00:31:39,550 --> 00:31:43,150 All right, let me go back here to VS Code, clear my terminal window, 648 00:31:43,150 --> 00:31:46,180 and in fact erase meows.py as it currently was. 649 00:31:46,180 --> 00:31:49,240 And let's implement a different version of meows that quite simply 650 00:31:49,240 --> 00:31:52,303 has a function called meow that does the actual meowing on the screen. 651 00:31:52,303 --> 00:31:54,970 And then, I'm just going to go ahead and call that function down 652 00:31:54,970 --> 00:31:55,690 toward the bottom. 653 00:31:55,690 --> 00:31:58,357 I'm not going to bother with a main function just for simplicity 654 00:31:58,357 --> 00:32:01,090 so that we can focus as always only on what's new. 655 00:32:01,090 --> 00:32:03,940 So here we are defining a function called meow. 656 00:32:03,940 --> 00:32:08,920 It's going to take a number of times to meow, for instance n for number. 657 00:32:08,920 --> 00:32:11,170 And inside of this function, I'm going to do 658 00:32:11,170 --> 00:32:16,430 my usual for underscore in the range of n go ahead and print, quote unquote, 659 00:32:16,430 --> 00:32:16,930 meow. 660 00:32:16,930 --> 00:32:19,900 So based on our earlier code, I think this is correct. 661 00:32:19,900 --> 00:32:22,158 I've not bothered defining the variable as i. 662 00:32:22,158 --> 00:32:24,950 I'm instead using the underscore because I'm not using it anywhere. 663 00:32:24,950 --> 00:32:27,580 But I think I now have a working function whose purpose in life 664 00:32:27,580 --> 00:32:31,400 is to meow zero, or one, or two, or three, or more times. 665 00:32:31,400 --> 00:32:34,073 Well, let's use this function, again, not bothering with main. 666 00:32:34,073 --> 00:32:37,240 I'm just going to keep my function at the very top because there's only one. 667 00:32:37,240 --> 00:32:40,292 And I'm going to write my code here on line six. 668 00:32:40,292 --> 00:32:41,500 So I'm going to give myself-- 669 00:32:41,500 --> 00:32:43,692 I'm going to ask the user for a number. 670 00:32:43,692 --> 00:32:45,400 And I'm going to go ahead and prompt them 671 00:32:45,400 --> 00:32:48,730 in the usual way for that number of times to meow. 672 00:32:48,730 --> 00:32:52,870 And now, I'm going to go ahead and call meow on that number. 673 00:32:52,870 --> 00:32:56,080 Now, some of you might see what I've already done wrong. 674 00:32:56,080 --> 00:32:57,460 But perhaps I myself don't. 675 00:32:57,460 --> 00:33:01,330 So let me go into my terminal window and run Python of meows.py, 676 00:33:01,330 --> 00:33:03,220 the goal being to prompt me. 677 00:33:03,220 --> 00:33:04,460 This seems to be working. 678 00:33:04,460 --> 00:33:05,860 I'm going to type in three. 679 00:33:05,860 --> 00:33:10,690 And I would expect now the meow function to print out meow three times. 680 00:33:10,690 --> 00:33:11,800 Enter. 681 00:33:11,800 --> 00:33:12,850 But no. 682 00:33:12,850 --> 00:33:17,200 There's some kind of type error here. str object cannot be interpreted 683 00:33:17,200 --> 00:33:18,760 as an integer. 684 00:33:18,760 --> 00:33:21,590 Why might that be? 685 00:33:21,590 --> 00:33:23,530 Why might that be? 686 00:33:23,530 --> 00:33:28,230 STUDENT: Because the input function returns a string instead of an integer. 687 00:33:28,230 --> 00:33:29,230 DAVID J. MALAN: Exactly. 688 00:33:29,230 --> 00:33:32,530 The input function returns a string or a str, not an int. 689 00:33:32,530 --> 00:33:35,530 So in the past, of course, our solution to this problem 690 00:33:35,530 --> 00:33:39,250 has just been to convert the string to an int by using the int function. 691 00:33:39,250 --> 00:33:42,010 But now, let me start programming more defensively 692 00:33:42,010 --> 00:33:46,660 so that honestly, I don't even find myself in this situation at all. 693 00:33:46,660 --> 00:33:48,260 Let me go ahead and do this. 694 00:33:48,260 --> 00:33:51,760 Let me add what's called a type hint to my function that 695 00:33:51,760 --> 00:33:56,890 explicitly specifies for meow what type of variable should be passed in. 696 00:33:56,890 --> 00:33:59,830 I'm going to go ahead now and change the very first line of my code 697 00:33:59,830 --> 00:34:05,020 and my function to specify that n colon should be an int 698 00:34:05,020 --> 00:34:07,420 and this is a type hint, the fact that I've 699 00:34:07,420 --> 00:34:10,929 added a colon, a space, and the word int is not creating 700 00:34:10,929 --> 00:34:12,920 another int or anything like that. 701 00:34:12,920 --> 00:34:17,170 It's just a hint, an annotation, so to speak to Python, 702 00:34:17,170 --> 00:34:22,190 that this variable on the left called n should be an int. 703 00:34:22,190 --> 00:34:25,330 Now, unfortunately, Python itself doesn't care. 704 00:34:25,330 --> 00:34:28,150 Because again, these type hints are not enforced by the language. 705 00:34:28,150 --> 00:34:29,650 And that's by design. 706 00:34:29,650 --> 00:34:31,540 The language itself and the community prefers 707 00:34:31,540 --> 00:34:34,719 that Python be dynamically typed, not so strongly typed 708 00:34:34,719 --> 00:34:36,760 as to require these things to be true. 709 00:34:36,760 --> 00:34:41,360 But if I run meows.py, type in three again, the same error is there. 710 00:34:41,360 --> 00:34:46,989 But let me go about trying this mypy program, an example of a program that 711 00:34:46,989 --> 00:34:48,530 understands type hints. 712 00:34:48,530 --> 00:34:53,500 And if I run it proactively myself can find bugs like this in my code 713 00:34:53,500 --> 00:34:58,030 before I, or worse, a user, actually runs and encounters something cryptic 714 00:34:58,030 --> 00:34:59,770 like this type error here. 715 00:34:59,770 --> 00:35:05,260 Let me clear my terminal window and this time run mypy space meows.py. 716 00:35:05,260 --> 00:35:10,300 So I'm going to run mypy on my program, but I'm not running Python itself. 717 00:35:10,300 --> 00:35:13,280 When I hit Enter, we'll see this. 718 00:35:13,280 --> 00:35:13,780 All right. 719 00:35:13,780 --> 00:35:17,770 We see now that mypy found apparently an error on line seven. 720 00:35:17,770 --> 00:35:23,080 Error, argument one to meow has incompatible type str expected int. 721 00:35:23,080 --> 00:35:24,670 So it's still an error message. 722 00:35:24,670 --> 00:35:27,880 But mypy is not a program that my users would use. 723 00:35:27,880 --> 00:35:31,240 This is a program that you and I as programmers would use. 724 00:35:31,240 --> 00:35:35,200 And because we have run this code now before we, for instance, released 725 00:35:35,200 --> 00:35:39,400 this program to the world, I can now see even before the code is called or run, 726 00:35:39,400 --> 00:35:43,730 oh, I seem to be using my argument to meow wrong. 727 00:35:43,730 --> 00:35:45,880 I had better fix this somehow. 728 00:35:45,880 --> 00:35:49,030 Well, I can actually go about in hints adding 729 00:35:49,030 --> 00:35:53,650 type hints even to my own variables here so as to catch this another way too. 730 00:35:53,650 --> 00:35:57,640 If I know on line six that I'm creating already a variable called number, 731 00:35:57,640 --> 00:36:02,560 and I know already that I'm assigning equal to the return value of input, 732 00:36:02,560 --> 00:36:07,120 I could give mypy and tools like it another hint and say, you know what? 733 00:36:07,120 --> 00:36:10,510 This variable called number should also be an int. 734 00:36:10,510 --> 00:36:14,650 That is to say, if I now start getting into the habit of annotating 735 00:36:14,650 --> 00:36:17,740 all of my variables and arguments to functions, 736 00:36:17,740 --> 00:36:20,530 maybe mypy can actually help me find things 737 00:36:20,530 --> 00:36:24,430 quite quickly as well before I get to the point of running Python itself. 738 00:36:24,430 --> 00:36:26,050 Let's go ahead and try this again. 739 00:36:26,050 --> 00:36:28,930 Mypy of meows.py and hit Enter. 740 00:36:28,930 --> 00:36:31,870 And this time, notice that mypy actually found 741 00:36:31,870 --> 00:36:34,120 the mistake a little more quickly. 742 00:36:34,120 --> 00:36:38,860 Notice this time it found on line six that, error, incompatible types 743 00:36:38,860 --> 00:36:42,820 and assignment expression has type str, variable has type int. 744 00:36:42,820 --> 00:36:45,940 So before I even got to the point of calling meow, 745 00:36:45,940 --> 00:36:50,350 line six, via this type hint, when used and analyzed by mypy 746 00:36:50,350 --> 00:36:52,060 has helped me find, oh, wait a minute. 747 00:36:52,060 --> 00:36:55,060 I shouldn't be assigning the return value of input 748 00:36:55,060 --> 00:36:57,920 to my variable called number in the first place. 749 00:36:57,920 --> 00:36:58,420 Why? 750 00:36:58,420 --> 00:37:01,460 Mypy has just pointed out to me that one returns a str. 751 00:37:01,460 --> 00:37:02,890 I'm expecting an int. 752 00:37:02,890 --> 00:37:05,002 Let me fix this now instead. 753 00:37:05,002 --> 00:37:06,460 So let me clear my terminal window. 754 00:37:06,460 --> 00:37:08,380 And now, let me do what most of you were probably 755 00:37:08,380 --> 00:37:11,380 thinking I should have done in the first place after all of these weeks. 756 00:37:11,380 --> 00:37:15,580 But now, let me go ahead and convert the return value of input to an integer. 757 00:37:15,580 --> 00:37:17,950 For today's purposes, I'm not going to try 758 00:37:17,950 --> 00:37:19,733 to catch any exceptions or the like. 759 00:37:19,733 --> 00:37:22,400 We're just going to assume that the user types this in properly. 760 00:37:22,400 --> 00:37:26,830 And now, let me go ahead and run mypy of meows.py, 761 00:37:26,830 --> 00:37:31,450 having not only added to type hints to my argument, to my function, 762 00:37:31,450 --> 00:37:33,940 to my variable down here on line six. 763 00:37:33,940 --> 00:37:36,620 And I've also now fixed the problem itself. 764 00:37:36,620 --> 00:37:38,090 Let me go ahead and run mypy. 765 00:37:38,090 --> 00:37:41,480 And success, no issues found in one source file. 766 00:37:41,480 --> 00:37:44,900 Now, it's more reasonable for me to go and run something 767 00:37:44,900 --> 00:37:49,280 like Python of meows and just trust that when I type in three, at least 768 00:37:49,280 --> 00:37:51,560 I'm not going to get a type error. 769 00:37:51,560 --> 00:37:54,680 That is, I didn't mess up as a programmer with respect 770 00:37:54,680 --> 00:37:56,100 to the types of my variables. 771 00:37:56,100 --> 00:37:56,600 Why? 772 00:37:56,600 --> 00:37:58,642 Because when I wrote the code in the first place, 773 00:37:58,642 --> 00:38:00,620 I provided these annotations, these hints 774 00:38:00,620 --> 00:38:04,820 that inform tools like mypy that my intention had better 775 00:38:04,820 --> 00:38:07,730 line up with what the actual code does. 776 00:38:07,730 --> 00:38:13,160 Let me pause here and see if there's now any questions on type hints or mypy. 777 00:38:13,160 --> 00:38:17,030 STUDENT: Is it common or how common is it for those to be used? 778 00:38:17,030 --> 00:38:20,960 Or is it just that it's more used in more complex code 779 00:38:20,960 --> 00:38:26,150 where it's more difficult to ensure that you're actually 780 00:38:26,150 --> 00:38:28,900 using the correct type in the way that you're using the variables? 781 00:38:28,900 --> 00:38:30,442 DAVID J. MALAN: It's a good question. 782 00:38:30,442 --> 00:38:32,000 And it's rather a matter of opinion. 783 00:38:32,000 --> 00:38:36,230 Python was designed to be a little more versatile and flexible when 784 00:38:36,230 --> 00:38:39,980 it comes to some of these details, partly for writeability, to make 785 00:38:39,980 --> 00:38:42,407 it easier and faster to write code, partly for performance 786 00:38:42,407 --> 00:38:44,240 so that the program like Python doesn't have 787 00:38:44,240 --> 00:38:45,710 to bother checking these kinds of details, 788 00:38:45,710 --> 00:38:47,420 we can just get right into the code. 789 00:38:47,420 --> 00:38:50,990 The reality, though, is that strong type checks 790 00:38:50,990 --> 00:38:54,120 do tend to be a good thing for the correctness of your code. 791 00:38:54,120 --> 00:38:54,620 Why? 792 00:38:54,620 --> 00:38:58,970 Because programs like mypy can find, before your code is even run, 793 00:38:58,970 --> 00:39:01,430 if there's already known to be an error. 794 00:39:01,430 --> 00:39:04,280 And it tends to be good for defensive programming. 795 00:39:04,280 --> 00:39:09,560 So the situation essentially is that within the Python ecosystem, 796 00:39:09,560 --> 00:39:12,260 you can annotate your types in this way. 797 00:39:12,260 --> 00:39:14,900 You can use tools to use those type hints. 798 00:39:14,900 --> 00:39:17,630 But to date, Python itself does not enforce 799 00:39:17,630 --> 00:39:20,270 or expect to enforce these conventions. 800 00:39:20,270 --> 00:39:23,420 In larger code bases, in professional code bases, 801 00:39:23,420 --> 00:39:27,548 commercial code bases, probably depending on the project manager 802 00:39:27,548 --> 00:39:29,840 or depending on the engineering team they may very well 803 00:39:29,840 --> 00:39:31,620 want themselves to be using type hints. 804 00:39:31,620 --> 00:39:32,120 Why? 805 00:39:32,120 --> 00:39:34,760 If it just decreases the probability of bugs. 806 00:39:34,760 --> 00:39:39,260 In fact, let me propose now that I-- imagine a situation where 807 00:39:39,260 --> 00:39:45,020 instead of expecting that meow prints meow, meow, meow some number of times, 808 00:39:45,020 --> 00:39:48,800 suppose that I accidentally assume that the meow function 809 00:39:48,800 --> 00:39:51,650 just returns meow some number of times. 810 00:39:51,650 --> 00:39:53,990 We saw, for instance when focusing on unit tests 811 00:39:53,990 --> 00:39:57,260 that it tends to be a good thing to have functions that return values, 812 00:39:57,260 --> 00:40:00,770 be it an int or a string, rather than just having some side effect 813 00:40:00,770 --> 00:40:02,460 like printing things out themselves. 814 00:40:02,460 --> 00:40:04,370 So perhaps I'm still in that mindset. 815 00:40:04,370 --> 00:40:07,160 And I've just assumed mistakenly for the moment 816 00:40:07,160 --> 00:40:11,810 that meow returns a value, like meow, or meow meow, or meow meow meow, 817 00:40:11,810 --> 00:40:15,620 a big string of some number of meows, rather than just printing it itself, 818 00:40:15,620 --> 00:40:18,080 as it clearly does at the moment on line three. 819 00:40:18,080 --> 00:40:21,770 And therefore, suppose that I accidentally did something like this. 820 00:40:21,770 --> 00:40:27,230 Rather than just getting the number and passing it to meow, suppose I did this. 821 00:40:27,230 --> 00:40:31,850 Suppose I declared a number of-- a new variable called meows, 822 00:40:31,850 --> 00:40:34,730 the type of which I think should be str. 823 00:40:34,730 --> 00:40:37,490 And suppose, again, I assume accidentally 824 00:40:37,490 --> 00:40:42,170 that meow returns to me a string of those meows so that I myself 825 00:40:42,170 --> 00:40:43,430 can then print them later. 826 00:40:43,430 --> 00:40:47,070 This would be a little more conducive, arguably, to testing my meow function. 827 00:40:47,070 --> 00:40:47,570 Why? 828 00:40:47,570 --> 00:40:51,380 Because I could expect that it's returning meow, or meow meow, 829 00:40:51,380 --> 00:40:53,600 or meow meow meow, separated by new lines, 830 00:40:53,600 --> 00:40:56,990 returning a str that I could then assert equals 831 00:40:56,990 --> 00:40:59,875 what I expect it to be in something like a unit test. 832 00:40:59,875 --> 00:41:02,000 I'm not going to bother writing any unit tests now. 833 00:41:02,000 --> 00:41:04,880 But let's just suppose that's the mindset I'm now in. 834 00:41:04,880 --> 00:41:08,060 And so, on line seven I'm assuming that I 835 00:41:08,060 --> 00:41:12,860 want to assign the return value of meow to a new variable called meows which 836 00:41:12,860 --> 00:41:16,340 I've annotated with this type hint as being a str, 837 00:41:16,340 --> 00:41:18,210 just so we can see another variable. 838 00:41:18,210 --> 00:41:20,990 This one is not an int but a str instead. 839 00:41:20,990 --> 00:41:25,340 Well, let me go ahead and run this code now, Python of meows.py, 840 00:41:25,340 --> 00:41:27,170 Enter, typing in three. 841 00:41:27,170 --> 00:41:29,300 And you'll see a curious bug. 842 00:41:29,300 --> 00:41:31,490 Meow meow meow none. 843 00:41:31,490 --> 00:41:33,000 Well, why is that? 844 00:41:33,000 --> 00:41:37,310 Well, it turns out at the moment, my meow function only has a side effect. 845 00:41:37,310 --> 00:41:39,650 It just prints out meow some number of times. 846 00:41:39,650 --> 00:41:42,080 It doesn't explicitly return a value as it 847 00:41:42,080 --> 00:41:44,810 would if there were literally the return keyword there. 848 00:41:44,810 --> 00:41:47,270 By default then, when a function in Python 849 00:41:47,270 --> 00:41:50,870 does not explicitly return a value, its implicit return value 850 00:41:50,870 --> 00:41:52,340 is in effect none. 851 00:41:52,340 --> 00:41:55,850 And so, what we're seeing here is this-- on line eight, 852 00:41:55,850 --> 00:42:01,970 because I'm assigning the return value of meow, which is none, 853 00:42:01,970 --> 00:42:08,660 to my meows variable, line three is what's still printing meow meow meow. 854 00:42:08,660 --> 00:42:12,290 And line eight is what's now incorrectly printing none. 855 00:42:12,290 --> 00:42:16,430 Because I accidentally thought that meow returns a value, but it doesn't. 856 00:42:16,430 --> 00:42:18,770 So its return value is effectively none. 857 00:42:18,770 --> 00:42:22,920 So I'm printing very weirdly the word none at the bottom. 858 00:42:22,920 --> 00:42:26,180 So how could I go about catching this kind of mistake too? 859 00:42:26,180 --> 00:42:27,510 I might make this mistake. 860 00:42:27,510 --> 00:42:29,690 But maybe with less frequency if I'm in the habit 861 00:42:29,690 --> 00:42:34,700 of annotating my code with this new feature called type hints. 862 00:42:34,700 --> 00:42:36,520 What you can do here is this. 863 00:42:36,520 --> 00:42:39,060 Let me clear my terminal window to get rid of that artifact. 864 00:42:39,060 --> 00:42:43,860 And up here, let me additionally specify with some funny looking syntax 865 00:42:43,860 --> 00:42:48,840 that my meow function actually by design returns none. 866 00:42:48,840 --> 00:42:51,300 So you literally use this arrow notation. 867 00:42:51,300 --> 00:42:56,400 In Python when hinting what the return value of a function is, 868 00:42:56,400 --> 00:42:57,750 you would do this. 869 00:42:57,750 --> 00:43:02,760 After the parentheses, a space, a hyphen, a greater than symbol, 870 00:43:02,760 --> 00:43:06,270 like an arrow, and then another space, and then the type of the return value. 871 00:43:06,270 --> 00:43:07,800 For now, it's indeed going to-- 872 00:43:07,800 --> 00:43:10,080 [SWALLOWS] excuse me, return none. 873 00:43:10,080 --> 00:43:12,990 But now, at least I can catch it like this. 874 00:43:12,990 --> 00:43:17,010 If I now run not Python but mypy on my code, which would be a habit 875 00:43:17,010 --> 00:43:18,960 I'm now getting into if using type hints. 876 00:43:18,960 --> 00:43:22,770 Check that I'm using all of my types correctly before I even run my program. 877 00:43:22,770 --> 00:43:27,930 We'll see that now mypy has found online seven that meow, quote unquote, 878 00:43:27,930 --> 00:43:29,770 does not return a value. 879 00:43:29,770 --> 00:43:34,530 And mypy knows that because I have proactively annotated my meow function 880 00:43:34,530 --> 00:43:37,840 as having none as its return value. 881 00:43:37,840 --> 00:43:39,720 So now, mypy can detect that. 882 00:43:39,720 --> 00:43:41,610 I should now realize, oh, wait a minute. 883 00:43:41,610 --> 00:43:44,220 I'm being foolish here. 884 00:43:44,220 --> 00:43:46,230 Meow clearly does not return a value. 885 00:43:46,230 --> 00:43:49,350 I should not be treating it like it does on line seven. 886 00:43:49,350 --> 00:43:52,200 Let me go about actually fixing this now. 887 00:43:52,200 --> 00:43:53,970 So how do I go about fixing this? 888 00:43:53,970 --> 00:43:56,700 Well, let's practice what we preached in our focus on unit tests, 889 00:43:56,700 --> 00:44:00,240 having a function like meow not have side effects like printing itself. 890 00:44:00,240 --> 00:44:02,860 But let's have it return the actual string. 891 00:44:02,860 --> 00:44:04,890 And I can actually do this kind of cleanly. 892 00:44:04,890 --> 00:44:08,560 Let me clear my error message in my terminal window here. 893 00:44:08,560 --> 00:44:10,110 Let me get rid of the loop here. 894 00:44:10,110 --> 00:44:14,010 Let me say this time that OK, fine, meow is going to return 895 00:44:14,010 --> 00:44:16,770 a value, an actual str or string. 896 00:44:16,770 --> 00:44:18,640 So I've changed none to str. 897 00:44:18,640 --> 00:44:20,980 And now, I can implement this in any number of ways, 898 00:44:20,980 --> 00:44:21,990 maybe even using a loop. 899 00:44:21,990 --> 00:44:25,260 But recall that we have this syntax in Python, which will, I think, 900 00:44:25,260 --> 00:44:27,010 solve this problem for us. 901 00:44:27,010 --> 00:44:32,640 If I want to return a string of n meows, what I can actually do, 902 00:44:32,640 --> 00:44:33,540 recall, is this. 903 00:44:33,540 --> 00:44:39,540 Return quote unquote meow, backslash n, times that number n. 904 00:44:39,540 --> 00:44:41,970 So it's kind of a clever one-liner, avoids 905 00:44:41,970 --> 00:44:44,640 the need for a for loop or something more involved than that, 906 00:44:44,640 --> 00:44:50,730 to just say, multiply meow backslash n against itself three times, or n times, 907 00:44:50,730 --> 00:44:55,710 in this case, in general, so that I get back a big string of zero meows, one, 908 00:44:55,710 --> 00:44:58,530 two, three, or many more meows instead. 909 00:44:58,530 --> 00:45:01,810 I think now my code on line six is actually correct. 910 00:45:01,810 --> 00:45:04,410 Now I've changed meow to behave the way I was 911 00:45:04,410 --> 00:45:06,610 pretending to assume it always worked. 912 00:45:06,610 --> 00:45:11,910 So I'm storing in meows plural a variable that's of type str. 913 00:45:11,910 --> 00:45:16,830 Because now, meow does have a return value of type str itself 914 00:45:16,830 --> 00:45:20,440 per this type hint as well. 915 00:45:20,440 --> 00:45:20,940 All right. 916 00:45:20,940 --> 00:45:22,830 Let me go ahead now and print meows. 917 00:45:22,830 --> 00:45:27,840 But because each of my meows comes with a trailing new line, the backslash n, 918 00:45:27,840 --> 00:45:30,750 I'm going to proactively fix what would be a minor aesthetic bug. 919 00:45:30,750 --> 00:45:33,630 And I'm just going to avoid outputting an extra new line 920 00:45:33,630 --> 00:45:35,250 at the end of those three. 921 00:45:35,250 --> 00:45:41,190 So if I run Python of meows.py now, type in three, there's my meow meow meow. 922 00:45:41,190 --> 00:45:45,450 And now, no mention of none. 923 00:45:45,450 --> 00:45:50,430 Questions now on type hints, and these annotations in mypy, 924 00:45:50,430 --> 00:45:55,170 and using them to defensively write code that just decreases hopefully 925 00:45:55,170 --> 00:45:57,450 the probability of your own bugs? 926 00:45:57,450 --> 00:46:04,230 STUDENT: Is the return three set have double quotes that 927 00:46:04,230 --> 00:46:10,020 have meow slash n, why the program don't take it as a [? string? ?] 928 00:46:10,020 --> 00:46:13,230 DAVID J. MALAN: Why does the program not take it as a-- as strange? 929 00:46:13,230 --> 00:46:14,370 STUDENT: [INAUDIBLE], yeah. 930 00:46:14,370 --> 00:46:17,160 DAVID J. MALAN: So recall that early on in the class 931 00:46:17,160 --> 00:46:19,800 we looked at plus as a concatenation operator that 932 00:46:19,800 --> 00:46:22,050 allows you to join a string on the left and the right. 933 00:46:22,050 --> 00:46:26,820 Multiplication is also an overloaded operator for strings. 934 00:46:26,820 --> 00:46:30,870 Whereby if you have a string on the left and an int on the right, 935 00:46:30,870 --> 00:46:35,250 it will multiply the string, so to speak, by concatenating or joining 936 00:46:35,250 --> 00:46:38,350 that many meows all together. 937 00:46:38,350 --> 00:46:40,770 So this is a feature of object-oriented programming 938 00:46:40,770 --> 00:46:45,540 and an operator overloading as we saw it in the past. 939 00:46:45,540 --> 00:46:49,290 Other questions on type hints or mypy. 940 00:46:49,290 --> 00:46:54,468 STUDENT: Can we not typecast this data type of this variable number? 941 00:46:54,468 --> 00:46:55,260 DAVID J. MALAN: No. 942 00:46:55,260 --> 00:46:57,300 You still-- and let me correct the terminology. 943 00:46:57,300 --> 00:47:00,870 It wouldn't be called typecasting in this context because it's not like C 944 00:47:00,870 --> 00:47:03,450 or C++ where there's an equivalence between these types. 945 00:47:03,450 --> 00:47:07,980 You're technically converting on line five a str to an int. 946 00:47:07,980 --> 00:47:09,900 You do still have to do this. 947 00:47:09,900 --> 00:47:12,060 Because mypy, for instance, would yell at you 948 00:47:12,060 --> 00:47:15,850 if you were trying to assign a str on the right to an int on the left. 949 00:47:15,850 --> 00:47:18,420 You must still use the int function. 950 00:47:18,420 --> 00:47:20,290 Int itself is still a function. 951 00:47:20,290 --> 00:47:21,330 It's not a type hint. 952 00:47:21,330 --> 00:47:26,340 But the word int is being used in another way now in these type hints. 953 00:47:26,340 --> 00:47:30,300 So this int is still a function call as it always has been. 954 00:47:30,300 --> 00:47:33,690 This syntax on the left is another use of the keyword int 955 00:47:33,690 --> 00:47:35,680 but in the form of these type hints. 956 00:47:35,680 --> 00:47:39,170 So you still have to do the conversion yourself. 957 00:47:39,170 --> 00:47:39,670 All right. 958 00:47:39,670 --> 00:47:44,050 Let me propose that we transition to another feature of Python that's worth 959 00:47:44,050 --> 00:47:47,200 knowing, especially since it's one that you'll see in the wild 960 00:47:47,200 --> 00:47:49,810 when you see code or libraries that other folks have written, 961 00:47:49,810 --> 00:47:53,500 namely something known as a doc string or document strings. 962 00:47:53,500 --> 00:47:55,450 It turns out in the world of Python there 963 00:47:55,450 --> 00:47:59,680 is a standardized way per another pep, Python Enhancement Proposal, this one 964 00:47:59,680 --> 00:48:05,830 257, that essentially standardizes how you should document your functions 965 00:48:05,830 --> 00:48:07,940 among other aspects of your code. 966 00:48:07,940 --> 00:48:12,790 And so, for instance, let me go back to my meows.py file here. 967 00:48:12,790 --> 00:48:16,000 And let me propose that we now start documenting this code 968 00:48:16,000 --> 00:48:18,550 too so that I know what the meow function does. 969 00:48:18,550 --> 00:48:23,170 And in fact, the standard way of doing this using doc string notation 970 00:48:23,170 --> 00:48:24,490 would be as follows. 971 00:48:24,490 --> 00:48:27,460 To comment this function, not above it, as you 972 00:48:27,460 --> 00:48:31,520 might be in the habit of doing with code in general, but actually inside of it. 973 00:48:31,520 --> 00:48:35,260 But instead of commenting it like this with the usual hash comment sign, 974 00:48:35,260 --> 00:48:40,840 like meow n times, it turns out that when you're formally docking-- 975 00:48:40,840 --> 00:48:45,020 when you're formally documenting a function like meow in this case, 976 00:48:45,020 --> 00:48:48,310 you don't use regular inline comments, so to speak. 977 00:48:48,310 --> 00:48:50,440 You use this syntax instead. 978 00:48:50,440 --> 00:48:54,760 You use triple quotation marks, either double or single. 979 00:48:54,760 --> 00:48:58,270 Then you write out your comment, meow n times. 980 00:48:58,270 --> 00:49:01,040 And then you write the same again at the end. 981 00:49:01,040 --> 00:49:03,370 So either three double quotes at the start and the end 982 00:49:03,370 --> 00:49:05,710 or three single quotes at the start and the end. 983 00:49:05,710 --> 00:49:08,380 And Python has built into it certain tools 984 00:49:08,380 --> 00:49:13,420 and certain assumptions that if it detects that there is a comment using 985 00:49:13,420 --> 00:49:16,780 this doc string format triple quotes on the left and the right, 986 00:49:16,780 --> 00:49:20,210 it will assume that that's indeed the documentation for that function. 987 00:49:20,210 --> 00:49:23,770 And it turns out in the Python ecosystem, there's a lot of tools 988 00:49:23,770 --> 00:49:28,300 that you can then use to analyze your code automatically, extract 989 00:49:28,300 --> 00:49:31,750 all of these document strings for you, and even generate 990 00:49:31,750 --> 00:49:36,530 web pages or PDFs of documentation for your own function. 991 00:49:36,530 --> 00:49:39,970 So there's these conventions via which if you adhere to them 992 00:49:39,970 --> 00:49:43,270 you can start documenting your code as for other people 993 00:49:43,270 --> 00:49:46,900 by generating automatically the documentation from your own code 994 00:49:46,900 --> 00:49:49,480 without writing something up from scratch manually. 995 00:49:49,480 --> 00:49:53,650 Now, it turns out if your function does take arguments 996 00:49:53,650 --> 00:49:56,830 and perhaps does a bit more, there are multiple conventions for how 997 00:49:56,830 --> 00:49:59,365 you can document for the human programmers that 998 00:49:59,365 --> 00:50:01,990 might be using your function, whether it's you, or a colleague, 999 00:50:01,990 --> 00:50:05,320 or someone else on the internet, to actually use these docs 1000 00:50:05,320 --> 00:50:08,270 strings to standardize the information therein. 1001 00:50:08,270 --> 00:50:10,180 So you might see this instead. 1002 00:50:10,180 --> 00:50:13,960 Using these same triple quotes above and below now, 1003 00:50:13,960 --> 00:50:19,420 you might see your one-sentence explanation of the function, meows-- 1004 00:50:19,420 --> 00:50:21,520 meow n times. 1005 00:50:21,520 --> 00:50:23,710 Sometimes depending on the style and use, 1006 00:50:23,710 --> 00:50:27,220 it might actually still be on the first line but with a blank line below it. 1007 00:50:27,220 --> 00:50:29,560 But I'll keep everything uniformly indented. 1008 00:50:29,560 --> 00:50:34,000 And this is a convention used by some popular Python documentation tools 1009 00:50:34,000 --> 00:50:34,660 as well. 1010 00:50:34,660 --> 00:50:39,850 You would say syntax like this-- param n colon, and then a description of what n 1011 00:50:39,850 --> 00:50:42,520 is, number of times to meow. 1012 00:50:42,520 --> 00:50:47,890 Then colon type n colon int, which just indicates that the type of n 1013 00:50:47,890 --> 00:50:49,010 is an integer. 1014 00:50:49,010 --> 00:50:53,050 Then, if this function could actually raise an exception, 1015 00:50:53,050 --> 00:50:54,400 you can document that too. 1016 00:50:54,400 --> 00:50:56,770 And actually, it's not really-- 1017 00:50:56,770 --> 00:50:58,660 well, it's arguably my mistake here. 1018 00:50:58,660 --> 00:51:01,990 If n comes in as an argument and is not, in fact, an int, 1019 00:51:01,990 --> 00:51:04,810 maybe it's a float, or a string, or something else, 1020 00:51:04,810 --> 00:51:08,060 the multiplication sign here is not going to work. 1021 00:51:08,060 --> 00:51:09,760 It's not going to multiply the string. 1022 00:51:09,760 --> 00:51:13,040 It's going to trigger what I know from experience to be a type error. 1023 00:51:13,040 --> 00:51:16,240 So I'm going to go ahead and proactively say in my own documentation 1024 00:51:16,240 --> 00:51:20,830 that this function, technically, if you use it wrong could raise a type error, 1025 00:51:20,830 --> 00:51:23,740 even though I'm hinting up here with this annotation 1026 00:51:23,740 --> 00:51:25,660 that you should pass in an int. 1027 00:51:25,660 --> 00:51:27,440 Again, Python doesn't enforce that. 1028 00:51:27,440 --> 00:51:29,500 So if you pass in a float, this might, in fact, 1029 00:51:29,500 --> 00:51:31,520 raise this function a type error. 1030 00:51:31,520 --> 00:51:35,740 And so, that might happen if n is not an int. 1031 00:51:35,740 --> 00:51:39,730 And then, lastly, I might say for clarity's sake for other programmers, 1032 00:51:39,730 --> 00:51:44,630 this function returns a string of n meows, one per line. 1033 00:51:44,630 --> 00:51:49,150 And the return type of that value, r type, is going to be str. 1034 00:51:49,150 --> 00:51:54,410 Now, all of this syntax here as I've used it is not Python per se. 1035 00:51:54,410 --> 00:51:56,740 This is a convention known as restructured text, which 1036 00:51:56,740 --> 00:51:58,930 is a form of markdown-like language that's 1037 00:51:58,930 --> 00:52:02,600 used for documentation, for websites, for blogs, and even more. 1038 00:52:02,600 --> 00:52:06,400 But it's one of the popular conventions within the world of Python 1039 00:52:06,400 --> 00:52:08,570 to document your own functions. 1040 00:52:08,570 --> 00:52:12,460 So this does not have anything to do fundamentally with type hints. 1041 00:52:12,460 --> 00:52:15,280 Type hints are a feature of Python. 1042 00:52:15,280 --> 00:52:19,840 What I'm doing here is just adhering to a third party convention 1043 00:52:19,840 --> 00:52:24,580 of putting in between a Python doc string from the start 1044 00:52:24,580 --> 00:52:29,470 to the end a certain standard format so that these third party tools can 1045 00:52:29,470 --> 00:52:31,960 analyze my code for me top to bottom, left to right, 1046 00:52:31,960 --> 00:52:35,260 and ideally generate documentation for me. 1047 00:52:35,260 --> 00:52:38,590 It can generate a PDF, a web page, or something else, 1048 00:52:38,590 --> 00:52:41,730 so that I or my colleagues don't need to not just only write 1049 00:52:41,730 --> 00:52:45,240 code but also manually create documentation for our code. 1050 00:52:45,240 --> 00:52:50,880 We can keep everything together and use tools to generate the same for us. 1051 00:52:50,880 --> 00:52:55,050 Any questions now on these doc strings, which 1052 00:52:55,050 --> 00:52:59,010 again are a convention of documenting your own code often 1053 00:52:59,010 --> 00:53:01,487 following some standard syntax? 1054 00:53:01,487 --> 00:53:02,070 STUDENT: Yeah. 1055 00:53:02,070 --> 00:53:06,150 So when you say you would document it and put it in a PDF, 1056 00:53:06,150 --> 00:53:10,300 is the purpose of doing this to publish it and share your function 1057 00:53:10,300 --> 00:53:12,105 so other users can use it? 1058 00:53:12,105 --> 00:53:13,230 DAVID J. MALAN: Absolutely. 1059 00:53:13,230 --> 00:53:16,620 In the past, when we have installed some third party libraries, for instance, 1060 00:53:16,620 --> 00:53:18,240 cowsay a few weeks back. 1061 00:53:18,240 --> 00:53:20,952 Recall that I showed you what functions it had. 1062 00:53:20,952 --> 00:53:23,160 But if you read the documentation, you might actually 1063 00:53:23,160 --> 00:53:27,350 see that it was documented for us by the author of that program. 1064 00:53:27,350 --> 00:53:29,850 Now, I don't believe they were using this particular syntax. 1065 00:53:29,850 --> 00:53:31,920 But it was definitely useful for you and me 1066 00:53:31,920 --> 00:53:35,850 to be able to read some web page or PDF telling us how to use the library 1067 00:53:35,850 --> 00:53:38,850 rather than wasting time reading through someone else's code 1068 00:53:38,850 --> 00:53:42,040 and trying to infer what functions exist and how to use them. 1069 00:53:42,040 --> 00:53:45,090 It just tends to be much more developer-friendly to have 1070 00:53:45,090 --> 00:53:49,360 proper documentation for our own code or libraries as well. 1071 00:53:49,360 --> 00:53:50,430 Other questions? 1072 00:53:50,430 --> 00:53:51,450 STUDENT: Yeah. 1073 00:53:51,450 --> 00:53:55,350 When with doc strings, when it's used to generate a PDF or whatever, 1074 00:53:55,350 --> 00:53:57,970 does it include any of the code? 1075 00:53:57,970 --> 00:54:01,740 So if you're referencing in your comment, 1076 00:54:01,740 --> 00:54:04,740 if you're referencing the code in the comment itself and 1077 00:54:04,740 --> 00:54:07,110 might not make sense without seeing the code. 1078 00:54:07,110 --> 00:54:09,843 Does it-- do these include it? 1079 00:54:09,843 --> 00:54:11,760 DAVID J. MALAN: Short answer, you can do that. 1080 00:54:11,760 --> 00:54:13,420 Not in the convention I'm using here. 1081 00:54:13,420 --> 00:54:18,330 But there's actually a clever way to write in your doc strings 1082 00:54:18,330 --> 00:54:23,320 sample inputs to your functions and sample outputs for your functions. 1083 00:54:23,320 --> 00:54:25,950 And if you use a different tool that we've not discussed, 1084 00:54:25,950 --> 00:54:29,670 that tool will run your code using those sample inputs. 1085 00:54:29,670 --> 00:54:32,890 It will check that your outputs match your sample outputs. 1086 00:54:32,890 --> 00:54:36,090 And if not, the program will yell at you saying, you've got a bug somewhere. 1087 00:54:36,090 --> 00:54:38,550 So this is just another way where you can 1088 00:54:38,550 --> 00:54:44,040 use doc strings to not only document but even catch errors in your code. 1089 00:54:44,040 --> 00:54:44,910 This has been a lot. 1090 00:54:44,910 --> 00:54:46,170 And there's a bit more to go. 1091 00:54:46,170 --> 00:54:48,503 Why don't we go ahead here and take a five-minute break. 1092 00:54:48,503 --> 00:54:51,660 And when we resume, we'll take a look at yet another feature of Python, yet 1093 00:54:51,660 --> 00:54:53,950 another library to write code faster. 1094 00:54:53,950 --> 00:54:54,450 All right. 1095 00:54:54,450 --> 00:54:58,770 Suppose we want to modify this meows program to actually take its input not 1096 00:54:58,770 --> 00:55:02,070 from the input function in the blinking prompt but from the command line. 1097 00:55:02,070 --> 00:55:04,862 Recall in our discussion of libraries, that you could use something 1098 00:55:04,862 --> 00:55:07,752 like sys.argv to get at command line arguments 1099 00:55:07,752 --> 00:55:10,210 that a human has provided when you're running your program. 1100 00:55:10,210 --> 00:55:12,060 So why don't we whip up a version of meow 1101 00:55:12,060 --> 00:55:15,280 that uses command line arguments instead of, again, input. 1102 00:55:15,280 --> 00:55:18,030 So I'm going to go ahead and delete what we've done here thus far. 1103 00:55:18,030 --> 00:55:21,540 And let me propose that we import sys as we've done in the past. 1104 00:55:21,540 --> 00:55:22,870 And let's do this. 1105 00:55:22,870 --> 00:55:27,720 How about if the user does not type any command line arguments. 1106 00:55:27,720 --> 00:55:29,910 Then my program will just meow once, just so that it 1107 00:55:29,910 --> 00:55:32,070 does something visually interesting. 1108 00:55:32,070 --> 00:55:35,310 Otherwise, let's also give the user an option 1109 00:55:35,310 --> 00:55:39,000 to specify how many times I want the cat to meow. 1110 00:55:39,000 --> 00:55:40,230 So let's start simple. 1111 00:55:40,230 --> 00:55:42,480 Let's first of all go ahead and do this. 1112 00:55:42,480 --> 00:55:47,790 If the length of sys.argv equals equals one, 1113 00:55:47,790 --> 00:55:52,050 that is the user only typed the name of the program and nothing else after 1114 00:55:52,050 --> 00:55:52,950 the-- 1115 00:55:52,950 --> 00:55:58,350 in their command, then let's go ahead and just print out one meow like this. 1116 00:55:58,350 --> 00:56:02,200 Else for now, let's go ahead and print out something like this. 1117 00:56:02,200 --> 00:56:04,980 Else go ahead and print out, let's say, usage 1118 00:56:04,980 --> 00:56:08,220 for the program, which will be usage of meows.py, 1119 00:56:08,220 --> 00:56:12,360 just so that the user knows that the program itself is called meows.py. 1120 00:56:12,360 --> 00:56:16,410 Now let me go down to my terminal window and start to type Python of meows.py. 1121 00:56:16,410 --> 00:56:20,070 And at this point, notice that the length of sys.argv 1122 00:56:20,070 --> 00:56:21,300 should indeed be one. 1123 00:56:21,300 --> 00:56:21,900 Why? 1124 00:56:21,900 --> 00:56:25,500 Well, Python the name doesn't end up in sys.argv at all ever. 1125 00:56:25,500 --> 00:56:28,470 But meows.py, the name of the file does. 1126 00:56:28,470 --> 00:56:31,320 And it's going to go in sys.argv zero. 1127 00:56:31,320 --> 00:56:32,592 But that's only one element. 1128 00:56:32,592 --> 00:56:34,050 So the length of this thing is one. 1129 00:56:34,050 --> 00:56:35,470 There's nothing more to the right. 1130 00:56:35,470 --> 00:56:38,610 So when I hit Enter now we should see, indeed, one meow. 1131 00:56:38,610 --> 00:56:43,530 If I don't cooperate, suppose I do something like meows three Enter. 1132 00:56:43,530 --> 00:56:46,800 Then I'm going to see a reminder that this is how you use the program. 1133 00:56:46,800 --> 00:56:49,050 And this is a common convention to literally print out 1134 00:56:49,050 --> 00:56:52,140 the word usage, a colon, then the name of the program, 1135 00:56:52,140 --> 00:56:54,240 and maybe some explanation of how to use it. 1136 00:56:54,240 --> 00:56:56,070 So I'm keeping it very simple. 1137 00:56:56,070 --> 00:56:57,600 But let's be a little fancier. 1138 00:56:57,600 --> 00:57:01,770 What if I really wanted the user to type in maybe not three, but something 1139 00:57:01,770 --> 00:57:03,060 more sophisticated. 1140 00:57:03,060 --> 00:57:07,300 And in fact, when controlling programs from the command line, 1141 00:57:07,300 --> 00:57:11,550 it's very common to provide what are often called switches or flags, 1142 00:57:11,550 --> 00:57:16,830 whereby you pass in something like dash n, which semantically means 1143 00:57:16,830 --> 00:57:19,620 this number of times, then often a space, 1144 00:57:19,620 --> 00:57:21,420 and then something like the number three. 1145 00:57:21,420 --> 00:57:24,940 This still allows me to do other things at the command line if I want. 1146 00:57:24,940 --> 00:57:28,200 But the fact that I've standardized on how I'm providing command line 1147 00:57:28,200 --> 00:57:30,690 arguments to this program with dash n three 1148 00:57:30,690 --> 00:57:34,440 is just a more reliable way now of my program 1149 00:57:34,440 --> 00:57:36,240 knowing what does the three mean. 1150 00:57:36,240 --> 00:57:39,838 It's a little less obvious if I just do meows.py space three. 1151 00:57:39,838 --> 00:57:41,130 Well, what does the three mean? 1152 00:57:41,130 --> 00:57:44,087 At least with syntax like dash n three, especially 1153 00:57:44,087 --> 00:57:46,170 if you've read the documentation for this program, 1154 00:57:46,170 --> 00:57:48,780 ultimately, oh, dash n means number of times. 1155 00:57:48,780 --> 00:57:49,290 Got it. 1156 00:57:49,290 --> 00:57:51,990 It's a way of passing in two additional arguments 1157 00:57:51,990 --> 00:57:54,070 but that have some relationship between them. 1158 00:57:54,070 --> 00:57:58,830 So how do I modify my program to understand dash n three? 1159 00:57:58,830 --> 00:58:01,650 Well, if I'm using sys like this, I could do this. 1160 00:58:01,650 --> 00:58:07,830 elif the length of sys.argv equals this time three. 1161 00:58:07,830 --> 00:58:13,680 Because notice, there's one, two, three things at my prompt. 1162 00:58:13,680 --> 00:58:17,910 So sys.argv is zero, one, and two, three things total separated by spaces. 1163 00:58:17,910 --> 00:58:19,380 If it equals three-- 1164 00:58:19,380 --> 00:58:27,000 and let's be safe, and sys.argv bracket one equals equals dash n, 1165 00:58:27,000 --> 00:58:29,320 then let's go ahead and do this. 1166 00:58:29,320 --> 00:58:37,680 Let's go ahead and convert sys.argv of two to an integer 1167 00:58:37,680 --> 00:58:40,560 and assign it to a variable, for instance, called n. 1168 00:58:40,560 --> 00:58:42,240 And then, let's go ahead and do this. 1169 00:58:42,240 --> 00:58:46,470 For underscore in the range of n, let's go ahead 1170 00:58:46,470 --> 00:58:48,395 and print out some of these meows. 1171 00:58:48,395 --> 00:58:51,270 Now, there's still an opportunity maybe to consolidate my print lines 1172 00:58:51,270 --> 00:58:51,730 with meow. 1173 00:58:51,730 --> 00:58:53,897 But for now, I'm going to keep these ideas separate. 1174 00:58:53,897 --> 00:58:58,120 So I'm going to handle the default case with no arguments up here as before. 1175 00:58:58,120 --> 00:59:01,530 And now, more interestingly, I'm going to do this to be clear. 1176 00:59:01,530 --> 00:59:05,070 I'm going to check if the user gave me three command line arguments, the name 1177 00:59:05,070 --> 00:59:07,710 of the program, dash n, and a number. 1178 00:59:07,710 --> 00:59:15,030 If indeed the second thing they gave me in sys.argv of 1 equals equals dash n, 1179 00:59:15,030 --> 00:59:19,410 then I'm going to assume that the next thing, sys.argv of two 1180 00:59:19,410 --> 00:59:21,030 is going to be an integer. 1181 00:59:21,030 --> 00:59:24,420 And I'll convert it to such and store it in this variable n. 1182 00:59:24,420 --> 00:59:28,960 And now, just using a loop, I'm going to print out meow that many times. 1183 00:59:28,960 --> 00:59:32,190 All right, so it's kind of a combination of our earlier focus on loops, 1184 00:59:32,190 --> 00:59:34,290 our earlier focus on command line arguments, 1185 00:59:34,290 --> 00:59:36,570 just creating a program that allow me to claim 1186 00:59:36,570 --> 00:59:39,750 is representative of how a lot of command line programs work, 1187 00:59:39,750 --> 00:59:42,130 even though we've typically not used many like this. 1188 00:59:42,130 --> 00:59:45,090 But it's very common to configure a program, one, 1189 00:59:45,090 --> 00:59:47,700 you're about to run it at the command line with something 1190 00:59:47,700 --> 00:59:51,120 like these command line arguments like dash n or dash something else. 1191 00:59:51,120 --> 00:59:53,430 Now, I'm going to go ahead and hit Enter. 1192 00:59:53,430 --> 00:59:57,360 And I think I should see, indeed, three meows. 1193 00:59:57,360 --> 01:00:02,250 By contrast, if I do two at the end, I should see two meows. 1194 01:00:02,250 --> 01:00:04,980 If I do one, I should see one meow. 1195 01:00:04,980 --> 01:00:07,860 And frankly, if I just omit this altogether, 1196 01:00:07,860 --> 01:00:12,420 I should see one meow as well, because that was my default case earlier. 1197 01:00:12,420 --> 01:00:17,880 And now, let me allow us to assume that this program eventually 1198 01:00:17,880 --> 01:00:19,020 gets more complicated. 1199 01:00:19,020 --> 01:00:22,620 Let's imagine a world where I don't want to support just dash n. 1200 01:00:22,620 --> 01:00:25,980 Maybe I want to support dash a, and dash b, and dash c, 1201 01:00:25,980 --> 01:00:27,870 and dash d, and a whole lot of others. 1202 01:00:27,870 --> 01:00:30,400 Or heck, at that point, I should maybe give them words. 1203 01:00:30,400 --> 01:00:32,220 So maybe it's a dash, dash, number. 1204 01:00:32,220 --> 01:00:35,700 It's indeed a convention in computing typically 1205 01:00:35,700 --> 01:00:40,620 to use single dashes with a single letter, like n, but use double dashes 1206 01:00:40,620 --> 01:00:42,930 if you're actually using a whole word like number. 1207 01:00:42,930 --> 01:00:45,900 So the command line argument might be dash n, or maybe 1208 01:00:45,900 --> 01:00:47,490 it's dash, dash number. 1209 01:00:47,490 --> 01:00:50,490 But you can imagine just how complicated the code gets 1210 01:00:50,490 --> 01:00:54,448 if now you want to support dash n, dash a, dash b, dash c, and so forth. 1211 01:00:54,448 --> 01:00:56,740 You're going to have to be checking all over the place. 1212 01:00:56,740 --> 01:00:58,532 And what if they come in a different order? 1213 01:00:58,532 --> 01:01:00,570 You're going to have to check is dash n first, 1214 01:01:00,570 --> 01:01:02,910 or is it second, or is it third, or is it's fourth? 1215 01:01:02,910 --> 01:01:06,240 I mean, this just becomes very painful very quickly 1216 01:01:06,240 --> 01:01:09,840 just to do something relatively simple like allow the user to pass command 1217 01:01:09,840 --> 01:01:11,700 line arguments into your program. 1218 01:01:11,700 --> 01:01:15,390 Well, this is why, as always, there exist libraries. 1219 01:01:15,390 --> 01:01:18,240 And another library that comes with Python 1220 01:01:18,240 --> 01:01:22,350 that's probably worth knowing something about is this one here called argparse. 1221 01:01:22,350 --> 01:01:26,640 In fact, with a lot of the tools I myself or CS50's team writes in Python, 1222 01:01:26,640 --> 01:01:29,490 we very frequently use argparse whenever they 1223 01:01:29,490 --> 01:01:33,390 are more complicated than a lot of our class demos and a little more similar 1224 01:01:33,390 --> 01:01:37,890 to this one where we want to allow the user to pass in configuration options 1225 01:01:37,890 --> 01:01:39,420 at the command line. 1226 01:01:39,420 --> 01:01:43,500 And by supporting things like dash n, or dash a, or dash b, or dash c, 1227 01:01:43,500 --> 01:01:46,770 argparse is a library that per its documentation 1228 01:01:46,770 --> 01:01:51,240 just handles all of this parsing so to speak, this analysis of command line 1229 01:01:51,240 --> 01:01:54,150 arguments for you automatically so you can focus 1230 01:01:54,150 --> 01:01:57,420 on writing the interesting parts of your program, not the command line 1231 01:01:57,420 --> 01:01:58,780 arguments part. 1232 01:01:58,780 --> 01:02:00,160 So how might we use this? 1233 01:02:00,160 --> 01:02:02,040 Well, let me go back to VS Code here. 1234 01:02:02,040 --> 01:02:03,630 Let me clear my terminal window. 1235 01:02:03,630 --> 01:02:06,690 And let me propose that I rewrite this using not sys 1236 01:02:06,690 --> 01:02:08,037 but actually using argparse. 1237 01:02:08,037 --> 01:02:10,620 And I'm going to start a little simple and then build back up. 1238 01:02:10,620 --> 01:02:15,960 So let me throw all of this away for now and instead import argparse. 1239 01:02:15,960 --> 01:02:17,910 Argparse stands for argument parser. 1240 01:02:17,910 --> 01:02:22,000 To parse something means to read it, kind of pick it apart to analyze it. 1241 01:02:22,000 --> 01:02:24,300 So this is indeed going to do just that for me. 1242 01:02:24,300 --> 01:02:25,950 Now, let me go ahead and do this. 1243 01:02:25,950 --> 01:02:27,780 And for this library, it's helpful to know 1244 01:02:27,780 --> 01:02:30,600 a little object-oriented programming like we all now do. 1245 01:02:30,600 --> 01:02:32,630 I'm going to create a variable called parser. 1246 01:02:32,630 --> 01:02:34,255 Though I could call it anything I want. 1247 01:02:34,255 --> 01:02:40,270 I'm going to set it equal to the return value of argparse.ArgumentParser, 1248 01:02:40,270 --> 01:02:43,450 with a capital A and a capital P. A constructor 1249 01:02:43,450 --> 01:02:47,650 for a class called argument parser that comes with Python itself 1250 01:02:47,650 --> 01:02:49,300 within this library here. 1251 01:02:49,300 --> 01:02:52,690 Now, I'm going to configure this argument parser 1252 01:02:52,690 --> 01:02:55,330 to know about the specific command line arguments 1253 01:02:55,330 --> 01:02:58,010 that I myself want to support in my program. 1254 01:02:58,010 --> 01:03:01,760 So I'm going to do this, parser.add_argument. 1255 01:03:01,760 --> 01:03:04,990 So that's apparently a method in the parser object. 1256 01:03:04,990 --> 01:03:07,600 I'm going to add an argument of dash n. 1257 01:03:07,600 --> 01:03:08,470 Easy enough. 1258 01:03:08,470 --> 01:03:11,772 Now I'm going to go ahead and actually parse the command line arguments. 1259 01:03:11,772 --> 01:03:14,230 I'm going to do args, or I could call the variable anything 1260 01:03:14,230 --> 01:03:17,890 I want, parser.parseargs. 1261 01:03:17,890 --> 01:03:23,410 And by default, parseargs is going to automatically look at sys.argv for me. 1262 01:03:23,410 --> 01:03:25,450 I don't need to import sys myself. 1263 01:03:25,450 --> 01:03:30,670 I can leave the argument parser, its code to import sys, look at sys.argv, 1264 01:03:30,670 --> 01:03:34,000 and figure out where dash n or anything else actually is. 1265 01:03:34,000 --> 01:03:36,970 And what's nice now, because this line of code 1266 01:03:36,970 --> 01:03:42,250 here results in the parser having parsed all of the command line arguments, 1267 01:03:42,250 --> 01:03:46,360 I now have this object in this variable called args inside 1268 01:03:46,360 --> 01:03:48,790 of which are all of the values of those command 1269 01:03:48,790 --> 01:03:51,630 line arguments, no matter what order they appeared in. 1270 01:03:51,630 --> 01:03:54,130 Not such a big deal when I've only got one because it's only 1271 01:03:54,130 --> 01:03:55,840 going to go in one place at the end. 1272 01:03:55,840 --> 01:03:58,342 But if I've got dash n, dash a, dash b, dash c, 1273 01:03:58,342 --> 01:04:00,550 you could imagine them being in all different orders. 1274 01:04:00,550 --> 01:04:02,467 They definitely don't have to be alphabetical. 1275 01:04:02,467 --> 01:04:05,260 The user should be able to type them in any order they want. 1276 01:04:05,260 --> 01:04:06,790 That's better for usability. 1277 01:04:06,790 --> 01:04:09,430 Arg parser is going to figure all of that out for me. 1278 01:04:09,430 --> 01:04:11,200 And all I have to do now is this. 1279 01:04:11,200 --> 01:04:16,300 If I want to iterate over that many numbers of arguments 1280 01:04:16,300 --> 01:04:18,310 and that many meows rather, I can do this. 1281 01:04:18,310 --> 01:04:24,280 For underscore in the range of the int conversion of args.n. 1282 01:04:24,280 --> 01:04:27,340 So dot is the syntax we kept using to access things 1283 01:04:27,340 --> 01:04:29,860 like properties inside of an object. 1284 01:04:29,860 --> 01:04:31,150 And that's what args is. 1285 01:04:31,150 --> 01:04:34,360 It's the object returned by the parse args function for me. 1286 01:04:34,360 --> 01:04:39,040 I'm going to go ahead now and print out, quote unquote, meow this many times. 1287 01:04:39,040 --> 01:04:41,080 So it's not super simple. 1288 01:04:41,080 --> 01:04:45,520 These are three new lines of code I need to write and rather understand. 1289 01:04:45,520 --> 01:04:48,010 But it's already a little simpler and more compact 1290 01:04:48,010 --> 01:04:52,540 than my if, and my elif, and my ors, and my ands, and all of that Boolean logic. 1291 01:04:52,540 --> 01:04:54,560 It's handling a lot of this for me. 1292 01:04:54,560 --> 01:04:58,780 So if I didn't make any mistakes, let me run Python now of meows.py Enter. 1293 01:04:58,780 --> 01:05:01,630 And I did make a mistake here. 1294 01:05:01,630 --> 01:05:03,170 I did make a mistake. 1295 01:05:03,170 --> 01:05:05,620 What's wrong here now? 1296 01:05:05,620 --> 01:05:06,280 What's wrong? 1297 01:05:06,280 --> 01:05:08,860 Well, I definitely didn't run it the way I intend. 1298 01:05:08,860 --> 01:05:11,110 So dash n three Enter. 1299 01:05:11,110 --> 01:05:12,490 So it does work. 1300 01:05:12,490 --> 01:05:16,990 But if I don't cooperate, this actually seems to be a worse version. 1301 01:05:16,990 --> 01:05:20,620 If I don't pass in dash n and a number, it just errors with a type error. 1302 01:05:20,620 --> 01:05:22,000 Int must be a string. 1303 01:05:22,000 --> 01:05:23,260 None is what came back. 1304 01:05:23,260 --> 01:05:25,000 So there's clearly an error here. 1305 01:05:25,000 --> 01:05:27,160 But the library is more flexible. 1306 01:05:27,160 --> 01:05:31,220 I can actually provide some documentation on how to use this thing. 1307 01:05:31,220 --> 01:05:32,860 So how do I know how to use this? 1308 01:05:32,860 --> 01:05:35,500 Well, typically it's conventional in Python 1309 01:05:35,500 --> 01:05:39,160 and in a lot of programming environments to run a program 1310 01:05:39,160 --> 01:05:42,970 with a special argument, dash h or dash dash help. 1311 01:05:42,970 --> 01:05:46,270 And almost always I will claim you'll then 1312 01:05:46,270 --> 01:05:47,898 see some kind of usage information. 1313 01:05:47,898 --> 01:05:49,690 And indeed, that's what I'm looking at now. 1314 01:05:49,690 --> 01:05:53,332 I just ran Python of meows.py space dash h. 1315 01:05:53,332 --> 01:05:54,040 I'll do it again. 1316 01:05:54,040 --> 01:05:57,820 Let me clear my screen and this time do dash dash help in English, Enter. 1317 01:05:57,820 --> 01:05:59,320 And I see the same thing. 1318 01:05:59,320 --> 01:06:01,510 It's not very useful at the moment. 1319 01:06:01,510 --> 01:06:04,540 It just shows me what the usage is up here. 1320 01:06:04,540 --> 01:06:06,200 And this is kind of interesting. 1321 01:06:06,200 --> 01:06:08,470 This is a standard syntax in computing. 1322 01:06:08,470 --> 01:06:11,200 And we've kind of seen it in Python's documentation before. 1323 01:06:11,200 --> 01:06:14,530 This just means that the program's name is, of course, meows.py. 1324 01:06:14,530 --> 01:06:20,060 Square brackets as almost always in documentation means it's optional. 1325 01:06:20,060 --> 01:06:22,862 So I don't have to type dash h, but I can. 1326 01:06:22,862 --> 01:06:27,220 I don't have to type dash n and another value, but I can. 1327 01:06:27,220 --> 01:06:30,400 And then, down here is some explanation of these options, 1328 01:06:30,400 --> 01:06:32,530 and more verbosely showing me that I can also 1329 01:06:32,530 --> 01:06:34,900 do dash dash help and not just dash h. 1330 01:06:34,900 --> 01:06:36,075 But this is so generic. 1331 01:06:36,075 --> 01:06:37,700 This has nothing to do with my program. 1332 01:06:37,700 --> 01:06:39,783 This is not going to help my users when I actually 1333 01:06:39,783 --> 01:06:41,390 release this software for the world. 1334 01:06:41,390 --> 01:06:43,000 So let me go ahead and improve it. 1335 01:06:43,000 --> 01:06:46,960 Let me add a description to my argument parser that the humans will see. 1336 01:06:46,960 --> 01:06:51,220 Meow like a cat, quote unquote, is going to be the value of this named 1337 01:06:51,220 --> 01:06:52,690 parameter called description. 1338 01:06:52,690 --> 01:06:56,830 And let me also add a help parameter to my dash n 1339 01:06:56,830 --> 01:07:01,630 argument that just explains what dash n means, number of times to meow, 1340 01:07:01,630 --> 01:07:02,657 quote unquote. 1341 01:07:02,657 --> 01:07:04,240 I'm not going to change anything else. 1342 01:07:04,240 --> 01:07:08,680 But I am going to go back to my terminal window and run Python of meow. 1343 01:07:08,680 --> 01:07:13,750 I'm going to run Python of meows.py dash h, or equivalently dash dash help. 1344 01:07:13,750 --> 01:07:17,570 And now notice that this is a little more user friendly. 1345 01:07:17,570 --> 01:07:19,900 If I scroll up, we still see the same usage. 1346 01:07:19,900 --> 01:07:22,660 But there's a quick sentence in English of explanation 1347 01:07:22,660 --> 01:07:24,310 that this program meows like a cat. 1348 01:07:24,310 --> 01:07:28,540 And if I look at the options now, oh, that's what n means. 1349 01:07:28,540 --> 01:07:30,940 It's the number of times to meow. 1350 01:07:30,940 --> 01:07:34,280 And this capital N, a mental variable, if you will, 1351 01:07:34,280 --> 01:07:37,930 is just indicating to me that I need to type a number by convention 1352 01:07:37,930 --> 01:07:40,780 after the lower case dash n. 1353 01:07:40,780 --> 01:07:43,930 So it would be nice though, all that said, 1354 01:07:43,930 --> 01:07:47,800 if my program still didn't just break when I run it without any command line 1355 01:07:47,800 --> 01:07:48,490 arguments. 1356 01:07:48,490 --> 01:07:52,090 Ideally, my program would handle this just like my manual version 1357 01:07:52,090 --> 01:07:54,430 did when I used sys.argv myself. 1358 01:07:54,430 --> 01:07:58,540 So we just need to add a little more functionality to this library. 1359 01:07:58,540 --> 01:08:02,230 And if I read the documentation, I'll see that add argument 1360 01:08:02,230 --> 01:08:04,450 takes yet another named argument. 1361 01:08:04,450 --> 01:08:09,070 If you want, you can specify a default value for dash n, for instance, one. 1362 01:08:09,070 --> 01:08:10,300 And I'll do that there. 1363 01:08:10,300 --> 01:08:13,930 And you can further specify that it's got to be an int. 1364 01:08:13,930 --> 01:08:15,910 And what this will additionally allow me to do 1365 01:08:15,910 --> 01:08:20,890 is if I tell arg parser to make sure that the value of dash n is an int, 1366 01:08:20,890 --> 01:08:22,810 I don't need to do the conversion manually. 1367 01:08:22,810 --> 01:08:25,689 I can just trust down on line seven that when 1368 01:08:25,689 --> 01:08:29,800 I access the property called n inside of my args object, 1369 01:08:29,800 --> 01:08:32,080 it's going to be automatically an int for me. 1370 01:08:32,080 --> 01:08:34,689 And again, this is the value of a library. 1371 01:08:34,689 --> 01:08:37,450 Let it do all of the work for you so you can get back 1372 01:08:37,450 --> 01:08:40,090 to focusing on the interesting project at hand, 1373 01:08:40,090 --> 01:08:42,520 whatever problem it is you're trying to solve. 1374 01:08:42,520 --> 01:08:45,939 In this case, granted, not that interesting, but meowing like a cat. 1375 01:08:45,939 --> 01:08:49,779 Let me go ahead now and run Python of meows.py and hit Enter. 1376 01:08:49,779 --> 01:08:52,250 This time, no arguments. 1377 01:08:52,250 --> 01:08:53,410 And now it meows. 1378 01:08:53,410 --> 01:08:53,979 Why? 1379 01:08:53,979 --> 01:08:58,930 Because I specified that if I don't, as a user, specify dash n, 1380 01:08:58,930 --> 01:09:01,149 it's going to have a default value of one apparently. 1381 01:09:01,149 --> 01:09:04,569 And I don't have to convert that value from a str to an int 1382 01:09:04,569 --> 01:09:09,760 because I told arg parser please just make this an int for me. 1383 01:09:09,760 --> 01:09:14,939 Any questions now on argparse, or really this principle 1384 01:09:14,939 --> 01:09:19,290 of just outsourcing the commodity stuff, the stuff that everyone's program 1385 01:09:19,290 --> 01:09:22,950 eventually needs to do so that you can focus on the juicy part yourself. 1386 01:09:22,950 --> 01:09:25,319 STUDENT: What does args.n contain? 1387 01:09:25,319 --> 01:09:27,359 DAVID J. MALAN: What does args.n contain? 1388 01:09:27,359 --> 01:09:35,609 It contains the integer that the human typed after a space after dash n. 1389 01:09:35,609 --> 01:09:36,220 Good question. 1390 01:09:36,220 --> 01:09:38,729 Other questions? 1391 01:09:38,729 --> 01:09:40,473 STUDENT: Yeah. 1392 01:09:40,473 --> 01:09:45,200 When you specify the type for the argument, what 1393 01:09:45,200 --> 01:09:49,250 happens if-- does that basically handle the exception if the user inputs 1394 01:09:49,250 --> 01:09:50,450 a string in this case? 1395 01:09:50,450 --> 01:09:51,290 DAVID J. MALAN: A really good question. 1396 01:09:51,290 --> 01:09:54,502 Suppose that the human does not type a number and therefore not an in. 1397 01:09:54,502 --> 01:09:55,710 Well, let's see what happens. 1398 01:09:55,710 --> 01:10:01,730 So Python of meows.py dash n dog, where dog is obviously not a number. 1399 01:10:01,730 --> 01:10:02,450 Enter. 1400 01:10:02,450 --> 01:10:06,140 And voila, we see an automatically generated error message. 1401 01:10:06,140 --> 01:10:07,730 A little cryptic, admittedly. 1402 01:10:07,730 --> 01:10:10,160 But I'm seeing a reminder of what the usage is 1403 01:10:10,160 --> 01:10:13,820 and a minor explanation of what is invalid about this. 1404 01:10:13,820 --> 01:10:15,830 And again, this is what allows you, this is 1405 01:10:15,830 --> 01:10:19,040 what allows me to focus on writing the actual code we care about 1406 01:10:19,040 --> 01:10:22,280 and just letting the library automate some of this stuff for us. 1407 01:10:22,280 --> 01:10:23,150 All right. 1408 01:10:23,150 --> 01:10:26,270 Well, allow me to propose now that we take a look at one other feature 1409 01:10:26,270 --> 01:10:28,190 of Python that we've seen before. 1410 01:10:28,190 --> 01:10:32,630 But it turns out we can use it even more powerfully as our programs become 1411 01:10:32,630 --> 01:10:35,990 more sophisticated and the problems we're trying to solve themselves become 1412 01:10:35,990 --> 01:10:36,830 more involved. 1413 01:10:36,830 --> 01:10:42,020 Let me go ahead and turn to VS Code, closing out meows.py, 1414 01:10:42,020 --> 01:10:45,050 and creating a new file for instance, called unpack.py. 1415 01:10:45,050 --> 01:10:47,120 So code of unpack.py. 1416 01:10:47,120 --> 01:10:49,357 And let me just remind us what we mean by unpacking. 1417 01:10:49,357 --> 01:10:52,190 Because this is actually a feature of Python that we've seen before. 1418 01:10:52,190 --> 01:10:54,770 For instance, suppose that I write a program that 1419 01:10:54,770 --> 01:10:58,040 prompts the user for their name, like David space Malan. 1420 01:10:58,040 --> 01:11:01,160 Wouldn't it be nice if we could split the user's name 1421 01:11:01,160 --> 01:11:02,593 into two separate variables? 1422 01:11:02,593 --> 01:11:05,760 And when we've done this in the past, we've done it in a few different ways. 1423 01:11:05,760 --> 01:11:10,910 But one of them involved unpacking a single value that comes back from that, 1424 01:11:10,910 --> 01:11:13,340 like a list or some other data structure, 1425 01:11:13,340 --> 01:11:15,750 and putting it immediately into two variables. 1426 01:11:15,750 --> 01:11:16,890 So let's do this here. 1427 01:11:16,890 --> 01:11:20,360 Let me go ahead and call the input function, 1428 01:11:20,360 --> 01:11:24,020 asking someone what's your name, question mark. 1429 01:11:24,020 --> 01:11:27,770 Then, let me go ahead and just split a little naively on a single space. 1430 01:11:27,770 --> 01:11:29,960 So I'm assuming that the only users at the moment 1431 01:11:29,960 --> 01:11:32,570 are people like me, David space Malan. 1432 01:11:32,570 --> 01:11:34,310 No middle names, no multiple names. 1433 01:11:34,310 --> 01:11:37,610 It's just one and two, which itself could be buggy for other users. 1434 01:11:37,610 --> 01:11:39,950 But for now, I'm keeping it simple, just to remind us 1435 01:11:39,950 --> 01:11:43,572 that I can now unpack that return value with something like first 1436 01:11:43,572 --> 01:11:47,287 underscore last equals the return value of input. 1437 01:11:47,287 --> 01:11:49,370 And now I can go ahead and do something like this, 1438 01:11:49,370 --> 01:11:53,340 like printing out with an f string, Hello, comma, and then in curly braces, 1439 01:11:53,340 --> 01:11:53,840 first. 1440 01:11:53,840 --> 01:11:57,500 If I just want to greet myself or any other user as, Hello, David, 1441 01:11:57,500 --> 01:11:58,820 without the last name. 1442 01:11:58,820 --> 01:12:00,770 And frankly, if I'm not using the last name, 1443 01:12:00,770 --> 01:12:04,580 recall that a Python convention is just to name it underscore to make clear 1444 01:12:04,580 --> 01:12:06,650 that you know you're not using that value. 1445 01:12:06,650 --> 01:12:10,590 But it does need to be there because you're unpacking two values at once. 1446 01:12:10,590 --> 01:12:13,310 So if I run this, it won't be all that unfamiliar. 1447 01:12:13,310 --> 01:12:15,470 I'm just going to run now Python of unpack.py. 1448 01:12:15,470 --> 01:12:17,660 I'll type in David Malan, which has a single space. 1449 01:12:17,660 --> 01:12:20,510 And there we have it, Hello, comma, David. 1450 01:12:20,510 --> 01:12:25,130 Well, it turns out that there's other ways to unpack values. 1451 01:12:25,130 --> 01:12:27,980 And there's other features that Python offers, especially 1452 01:12:27,980 --> 01:12:31,220 when it comes to defining and using functions. 1453 01:12:31,220 --> 01:12:33,658 And this is slightly more intermediate functionality, 1454 01:12:33,658 --> 01:12:35,450 if you will, that's useful, because you can 1455 01:12:35,450 --> 01:12:38,150 start to write even more elegant and powerful code once you 1456 01:12:38,150 --> 01:12:40,650 get comfortable with syntax like this. 1457 01:12:40,650 --> 01:12:43,190 So let me go ahead and propose that we not just play with, 1458 01:12:43,190 --> 01:12:48,140 Hello names anymore, but instead do something maybe 1459 01:12:48,140 --> 01:12:50,230 involving some coinage again. 1460 01:12:50,230 --> 01:12:51,980 So maybe not dollars and cents, but maybe, 1461 01:12:51,980 --> 01:12:54,530 again, as in the past, some galleons, and sickles, 1462 01:12:54,530 --> 01:12:57,590 and knuts which among which there's a mathematical relationship as 1463 01:12:57,590 --> 01:13:00,680 to how many of those in the wizarding world equal each other. 1464 01:13:00,680 --> 01:13:02,160 And let me go ahead and do this. 1465 01:13:02,160 --> 01:13:05,750 Let me define a simple function called total that just tells me 1466 01:13:05,750 --> 01:13:10,280 the total value of someone's vault in Gringotts, the wizarding bank, based 1467 01:13:10,280 --> 01:13:13,050 on how many galleons, sickles, and knuts that they have, 1468 01:13:13,050 --> 01:13:15,920 which again, are currencies from the wizarding world as opposed 1469 01:13:15,920 --> 01:13:17,460 to our actual human world. 1470 01:13:17,460 --> 01:13:21,140 So this total function might take a variable like galleons and sickles 1471 01:13:21,140 --> 01:13:22,457 and knuts like this. 1472 01:13:22,457 --> 01:13:25,040 And then, it's going to return the formula, which I admittedly 1473 01:13:25,040 --> 01:13:26,120 had to look up myself. 1474 01:13:26,120 --> 01:13:30,800 And it turns out that the formula for converting galleons and sickles 1475 01:13:30,800 --> 01:13:32,510 to knuts would be this. 1476 01:13:32,510 --> 01:13:39,710 Galleons times 17 plus sickles, then times all of that by 29 and then 1477 01:13:39,710 --> 01:13:41,540 add in the individual knuts. 1478 01:13:41,540 --> 01:13:45,230 Not sure in what detail this came up in the books or movies. 1479 01:13:45,230 --> 01:13:47,370 But here we have it, the official formula. 1480 01:13:47,370 --> 01:13:47,870 All right. 1481 01:13:47,870 --> 01:13:49,770 Now let's go ahead and do this. 1482 01:13:49,770 --> 01:13:53,390 Let me go ahead and call the total function with just some sample inputs. 1483 01:13:53,390 --> 01:13:58,880 Suppose that someone like Harry has 100 galleons, 50 sickles, and 25 knuts. 1484 01:13:58,880 --> 01:14:01,310 Let me go ahead and print that out on the screen. 1485 01:14:01,310 --> 01:14:05,210 Well, if total returns an integer, which I think this arithmetic expression 1486 01:14:05,210 --> 01:14:10,280 will do, let me go ahead and store, rather, pass the return value of total 1487 01:14:10,280 --> 01:14:11,300 to print. 1488 01:14:11,300 --> 01:14:13,940 And then just for clarity, let me write knuts at the end So? 1489 01:14:13,940 --> 01:14:18,050 I know that the unit of measure here is indeed knuts in total. 1490 01:14:18,050 --> 01:14:19,880 Now, let me go ahead in my terminal window 1491 01:14:19,880 --> 01:14:22,310 and run Python of unpack.py and hit Enter. 1492 01:14:22,310 --> 01:14:27,350 And it turns out mathematically that if I got my math correct, 1493 01:14:27,350 --> 01:14:36,320 100 galleons plus 50 sickles plus 25 knuts equals in total 50,775 knuts. 1494 01:14:36,320 --> 01:14:39,170 Just avoiding having to use our own human currency here. 1495 01:14:39,170 --> 01:14:43,740 But I'm not doing anything along the lines of unpacking at least just yet. 1496 01:14:43,740 --> 01:14:45,710 Let me propose now that I do this. 1497 01:14:45,710 --> 01:14:48,110 Just for the sake of discussion, let me propose 1498 01:14:48,110 --> 01:14:50,480 that I leave the total function as is. 1499 01:14:50,480 --> 01:14:54,270 But let me go ahead and just store all of my coins in a list. 1500 01:14:54,270 --> 01:14:59,300 So coins in order from left to right, 100, 50, 25, it 1501 01:14:59,300 --> 01:15:01,640 just because for whatever purposes in this story 1502 01:15:01,640 --> 01:15:05,570 I have all of my coinage in a list in this order. 1503 01:15:05,570 --> 01:15:08,250 Kind of a purse or wallet of sorts. 1504 01:15:08,250 --> 01:15:09,830 Well, how can I pass this in? 1505 01:15:09,830 --> 01:15:12,170 Well, I'm not going to hard-code the same values twice. 1506 01:15:12,170 --> 01:15:14,150 Just for the sake of discussion, how could I 1507 01:15:14,150 --> 01:15:18,440 pass in the individual elements of a list to my total function? 1508 01:15:18,440 --> 01:15:21,110 Well, of course, I could treat this list as I always 1509 01:15:21,110 --> 01:15:24,890 do using numeric indices by doing coins bracket zero, 1510 01:15:24,890 --> 01:15:27,620 coins bracket one, coins bracket two. 1511 01:15:27,620 --> 01:15:29,660 So this is old-school stuff with lists. 1512 01:15:29,660 --> 01:15:32,540 If I've got a list called coins and there's three elements, 1513 01:15:32,540 --> 01:15:37,070 the indices or indexes of those elements are zero, one, and two respectively 1514 01:15:37,070 --> 01:15:38,070 from left to right. 1515 01:15:38,070 --> 01:15:42,740 So all I'm doing here now is passing in the first element 1516 01:15:42,740 --> 01:15:47,210 from that list as galleons, the second element of that list as sickles, 1517 01:15:47,210 --> 01:15:50,810 and the third element of this list as my knuts. 1518 01:15:50,810 --> 01:15:53,270 And that lines up with, of course, the signature 1519 01:15:53,270 --> 01:15:57,110 of this function, which as total expects that I've passed in those three 1520 01:15:57,110 --> 01:15:59,152 things in that order left to right. 1521 01:15:59,152 --> 01:16:01,610 Let me go ahead and run, just to make sure I haven't broken 1522 01:16:01,610 --> 01:16:03,650 anything, unpack.py and hit Enter. 1523 01:16:03,650 --> 01:16:06,380 And the math still checks out. 1524 01:16:06,380 --> 01:16:08,810 But this is getting a little verbose-- 1525 01:16:08,810 --> 01:16:10,040 a little verbose. 1526 01:16:10,040 --> 01:16:13,970 And wouldn't it be nice if I could just pass the list of coins 1527 01:16:13,970 --> 01:16:15,500 to this total function? 1528 01:16:15,500 --> 01:16:20,540 Wouldn't it be nice if I could just say something like this, coins. 1529 01:16:20,540 --> 01:16:28,380 But let me pause and ask the group, why would this not actually work as is? 1530 01:16:28,380 --> 01:16:30,900 It technically is passing in all three. 1531 01:16:30,900 --> 01:16:34,140 But why would I get some kind of error when I run this? 1532 01:16:34,140 --> 01:16:35,340 Eric? 1533 01:16:35,340 --> 01:16:38,455 STUDENT: Because you are passing a list to galleons. 1534 01:16:38,455 --> 01:16:39,330 DAVID J. MALAN: Yeah. 1535 01:16:39,330 --> 01:16:42,900 I'm passing a list to galleons and nothing for sickles and knuts. 1536 01:16:42,900 --> 01:16:44,830 And notice, those don't have default values. 1537 01:16:44,830 --> 01:16:46,770 There's no equal signs on that first line 1538 01:16:46,770 --> 01:16:49,020 of code, which means Python is not going to know 1539 01:16:49,020 --> 01:16:51,340 what value should be assumed there. 1540 01:16:51,340 --> 01:16:53,790 So it just seems like it's not going to work. 1541 01:16:53,790 --> 01:16:55,710 Plus, it's the wrong type, as Eric notes. 1542 01:16:55,710 --> 01:16:59,470 It's a list and it's not an integer as it was before. 1543 01:16:59,470 --> 01:17:01,500 So let's actually run this incorrect version, 1544 01:17:01,500 --> 01:17:04,920 Python of unpack.py Enter, type error. 1545 01:17:04,920 --> 01:17:06,880 And that is probably what you might expect, 1546 01:17:06,880 --> 01:17:08,580 like I'm messing up with the types here. 1547 01:17:08,580 --> 01:17:12,120 And I am required to pass in two positional arguments, sickles 1548 01:17:12,120 --> 01:17:13,920 and knuts, that were not even passed. 1549 01:17:13,920 --> 01:17:15,750 So I've definitely erred here. 1550 01:17:15,750 --> 01:17:18,780 But it certainly seems unfortunate if the only solution to this 1551 01:17:18,780 --> 01:17:22,893 is to do what I previously did, which is index into the first element, index 1552 01:17:22,893 --> 01:17:24,810 into the second element, index into the third. 1553 01:17:24,810 --> 01:17:27,360 You can imagine, with bigger fancier functions that 1554 01:17:27,360 --> 01:17:30,810 take even more arguments, this is going to get very verbose and honestly 1555 01:17:30,810 --> 01:17:35,070 very vulnerable, potentially, to just mistakes, typos on my part. 1556 01:17:35,070 --> 01:17:38,400 But here too is where you can do what's known, again, 1557 01:17:38,400 --> 01:17:41,250 as unpacking a value in Python. 1558 01:17:41,250 --> 01:17:44,970 Right now, a list is packed with multiple values. 1559 01:17:44,970 --> 01:17:49,538 My current list has these three values, 100, 50, and 25 respectively. 1560 01:17:49,538 --> 01:17:51,330 But they're all packed up in this one list. 1561 01:17:51,330 --> 01:17:54,450 Wouldn't it be nice if I could unpack that list, 1562 01:17:54,450 --> 01:17:59,820 just like I previously unpacked the return value of the str class's split 1563 01:17:59,820 --> 01:18:02,220 function into multiple things too. 1564 01:18:02,220 --> 01:18:04,350 And indeed, I can do just that. 1565 01:18:04,350 --> 01:18:10,480 Python actually allows me to pass in not coins but star coins. 1566 01:18:10,480 --> 01:18:13,890 So if you use a single asterisk at the beginning of your variable, 1567 01:18:13,890 --> 01:18:15,540 that will unpack it. 1568 01:18:15,540 --> 01:18:20,340 And it will take one sequence, in this case coins of size three, 1569 01:18:20,340 --> 01:18:25,200 and explode it, if you will, unpack it into three individual arguments. 1570 01:18:25,200 --> 01:18:26,430 No commas are needed. 1571 01:18:26,430 --> 01:18:28,170 Python just handles this for you. 1572 01:18:28,170 --> 01:18:32,040 But the effect of passing in star coins is 1573 01:18:32,040 --> 01:18:36,400 to pass in the individual members of that list. 1574 01:18:36,400 --> 01:18:39,540 Which in this case are going to be 100, 50, and 25 respectively. 1575 01:18:39,540 --> 01:18:42,480 Which is perfect, because now it's going to line up with galleons, 1576 01:18:42,480 --> 01:18:44,320 sickles, knuts respectively. 1577 01:18:44,320 --> 01:18:49,500 So now when I run Python of unpack.py, we're back in business 1578 01:18:49,500 --> 01:18:50,670 and the math checks out. 1579 01:18:50,670 --> 01:18:54,390 But I've cleaned up my code by just introducing this new symbol, which 1580 01:18:54,390 --> 01:18:58,030 we've used, of course, in other contexts for multiplication and the like. 1581 01:18:58,030 --> 01:19:02,130 But now, it's also used for unpacking in this way. 1582 01:19:02,130 --> 01:19:04,740 Questions on what we've just done? 1583 01:19:04,740 --> 01:19:09,090 It's a single operator, but it's already quite powerful. 1584 01:19:09,090 --> 01:19:12,180 Because it allows us to take a data structure and unpack it 1585 01:19:12,180 --> 01:19:14,340 and pass it in individually. 1586 01:19:14,340 --> 01:19:22,890 STUDENT: Does that work for tuples, sets, dicts, dictionaries as well? 1587 01:19:22,890 --> 01:19:25,430 DAVID J. MALAN: Tuples, yes. 1588 01:19:25,430 --> 01:19:27,530 Sets I don't know. 1589 01:19:27,530 --> 01:19:32,000 [? Ranshin? ?] I don't know if order is preserved. 1590 01:19:32,000 --> 01:19:33,110 No. 1591 01:19:33,110 --> 01:19:37,103 Oh, is that no it does not, or you're checking? 1592 01:19:37,103 --> 01:19:38,020 Order's not preserved. 1593 01:19:38,020 --> 01:19:40,900 So it wouldn't work with set? 1594 01:19:40,900 --> 01:19:42,090 It does not work with set. 1595 01:19:42,090 --> 01:19:43,710 Does not work with set. 1596 01:19:43,710 --> 01:19:46,770 Sorry, I'm verbally googling here just to save us some keystrokes. 1597 01:19:46,770 --> 01:19:51,062 So it would work for enumerations that where order is indeed preserved. 1598 01:19:51,062 --> 01:19:52,770 And we'll see another example in a moment 1599 01:19:52,770 --> 01:19:55,520 where it actually can be used in a different way for dictionaries, 1600 01:19:55,520 --> 01:19:57,420 which nowadays do preserve order. 1601 01:19:57,420 --> 01:20:00,720 Other questions on unpacking in this way? 1602 01:20:00,720 --> 01:20:01,350 STUDENT: Yes. 1603 01:20:01,350 --> 01:20:02,070 Hi. 1604 01:20:02,070 --> 01:20:02,987 DAVID J. MALAN: Hello. 1605 01:20:02,987 --> 01:20:07,290 STUDENT: Can you use unpacking to get the value, for example 10 plus 50 1606 01:20:07,290 --> 01:20:13,205 plus 25 instead of a for loop and then result plus? 1607 01:20:13,205 --> 01:20:14,580 DAVID J. MALAN: Short answer, no. 1608 01:20:14,580 --> 01:20:18,030 If you want the individual values, you should be just indexing, in this case, 1609 01:20:18,030 --> 01:20:20,710 into those specific locations. 1610 01:20:20,710 --> 01:20:25,767 This is returning multiple values, the equivalent of a comma separated list. 1611 01:20:25,767 --> 01:20:27,600 So you would use the earlier approach if you 1612 01:20:27,600 --> 01:20:30,060 cared about the individual locations. 1613 01:20:30,060 --> 01:20:32,670 How about one other question on unpacking? 1614 01:20:32,670 --> 01:20:35,190 STUDENT: What if we have declared-- 1615 01:20:35,190 --> 01:20:37,500 we declare some default values. 1616 01:20:37,500 --> 01:20:44,690 And if you use this as two points, will it go out right, or will it skip it? 1617 01:20:44,690 --> 01:20:45,940 DAVID J. MALAN: Good question. 1618 01:20:45,940 --> 01:20:47,773 If I heard you right, what if, for instance, 1619 01:20:47,773 --> 01:20:51,900 the list has four values like this here and you're still unpacking it 1620 01:20:51,900 --> 01:20:53,730 when it's only three that's expected. 1621 01:20:53,730 --> 01:20:54,570 Well, let's try it. 1622 01:20:54,570 --> 01:20:58,140 Python of unpack.py Enter. 1623 01:20:58,140 --> 01:20:59,280 Another type error. 1624 01:20:59,280 --> 01:21:02,560 This time it takes three positional arguments but four were given. 1625 01:21:02,560 --> 01:21:06,390 So the onus is on us as the programmer not to do that in this case. 1626 01:21:06,390 --> 01:21:09,090 So potentially fragile, but avoidable if I'm 1627 01:21:09,090 --> 01:21:11,470 controlling the contents of this list. 1628 01:21:11,470 --> 01:21:14,730 In fact, let me propose now that we take a look at another variant of this. 1629 01:21:14,730 --> 01:21:18,480 Whereby we use not just positional arguments, 1630 01:21:18,480 --> 01:21:21,720 whereby we trust that the first is galleons, the second is sickles, 1631 01:21:21,720 --> 01:21:22,680 the third is knuts. 1632 01:21:22,680 --> 01:21:25,950 Suppose that we actually passed in the names as we're allowed to do in Python. 1633 01:21:25,950 --> 01:21:28,560 And then, technically, we could pass them in in any order 1634 01:21:28,560 --> 01:21:31,830 and Python would figure it out using named parameters instead. 1635 01:21:31,830 --> 01:21:32,940 Well, how might I do this? 1636 01:21:32,940 --> 01:21:35,190 Well, it's going to be a bit of a regression at first. 1637 01:21:35,190 --> 01:21:37,560 So let me get rid of this list here. 1638 01:21:37,560 --> 01:21:41,520 Let me change this now to just manually passing the values I care about. 1639 01:21:41,520 --> 01:21:44,010 Galleons I want to still equal 100. 1640 01:21:44,010 --> 01:21:46,380 Sickles I want to equal 50. 1641 01:21:46,380 --> 01:21:49,540 And knuts I want to equal 25. 1642 01:21:49,540 --> 01:21:52,140 So this is old-school parameter passing. 1643 01:21:52,140 --> 01:21:53,310 It's no longer positional. 1644 01:21:53,310 --> 01:21:56,040 I'm explicitly specifying the names of these arguments. 1645 01:21:56,040 --> 01:21:58,830 But that's just going to work because that's 1646 01:21:58,830 --> 01:22:02,520 exactly what the names of these parameters are in my total function 1647 01:22:02,520 --> 01:22:03,240 as before. 1648 01:22:03,240 --> 01:22:05,790 Let's make sure I, nonetheless, did not break anything. 1649 01:22:05,790 --> 01:22:10,020 Let's run Python of unpack.py Enter. 1650 01:22:10,020 --> 01:22:13,800 And there we have it, still 50,775 knuts. 1651 01:22:13,800 --> 01:22:21,150 Well, once you start giving things names and values, names and values, 1652 01:22:21,150 --> 01:22:25,290 that probably should bring to mind one of our most versatile data 1653 01:22:25,290 --> 01:22:29,670 structures in Python and even other languages, that of a dictionary. 1654 01:22:29,670 --> 01:22:33,600 Remember that a dictionary is just a collection of key value pairs, names 1655 01:22:33,600 --> 01:22:35,200 and their respective values. 1656 01:22:35,200 --> 01:22:37,330 So this kind of opens up an opportunity. 1657 01:22:37,330 --> 01:22:38,470 What if I did this. 1658 01:22:38,470 --> 01:22:42,810 What if I actually had for some reason in my program on a variable 1659 01:22:42,810 --> 01:22:44,190 as before called coins. 1660 01:22:44,190 --> 01:22:47,460 But instead of making it a list of three values like before, 1661 01:22:47,460 --> 01:22:49,240 what if it's a proper dictionary? 1662 01:22:49,240 --> 01:22:53,910 So what if it's galleons, quote unquote, colon 100 for 100 of those, 1663 01:22:53,910 --> 01:22:59,415 sickles quote unquote and 50 of those, and knuts quote unquote 25 of those, 1664 01:22:59,415 --> 01:23:02,790 each of those separated by colons. 1665 01:23:02,790 --> 01:23:06,120 And let me fix my square brackets to this time 1666 01:23:06,120 --> 01:23:10,200 be curly braces, which, recall, is the symbol we use for dictionaries 1667 01:23:10,200 --> 01:23:12,240 or dict objects in Python. 1668 01:23:12,240 --> 01:23:14,880 So now, I have a dictionary called coins. 1669 01:23:14,880 --> 01:23:15,790 Not a list. 1670 01:23:15,790 --> 01:23:20,430 It's a collection of keys and values, three keys, galleons, sickles, knuts, 1671 01:23:20,430 --> 01:23:24,180 and three values, 100, 50, and 25 respectively. 1672 01:23:24,180 --> 01:23:29,910 If I were to now pass these individual values into my total function, 1673 01:23:29,910 --> 01:23:32,800 I could do it as always with my dictionary. 1674 01:23:32,800 --> 01:23:34,140 So I'm doing it old-school now. 1675 01:23:34,140 --> 01:23:36,060 Coins is the name of my dictionary. 1676 01:23:36,060 --> 01:23:39,960 I index into it not with numbers like with lists, but with words. 1677 01:23:39,960 --> 01:23:44,250 So galleons, strings like this, coins, quote unquote, 1678 01:23:44,250 --> 01:23:46,210 sickles in square brackets there. 1679 01:23:46,210 --> 01:23:49,500 And then lastly, coins, square brackets, quote unquote, knuts. 1680 01:23:49,500 --> 01:23:51,540 So it's getting-- it's verbose again. 1681 01:23:51,540 --> 01:23:53,340 This is not maybe the best road to go down. 1682 01:23:53,340 --> 01:23:54,940 But we'll backpedal in a moment. 1683 01:23:54,940 --> 01:23:58,800 This is just how, if you happen to have all of your coins stored 1684 01:23:58,800 --> 01:24:02,580 in a dictionary, you could pass the galleons, sickles, and knuts 1685 01:24:02,580 --> 01:24:04,543 into your function respectively. 1686 01:24:04,543 --> 01:24:06,210 Let's make sure I didn't break anything. 1687 01:24:06,210 --> 01:24:10,290 Let's rerun Python of unpack.py, and we're still good. 1688 01:24:10,290 --> 01:24:12,390 Now, how could we get to a situation like this? 1689 01:24:12,390 --> 01:24:16,740 Well, as always, imagine this program is a little longer than this one here. 1690 01:24:16,740 --> 01:24:19,230 And somehow you're using a dictionary maybe 1691 01:24:19,230 --> 01:24:21,630 just to keep track of someone's purse or wallet, 1692 01:24:21,630 --> 01:24:24,450 like how many coins of each type that they have. 1693 01:24:24,450 --> 01:24:27,120 And as such, it's perfectly reasonable to use a dictionary. 1694 01:24:27,120 --> 01:24:29,010 But then you want to print out the total. 1695 01:24:29,010 --> 01:24:29,700 And darn it. 1696 01:24:29,700 --> 01:24:34,650 If that total function does not expect a dictionary so you cannot just do 1697 01:24:34,650 --> 01:24:37,380 something nice and simple like pass in coins. 1698 01:24:37,380 --> 01:24:39,690 For reasons we saw earlier, that would be a type error. 1699 01:24:39,690 --> 01:24:42,480 Total expects three arguments, three integers. 1700 01:24:42,480 --> 01:24:44,890 You can't just pass in a dictionary. 1701 01:24:44,890 --> 01:24:48,330 But if that's the data structure you're using to store the person's purse 1702 01:24:48,330 --> 01:24:51,090 or wallet, well, it's kind of unfortunate 1703 01:24:51,090 --> 01:24:54,180 that we have this clash between these data types. 1704 01:24:54,180 --> 01:24:55,990 Well, here's what we can do. 1705 01:24:55,990 --> 01:24:57,420 We can't pass in coins. 1706 01:24:57,420 --> 01:25:00,750 Because watch, if I try doing that and run Python of unpack.py, 1707 01:25:00,750 --> 01:25:02,850 we're getting another type error. 1708 01:25:02,850 --> 01:25:05,160 Missing two required positional arguments. 1709 01:25:05,160 --> 01:25:08,700 Sickles and knuts, I have to pass in three things. 1710 01:25:08,700 --> 01:25:14,310 But, wonderfully, Python allows you to unpack dictionaries as well. 1711 01:25:14,310 --> 01:25:18,450 For a dictionary, you don't use a single asterisk, you use two. 1712 01:25:18,450 --> 01:25:21,960 And what this syntax has the effect of doing 1713 01:25:21,960 --> 01:25:26,070 is passing in three values with names. 1714 01:25:26,070 --> 01:25:30,870 It has the effect of passing in galleons equals 100 comma, 1715 01:25:30,870 --> 01:25:35,880 sickles equals 50 comma, knuts equals 25. 1716 01:25:35,880 --> 01:25:39,570 And so, it has the similar effect to the list unpacking. 1717 01:25:39,570 --> 01:25:44,550 But that just passed in the values, 100, 50, 25 separated by commas in effect. 1718 01:25:44,550 --> 01:25:48,270 When unpacking a dictionary, it passes in the keys 1719 01:25:48,270 --> 01:25:52,590 and the values separated conceptually with equal signs 1720 01:25:52,590 --> 01:25:55,290 just like our function expects. 1721 01:25:55,290 --> 01:25:59,310 So if I now run Python of unpack.py again, we're still good, 1722 01:25:59,310 --> 01:26:01,230 but we've tightened our code up again. 1723 01:26:01,230 --> 01:26:03,240 And now, I'm giving myself yet another option. 1724 01:26:03,240 --> 01:26:08,910 I can either store a wizard's purse or wallets in their-- 1725 01:26:08,910 --> 01:26:10,440 in a list as we did earlier. 1726 01:26:10,440 --> 01:26:13,410 Or I can store it in a little more versatility-- with even more 1727 01:26:13,410 --> 01:26:16,270 specificity using a dictionary instead. 1728 01:26:16,270 --> 01:26:18,270 And so, to be clear, let me rewind. 1729 01:26:18,270 --> 01:26:22,230 Star star coins is the same thing if I rewind 1730 01:26:22,230 --> 01:26:26,430 a little bit to our first example of named arguments 1731 01:26:26,430 --> 01:26:28,680 is equivalent to what I've highlighted here. 1732 01:26:28,680 --> 01:26:32,940 When you unpack a dictionary, it passes in all of the keys 1733 01:26:32,940 --> 01:26:36,840 and all of the values much like the syntax here. 1734 01:26:36,840 --> 01:26:39,780 But let me tighten it up and go to where we left off. 1735 01:26:39,780 --> 01:26:41,790 Questions now on unpacking? 1736 01:26:41,790 --> 01:26:44,280 STUDENT: Can we have a-- in this dictionary, 1737 01:26:44,280 --> 01:26:49,500 can we have instead of having a constant name value pair, can we have 1738 01:26:49,500 --> 01:26:52,913 a variable number of name value pairs? 1739 01:26:52,913 --> 01:26:54,330 DAVID J. MALAN: Short answer, yes. 1740 01:26:54,330 --> 01:26:57,630 You can have more than three key value pairs as I have here. 1741 01:26:57,630 --> 01:27:01,920 But it's not going to work unpacking it if the total function is 1742 01:27:01,920 --> 01:27:03,420 expecting only three. 1743 01:27:03,420 --> 01:27:06,750 So if I were to add something here, like let me introduce pennies 1744 01:27:06,750 --> 01:27:07,950 to the wizarding world. 1745 01:27:07,950 --> 01:27:11,220 And suppose I have one penny, for instance. 1746 01:27:11,220 --> 01:27:17,440 And now I run this same code, Python of unpack.py, we're back to a type error 1747 01:27:17,440 --> 01:27:17,940 again. 1748 01:27:17,940 --> 01:27:20,940 Whereby I got an unexpected keyword argument pennies, 1749 01:27:20,940 --> 01:27:24,090 because that is not expected by the total function. 1750 01:27:24,090 --> 01:27:27,790 We will see in just a moment, wonderfully, a solution though to that. 1751 01:27:27,790 --> 01:27:29,790 But for now, it does not work. 1752 01:27:29,790 --> 01:27:33,150 Other questions on unpacking with dictionaries or lists? 1753 01:27:33,150 --> 01:27:37,230 STUDENT: In list-- in list values, we gave the same number of arguments 1754 01:27:37,230 --> 01:27:40,470 and we declared a default value in the function. 1755 01:27:40,470 --> 01:27:44,070 Now, if you use this asterisk, will it overwrite that value 1756 01:27:44,070 --> 01:27:46,427 or will it skip that default value? 1757 01:27:46,427 --> 01:27:47,760 DAVID J. MALAN: A good question. 1758 01:27:47,760 --> 01:27:51,660 If we did have default values up here, for instance, 1759 01:27:51,660 --> 01:27:56,670 equals zero, equals zero, equals zero, the upside of that, 1760 01:27:56,670 --> 01:27:59,730 recall, from our discussion of arguments to functions a while back, 1761 01:27:59,730 --> 01:28:02,760 is that now you don't have to pass in all of those values. 1762 01:28:02,760 --> 01:28:05,070 They will default to those zeros. 1763 01:28:05,070 --> 01:28:08,220 Therefore, you could pass in fewer than three values, 1764 01:28:08,220 --> 01:28:11,910 either using a list or a dictionary that's unpacked in this scenario. 1765 01:28:11,910 --> 01:28:15,360 I deliberately did not do that because I wanted us to encounter 1766 01:28:15,360 --> 01:28:16,980 this specific error in this case. 1767 01:28:16,980 --> 01:28:21,270 But you could absolutely go back and add those defaults. 1768 01:28:21,270 --> 01:28:24,840 So it turns out that this single asterisks or this double asterisk 1769 01:28:24,840 --> 01:28:27,060 is not only used in the context of unpacking. 1770 01:28:27,060 --> 01:28:30,990 That same syntax is actually used as a visual indicator in Python 1771 01:28:30,990 --> 01:28:33,900 when in a function itself might very well 1772 01:28:33,900 --> 01:28:36,960 take a variable number of arguments. 1773 01:28:36,960 --> 01:28:39,210 That is to say, a function can be variadic. 1774 01:28:39,210 --> 01:28:41,880 Which means that it doesn't necessarily have to take, say, 1775 01:28:41,880 --> 01:28:43,920 three arguments specifically. 1776 01:28:43,920 --> 01:28:46,020 Even if they do or don't have default values, 1777 01:28:46,020 --> 01:28:49,990 it can take maybe zero, or one, or two, or three. 1778 01:28:49,990 --> 01:28:52,770 And it turns out the syntax for implementing the same idea is 1779 01:28:52,770 --> 01:28:54,490 quite similar in spirit. 1780 01:28:54,490 --> 01:28:56,820 In fact, let me go back to VS Code here. 1781 01:28:56,820 --> 01:28:59,130 And let me propose that we start over with this code 1782 01:28:59,130 --> 01:29:01,800 and get rid of our notion of galleons, and sickles, and knuts, 1783 01:29:01,800 --> 01:29:04,383 and do something just a little more generic just so that we've 1784 01:29:04,383 --> 01:29:05,800 seen the syntax for this. 1785 01:29:05,800 --> 01:29:09,240 Suppose that I define a function as follows. 1786 01:29:09,240 --> 01:29:11,490 Define a function, let's call it f. 1787 01:29:11,490 --> 01:29:15,210 And that function is not going to take a specific number of arguments 1788 01:29:15,210 --> 01:29:16,440 but a variable one. 1789 01:29:16,440 --> 01:29:18,930 And so, I'm going to go ahead and use this syntax here, 1790 01:29:18,930 --> 01:29:23,440 star args, which indicates that this function is indeed variadic. 1791 01:29:23,440 --> 01:29:26,850 It takes some variable number of positional arguments. 1792 01:29:26,850 --> 01:29:29,820 Positional in the sense that they go typically from left to right. 1793 01:29:29,820 --> 01:29:32,550 But I don't know how many just yet I want to support. 1794 01:29:32,550 --> 01:29:34,560 Suppose that I additionally want to support 1795 01:29:34,560 --> 01:29:36,720 some number of keyword arguments, that is, 1796 01:29:36,720 --> 01:29:40,140 named parameters that can be called optionally 1797 01:29:40,140 --> 01:29:42,160 and individually by their own name. 1798 01:29:42,160 --> 01:29:44,200 Well, the convention syntactically here would 1799 01:29:44,200 --> 01:29:46,900 be to use two stars and then kwargs. 1800 01:29:46,900 --> 01:29:50,290 I could call args, or kwargs, anything else that I want. 1801 01:29:50,290 --> 01:29:53,860 But a convention you'll frequently see in Python's own documentation 1802 01:29:53,860 --> 01:29:57,280 is that when you have placeholders like this for some number of arguments 1803 01:29:57,280 --> 01:30:03,220 and some number of keyword arguments, the world tends to use args and kwargs. 1804 01:30:03,220 --> 01:30:06,760 Well, inside of this function, let's do something super simple just for now. 1805 01:30:06,760 --> 01:30:10,600 Let me go ahead and print out literally quote unquote positional, 1806 01:30:10,600 --> 01:30:13,870 just to indicate to myself while wrapping my mind around what's 1807 01:30:13,870 --> 01:30:16,810 going on here what the positional arguments are. 1808 01:30:16,810 --> 01:30:19,255 And let me quite simply print out those args. 1809 01:30:19,255 --> 01:30:21,130 This is not something you would typically do. 1810 01:30:21,130 --> 01:30:23,050 You don't typically just take in these arguments 1811 01:30:23,050 --> 01:30:24,925 and print them, no matter how many there are. 1812 01:30:24,925 --> 01:30:28,930 I'm just doing this diagnostically for now to show you how the syntax works. 1813 01:30:28,930 --> 01:30:31,120 Now, let me go ahead at the bottom of my file-- 1814 01:30:31,120 --> 01:30:34,090 and I won't bother with a main function this time so we can focus only 1815 01:30:34,090 --> 01:30:35,560 on this function f. 1816 01:30:35,560 --> 01:30:38,260 Let me go ahead and just call f with three arguments. 1817 01:30:38,260 --> 01:30:39,850 I'll use the same arguments as before. 1818 01:30:39,850 --> 01:30:42,670 But I didn't bother giving them names just yet, like galleons, 1819 01:30:42,670 --> 01:30:44,690 and sickles, and knuts, and the like. 1820 01:30:44,690 --> 01:30:46,420 So what do I have? 1821 01:30:46,420 --> 01:30:49,180 A program that no matter what calls this function f, 1822 01:30:49,180 --> 01:30:51,760 but it first defines f at the top of the file. 1823 01:30:51,760 --> 01:30:54,100 It's taking some number of positional arguments, 1824 01:30:54,100 --> 01:30:56,110 some number of named arguments. 1825 01:30:56,110 --> 01:30:59,590 And for the moment, I'm just printing out the positional ones. 1826 01:30:59,590 --> 01:31:01,610 Let me go ahead and in my terminal window 1827 01:31:01,610 --> 01:31:04,510 run Python of unpack.py and hit Enter. 1828 01:31:04,510 --> 01:31:07,930 And you'll see that the positional arguments passed in 1829 01:31:07,930 --> 01:31:09,520 are apparently this-- 1830 01:31:09,520 --> 01:31:12,790 a sequence, 100, 50, 25. 1831 01:31:12,790 --> 01:31:13,900 But notice this. 1832 01:31:13,900 --> 01:31:16,630 If I clear my terminal window there and pass in something else, 1833 01:31:16,630 --> 01:31:18,550 like five, a fourth argument. 1834 01:31:18,550 --> 01:31:22,450 Previously, if I tried to change the number of arguments 1835 01:31:22,450 --> 01:31:27,250 I'm passing in to my total function, which was only defined as taking three, 1836 01:31:27,250 --> 01:31:29,920 I would have gotten a type error, some visual indication 1837 01:31:29,920 --> 01:31:32,950 that, no, you can't pass in more or fewer arguments 1838 01:31:32,950 --> 01:31:35,410 than is actually in the function's definition. 1839 01:31:35,410 --> 01:31:36,340 But now watch. 1840 01:31:36,340 --> 01:31:38,890 If I run Python of unpack.py, this time passing 1841 01:31:38,890 --> 01:31:44,500 in 100, 50, 25, and 5, a fourth argument, all four of those 1842 01:31:44,500 --> 01:31:46,100 went through just fine. 1843 01:31:46,100 --> 01:31:50,830 I can get rid of all of those but one, for instance now, rerun my program 1844 01:31:50,830 --> 01:31:51,920 after clearing my screen. 1845 01:31:51,920 --> 01:31:54,250 And now, I'll see just one argument here. 1846 01:31:54,250 --> 01:31:56,590 And even though there's a comma and nothing after it, 1847 01:31:56,590 --> 01:32:00,842 this is actually the syntax when seeing a tuple, in effect, whereby 1848 01:32:00,842 --> 01:32:02,800 the comma just indicates this is indeed a list, 1849 01:32:02,800 --> 01:32:05,440 but there's only one element therein. 1850 01:32:05,440 --> 01:32:07,480 Well, let's get a little more curious too. 1851 01:32:07,480 --> 01:32:10,210 Let me go ahead and rewind here to where we 1852 01:32:10,210 --> 01:32:11,890 started with just those three values. 1853 01:32:11,890 --> 01:32:15,640 And this time, let me go ahead and print out my named argument, so to speak, 1854 01:32:15,640 --> 01:32:18,430 which isn't args but kwargs. 1855 01:32:18,430 --> 01:32:21,280 Again, the positional args in this syntax come first. 1856 01:32:21,280 --> 01:32:24,440 The named arguments, kwargs come second. 1857 01:32:24,440 --> 01:32:26,510 That's what Python prescribes. 1858 01:32:26,510 --> 01:32:29,800 So now, let me go ahead and not pass in just these numbers. 1859 01:32:29,800 --> 01:32:32,740 Let me go ahead and pass in actually named arguments. 1860 01:32:32,740 --> 01:32:36,850 So let me do something now more specifically, like galleons equals 100, 1861 01:32:36,850 --> 01:32:40,257 and sickles equals 50, and knuts equals 25. 1862 01:32:40,257 --> 01:32:42,340 I'm not going to bother doing any math with total. 1863 01:32:42,340 --> 01:32:46,060 I just want to poke around right now at this functionality of having 1864 01:32:46,060 --> 01:32:47,830 a variable number of arguments. 1865 01:32:47,830 --> 01:32:53,920 And what's neat now is if I run Python of unpack.py and hit Enter, no problem. 1866 01:32:53,920 --> 01:32:58,870 What kwargs is, is automatically a dictionary that 1867 01:32:58,870 --> 01:33:03,260 contains all of the named arguments that were passed to my function. 1868 01:33:03,260 --> 01:33:05,860 Which is to say, when designing your own functions, 1869 01:33:05,860 --> 01:33:09,250 if you want to support more than one argument, 1870 01:33:09,250 --> 01:33:12,910 maybe more than two, or three, or four, maybe a variable number of arguments, 1871 01:33:12,910 --> 01:33:17,080 indeed, you can support both a variable number of positional arguments 1872 01:33:17,080 --> 01:33:21,490 that are just value comma value comma value, or any number of named 1873 01:33:21,490 --> 01:33:24,040 arguments, where you actually put the name of the parameter 1874 01:33:24,040 --> 01:33:28,760 equals the value and then maybe a comma and some more of the same. 1875 01:33:28,760 --> 01:33:34,720 So now, it turns out we have seen this before in some of the functions 1876 01:33:34,720 --> 01:33:36,250 we've used to date. 1877 01:33:36,250 --> 01:33:41,440 We didn't necessarily see it called args or necessarily see it called kwargs. 1878 01:33:41,440 --> 01:33:44,920 But we have seen at least one example of this in the wild. 1879 01:33:44,920 --> 01:33:48,580 Recall our old friend print, which we've been using now for weeks. 1880 01:33:48,580 --> 01:33:51,640 And when we first looked at the documentation for print way 1881 01:33:51,640 --> 01:33:54,590 back when, it looked a little something like this. 1882 01:33:54,590 --> 01:33:57,460 The first argument to print was objects. 1883 01:33:57,460 --> 01:34:00,070 And I waved my hand at the time at the asterisk that 1884 01:34:00,070 --> 01:34:02,080 was at the start of that variable name. 1885 01:34:02,080 --> 01:34:05,860 But then we had sep for separator, the default value of which was a space. 1886 01:34:05,860 --> 01:34:08,270 We had n, the default value of which was a new line. 1887 01:34:08,270 --> 01:34:11,380 And then some other names arguments that we waved our hands at then 1888 01:34:11,380 --> 01:34:13,210 and I'll again do now. 1889 01:34:13,210 --> 01:34:16,900 But what you can now perhaps infer from our emphasis 1890 01:34:16,900 --> 01:34:20,560 on these asterisks today, the single stars or the double stars, 1891 01:34:20,560 --> 01:34:21,520 is that you know what? 1892 01:34:21,520 --> 01:34:24,640 This is the convention in Python's documentation 1893 01:34:24,640 --> 01:34:30,440 to indicate that print takes a variable number of arguments. 1894 01:34:30,440 --> 01:34:33,610 So if we were to look at the actual implementation of the print 1895 01:34:33,610 --> 01:34:36,070 function implemented by Python's own authors, 1896 01:34:36,070 --> 01:34:38,290 it might very well look something like this. 1897 01:34:38,290 --> 01:34:40,690 Def print, and then the first our argument 1898 01:34:40,690 --> 01:34:43,240 would be star objects, thereby indicating that print 1899 01:34:43,240 --> 01:34:45,440 takes a variable number of arguments. 1900 01:34:45,440 --> 01:34:48,070 The next one of which might be sep equals quote unquote 1901 01:34:48,070 --> 01:34:51,730 either using double quotes or as in the documentation single quotes too. 1902 01:34:51,730 --> 01:34:55,057 The next one of which might be n, the default value of which is a new line. 1903 01:34:55,057 --> 01:34:56,890 And then some of those other named arguments 1904 01:34:56,890 --> 01:34:58,400 that we've not looked at as well. 1905 01:34:58,400 --> 01:35:00,880 And then, maybe inside of the print function 1906 01:35:00,880 --> 01:35:03,250 implemented by the authors of Python, maybe 1907 01:35:03,250 --> 01:35:06,130 there's a for loop like for object in objects 1908 01:35:06,130 --> 01:35:10,960 that allows them to iterate over each of those variable number of objects 1909 01:35:10,960 --> 01:35:12,400 and print each of them. 1910 01:35:12,400 --> 01:35:15,220 And this is why in programs past, you and I 1911 01:35:15,220 --> 01:35:18,790 have been able to do just print open parentheses close parenthesis 1912 01:35:18,790 --> 01:35:19,930 with nothing inside. 1913 01:35:19,930 --> 01:35:22,030 Or you and I have been able to print out something 1914 01:35:22,030 --> 01:35:25,240 like, Hello, world, a single string inside of those parentheses. 1915 01:35:25,240 --> 01:35:28,420 Or you and I have been able to do a single string, Hello, 1916 01:35:28,420 --> 01:35:30,880 and then another string quote unquote world, 1917 01:35:30,880 --> 01:35:33,760 thereby passing in two arguments or even more. 1918 01:35:33,760 --> 01:35:38,080 So we've long had this ability to use variadic functions, 1919 01:35:38,080 --> 01:35:41,050 whereby you can pass in a variable number of arguments. 1920 01:35:41,050 --> 01:35:45,730 What you now have via this args and kwargs syntax-- 1921 01:35:45,730 --> 01:35:48,250 but again, they do not need to be called that-- is 1922 01:35:48,250 --> 01:35:53,290 the ability using that star or two stars to implement those kinds of functions 1923 01:35:53,290 --> 01:35:55,040 yourself. 1924 01:35:55,040 --> 01:35:58,880 My own f function a moment ago did not do anything all that interesting. 1925 01:35:58,880 --> 01:36:01,250 But it hints at how you could, if in the future 1926 01:36:01,250 --> 01:36:06,230 you have a use case, for taking zero or one or more of either type of argument. 1927 01:36:06,230 --> 01:36:12,280 Any questions now on these types of arguments? 1928 01:36:12,280 --> 01:36:17,830 STUDENT: What will happen if you print kwargs and the argument is like a list? 1929 01:36:17,830 --> 01:36:18,890 DAVID J. MALAN: Ah. 1930 01:36:18,890 --> 01:36:20,830 So what would happen if you print the argument like it's a list? 1931 01:36:20,830 --> 01:36:21,910 So I think we saw that. 1932 01:36:21,910 --> 01:36:26,650 If I roll back in my history here to when I had that f function. 1933 01:36:26,650 --> 01:36:28,810 Which I called f just to be very generic just so 1934 01:36:28,810 --> 01:36:30,760 we could play around with the syntax. 1935 01:36:30,760 --> 01:36:32,510 This is what I had here. 1936 01:36:32,510 --> 01:36:35,050 So this is a-- 1937 01:36:35,050 --> 01:36:38,500 I passed in 100 comma 50 comma 25. 1938 01:36:38,500 --> 01:36:41,500 That gets automatically stored in args. 1939 01:36:41,500 --> 01:36:44,800 And when I run it, you can actually see that sequence of values 1940 01:36:44,800 --> 01:36:46,630 by running Python of unpack.py. 1941 01:36:46,630 --> 01:36:50,560 There is that sequence all in the form of one single variable. 1942 01:36:50,560 --> 01:36:52,600 I'm printing it just for diagnostic purposes. 1943 01:36:52,600 --> 01:36:54,940 This is not really a useful or pretty program. 1944 01:36:54,940 --> 01:36:59,740 But it hints at how we can access that whole sequence of values. 1945 01:36:59,740 --> 01:37:03,640 Other questions on this approach here? 1946 01:37:03,640 --> 01:37:07,735 STUDENT: Can we pass the kwargs from one function to another function? 1947 01:37:07,735 --> 01:37:08,860 DAVID J. MALAN: Absolutely. 1948 01:37:08,860 --> 01:37:11,260 You can pass either of those to another function, which 1949 01:37:11,260 --> 01:37:14,560 you might want to do if you want to wrap another function, 1950 01:37:14,560 --> 01:37:17,260 provide some additional functionality, but still 1951 01:37:17,260 --> 01:37:23,290 pass in all of the supported arguments to the underlying function as well. 1952 01:37:23,290 --> 01:37:23,920 All right. 1953 01:37:23,920 --> 01:37:25,780 How about this next. 1954 01:37:25,780 --> 01:37:29,860 It turns out that a few other tools we can add to your tool kit 1955 01:37:29,860 --> 01:37:33,670 relate to the types of programming models that Python supports. 1956 01:37:33,670 --> 01:37:36,250 We started out quite some time ago focusing really 1957 01:37:36,250 --> 01:37:38,440 on procedural programming in Python. 1958 01:37:38,440 --> 01:37:41,980 Whereby we wrote code top to bottom, left to right, defining some functions, 1959 01:37:41,980 --> 01:37:44,740 or if you will, procedures along the way, 1960 01:37:44,740 --> 01:37:47,110 defining variables, and having side effects, 1961 01:37:47,110 --> 01:37:48,890 and assigning values as needed. 1962 01:37:48,890 --> 01:37:51,700 But we then eventually introduced or really revealed 1963 01:37:51,700 --> 01:37:54,340 that Python is also very much object-oriented. 1964 01:37:54,340 --> 01:37:56,920 And a lot of those variables, a lot of those types 1965 01:37:56,920 --> 01:38:00,070 that we were using all that time were in fact objects, 1966 01:38:00,070 --> 01:38:02,950 objects that came from certain classes. 1967 01:38:02,950 --> 01:38:05,680 And those classes were templates of sorts, blueprints, 1968 01:38:05,680 --> 01:38:09,370 via which you could encapsulate both data and functionality therein. 1969 01:38:09,370 --> 01:38:11,380 Well, we also saw along the way some hints 1970 01:38:11,380 --> 01:38:14,350 of a third paradigm of programming that Python also, 1971 01:38:14,350 --> 01:38:18,310 to some extent, supports, which is known as functional programming. 1972 01:38:18,310 --> 01:38:21,250 Whereby functions are ever more powerful in that 1973 01:38:21,250 --> 01:38:25,630 they tend not to have side effects, no printing or changing of state globally. 1974 01:38:25,630 --> 01:38:28,120 But rather, they're completely self-contained 1975 01:38:28,120 --> 01:38:31,240 and might take as inputs and return values. 1976 01:38:31,240 --> 01:38:34,510 And that's generally a paradigm we saw when we started sorting things, 1977 01:38:34,510 --> 01:38:37,660 particularly with functions like our sort function 1978 01:38:37,660 --> 01:38:40,390 or Lambda function when we passed in the function 1979 01:38:40,390 --> 01:38:43,240 we wanted to use to sort a list way back when. 1980 01:38:43,240 --> 01:38:46,690 Well, it turns out Python has other functionality that 1981 01:38:46,690 --> 01:38:50,590 is reminiscent of functional programming and indeed is a powerful way 1982 01:38:50,590 --> 01:38:53,260 to solve problems a little more differently still. 1983 01:38:53,260 --> 01:38:54,460 Let me propose this. 1984 01:38:54,460 --> 01:38:57,940 Let me propose that I whip up a new program here in VS Code 1985 01:38:57,940 --> 01:39:02,500 by closing our unpack.py and this time creating another program called yell. 1986 01:39:02,500 --> 01:39:05,800 Suppose the goal at hand is to implement some program that allows the user 1987 01:39:05,800 --> 01:39:08,470 to pass an input, and then it yells the response 1988 01:39:08,470 --> 01:39:10,450 by forcing everything to uppercase. 1989 01:39:10,450 --> 01:39:12,790 My apologies to those with headphones there. 1990 01:39:12,790 --> 01:39:13,960 I'll modulate. 1991 01:39:13,960 --> 01:39:16,690 So let me go ahead and run code of yell.py. 1992 01:39:16,690 --> 01:39:19,360 And within yell.py, let's go ahead and implement 1993 01:39:19,360 --> 01:39:22,190 a program that really does just that. 1994 01:39:22,190 --> 01:39:24,730 Let's go ahead and define a main function up here. 1995 01:39:24,730 --> 01:39:28,060 And let's assume for the moment that this yell function already exists 1996 01:39:28,060 --> 01:39:32,860 and yell something like, This is CS50, properly capitalized, not in all caps. 1997 01:39:32,860 --> 01:39:37,090 Now, let's go ahead and implement this yell function with def yell. 1998 01:39:37,090 --> 01:39:41,590 It's going to take, for now, a single word or phrase. 1999 01:39:41,590 --> 01:39:42,760 And let's go ahead. 2000 01:39:42,760 --> 01:39:44,530 And I'll call it phrase here. 2001 01:39:44,530 --> 01:39:48,820 And I'm going to go ahead and just print out the phrase.upper. 2002 01:39:48,820 --> 01:39:51,910 So phrase.upper is going to force the hole thing to uppercase. 2003 01:39:51,910 --> 01:39:54,760 And as usual, down here if the name of this file equals 2004 01:39:54,760 --> 01:39:59,620 equals quote unquote main, then let's go ahead, as always, and call main. 2005 01:39:59,620 --> 01:40:00,642 So let's just run this. 2006 01:40:00,642 --> 01:40:03,100 But for the most part, it should be fairly straightforward. 2007 01:40:03,100 --> 01:40:08,710 When I run Python of yell.py, THIS IS CS50 is yelled on the screen. 2008 01:40:08,710 --> 01:40:09,730 All right, that's nice. 2009 01:40:09,730 --> 01:40:16,120 But it's not great that yell only expects a single phrase. 2010 01:40:16,120 --> 01:40:18,190 Wouldn't it be nice, like print, if I could 2011 01:40:18,190 --> 01:40:22,870 pass in one phrase, or two, or three, or really multiple words more generally 2012 01:40:22,870 --> 01:40:25,130 but as individual words themselves. 2013 01:40:25,130 --> 01:40:29,380 So let me retool this a little bit and change yell to take in not a phrase 2014 01:40:29,380 --> 01:40:32,330 but how about something like a list of words. 2015 01:40:32,330 --> 01:40:35,650 So that ultimately, I can call yell like this. 2016 01:40:35,650 --> 01:40:41,380 Quote unquote, this inside of a list, quote unquote, "This" inside of a list, 2017 01:40:41,380 --> 01:40:43,975 and, quote unquote, "CS50" inside of a list. 2018 01:40:43,975 --> 01:40:46,600 I'm not going to bother with type hints or annotations for now. 2019 01:40:46,600 --> 01:40:50,980 But I'll just assume that yell has been defined now as taking a list of words 2020 01:40:50,980 --> 01:40:52,400 as defined here. 2021 01:40:52,400 --> 01:40:54,670 But now I want to force them all to lowercase. 2022 01:40:54,670 --> 01:40:57,160 So I don't quite want to do something as simple as this. 2023 01:40:57,160 --> 01:41:02,890 Like for word in words, I could, for instance, print that given word 2024 01:41:02,890 --> 01:41:05,500 and maybe end the line with nothing right now. 2025 01:41:05,500 --> 01:41:10,090 But I think if I do this, Python of yell.py, no, that's not right. 2026 01:41:10,090 --> 01:41:12,250 I haven't forced anything to uppercase. 2027 01:41:12,250 --> 01:41:13,700 So let's fix this. 2028 01:41:13,700 --> 01:41:15,620 Well, let's go ahead and do the following. 2029 01:41:15,620 --> 01:41:19,940 Let me go ahead and accumulate the uppercase words as follows. 2030 01:41:19,940 --> 01:41:22,120 Let me create a variable called uppercase 2031 01:41:22,120 --> 01:41:25,420 and initialize it to an empty list using square brackets or our more 2032 01:41:25,420 --> 01:41:27,410 verbose list syntax. 2033 01:41:27,410 --> 01:41:31,780 And now, let me go ahead and iterate over each of those words in words. 2034 01:41:31,780 --> 01:41:37,840 And for each of them, let's go into our upper cased list, append to it 2035 01:41:37,840 --> 01:41:41,020 the current words uppercase version. 2036 01:41:41,020 --> 01:41:44,740 So this is a way of creating a new list called uppercase 2037 01:41:44,740 --> 01:41:48,070 that is just appending, appending, appending to that list 2038 01:41:48,070 --> 01:41:51,520 each of the current words in the loop but uppercased instead. 2039 01:41:51,520 --> 01:41:55,600 And now, just let me go ahead and print out the uppercased list. 2040 01:41:55,600 --> 01:41:56,870 This isn't quite right. 2041 01:41:56,870 --> 01:41:58,180 Let's see what happens here. 2042 01:41:58,180 --> 01:42:01,540 Python of yell.py, OK. 2043 01:42:01,540 --> 01:42:04,090 It's not quite right, because I don't think I want 2044 01:42:04,090 --> 01:42:06,100 those quotes or those square brackets. 2045 01:42:06,100 --> 01:42:07,100 What am I seeing? 2046 01:42:07,100 --> 01:42:09,070 I'm actually printing a list. 2047 01:42:09,070 --> 01:42:13,155 But, but, but, here's where some of our unpacking syntax now can be useful. 2048 01:42:13,155 --> 01:42:15,280 I don't have to change my approach to this problem. 2049 01:42:15,280 --> 01:42:18,850 I can just unpack uppercase by adding a single star. 2050 01:42:18,850 --> 01:42:22,000 And now, let me go ahead and rerun Python of yell.py. 2051 01:42:22,000 --> 01:42:24,370 And now, it's actually just English. 2052 01:42:24,370 --> 01:42:26,980 There's no remnants of Python syntax like the quotes, 2053 01:42:26,980 --> 01:42:29,290 and the commas, and the square brackets. 2054 01:42:29,290 --> 01:42:34,690 I've now unpacked, this is CS50 as three separate arguments to print. 2055 01:42:34,690 --> 01:42:38,510 So already now, this unpacking technique would seem to be useful. 2056 01:42:38,510 --> 01:42:41,260 Well, it's a little unfortunate that I now 2057 01:42:41,260 --> 01:42:45,640 need to call yell though with a list of values in this way. 2058 01:42:45,640 --> 01:42:47,097 This is just not the norm. 2059 01:42:47,097 --> 01:42:49,180 Or it's at least, it's not nearly as user-friendly 2060 01:42:49,180 --> 01:42:51,250 as something like the print function where 2061 01:42:51,250 --> 01:42:54,580 I can pass in zero, or one, or two, or three, or any number of arguments. 2062 01:42:54,580 --> 01:42:58,570 Why are you making me for your yell function pass in only a list? 2063 01:42:58,570 --> 01:42:59,530 Well, we can do better. 2064 01:42:59,530 --> 01:43:01,780 Let's adopt some of the new conventions we've learned. 2065 01:43:01,780 --> 01:43:05,500 And let's go ahead and get rid of the list by removing the square brackets. 2066 01:43:05,500 --> 01:43:08,380 And let's just pass yell three arguments. 2067 01:43:08,380 --> 01:43:11,830 Now, I don't want to do something like change the definition of words 2068 01:43:11,830 --> 01:43:14,707 to take in word one, word two. 2069 01:43:14,707 --> 01:43:16,540 That's not going to scale and it's not going 2070 01:43:16,540 --> 01:43:18,190 to handle different number of words. 2071 01:43:18,190 --> 01:43:19,840 But we have a technique now. 2072 01:43:19,840 --> 01:43:23,830 We can say star args, which will allow the yell function 2073 01:43:23,830 --> 01:43:25,870 to accept any number of arguments. 2074 01:43:25,870 --> 01:43:28,900 And just for specificity, let's not call it generically args. 2075 01:43:28,900 --> 01:43:33,250 Let's name it something a little more self explanatory like star words. 2076 01:43:33,250 --> 01:43:37,310 This just means I have a variable number of words being passed in. 2077 01:43:37,310 --> 01:43:39,790 Now, I think, I've made a marginal improvement. 2078 01:43:39,790 --> 01:43:42,220 Let me run this again, Python of yell.py. 2079 01:43:42,220 --> 01:43:45,050 This is CS50 is in all caps. 2080 01:43:45,050 --> 01:43:46,600 But it's just a little better. 2081 01:43:46,600 --> 01:43:50,590 Because now I can treat yell just like I've long treated print, 2082 01:43:50,590 --> 01:43:53,980 pass in as many things as you want, and print will deal with it. 2083 01:43:53,980 --> 01:43:56,860 Now, my yell function is just as powerful it would seem. 2084 01:43:56,860 --> 01:44:00,820 And better still, it also forces everything to uppercase. 2085 01:44:00,820 --> 01:44:03,250 Well, it turns out Python comes with this function called 2086 01:44:03,250 --> 01:44:07,870 map, whose purpose in life is to allow you to map, that is, apply 2087 01:44:07,870 --> 01:44:12,590 some function to every element of some sequence like a list. 2088 01:44:12,590 --> 01:44:17,470 So for instance, if we want to force to uppercase each of the words, 2089 01:44:17,470 --> 01:44:22,780 this is CS50 in the list of words that's been passed in, well, 2090 01:44:22,780 --> 01:44:27,740 we essentially want to map the upper case function to each of those values. 2091 01:44:27,740 --> 01:44:30,130 So using map in Python can I do just that? 2092 01:44:30,130 --> 01:44:31,900 Let me go back here to VS Code. 2093 01:44:31,900 --> 01:44:35,800 And let me propose now that I re-implement this as follows. 2094 01:44:35,800 --> 01:44:40,960 I get rid of all three of these lines here, getting rid of that loop 2095 01:44:40,960 --> 01:44:41,980 in particular. 2096 01:44:41,980 --> 01:44:45,100 Let me still declare a variable called uppercased. 2097 01:44:45,100 --> 01:44:50,260 But let me set it equal to the return value of this new function called map. 2098 01:44:50,260 --> 01:44:52,450 Map takes two arguments here. 2099 01:44:52,450 --> 01:44:56,140 In this case, the name of a function that I want to 2100 01:44:56,140 --> 01:44:59,170 map on to a sequence of values. 2101 01:44:59,170 --> 01:45:03,580 Well, what function do I want to apply to every word that's been passed in? 2102 01:45:03,580 --> 01:45:05,470 Well, it turns out, thanks to my knowledge 2103 01:45:05,470 --> 01:45:09,310 now of object-oriented programming, I know that in the str class 2104 01:45:09,310 --> 01:45:11,320 there is a function called upper. 2105 01:45:11,320 --> 01:45:14,440 We've usually called it by using the name of a string 2106 01:45:14,440 --> 01:45:18,400 variable.upper open paren close paren. 2107 01:45:18,400 --> 01:45:21,760 But if you read the documentation for the str class, 2108 01:45:21,760 --> 01:45:25,540 you'll see that the function is described indeed as str.upper. 2109 01:45:25,540 --> 01:45:30,250 I'm not using parentheses, open and close, at the end of str.upper. 2110 01:45:30,250 --> 01:45:31,960 Because I don't want to call it now. 2111 01:45:31,960 --> 01:45:36,050 I want to pass this function to the map function, 2112 01:45:36,050 --> 01:45:40,390 so that map can somehow add those parentheses, so to speak, and call it 2113 01:45:40,390 --> 01:45:43,040 on every one of these words. 2114 01:45:43,040 --> 01:45:45,670 And this is what map does quite powerfully, 2115 01:45:45,670 --> 01:45:48,040 and is an instance, indeed, of functional programming. 2116 01:45:48,040 --> 01:45:51,700 Whereby I'm passing to this map function another function. 2117 01:45:51,700 --> 01:45:52,480 Not calling it. 2118 01:45:52,480 --> 01:45:55,570 I'm just passing it in by a reference of sorts. 2119 01:45:55,570 --> 01:46:00,700 And what map is going to do for me is iterate over each of those words, 2120 01:46:00,700 --> 01:46:04,210 call str.upper on each of those words, and return 2121 01:46:04,210 --> 01:46:10,690 to me a brand new list containing all of those results together in one list. 2122 01:46:10,690 --> 01:46:14,320 It completely obviates the need for me to do this more manually using 2123 01:46:14,320 --> 01:46:15,190 that list. 2124 01:46:15,190 --> 01:46:17,980 I'm still going to print the whole thing using star uppercase. 2125 01:46:17,980 --> 01:46:21,490 So that if I get back a list of three uppercase words, 2126 01:46:21,490 --> 01:46:24,500 I'm going to unpack them and print them all out. 2127 01:46:24,500 --> 01:46:25,750 So let's run this again. 2128 01:46:25,750 --> 01:46:29,740 Python of yell.py Enter. 2129 01:46:29,740 --> 01:46:31,780 And voila, it's still working. 2130 01:46:31,780 --> 01:46:34,480 But the code now is even more tight-- 2131 01:46:34,480 --> 01:46:36,690 even tighter than before. 2132 01:46:36,690 --> 01:46:38,440 So it turns out there's another way we can 2133 01:46:38,440 --> 01:46:40,690 solve this problem in a way that's even more 2134 01:46:40,690 --> 01:46:42,700 Pythonic, or at least quite common. 2135 01:46:42,700 --> 01:46:46,360 And that's using a feature known as a list comprehension. 2136 01:46:46,360 --> 01:46:48,380 And it's a big phrase, if you will. 2137 01:46:48,380 --> 01:46:50,830 But it refers to the ability in Python for you 2138 01:46:50,830 --> 01:46:54,550 to very easily construct a list on the fly without using a loop, 2139 01:46:54,550 --> 01:46:57,250 without calling append and append, but to do everything 2140 01:46:57,250 --> 01:47:00,370 in one, daresay, elegant one-liner. 2141 01:47:00,370 --> 01:47:06,003 So how can I go about using this notion of a list comprehension? 2142 01:47:06,003 --> 01:47:07,420 Well, let me go ahead and do this. 2143 01:47:07,420 --> 01:47:10,330 In yell.py, in VS Code here, let me go ahead 2144 01:47:10,330 --> 01:47:12,430 and change my approach as follows. 2145 01:47:12,430 --> 01:47:16,070 Instead of using map, which is perfectly fine and correct in this way, 2146 01:47:16,070 --> 01:47:18,220 let me just show you this other way as well. 2147 01:47:18,220 --> 01:47:22,710 A list comprehension is the opportunity to create a list like this, 2148 01:47:22,710 --> 01:47:24,550 using square brackets like this. 2149 01:47:24,550 --> 01:47:28,260 But inside of those square brackets to write a Python expression, 2150 01:47:28,260 --> 01:47:33,000 that in effect is going to dynamically generate a brand new list for you 2151 01:47:33,000 --> 01:47:35,610 using some logic you've written. 2152 01:47:35,610 --> 01:47:38,130 And the approach I might take here is this. 2153 01:47:38,130 --> 01:47:43,800 If I want to store in this list the uppercase version of every word 2154 01:47:43,800 --> 01:47:47,130 in that words list, I can do this-- 2155 01:47:47,130 --> 01:47:53,010 word.upper for word in words. 2156 01:47:53,010 --> 01:47:54,720 Now, this is a mouthful. 2157 01:47:54,720 --> 01:47:58,170 But I dare say Python programmers love this capability 2158 01:47:58,170 --> 01:48:02,340 of being able to define on the fly a list inside of which 2159 01:48:02,340 --> 01:48:06,300 is any number of values that you would ordinarily, at least as we've done it, 2160 01:48:06,300 --> 01:48:09,630 construct with a loop and again calling append, and append, and append. 2161 01:48:09,630 --> 01:48:12,810 But that usually takes two, three, four or more lines. 2162 01:48:12,810 --> 01:48:16,350 This list comprehension that I've highlighted here 2163 01:48:16,350 --> 01:48:20,190 is now an alternative way to create the exact same thing-- 2164 01:48:20,190 --> 01:48:24,180 a list inside of which are a whole bunch of uppercased words. 2165 01:48:24,180 --> 01:48:24,990 Which words? 2166 01:48:24,990 --> 01:48:30,180 For each word in the words list that was passed into yell 2167 01:48:30,180 --> 01:48:33,360 is what ends up in this list. 2168 01:48:33,360 --> 01:48:36,060 Questions on this syntax here? 2169 01:48:36,060 --> 01:48:38,640 It definitely takes a little bit of getting used to. 2170 01:48:38,640 --> 01:48:42,300 Because you've got this value on the left, this function call here. 2171 01:48:42,300 --> 01:48:45,270 You've got this loop inside of the square brackets. 2172 01:48:45,270 --> 01:48:48,810 But if you become accustomed to reading the code in this way from left 2173 01:48:48,810 --> 01:48:51,510 to right, this means give me the uppercase version 2174 01:48:51,510 --> 01:48:55,080 of the word for each word in my words list. 2175 01:48:55,080 --> 01:48:57,790 Questions here on list comprehensions? 2176 01:48:57,790 --> 01:48:58,320 STUDENT: Hi. 2177 01:48:58,320 --> 01:49:05,237 Can you do conditionals also, like if else, or combine if, elif, else? 2178 01:49:05,237 --> 01:49:06,570 DAVID J. MALAN: Indeed, you can. 2179 01:49:06,570 --> 01:49:08,760 And let me come back to that, where we'll see an opportunity 2180 01:49:08,760 --> 01:49:09,885 to do things conditionally. 2181 01:49:09,885 --> 01:49:13,350 But for now, I'm just uppercasing every word in the list. 2182 01:49:13,350 --> 01:49:14,110 Good question. 2183 01:49:14,110 --> 01:49:16,277 Other questions? 2184 01:49:16,277 --> 01:49:16,860 STUDENT: Yeah. 2185 01:49:16,860 --> 01:49:18,630 Is this functional programming? 2186 01:49:18,630 --> 01:49:24,257 Or I mean, this particular thing, we are using words.upper for a word in words? 2187 01:49:24,257 --> 01:49:25,590 DAVID J. MALAN: Not necessarily. 2188 01:49:25,590 --> 01:49:28,060 This is more of a feature of Python, I would say. 2189 01:49:28,060 --> 01:49:28,560 STUDENT: OK. 2190 01:49:28,560 --> 01:49:29,435 DAVID J. MALAN: Yeah. 2191 01:49:29,435 --> 01:49:33,690 Map was one very specific incarnation of thereof our use of Lambda 2192 01:49:33,690 --> 01:49:36,690 and passing it in as a key attribute to the sort function, 2193 01:49:36,690 --> 01:49:38,730 sorted function a while back was an example. 2194 01:49:38,730 --> 01:49:41,310 And we're about to see one other. 2195 01:49:41,310 --> 01:49:43,680 So we can even use these list comprehension 2196 01:49:43,680 --> 01:49:46,840 to filter values in or out of our resulting list. 2197 01:49:46,840 --> 01:49:50,250 So in fact, in VS Code here, let me close yell.py and close my terminal 2198 01:49:50,250 --> 01:49:50,760 window. 2199 01:49:50,760 --> 01:49:53,490 And let me create a new program here whose purpose in life 2200 01:49:53,490 --> 01:49:58,110 maybe is to take a same list of students as before with a shorter version 2201 01:49:58,110 --> 01:50:02,370 thereof, and just filter out all of the students in Gryffindor. 2202 01:50:02,370 --> 01:50:06,090 So let me go ahead and create a file called Gryffindors.py. 2203 01:50:06,090 --> 01:50:10,170 I'm going to go ahead and copy paste from before really my list of students, 2204 01:50:10,170 --> 01:50:13,230 at least Hermione, Harry, Ron, and Draco from the start 2205 01:50:13,230 --> 01:50:17,430 here, just so that I can focus on one student who 2206 01:50:17,430 --> 01:50:19,197 happens not to be from Slytherin. 2207 01:50:19,197 --> 01:50:21,030 And what I'm going to do here now, if I want 2208 01:50:21,030 --> 01:50:24,970 to filter out only the Gryffindor students, let me go ahead and do this. 2209 01:50:24,970 --> 01:50:27,720 Let me create another variable called Gryffindors, which 2210 01:50:27,720 --> 01:50:30,480 is going to equal the following list. 2211 01:50:30,480 --> 01:50:32,680 And this is going to be a bit of a longer line. 2212 01:50:32,680 --> 01:50:34,830 So I'm going to proactively move my square brackets 2213 01:50:34,830 --> 01:50:36,480 onto two separate lines. 2214 01:50:36,480 --> 01:50:39,240 And I'm going to create now a list comprehension. 2215 01:50:39,240 --> 01:50:40,560 I want to do this. 2216 01:50:40,560 --> 01:50:45,450 I want this new list called Gryffindors to contain every student's name 2217 01:50:45,450 --> 01:50:49,560 for each student in the student's list. 2218 01:50:49,560 --> 01:50:57,150 But, but, but, if the student's house equals equals quote unquote Gryffindor. 2219 01:50:57,150 --> 01:51:01,290 So this is nearly identical in spirit to what I just did earlier 2220 01:51:01,290 --> 01:51:04,260 to create a list comprehension out of each of the words passed 2221 01:51:04,260 --> 01:51:05,290 to my yell function. 2222 01:51:05,290 --> 01:51:07,620 But here, I'm doing so conditionally. 2223 01:51:07,620 --> 01:51:10,350 And so, I'm borrowing inspiration from our focus on loops, 2224 01:51:10,350 --> 01:51:14,670 borrowing some inspiration from our focus on conditionals, 2225 01:51:14,670 --> 01:51:18,310 combining that into this same square bracket notation. 2226 01:51:18,310 --> 01:51:23,340 So that what Gryffindors ultimately is, is zero or more students' names. 2227 01:51:23,340 --> 01:51:26,340 And the names that are included are the result 2228 01:51:26,340 --> 01:51:32,100 of iterating over each of those students and only including in the final result 2229 01:51:32,100 --> 01:51:35,550 the students whose house happens to be Gryffindor. 2230 01:51:35,550 --> 01:51:38,730 So when I go ahead and run this with Python of Gryffindors.py 2231 01:51:38,730 --> 01:51:42,150 and hit Enter, you'll see, huh, nothing actually happened here. 2232 01:51:42,150 --> 01:51:44,320 Well, that's because I didn't finish the program. 2233 01:51:44,320 --> 01:51:46,737 Let me go ahead and actually finish the program with this. 2234 01:51:46,737 --> 01:51:50,760 How about for each Gryffindor in Gryffindors plural-- 2235 01:51:50,760 --> 01:51:54,480 and better yet, so that it's sensible that I did all of this work in advance, 2236 01:51:54,480 --> 01:51:56,490 let me go ahead and sort all of those names 2237 01:51:56,490 --> 01:51:58,800 with our familiar sorted function. 2238 01:51:58,800 --> 01:52:02,970 Let's go ahead now and print out each of these Gryffindors. 2239 01:52:02,970 --> 01:52:06,107 So now, notice, if familiar with the books and the movies, 2240 01:52:06,107 --> 01:52:08,190 you'll know that only three of these four students 2241 01:52:08,190 --> 01:52:09,360 are actually in Gryffindor. 2242 01:52:09,360 --> 01:52:14,070 And if I run Python of Gryffindor.py, there we see Harry, Hermione, and Ron, 2243 01:52:14,070 --> 01:52:17,080 but now in sorted order as well. 2244 01:52:17,080 --> 01:52:20,760 So that's just one way we can solve this same problem using not just a list 2245 01:52:20,760 --> 01:52:24,810 comprehension but a list comprehension that has this conditional therein. 2246 01:52:24,810 --> 01:52:27,690 But there's yet other ways to solve this same problem too. 2247 01:52:27,690 --> 01:52:30,570 And we come back to some functional features of Python. 2248 01:52:30,570 --> 01:52:32,640 In addition to functions like map, there's 2249 01:52:32,640 --> 01:52:36,100 also this one called filter that can be used to achieve the same effect, 2250 01:52:36,100 --> 01:52:38,760 but with a more functional approach, if you will. 2251 01:52:38,760 --> 01:52:40,530 Let me go back to VS Code here. 2252 01:52:40,530 --> 01:52:43,440 And with the same example, let me do this. 2253 01:52:43,440 --> 01:52:47,550 Let me leave the original list of above as before, including Draco, 2254 01:52:47,550 --> 01:52:49,530 who's not in fact from Gryffindor. 2255 01:52:49,530 --> 01:52:51,960 And let me temporarily define a function called 2256 01:52:51,960 --> 01:52:57,420 is Gryffindor that takes in as a value something like a student S. 2257 01:52:57,420 --> 01:52:59,200 And then, let's do this. 2258 01:52:59,200 --> 01:53:04,920 Let's go ahead and say if s quote unquote house equals equals Gryffindor, 2259 01:53:04,920 --> 01:53:08,530 then go ahead and return true. 2260 01:53:08,530 --> 01:53:12,525 Otherwise, go ahead and return false. 2261 01:53:12,525 --> 01:53:14,400 Now, we've seen before conditionals like this 2262 01:53:14,400 --> 01:53:16,530 that are a bit unnecessarily verbose. 2263 01:53:16,530 --> 01:53:18,930 I don't need to have a conditional if I'm already 2264 01:53:18,930 --> 01:53:21,400 asking a Boolean question up here. 2265 01:53:21,400 --> 01:53:24,000 So I can actually tighten this up as we've done in the past 2266 01:53:24,000 --> 01:53:28,170 and just return does the student's house equal equal Gryffindor? 2267 01:53:28,170 --> 01:53:30,930 Either it does and it's true, or it doesn't in it's false. 2268 01:53:30,930 --> 01:53:33,150 I don't need to explicitly return true or false. 2269 01:53:33,150 --> 01:53:36,570 I can just return the value of that Boolean. 2270 01:53:36,570 --> 01:53:38,350 Let's go ahead now and do this. 2271 01:53:38,350 --> 01:53:40,140 I'm going to create, as before, a variable 2272 01:53:40,140 --> 01:53:43,620 called Gryffindors, a list for all of my Gryffindor students 2273 01:53:43,620 --> 01:53:47,430 that equals to, this time, the result of calling filter. 2274 01:53:47,430 --> 01:53:50,050 Filter takes at least two arguments here, 2275 01:53:50,050 --> 01:53:54,830 one of which is the name of a function to call is Gryffindor. 2276 01:53:54,830 --> 01:53:59,090 And I'm going to apply that function to each of the elements of this sequence 2277 01:53:59,090 --> 01:53:59,820 here. 2278 01:53:59,820 --> 01:54:03,920 So similar in spirit to map, I'm passing in a function 2279 01:54:03,920 --> 01:54:07,370 that's going to be applied to each of the elements in the sequence. 2280 01:54:07,370 --> 01:54:11,070 But map returns one value for each element in the sequence. 2281 01:54:11,070 --> 01:54:13,520 That's how we forced all of the words to uppercase. 2282 01:54:13,520 --> 01:54:19,520 But if I want to conditionally include a student in my resulting Gryffindors 2283 01:54:19,520 --> 01:54:21,920 list, I can use filter instead. 2284 01:54:21,920 --> 01:54:25,970 Filter expects its first function to be not something like str.upper, 2285 01:54:25,970 --> 01:54:28,340 but a function that returns true or false. 2286 01:54:28,340 --> 01:54:33,800 Tell me whether or not I should include or not include the current student 2287 01:54:33,800 --> 01:54:35,150 from the final list. 2288 01:54:35,150 --> 01:54:37,970 And the question being asked is, do they live in Gryffindor? 2289 01:54:37,970 --> 01:54:41,550 We're checking the dictionary's house key for that answer. 2290 01:54:41,550 --> 01:54:45,410 And so, ultimately, I think we'll be left with something quite similar. 2291 01:54:45,410 --> 01:54:52,040 For Gryffindor in the sorted version-- let's do for Gryffindor in Gryffindors, 2292 01:54:52,040 --> 01:54:56,150 let's go ahead then and print out the current students, Gryffindor name. 2293 01:54:56,150 --> 01:54:58,010 It's not going to be sorted just yet. 2294 01:54:58,010 --> 01:55:01,910 But when I run this version here Python of Gryffindors.py and hit Enter, 2295 01:55:01,910 --> 01:55:03,120 we're back in business. 2296 01:55:03,120 --> 01:55:03,800 It's unsorted. 2297 01:55:03,800 --> 01:55:07,100 But we have Hermione, Harry, and Ron, but not Draco. 2298 01:55:07,100 --> 01:55:09,380 And if you recall from a few weeks back, if we 2299 01:55:09,380 --> 01:55:13,190 want to even a list of dictionaries, we can still do that too. 2300 01:55:13,190 --> 01:55:16,790 I can call sorted on Gryffindors plural. 2301 01:55:16,790 --> 01:55:18,500 And I can pass in a key. 2302 01:55:18,500 --> 01:55:21,710 And that key can have a anonymous function, a.k.a. 2303 01:55:21,710 --> 01:55:24,350 A Lambda function, that takes in a student as input, 2304 01:55:24,350 --> 01:55:29,750 call it s, and then returns the value s quote unquote name, if my goal is 2305 01:55:29,750 --> 01:55:33,410 to sort by, indeed, students own names. 2306 01:55:33,410 --> 01:55:36,080 If I go ahead now and run Python of Gryffindors.py, 2307 01:55:36,080 --> 01:55:37,850 I see the same list of students. 2308 01:55:37,850 --> 01:55:39,860 But this time, it's sorted. 2309 01:55:39,860 --> 01:55:42,890 So here we've seen two approaches to this particular problem 2310 01:55:42,890 --> 01:55:43,850 of Gryffindor students. 2311 01:55:43,850 --> 01:55:47,090 Whereby we can either use something like a list comprehension, 2312 01:55:47,090 --> 01:55:50,360 and inside of that list comprehension do a bit of filtration, 2313 01:55:50,360 --> 01:55:53,480 including an if conditional as I did. 2314 01:55:53,480 --> 01:55:55,940 Or we can take a more functional approach 2315 01:55:55,940 --> 01:56:00,410 by just using this filter function, passing into it the function 2316 01:56:00,410 --> 01:56:02,700 that I want to make these decisions for me, 2317 01:56:02,700 --> 01:56:07,160 and then include only those for whom true is returned. 2318 01:56:07,160 --> 01:56:10,297 Any questions on either of these two approaches? 2319 01:56:10,297 --> 01:56:10,880 STUDENT: Yeah. 2320 01:56:10,880 --> 01:56:14,090 I just had a question, that if we write a code 2321 01:56:14,090 --> 01:56:17,090 like in the previous version, where everything is stuffed into one line, 2322 01:56:17,090 --> 01:56:18,050 won't the-- 2323 01:56:18,050 --> 01:56:24,770 if we check for the style of the code, then won't it have a problem with it 2324 01:56:24,770 --> 01:56:26,210 because it's less readable? 2325 01:56:26,210 --> 01:56:28,400 DAVID J. MALAN: So would a formatter like black 2326 01:56:28,400 --> 01:56:30,680 have a problem with the style of some of this code? 2327 01:56:30,680 --> 01:56:33,800 STUDENT: The previous one, where the everything was tucked into one line. 2328 01:56:33,800 --> 01:56:34,190 [INTERPOSING VOICES] 2329 01:56:34,190 --> 01:56:35,690 DAVID J. MALAN: Oh, a good question. 2330 01:56:35,690 --> 01:56:38,610 Would something like Black have a problem with this code? 2331 01:56:38,610 --> 01:56:40,460 Well, let me rewind to that version, which 2332 01:56:40,460 --> 01:56:45,170 was using the somewhat longer list comprehension, which 2333 01:56:45,170 --> 01:56:50,160 looked like, if we go far enough back, give me a few more undoes, 2334 01:56:50,160 --> 01:56:52,350 which looked like this ultimately. 2335 01:56:52,350 --> 01:56:55,500 Let me go ahead and run Black on Gryffindors.py. 2336 01:56:55,500 --> 01:56:59,170 And you'll see that I actually-- it reformatted ever so slightly. 2337 01:56:59,170 --> 01:57:01,230 But I proactively fix this myself. 2338 01:57:01,230 --> 01:57:03,720 Had I done this and done it on just one line, 2339 01:57:03,720 --> 01:57:07,140 but I knew that Black might not like that, it would have fixed it for me. 2340 01:57:07,140 --> 01:57:10,740 So I just proactively fixed it before writing the code myself. 2341 01:57:10,740 --> 01:57:14,730 How about time for one other question on Gryffindors.py 2342 01:57:14,730 --> 01:57:17,640 and this approach of using a list comprehension or filter? 2343 01:57:17,640 --> 01:57:19,440 STUDENT: Yeah. 2344 01:57:19,440 --> 01:57:24,360 When using filter, instead of calling the function is Gryffindor, 2345 01:57:24,360 --> 01:57:29,328 can you use it right there inside filter? 2346 01:57:29,328 --> 01:57:31,620 DAVID J. MALAN: Can you use the function is Gryffindor? 2347 01:57:31,620 --> 01:57:33,685 So you don't want to call it like this. 2348 01:57:33,685 --> 01:57:35,310 Because you don't want to call it then. 2349 01:57:35,310 --> 01:57:39,180 You want filter to call the function for you, if that's what you mean. 2350 01:57:39,180 --> 01:57:43,740 So I pass it in only by its name instead. 2351 01:57:43,740 --> 01:57:50,040 STUDENT: No, I mean, if you can write the return as house equals 2352 01:57:50,040 --> 01:57:51,900 equals Gryffindor inside [INAUDIBLE]. 2353 01:57:51,900 --> 01:57:53,520 DAVID J. MALAN: Yes, indeed. 2354 01:57:53,520 --> 01:57:57,330 In fact, so recall that we indeed used these Lambda functions way back 2355 01:57:57,330 --> 01:58:01,020 when we wanted to pass in a quick and dirty function anonymously 2356 01:58:01,020 --> 01:58:05,040 to allow sorted to filter by a different key of a dictionary. 2357 01:58:05,040 --> 01:58:06,490 We can do that here. 2358 01:58:06,490 --> 01:58:10,140 I can actually take the essence of this is Gryffindor function. 2359 01:58:10,140 --> 01:58:13,050 I can change the name of this function in my filter call 2360 01:58:13,050 --> 01:58:17,190 to be another Lambda function, passing in an argument like s, 2361 01:58:17,190 --> 01:58:19,560 and returning exactly that. 2362 01:58:19,560 --> 01:58:23,040 I can now delete my is Gryffindor function all together. 2363 01:58:23,040 --> 01:58:26,400 And now, when I run Python of Gryffindors.py, 2364 01:58:26,400 --> 01:58:28,030 I still get the same answer. 2365 01:58:28,030 --> 01:58:31,440 And I've not bothered defining a function only to then use it 2366 01:58:31,440 --> 01:58:34,560 in one and only one place. 2367 01:58:34,560 --> 01:58:36,840 Well, let me propose too that we equip you 2368 01:58:36,840 --> 01:58:40,290 with one other tool for your toolkit, namely dictionary 2369 01:58:40,290 --> 01:58:41,880 comprehensions as well. 2370 01:58:41,880 --> 01:58:45,210 And admittedly, the syntax is starting to get even weirder. 2371 01:58:45,210 --> 01:58:48,330 But as you get more comfortable with all of these primitives and others, 2372 01:58:48,330 --> 01:58:51,900 these are just tools that you can optionally but perhaps powerfully use 2373 01:58:51,900 --> 01:58:53,980 to solve future problems down the road. 2374 01:58:53,980 --> 01:58:58,230 And with a dictionary comprehension, we have the ability to create on the fly 2375 01:58:58,230 --> 01:59:01,200 a dictionary with keys and some values without having 2376 01:59:01,200 --> 01:59:05,580 to do it old-school by creating an empty dictionary, and creating a for loop, 2377 01:59:05,580 --> 01:59:08,730 and iterating over that loop, and inserting more and more keys 2378 01:59:08,730 --> 01:59:10,170 and values into the dictionary. 2379 01:59:10,170 --> 01:59:12,400 We can rather do it all at once. 2380 01:59:12,400 --> 01:59:14,640 So in fact, let me go back to VS Code here. 2381 01:59:14,640 --> 01:59:17,770 And let me propose now that I do this. 2382 01:59:17,770 --> 01:59:23,070 Let me go ahead and initially do it the old-fashioned way here as follows. 2383 01:59:23,070 --> 01:59:25,320 Let me go ahead and simplify and get rid of the houses 2384 01:59:25,320 --> 01:59:30,060 all together so that we can focus for now just on a list of students' names. 2385 01:59:30,060 --> 01:59:32,620 I'm going to go ahead and run students. 2386 01:59:32,620 --> 01:59:37,200 I'm going to go ahead and write students equals quote unquote Hermione, quote 2387 01:59:37,200 --> 01:59:39,600 unquote Harry, and we'll keep it even shorter this time, 2388 01:59:39,600 --> 01:59:42,480 quote unquote Ron, only those three students in Gryffindor. 2389 01:59:42,480 --> 01:59:44,790 I'm going to now proactively as we've done in the past 2390 01:59:44,790 --> 01:59:47,910 give myself an empty list, so that I have something to accumulate 2391 01:59:47,910 --> 01:59:49,905 some answers to this problem in. 2392 01:59:49,905 --> 01:59:51,780 And now, I'm going to do something like this. 2393 01:59:51,780 --> 01:59:55,450 For students and students, so I can iterate over each of them, 2394 01:59:55,450 --> 01:59:57,960 Let's go ahead and with the Gryffindors list 2395 01:59:57,960 --> 02:00:01,950 append to it the name of the student. 2396 02:00:01,950 --> 02:00:04,260 So quote unquote name and then student, which 2397 02:00:04,260 --> 02:00:05,820 is indeed their name from that list. 2398 02:00:05,820 --> 02:00:09,003 And now, let's go ahead and just put these students all in Gryffindor. 2399 02:00:09,003 --> 02:00:10,920 I know these three students are in Gryffindor. 2400 02:00:10,920 --> 02:00:13,260 So suppose that the problem at hand is that I 2401 02:00:13,260 --> 02:00:18,190 want to build up a list of dictionaries that only contains the Gryffindor 2402 02:00:18,190 --> 02:00:18,690 students. 2403 02:00:18,690 --> 02:00:20,970 So it's sort of a step back from the previous version 2404 02:00:20,970 --> 02:00:23,430 where I already had the names and the houses. 2405 02:00:23,430 --> 02:00:26,640 For now, just assume that the problem is I have all of their names, 2406 02:00:26,640 --> 02:00:29,550 but I don't yet have the student dictionaries themselves. 2407 02:00:29,550 --> 02:00:34,380 So I'm rebuilding that same structure that I previously took for granted. 2408 02:00:34,380 --> 02:00:37,200 Now, let's go ahead and just for the sake of discussion just 2409 02:00:37,200 --> 02:00:40,300 print out these Gryffindors so we can see what we've built. 2410 02:00:40,300 --> 02:00:43,080 If I run Python of Gryffindors.py in my prompt, 2411 02:00:43,080 --> 02:00:45,270 I see a bit of a cryptic syntax. 2412 02:00:45,270 --> 02:00:47,640 But again, look for our little hints. 2413 02:00:47,640 --> 02:00:51,120 I've got a square bracket at the end and a square bracket at the beginning. 2414 02:00:51,120 --> 02:00:53,760 And that indicates, as always, this is a list. 2415 02:00:53,760 --> 02:00:56,520 I then have a whole bunch of curly braces 2416 02:00:56,520 --> 02:00:58,740 with a whole bunch of quoted keys. 2417 02:00:58,740 --> 02:01:00,630 They happen to be single quotes by convention 2418 02:01:00,630 --> 02:01:02,910 when using print on a dictionary. 2419 02:01:02,910 --> 02:01:06,000 But that's just a visual indicator that that is my key. 2420 02:01:06,000 --> 02:01:08,190 And the first value thereof is Hermione. 2421 02:01:08,190 --> 02:01:09,660 Second key is a house. 2422 02:01:09,660 --> 02:01:11,460 This value thereof is Gryffindor. 2423 02:01:11,460 --> 02:01:15,420 Then there's a comma, which separates one dict object from the next. 2424 02:01:15,420 --> 02:01:17,370 And if we look past Harry and Gryffindor, 2425 02:01:17,370 --> 02:01:21,450 there's a second comma which separates Harry and Gryffindor from Ron 2426 02:01:21,450 --> 02:01:22,540 and Gryffindor as well. 2427 02:01:22,540 --> 02:01:26,280 So in short, here is some code whereby I fairly manually 2428 02:01:26,280 --> 02:01:30,330 built up with a for loop in an otherwise initially empty list 2429 02:01:30,330 --> 02:01:35,910 the same data structure as before minus Draco just for Gryffindor students. 2430 02:01:35,910 --> 02:01:39,060 But here's where, again, with dictionary comprehensions 2431 02:01:39,060 --> 02:01:43,290 or really list comprehensions first, can we do this a little more succinctly? 2432 02:01:43,290 --> 02:01:45,270 Let me clear my terminal window. 2433 02:01:45,270 --> 02:01:49,720 Let's get rid of this initially empty list and this for loop 2434 02:01:49,720 --> 02:01:51,460 that appends, appends, appends to it. 2435 02:01:51,460 --> 02:01:52,960 And let's just do this. 2436 02:01:52,960 --> 02:01:58,120 A Gryffindors variable will equal the following list comprehension. 2437 02:01:58,120 --> 02:02:01,660 Inside of that list, I want a dictionary structured 2438 02:02:01,660 --> 02:02:04,180 with someone's name and their name. 2439 02:02:04,180 --> 02:02:07,420 Someone's house and only for now Gryffindor. 2440 02:02:07,420 --> 02:02:08,380 And that's it. 2441 02:02:08,380 --> 02:02:13,870 But I want one of these dict objects here in these curly braces 2442 02:02:13,870 --> 02:02:18,110 for each student in students. 2443 02:02:18,110 --> 02:02:23,230 So here too, inside of my list comprehension with my square brackets, 2444 02:02:23,230 --> 02:02:25,960 I want an object as indicate-- 2445 02:02:25,960 --> 02:02:29,290 I want a dictionary as indicated by the curly braces. 2446 02:02:29,290 --> 02:02:32,440 I want each of those dictionaries to have two keys-- 2447 02:02:32,440 --> 02:02:35,560 name and house respectively, the values thereof 2448 02:02:35,560 --> 02:02:40,060 are the student's name from earlier here and Gryffindor only. 2449 02:02:40,060 --> 02:02:43,300 Which students do I want to create those dict objects from? 2450 02:02:43,300 --> 02:02:45,230 Well, for student in students. 2451 02:02:45,230 --> 02:02:48,010 So again, on the left I have what I want in the final list. 2452 02:02:48,010 --> 02:02:50,740 And on the right, I have a loop, and this time, no conditional. 2453 02:02:50,740 --> 02:02:54,140 I want all of these students in Gryffindor as their house. 2454 02:02:54,140 --> 02:02:58,330 Now, let's print this again, Python of Gryffindors.py and hit Enter. 2455 02:02:58,330 --> 02:03:00,950 And now, we have the exact same output. 2456 02:03:00,950 --> 02:03:03,520 So instead of three lines it's just one. 2457 02:03:03,520 --> 02:03:06,520 It's a little more cryptic to read at first glance. 2458 02:03:06,520 --> 02:03:09,520 But once familiar with list comprehensions and this sort of syntax, 2459 02:03:09,520 --> 02:03:12,560 it's just another way of solving that same problem. 2460 02:03:12,560 --> 02:03:14,560 What if I want to change this and simplify? 2461 02:03:14,560 --> 02:03:18,100 What if I don't want a list of dictionaries, which I now have. 2462 02:03:18,100 --> 02:03:22,480 Again, per the square brackets I have a list of three dict objects here. 2463 02:03:22,480 --> 02:03:25,660 What if I just want one bigger dictionary 2464 02:03:25,660 --> 02:03:31,330 inside of which is a key like Hermione colon Gryffindor, Harry colon 2465 02:03:31,330 --> 02:03:33,670 Gryffindor, Ron colon Gryffindor. 2466 02:03:33,670 --> 02:03:34,600 I don't need a list. 2467 02:03:34,600 --> 02:03:36,430 I don't need separate objects per student. 2468 02:03:36,430 --> 02:03:40,060 I just want instead one big dictionary where 2469 02:03:40,060 --> 02:03:42,850 the keys are the students' names and the values of their house. 2470 02:03:42,850 --> 02:03:45,010 And I'm assuming for now no one's going to have 2471 02:03:45,010 --> 02:03:47,590 the same first name in this world. 2472 02:03:47,590 --> 02:03:49,280 Well, I can do this. 2473 02:03:49,280 --> 02:03:53,680 Let me get rid of this here and not create a list comprehension, but again, 2474 02:03:53,680 --> 02:03:56,230 this thing known as a dictionary comprehension. 2475 02:03:56,230 --> 02:03:58,450 And the visual indicator or difference here 2476 02:03:58,450 --> 02:04:02,650 is that instead of being square brackets on the very outside, this time 2477 02:04:02,650 --> 02:04:04,820 it's going to be curly braces instead. 2478 02:04:04,820 --> 02:04:08,770 So inside of these curly braces, what do I want every key to be? 2479 02:04:08,770 --> 02:04:11,170 I want every key to be the student's name. 2480 02:04:11,170 --> 02:04:14,170 I want every value for now to be Gryffindor. 2481 02:04:14,170 --> 02:04:17,980 And I want to do this for each student in students. 2482 02:04:17,980 --> 02:04:20,470 And now, things are getting really interesting. 2483 02:04:20,470 --> 02:04:23,890 And this is another manifestation of Python in some views being 2484 02:04:23,890 --> 02:04:25,750 very readable from left to right. 2485 02:04:25,750 --> 02:04:28,240 Absolutely takes practice and comfort. 2486 02:04:28,240 --> 02:04:31,180 But this is creating a variable called Gryffindor 2487 02:04:31,180 --> 02:04:34,660 which is going to be a dictionary per these curly braces. 2488 02:04:34,660 --> 02:04:37,570 Every key is going to be the name of some student. 2489 02:04:37,570 --> 02:04:39,400 Every value is going to be Gryffindor. 2490 02:04:39,400 --> 02:04:40,930 What names of what students? 2491 02:04:40,930 --> 02:04:43,930 Well, this dictionary comprehension will be 2492 02:04:43,930 --> 02:04:48,740 constructed from the list of students one at a time. 2493 02:04:48,740 --> 02:04:51,490 So when I print this now, the syntax will look a little different 2494 02:04:51,490 --> 02:04:53,770 because it's not a list of dictionary objects. 2495 02:04:53,770 --> 02:04:56,630 It's just one bigger dictionary object itself. 2496 02:04:56,630 --> 02:05:01,030 But now printing Gryffindors gives me Hermione colon Gryffindor, Harry colon 2497 02:05:01,030 --> 02:05:05,560 Gryffindor, and Ron colon Gryffindor as well. 2498 02:05:05,560 --> 02:05:10,755 Any questions now on what we've called dictionary comprehensions as well? 2499 02:05:10,755 --> 02:05:14,690 2500 02:05:14,690 --> 02:05:15,955 Any questions on here? 2501 02:05:15,955 --> 02:05:18,670 2502 02:05:18,670 --> 02:05:19,170 No? 2503 02:05:19,170 --> 02:05:23,070 Well, let's introduce one other function from Python's toolkit followed 2504 02:05:23,070 --> 02:05:25,770 by one final feature and flourish. 2505 02:05:25,770 --> 02:05:27,630 And then you're off on your way. 2506 02:05:27,630 --> 02:05:30,720 Well, let's go ahead and think back to this. 2507 02:05:30,720 --> 02:05:34,860 Recall some time ago that we had just a simple list of students 2508 02:05:34,860 --> 02:05:37,290 as we have here, Hermione, Harry, and Ron. 2509 02:05:37,290 --> 02:05:40,770 And for instance, way back when, we wanted to print out, 2510 02:05:40,770 --> 02:05:44,160 for instance, their ranking from one, to two, to three. 2511 02:05:44,160 --> 02:05:48,000 Unfortunately, when you do something like this for student in students, 2512 02:05:48,000 --> 02:05:50,700 you can print out the student's name quite easily. 2513 02:05:50,700 --> 02:05:53,220 Of course, if I do Python of Gryffindors.py, 2514 02:05:53,220 --> 02:05:56,010 I get Hermione, Harry, Ron in that same order. 2515 02:05:56,010 --> 02:05:58,020 But I don't see any numerical rank. 2516 02:05:58,020 --> 02:06:00,300 I see no number one, two, or three. 2517 02:06:00,300 --> 02:06:04,980 So I could maybe do this with maybe a different type of for loop. 2518 02:06:04,980 --> 02:06:06,870 Instead of this, why don't I try this? 2519 02:06:06,870 --> 02:06:13,770 So maybe I could do for i in the range of the length of the students list-- 2520 02:06:13,770 --> 02:06:15,670 and we've done something like this before. 2521 02:06:15,670 --> 02:06:19,320 And then I could print out i, and I could print out the student's name 2522 02:06:19,320 --> 02:06:22,528 by indexing into that list at location i. 2523 02:06:22,528 --> 02:06:23,820 Well, what does this look like? 2524 02:06:23,820 --> 02:06:27,180 If I run Python of Gryffindors.py it's close. 2525 02:06:27,180 --> 02:06:29,130 But these aren't programmers. 2526 02:06:29,130 --> 02:06:31,950 They don't necessarily think of themselves as zero-indexed. 2527 02:06:31,950 --> 02:06:34,320 Hermione he probably wants to be first not zero. 2528 02:06:34,320 --> 02:06:35,290 So how can we fix this? 2529 02:06:35,290 --> 02:06:35,790 Well. 2530 02:06:35,790 --> 02:06:37,260 Just a little bit of arithmetic. 2531 02:06:37,260 --> 02:06:40,897 I could print out i plus one, of course, and then the student's name. 2532 02:06:40,897 --> 02:06:43,980 So if I clear my terminal window and run Python of Gryffindors.py doorstop 2533 02:06:43,980 --> 02:06:47,760 once more, now we have this enumeration, one, two, 2534 02:06:47,760 --> 02:06:49,470 three of each of these students. 2535 02:06:49,470 --> 02:06:51,540 But it turns out that Python actually has 2536 02:06:51,540 --> 02:06:53,610 had all this time another built-in function 2537 02:06:53,610 --> 02:06:55,230 that you might now find useful. 2538 02:06:55,230 --> 02:06:57,210 That is namely enumerate. 2539 02:06:57,210 --> 02:07:00,210 And enumerate allows you to solve this kind of problem 2540 02:07:00,210 --> 02:07:03,660 much more simply by iterating over some sequence 2541 02:07:03,660 --> 02:07:07,810 but finding out not each value one at a time, 2542 02:07:07,810 --> 02:07:12,630 but both the value one at a time and the index thereof. 2543 02:07:12,630 --> 02:07:15,010 It gives you back two answers at once. 2544 02:07:15,010 --> 02:07:18,240 So if I go back to VS Code here now and take this approach, 2545 02:07:18,240 --> 02:07:21,990 I don't need to do this complicated range, and length, 2546 02:07:21,990 --> 02:07:23,730 and then i all over the place. 2547 02:07:23,730 --> 02:07:25,620 I can more succinctly do this. 2548 02:07:25,620 --> 02:07:31,560 I can say for i comma student in the enumerate return 2549 02:07:31,560 --> 02:07:33,550 value passing in students. 2550 02:07:33,550 --> 02:07:36,150 So this gives me back an enumeration, if you will. 2551 02:07:36,150 --> 02:07:39,690 And now, I can go about printing i plus 1 as before. 2552 02:07:39,690 --> 02:07:41,190 And I can print out the student. 2553 02:07:41,190 --> 02:07:44,625 So I don't need to index into the list with bracket i notation. 2554 02:07:44,625 --> 02:07:45,750 I don't need to call range. 2555 02:07:45,750 --> 02:07:47,040 I don't need to call length. 2556 02:07:47,040 --> 02:07:50,340 Again, enumerate takes a sequence of values like these students, 2557 02:07:50,340 --> 02:07:54,390 and it allows me to get back the current index zero, one, two, 2558 02:07:54,390 --> 02:07:58,630 and the current value Hermione, Harry, Ron respectively. 2559 02:07:58,630 --> 02:08:00,330 So now, just tighten things up further. 2560 02:08:00,330 --> 02:08:01,955 And indeed, that's been our theme here. 2561 02:08:01,955 --> 02:08:04,770 Can we solve the same problems as we've been solving for weeks 2562 02:08:04,770 --> 02:08:09,320 but tighten things up using just more of this toolkit? 2563 02:08:09,320 --> 02:08:12,260 Allow us to equip you with one final tool for your tool kit, 2564 02:08:12,260 --> 02:08:16,323 namely this ability to generate values in Python from functions. 2565 02:08:16,323 --> 02:08:18,990 This is not a problem that we've necessarily encountered before. 2566 02:08:18,990 --> 02:08:22,340 But it turns out, if you're writing a function that reads or generates 2567 02:08:22,340 --> 02:08:24,500 lots of data, your function, your program, 2568 02:08:24,500 --> 02:08:26,990 your computer might very well run out of memory. 2569 02:08:26,990 --> 02:08:30,410 And your program might not be able to run any further. 2570 02:08:30,410 --> 02:08:33,065 But it turns out there's a solution to this problem that's 2571 02:08:33,065 --> 02:08:34,940 something you might have in your back pocket, 2572 02:08:34,940 --> 02:08:38,600 particularly if after this course you start crunching quite a few numbers 2573 02:08:38,600 --> 02:08:40,460 and analyzing all the more data. 2574 02:08:40,460 --> 02:08:42,233 In fact, let's go back to VS Code here. 2575 02:08:42,233 --> 02:08:44,150 And let's go ahead and create a program that's 2576 02:08:44,150 --> 02:08:47,113 perhaps timely at this time of day, particularly 2577 02:08:47,113 --> 02:08:50,030 depending on your time zone, you might be feeling all the more sleepy. 2578 02:08:50,030 --> 02:08:52,100 But here in the US, it's quite common to be 2579 02:08:52,100 --> 02:08:54,590 lulled to sleep when you're struggling otherwise 2580 02:08:54,590 --> 02:08:56,240 by counting sheep in your head. 2581 02:08:56,240 --> 02:08:58,400 And typically, as depicted in cartoons, you 2582 02:08:58,400 --> 02:09:01,580 might see in your mind's eye one sheep jumping over a fence, and then two, 2583 02:09:01,580 --> 02:09:03,138 and then three sheep, and then four. 2584 02:09:03,138 --> 02:09:05,180 And then, eventually, you presumably get so bored 2585 02:09:05,180 --> 02:09:07,710 counting these sheep you actually do fall asleep. 2586 02:09:07,710 --> 02:09:11,570 So in VS Code here, let's create a program called sleep.py 2587 02:09:11,570 --> 02:09:14,810 that allows me to print out some number of sheep 2588 02:09:14,810 --> 02:09:17,750 as though I'm counting them in my mind's eye. 2589 02:09:17,750 --> 02:09:19,610 And via this program, let's do this. 2590 02:09:19,610 --> 02:09:22,250 Let's prompt the user for a variable n, setting 2591 02:09:22,250 --> 02:09:25,610 it equal to the integer conversion of the return value of input, 2592 02:09:25,610 --> 02:09:27,530 asking the user what's n? 2593 02:09:27,530 --> 02:09:29,720 For how many sheep do they want to try counting? 2594 02:09:29,720 --> 02:09:31,760 And then, let's do a familiar for loop here. 2595 02:09:31,760 --> 02:09:33,750 And we'll start counting from zero as always. 2596 02:09:33,750 --> 02:09:36,667 So we'll first have zero sheep, then one sheep, then two sheep, and so 2597 02:09:36,667 --> 02:09:39,620 on for i in the range of that value n. 2598 02:09:39,620 --> 02:09:40,880 Go ahead and print out. 2599 02:09:40,880 --> 02:09:44,090 And I'll paste here an emoji representing a sheep times i. 2600 02:09:44,090 --> 02:09:46,220 So the first iteration I'll see zero sheep. 2601 02:09:46,220 --> 02:09:49,880 The second iteration you'll see one, and then two, and then however many 2602 02:09:49,880 --> 02:09:53,330 specified by n ultimately minus one. 2603 02:09:53,330 --> 02:09:56,690 Let's go down into my terminal window here and run Python of sleep.py. 2604 02:09:56,690 --> 02:09:59,450 And I should see, indeed, after typing in, say, three 2605 02:09:59,450 --> 02:10:03,667 for my value of n, zero sheep, then one sheep, then two sheep, and so forth. 2606 02:10:03,667 --> 02:10:05,750 And if I make my terminal window even bigger here, 2607 02:10:05,750 --> 02:10:08,990 we can, of course, do many more than this, typing in, for instance, 10. 2608 02:10:08,990 --> 02:10:11,870 And you'll see that we get more and more sheep as time passes, 2609 02:10:11,870 --> 02:10:15,740 presumably becoming all the more tedious to envision in my mind's eye. 2610 02:10:15,740 --> 02:10:18,753 So let's now go ahead and practice what we've 2611 02:10:18,753 --> 02:10:21,170 been preaching when it comes to the design of this program 2612 02:10:21,170 --> 02:10:24,690 and see if and when we actually run into a problem. 2613 02:10:24,690 --> 02:10:27,830 Let me go ahead here now and put all of this in a main function 2614 02:10:27,830 --> 02:10:30,410 by defining main up here as always. 2615 02:10:30,410 --> 02:10:32,640 Let me go ahead and indent all of this code here. 2616 02:10:32,640 --> 02:10:34,890 And then, let me just do this conditionally as always, 2617 02:10:34,890 --> 02:10:39,020 if the name of this file equals equals quote unquote main, let's go ahead 2618 02:10:39,020 --> 02:10:39,890 and call main. 2619 02:10:39,890 --> 02:10:43,520 Let's make sure I didn't break anything just yet, even though functionally this 2620 02:10:43,520 --> 02:10:44,880 should be nearly the same. 2621 02:10:44,880 --> 02:10:48,080 And if I type in three, I still have zero, then one, 2622 02:10:48,080 --> 02:10:50,180 then two sheep on the screen. 2623 02:10:50,180 --> 02:10:53,690 But we've been in the habit of course of creating helper functions 2624 02:10:53,690 --> 02:10:54,380 for ourselves. 2625 02:10:54,380 --> 02:10:56,510 That is, factoring our code in a way that 2626 02:10:56,510 --> 02:11:00,530 allows us to abstract away certain functionality, like generating 2627 02:11:00,530 --> 02:11:03,093 some number of sheep into separate functions. 2628 02:11:03,093 --> 02:11:04,760 So that, one, they're indeed abstracted. 2629 02:11:04,760 --> 02:11:07,302 And we no longer have to think about how they're implemented. 2630 02:11:07,302 --> 02:11:09,860 And we can even reuse them in projects as in libraries. 2631 02:11:09,860 --> 02:11:13,460 But we've also been in the habit too of now testing those functions 2632 02:11:13,460 --> 02:11:14,670 as with unit tests. 2633 02:11:14,670 --> 02:11:17,660 So I probably shouldn't keep all of my logic anyway in main. 2634 02:11:17,660 --> 02:11:19,280 And let's factor some of this out. 2635 02:11:19,280 --> 02:11:24,980 Wouldn't it be nice if I could, for instance, just call a sheep function 2636 02:11:24,980 --> 02:11:27,990 as by taking this line of code here. 2637 02:11:27,990 --> 02:11:30,140 And instead of just printing it here, let's print 2638 02:11:30,140 --> 02:11:32,270 out the return value of a new function called 2639 02:11:32,270 --> 02:11:37,310 sheep that tells the function how many sheep to print, i in this case. 2640 02:11:37,310 --> 02:11:40,610 Let's go down as always and create another function here called sheep. 2641 02:11:40,610 --> 02:11:43,910 The sheep function now will take a parameter n that specifies 2642 02:11:43,910 --> 02:11:45,960 how many sheep do you want to return. 2643 02:11:45,960 --> 02:11:48,440 And so, that we can test this as with a unit test, 2644 02:11:48,440 --> 02:11:52,430 though we won't do that here, let me go ahead and not print the number of sheep 2645 02:11:52,430 --> 02:11:53,720 as via a side effect. 2646 02:11:53,720 --> 02:11:57,830 But let me go ahead and return one of those sheep times n 2647 02:11:57,830 --> 02:12:00,980 so that the user gets back a whole string of sheep that's 2648 02:12:00,980 --> 02:12:03,320 the appropriate number to print. 2649 02:12:03,320 --> 02:12:05,270 So here too functionally, I don't think we've 2650 02:12:05,270 --> 02:12:06,890 changed anything too fundamentally. 2651 02:12:06,890 --> 02:12:09,860 Python of sleep.py typing three still gives us 2652 02:12:09,860 --> 02:12:11,870 zero, then one, and then two sheep. 2653 02:12:11,870 --> 02:12:18,170 But now, we at least have a framework for focusing on the implementation 2654 02:12:18,170 --> 02:12:19,640 of this sheep function. 2655 02:12:19,640 --> 02:12:24,890 But it's a little inelegant now that it's still up to the main function 2656 02:12:24,890 --> 02:12:26,330 to do this iteration. 2657 02:12:26,330 --> 02:12:28,610 We've seen in the past way back in week zero, 2658 02:12:28,610 --> 02:12:31,730 wouldn't it be nice to define a function that actually handles 2659 02:12:31,730 --> 02:12:34,040 the process of returning the entire string 2660 02:12:34,040 --> 02:12:38,180 that we want rather than just one row of sheep at a time. 2661 02:12:38,180 --> 02:12:39,840 Well, I think we can do this. 2662 02:12:39,840 --> 02:12:42,630 Why don't I go ahead and change sheep as follows. 2663 02:12:42,630 --> 02:12:48,110 Let me go ahead here and first create a flock of sheep that's initially empty 2664 02:12:48,110 --> 02:12:49,410 using an empty list. 2665 02:12:49,410 --> 02:12:55,640 Then for i in the range of n, let's go ahead and append to that flock, 2666 02:12:55,640 --> 02:12:58,760 for instance, one sheep times i, so that I 2667 02:12:58,760 --> 02:13:02,570 keep adding to this list zero sheep, then one sheep, then two sheep, then 2668 02:13:02,570 --> 02:13:03,660 three, and so forth. 2669 02:13:03,660 --> 02:13:06,500 And then, ultimately, I'm going to return the whole flock of sheep 2670 02:13:06,500 --> 02:13:07,130 at once. 2671 02:13:07,130 --> 02:13:09,510 So this is going to return the equivalent of all 2672 02:13:09,510 --> 02:13:14,170 of those strings of sheep so that, ah, main can handle the printing thereof. 2673 02:13:14,170 --> 02:13:16,110 So back up here in main, let's do this. 2674 02:13:16,110 --> 02:13:19,110 How about for each sheep, I'll call it s since sheep 2675 02:13:19,110 --> 02:13:24,210 is both singular and plural, for s in sheep of n, which again returns 2676 02:13:24,210 --> 02:13:27,960 to me a list of all of the sheep, the whole flock, let's just print 2677 02:13:27,960 --> 02:13:31,260 out each sheep, s, one at a time. 2678 02:13:31,260 --> 02:13:33,300 So, so far so good here, I think. 2679 02:13:33,300 --> 02:13:36,060 Let me go ahead and run Python of sleep.py and hit Enter. 2680 02:13:36,060 --> 02:13:37,380 What's n three. 2681 02:13:37,380 --> 02:13:40,560 And that's still seems to work just fine. 2682 02:13:40,560 --> 02:13:45,300 But let me get a little creative here and see 2683 02:13:45,300 --> 02:13:48,790 not just three sheep on my screen but maybe 10 rows of sheep. 2684 02:13:48,790 --> 02:13:50,280 And that too seems to work fine. 2685 02:13:50,280 --> 02:13:53,970 Let me get a little more adventurous and type in maybe 100 sheep. 2686 02:13:53,970 --> 02:13:56,100 And it's starting to look ugly, to be fair, 2687 02:13:56,100 --> 02:13:58,290 but they're all printing out pretty fast. 2688 02:13:58,290 --> 02:14:02,610 Let me go ahead and try again with maybe 1,000 sheep on the screen. 2689 02:14:02,610 --> 02:14:04,210 And they flew by pretty fast. 2690 02:14:04,210 --> 02:14:05,310 It's still pretty messy. 2691 02:14:05,310 --> 02:14:06,420 But they're all there. 2692 02:14:06,420 --> 02:14:07,740 We could count them all up. 2693 02:14:07,740 --> 02:14:11,610 How about not just 1,000 but 10,000 sheep? 2694 02:14:11,610 --> 02:14:13,080 Well, that too seems OK. 2695 02:14:13,080 --> 02:14:14,712 It's taking like 10 times as long. 2696 02:14:14,712 --> 02:14:16,920 And that's why you see this flickering on the screen. 2697 02:14:16,920 --> 02:14:18,810 All of the sheep are still printing. 2698 02:14:18,810 --> 02:14:22,420 But, but, but, it's a lot of data being printed. 2699 02:14:22,420 --> 02:14:27,240 If I hang in there a little longer, hopefully we'll 2700 02:14:27,240 --> 02:14:32,490 see all 10,000 sheep coming to pass. 2701 02:14:32,490 --> 02:14:38,160 This is here in the video where we will speed up time, a real online. 2702 02:14:38,160 --> 02:14:38,670 Oh my God. 2703 02:14:38,670 --> 02:14:40,410 This is a lot of sheep. 2704 02:14:40,410 --> 02:14:40,990 There we go. 2705 02:14:40,990 --> 02:14:41,490 OK. 2706 02:14:41,490 --> 02:14:44,220 And now all of my sheep have been printed. 2707 02:14:44,220 --> 02:14:46,320 So it seems to be working just fine. 2708 02:14:46,320 --> 02:14:48,420 Well, let me just be even more adventurous 2709 02:14:48,420 --> 02:14:50,220 and, OK, let me try my luck. 2710 02:14:50,220 --> 02:14:56,930 Let me try, how about one million sheep this time and hit Enter? 2711 02:14:56,930 --> 02:14:58,250 Ha. 2712 02:14:58,250 --> 02:15:03,070 Something's no longer working. 2713 02:15:03,070 --> 02:15:06,670 While we wait for a spoiler here, does anyone 2714 02:15:06,670 --> 02:15:12,400 have any intuition for why my program suddenly stopped printing sheep? 2715 02:15:12,400 --> 02:15:17,320 What is going wrong in this version, wherein I'm generating 2716 02:15:17,320 --> 02:15:19,360 this really big flock of sheep? 2717 02:15:19,360 --> 02:15:22,745 STUDENT: We might have run out of memory or computation power. 2718 02:15:22,745 --> 02:15:23,620 DAVID J. MALAN: Yeah. 2719 02:15:23,620 --> 02:15:26,020 So maybe we're actually pushing the limits of my Mac, 2720 02:15:26,020 --> 02:15:30,760 my PCs, my cloud server's memory or CPU, the brains of the computer's 2721 02:15:30,760 --> 02:15:34,570 capabilities because it's just trying to generate 2722 02:15:34,570 --> 02:15:38,920 massive, massive, massive lists of sheep, one million 2723 02:15:38,920 --> 02:15:41,950 of those rows of sheep, each of which has a huge number of sheep. 2724 02:15:41,950 --> 02:15:46,330 And it seems that my computer here is honestly just really struggling. 2725 02:15:46,330 --> 02:15:48,340 And this is really unfortunate now. 2726 02:15:48,340 --> 02:15:50,920 Because it would seem that even though this program clearly 2727 02:15:50,920 --> 02:15:54,760 works pretty well for 1,000 sheep, 10,000 sheep, once 2728 02:15:54,760 --> 02:15:58,520 you cross some threshold, it just stops working altogether. 2729 02:15:58,520 --> 02:16:02,770 Or it just takes way too long for the program to be useful anymore. 2730 02:16:02,770 --> 02:16:07,630 But this seems a little silly, because theoretically, I should absolutely 2731 02:16:07,630 --> 02:16:10,450 be able to print all of these same sheep if I just printed one 2732 02:16:10,450 --> 02:16:15,430 right away, then print two right away, then print three, then four, then five. 2733 02:16:15,430 --> 02:16:19,180 It seems that the essence of this problem, if I go back to my code, 2734 02:16:19,180 --> 02:16:24,610 is that per my best practices that I'm trying to practice what I'm preaching, 2735 02:16:24,610 --> 02:16:26,920 it seems that the fundamental problem is that I've 2736 02:16:26,920 --> 02:16:31,000 modularized my code by creating this helper function called sheep, 2737 02:16:31,000 --> 02:16:34,570 whose purpose in life is to do all of the generation of sheep 2738 02:16:34,570 --> 02:16:37,639 and then return all of them at once. 2739 02:16:37,639 --> 02:16:39,080 Wouldn't it be better-- 2740 02:16:39,080 --> 02:16:41,629 and I can actually hear my fan turning on now even just 2741 02:16:41,629 --> 02:16:43,160 trying to generate these sheep-- 2742 02:16:43,160 --> 02:16:46,940 wouldn't it be better then to just print the sheep one, two, three, 2743 02:16:46,940 --> 02:16:47,959 four at a time? 2744 02:16:47,959 --> 02:16:49,219 Well, we could do that. 2745 02:16:49,219 --> 02:16:51,260 But that's really a step backwards. 2746 02:16:51,260 --> 02:16:54,469 That rather contradicts all of the lessons learned of the past few weeks. 2747 02:16:54,469 --> 02:16:57,709 Where generally not putting everything in main is a good thing. 2748 02:16:57,709 --> 02:17:01,280 Generally having an additional function that you can then test separately 2749 02:17:01,280 --> 02:17:02,790 with unit tests is a good thing. 2750 02:17:02,790 --> 02:17:06,320 Do we really need to give up all of those best practices just 2751 02:17:06,320 --> 02:17:09,469 to print out some sheep and here fall asleep? 2752 02:17:09,469 --> 02:17:12,770 Well, it turns out there's a solution to this problem, 2753 02:17:12,770 --> 02:17:16,129 and namely in the form of these generators in Python. 2754 02:17:16,129 --> 02:17:18,410 You can define a function as a generator, 2755 02:17:18,410 --> 02:17:24,059 whereby it can still generate a massive amount of data for your users. 2756 02:17:24,059 --> 02:17:28,400 But you can have it return just a little bit of that data at a time. 2757 02:17:28,400 --> 02:17:31,940 And you yourself can implement the code in almost the same way. 2758 02:17:31,940 --> 02:17:36,000 But you don't have to worry about too much getting returned all at once. 2759 02:17:36,000 --> 02:17:38,690 These two, like all features of Python, are documented 2760 02:17:38,690 --> 02:17:40,670 in the official documentation therein. 2761 02:17:40,670 --> 02:17:45,980 But what you'll find, ultimately, that it all boils down to this keyword here, 2762 02:17:45,980 --> 02:17:46,940 yield. 2763 02:17:46,940 --> 02:17:49,549 Up until now, when we've been making functions, 2764 02:17:49,549 --> 02:17:54,260 we have been defining functions that return values, if at all, 2765 02:17:54,260 --> 02:17:56,059 using the keyword return. 2766 02:17:56,059 --> 02:17:58,040 And indeed, if we go back to our code here, 2767 02:17:58,040 --> 02:17:59,969 that's exactly what I've been waiting for. 2768 02:17:59,969 --> 02:18:02,809 I've been waiting to return the whole flock at once. 2769 02:18:02,809 --> 02:18:05,690 Unfortunately, if you wait too long, and here we have it, 2770 02:18:05,690 --> 02:18:08,330 my program was quote unquote killed. 2771 02:18:08,330 --> 02:18:13,370 That is to say, my computer got so fed up with how much memory and CPU 2772 02:18:13,370 --> 02:18:16,549 it was trying to use it just said, nope, you're not going to run at all. 2773 02:18:16,549 --> 02:18:17,660 And that's unfortunate. 2774 02:18:17,660 --> 02:18:20,420 Now my program no longer works for large numbers 2775 02:18:20,420 --> 02:18:24,110 of sleeps-- sheeps, which is not good if I'm really having trouble 2776 02:18:24,110 --> 02:18:26,000 falling asleep some night. 2777 02:18:26,000 --> 02:18:29,780 So how can I use yield to solve this problem instead? 2778 02:18:29,780 --> 02:18:31,070 Well, let me do this. 2779 02:18:31,070 --> 02:18:36,740 Instead of building up this massive list of sheep in this big list called flock, 2780 02:18:36,740 --> 02:18:38,660 let's just do this instead. 2781 02:18:38,660 --> 02:18:41,660 Let me go ahead and simplify this whole function 2782 02:18:41,660 --> 02:18:46,670 as follows, whereby I iterate for i in the range of n. 2783 02:18:46,670 --> 02:18:49,700 And then on each iteration, in the past I 2784 02:18:49,700 --> 02:18:52,700 might have been inclined to use return and return 2785 02:18:52,700 --> 02:18:55,070 something like one sheep times i. 2786 02:18:55,070 --> 02:18:56,719 But this won't work here. 2787 02:18:56,719 --> 02:18:59,330 Because if you want a million sheep and you 2788 02:18:59,330 --> 02:19:03,290 start a for loop saying for i in the range of a million, 2789 02:19:03,290 --> 02:19:06,469 you're going to return accidentally zero sheep right away. 2790 02:19:06,469 --> 02:19:08,480 And then this function is essentially useless. 2791 02:19:08,480 --> 02:19:11,785 You shouldn't return a value in the middle of a loop like this 2792 02:19:11,785 --> 02:19:14,660 because you're not going to get to any of these subsequent iterations 2793 02:19:14,660 --> 02:19:15,160 of the loop. 2794 02:19:15,160 --> 02:19:17,809 It's going to iterate once, and boom, you return. 2795 02:19:17,809 --> 02:19:20,870 But thanks to this other keyword in Python 2796 02:19:20,870 --> 02:19:26,540 called yield, you can tell Python to effectively return just one 2797 02:19:26,540 --> 02:19:28,740 value at a time from this loop. 2798 02:19:28,740 --> 02:19:31,309 So if I go back to this version of my code here. 2799 02:19:31,309 --> 02:19:38,719 And I say not return but yield, this is like saying return one value at a time, 2800 02:19:38,719 --> 02:19:42,770 return one value at a time, return one value at a time. 2801 02:19:42,770 --> 02:19:45,500 The for loop will keep working and I will 2802 02:19:45,500 --> 02:19:49,880 keep counting from zero, to one, to two, all the way up toward one million. 2803 02:19:49,880 --> 02:19:52,970 But each time, the function is just going to hand you 2804 02:19:52,970 --> 02:19:54,260 back a little piece of data. 2805 02:19:54,260 --> 02:19:57,890 It's going to generate, so to speak, just a little bit of that data, 2806 02:19:57,890 --> 02:19:59,570 not all of the data at once. 2807 02:19:59,570 --> 02:20:00,440 And that's good. 2808 02:20:00,440 --> 02:20:04,010 Because my computer has a decent amount of RAM, certainly enough to fit 2809 02:20:04,010 --> 02:20:05,120 one row of sheep. 2810 02:20:05,120 --> 02:20:07,400 It just doesn't have enough memory to fit, apparently, 2811 02:20:07,400 --> 02:20:10,430 one million rows of so many sheep. 2812 02:20:10,430 --> 02:20:15,410 So now, if I go to my terminal window and run Python of sleep.py and hit 2813 02:20:15,410 --> 02:20:16,910 Enter, what's n? 2814 02:20:16,910 --> 02:20:20,210 Three would still work-- zero, then one, and then two. 2815 02:20:20,210 --> 02:20:23,930 Let me go ahead and increase the size of this here and run Python of sleep.py. 2816 02:20:23,930 --> 02:20:27,980 Let's try one million as before and hit Enter. 2817 02:20:27,980 --> 02:20:31,002 And now I immediately see results. 2818 02:20:31,002 --> 02:20:32,960 I don't think we'll wait for all of these sheep 2819 02:20:32,960 --> 02:20:35,870 to be printed, because then we will literally all be asleep. 2820 02:20:35,870 --> 02:20:40,310 But what you'll notice happening now is the program is not hanging, 2821 02:20:40,310 --> 02:20:40,810 so to speak. 2822 02:20:40,810 --> 02:20:43,227 It's not waiting, and waiting, and thinking, and thinking, 2823 02:20:43,227 --> 02:20:45,260 and trying to generate the entire flock at once. 2824 02:20:45,260 --> 02:20:48,187 It's just generating one row of sheep at a time. 2825 02:20:48,187 --> 02:20:51,020 And it's flickering on the screen because there are so many of them. 2826 02:20:51,020 --> 02:20:52,700 And that's all thanks to yield. 2827 02:20:52,700 --> 02:20:57,530 It's generating a little bit of data at a time, not all at once. 2828 02:20:57,530 --> 02:21:03,030 Any questions now on this feature called generators? 2829 02:21:03,030 --> 02:21:06,260 2830 02:21:06,260 --> 02:21:07,760 Any questions at all? 2831 02:21:07,760 --> 02:21:10,910 To add one more piece of terminology to the mix, just so you've heard it. 2832 02:21:10,910 --> 02:21:17,840 This same feature here is returning what we'll technically now call an iterator. 2833 02:21:17,840 --> 02:21:20,000 Yield is returning an iterator that allows 2834 02:21:20,000 --> 02:21:25,550 your own code, your own for loop in main to iterate over these generated values 2835 02:21:25,550 --> 02:21:27,450 one at a time. 2836 02:21:27,450 --> 02:21:34,010 STUDENT: How does this yield actually works under the hood? 2837 02:21:34,010 --> 02:21:38,545 I mean, is it using multithreading? 2838 02:21:38,545 --> 02:21:40,670 DAVID J. MALAN: You can think of the implementation 2839 02:21:40,670 --> 02:21:42,860 as being asynchronous in this sense. 2840 02:21:42,860 --> 02:21:48,050 Whereby the function is returning a value immediately and then 2841 02:21:48,050 --> 02:21:50,900 subsequently giving you back another one as well. 2842 02:21:50,900 --> 02:21:52,910 Underneath the hood, what's really happening 2843 02:21:52,910 --> 02:21:55,740 is the generator is just retaining state for you. 2844 02:21:55,740 --> 02:21:58,790 It does not going to run the entire loop from top to bottom 2845 02:21:58,790 --> 02:22:00,140 and then return a value. 2846 02:22:00,140 --> 02:22:02,660 It's going to do one iteration and yield a result. 2847 02:22:02,660 --> 02:22:07,460 And the Python for you is going to suspend the function, if you will, 2848 02:22:07,460 --> 02:22:09,650 but remember on what iteration it was. 2849 02:22:09,650 --> 02:22:12,170 So the next time you iterate over it, as is 2850 02:22:12,170 --> 02:22:14,570 going to happen again and again in this for loop in main, 2851 02:22:14,570 --> 02:22:17,570 you get back another value again and again. 2852 02:22:17,570 --> 02:22:20,300 So yield returns, indeed, this thing called an iterator. 2853 02:22:20,300 --> 02:22:24,690 And that iterator can be stepped over as in a loop one element at a time. 2854 02:22:24,690 --> 02:22:27,410 But the language, Python handles all of that for you 2855 02:22:27,410 --> 02:22:32,120 so that you don't need to do all of the underlying plumbing yourself. 2856 02:22:32,120 --> 02:22:36,510 How about time for one other question on these generators and integrators 2857 02:22:36,510 --> 02:22:38,990 as our sheep continue to fly by? 2858 02:22:38,990 --> 02:22:41,960 STUDENT: So if every iteration, the program 2859 02:22:41,960 --> 02:22:46,270 will return the memory to the system so the program will not crash? 2860 02:22:46,270 --> 02:22:47,270 DAVID J. MALAN: Correct. 2861 02:22:47,270 --> 02:22:50,840 On each iteration, it's only returning the one string of sheep that's 2862 02:22:50,840 --> 02:22:53,030 appropriate for the current value of i. 2863 02:22:53,030 --> 02:22:57,080 It is not trying to return all million rows of the same. 2864 02:22:57,080 --> 02:23:00,470 And therefore, it uses really one millionth the amount of memory, 2865 02:23:00,470 --> 02:23:03,980 although that's a bit of an oversimplification. 2866 02:23:03,980 --> 02:23:04,850 All right. 2867 02:23:04,850 --> 02:23:07,040 As these sheep continue to fly across the, 2868 02:23:07,040 --> 02:23:10,130 screen let me now go ahead and interrupt this, 2869 02:23:10,130 --> 02:23:13,460 as you might have had to in the past with infinite loops in your own code. 2870 02:23:13,460 --> 02:23:16,250 Even though this isn't infinite, it's just really long-- control 2871 02:23:16,250 --> 02:23:19,400 C will interrupt with your keyboard that program, giving me 2872 02:23:19,400 --> 02:23:21,950 back control of my computer. 2873 02:23:21,950 --> 02:23:25,760 Well, here we are at the end of CS50's Introduction 2874 02:23:25,760 --> 02:23:27,200 to Programming with Python. 2875 02:23:27,200 --> 02:23:31,700 And if today in particular of all days felt like a real escalation 2876 02:23:31,700 --> 02:23:34,040 real quickly, realize that these are really-- 2877 02:23:34,040 --> 02:23:37,700 these are just additional perhaps optional tools in your toolkit 2878 02:23:37,700 --> 02:23:40,520 that you can add to all of the past lessons learned. 2879 02:23:40,520 --> 02:23:43,723 So that as you exit from this course and tackle other courses or projects 2880 02:23:43,723 --> 02:23:45,890 of your own, you have all the more of a mental model 2881 02:23:45,890 --> 02:23:49,220 and all the more of a toolbox with which to solve those same problems. 2882 02:23:49,220 --> 02:23:52,010 If we think back now just a few weeks ago, 2883 02:23:52,010 --> 02:23:55,193 it was probably in our focus on functions and variables 2884 02:23:55,193 --> 02:23:56,610 that you first started struggling. 2885 02:23:56,610 --> 02:23:59,120 But now in retrospect, if you look back at those problems 2886 02:23:59,120 --> 02:24:01,670 and those same problems sets, odds are those same problems 2887 02:24:01,670 --> 02:24:03,560 would come all too easily to you now. 2888 02:24:03,560 --> 02:24:05,870 Conditionals was the next step in the class, wherein 2889 02:24:05,870 --> 02:24:08,810 we gave you the ability to ask questions and get answers 2890 02:24:08,810 --> 02:24:10,940 and therefore do things conditionally in your code. 2891 02:24:10,940 --> 02:24:12,355 We came full circle today. 2892 02:24:12,355 --> 02:24:15,230 And you can see that you can now use those same kinds of conditionals 2893 02:24:15,230 --> 02:24:18,680 now to do fancier things with list comprehension and dictionary 2894 02:24:18,680 --> 02:24:19,970 comprehensions and the like. 2895 02:24:19,970 --> 02:24:23,450 Loops, of course, have been omnipresent now for weeks, including today 2896 02:24:23,450 --> 02:24:25,070 as we built up those same structures. 2897 02:24:25,070 --> 02:24:26,750 And of course, something can go wrong. 2898 02:24:26,750 --> 02:24:29,600 And exceptions and exception handling was our mechanism 2899 02:24:29,600 --> 02:24:34,022 for not only catching errors in code but also raising your own exception. 2900 02:24:34,022 --> 02:24:36,980 So that if you're laying the foundation to write code for other people, 2901 02:24:36,980 --> 02:24:39,590 as in the form of libraries, you can do that too. 2902 02:24:39,590 --> 02:24:43,190 Libraries, of course, are things you can not only use but now write on your own, 2903 02:24:43,190 --> 02:24:45,713 be it a small module or whole package of code 2904 02:24:45,713 --> 02:24:47,880 that you want to share with others around the world. 2905 02:24:47,880 --> 02:24:52,250 And even better, can you write tests for your own code, for your libraries, 2906 02:24:52,250 --> 02:24:53,402 for other's code as well. 2907 02:24:53,402 --> 02:24:55,610 So that ultimately, you can be all the more confident 2908 02:24:55,610 --> 02:24:57,470 that not only your code is correct today. 2909 02:24:57,470 --> 02:24:59,780 But if you make a change to your code tomorrow, 2910 02:24:59,780 --> 02:25:02,960 you haven't broken anything, at least according to your tests 2911 02:25:02,960 --> 02:25:04,580 if they continue to pass. 2912 02:25:04,580 --> 02:25:07,400 File I/O though, meanwhile, was a way of now storing 2913 02:25:07,400 --> 02:25:10,160 data not just in the computer's memory like all of these sheep, 2914 02:25:10,160 --> 02:25:13,700 but actually storing things persistently, longer term to disk, 2915 02:25:13,700 --> 02:25:17,510 being in a CSV or something more like a binary file like an image. 2916 02:25:17,510 --> 02:25:20,240 With regular expressions, you then had the ability 2917 02:25:20,240 --> 02:25:23,870 to express patterns and actually validate data or extract data 2918 02:25:23,870 --> 02:25:24,620 from information. 2919 02:25:24,620 --> 02:25:28,280 All the more of a useful technique nowadays when so much of the world 2920 02:25:28,280 --> 02:25:31,370 is trying to analyze and process data at scale, some of which 2921 02:25:31,370 --> 02:25:35,300 might in fact be quite messy from the get-go. 2922 02:25:35,300 --> 02:25:38,090 And then, of course, most recently, object-oriented programming, 2923 02:25:38,090 --> 02:25:40,610 an opportunity to solve the same kinds of problems 2924 02:25:40,610 --> 02:25:42,770 but with a slightly different perspective, a way 2925 02:25:42,770 --> 02:25:47,600 to encapsulate and to represent real world entities, this time in code. 2926 02:25:47,600 --> 02:25:50,720 And today, of course, et cetera with so many other tools 2927 02:25:50,720 --> 02:25:54,110 that you can add that didn't necessarily fall under any of those earlier 2928 02:25:54,110 --> 02:25:57,410 umbrellas but are useful functions, and data types, 2929 02:25:57,410 --> 02:26:00,260 and techniques just to have, again, in your back pocket 2930 02:26:00,260 --> 02:26:03,830 as yet other mechanisms for solving problems as well. 2931 02:26:03,830 --> 02:26:07,170 Not just putting everyone to sleep, but I thought another way to end 2932 02:26:07,170 --> 02:26:11,670 might be a little more vocally to try writing one final program together, 2933 02:26:11,670 --> 02:26:15,030 this one using a library we've seen in the past, as well as one other. 2934 02:26:15,030 --> 02:26:19,800 I've taken the liberty of installing a text to speech library on my computer 2935 02:26:19,800 --> 02:26:20,550 here. 2936 02:26:20,550 --> 02:26:25,530 And I'm going to go ahead, perhaps, and open a new file here called say.py 2937 02:26:25,530 --> 02:26:26,610 in VS Code. 2938 02:26:26,610 --> 02:26:32,430 And I'm going to go ahead here and first import our own friend, import cowsay. 2939 02:26:32,430 --> 02:26:37,500 And I'm going to import this new library here, import pyttsx3, 2940 02:26:37,500 --> 02:26:39,540 the Python text to speech library. 2941 02:26:39,540 --> 02:26:42,900 And now, per its documentation, which I read in advance, 2942 02:26:42,900 --> 02:26:45,540 I'm going to go ahead and create a variable for myself here, 2943 02:26:45,540 --> 02:26:52,530 engine equals pyttsx3 init to initialize that library for text to speech. 2944 02:26:52,530 --> 02:26:56,160 I'm going to then ask the user, well, what do I want to hear spoken? 2945 02:26:56,160 --> 02:26:57,660 And I might do something like this. 2946 02:26:57,660 --> 02:27:00,840 A variable called this equals the return value of input. 2947 02:27:00,840 --> 02:27:03,000 What's this shall be my simple question. 2948 02:27:03,000 --> 02:27:05,310 And I'm going to keep it this time as a string. 2949 02:27:05,310 --> 02:27:06,870 We've seen how to use cowsay. 2950 02:27:06,870 --> 02:27:09,420 We can do cowsay.cow of this. 2951 02:27:09,420 --> 02:27:12,600 Turns out this new library can allow me to use its own engine 2952 02:27:12,600 --> 02:27:14,800 to say this as well. 2953 02:27:14,800 --> 02:27:19,020 But then, ultimately, I'm going to have to run the engine.runAndWait, 2954 02:27:19,020 --> 02:27:21,750 just in case it's a long phrase or sentence to be said. 2955 02:27:21,750 --> 02:27:22,830 But that's it. 2956 02:27:22,830 --> 02:27:27,000 In just eight lines of code, not only am I apparently going to have a cow 2957 02:27:27,000 --> 02:27:32,010 appear on the screen to close us out now but also some synthesized text. 2958 02:27:32,010 --> 02:27:35,732 Ultimately then we hope with this course that you've not only learned Python, 2959 02:27:35,732 --> 02:27:37,440 that you've not only learned programming, 2960 02:27:37,440 --> 02:27:39,440 but you've really learned how to solve problems, 2961 02:27:39,440 --> 02:27:42,180 and ultimately how to teach yourself new languages. 2962 02:27:42,180 --> 02:27:45,597 Funny enough, I myself only learned Python just a few years ago. 2963 02:27:45,597 --> 02:27:48,930 And even though I certainly went through some formal documentation and resources 2964 02:27:48,930 --> 02:27:52,530 online, I mostly learned what I know now and even what 2965 02:27:52,530 --> 02:27:55,740 I had to learn again for today by just asking lots of questions, 2966 02:27:55,740 --> 02:27:59,600 be it of Google, or friends, who are more versed in this language than I. 2967 02:27:59,600 --> 02:28:02,470 And so, having that instinct, having that vocabulary [INAUDIBLE] 2968 02:28:02,470 --> 02:28:05,550 which to ask questions of others to search for answers to questions, 2969 02:28:05,550 --> 02:28:08,790 you absolutely now have enough of a foundation in Python 2970 02:28:08,790 --> 02:28:11,217 and programming to go off and stand on your own. 2971 02:28:11,217 --> 02:28:13,800 So you can certainly-- and you're welcome and encouraged to go 2972 02:28:13,800 --> 02:28:16,830 on and take other courses in Python and programming specifically. 2973 02:28:16,830 --> 02:28:19,350 But better still, as quickly as you can, is 2974 02:28:19,350 --> 02:28:22,200 to find some project that's personally of interest that 2975 02:28:22,200 --> 02:28:24,030 uses Python or some other language. 2976 02:28:24,030 --> 02:28:26,730 Because at least from my own experience, I tend to learn best 2977 02:28:26,730 --> 02:28:29,880 and I hope you might too by actually applying these skills. 2978 02:28:29,880 --> 02:28:32,700 Not to problems in the classroom, but really truly 2979 02:28:32,700 --> 02:28:34,980 to problems in the real world. 2980 02:28:34,980 --> 02:28:39,180 Allow me with all that said to look at my full screen terminal window here. 2981 02:28:39,180 --> 02:28:44,010 Run Python of say.py, crossing my fingers one final time in 2982 02:28:44,010 --> 02:28:46,860 hopes that I've not made any mistakes or bugs. 2983 02:28:46,860 --> 02:28:47,790 And here we go. 2984 02:28:47,790 --> 02:28:50,620 Python of say.py prompting me, what's this? 2985 02:28:50,620 --> 02:28:52,740 How about we end on this note here. 2986 02:28:52,740 --> 02:28:55,420 2987 02:28:55,420 --> 02:28:57,480 COMPUTER: This was CS50. 2988 02:28:57,480 --> 02:29:04,000