1 00:00:00,000 --> 00:00:02,988 [MUSIC PLAYING] 2 00:00:02,988 --> 00:01:01,320 3 00:01:01,320 --> 00:01:02,670 DAVID MALAN: All right. 4 00:01:02,670 --> 00:01:06,240 This is CS50, and this is finally week 6. 5 00:01:06,240 --> 00:01:08,460 And this is that week we promised, wherein we finally 6 00:01:08,460 --> 00:01:13,705 transition from C, this lower-level older language via which we explored 7 00:01:13,705 --> 00:01:16,080 memory and how really computers work underneath the hood, 8 00:01:16,080 --> 00:01:19,590 to what's now called Python, which is a more modern, higher-level language, 9 00:01:19,590 --> 00:01:22,632 whereby we're still going to be able to solve the same types of problems. 10 00:01:22,632 --> 00:01:25,560 But it's going to suddenly start to get much, much easier because what 11 00:01:25,560 --> 00:01:29,190 Python offers, as do higher-level languages more generally, 12 00:01:29,190 --> 00:01:31,680 are what we might describe as abstractions 13 00:01:31,680 --> 00:01:35,850 over the very low-level ideas that you've been implementing in sections 14 00:01:35,850 --> 00:01:37,780 and problem sets and so much more. 15 00:01:37,780 --> 00:01:39,838 But recall from week 0, where we began. 16 00:01:39,838 --> 00:01:42,630 This was our simplest of programs that just printed "hello, world." 17 00:01:42,630 --> 00:01:44,820 Things escalated quickly thereafter in week 1, 18 00:01:44,820 --> 00:01:46,860 where, suddenly, we had all of this new syntax. 19 00:01:46,860 --> 00:01:50,790 But the idea was still the same of just printing out "hello, world." 20 00:01:50,790 --> 00:01:53,670 Well, as of today, a lot of that distraction, 21 00:01:53,670 --> 00:01:56,490 a lot of the visual distraction, goes away entirely 22 00:01:56,490 --> 00:02:02,410 such that what used to be this in C will now be quite simply this in Python. 23 00:02:02,410 --> 00:02:04,480 And that's a bit of a head fake in that we're 24 00:02:04,480 --> 00:02:06,850 going to see some other fancier features of Python. 25 00:02:06,850 --> 00:02:09,910 But you'll find that Python's popularity in large part 26 00:02:09,910 --> 00:02:13,300 derives from just how relatively readable it is 27 00:02:13,300 --> 00:02:16,750 and also, as we'll ultimately see, just how exciting 28 00:02:16,750 --> 00:02:19,930 and filled the ecosystem among Python programmers. 29 00:02:19,930 --> 00:02:22,068 That is to say there's a lot more libraries. 30 00:02:22,068 --> 00:02:24,610 There's a lot more problems that people have solved in Python 31 00:02:24,610 --> 00:02:26,980 that you can now incorporate into your own programs 32 00:02:26,980 --> 00:02:30,640 in order to stand on their shoulders and get real work done faster. 33 00:02:30,640 --> 00:02:33,430 But recall, though, from C that we had a few steps via which 34 00:02:33,430 --> 00:02:35,360 to actually compile that kind of code. 35 00:02:35,360 --> 00:02:38,470 So we got into the habit of make to make our program called hello. 36 00:02:38,470 --> 00:02:41,560 And then we've been in the habit of running it with ./hello, 37 00:02:41,560 --> 00:02:45,070 the effect of which, of course, is to feed all of the zeros and ones that 38 00:02:45,070 --> 00:02:48,610 compose the hello program into the computer's memory and, in turn, 39 00:02:48,610 --> 00:02:49,510 the CPU. 40 00:02:49,510 --> 00:02:52,270 We revealed that what make is really doing 41 00:02:52,270 --> 00:02:56,410 is something a little more specific, namely running clang, the C language 42 00:02:56,410 --> 00:03:00,610 compiler specifically, with some automatic command line arguments so as 43 00:03:00,610 --> 00:03:04,250 to output the name that you want, link in the library that you want, 44 00:03:04,250 --> 00:03:05,030 and so forth. 45 00:03:05,030 --> 00:03:08,590 But with Python, wonderfully, we're going to get rid of those steps, 46 00:03:08,590 --> 00:03:11,650 too, and quite simply run it as follows. 47 00:03:11,650 --> 00:03:15,100 Henceforth, our programs will no longer be in files ending in .c, 48 00:03:15,100 --> 00:03:16,090 suffice it to say. 49 00:03:16,090 --> 00:03:18,970 Our files starting today are going to start ending with .py, 50 00:03:18,970 --> 00:03:22,280 which is an indication to the computer-- macOS, Windows, 51 00:03:22,280 --> 00:03:25,450 or Linux or anything else-- that this is a Python program. 52 00:03:25,450 --> 00:03:30,310 But unlike C, wherein we've been in the habit of compiling our code 53 00:03:30,310 --> 00:03:33,580 and running it, compiling our code and running it, any time you make 54 00:03:33,580 --> 00:03:37,590 a change, with Python, those two steps get reduced into one, such 55 00:03:37,590 --> 00:03:40,090 that any time you make a change and want to rerun your code, 56 00:03:40,090 --> 00:03:42,310 you don't explicitly compile it anymore. 57 00:03:42,310 --> 00:03:46,930 You instead just run a program called python, similar in spirit to clang. 58 00:03:46,930 --> 00:03:49,780 But whereas clang is a compiler, python will 59 00:03:49,780 --> 00:03:53,050 see as not only the name of the language, but the name of a program. 60 00:03:53,050 --> 00:03:56,110 And the type of that program is that of interpreter. 61 00:03:56,110 --> 00:03:59,710 An interpreter is a program that reads your code top to bottom, left to right, 62 00:03:59,710 --> 00:04:04,300 and really does what it says without having this intermediate step of first 63 00:04:04,300 --> 00:04:06,860 having to compile it in zeros and ones. 64 00:04:06,860 --> 00:04:08,570 So with that said, let me do this. 65 00:04:08,570 --> 00:04:10,690 Let me flip over here to VS Code. 66 00:04:10,690 --> 00:04:13,600 And within VS Code, let me write my first Python program. 67 00:04:13,600 --> 00:04:17,200 And as always, I can create a new file with the code command within VS Code. 68 00:04:17,200 --> 00:04:20,500 I'm going to create this file called hello.py, for instance. 69 00:04:20,500 --> 00:04:25,345 And quite, quite simply, I'm going to go ahead and simply do print("Hello, 70 00:04:25,345 --> 00:04:27,560 world"). 71 00:04:27,560 --> 00:04:30,430 And if I go down to my terminal window, instead of compiling this, 72 00:04:30,430 --> 00:04:34,600 I'm instead going to interpret this program by running python, space, 73 00:04:34,600 --> 00:04:37,600 and the name of the file I want Python to interpret, hitting Enter. 74 00:04:37,600 --> 00:04:38,650 And voila. 75 00:04:38,650 --> 00:04:40,970 Now you see "hello, world." 76 00:04:40,970 --> 00:04:43,300 But let me go ahead and compare this at left. 77 00:04:43,300 --> 00:04:47,740 Let me also go ahead and bring back briefly a file called hello.c. 78 00:04:47,740 --> 00:04:51,250 And I'm going to do this as we did in the very first day of C, 79 00:04:51,250 --> 00:04:53,740 where I included standard io.h. 80 00:04:53,740 --> 00:04:55,540 I did int main(void). 81 00:04:55,540 --> 00:04:59,770 I did inside of there printf(), quote unquote, "hello, world," backslash n, 82 00:04:59,770 --> 00:05:01,508 close quote, semicolon. 83 00:05:01,508 --> 00:05:02,800 And let me go ahead in VS Code. 84 00:05:02,800 --> 00:05:05,390 And if you drag your file over to the right or the left, 85 00:05:05,390 --> 00:05:08,072 you can actually split-screen things if of help. 86 00:05:08,072 --> 00:05:10,780 And what I've done here is-- and let me hide my terminal window-- 87 00:05:10,780 --> 00:05:12,940 I've now compared these two files left and right. 88 00:05:12,940 --> 00:05:15,280 So here's hello.c from, say, week 1. 89 00:05:15,280 --> 00:05:18,100 Here's hello.py from week 6 now. 90 00:05:18,100 --> 00:05:20,680 And the obvious-- the differences are perhaps obvious. 91 00:05:20,680 --> 00:05:23,890 But there's still some-- there's a subtlety, at least one subtlety. 92 00:05:23,890 --> 00:05:26,530 Beyond getting rid of lots of syntax, what 93 00:05:26,530 --> 00:05:29,020 did I apparently omit from my Python version, 94 00:05:29,020 --> 00:05:32,680 even though it didn't appear to behave in any buggy way? 95 00:05:32,680 --> 00:05:34,180 Yeah? 96 00:05:34,180 --> 00:05:35,380 Sorry? 97 00:05:35,380 --> 00:05:36,180 Say one more time? 98 00:05:36,180 --> 00:05:36,620 AUDIENCE: The library. 99 00:05:36,620 --> 00:05:37,700 DAVID MALAN: The library. 100 00:05:37,700 --> 00:05:41,570 So I didn't have to include any kind of library like the standard I/O library. 101 00:05:41,570 --> 00:05:44,123 print(), apparently, in Python, just works. 102 00:05:44,123 --> 00:05:45,290 AUDIENCE: main() [INAUDIBLE] 103 00:05:45,290 --> 00:05:47,790 DAVID MALAN: So I don't need to use main() anymore. 104 00:05:47,790 --> 00:05:51,350 So this main() function, to be clear, was required in C because that's what 105 00:05:51,350 --> 00:05:54,050 told the compiler what the main part of your program is. 106 00:05:54,050 --> 00:05:56,060 And you can't just start writing code otherwise. 107 00:05:56,060 --> 00:05:56,935 What else do you see? 108 00:05:56,935 --> 00:05:57,893 AUDIENCE: No semicolon. 109 00:05:57,893 --> 00:06:00,560 DAVID MALAN: So there's no more semicolon, wonderfully enough, 110 00:06:00,560 --> 00:06:03,090 at the end of this line, even though there was here. 111 00:06:03,090 --> 00:06:05,090 And things are getting a little more subtle now. 112 00:06:05,090 --> 00:06:06,300 What else? 113 00:06:06,300 --> 00:06:07,460 So the new line. 114 00:06:07,460 --> 00:06:10,520 So recall that in printf(), if you wanted to move the cursor to the next 115 00:06:10,520 --> 00:06:13,470 line when you're done printing, you had to do it yourself. 116 00:06:13,470 --> 00:06:15,500 So it seems as though Python-- 117 00:06:15,500 --> 00:06:17,810 because when I interpreted this program a moment ago, 118 00:06:17,810 --> 00:06:21,150 the cursor did move to the next line on its own. 119 00:06:21,150 --> 00:06:23,340 They sort of reversed the default behavior. 120 00:06:23,340 --> 00:06:25,680 So those are just some of the salient differences here. 121 00:06:25,680 --> 00:06:29,540 One, you don't have to explicitly include standard library, so to speak, 122 00:06:29,540 --> 00:06:33,728 like standard I/O. You don't need to define a main() function anymore. 123 00:06:33,728 --> 00:06:35,270 You can just start writing your code. 124 00:06:35,270 --> 00:06:37,790 You don't need these parentheses, these curly braces. 125 00:06:37,790 --> 00:06:40,280 printf() is now called print(), it would seem. 126 00:06:40,280 --> 00:06:43,100 And you don't need the backslash n. 127 00:06:43,100 --> 00:06:46,250 Now, there is one thing that's also a little looser, 128 00:06:46,250 --> 00:06:47,700 even though I didn't do it here. 129 00:06:47,700 --> 00:06:51,710 Even though in C, it was required to use double quotes any times you-- 130 00:06:51,710 --> 00:06:56,240 any time you want to use a string, a.k.a., char*, in Python, 131 00:06:56,240 --> 00:07:00,210 as with a lot of languages nowadays, you can actually get away with just using 132 00:07:00,210 --> 00:07:03,780 single quotes so long as you are consistent. 133 00:07:03,780 --> 00:07:05,730 Generally speaking, some people like this 134 00:07:05,730 --> 00:07:07,650 because you don't have to hold Shift, and therefore, you just 135 00:07:07,650 --> 00:07:08,970 hit one key instead of two. 136 00:07:08,970 --> 00:07:11,290 So there's an argument in terms of efficiency. 137 00:07:11,290 --> 00:07:14,220 However, if you want to use an apostrophe in your string, 138 00:07:14,220 --> 00:07:15,580 then you have to escape it. 139 00:07:15,580 --> 00:07:19,020 And so in general, stylistically, I'll use double quotes in this way. 140 00:07:19,020 --> 00:07:21,570 But things are getting a little looser now with Python, 141 00:07:21,570 --> 00:07:24,390 whereby that's not actually a requirement. 142 00:07:24,390 --> 00:07:27,240 But what's especially exciting with Python, 143 00:07:27,240 --> 00:07:30,960 and, really, a lot of higher-level languages, is just how much real work 144 00:07:30,960 --> 00:07:32,642 you can get done relatively quickly. 145 00:07:32,642 --> 00:07:34,350 So you've just spent quite a bit of time, 146 00:07:34,350 --> 00:07:37,050 daresay, implementing your spell checker and implementing 147 00:07:37,050 --> 00:07:39,360 your own dictionary of sorts. 148 00:07:39,360 --> 00:07:42,420 Well, let me propose that maybe we should have asked you 149 00:07:42,420 --> 00:07:45,513 to do that in Python instead of C. Why? 150 00:07:45,513 --> 00:07:46,930 Well, let me go ahead and do this. 151 00:07:46,930 --> 00:07:50,170 Let me close these two tabs and reopen my terminal window. 152 00:07:50,170 --> 00:07:52,440 Let me go into a directory called speller 153 00:07:52,440 --> 00:07:55,020 that I downloaded in advance for class. 154 00:07:55,020 --> 00:07:57,150 And if I type ls in here, you'll notice that it's 155 00:07:57,150 --> 00:08:00,660 very similar to what you spent time on with problem set 5. 156 00:08:00,660 --> 00:08:02,670 But the file extensions are different. 157 00:08:02,670 --> 00:08:05,040 There's a dictionary.py instead of dictionary.c. 158 00:08:05,040 --> 00:08:07,620 There's a speller.py instead of a speller.c. 159 00:08:07,620 --> 00:08:10,980 And there's the exact same directories, dictionaries, and texts 160 00:08:10,980 --> 00:08:13,110 that we gave you for problem set 5. 161 00:08:13,110 --> 00:08:18,660 So let me just stipulate that I spent time implementing speller.c in Python. 162 00:08:18,660 --> 00:08:20,580 And so I gave it a name of speller.py. 163 00:08:20,580 --> 00:08:25,090 But I didn't go about really implementing dictionary.py yet. 164 00:08:25,090 --> 00:08:29,250 And so why don't we go ahead and actually implement dictionary.py 165 00:08:29,250 --> 00:08:30,900 together by doing this? 166 00:08:30,900 --> 00:08:34,530 Let me clear my terminal, do code dictionary.py. 167 00:08:34,530 --> 00:08:38,309 And let me propose that we implement, ultimately, four functions. 168 00:08:38,309 --> 00:08:40,120 And what are those functions going to be? 169 00:08:40,120 --> 00:08:42,960 Well, they're going to be the check() function, the load() function, 170 00:08:42,960 --> 00:08:45,420 the size() function, and the unload() function. 171 00:08:45,420 --> 00:08:50,140 But recall that in problem set 5, you implemented your own hash table. 172 00:08:50,140 --> 00:08:54,333 And so while there isn't a hash table data type in Python, 173 00:08:54,333 --> 00:08:55,750 I'm going to go ahead and do this. 174 00:08:55,750 --> 00:08:58,860 I'm going to create a variable, a global variable in dictionary.py, 175 00:08:58,860 --> 00:09:01,620 called words, and I'm going to make it a set. 176 00:09:01,620 --> 00:09:03,960 In the mathematical sense, a set is a collection 177 00:09:03,960 --> 00:09:05,970 of things that won't contain duplicates. 178 00:09:05,970 --> 00:09:07,588 Any duplicates will be filtered out. 179 00:09:07,588 --> 00:09:10,380 So I'm going to now, after that, creating that one global variable, 180 00:09:10,380 --> 00:09:13,410 I'm going to create a function called check(), just as you did. 181 00:09:13,410 --> 00:09:15,730 And check() takes as input a word. 182 00:09:15,730 --> 00:09:19,200 And if I want to check if a word is in that set of words, 183 00:09:19,200 --> 00:09:25,080 I can simply do word.lower in words. 184 00:09:25,080 --> 00:09:26,070 And that's it. 185 00:09:26,070 --> 00:09:28,740 Let me now define another function called load(), which, recall, 186 00:09:28,740 --> 00:09:32,310 took an argument, which was the name of the dictionary you want to load 187 00:09:32,310 --> 00:09:33,060 into memory. 188 00:09:33,060 --> 00:09:35,352 Inside of my load() function, I'm now going to do this. 189 00:09:35,352 --> 00:09:39,525 I'm going to say with open(dictionary) as a variable called file. 190 00:09:39,525 --> 00:09:42,150 And in there, I'm going to go ahead and update the set of words 191 00:09:42,150 --> 00:09:47,670 to be the updated version of whatever's in this file as a result of reading 192 00:09:47,670 --> 00:09:50,580 it and then splitting its lines, whereby this file has 193 00:09:50,580 --> 00:09:54,090 a big, long column of words, each of which is separated by a new line, 194 00:09:54,090 --> 00:09:57,930 splitline is going to split all of those into one big collection. 195 00:09:57,930 --> 00:10:00,270 And then I'm just going to go ahead and return True. 196 00:10:00,270 --> 00:10:03,643 I'm now going to go ahead and define a size function, just as you did. 197 00:10:03,643 --> 00:10:06,810 But in Python, I'm going to go ahead and just go ahead and return the length 198 00:10:06,810 --> 00:10:12,120 of that set of words, where length, or len(), is a function itself in Python. 199 00:10:12,120 --> 00:10:14,170 And I'm going to do one last function. 200 00:10:14,170 --> 00:10:17,560 It turns out that in Python, even though, for this program, 201 00:10:17,560 --> 00:10:20,040 I'm going to go and implement a function called unload, 202 00:10:20,040 --> 00:10:22,500 there's not actually anything to unload in Python, 203 00:10:22,500 --> 00:10:25,410 because Python will manage your memory for you. 204 00:10:25,410 --> 00:10:27,750 malloc() is gone. free() is gone. 205 00:10:27,750 --> 00:10:29,340 Pointers are gone. 206 00:10:29,340 --> 00:10:33,280 It handles all of that, seemingly magically for now, for you. 207 00:10:33,280 --> 00:10:37,080 So here then is, I claim, what you could have done with problem set 5 208 00:10:37,080 --> 00:10:39,090 if implementing it in Python instead. 209 00:10:39,090 --> 00:10:41,470 Let me go ahead and open my terminal window. 210 00:10:41,470 --> 00:10:42,690 Let me increase its size. 211 00:10:42,690 --> 00:10:46,380 Let me run Python of speller.py, which is the name of the actual program, not 212 00:10:46,380 --> 00:10:48,360 the dictionary per se that I implemented. 213 00:10:48,360 --> 00:10:51,300 Let's run it on a file called holmes.txt because that 214 00:10:51,300 --> 00:10:53,140 was a particularly big file. 215 00:10:53,140 --> 00:10:55,680 And if I hit Enter now, we'll see, hopefully, 216 00:10:55,680 --> 00:10:59,630 the same output that you saw in C flying across the screen. 217 00:10:59,630 --> 00:11:02,870 And eventually, we should see that same summary at the bottom 218 00:11:02,870 --> 00:11:04,820 as to how many words seem to be misspelled, 219 00:11:04,820 --> 00:11:09,170 how many words were in the dictionary, and, ultimately, how 220 00:11:09,170 --> 00:11:11,510 fast this whole process was. 221 00:11:11,510 --> 00:11:15,500 Now, the total amount of time required was 1.93 seconds, which was actually 222 00:11:15,500 --> 00:11:16,940 longer than it seemed to take. 223 00:11:16,940 --> 00:11:18,590 That's because we're doing this in the cloud, 224 00:11:18,590 --> 00:11:21,715 and it was taking some amount of time to send all of the text to my screen. 225 00:11:21,715 --> 00:11:26,120 But the code was only taking 1.93 seconds total on the actual server. 226 00:11:26,120 --> 00:11:29,300 And hopefully, these same kinds of numbers line up with your own, 227 00:11:29,300 --> 00:11:33,500 the difference being what I did not have to implement for this spell checker is 228 00:11:33,500 --> 00:11:36,500 your own hash table, is your own dictionary, literally, 229 00:11:36,500 --> 00:11:41,780 beyond what I've done using Python here with some of these built-in features. 230 00:11:41,780 --> 00:11:45,440 So why, you see, why not always use Python, 231 00:11:45,440 --> 00:11:48,140 assuming that you prefer the idea of being 232 00:11:48,140 --> 00:11:53,540 able to whip up within seconds the entirety of problem set 5? 233 00:11:53,540 --> 00:11:57,940 How might you choose now between languages? 234 00:11:57,940 --> 00:11:59,940 And I apologize if you're harboring resentment 235 00:11:59,940 --> 00:12:02,790 that this wasn't a week earlier. 236 00:12:02,790 --> 00:12:05,850 Why Python or why C? 237 00:12:05,850 --> 00:12:08,790 Any instincts? 238 00:12:08,790 --> 00:12:09,615 Any thoughts? 239 00:12:09,615 --> 00:12:11,190 There's hopefully a reason? 240 00:12:11,190 --> 00:12:13,390 Yeah, over here? 241 00:12:13,390 --> 00:12:15,286 Yeah? 242 00:12:15,286 --> 00:12:20,813 AUDIENCE: I always thought that Python was a little slower than C [INAUDIBLE] 243 00:12:20,813 --> 00:12:22,480 DAVID MALAN: Ah, really good conjecture. 244 00:12:22,480 --> 00:12:24,730 So you always thought that Python was slower than C 245 00:12:24,730 --> 00:12:27,950 and takes up more space than C. Odds are that's, in fact, correct. 246 00:12:27,950 --> 00:12:32,050 So even though, ultimately, this 1.93 seconds is still pretty darn fast, 247 00:12:32,050 --> 00:12:35,230 odds are it's a little slower than the C version would have been. 248 00:12:35,230 --> 00:12:38,110 It's possible, too, that my version in Python 249 00:12:38,110 --> 00:12:41,050 actually does take up more RAM or memory underneath the hood. 250 00:12:41,050 --> 00:12:41,560 Why? 251 00:12:41,560 --> 00:12:44,950 Well, because Python itself is managing memory for you. 252 00:12:44,950 --> 00:12:49,420 And it doesn't necessarily know a priori how much memory you're going to need. 253 00:12:49,420 --> 00:12:52,840 You, the programmer might, and you, the programmer writing in C, 254 00:12:52,840 --> 00:12:55,540 allocated presumably exactly as much memory 255 00:12:55,540 --> 00:12:58,660 as you might have needed last week with problem set 5. 256 00:12:58,660 --> 00:13:01,630 But Python's got to maybe do its best effort for you 257 00:13:01,630 --> 00:13:05,260 and try to manage memory for you, and there's going to be some overhead. 258 00:13:05,260 --> 00:13:08,170 The fact that I have so many fewer lines of code, 259 00:13:08,170 --> 00:13:11,560 the fact that these lines of code solve problem set 5 for me, 260 00:13:11,560 --> 00:13:17,050 means that Python, or whoever invented Python, they wrote lines of code 261 00:13:17,050 --> 00:13:19,190 to of give me this functionality. 262 00:13:19,190 --> 00:13:21,440 And so if you think of Python as a middleman of sorts, 263 00:13:21,440 --> 00:13:23,420 it's doing more work for me. 264 00:13:23,420 --> 00:13:24,960 It's doing more of the heavy lift. 265 00:13:24,960 --> 00:13:26,540 So it might take me a bit more time. 266 00:13:26,540 --> 00:13:30,050 But, my gosh, look how much time it has saved in terms 267 00:13:30,050 --> 00:13:31,820 of writing this code more quickly. 268 00:13:31,820 --> 00:13:34,220 And arguably, this code is even more readable, 269 00:13:34,220 --> 00:13:38,660 or at least will be after today, week 6, once you have an eye for the syntax 270 00:13:38,660 --> 00:13:41,430 and features of Python itself. 271 00:13:41,430 --> 00:13:45,030 So beyond that, it turns out you can do other things pretty easily as well. 272 00:13:45,030 --> 00:13:47,490 Let me go back into my terminal window. 273 00:13:47,490 --> 00:13:49,400 Let me close this dictionary.py. 274 00:13:49,400 --> 00:13:52,220 Let me go into a folder called filter, in which 275 00:13:52,220 --> 00:13:56,520 I have this same bridge that we've seen in the past across the river there. 276 00:13:56,520 --> 00:13:57,352 So here's a bridge. 277 00:13:57,352 --> 00:13:59,810 This is the original version of this particular photograph. 278 00:13:59,810 --> 00:14:02,640 Suppose I actually want to write a program that blurs this. 279 00:14:02,640 --> 00:14:06,590 Well, you might recall from problem set 4 you could write that same code in C 280 00:14:06,590 --> 00:14:10,760 by manipulating all of the red, the green, the blue pixels that 281 00:14:10,760 --> 00:14:12,770 are ultimately composing that file. 282 00:14:12,770 --> 00:14:14,940 But let me go ahead and propose this instead. 283 00:14:14,940 --> 00:14:17,540 Let me create a file called blur.py. 284 00:14:17,540 --> 00:14:23,840 And in this file, let me go ahead and just go ahead and import a library. 285 00:14:23,840 --> 00:14:27,950 So from the Python image library, PIL, let me go ahead 286 00:14:27,950 --> 00:14:31,790 and import something called Image, capital I, and Image Filter, capital 287 00:14:31,790 --> 00:14:32,810 I, capital F. 288 00:14:32,810 --> 00:14:35,570 So I'm going to do before = Image.open("bridge.bmp"). 289 00:14:35,570 --> 00:14:38,480 290 00:14:38,480 --> 00:14:41,300 Then let me go ahead and create another variable called after 291 00:14:41,300 --> 00:14:46,220 and set that equal to before.filter, and then, in parentheses, 292 00:14:46,220 --> 00:14:50,328 ImageFilter, spelled as before, dot BoxBlur, 293 00:14:50,328 --> 00:14:51,870 and then we'll give it a value of 10. 294 00:14:51,870 --> 00:14:53,900 How much do I want to blur it, for instance? 295 00:14:53,900 --> 00:14:57,710 After that, I'm going to literally call after.save, and let's 296 00:14:57,710 --> 00:15:00,320 save it as a file called out.bmp. 297 00:15:00,320 --> 00:15:01,520 And that's it. 298 00:15:01,520 --> 00:15:05,640 I propose that this is how you can now write code in Python to blur an image, 299 00:15:05,640 --> 00:15:07,640 much like you might have for problem set 4. 300 00:15:07,640 --> 00:15:12,200 Now let me go ahead in my terminal window and run python of blur.py. 301 00:15:12,200 --> 00:15:14,403 When I hit Enter, those four lines of code will run. 302 00:15:14,403 --> 00:15:16,070 It seems to have happened quite quickly. 303 00:15:16,070 --> 00:15:19,460 Let me go ahead and open now out.bmp. 304 00:15:19,460 --> 00:15:24,020 And whereas the previous image looked like this a moment ago, let me go ahead 305 00:15:24,020 --> 00:15:26,030 and open out.bmp. 306 00:15:26,030 --> 00:15:28,430 And hopefully, you can indeed see that it blurred it 307 00:15:28,430 --> 00:15:30,597 for me using that same code. 308 00:15:30,597 --> 00:15:32,930 And if we want things to escalate a little more quickly, 309 00:15:32,930 --> 00:15:35,220 let me go ahead and do this instead. 310 00:15:35,220 --> 00:15:36,650 Let me close blur.bmp. 311 00:15:36,650 --> 00:15:39,440 Let me go ahead and open a file called edges.py. 312 00:15:39,440 --> 00:15:41,990 And maybe, in edges.py, we can use this same library. 313 00:15:41,990 --> 00:15:47,913 So from the Python Image Library, import Image and import ImageFilter. 314 00:15:47,913 --> 00:15:50,330 Let me go ahead and create another variable called before, 315 00:15:50,330 --> 00:15:53,900 set it equal to Image.open("bridge.bmp"), 316 00:15:53,900 --> 00:15:54,980 just like before. 317 00:15:54,980 --> 00:15:57,560 Let me create another variable called after, 318 00:15:57,560 --> 00:16:03,200 set that equal to before.filter(ImageFilter.FIND_EDGES), 319 00:16:03,200 --> 00:16:06,500 which comes with this library automatically, and lastly, 320 00:16:06,500 --> 00:16:10,370 the same thing-- save this as a file called out.bmp. 321 00:16:10,370 --> 00:16:12,470 So if you struggled perhaps with this one 322 00:16:12,470 --> 00:16:16,310 previously, whereby you wrote for the more comfortable version of problem 323 00:16:16,310 --> 00:16:19,580 set 4, edge detection, so to speak, well, you 324 00:16:19,580 --> 00:16:23,270 might have then created a file that given an input like this, 325 00:16:23,270 --> 00:16:28,340 the original bridge.bmp, this new version, out.bmp, with just four 326 00:16:28,340 --> 00:16:31,050 lines of code, now looks like this. 327 00:16:31,050 --> 00:16:32,840 So, again, if this is a little frustrating 328 00:16:32,840 --> 00:16:36,110 that we had to do all of this in C, that was exactly 329 00:16:36,110 --> 00:16:39,140 the point to motivate that you now understand nonetheless 330 00:16:39,140 --> 00:16:40,850 what's going on underneath the hood. 331 00:16:40,850 --> 00:16:43,580 But with Python, you can express the solutions 332 00:16:43,580 --> 00:16:46,880 to problems all the more efficiently, all the more readily. 333 00:16:46,880 --> 00:16:50,060 And just one last one, too-- it's very common nowadays 334 00:16:50,060 --> 00:16:52,910 in the world of photography and social media and the like to do face 335 00:16:52,910 --> 00:16:54,830 detection, for better or for worse. 336 00:16:54,830 --> 00:16:56,990 And it turns out that face detection, even 337 00:16:56,990 --> 00:16:59,240 if you want to integrate it into your own application, 338 00:16:59,240 --> 00:17:02,660 is something that lots of other people have integrated into their applications 339 00:17:02,660 --> 00:17:03,360 as well. 340 00:17:03,360 --> 00:17:07,880 So Python, to my point earlier of having this very rich ecosystem of libraries 341 00:17:07,880 --> 00:17:12,349 that other people wrote, you can literally run a command like pip 342 00:17:12,349 --> 00:17:19,680 install face_recognition if you want to add support to your code space, 343 00:17:19,680 --> 00:17:21,930 or to your programming and environment more generally, 344 00:17:21,930 --> 00:17:24,532 for the notion of face recognition. 345 00:17:24,532 --> 00:17:26,490 In fact, this is going to automatically install 346 00:17:26,490 --> 00:17:29,340 from some server elsewhere a library that someone else wrote 347 00:17:29,340 --> 00:17:30,870 called face_recognition. 348 00:17:30,870 --> 00:17:34,000 And with this library, you can do something like this. 349 00:17:34,000 --> 00:17:37,290 Let me go into a directory that I came with in advance. 350 00:17:37,290 --> 00:17:40,350 Let me go ahead and ls in there, and you'll see four files-- 351 00:17:40,350 --> 00:17:43,710 detect.py and recognize.py, which are going to detect 352 00:17:43,710 --> 00:17:48,150 faces and then recognize specific faces, respectively, and then two files 353 00:17:48,150 --> 00:17:50,260 I brought from a popular TV show, for instance. 354 00:17:50,260 --> 00:17:55,350 So if I open office.jpg, here is one of the early cast photos from the hit TV 355 00:17:55,350 --> 00:17:56,580 series The Office. 356 00:17:56,580 --> 00:18:03,060 And here is a photograph of someone specific from the show, Toby. 357 00:18:03,060 --> 00:18:05,970 Now, this is, of course, Toby's face. 358 00:18:05,970 --> 00:18:10,830 But what is it that makes Toby's face a face? 359 00:18:10,830 --> 00:18:15,450 More generally, if I open up office.jpg, and I asked you, the human, 360 00:18:15,450 --> 00:18:17,917 to identify all of the faces in this picture, 361 00:18:17,917 --> 00:18:21,000 it wouldn't be that hard with a marker to sort of circle all of the faces. 362 00:18:21,000 --> 00:18:21,510 But how? 363 00:18:21,510 --> 00:18:22,410 Why? 364 00:18:22,410 --> 00:18:26,460 How do you as humans detect faces, might you think? 365 00:18:26,460 --> 00:18:27,030 Yeah? 366 00:18:27,030 --> 00:18:28,380 AUDIENCE: You have eyes, nose. 367 00:18:28,380 --> 00:18:30,360 DAVID MALAN: Features, yeah, like eyes, nose, 368 00:18:30,360 --> 00:18:32,610 generally in a similar orientation, even though we all 369 00:18:32,610 --> 00:18:34,140 have different faces, ultimately. 370 00:18:34,140 --> 00:18:38,140 But there's a pattern to the shapes that you're seeing on the screen. 371 00:18:38,140 --> 00:18:40,140 Well, it turns out this face_recognition library 372 00:18:40,140 --> 00:18:43,590 has been trained, perhaps via artificial intelligence over time, 373 00:18:43,590 --> 00:18:46,830 to recognize faces, but any number of different faces, 374 00:18:46,830 --> 00:18:48,430 perhaps among these folks here. 375 00:18:48,430 --> 00:18:51,250 So if I go back into my terminal window here, 376 00:18:51,250 --> 00:18:56,970 let me go ahead and run, say, python of detect.py, which I wrote 377 00:18:56,970 --> 00:18:59,190 in advance, which uses that library. 378 00:18:59,190 --> 00:19:03,240 And what that program is going to do-- it's going to think, do some thinking. 379 00:19:03,240 --> 00:19:04,830 It's just found some face. 380 00:19:04,830 --> 00:19:08,100 And let me go ahead now and open a file it just created 381 00:19:08,100 --> 00:19:12,370 called detected.jpg, which I didn't have in my folder a moment ago. 382 00:19:12,370 --> 00:19:16,650 But when I open this here file, you'll now see all of the faces 383 00:19:16,650 --> 00:19:19,290 based on this library's detection thereof. 384 00:19:19,290 --> 00:19:22,500 But suppose that we're looking for a very specific face among them, 385 00:19:22,500 --> 00:19:23,435 maybe Toby's. 386 00:19:23,435 --> 00:19:25,560 Well, maybe if we write a program that doesn't just 387 00:19:25,560 --> 00:19:30,030 take as input the office.jpg, but a second input, toby.jpg, 388 00:19:30,030 --> 00:19:32,910 maybe this library, and code more generally, 389 00:19:32,910 --> 00:19:37,320 can distinguish Toby's face from Jim's, from Pam's, from everyone else 390 00:19:37,320 --> 00:19:41,880 in the show, just based on this one piece of training data, so to speak. 391 00:19:41,880 --> 00:19:47,580 Well let me instead run python of recognize.py and hit Enter. 392 00:19:47,580 --> 00:19:50,370 It's going to do some thinking, some thinking, some thinking. 393 00:19:50,370 --> 00:19:52,800 And it is going to output now a file called 394 00:19:52,800 --> 00:19:59,740 recognized.jpg, which should show me his face, ideally, specifically. 395 00:19:59,740 --> 00:20:01,200 And so what has it done? 396 00:20:01,200 --> 00:20:05,670 Well, with sort of a green marker, there is Toby among all of these faces. 397 00:20:05,670 --> 00:20:07,800 That's maybe a dozen or so lines of code, 398 00:20:07,800 --> 00:20:10,860 but it's built on top of this ecosystem of libraries. 399 00:20:10,860 --> 00:20:14,295 And this is, again, just one of the reasons why Python is so popular. 400 00:20:14,295 --> 00:20:16,920 Undoubtedly, some number of years from now, Python will be out, 401 00:20:16,920 --> 00:20:18,378 and something else will be back in. 402 00:20:18,378 --> 00:20:21,600 But that's indeed among the goals of CS50, too, is not to teach you C, 403 00:20:21,600 --> 00:20:23,820 not to teach you Python, not in a couple of weeks 404 00:20:23,820 --> 00:20:26,280 to teach you JavaScript and other languages, too, 405 00:20:26,280 --> 00:20:28,200 but to teach you how to program. 406 00:20:28,200 --> 00:20:30,900 And indeed, all of the ideas we have explored and will now 407 00:20:30,900 --> 00:20:36,000 explore more today, you'll see recurring for languages in the years to come. 408 00:20:36,000 --> 00:20:40,530 Any questions before we now dive into how it is this code is working 409 00:20:40,530 --> 00:20:45,150 and why I type the things that I did before we forge ahead? 410 00:20:45,150 --> 00:20:48,890 Any questions along these lines? 411 00:20:48,890 --> 00:20:50,720 Anything at all? 412 00:20:50,720 --> 00:20:51,270 No? 413 00:20:51,270 --> 00:20:51,770 All right. 414 00:20:51,770 --> 00:20:54,887 So how does Python itself work? 415 00:20:54,887 --> 00:20:56,720 Well, let's do a quick review as we did when 416 00:20:56,720 --> 00:20:58,730 we transitioned from Scratch to C, this time, 417 00:20:58,730 --> 00:21:00,470 though, from Scratch, say, to Python. 418 00:21:00,470 --> 00:21:02,507 So in Python, as with many languages, there 419 00:21:02,507 --> 00:21:05,090 are these things called functions-- the actions and verbs that 420 00:21:05,090 --> 00:21:06,330 actually get things done. 421 00:21:06,330 --> 00:21:09,590 So here on the left, recall from week 0, was the simplest of functions. 422 00:21:09,590 --> 00:21:12,320 We played with, first, the say block, which just literally has 423 00:21:12,320 --> 00:21:13,880 the cat say something on the screen. 424 00:21:13,880 --> 00:21:18,980 We've seen in C, for instance, the equivalent line of code is arguably 425 00:21:18,980 --> 00:21:22,130 this here, with printf(), with the parentheses, the quotation marks, 426 00:21:22,130 --> 00:21:23,810 the backslash n, the semicolon. 427 00:21:23,810 --> 00:21:27,330 In Python now, it's going to indeed be a little simpler than that. 428 00:21:27,330 --> 00:21:30,590 But the idea is the same as it was back in week 0. 429 00:21:30,590 --> 00:21:33,800 Libraries-- so we've seen already in C, and now we've 430 00:21:33,800 --> 00:21:36,170 already seen in Python that these things exist, too. 431 00:21:36,170 --> 00:21:40,460 In the world of C, recall that besides the standard ones, like standard io.h, 432 00:21:40,460 --> 00:21:43,978 that header file, we could very quickly introduce cs50.h, 433 00:21:43,978 --> 00:21:45,770 which was like your entry point, the header 434 00:21:45,770 --> 00:21:49,342 file for the CS50 library, which gave you a bunch of functions as well. 435 00:21:49,342 --> 00:21:51,300 Well, we're going to give you a similar library 436 00:21:51,300 --> 00:21:54,145 for at least the next week or two, training wheels for Python 437 00:21:54,145 --> 00:21:57,270 specifically, that, again, will take off so that you can stand on your own, 438 00:21:57,270 --> 00:21:59,160 even with CS50 behind you. 439 00:21:59,160 --> 00:22:03,130 But the syntax for using a library in Python is a little different. 440 00:22:03,130 --> 00:22:05,250 You don't include a .h file. 441 00:22:05,250 --> 00:22:09,550 You just import, instead, the name of the library. 442 00:22:09,550 --> 00:22:10,050 All right. 443 00:22:10,050 --> 00:22:11,258 What does that actually mean? 444 00:22:11,258 --> 00:22:14,280 Well, if there are specific functions in that library you want to use, 445 00:22:14,280 --> 00:22:16,050 in Python, you can be more precise. 446 00:22:16,050 --> 00:22:18,750 You don't just have to say, give me the whole library. 447 00:22:18,750 --> 00:22:23,580 For efficiency purposes, you can say, let me import the get_string() function 448 00:22:23,580 --> 00:22:25,800 from the CS50 library. 449 00:22:25,800 --> 00:22:29,610 So you have finer-grained control in Python, which can actually speed things 450 00:22:29,610 --> 00:22:31,590 up if you're not loading things unnecessarily 451 00:22:31,590 --> 00:22:35,040 into memory, if all you want is, say, one feature therein. 452 00:22:35,040 --> 00:22:40,230 So here, for instance, in Scratch, was an example of how we might use not only 453 00:22:40,230 --> 00:22:45,120 a built-in function, like the say block, or, in C, in the printf(), 454 00:22:45,120 --> 00:22:50,250 but how we might similarly now do the same but achieve this in Python. 455 00:22:50,250 --> 00:22:51,610 So how might we do this? 456 00:22:51,610 --> 00:22:54,672 Well, in Python, or rather, in C, this code 457 00:22:54,672 --> 00:22:56,130 looks a little something like this. 458 00:22:56,130 --> 00:22:59,070 Back in week 1, we declared a variable of type string, 459 00:22:59,070 --> 00:23:01,530 even though later we revealed that to be char*. 460 00:23:01,530 --> 00:23:04,980 I gave this a variable name of answer for parity with Scratch. 461 00:23:04,980 --> 00:23:08,140 Then we use CS50's own get_string() function and asked, for instance, 462 00:23:08,140 --> 00:23:10,290 the same question as in the white oval here. 463 00:23:10,290 --> 00:23:14,430 And then, using this placeholder syntax, these format codes, 464 00:23:14,430 --> 00:23:18,870 which was printf()-specific, we could plug in that answer to this premade 465 00:23:18,870 --> 00:23:21,120 string where the %s is. 466 00:23:21,120 --> 00:23:24,370 And we saw %i and %f and a bunch of others as well. 467 00:23:24,370 --> 00:23:27,450 So this is sort of how, in C, you approximate 468 00:23:27,450 --> 00:23:31,080 the idea of concatenating two things together, joining two things, 469 00:23:31,080 --> 00:23:33,330 just as we did here in Scratch. 470 00:23:33,330 --> 00:23:36,360 So in Python, it turns out it's not only going to be a little easier, 471 00:23:36,360 --> 00:23:39,070 but there's going to be even more ways to do this. 472 00:23:39,070 --> 00:23:42,690 And so even what might seem today like a lot of different syntax, 473 00:23:42,690 --> 00:23:45,300 it really is just different ways, stylistically, 474 00:23:45,300 --> 00:23:46,392 to achieve the same goals. 475 00:23:46,392 --> 00:23:48,600 And over time, as you get more comfortable with this, 476 00:23:48,600 --> 00:23:51,900 you too will develop your own style, or, if working for a company 477 00:23:51,900 --> 00:23:54,120 or working with a team, you might collectively 478 00:23:54,120 --> 00:23:56,950 decide which conventions you want to use. 479 00:23:56,950 --> 00:24:01,290 But here, for instance, is one way you could implement this same idea 480 00:24:01,290 --> 00:24:04,120 in Scratch but in Python instead. 481 00:24:04,120 --> 00:24:06,690 So notice I'm going to still use a variable called answer. 482 00:24:06,690 --> 00:24:08,910 I'm going to use CS50's function called get_string(). 483 00:24:08,910 --> 00:24:11,370 I'm still going to use, quote unquote, "What's your name?" 484 00:24:11,370 --> 00:24:14,408 But down here is where we see the most difference. 485 00:24:14,408 --> 00:24:16,200 It's, again, not called printf() in Python. 486 00:24:16,200 --> 00:24:18,060 It's now called just print(). 487 00:24:18,060 --> 00:24:22,350 And what might you infer the plus operator is doing here? 488 00:24:22,350 --> 00:24:26,063 It's not addition, obviously, in a mathematical sense. 489 00:24:26,063 --> 00:24:28,230 But those of you who have perhaps programmed before, 490 00:24:28,230 --> 00:24:30,195 what does the plus represent in this context? 491 00:24:30,195 --> 00:24:32,340 AUDIENCE: It's joining the two strings together. 492 00:24:32,340 --> 00:24:34,757 DAVID MALAN: It's indeed joining the two strings together. 493 00:24:34,757 --> 00:24:36,630 So this is indeed concatenating the thing 494 00:24:36,630 --> 00:24:38,830 on the left with the thing on the right. 495 00:24:38,830 --> 00:24:42,160 So you don't use the placeholder in this particular scenario. 496 00:24:42,160 --> 00:24:44,490 You can instead, a little more simply, just use plus. 497 00:24:44,490 --> 00:24:46,120 But you want your grammar to line up. 498 00:24:46,120 --> 00:24:50,430 So I still have "hello," and then close quote 499 00:24:50,430 --> 00:24:52,950 because I want to form a full phrase. 500 00:24:52,950 --> 00:24:55,410 Notice, too, there's also one other slightly more 501 00:24:55,410 --> 00:24:58,200 subtle difference on the first line. 502 00:24:58,200 --> 00:25:01,482 Besides the fact that we don't have a semicolon, what else is different? 503 00:25:01,482 --> 00:25:03,690 AUDIENCE: You don't declare the type of the variable. 504 00:25:03,690 --> 00:25:05,982 DAVID MALAN: I didn't declare the type of the variable. 505 00:25:05,982 --> 00:25:08,340 So Python still has strings, as we'll see. 506 00:25:08,340 --> 00:25:12,290 But you don't have to tell the interpreter what type of variable 507 00:25:12,290 --> 00:25:12,790 it is. 508 00:25:12,790 --> 00:25:14,490 And this is going to save us some keystrokes, 509 00:25:14,490 --> 00:25:17,230 and it's just going to be a little more user-friendly over time. 510 00:25:17,230 --> 00:25:21,130 Meanwhile, you can do this also a little bit differently if you prefer. 511 00:25:21,130 --> 00:25:25,620 You can instead trust that the print() function in Python can actually do even 512 00:25:25,620 --> 00:25:27,210 more for you automatically. 513 00:25:27,210 --> 00:25:32,160 The print() function in Python can take multiple arguments separated by commas 514 00:25:32,160 --> 00:25:33,240 in the usual way. 515 00:25:33,240 --> 00:25:35,340 And by default, Python is going to insert 516 00:25:35,340 --> 00:25:39,767 for you a single space between its first argument and its second argument. 517 00:25:39,767 --> 00:25:41,850 So notice what I've done here is my first argument 518 00:25:41,850 --> 00:25:45,090 is, quote unquote, "hello," with a comma but no space. 519 00:25:45,090 --> 00:25:48,150 Then, outside of the quotes, I'm putting a comma because that just means, 520 00:25:48,150 --> 00:25:49,550 here comes my second argument. 521 00:25:49,550 --> 00:25:52,098 And then I put the same variable as before. 522 00:25:52,098 --> 00:25:53,890 And I'm just going to let Python figure out 523 00:25:53,890 --> 00:25:56,860 that it should, by default, per its documentation, 524 00:25:56,860 --> 00:26:02,050 can join these two variables, putting a single space in between them. 525 00:26:02,050 --> 00:26:03,610 You can do this yet another way. 526 00:26:03,610 --> 00:26:06,850 And this way looks a little weirder, but this is actually 527 00:26:06,850 --> 00:26:09,910 probably the most common way nowadays in Python 528 00:26:09,910 --> 00:26:14,450 is to use what's called a format string, or f string, for short. 529 00:26:14,450 --> 00:26:16,390 And this looks weird to me still. 530 00:26:16,390 --> 00:26:17,170 It looks weird. 531 00:26:17,170 --> 00:26:21,010 But if you prefix a string in Python with an f, 532 00:26:21,010 --> 00:26:26,020 literally, you can then use curly braces inside of that string in Python. 533 00:26:26,020 --> 00:26:29,930 And Python will not print out literally a curly brace and a closed curly brace. 534 00:26:29,930 --> 00:26:34,330 It will instead interpolate whatever is inside of those curly braces. 535 00:26:34,330 --> 00:26:36,400 That is to say if answer is a variable that 536 00:26:36,400 --> 00:26:39,190 has some value, like "David" or something like that, 537 00:26:39,190 --> 00:26:42,400 saying f before the first quotation mark, 538 00:26:42,400 --> 00:26:44,530 and then using these curly braces therein, 539 00:26:44,530 --> 00:26:48,430 is going to do the exact same thing of creating a string that says "Hello," 540 00:26:48,430 --> 00:26:50,440 comma, space, "David." 541 00:26:50,440 --> 00:26:52,390 So it's going to plug in the value for you. 542 00:26:52,390 --> 00:26:56,350 So you can think of this as %s but without that second step of having 543 00:26:56,350 --> 00:26:59,635 to keep track of what you want to plug back in for %s. 544 00:26:59,635 --> 00:27:02,330 Instead of %s, you literally put in curly braces, 545 00:27:02,330 --> 00:27:04,060 what do you want to put right there? 546 00:27:04,060 --> 00:27:06,610 You format the string yourself. 547 00:27:06,610 --> 00:27:10,570 So given all of those ways, how might we actually 548 00:27:10,570 --> 00:27:14,410 go about implementing this or using this ourselves? 549 00:27:14,410 --> 00:27:18,040 Well, let me propose that we do this here. 550 00:27:18,040 --> 00:27:21,070 Let me propose that I go back to VS Code. 551 00:27:21,070 --> 00:27:24,640 Let me go ahead and open up hello.py again. 552 00:27:24,640 --> 00:27:28,730 And as before, instead of just printing out something like, 553 00:27:28,730 --> 00:27:31,330 quote unquote, "hello, world," let me actually print out 554 00:27:31,330 --> 00:27:33,260 something a little more interesting. 555 00:27:33,260 --> 00:27:36,850 So let me go ahead and, from the CS50 library, 556 00:27:36,850 --> 00:27:39,310 import the function called get_string(). 557 00:27:39,310 --> 00:27:42,010 Then let me go ahead and create a variable called answer. 558 00:27:42,010 --> 00:27:45,700 Let me set that equal to the return value of get_string() with, 559 00:27:45,700 --> 00:27:50,590 as an argument, quote unquote, "What's your name?" 560 00:27:50,590 --> 00:27:54,610 And then no semicolon at the end of that line, but on the next line, frankly 561 00:27:54,610 --> 00:27:57,650 here, I can pick any one of those potential solutions. 562 00:27:57,650 --> 00:27:59,090 So let me start with the first. 563 00:27:59,090 --> 00:28:00,415 So "hello, " + answer. 564 00:28:00,415 --> 00:28:03,640 565 00:28:03,640 --> 00:28:07,840 And now, if I go down to my terminal window and run python of hello.py, 566 00:28:07,840 --> 00:28:08,950 I'm prompted for my name. 567 00:28:08,950 --> 00:28:12,280 I can type in D-A-V-I-D, and voila, that there then works. 568 00:28:12,280 --> 00:28:13,870 Or I can tweak this a little bit. 569 00:28:13,870 --> 00:28:19,120 I can trust that Python will concatenate its first and second argument for me. 570 00:28:19,120 --> 00:28:21,010 But this isn't quite right. 571 00:28:21,010 --> 00:28:24,820 Let me go ahead and rerun python of hello.py, hit Enter, and type 572 00:28:24,820 --> 00:28:25,420 in "David." 573 00:28:25,420 --> 00:28:28,840 It's going to be ever-so-slightly buggy, sort of grammatically 574 00:28:28,840 --> 00:28:29,980 or visually, if you will. 575 00:28:29,980 --> 00:28:31,880 What did I do wrong here? 576 00:28:31,880 --> 00:28:32,380 Yeah. 577 00:28:32,380 --> 00:28:35,900 So I left the space in there, even though I'm getting one for free from 578 00:28:35,900 --> 00:28:36,400 print(). 579 00:28:36,400 --> 00:28:38,260 So that's an easy solution here. 580 00:28:38,260 --> 00:28:41,560 But let's do it one other way after running this to be sure-- 581 00:28:41,560 --> 00:28:42,250 D-A-V-I-D. 582 00:28:42,250 --> 00:28:44,620 And OK, now it looks like I intended. 583 00:28:44,620 --> 00:28:47,780 Well, let's go ahead and use that placeholder syntax. 584 00:28:47,780 --> 00:28:52,240 So let's just pass in one bigger string as our argument, do "hello," 585 00:28:52,240 --> 00:28:55,930 and then, in curly braces, [? answer ?],, like this. 586 00:28:55,930 --> 00:28:58,390 Well, let me go down to my terminal window and clear it. 587 00:28:58,390 --> 00:29:03,700 Let me run python of hello.py and enter, type in D-A-V-I-D, and voila. 588 00:29:03,700 --> 00:29:05,650 OK, I made a mistake. 589 00:29:05,650 --> 00:29:08,150 What did I do wrong here, minor though it seems to be? 590 00:29:08,150 --> 00:29:08,650 Yeah? 591 00:29:08,650 --> 00:29:09,525 AUDIENCE: [INAUDIBLE] 592 00:29:09,525 --> 00:29:11,410 DAVID MALAN: So the stupid little f that you 593 00:29:11,410 --> 00:29:13,690 have to put before the string to tell Python 594 00:29:13,690 --> 00:29:16,570 that this is a special string-- it's a format string, or f 595 00:29:16,570 --> 00:29:20,450 string-- that it should additionally format for you. 596 00:29:20,450 --> 00:29:24,272 So if I rerun this after adding that f, I can do python of hello.py. 597 00:29:24,272 --> 00:29:24,980 What's your name? 598 00:29:24,980 --> 00:29:25,480 David. 599 00:29:25,480 --> 00:29:28,690 And now it looks the way I might intend. 600 00:29:28,690 --> 00:29:33,160 But it turns out in Python, you don't actually need to use get_string(). 601 00:29:33,160 --> 00:29:35,770 In C, recall that we introduced that because it's actually 602 00:29:35,770 --> 00:29:40,840 pretty annoying in C to get strings, in particular to get strings safely. 603 00:29:40,840 --> 00:29:44,950 Recall those short examples we did with scanf not too long ago. 604 00:29:44,950 --> 00:29:47,530 And scanf kind of scans what the user types at the keyboard 605 00:29:47,530 --> 00:29:49,150 and loads it into memory. 606 00:29:49,150 --> 00:29:53,890 But the fundamental danger with scanf when it comes to strings was what? 607 00:29:53,890 --> 00:30:00,580 Why was it dangerous to use scanf to get strings from a user? 608 00:30:00,580 --> 00:30:01,080 Why? 609 00:30:01,080 --> 00:30:02,058 Yeah? 610 00:30:02,058 --> 00:30:04,690 AUDIENCE: What if they give you a really long string you don't have space for? 611 00:30:04,690 --> 00:30:05,260 DAVID MALAN: Exactly. 612 00:30:05,260 --> 00:30:07,010 What if they give you a really long string 613 00:30:07,010 --> 00:30:08,598 that you didn't allocate space for? 614 00:30:08,598 --> 00:30:11,140 Because you're not going to know as the programmer in advance 615 00:30:11,140 --> 00:30:13,750 how long of a string the human is going to type in. 616 00:30:13,750 --> 00:30:15,130 So you might under-- 617 00:30:15,130 --> 00:30:19,895 you might undercut it and therefore have too much memory, or too many 618 00:30:19,895 --> 00:30:22,270 characters being put into that memory, thereby giving you 619 00:30:22,270 --> 00:30:26,260 some kind of buffer overflow, which might crash the computer or, minimally, 620 00:30:26,260 --> 00:30:27,130 your program. 621 00:30:27,130 --> 00:30:31,570 So it turns out in C, get_string() was especially useful. 622 00:30:31,570 --> 00:30:33,700 In Python, it's not really that useful. 623 00:30:33,700 --> 00:30:39,640 All it does is use a function that does come with Python called input(). 624 00:30:39,640 --> 00:30:43,370 And, in fact, the input() function in Python, for all intents and purposes, 625 00:30:43,370 --> 00:30:47,270 is the same as the get_string() function that we give to you. 626 00:30:47,270 --> 00:30:50,140 But just to ease the transition from C to Python, 627 00:30:50,140 --> 00:30:52,780 we implemented a Python version of get_string() nonetheless. 628 00:30:52,780 --> 00:30:55,120 But this is to say if I go to VS Code here, 629 00:30:55,120 --> 00:30:59,030 and I just change get_string() to input(), and, in fact, 630 00:30:59,030 --> 00:31:03,680 I even get rid of the CS50 library at the top, this too should work fine. 631 00:31:03,680 --> 00:31:07,940 If I rerun python of hello.py, type in my name, David, and voila, 632 00:31:07,940 --> 00:31:10,710 I have that now working as well. 633 00:31:10,710 --> 00:31:11,210 All right. 634 00:31:11,210 --> 00:31:19,020 Questions about this use of get_string() or input() or any of our syntax thus 635 00:31:19,020 --> 00:31:19,520 far? 636 00:31:19,520 --> 00:31:21,990 637 00:31:21,990 --> 00:31:22,490 All right. 638 00:31:22,490 --> 00:31:24,470 Well, what about variables? 639 00:31:24,470 --> 00:31:26,480 We've used variables already, and we already 640 00:31:26,480 --> 00:31:29,960 identified the fact that you don't have to specify the type of your variables 641 00:31:29,960 --> 00:31:33,710 proactively, even though, clearly, Python supports strings thus far, 642 00:31:33,710 --> 00:31:37,010 well, in Python, here's how you might declare 643 00:31:37,010 --> 00:31:40,130 a variable that not necessarily is assigned like a string, 644 00:31:40,130 --> 00:31:41,440 but maybe an integer instead. 645 00:31:41,440 --> 00:31:43,190 So in Scratch, here's how you could create 646 00:31:43,190 --> 00:31:47,030 a variable called counter if you want to count things and set it equal to 0. 647 00:31:47,030 --> 00:31:50,180 In C, what we would have done is this-- 648 00:31:50,180 --> 00:31:55,040 int counter = 0; that's the exact same thing as in Scratch. 649 00:31:55,040 --> 00:31:59,570 But in Python, as you might imagine, we can chip away at this and type 650 00:31:59,570 --> 00:32:01,760 out this same idea little more easily. 651 00:32:01,760 --> 00:32:03,920 One, we don't need to say int anymore. 652 00:32:03,920 --> 00:32:06,240 Two, we don't need the semicolon anymore. 653 00:32:06,240 --> 00:32:07,850 And so you just do what you intend. 654 00:32:07,850 --> 00:32:09,770 If you want a variable, just write it out. 655 00:32:09,770 --> 00:32:12,352 If you want to assign it a value, you use the equals sign. 656 00:32:12,352 --> 00:32:14,810 If you want to specify that value, you put it on the right. 657 00:32:14,810 --> 00:32:17,570 And just as in C, this is not the equality operator. 658 00:32:17,570 --> 00:32:20,420 It's the assignment operator from right to left. 659 00:32:20,420 --> 00:32:25,460 Recall that in Scratch, if you wanted to increment a variable by 1 or any value, 660 00:32:25,460 --> 00:32:27,260 you could use this puzzle piece here. 661 00:32:27,260 --> 00:32:32,310 Well, in C, you could do syntax like this, which, again, is not equality. 662 00:32:32,310 --> 00:32:38,600 It's saying add 1 to counter and then assign it back to the counter variable. 663 00:32:38,600 --> 00:32:42,750 In Python, you can do exactly the same thing minus the semicolon. 664 00:32:42,750 --> 00:32:44,690 So you don't need to use the semicolon here. 665 00:32:44,690 --> 00:32:48,380 But you might recall that in C, there was some syntactic sugar for this idea 666 00:32:48,380 --> 00:32:49,640 because it was pretty popular. 667 00:32:49,640 --> 00:32:53,810 And so you could shorten this in C, as you can in Python, 668 00:32:53,810 --> 00:32:59,270 to actually just this. += 1 will add to the counter variable whatever that 669 00:32:59,270 --> 00:33:00,560 value is. 670 00:33:00,560 --> 00:33:03,740 But it's not all steps forward. 671 00:33:03,740 --> 00:33:07,250 You might be in the habit of using ++ or --. 672 00:33:07,250 --> 00:33:09,870 Sorry, those are not available in Python. 673 00:33:09,870 --> 00:33:10,370 Why? 674 00:33:10,370 --> 00:33:12,590 It's because the designers of Python decided that you 675 00:33:12,590 --> 00:33:14,360 don't need them because this is-- 676 00:33:14,360 --> 00:33:16,230 gets the job done anyway. 677 00:33:16,230 --> 00:33:19,290 But there's a question down here in front, unless it was about the same. 678 00:33:19,290 --> 00:33:19,790 All right. 679 00:33:19,790 --> 00:33:22,070 So that's one feature we're taking away. 680 00:33:22,070 --> 00:33:25,130 But it's not such a big deal to do += in this case. 681 00:33:25,130 --> 00:33:28,520 Well, what about the actual types involved here 682 00:33:28,520 --> 00:33:31,970 beyond actually being able to define variables? 683 00:33:31,970 --> 00:33:36,620 Well, recall that in the world of C, we had at least these data 684 00:33:36,620 --> 00:33:39,020 types, those that came with the language in particular. 685 00:33:39,020 --> 00:33:41,570 And we played with quite a few of these over time. 686 00:33:41,570 --> 00:33:45,110 In Python, we're going to take a bunch of those away. 687 00:33:45,110 --> 00:33:51,650 In Python, you're only going to have access to a bool, true or false, 688 00:33:51,650 --> 00:33:54,110 a float, which is a real number with a decimal point, 689 00:33:54,110 --> 00:33:58,430 typically, an int, or an integer, and a string, now known as str. 690 00:33:58,430 --> 00:34:00,533 So Python here sort of cuts some corners, 691 00:34:00,533 --> 00:34:02,450 feels like it's too long to write out strings. 692 00:34:02,450 --> 00:34:08,750 So a string in Python is called str, S-T-R, but it's the exact same idea. 693 00:34:08,750 --> 00:34:11,870 Notice, though, that missing from this now, in particular, 694 00:34:11,870 --> 00:34:16,639 are double and long, which, recall, actually used more bits in order 695 00:34:16,639 --> 00:34:17,929 to store information. 696 00:34:17,929 --> 00:34:20,781 We'll see that that might not necessarily be a bad thing. 697 00:34:20,781 --> 00:34:22,489 In fact, Python just simplifies the world 698 00:34:22,489 --> 00:34:24,590 into two different types of variables but gets 699 00:34:24,590 --> 00:34:26,460 out of the business of you having to decide, 700 00:34:26,460 --> 00:34:30,923 do you want a small int or a large int or something along those lines? 701 00:34:30,923 --> 00:34:32,340 Well, let me go ahead and do this. 702 00:34:32,340 --> 00:34:34,670 Let me switch back over to VS Code here. 703 00:34:34,670 --> 00:34:37,940 And why don't we actually try to play around with some calculations using 704 00:34:37,940 --> 00:34:39,590 these data types and more? 705 00:34:39,590 --> 00:34:42,500 Let me go ahead and propose that we implement, 706 00:34:42,500 --> 00:34:48,690 like we did way back in week 1, a simple calculator. 707 00:34:48,690 --> 00:34:51,889 So let me do this-- code of calculator.c. 708 00:34:51,889 --> 00:34:55,670 So I'm indeed going to do this in C first, just so 709 00:34:55,670 --> 00:34:58,470 that we have a similar example at hand. 710 00:34:58,470 --> 00:35:02,940 So I'm going to include standard io.h here at the top. 711 00:35:02,940 --> 00:35:06,370 I'm going to go ahead and do int main(void). 712 00:35:06,370 --> 00:35:10,270 Inside of main(), I'm going to go ahead and declare a variable called x and set 713 00:35:10,270 --> 00:35:15,190 that equal to get_int(), and I'm going to prompt the user for that value x. 714 00:35:15,190 --> 00:35:18,550 But if I'm using get_int(), recall that actually is from the CS50 library. 715 00:35:18,550 --> 00:35:22,000 So in C, I'm going to need cs50.h, still, for this example. 716 00:35:22,000 --> 00:35:24,290 But back in week 1, I then did something else. 717 00:35:24,290 --> 00:35:28,240 I then said, give me another variable called y, set that equal to get_int(), 718 00:35:28,240 --> 00:35:31,960 and set that equal to that-- pass in that prompt there. 719 00:35:31,960 --> 00:35:34,210 And then, lastly, let's just do something super simple 720 00:35:34,210 --> 00:35:35,900 like add two numbers together. 721 00:35:35,900 --> 00:35:37,480 So in C, I'll use printf(). 722 00:35:37,480 --> 00:35:41,860 I'm going to go ahead and do %i backslash n as a placeholder. 723 00:35:41,860 --> 00:35:44,380 And then I'm just going to plug in x + y. 724 00:35:44,380 --> 00:35:48,460 So all of that was in C. So it was a decent number of lines of code 725 00:35:48,460 --> 00:35:50,650 to accomplish that task, only three of which 726 00:35:50,650 --> 00:35:53,200 are really the logical part of my program. 727 00:35:53,200 --> 00:35:55,640 These are the three that we're really interested in. 728 00:35:55,640 --> 00:35:59,980 So let me instead now do this, code of calculator.py, 729 00:35:59,980 --> 00:36:01,510 which is going to give me a new tab. 730 00:36:01,510 --> 00:36:05,540 Let me just drag it over to the right so I can view these side by side. 731 00:36:05,540 --> 00:36:07,790 And in calculator.py, let's do this. 732 00:36:07,790 --> 00:36:11,910 From the CS50 library, import the get_int() function, 733 00:36:11,910 --> 00:36:13,430 which is also available. 734 00:36:13,430 --> 00:36:16,670 Then let's go ahead and create a variable called x and set it equal 735 00:36:16,670 --> 00:36:20,180 to the return value of get_int(), passing in the same prompt-- 736 00:36:20,180 --> 00:36:22,220 no semicolon, no mention of int. 737 00:36:22,220 --> 00:36:25,040 Let's then create a second variable y, set it equal to get_int(), 738 00:36:25,040 --> 00:36:29,970 prompt the user for y, as before, no int, explicitly, no semicolon. 739 00:36:29,970 --> 00:36:33,620 And now let's just go ahead and print out x + y. 740 00:36:33,620 --> 00:36:37,070 So it turns out that the print() function in Python is further flexible, 741 00:36:37,070 --> 00:36:39,050 that you don't need these format strings. 742 00:36:39,050 --> 00:36:42,470 If you want to print out an integer, just pass it an integer, 743 00:36:42,470 --> 00:36:45,750 even if that integer is the sum of two other integers. 744 00:36:45,750 --> 00:36:48,180 So it just sort of works as you might expect. 745 00:36:48,180 --> 00:36:50,060 So let me go down into my terminal here. 746 00:36:50,060 --> 00:36:52,580 Let me run python of calculator.py. 747 00:36:52,580 --> 00:36:54,830 And when I hit Enter, I'm prompted for x. 748 00:36:54,830 --> 00:36:55,730 Let's do 1. 749 00:36:55,730 --> 00:36:56,630 I'm prompted for y. 750 00:36:56,630 --> 00:36:57,530 Let's do 2. 751 00:36:57,530 --> 00:37:02,070 And voila, I should see 3 as the result-- 752 00:37:02,070 --> 00:37:04,800 so no actual surprises there. 753 00:37:04,800 --> 00:37:07,948 But let me go ahead and, you know what? 754 00:37:07,948 --> 00:37:09,740 Let's take away this training wheel, right? 755 00:37:09,740 --> 00:37:12,210 We don't want to keep introducing CS50-specific things. 756 00:37:12,210 --> 00:37:14,960 So suppose we didn't give you get_int(). 757 00:37:14,960 --> 00:37:18,505 Well, it turns out that get_int() is still doing a bit of help for you, 758 00:37:18,505 --> 00:37:21,380 even though get_string() was kind of a throwaway and we could replace 759 00:37:21,380 --> 00:37:22,590 get_string() with input(). 760 00:37:22,590 --> 00:37:24,530 So let's try this same idea. 761 00:37:24,530 --> 00:37:28,880 Let's go ahead and prompt the user for input for both x and y using 762 00:37:28,880 --> 00:37:33,410 the input() function in Python instead of get_int() from CS50. 763 00:37:33,410 --> 00:37:37,280 Let me go ahead and rerun Python of calculator.py and hit Enter. 764 00:37:37,280 --> 00:37:38,330 So far, so good. 765 00:37:38,330 --> 00:37:39,470 Let me type in 1. 766 00:37:39,470 --> 00:37:40,490 Let me type in 2. 767 00:37:40,490 --> 00:37:43,230 And what answer should we see? 768 00:37:43,230 --> 00:37:46,260 Hopefully still 3, but nope. 769 00:37:46,260 --> 00:37:51,570 Now the answer is 12, or is it? 770 00:37:51,570 --> 00:37:54,233 Why am I seeing 12 and not 3? 771 00:37:54,233 --> 00:37:55,650 AUDIENCE: [INAUDIBLE] two strings. 772 00:37:55,650 --> 00:37:56,400 DAVID MALAN: Yeah. 773 00:37:56,400 --> 00:37:59,230 So it's actually concatenating what seem to be two strings. 774 00:37:59,230 --> 00:38:01,980 So if we actually read the documentation for the input() function, 775 00:38:01,980 --> 00:38:04,530 it's behaving exactly as it's supposed to. 776 00:38:04,530 --> 00:38:06,810 It is getting input from the user from their keyboard. 777 00:38:06,810 --> 00:38:09,840 But anything you type at the keyboard is effectively a string. 778 00:38:09,840 --> 00:38:12,390 Even if some of the symbols happen to look like or actually 779 00:38:12,390 --> 00:38:15,780 be decimal numbers, they're still going to come to you as strings. 780 00:38:15,780 --> 00:38:20,130 And so x is a string, a.k.a., str, y is a str, 781 00:38:20,130 --> 00:38:24,660 and we've already seen that if you use plus in between two strings, or strs, 782 00:38:24,660 --> 00:38:27,390 you're going to get concatenation, not addition. 783 00:38:27,390 --> 00:38:32,590 So you're not seeing 12 as much as you're seeing 1 2, not 12. 784 00:38:32,590 --> 00:38:33,820 So how can we fix this? 785 00:38:33,820 --> 00:38:38,040 Well, in C, we had this technique where we could cast one thing to another 786 00:38:38,040 --> 00:38:40,710 by just putting int in parentheses, for instance. 787 00:38:40,710 --> 00:38:42,880 In Python, things are a little higher-level such 788 00:38:42,880 --> 00:38:46,060 that you can't quite get away with just casting 789 00:38:46,060 --> 00:38:51,280 one thing to another because a string, recall, is not 790 00:38:51,280 --> 00:38:53,080 the same thing as a char. 791 00:38:53,080 --> 00:38:55,450 A string has zero or more characters. 792 00:38:55,450 --> 00:38:56,890 A char always has one. 793 00:38:56,890 --> 00:39:00,490 And in C, there was a perfect mapping between single characters 794 00:39:00,490 --> 00:39:05,230 and single numbers in decimal, like 65 for capital A. 795 00:39:05,230 --> 00:39:09,430 But in Python, we can do something somewhat similar and not so much cast 796 00:39:09,430 --> 00:39:15,510 but convert this input() to an int and convert this input() to an int. 797 00:39:15,510 --> 00:39:17,260 So just like in C, you can nest functions. 798 00:39:17,260 --> 00:39:19,750 You can call one function and pass its output 799 00:39:19,750 --> 00:39:21,700 as the input to another function. 800 00:39:21,700 --> 00:39:25,390 And this now will convert x and y to integers. 801 00:39:25,390 --> 00:39:28,780 And so now plus is going to behave as you should-- as you would expect. 802 00:39:28,780 --> 00:39:33,010 Let me rerun python of calculator.py, type in 1, type in 2, and now 803 00:39:33,010 --> 00:39:37,150 we're back to seeing 3 as the result. If this is a little unclear, 804 00:39:37,150 --> 00:39:39,190 this nesting, let me do this one other way. 805 00:39:39,190 --> 00:39:43,360 Instead of just passing input() output into int, 806 00:39:43,360 --> 00:39:45,760 I could also more pedantically do this. 807 00:39:45,760 --> 00:39:52,000 x should actually equal int(x), y should actually equal int(y). 808 00:39:52,000 --> 00:39:54,200 This would be the exact same effect. 809 00:39:54,200 --> 00:39:57,340 It's just two extra lines where it's not really necessary. 810 00:39:57,340 --> 00:39:58,540 But that would work fine. 811 00:39:58,540 --> 00:40:01,270 If you don't like that approach, we could even do it inline. 812 00:40:01,270 --> 00:40:05,420 We could actually convert x to an int and y to an int. 813 00:40:05,420 --> 00:40:05,920 Why? 814 00:40:05,920 --> 00:40:10,180 Well, int, I-N-T, in the context of Python itself, is a function. 815 00:40:10,180 --> 00:40:13,570 And it takes as input here a string, or str, 816 00:40:13,570 --> 00:40:19,550 and returns to you the numeric, the integral equivalent-- so similar idea, 817 00:40:19,550 --> 00:40:20,870 but it's actually a function. 818 00:40:20,870 --> 00:40:23,320 So all of the syntax that I've been tinkering with here 819 00:40:23,320 --> 00:40:27,800 is sort of fundamentally the same as it would be in C. But in this case, 820 00:40:27,800 --> 00:40:31,120 we're not casting but converting more specifically. 821 00:40:31,120 --> 00:40:33,830 Well, let me go back to these data types. 822 00:40:33,830 --> 00:40:37,423 These are some of the data types that are available to us in Python. 823 00:40:37,423 --> 00:40:39,340 It turns out there's a bunch of others as well 824 00:40:39,340 --> 00:40:40,923 that we'll start to dabble with today. 825 00:40:40,923 --> 00:40:43,490 You can get a range of values, a list of values, 826 00:40:43,490 --> 00:40:46,960 which is going to be like an array, but better, tuples, which 827 00:40:46,960 --> 00:40:50,920 are kind of like x, comma, y, often, combinations of values that 828 00:40:50,920 --> 00:40:51,940 don't change. 829 00:40:51,940 --> 00:40:57,580 dict for dictionary-- it turns out that in Python, you get dictionaries. 830 00:40:57,580 --> 00:40:58,917 You get hash tables for free. 831 00:40:58,917 --> 00:41:00,250 They're built into the language. 832 00:41:00,250 --> 00:41:02,125 And we already saw that Python also gives you 833 00:41:02,125 --> 00:41:04,060 a data type known as a set, which is just 834 00:41:04,060 --> 00:41:06,400 a collection of values that gives you-- 835 00:41:06,400 --> 00:41:08,090 gets rid of any duplicates for you. 836 00:41:08,090 --> 00:41:11,830 And as we saw briefly in speller-- and we'll play more with these ideas soon-- 837 00:41:11,830 --> 00:41:15,340 it's going to actually be pretty darn easy to get values or check 838 00:41:15,340 --> 00:41:17,960 for values in those there data types. 839 00:41:17,960 --> 00:41:22,510 So that in C, we were able to get input easily, we had all of these functions. 840 00:41:22,510 --> 00:41:26,253 In the CS50 library for Python, we're only going to give you these instead. 841 00:41:26,253 --> 00:41:27,670 They're going to be the same name. 842 00:41:27,670 --> 00:41:30,670 So it's still get_string(), not get_str, because we wanted the functions 843 00:41:30,670 --> 00:41:31,840 to remain named the same. 844 00:41:31,840 --> 00:41:34,690 But get_float(), get_int(), get_string() all exist. 845 00:41:34,690 --> 00:41:37,280 But, again, get_string() is not all that useful. 846 00:41:37,280 --> 00:41:41,980 But get_int() and get_float() actually are. 847 00:41:41,980 --> 00:41:42,580 Why? 848 00:41:42,580 --> 00:41:44,830 Well, let me go back to VS Code here. 849 00:41:44,830 --> 00:41:47,920 And let me go back to the second version of this program, 850 00:41:47,920 --> 00:41:54,980 whereby I proactively converted each of these return values to integers. 851 00:41:54,980 --> 00:41:59,020 So recall that this is the solution to the 1 2 problem. 852 00:41:59,020 --> 00:42:03,550 And to be clear, if I run python of calculator.py and input 1 and 2, 853 00:42:03,550 --> 00:42:06,100 I get back now 3 as expected. 854 00:42:06,100 --> 00:42:09,850 But what I'm not showing you is that there's still potentially a bug here. 855 00:42:09,850 --> 00:42:13,480 Let me run python of calculator.py, and let me just not cooperate. 856 00:42:13,480 --> 00:42:15,400 Instead of typing what looks like a number, 857 00:42:15,400 --> 00:42:19,180 let me actually type something that's clearly a string, like cat. 858 00:42:19,180 --> 00:42:22,570 And unfortunately, we're going to see the first of our errors, 859 00:42:22,570 --> 00:42:23,938 the first of our runtime errors. 860 00:42:23,938 --> 00:42:26,230 And this, like in C, is going to look cryptic at first. 861 00:42:26,230 --> 00:42:28,330 But this is generally known as a traceback, where 862 00:42:28,330 --> 00:42:31,523 it's going to trace back for you everything your program just did, 863 00:42:31,523 --> 00:42:33,190 even though this one's relatively short. 864 00:42:33,190 --> 00:42:36,915 And you'll see that calculator.py, line 1-- 865 00:42:36,915 --> 00:42:39,040 I didn't even get very far before there's an error. 866 00:42:39,040 --> 00:42:42,300 And then, with all of these carrot symbols here, this is a problem. 867 00:42:42,300 --> 00:42:42,810 Why? 868 00:42:42,810 --> 00:42:47,370 invalid literal for int() function with base 10, quote unquote, 'cat.' 869 00:42:47,370 --> 00:42:49,147 Again, just like in C, It's very arcane. 870 00:42:49,147 --> 00:42:51,480 It's hard to understand this the first time you read it. 871 00:42:51,480 --> 00:42:55,920 But what it's trying to tell me is that cat is not an integer. 872 00:42:55,920 --> 00:42:59,850 And therefore, the int() function cannot convert it to an integer for you. 873 00:42:59,850 --> 00:43:01,920 We're going to leave this problem alone for now. 874 00:43:01,920 --> 00:43:04,950 But this is why, again, get_int()'s looking kind of good, 875 00:43:04,950 --> 00:43:07,980 and get_float()'s looking kind of good because those functions from 876 00:43:07,980 --> 00:43:12,970 CS50's library will deal with these kinds of problems for you. 877 00:43:12,970 --> 00:43:14,970 Now, just so you've seen it, there's another way 878 00:43:14,970 --> 00:43:17,370 to import functions from these things. 879 00:43:17,370 --> 00:43:20,430 If you were to use, for instance, in a program, get_float(), get_int(), 880 00:43:20,430 --> 00:43:24,180 and get_string(), you don't need to do three separate lines like this. 881 00:43:24,180 --> 00:43:27,420 You can actually separate them a little more cleanly with commas. 882 00:43:27,420 --> 00:43:33,240 And, in fact, if I go back to a version of this program here in VS Code whereby 883 00:43:33,240 --> 00:43:35,710 I actually do use the get_int() function-- 884 00:43:35,710 --> 00:43:39,810 so let me actually get rid of all this and use get_int() as before. 885 00:43:39,810 --> 00:43:43,000 Let me get rid of all this and use get_int() as before. 886 00:43:43,000 --> 00:43:48,640 Previously, the way I did this was by saying from cs50 import get_int() 887 00:43:48,640 --> 00:43:51,200 if you know in advance what function you want to use. 888 00:43:51,200 --> 00:43:54,400 But suppose, for whatever reason, you already have your own function named 889 00:43:54,400 --> 00:43:57,940 get_int(), and therefore, it would collide with CS50's own, 890 00:43:57,940 --> 00:44:01,400 you can avoid that issue, too, by just using that first statement we saw 891 00:44:01,400 --> 00:44:01,900 earlier. 892 00:44:01,900 --> 00:44:03,520 Just import the library itself. 893 00:44:03,520 --> 00:44:06,520 Don't specify explicitly which functions you're going to use. 894 00:44:06,520 --> 00:44:09,850 But thereafter-- and you could not do this in C-- 895 00:44:09,850 --> 00:44:14,950 you could specify cs50.get_int(), cs50.get_int(), 896 00:44:14,950 --> 00:44:19,450 in order to go into the library, access its get_int() function, and therefore, 897 00:44:19,450 --> 00:44:22,630 it doesn't matter if you or any number of other people wrote 898 00:44:22,630 --> 00:44:25,360 an identically-named function called get_int(). 899 00:44:25,360 --> 00:44:28,660 You're using here, clearly, CS50's own. 900 00:44:28,660 --> 00:44:34,180 So this is, again, just more ways to achieve the same solution 901 00:44:34,180 --> 00:44:36,410 but with different syntax. 902 00:44:36,410 --> 00:44:36,910 All right. 903 00:44:36,910 --> 00:44:44,030 Any questions about any of this syntax or features thus far? 904 00:44:44,030 --> 00:44:44,530 No? 905 00:44:44,530 --> 00:44:45,030 All right. 906 00:44:45,030 --> 00:44:48,010 Well, how about maybe another example here, 907 00:44:48,010 --> 00:44:52,450 whereby we revisit conditionals, which was the way of implementing 908 00:44:52,450 --> 00:44:55,510 do this thing or this thing, sort of proverbial forks in the road. 909 00:44:55,510 --> 00:44:58,420 In Scratch, recall, we might use building blocks like these 910 00:44:58,420 --> 00:45:01,990 to just check, is x less than y, and if so, say so. 911 00:45:01,990 --> 00:45:04,400 In C, this code looked like this. 912 00:45:04,400 --> 00:45:07,210 And notice that we had parentheses around the x and the y. 913 00:45:07,210 --> 00:45:11,380 We had curly braces, even though I did disclaim that for single lines of code, 914 00:45:11,380 --> 00:45:13,160 you can actually omit the curly braces. 915 00:45:13,160 --> 00:45:15,790 But stylistically, we always include them in CS50's code. 916 00:45:15,790 --> 00:45:18,250 But you have the backslash n and the semicolon. 917 00:45:18,250 --> 00:45:21,770 In a moment, you're about to see the Python equivalent of this, 918 00:45:21,770 --> 00:45:23,270 which is almost the same. 919 00:45:23,270 --> 00:45:24,760 It's just a little nicer. 920 00:45:24,760 --> 00:45:28,160 This, then, is the Python equivalent thereof. 921 00:45:28,160 --> 00:45:32,840 So what's different at a glance here, just to be clear? 922 00:45:32,840 --> 00:45:33,620 What's different? 923 00:45:33,620 --> 00:45:34,475 Yeah? 924 00:45:34,475 --> 00:45:35,350 AUDIENCE: [INAUDIBLE] 925 00:45:35,350 --> 00:45:38,440 DAVID MALAN: So the conditional is not in parentheses. 926 00:45:38,440 --> 00:45:42,230 You can use parentheses, especially if, logically, you need to group things. 927 00:45:42,230 --> 00:45:45,160 But if you don't need them, don't use them is Python's mindset. 928 00:45:45,160 --> 00:45:46,780 What else has changed here? 929 00:45:46,780 --> 00:45:47,500 Yeah? 930 00:45:47,500 --> 00:45:48,700 AUDIENCE: No curly brackets. 931 00:45:48,700 --> 00:45:51,620 DAVID MALAN: No curly braces, yeah, so no curly braces around this. 932 00:45:51,620 --> 00:45:55,250 And even though it's one line of code, you just don't use curly braces at all. 933 00:45:55,250 --> 00:45:55,750 Why? 934 00:45:55,750 --> 00:46:00,230 Because in Python, indentation is actually really, really important. 935 00:46:00,230 --> 00:46:02,410 And we know from office hours and problem sets 936 00:46:02,410 --> 00:46:05,200 occasionally that if you forgot to run style50 937 00:46:05,200 --> 00:46:07,870 or you didn't manually format your code beautifully, 938 00:46:07,870 --> 00:46:10,960 C is not actually going to care if everything is aligned on the left. 939 00:46:10,960 --> 00:46:14,830 If you never once hit the Tab character or the space bar, 940 00:46:14,830 --> 00:46:17,770 C, or specifically, clang, isn't really going to care. 941 00:46:17,770 --> 00:46:19,722 But your teaching fellow, your TA, is going 942 00:46:19,722 --> 00:46:22,930 to care, or your colleague in the real world, because your code's just a mess 943 00:46:22,930 --> 00:46:23,920 and hard to read. 944 00:46:23,920 --> 00:46:29,380 Python, though-- because you are not the only ones in the world that might have 945 00:46:29,380 --> 00:46:31,450 bad habits when it comes to style-- 946 00:46:31,450 --> 00:46:34,490 Python as a language decided, that's it. 947 00:46:34,490 --> 00:46:37,890 Everyone has to indent in order for their code to even work. 948 00:46:37,890 --> 00:46:40,470 So the convention as Python is to use for spaces-- 949 00:46:40,470 --> 00:46:43,850 so 1, 2, 3, 4, or hit Tab and let it automatically convert to the same, 950 00:46:43,850 --> 00:46:47,960 and use a colon instead of the curly braces, 951 00:46:47,960 --> 00:46:50,480 for instance, to make clear what is associated 952 00:46:50,480 --> 00:46:53,000 with this particular conditional. 953 00:46:53,000 --> 00:46:55,310 We can omit, though, the backslash n per before. 954 00:46:55,310 --> 00:46:56,630 We can omit the semicolon. 955 00:46:56,630 --> 00:46:59,750 But this is essentially the Python version thereof. 956 00:46:59,750 --> 00:47:03,560 Here in C-- in Scratch, if you wanted to do an if-else, 957 00:47:03,560 --> 00:47:08,000 like we did back in week 0, in C, It's very similar to the if, except you add 958 00:47:08,000 --> 00:47:11,810 the else clause and write out an additional printf() like this. 959 00:47:11,810 --> 00:47:13,760 In Python, we can tighten this up. 960 00:47:13,760 --> 00:47:16,730 if x less than y, colon, that's exactly the same. 961 00:47:16,730 --> 00:47:17,810 First line's the same. 962 00:47:17,810 --> 00:47:21,680 All we're doing now is adding an else and the second print line here. 963 00:47:21,680 --> 00:47:22,640 How about in Scratch? 964 00:47:22,640 --> 00:47:26,060 If we had a three-way fork in the road-- if, else, if, else. 965 00:47:26,060 --> 00:47:29,750 In C, it looked pretty much like that-- if, else, if, else. 966 00:47:29,750 --> 00:47:31,820 In Python, we can tighten this up. 967 00:47:31,820 --> 00:47:34,170 And this is not a typo. 968 00:47:34,170 --> 00:47:38,300 What jumps out at you as weird but you got to just get used to it? 969 00:47:38,300 --> 00:47:39,020 Yeah? 970 00:47:39,020 --> 00:47:39,820 AUDIENCE: elif. 971 00:47:39,820 --> 00:47:40,570 DAVID MALAN: elif. 972 00:47:40,570 --> 00:47:44,750 And honestly, years later, I still can't remember if it's elif or elsif 973 00:47:44,750 --> 00:47:49,332 because other languages actually do E-L-S-I-F. 974 00:47:49,332 --> 00:47:52,040 and now I probably now biased all of you to now questioning this. 975 00:47:52,040 --> 00:47:53,390 But it's elif in Python. 976 00:47:53,390 --> 00:47:55,200 E-L-I-F is not a typo. 977 00:47:55,200 --> 00:47:58,520 It's in the spirit of let's just save ourselves some keystrokes. 978 00:47:58,520 --> 00:48:03,950 So elif is identical to elsif, but it's a little tighter to type it this way. 979 00:48:03,950 --> 00:48:04,670 All right. 980 00:48:04,670 --> 00:48:09,080 So if we now have this ability to express conditionals, 981 00:48:09,080 --> 00:48:11,250 what can we actually do with them? 982 00:48:11,250 --> 00:48:13,530 Well, let me go over to VS Code here. 983 00:48:13,530 --> 00:48:18,320 And let me propose that we revisit maybe another program from before, 984 00:48:18,320 --> 00:48:21,600 where we just compare two integers in particular. 985 00:48:21,600 --> 00:48:22,580 So I'm in VS Code. 986 00:48:22,580 --> 00:48:25,670 Let me open up a file called, say, compare.py. 987 00:48:25,670 --> 00:48:28,460 And in compare.py, we'll use the CS50 library just 988 00:48:28,460 --> 00:48:32,220 so we don't risk any errors, like if the human doesn't type an integer. 989 00:48:32,220 --> 00:48:35,450 So we're going to go ahead and say from cs50 import get_int(). 990 00:48:35,450 --> 00:48:39,350 And in compare.py, let's get two variables-- x = get_int(), 991 00:48:39,350 --> 00:48:41,390 and prompt the user for x. 992 00:48:41,390 --> 00:48:42,830 So "What's x?" 993 00:48:42,830 --> 00:48:47,840 To be a bit more verbose, y = get+int("What's y? ") 994 00:48:47,840 --> 00:48:50,490 And then let's go ahead and just compare these two values. 995 00:48:50,490 --> 00:48:55,196 So if x is less than y, then go ahead and print out with print("x is less 996 00:48:55,196 --> 00:49:01,160 than y"), close quote, elif x is greater than y, 997 00:49:01,160 --> 00:49:07,220 go ahead and print out "x is greater than y," close quote, else, 998 00:49:07,220 --> 00:49:12,830 go ahead and print out "x is equal to y"-- so the exact same program, 999 00:49:12,830 --> 00:49:16,400 but I've added to the mix getting a value of x and y. 1000 00:49:16,400 --> 00:49:18,650 Let me run python of compare.py. 1001 00:49:18,650 --> 00:49:19,430 Enter. 1002 00:49:19,430 --> 00:49:23,060 Let's type in 1 for x, 2 for y. x is less than y. 1003 00:49:23,060 --> 00:49:24,500 Let's run it once more. 1004 00:49:24,500 --> 00:49:26,240 x is 2. y is 1. 1005 00:49:26,240 --> 00:49:27,440 x is greater than y. 1006 00:49:27,440 --> 00:49:30,230 And just for good measure, let's run it a third time. x is 1. 1007 00:49:30,230 --> 00:49:31,400 y is 1. 1008 00:49:31,400 --> 00:49:32,690 x is equal to y. 1009 00:49:32,690 --> 00:49:37,430 So the code, daresay, works exactly as you would expect, as you would hope. 1010 00:49:37,430 --> 00:49:40,430 But it turns out that in the world of Python, 1011 00:49:40,430 --> 00:49:44,210 we're actually going to get some other behavior that might actually 1012 00:49:44,210 --> 00:49:48,620 have been what you expected weeks ago, even though C did not behave this way. 1013 00:49:48,620 --> 00:49:52,610 In the world of Python and in the world of strings, a.k.a. 1014 00:49:52,610 --> 00:49:57,090 strs, strings actually behave more like you would expect. 1015 00:49:57,090 --> 00:49:58,070 So by that I mean this. 1016 00:49:58,070 --> 00:49:59,780 Let me actually go back to this code. 1017 00:49:59,780 --> 00:50:05,420 And instead of using integers, let me go ahead and get rid of-- 1018 00:50:05,420 --> 00:50:08,670 I could do get_string(), but we said that that's not really necessary. 1019 00:50:08,670 --> 00:50:11,030 So let's just go ahead and change this to input(). 1020 00:50:11,030 --> 00:50:11,720 And actually, you know what? 1021 00:50:11,720 --> 00:50:12,678 Let's just start fresh. 1022 00:50:12,678 --> 00:50:16,370 Let's give myself a string called s and use the input() function and ask 1023 00:50:16,370 --> 00:50:17,920 the user for s. 1024 00:50:17,920 --> 00:50:21,800 Let's use another variable called t just because it comes after s and use 1025 00:50:21,800 --> 00:50:23,870 the input() function to get t. 1026 00:50:23,870 --> 00:50:26,698 Then let's compare if s and t are the same. 1027 00:50:26,698 --> 00:50:28,490 Now, a couple of weeks ago, this backfired. 1028 00:50:28,490 --> 00:50:32,670 And if I tried to compare two strings for equality, it did not work. 1029 00:50:32,670 --> 00:50:38,100 But if I do if s == t, print("Same"), else, 1030 00:50:38,100 --> 00:50:40,590 let's go ahead and print("Different"). 1031 00:50:40,590 --> 00:50:44,410 I daresay, in Python, I think this is going to work as you would expect. 1032 00:50:44,410 --> 00:50:48,480 So python of compare.py, let's type in cat and cat. 1033 00:50:48,480 --> 00:50:49,920 And indeed those are the same. 1034 00:50:49,920 --> 00:50:53,190 Let me run it again and type in cat and dog, respectively. 1035 00:50:53,190 --> 00:50:55,140 And those are now different. 1036 00:50:55,140 --> 00:51:00,240 But in C, we always got "Different," "Different," "Different," 1037 00:51:00,240 --> 00:51:04,500 even if I typed the exact same word, be it cat or dog or high or anything else. 1038 00:51:04,500 --> 00:51:08,820 Why, in C, were s and t always different a couple of weeks ago? 1039 00:51:08,820 --> 00:51:09,904 Yeah? 1040 00:51:09,904 --> 00:51:12,946 AUDIENCE: Because it was comparing the value of the char* with the memory 1041 00:51:12,946 --> 00:51:13,515 address. 1042 00:51:13,515 --> 00:51:14,390 DAVID MALAN: Exactly. 1043 00:51:14,390 --> 00:51:18,570 In C, string is the same thing as char*, which is a memory address. 1044 00:51:18,570 --> 00:51:20,780 And because we had called get_string() twice, 1045 00:51:20,780 --> 00:51:24,290 even if the human typed the same things, that was two different chunks of memory 1046 00:51:24,290 --> 00:51:25,790 at two different addresses. 1047 00:51:25,790 --> 00:51:29,090 So those two char*s were just naturally always different, 1048 00:51:29,090 --> 00:51:31,880 even if the characters at those addresses were the same. 1049 00:51:31,880 --> 00:51:33,802 Python is meant to be higher-level. 1050 00:51:33,802 --> 00:51:35,510 It's meant to be a little more intuitive. 1051 00:51:35,510 --> 00:51:38,720 It's meant to be more accessible to folks who might not necessarily 1052 00:51:38,720 --> 00:51:41,310 know or want to understand those lower-level details. 1053 00:51:41,310 --> 00:51:48,080 So in Python, ==, even for strings just works the way that you might expect. 1054 00:51:48,080 --> 00:51:50,450 But in Python, we can do some other things, 1055 00:51:50,450 --> 00:51:55,230 too, even more easily than we could in C. Let me go back to VS Code here. 1056 00:51:55,230 --> 00:51:56,660 Let me close compare.py. 1057 00:51:56,660 --> 00:51:59,720 And let's reimplement a program from C called agree, 1058 00:51:59,720 --> 00:52:02,980 which allowed us to prompt the user for a yes/no question, like, 1059 00:52:02,980 --> 00:52:05,730 do you agree to these terms and conditions or something like that. 1060 00:52:05,730 --> 00:52:08,660 So let's do code of agree.py. 1061 00:52:08,660 --> 00:52:12,410 And with agree.py, let me go ahead and-- 1062 00:52:12,410 --> 00:52:14,680 actually, let's go ahead and do this. 1063 00:52:14,680 --> 00:52:18,220 Let me also open up a file that I came with in advance. 1064 00:52:18,220 --> 00:52:20,310 And this is called agree.c. 1065 00:52:20,310 --> 00:52:23,340 And this is what we did some weeks ago when 1066 00:52:23,340 --> 00:52:26,860 we wanted to check whether or not the user had agreed to something or not. 1067 00:52:26,860 --> 00:52:29,310 So we used the CS50 library, the standard I/O library, 1068 00:52:29,310 --> 00:52:31,290 we had a main() function, we used get_char(). 1069 00:52:31,290 --> 00:52:35,460 And then we used == a lot, and we used the two vertical bars, 1070 00:52:35,460 --> 00:52:36,990 which meant logical or. 1071 00:52:36,990 --> 00:52:39,100 Is this thing true or is this thing true? 1072 00:52:39,100 --> 00:52:42,120 And if so, printf() "Agreed" or "Not agreed." 1073 00:52:42,120 --> 00:52:42,900 So this worked. 1074 00:52:42,900 --> 00:52:44,160 And this is relatively simple. 1075 00:52:44,160 --> 00:52:47,670 That's the right way to do it in C. But notice 1076 00:52:47,670 --> 00:52:50,400 it was a little verbose because we wanted 1077 00:52:50,400 --> 00:52:53,820 to handle uppercase and lowercase, uppercase and lowercase. 1078 00:52:53,820 --> 00:52:56,640 So that did start to bloat the code, admittedly. 1079 00:52:56,640 --> 00:52:58,770 So let's try to do the same thing in Python 1080 00:52:58,770 --> 00:53:02,530 and see what we can do the same or different-- no pun intended. 1081 00:53:02,530 --> 00:53:03,330 So let me do this. 1082 00:53:03,330 --> 00:53:08,160 In agree.py, why don't we try to get input from the user as before? 1083 00:53:08,160 --> 00:53:09,450 And I will use-- 1084 00:53:09,450 --> 00:53:12,060 I could use get_string(), but I'll go ahead and use input(). 1085 00:53:12,060 --> 00:53:17,940 So s = input("Do you agree? ") in double quotes. 1086 00:53:17,940 --> 00:53:21,090 And then let's go ahead and check if s == "Y"-- 1087 00:53:21,090 --> 00:53:24,240 1088 00:53:24,240 --> 00:53:26,980 and it's not vertical bar now, it's actually more readable, 1089 00:53:26,980 --> 00:53:34,830 more English-like-- or s == "y," then go ahead and print out "Agreed" as before, 1090 00:53:34,830 --> 00:53:36,270 elsif-- 1091 00:53:36,270 --> 00:53:43,710 see, I did it there-- elif s == "N" or s == "n," 1092 00:53:43,710 --> 00:53:47,040 go ahead and print out "Not agreed." 1093 00:53:47,040 --> 00:53:51,720 So it's almost the same as the C version, except that I'm using, 1094 00:53:51,720 --> 00:53:53,980 literally, O-R instead of two vertical bars. 1095 00:53:53,980 --> 00:53:56,430 So let's run this-- so python of agree.py. 1096 00:53:56,430 --> 00:53:57,030 Enter. 1097 00:53:57,030 --> 00:53:58,110 Do I agree? 1098 00:53:58,110 --> 00:53:59,610 Yes, for little y. 1099 00:53:59,610 --> 00:54:02,970 Let's do it again. python of agree.py, capital Y. Yes. 1100 00:54:02,970 --> 00:54:03,910 That works there. 1101 00:54:03,910 --> 00:54:06,630 And if I do it again with lowercase n, and if I do it 1102 00:54:06,630 --> 00:54:10,240 with capital N, this program, too, seems to work. 1103 00:54:10,240 --> 00:54:12,160 But what if I do this? 1104 00:54:12,160 --> 00:54:13,980 Let me rerun python of agree.py. 1105 00:54:13,980 --> 00:54:16,140 Let me type in Yes. 1106 00:54:16,140 --> 00:54:17,227 OK, it just ignores me. 1107 00:54:17,227 --> 00:54:18,060 Let me run it again. 1108 00:54:18,060 --> 00:54:18,915 Let me type in no. 1109 00:54:18,915 --> 00:54:20,130 It just ignores me. 1110 00:54:20,130 --> 00:54:22,230 Let me try it very emphatically, YES in all caps. 1111 00:54:22,230 --> 00:54:23,370 It just ignores me. 1112 00:54:23,370 --> 00:54:26,100 So there's some explosion of possibilities 1113 00:54:26,100 --> 00:54:27,930 that ideally we should handle, right? 1114 00:54:27,930 --> 00:54:32,790 This is bad user interface design if I have-- the user has to type Y or N, 1115 00:54:32,790 --> 00:54:37,260 even if yes and no in English are perfectly reasonable and logical, too. 1116 00:54:37,260 --> 00:54:39,250 So how could we handle that? 1117 00:54:39,250 --> 00:54:43,950 Well, it turns out in Python, we can use something like an array, 1118 00:54:43,950 --> 00:54:47,740 technically called a list, to maybe check a bunch of things at once. 1119 00:54:47,740 --> 00:54:49,120 So let me do this. 1120 00:54:49,120 --> 00:54:54,330 Let me instead say not equality, but let me use the in keyword in Python 1121 00:54:54,330 --> 00:54:56,880 and check if it's in a collection of possible values. 1122 00:54:56,880 --> 00:54:59,940 Let me say if s is in-- 1123 00:54:59,940 --> 00:55:03,930 and here comes, in square brackets, just like-- 1124 00:55:03,930 --> 00:55:09,630 in square brackets, quote unquote, "y", quote unquote, "yes," 1125 00:55:09,630 --> 00:55:13,770 then we can go ahead and print out "Agreed," 1126 00:55:13,770 --> 00:55:20,100 elif s in this list of values, lowercase "n" or lowercase "no," 1127 00:55:20,100 --> 00:55:23,190 then we can print out, for instance, "Not agreed." 1128 00:55:23,190 --> 00:55:25,890 But this is a bit of a step backwards because now 1129 00:55:25,890 --> 00:55:29,010 I'm only handling lowercase. 1130 00:55:29,010 --> 00:55:32,520 So let me go into the mix and maybe add capital "Y"-- 1131 00:55:32,520 --> 00:55:36,780 wait a minute, then maybe capital "YES," then maybe "YeS," also-- 1132 00:55:36,780 --> 00:55:41,310 I mean, weird, but we should probably support this and "YEs." 1133 00:55:41,310 --> 00:55:43,480 I mean, there's a lot of combinations. 1134 00:55:43,480 --> 00:55:45,360 So this is not going to end well. 1135 00:55:45,360 --> 00:55:47,760 Or it's just going to bloat my code unnecessarily. 1136 00:55:47,760 --> 00:55:51,840 And eventually, for longer words, I'm surely going to miss capitalization. 1137 00:55:51,840 --> 00:55:54,990 So logically, whether it's in Python or C or any language, 1138 00:55:54,990 --> 00:56:00,900 what might be a better design for this problem of handling Y and Yes, 1139 00:56:00,900 --> 00:56:03,090 but who cares about the capitalization? 1140 00:56:03,090 --> 00:56:07,500 AUDIENCE: Don't use capitals or [INAUDIBLE] 1141 00:56:07,500 --> 00:56:10,260 DAVID MALAN: So OK, so don't use capitals. 1142 00:56:10,260 --> 00:56:12,280 You could only support lowercase. 1143 00:56:12,280 --> 00:56:12,780 That's fine. 1144 00:56:12,780 --> 00:56:14,072 That's kind of a copout, right? 1145 00:56:14,072 --> 00:56:16,225 Because now the program's usability is worse. 1146 00:56:16,225 --> 00:56:17,100 AUDIENCE: Convert it. 1147 00:56:17,100 --> 00:56:19,590 DAVID MALAN: Oh, we could convert it to lowercase, yeah. 1148 00:56:19,590 --> 00:56:22,440 Though I did hear you say we could just check the first letter, 1149 00:56:22,440 --> 00:56:24,460 I bet that's going to get us into trouble. 1150 00:56:24,460 --> 00:56:26,700 And we probably don't want to allow any word starting 1151 00:56:26,700 --> 00:56:31,320 with Y, any word starting with N, just because it logically-- especially you 1152 00:56:31,320 --> 00:56:33,150 want the lawyers happy, presumably. 1153 00:56:33,150 --> 00:56:36,960 You should probably get an explicit semantically correct word like Y or N 1154 00:56:36,960 --> 00:56:37,890 or yes or no. 1155 00:56:37,890 --> 00:56:41,580 But, yeah, we can actually go about converting this to something 1156 00:56:41,580 --> 00:56:42,295 maybe smaller. 1157 00:56:42,295 --> 00:56:43,920 But how do we go about converting this? 1158 00:56:43,920 --> 00:56:48,450 In C, that alone was going to be pretty darn annoying because we'd have to use 1159 00:56:48,450 --> 00:56:53,770 the tolower() function on every character and compare it for equality. 1160 00:56:53,770 --> 00:56:55,440 It just feels like that's a bit of work. 1161 00:56:55,440 --> 00:56:58,840 But in Python, you're going to get more functionality for free. 1162 00:56:58,840 --> 00:57:01,680 So there might very well be a function, like in C, 1163 00:57:01,680 --> 00:57:03,600 called tolower() or toupper(). 1164 00:57:03,600 --> 00:57:06,300 But the weird thing about C, perhaps in retrospect, 1165 00:57:06,300 --> 00:57:10,090 is that those functions just kind of worked on the honor system. 1166 00:57:10,090 --> 00:57:15,010 tolower() and toupper() just trusted that you would pass them an input, 1167 00:57:15,010 --> 00:57:18,400 an argument, that is, in fact, a char. 1168 00:57:18,400 --> 00:57:22,300 In Python, and in a lot of other higher-level languages, 1169 00:57:22,300 --> 00:57:26,120 they introduced this notion of Object-Oriented Programming, 1170 00:57:26,120 --> 00:57:28,270 which is commonly described as OOP. 1171 00:57:28,270 --> 00:57:31,120 And in the world of Object-Oriented Programming, 1172 00:57:31,120 --> 00:57:34,510 your values can not only-- your variables, 1173 00:57:34,510 --> 00:57:37,450 for instance, and your data types can not only have values. 1174 00:57:37,450 --> 00:57:40,940 They can also have functionality built into them. 1175 00:57:40,940 --> 00:57:43,270 So if you have a data type like a string, 1176 00:57:43,270 --> 00:57:45,370 frankly, it just makes good sense that strings 1177 00:57:45,370 --> 00:57:48,490 should be uppercaseable, lowercaseable, capitalizable, 1178 00:57:48,490 --> 00:57:50,990 and any number of other operations on strings. 1179 00:57:50,990 --> 00:57:54,040 So in the world of object-oriented programming functions, 1180 00:57:54,040 --> 00:57:58,150 like toupper() and tolower() and isupper() and islower() are not just 1181 00:57:58,150 --> 00:57:59,980 in some random library that you can use. 1182 00:57:59,980 --> 00:58:02,560 They're built into the strings themselves. 1183 00:58:02,560 --> 00:58:06,400 And what this means is that in the world of strings in Python, 1184 00:58:06,400 --> 00:58:10,360 here, for instance, is the URL of the documentation for all of the functions, 1185 00:58:10,360 --> 00:58:13,810 otherwise known as methods, that come with strings. 1186 00:58:13,810 --> 00:58:16,150 So you don't go check for a C-type library 1187 00:58:16,150 --> 00:58:18,790 like we did in C. You check the actual data 1188 00:58:18,790 --> 00:58:20,770 type, the documentation, therefore, and you 1189 00:58:20,770 --> 00:58:24,580 will see in Python's own documentation what functions, a.k.a. 1190 00:58:24,580 --> 00:58:26,680 methods, come with strings. 1191 00:58:26,680 --> 00:58:28,420 So a method is just a function. 1192 00:58:28,420 --> 00:58:32,540 But it's a function that comes with some data type, like a string. 1193 00:58:32,540 --> 00:58:35,830 So let me propose that we do this. 1194 00:58:35,830 --> 00:58:38,980 In the world of object-oriented programming, 1195 00:58:38,980 --> 00:58:41,720 we can come back to agree.py. 1196 00:58:41,720 --> 00:58:44,620 And we can actually improve the program by getting 1197 00:58:44,620 --> 00:58:47,440 rid of this crazy long list, which I wasn't even done with, 1198 00:58:47,440 --> 00:58:50,260 and just canonicalize everything as lowercase. 1199 00:58:50,260 --> 00:58:53,470 So let's just check for lowercase y and lowercase yes, lowercase 1200 00:58:53,470 --> 00:58:55,630 n, lowercase no, and that's it. 1201 00:58:55,630 --> 00:58:58,000 But to your suggestion, let's force everything 1202 00:58:58,000 --> 00:59:01,600 that the user types into lowercase, not because we want 1203 00:59:01,600 --> 00:59:03,160 to permanently change their input-- 1204 00:59:03,160 --> 00:59:05,980 we can throw the value away thereafter-- but 1205 00:59:05,980 --> 00:59:09,520 because we want to more easily logically compare it 1206 00:59:09,520 --> 00:59:12,350 for membership in this list of values. 1207 00:59:12,350 --> 00:59:18,200 So one way to do this would be to literally do s = s.lower(). 1208 00:59:18,200 --> 00:59:19,480 So here's the difference. 1209 00:59:19,480 --> 00:59:22,360 In the world of C, we would have done this-- 1210 00:59:22,360 --> 00:59:25,990 tolower and pass in the value s. 1211 00:59:25,990 --> 00:59:30,010 But in the world of Python, and, in general, object-oriented programming-- 1212 00:59:30,010 --> 00:59:32,530 Java is another language that does this-- 1213 00:59:32,530 --> 00:59:36,460 if s is a string, a.k.a. str, therefore, s is actually 1214 00:59:36,460 --> 00:59:38,290 what's known in Python as an object. 1215 00:59:38,290 --> 00:59:41,560 An object can not only have values or attributes inside of them, 1216 00:59:41,560 --> 00:59:43,300 but also functionality built in. 1217 00:59:43,300 --> 00:59:47,260 And just like in C, with a struct, if you want to go inside of something, 1218 00:59:47,260 --> 00:59:49,030 you use the dot operator. 1219 00:59:49,030 --> 00:59:54,250 And inside of this string, I claim, is a function, a.k.a., method, 1220 00:59:54,250 --> 00:59:55,330 called lower(). 1221 00:59:55,330 --> 00:59:58,510 Long story short, the only takeaway, if this is a bit abstract, 1222 00:59:58,510 --> 01:00:01,185 is that instead of doing lower and then, in parentheses, 1223 01:00:01,185 --> 01:00:04,060 s, in the world of object-oriented programming, you kind of flip that 1224 01:00:04,060 --> 01:00:08,950 and you do s dot name of the method, and then open paren and close paren if you 1225 01:00:08,950 --> 01:00:10,940 don't need to pass in any arguments. 1226 01:00:10,940 --> 01:00:12,650 So this actually achieves the same. 1227 01:00:12,650 --> 01:00:17,680 So let me go ahead and rerun agree.py, and let me type in lowercase y. 1228 01:00:17,680 --> 01:00:18,460 That works. 1229 01:00:18,460 --> 01:00:20,680 Let me run it again, type in lowercase yes. 1230 01:00:20,680 --> 01:00:24,820 That works let me run it again, type in capital Y. That works. 1231 01:00:24,820 --> 01:00:27,070 Let me type in capital YES, all capital-- 1232 01:00:27,070 --> 01:00:28,360 all uppercase YES. 1233 01:00:28,360 --> 01:00:29,410 That too works. 1234 01:00:29,410 --> 01:00:31,450 Let me try no. 1235 01:00:31,450 --> 01:00:33,640 Let me try no in lowercase. 1236 01:00:33,640 --> 01:00:37,180 And all of these permutations now actually work 1237 01:00:37,180 --> 01:00:38,830 because I'm forcing it to lowercase. 1238 01:00:38,830 --> 01:00:42,460 But even more interestingly, in Python, if you're sort of becoming a languages 1239 01:00:42,460 --> 01:00:48,850 person, if you have a variable s that is being set the return value of input() 1240 01:00:48,850 --> 01:00:52,420 function, and then you're immediately going about changing it to lowercase, 1241 01:00:52,420 --> 01:00:57,800 you can also chain method calls together in something like Python by doing this. 1242 01:00:57,800 --> 01:01:02,720 We can get rid of this line altogether, and then I can just do this, .lower. 1243 01:01:02,720 --> 01:01:05,950 And so whatever the return value of input() is, it's going to be a str. 1244 01:01:05,950 --> 01:01:08,350 Whatever the human types in, you can then immediately 1245 01:01:08,350 --> 01:01:12,490 force it to lowercase and then assign the whole value to this variable 1246 01:01:12,490 --> 01:01:13,120 called s. 1247 01:01:13,120 --> 01:01:17,260 You don't actually have to wait around and do it on a separate line 1248 01:01:17,260 --> 01:01:20,220 altogether. 1249 01:01:20,220 --> 01:01:23,235 Questions, then, on any of this? 1250 01:01:23,235 --> 01:01:26,250 1251 01:01:26,250 --> 01:01:26,750 No? 1252 01:01:26,750 --> 01:01:27,250 All right. 1253 01:01:27,250 --> 01:01:30,260 Let me do one other that's reminiscent of something we did in the past. 1254 01:01:30,260 --> 01:01:32,900 Let me go into VS Code here, clear my terminal. 1255 01:01:32,900 --> 01:01:35,960 Let's close both the C and the Python version of agree. 1256 01:01:35,960 --> 01:01:39,500 And let's create a program called uppercase.py, whose purpose in life 1257 01:01:39,500 --> 01:01:41,390 is to actually uppercase a whole string. 1258 01:01:41,390 --> 01:01:45,440 In the world of C, we had to do this character by character by character. 1259 01:01:45,440 --> 01:01:46,200 And that's fine. 1260 01:01:46,200 --> 01:01:48,380 I'm going to go ahead and do it similarly here 1261 01:01:48,380 --> 01:01:53,540 in Python, whereby I want to convert it character by character. 1262 01:01:53,540 --> 01:01:56,510 But unfortunately, before I can do that, I actually 1263 01:01:56,510 --> 01:02:00,320 need some way of looping in Python, which we actually haven't seen yet. 1264 01:02:00,320 --> 01:02:02,420 So we need one more set of building blocks. 1265 01:02:02,420 --> 01:02:05,120 And, in fact, if we were to consult the Python documentation, 1266 01:02:05,120 --> 01:02:06,530 we'd see this and much more. 1267 01:02:06,530 --> 01:02:10,240 So, in fact, here's a list of all of the functions that come with Python. 1268 01:02:10,240 --> 01:02:11,990 And it's actually not that long of a list, 1269 01:02:11,990 --> 01:02:15,680 because so much of the functionality of Python is built into data types, 1270 01:02:15,680 --> 01:02:18,950 like strings and integers and floats and more. 1271 01:02:18,950 --> 01:02:22,790 Here is the canonical source of truth for Python documentation. 1272 01:02:22,790 --> 01:02:25,950 So as opposed to using the CS50 manual for C, 1273 01:02:25,950 --> 01:02:29,700 which is meant to be a simplified version of publicly 1274 01:02:29,700 --> 01:02:32,520 available documentation, we'll generally, for Python, 1275 01:02:32,520 --> 01:02:33,870 point you to the official docs. 1276 01:02:33,870 --> 01:02:39,000 I will disclaim they're not really written for introductory students. 1277 01:02:39,000 --> 01:02:42,150 And they'll generally leave some detail off and use arcane language. 1278 01:02:42,150 --> 01:02:43,947 But at this point in the term, even if it 1279 01:02:43,947 --> 01:02:45,780 might be a little frustrating at first, it's 1280 01:02:45,780 --> 01:02:47,873 good to see documentation in the real world 1281 01:02:47,873 --> 01:02:50,290 because that's what you're going to have after the course. 1282 01:02:50,290 --> 01:02:52,860 And so you'll get used to it through practice over time. 1283 01:02:52,860 --> 01:02:55,408 But with loops, let's introduce one other feature 1284 01:02:55,408 --> 01:02:56,700 that we can compare to Scratch. 1285 01:02:56,700 --> 01:03:00,270 Here, for instance, in Scratch, is how we might have repeated something three 1286 01:03:00,270 --> 01:03:01,980 times, like meowing on the screen. 1287 01:03:01,980 --> 01:03:04,330 In C, there were a bunch of ways to do this. 1288 01:03:04,330 --> 01:03:07,080 And the clunkiest was maybe to do it with a while loop 1289 01:03:07,080 --> 01:03:10,080 where we declare a variable called i, set it equal to 0, 1290 01:03:10,080 --> 01:03:14,580 and then, iteratively, increment i again and again until it exceeds-- 1291 01:03:14,580 --> 01:03:17,970 until it equals 3, each time printing out "meow." 1292 01:03:17,970 --> 01:03:22,620 In Python, we can do this in a few different ways as well. 1293 01:03:22,620 --> 01:03:26,850 The nearest translation of C into Python is perhaps this. 1294 01:03:26,850 --> 01:03:29,850 It's almost the same, and logically, it really is the same, 1295 01:03:29,850 --> 01:03:32,730 but you don't specify int, and you don't have a semicolon. 1296 01:03:32,730 --> 01:03:34,270 You don't have curly braces. 1297 01:03:34,270 --> 01:03:35,520 But you do have a colon. 1298 01:03:35,520 --> 01:03:36,690 You don't use printf(). 1299 01:03:36,690 --> 01:03:37,800 You use print(). 1300 01:03:37,800 --> 01:03:42,240 And you can't use i++, but you still can use i += 1. 1301 01:03:42,240 --> 01:03:45,360 So logically, exactly the same idea as in C-- 1302 01:03:45,360 --> 01:03:46,660 It's just a little tighter. 1303 01:03:46,660 --> 01:03:49,920 I mean, it's a little easier to read, even though it's very mechanical, 1304 01:03:49,920 --> 01:03:50,490 if you will. 1305 01:03:50,490 --> 01:03:51,900 You're defining all of these. 1306 01:03:51,900 --> 01:03:54,600 You're defining this variable and changing it incrementally. 1307 01:03:54,600 --> 01:03:58,317 Well, recall that in C, we could also use a for loop, which at first glance 1308 01:03:58,317 --> 01:04:00,150 was probably more cryptic than a while loop. 1309 01:04:00,150 --> 01:04:02,670 But odds are by now, you're more comfortable or more 1310 01:04:02,670 --> 01:04:04,950 in the habit of using loops-- same exact idea. 1311 01:04:04,950 --> 01:04:08,160 In Python, though, we might do it like this. 1312 01:04:08,160 --> 01:04:13,530 We've seen how, in square brackets, you can have lists of values, like y, yes, 1313 01:04:13,530 --> 01:04:14,690 and so forth. 1314 01:04:14,690 --> 01:04:16,690 Well, let's just do the same thing with numbers. 1315 01:04:16,690 --> 01:04:19,380 So if you want Python to do something three times, give it 1316 01:04:19,380 --> 01:04:24,390 a list of three values, like 0, 1, 2, and then print out "hello, world" 1317 01:04:24,390 --> 01:04:26,100 that many times. 1318 01:04:26,100 --> 01:04:29,550 Now, this is correct, but it's bad design. 1319 01:04:29,550 --> 01:04:33,720 Even if you've never seen Python before, extrapolate mentally from this. 1320 01:04:33,720 --> 01:04:38,204 Why is this probably not the right way or the best way to do this looping? 1321 01:04:38,204 --> 01:04:40,329 AUDIENCE: Because if you wanted to do it more than, 1322 01:04:40,329 --> 01:04:42,360 like, three times, you have to [INAUDIBLE].. 1323 01:04:42,360 --> 01:04:43,110 DAVID MALAN: Yeah. 1324 01:04:43,110 --> 01:04:46,830 If you want to do it four times, five times, 50 times, 100 times, 1325 01:04:46,830 --> 01:04:50,250 I mean, surely, there's a better way than enumerating all of these values. 1326 01:04:50,250 --> 01:04:51,060 And there is. 1327 01:04:51,060 --> 01:04:55,620 In fact, in Python, there's a function called range() that actually returns 1328 01:04:55,620 --> 01:04:58,170 to you very efficiently a range of values. 1329 01:04:58,170 --> 01:05:02,040 And by default, it hands you the number 0 and then 1 and then 2. 1330 01:05:02,040 --> 01:05:05,430 And if you want more than that, you just change the argument to range() to be 1331 01:05:05,430 --> 01:05:07,060 how many values do you want. 1332 01:05:07,060 --> 01:05:10,560 So if you passed in range of 50, you would get back 0 1333 01:05:10,560 --> 01:05:15,430 through 49, which effectively allows you to do something 50 times in total. 1334 01:05:15,430 --> 01:05:18,430 So this is perhaps the most Pythonic way, so to speak. 1335 01:05:18,430 --> 01:05:20,070 And this is actually a term of art. 1336 01:05:20,070 --> 01:05:23,340 Pythonic isn't necessarily the only way to do something. 1337 01:05:23,340 --> 01:05:28,230 But it's the way to do something based on consensus in the Python community. 1338 01:05:28,230 --> 01:05:30,090 So it's pretty common to do this. 1339 01:05:30,090 --> 01:05:32,100 But there's some curiosity here. 1340 01:05:32,100 --> 01:05:36,720 Notice I'm declaring a variable i, but I'm never actually using it. 1341 01:05:36,720 --> 01:05:39,137 In fact, I don't even increment it because that's 1342 01:05:39,137 --> 01:05:40,470 sort of happening automatically. 1343 01:05:40,470 --> 01:05:43,690 Well, what's really happening here is automatically in Python, 1344 01:05:43,690 --> 01:05:50,000 on every iteration of this loop, Python is assigning i to the next value. 1345 01:05:50,000 --> 01:05:51,393 So initially, i is 0. 1346 01:05:51,393 --> 01:05:52,810 Then it goes through an iteration. 1347 01:05:52,810 --> 01:05:53,920 Then i is 1. 1348 01:05:53,920 --> 01:05:55,210 Then i is 2. 1349 01:05:55,210 --> 01:05:58,210 And then that's it if you only asked for three values. 1350 01:05:58,210 --> 01:06:00,940 But there's this other technique in Python, just so you know, 1351 01:06:00,940 --> 01:06:03,850 whereby if you're the programmer, and you know you don't actually 1352 01:06:03,850 --> 01:06:06,100 care about the name of this variable, you 1353 01:06:06,100 --> 01:06:10,600 can actually change it to an underscore, which has no functional effect per se. 1354 01:06:10,600 --> 01:06:14,020 It just signals to the reader, your colleague, your teaching fellow, 1355 01:06:14,020 --> 01:06:17,738 that it's a variable, and you need it in order to achieve a for loop. 1356 01:06:17,738 --> 01:06:19,780 But you don't care about the name of the variable 1357 01:06:19,780 --> 01:06:22,150 because you're not going to use it explicitly anywhere. 1358 01:06:22,150 --> 01:06:25,310 So that might be an even more Pythonic way of doing things. 1359 01:06:25,310 --> 01:06:27,580 But if you're more comfortable seeing the i 1360 01:06:27,580 --> 01:06:30,010 and using the variable more explicitly, that's fine. 1361 01:06:30,010 --> 01:06:32,480 Underscore does not mean anything special. 1362 01:06:32,480 --> 01:06:35,720 It's just a valid character for a variable name. 1363 01:06:35,720 --> 01:06:38,630 So this is convention, nothing more technical than that. 1364 01:06:38,630 --> 01:06:42,430 What about a forever loop in Scratch, like literally meow forever. 1365 01:06:42,430 --> 01:06:46,840 Well, over here, we can just use in C, while(true) printf() "meow," 1366 01:06:46,840 --> 01:06:49,210 again and again and again. 1367 01:06:49,210 --> 01:06:51,977 In Python, it's almost the same. 1368 01:06:51,977 --> 01:06:53,560 You still get rid of the curly braces. 1369 01:06:53,560 --> 01:06:54,430 You add the colon. 1370 01:06:54,430 --> 01:06:55,638 You get rid of the semicolon. 1371 01:06:55,638 --> 01:06:57,940 But there's a subtlety. 1372 01:06:57,940 --> 01:07:00,040 What else is different here? 1373 01:07:00,040 --> 01:07:01,090 Yeah? 1374 01:07:01,090 --> 01:07:02,620 So True is uppercase. 1375 01:07:02,620 --> 01:07:03,310 Why? 1376 01:07:03,310 --> 01:07:04,210 Who knows? 1377 01:07:04,210 --> 01:07:07,210 The world decided that in Python, True is capitalized 1378 01:07:07,210 --> 01:07:08,320 and False is capitalized. 1379 01:07:08,320 --> 01:07:11,030 In many other languages, daresay most, they are not. 1380 01:07:11,030 --> 01:07:15,430 It's just a difference that you have to keep in mind or remember. 1381 01:07:15,430 --> 01:07:15,940 All right. 1382 01:07:15,940 --> 01:07:20,480 So now that we have looping constructs, let me go back to my code here. 1383 01:07:20,480 --> 01:07:23,680 And recall that I proposed that we re-implement a program like uppercase, 1384 01:07:23,680 --> 01:07:25,420 force an entire string to uppercase. 1385 01:07:25,420 --> 01:07:29,470 And in C, we would have done this with a for loop, iterating from left to right. 1386 01:07:29,470 --> 01:07:32,410 But what's nice in Python frankly, is that it's a lot easier 1387 01:07:32,410 --> 01:07:37,510 to loop in Python than it is in C because you can loop over 1388 01:07:37,510 --> 01:07:39,460 anything that is iterable. 1389 01:07:39,460 --> 01:07:43,210 A string is iterable in the sense that you can iterate over it 1390 01:07:43,210 --> 01:07:44,630 from left to right. 1391 01:07:44,630 --> 01:07:45,740 So what do I mean by this? 1392 01:07:45,740 --> 01:07:48,010 Well, let me go ahead and, in uppercase.py, 1393 01:07:48,010 --> 01:07:51,790 let's first prompt the user for a variable called before and set that 1394 01:07:51,790 --> 01:07:56,650 equal to the return value of input(), giving them a prompt of "Before," 1395 01:07:56,650 --> 01:07:57,820 colon. 1396 01:07:57,820 --> 01:08:01,720 Then let's go ahead, as we did weeks ago, and print out just the word 1397 01:08:01,720 --> 01:08:08,110 "After," just to make clear to the user what is actually going to be printed. 1398 01:08:08,110 --> 01:08:12,700 Then let me go ahead and specify the following loop-- 1399 01:08:12,700 --> 01:08:16,005 for-- and previously you saw me use i, but because I'm dealing with 1400 01:08:16,005 --> 01:08:18,130 characters, I'm actually going to do this instead-- 1401 01:08:18,130 --> 01:08:24,910 for c in before, colon, print out c.upper. 1402 01:08:24,910 --> 01:08:26,290 And that's it. 1403 01:08:26,290 --> 01:08:28,359 Now, this is a little flawed, I will concede. 1404 01:08:28,359 --> 01:08:31,540 But let me run this-- python of uppercase.py. 1405 01:08:31,540 --> 01:08:35,229 Let's type in something like cat, C-A-T in all lowercase. 1406 01:08:35,229 --> 01:08:35,990 Enter. 1407 01:08:35,990 --> 01:08:36,490 All right. 1408 01:08:36,490 --> 01:08:39,040 Well, you see "After," and I did get it right in the sense 1409 01:08:39,040 --> 01:08:43,359 that it is capital C, capital A, capital T, but it looks a little stupid. 1410 01:08:43,359 --> 01:08:45,189 And in order to fix this, we actually need 1411 01:08:45,189 --> 01:08:49,479 to introduce something that's called named parameters. 1412 01:08:49,479 --> 01:08:55,510 So let me actually go ahead and propose that we can fix this problem 1413 01:08:55,510 --> 01:08:59,140 by actually passing in another argument to the print() function. 1414 01:08:59,140 --> 01:09:01,540 And this is a little different syntactically from C. 1415 01:09:01,540 --> 01:09:04,479 But if I go back to VS Code here, it turns out 1416 01:09:04,479 --> 01:09:06,319 that there's two aesthetic problems here. 1417 01:09:06,319 --> 01:09:10,130 One, I did not want the new line automatically inserted after "After." 1418 01:09:10,130 --> 01:09:10,630 Why? 1419 01:09:10,630 --> 01:09:13,569 Because, just like in week 1, I want them to line up nicely-- 1420 01:09:13,569 --> 01:09:15,310 or in week 2. 1421 01:09:15,310 --> 01:09:18,367 And I don't want a new line after C-A-T. So even 1422 01:09:18,367 --> 01:09:20,200 though at first glance a moment-- a bit ago, 1423 01:09:20,200 --> 01:09:23,620 it might have seemed nice that Python just does the backslash n for you, 1424 01:09:23,620 --> 01:09:27,649 it can backfire if you don't actually want a new line every time. 1425 01:09:27,649 --> 01:09:29,660 So the syntax is going to look a little weird. 1426 01:09:29,660 --> 01:09:32,529 But in Python, with the print() function, 1427 01:09:32,529 --> 01:09:36,819 if you want to change the character that's automatically used at the end 1428 01:09:36,819 --> 01:09:43,130 of every line, you can literally pass in a second argument called end and set it 1429 01:09:43,130 --> 01:09:45,660 equal to something else. 1430 01:09:45,660 --> 01:09:48,350 So if you want to set it equal to something else, 1431 01:09:48,350 --> 01:09:52,620 and that something else is nothing, "", then that's fine. 1432 01:09:52,620 --> 01:09:57,050 You can actually specify end="". 1433 01:09:57,050 --> 01:10:00,980 Down here, too, if you want to specify that at the end of every one of these 1434 01:10:00,980 --> 01:10:05,330 characters should be nothing, I can specify end="". 1435 01:10:05,330 --> 01:10:08,390 What this implies is that by default in Python, 1436 01:10:08,390 --> 01:10:13,280 the default value of this end parameter is actually always backslash n. 1437 01:10:13,280 --> 01:10:15,800 So if you want to override it and take that away, 1438 01:10:15,800 --> 01:10:19,380 you just literally change it to "" instead. 1439 01:10:19,380 --> 01:10:23,960 And now if I clear my-- if I rerun this program, uppercase.py, 1440 01:10:23,960 --> 01:10:27,820 type in cat in all lowercase, now you'll see-- 1441 01:10:27,820 --> 01:10:29,307 oh, two minor bugs here. 1442 01:10:29,307 --> 01:10:30,140 One was just stupid. 1443 01:10:30,140 --> 01:10:31,940 I had one too many spaces here. 1444 01:10:31,940 --> 01:10:35,150 But you'll notice that I didn't move the cursor to the next line 1445 01:10:35,150 --> 01:10:38,090 after CAT was printed in all uppercase. 1446 01:10:38,090 --> 01:10:40,070 And that we can fix by just printing nothing. 1447 01:10:40,070 --> 01:10:43,140 It turns out when you don't pass print() an argument at all, 1448 01:10:43,140 --> 01:10:46,690 it automatically gives you just the line ending, nothing else. 1449 01:10:46,690 --> 01:10:49,210 So I think this will move the cursor as expected. 1450 01:10:49,210 --> 01:10:52,200 So let me clear it now, run python of uppercase.py 1451 01:10:52,200 --> 01:10:55,290 and hit Enter, type in cat in all lowercase, cross my fingers this time, 1452 01:10:55,290 --> 01:10:59,910 and now I have indeed capitalized this, character by character 1453 01:10:59,910 --> 01:11:03,360 by character, just like we did in C. 1454 01:11:03,360 --> 01:11:06,060 But honestly, this, too, not really necessary-- 1455 01:11:06,060 --> 01:11:08,610 it turns out I don't need to loop over a whole string, 1456 01:11:08,610 --> 01:11:10,510 because strings themselves come with methods. 1457 01:11:10,510 --> 01:11:12,930 And if you were to visit the documentation for strings, 1458 01:11:12,930 --> 01:11:17,370 you would see that indeed, upper is a method that comes with every string, 1459 01:11:17,370 --> 01:11:20,880 and you don't need to call it on every character individually. 1460 01:11:20,880 --> 01:11:25,650 I could instead get rid of all of this and just print out-- 1461 01:11:25,650 --> 01:11:31,860 for instance, I can just print out before.upper. 1462 01:11:31,860 --> 01:11:35,400 And the upper() function that comes with strings will automatically apply it 1463 01:11:35,400 --> 01:11:39,370 to every character they're in and, I think, achieve the same result. 1464 01:11:39,370 --> 01:11:42,990 So let me go ahead and try this again-- python of uppercase.py, type in cat, 1465 01:11:42,990 --> 01:11:46,330 enter, and indeed, it works exactly the same way. 1466 01:11:46,330 --> 01:11:48,090 Let me take this one step further. 1467 01:11:48,090 --> 01:11:51,510 Let me go ahead and combine a couple of ideas now here. 1468 01:11:51,510 --> 01:11:56,220 Let me go ahead and, for instance, let me get rid of this last print() line. 1469 01:11:56,220 --> 01:12:00,090 Let me change my logic to be after equals the return value of this. 1470 01:12:00,090 --> 01:12:04,770 And now I can use one of those f strings and plug this in maybe here, After. 1471 01:12:04,770 --> 01:12:06,750 And I can get rid of the new line ending. 1472 01:12:06,750 --> 01:12:08,385 I can specify this is an f string. 1473 01:12:08,385 --> 01:12:10,260 So I'm just changing this around a little bit 1474 01:12:10,260 --> 01:12:13,680 logically so that now I have a variable called after that 1475 01:12:13,680 --> 01:12:15,940 is the uppercase version of before. 1476 01:12:15,940 --> 01:12:21,910 And now, if I do python of uppercase.py, type in cat, that too now works. 1477 01:12:21,910 --> 01:12:23,980 And if I-- actually let me add a space there, 1478 01:12:23,980 --> 01:12:28,350 if I run python of uppercase.py, type in cat, that too now works. 1479 01:12:28,350 --> 01:12:31,440 And lastly here, if you don't want to bother 1480 01:12:31,440 --> 01:12:33,960 creating another variable like this, you can even 1481 01:12:33,960 --> 01:12:37,830 put short bits of code inside of these format strings. 1482 01:12:37,830 --> 01:12:40,800 So I, for instance, could go in here into these curly braces 1483 01:12:40,800 --> 01:12:42,510 and not just put a variable name. 1484 01:12:42,510 --> 01:12:48,120 I can actually put Python code inside of the curly braces, inside of my string. 1485 01:12:48,120 --> 01:12:51,600 And so now if I run Python of uppercase.py, type in cat, 1486 01:12:51,600 --> 01:12:54,360 even that too now works. 1487 01:12:54,360 --> 01:12:55,890 Now, which one is the best? 1488 01:12:55,890 --> 01:12:59,590 This is kind of reasonable to put the bit of code inside of the string. 1489 01:12:59,590 --> 01:13:02,793 I would not start writing long lines of code inside of curly braces 1490 01:13:02,793 --> 01:13:04,710 that start to wrap, no less, because then it's 1491 01:13:04,710 --> 01:13:06,420 just going to be a matter of bad style. 1492 01:13:06,420 --> 01:13:09,780 But this, again, is to say that there's a bunch of different ways 1493 01:13:09,780 --> 01:13:11,590 to solve each of these problems. 1494 01:13:11,590 --> 01:13:15,240 And so up until now, we've generally seen not named parameters. 1495 01:13:15,240 --> 01:13:20,100 end is the first parameter we've ever seen that has a name, literally, end. 1496 01:13:20,100 --> 01:13:23,790 Up until now in C and up until a moment ago in Python, 1497 01:13:23,790 --> 01:13:27,840 we've always been assuming that our parameters are positional. 1498 01:13:27,840 --> 01:13:33,600 What matters is the order in which you specify them, not necessarily something 1499 01:13:33,600 --> 01:13:35,280 else. 1500 01:13:35,280 --> 01:13:35,880 Whew. 1501 01:13:35,880 --> 01:13:37,600 OK, that was a lot. 1502 01:13:37,600 --> 01:13:42,660 Any questions about any of this here? 1503 01:13:42,660 --> 01:13:43,230 No? 1504 01:13:43,230 --> 01:13:43,620 All right. 1505 01:13:43,620 --> 01:13:44,290 It feels like a lot. 1506 01:13:44,290 --> 01:13:45,450 Let's take our 10-minute break here. 1507 01:13:45,450 --> 01:13:46,700 Fruit roll-ups are now served. 1508 01:13:46,700 --> 01:13:50,310 We'll be back in 10. 1509 01:13:50,310 --> 01:13:51,690 All right. 1510 01:13:51,690 --> 01:13:52,830 We are back. 1511 01:13:52,830 --> 01:13:58,440 And recall that as we left off, we had just introduced loops. 1512 01:13:58,440 --> 01:14:01,620 And we'd seen a bunch of different ways by which 1513 01:14:01,620 --> 01:14:03,140 we could get, say, a cat to meow. 1514 01:14:03,140 --> 01:14:04,890 Let's actually translate that to some code 1515 01:14:04,890 --> 01:14:08,490 and start to make sense of some of the programs with which we began, 1516 01:14:08,490 --> 01:14:11,452 like creating our own functions, as we did for the speller example 1517 01:14:11,452 --> 01:14:14,410 at the very beginning, and actually do this a little more methodically. 1518 01:14:14,410 --> 01:14:16,270 So let me go over to VS Code here. 1519 01:14:16,270 --> 01:14:20,790 Let me go ahead and create a program called meow.py, instead of meow.c 1520 01:14:20,790 --> 01:14:22,140 as in the past. 1521 01:14:22,140 --> 01:14:25,890 And suffice it to say if you want to implement the idea of a cat, 1522 01:14:25,890 --> 01:14:30,270 we can do better than just saying print("meow"), print("meow"), 1523 01:14:30,270 --> 01:14:31,260 print("meow"). 1524 01:14:31,260 --> 01:14:32,457 This, of course, would work. 1525 01:14:32,457 --> 01:14:35,290 This is correct if the goal is to get the thing to meow three times. 1526 01:14:35,290 --> 01:14:40,710 But when I run python of meow.py, it's going to work as expected, 1527 01:14:40,710 --> 01:14:42,650 but this is just not good design, right? 1528 01:14:42,650 --> 01:14:44,310 We should minimally be using a loop. 1529 01:14:44,310 --> 01:14:47,870 So let me propose that we improve this per the building blocks we've seen. 1530 01:14:47,870 --> 01:14:51,470 And I could say something like, for i in range(3), 1531 01:14:51,470 --> 01:14:54,170 go ahead and print out now, quote unquote, "meow." 1532 01:14:54,170 --> 01:14:58,340 So this is better in the sense that it still prints meow, meow, meow. 1533 01:14:58,340 --> 01:15:01,490 But if I want to change this to a dog and change the meow to a woof 1534 01:15:01,490 --> 01:15:04,370 or something like that, I can change it in one place and not three 1535 01:15:04,370 --> 01:15:07,290 different places-- so just, in general, better design. 1536 01:15:07,290 --> 01:15:10,460 But what if now, much like in Scratch and in C, 1537 01:15:10,460 --> 01:15:14,270 I wanted to create my own meow() function which did not come with either 1538 01:15:14,270 --> 01:15:15,770 of those languages as well. 1539 01:15:15,770 --> 01:15:18,170 Well, as a teaser at the start of class, we 1540 01:15:18,170 --> 01:15:20,600 saw that you can define your own functions 1541 01:15:20,600 --> 01:15:24,410 with this keyword def, which is a little bit different from how C does it. 1542 01:15:24,410 --> 01:15:29,060 But let me go ahead and do this indeed in Python and define my own function 1543 01:15:29,060 --> 01:15:29,690 meow(). 1544 01:15:29,690 --> 01:15:36,950 So let me go ahead and do def meow(), and then, inside of that function, 1545 01:15:36,950 --> 01:15:41,370 I'm just going to literally do for now, quote unquote, "meow" with print(). 1546 01:15:41,370 --> 01:15:46,910 And now down here, notice, I can actually go ahead and just call meow(). 1547 01:15:46,910 --> 01:15:49,880 And I can go ahead and call meow(), and I can call meow(). 1548 01:15:49,880 --> 01:15:52,370 And this is not the best design at the moment. 1549 01:15:52,370 --> 01:15:56,900 But Python does not constrain me to have to implement a main() function, 1550 01:15:56,900 --> 01:15:58,200 as we've seen thus far. 1551 01:15:58,200 --> 01:16:01,850 But I can define my own helper functions, if you will, 1552 01:16:01,850 --> 01:16:03,590 like a helper function called meow(). 1553 01:16:03,590 --> 01:16:06,350 So let me go ahead and just run this for demonstration's sake 1554 01:16:06,350 --> 01:16:08,120 and run python of meow.py. 1555 01:16:08,120 --> 01:16:09,380 That does seem to work. 1556 01:16:09,380 --> 01:16:10,610 But this is not good design. 1557 01:16:10,610 --> 01:16:15,800 And let me go ahead and actually do this-- for i in range(3), 1558 01:16:15,800 --> 01:16:18,140 now let me call the meow() function. 1559 01:16:18,140 --> 01:16:19,430 And this, too, should work. 1560 01:16:19,430 --> 01:16:23,480 If I do python of meow.py, there we have meow, meow, meow. 1561 01:16:23,480 --> 01:16:26,840 But I very deliberately did something clever here. 1562 01:16:26,840 --> 01:16:29,060 I defined meow at the top of my file. 1563 01:16:29,060 --> 01:16:31,600 But that's not the best practice because as in C, 1564 01:16:31,600 --> 01:16:34,100 when someone opens the file for the first time, whether you, 1565 01:16:34,100 --> 01:16:38,510 a TF, a TA, a colleague, you'd like to see the main part of the program 1566 01:16:38,510 --> 01:16:42,050 at the top of the file, just because it's easier mentally to dive right in 1567 01:16:42,050 --> 01:16:43,610 and know what this file is doing. 1568 01:16:43,610 --> 01:16:47,420 So let me go ahead and practice what I'm preaching and put the main part 1569 01:16:47,420 --> 01:16:49,670 of my code, even if there's no main() function per se, 1570 01:16:49,670 --> 01:16:51,480 at the top of this file. 1571 01:16:51,480 --> 01:16:53,600 So now I have the loop at the top. 1572 01:16:53,600 --> 01:16:57,710 I'm calling meow() on line 2, and I'm defining meow() on lines 5 and 6. 1573 01:16:57,710 --> 01:17:00,420 Well, instinctively, you can perhaps see where this is going. 1574 01:17:00,420 --> 01:17:02,750 If I run Python of meow.py and hit Enter, 1575 01:17:02,750 --> 01:17:06,570 there's one of those tracebacks that's tracing my error. 1576 01:17:06,570 --> 01:17:11,060 And here, my error is apparently on line 2 in meow.py. 1577 01:17:11,060 --> 01:17:15,120 And you'll notice that, huh, the name 'meow' is not defined. 1578 01:17:15,120 --> 01:17:18,440 And so previously, we saw a different type of error, a value error. 1579 01:17:18,440 --> 01:17:20,870 Here we're seeing a name error in the sense 1580 01:17:20,870 --> 01:17:23,690 that Python does not recognize the name of this function. 1581 01:17:23,690 --> 01:17:27,630 And intuitively, why might that be, even if the error is a little cryptic? 1582 01:17:27,630 --> 01:17:28,130 Yeah? 1583 01:17:28,130 --> 01:17:29,660 AUDIENCE: [INAUDIBLE] top to bottom. 1584 01:17:29,660 --> 01:17:33,680 DAVID MALAN: Yeah, Python, too-- as fancier as it seems to be than C, 1585 01:17:33,680 --> 01:17:36,810 it still takes things pretty literally, top to bottom, left to right. 1586 01:17:36,810 --> 01:17:40,820 So if you define meow() on line 5, you can't use it on line 2. 1587 01:17:40,820 --> 01:17:43,352 OK, so I could undo this, and I could flip the order. 1588 01:17:43,352 --> 01:17:46,310 But let me just stipulate that as soon as we have a bunch of functions, 1589 01:17:46,310 --> 01:17:49,880 it's probably naive to assume I can just keep putting my functions above, above, 1590 01:17:49,880 --> 01:17:50,510 above, above. 1591 01:17:50,510 --> 01:17:53,810 And honestly, that's going to move all of my main code, so to speak, 1592 01:17:53,810 --> 01:17:57,360 to the bottom of the file, which is sort of counterproductive or less obvious. 1593 01:17:57,360 --> 01:18:01,400 So it turns out in Python, even though you don't need a main() function, 1594 01:18:01,400 --> 01:18:05,160 it's actually quite common to define one nonetheless. 1595 01:18:05,160 --> 01:18:08,850 So what I could do to solve this problem is this. 1596 01:18:08,850 --> 01:18:12,980 Let me go ahead and define a function called main() that takes no arguments, 1597 01:18:12,980 --> 01:18:14,030 in this case. 1598 01:18:14,030 --> 01:18:17,310 Let me indent that same code beneath it. 1599 01:18:17,310 --> 01:18:20,550 And now let me keep meow() defined at the bottom of my file. 1600 01:18:20,550 --> 01:18:24,170 So if we read this literally, on line 1, I'm defining a function called main(). 1601 01:18:24,170 --> 01:18:27,110 And it will do what is prescribed on lines 2 and 3. 1602 01:18:27,110 --> 01:18:30,050 On line 6, I'm defining a function called meow(), 1603 01:18:30,050 --> 01:18:33,690 and it will do what's prescribed on line 7-- so fairly straightforward, 1604 01:18:33,690 --> 01:18:36,260 even though the keyword def is, of course, new today. 1605 01:18:36,260 --> 01:18:38,870 If I run, though, python of meow.py, you'd 1606 01:18:38,870 --> 01:18:40,370 like to think I'll see three meows. 1607 01:18:40,370 --> 01:18:43,330 But I see nothing. 1608 01:18:43,330 --> 01:18:45,140 I don't see an error, but I see nothing. 1609 01:18:45,140 --> 01:18:45,640 Why? 1610 01:18:45,640 --> 01:18:50,240 Intuitively, what explains the lack of behavior? 1611 01:18:50,240 --> 01:18:51,310 I didn't call main(). 1612 01:18:51,310 --> 01:18:55,300 So this is the thing even though it's not required in Python to have a main() 1613 01:18:55,300 --> 01:18:59,840 function, but it is conventional in Python to have a main() function, 1614 01:18:59,840 --> 01:19:02,320 you have to call the function yourself. 1615 01:19:02,320 --> 01:19:04,840 It doesn't get magically called as it does in C. 1616 01:19:04,840 --> 01:19:06,730 So this might seem a little stupid-- 1617 01:19:06,730 --> 01:19:09,970 and that's fine-- but it is the convention in Python. 1618 01:19:09,970 --> 01:19:14,050 Generally, the very last line of your file might just be to literally this, 1619 01:19:14,050 --> 01:19:18,310 call main(), because this satisfies the constraint that main() is defined 1620 01:19:18,310 --> 01:19:24,020 on line 1 meow() is defined on line 6, but we don't call anything until line 1621 01:19:24,020 --> 01:19:24,520 10. 1622 01:19:24,520 --> 01:19:26,590 So line 10 says call main(). 1623 01:19:26,590 --> 01:19:28,420 So that means execute this code. 1624 01:19:28,420 --> 01:19:32,060 Line 3 says call meow(), which means execute this code. 1625 01:19:32,060 --> 01:19:36,640 So now it all works because the last thing I'm doing is call main(). 1626 01:19:36,640 --> 01:19:38,920 You can think of C as just kind of secretly having 1627 01:19:38,920 --> 01:19:41,380 this line there for you the whole time. 1628 01:19:41,380 --> 01:19:45,210 But now that we have our own functions, notice that we can enhance this 1629 01:19:45,210 --> 01:19:48,900 implementation of meow() to maybe be parameterized and take actually 1630 01:19:48,900 --> 01:19:50,080 an argument itself. 1631 01:19:50,080 --> 01:19:51,510 So let me make a tweak here. 1632 01:19:51,510 --> 01:19:54,270 Just like in C, and just like in Scratch, 1633 01:19:54,270 --> 01:19:58,170 I can actually let meow() meow a specific number of times. 1634 01:19:58,170 --> 01:19:58,980 So let me do this. 1635 01:19:58,980 --> 01:20:01,950 Wouldn't it be nice, instead of having my loop in main(), 1636 01:20:01,950 --> 01:20:05,790 to instead just distill main() into a single line of code and just pass 1637 01:20:05,790 --> 01:20:08,250 in the number of times you want the thing to meow? 1638 01:20:08,250 --> 01:20:11,640 What I could do in meow() here is I have to give it a parameter. 1639 01:20:11,640 --> 01:20:13,140 And I could call it anything I want. 1640 01:20:13,140 --> 01:20:16,000 I'm going to call it n for number, which seems fine. 1641 01:20:16,000 --> 01:20:18,270 And then, in the meow() function, I could do this-- 1642 01:20:18,270 --> 01:20:25,290 for i in range of, not 3, but n now, I can tell range() to give me a range 1643 01:20:25,290 --> 01:20:27,930 that is of variable length based on what n is. 1644 01:20:27,930 --> 01:20:31,380 And then I indent the print() below the loop now. 1645 01:20:31,380 --> 01:20:33,960 And this should now do what I expect, too. 1646 01:20:33,960 --> 01:20:36,750 Let me run python of meow.py. 1647 01:20:36,750 --> 01:20:37,530 Enter. 1648 01:20:37,530 --> 01:20:38,730 And there's 3. 1649 01:20:38,730 --> 01:20:43,000 But if I change the 3 to a 5 and rerun this, python of meow.py, 1650 01:20:43,000 --> 01:20:44,560 now I'm getting five meows. 1651 01:20:44,560 --> 01:20:48,040 So we've just seen a third way how, in Python, now we 1652 01:20:48,040 --> 01:20:52,780 can implement the idea of meowing as its own abstracted function. 1653 01:20:52,780 --> 01:20:54,730 And I can assume now that meow() exists. 1654 01:20:54,730 --> 01:20:57,767 I can now treat it as out of sight, out of mind. 1655 01:20:57,767 --> 01:20:58,600 It's an abstraction. 1656 01:20:58,600 --> 01:21:02,530 And frankly, I could even put it into a library, import it from a file, 1657 01:21:02,530 --> 01:21:07,880 like we've done with CS50, and make it usable by other people as well. 1658 01:21:07,880 --> 01:21:10,720 So the takeaway here, really, though, is that in Python, you 1659 01:21:10,720 --> 01:21:13,750 can, similarly to C, define your own functions. 1660 01:21:13,750 --> 01:21:15,790 But you should understand the slight differences 1661 01:21:15,790 --> 01:21:19,150 as to what gets called automatically for you. 1662 01:21:19,150 --> 01:21:19,900 All right. 1663 01:21:19,900 --> 01:21:22,360 Other differences or similarities with C? 1664 01:21:22,360 --> 01:21:25,930 Well, recall that in C, truncation was an issue. 1665 01:21:25,930 --> 01:21:30,910 Truncation is whereby if you, for instance, divide an int by an int, 1666 01:21:30,910 --> 01:21:34,510 and it's a fractional answer, everything after the decimal point 1667 01:21:34,510 --> 01:21:38,440 gets truncated by default because an int divided by an int in C 1668 01:21:38,440 --> 01:21:39,560 gives you an int. 1669 01:21:39,560 --> 01:21:43,780 And if you can't fit the remainder in that integer, everything at the decimal 1670 01:21:43,780 --> 01:21:44,810 gets cut off. 1671 01:21:44,810 --> 01:21:45,800 So what does this mean? 1672 01:21:45,800 --> 01:21:48,170 Well, let me actually go back to VS Code here. 1673 01:21:48,170 --> 01:21:52,540 Let me go ahead and open, say, calculator.py again, 1674 01:21:52,540 --> 01:21:54,760 and let's change up what the calculator now does. 1675 01:21:54,760 --> 01:21:55,610 Let me do this. 1676 01:21:55,610 --> 01:22:00,010 Let me define a variable called x, set it equal to the input() function, 1677 01:22:00,010 --> 01:22:01,510 prompting the user for x. 1678 01:22:01,510 --> 01:22:05,710 Let me ask the user for y, let me not repeat past mistakes, 1679 01:22:05,710 --> 01:22:09,160 and let me proactively convert both of these to ints. 1680 01:22:09,160 --> 01:22:13,720 And I'll do it in one pretty one-liner here so that I definitely get x and y. 1681 01:22:13,720 --> 01:22:15,850 And on the honor system, I just won't type cat. 1682 01:22:15,850 --> 01:22:17,980 I won't type dog, even though this program is not 1683 01:22:17,980 --> 01:22:19,917 really complete without error checking. 1684 01:22:19,917 --> 01:22:22,000 Now, let me go ahead and declare a third variable, 1685 01:22:22,000 --> 01:22:26,230 z = x / y, and now let's just go ahead and print out z. 1686 01:22:26,230 --> 01:22:27,700 I don't need a format code. 1687 01:22:27,700 --> 01:22:28,810 I don't need an f string. 1688 01:22:28,810 --> 01:22:32,420 If all you want to do is print a variable, print() is very flexible. 1689 01:22:32,420 --> 01:22:35,200 You can just say print(z), in parentheses. 1690 01:22:35,200 --> 01:22:38,260 Let me run python of calculator.py, hit Enter. 1691 01:22:38,260 --> 01:22:42,040 Let's type in 1 for x, 3 for y. 1692 01:22:42,040 --> 01:22:43,570 I left out a space there. 1693 01:22:43,570 --> 01:22:46,240 And oh, interesting. 1694 01:22:46,240 --> 01:22:48,170 What seems to have happened here? 1695 01:22:48,170 --> 01:22:52,900 Let me fix my spacing and rerun this again-- python of calculator.py-- so 1, 1696 01:22:52,900 --> 01:22:53,740 3. 1697 01:22:53,740 --> 01:22:55,510 What did not happen? 1698 01:22:55,510 --> 01:22:56,830 AUDIENCE: It doesn't truncate. 1699 01:22:56,830 --> 01:22:57,370 DAVID MALAN: Yeah. 1700 01:22:57,370 --> 01:22:58,400 So it didn't truncate. 1701 01:22:58,400 --> 01:23:00,430 So Python is a little smarter when it comes 1702 01:23:00,430 --> 01:23:02,660 to converting one value to another. 1703 01:23:02,660 --> 01:23:05,050 So an integer divided by an integer, if it ends up 1704 01:23:05,050 --> 01:23:07,780 giving you this fractional component, not to worry now, 1705 01:23:07,780 --> 01:23:11,860 you'll get back what is effectively a float in Python here. 1706 01:23:11,860 --> 01:23:17,050 Well, what else do we want to be mindful of in, say, Python? 1707 01:23:17,050 --> 01:23:20,920 Well, recall that in C, we had this issue of floating point and precision 1708 01:23:20,920 --> 01:23:24,760 whereby if you want to represent a number, like 1/3, and on a piece 1709 01:23:24,760 --> 01:23:27,640 of paper, it's, like, 0.3 with a line over it 1710 01:23:27,640 --> 01:23:29,860 because the 3 infinitely repeats-- 1711 01:23:29,860 --> 01:23:33,040 but we saw a problem in C last time when we actually 1712 01:23:33,040 --> 01:23:34,550 played around with some value. 1713 01:23:34,550 --> 01:23:37,000 So, for instance, let me go back to VS Code here. 1714 01:23:37,000 --> 01:23:40,300 And this is going to be the ugliest syntax I do think we see today. 1715 01:23:40,300 --> 01:23:45,700 But there was a way in C, using %f, to show more than the default number 1716 01:23:45,700 --> 01:23:49,030 of digits after the decimal point, to see more significant digits. 1717 01:23:49,030 --> 01:23:50,830 In Python, there's something similar. 1718 01:23:50,830 --> 01:23:51,970 It just looks very weird. 1719 01:23:51,970 --> 01:23:53,860 And the way you do it in Python is this. 1720 01:23:53,860 --> 01:23:56,950 You specify that you want an f string, a format string. 1721 01:23:56,950 --> 01:23:59,440 And I'm just going to start and finish my thought first-- 1722 01:23:59,440 --> 01:24:01,270 f before "". 1723 01:24:01,270 --> 01:24:04,910 If you want to print out z, you could literally just do this. 1724 01:24:04,910 --> 01:24:08,620 And so this is just an f string, but you're interpolating z. 1725 01:24:08,620 --> 01:24:12,040 So it doesn't do anything more than it did a moment ago when I literally just 1726 01:24:12,040 --> 01:24:13,090 passed in z. 1727 01:24:13,090 --> 01:24:15,880 But as soon as you have an f string, you can 1728 01:24:15,880 --> 01:24:19,700 configure the variable to print out to a specific number of digits. 1729 01:24:19,700 --> 01:24:24,910 So if you actually want to print out z to, say, 50 decimal points, 1730 01:24:24,910 --> 01:24:28,210 just to see a lot, you can use crazy syntax like this. 1731 01:24:28,210 --> 01:24:31,270 So it's just using the curly braces, as I introduced before. 1732 01:24:31,270 --> 01:24:34,000 But you then use a dot after a colon, and then 1733 01:24:34,000 --> 01:24:37,270 you specify the number of digits that you want and then an f to make clear 1734 01:24:37,270 --> 01:24:37,960 it's a float. 1735 01:24:37,960 --> 01:24:40,877 Honestly, I google this all the time when I don't remember the syntax. 1736 01:24:40,877 --> 01:24:43,470 But the point is the functionality exists. 1737 01:24:43,470 --> 01:24:43,970 All right. 1738 01:24:43,970 --> 01:24:48,320 Let me go down here and rerun python of calculator.py. 1739 01:24:48,320 --> 01:24:52,640 And unfortunately, if I divide 1 by 3, not all of my problems are solved. 1740 01:24:52,640 --> 01:24:56,090 Floating point precision is still a thing. 1741 01:24:56,090 --> 01:24:59,150 So be mindful of the fact that there are these limitations 1742 01:24:59,150 --> 01:25:00,860 in the world of Python. 1743 01:25:00,860 --> 01:25:02,240 Floating point precision remains. 1744 01:25:02,240 --> 01:25:04,490 If you want to do even better than that, though, there 1745 01:25:04,490 --> 01:25:07,670 exist a lot more libraries, third-party libraries, 1746 01:25:07,670 --> 01:25:11,630 that can give you much greater precision for scientific purposes, 1747 01:25:11,630 --> 01:25:13,830 financial purposes, or the like. 1748 01:25:13,830 --> 01:25:16,580 But what about another problem from C, integer overflow? 1749 01:25:16,580 --> 01:25:19,370 If you just count to high, recall that you might accidentally 1750 01:25:19,370 --> 01:25:22,880 overflow the capacity of an integer and end up going back to 0, 1751 01:25:22,880 --> 01:25:25,190 or worse, going negative altogether. 1752 01:25:25,190 --> 01:25:28,430 In Python, this problem does not exist. 1753 01:25:28,430 --> 01:25:31,610 In Python, when you have an integer, a.k.a. 1754 01:25:31,610 --> 01:25:34,460 int, even though we haven't needed to use the keyword int, 1755 01:25:34,460 --> 01:25:37,490 it will grow and grow and grow. 1756 01:25:37,490 --> 01:25:41,590 And Python will reserve more and more memory for that integer to fit it. 1757 01:25:41,590 --> 01:25:43,770 So it is not a fixed number of bits. 1758 01:25:43,770 --> 01:25:46,900 So floating point imprecision is still a problem. 1759 01:25:46,900 --> 01:25:51,120 Integer overflow-- not a problem in the latest versions of Python, 1760 01:25:51,120 --> 01:25:53,250 so a difference worth knowing. 1761 01:25:53,250 --> 01:25:56,778 But what about other features of Python that we didn't have in C? 1762 01:25:56,778 --> 01:25:59,820 Well, let's actually revisit one of those tracebacks, one of those errors 1763 01:25:59,820 --> 01:26:03,130 I ran into earlier, to see how we might actually solve it. 1764 01:26:03,130 --> 01:26:05,250 So let me go back to VS Code here. 1765 01:26:05,250 --> 01:26:07,870 And just for fun, let me go ahead and do this. 1766 01:26:07,870 --> 01:26:09,070 Let me clear my terminal. 1767 01:26:09,070 --> 01:26:12,248 And let me change my calculator to actually have a get_int() function. 1768 01:26:12,248 --> 01:26:14,040 We've seen how to define our own functions. 1769 01:26:14,040 --> 01:26:15,930 Let me not bother with the CS50 library. 1770 01:26:15,930 --> 01:26:18,790 Let me just invent my own get_int() function as follows. 1771 01:26:18,790 --> 01:26:22,590 So def get_int(), and just like the CS50 function, 1772 01:26:22,590 --> 01:26:26,250 I'm going to have get int take a prompt, a string to show the user to ask them 1773 01:26:26,250 --> 01:26:27,150 for an integer. 1774 01:26:27,150 --> 01:26:31,410 And now I'm going to go ahead and return the return value of input(), 1775 01:26:31,410 --> 01:26:33,780 passing that same prompt to input()-- because input(), 1776 01:26:33,780 --> 01:26:37,330 just like get_string(), shows the user a string of text. 1777 01:26:37,330 --> 01:26:40,930 But I do want to convert this thing here to an int. 1778 01:26:40,930 --> 01:26:44,730 So this is just a one-liner, really, of an implementation of get_int(). 1779 01:26:44,730 --> 01:26:49,050 So this is kind of like what CS50 did in its Python library, but not quite. 1780 01:26:49,050 --> 01:26:49,590 Why? 1781 01:26:49,590 --> 01:26:51,010 Because there's a problem with it. 1782 01:26:51,010 --> 01:26:51,760 So let me do this. 1783 01:26:51,760 --> 01:26:54,030 Let me define a main() function just by convention. 1784 01:26:54,030 --> 01:26:57,780 Let me use this implementation of get_int() to ask the user for x. 1785 01:26:57,780 --> 01:27:01,170 Let me use this get_int() function to prompt the user for y. 1786 01:27:01,170 --> 01:27:05,190 And then let me do something simple like print out x + y. 1787 01:27:05,190 --> 01:27:08,340 And then, very last thing, I have to call main(). 1788 01:27:08,340 --> 01:27:10,470 And this is a minor point, but I'm deliberately 1789 01:27:10,470 --> 01:27:13,770 putting multiple blank lines between my functions. 1790 01:27:13,770 --> 01:27:14,970 This too is Pythonic. 1791 01:27:14,970 --> 01:27:17,590 It's a matter of style. style50 will help you with this. 1792 01:27:17,590 --> 01:27:21,630 It's just meant for larger files to really make your functions stand out 1793 01:27:21,630 --> 01:27:24,400 and be a little more separated visually from others. 1794 01:27:24,400 --> 01:27:24,900 All right. 1795 01:27:24,900 --> 01:27:27,720 Let me go ahead and run Python of calculator.py. 1796 01:27:27,720 --> 01:27:28,710 Enter. 1797 01:27:28,710 --> 01:27:29,790 Let me type in 1. 1798 01:27:29,790 --> 01:27:31,020 Let me type in 3. 1799 01:27:31,020 --> 01:27:32,400 And that actually works. 1800 01:27:32,400 --> 01:27:33,840 1 plus 3 is 4. 1801 01:27:33,840 --> 01:27:35,130 Let me do the more obvious. 1802 01:27:35,130 --> 01:27:37,200 1 plus 2 gives me 3. 1803 01:27:37,200 --> 01:27:41,310 So the calculator is in fact working until such time as I, the human, 1804 01:27:41,310 --> 01:27:44,410 don't cooperate and type in something like cat for x. 1805 01:27:44,410 --> 01:27:47,490 Then we get that same traceback as before, 1806 01:27:47,490 --> 01:27:49,390 but I'm seeing it now in this file. 1807 01:27:49,390 --> 01:27:51,790 And let me zoom in on my terminal just to make clear. 1808 01:27:51,790 --> 01:27:55,920 We don't need to see the old history there. 1809 01:27:55,920 --> 01:28:00,390 Let me type in cat, Enter, and you'll see the same traceback. 1810 01:28:00,390 --> 01:28:03,150 And you'll see that, OK, here's where now there's 1811 01:28:03,150 --> 01:28:04,450 multiple functions involved. 1812 01:28:04,450 --> 01:28:05,430 So what's going on? 1813 01:28:05,430 --> 01:28:08,550 The first problem is at line 12 in main(). 1814 01:28:08,550 --> 01:28:12,410 But that's not actually the problem because main() calls my get_int() 1815 01:28:12,410 --> 01:28:12,910 function. 1816 01:28:12,910 --> 01:28:17,410 So on line 6 of calculator.py, this is really the issue-- 1817 01:28:17,410 --> 01:28:21,330 so, again, it's tracing everything that just happened from top to bottom here-- 1818 01:28:21,330 --> 01:28:25,440 and value error-- invalid literal for int() with base 10, 1819 01:28:25,440 --> 01:28:30,780 'cat,' which is to say, like before, cat is not an integer in base 10 or any 1820 01:28:30,780 --> 01:28:31,350 other base. 1821 01:28:31,350 --> 01:28:34,090 It just cannot be converted to an integer. 1822 01:28:34,090 --> 01:28:38,080 So how do you fix this, or, really, how does the CS50 library fix this? 1823 01:28:38,080 --> 01:28:40,180 You won't have to write much code like this. 1824 01:28:40,180 --> 01:28:44,220 But it turns out that Python supports what are called exceptions. 1825 01:28:44,220 --> 01:28:47,310 And generally, an exception is a better way 1826 01:28:47,310 --> 01:28:50,610 of handling certain types of errors because in C, recall 1827 01:28:50,610 --> 01:28:53,160 that the only way we could really handle errors 1828 01:28:53,160 --> 01:28:56,310 is by having functions return special values. 1829 01:28:56,310 --> 01:29:00,330 malloc() could return null, which means it ran out of memory. 1830 01:29:00,330 --> 01:29:01,560 Something went wrong. 1831 01:29:01,560 --> 01:29:06,270 Some functions we wrote in C could return 1, could return 2, 1832 01:29:06,270 --> 01:29:07,380 could return negative 1. 1833 01:29:07,380 --> 01:29:10,260 Recall that we could write our own functions that return values 1834 01:29:10,260 --> 01:29:12,180 to indicate something went wrong. 1835 01:29:12,180 --> 01:29:15,690 But the problem in C is that if you're stealing certain values, 1836 01:29:15,690 --> 01:29:23,280 be it null or 1 or 2 or 3, your function can never return null or 1 or 2 or 3 1837 01:29:23,280 --> 01:29:24,640 as actual values. 1838 01:29:24,640 --> 01:29:25,140 Why? 1839 01:29:25,140 --> 01:29:27,598 Because other people are going to interpret them as errors. 1840 01:29:27,598 --> 01:29:30,960 So you kind of have to use up some of your possible return values 1841 01:29:30,960 --> 01:29:35,310 in a language like C and treat them specially as errors. 1842 01:29:35,310 --> 01:29:37,380 In Python and other languages-- 1843 01:29:37,380 --> 01:29:39,280 Java and others-- you don't have to do that. 1844 01:29:39,280 --> 01:29:43,260 You can instead have more out of band error handling, known as exceptions. 1845 01:29:43,260 --> 01:29:44,760 And that's what's happening here. 1846 01:29:44,760 --> 01:29:49,880 When I run calculator.py and I type in cat, what I'm seeing here 1847 01:29:49,880 --> 01:29:51,810 is actually an exception. 1848 01:29:51,810 --> 01:29:54,650 It's something exceptional, but not in a good way. 1849 01:29:54,650 --> 01:29:57,500 This exception means this was not supposed to happen. 1850 01:29:57,500 --> 01:30:00,663 The type of exception happens to be called a value error. 1851 01:30:00,663 --> 01:30:03,830 And within the world of Python, there's this whole taxonomy, that is to say, 1852 01:30:03,830 --> 01:30:05,570 a whole list of possible exceptions. 1853 01:30:05,570 --> 01:30:07,370 ValueError is one of the most common. 1854 01:30:07,370 --> 01:30:09,980 We saw another one before, name error, when I said 1855 01:30:09,980 --> 01:30:12,560 meow when Python didn't know what meow meant. 1856 01:30:12,560 --> 01:30:14,630 So this is just an example of an exception. 1857 01:30:14,630 --> 01:30:18,320 But what this means is that there is a way for me to try to handle this 1858 01:30:18,320 --> 01:30:19,130 myself. 1859 01:30:19,130 --> 01:30:21,170 So I'm actually going to go ahead and do this. 1860 01:30:21,170 --> 01:30:26,420 Instead of get_int() simply blindly returning the integer conversion 1861 01:30:26,420 --> 01:30:31,430 of whatever input the user gives me, I'm going to instead literally try to do 1862 01:30:31,430 --> 01:30:32,880 this instead. 1863 01:30:32,880 --> 01:30:34,600 So it's kind of a aptly named phrase. 1864 01:30:34,600 --> 01:30:35,600 It literally means that. 1865 01:30:35,600 --> 01:30:39,710 Please try to do this, except if something goes wrong, 1866 01:30:39,710 --> 01:30:45,030 except if there is a ValueError, in which case 1867 01:30:45,030 --> 01:30:47,340 I want Python to do something else, for instance, 1868 01:30:47,340 --> 01:30:50,380 quote unquote, "Not an integer." 1869 01:30:50,380 --> 01:30:51,550 So what does this mean? 1870 01:30:51,550 --> 01:30:53,130 It's a little weird, the syntax. 1871 01:30:53,130 --> 01:30:57,790 But in the get_int() function, Python will first try to do the following. 1872 01:30:57,790 --> 01:31:00,090 It will try to get an input from the user. 1873 01:31:00,090 --> 01:31:01,780 It will try to convert it to an integer. 1874 01:31:01,780 --> 01:31:03,240 And it will try to return it. 1875 01:31:03,240 --> 01:31:07,590 But if one of those operations fails, namely the integer step in this case, 1876 01:31:07,590 --> 01:31:09,840 then an exception could happen. 1877 01:31:09,840 --> 01:31:11,770 And you might get what's called a ValueError. 1878 01:31:11,770 --> 01:31:11,940 Why? 1879 01:31:11,940 --> 01:31:14,190 Because the documentation tells you that might happen. 1880 01:31:14,190 --> 01:31:16,170 Or, in my case, I experienced it firsthand, 1881 01:31:16,170 --> 01:31:20,140 and now I want to catch this kind of exception in my own code. 1882 01:31:20,140 --> 01:31:22,200 So if there is a ValueError, I'm not going 1883 01:31:22,200 --> 01:31:24,030 to see that crazy traceback anymore. 1884 01:31:24,030 --> 01:31:28,080 I'm instead going to see, quote unquote, "Not an integer." 1885 01:31:28,080 --> 01:31:31,080 But what the CS50 library does for you technically is it 1886 01:31:31,080 --> 01:31:33,510 lets you try again and again and again. 1887 01:31:33,510 --> 01:31:36,178 Recall in the past, if I type in cat and dog and bird, 1888 01:31:36,178 --> 01:31:38,220 it's just going to keep asking me again and again 1889 01:31:38,220 --> 01:31:39,810 until I actually give it an int. 1890 01:31:39,810 --> 01:31:43,620 So that kind of implies that we really need a loop inside of this function. 1891 01:31:43,620 --> 01:31:45,810 And the easiest way to do something forever 1892 01:31:45,810 --> 01:31:50,820 is to loop while true, just like in C, but a capital T in Python. 1893 01:31:50,820 --> 01:31:54,810 And what I'm going to do now is implement a better version of get_int() 1894 01:31:54,810 --> 01:31:57,030 here because what's it going to do? 1895 01:31:57,030 --> 01:31:59,910 It is going to try-- it's going to do this forever. 1896 01:31:59,910 --> 01:32:03,840 It's going to try to get an input, convert it to an int, and return it. 1897 01:32:03,840 --> 01:32:07,590 And just like break breaks you out of a loop, 1898 01:32:07,590 --> 01:32:10,650 return also breaks you out of a loop as well, right? 1899 01:32:10,650 --> 01:32:14,010 Because once you've returned, there's no more need for this function to execute. 1900 01:32:14,010 --> 01:32:17,370 So long story short, you won't have to write much code like this yourself. 1901 01:32:17,370 --> 01:32:23,100 But this is essentially what the CS50 library is doing when it implements 1902 01:32:23,100 --> 01:32:24,960 the Python version of get_int(). 1903 01:32:24,960 --> 01:32:26,310 So what happens now? 1904 01:32:26,310 --> 01:32:31,028 If I run python of calculator.py, and I type in cat, I get yelled at, 1905 01:32:31,028 --> 01:32:32,820 but I'm prompted again because of the loop. 1906 01:32:32,820 --> 01:32:33,697 I type in dog. 1907 01:32:33,697 --> 01:32:35,280 I'm yelled at, but I'm prompted again. 1908 01:32:35,280 --> 01:32:38,190 I type in bird, yelled at, but I'm prompted again. 1909 01:32:38,190 --> 01:32:41,970 If I type in 1, then I type in 2, now it proceeds 1910 01:32:41,970 --> 01:32:46,470 because it tried and succeeded this time as opposed to trying and failing 1911 01:32:46,470 --> 01:32:47,130 last time. 1912 01:32:47,130 --> 01:32:49,770 And technically, the CS50 library doesn't actually 1913 01:32:49,770 --> 01:32:51,580 yell at you with "Not an integer." 1914 01:32:51,580 --> 01:32:54,750 So technically, if you want to handle the error, that is to say, 1915 01:32:54,750 --> 01:32:58,440 catch the exception, you can actually just say, oh, pass, 1916 01:32:58,440 --> 01:33:02,290 and it will just silently try again and again. 1917 01:33:02,290 --> 01:33:06,670 So let me go ahead and run this. python of calculator.py works almost the same. 1918 01:33:06,670 --> 01:33:09,390 But notice now it works just like the C version. 1919 01:33:09,390 --> 01:33:13,290 It doesn't yell at you, but it does prompt you again and again and again. 1920 01:33:13,290 --> 01:33:16,120 But I'll do 1 and 2, and that now is satisfied. 1921 01:33:16,120 --> 01:33:18,450 So that then is exceptions which you'll encounter, 1922 01:33:18,450 --> 01:33:21,930 but you yourself won't have to write much code along those lines. 1923 01:33:21,930 --> 01:33:23,650 Well, what else can we now do? 1924 01:33:23,650 --> 01:33:25,860 Well, let's revisit something like this for Mario, 1925 01:33:25,860 --> 01:33:29,400 recall, whereby we had this two-dimensional world with things 1926 01:33:29,400 --> 01:33:33,030 in the way for Mario, like this column of three bricks. 1927 01:33:33,030 --> 01:33:35,610 Let me actually play around now for a moment with some loops 1928 01:33:35,610 --> 01:33:38,040 just to see how there's different ways that might actually 1929 01:33:38,040 --> 01:33:41,410 resonate with you just in terms of the simplicity of some of these things. 1930 01:33:41,410 --> 01:33:44,310 Let me go ahead and create a program called mario.py. 1931 01:33:44,310 --> 01:33:47,970 And suppose that I want to print a column of three bricks. 1932 01:33:47,970 --> 01:33:50,580 It kind of doesn't get any easier than this in Python. 1933 01:33:50,580 --> 01:33:55,800 So for i in range(3), just go ahead and print out a single hash-- 1934 01:33:55,800 --> 01:33:57,150 done. 1935 01:33:57,150 --> 01:34:00,600 That then is what we took us more lines of code in the past. 1936 01:34:00,600 --> 01:34:03,755 But if I run mario.py, that there gets the job done. 1937 01:34:03,755 --> 01:34:05,880 I could change the i to an underscore, but it's not 1938 01:34:05,880 --> 01:34:09,660 bad to remind myself that i is what's really doing my counting. 1939 01:34:09,660 --> 01:34:11,970 Well, what else could we do beyond this? 1940 01:34:11,970 --> 01:34:17,310 Well, recall that in the world of Mario, we prompted the user, actually, 1941 01:34:17,310 --> 01:34:18,660 for a specific height. 1942 01:34:18,660 --> 01:34:20,640 We didn't just always hardcode 3. 1943 01:34:20,640 --> 01:34:23,680 So I could actually do something like this. 1944 01:34:23,680 --> 01:34:28,860 Let me actually open up from today's code that I came with in advance 1945 01:34:28,860 --> 01:34:31,830 and pull up this C version of Mario. 1946 01:34:31,830 --> 01:34:34,260 So this was from some time ago, in week 1. 1947 01:34:34,260 --> 01:34:38,250 And this is how we implemented a loop that 1948 01:34:38,250 --> 01:34:43,400 ensures that we get a positive integer from the user by just doing while 1949 01:34:43,400 --> 01:34:45,890 and is not positive, and then we use this for loop 1950 01:34:45,890 --> 01:34:47,850 to actually print out that many hashes. 1951 01:34:47,850 --> 01:34:50,250 Now, in Python, it's actually going to be pretty similar, 1952 01:34:50,250 --> 01:34:53,900 except for the fact that in Python, there is no do while loop. 1953 01:34:53,900 --> 01:34:55,670 But recall that a do while loop was useful 1954 01:34:55,670 --> 01:34:59,215 because it means you can get the user to try something and then maybe try again, 1955 01:34:59,215 --> 01:35:00,590 maybe try again, maybe try again. 1956 01:35:00,590 --> 01:35:02,970 So it's really good for user input. 1957 01:35:02,970 --> 01:35:04,220 So let's actually do this. 1958 01:35:04,220 --> 01:35:07,580 Let me borrow the CS50's library get_int() function, 1959 01:35:07,580 --> 01:35:10,940 just so we don't have to re-implement that ourselves again and again. 1960 01:35:10,940 --> 01:35:14,300 Let me, in Python, do this the Pythonic way. 1961 01:35:14,300 --> 01:35:17,630 In Python, if you want to prompt the user to do something again and again 1962 01:35:17,630 --> 01:35:20,780 and again, potentially, you deliberately, by convention, 1963 01:35:20,780 --> 01:35:22,200 induce an infinite loop. 1964 01:35:22,200 --> 01:35:24,200 You just get yourself into an infinite loop. 1965 01:35:24,200 --> 01:35:27,260 But the goal is going to be try something, try something, try 1966 01:35:27,260 --> 01:35:30,230 something, and as soon as you have what you want, break out of the loop 1967 01:35:30,230 --> 01:35:31,050 instead. 1968 01:35:31,050 --> 01:35:34,370 So we're implementing the idea of a do while loop ourselves. 1969 01:35:34,370 --> 01:35:37,940 So I'm going to do this. n, for number, equals get_int(), 1970 01:35:37,940 --> 01:35:40,220 and let's ask the user for a height. 1971 01:35:40,220 --> 01:35:42,020 Then let's just check. 1972 01:35:42,020 --> 01:35:44,430 If n is greater than 0, you know what? 1973 01:35:44,430 --> 01:35:44,930 Break. 1974 01:35:44,930 --> 01:35:46,400 We've got the value we need. 1975 01:35:46,400 --> 01:35:50,420 And if not, it's just going to implicitly keep looping again and again 1976 01:35:50,420 --> 01:35:51,020 and again. 1977 01:35:51,020 --> 01:35:53,690 So in Python, this is to say-- super common-- 1978 01:35:53,690 --> 01:35:57,770 to deliberately induce an infinite loop and break out of it when you have 1979 01:35:57,770 --> 01:35:58,740 what you want. 1980 01:35:58,740 --> 01:35:59,240 All right? 1981 01:35:59,240 --> 01:36:03,770 Now I can just do the same kind of code as before. for i in range not of-- 1982 01:36:03,770 --> 01:36:07,580 rage sometimes-- for i in range, not 3, but n, 1983 01:36:07,580 --> 01:36:10,640 now I can go ahead and print out-- 1984 01:36:10,640 --> 01:36:12,432 oops-- a hash like this. 1985 01:36:12,432 --> 01:36:15,140 If I open my terminal window, it's going to work almost the same, 1986 01:36:15,140 --> 01:36:17,400 but now mario is going to prompt me for the height. 1987 01:36:17,400 --> 01:36:20,690 So I could type in 3, or I could type in 4, 1988 01:36:20,690 --> 01:36:25,910 or I could be uncooperative and type in 0 or negative 1 or even cat. 1989 01:36:25,910 --> 01:36:29,510 And because I'm using the CS50 library, cat is ignored. 1990 01:36:29,510 --> 01:36:32,420 Because I'm using my while loop and breaking out 1991 01:36:32,420 --> 01:36:37,520 of it only when n is positive, I'm also ignoring the 0 and the negative 1. 1992 01:36:37,520 --> 01:36:43,060 So, again, this would be a Pythonic way of implementing this particular idea. 1993 01:36:43,060 --> 01:36:49,630 If I want to maybe enhance this a bit further, let me propose that, 1994 01:36:49,630 --> 01:36:55,450 for instance, we consider something like the two-dimensional version-- 1995 01:36:55,450 --> 01:36:58,490 or the horizontal version of this instead. 1996 01:36:58,490 --> 01:37:00,610 So recall that some time ago, we printed out, 1997 01:37:00,610 --> 01:37:03,340 like, four question marks in the sky that might have 1998 01:37:03,340 --> 01:37:05,090 looked a little something like this. 1999 01:37:05,090 --> 01:37:09,740 Now, the very mechanical way to do this would be as follows. 2000 01:37:09,740 --> 01:37:11,200 Let me close my C code. 2001 01:37:11,200 --> 01:37:12,770 Let me clear my terminal. 2002 01:37:12,770 --> 01:37:16,060 And let me just delete my old mario version here. 2003 01:37:16,060 --> 01:37:20,800 And let's just do this-- for i in range(4), let's go ahead 2004 01:37:20,800 --> 01:37:23,620 and print out a question mark, all right? 2005 01:37:23,620 --> 01:37:26,140 I'm going to run python of mario.py, enter, 2006 01:37:26,140 --> 01:37:30,430 and, ugh, it's still a column instead of a row. 2007 01:37:30,430 --> 01:37:34,245 But what's the fix here, perhaps? 2008 01:37:34,245 --> 01:37:34,870 What's the fix? 2009 01:37:34,870 --> 01:37:35,770 Yeah? 2010 01:37:35,770 --> 01:37:37,270 AUDIENCE: The end equals [INAUDIBLE] 2011 01:37:37,270 --> 01:37:38,020 DAVID MALAN: Yeah. 2012 01:37:38,020 --> 01:37:41,350 We can use that named parameter and say end="" 2013 01:37:41,350 --> 01:37:43,720 to just suppress the default backslash n. 2014 01:37:43,720 --> 01:37:46,660 But let's give ourselves one at the very end of the loop 2015 01:37:46,660 --> 01:37:48,500 just to move the cursor correctly. 2016 01:37:48,500 --> 01:37:50,890 So now if I run python of mario.py, now it 2017 01:37:50,890 --> 01:37:53,780 looks like what it might have in the sky here. 2018 01:37:53,780 --> 01:37:57,400 But it turns out Python has some neat features, too, more syntactic sugar, 2019 01:37:57,400 --> 01:37:59,590 if you will, for doing things a little more easily. 2020 01:37:59,590 --> 01:38:01,850 It turns out in Python, you could also do this. 2021 01:38:01,850 --> 01:38:03,430 You could just say print("?" * 4). 2022 01:38:03,430 --> 01:38:06,490 2023 01:38:06,490 --> 01:38:11,230 And just like + means concatenation, * here means, 2024 01:38:11,230 --> 01:38:14,170 really, multiply the string by itself that many times, so sort 2025 01:38:14,170 --> 01:38:16,610 of automatically concatenate it with itself. 2026 01:38:16,610 --> 01:38:19,327 So if I run python of mario.py, this too works-- so, 2027 01:38:19,327 --> 01:38:21,910 again, just some features of Python that make it a little more 2028 01:38:21,910 --> 01:38:25,660 pleasant to use so you don't always have to slog through implementing 2029 01:38:25,660 --> 01:38:27,710 a loop or something along those lines. 2030 01:38:27,710 --> 01:38:29,710 Well, what about something more two-dimensional, 2031 01:38:29,710 --> 01:38:32,800 like in the world of this brick here? 2032 01:38:32,800 --> 01:38:36,040 Well, in the context of this sort of grid of bricks, 2033 01:38:36,040 --> 01:38:38,450 we might do something like this in VS Code. 2034 01:38:38,450 --> 01:38:43,700 Let me go back to mario.py, and let me do a 3-by-3 grid for that block, 2035 01:38:43,700 --> 01:38:44,840 like we did in week 1. 2036 01:38:44,840 --> 01:38:47,870 So for i in range(3)-- 2037 01:38:47,870 --> 01:38:52,340 I can nest loops, just like in C-- for j in range(3), 2038 01:38:52,340 --> 01:38:55,190 I can then print out a hash here. 2039 01:38:55,190 --> 01:38:58,370 And then let's leave this alone even though it's not quite right yet. 2040 01:38:58,370 --> 01:39:00,620 Let's do python of mario.py. 2041 01:39:00,620 --> 01:39:04,700 OK, it's, like, nine bricks all in a column, which so your mind might 2042 01:39:04,700 --> 01:39:06,770 wander to the end parameter again. 2043 01:39:06,770 --> 01:39:10,310 So, yeah, let's fix this-- end="", but at the end of that loop, 2044 01:39:10,310 --> 01:39:11,910 let's just print out a new line. 2045 01:39:11,910 --> 01:39:16,020 So this logically is the same as it was in C But in this case, 2046 01:39:16,020 --> 01:39:19,730 I'm now doing it in Python, just a little more easily, without i++, 2047 01:39:19,730 --> 01:39:26,120 without a conditional, I'm just relying on this for i in syntax using range(). 2048 01:39:26,120 --> 01:39:28,080 I can tighten this up further, frankly. 2049 01:39:28,080 --> 01:39:30,840 If I already have the outer loop, I could do something like this. 2050 01:39:30,840 --> 01:39:34,280 I could print out a single hash times 3. 2051 01:39:34,280 --> 01:39:37,610 And now if I run python of mario.py, that works, too. 2052 01:39:37,610 --> 01:39:40,710 So I can combine these ideas in interesting ways as well. 2053 01:39:40,710 --> 01:39:44,550 The goal is simply to seed you with some of these building blocks. 2054 01:39:44,550 --> 01:39:45,050 All right. 2055 01:39:45,050 --> 01:39:48,540 How about code that was maybe a little more logical in nature? 2056 01:39:48,540 --> 01:39:52,640 Well, in Python, we indeed have some other features as well, namely lists. 2057 01:39:52,640 --> 01:39:54,950 And lists are denoted by those square brackets, 2058 01:39:54,950 --> 01:39:56,570 reminiscent of the world of arrays. 2059 01:39:56,570 --> 01:39:59,270 But in Python, what's really nice about lists 2060 01:39:59,270 --> 01:40:02,510 is that their memory is automatically handled for you. 2061 01:40:02,510 --> 01:40:05,930 An array is about having values contiguously in memory. 2062 01:40:05,930 --> 01:40:09,440 In Python, a list is more like a linked list. 2063 01:40:09,440 --> 01:40:12,770 It will allocate memory for you and grow and shrink these things. 2064 01:40:12,770 --> 01:40:14,990 And you do not have to know about pointers. 2065 01:40:14,990 --> 01:40:16,520 You do not have to know about nodes. 2066 01:40:16,520 --> 01:40:18,830 You do not have to implement linked lists yourself. 2067 01:40:18,830 --> 01:40:22,250 You just get list as a data type in Python itself. 2068 01:40:22,250 --> 01:40:25,490 Here, for instance, is some of the documentation for lists specifically. 2069 01:40:25,490 --> 01:40:29,210 And in particular, lists also, like strings, or strs, 2070 01:40:29,210 --> 01:40:31,970 have methods, functions that come with them, 2071 01:40:31,970 --> 01:40:34,710 that just make it easy to do certain things. 2072 01:40:34,710 --> 01:40:40,100 So, for instance, if I wanted to maybe do something like taking averages 2073 01:40:40,100 --> 01:40:43,850 of scores, like we did some time ago, we can do that using a combination 2074 01:40:43,850 --> 01:40:47,550 of lists and the function called len(), which I alluded to earlier, 2075 01:40:47,550 --> 01:40:49,550 which will tell you the length of a list. 2076 01:40:49,550 --> 01:40:50,705 Now, how might we do this? 2077 01:40:50,705 --> 01:40:52,580 Well, if we read the documentation for len(), 2078 01:40:52,580 --> 01:40:55,455 it turns out there's other functions there too that might be helpful. 2079 01:40:55,455 --> 01:40:57,240 So let me go back to VS Code here. 2080 01:40:57,240 --> 01:40:59,030 Let me close mario.py. 2081 01:40:59,030 --> 01:41:02,360 And let me open a file called scores.py, reminiscent of something 2082 01:41:02,360 --> 01:41:03,800 we did weeks ago, too. 2083 01:41:03,800 --> 01:41:06,090 Let me go ahead and, just for demonstration's sake, 2084 01:41:06,090 --> 01:41:09,920 give myself a variable called scores that has my three test scores 2085 01:41:09,920 --> 01:41:11,480 or whatnot from weeks ago. 2086 01:41:11,480 --> 01:41:15,890 So I'm using square brackets, not curly braces, as in C. This is a linked list, 2087 01:41:15,890 --> 01:41:17,780 or a list in Python. 2088 01:41:17,780 --> 01:41:20,460 And let me get the average of these values. 2089 01:41:20,460 --> 01:41:24,380 Well, I could do this-- average =, and it turns out in Python, 2090 01:41:24,380 --> 01:41:26,300 you just get a lot of functionality for free. 2091 01:41:26,300 --> 01:41:29,600 And those functions sometimes take not single arguments, 2092 01:41:29,600 --> 01:41:31,740 but lists as their arguments. 2093 01:41:31,740 --> 01:41:35,330 So, for instance, I can use Python's built-in sum() function and pass 2094 01:41:35,330 --> 01:41:36,230 in those scores. 2095 01:41:36,230 --> 01:41:41,130 I can then divide that sum by the length of the scores list as well. 2096 01:41:41,130 --> 01:41:44,670 So length of a list just tells you how many things are in it. 2097 01:41:44,670 --> 01:41:51,620 So this is like doing magically 72 plus 73 plus 33, all divided by 3 in total. 2098 01:41:51,620 --> 01:41:54,840 If I want to now do the math out, I can print the result. 2099 01:41:54,840 --> 01:41:59,630 So I can print out, using an f string and maybe some prefix text here. 2100 01:41:59,630 --> 01:42:02,520 Let's print out that average here. 2101 01:42:02,520 --> 01:42:06,170 So let me do python of scores.py, enter, and there 2102 01:42:06,170 --> 01:42:08,280 is the average, slightly imprecisely. 2103 01:42:08,280 --> 01:42:10,280 But at that point, I'm not doing so well anyway. 2104 01:42:10,280 --> 01:42:11,100 So that's fine. 2105 01:42:11,100 --> 01:42:17,840 So at this point, we've seen that we have sort of more functionality than C. 2106 01:42:17,840 --> 01:42:20,490 In C, how would we have computed the average weeks ago? 2107 01:42:20,490 --> 01:42:22,170 I mean, we literally created a variable. 2108 01:42:22,170 --> 01:42:22,970 We then had a loop. 2109 01:42:22,970 --> 01:42:24,278 We iterated over the array. 2110 01:42:24,278 --> 01:42:25,320 We added things together. 2111 01:42:25,320 --> 01:42:26,930 It was just so much more work. 2112 01:42:26,930 --> 01:42:30,920 It's nice when you have a language that comes with functions, among them len(), 2113 01:42:30,920 --> 01:42:34,670 among them sum(), that just does more of this for you. 2114 01:42:34,670 --> 01:42:37,690 But suppose you actually want to get the scores from the user. 2115 01:42:37,690 --> 01:42:41,020 In C, we used an array, and in C, we used get_int(). 2116 01:42:41,020 --> 01:42:42,850 We can do something a little similar here. 2117 01:42:42,850 --> 01:42:46,300 Let me propose that instead of hardcoding those three values, 2118 01:42:46,300 --> 01:42:49,770 let me do this. from cs50 import get_int(). 2119 01:42:49,770 --> 01:42:53,710 Now let me give myself an empty list by just saying scores 2120 01:42:53,710 --> 01:42:55,570 equals open bracket, closed bracket. 2121 01:42:55,570 --> 01:42:59,890 And unlike C, where you just can't do this-- you can't say give me an array 2122 01:42:59,890 --> 01:43:03,100 and I'll figure out the length later, unless you resort 2123 01:43:03,100 --> 01:43:06,040 to pointers and memory management or the like, in Python 2124 01:43:06,040 --> 01:43:09,680 you can absolutely give yourself an initially empty list. 2125 01:43:09,680 --> 01:43:12,970 Now let's do this. for i in range(3), let's 2126 01:43:12,970 --> 01:43:15,320 prompt the human for three test scores. 2127 01:43:15,320 --> 01:43:18,400 So the first score will be the return value of get_int(), 2128 01:43:18,400 --> 01:43:20,680 prompting the user for their score. 2129 01:43:20,680 --> 01:43:25,000 And now, if I want to add this score to that otherwise empty list, 2130 01:43:25,000 --> 01:43:27,130 here's where methods come into play, functions 2131 01:43:27,130 --> 01:43:30,130 that come with objects, like lists. 2132 01:43:30,130 --> 01:43:31,780 I can do scores, plural-- 2133 01:43:31,780 --> 01:43:35,710 because that's the name of my variable from line 3-- .append, 2134 01:43:35,710 --> 01:43:37,650 and I can append that score. 2135 01:43:37,650 --> 01:43:40,070 So if we read the documentation for lists in Python, 2136 01:43:40,070 --> 01:43:43,940 you will see that lists come with a function, a method called append(), 2137 01:43:43,940 --> 01:43:47,630 which literally just tacks a value onto the end, tacks a value onto the end, 2138 01:43:47,630 --> 01:43:52,040 like all of that annoying code we would have written in C to iterate with 2139 01:43:52,040 --> 01:43:55,040 pointer and pointer and pointer to the end of the list, append it, 2140 01:43:55,040 --> 01:43:56,180 malloc() a new node. 2141 01:43:56,180 --> 01:43:58,262 Python does all of that for us. 2142 01:43:58,262 --> 01:44:01,220 And so once you've done that, now I can do something similar to before. 2143 01:44:01,220 --> 01:44:04,460 The average equals the sum of those scores divided 2144 01:44:04,460 --> 01:44:06,740 by the length of that list of scores. 2145 01:44:06,740 --> 01:44:11,660 And I can again print out, with an f string, the average value 2146 01:44:11,660 --> 01:44:13,667 in that variable like this. 2147 01:44:13,667 --> 01:44:16,250 So, again, you just have more building blocks at your disposal 2148 01:44:16,250 --> 01:44:19,770 when it comes to something like this. 2149 01:44:19,770 --> 01:44:22,790 You can also do this, just so you've seen other syntax. 2150 01:44:22,790 --> 01:44:28,610 It turns out that instead of doing scores.append, you could also do this. 2151 01:44:28,610 --> 01:44:36,530 You could concatenate scores with itself by adding two lists together like this. 2152 01:44:36,530 --> 01:44:38,000 This looks a little weird. 2153 01:44:38,000 --> 01:44:40,910 But on the left is my variable scores. 2154 01:44:40,910 --> 01:44:44,900 On the right here, I am taking whatever is in that list, 2155 01:44:44,900 --> 01:44:49,190 and I'm adding the current score by adding it to its own list. 2156 01:44:49,190 --> 01:44:51,900 And this will update the value as we go. 2157 01:44:51,900 --> 01:44:54,110 But it does, in fact, change the value of score 2158 01:44:54,110 --> 01:44:58,070 as opposed to appending to the initial list. 2159 01:44:58,070 --> 01:44:58,610 All right. 2160 01:44:58,610 --> 01:45:02,120 How about some other building blocks here? 2161 01:45:02,120 --> 01:45:04,010 Let me propose this. 2162 01:45:04,010 --> 01:45:06,470 Let me close out scores.py. 2163 01:45:06,470 --> 01:45:09,800 Let me open up a file called phonebook.py, 2164 01:45:09,800 --> 01:45:12,080 reminiscent of what we did weeks ago in C. 2165 01:45:12,080 --> 01:45:13,747 And let me give myself a list of names. 2166 01:45:13,747 --> 01:45:15,330 We won't bother with numbers just yet. 2167 01:45:15,330 --> 01:45:17,247 Let's just play with lists for another moment. 2168 01:45:17,247 --> 01:45:18,830 So here is a variable called names. 2169 01:45:18,830 --> 01:45:22,670 It has maybe three names in it-- maybe Carter and David 2170 01:45:22,670 --> 01:45:25,220 and John Harvard, as in past weeks. 2171 01:45:25,220 --> 01:45:29,303 And now let me go ahead and ask the user to input a name-- 2172 01:45:29,303 --> 01:45:31,220 because this is going to be like a phone book. 2173 01:45:31,220 --> 01:45:33,110 I want to ask the user for a name and then look up 2174 01:45:33,110 --> 01:45:35,943 that person's name and the phone book, even though I'm not bothering 2175 01:45:35,943 --> 01:45:38,330 by having any phone numbers just yet. 2176 01:45:38,330 --> 01:45:41,780 How could I search for, a la linear search, someone's name? 2177 01:45:41,780 --> 01:45:45,410 Well, in Python I could do this. for name-- 2178 01:45:45,410 --> 01:45:53,210 rather, for n in names, if the current name equals what the human typed in, 2179 01:45:53,210 --> 01:45:58,470 then go ahead and print out "Found," then break out of this loop. 2180 01:45:58,470 --> 01:46:02,250 Otherwise, we'll print out "Not found" at the bottom. 2181 01:46:02,250 --> 01:46:02,750 All right. 2182 01:46:02,750 --> 01:46:05,360 So let's try this-- python of phonebook.py. 2183 01:46:05,360 --> 01:46:06,960 Let's search for maybe Carter. 2184 01:46:06,960 --> 01:46:07,460 That's easy. 2185 01:46:07,460 --> 01:46:08,570 He's at the beginning. 2186 01:46:08,570 --> 01:46:09,475 Oh, hmm. 2187 01:46:09,475 --> 01:46:11,600 Well, he was found, but then I printed "Not found." 2188 01:46:11,600 --> 01:46:13,250 So that's not quite what I want. 2189 01:46:13,250 --> 01:46:14,240 How about David? 2190 01:46:14,240 --> 01:46:16,820 D-A-V-I-D. "Found," "Not found"-- 2191 01:46:16,820 --> 01:46:18,260 all right, not very correct. 2192 01:46:18,260 --> 01:46:20,090 How about this? 2193 01:46:20,090 --> 01:46:21,900 Let's search for Eli, not in the list. 2194 01:46:21,900 --> 01:46:22,400 OK. 2195 01:46:22,400 --> 01:46:25,070 So at least someone not being in the list is working. 2196 01:46:25,070 --> 01:46:28,310 But logically, for Carter, for David, and even John, 2197 01:46:28,310 --> 01:46:31,355 why are we seeing "Found" and then "Not found?" 2198 01:46:31,355 --> 01:46:37,030 2199 01:46:37,030 --> 01:46:38,650 Why is it not found? 2200 01:46:38,650 --> 01:46:39,170 Yeah? 2201 01:46:39,170 --> 01:46:41,063 AUDIENCE: You need to intend the print(). 2202 01:46:41,063 --> 01:46:41,730 DAVID MALAN: OK. 2203 01:46:41,730 --> 01:46:45,610 I don't have seem to have indented the print(), but let me try this. 2204 01:46:45,610 --> 01:46:48,180 If I just go with the else here-- 2205 01:46:48,180 --> 01:46:50,910 let me go up here and indent this and say else-- 2206 01:46:50,910 --> 01:46:53,790 I'm not sure logically this is what we want, 2207 01:46:53,790 --> 01:46:57,810 because what I think this is going to do if I search for maybe Carter-- 2208 01:46:57,810 --> 01:46:58,660 OK, that worked. 2209 01:46:58,660 --> 01:47:00,480 So it's partially fixed the problem. 2210 01:47:00,480 --> 01:47:03,150 But let me try searching for maybe David. 2211 01:47:03,150 --> 01:47:05,280 Oh, now we're sort of the opposite problem-- 2212 01:47:05,280 --> 01:47:06,420 "Not found," "Found." 2213 01:47:06,420 --> 01:47:07,020 Why? 2214 01:47:07,020 --> 01:47:09,450 Well, I don't think we want to immediately conclude 2215 01:47:09,450 --> 01:47:12,000 that someone's not found just because they don't 2216 01:47:12,000 --> 01:47:15,220 equal the current name in the list. 2217 01:47:15,220 --> 01:47:19,950 So it turns out we could fix this in a couple of different ways. 2218 01:47:19,950 --> 01:47:22,290 But there's kind of a neat features of Python. 2219 01:47:22,290 --> 01:47:25,980 In Python, even for loops can have an else clause. 2220 01:47:25,980 --> 01:47:27,150 And this is weird. 2221 01:47:27,150 --> 01:47:29,320 But the way this works is as follows. 2222 01:47:29,320 --> 01:47:34,560 In Python, if you break out of a loop, that's it for the for loop. 2223 01:47:34,560 --> 01:47:38,640 If, though, you get all the way through the list that you're looping over, 2224 01:47:38,640 --> 01:47:42,520 and you never once call line 8-- you never break out of the loop-- 2225 01:47:42,520 --> 01:47:44,950 Python is smart enough to realize, OK, you just 2226 01:47:44,950 --> 01:47:46,510 went through lines 5 through 8. 2227 01:47:46,510 --> 01:47:49,120 You never actually logically called break. 2228 01:47:49,120 --> 01:47:51,490 Here's an else clause to be associated with it. 2229 01:47:51,490 --> 01:47:52,730 Semantically, this is weird. 2230 01:47:52,730 --> 01:47:55,610 We've only ever seen if and else associated with each other. 2231 01:47:55,610 --> 01:47:59,330 But for loops in Python actually can have else as well. 2232 01:47:59,330 --> 01:48:03,190 And in this case now, if I do python of phonebook.py, type in Carter, 2233 01:48:03,190 --> 01:48:05,005 now we get only one answer. 2234 01:48:05,005 --> 01:48:07,630 If I do it again and type in David, now we get only one answer. 2235 01:48:07,630 --> 01:48:08,547 Do it again with John. 2236 01:48:08,547 --> 01:48:09,700 Now we get only one answer. 2237 01:48:09,700 --> 01:48:10,480 Do it with Eli. 2238 01:48:10,480 --> 01:48:12,530 Now we get only one answer. 2239 01:48:12,530 --> 01:48:14,998 So, again, you just get a few more tools in your toolkit 2240 01:48:14,998 --> 01:48:18,040 when it comes to a language like Python that might very well make solving 2241 01:48:18,040 --> 01:48:20,890 problems a little more pleasant. 2242 01:48:20,890 --> 01:48:22,870 But this is kind of stupid in Python. 2243 01:48:22,870 --> 01:48:25,060 This is correct, but it's not well designed, 2244 01:48:25,060 --> 01:48:28,810 because I don't need to iterate over lists like this so pedantically 2245 01:48:28,810 --> 01:48:30,700 like we've been doing for weeks in C. 2246 01:48:30,700 --> 01:48:33,880 I can actually tighten this up, and I can just do this. 2247 01:48:33,880 --> 01:48:38,440 I can get rid of the loop, and I can say if name in names, then print out, 2248 01:48:38,440 --> 01:48:39,430 quote unquote, "Found." 2249 01:48:39,430 --> 01:48:40,750 That's it in Python. 2250 01:48:40,750 --> 01:48:44,020 If you want Python to search a whole list of values for you, 2251 01:48:44,020 --> 01:48:45,700 just let Python do the work. 2252 01:48:45,700 --> 01:48:50,590 And you can literally just say if the name that the human inputted is 2253 01:48:50,590 --> 01:48:55,180 in names, which is this list here, Python will use linear search for you, 2254 01:48:55,180 --> 01:48:59,240 search automatically from left to right, presumably, looking for the value. 2255 01:48:59,240 --> 01:49:02,260 And if it doesn't find it then and only then will 2256 01:49:02,260 --> 01:49:04,490 this else clause execute instead. 2257 01:49:04,490 --> 01:49:06,970 So, again, Python's just starting to save us 2258 01:49:06,970 --> 01:49:10,970 some time because this, too, will find Carter, but it will not find, 2259 01:49:10,970 --> 01:49:12,800 for instance, Eli. 2260 01:49:12,800 --> 01:49:13,300 All right? 2261 01:49:13,300 --> 01:49:15,350 So we get that functionality for free. 2262 01:49:15,350 --> 01:49:18,470 But what more can we perhaps do here? 2263 01:49:18,470 --> 01:49:22,360 Well, it turns out that Python has yet other features 2264 01:49:22,360 --> 01:49:25,930 we might want to explore, namely dictionaries, shortened as dict. 2265 01:49:25,930 --> 01:49:29,708 And a dictionary in Python is just like it was in C and, really, 2266 01:49:29,708 --> 01:49:31,000 in computer science in general. 2267 01:49:31,000 --> 01:49:32,625 A dictionary was an abstract data type. 2268 01:49:32,625 --> 01:49:36,460 And it's a collection of key value pairs it looks a little something like this. 2269 01:49:36,460 --> 01:49:38,980 If in C, if in Python, if, in any language, 2270 01:49:38,980 --> 01:49:43,210 you want to associate something with something, like a name with a number, 2271 01:49:43,210 --> 01:49:46,990 you had to, in problem set 5, implement the darn thing yourself 2272 01:49:46,990 --> 01:49:50,950 by implementing an entire spell checker with an array and linked list 2273 01:49:50,950 --> 01:49:53,890 to store all of those words in your dictionary. 2274 01:49:53,890 --> 01:49:57,220 In Python, as we saw earlier, you can use a set, or you can use, 2275 01:49:57,220 --> 01:50:02,080 more simply, a dictionary that implements for you all of problem 2276 01:50:02,080 --> 01:50:03,730 set 5's ideas. 2277 01:50:03,730 --> 01:50:06,370 But Python does the heavy lifting for you. 2278 01:50:06,370 --> 01:50:11,600 A dict in Python is essentially a hash table, a collection of key value pairs. 2279 01:50:11,600 --> 01:50:13,720 So what does this mean for me in Python? 2280 01:50:13,720 --> 01:50:18,050 It means that I can do some pretty handy things pretty easily. 2281 01:50:18,050 --> 01:50:21,070 So, for instance, let me go back here to VS Code, 2282 01:50:21,070 --> 01:50:25,210 and let me change my phone book altogether to be this. 2283 01:50:25,210 --> 01:50:29,570 Let me give myself a list of dictionaries. 2284 01:50:29,570 --> 01:50:32,830 So people is now going to be a global list. 2285 01:50:32,830 --> 01:50:35,977 And I'm going to demarcate it here with open square bracket 2286 01:50:35,977 --> 01:50:37,060 and closed square bracket. 2287 01:50:37,060 --> 01:50:39,430 And just to be nice and neat and tidy, I'm 2288 01:50:39,430 --> 01:50:45,160 going to have these people no longer just be Carter and David and John, 2289 01:50:45,160 --> 01:50:46,600 as in the previous example. 2290 01:50:46,600 --> 01:50:52,060 But I want each of the elements of this list to be a key value 2291 01:50:52,060 --> 01:50:54,800 pair, like a name and a number. 2292 01:50:54,800 --> 01:50:56,140 So how can I do this? 2293 01:50:56,140 --> 01:50:58,940 In Python, you can use this syntax. 2294 01:50:58,940 --> 01:51:02,290 And this is, I think, the last of the weird looking syntax today. 2295 01:51:02,290 --> 01:51:06,910 You can define a dictionary that is something like this 2296 01:51:06,910 --> 01:51:10,070 by using two curly braces like this. 2297 01:51:10,070 --> 01:51:12,820 And inside of your curly braces, you get to invent 2298 01:51:12,820 --> 01:51:15,140 the name, the keys, and the values. 2299 01:51:15,140 --> 01:51:18,107 So if you want one key to be the person's name, you can do, 2300 01:51:18,107 --> 01:51:20,440 quote unquote, "name" and then, quote unquote, "Carter." 2301 01:51:20,440 --> 01:51:24,850 If you want another key to be "number," you can do, quote unquote, "number," 2302 01:51:24,850 --> 01:51:29,560 and then, quote unquote, something like last time, "1-617-495-1000," 2303 01:51:29,560 --> 01:51:31,420 for instance, for Carter's number there. 2304 01:51:31,420 --> 01:51:35,530 And collectively, everything here on line 2 represents a dictionary. 2305 01:51:35,530 --> 01:51:40,680 It's as though, on a chalkboard, I wrote down "name, Carter, number, 2306 01:51:40,680 --> 01:51:45,790 +1-617-495-1000," row by row by row in this table. 2307 01:51:45,790 --> 01:51:47,700 This is simply the code equivalent thereof. 2308 01:51:47,700 --> 01:51:50,340 If you want to be really nitpicky or tidy, 2309 01:51:50,340 --> 01:51:52,840 you could style your code to look like this, 2310 01:51:52,840 --> 01:51:56,360 which makes it a little more clear, perhaps, as to what's going on. 2311 01:51:56,360 --> 01:51:58,860 It's just starting to add a lot of whitespace to the screen. 2312 01:51:58,860 --> 01:52:01,140 But it's just a collection of key value pairs, 2313 01:52:01,140 --> 01:52:04,573 again, akin to a two-column table like this. 2314 01:52:04,573 --> 01:52:07,740 I'm going to undo the whitespace just to kind of tighten things up because I 2315 01:52:07,740 --> 01:52:09,760 want to cram two other people in here. 2316 01:52:09,760 --> 01:52:13,740 So I'm going to go ahead and do another set of curly braces with, 2317 01:52:13,740 --> 01:52:17,370 quote unquote, "name" and "David," quote unquote, "number"-- 2318 01:52:17,370 --> 01:52:21,300 and we'll have the same number, so "+1-617-495-1000." 2319 01:52:21,300 --> 01:52:25,560 And then, lastly, let's do another set of curly braces for a name of say 2320 01:52:25,560 --> 01:52:34,140 "John," and John Harvard's number, quote unquote, "number" will be "+1"-- 2321 01:52:34,140 --> 01:52:40,120 let's see-- "949-468-2750" is always John Harvard's number. 2322 01:52:40,120 --> 01:52:44,090 And then, by convention, you typically end even this element with a comma. 2323 01:52:44,090 --> 01:52:46,150 But it's not strictly necessary syntactically. 2324 01:52:46,150 --> 01:52:48,410 But stylistically, that's often added for you. 2325 01:52:48,410 --> 01:52:49,450 So what is people? 2326 01:52:49,450 --> 01:52:54,470 people is now a list of dictionaries, a list of dictionaries. 2327 01:52:54,470 --> 01:52:55,700 So what does that mean? 2328 01:52:55,700 --> 01:52:57,700 It means I can now do code like this. 2329 01:52:57,700 --> 01:53:02,200 I can prompt the user with the input() function for someone's name if the goal 2330 01:53:02,200 --> 01:53:04,300 now is to look up that person's number. 2331 01:53:04,300 --> 01:53:05,680 How can I look up that number? 2332 01:53:05,680 --> 01:53:10,720 Well, for each person in the list of people, let's go ahead and do this. 2333 01:53:10,720 --> 01:53:16,930 If the current person's name equals equals whatever name the human 2334 01:53:16,930 --> 01:53:22,930 typed in, then get that person's number by going into that person and doing, 2335 01:53:22,930 --> 01:53:25,870 quote unquote, "number," and then go ahead 2336 01:53:25,870 --> 01:53:31,270 and print out something like this f string "Found" that person's number. 2337 01:53:31,270 --> 01:53:34,310 And then, since we found them, let's just break out all together. 2338 01:53:34,310 --> 01:53:37,600 And if we get through that whole thing, let's just, at the very end, print 2339 01:53:37,600 --> 01:53:39,310 out "Not found." 2340 01:53:39,310 --> 01:53:40,860 So what's weird here? 2341 01:53:40,860 --> 01:53:44,370 If I focus on this code here, this syntax obviously is new. 2342 01:53:44,370 --> 01:53:47,760 The square brackets, though, just means, hey, Python, here comes a list. 2343 01:53:47,760 --> 01:53:49,590 Hey, Python, that's it for the list. 2344 01:53:49,590 --> 01:53:52,320 Inside of this list are three dictionaries. 2345 01:53:52,320 --> 01:53:55,470 The curly braces mean, hey, Python, here comes a dictionary. 2346 01:53:55,470 --> 01:53:57,300 Hey, Python, that's it for the dictionary. 2347 01:53:57,300 --> 01:53:59,970 Each of these dictionaries has two key value pairs-- 2348 01:53:59,970 --> 01:54:03,580 "name" and its value, "number" and its value. 2349 01:54:03,580 --> 01:54:07,980 So you can think of each of these lines as being like a C struct, 2350 01:54:07,980 --> 01:54:09,310 like with typedef and struct. 2351 01:54:09,310 --> 01:54:12,060 But I don't have to decide in advance what the keys and the values 2352 01:54:12,060 --> 01:54:12,727 are going to be. 2353 01:54:12,727 --> 01:54:15,360 I can just, on the fly, create a dictionary like this, 2354 01:54:15,360 --> 01:54:18,550 again, reminiscent of this kind of chalkboard design. 2355 01:54:18,550 --> 01:54:19,050 All right. 2356 01:54:19,050 --> 01:54:21,210 So what am I actually doing in code? 2357 01:54:21,210 --> 01:54:26,490 A dictionary in Python lets you index into it, 2358 01:54:26,490 --> 01:54:31,287 similar to an array with numbers in C. So in C, this 2359 01:54:31,287 --> 01:54:32,370 is a little bit different. 2360 01:54:32,370 --> 01:54:35,550 In C, you might have been in the habit of doing person.name. 2361 01:54:35,550 --> 01:54:38,220 But because it's a dictionary, the syntax in Python 2362 01:54:38,220 --> 01:54:41,590 is you actually use square brackets with strings 2363 01:54:41,590 --> 01:54:45,620 as being inside the square brackets rather than numbers. 2364 01:54:45,620 --> 01:54:49,750 But all this is now doing is it's creating a variable on line 11, 2365 01:54:49,750 --> 01:54:53,060 setting that number equal to that same person's number. 2366 01:54:53,060 --> 01:54:53,560 Why? 2367 01:54:53,560 --> 01:54:56,140 Because we're inside of this loop, I'm iterating 2368 01:54:56,140 --> 01:54:58,340 over each person one at a time. 2369 01:54:58,340 --> 01:54:59,260 And that's what for-- 2370 01:54:59,260 --> 01:55:00,430 that's what n does. 2371 01:55:00,430 --> 01:55:06,280 It assigns the person variable to this dictionary, then this dictionary, 2372 01:55:06,280 --> 01:55:08,860 then this dictionary automatically for me-- 2373 01:55:08,860 --> 01:55:11,480 no need for i and i++ and all of that. 2374 01:55:11,480 --> 01:55:14,020 So this is just saying, if the current person's name 2375 01:55:14,020 --> 01:55:17,560 equals the name we're looking for, get a variable called number and assign it 2376 01:55:17,560 --> 01:55:20,708 that person's number, and then print out that person's number. 2377 01:55:20,708 --> 01:55:23,500 So whereas last time we were just printing "Found" and "Not found," 2378 01:55:23,500 --> 01:55:25,310 now I'm going to print an actual number. 2379 01:55:25,310 --> 01:55:28,270 So if I run python of phonebook.py and I search for Carter, 2380 01:55:28,270 --> 01:55:29,680 there then is his number. 2381 01:55:29,680 --> 01:55:34,550 If I run python of phonebook.py, type in John, there then is John's number. 2382 01:55:34,550 --> 01:55:38,840 And if I search for someone who's not there, I instead just get "Not found." 2383 01:55:38,840 --> 01:55:42,010 So what's interesting and compelling about dictionaries 2384 01:55:42,010 --> 01:55:45,580 is they're kind of known as the Swiss Army knives of data structures 2385 01:55:45,580 --> 01:55:48,010 in programming because you can just use them 2386 01:55:48,010 --> 01:55:49,930 in so many interesting, clever ways. 2387 01:55:49,930 --> 01:55:52,750 If you ever want to associate something with something else, 2388 01:55:52,750 --> 01:55:54,910 a dictionary is your friend. 2389 01:55:54,910 --> 01:55:58,390 And you no longer have to write dozens of lines of code as in P set 5. 2390 01:55:58,390 --> 01:56:01,850 You can write single lines of code to achieve this same idea. 2391 01:56:01,850 --> 01:56:04,990 So, for instance, if I, too, want to tighten this up, 2392 01:56:04,990 --> 01:56:07,180 I actually don't need this loop altogether. 2393 01:56:07,180 --> 01:56:10,240 An even better version of this code would be this. 2394 01:56:10,240 --> 01:56:12,737 I don't need this variable, technically, even 2395 01:56:12,737 --> 01:56:14,320 though this will look a little uglier. 2396 01:56:14,320 --> 01:56:16,870 Notice that I'm only creating a variable called number 2397 01:56:16,870 --> 01:56:19,180 because I want to set it equal to this person's number. 2398 01:56:19,180 --> 01:56:21,940 But strictly speaking, any time you've declared a variable 2399 01:56:21,940 --> 01:56:25,220 and then used it in the next line, eh, you don't really need it. 2400 01:56:25,220 --> 01:56:26,450 So I could do this. 2401 01:56:26,450 --> 01:56:27,950 I could get rid of that line. 2402 01:56:27,950 --> 01:56:30,430 And instead of printing "number" in my curly braces, 2403 01:56:30,430 --> 01:56:33,640 I could actually do person, square brackets, 2404 01:56:33,640 --> 01:56:35,740 and you might be inclined to do this. 2405 01:56:35,740 --> 01:56:38,740 But this is going to confuse Python because you're mixing double quotes 2406 01:56:38,740 --> 01:56:40,410 on the inside and the outside. 2407 01:56:40,410 --> 01:56:43,680 But you can use single quotes here, compellingly. 2408 01:56:43,680 --> 01:56:45,390 So you don't have to do it this way. 2409 01:56:45,390 --> 01:56:47,850 But this is just to show you, syntactically, 2410 01:56:47,850 --> 01:56:51,010 you can put most anything you want in these curly braces 2411 01:56:51,010 --> 01:56:55,350 so long as you don't confuse Python by using the same syntax. 2412 01:56:55,350 --> 01:56:58,240 But let me do one other thing here. 2413 01:56:58,240 --> 01:56:59,910 This is even more powerful. 2414 01:56:59,910 --> 01:57:02,400 Let me propose that if all you're storing 2415 01:57:02,400 --> 01:57:05,760 is names and numbers, names and numbers, I can actually 2416 01:57:05,760 --> 01:57:09,090 simplify this dictionary significantly. 2417 01:57:09,090 --> 01:57:14,370 Let me actually redeclare this people data structure 2418 01:57:14,370 --> 01:57:19,770 to be not a list of dictionaries, but how about just one big dictionary? 2419 01:57:19,770 --> 01:57:22,230 Because if I'm only associating names with numbers, 2420 01:57:22,230 --> 01:57:24,780 I don't technically need to create special keys 2421 01:57:24,780 --> 01:57:26,040 called "name" and "number." 2422 01:57:26,040 --> 01:57:32,730 Why don't I just associate Carter with his number, +1-617-495-1000? 2423 01:57:32,730 --> 01:57:37,240 Why don't I just associate, quote unquote, "David" with his number, 2424 01:57:37,240 --> 01:57:41,500 +1-617-495-1000? 2425 01:57:41,500 --> 01:57:51,850 And then, lastly, let's just associate John with his number, +1-949-468-2750? 2426 01:57:51,850 --> 01:57:54,280 And that too would work. 2427 01:57:54,280 --> 01:57:57,520 But notice that I'm going to get rid of my list of people 2428 01:57:57,520 --> 01:58:01,120 and instead just have one dictionary of people, the downside of which 2429 01:58:01,120 --> 01:58:04,660 is that you can only have one key, one value, one key, one value. 2430 01:58:04,660 --> 01:58:09,013 You can't have a name key and a number key and an email key and an address key 2431 01:58:09,013 --> 01:58:11,680 and any number of other pieces of data that might be compelling. 2432 01:58:11,680 --> 01:58:14,290 But if you've only got key value pairs like this, 2433 01:58:14,290 --> 01:58:17,860 we can tighten up this code significantly so that now, down here, 2434 01:58:17,860 --> 01:58:19,360 I can actually do this. 2435 01:58:19,360 --> 01:58:24,580 If the name I'm looking for is somewhere in that people dictionary, 2436 01:58:24,580 --> 01:58:27,640 then go ahead and get the person's number 2437 01:58:27,640 --> 01:58:32,740 by going into the people dictionary, indexing into it at that person's name, 2438 01:58:32,740 --> 01:58:39,100 and then printing out "Found," for instance, that here number, 2439 01:58:39,100 --> 01:58:42,160 making this an f string, else you can go ahead 2440 01:58:42,160 --> 01:58:44,827 and print out "Not found" in this case here. 2441 01:58:44,827 --> 01:58:47,410 So, again, the difference is that the previous version created 2442 01:58:47,410 --> 01:58:49,930 a list of dictionaries, and I very manually, 2443 01:58:49,930 --> 01:58:52,420 methodically, iterated over it, looking for the person. 2444 01:58:52,420 --> 01:58:55,030 But what's nice again about dictionaries is 2445 01:58:55,030 --> 01:58:58,840 that Python gives you a lot of support for just looking into them easily. 2446 01:58:58,840 --> 01:59:01,630 And this syntax, just like you can use it for lists, 2447 01:59:01,630 --> 01:59:03,520 you can use it for dictionaries as well. 2448 01:59:03,520 --> 01:59:08,210 And Python will look for that name among the keys in the dictionary. 2449 01:59:08,210 --> 01:59:15,610 And if it finds it, you use this syntax to get at that person's number. 2450 01:59:15,610 --> 01:59:16,990 Whew, OK. 2451 01:59:16,990 --> 01:59:21,250 A lot all at once, but are there any questions on this here syntax? 2452 01:59:21,250 --> 01:59:25,220 We'll then introduce a couple of final features with a final flourish. 2453 01:59:25,220 --> 01:59:25,910 Yes? 2454 01:59:25,910 --> 01:59:28,722 AUDIENCE: This way [INAUDIBLE] break [INAUDIBLE].. 2455 01:59:28,722 --> 01:59:30,930 DAVID MALAN: In this case, I do not need to use break 2456 01:59:30,930 --> 01:59:33,030 because I don't have any loop involved. 2457 01:59:33,030 --> 01:59:37,050 So break is only used, as we've seen it, in the context of looping 2458 01:59:37,050 --> 01:59:39,600 over something when you want to terminate the loop early. 2459 01:59:39,600 --> 01:59:42,360 But here Python is doing the searching for you. 2460 01:59:42,360 --> 01:59:46,500 So Python is taking care of that automatically. 2461 01:59:46,500 --> 01:59:47,000 All right. 2462 01:59:47,000 --> 01:59:50,300 Just a couple of final features so that you have a couple of more building 2463 01:59:50,300 --> 01:59:53,540 blocks-- here is the documentation for dictionaries themselves 2464 01:59:53,540 --> 01:59:56,580 in case you want to poke around as to what more you can do with them. 2465 01:59:56,580 --> 01:59:59,300 But it turns out that there are other libraries that 2466 01:59:59,300 --> 02:00:01,880 come with Python, not even third-party, and one of them 2467 02:00:01,880 --> 02:00:07,610 is the sys library, whereby you have system-related functionality. 2468 02:00:07,610 --> 02:00:10,230 And here's its official documentation, for instance. 2469 02:00:10,230 --> 02:00:12,980 But what this means is that certain functionality that was just 2470 02:00:12,980 --> 02:00:17,480 immediately available in C is sometimes tucked away now into libraries 2471 02:00:17,480 --> 02:00:18,080 in Python. 2472 02:00:18,080 --> 02:00:19,910 So, for instance, let me go over to VS Code 2473 02:00:19,910 --> 02:00:23,150 here, and let me just create a program called greet.py, which 2474 02:00:23,150 --> 02:00:25,490 is reminiscent of an old C program that just greets 2475 02:00:25,490 --> 02:00:27,710 the user using command-line arguments. 2476 02:00:27,710 --> 02:00:31,580 But in C, recall that we got access to command-line arguments with main() 2477 02:00:31,580 --> 02:00:34,070 and argc and argv. 2478 02:00:34,070 --> 02:00:36,870 But none of those have we seen at all today. 2479 02:00:36,870 --> 02:00:39,540 And, in fact, main() itself is no longer required. 2480 02:00:39,540 --> 02:00:43,010 So if you want to do command-line arguments in Python, 2481 02:00:43,010 --> 02:00:44,270 you actually do this. 2482 02:00:44,270 --> 02:00:48,720 From the sys library, you can import something called argv. 2483 02:00:48,720 --> 02:00:50,260 So argv still exists. 2484 02:00:50,260 --> 02:00:54,040 It's just tucked away inside of this library, otherwise known as a module. 2485 02:00:54,040 --> 02:00:55,630 And I can then do this. 2486 02:00:55,630 --> 02:01:00,450 If the length of argv, for instance, does not equal 2, 2487 02:01:00,450 --> 02:01:02,918 well, then, we're going to go ahead and do what we did-- 2488 02:01:02,918 --> 02:01:03,960 or rather, let's do this. 2489 02:01:03,960 --> 02:01:06,330 If the length of argv does equal 2, we're 2490 02:01:06,330 --> 02:01:08,580 going to go ahead and do what we did a couple of weeks 2491 02:01:08,580 --> 02:01:11,670 ago, whereby I'm going to print out "hello," 2492 02:01:11,670 --> 02:01:15,490 comma, and then argv bracket 1, for instance, 2493 02:01:15,490 --> 02:01:18,540 so whatever is in location 1 of that list. 2494 02:01:18,540 --> 02:01:21,990 Else, if the length of argv is not equal to 2-- that is, 2495 02:01:21,990 --> 02:01:24,120 the human did not type two words at the prompt-- 2496 02:01:24,120 --> 02:01:27,730 let's go ahead and print out "hello, world" by default. 2497 02:01:27,730 --> 02:01:31,860 So we did the exact same thing in C. The only difference here is that this now 2498 02:01:31,860 --> 02:01:33,660 is how you get access to argv. 2499 02:01:33,660 --> 02:01:38,020 So let me run this-- python of greet.py and hit Enter. "hello, world" is all I 2500 02:01:38,020 --> 02:01:38,520 get. 2501 02:01:38,520 --> 02:01:40,680 And actually, I got an extra line break because out of habit, 2502 02:01:40,680 --> 02:01:43,320 I included backslash n, but I don't need that in Python. 2503 02:01:43,320 --> 02:01:46,680 So let me fix that. python of greet.py-- "hello, world." 2504 02:01:46,680 --> 02:01:50,160 But if I do python of greet.py, D-A-V-I-D, 2505 02:01:50,160 --> 02:01:53,010 now notice that argv equals 2. 2506 02:01:53,010 --> 02:01:56,490 If I instead do something like Carter, argv now equals 2. 2507 02:01:56,490 --> 02:01:57,780 But there is a difference. 2508 02:01:57,780 --> 02:02:02,280 Technically, I'm typing three words at the prompt, three words at the prompt, 2509 02:02:02,280 --> 02:02:07,380 but argv still only equals 2 because the command python is ignored from argv. 2510 02:02:07,380 --> 02:02:11,440 It's only the name of your file and the thing you type after it. 2511 02:02:11,440 --> 02:02:16,980 So that's then how we might print out arguments in Python using argv. 2512 02:02:16,980 --> 02:02:20,490 Well, what else might we do using some of these here features? 2513 02:02:20,490 --> 02:02:25,180 Well, it turns out that you can exit from programs using this same sys 2514 02:02:25,180 --> 02:02:25,680 library. 2515 02:02:25,680 --> 02:02:27,270 So let me close greet.py. 2516 02:02:27,270 --> 02:02:30,545 Let me open up exit.py just for demonstration's sake. 2517 02:02:30,545 --> 02:02:31,920 And let's do something like this. 2518 02:02:31,920 --> 02:02:33,210 Let's import sys. 2519 02:02:33,210 --> 02:02:37,920 And if the length of sys.argv-- 2520 02:02:37,920 --> 02:02:40,050 so here's just another way of doing this. 2521 02:02:40,050 --> 02:02:44,130 And actually, I'll do it the same first-- from sys import argv. 2522 02:02:44,130 --> 02:02:49,483 If the length of argv does not equal 2-- 2523 02:02:49,483 --> 02:02:51,650 well, let's actually yell at the user with something 2524 02:02:51,650 --> 02:02:54,800 like "Missing command-line argument." 2525 02:02:54,800 --> 02:03:01,220 And then what we can do is exit out of the program entirely using sys.exit(), 2526 02:03:01,220 --> 02:03:02,750 which is a function therein. 2527 02:03:02,750 --> 02:03:05,282 But notice that exit() is a function in sys. 2528 02:03:05,282 --> 02:03:05,990 So you know what? 2529 02:03:05,990 --> 02:03:07,980 It's actually more convenient in this case. 2530 02:03:07,980 --> 02:03:09,590 Let's just import all of sys. 2531 02:03:09,590 --> 02:03:12,800 But because that has not given me direct access to argv, 2532 02:03:12,800 --> 02:03:16,363 let me do sys.argv here and sys.exit() here. 2533 02:03:16,363 --> 02:03:19,280 Otherwise, if all is well, let's just go ahead and print out something 2534 02:03:19,280 --> 02:03:27,140 like "hello, sys.argv," bracket 1, close quote, and that will print out "hello, 2535 02:03:27,140 --> 02:03:27,920 so-and-so." 2536 02:03:27,920 --> 02:03:31,580 And when I'm ready to exit with a non-0-- 2537 02:03:31,580 --> 02:03:35,580 with a 0 exit status, I can actually start to specify these things here. 2538 02:03:35,580 --> 02:03:38,900 So just like in C, if you want to exit from a program with 1 or 2 2539 02:03:38,900 --> 02:03:41,330 or anything else, you can use sys.exit. 2540 02:03:41,330 --> 02:03:45,570 And if you want to exit with a 0, you can do this here instead. 2541 02:03:45,570 --> 02:03:48,090 So we have the same capabilities as in C, 2542 02:03:48,090 --> 02:03:51,720 just accessed a little bit differently. 2543 02:03:51,720 --> 02:03:53,700 Let me propose that-- 2544 02:03:53,700 --> 02:03:55,530 let's see. 2545 02:03:55,530 --> 02:03:58,200 Let me propose that-- 2546 02:03:58,200 --> 02:03:59,530 how about this? 2547 02:03:59,530 --> 02:04:03,480 2548 02:04:03,480 --> 02:04:04,590 How about this? 2549 02:04:04,590 --> 02:04:09,120 If we want to go ahead and create something a little more interactive, 2550 02:04:09,120 --> 02:04:12,540 recall that there was that command a while back, namely 2551 02:04:12,540 --> 02:04:15,862 pip, whereby I ran pip install face_recognition. 2552 02:04:15,862 --> 02:04:17,820 That's one of the examples with which we began. 2553 02:04:17,820 --> 02:04:21,150 And that allows me to install more functionality from a third party 2554 02:04:21,150 --> 02:04:24,207 into my own code space or my programming environment more generally. 2555 02:04:24,207 --> 02:04:26,290 Well, we can have a little fun with this, in fact. 2556 02:04:26,290 --> 02:04:27,780 Let me go back to VS Code here. 2557 02:04:27,780 --> 02:04:30,750 And just like there's a command in Linux called cowsay, 2558 02:04:30,750 --> 02:04:32,730 whereby you can get the cow to say something, 2559 02:04:32,730 --> 02:04:35,100 you can also use this kind of thing in Python. 2560 02:04:35,100 --> 02:04:39,330 So if I do pip install cowsay, this, if it's not installed already, 2561 02:04:39,330 --> 02:04:41,250 will install a library called cowsay. 2562 02:04:41,250 --> 02:04:44,160 And what this means is that if I actually want to code up a program 2563 02:04:44,160 --> 02:04:48,540 called, like, moo.py, I can import the cowsay library, 2564 02:04:48,540 --> 02:04:52,080 and I can do something simple like cowsay.cow, 2565 02:04:52,080 --> 02:04:55,620 because there's a function in this library called cow(), 2566 02:04:55,620 --> 02:04:59,970 and I can say something like "This is CS50," quote unquote. 2567 02:04:59,970 --> 02:05:01,330 How do I run this program? 2568 02:05:01,330 --> 02:05:05,700 I can run python of moo.py, and-- oh, underwhelming. 2569 02:05:05,700 --> 02:05:09,210 If I increase the size of my terminal window, run python of moo.py, 2570 02:05:09,210 --> 02:05:11,460 we have that same adorable cow as before, 2571 02:05:11,460 --> 02:05:15,425 but I now have programmatic capabilities with which to manipulate it. 2572 02:05:15,425 --> 02:05:18,300 And so, in fact, I could make this program a little more interesting. 2573 02:05:18,300 --> 02:05:22,170 I could do something like name = quote-- 2574 02:05:22,170 --> 02:05:25,650 or rather, name = input("What's your name?") and combine some 2575 02:05:25,650 --> 02:05:26,580 of today's ideas. 2576 02:05:26,580 --> 02:05:29,320 And now I can say not "This is CS50," but something like, 2577 02:05:29,320 --> 02:05:32,520 quote unquote, "Hello," comma, person's name. 2578 02:05:32,520 --> 02:05:36,720 And now, if I increase the size of my terminal, rerun python of moo.py, 2579 02:05:36,720 --> 02:05:39,570 it's not going to actually moo or say "This is CS50." 2580 02:05:39,570 --> 02:05:42,780 It's going to say something like "hello, David," and so forth. 2581 02:05:42,780 --> 02:05:45,250 And suffice it to say through other functions, 2582 02:05:45,250 --> 02:05:48,730 you can do not only cows but dragons and other fancy things, too. 2583 02:05:48,730 --> 02:05:52,000 But even in Python, too, can you generate not just ASCII art, 2584 02:05:52,000 --> 02:05:54,310 but actual art and actual images. 2585 02:05:54,310 --> 02:05:57,845 And the note I thought we'd end on is doing one other library. 2586 02:05:57,845 --> 02:05:59,470 I'm going to go back into VS Code here. 2587 02:05:59,470 --> 02:06:01,240 I'm going to close moo.py. 2588 02:06:01,240 --> 02:06:06,220 I'm going to do pip install qrcode, which is the name of a library 2589 02:06:06,220 --> 02:06:10,570 that I might want to install to generate QR codes automatically. 2590 02:06:10,570 --> 02:06:12,790 And QR codes are these two-dimensional bar codes. 2591 02:06:12,790 --> 02:06:14,840 If you want to generate these things yourself, 2592 02:06:14,840 --> 02:06:17,080 you don't have to go to a website and type in a URL. 2593 02:06:17,080 --> 02:06:19,880 You can actually write this kind of code yourself. 2594 02:06:19,880 --> 02:06:21,100 So how might I do this? 2595 02:06:21,100 --> 02:06:26,890 Well, let me go into a new file called, say, qr.py. 2596 02:06:26,890 --> 02:06:28,370 And let me do this. 2597 02:06:28,370 --> 02:06:34,030 Let me go ahead and import this library called qrcode. 2598 02:06:34,030 --> 02:06:37,360 Let me go ahead and create a variable called image, or anything else. 2599 02:06:37,360 --> 02:06:40,720 Let me set it equal to this library's qrcodes function called 2600 02:06:40,720 --> 02:06:43,090 make-- no relationship to C. It's just called make 2601 02:06:43,090 --> 02:06:44,650 because you want to make a QR code. 2602 02:06:44,650 --> 02:06:48,370 Let me type in, maybe, the URL of a lecture video here on YouTube-- 2603 02:06:48,370 --> 02:07:01,990 so, like, youtu.be/xvFZjo5PgG0, quote unquote. 2604 02:07:01,990 --> 02:07:07,490 And then I can go ahead and do img.save because inside of this image variable, 2605 02:07:07,490 --> 02:07:10,540 which is a different data type that this library gave me-- 2606 02:07:10,540 --> 02:07:12,130 it doesn't come with Python per se-- 2607 02:07:12,130 --> 02:07:18,820 I can save a file like qr.png, And I can save it in the PNG format, the Portable 2608 02:07:18,820 --> 02:07:19,665 Network Graphic. 2609 02:07:19,665 --> 02:07:21,790 And so just to be clear, what this should hopefully 2610 02:07:21,790 --> 02:07:26,980 do for me is create a QR code containing that particular URL, 2611 02:07:26,980 --> 02:07:31,540 but not as text, but rather as an actual image that I can send, 2612 02:07:31,540 --> 02:07:34,810 I can post online, or, in our case, generate into my code space, 2613 02:07:34,810 --> 02:07:35,830 and then open. 2614 02:07:35,830 --> 02:07:38,558 And so, with all that said, we've seen a bunch of new syntax 2615 02:07:38,558 --> 02:07:39,850 today, a bunch of new features. 2616 02:07:39,850 --> 02:07:45,160 But the ideas underlying Python are exactly the same as they've been in C. 2617 02:07:45,160 --> 02:07:49,000 It's just that you don't have to do nearly as much heavy lifting yourself. 2618 02:07:49,000 --> 02:07:51,850 And here, for instance, in just three lines of code, 2619 02:07:51,850 --> 02:07:54,910 can you generate a massive QR code that people can scan, 2620 02:07:54,910 --> 02:07:57,130 as you can in a moment with your phones, and actually 2621 02:07:57,130 --> 02:07:59,870 link to something like a CS50 class. 2622 02:07:59,870 --> 02:08:03,070 So let me go ahead and run python of qr.py. 2623 02:08:03,070 --> 02:08:04,990 It seems to have run. 2624 02:08:04,990 --> 02:08:09,760 Let me run code of qr.png, which is the file I created. 2625 02:08:09,760 --> 02:08:12,760 I'll close my terminal window, allow you an opportunity 2626 02:08:12,760 --> 02:08:17,930 to scan this here very CS50 lecture. 2627 02:08:17,930 --> 02:08:25,183 And-- and-- is someone's volume up? 2628 02:08:25,183 --> 02:08:26,850 [RICK ASTLEY, "NEVER GONNA GIVE YOU UP"] 2629 02:08:26,850 --> 02:08:27,660 There we go. 2630 02:08:27,660 --> 02:08:28,680 What a perfect ending. 2631 02:08:28,680 --> 02:08:29,180 All right. 2632 02:08:29,180 --> 02:08:30,120 That was CS50. 2633 02:08:30,120 --> 02:08:32,670 We'll see you next time. 2634 02:08:32,670 --> 02:08:35,720 [MUSIC PLAYING] 2635 02:08:35,720 --> 02:09:02,000