1 00:00:00,000 --> 00:00:03,992 [MUSIC PLAYING] 2 00:00:03,992 --> 00:00:50,490 3 00:00:50,490 --> 00:00:53,820 DAVID MALAN: This is CS50 and this is lecture 6. 4 00:00:53,820 --> 00:00:56,490 And you'll recall that last week we introduced web programming 5 00:00:56,490 --> 00:00:58,874 by way of HTML and CSS, or at least the building blocks 6 00:00:58,874 --> 00:01:01,290 because we don't actually have the ability to program yet. 7 00:01:01,290 --> 00:01:04,900 It's just markup, HTML and CSS with stylization thereof. 8 00:01:04,900 --> 00:01:08,490 But we introduced this metaphor last week of a protocol called TCP/IP. 9 00:01:08,490 --> 00:01:10,780 And we related it to, of course, an envelope. 10 00:01:10,780 --> 00:01:12,750 And on this envelope, virtually, on the front 11 00:01:12,750 --> 00:01:14,790 was at least two pieces of information. 12 00:01:14,790 --> 00:01:17,040 And if anyone remembers what were those two 13 00:01:17,040 --> 00:01:19,500 pieces of information in the to field? 14 00:01:19,500 --> 00:01:22,010 Someone else who we didn't hear from recently? 15 00:01:22,010 --> 00:01:22,510 Yeah? 16 00:01:22,510 --> 00:01:23,400 AUDIENCE: An IP address. 17 00:01:23,400 --> 00:01:23,590 DAVID MALAN: Yeah. 18 00:01:23,590 --> 00:01:26,730 An IP address, a numeric address that uniquely identifies your computer 19 00:01:26,730 --> 00:01:27,964 and someone else's computer. 20 00:01:27,964 --> 00:01:29,505 And one other thing, if you remember. 21 00:01:29,505 --> 00:01:32,120 22 00:01:32,120 --> 00:01:32,620 Oh, come on. 23 00:01:32,620 --> 00:01:33,860 It was like two minutes ago. 24 00:01:33,860 --> 00:01:34,100 OK. 25 00:01:34,100 --> 00:01:34,530 Yeah. 26 00:01:34,530 --> 00:01:35,390 AUDIENCE: A port number. 27 00:01:35,390 --> 00:01:36,515 DAVID MALAN: A port number. 28 00:01:36,515 --> 00:01:40,420 So another number, shorter number, that's just a number like 80 or 443 29 00:01:40,420 --> 00:01:44,090 referring to HTTP or HTTPS, or other numbers, 30 00:01:44,090 --> 00:01:45,890 like 25 for email and the like. 31 00:01:45,890 --> 00:01:49,090 And so together these unique addresses allow you to send information 32 00:01:49,090 --> 00:01:51,940 to not only a specific computer, but a specific service 33 00:01:51,940 --> 00:01:53,860 running on that computer. 34 00:01:53,860 --> 00:01:57,580 And in order to actually request information from that server, 35 00:01:57,580 --> 00:02:01,480 there's this other protocol called HTTP, Hypertext Transfer Protocol. 36 00:02:01,480 --> 00:02:03,600 This is what's inside of the envelope. 37 00:02:03,600 --> 00:02:06,130 So when the server opens it up, metaphorically, 38 00:02:06,130 --> 00:02:10,509 looks inside, this is the command that that server reads in order to decide 39 00:02:10,509 --> 00:02:12,180 what it should actually respond with. 40 00:02:12,180 --> 00:02:15,260 And so this request here is telling the server-- 41 00:02:15,260 --> 00:02:19,330 otherwise known as www.example.com in this particular example-- 42 00:02:19,330 --> 00:02:23,560 to send back what exactly in its own envelope to me 43 00:02:23,560 --> 00:02:27,990 and my laptop if I were to request this? 44 00:02:27,990 --> 00:02:29,646 AUDIENCE: A specific web page. 45 00:02:29,646 --> 00:02:31,020 DAVID MALAN: A specific web page. 46 00:02:31,020 --> 00:02:33,459 And someone else, which web page specifically, presumably? 47 00:02:33,459 --> 00:02:34,125 AUDIENCE: Index. 48 00:02:34,125 --> 00:02:36,700 DAVID MALAN: Yeah, so index.html, which we said last week 49 00:02:36,700 --> 00:02:40,602 just tends to be the default file name on a server for a web page 50 00:02:40,602 --> 00:02:43,560 that's just selected by default. And it doesn't have to be called this, 51 00:02:43,560 --> 00:02:44,820 but it's a human convention. 52 00:02:44,820 --> 00:02:48,070 And the rest of this is just a verb saying, literally, get me that file. 53 00:02:48,070 --> 00:02:50,400 This is just telling the server what version of HTTP 54 00:02:50,400 --> 00:02:54,600 I speak so that humans can improve it and upgrade it over time. 55 00:02:54,600 --> 00:02:57,660 But this would tell the server to return index.html. 56 00:02:57,660 --> 00:03:00,480 Meanwhile, we saw more sophisticated get queries 57 00:03:00,480 --> 00:03:03,960 when we started talking about Google, and any website that 58 00:03:03,960 --> 00:03:07,830 has not just a front end, like HTML and CSS, but also a back end. 59 00:03:07,830 --> 00:03:10,320 And a back end is where the logic is, where the server is, 60 00:03:10,320 --> 00:03:12,850 and the interesting work, ultimately. 61 00:03:12,850 --> 00:03:15,780 And so this slash search indicates some kind 62 00:03:15,780 --> 00:03:18,820 of software running on Google servers as of last week 63 00:03:18,820 --> 00:03:20,460 that's simply responds to requests. 64 00:03:20,460 --> 00:03:27,740 And what did question mark q equals cats do or represent in that demonstration? 65 00:03:27,740 --> 00:03:28,720 AUDIENCE: User input. 66 00:03:28,720 --> 00:03:29,780 DAVID MALAN: Yeah, user input. 67 00:03:29,780 --> 00:03:32,790 So the question mark just says, that's it for the file name or the URL. 68 00:03:32,790 --> 00:03:34,500 Here comes the user's input. 69 00:03:34,500 --> 00:03:38,361 Q is just literally the HTTP parameter or input 70 00:03:38,361 --> 00:03:40,110 that Larry and Sergey, founders of Google, 71 00:03:40,110 --> 00:03:44,340 20 years ago decided would represent the user's input, q for query. 72 00:03:44,340 --> 00:03:47,231 Equal just means that query that the human typed in was cats. 73 00:03:47,231 --> 00:03:49,230 But the human doesn't even have to type this in. 74 00:03:49,230 --> 00:03:53,250 Once you understand HTTP, if you really wanted to be kind of a nerd, 75 00:03:53,250 --> 00:03:58,387 you could go to www.google.com/search?q=cats and it 76 00:03:58,387 --> 00:04:00,970 would induce the search for you because at the end of the day, 77 00:04:00,970 --> 00:04:02,724 that's all the browser is doing. 78 00:04:02,724 --> 00:04:05,640 When you have these web forms that you now have the ability to create, 79 00:04:05,640 --> 00:04:09,810 it's just automating the process of generating these HTTP messages. 80 00:04:09,810 --> 00:04:13,820 Now, the server hopefully responds with a message you never, ever actually see, 81 00:04:13,820 --> 00:04:17,010 HTTP 200, which literally means OK. 82 00:04:17,010 --> 00:04:22,200 Of course, many of us have seen numbers other than 200 appear, like what? 83 00:04:22,200 --> 00:04:24,450 404, which means? 84 00:04:24,450 --> 00:04:25,410 File not found. 85 00:04:25,410 --> 00:04:28,980 Now, why the humans decided years ago to tell 86 00:04:28,980 --> 00:04:31,440 other humans what that numeric code is, I mean, 87 00:04:31,440 --> 00:04:33,000 that is an uninteresting detail. 88 00:04:33,000 --> 00:04:36,660 But the world, for whatever reason, has revealed in many web sites 404. 89 00:04:36,660 --> 00:04:38,070 But it just means the same thing. 90 00:04:38,070 --> 00:04:39,450 Everything is not OK. 91 00:04:39,450 --> 00:04:40,615 A file was not found. 92 00:04:40,615 --> 00:04:42,240 You might see something else like this. 93 00:04:42,240 --> 00:04:44,730 We saw this with Harvard, in fact, curiously, 94 00:04:44,730 --> 00:04:46,780 that Harvard had moved permanently. 95 00:04:46,780 --> 00:04:51,780 Now, Harvard was responding to certain queries with HTTP 301s 96 00:04:51,780 --> 00:04:55,610 in order to achieve what feature or effect? 97 00:04:55,610 --> 00:04:56,110 Why? 98 00:04:56,110 --> 00:04:56,672 Yeah. 99 00:04:56,672 --> 00:04:57,630 AUDIENCE: Redirections. 100 00:04:57,630 --> 00:04:58,400 DAVID MALAN: Redirections. 101 00:04:58,400 --> 00:05:00,690 So this is kind of a low-level way of describing it. 102 00:05:00,690 --> 00:05:03,150 But 301, even though it says moved permanently, 103 00:05:03,150 --> 00:05:05,330 that's a more technical hint to the browser saying, 104 00:05:05,330 --> 00:05:08,420 Harvard moved not to whatever URL you just came from, 105 00:05:08,420 --> 00:05:10,440 but to this URL specifically. 106 00:05:10,440 --> 00:05:14,000 And now Harvard was probably, if you recall, redirecting me from what URL? 107 00:05:14,000 --> 00:05:19,010 If I wasn't already at that URL, where might I have been? 108 00:05:19,010 --> 00:05:22,450 Maybe dot com, if they actually own multiple domains and were redirecting. 109 00:05:22,450 --> 00:05:23,410 That could work. 110 00:05:23,410 --> 00:05:23,930 What else? 111 00:05:23,930 --> 00:05:25,070 Yeah. 112 00:05:25,070 --> 00:05:26,020 AUDIENCE: Just HTTP. 113 00:05:26,020 --> 00:05:26,770 DAVID MALAN: Yeah. 114 00:05:26,770 --> 00:05:30,250 Maybe I just typed in HTTP, and Harvard, in the interest of security, 115 00:05:30,250 --> 00:05:36,610 wants to force my browser to request this page again via HTTPS. 116 00:05:36,610 --> 00:05:40,490 Sometimes a website might prepend the www if you haven't typed it in, 117 00:05:40,490 --> 00:05:42,640 or you can be redirected most anywhere. 118 00:05:42,640 --> 00:05:47,020 In fact, if you go to CS50's own website by just typing CS50.harvard.edu, 119 00:05:47,020 --> 00:05:47,890 watch the URL. 120 00:05:47,890 --> 00:05:51,990 You'll be redirected to a more specific page, depending on the time of year. 121 00:05:51,990 --> 00:05:53,800 So we use these tricks, as well. 122 00:05:53,800 --> 00:05:56,725 404 not found might look like this, but inside deeper 123 00:05:56,725 --> 00:06:00,630 of that metaphorical envelope is the actual contents of the web page. 124 00:06:00,630 --> 00:06:03,190 So you get back not only these HTTP headers, 125 00:06:03,190 --> 00:06:06,160 as they're called, in the top of the response, so to speak, 126 00:06:06,160 --> 00:06:10,750 but you also get back HTML, yet another language we looked at, 127 00:06:10,750 --> 00:06:13,480 this one actually a language, but not a programming language. 128 00:06:13,480 --> 00:06:17,440 These tags tell the browser exactly what to do and to render. 129 00:06:17,440 --> 00:06:19,480 We introduced this style tag, though. 130 00:06:19,480 --> 00:06:22,180 What did that allow us to do that HTML alone did not? 131 00:06:22,180 --> 00:06:27,770 132 00:06:27,770 --> 00:06:29,030 Yeah. 133 00:06:29,030 --> 00:06:31,620 Use CSS to beautify the site and just make it nicer. 134 00:06:31,620 --> 00:06:34,040 HTML, for the most part, is about structure 135 00:06:34,040 --> 00:06:37,400 and about tagging the contents of your web page in a way 136 00:06:37,400 --> 00:06:38,930 that the browser finds helpful. 137 00:06:38,930 --> 00:06:41,810 But CSS is really for the user's benefit, at the end of the day, 138 00:06:41,810 --> 00:06:43,640 and his or her eyes, because it really lets 139 00:06:43,640 --> 00:06:46,114 you control font size and positioning and lower-level stuff 140 00:06:46,114 --> 00:06:49,280 that you might have started tinkering with with the most recent problem set. 141 00:06:49,280 --> 00:06:51,590 Now, we'd proposed that you probably shouldn't just 142 00:06:51,590 --> 00:06:54,961 start typing CSS inside of your HTML page 143 00:06:54,961 --> 00:06:57,710 because it's just a little harder to maintain as your examples get 144 00:06:57,710 --> 00:06:58,730 more sophisticated. 145 00:06:58,730 --> 00:06:59,990 So you might factor it out. 146 00:06:59,990 --> 00:07:01,698 And odds are you did this for the problem 147 00:07:01,698 --> 00:07:04,430 set because when making a home page, if you have the same CSS 148 00:07:04,430 --> 00:07:08,450 styles across multiple files, it would be pretty silly and inefficient to copy 149 00:07:08,450 --> 00:07:11,730 and paste them again and again when you can factor them out like this. 150 00:07:11,730 --> 00:07:14,090 Lastly, we looked at JavaScript, last time, 151 00:07:14,090 --> 00:07:16,460 another programming language that's super similar 152 00:07:16,460 --> 00:07:18,254 to see, at least at first glance. 153 00:07:18,254 --> 00:07:20,420 But it actually gets rid of a lot of the lower level 154 00:07:20,420 --> 00:07:22,990 headaches like pointers and memory addresses and that 155 00:07:22,990 --> 00:07:24,740 that we've struggled with in recent weeks. 156 00:07:24,740 --> 00:07:27,230 But most important was how we used it. 157 00:07:27,230 --> 00:07:31,460 So you can consider a web page like this as once it's loaded by your browser 158 00:07:31,460 --> 00:07:32,887 as just being a tree structure. 159 00:07:32,887 --> 00:07:35,720 Thinking back a couple of weeks to our discussion of data structures 160 00:07:35,720 --> 00:07:40,580 and each of these nodes in the tree we saw in JavaScript can be manipulated. 161 00:07:40,580 --> 00:07:43,040 And via that very simple principle, writing 162 00:07:43,040 --> 00:07:47,990 code that modifies this existing tree in the browser's memory, 163 00:07:47,990 --> 00:07:51,380 means you can make much more dynamic things like Gmail and Facebook 164 00:07:51,380 --> 00:07:53,880 and any number of websites that are constantly changing. 165 00:07:53,880 --> 00:07:55,754 You did not do this yet for the problems set. 166 00:07:55,754 --> 00:07:59,330 You made static web pages just by hard coding HTML and CSS. 167 00:07:59,330 --> 00:08:02,420 But starting next week, once we have, thanks to this week, the vocabulary 168 00:08:02,420 --> 00:08:04,910 of Python will you start to make things more dynamic 169 00:08:04,910 --> 00:08:07,670 and then even bring back into play JavaScript, 170 00:08:07,670 --> 00:08:11,152 bringing all of these various threads together. 171 00:08:11,152 --> 00:08:14,360 And to include the JavaScript, recall, we used either a script tag at the top 172 00:08:14,360 --> 00:08:16,010 or refactored it out to a file. 173 00:08:16,010 --> 00:08:18,620 Or in some cases, it's necessary or beneficial 174 00:08:18,620 --> 00:08:22,550 to move it down to the bottom of the file or factor it out like that, 175 00:08:22,550 --> 00:08:24,680 but more on that down the road. 176 00:08:24,680 --> 00:08:32,600 So any questions on last week or on HTTP, HTML, CSS, or TCP/IP? 177 00:08:32,600 --> 00:08:33,330 No? 178 00:08:33,330 --> 00:08:35,530 Anything at all? 179 00:08:35,530 --> 00:08:36,146 Oh, yeah? 180 00:08:36,146 --> 00:08:38,229 AUDIENCE: So in what case would you put the script 181 00:08:38,229 --> 00:08:41,225 tag up at the top [INAUDIBLE] 182 00:08:41,225 --> 00:08:42,350 DAVID MALAN: Good question. 183 00:08:42,350 --> 00:08:44,808 So in what cases would you put the script tag up at the top 184 00:08:44,808 --> 00:08:45,970 versus at the bottom? 185 00:08:45,970 --> 00:08:49,090 If the code you're writing in JavaScript manipulates 186 00:08:49,090 --> 00:08:52,510 the DOM, the tree that I had on the screen just a moment ago, 187 00:08:52,510 --> 00:08:56,860 the catch is that that tree needs to exist when your code is executed. 188 00:08:56,860 --> 00:09:01,330 So if you, for instance, have JavaScript code up here in the head of your page, 189 00:09:01,330 --> 00:09:04,570 but the nodes in the tree, the tags that you 190 00:09:04,570 --> 00:09:07,360 want to manipulate in changing things to red to green to blue 191 00:09:07,360 --> 00:09:11,020 like we did last week, or making things blank, are down here in the page, 192 00:09:11,020 --> 00:09:14,590 you can't write your code up here and have it change things in the page 193 00:09:14,590 --> 00:09:16,780 down here because it's happening out of order. 194 00:09:16,780 --> 00:09:20,110 So similar in spirit to C where things have to happen in the right order, 195 00:09:20,110 --> 00:09:22,120 if you want to change something down here, 196 00:09:22,120 --> 00:09:25,480 your code needs to at least be down here, 197 00:09:25,480 --> 00:09:28,430 or you need to use some fancier techniques to say, 198 00:09:28,430 --> 00:09:31,120 I'm going to write my code up here but wait a few seconds 199 00:09:31,120 --> 00:09:34,124 before executing it until the whole webpage is loaded. 200 00:09:34,124 --> 00:09:36,790 So for most of the examples we looked at, this was not an issue. 201 00:09:36,790 --> 00:09:39,460 But we'll come back to this perhaps before long. 202 00:09:39,460 --> 00:09:42,100 All right, so let's now take the same approach 203 00:09:42,100 --> 00:09:45,419 that we did last time of introducing one language by way of another. 204 00:09:45,419 --> 00:09:48,460 You'll recall, of course, that we started the whole semester with Scratch 205 00:09:48,460 --> 00:09:51,010 and then we transitioned a few weeks back now to C. Last week 206 00:09:51,010 --> 00:09:52,780 we made some comparisons with JavaScript. 207 00:09:52,780 --> 00:09:54,820 Let's do the same thing briefly with Python 208 00:09:54,820 --> 00:09:57,790 but then spend more time at the keyboard comparing the two to see 209 00:09:57,790 --> 00:10:00,430 what actually is different about these. 210 00:10:00,430 --> 00:10:02,350 So why in another language, though, first? 211 00:10:02,350 --> 00:10:07,900 We have Scratch, C, JavaScript, Python, not to mention HTML and CSS 212 00:10:07,900 --> 00:10:09,010 for different purposes. 213 00:10:09,010 --> 00:10:11,890 Like, why do we have all of these darn languages already? 214 00:10:11,890 --> 00:10:16,030 Why didn't humans just decide, that's it, we're all using Scratch? 215 00:10:16,030 --> 00:10:19,720 We're all using C or JavaScript or Python? 216 00:10:19,720 --> 00:10:23,930 What's, perhaps, the intuition behind that? 217 00:10:23,930 --> 00:10:27,450 Why are there so many damn languages, not to mention in this one course? 218 00:10:27,450 --> 00:10:28,123 Yeah? 219 00:10:28,123 --> 00:10:29,925 AUDIENCE: [INAUDIBLE] 220 00:10:29,925 --> 00:10:31,050 DAVID MALAN: Say once more? 221 00:10:31,050 --> 00:10:32,770 AUDIENCE: Different ones are good for different things. 222 00:10:32,770 --> 00:10:34,540 DAVID MALAN: Yeah, different ones are good for different things. 223 00:10:34,540 --> 00:10:36,850 And this probably goes without saying for something like Scratch, right? 224 00:10:36,850 --> 00:10:37,570 It's so visual. 225 00:10:37,570 --> 00:10:39,340 It's so graphical and animated. 226 00:10:39,340 --> 00:10:41,122 It makes sense that the puzzle pieces-- 227 00:10:41,122 --> 00:10:43,330 or that the language itself is based on puzzle pieces 228 00:10:43,330 --> 00:10:44,720 and dragging and dropping. 229 00:10:44,720 --> 00:10:47,410 So maybe languages are tailored to certain applications. 230 00:10:47,410 --> 00:10:51,550 But is that true for C, Python, and JavaScript, which 231 00:10:51,550 --> 00:10:54,490 are all text-based languages we'll see? 232 00:10:54,490 --> 00:10:57,310 AUDIENCE: [INAUDIBLE] for example, they're 233 00:10:57,310 --> 00:10:58,730 different levels of abstraction. 234 00:10:58,730 --> 00:10:59,396 DAVID MALAN: OK. 235 00:10:59,396 --> 00:11:01,007 Different levels of abstraction. 236 00:11:01,007 --> 00:11:06,474 AUDIENCE: C is very [INAUDIBLE] actually dealing with a lot of things that you 237 00:11:06,474 --> 00:11:08,960 don't have to think about in Python-- 238 00:11:08,960 --> 00:11:10,166 DAVID MALAN: Good. 239 00:11:10,166 --> 00:11:14,150 AUDIENCE: --where these sort of things are taken care of for you, 240 00:11:14,150 --> 00:11:18,134 such as memory allocations and so on. 241 00:11:18,134 --> 00:11:22,510 And so depending on what level of abstraction you want to work on 242 00:11:22,510 --> 00:11:24,554 and what parts you want to manipulate. 243 00:11:24,554 --> 00:11:25,470 DAVID MALAN: OK, good. 244 00:11:25,470 --> 00:11:27,469 Bringing it back to abstraction does make sense. 245 00:11:27,469 --> 00:11:31,470 C is, indeed, very low level, literally having the ability to manipulate memory 246 00:11:31,470 --> 00:11:32,860 and via pointers and so forth. 247 00:11:32,860 --> 00:11:35,460 And that's great because you can do anything you want with the computer. 248 00:11:35,460 --> 00:11:37,290 But it comes at great risk and great cost. 249 00:11:37,290 --> 00:11:39,060 One, the cost is human time. 250 00:11:39,060 --> 00:11:41,940 It's just painful to write that kind of code sometimes. 251 00:11:41,940 --> 00:11:48,210 Two, it's also very risky because if you make a mistake, even a simple mistake, 252 00:11:48,210 --> 00:11:49,680 the whole computer can crash. 253 00:11:49,680 --> 00:11:51,471 And we didn't see examples of this, but you 254 00:11:51,471 --> 00:11:53,340 can make your code vulnerable to a hacker 255 00:11:53,340 --> 00:11:56,274 if he or she is able to somehow exploit a memory-related bug 256 00:11:56,274 --> 00:11:59,190 and read all of the passwords in your program, or something like that. 257 00:11:59,190 --> 00:12:01,500 So with great power comes great responsibility 258 00:12:01,500 --> 00:12:03,630 is kind of the mantra of C down here. 259 00:12:03,630 --> 00:12:07,142 But JavaScript we saw allows us to do things a little more high-level. 260 00:12:07,142 --> 00:12:08,100 There were no pointers. 261 00:12:08,100 --> 00:12:08,910 There was no memory. 262 00:12:08,910 --> 00:12:10,650 We didn't talk about things at that level. 263 00:12:10,650 --> 00:12:12,566 We talked about things at the level of a tree, 264 00:12:12,566 --> 00:12:16,902 a DOM in memory and changing colors and positioning of things on the screen. 265 00:12:16,902 --> 00:12:18,360 And that's, indeed, a higher level. 266 00:12:18,360 --> 00:12:21,990 Now, Python is not necessarily even web-centric. 267 00:12:21,990 --> 00:12:23,580 It's more of a multi-purpose language. 268 00:12:23,580 --> 00:12:26,310 People use Python to write command-line programs, 269 00:12:26,310 --> 00:12:29,480 like we will soon, at the keyboard, like we've been doing with C. 270 00:12:29,480 --> 00:12:31,230 You can also, though, use it, as we'll see 271 00:12:31,230 --> 00:12:33,400 next week, to generate other languages. 272 00:12:33,400 --> 00:12:35,820 So next week we will write code in Python, 273 00:12:35,820 --> 00:12:39,990 the language we're about to see, to generate another language, HTML 274 00:12:39,990 --> 00:12:40,630 and CSS. 275 00:12:40,630 --> 00:12:44,430 Some of you probably noticed in your homepages that you had some redundancy. 276 00:12:44,430 --> 00:12:46,770 You probably had similar tags or similar structure, 277 00:12:46,770 --> 00:12:48,690 maybe a similar menu across pages. 278 00:12:48,690 --> 00:12:51,120 Python and other languages will let us factor that 279 00:12:51,120 --> 00:12:53,760 out and generate those commonalities a lot more 280 00:12:53,760 --> 00:12:55,440 easily, among many other things. 281 00:12:55,440 --> 00:12:58,560 And it's also arguably easier and faster to write 282 00:12:58,560 --> 00:13:02,050 because it comes with so many more features, as we will soon see. 283 00:13:02,050 --> 00:13:03,180 So in fact-- you know what? 284 00:13:03,180 --> 00:13:03,940 Let me do this. 285 00:13:03,940 --> 00:13:07,570 Let me go ahead and open up CS50 IDE. 286 00:13:07,570 --> 00:13:09,390 Let me go ahead and create a new file. 287 00:13:09,390 --> 00:13:11,330 And out of curiosity, of our recent problem 288 00:13:11,330 --> 00:13:16,482 sets, what was maybe among the most challenging programs you've written? 289 00:13:16,482 --> 00:13:17,197 AUDIENCE: Crack. 290 00:13:17,197 --> 00:13:18,780 DAVID MALAN: OK, crack was a good one. 291 00:13:18,780 --> 00:13:19,680 What else? 292 00:13:19,680 --> 00:13:20,460 AUDIENCE: Resize. 293 00:13:20,460 --> 00:13:21,570 DAVID MALAN: Resize, recover. 294 00:13:21,570 --> 00:13:22,980 Yeah, definitely the forensics ones. 295 00:13:22,980 --> 00:13:24,980 And more people probably did recover and resize. 296 00:13:24,980 --> 00:13:26,640 So let's take resize, for example. 297 00:13:26,640 --> 00:13:30,750 So let me go ahead and write a program in a file called resize.py for Python, 298 00:13:30,750 --> 00:13:35,790 instead of .c, and see if we can't spend, what, few hours, couple days, 299 00:13:35,790 --> 00:13:38,220 as you probably did in C, implementing resize. 300 00:13:38,220 --> 00:13:40,270 Well, let me go ahead and do this. 301 00:13:40,270 --> 00:13:42,360 I'm going to go ahead and-- 302 00:13:42,360 --> 00:13:42,960 let's see. 303 00:13:42,960 --> 00:13:46,570 First I'm going to import some features that just come with Python. 304 00:13:46,570 --> 00:13:50,910 And I'm going to go ahead and say from sys import argv. 305 00:13:50,910 --> 00:13:54,730 And I'm going to go ahead and also do from pil import image. 306 00:13:54,730 --> 00:13:55,980 Don't know yet what these are. 307 00:13:55,980 --> 00:13:57,450 We'll tease this apart in a moment. 308 00:13:57,450 --> 00:13:58,783 But then let me just do a check. 309 00:13:58,783 --> 00:14:00,270 If the length of-- 310 00:14:00,270 --> 00:14:04,770 rather, if the length of argv does not equal 4, 311 00:14:04,770 --> 00:14:08,190 I'm going to go ahead and exit for the user and say the usage of this program 312 00:14:08,190 --> 00:14:12,840 is Python resize.py and in file, out file. 313 00:14:12,840 --> 00:14:15,440 So even though some of this should look cryptic at the moment, 314 00:14:15,440 --> 00:14:18,110 there's some commonalities-- argv, you recall, from C, 315 00:14:18,110 --> 00:14:21,450 and this usage string that we printed out whenever anything went wrong. 316 00:14:21,450 --> 00:14:23,640 That looks very similar in spirit to C. 317 00:14:23,640 --> 00:14:25,110 And what did we do in resize? 318 00:14:25,110 --> 00:14:27,570 If you implemented resize, like the less comfy version, 319 00:14:27,570 --> 00:14:31,590 to increase the size of things, you probably declared a variable like an 320 00:14:31,590 --> 00:14:33,240 and got sys-- 321 00:14:33,240 --> 00:14:36,940 or rather, argv bracket one to get access to it. 322 00:14:36,940 --> 00:14:39,580 I'm going to go ahead and convert that or cast that to an int. 323 00:14:39,580 --> 00:14:43,669 You probably had an infile variable that gave you access to argv two. 324 00:14:43,669 --> 00:14:46,710 You probably had an out file variable that gave you access to argv three, 325 00:14:46,710 --> 00:14:47,820 and so forth. 326 00:14:47,820 --> 00:14:49,740 And it turns out in Python, you know what? 327 00:14:49,740 --> 00:14:53,280 I can actually use a library, code that other people have written. 328 00:14:53,280 --> 00:14:56,580 Let me come up with a variable called in image, like infile. 329 00:14:56,580 --> 00:14:58,290 This is my input image. 330 00:14:58,290 --> 00:15:01,200 And that's going to equal image.open because I 331 00:15:01,200 --> 00:15:03,040 want to open this thing called infile. 332 00:15:03,040 --> 00:15:04,650 And then the width-- 333 00:15:04,650 --> 00:15:07,230 let me get the width and the height of the existing image 334 00:15:07,230 --> 00:15:09,840 by doing input image.size. 335 00:15:09,840 --> 00:15:12,960 And then let me go ahead and make a new image-- out image, I'll call it-- 336 00:15:12,960 --> 00:15:17,520 which is going to equal the input image calling a resize function 337 00:15:17,520 --> 00:15:22,210 and doing the width times n, which is the number the human probably typed in, 338 00:15:22,210 --> 00:15:26,220 and height times n, which is the number the human typed in. 339 00:15:26,220 --> 00:15:29,970 Then let me go ahead and just save the outfile as follows. 340 00:15:29,970 --> 00:15:32,520 Outfile, OK. 341 00:15:32,520 --> 00:15:33,840 Done. 342 00:15:33,840 --> 00:15:37,460 Problem set three. 343 00:15:37,460 --> 00:15:39,660 Tada. 344 00:15:39,660 --> 00:15:43,280 OK, either really exciting or really, really disheartening perhaps. 345 00:15:43,280 --> 00:15:45,630 So with the right language, as you say, can you 346 00:15:45,630 --> 00:15:48,660 solve problems so much more easily. 347 00:15:48,660 --> 00:15:51,210 Now, I'm being a little disingenuous because I'm also 348 00:15:51,210 --> 00:15:52,680 leveraging what's called a library. 349 00:15:52,680 --> 00:15:54,809 And we had access to these in C. And undoubtedly 350 00:15:54,809 --> 00:15:56,850 we could have dug a little deeper on the internet 351 00:15:56,850 --> 00:16:02,010 into other people's available code and found maybe a library for bitmap files. 352 00:16:02,010 --> 00:16:05,840 But notice that there is no dealing with padding now. 353 00:16:05,840 --> 00:16:07,330 There's no dealing with arrays. 354 00:16:07,330 --> 00:16:11,560 There's no dealing with memory because I'm using the right tool for the job. 355 00:16:11,560 --> 00:16:13,740 And if I wrote this code correctly-- and let 356 00:16:13,740 --> 00:16:16,440 me cross my fingers that I didn't make any typos. 357 00:16:16,440 --> 00:16:20,280 Let me go ahead here and get myself a copy 358 00:16:20,280 --> 00:16:23,200 of smiley, which I brought with me. 359 00:16:23,200 --> 00:16:25,420 So that was the tiny little image from last week. 360 00:16:25,420 --> 00:16:27,128 Let me go ahead and open this in the IDE. 361 00:16:27,128 --> 00:16:28,770 Smiley, super small. 362 00:16:28,770 --> 00:16:30,330 Just a few pixels there. 363 00:16:30,330 --> 00:16:33,840 And let me go ahead now and run Python, which we'll see why in a moment, 364 00:16:33,840 --> 00:16:34,860 resize. 365 00:16:34,860 --> 00:16:40,680 Let's increase this by a factor of 10, increasing Smiley, and call it out.bmp. 366 00:16:40,680 --> 00:16:45,960 Now let me go ahead and open out.bnp and voila, it indeed seems to work. 367 00:16:45,960 --> 00:16:47,250 Right, no funky colors. 368 00:16:47,250 --> 00:16:49,080 No weird sizes. 369 00:16:49,080 --> 00:16:49,810 No padding. 370 00:16:49,810 --> 00:16:51,330 No padding of all things. 371 00:16:51,330 --> 00:16:53,070 It's just now Python. 372 00:16:53,070 --> 00:16:56,010 So you can probably glean some of the logic that's going on here. 373 00:16:56,010 --> 00:16:59,250 But some of it certainly should and probably does look magical. 374 00:16:59,250 --> 00:17:02,130 So let's use today to tease this apart and appreciate not only 375 00:17:02,130 --> 00:17:04,650 what you can do with another language like Python, 376 00:17:04,650 --> 00:17:07,079 but how it's similar and different and how it actually 377 00:17:07,079 --> 00:17:11,120 is built upon something like C. So let's do some comparisons first 378 00:17:11,120 --> 00:17:13,619 so that we can see that it's not a huge stretch to introduce 379 00:17:13,619 --> 00:17:15,552 yet another language so quickly. 380 00:17:15,552 --> 00:17:18,510 So recall that in Scratch if we wanted to set a variable, like counter, 381 00:17:18,510 --> 00:17:20,940 to zero, you might simply do something like this, 382 00:17:20,940 --> 00:17:22,470 setting it equal to zero at left. 383 00:17:22,470 --> 00:17:24,810 In C, we would do the same thing here at the right. 384 00:17:24,810 --> 00:17:27,609 In JavaScript, this instead looked a little different. 385 00:17:27,609 --> 00:17:29,670 What did we do in JavaScript? 386 00:17:29,670 --> 00:17:33,600 Yeah, we used let instead because we don't specify explicitly the type. 387 00:17:33,600 --> 00:17:37,560 But we do need to tell the computer, let me have this variable called counter. 388 00:17:37,560 --> 00:17:41,310 In Python, it's going to be that. 389 00:17:41,310 --> 00:17:43,500 So we've gotten rid of the type still. 390 00:17:43,500 --> 00:17:46,850 We've gotten rid of any mention of let or another keyword. 391 00:17:46,850 --> 00:17:50,880 And we've gotten rid of-- perhaps most gratifyingly-- 392 00:17:50,880 --> 00:17:52,440 semi-colons are gone. 393 00:17:52,440 --> 00:17:53,580 No more semi-colons. 394 00:17:53,580 --> 00:17:56,890 And no more curly braces in the way you've seen them thus far. 395 00:17:56,890 --> 00:17:59,285 So that was C, JavaScript, and now Python. 396 00:17:59,285 --> 00:18:00,660 So how about something like this? 397 00:18:00,660 --> 00:18:03,407 In Scratch, if you wanted to increment a counter by one, 398 00:18:03,407 --> 00:18:04,740 you would use a block like this. 399 00:18:04,740 --> 00:18:07,200 In C, we would do the same on the right here in code. 400 00:18:07,200 --> 00:18:09,450 In JavaScript, did it look any different on the right? 401 00:18:09,450 --> 00:18:13,290 402 00:18:13,290 --> 00:18:14,670 No. 403 00:18:14,670 --> 00:18:16,410 You haven't had occasion to use this yet. 404 00:18:16,410 --> 00:18:20,640 But one of the sort of revelations of JavaScript was that's also JavaScript. 405 00:18:20,640 --> 00:18:21,960 It was identical. 406 00:18:21,960 --> 00:18:24,930 Something like this, though, is Python. 407 00:18:24,930 --> 00:18:26,100 So it's almost the same. 408 00:18:26,100 --> 00:18:27,683 But I've gotten rid of the semi-colon. 409 00:18:27,683 --> 00:18:29,370 But the logic is exactly the same-- 410 00:18:29,370 --> 00:18:32,760 set counter on the left equal to whatever it is on the right plus one 411 00:18:32,760 --> 00:18:33,720 additional value. 412 00:18:33,720 --> 00:18:34,660 What about this? 413 00:18:34,660 --> 00:18:38,040 This in C had what effect? 414 00:18:38,040 --> 00:18:39,160 Incrementing the variable. 415 00:18:39,160 --> 00:18:40,540 So this is exactly the same. 416 00:18:40,540 --> 00:18:43,796 It's sort of a nice shorthand notation for doing counter equals 417 00:18:43,796 --> 00:18:46,170 counter plus 1, which just gets a little tedious to type. 418 00:18:46,170 --> 00:18:48,300 We had that same syntax in JavaScript. 419 00:18:48,300 --> 00:18:51,284 And you can probably guess in Python, what's it going to look like? 420 00:18:51,284 --> 00:18:52,700 AUDIENCE: Same thing without the-- 421 00:18:52,700 --> 00:18:54,850 DAVID MALAN: Same thing minus the semi-colon. 422 00:18:54,850 --> 00:18:56,320 So pretty nice pattern so far. 423 00:18:56,320 --> 00:18:58,880 Languages just keep getting trimmer and trimmer, if you will. 424 00:18:58,880 --> 00:19:01,160 In C, recall that we could just do plus plus, 425 00:19:01,160 --> 00:19:04,310 which was another trick for automating that same process. 426 00:19:04,310 --> 00:19:06,440 JavaScript allows for the same. 427 00:19:06,440 --> 00:19:09,740 And if you really like this syntax, I can't show you a slide for Python. 428 00:19:09,740 --> 00:19:10,430 Doesn't exist. 429 00:19:10,430 --> 00:19:11,750 Can no longer do plus plus. 430 00:19:11,750 --> 00:19:13,160 So we're paying a price. 431 00:19:13,160 --> 00:19:15,620 The author of Python did not include this in the language. 432 00:19:15,620 --> 00:19:16,250 But that's OK. 433 00:19:16,250 --> 00:19:18,830 We at least have this one, which is not too horrible. 434 00:19:18,830 --> 00:19:20,680 So what else did we look at last time? 435 00:19:20,680 --> 00:19:23,780 An if condition like this, comparing if x is less than y, 436 00:19:23,780 --> 00:19:25,310 in C it looks like this. 437 00:19:25,310 --> 00:19:27,920 In JavaScript it looks like this same thing. 438 00:19:27,920 --> 00:19:31,740 In Python, it looks like this. 439 00:19:31,740 --> 00:19:34,460 So gone are the curly braces. 440 00:19:34,460 --> 00:19:36,050 Added is a colon. 441 00:19:36,050 --> 00:19:40,380 And what you don't see yet is that indentation is going to be important. 442 00:19:40,380 --> 00:19:43,370 So any of you have been a little fast and loose with style 50 443 00:19:43,370 --> 00:19:46,080 and, like we've seen at office hours, all of your code, 444 00:19:46,080 --> 00:19:48,288 however many lines you've written for whatever reason 445 00:19:48,288 --> 00:19:50,970 is all aligned on the left and nothing is actually indented. 446 00:19:50,970 --> 00:19:52,678 Now Python is not going to tolerate that. 447 00:19:52,678 --> 00:19:55,180 Python requires indentation for logic. 448 00:19:55,180 --> 00:19:57,680 And so this is actually a stylistic feature of the language. 449 00:19:57,680 --> 00:20:02,000 It forces you to adopt good visual stylistic habits because the code just 450 00:20:02,000 --> 00:20:04,520 won't run if you haven't indented it properly. 451 00:20:04,520 --> 00:20:07,070 So anything that's going to happen if x is less than y 452 00:20:07,070 --> 00:20:11,100 needs to be indented, say, four spaces underneath that colon. 453 00:20:11,100 --> 00:20:12,080 What else have we seen? 454 00:20:12,080 --> 00:20:14,630 In C or in Scratch we had this block for if's and elses. 455 00:20:14,630 --> 00:20:16,460 In C it looks like this. 456 00:20:16,460 --> 00:20:18,170 In JavaScript it looks like this. 457 00:20:18,170 --> 00:20:21,890 In Python it's going to look like this, albeit with indentation 458 00:20:21,890 --> 00:20:23,126 below each of those colons. 459 00:20:23,126 --> 00:20:23,750 How about this? 460 00:20:23,750 --> 00:20:27,260 When we had three-way a fork in the road-- if else, if else-- 461 00:20:27,260 --> 00:20:28,970 in C it looks like this. 462 00:20:28,970 --> 00:20:30,440 JavaScript looked the same. 463 00:20:30,440 --> 00:20:32,880 In Python, looks a little funky. 464 00:20:32,880 --> 00:20:34,130 It's going to look like this-- 465 00:20:34,130 --> 00:20:36,670 elif but three colons, this time two. 466 00:20:36,670 --> 00:20:37,170 What else? 467 00:20:37,170 --> 00:20:41,480 We also looked at forever loops in Scratch, in C, and in JavaScript. 468 00:20:41,480 --> 00:20:45,950 You could use exactly the same syntax in Python, almost the same. 469 00:20:45,950 --> 00:20:48,320 Gone are the curly braces, added is the colon. 470 00:20:48,320 --> 00:20:52,580 And the slight subtlety, if you noticed, true and false 471 00:20:52,580 --> 00:20:54,440 are now proper nouns, if you will. 472 00:20:54,440 --> 00:20:56,960 Capital T capital F is necessary to write. 473 00:20:56,960 --> 00:20:57,930 How about a for loop? 474 00:20:57,930 --> 00:21:00,870 So in Scratch, we could very easily say, repeat this 50 times. 475 00:21:00,870 --> 00:21:03,410 C and JavaScript is a little pedantic in that you have 476 00:21:03,410 --> 00:21:05,810 to initialize and increment and check. 477 00:21:05,810 --> 00:21:08,060 Both C and JavaScript take that same approach, 478 00:21:08,060 --> 00:21:11,450 although in JavaScript we of course use let instead of int. 479 00:21:11,450 --> 00:21:15,470 Python is a little more succinct although a little less explicit 480 00:21:15,470 --> 00:21:16,880 step by step. 481 00:21:16,880 --> 00:21:17,960 You just do this. 482 00:21:17,960 --> 00:21:23,270 For i in range of 50 is the way of saying start iterating at 0, 483 00:21:23,270 --> 00:21:25,970 count all the way up to but not including 50, 484 00:21:25,970 --> 00:21:28,260 thereby giving you a range of values. 485 00:21:28,260 --> 00:21:30,770 So this is the one that's perhaps the most weird 486 00:21:30,770 --> 00:21:33,620 thus far, but still a little more succinct to write. 487 00:21:33,620 --> 00:21:37,250 So in C, we had so many data types-- bool, char, double, float, int, long, 488 00:21:37,250 --> 00:21:37,965 string-- 489 00:21:37,965 --> 00:21:40,340 the last of which, of course, came from the CS50 library. 490 00:21:40,340 --> 00:21:42,048 And there's others that you can use in C, 491 00:21:42,048 --> 00:21:44,810 as you might recall, from problem set 3, perhaps. 492 00:21:44,810 --> 00:21:47,790 In Python, we're going to shorten this list, at least initially, 493 00:21:47,790 --> 00:21:48,850 to just these data types. 494 00:21:48,850 --> 00:21:52,730 In Python, we're going to have bools for true-false, floats for real numbers, 495 00:21:52,730 --> 00:21:56,410 ints for integers, and then strs for strings. 496 00:21:56,410 --> 00:21:59,864 Just a little more succinct, but it does actually exist. str in Python 497 00:21:59,864 --> 00:22:00,530 is a real thing. 498 00:22:00,530 --> 00:22:02,569 It is not a CS50 addition. 499 00:22:02,569 --> 00:22:04,610 There are other data types that come with Python. 500 00:22:04,610 --> 00:22:07,400 In fact, this is where the language gets powerful. 501 00:22:07,400 --> 00:22:11,196 And those of you who came from a Java background or C++, 502 00:22:11,196 --> 00:22:13,070 the subset of you who have programmed before, 503 00:22:13,070 --> 00:22:16,310 you have more features in Python just like you do in those other languages 504 00:22:16,310 --> 00:22:21,770 that we did not have in C. In Python, you have dictionaries or hash tables. 505 00:22:21,770 --> 00:22:25,640 You have lists, which are arrays, but that can automatically resize. 506 00:22:25,640 --> 00:22:28,330 You don't have to decide in advance how big or small they are. 507 00:22:28,330 --> 00:22:31,520 Range we just saw, it's a range of values, like 50 of them, 508 00:22:31,520 --> 00:22:33,050 set in the mathematical sense. 509 00:22:33,050 --> 00:22:35,600 It's a collection of things that ensures you don't 510 00:22:35,600 --> 00:22:37,790 have duplicates in that collection. 511 00:22:37,790 --> 00:22:40,820 And then tuple is a combination of things kind of like for math 512 00:22:40,820 --> 00:22:43,760 when you have x comma y or latitude comma longitude. 513 00:22:43,760 --> 00:22:46,490 Any time you have pairs or triples or more of things, 514 00:22:46,490 --> 00:22:47,870 those are called tuples. 515 00:22:47,870 --> 00:22:51,410 And those are common in math courses and higher-level CS theory classes, 516 00:22:51,410 --> 00:22:52,320 as well. 517 00:22:52,320 --> 00:22:54,500 But we do give you, at least in this first week 518 00:22:54,500 --> 00:22:57,410 of our look at Python, a few functions from CS50, 519 00:22:57,410 --> 00:23:01,610 among them getFloat, getInt, and getString, which behave exactly 520 00:23:01,610 --> 00:23:02,729 like their C counterparts. 521 00:23:02,729 --> 00:23:04,520 And this is just going to allow us to start 522 00:23:04,520 --> 00:23:08,180 writing code very reminiscent of what we did the last few weeks. 523 00:23:08,180 --> 00:23:10,370 But let's consider what's going to change 524 00:23:10,370 --> 00:23:12,740 as we're about to start writing our own programs. 525 00:23:12,740 --> 00:23:16,130 In C, when you wanted to use the CS50 library, you of course 526 00:23:16,130 --> 00:23:17,690 included its header file. 527 00:23:17,690 --> 00:23:20,990 That syntax is going to change in Python so that for this first week when 528 00:23:20,990 --> 00:23:24,410 you want to use the CS50 library, you're going to instead say 529 00:23:24,410 --> 00:23:29,060 from CS50 import and then a comma separated list of the functions 530 00:23:29,060 --> 00:23:31,560 that you want to import or use in your code. 531 00:23:31,560 --> 00:23:32,810 So it's a little more precise. 532 00:23:32,810 --> 00:23:34,940 This syntax is not saying give me everything. 533 00:23:34,940 --> 00:23:36,650 Give me this, this, and this other thing. 534 00:23:36,650 --> 00:23:39,762 And if you want to use one or more, you can just separate them by commas. 535 00:23:39,762 --> 00:23:42,470 As an aside, especially those of you who have seen Python before, 536 00:23:42,470 --> 00:23:44,270 there's other ways to do this. 537 00:23:44,270 --> 00:23:45,740 There are several approaches. 538 00:23:45,740 --> 00:23:48,299 This is, perhaps, the most comparable for our purposes today. 539 00:23:48,299 --> 00:23:50,090 What else are you're going to have to know? 540 00:23:50,090 --> 00:23:52,880 In C you had to compile your code. 541 00:23:52,880 --> 00:23:54,690 And you did so with clang, like this. 542 00:23:54,690 --> 00:23:57,320 And then you ran your program with dot slash hello. 543 00:23:57,320 --> 00:23:59,270 Or more simply, you did make hello and then 544 00:23:59,270 --> 00:24:02,630 we'd figure out the command for you in the IDE or the sandbox or lab. 545 00:24:02,630 --> 00:24:05,602 In Python, you're going to skip the compilation step. 546 00:24:05,602 --> 00:24:07,310 When you want to run a program in Python, 547 00:24:07,310 --> 00:24:09,600 you're going to do just what I did quickly before. 548 00:24:09,600 --> 00:24:13,400 You're just going to run the command Python and then the name of the file 549 00:24:13,400 --> 00:24:14,660 that you want to run. 550 00:24:14,660 --> 00:24:16,580 And the reason for this is as follows. 551 00:24:16,580 --> 00:24:20,750 In the world of C, recall that we had this sort of pipeline process 552 00:24:20,750 --> 00:24:24,530 where we have our source code as our input. 553 00:24:24,530 --> 00:24:29,750 And then we wanted to get to the point of machine code, the zeros and ones. 554 00:24:29,750 --> 00:24:32,990 And what was standing in between source code and machine code, 555 00:24:32,990 --> 00:24:34,430 just to be clear? 556 00:24:34,430 --> 00:24:36,420 What process? 557 00:24:36,420 --> 00:24:37,640 Yeah, so compiling. 558 00:24:37,640 --> 00:24:40,730 So we had a compiler in the middle whose purpose in life 559 00:24:40,730 --> 00:24:44,180 is by definition to translate one language to another. 560 00:24:44,180 --> 00:24:47,510 It happens to be an English-like language to a computer-like language, 561 00:24:47,510 --> 00:24:50,790 but a compiler is a general term that just converts one thing to another. 562 00:24:50,790 --> 00:24:53,750 And so this pipeline for C looked like this. 563 00:24:53,750 --> 00:24:56,570 And that's why you had to run Clang explicitly, or make. 564 00:24:56,570 --> 00:24:58,730 You had to induce that middle man operation 565 00:24:58,730 --> 00:25:01,370 to convert the language to something the computer understands. 566 00:25:01,370 --> 00:25:06,980 Python and other languages are not typically compiled in the same way. 567 00:25:06,980 --> 00:25:09,020 They're generally said to be interpreted, 568 00:25:09,020 --> 00:25:11,600 whereby you don't compile them into zeros and ones 569 00:25:11,600 --> 00:25:12,860 and then run the program. 570 00:25:12,860 --> 00:25:17,330 You instead run a program that someone else wrote called Python. 571 00:25:17,330 --> 00:25:20,006 And that program is, by definition, an interpreter. 572 00:25:20,006 --> 00:25:22,130 And that interpreter's purpose in life, as the word 573 00:25:22,130 --> 00:25:25,580 implies, is to read your code top to bottom, left to right, 574 00:25:25,580 --> 00:25:27,380 and just do exactly what you tell it to do, 575 00:25:27,380 --> 00:25:32,420 step by step by step, without doing the upfront work of converting things 576 00:25:32,420 --> 00:25:33,260 to zeros and ones. 577 00:25:33,260 --> 00:25:35,840 So in the human world, if I speak English and someone there 578 00:25:35,840 --> 00:25:38,450 speaks Spanish and we don't speak each other's language, 579 00:25:38,450 --> 00:25:41,720 we might put a third human in between us, obviously a human interpreter. 580 00:25:41,720 --> 00:25:43,070 The role is very similar. 581 00:25:43,070 --> 00:25:45,112 The interpreter listens to me and then translates 582 00:25:45,112 --> 00:25:46,903 that to something the computer understands. 583 00:25:46,903 --> 00:25:48,760 But it doesn't get into zeros and ones. 584 00:25:48,760 --> 00:25:51,010 It just goes from one directly to the other. 585 00:25:51,010 --> 00:25:53,380 So the difference here in Python is that you still 586 00:25:53,380 --> 00:25:56,560 are going to write source code, like I quickly did for resize. 587 00:25:56,560 --> 00:25:59,380 And ultimately, we want to actually get it 588 00:25:59,380 --> 00:26:04,260 into a program called an interpreter. 589 00:26:04,260 --> 00:26:07,320 And so the step ideally just looks like this. 590 00:26:07,320 --> 00:26:11,400 But as an aside, Python is a pretty sophisticated language. 591 00:26:11,400 --> 00:26:14,220 And even though we have the pleasure of running it just 592 00:26:14,220 --> 00:26:18,570 with one step instead of these two steps, there actually is, as an aside, 593 00:26:18,570 --> 00:26:21,470 some magic going on underneath the hood. 594 00:26:21,470 --> 00:26:25,500 And for the curious, there actually is, for performance reasons, 595 00:26:25,500 --> 00:26:28,920 a compiler built into Python that actually converts it to something 596 00:26:28,920 --> 00:26:31,500 intermediary called bytecode. 597 00:26:31,500 --> 00:26:34,120 And bytecode is what's actually interpreted. 598 00:26:34,120 --> 00:26:38,370 And so this is why Python, while potentially slower than C 599 00:26:38,370 --> 00:26:42,090 at certain tasks because you're not going to the low level zeros and ones, 600 00:26:42,090 --> 00:26:45,260 can actually be used in business applications and popular websites 601 00:26:45,260 --> 00:26:45,760 and such. 602 00:26:45,760 --> 00:26:47,590 And that didn't really work very well. 603 00:26:47,590 --> 00:26:51,000 And so it can be highly performing, as well. 604 00:26:51,000 --> 00:26:52,990 But more on that in a little bit. 605 00:26:52,990 --> 00:26:55,860 So with that said, if these are the differences not only 606 00:26:55,860 --> 00:26:58,410 syntactically but also mechanically, let's go ahead 607 00:26:58,410 --> 00:27:00,420 and actually write a program. 608 00:27:00,420 --> 00:27:03,127 So let me go ahead and go into the IDE. 609 00:27:03,127 --> 00:27:04,710 Let me close our examples from before. 610 00:27:04,710 --> 00:27:07,680 And let's start more simply because resize was a mouthful all at once. 611 00:27:07,680 --> 00:27:10,680 Let me go ahead and create a file called hello.py. 612 00:27:10,680 --> 00:27:13,140 And instead of writing this program in C, 613 00:27:13,140 --> 00:27:15,420 let me go ahead and just write hello world. 614 00:27:15,420 --> 00:27:17,190 So let's go ahead and do this. 615 00:27:17,190 --> 00:27:19,200 Print hello world. 616 00:27:19,200 --> 00:27:20,220 Done. 617 00:27:20,220 --> 00:27:23,830 That's my first program in Python, and truly my first program in Python, 618 00:27:23,830 --> 00:27:26,380 not sort of coming out swinging with resize. 619 00:27:26,380 --> 00:27:32,955 So what is not present in this file that was in something like hello.c? 620 00:27:32,955 --> 00:27:35,490 There is no main function necessary here. 621 00:27:35,490 --> 00:27:37,150 What else is missing? 622 00:27:37,150 --> 00:27:38,100 AUDIENCE: Printf. 623 00:27:38,100 --> 00:27:39,530 DAVID MALAN: There is no mention of printf. 624 00:27:39,530 --> 00:27:41,946 It's instead print, which is a little more human friendly. 625 00:27:41,946 --> 00:27:42,780 AUDIENCE: Libraries. 626 00:27:42,780 --> 00:27:45,320 DAVID MALAN: There is no mention of header files or libraries 627 00:27:45,320 --> 00:27:46,350 at the top of the file. 628 00:27:46,350 --> 00:27:48,760 I just dived right in and got to it. 629 00:27:48,760 --> 00:27:49,260 Yeah? 630 00:27:49,260 --> 00:27:50,160 AUDIENCE: No semi-colons. 631 00:27:50,160 --> 00:27:51,360 DAVID MALAN: No semi-colons. 632 00:27:51,360 --> 00:27:51,859 What else? 633 00:27:51,859 --> 00:27:55,340 634 00:27:55,340 --> 00:27:55,840 What else? 635 00:27:55,840 --> 00:27:56,340 Yeah? 636 00:27:56,340 --> 00:27:57,584 AUDIENCE: No backslash n. 637 00:27:57,584 --> 00:27:58,750 DAVID MALAN: No backslash n. 638 00:27:58,750 --> 00:28:00,940 I probably-- I haven't run it yet, but I think 639 00:28:00,940 --> 00:28:02,980 I will get that for free this time with Python. 640 00:28:02,980 --> 00:28:04,271 I don't have to be so explicit. 641 00:28:04,271 --> 00:28:06,612 Was there another hand here? 642 00:28:06,612 --> 00:28:08,470 AUDIENCE: There's no f in printf. 643 00:28:08,470 --> 00:28:10,696 DAVID MALAN: There's no f in printf, yep. 644 00:28:10,696 --> 00:28:11,320 Something else? 645 00:28:11,320 --> 00:28:14,140 646 00:28:14,140 --> 00:28:15,222 There's no indentation. 647 00:28:15,222 --> 00:28:16,930 Though to be fair, there's only one line. 648 00:28:16,930 --> 00:28:18,054 But there's no indentation. 649 00:28:18,054 --> 00:28:18,730 That's fair. 650 00:28:18,730 --> 00:28:19,430 That's fair. 651 00:28:19,430 --> 00:28:21,070 There's no curly braces, as well. 652 00:28:21,070 --> 00:28:22,180 There's no mention of int. 653 00:28:22,180 --> 00:28:23,350 There's no mention of void. 654 00:28:23,350 --> 00:28:24,640 I mean, my God. 655 00:28:24,640 --> 00:28:26,890 Why didn't we just do this last time? 656 00:28:26,890 --> 00:28:29,290 And so this is why languages evolve. 657 00:28:29,290 --> 00:28:32,560 People realized years ago, gee, C is serving us well. 658 00:28:32,560 --> 00:28:35,380 Once I understand pointers and the syntax, OK, I got it. 659 00:28:35,380 --> 00:28:38,590 But my God, it's just so tedious to write even the simplest of programs 660 00:28:38,590 --> 00:28:42,910 because I have to do hash includes, standard io.h, int main void, I mean, 661 00:28:42,910 --> 00:28:46,060 all of this syntactic overhead that's getting in the way of you just 662 00:28:46,060 --> 00:28:48,520 doing the work you care about, which in simplest form 663 00:28:48,520 --> 00:28:50,650 here is just printing hello world. 664 00:28:50,650 --> 00:28:53,890 So Python and a lot of more modern languages-- among them, 665 00:28:53,890 --> 00:28:56,110 Ruby and PHP and others-- 666 00:28:56,110 --> 00:28:58,960 just get rid of a lot of that overhead so that you can just get down 667 00:28:58,960 --> 00:29:01,280 to work more quickly right away. 668 00:29:01,280 --> 00:29:02,790 So how do I go ahead and run this? 669 00:29:02,790 --> 00:29:06,570 In C, recall, I would have done dot slash hello.py. 670 00:29:06,570 --> 00:29:09,070 But we just said a moment ago that's not the right approach. 671 00:29:09,070 --> 00:29:11,780 How do I go and run this program? 672 00:29:11,780 --> 00:29:16,450 Yeah, so I run literally a program that is coincidentally called Python itself. 673 00:29:16,450 --> 00:29:17,450 That is the interpreter. 674 00:29:17,450 --> 00:29:20,780 That's the man in the middle between me and my Spanish-speaking friend that 675 00:29:20,780 --> 00:29:25,280 just has to convert hello.py into whatever the computer itself 676 00:29:25,280 --> 00:29:26,150 understands. 677 00:29:26,150 --> 00:29:28,070 And so there, indeed, we have hello world. 678 00:29:28,070 --> 00:29:30,830 And as you notice, there's no backslash n on my code. 679 00:29:30,830 --> 00:29:33,210 But I am moving the cursor to the new line. 680 00:29:33,210 --> 00:29:34,880 So Python just decided, you know what? 681 00:29:34,880 --> 00:29:37,884 It's so damn common to have new lines, let's just add those by default. 682 00:29:37,884 --> 00:29:39,800 You know, the price we're going to pay is it's 683 00:29:39,800 --> 00:29:41,480 a little annoying to get rid of them. 684 00:29:41,480 --> 00:29:43,569 But we'll see that in a little bit, too. 685 00:29:43,569 --> 00:29:44,360 So just a tradeoff. 686 00:29:44,360 --> 00:29:45,693 All right, let's do another one. 687 00:29:45,693 --> 00:29:48,110 That's just a simplest of possible programs. 688 00:29:48,110 --> 00:29:51,350 Let's go ahead and do, say, something a little fancier 689 00:29:51,350 --> 00:29:54,720 that allows us to do something more than that. 690 00:29:54,720 --> 00:29:58,310 So let's go ahead, say, and compare not just 691 00:29:58,310 --> 00:30:00,960 that, but let's actually go get some user input. 692 00:30:00,960 --> 00:30:03,020 So for user input, there's a few ways to do this. 693 00:30:03,020 --> 00:30:06,144 We'll do it the CS50 way initially, but these are training wheels this week 694 00:30:06,144 --> 00:30:08,510 that we'll use for just a week before we take them off, 695 00:30:08,510 --> 00:30:11,600 just bridging us from C to Python. 696 00:30:11,600 --> 00:30:13,670 Let me go ahead and call this string zero.py 697 00:30:13,670 --> 00:30:15,200 because I'm dealing with strings. 698 00:30:15,200 --> 00:30:18,860 And let me go ahead and do s to give me a variable. 699 00:30:18,860 --> 00:30:19,770 Get string. 700 00:30:19,770 --> 00:30:23,060 Let me prompt the human for his or her name like this and then let me go ahead 701 00:30:23,060 --> 00:30:24,860 and say hello. 702 00:30:24,860 --> 00:30:28,070 And so and now I just have to consider how to print out their name. 703 00:30:28,070 --> 00:30:30,590 And in Python, I can actually just do this. 704 00:30:30,590 --> 00:30:32,240 I don't need to do percent s. 705 00:30:32,240 --> 00:30:36,710 I don't need to put a second-- or, I do need to put a second comma here. 706 00:30:36,710 --> 00:30:39,170 But I can just do this, which is a little simpler. 707 00:30:39,170 --> 00:30:40,880 And this is not correct. 708 00:30:40,880 --> 00:30:43,902 I'm not practicing what I preached. 709 00:30:43,902 --> 00:30:44,610 Get rid of the f. 710 00:30:44,610 --> 00:30:46,530 Just print what you want to print, indeed. 711 00:30:46,530 --> 00:30:49,590 So s, notice, is apparently a variable because I'm assigning 712 00:30:49,590 --> 00:30:51,330 it a value from right to left. 713 00:30:51,330 --> 00:30:53,760 But notice that I'm not specifying the type. 714 00:30:53,760 --> 00:30:57,216 So Python does have type. str we said is the string equivalent. 715 00:30:57,216 --> 00:30:58,590 But you don't have to mention it. 716 00:30:58,590 --> 00:31:01,589 Python, like JavaScript, will just figure it out, even without a keyword 717 00:31:01,589 --> 00:31:02,160 like let. 718 00:31:02,160 --> 00:31:05,330 But I do need to add one thing. 719 00:31:05,330 --> 00:31:05,845 What's that? 720 00:31:05,845 --> 00:31:07,636 AUDIENCE: You need to import the getString? 721 00:31:07,636 --> 00:31:09,630 DAVID MALAN: Yeah, getString is a CS50 thing. 722 00:31:09,630 --> 00:31:12,570 And we're only going to use it for a week, but I do need to import it. 723 00:31:12,570 --> 00:31:15,990 And the syntax with which to do this is to say, from the CS50 library, 724 00:31:15,990 --> 00:31:17,940 import a function called get string. 725 00:31:17,940 --> 00:31:20,160 I don't need to import any more with commas. 726 00:31:20,160 --> 00:31:21,800 That one suffices for this program. 727 00:31:21,800 --> 00:31:23,060 Yeah. 728 00:31:23,060 --> 00:31:25,510 AUDIENCE: Would you want to-- 729 00:31:25,510 --> 00:31:29,920 instead of saying hello your name, would you want to first getName that says 730 00:31:29,920 --> 00:31:32,370 [INAUDIBLE]? 731 00:31:32,370 --> 00:31:35,524 You're not indicating where the error is [INAUDIBLE].. 732 00:31:35,524 --> 00:31:37,940 DAVID MALAN: Sure, let me come back to this in one second. 733 00:31:37,940 --> 00:31:40,520 Let's run this program first to demonstrate that it indeed 734 00:31:40,520 --> 00:31:42,810 does what we saw it do last week. 735 00:31:42,810 --> 00:31:50,987 And let me go ahead here and do this time Python of string 0. 736 00:31:50,987 --> 00:31:53,070 Let me go ahead and it's just waiting for my name. 737 00:31:53,070 --> 00:31:54,090 So I'll type in David. 738 00:31:54,090 --> 00:31:54,740 Hello, David. 739 00:31:54,740 --> 00:31:57,800 But as you propose, what if you wanted to flip this around? 740 00:31:57,800 --> 00:32:01,310 Well, suppose I wanted to say the person's name and then 741 00:32:01,310 --> 00:32:06,629 something like hello because I'm just excited to see them, instead. 742 00:32:06,629 --> 00:32:07,670 Let's see what this does. 743 00:32:07,670 --> 00:32:10,430 Let me go ahead now and run Python of string 0. 744 00:32:10,430 --> 00:32:12,050 Type in my name. 745 00:32:12,050 --> 00:32:14,630 And it's almost what I think you intended. 746 00:32:14,630 --> 00:32:16,040 But there is a bug-- 747 00:32:16,040 --> 00:32:17,330 an aesthetic bug, at least. 748 00:32:17,330 --> 00:32:19,850 So it seems with Python's print function you don't need 749 00:32:19,850 --> 00:32:21,970 to use the placeholder like percent s. 750 00:32:21,970 --> 00:32:27,680 But it would seem to presumptuously add a space for you after everything you're 751 00:32:27,680 --> 00:32:31,190 passing in as an input to print itself. 752 00:32:31,190 --> 00:32:33,380 So notice print is taking how many arguments 753 00:32:33,380 --> 00:32:37,070 according to this highlighted portion? 754 00:32:37,070 --> 00:32:39,680 How many arguments might you infer? 755 00:32:39,680 --> 00:32:42,672 AUDIENCE: S space and then the thing. 756 00:32:42,672 --> 00:32:43,380 DAVID MALAN: Two? 757 00:32:43,380 --> 00:32:44,370 Yeah, so two. 758 00:32:44,370 --> 00:32:48,292 One is s, comma, and then the rest is what's highlighted in green here. 759 00:32:48,292 --> 00:32:51,000 Yes, there's a second comma there, but it's inside of the string. 760 00:32:51,000 --> 00:32:53,487 So just like in C, that's sort of a red herring. 761 00:32:53,487 --> 00:32:54,820 There's only two arguments here. 762 00:32:54,820 --> 00:32:56,730 But it seems that the print function-- and you would know this 763 00:32:56,730 --> 00:33:00,180 by reading that documentation-- if you pass in two or three or more arguments, 764 00:33:00,180 --> 00:33:01,230 it prints all of them. 765 00:33:01,230 --> 00:33:02,856 But separates them with a single space. 766 00:33:02,856 --> 00:33:03,938 So this isn't quite right. 767 00:33:03,938 --> 00:33:06,450 So this is actually a great motivation for cleaning this up. 768 00:33:06,450 --> 00:33:10,030 If I want to actually improve this program and tidy it up a little bit, 769 00:33:10,030 --> 00:33:11,940 let me do that in version one here. 770 00:33:11,940 --> 00:33:15,660 Let me create another file called, say, string1.py. 771 00:33:15,660 --> 00:33:17,850 Let me start where we started a moment ago. 772 00:33:17,850 --> 00:33:21,430 And let me actually use a placeholder akin to C. So if I want to do, 773 00:33:21,430 --> 00:33:27,150 for instance, hello so-and-so, it turns out you can actually say, hey Python, 774 00:33:27,150 --> 00:33:30,540 put a variable called s right here. 775 00:33:30,540 --> 00:33:34,920 However, if I run this as is, there's still going to be a bug. 776 00:33:34,920 --> 00:33:36,460 It's not quite solved yet. 777 00:33:36,460 --> 00:33:38,770 But when I hit Enter now and type in my name-- 778 00:33:38,770 --> 00:33:41,040 all right, this is obviously stupid looking. 779 00:33:41,040 --> 00:33:45,660 So it seems that I need to tell Python that this string that I'm passing in, 780 00:33:45,660 --> 00:33:48,480 hello comma so and so, is a formatted string. 781 00:33:48,480 --> 00:33:52,080 It's a placeholder string that it should make some changes to. 782 00:33:52,080 --> 00:33:55,230 And this is a little weird, cryptic syntactically in Python. 783 00:33:55,230 --> 00:34:01,050 But the way you do this in Python is you put an f before the string itself. 784 00:34:01,050 --> 00:34:04,020 So I'm sorry, we got rid of the f a moment ago. 785 00:34:04,020 --> 00:34:05,190 So we just called it print. 786 00:34:05,190 --> 00:34:07,230 Now we're reusing a different f here. 787 00:34:07,230 --> 00:34:09,270 And it's stupid-looking syntax, admittedly. 788 00:34:09,270 --> 00:34:12,270 But this just means hey, Python, the following double quotes 789 00:34:12,270 --> 00:34:14,280 or single quotes that you're about to see should 790 00:34:14,280 --> 00:34:16,199 be formatted by you in a special way. 791 00:34:16,199 --> 00:34:18,780 And it literally goes at the beginning of the string 792 00:34:18,780 --> 00:34:21,010 even though that does admittedly look weird. 793 00:34:21,010 --> 00:34:24,409 But if I now rerun this Python string one and type in my name now, 794 00:34:24,409 --> 00:34:26,110 now it does the substitution. 795 00:34:26,110 --> 00:34:29,460 So I can flip it around logically much more flexibly now 796 00:34:29,460 --> 00:34:33,150 and do something like hello because now I'm passing in one argument 797 00:34:33,150 --> 00:34:35,230 that print will format for me. 798 00:34:35,230 --> 00:34:38,699 So when I type in my name now, I'm not going to get that superfluous space. 799 00:34:38,699 --> 00:34:42,090 And now I have complete control over the formatting of the string. 800 00:34:42,090 --> 00:34:46,230 So you know, sort of two steps forward, one step back, perhaps, syntactically. 801 00:34:46,230 --> 00:34:48,480 But it does allow us to do what we want this to do. 802 00:34:48,480 --> 00:34:50,230 We could write the same program using ints 803 00:34:50,230 --> 00:34:52,469 and floats using getInt and getFloat. 804 00:34:52,469 --> 00:34:53,860 Would look exactly the same. 805 00:34:53,860 --> 00:34:57,330 You don't need to worry about percent s versus percent i versus percent f. 806 00:34:57,330 --> 00:35:01,440 You just type in the variable name inside of those curly braces. 807 00:35:01,440 --> 00:35:04,834 All right, let me go ahead and do some quick math. 808 00:35:04,834 --> 00:35:06,000 Let me go ahead and do this. 809 00:35:06,000 --> 00:35:07,680 Let me go ahead and create a new file. 810 00:35:07,680 --> 00:35:10,170 We'll call this ints.py for integers. 811 00:35:10,170 --> 00:35:13,680 And let me go ahead and get this access to-- 812 00:35:13,680 --> 00:35:18,249 how about the CS50 library's get int method or function which exists. 813 00:35:18,249 --> 00:35:20,040 Then let me go ahead and declare a variable 814 00:35:20,040 --> 00:35:23,541 called x and get an int from the user and just prompt him or her for x. 815 00:35:23,541 --> 00:35:25,290 Then let me go ahead and do the same thing 816 00:35:25,290 --> 00:35:27,594 and just get y from them, as well. 817 00:35:27,594 --> 00:35:29,760 And then down here, let me just do some simple math. 818 00:35:29,760 --> 00:35:34,270 And we did this way back in week one by printing as follows. 819 00:35:34,270 --> 00:35:38,220 Let me go ahead and just print out x plus y equals-- 820 00:35:38,220 --> 00:35:42,090 and this is what's cool now about this curly brace feature. 821 00:35:42,090 --> 00:35:46,050 You can actually do not just variable's names, 822 00:35:46,050 --> 00:35:48,180 but you can do simple operations in there, too. 823 00:35:48,180 --> 00:35:52,434 I can literally do math inside of those curly braces and print out that value. 824 00:35:52,434 --> 00:35:55,600 But of course, this alone is just going to literally print the curly braces. 825 00:35:55,600 --> 00:35:56,960 What do I have to add? 826 00:35:56,960 --> 00:35:58,560 Yeah, so it looks a little weird. 827 00:35:58,560 --> 00:36:00,420 But this now will solve that problem. 828 00:36:00,420 --> 00:36:05,924 It will print literally x plus y equals whatever the actual sum is. 829 00:36:05,924 --> 00:36:07,840 AUDIENCE: Just following up, what does f mean? 830 00:36:07,840 --> 00:36:08,673 DAVID MALAN: Format. 831 00:36:08,673 --> 00:36:10,180 Format the following string for me. 832 00:36:10,180 --> 00:36:11,060 Good question. 833 00:36:11,060 --> 00:36:14,900 Let's do just a few copy/paste but change the operator here. 834 00:36:14,900 --> 00:36:19,520 So x minus y, I want to see what this looks like. 835 00:36:19,520 --> 00:36:21,070 X, say-- what did we do last time? 836 00:36:21,070 --> 00:36:22,634 Multiplying by y. 837 00:36:22,634 --> 00:36:23,800 I want to do that math, too. 838 00:36:23,800 --> 00:36:26,420 I can divide as well. 839 00:36:26,420 --> 00:36:29,140 And then we had one more, which was modulo, 840 00:36:29,140 --> 00:36:31,900 or modular arithmetic, which, recall, was the percent sign. 841 00:36:31,900 --> 00:36:33,922 So syntactically, it's identical to see. 842 00:36:33,922 --> 00:36:36,880 We're just adding this curly brace notation just for the print function 843 00:36:36,880 --> 00:36:37,630 right now. 844 00:36:37,630 --> 00:36:38,838 Let me go ahead and run this. 845 00:36:38,838 --> 00:36:40,390 Python of ints.py. 846 00:36:40,390 --> 00:36:44,710 And let me go ahead and do one and say two. 847 00:36:44,710 --> 00:36:46,840 So 1 plus 2 is 3. 848 00:36:46,840 --> 00:36:48,610 1 minus 2 is negative 1. 849 00:36:48,610 --> 00:36:50,430 1 times 2 is 2. 850 00:36:50,430 --> 00:36:53,280 1 divided by 2 is 0.5. 851 00:36:53,280 --> 00:36:57,700 And 1 then divide by 2 and take the remainder is 1. 852 00:36:57,700 --> 00:37:00,550 So I think this checks out mathematically. 853 00:37:00,550 --> 00:37:03,235 But you should be a little surprised by one of these outcomes. 854 00:37:03,235 --> 00:37:06,340 855 00:37:06,340 --> 00:37:07,090 Say again? 856 00:37:07,090 --> 00:37:08,560 AUDIENCE: You're getting a float. 857 00:37:08,560 --> 00:37:10,184 DAVID MALAN: Yeah, I'm getting a float. 858 00:37:10,184 --> 00:37:14,560 Like, Python itself seems to have fixed a bug in C itself. 859 00:37:14,560 --> 00:37:20,420 What happened in C when you divided 1, an integer, by 2, an integer, in C? 860 00:37:20,420 --> 00:37:21,670 You would get another integer. 861 00:37:21,670 --> 00:37:23,670 And what's the closest integer you can represent 862 00:37:23,670 --> 00:37:26,174 that doesn't have a decimal point? 863 00:37:26,174 --> 00:37:29,530 0, because the C would truncate everything after the decimal point. 864 00:37:29,530 --> 00:37:32,220 And yet, Python seems to have fixed this problem. 865 00:37:32,220 --> 00:37:34,340 And this is actually a somewhat recent phenomenon. 866 00:37:34,340 --> 00:37:36,670 And this a huge religious debate as to whether or not 867 00:37:36,670 --> 00:37:40,900 you should just keep the historical definition of division, which 868 00:37:40,900 --> 00:37:44,170 is floor division, so to speak, or we should make it truly division, 869 00:37:44,170 --> 00:37:45,910 like we all grew up learning in school. 870 00:37:45,910 --> 00:37:50,230 Python took the latter approach and made division mean division, true division, 871 00:37:50,230 --> 00:37:52,459 where if you divide two ints you get back a float. 872 00:37:52,459 --> 00:37:54,250 Of course, this is a problem if people want 873 00:37:54,250 --> 00:37:56,740 to write code that assumes that it's going to be truncated. 874 00:37:56,740 --> 00:37:59,350 That can actually be a powerful feature. 875 00:37:59,350 --> 00:38:02,740 So it turns out, and you won't have terribly many occasions to use this, 876 00:38:02,740 --> 00:38:05,650 but the compromise in the world was, all right, if you really 877 00:38:05,650 --> 00:38:11,020 want the old behavior of the division in Python, we will give it back to you. 878 00:38:11,020 --> 00:38:12,384 You have to use two slashes. 879 00:38:12,384 --> 00:38:15,050 So again, another one of these two steps forward, one step back. 880 00:38:15,050 --> 00:38:18,040 But it's there, so problems can still be solved in the same way. 881 00:38:18,040 --> 00:38:22,080 And this, if I save it and rerun that same code, 1 and 2, 882 00:38:22,080 --> 00:38:27,160 now I get back 0, just as I would in C, which does have some applicability. 883 00:38:27,160 --> 00:38:29,410 Let's do one other example now involving some numbers. 884 00:38:29,410 --> 00:38:32,520 And let me go ahead and call this floats.py. 885 00:38:32,520 --> 00:38:36,942 And let me do the same thing, from CS50 import getFloat this time. 886 00:38:36,942 --> 00:38:38,650 So I can deal with floating point values. 887 00:38:38,650 --> 00:38:40,570 Let me declare a variable x and get a float 888 00:38:40,570 --> 00:38:42,580 and we'll ask the user for a variable x. 889 00:38:42,580 --> 00:38:45,842 Then let's go ahead and get another float, and just as before, call it y. 890 00:38:45,842 --> 00:38:47,800 But this time both of them are, indeed, floats. 891 00:38:47,800 --> 00:38:51,070 Then let me go ahead and do some math, x plus y equals z. 892 00:38:51,070 --> 00:38:52,660 Let's give myself a third variable. 893 00:38:52,660 --> 00:38:55,480 And then let me just go ahead and print out a similar message-- 894 00:38:55,480 --> 00:39:00,200 x divided by y equals z. 895 00:39:00,200 --> 00:39:03,130 All right, and let me go ahead and save this, clear my terminal, 896 00:39:03,130 --> 00:39:05,620 and do Python of floats.py. 897 00:39:05,620 --> 00:39:08,270 1 divided by 10 this time. 898 00:39:08,270 --> 00:39:09,700 And I get-- dammit, bug. 899 00:39:09,700 --> 00:39:11,270 How do I fix this? 900 00:39:11,270 --> 00:39:12,520 All right, so just a simple f. 901 00:39:12,520 --> 00:39:13,570 Make it a format string. 902 00:39:13,570 --> 00:39:14,330 No big deal. 903 00:39:14,330 --> 00:39:16,610 So let's rerun this, 1, 10. 904 00:39:16,610 --> 00:39:19,000 OK, hoo, hoo. 905 00:39:19,000 --> 00:39:21,120 That's a new one. 906 00:39:21,120 --> 00:39:22,430 What is going on there? 907 00:39:22,430 --> 00:39:27,180 908 00:39:27,180 --> 00:39:29,474 AUDIENCE: [INAUDIBLE] 909 00:39:29,474 --> 00:39:32,640 DAVID MALAN: I did define z in the line above it, and what was your comment? 910 00:39:32,640 --> 00:39:33,806 AUDIENCE: You used x plus y. 911 00:39:33,806 --> 00:39:36,220 DAVID MALAN: I did use x plus y, but I think I-- 912 00:39:36,220 --> 00:39:37,330 oh, wait, OK. 913 00:39:37,330 --> 00:39:37,870 I'm sorry. 914 00:39:37,870 --> 00:39:41,760 Let's-- OK, so we can fix that. 915 00:39:41,760 --> 00:39:44,590 Let's-- sorry. 916 00:39:44,590 --> 00:39:45,900 There. 917 00:39:45,900 --> 00:39:48,450 OK, so 110. 918 00:39:48,450 --> 00:39:51,900 Hmm, still wrong. 919 00:39:51,900 --> 00:39:54,150 Good catch, thank you, though. 920 00:39:54,150 --> 00:39:57,840 Why is 1 plus 2 11-- 921 00:39:57,840 --> 00:40:01,210 or 1 plus 10, 11? 922 00:40:01,210 --> 00:40:01,961 Yeah? 923 00:40:01,961 --> 00:40:03,700 AUDIENCE: [INAUDIBLE]. 924 00:40:03,700 --> 00:40:04,950 DAVID MALAN: Wait, wait, wait. 925 00:40:04,950 --> 00:40:06,260 Sorry. 926 00:40:06,260 --> 00:40:08,100 AUDIENCE: [INAUDIBLE] 927 00:40:08,100 --> 00:40:10,410 [LAUGHTER] 928 00:40:10,410 --> 00:40:13,690 DAVID MALAN: This brings me back to my earlier point as to how tired I am. 929 00:40:13,690 --> 00:40:14,590 So this is correct. 930 00:40:14,590 --> 00:40:19,460 So Python does math correctly. 931 00:40:19,460 --> 00:40:21,760 But-- OK, horrifying. 932 00:40:21,760 --> 00:40:24,850 All right, so now let's do division and try 933 00:40:24,850 --> 00:40:28,690 to make the point I think I meant to make late last night where I if I do 1 934 00:40:28,690 --> 00:40:35,260 divided by 10, OK, 1 divided by 10, as expected, does actually work here. 935 00:40:35,260 --> 00:40:36,707 So 0.1, that's correct. 936 00:40:36,707 --> 00:40:39,040 But remember in C-- let me dig myself out of this hole-- 937 00:40:39,040 --> 00:40:42,430 remember in C what happened if we dug a little deeper 938 00:40:42,430 --> 00:40:44,750 and we looked a little past the first decimal point. 939 00:40:44,750 --> 00:40:46,080 So how do I do this in Python? 940 00:40:46,080 --> 00:40:47,510 It's actually pretty similar. 941 00:40:47,510 --> 00:40:50,860 Let me go ahead and not just show myself z but go ahead 942 00:40:50,860 --> 00:40:54,970 and print out to, let's say, two decimal places that same value. 943 00:40:54,970 --> 00:40:56,170 The syntax here is weird. 944 00:40:56,170 --> 00:40:59,470 It's different from C. But you literally take the variable that you want 945 00:40:59,470 --> 00:41:02,380 to format, you put a colon and then a dot-- 946 00:41:02,380 --> 00:41:03,880 because you want to adjust the dot-- 947 00:41:03,880 --> 00:41:06,220 and then you want to say something like 2f. 948 00:41:06,220 --> 00:41:09,670 So this is saying, hey, Python, format the variable 949 00:41:09,670 --> 00:41:13,657 that's to the left of the colon using two decimal points. 950 00:41:13,657 --> 00:41:15,490 And by the way, it's a floating point value. 951 00:41:15,490 --> 00:41:16,906 So this f has a different meaning. 952 00:41:16,906 --> 00:41:18,070 This is f as in float. 953 00:41:18,070 --> 00:41:20,570 The f to the left is in format. 954 00:41:20,570 --> 00:41:22,020 So let me go ahead and run this. 955 00:41:22,020 --> 00:41:23,590 1 divided by 10. 956 00:41:23,590 --> 00:41:25,220 And OK, still looking pretty good. 957 00:41:25,220 --> 00:41:28,870 Let's do maybe three decimal places, save that, rerun it. 958 00:41:28,870 --> 00:41:30,430 1 divided by 10. 959 00:41:30,430 --> 00:41:31,490 Still pretty good. 960 00:41:31,490 --> 00:41:33,040 Let's get a little ambitious. 961 00:41:33,040 --> 00:41:37,870 Let's do it 50 decimal places out, 1 divided by 10, and damn it. 962 00:41:37,870 --> 00:41:40,690 Python has not fixed this fundamental problem. 963 00:41:40,690 --> 00:41:42,400 So we describe this problem as what? 964 00:41:42,400 --> 00:41:45,590 965 00:41:45,590 --> 00:41:50,092 What's the sort of buzzword here to sort of explain or forgive this issue? 966 00:41:50,092 --> 00:41:51,191 AUDIENCE: [INAUDIBLE] 967 00:41:51,191 --> 00:41:53,690 DAVID MALAN: This is an integer overflow, related in spirit. 968 00:41:53,690 --> 00:41:55,610 Integer overflow literally happens when you're 969 00:41:55,610 --> 00:41:58,580 doing lots of addition and something's rolling over from a big value 970 00:41:58,580 --> 00:42:01,160 to a small or even a negative. 971 00:42:01,160 --> 00:42:01,984 Similar in spirit. 972 00:42:01,984 --> 00:42:02,484 Yeah? 973 00:42:02,484 --> 00:42:08,175 AUDIENCE: [INAUDIBLE] 974 00:42:08,175 --> 00:42:08,925 DAVID MALAN: Yeah. 975 00:42:08,925 --> 00:42:12,320 If you want to have an infinite amount of precision all the way out, 976 00:42:12,320 --> 00:42:13,910 you need an infinite amount of memory. 977 00:42:13,910 --> 00:42:16,730 And no Mac or PC or phone has an infinite amount of memory. 978 00:42:16,730 --> 00:42:20,510 At some point, a line is drawn in the sand and you can only be so precise. 979 00:42:20,510 --> 00:42:24,230 And so imprecision was the analog in the floating point world 980 00:42:24,230 --> 00:42:27,290 to overflow, recall, where if you only have a finite number of bits 981 00:42:27,290 --> 00:42:29,060 you can do really well up to a point. 982 00:42:29,060 --> 00:42:32,510 But eventually, the computer's got to estimate that value for you 983 00:42:32,510 --> 00:42:35,370 because you can't represent an infinite number of values. 984 00:42:35,370 --> 00:42:38,960 So this is to say Python is just as limited, fundamentally, 985 00:42:38,960 --> 00:42:40,910 as some other languages like C. So we've not 986 00:42:40,910 --> 00:42:42,410 gotten rid of all of those problems. 987 00:42:42,410 --> 00:42:45,690 But frankly, in the world of data science and analytics, 988 00:42:45,690 --> 00:42:47,690 it's certainly important precise mathematics. 989 00:42:47,690 --> 00:42:49,710 So there are solutions to this problem. 990 00:42:49,710 --> 00:42:52,070 But it requires special libraries, typically, 991 00:42:52,070 --> 00:42:55,070 importing something that allows you to use as much memory 992 00:42:55,070 --> 00:42:58,560 as you want more than just the default amount of memory. 993 00:42:58,560 --> 00:43:00,350 So that problem there still exists. 994 00:43:00,350 --> 00:43:03,470 Let me go ahead and open up one other example here. 995 00:43:03,470 --> 00:43:07,730 And in fact, in C, you'll recall that we had this example here. 996 00:43:07,730 --> 00:43:13,240 In C we had a program called overflow.c. 997 00:43:13,240 --> 00:43:16,330 And notice that this code in C from a few weeks 998 00:43:16,330 --> 00:43:19,915 back just multiplied i by 2, by 2, by 2. 999 00:43:19,915 --> 00:43:21,790 So it was doing exponentiation, so to speak-- 1000 00:43:21,790 --> 00:43:25,000 1 to 2 to 4 to 8, 16, 32, 64, and so forth. 1001 00:43:25,000 --> 00:43:27,670 What happened if we waited long enough and watched 1002 00:43:27,670 --> 00:43:30,913 this program a few weeks back? 1003 00:43:30,913 --> 00:43:32,680 AUDIENCE: You go to 5 billion instead of-- 1004 00:43:32,680 --> 00:43:36,846 DAVID MALAN: Yeah, we hit roughly 5 billion or 4 billion-- 1005 00:43:36,846 --> 00:43:39,970 or rather, we technically hit, I think, 2 billion, and then it rolled over. 1006 00:43:39,970 --> 00:43:41,410 And it actually created a problem. 1007 00:43:41,410 --> 00:43:42,230 So let me actually do this. 1008 00:43:42,230 --> 00:43:44,627 Let me go ahead and make overflow so we can demonstrate 1009 00:43:44,627 --> 00:43:47,710 the points that you made earlier about integer overflow, which is, indeed, 1010 00:43:47,710 --> 00:43:48,430 this one. 1011 00:43:48,430 --> 00:43:50,860 Let me go ahead now and run overflow. 1012 00:43:50,860 --> 00:43:54,290 I'll expand my window just so we can fit a little more in the screen. 1013 00:43:54,290 --> 00:43:55,860 And as this runs-- 1014 00:43:55,860 --> 00:43:57,490 whoops, let me fix this. 1015 00:43:57,490 --> 00:43:59,280 Here we go. 1016 00:43:59,280 --> 00:44:01,000 Let me go ahead and make overflow. 1017 00:44:01,000 --> 00:44:06,290 And now 1, 2, 4, 8, 16, 32, and so forth. 1018 00:44:06,290 --> 00:44:08,590 It's a little slow to start, but doubling and doubling 1019 00:44:08,590 --> 00:44:11,036 is going to get us up to a big value pretty quickly. 1020 00:44:11,036 --> 00:44:13,660 This is indeed going to overflow once we hit roughly 2 billion. 1021 00:44:13,660 --> 00:44:14,410 Why? 1022 00:44:14,410 --> 00:44:16,970 Why two billion, give or take? 1023 00:44:16,970 --> 00:44:18,700 Why that value in C? 1024 00:44:18,700 --> 00:44:19,876 Yeah? 1025 00:44:19,876 --> 00:44:21,956 AUDIENCE: [INAUDIBLE] 1026 00:44:21,956 --> 00:44:23,830 DAVID MALAN: Yeah, that's how much an integer 1027 00:44:23,830 --> 00:44:27,700 can store because we're calling C. An int is typically 32 bits or 4 bytes. 1028 00:44:27,700 --> 00:44:31,120 And with 32 bits, you can represent four billion possible values. 1029 00:44:31,120 --> 00:44:34,090 And if half of those values are positive and half of them are negative, 1030 00:44:34,090 --> 00:44:37,720 it stands to reason that the highest you can count is roughly 2 billion. 1031 00:44:37,720 --> 00:44:41,630 And indeed, once we try to count up just doubling one billion, we overflow. 1032 00:44:41,630 --> 00:44:44,830 So to your point earlier, overflow is still an issue, 1033 00:44:44,830 --> 00:44:46,540 but in the context of integers. 1034 00:44:46,540 --> 00:44:49,000 But now let's try a Python version of this. 1035 00:44:49,000 --> 00:44:52,887 Let me go ahead now and open up overflow.py, 1036 00:44:52,887 --> 00:44:54,470 which is a program I wrote in advance. 1037 00:44:54,470 --> 00:44:56,428 It's on the course's website, as always, if you 1038 00:44:56,428 --> 00:44:58,300 want to take a look more closely. 1039 00:44:58,300 --> 00:45:04,240 And if I go into this file in weeks one, overflow.py, we see this code. 1040 00:45:04,240 --> 00:45:05,500 So it's almost the same. 1041 00:45:05,500 --> 00:45:07,624 But notice I'm using another library that we've not 1042 00:45:07,624 --> 00:45:09,590 seen before, from time import sleep. 1043 00:45:09,590 --> 00:45:10,340 It's kind of cute. 1044 00:45:10,340 --> 00:45:12,007 So this allows me to sleep for a second. 1045 00:45:12,007 --> 00:45:14,131 That's going to get tedious quickly, but that's OK. 1046 00:45:14,131 --> 00:45:15,160 Let's do this real fast. 1047 00:45:15,160 --> 00:45:18,580 If I go into the source six directory, weeks one, 1048 00:45:18,580 --> 00:45:23,530 and run Python of overflow.py, it's the same function-- or same program, 1049 00:45:23,530 --> 00:45:24,542 functionally. 1050 00:45:24,542 --> 00:45:26,500 But honestly, this is getting a little tedious. 1051 00:45:26,500 --> 00:45:31,090 Let's go ahead and not sleep for a second every time, save and reload. 1052 00:45:31,090 --> 00:45:32,610 Let's just run the thing. 1053 00:45:32,610 --> 00:45:35,800 Whew, look at it go. 1054 00:45:35,800 --> 00:45:36,520 Only up there. 1055 00:45:36,520 --> 00:45:38,440 Look up there. 1056 00:45:38,440 --> 00:45:42,070 What's it doing differently? 1057 00:45:42,070 --> 00:45:44,380 It's counting a lot higher than 2 billion. 1058 00:45:44,380 --> 00:45:47,387 So what might you infer about integers in Python? 1059 00:45:47,387 --> 00:45:48,670 AUDIENCE: [INAUDIBLE] 1060 00:45:48,670 --> 00:45:50,146 DAVID MALAN: Say again? 1061 00:45:50,146 --> 00:45:54,257 AUDIENCE: An integer is defined to be quite a number of bits. 1062 00:45:54,257 --> 00:45:57,090 DAVID MALAN: OK, an integer is defined to be quite a number of bits. 1063 00:45:57,090 --> 00:45:58,656 And indeed, that's the case. 1064 00:45:58,656 --> 00:46:00,030 Python is not actually this slow. 1065 00:46:00,030 --> 00:46:02,940 It's because we're running a web based IDE and the internet itself 1066 00:46:02,940 --> 00:46:03,900 is a little slow. 1067 00:46:03,900 --> 00:46:06,899 And so what's happening here is just the internet is getting in the way. 1068 00:46:06,899 --> 00:46:10,629 But suffice it to say that Python is counting up way, way higher than C was. 1069 00:46:10,629 --> 00:46:13,170 And that's the power you get by just using larger data types. 1070 00:46:13,170 --> 00:46:16,470 We could have done this in C. We could have used longs, for instance. 1071 00:46:16,470 --> 00:46:20,269 But notice that with Python you just get more by default out of the box. 1072 00:46:20,269 --> 00:46:22,310 Let's go ahead and take a five minute break here. 1073 00:46:22,310 --> 00:46:24,476 And when we resume, we'll introduce some more syntax 1074 00:46:24,476 --> 00:46:25,950 and solve some more problems. 1075 00:46:25,950 --> 00:46:28,950 All right, so let's take a look at a few other examples 1076 00:46:28,950 --> 00:46:31,920 that are comparable to what we did back in week one and look at a few 1077 00:46:31,920 --> 00:46:34,440 from week two and three and really take a look 1078 00:46:34,440 --> 00:46:37,770 not just at the syntax, ultimately, but some of the features of Python. 1079 00:46:37,770 --> 00:46:40,710 And of course, we need the ability to express ourselves conditionally 1080 00:46:40,710 --> 00:46:42,480 or logically with control flow. 1081 00:46:42,480 --> 00:46:44,340 And so let me propose a quick program here 1082 00:46:44,340 --> 00:46:47,760 that we'll just call conditions.py, reminiscent of conditions.c 1083 00:46:47,760 --> 00:46:48,850 some time ago. 1084 00:46:48,850 --> 00:46:52,560 Let me go ahead and import from CS50 getInt this time 1085 00:46:52,560 --> 00:46:56,530 and get myself another x with getInt x from the user. 1086 00:46:56,530 --> 00:47:00,262 Then let me go ahead and ask them for getInt y from the user. 1087 00:47:00,262 --> 00:47:02,220 And then let me go ahead and just compare them. 1088 00:47:02,220 --> 00:47:04,830 And so per our comparison with Scratch a bit ago, 1089 00:47:04,830 --> 00:47:08,220 I can simply say if x is less than y, then go ahead 1090 00:47:08,220 --> 00:47:14,090 and print out, for instance, print x is less than y, just as we did weeks ago. 1091 00:47:14,090 --> 00:47:16,980 Elif if x is greater than y, we can go ahead 1092 00:47:16,980 --> 00:47:20,610 and print out x is greater than y. 1093 00:47:20,610 --> 00:47:23,160 And then we can still have a third condition, else, just 1094 00:47:23,160 --> 00:47:26,310 like in C, where we print out, for instance, the logical conclusion. 1095 00:47:26,310 --> 00:47:28,440 x is equal to y. 1096 00:47:28,440 --> 00:47:30,330 So just to point out some of the differences, 1097 00:47:30,330 --> 00:47:32,400 indentation is ever so important now. 1098 00:47:32,400 --> 00:47:33,859 And it's got to be consistent. 1099 00:47:33,859 --> 00:47:35,400 You can't have four spaces and three. 1100 00:47:35,400 --> 00:47:37,540 You've got to have, for instance, four all the way. 1101 00:47:37,540 --> 00:47:39,690 Notice that I've got the colons consistently there. 1102 00:47:39,690 --> 00:47:44,260 But notice that I don't need the parentheses, either, anymore. 1103 00:47:44,260 --> 00:47:46,900 And with Python, there's sort of a buzzword, Pythonic. 1104 00:47:46,900 --> 00:47:48,930 There is a Pythonic way of doing things. 1105 00:47:48,930 --> 00:47:53,230 You can have parentheses around x, less than y, or x greater than y, 1106 00:47:53,230 --> 00:47:56,602 just like in C. But it doesn't add anything logically, arguably. 1107 00:47:56,602 --> 00:47:58,560 And if it doesn't make your code more readable, 1108 00:47:58,560 --> 00:48:00,990 don't clutter your code with additional characters. 1109 00:48:00,990 --> 00:48:02,820 And so that's a general rule of thumb now. 1110 00:48:02,820 --> 00:48:06,099 Python is much more trim when it comes to syntax, only 1111 00:48:06,099 --> 00:48:08,890 introducing it when it really solves a problem, which in this case, 1112 00:48:08,890 --> 00:48:09,710 it doesn't really. 1113 00:48:09,710 --> 00:48:10,719 Yeah? 1114 00:48:10,719 --> 00:48:12,760 AUDIENCE: Quick question, the lines [INAUDIBLE],, 1115 00:48:12,760 --> 00:48:15,551 those are grouped right together, one to the next, one to the next, 1116 00:48:15,551 --> 00:48:16,515 and one to the next. 1117 00:48:16,515 --> 00:48:18,640 If you were to put an additional line between them, 1118 00:48:18,640 --> 00:48:19,960 would that break the code? 1119 00:48:19,960 --> 00:48:20,770 DAVID MALAN: No, not at all. 1120 00:48:20,770 --> 00:48:23,080 I can have as much whitespace vertically as I want if. 1121 00:48:23,080 --> 00:48:25,210 I want to add some comments, indeed, I can do that. 1122 00:48:25,210 --> 00:48:27,760 And why don't we do that, in fact, because the commenting syntax 1123 00:48:27,760 --> 00:48:29,134 for Python is a little different. 1124 00:48:29,134 --> 00:48:31,315 In C, we were in the habit of doing slash slash. 1125 00:48:31,315 --> 00:48:33,190 Python, it's actually a little more succinct. 1126 00:48:33,190 --> 00:48:34,510 You can just use a single hash. 1127 00:48:34,510 --> 00:48:37,270 And you can say gets x from user here. 1128 00:48:37,270 --> 00:48:39,860 I can say get y from user here. 1129 00:48:39,860 --> 00:48:42,557 And then I can say something like compare x and y. 1130 00:48:42,557 --> 00:48:44,890 And if I really wanted to, I could put comments in here. 1131 00:48:44,890 --> 00:48:46,150 That is perfectly fine. 1132 00:48:46,150 --> 00:48:49,420 But I'll just keep it more compact with this particular example. 1133 00:48:49,420 --> 00:48:54,740 So any questions on the conditional syntax or what we've just done here? 1134 00:48:54,740 --> 00:48:56,980 All right, let Me whip up another example, 1135 00:48:56,980 --> 00:48:59,340 this time doing some comparisons. 1136 00:48:59,340 --> 00:49:02,190 This time, let me create a file called answer.py, 1137 00:49:02,190 --> 00:49:05,700 which is reminiscent of a quick example we did weeks ago called answer.c. 1138 00:49:05,700 --> 00:49:09,630 Let me go ahead and from CS50 import getString. 1139 00:49:09,630 --> 00:49:12,210 And this time, let me go ahead and declare 1140 00:49:12,210 --> 00:49:15,090 a variable, C. And let me go ahead and get a string from the user-- 1141 00:49:15,090 --> 00:49:18,030 whoops-- get a string from the user for their answer 1142 00:49:18,030 --> 00:49:19,770 to whatever question it is we care about. 1143 00:49:19,770 --> 00:49:22,890 And then if it's meant to be a yes/no answer, let's check for that. 1144 00:49:22,890 --> 00:49:28,840 If c equals equals y or c equals equals little y, 1145 00:49:28,840 --> 00:49:32,280 then go ahead and say, just for the sake of demonstration, 1146 00:49:32,280 --> 00:49:34,680 yes, because the human presumably meant that. 1147 00:49:34,680 --> 00:49:40,140 Elif c equals equals capital n or c equals equals little n, 1148 00:49:40,140 --> 00:49:43,320 then go ahead and print out, for instance, no. 1149 00:49:43,320 --> 00:49:47,320 So a short program, but what are some of the takeaways? 1150 00:49:47,320 --> 00:49:51,180 Well, what's different clearly among these lines, 5 through 8, versus C, 1151 00:49:51,180 --> 00:49:53,090 weeks ago? 1152 00:49:53,090 --> 00:49:53,666 Yeah. 1153 00:49:53,666 --> 00:49:55,330 AUDIENCE: For or you have to do-- 1154 00:49:55,330 --> 00:49:58,310 DAVID MALAN: Yeah, none of those stupid vertical bars or the ampersand 1155 00:49:58,310 --> 00:49:58,830 ampersand. 1156 00:49:58,830 --> 00:50:02,430 If you want to do something or or and it together, just say and and 1157 00:50:02,430 --> 00:50:05,580 or, much like Scratch, actually, some weeks ago. 1158 00:50:05,580 --> 00:50:08,970 Notice, too-- how are we comparing strings? 1159 00:50:08,970 --> 00:50:12,930 Turns out Python does not have chars, per se. 1160 00:50:12,930 --> 00:50:14,960 C did have chars, single characters. 1161 00:50:14,960 --> 00:50:16,380 Python only has strings. 1162 00:50:16,380 --> 00:50:19,107 It has strings, ints, floats, and then some fancier things, 1163 00:50:19,107 --> 00:50:20,190 but it doesn't have chars. 1164 00:50:20,190 --> 00:50:22,530 So that's why I am deliberately using string. 1165 00:50:22,530 --> 00:50:28,130 But when we use strings in C, how did we compare two strings? 1166 00:50:28,130 --> 00:50:31,967 Str comp, right, because of the whole annoying pointer comparison thing. 1167 00:50:31,967 --> 00:50:33,800 Well, it turns out now in Python if you want 1168 00:50:33,800 --> 00:50:37,280 to compare two strings character by character by character, 1169 00:50:37,280 --> 00:50:38,570 equal equals is back. 1170 00:50:38,570 --> 00:50:43,050 And it does exactly what you expect it to do, even if it's a full word. 1171 00:50:43,050 --> 00:50:47,780 So if you're actually checking for, for instance, yes or yes from the human, 1172 00:50:47,780 --> 00:50:50,630 you can still use equal equals, as well, even though it's 1173 00:50:50,630 --> 00:50:52,080 more than now one character. 1174 00:50:52,080 --> 00:50:53,538 So that's a wonderful feature, too. 1175 00:50:53,538 --> 00:50:56,000 And it just makes the code more readable and a lot easier 1176 00:50:56,000 --> 00:50:58,430 to write right out of the gate. 1177 00:50:58,430 --> 00:51:02,000 All right, so now recall that in C we spent a little while, 1178 00:51:02,000 --> 00:51:05,120 as well as in Scratch, taking a look at a few examples about coughing, 1179 00:51:05,120 --> 00:51:06,050 of all things. 1180 00:51:06,050 --> 00:51:08,000 And in fact, in Python and C-- 1181 00:51:08,000 --> 00:51:09,530 rather, in Scratch and in C-- 1182 00:51:09,530 --> 00:51:12,130 we did a zero example that looked a little like this. 1183 00:51:12,130 --> 00:51:14,990 If you want to simulate the notion of Scratch the cat coughing, 1184 00:51:14,990 --> 00:51:16,446 you might, of course, do this. 1185 00:51:16,446 --> 00:51:19,070 And then if he's going to cough three times, you might do this. 1186 00:51:19,070 --> 00:51:22,029 And we ran this and it just did cough, cough, cough on the screen. 1187 00:51:22,029 --> 00:51:24,320 I won't bother running it because it will just do that. 1188 00:51:24,320 --> 00:51:26,810 But this was bad design we claimed weeks ago. 1189 00:51:26,810 --> 00:51:28,700 What was the gist of why this is bad design? 1190 00:51:28,700 --> 00:51:31,267 1191 00:51:31,267 --> 00:51:32,850 I mean, I literally copied and pasted. 1192 00:51:32,850 --> 00:51:35,876 And the odds are if you're ever doing that in CS50 or in programming 1193 00:51:35,876 --> 00:51:38,000 more generally, you're probably being a little lazy 1194 00:51:38,000 --> 00:51:39,290 and there's a better way to do it. 1195 00:51:39,290 --> 00:51:41,039 And it's a more maintainable way to do it. 1196 00:51:41,039 --> 00:51:45,260 So of course, we introduced weeks ago, both in Scratch and in C, 1197 00:51:45,260 --> 00:51:49,610 the ability to in cough one, this time, do a loop. 1198 00:51:49,610 --> 00:51:53,240 And I can do a loop slightly differently in Python and in C. But for i 1199 00:51:53,240 --> 00:51:57,002 in the range of 3, go ahead and print out cough. 1200 00:51:57,002 --> 00:51:59,210 So the syntax for the for loop is a little different. 1201 00:51:59,210 --> 00:52:01,084 But it's pretty straightforward, nonetheless, 1202 00:52:01,084 --> 00:52:04,340 once you remember that you use for, variable name, then 1203 00:52:04,340 --> 00:52:08,569 the preposition in, and then the word range with a parenthesis and its-- 1204 00:52:08,569 --> 00:52:10,610 parentheses and the value you want to care about. 1205 00:52:10,610 --> 00:52:16,240 But then we saw an opportunity, recall, to actually abstract coughing away. 1206 00:52:16,240 --> 00:52:19,490 Coughing, at least in our textual form, is just the act of printing something. 1207 00:52:19,490 --> 00:52:22,310 So we introduced in version two some time ago, 1208 00:52:22,310 --> 00:52:25,340 the following approach in cough two. 1209 00:52:25,340 --> 00:52:28,857 I instead defined a function called cough that did the coughing for me. 1210 00:52:28,857 --> 00:52:30,440 And we've not seen this yet in Python. 1211 00:52:30,440 --> 00:52:33,350 So how do you define a function in Python called cough? 1212 00:52:33,350 --> 00:52:36,680 Put another way, how do you make your own custom puzzle piece, 1213 00:52:36,680 --> 00:52:38,300 just as we did in Scratch? 1214 00:52:38,300 --> 00:52:40,250 Well, you define it with def. 1215 00:52:40,250 --> 00:52:42,390 And then you have it do exactly what you want 1216 00:52:42,390 --> 00:52:45,690 it to do by just indenting the lines of code that belong to that function. 1217 00:52:45,690 --> 00:52:47,270 So there's no return value. 1218 00:52:47,270 --> 00:52:49,200 There's no need for an input at the moment. 1219 00:52:49,200 --> 00:52:50,510 But we do have the colon. 1220 00:52:50,510 --> 00:52:51,680 And we have the indentation. 1221 00:52:51,680 --> 00:52:53,510 No curly braces, nothing else. 1222 00:52:53,510 --> 00:52:55,230 How do I now use this function? 1223 00:52:55,230 --> 00:52:59,480 Well, here's where we have a few options stylistically in the program. 1224 00:52:59,480 --> 00:53:03,890 The simplest way to call this function would be quite simply like this. 1225 00:53:03,890 --> 00:53:09,710 Go ahead and for i in range 3, go ahead now and cough. 1226 00:53:09,710 --> 00:53:11,240 And this should look a little weird. 1227 00:53:11,240 --> 00:53:12,777 It looks, indeed, a little sloppy. 1228 00:53:12,777 --> 00:53:13,860 But let's see if it works. 1229 00:53:13,860 --> 00:53:17,780 So if I go ahead and run Python of coughtwo.py, 1230 00:53:17,780 --> 00:53:19,790 it seems to cough, cough, cough. 1231 00:53:19,790 --> 00:53:24,160 But I say this is a little weird because what am I 1232 00:53:24,160 --> 00:53:28,590 doing that's very different now from C? 1233 00:53:28,590 --> 00:53:29,980 There's no what? 1234 00:53:29,980 --> 00:53:31,690 There's no main function. 1235 00:53:31,690 --> 00:53:34,742 I just have some code right here on the left of the screen. 1236 00:53:34,742 --> 00:53:36,200 And yet, I do have a function here. 1237 00:53:36,200 --> 00:53:37,667 And in Python, this is OK. 1238 00:53:37,667 --> 00:53:40,000 Because you're using an interpreter and reading the file 1239 00:53:40,000 --> 00:53:43,390 top to bottom, left to right, you don't strictly need a function called main. 1240 00:53:43,390 --> 00:53:45,610 It's just going to interpret all of your code. 1241 00:53:45,610 --> 00:53:47,811 And when it's seen the definition of a function, OK. 1242 00:53:47,811 --> 00:53:49,060 It's going to say, OK, got it. 1243 00:53:49,060 --> 00:53:50,830 I now know what the verb cough means. 1244 00:53:50,830 --> 00:53:53,890 I will do this anytime I see it down here. 1245 00:53:53,890 --> 00:53:56,120 But we're going to run into a problem. 1246 00:53:56,120 --> 00:53:58,870 And if, indeed, I did what my first instinct was, 1247 00:53:58,870 --> 00:54:02,620 which was to put the logic, the main part of my program at the top 1248 00:54:02,620 --> 00:54:05,240 and to define cough down here, let's see what happens. 1249 00:54:05,240 --> 00:54:06,160 Let me zoom out. 1250 00:54:06,160 --> 00:54:08,800 Let me go ahead and rerun coughtwo.py. 1251 00:54:08,800 --> 00:54:11,410 And now we start to see the first of our error messages. 1252 00:54:11,410 --> 00:54:14,890 And they're going to look just as cryptic at first glance as is clang 1253 00:54:14,890 --> 00:54:15,730 and make were. 1254 00:54:15,730 --> 00:54:19,640 Arrested assured that help 50 can help with Python error messages, as well. 1255 00:54:19,640 --> 00:54:23,860 But let's just try to parse what I do understand. cough2.py, line two 1256 00:54:23,860 --> 00:54:26,440 in module whatever that is, name error. 1257 00:54:26,440 --> 00:54:28,550 Name cough is not defined. 1258 00:54:28,550 --> 00:54:29,830 So what's your gut here? 1259 00:54:29,830 --> 00:54:31,180 What is that really-- 1260 00:54:31,180 --> 00:54:32,890 what's the explanation for that error? 1261 00:54:32,890 --> 00:54:34,600 Because cough is clearly defined-- 1262 00:54:34,600 --> 00:54:37,660 literally with the define def verb-- 1263 00:54:37,660 --> 00:54:40,551 right there on line four now. 1264 00:54:40,551 --> 00:54:41,050 What-- 1265 00:54:41,050 --> 00:54:41,982 AUDIENCE: You're calling cough before it's defined. 1266 00:54:41,982 --> 00:54:44,523 DAVID MALAN: Yeah, I'm trying to call it before it's defined. 1267 00:54:44,523 --> 00:54:46,707 Python is trying to take me very literally. 1268 00:54:46,707 --> 00:54:48,790 And it's going to do top to bottom, left to right. 1269 00:54:48,790 --> 00:54:50,790 And if it doesn't see until the bottom something 1270 00:54:50,790 --> 00:54:53,681 it's supposed to be doing at the top, it's just not going to work. 1271 00:54:53,681 --> 00:54:56,430 So there is a solution to this and it starts to get a little ugly. 1272 00:54:56,430 --> 00:54:58,140 But it's a more generalized solution. 1273 00:54:58,140 --> 00:55:01,890 It turns out that even though main is not required in a Python program, 1274 00:55:01,890 --> 00:55:04,590 many programmers just create one nonetheless 1275 00:55:04,590 --> 00:55:06,780 to address this particular problem. 1276 00:55:06,780 --> 00:55:09,150 And they specifically do something like this-- 1277 00:55:09,150 --> 00:55:13,290 def main-- and then below it they indent everything there. 1278 00:55:13,290 --> 00:55:18,390 And then you need one specific feature to solve this problem now. 1279 00:55:18,390 --> 00:55:21,960 I've now defined main and I've defined cough, which theoretically 1280 00:55:21,960 --> 00:55:24,540 solves this problem just as it did in C. There 1281 00:55:24,540 --> 00:55:26,280 is no notion of a prototype in Python. 1282 00:55:26,280 --> 00:55:30,280 That is not the solution to copy paste the name of the function up above. 1283 00:55:30,280 --> 00:55:33,820 But when I do this now, literally nothing happens. 1284 00:55:33,820 --> 00:55:35,920 But I did get rid of the error. 1285 00:55:35,920 --> 00:55:38,264 So just reason through this, perhaps. 1286 00:55:38,264 --> 00:55:40,430 Especially if you've never programmed Python before, 1287 00:55:40,430 --> 00:55:44,675 why might nothing now be happening? 1288 00:55:44,675 --> 00:55:45,799 AUDIENCE: Not calling main? 1289 00:55:45,799 --> 00:55:47,675 DAVID MALAN: I'm not calling main, yeah. 1290 00:55:47,675 --> 00:55:49,010 So whereas in C-- 1291 00:55:49,010 --> 00:55:53,630 and frankly, in Java, C++, and a few other languages-- main is special. 1292 00:55:53,630 --> 00:55:57,590 It just gets called by default. In Python, main is not special. 1293 00:55:57,590 --> 00:56:00,960 I've chosen this name main just because so many other languages use it, 1294 00:56:00,960 --> 00:56:02,840 but it has no special significance. 1295 00:56:02,840 --> 00:56:05,970 If you want to call main, you have to do it yourself. 1296 00:56:05,970 --> 00:56:08,090 And so this is a little weird, admittedly. 1297 00:56:08,090 --> 00:56:12,410 But you can literally do this down here because your code will be executed top 1298 00:56:12,410 --> 00:56:13,610 to bottom, left to right. 1299 00:56:13,610 --> 00:56:16,640 By the time line 10 is reached, both main has been defined 1300 00:56:16,640 --> 00:56:19,100 and cough has been defined, which means you're good to go. 1301 00:56:19,100 --> 00:56:23,630 So if I now go down here and run Python of cough2, now it actually works. 1302 00:56:23,630 --> 00:56:27,200 Now, as an aside, this is not Pythonic, if you will. 1303 00:56:27,200 --> 00:56:33,020 Most people would actually do this if the name equals equals main, 1304 00:56:33,020 --> 00:56:34,910 then do this. 1305 00:56:34,910 --> 00:56:38,180 This is for lower level reasons that let me wave my hand out for today. 1306 00:56:38,180 --> 00:56:40,932 But long story short, the addition of this cryptic-looking line 1307 00:56:40,932 --> 00:56:42,890 solves other problems that we're just not going 1308 00:56:42,890 --> 00:56:44,790 to trip over this week and probably next. 1309 00:56:44,790 --> 00:56:46,710 So this is the common way to do it. 1310 00:56:46,710 --> 00:56:49,730 But if you just ignore that, the effect of this cryptic-looking code 1311 00:56:49,730 --> 00:56:52,460 is just to call main yourself at the very bottom of your file. 1312 00:56:52,460 --> 00:56:54,350 So when we start writing more interesting programs, 1313 00:56:54,350 --> 00:56:55,920 this is just going to become conventional. 1314 00:56:55,920 --> 00:56:58,086 If you want to start writing functions and so forth, 1315 00:56:58,086 --> 00:57:00,560 odds are you'll benefit by writing a main function 1316 00:57:00,560 --> 00:57:02,120 and putting more code in there. 1317 00:57:02,120 --> 00:57:07,400 So let's do one final example with cough that actually now parameterizes 1318 00:57:07,400 --> 00:57:12,200 the code, just as we did weeks ago in Scratch and C. This will be cough3.py. 1319 00:57:12,200 --> 00:57:14,330 Let me start as I did just a little bit ago. 1320 00:57:14,330 --> 00:57:16,550 But suppose I want to achieve this effect. 1321 00:57:16,550 --> 00:57:20,780 I want the computer to cough three times by passing in an input. 1322 00:57:20,780 --> 00:57:23,900 I now do need to modify cough to take an input. 1323 00:57:23,900 --> 00:57:26,420 And in C, I would have said something like int n. 1324 00:57:26,420 --> 00:57:29,300 But you don't have to specify data types in Python, 1325 00:57:29,300 --> 00:57:32,150 you just have to specify the parameter name or the argument name. 1326 00:57:32,150 --> 00:57:33,510 So that's nice and simple. 1327 00:57:33,510 --> 00:57:36,680 And now down in here, in cough is where I should probably 1328 00:57:36,680 --> 00:57:41,270 say for i in the range of 3, do this. 1329 00:57:41,270 --> 00:57:42,440 But this isn't quite right. 1330 00:57:42,440 --> 00:57:44,701 What fix do I want to make here? 1331 00:57:44,701 --> 00:57:45,200 Yeah. 1332 00:57:45,200 --> 00:57:46,420 Now I can just pass in n. 1333 00:57:46,420 --> 00:57:48,920 So range is just a function that takes an argument that I've 1334 00:57:48,920 --> 00:57:51,150 been hard coding as three just because. 1335 00:57:51,150 --> 00:57:53,420 But you can generalize it with n, as well. 1336 00:57:53,420 --> 00:57:56,450 So now again, per our discussion of abstraction weeks and weeks 1337 00:57:56,450 --> 00:57:59,900 ago, do we have a sort of beautiful version of coughing, 1338 00:57:59,900 --> 00:58:01,970 even though it's looking way more cryptic. 1339 00:58:01,970 --> 00:58:04,220 But by step by step by step did we get to the point 1340 00:58:04,220 --> 00:58:07,100 of having a main function that takes an abstraction, cough. 1341 00:58:07,100 --> 00:58:08,420 Do it this many times. 1342 00:58:08,420 --> 00:58:11,370 Now the implementation details are hidden in this custom puzzle piece, 1343 00:58:11,370 --> 00:58:11,990 if you will. 1344 00:58:11,990 --> 00:58:14,600 And the two lines at the bottom just kick off 1345 00:58:14,600 --> 00:58:16,280 the whole execution of the program. 1346 00:58:16,280 --> 00:58:20,742 But that's the only stuff that's really Python-specific now. 1347 00:58:20,742 --> 00:58:21,714 Yeah? 1348 00:58:21,714 --> 00:58:27,550 AUDIENCE: Can we use the cough function on line 11 [INAUDIBLE]?? 1349 00:58:27,550 --> 00:58:31,151 DAVID MALAN: Could use the cough function on line 11? 1350 00:58:31,151 --> 00:58:31,650 Yes. 1351 00:58:31,650 --> 00:58:37,310 You could absolutely just do this, for instance, and get rid of main again. 1352 00:58:37,310 --> 00:58:38,490 It's just a convention. 1353 00:58:38,490 --> 00:58:41,490 Once you start writing more sophisticated programs with functions, 1354 00:58:41,490 --> 00:58:45,542 you should probably introduce main just to keep it tidy. 1355 00:58:45,542 --> 00:58:49,330 AUDIENCE: With the [INAUDIBLE]. 1356 00:58:49,330 --> 00:58:50,710 DAVID MALAN: You could do that. 1357 00:58:50,710 --> 00:58:53,410 Then you're starting to be non-Pythonic. 1358 00:58:53,410 --> 00:58:59,320 Like, yes, you could do cough3 but people would look askew at you 1359 00:58:59,320 --> 00:59:01,450 because it's just not done that way. 1360 00:59:01,450 --> 00:59:03,160 That's what Pythonic means. 1361 00:59:03,160 --> 00:59:04,996 Yeah, other questions? 1362 00:59:04,996 --> 00:59:08,940 AUDIENCE: You need to have the [INAUDIBLE] come after the for i 1363 00:59:08,940 --> 00:59:15,840 in range n so that it knows what the cough is? 1364 00:59:15,840 --> 00:59:17,090 DAVID MALAN: Not in this case. 1365 00:59:17,090 --> 00:59:22,462 So the order now is OK because first Python is seeing here's 1366 00:59:22,462 --> 00:59:23,420 the definition of main. 1367 00:59:23,420 --> 00:59:24,500 OK, I got it. 1368 00:59:24,500 --> 00:59:27,380 And then it's saying, here is the definition of cough, OK, I got it. 1369 00:59:27,380 --> 00:59:30,020 But it's not actually calling those functions yet. 1370 00:59:30,020 --> 00:59:33,170 The Python errors are thrown only at what's called runtime, 1371 00:59:33,170 --> 00:59:37,490 the running of the program's time, which means only when main is called 1372 00:59:37,490 --> 00:59:40,997 does Python actually execute line 4 and then see, 1373 00:59:40,997 --> 00:59:42,830 ooh, I need to call a function called cough. 1374 00:59:42,830 --> 00:59:45,350 But that's OK because it saw it earlier when it first 1375 00:59:45,350 --> 00:59:47,340 read the file top to bottom. 1376 00:59:47,340 --> 00:59:49,940 So it matters when the functions are called, 1377 00:59:49,940 --> 00:59:54,000 not where they appear, per se, in the file, the order in which they're 1378 00:59:54,000 --> 00:59:54,660 called. 1379 00:59:54,660 --> 00:59:57,300 Other questions? 1380 00:59:57,300 --> 01:00:00,725 All right, yes? 1381 01:00:00,725 --> 01:00:04,190 AUDIENCE: I don't know where you [INAUDIBLE] from. 1382 01:00:04,190 --> 01:00:07,094 How do you define n as an integer? 1383 01:00:07,094 --> 01:00:09,010 DAVID MALAN: How did I define n as an integer? 1384 01:00:09,010 --> 01:00:10,240 This is what's nice about Python. 1385 01:00:10,240 --> 01:00:12,031 If you want a variable or a parameter, just 1386 01:00:12,031 --> 01:00:14,950 start using it without mentioning its data type. 1387 01:00:14,950 --> 01:00:18,360 So the fact that I put n in parentheses in this function 1388 01:00:18,360 --> 01:00:21,960 means, hey, Python, let this function take an input called n. 1389 01:00:21,960 --> 01:00:24,840 And it can actually be any data type-- int, float, string, 1390 01:00:24,840 --> 01:00:26,010 or even something else. 1391 01:00:26,010 --> 01:00:28,950 It's up to me to use it responsibly as a number 1392 01:00:28,950 --> 01:00:32,560 and to call it responsibly with a number. 1393 01:00:32,560 --> 01:00:33,750 Good question. 1394 01:00:33,750 --> 01:00:34,388 Yeah? 1395 01:00:34,388 --> 01:00:36,925 AUDIENCE: So it's possible for a variable to change type? 1396 01:00:36,925 --> 01:00:39,050 DAVID MALAN: It is, indeed, possible for a variable 1397 01:00:39,050 --> 01:00:40,910 to change type, a good observation. 1398 01:00:40,910 --> 01:00:45,560 So yes, Python is not as strongly-typed language, so to speak. 1399 01:00:45,560 --> 01:00:48,050 C is strongly-typed in that if you make something an int, 1400 01:00:48,050 --> 01:00:49,850 it is staying an int forever. 1401 01:00:49,850 --> 01:00:53,120 Python is loosely typed, whereby x can be an int initially. 1402 01:00:53,120 --> 01:00:55,580 But if you really want to turn it into a string, you can. 1403 01:00:55,580 --> 01:01:00,140 But the convention there would be, yes, you can do that, but don't do that. 1404 01:01:00,140 --> 01:01:02,780 So Python has the, frankly, the sort of arrogance 1405 01:01:02,780 --> 01:01:04,460 of being sort of an adult language. 1406 01:01:04,460 --> 01:01:06,560 Yes, you could do that, but just don't. 1407 01:01:06,560 --> 01:01:08,477 Why do we have to protect you from yourselves? 1408 01:01:08,477 --> 01:01:11,476 And so in that sense, you need to be a little more responsible about it. 1409 01:01:11,476 --> 01:01:13,310 But again, there are arguments both ways. 1410 01:01:13,310 --> 01:01:16,820 That induces potential bugs that C would catch for you. 1411 01:01:16,820 --> 01:01:19,820 And this is where humans start to disagree about the upsides 1412 01:01:19,820 --> 01:01:23,390 and downsides of languages, whether a language should be strongly or loosely 1413 01:01:23,390 --> 01:01:25,290 or not even typed at all. 1414 01:01:25,290 --> 01:01:26,670 A good observation. 1415 01:01:26,670 --> 01:01:29,330 So let's look at a paradigm that was super common in C 1416 01:01:29,330 --> 01:01:31,280 when we wanted to do something again and again 1417 01:01:31,280 --> 01:01:34,370 to see how it actually is a little differently done in Python now. 1418 01:01:34,370 --> 01:01:37,880 Let me go ahead and create a file called positive.py 1419 01:01:37,880 --> 01:01:40,950 and go ahead and write a program a little quickly here. 1420 01:01:40,950 --> 01:01:44,250 So from CS50, let me go ahead and import getInt, 1421 01:01:44,250 --> 01:01:46,030 so we can get integers from the user. 1422 01:01:46,030 --> 01:01:47,780 Let me go ahead and define a main function 1423 01:01:47,780 --> 01:01:53,940 that simply does i, which will be my variable, gets a positive int, 1424 01:01:53,940 --> 01:01:56,180 and asks the user, just as we did weeks ago, 1425 01:01:56,180 --> 01:01:58,880 if you'll recall, for a positive integer. 1426 01:01:58,880 --> 01:02:02,240 And then just goes ahead and very boringly prints it out. 1427 01:02:02,240 --> 01:02:03,860 So that's all this program does. 1428 01:02:03,860 --> 01:02:06,110 And let me go ahead and just from recollection-- 1429 01:02:06,110 --> 01:02:08,930 though it's totally fine to copy/paste this cryptic-looking string, 1430 01:02:08,930 --> 01:02:13,590 we would just be remiss in not showing you how most people do this. 1431 01:02:13,590 --> 01:02:15,770 So if I do this, this is a complete program, 1432 01:02:15,770 --> 01:02:21,120 except for the fact that what does not exist yet? 1433 01:02:21,120 --> 01:02:24,265 Get positive int probably does not exist, just as it didn't in week one, 1434 01:02:24,265 --> 01:02:25,890 because we have to invent it ourselves. 1435 01:02:25,890 --> 01:02:28,030 Get int exists, but get positive int does not. 1436 01:02:28,030 --> 01:02:30,113 And just for demonstration's sake, let's try this. 1437 01:02:30,113 --> 01:02:33,360 Python of positive.py, notice we have name error get 1438 01:02:33,360 --> 01:02:34,790 positive int not defined. 1439 01:02:34,790 --> 01:02:36,280 OK, so we can fix that. 1440 01:02:36,280 --> 01:02:38,460 We can literally define, or def, it. 1441 01:02:38,460 --> 01:02:40,920 So get positive int. 1442 01:02:40,920 --> 01:02:42,670 It's going to take a prompt from the user, 1443 01:02:42,670 --> 01:02:45,870 just as it did weeks ago, the string that you want to show to him or her. 1444 01:02:45,870 --> 01:02:49,350 And now let me go ahead and get a positive integer. 1445 01:02:49,350 --> 01:02:51,930 What type of programming construct did we 1446 01:02:51,930 --> 01:02:55,063 use in C to do something again and again and again? 1447 01:02:55,063 --> 01:02:55,822 AUDIENCE: Loop. 1448 01:02:55,822 --> 01:02:58,030 DAVID MALAN: A loop, for sure, but more specifically, 1449 01:02:58,030 --> 01:03:00,700 to do something at least once and then maybe again 1450 01:03:00,700 --> 01:03:02,614 and again and again if they don't cooperate? 1451 01:03:02,614 --> 01:03:03,280 AUDIENCE: While. 1452 01:03:03,280 --> 01:03:04,550 DAVID MALAN: Do while. 1453 01:03:04,550 --> 01:03:06,730 No do while in Python. 1454 01:03:06,730 --> 01:03:09,750 So that handy feature for user input does not exist. 1455 01:03:09,750 --> 01:03:10,745 So that's fine. 1456 01:03:10,745 --> 01:03:12,370 We need to solve this just differently. 1457 01:03:12,370 --> 01:03:15,190 And honestly, in C, you could have solved that problem differently. 1458 01:03:15,190 --> 01:03:16,239 You don't need do while. 1459 01:03:16,239 --> 01:03:17,780 We could have taken it away from you. 1460 01:03:17,780 --> 01:03:19,071 C could take it away. 1461 01:03:19,071 --> 01:03:21,820 You could still solve every problem that we have in the past weeks 1462 01:03:21,820 --> 01:03:23,980 using a for loop or a while loop. 1463 01:03:23,980 --> 01:03:26,140 Do while just is a nice handy feature. 1464 01:03:26,140 --> 01:03:27,520 But we can simulate it. 1465 01:03:27,520 --> 01:03:30,070 And the Pythonic way of doing this is as follows. 1466 01:03:30,070 --> 01:03:32,860 Deliberately induce an infinite loop, because you 1467 01:03:32,860 --> 01:03:34,690 do want to loop potentially. 1468 01:03:34,690 --> 01:03:37,210 But the logic is going to be, give me an infinite loop 1469 01:03:37,210 --> 01:03:40,390 and I will break out of it when I'm ready to break out of it. 1470 01:03:40,390 --> 01:03:41,740 This would be the convention. 1471 01:03:41,740 --> 01:03:43,820 So while the following is true do this. 1472 01:03:43,820 --> 01:03:46,190 Go ahead and declare a variable called n. 1473 01:03:46,190 --> 01:03:48,440 Get an int from the user and pass in that same prompt. 1474 01:03:48,440 --> 01:03:50,440 So get int, we wrote-- the staff-- 1475 01:03:50,440 --> 01:03:52,750 prompt is whatever I typed in up here. 1476 01:03:52,750 --> 01:03:55,480 So just copy/paste from the C version. 1477 01:03:55,480 --> 01:03:59,110 And then under what circumstances do I want to break out of this infinite loop 1478 01:03:59,110 --> 01:04:01,902 if the function is to be called to get positive int? 1479 01:04:01,902 --> 01:04:02,776 AUDIENCE: [INAUDIBLE] 1480 01:04:02,776 --> 01:04:04,780 DAVID MALAN: Yeah, so if n is greater than 0, 1481 01:04:04,780 --> 01:04:08,110 then I do have the keyword break still, just as I did in C. 1482 01:04:08,110 --> 01:04:09,790 I can break out of this loop. 1483 01:04:09,790 --> 01:04:13,480 And then once I do that, I can go ahead and just return n. 1484 01:04:13,480 --> 01:04:16,060 Or for that matter, I could condense this a little bit. 1485 01:04:16,060 --> 01:04:19,570 I could just return n immediately and tighten it just a little bit. 1486 01:04:19,570 --> 01:04:21,100 So multiple ways to do this. 1487 01:04:21,100 --> 01:04:23,980 Otherwise it's just going to loop and loop forever. 1488 01:04:23,980 --> 01:04:26,260 So let me go ahead now and run positive.py 1489 01:04:26,260 --> 01:04:32,920 through Python, positive integer like negative 1, maybe negative 2, 0, OK, 1. 1490 01:04:32,920 --> 01:04:34,450 And now it, indeed, co-operates. 1491 01:04:34,450 --> 01:04:36,000 So this is just a common paradigm. 1492 01:04:36,000 --> 01:04:38,980 This is the kind of thing when learning a new language that honestly 1493 01:04:38,980 --> 01:04:40,480 tends to hang people up initially. 1494 01:04:40,480 --> 01:04:42,760 You need to learn the JavaScript way of doing things. 1495 01:04:42,760 --> 01:04:44,840 You need to learn the Python way of doing things. 1496 01:04:44,840 --> 01:04:47,310 But then you start to notice these so-called design patterns. 1497 01:04:47,310 --> 01:04:49,240 Anytime in Python you want to do something again and again, 1498 01:04:49,240 --> 01:04:50,390 yes, you want to loop. 1499 01:04:50,390 --> 01:04:54,130 But if you want to do something definitely once and maybe again? 1500 01:04:54,130 --> 01:04:56,440 You still just use a loop, but you deliberately 1501 01:04:56,440 --> 01:05:00,170 induce, typically, an infinite loop, and just break out of it when you're ready. 1502 01:05:00,170 --> 01:05:01,630 So a very common approach. 1503 01:05:01,630 --> 01:05:06,670 So not everything translates literally from C back and forth. 1504 01:05:06,670 --> 01:05:10,630 Any questions then on that? 1505 01:05:10,630 --> 01:05:11,563 Yeah, in the back? 1506 01:05:11,563 --> 01:05:15,427 AUDIENCE: Is that something you just did with the while for loop, 1507 01:05:15,427 --> 01:05:19,291 is that [INAUDIBLE] initializing a variable called [INAUDIBLE] 1508 01:05:19,291 --> 01:05:23,354 to a negative number and then do while n is less than 0-- 1509 01:05:23,354 --> 01:05:24,770 DAVID MALAN: Really good question. 1510 01:05:24,770 --> 01:05:27,380 Is this approach preferable to instead declaring, maybe 1511 01:05:27,380 --> 01:05:32,220 in here, a variable that is equal to some known value, like zero or whatnot, 1512 01:05:32,220 --> 01:05:33,770 and then updating it? 1513 01:05:33,770 --> 01:05:36,830 Short answer, yes, because your approach, while correct, 1514 01:05:36,830 --> 01:05:40,170 is not as well-designed, arguably because it's just not necessary. 1515 01:05:40,170 --> 01:05:43,259 And the Pythonic way, and really the well-designed way 1516 01:05:43,259 --> 01:05:45,050 to do most things would be use as few lines 1517 01:05:45,050 --> 01:05:47,780 as you can so long as it's still readable and understandable, 1518 01:05:47,780 --> 01:05:50,780 which I would argue this is once you're comfortable with the syntax. 1519 01:05:50,780 --> 01:05:56,030 But this does bring up an interesting point about one other topic in C. Scope 1520 01:05:56,030 --> 01:05:59,480 has now gone out the window, at least as we previously saw it. 1521 01:05:59,480 --> 01:06:02,570 Scope referred to where a variable lives. 1522 01:06:02,570 --> 01:06:05,577 And we defined it essentially casually between two curly braces, 1523 01:06:05,577 --> 01:06:07,160 the most recently opened curly braces. 1524 01:06:07,160 --> 01:06:10,790 Well, no curly braces anymore so it turns out that variables by default 1525 01:06:10,790 --> 01:06:12,530 have function scope here. 1526 01:06:12,530 --> 01:06:17,600 So when you declare n on line 9, you can use it in Python on line 10. 1527 01:06:17,600 --> 01:06:18,350 And you know what? 1528 01:06:18,350 --> 01:06:22,880 You can even use it on line 12, even though it was declared inside 1529 01:06:22,880 --> 01:06:24,870 of this loop higher up. 1530 01:06:24,870 --> 01:06:27,170 So once you declare a variable on this line, 1531 01:06:27,170 --> 01:06:30,654 you can use it anywhere on a subsequent line within that same function. 1532 01:06:30,654 --> 01:06:33,570 So in some sense, it's a little sloppy that you're allowed to do this. 1533 01:06:33,570 --> 01:06:35,570 But on the other hand, it's very convenient 1534 01:06:35,570 --> 01:06:37,570 because you don't have to deal with those things 1535 01:06:37,570 --> 01:06:40,130 like declaring the variable up here just to use it down here. 1536 01:06:40,130 --> 01:06:42,980 So it's one less thing to think about. 1537 01:06:42,980 --> 01:06:46,250 All right, let's take a look just a few examples from week two 1538 01:06:46,250 --> 01:06:49,550 wherein we introduced arrays and strings more generally 1539 01:06:49,550 --> 01:06:51,620 to see what has changed now, as well. 1540 01:06:51,620 --> 01:06:56,330 You'll recall that in week two, perhaps, we had an example about capitalization. 1541 01:06:56,330 --> 01:06:59,540 And let me go ahead and look at the third version of that, 1542 01:06:59,540 --> 01:07:01,580 capitalize too, but convert it to Python. 1543 01:07:01,580 --> 01:07:04,190 The purpose in life was to take input from the user 1544 01:07:04,190 --> 01:07:06,800 and just capitalize every character therein. 1545 01:07:06,800 --> 01:07:08,660 So if I type in my name in all lowercase, 1546 01:07:08,660 --> 01:07:10,830 it should come back as all uppercase. 1547 01:07:10,830 --> 01:07:12,650 So from the CS50 library, let me go ahead 1548 01:07:12,650 --> 01:07:16,310 and import getString so that I have some input from the user. 1549 01:07:16,310 --> 01:07:20,660 Then let me go ahead and just get a string from the user, like their name. 1550 01:07:20,660 --> 01:07:24,530 And then I want to go ahead and capitalize everything. 1551 01:07:24,530 --> 01:07:27,470 So let me go ahead and do this. 1552 01:07:27,470 --> 01:07:29,090 And this is a fancy feature. 1553 01:07:29,090 --> 01:07:33,680 In C I would have done a for int i is zero i less than strlen. 1554 01:07:33,680 --> 01:07:36,850 I mean, you perhaps remember the paradigm for iterating over a string. 1555 01:07:36,850 --> 01:07:38,660 Python is just so much more pleasant. 1556 01:07:38,660 --> 01:07:40,820 For c in s-- 1557 01:07:40,820 --> 01:07:46,280 that will induce a loop over the string s, giving you access to every character 1558 01:07:46,280 --> 01:07:49,040 at a time, calling that variable c. 1559 01:07:49,040 --> 01:07:52,940 And so what is it I want to do, just as a preliminary step, 1560 01:07:52,940 --> 01:07:56,820 a baby step, if you will, let's just print out c, just to see what happens. 1561 01:07:56,820 --> 01:08:01,280 Let me go ahead down here and do Python of capitalize two. 1562 01:08:01,280 --> 01:08:03,590 Let me go ahead and type in my name, all lowercase. 1563 01:08:03,590 --> 01:08:06,260 All right, and why is it showing up vertically 1564 01:08:06,260 --> 01:08:08,880 like that, one character per line? 1565 01:08:08,880 --> 01:08:10,440 Yeah, you get the free line-- 1566 01:08:10,440 --> 01:08:11,997 free new line this time. 1567 01:08:11,997 --> 01:08:13,580 So let's see how you can disable that. 1568 01:08:13,580 --> 01:08:15,300 It's stupid looking, honestly. 1569 01:08:15,300 --> 01:08:20,069 But you say end equals quote unquote, thereby revealing a new feature 1570 01:08:20,069 --> 01:08:21,779 of Python that C does not have. 1571 01:08:21,779 --> 01:08:26,417 It turns out that Python has not only positional arguments, as it's called, 1572 01:08:26,417 --> 01:08:28,500 whereby you just pass in arguments between commas. 1573 01:08:28,500 --> 01:08:30,120 That's what we've been doing in C. 1574 01:08:30,120 --> 01:08:33,630 But Python also has named arguments, whereby 1575 01:08:33,630 --> 01:08:35,939 you can specify the name of the argument, 1576 01:08:35,939 --> 01:08:38,250 then an equals sign, then the value. 1577 01:08:38,250 --> 01:08:42,569 And the power of named arguments, even though this is a tiny example, 1578 01:08:42,569 --> 01:08:46,154 means that you can sometimes pass in your arguments in any order. 1579 01:08:46,154 --> 01:08:47,279 You don't have to remember. 1580 01:08:47,279 --> 01:08:49,529 You don't have to pull up CS50 manual or the man pages 1581 01:08:49,529 --> 01:08:52,649 to remember what is the order of all these darn arguments. 1582 01:08:52,649 --> 01:08:55,410 You can pass them in in any order, but by specifying 1583 01:08:55,410 --> 01:08:58,920 the name of the argument, an equals sign, and its value. 1584 01:08:58,920 --> 01:09:01,089 And in Python 2, you can have optional arguments. 1585 01:09:01,089 --> 01:09:02,880 Obviously, in all of the examples thus far, 1586 01:09:02,880 --> 01:09:06,180 I have never typed the word end and an equals sign yet. 1587 01:09:06,180 --> 01:09:09,899 But what Python does support is default values for arguments. 1588 01:09:09,899 --> 01:09:14,550 And so if you look in the documentation for Python, this is equivalent-- 1589 01:09:14,550 --> 01:09:18,580 this cryptic looking sequence-- this is equivalent to the default behavior, 1590 01:09:18,580 --> 01:09:20,979 which is to type none of that at all. 1591 01:09:20,979 --> 01:09:25,080 End implies, for the print function, that you should end every line 1592 01:09:25,080 --> 01:09:26,609 with that default character. 1593 01:09:26,609 --> 01:09:28,680 Therefore, if you want to override it, you 1594 01:09:28,680 --> 01:09:31,800 can just change it to the empty string, quote unquote. 1595 01:09:31,800 --> 01:09:36,660 So if I now run this again and run it through with my name, 1596 01:09:36,660 --> 01:09:39,000 now I get it like that, one character at a time. 1597 01:09:39,000 --> 01:09:42,149 But you can do weird things, like ha ha ha ha ha-- 1598 01:09:42,149 --> 01:09:43,644 not that you would. 1599 01:09:43,644 --> 01:09:45,060 I don't know why I went with that. 1600 01:09:45,060 --> 01:09:48,984 But I mean, that does the exact same thing 1601 01:09:48,984 --> 01:09:50,859 because you're just changing the line ending. 1602 01:09:50,859 --> 01:09:54,090 So don't do that, but do something else like this with it, instead. 1603 01:09:54,090 --> 01:09:57,330 So suppose I want to now capitalize the first character. 1604 01:09:57,330 --> 01:10:02,910 It turns out that strings in Python are more powerful than strings 1605 01:10:02,910 --> 01:10:05,190 in C. In C, there is no string. 1606 01:10:05,190 --> 01:10:06,160 That was a lie. 1607 01:10:06,160 --> 01:10:09,750 It's just a sequence of characters as referenced by an address in memory. 1608 01:10:09,750 --> 01:10:12,270 In Python, a string is an actual object. 1609 01:10:12,270 --> 01:10:13,410 It's a data structure. 1610 01:10:13,410 --> 01:10:16,470 And if you think about C, we had structs toward the very end of our look 1611 01:10:16,470 --> 01:10:19,230 at C, nodes and structs and student structures and the like. 1612 01:10:19,230 --> 01:10:22,027 A string in Python is like this container inside of which 1613 01:10:22,027 --> 01:10:23,610 somewhere are all of those characters. 1614 01:10:23,610 --> 01:10:27,540 But in that container or structure is also built-in functions, 1615 01:10:27,540 --> 01:10:29,850 features of a string that you can just call. 1616 01:10:29,850 --> 01:10:33,120 So in C, we would have said something like toUpper 1617 01:10:33,120 --> 01:10:36,180 and then passed as input to a function called toUpper 1618 01:10:36,180 --> 01:10:37,770 the character that we care about. 1619 01:10:37,770 --> 01:10:40,080 Python kind of flips the logic around. 1620 01:10:40,080 --> 01:10:43,170 Strings come with built-in functionality that 1621 01:10:43,170 --> 01:10:47,260 allow you to operate on the given character automatically. 1622 01:10:47,260 --> 01:10:50,160 So in Python, the syntax is actually the character itself. 1623 01:10:50,160 --> 01:10:52,530 Use the dot notation because it's a structure. 1624 01:10:52,530 --> 01:10:54,480 And then you can literally do-- 1625 01:10:54,480 --> 01:10:55,080 oops. 1626 01:10:55,080 --> 01:10:58,330 You can literally do upper. 1627 01:10:58,330 --> 01:11:04,290 So this is to say, built into the string type in Python 1628 01:11:04,290 --> 01:11:07,942 is a bunch of features, one of which is a function called upper. 1629 01:11:07,942 --> 01:11:10,650 And the syntax with which you call it is the name of the variable 1630 01:11:10,650 --> 01:11:14,865 or the name of the string dot name of the function open paren, close paren. 1631 01:11:14,865 --> 01:11:16,240 And that's just now the paradigm. 1632 01:11:16,240 --> 01:11:17,580 There's no C type library. 1633 01:11:17,580 --> 01:11:19,470 There's no to upper or to lower. 1634 01:11:19,470 --> 01:11:22,020 Those features now built into the strings themselves. 1635 01:11:22,020 --> 01:11:24,630 And this is an example of encapsulation, or more 1636 01:11:24,630 --> 01:11:26,700 generally, object oriented programming, something 1637 01:11:26,700 --> 01:11:29,460 you'll explore if you take a class like CS51 that 1638 01:11:29,460 --> 01:11:33,930 bakes into the data types itself all of the relevant functionality. 1639 01:11:33,930 --> 01:11:37,420 It does not relegate them to another library. 1640 01:11:37,420 --> 01:11:41,020 So if I clean this up by just moving the cursor to the next line, 1641 01:11:41,020 --> 01:11:45,750 now hopefully you'll indeed see David typed out in all caps, the same idea 1642 01:11:45,750 --> 01:11:46,620 as before. 1643 01:11:46,620 --> 01:11:48,220 What about this length of a string? 1644 01:11:48,220 --> 01:11:50,370 This one is pretty trivial, but if I go in here, 1645 01:11:50,370 --> 01:11:54,106 let me go ahead and create a file called str len of .py. 1646 01:11:54,106 --> 01:11:57,660 If I want to see the length of a string, from CS50 import getString, 1647 01:11:57,660 --> 01:11:59,010 just as we did before. 1648 01:11:59,010 --> 01:12:01,802 Let me go ahead and get a string for myself, like my name again. 1649 01:12:01,802 --> 01:12:04,760 And then here, if I want to print the length of the string, in Python-- 1650 01:12:04,760 --> 01:12:06,732 in C, you would say strlen. 1651 01:12:06,732 --> 01:12:08,190 In Python, it's a little different. 1652 01:12:08,190 --> 01:12:10,780 You actually just say len for length. 1653 01:12:10,780 --> 01:12:13,830 So if I go ahead and run this through strlen-- 1654 01:12:13,830 --> 01:12:15,960 strlen-- type in my name. 1655 01:12:15,960 --> 01:12:17,340 Hopefully I, indeed, see five. 1656 01:12:17,340 --> 01:12:20,640 And there's no notion that you need to care about the backslash zero 1657 01:12:20,640 --> 01:12:23,410 in order to terminate the string. 1658 01:12:23,410 --> 01:12:25,119 Yeah? 1659 01:12:25,119 --> 01:12:31,400 AUDIENCE: So this upper [INAUDIBLE] 1660 01:12:31,400 --> 01:12:32,900 DAVID MALAN: No, in fact. 1661 01:12:32,900 --> 01:12:34,890 So that's a really good observation. 1662 01:12:34,890 --> 01:12:37,880 Let's rewind and actually improve upon this 1663 01:12:37,880 --> 01:12:42,380 rather than just translate it from what was our comparable example in C. Let 1664 01:12:42,380 --> 01:12:45,980 me go ahead here and actually say, you know what? 1665 01:12:45,980 --> 01:12:48,680 S gets s upper. 1666 01:12:48,680 --> 01:12:50,480 And then let me just print s, perhaps. 1667 01:12:50,480 --> 01:12:51,660 Let's see what happens. 1668 01:12:51,660 --> 01:12:55,100 Let me go back here and run Python of capitalize 2. 1669 01:12:55,100 --> 01:12:57,230 Enter David. 1670 01:12:57,230 --> 01:12:58,730 And it operates on the whole string. 1671 01:12:58,730 --> 01:12:59,355 Good intuition. 1672 01:12:59,355 --> 01:13:01,020 And honestly, I don't need to do this. 1673 01:13:01,020 --> 01:13:07,400 I could just say upper here and really trim this down and do 1674 01:13:07,400 --> 01:13:11,360 Python of capitalize, type in my name. 1675 01:13:11,360 --> 01:13:12,080 That still works. 1676 01:13:12,080 --> 01:13:15,650 And if I really want to be fancy, I don't even need s at all. 1677 01:13:15,650 --> 01:13:20,060 I can take this, get rid of that, put this here, immediately call 1678 01:13:20,060 --> 01:13:24,200 upper on the user's input and whittle this down to one line, type in David, 1679 01:13:24,200 --> 01:13:25,200 and that, too, works. 1680 01:13:25,200 --> 01:13:28,190 So you just get lots and lots and lots of more expressiveness. 1681 01:13:28,190 --> 01:13:29,030 Good question. 1682 01:13:29,030 --> 01:13:31,490 So how do you even know that things like this exist? 1683 01:13:31,490 --> 01:13:32,870 Well, quick aside. 1684 01:13:32,870 --> 01:13:35,344 Google will truly be your friend in cases like this. 1685 01:13:35,344 --> 01:13:38,510 And you'll want to know at this point, there's different versions of Python. 1686 01:13:38,510 --> 01:13:40,460 The world is kind of holding out and is still 1687 01:13:40,460 --> 01:13:43,930 using, a lot of people, version 2 of Python, which is older by many years 1688 01:13:43,930 --> 01:13:44,460 now. 1689 01:13:44,460 --> 01:13:45,489 We are using version 3. 1690 01:13:45,489 --> 01:13:47,030 And this is where the world is going. 1691 01:13:47,030 --> 01:13:50,450 And indeed, Python 2 will be officially deprecated or phased out 1692 01:13:50,450 --> 01:13:52,040 in a couple of years, theoretically. 1693 01:13:52,040 --> 01:13:54,150 So when you Google, you just want to be mindful of this 1694 01:13:54,150 --> 01:13:57,230 so that you don't accidentally make your way to old tutorials, old documentation 1695 01:13:57,230 --> 01:13:57,920 and the like. 1696 01:13:57,920 --> 01:14:03,380 So let me go ahead and Google Python 3 string, or str, and upper, 1697 01:14:03,380 --> 01:14:05,510 just to see if I can get to the documentation. 1698 01:14:05,510 --> 01:14:07,790 Here you have a number of tutorials. 1699 01:14:07,790 --> 01:14:11,300 But if we focus down here, what you're generally going to want to look for, 1700 01:14:11,300 --> 01:14:15,110 at least for the official documentation, is docs.python.org. 1701 01:14:15,110 --> 01:14:18,420 You see in the URL it's version 3, and that's where we want to go. 1702 01:14:18,420 --> 01:14:21,120 So let me go ahead and click on this, common string operators. 1703 01:14:21,120 --> 01:14:22,550 And I will disclaim this-- 1704 01:14:22,550 --> 01:14:24,770 I think, personally, Python's documentation 1705 01:14:24,770 --> 01:14:26,407 is not terribly newbie-friendly. 1706 01:14:26,407 --> 01:14:28,490 Like, it's written fairly arcanely and you kind of 1707 01:14:28,490 --> 01:14:30,800 have to really dig to understand certain things. 1708 01:14:30,800 --> 01:14:31,372 That's fine. 1709 01:14:31,372 --> 01:14:33,080 You'll get comfortable with it over time. 1710 01:14:33,080 --> 01:14:34,996 But if you're feeling a little overwhelmed by, 1711 01:14:34,996 --> 01:14:39,050 oh my God, I just want to know about upper, everyone feels this way too. 1712 01:14:39,050 --> 01:14:42,350 So control F or Command F is your friend, upper. 1713 01:14:42,350 --> 01:14:44,390 Let me go ahead and search for this. 1714 01:14:44,390 --> 01:14:47,400 And it's not actually on this page, is it? 1715 01:14:47,400 --> 01:14:50,430 String-- string methods. 1716 01:14:50,430 --> 01:14:50,930 Here we go. 1717 01:14:50,930 --> 01:14:52,190 String methods. 1718 01:14:52,190 --> 01:14:56,710 OK, so under string methods, let me go ahead and search for upper. 1719 01:14:56,710 --> 01:14:59,020 And down here, indeed, is the documentation. 1720 01:14:59,020 --> 01:15:02,490 So the convention will be the name of the data type in question-- 1721 01:15:02,490 --> 01:15:03,590 str for string-- 1722 01:15:03,590 --> 01:15:04,840 the name of the function here. 1723 01:15:04,840 --> 01:15:08,330 It would tell you in parentheses if it takes any arguments, but it doesn't. 1724 01:15:08,330 --> 01:15:11,457 And so it returns a copy of the string with all of the cased characters 1725 01:15:11,457 --> 01:15:14,290 converted to uppercase-- that just means the letters of the alphabet 1726 01:15:14,290 --> 01:15:15,620 essentially-- 1727 01:15:15,620 --> 01:15:17,870 and then some additional documentation, and so forth. 1728 01:15:17,870 --> 01:15:19,330 It gets pretty low-level pretty quickly. 1729 01:15:19,330 --> 01:15:21,160 These are the equivalent of the man pages. 1730 01:15:21,160 --> 01:15:23,380 And there is no CS50 reference for Python. 1731 01:15:23,380 --> 01:15:25,630 That was just for C. So just realize that there's 1732 01:15:25,630 --> 01:15:27,052 this documentation available. 1733 01:15:27,052 --> 01:15:29,010 And you'll notice there's bunches of functions. 1734 01:15:29,010 --> 01:15:33,197 Strip is actually kind of a popular one, or L strip or R strip. 1735 01:15:33,197 --> 01:15:35,530 If you have whitespace at the beginning or end of a line 1736 01:15:35,530 --> 01:15:39,160 because your human got a little sloppy or there's new lines in a file, 1737 01:15:39,160 --> 01:15:42,250 you can call strip on a string and get rid of whitespace to the left 1738 01:15:42,250 --> 01:15:43,715 and right to kind of clean it up. 1739 01:15:43,715 --> 01:15:46,090 Terribly useful for things like data science applications 1740 01:15:46,090 --> 01:15:48,756 and analysis of data where you just kind of clean up messy data. 1741 01:15:48,756 --> 01:15:51,710 So many functions like that are built in for you. 1742 01:15:51,710 --> 01:15:55,060 All right, so let's take a look at a few other examples reminiscent of features 1743 01:15:55,060 --> 01:15:58,040 we did have in C, such as this one here. 1744 01:15:58,040 --> 01:15:59,950 Suppose I want to write a program that takes 1745 01:15:59,950 --> 01:16:02,545 command line arguments, much like resize, 1746 01:16:02,545 --> 01:16:04,045 with which we started today's story. 1747 01:16:04,045 --> 01:16:06,700 1748 01:16:06,700 --> 01:16:08,360 Let's not even use the CS50 library. 1749 01:16:08,360 --> 01:16:09,340 Let's do this. 1750 01:16:09,340 --> 01:16:13,540 If you want access to argv, recall in C it looked like this-- int, 1751 01:16:13,540 --> 01:16:19,150 argc, string, argv. 1752 01:16:19,150 --> 01:16:20,830 It looked like this in C. 1753 01:16:20,830 --> 01:16:22,789 Well, unfortunately, if you're not using main, 1754 01:16:22,789 --> 01:16:25,330 it would be nice if you can still use command line arguments. 1755 01:16:25,330 --> 01:16:27,070 And you can, but you have to import them. 1756 01:16:27,070 --> 01:16:29,230 It's a library that provides you with access. 1757 01:16:29,230 --> 01:16:33,730 From the sys or system library, you can import argv in Python. 1758 01:16:33,730 --> 01:16:36,850 And that gives you access to command line arguments as a feature. 1759 01:16:36,850 --> 01:16:38,540 Then you can say something like this. 1760 01:16:38,540 --> 01:16:40,750 If the length of argv-- 1761 01:16:40,750 --> 01:16:43,060 which is just an array, recall, in C-- 1762 01:16:43,060 --> 01:16:47,260 equals equals 2, then go ahead and say hello. 1763 01:16:47,260 --> 01:16:52,030 And let's go ahead and print out whatever the user typed in, argv 1. 1764 01:16:52,030 --> 01:16:56,030 Else, let's just by default say hello world. 1765 01:16:56,030 --> 01:16:57,670 So in English, what's happening? 1766 01:16:57,670 --> 01:17:01,990 If the user typed in a command line argument-- say, hello so-and-so. 1767 01:17:01,990 --> 01:17:04,870 Else if the human did not type in exactly one command line argument, 1768 01:17:04,870 --> 01:17:07,210 just say, by default, hello world. 1769 01:17:07,210 --> 01:17:08,170 So let me save this. 1770 01:17:08,170 --> 01:17:11,440 Do Python of argv1, or rather zero. 1771 01:17:11,440 --> 01:17:12,200 Enter. 1772 01:17:12,200 --> 01:17:14,330 OK, I didn't type in a word after the command. 1773 01:17:14,330 --> 01:17:18,440 So now let's do it again and I'll type in Brian's name. 1774 01:17:18,440 --> 01:17:19,720 Enter, hello Brian. 1775 01:17:19,720 --> 01:17:21,220 Let's do it again. 1776 01:17:21,220 --> 01:17:23,590 Veronica, enter. 1777 01:17:23,590 --> 01:17:27,700 Now, there's something that's not quite the same as C. How many words did I 1778 01:17:27,700 --> 01:17:30,770 just type at the prompt? 1779 01:17:30,770 --> 01:17:31,810 3. 1780 01:17:31,810 --> 01:17:37,830 So that would suggest that this is argv 0, argv 1, and argv 2. 1781 01:17:37,830 --> 01:17:41,190 And yet, I'm printing argv 1, not argv 2. 1782 01:17:41,190 --> 01:17:43,470 So how do I think about this? 1783 01:17:43,470 --> 01:17:47,430 The code is correct, but it's different from C. 1784 01:17:47,430 --> 01:17:50,950 What does argv technically store when you run a command like these? 1785 01:17:50,950 --> 01:17:57,852 1786 01:17:57,852 --> 01:17:58,810 Remember, let's rewind. 1787 01:17:58,810 --> 01:18:02,055 In C, argv 0 stored what? 1788 01:18:02,055 --> 01:18:03,180 AUDIENCE: Name of the file. 1789 01:18:03,180 --> 01:18:06,480 DAVID MALAN: The name of the file or the name of the program you just ran. 1790 01:18:06,480 --> 01:18:09,780 Notice, though, the program I just ran is called Python. 1791 01:18:09,780 --> 01:18:13,080 And so you would think that argv 0 would have Python in it, 1792 01:18:13,080 --> 01:18:16,080 but it doesn't because notice if I'm printing argv 1, 1793 01:18:16,080 --> 01:18:17,730 you would think that's 0, 1. 1794 01:18:17,730 --> 01:18:20,830 You would think I just said hello argv 0 .py, But I didn't. 1795 01:18:20,830 --> 01:18:24,720 argv 1 clearly prints Veronica or Brian. 1796 01:18:24,720 --> 01:18:27,379 So it stands to reason argv 0 is this, which 1797 01:18:27,379 --> 01:18:28,920 means this is, like, argv negative 1. 1798 01:18:28,920 --> 01:18:32,840 Python is excluded from the argument vector, as it's called. 1799 01:18:32,840 --> 01:18:35,800 The command line arguments do not include the name of the interpreter. 1800 01:18:35,800 --> 01:18:39,720 But otherwise, it works exactly the same as it did once upon a time. 1801 01:18:39,720 --> 01:18:43,020 And notice, too, with this new for construct, 1802 01:18:43,020 --> 01:18:46,380 notice what you can do whenever you have access to an array of things. 1803 01:18:46,380 --> 01:18:52,569 If I go into argv1.py and import argv again, let me go ahead now 1804 01:18:52,569 --> 01:18:53,610 and just-- you know what? 1805 01:18:53,610 --> 01:18:57,990 For s in argv, go ahead and print out s. 1806 01:18:57,990 --> 01:18:59,220 It's really succinct. 1807 01:18:59,220 --> 01:19:00,450 What is this going to do? 1808 01:19:00,450 --> 01:19:04,500 Let me go ahead and do Python of argv1, enter. 1809 01:19:04,500 --> 01:19:06,490 And it just prints out the name of the file. 1810 01:19:06,490 --> 01:19:09,780 If I go ahead and say foo, bar, baz, three random words, 1811 01:19:09,780 --> 01:19:11,590 it prints out all of those words. 1812 01:19:11,590 --> 01:19:14,310 And so what's powerful about Python is honestly this for loop. 1813 01:19:14,310 --> 01:19:17,190 There's no int i, less than, plus plus, any of that. 1814 01:19:17,190 --> 01:19:19,380 You just say, give me a variable called s 1815 01:19:19,380 --> 01:19:22,950 and iterate over the entirety of the thing on the right, which is presumed, 1816 01:19:22,950 --> 01:19:25,020 in this case, to be an array. 1817 01:19:25,020 --> 01:19:26,790 You can be even more powerful than that. 1818 01:19:26,790 --> 01:19:29,080 If I-- just like in C weeks ago-- 1819 01:19:29,080 --> 01:19:32,550 look at characters in these strings-- let me do argv2.py-- 1820 01:19:32,550 --> 01:19:38,026 suppose that this iterate over each string in argv, 1821 01:19:38,026 --> 01:19:46,530 and then here iterate over each character in s, I can do for c in s 1822 01:19:46,530 --> 01:19:49,330 and now print out the character. 1823 01:19:49,330 --> 01:19:53,820 So now when I run this same command but on argv2.py, 1824 01:19:53,820 --> 01:19:55,090 notice what's going to happen. 1825 01:19:55,090 --> 01:19:57,410 Let me raise this a little bit. 1826 01:19:57,410 --> 01:19:59,350 Enter. 1827 01:19:59,350 --> 01:20:03,010 It prints every character from every word one at a time. 1828 01:20:03,010 --> 01:20:06,320 But it did so this time based on using these two for loops. 1829 01:20:06,320 --> 01:20:07,330 So what does this mean? 1830 01:20:07,330 --> 01:20:10,540 When you have an array, as we've called it, 1831 01:20:10,540 --> 01:20:12,490 you can iterate over everything in the array. 1832 01:20:12,490 --> 01:20:15,800 When you have a string, you can iterate over every character in the string. 1833 01:20:15,800 --> 01:20:17,716 And this is where Python just gets wonderfully 1834 01:20:17,716 --> 01:20:20,620 flexible to do this again and again. 1835 01:20:20,620 --> 01:20:23,170 All right, let's take a look at-- 1836 01:20:23,170 --> 01:20:25,140 let's see-- compared strings already. 1837 01:20:25,140 --> 01:20:26,510 We copied strings. 1838 01:20:26,510 --> 01:20:29,110 Let's go ahead and do this in Python. 1839 01:20:29,110 --> 01:20:32,290 Recall that we ran into a fundamental limitation of C, 1840 01:20:32,290 --> 01:20:35,560 and it would seem programming, when we had example called swap 1841 01:20:35,560 --> 01:20:38,080 and no swap back in the day where I was just 1842 01:20:38,080 --> 01:20:40,240 trying to swap two values, x and y. 1843 01:20:40,240 --> 01:20:44,170 And recall that I hardcoded something like x is 1 and y is 2. 1844 01:20:44,170 --> 01:20:48,370 And the whole goal was simply to first say, x is such and such, 1845 01:20:48,370 --> 01:20:50,920 y is such and such. 1846 01:20:50,920 --> 01:20:53,350 Let me go ahead and make that a format string. 1847 01:20:53,350 --> 01:20:55,360 Then I wanted to print this again. 1848 01:20:55,360 --> 01:20:58,570 But somewhere in here, I wanted to swap x and y. 1849 01:20:58,570 --> 01:21:01,900 So to punctuate our sort of exploration of just what Python can do, 1850 01:21:01,900 --> 01:21:07,020 if you want to swap two variables, x and y, that's fine, just do it. 1851 01:21:07,020 --> 01:21:10,410 And it's this magical shell game that just works in Python. 1852 01:21:10,410 --> 01:21:13,260 Now, technically these are what are called tuples on the left. 1853 01:21:13,260 --> 01:21:15,010 It's a x comma y pair. 1854 01:21:15,010 --> 01:21:16,320 It's latitude comma longitude. 1855 01:21:16,320 --> 01:21:20,640 So there's an actual underlying mental model for what's going on here. 1856 01:21:20,640 --> 01:21:22,590 But in effect, you're literally switching them 1857 01:21:22,590 --> 01:21:24,340 and you don't need the temporary variable. 1858 01:21:24,340 --> 01:21:28,560 Python the language takes care of that for you. 1859 01:21:28,560 --> 01:21:30,670 All right, let's look at a more powerful feature 1860 01:21:30,670 --> 01:21:33,930 still, this time using what's actually called a list. 1861 01:21:33,930 --> 01:21:38,010 So a moment ago I was using argv 0, 1, 2, as our examples. 1862 01:21:38,010 --> 01:21:40,096 And I was calling them arrays. 1863 01:21:40,096 --> 01:21:41,220 They're not arrays anymore. 1864 01:21:41,220 --> 01:21:43,080 Python does not have arrays. 1865 01:21:43,080 --> 01:21:44,640 Python has lists. 1866 01:21:44,640 --> 01:21:46,595 And lists sounds reminiscent of linked lists. 1867 01:21:46,595 --> 01:21:47,470 And indeed, they are. 1868 01:21:47,470 --> 01:21:50,786 In Python, you have lists that are resizable. 1869 01:21:50,786 --> 01:21:53,910 You don't have to decide in advance how big they are or how small they are. 1870 01:21:53,910 --> 01:21:57,330 They will just grow and shrink for you just like a linked list will, 1871 01:21:57,330 --> 01:21:59,760 but you don't have to write the linked list yourself. 1872 01:21:59,760 --> 01:22:00,456 Yeah? 1873 01:22:00,456 --> 01:22:04,854 AUDIENCE: [INAUDIBLE] 1874 01:22:04,854 --> 01:22:05,603 DAVID MALAN: Sure. 1875 01:22:05,603 --> 01:22:10,434 AUDIENCE: [INAUDIBLE] 1876 01:22:10,434 --> 01:22:11,350 DAVID MALAN: Oh, sure. 1877 01:22:11,350 --> 01:22:15,130 Let me open that file up in argv1. 1878 01:22:15,130 --> 01:22:16,263 This one here? 1879 01:22:16,263 --> 01:22:19,404 AUDIENCE: No, it was, like, [INAUDIBLE]. 1880 01:22:19,404 --> 01:22:20,695 DAVID MALAN: Oh, this one here. 1881 01:22:20,695 --> 01:22:21,319 AUDIENCE: Yeah. 1882 01:22:21,319 --> 01:22:23,430 [INAUDIBLE] bracket notation [INAUDIBLE].. 1883 01:22:23,430 --> 01:22:26,257 DAVID MALAN: Yes, you can still-- so argv, I called it an array, 1884 01:22:26,257 --> 01:22:27,840 but that was a white lie a moment ago. 1885 01:22:27,840 --> 01:22:29,770 It's actually a list, a linked list. 1886 01:22:29,770 --> 01:22:32,850 But whereas a linked list in C does not allow you to use square brackets, 1887 01:22:32,850 --> 01:22:34,600 you have to use a for loop or a while loop 1888 01:22:34,600 --> 01:22:38,230 to iterate over the whole thing to find what you're looking for, in Python, 1889 01:22:38,230 --> 01:22:41,200 if something is in a list, you can just use, yes, the square brackets 1890 01:22:41,200 --> 01:22:42,700 to get at that specific element. 1891 01:22:42,700 --> 01:22:45,930 AUDIENCE: Or I'm saying you could use the f right before-- 1892 01:22:45,930 --> 01:22:47,570 DAVID MALAN: Oh, I could have, yes. 1893 01:22:47,570 --> 01:22:52,280 I didn't use the F, just because frankly it just gets ugly eventually. 1894 01:22:52,280 --> 01:22:56,210 But yes, I could have also done this to achieve the exact same effect. 1895 01:22:56,210 --> 01:22:58,760 It just starts to look cryptic. 1896 01:22:58,760 --> 01:23:03,970 OK, so let's actually introduce a list, which itself is a data type in Python, 1897 01:23:03,970 --> 01:23:07,746 as well as in languages like C++ and Java, 1898 01:23:07,746 --> 01:23:09,620 if some of you have that background, as well. 1899 01:23:09,620 --> 01:23:13,100 So here, in list.py, let me go ahead and do the following. 1900 01:23:13,100 --> 01:23:15,589 Let me first import from the CS50 library getInt 1901 01:23:15,589 --> 01:23:17,380 so that we can get some ints from the user. 1902 01:23:17,380 --> 01:23:19,510 Let me give myself an array, a.k.a. 1903 01:23:19,510 --> 01:23:22,810 now a list in Python. 1904 01:23:22,810 --> 01:23:25,930 So in C you can't really express quite this idea. 1905 01:23:25,930 --> 01:23:29,260 In Python, if you want a variable called numbers 1906 01:23:29,260 --> 01:23:31,390 and you want to initialize it to an empty list, 1907 01:23:31,390 --> 01:23:33,490 you just literally do open bracket, close bracket. 1908 01:23:33,490 --> 01:23:35,140 No number in between them. 1909 01:23:35,140 --> 01:23:37,090 And as before, no semi-colon. 1910 01:23:37,090 --> 01:23:40,660 Let's now do the following forever until I break out of this. 1911 01:23:40,660 --> 01:23:43,150 Let me go ahead and get a number from the user, 1912 01:23:43,150 --> 01:23:45,130 just by asking them for some number. 1913 01:23:45,130 --> 01:23:49,605 Then let me say, if not number, go ahead and break out of this. 1914 01:23:49,605 --> 01:23:51,730 This is going to, as an aside, just let me quit out 1915 01:23:51,730 --> 01:23:55,400 of this by hitting Control D as we discussed ever so briefly a while back. 1916 01:23:55,400 --> 01:23:57,010 But that's just a UI feature. 1917 01:23:57,010 --> 01:23:58,750 So this is what's kind of cool. 1918 01:23:58,750 --> 01:24:02,920 Suppose I want to implement the notion of checking 1919 01:24:02,920 --> 01:24:06,850 if the number the user's typed in is in the list already, and if so, 1920 01:24:06,850 --> 01:24:07,394 not add it. 1921 01:24:07,394 --> 01:24:08,810 I'm going to go ahead and do that. 1922 01:24:08,810 --> 01:24:10,101 But first, let's just do this-- 1923 01:24:10,101 --> 01:24:13,360 numbers.append number. 1924 01:24:13,360 --> 01:24:14,860 And this is a new feature. 1925 01:24:14,860 --> 01:24:16,120 So what do I want to do here? 1926 01:24:16,120 --> 01:24:17,770 For number in numbers-- 1927 01:24:17,770 --> 01:24:20,146 I'll explain this in a second-- 1928 01:24:20,146 --> 01:24:21,520 let me go ahead and print number. 1929 01:24:21,520 --> 01:24:23,910 So what is this program aspiring to do? 1930 01:24:23,910 --> 01:24:26,025 At the very top, I'm importing getInt. 1931 01:24:26,025 --> 01:24:29,170 At the very top below that, I'm just giving myself an empty array, 1932 01:24:29,170 --> 01:24:31,290 now called a list, called numbers. 1933 01:24:31,290 --> 01:24:33,360 Then I do the following forever. 1934 01:24:33,360 --> 01:24:35,280 Go ahead and get the number from the user. 1935 01:24:35,280 --> 01:24:38,238 If he or she did not actually type in a number, just break out of this. 1936 01:24:38,238 --> 01:24:39,300 The program is done. 1937 01:24:39,300 --> 01:24:40,770 But here's the new feature. 1938 01:24:40,770 --> 01:24:44,170 Just as with strings, they are objects, so to speak. 1939 01:24:44,170 --> 01:24:46,830 They are data structures that have functions built in. 1940 01:24:46,830 --> 01:24:49,500 So do lists have functions built in. 1941 01:24:49,500 --> 01:24:52,470 There is literally a function inside of every Python list 1942 01:24:52,470 --> 01:24:54,540 called append that literally does that. 1943 01:24:54,540 --> 01:24:56,970 You call append and it appends whatever its input 1944 01:24:56,970 --> 01:24:59,670 is to whatever the list itself is. 1945 01:24:59,670 --> 01:25:03,720 So in C, you might have had to use realloc. 1946 01:25:03,720 --> 01:25:06,630 You might have had to add something to the end of the list. 1947 01:25:06,630 --> 01:25:07,980 None of that happens anymore. 1948 01:25:07,980 --> 01:25:10,410 Just at a high level, you say append this to the list 1949 01:25:10,410 --> 01:25:12,690 and let the language take care of it for you. 1950 01:25:12,690 --> 01:25:15,690 Then down here, left-aligned all the way at the end, 1951 01:25:15,690 --> 01:25:17,580 is just saying, for number in numbers. 1952 01:25:17,580 --> 01:25:21,880 Like, iterate over all of the numbers in the list and print out one at a time. 1953 01:25:21,880 --> 01:25:22,720 So let's try this. 1954 01:25:22,720 --> 01:25:25,530 Let me go down here and do Python of-- 1955 01:25:25,530 --> 01:25:31,920 this is list.py-- and let me go ahead and type in a number like 13, 42, 50. 1956 01:25:31,920 --> 01:25:34,950 And I'm going to hit Control D, which means that's it, I'm done. 1957 01:25:34,950 --> 01:25:36,450 And there we see the three numbers. 1958 01:25:36,450 --> 01:25:38,100 It looks a little stupid because you know what? 1959 01:25:38,100 --> 01:25:39,480 I think I need a print here. 1960 01:25:39,480 --> 01:25:40,800 Let's fix this. 1961 01:25:40,800 --> 01:25:42,200 Let me rerun this. 1962 01:25:42,200 --> 01:25:45,510 13, 42, 50, Control D, there we go. 1963 01:25:45,510 --> 01:25:46,710 One per line. 1964 01:25:46,710 --> 01:25:50,400 But what this program has is honestly kind of a bug, potentially. 1965 01:25:50,400 --> 01:25:53,622 Suppose I want unique numbers, now I have three 13s. 1966 01:25:53,622 --> 01:25:56,580 But I'd ideally just want one copy of every number for whatever reason. 1967 01:25:56,580 --> 01:25:57,730 I want uniqueness. 1968 01:25:57,730 --> 01:26:00,570 Well, notice how easily you can express that. 1969 01:26:00,570 --> 01:26:05,710 If my goal is to only conditionally add a number to the numbers list 1970 01:26:05,710 --> 01:26:08,940 if it's not already there, how would you do this in C? 1971 01:26:08,940 --> 01:26:11,730 You have an array called numbers and you want to first check 1972 01:26:11,730 --> 01:26:13,390 is a number in that array. 1973 01:26:13,390 --> 01:26:15,623 What would you do in English? 1974 01:26:15,623 --> 01:26:16,570 AUDIENCE: A for loop. 1975 01:26:16,570 --> 01:26:17,240 DAVID MALAN: A for loop, right? 1976 01:26:17,240 --> 01:26:18,880 You'd probably start at the left, iterate over 1977 01:26:18,880 --> 01:26:21,850 the whole array looking for the number and then conclude true or false, 1978 01:26:21,850 --> 01:26:22,632 it's in there. 1979 01:26:22,632 --> 01:26:24,340 It's not hard but it's a little annoying. 1980 01:26:24,340 --> 01:26:27,423 You have to write more code, a couple of lines, four lines for a for loop. 1981 01:26:27,423 --> 01:26:30,130 In Python, just say what you mean. 1982 01:26:30,130 --> 01:26:35,140 If number not in numbers, append it. 1983 01:26:35,140 --> 01:26:37,360 And it reads much more like English. 1984 01:26:37,360 --> 01:26:41,990 At the end of the day, some human wrote the for loop that does that operation. 1985 01:26:41,990 --> 01:26:46,110 But we, the more modern programmers, can just now say, if number not in numbers, 1986 01:26:46,110 --> 01:26:46,930 append it. 1987 01:26:46,930 --> 01:26:48,870 And so it is meant to read more English-like. 1988 01:26:48,870 --> 01:26:50,540 So let's try this now. 1989 01:26:50,540 --> 01:26:53,260 13, 13, 50, done. 1990 01:26:53,260 --> 01:26:56,980 Now I just get one copy of the 13 because it's checking that for me. 1991 01:26:56,980 --> 01:26:58,750 Now, running time is still an issue. 1992 01:26:58,750 --> 01:27:01,420 Consider this, theoretically, you're still 1993 01:27:01,420 --> 01:27:04,510 wasting some time looking for a number because someone wrote 1994 01:27:04,510 --> 01:27:05,990 code that's probably linear search. 1995 01:27:05,990 --> 01:27:07,660 Maybe it's binary search if it's sorted. 1996 01:27:07,660 --> 01:27:08,830 But someone wrote that code. 1997 01:27:08,830 --> 01:27:10,996 But the point is, with these higher level languages, 1998 01:27:10,996 --> 01:27:14,937 these more modern languages like Python, that is not our problem, necessarily. 1999 01:27:14,937 --> 01:27:17,020 It only becomes our problem if the program is just 2000 01:27:17,020 --> 01:27:22,720 too slow for some reason and we really need to get into the weeds of why. 2001 01:27:22,720 --> 01:27:25,210 All right, let's look at a final feature syntactically 2002 01:27:25,210 --> 01:27:27,580 before we try this to a more generalized problem. 2003 01:27:27,580 --> 01:27:30,550 Let me go ahead and save a file called struct0.py, 2004 01:27:30,550 --> 01:27:33,780 which is reminiscent of struct0.c a few weeks back. 2005 01:27:33,780 --> 01:27:37,820 And let me go ahead and from the CS50 library import getString. 2006 01:27:37,820 --> 01:27:41,770 Let me go ahead and give myself an array this time called students that's empty, 2007 01:27:41,770 --> 01:27:43,330 or a list called students. 2008 01:27:43,330 --> 01:27:46,340 And then let me just get three students for the sake of discussion. 2009 01:27:46,340 --> 01:27:50,170 So for i in range 3, that just iterates three times, 2010 01:27:50,170 --> 01:27:52,870 let me go ahead and ask the user for their name. 2011 01:27:52,870 --> 01:27:55,370 So getString, ask them for their name. 2012 01:27:55,370 --> 01:27:57,370 Then let me go ahead and ask them for their dorm 2013 01:27:57,370 --> 01:28:00,040 and go ahead and get string for dorm. 2014 01:28:00,040 --> 01:28:01,540 And then that's enough. 2015 01:28:01,540 --> 01:28:04,390 Let me now go ahead and append the student to my list. 2016 01:28:04,390 --> 01:28:07,330 So students dot append. 2017 01:28:07,330 --> 01:28:09,489 But I don't really have a student structure yet. 2018 01:28:09,489 --> 01:28:11,530 Now, there's many ways we can solve this, but let 2019 01:28:11,530 --> 01:28:13,330 me propose the simplest one. 2020 01:28:13,330 --> 01:28:19,120 It turns out in Python you can declare hash tables so wonderfully simply. 2021 01:28:19,120 --> 01:28:21,680 A hash table is just a collection of key value pairs. 2022 01:28:21,680 --> 01:28:25,600 And I would argue at this point in my example I have keys and values. 2023 01:28:25,600 --> 01:28:29,330 I have a name which is a key and the value, like David or whatever, 2024 01:28:29,330 --> 01:28:33,430 another key called dorm, and then a value which is like Matthews 2025 01:28:33,430 --> 01:28:34,210 or wherever. 2026 01:28:34,210 --> 01:28:35,500 And so keys and values. 2027 01:28:35,500 --> 01:28:38,710 So it would be kind of nice if I could create for myself a hash table-- 2028 01:28:38,710 --> 01:28:41,660 or even a try, for that matter-- that allows me to store this data. 2029 01:28:41,660 --> 01:28:44,260 Well, it turns out in Python, I can do just that. 2030 01:28:44,260 --> 01:28:47,110 I can go ahead and create an object called student 2031 01:28:47,110 --> 01:28:49,600 using curly bracket notation. 2032 01:28:49,600 --> 01:28:51,370 And you can literally do this. 2033 01:28:51,370 --> 01:28:53,530 The name shall be one key. 2034 01:28:53,530 --> 01:28:55,600 And now it's going to take on that value. 2035 01:28:55,600 --> 01:28:59,530 Dorm shall be another key and it's going to take on that value. 2036 01:28:59,530 --> 01:29:02,500 So I could call this anything I want-- x and y 2037 01:29:02,500 --> 01:29:05,920 and have the values David and Matthews or whatever it is I'm going to type in. 2038 01:29:05,920 --> 01:29:09,310 But if you want a very generalized data structure 2039 01:29:09,310 --> 01:29:13,600 that isn't just a list of values from left to right, but has metadata-- 2040 01:29:13,600 --> 01:29:16,330 a key, or if you think of a spreadsheet, a column name 2041 01:29:16,330 --> 01:29:20,260 called name and a column name called dorm, each of which has values-- 2042 01:29:20,260 --> 01:29:21,520 you just use curly braces. 2043 01:29:21,520 --> 01:29:24,502 And you put the keys in quotes and then a colon. 2044 01:29:24,502 --> 01:29:26,960 And then if you've got multiple keys, you just put a comma. 2045 01:29:26,960 --> 01:29:31,060 So it's a little cryptic, but this is just like a container, a hash table, 2046 01:29:31,060 --> 01:29:33,590 that contains words and values. 2047 01:29:33,590 --> 01:29:36,640 Now, in p set 4, when you implemented speller, 2048 01:29:36,640 --> 01:29:40,060 you actually just said yes or no, is the word in the dictionary? 2049 01:29:40,060 --> 01:29:42,460 But you certainly could have stored more information 2050 01:29:42,460 --> 01:29:43,960 instead of just Boolean values. 2051 01:29:43,960 --> 01:29:47,150 You just tended to not need to do that. 2052 01:29:47,150 --> 01:29:48,610 So what does this mean for me? 2053 01:29:48,610 --> 01:29:50,920 At this point in the story, I have an object, 2054 01:29:50,920 --> 01:29:54,984 as it's called in Python, that stores these keys and these values. 2055 01:29:54,984 --> 01:29:57,400 So if later on I want to iterate over them, I can do this. 2056 01:29:57,400 --> 01:30:01,330 For student in-- oh, you have to append it-- 2057 01:30:01,330 --> 01:30:04,480 so student.append student. 2058 01:30:04,480 --> 01:30:06,320 Let's add the student to the list. 2059 01:30:06,320 --> 01:30:08,200 So for student in students, which is just how 2060 01:30:08,200 --> 01:30:10,449 you iterate over every one of the things in that list. 2061 01:30:10,449 --> 01:30:16,270 Let me just go ahead and say a sentence like, I want to say so and so 2062 01:30:16,270 --> 01:30:18,340 is in this dorm. 2063 01:30:18,340 --> 01:30:19,810 So how do express that? 2064 01:30:19,810 --> 01:30:22,540 Well, so and so, I need to get access to the student's name. 2065 01:30:22,540 --> 01:30:25,390 And the way I can do this is as follows. 2066 01:30:25,390 --> 01:30:30,850 I could say, let's go ahead and say curly brace student bracket 2067 01:30:30,850 --> 01:30:33,760 name close bracket. 2068 01:30:33,760 --> 01:30:36,170 And then here, I can go ahead and say-- 2069 01:30:36,170 --> 01:30:38,710 oops, let me put quotes in here-- 2070 01:30:38,710 --> 01:30:42,520 and then here I can say student bracket quote unquote dorm. 2071 01:30:42,520 --> 01:30:45,400 So this is admittedly the most cryptic example we've done thus far. 2072 01:30:45,400 --> 01:30:47,870 But let's tease it apart as a format string. 2073 01:30:47,870 --> 01:30:50,260 So if I zoom in on this, what am I doing? 2074 01:30:50,260 --> 01:30:52,720 The curly braces and the f just means format this string. 2075 01:30:52,720 --> 01:30:56,220 So you can ignore the curly braces as part of our story from earlier. 2076 01:30:56,220 --> 01:30:58,390 Student is the name of the variable in the for loop. 2077 01:30:58,390 --> 01:30:59,890 So it's the current student. 2078 01:30:59,890 --> 01:31:01,820 The square brackets are new. 2079 01:31:01,820 --> 01:31:05,350 In C, the only time we used square brackets was in what context? 2080 01:31:05,350 --> 01:31:06,317 AUDIENCE: Arrays. 2081 01:31:06,317 --> 01:31:07,150 DAVID MALAN: Arrays. 2082 01:31:07,150 --> 01:31:10,720 And what did we always put in those square brackets? 2083 01:31:10,720 --> 01:31:11,610 A number. 2084 01:31:11,610 --> 01:31:12,990 Yeah, so 0, 1, 2. 2085 01:31:12,990 --> 01:31:14,780 You can index into an array. 2086 01:31:14,780 --> 01:31:16,990 What's cool about an object-- 2087 01:31:16,990 --> 01:31:20,160 or a hash table more generally, as we're now defining it-- 2088 01:31:20,160 --> 01:31:25,660 is you can index into the variable using not numbers, but words. 2089 01:31:25,660 --> 01:31:29,110 So you could think of student as being like a list or an array 2090 01:31:29,110 --> 01:31:30,987 with two values-- name and dorm. 2091 01:31:30,987 --> 01:31:33,570 But it's nice to be able to refer to those not as zero and one 2092 01:31:33,570 --> 01:31:36,720 or some stupid arbitrary number, but rather by keys-- 2093 01:31:36,720 --> 01:31:38,340 name and dorm. 2094 01:31:38,340 --> 01:31:41,370 So this syntax here, though cryptic, says go inside the student 2095 01:31:41,370 --> 01:31:44,970 object and get me the value of the key called name. 2096 01:31:44,970 --> 01:31:47,830 And this says the same thing about dorm. 2097 01:31:47,830 --> 01:31:50,550 So an object in Python-- 2098 01:31:50,550 --> 01:31:54,120 or more generally a hash table-- allows you to associate keys with values. 2099 01:31:54,120 --> 01:31:56,717 And this is quite simply the syntax you use for that. 2100 01:31:56,717 --> 01:31:58,050 So let me go ahead and run this. 2101 01:31:58,050 --> 01:32:01,410 Struct0.py, type in my name. 2102 01:32:01,410 --> 01:32:03,660 Let's say Matthews. 2103 01:32:03,660 --> 01:32:07,080 Let's do, like, Veronica, Weld. 2104 01:32:07,080 --> 01:32:08,310 Let's do Brian. 2105 01:32:08,310 --> 01:32:09,424 Brian, where did you live? 2106 01:32:09,424 --> 01:32:10,445 AUDIENCE: Which year? 2107 01:32:10,445 --> 01:32:11,570 DAVID MALAN: Freshman year. 2108 01:32:11,570 --> 01:32:12,486 AUDIENCE: Pennypacker. 2109 01:32:12,486 --> 01:32:14,690 DAVID MALAN: Pennypacker, enter. 2110 01:32:14,690 --> 01:32:17,420 Not that these specifics really matter, but now we 2111 01:32:17,420 --> 01:32:19,050 have expressed all of these sentences. 2112 01:32:19,050 --> 01:32:21,950 So the short of it now is we didn't quite see this in C, 2113 01:32:21,950 --> 01:32:25,340 but we did see a hint of this when we implemented our own hash 2114 01:32:25,340 --> 01:32:30,590 table in C so that we can actually access keys and values arbitrarily. 2115 01:32:30,590 --> 01:32:35,080 So let's do a-- actually, let me pause here for any questions 2116 01:32:35,080 --> 01:32:39,140 before we bring back Mario. 2117 01:32:39,140 --> 01:32:39,780 All right. 2118 01:32:39,780 --> 01:32:43,230 So let's now not just do examples for the sake of demonstration, 2119 01:32:43,230 --> 01:32:47,092 but rewind to an old friend that we've seen a few times 2120 01:32:47,092 --> 01:32:48,800 and just look at a few different screens. 2121 01:32:48,800 --> 01:32:50,954 So in Super Mario Bros, running left to right 2122 01:32:50,954 --> 01:32:53,870 you might recall or have seen that there's stuff like this in the sky. 2123 01:32:53,870 --> 01:32:55,870 And Mario's supposed to run under it and jump up 2124 01:32:55,870 --> 01:32:59,310 and he gets coins or whatever by jumping up and hitting these question marks. 2125 01:32:59,310 --> 01:33:01,550 So this is mostly a very contrived way of saying, 2126 01:33:01,550 --> 01:33:03,300 suppose we want to print out four question 2127 01:33:03,300 --> 01:33:06,291 marks on the screen just like Super Mario Bros, how could we do it? 2128 01:33:06,291 --> 01:33:08,790 It's going to be a little black and white, a little textual, 2129 01:33:08,790 --> 01:33:10,600 but how do I print out four question marks? 2130 01:33:10,600 --> 01:33:14,790 Well, let me go over here and let me create a file called, 2131 01:33:14,790 --> 01:33:17,370 let's say, Mario0.py. 2132 01:33:17,370 --> 01:33:18,330 And how do I do this? 2133 01:33:18,330 --> 01:33:22,180 What's the simplest way to do this, print four question marks? 2134 01:33:22,180 --> 01:33:24,500 OK, I heard print. 2135 01:33:24,500 --> 01:33:25,650 OK, four question marks. 2136 01:33:25,650 --> 01:33:26,290 Very good. 2137 01:33:26,290 --> 01:33:28,740 So let's go ahead and run Mario0. 2138 01:33:28,740 --> 01:33:29,850 Correct, that's right. 2139 01:33:29,850 --> 01:33:31,200 So this is not bad. 2140 01:33:31,200 --> 01:33:32,770 It's one string, not a huge deal. 2141 01:33:32,770 --> 01:33:35,487 Let's do it at least with a loop, as we've been often doing, 2142 01:33:35,487 --> 01:33:37,320 just to improve the design, even though this 2143 01:33:37,320 --> 01:33:39,450 is a very tiny, tiny, tiny example. 2144 01:33:39,450 --> 01:33:44,440 So Mario1.py, let's go ahead and print this out with a loop, for instance. 2145 01:33:44,440 --> 01:33:45,670 So how do I do this? 2146 01:33:45,670 --> 01:33:49,900 How do I print four question marks, but one at a time? 2147 01:33:49,900 --> 01:33:56,570 For i in range four, print, question mark. 2148 01:33:56,570 --> 01:33:57,470 Save, all right. 2149 01:33:57,470 --> 01:33:58,490 So Python, Mario. 2150 01:33:58,490 --> 01:34:01,616 Does anyone want to yell out, no, don't do that? 2151 01:34:01,616 --> 01:34:02,390 OK, thanks. 2152 01:34:02,390 --> 01:34:02,960 That's great. 2153 01:34:02,960 --> 01:34:05,082 All right, so why did you not want me to do that? 2154 01:34:05,082 --> 01:34:06,290 Because they're all vertical. 2155 01:34:06,290 --> 01:34:08,210 So we did have a fix for this how. 2156 01:34:08,210 --> 01:34:12,310 Do I tell print, don't end your lines with the default new line? 2157 01:34:12,310 --> 01:34:17,850 So and equals just quote unquote to override the default backslash n value. 2158 01:34:17,850 --> 01:34:18,949 So now I can rerun this. 2159 01:34:18,949 --> 01:34:20,240 All right, it's a little buggy. 2160 01:34:20,240 --> 01:34:24,790 So how can I fix this and only put a newline after the last one? 2161 01:34:24,790 --> 01:34:25,760 AUDIENCE: [INAUDIBLE] 2162 01:34:25,760 --> 01:34:27,910 DAVID MALAN: Yeah, honestly, just do print nothing. 2163 01:34:27,910 --> 01:34:30,740 And that will have the effect of printing a new line for free. 2164 01:34:30,740 --> 01:34:31,600 So let's do this. 2165 01:34:31,600 --> 01:34:32,140 OK. 2166 01:34:32,140 --> 01:34:33,990 Now we've got a good example there. 2167 01:34:33,990 --> 01:34:36,490 All right, so it turns out we actually printed along the way 2168 01:34:36,490 --> 01:34:40,840 a separate example, which looked like this, albeit with four blocks. 2169 01:34:40,840 --> 01:34:43,380 So we won't-- let's go ahead and do this now vertically, 2170 01:34:43,380 --> 01:34:45,734 not with question marks, but with hashes like bricks. 2171 01:34:45,734 --> 01:34:47,650 So if we want to print out those three hashes, 2172 01:34:47,650 --> 01:34:53,140 allow me to draw some inspiration from this and let's say in Mario2.py, 2173 01:34:53,140 --> 01:34:58,780 let me go ahead and just say for i in range of three, 2174 01:34:58,780 --> 01:35:01,000 go ahead and print out just one block. 2175 01:35:01,000 --> 01:35:03,580 And as you've been advising, just do this-- 2176 01:35:03,580 --> 01:35:06,220 or rather, no, let's use the default to print out 2177 01:35:06,220 --> 01:35:08,470 a vertical bar of three blocks. 2178 01:35:08,470 --> 01:35:10,240 So this is Mario2.py. 2179 01:35:10,240 --> 01:35:12,790 And now we've done something reminiscent of that. 2180 01:35:12,790 --> 01:35:16,090 But now things get a little interesting if we go underground. 2181 01:35:16,090 --> 01:35:17,800 And let's focus on this square. 2182 01:35:17,800 --> 01:35:20,560 So three by three, for instance, because we've not quite 2183 01:35:20,560 --> 01:35:22,160 seen something like this. 2184 01:35:22,160 --> 01:35:24,610 So in our last example here, let's see. 2185 01:35:24,610 --> 01:35:28,330 Could we get maybe a brave volunteer to come on up, tie some of these ideas 2186 01:35:28,330 --> 01:35:30,100 together? 2187 01:35:30,100 --> 01:35:31,350 Is that a hand back there? 2188 01:35:31,350 --> 01:35:33,510 Come on down. 2189 01:35:33,510 --> 01:35:39,390 So this will be Mario3.py, the goal of which is to print a brick, 2190 01:35:39,390 --> 01:35:40,740 a bigger brick-- 2191 01:35:40,740 --> 01:35:43,304 it's like 3 by 3-- hello again. 2192 01:35:43,304 --> 01:35:44,180 ANDREA: Hello. 2193 01:35:44,180 --> 01:35:45,120 DAVID MALAN: For the audience, what's your name? 2194 01:35:45,120 --> 01:35:46,050 ANDREA: Andrea. 2195 01:35:46,050 --> 01:35:47,040 DAVID MALAN: Andrea, nice to see you. 2196 01:35:47,040 --> 01:35:47,850 ANDREA: Nice to see you. 2197 01:35:47,850 --> 01:35:49,641 DAVID MALAN: All right, so the goal at hand 2198 01:35:49,641 --> 01:35:52,320 is to print a three by three grid of just 2199 01:35:52,320 --> 01:35:54,990 hashes reminiscent of those bricks. 2200 01:35:54,990 --> 01:35:56,214 All right, you're in charge. 2201 01:35:56,214 --> 01:35:57,636 ANDREA: All right. 2202 01:35:57,636 --> 01:35:59,762 Should I do, like, a loop or something? 2203 01:35:59,762 --> 01:36:01,428 DAVID MALAN: Whatever gets the job done. 2204 01:36:01,428 --> 01:36:04,750 2205 01:36:04,750 --> 01:36:07,691 All right, for. 2206 01:36:07,691 --> 01:36:08,190 OK, good. 2207 01:36:08,190 --> 01:36:15,360 2208 01:36:15,360 --> 01:36:17,008 OK, interesting. 2209 01:36:17,008 --> 01:36:23,710 2210 01:36:23,710 --> 01:36:27,933 OK, print, quote unquote, print, yeah, OK. 2211 01:36:27,933 --> 01:36:28,432 ANDREA: OK. 2212 01:36:28,432 --> 01:36:29,060 Oh, right. 2213 01:36:29,060 --> 01:36:30,060 DAVID MALAN: Key detail. 2214 01:36:30,060 --> 01:36:31,227 ANDREA: What was it, a hash? 2215 01:36:31,227 --> 01:36:32,643 DAVID MALAN: A hash is fine, yeah. 2216 01:36:32,643 --> 01:36:33,510 ANDREA: OK. 2217 01:36:33,510 --> 01:36:34,600 DAVID MALAN: All right. 2218 01:36:34,600 --> 01:36:40,320 And before we do this, does everyone want her to run this program 2219 01:36:40,320 --> 01:36:42,064 and be correct? 2220 01:36:42,064 --> 01:36:42,980 AUDIENCE: Don't do it. 2221 01:36:42,980 --> 01:36:45,350 DAVID MALAN: No, why? 2222 01:36:45,350 --> 01:36:46,610 Someone who claims no, what? 2223 01:36:46,610 --> 01:36:47,710 What's your concern? 2224 01:36:47,710 --> 01:36:50,894 AUDIENCE: N equals-- it'll do it [INAUDIBLE] 2225 01:36:50,894 --> 01:36:51,810 DAVID MALAN: Good, OK. 2226 01:36:51,810 --> 01:36:52,590 So you fixed that. 2227 01:36:52,590 --> 01:36:53,160 Good. 2228 01:36:53,160 --> 01:36:55,759 Any other concerns? 2229 01:36:55,759 --> 01:36:57,002 Yeah? 2230 01:36:57,002 --> 01:37:01,704 AUDIENCE: [INAUDIBLE] 2231 01:37:01,704 --> 01:37:02,370 DAVID MALAN: OK. 2232 01:37:02,370 --> 01:37:03,510 Is it going to go up and down? 2233 01:37:03,510 --> 01:37:03,710 Well, let's see. 2234 01:37:03,710 --> 01:37:05,690 Can you walk us through verbally-- do we have-- 2235 01:37:05,690 --> 01:37:09,460 2236 01:37:09,460 --> 01:37:12,315 can you walk us through what the program does? 2237 01:37:12,315 --> 01:37:15,430 [LAUGHTER] 2238 01:37:15,430 --> 01:37:19,255 ANDREA: For i in range 3, so this will happen three times, then j 2239 01:37:19,255 --> 01:37:22,050 in range three, the next thing will also happen three times. 2240 01:37:22,050 --> 01:37:23,400 So we print a hash. 2241 01:37:23,400 --> 01:37:25,370 And then we another hash and another hash 2242 01:37:25,370 --> 01:37:28,930 because the end is the quotation marks. 2243 01:37:28,930 --> 01:37:29,740 DAVID MALAN: OK. 2244 01:37:29,740 --> 01:37:35,480 ANDREA: And then that happens and then we print a new line. 2245 01:37:35,480 --> 01:37:38,352 And then it should execute that three times. 2246 01:37:38,352 --> 01:37:39,310 DAVID MALAN: All right. 2247 01:37:39,310 --> 01:37:40,060 What do you think? 2248 01:37:40,060 --> 01:37:42,214 Do you-- the duck is convinced. 2249 01:37:42,214 --> 01:37:44,380 All right, why don't you go ahead and save the file. 2250 01:37:44,380 --> 01:37:45,160 Let's try. 2251 01:37:45,160 --> 01:37:48,130 No harm in trying, so right or wrong, let's see. 2252 01:37:48,130 --> 01:37:53,740 This is called Mario3.py, and I think we have round of applause if we could. 2253 01:37:53,740 --> 01:37:55,770 Very nicely done. 2254 01:37:55,770 --> 01:37:56,560 All right. 2255 01:37:56,560 --> 01:37:58,460 So let's-- and if you'd like one more. 2256 01:37:58,460 --> 01:38:00,670 So let's take a look at one final example, 2257 01:38:00,670 --> 01:38:02,890 coming full circle from where we began. 2258 01:38:02,890 --> 01:38:04,330 We of course looked at resize. 2259 01:38:04,330 --> 01:38:09,340 And let's open that up, just to see how I got away with writing so little code 2260 01:38:09,340 --> 01:38:11,350 and actually getting that job done. 2261 01:38:11,350 --> 01:38:14,110 So in resize.py, which is where we began, 2262 01:38:14,110 --> 01:38:17,200 notice that I had a few lines that hopefully look a little more familiar 2263 01:38:17,200 --> 01:38:17,700 now. 2264 01:38:17,700 --> 01:38:21,520 But we didn't exactly introduce all of these features ourselves. 2265 01:38:21,520 --> 01:38:24,850 So it turns out in line one and line two we have 2266 01:38:24,850 --> 01:38:26,504 one unfamiliar and one familiar line. 2267 01:38:26,504 --> 01:38:29,170 Line two just gives us access to a command line arguments, which 2268 01:38:29,170 --> 01:38:30,910 we needed for resizing the bitmap. 2269 01:38:30,910 --> 01:38:34,070 Line one is where a lot of the power is coming from. 2270 01:38:34,070 --> 01:38:36,720 It turns out there's a library in Python called pillow 2271 01:38:36,720 --> 01:38:39,470 that you can install by typing a certain command at your terminal. 2272 01:38:39,470 --> 01:38:41,430 It doesn't necessarily come with your Mac or PC. 2273 01:38:41,430 --> 01:38:43,679 You have to download it and install it with a command. 2274 01:38:43,679 --> 01:38:45,640 And then if you read its documentation, it 2275 01:38:45,640 --> 01:38:48,640 will say, from pill for pillow import image. 2276 01:38:48,640 --> 01:38:50,200 Now, that's not a specific image. 2277 01:38:50,200 --> 01:38:52,870 That's the name of a library called the image 2278 01:38:52,870 --> 01:38:56,740 library that comes with that software that someone freely made available. 2279 01:38:56,740 --> 01:39:00,420 So that's just saying, give me access to an image-related library. 2280 01:39:00,420 --> 01:39:03,970 And undoubtedly, there could exist similar things in C. But we of course 2281 01:39:03,970 --> 01:39:06,190 did things very hands-on low-level. 2282 01:39:06,190 --> 01:39:10,150 All right, if the length of argv is not 4, yell at the user with the usage. 2283 01:39:10,150 --> 01:39:13,480 And that's just if they don't cooperate by typing in as they should, this. 2284 01:39:13,480 --> 01:39:15,969 It's a little more verbose now because we have Python 2285 01:39:15,969 --> 01:39:17,260 and we have the file extension. 2286 01:39:17,260 --> 01:39:19,930 But we could technically clean that up if we really wanted. 2287 01:39:19,930 --> 01:39:23,290 Lines 7, 8, and 9, there's nothing really new there. 2288 01:39:23,290 --> 01:39:26,230 I'm just declaring three variables implicitly typed. 2289 01:39:26,230 --> 01:39:28,510 I don't have to bother saying int or string. 2290 01:39:28,510 --> 01:39:33,320 I'm accessing argv 1, 2, and 3, which is 1, 2, and 3. 2291 01:39:33,320 --> 01:39:35,920 And then I'm doing one thing line 7. 2292 01:39:35,920 --> 01:39:39,718 What is line 7 doing that's important? 2293 01:39:39,718 --> 01:39:41,145 AUDIENCE: [INAUDIBLE] 2294 01:39:41,145 --> 01:39:43,770 DAVID MALAN: I'm changing the argument from what is technically 2295 01:39:43,770 --> 01:39:46,520 a string by default-- because indeed, it came from the human hands 2296 01:39:46,520 --> 01:39:49,110 at a keyboard-- and converting it into a number. 2297 01:39:49,110 --> 01:39:53,250 Now, as an aside, if the user does not provide a number like 2 or 10, 2298 01:39:53,250 --> 01:39:54,300 this code could break. 2299 01:39:54,300 --> 01:39:56,466 To be fair, I should really have some error checking 2300 01:39:56,466 --> 01:40:00,630 to make sure if the user typed in hello and not 2 or 10, 2301 01:40:00,630 --> 01:40:01,780 I need to catch that error. 2302 01:40:01,780 --> 01:40:02,988 So I'm being a little sloppy. 2303 01:40:02,988 --> 01:40:06,130 But it was really meant to demonstrate succinct code. 2304 01:40:06,130 --> 01:40:09,730 So now we have infile and outfile defined exactly as before. 2305 01:40:09,730 --> 01:40:13,040 So we have just three lines left that actually implement most of the magic. 2306 01:40:13,040 --> 01:40:14,715 Yeah. 2307 01:40:14,715 --> 01:40:22,650 AUDIENCE: [INAUDIBLE] 2308 01:40:22,650 --> 01:40:24,441 DAVID MALAN: Wait, say the last part again. 2309 01:40:24,441 --> 01:40:26,600 AUDIENCE: [INAUDIBLE] 2310 01:40:26,600 --> 01:40:28,550 DAVID MALAN: Yes. 2311 01:40:28,550 --> 01:40:33,430 AUDIENCE: There was almost [INAUDIBLE] 2312 01:40:33,430 --> 01:40:34,680 DAVID MALAN: Good observation. 2313 01:40:34,680 --> 01:40:38,700 So this is not just converting the user's input to the equivalent ASCII 2314 01:40:38,700 --> 01:40:40,800 value because that's not what we want. 2315 01:40:40,800 --> 01:40:44,400 This int used here is actually converting it 2316 01:40:44,400 --> 01:40:48,420 as via a2i, a function that you've probably used a couple of weeks ago, 2317 01:40:48,420 --> 01:40:50,460 it's just named a little more succinctly. 2318 01:40:50,460 --> 01:40:53,940 There is a function via which you could convert a character or a string 2319 01:40:53,940 --> 01:40:55,201 to its ASCII equivalent. 2320 01:40:55,201 --> 01:40:56,700 But that's not what's going on here. 2321 01:40:56,700 --> 01:40:59,190 It does the more intuitive turn this into an integer 2322 01:40:59,190 --> 01:41:01,811 without using a cryptically named function like a2i. 2323 01:41:01,811 --> 01:41:04,560 So let's scroll down just a little further to these last few lines 2324 01:41:04,560 --> 01:41:05,770 and see what's going on. 2325 01:41:05,770 --> 01:41:08,220 Some of them you would only know how to do from having 2326 01:41:08,220 --> 01:41:09,870 read the documentation just as we did. 2327 01:41:09,870 --> 01:41:11,761 This says give me a variable called in image. 2328 01:41:11,761 --> 01:41:13,010 Could have called it anything. 2329 01:41:13,010 --> 01:41:14,940 I'm just trying to be consistent with in file. 2330 01:41:14,940 --> 01:41:17,100 This says, use the image library. 2331 01:41:17,100 --> 01:41:19,140 Use its open function that comes with it. 2332 01:41:19,140 --> 01:41:21,780 So image is some kind of structure, inside of which 2333 01:41:21,780 --> 01:41:23,730 is some useful image-related functionality. 2334 01:41:23,730 --> 01:41:26,860 So call its open function on the name of the file, 2335 01:41:26,860 --> 01:41:29,250 then go ahead and extract its height and width. 2336 01:41:29,250 --> 01:41:32,100 So turns out this is another tuple, if you will. 2337 01:41:32,100 --> 01:41:35,110 Tuples, again, are like x comma y, latitude comma longitude. 2338 01:41:35,110 --> 01:41:37,560 You'd only know that it is a tuple from the documentation. 2339 01:41:37,560 --> 01:41:42,150 So when I say width comma height, this is taking what's technically a list 2340 01:41:42,150 --> 01:41:43,675 of size two-- or really, a tuple-- 2341 01:41:43,675 --> 01:41:46,050 and it's just extracting for me the width and the height. 2342 01:41:46,050 --> 01:41:48,362 But let me wave my hands at that particular syntax. 2343 01:41:48,362 --> 01:41:50,070 The rest of this just says the following. 2344 01:41:50,070 --> 01:41:52,450 Give me a new variable called out image. 2345 01:41:52,450 --> 01:41:56,250 Call the input image's resize function, another piece of functionality 2346 01:41:56,250 --> 01:41:59,610 built into it, just like open, and change it 2347 01:41:59,610 --> 01:42:03,750 by this width and this height-- the original width times n, 2348 01:42:03,750 --> 01:42:05,520 the original height times n. 2349 01:42:05,520 --> 01:42:08,850 No padding manipulation, that's all the responsibility of the library. 2350 01:42:08,850 --> 01:42:11,160 Some other human dealt with all of that for us. 2351 01:42:11,160 --> 01:42:13,290 And this last line, perhaps not surprisingly, 2352 01:42:13,290 --> 01:42:16,660 saves the output image to that file name. 2353 01:42:16,660 --> 01:42:18,862 So in just, what, 15 lines of code and fewer 2354 01:42:18,862 --> 01:42:20,820 if we get rid of some of the whitespace can you 2355 01:42:20,820 --> 01:42:22,410 implement the entirety of resize. 2356 01:42:22,410 --> 01:42:24,490 But really focusing on the logic of the problem, 2357 01:42:24,490 --> 01:42:26,130 I want to take an input from the user. 2358 01:42:26,130 --> 01:42:27,600 I want to scale it up by a factor of n. 2359 01:42:27,600 --> 01:42:28,740 And I want to save out the file. 2360 01:42:28,740 --> 01:42:30,040 That's what you care about. 2361 01:42:30,040 --> 01:42:33,270 You don't necessarily care about getting into the weeds of exactly what it 2362 01:42:33,270 --> 01:42:36,870 was you had to do when you did it in C. 2363 01:42:36,870 --> 01:42:39,300 So let's do one final example here. 2364 01:42:39,300 --> 01:42:42,780 You'll recall from problem set four you implemented your own spell checker. 2365 01:42:42,780 --> 01:42:45,790 And odds are you did a try or a hash table or the like. 2366 01:42:45,790 --> 01:42:48,120 And it turns out that is non-trivial, certainly in C. 2367 01:42:48,120 --> 01:42:51,100 And it's non-trivial certainly for the first time in any language. 2368 01:42:51,100 --> 01:42:53,880 But let me take a stab at doing this now in Python. 2369 01:42:53,880 --> 01:42:57,330 Let me go into source 6 where I have a speller example. 2370 01:42:57,330 --> 01:43:00,840 And notice that in this folder today I've brought a few files with me. 2371 01:43:00,840 --> 01:43:03,030 So I've brought a copy of the dictionaries 2372 01:43:03,030 --> 01:43:06,750 from p set four, a copy of the text files, like la-la land and the like 2373 01:43:06,750 --> 01:43:07,690 in text. 2374 01:43:07,690 --> 01:43:11,280 And then I brought two files-- dictionary.py and speller.py-- 2375 01:43:11,280 --> 01:43:14,940 the latter of which is an implementation of speller.c in Python. 2376 01:43:14,940 --> 01:43:17,940 And I'm not going to pull that one up because we wrote that one entirely 2377 01:43:17,940 --> 01:43:18,600 for you. 2378 01:43:18,600 --> 01:43:22,770 But let me go ahead and write, for instance, just my own dictionary. 2379 01:43:22,770 --> 01:43:28,680 So dictionary.py is the analog of dictionary.c. 2380 01:43:28,680 --> 01:43:31,106 And let's go ahead and set this up. 2381 01:43:31,106 --> 01:43:33,480 Let me go ahead and create this file in a separate folder 2382 01:43:33,480 --> 01:43:36,150 for now, so dictionary.py. 2383 01:43:36,150 --> 01:43:38,184 And there's a few functions in dictionary.c 2384 01:43:38,184 --> 01:43:40,350 which we should probably get around to implementing. 2385 01:43:40,350 --> 01:43:41,772 What are those functions? 2386 01:43:41,772 --> 01:43:43,380 AUDIENCE: Load. 2387 01:43:43,380 --> 01:43:45,090 DAVID MALAN: Load was one, and load takes 2388 01:43:45,090 --> 01:43:46,990 the name of a file or a dictionary. 2389 01:43:46,990 --> 01:43:47,850 So let's do this. 2390 01:43:47,850 --> 01:43:48,930 And I'll just say to do. 2391 01:43:48,930 --> 01:43:49,680 Come back to that. 2392 01:43:49,680 --> 01:43:52,020 What other functions were in dictionary.c? 2393 01:43:52,020 --> 01:43:53,580 Check, so def check. 2394 01:43:53,580 --> 01:43:56,280 And what did check take as an input? 2395 01:43:56,280 --> 01:43:56,820 A word, yep. 2396 01:43:56,820 --> 01:43:59,320 So we'll come back to this and just come back to that to do. 2397 01:43:59,320 --> 01:44:00,523 What other functions? 2398 01:44:00,523 --> 01:44:01,370 AUDIENCE: Size. 2399 01:44:01,370 --> 01:44:04,201 DAVID MALAN: Size was one, so def size. 2400 01:44:04,201 --> 01:44:07,200 This did not take input, but it just returned the size of the structure. 2401 01:44:07,200 --> 01:44:07,950 So we'll come back to that. 2402 01:44:07,950 --> 01:44:08,712 And lastly? 2403 01:44:08,712 --> 01:44:09,420 AUDIENCE: Unload. 2404 01:44:09,420 --> 01:44:10,545 DAVID MALAN: OK, so unload. 2405 01:44:10,545 --> 01:44:13,310 All right, so this is the Python version of the distribution code 2406 01:44:13,310 --> 01:44:15,170 for speller for your dictionary file. 2407 01:44:15,170 --> 01:44:17,670 So unload also didn't take an argument. 2408 01:44:17,670 --> 01:44:19,820 So that's something for us to do, too. 2409 01:44:19,820 --> 01:44:22,400 So what's the gist of making a spell checker? 2410 01:44:22,400 --> 01:44:25,730 You are loading words in your load function from a dictionary file. 2411 01:44:25,730 --> 01:44:27,980 And the goal is to load those somehow into memory. 2412 01:44:27,980 --> 01:44:30,950 You had a design decision for the p set in C, 2413 01:44:30,950 --> 01:44:32,840 where you could make a hash table or a try 2414 01:44:32,840 --> 01:44:34,790 or even a linked list or even an array. 2415 01:44:34,790 --> 01:44:37,580 But odds are the first of those two were probably more efficient. 2416 01:44:37,580 --> 01:44:40,550 So it turns out that in Python, you have the ability 2417 01:44:40,550 --> 01:44:43,640 to store words pretty readily in any number of data structures. 2418 01:44:43,640 --> 01:44:46,640 You have not just ints and floats and strings, 2419 01:44:46,640 --> 01:44:49,250 but you clearly have lists, as we've seen. 2420 01:44:49,250 --> 01:44:52,511 We call them objects or hashes, hash tables. 2421 01:44:52,511 --> 01:44:54,260 And there's other things, too, even called 2422 01:44:54,260 --> 01:44:57,650 sets, where a set is kind of just a collection of words 2423 01:44:57,650 --> 01:45:00,350 which would be very nicely searchable. 2424 01:45:00,350 --> 01:45:01,247 And so you know what? 2425 01:45:01,247 --> 01:45:03,080 If I want to ultimately load some words, let 2426 01:45:03,080 --> 01:45:05,600 me give myself a global variable called words 2427 01:45:05,600 --> 01:45:08,250 and just initialize it to an empty set. 2428 01:45:08,250 --> 01:45:11,330 So I have a global variable called words and nothing is in it just yet. 2429 01:45:11,330 --> 01:45:13,160 But it's a set of words. 2430 01:45:13,160 --> 01:45:15,660 How do I go about loading words into that dictionary? 2431 01:45:15,660 --> 01:45:17,540 Well, let's go ahead and implement load here. 2432 01:45:17,540 --> 01:45:20,240 So let me go ahead and declare a variable called file and open 2433 01:45:20,240 --> 01:45:24,230 this dictionary in read mode, just as in C. 2434 01:45:24,230 --> 01:45:26,540 And then how do I integrate over the lines in a file? 2435 01:45:26,540 --> 01:45:27,510 We've not seen that. 2436 01:45:27,510 --> 01:45:30,530 But I do know how to iterate over the strings in an array 2437 01:45:30,530 --> 01:45:31,840 and the characters in a string. 2438 01:45:31,840 --> 01:45:35,720 So let me go with my instinct for line in file. 2439 01:45:35,720 --> 01:45:38,240 Indeed, this will do exactly what you want it to do. 2440 01:45:38,240 --> 01:45:44,750 Then let me go ahead and add to my words data structure the following line. 2441 01:45:44,750 --> 01:45:46,965 And then let me close the file. 2442 01:45:46,965 --> 01:45:49,340 And then let me just say return true because all is well. 2443 01:45:49,340 --> 01:45:50,780 Done. 2444 01:45:50,780 --> 01:45:53,690 All right, so I'm cutting a few corners, technically. 2445 01:45:53,690 --> 01:45:55,670 Let me use that function I alluded to earlier. 2446 01:45:55,670 --> 01:45:58,584 Let me go ahead and call r strip and strip off 2447 01:45:58,584 --> 01:46:00,500 the new line because in the file, technically, 2448 01:46:00,500 --> 01:46:03,460 when you're reading in those words, every line ends with a backslash zero. 2449 01:46:03,460 --> 01:46:04,620 That's now part of the word. 2450 01:46:04,620 --> 01:46:07,120 So a minor correction there that I'm stripping off the line. 2451 01:46:07,120 --> 01:46:08,210 But that's it for load. 2452 01:46:08,210 --> 01:46:13,040 How do I now check if a given word is in that set? 2453 01:46:13,040 --> 01:46:18,510 Well, I can just say, if word in words return true. 2454 01:46:18,510 --> 01:46:21,120 Else, return false. 2455 01:46:21,120 --> 01:46:22,950 Done with check. 2456 01:46:22,950 --> 01:46:26,590 How do I return the size of this data structure? 2457 01:46:26,590 --> 01:46:30,450 How about I just return the length of that structure, words, and then 2458 01:46:30,450 --> 01:46:30,990 unload-- 2459 01:46:30,990 --> 01:46:33,082 heck, Python's doing this all for me-- 2460 01:46:33,082 --> 01:46:35,818 done. 2461 01:46:35,818 --> 01:46:37,120 Let me shrink this. 2462 01:46:37,120 --> 01:46:37,870 And you know what? 2463 01:46:37,870 --> 01:46:39,010 This is a little verbose. 2464 01:46:39,010 --> 01:46:40,720 I don't actually need to do this if else. 2465 01:46:40,720 --> 01:46:44,691 I could just return word in words and that will return a Boolean for me. 2466 01:46:44,691 --> 01:46:46,940 And honestly, if I want to lower case it, that's easy. 2467 01:46:46,940 --> 01:46:48,648 I can just do this and take care of that. 2468 01:46:48,648 --> 01:46:49,984 Now it's even better. 2469 01:46:49,984 --> 01:46:51,415 That's p set 4. 2470 01:46:51,415 --> 01:46:54,760 2471 01:46:54,760 --> 01:46:56,440 Excited? 2472 01:46:56,440 --> 01:46:57,790 Wish we had done this in C? 2473 01:46:57,790 --> 01:46:59,830 So what is the whole point of all of this, 2474 01:46:59,830 --> 01:47:03,760 because the goal wasn't to create sort of great angst and wonder now. 2475 01:47:03,760 --> 01:47:07,210 But the whole point of having introduced C over these past few weeks is to, 2476 01:47:07,210 --> 01:47:09,339 one, none of this now do you take for granted. 2477 01:47:09,339 --> 01:47:12,130 I mean, you might be longing for having implemented this in Python. 2478 01:47:12,130 --> 01:47:13,840 And you might have had to read some documentation 2479 01:47:13,840 --> 01:47:15,310 and figure out the various syntax. 2480 01:47:15,310 --> 01:47:16,660 But my God. 2481 01:47:16,660 --> 01:47:20,082 We whittled down what probably took most of you hours into just seconds 2482 01:47:20,082 --> 01:47:22,040 once you're more comfortable with the language. 2483 01:47:22,040 --> 01:47:23,950 But also, to our very earliest point today, 2484 01:47:23,950 --> 01:47:27,350 once you have the right language and the right tool for the job. 2485 01:47:27,350 --> 01:47:30,200 Now, it's not to say that this is perfect, because in fact, 2486 01:47:30,200 --> 01:47:31,690 let's go ahead and do some tests. 2487 01:47:31,690 --> 01:47:34,360 Let me go into my terminal window here. 2488 01:47:34,360 --> 01:47:38,320 And I actually brought my own solution in my C folder here. 2489 01:47:38,320 --> 01:47:38,980 Let's see. 2490 01:47:38,980 --> 01:47:43,852 I have my own code to speller implemented in C here. 2491 01:47:43,852 --> 01:47:45,310 And let me go ahead and run a test. 2492 01:47:45,310 --> 01:47:49,120 Let me go ahead and run speller on, say, the text Shakespeare. 2493 01:47:49,120 --> 01:47:50,230 That's a pretty big input. 2494 01:47:50,230 --> 01:47:51,670 Let's go ahead and hit Enter. 2495 01:47:51,670 --> 01:47:53,230 And this is my spell checker running. 2496 01:47:53,230 --> 01:47:54,604 And all the words are outputting. 2497 01:47:54,604 --> 01:47:58,780 And the time total to run speller in C was, say, 0.9 seconds. 2498 01:47:58,780 --> 01:48:00,210 So that's actually pretty good. 2499 01:48:00,210 --> 01:48:03,910 In a second window, let me go up here in another terminal window. 2500 01:48:03,910 --> 01:48:08,890 And let me go into today's code and into the speller folder where I have 2501 01:48:08,890 --> 01:48:12,640 a Python version that I'm going to run as follows-- speller.py-- 2502 01:48:12,640 --> 01:48:15,250 let me go ahead and run it on Shakespeare. 2503 01:48:15,250 --> 01:48:16,970 So we've not looked at speller.py. 2504 01:48:16,970 --> 01:48:20,671 But it is essentially line for line a port, a translation, from C to Python. 2505 01:48:20,671 --> 01:48:22,420 But you're welcome to look at that online. 2506 01:48:22,420 --> 01:48:25,670 And it's using my dictionary.py file. 2507 01:48:25,670 --> 01:48:27,040 Let me go ahead and run that. 2508 01:48:27,040 --> 01:48:28,780 It's running through all the words. 2509 01:48:28,780 --> 01:48:33,230 Top is Python, bottom is C. Here we go. 2510 01:48:33,230 --> 01:48:36,130 Here we go. 2511 01:48:36,130 --> 01:48:38,705 Here we go. 2512 01:48:38,705 --> 01:48:41,580 Now, this is a bit misleading because again, the internet is the way. 2513 01:48:41,580 --> 01:48:46,020 We're using a web-based IDE, and so it's funny that that appears so many times. 2514 01:48:46,020 --> 01:48:48,727 And you'll see it's not 10, 20 seconds, however long that was. 2515 01:48:48,727 --> 01:48:50,310 That was just the internet being slow. 2516 01:48:50,310 --> 01:48:53,190 And all we're timing is your functions in both C and Python. 2517 01:48:53,190 --> 01:48:55,950 But what's the takeaway between Python and C? 2518 01:48:55,950 --> 01:48:58,880 2519 01:48:58,880 --> 01:49:01,824 Same inputs. 2520 01:49:01,824 --> 01:49:02,490 What do you see? 2521 01:49:02,490 --> 01:49:03,690 Yeah? 2522 01:49:03,690 --> 01:49:05,491 AUDIENCE: Be more concise [INAUDIBLE]. 2523 01:49:05,491 --> 01:49:07,240 DAVID MALAN: Yeah, I wouldn't say concise. 2524 01:49:07,240 --> 01:49:08,200 That's more aesthetic. 2525 01:49:08,200 --> 01:49:09,284 It's more-- 2526 01:49:09,284 --> 01:49:10,850 AUDIENCE: Specific [INAUDIBLE]. 2527 01:49:10,850 --> 01:49:12,600 DAVID MALAN: Well, not even that, I think. 2528 01:49:12,600 --> 01:49:13,479 These are correct. 2529 01:49:13,479 --> 01:49:14,520 Both of them are correct. 2530 01:49:14,520 --> 01:49:18,030 All the important numbers at the top are identical. 2531 01:49:18,030 --> 01:49:21,680 But what is clearly different, though? 2532 01:49:21,680 --> 01:49:22,760 It's slower. 2533 01:49:22,760 --> 01:49:24,710 So Python seems to be slower, right? 2534 01:49:24,710 --> 01:49:27,150 It takes in total-- if we just look at two numbers-- 2535 01:49:27,150 --> 01:49:30,230 1.55 seconds in Python, if you ignore the internet speed 2536 01:49:30,230 --> 01:49:32,820 and just look at the code performance, versus 0.9. 2537 01:49:32,820 --> 01:49:37,970 So it's almost twice as slow as C. So what's the takeaway there? 2538 01:49:37,970 --> 01:49:42,180 Well, yes, it took me, what, 10, 20, 30 seconds to write the code. 2539 01:49:42,180 --> 01:49:44,080 But it's taking me twice as long to run it. 2540 01:49:44,080 --> 01:49:45,320 Now, not a big deal, of course, when we're 2541 01:49:45,320 --> 01:49:46,904 talking a few seconds here and there. 2542 01:49:46,904 --> 01:49:49,820 But if this were a big data set that you're analyzing for some project 2543 01:49:49,820 --> 01:49:54,587 or for work or for any kind of analysis project and the data is much larger 2544 01:49:54,587 --> 01:49:57,170 than even this-- especially in the medical field or the like-- 2545 01:49:57,170 --> 01:49:58,628 maybe you don't want to use Python. 2546 01:49:58,628 --> 01:50:01,980 Sure, you can bang out the code in just a few minutes, maybe a few hours. 2547 01:50:01,980 --> 01:50:05,900 But once you run it, damn, it's slower than using something like C. 2548 01:50:05,900 --> 01:50:08,214 Whereas in C, might take you more time upfront. 2549 01:50:08,214 --> 01:50:10,130 And you might not even have the comfort with C 2550 01:50:10,130 --> 01:50:12,620 anymore so it's going to take an even longer because you have to go relearn 2551 01:50:12,620 --> 01:50:13,400 the language. 2552 01:50:13,400 --> 01:50:16,250 But when you run it, wow, it runs twice as fast. 2553 01:50:16,250 --> 01:50:18,110 You therefore need less RAM, potentially, 2554 01:50:18,110 --> 01:50:21,530 less hardware or less expensive hardware because you can get away with more. 2555 01:50:21,530 --> 01:50:24,560 So again, this theme we keep seeing in data structures and algorithms 2556 01:50:24,560 --> 01:50:25,340 is trade-offs. 2557 01:50:25,340 --> 01:50:28,760 Like, developer time is a resource and it is wonderful that I 2558 01:50:28,760 --> 01:50:31,520 and now you would be able to write code so much more quickly. 2559 01:50:31,520 --> 01:50:33,424 But you do have to pay a price somewhere. 2560 01:50:33,424 --> 01:50:35,090 And there's clearly a price with Python. 2561 01:50:35,090 --> 01:50:37,610 And it's not because Python is poorly implemented. 2562 01:50:37,610 --> 01:50:40,490 But what is the fundamental difference between the paradigm 2563 01:50:40,490 --> 01:50:44,862 of programming in C versus in Python as we've seen it today? 2564 01:50:44,862 --> 01:50:45,570 What's different? 2565 01:50:45,570 --> 01:50:46,329 Yeah? 2566 01:50:46,329 --> 01:50:52,816 AUDIENCE: [INAUDIBLE] line by line, whereas C, it essentially-- 2567 01:50:52,816 --> 01:50:57,037 [INAUDIBLE] optimize running it, it will run [INAUDIBLE].. 2568 01:50:57,037 --> 01:50:57,870 DAVID MALAN: Indeed. 2569 01:50:57,870 --> 01:50:58,550 And let me flip it around. 2570 01:50:58,550 --> 01:51:00,760 So with C, you're compiling down to zeros and ones. 2571 01:51:00,760 --> 01:51:02,270 And that compiler is super smart. 2572 01:51:02,270 --> 01:51:03,710 And it's going to move things around in memory. 2573 01:51:03,710 --> 01:51:06,543 It's going to talk the computer's native language of zeros and ones. 2574 01:51:06,543 --> 01:51:10,477 Python is, indeed, reading your code, by contrast, line by line, top to bottom, 2575 01:51:10,477 --> 01:51:11,060 left to right. 2576 01:51:11,060 --> 01:51:14,226 And even though technically underneath the hood there is a compilation step, 2577 01:51:14,226 --> 01:51:16,390 there is nonetheless some overhead involved. 2578 01:51:16,390 --> 01:51:18,860 The mere fact that we're no longer running clang and then 2579 01:51:18,860 --> 01:51:22,520 getting 0's and 1's or running make and getting zeros and ones, that's great. 2580 01:51:22,520 --> 01:51:24,187 But we have to pay the price somewhere. 2581 01:51:24,187 --> 01:51:25,520 So this is going to be thematic. 2582 01:51:25,520 --> 01:51:28,520 Like, there is no holy grail among languages or tools or techniques. 2583 01:51:28,520 --> 01:51:31,340 There's going to be trade-offs among your comfort, your familiarity 2584 01:51:31,340 --> 01:51:33,990 or recollection of a language, how easy it is to use, 2585 01:51:33,990 --> 01:51:37,730 how succinctly you can type it, and then how efficiently you can actually 2586 01:51:37,730 --> 01:51:39,050 run it on the screen. 2587 01:51:39,050 --> 01:51:42,560 And with C, hopefully now-- we will not write any more C-code-- 2588 01:51:42,560 --> 01:51:45,950 you have an appreciation in Python of when you create a hash-- 2589 01:51:45,950 --> 01:51:47,450 or a list, rather-- 2590 01:51:47,450 --> 01:51:50,840 or if you create a set or a hash table or the like, what you're really 2591 01:51:50,840 --> 01:51:53,650 getting access to is someone else's implementation of p 2592 01:51:53,650 --> 01:51:57,450 set four and p set three and p set two and p set one, in some form, 2593 01:51:57,450 --> 01:52:01,310 but now exposed to you in a more powerful and more modern language. 2594 01:52:01,310 --> 01:52:02,810 So let's end there officially today. 2595 01:52:02,810 --> 01:52:07,296 And next week, we'll do the same thing, but in the context of web programming. 2596 01:52:07,296 --> 01:52:07,795