[MUSIC PLAYING] DAVID MALAN: All right. This is CS50. Welcome back to all. And this is one of those rare days, where, in just a couple of hours, you'll be able to say that you've learned a new language. Or if you have a little bit of Python background already, you'll be able to say hopefully that you know it all the more, because even though we've spent the past several weeks focusing on C, one of the overarching goals of the class is not to teach you C-- and indeed, C is officially now behind us-- but really to teach you how to program. But realize, too, that even as we dive into a new language today, the goal is not to take a course on one language or another. Indeed, I, myself, back in the day took CS50 and just one other follow-on class, where I learned how to program. And every language since then have I pretty much taught myself, learned from others, learned by reading other code, and really bootstrapping myself from that. So after just this term, hopefully will you have the power to teach yourselves new languages. And today, we start that together. All right. So where do we begin? Back in week 0-- this is, recall, where we began, just making a little cat on the screen say "Hello world." And very quickly, things escalated a week later and started looking like this. Now, hopefully, over the past several weeks, you've begun to see through the syntax and see the underlying concepts and ideas that actually matter. But even so, there's a lot of cognitive overhead. There's a lot of syntactic overhead just to getting something simple done in this language called C. So starting today, we're going to introduce you to another programming language called Python that has been gaining steam in recent years and is wonderfully applicable, not only for the sort of command line programs that we've been writing in our terminal windows, but also in data science applications, analytics of large data sets, web programming, and the like. So this is the type of language that can actually solve many problems. And wonderfully, if we want to say "Hello, world" starting today in this new language, Python, all we need type is this-- typing the commands that you actually ultimately care about. So how do we get to that point ultimately? Well, recall that in C, we had this process of compiling our code and then running it, as with make or more specifically, as with clang, and then running it with the file ./hello, representing a file in your current working directory. Today, even that process gets a little easier in that it's no longer a two-step process to write and run code. It's now just one. But it's a little bit different from the past, whereas in the past, we've, indeed, compiled our code from source code into machine code and then done ./ in order to run it. Just as in a Mac or PC, you would double click an icon, Python is used a little differently. And other languages are used in the same way, too. You don't run the programs directly per se. You instead, literally, starting today, run a program that itself is called Python. And you pass as input to it the name of the file containing your source code. So Python itself is the program. It supports command line arguments. And one of those arguments can be the name of your very program, which means we don't have to very annoyingly keep compiling and recompiling our code every time we make a change. If you want to make a change to your code, all you need do is save your file and rerun this command. So let's put this into context. Let me go over to CS50 IDE, which for Python, you can continue using, as well. Let me go ahead and create a new file called, for instance, hello.py. So instead of hello.c, I'll use hello.py-- py being the convention for Python-based programs. And you know what? If I want to print "hello world," I'm just going to go ahead and say print("hello, world"). I'm going to go ahead and save my file. And then, in my terminal window, there's no need to compile. I can now run the program called Python, which is identically named to the language itself. And I'm going to go ahead and run the file called hello.py as input into that program. And voila, my very first program in Python. No curly braces, no int, no main, no void, no include-- you can just start to get real work done. But to get more interesting real work done, let's start to bootstrap things from where we left off when there are comparisons between Scratch and C, doing the same thing, again, this time between Scratch and C, but now Python, as well. So in the world of Scratch, if you wanted to say "hello, world," you would use this purple block, a function, as it was called at the time. And we translated that a few weeks back now to the corresponding C code-- printf("hello,world"). And there were a few nuances and things to trip over. It's printf. It's not print. You've got the backslash n and the semicolon. Today, in Python, if you want to achieve that same goal, as I just did in the IDE, you can simplify this to just that. So just to be super clear, what has changed from C to Python? What do you no longer need to worry about in Python-- some observations? Yeah. AUDIENCE: Semicolons. DAVID MALAN: No more semicolons-- those are officially gone. Other comments? AUDIENCE: No more new lines. DAVID MALAN: No more new lines-- print will actually give you one if you simply call print. Let me go over here. AUDIENCE: Print instead of printf. DAVID MALAN: And it's print instead of printf and-- this is going to end poorly today, because my arm will eventually fail. Are there any other differences that jump out? Maybe? AUDIENCE: No more standard I/O. DAVID MALAN: No more standard I/O-- so there's none of the overhead that we need. I'm not going to give you a stress ball, though, from that one just because it wasn't in the previous slide for C. But indeed, there's no overhead needed, the includes and so forth, just to get real work done. AUDIENCE: No backslash [INAUDIBLE]. DAVID MALAN: Oh, that was taken already. So I'm sorry. The stress ball's again given out. Yeah. AUDIENCE: No %s. DAVID MALAN: No %s, but not germane fear, because I'm not yet plugging anything in. So, in fact, let me just move on, because I'm pretty sure there's no other differences or stress balls for this one. So let's take a look, though, at a variant of this, where we wanted to do something more interesting than just print statically-- that is, hardcoded-- the same thing again and again-- hello, world-- something like this. And now, I'll come back to you in just a moment. If you want to get users' input, in Scratch, we use this Ask block. That gave us access to a special return value or variable called answer. And then, we could use "join" and creatively use the Say block to concatenate, or join those two values together. In C, this ended up being this, where you declare a variable on the left. You assign it the return value on the right, as with the first line there. And then, you go ahead and print out not just hello. But hello, %s, which then plugged in that value. In Python, you can achieve the same goal. But it's going to be a little simpler. We can now do it with just this. So what has disappeared clearly from the screen? What do we no longer need to worry about in Python? Yeah. AUDIENCE: Well, you could just do plus answer instead of, like, having to do it with a comma and the %s answer. DAVID MALAN: Exactly. So there's no %s. We're just using this comma operator, which is new in Python. This is actually now called the concatenation operator. And if you've studied Java or a few other languages, you know that this will join the string on the left with the string on the right. So we can sort of construct this phrase that we want. And because you called out the %s earlier-- AUDIENCE: Oh. DAVID MALAN: --let me be fair there. Yeah. AUDIENCE: We didn't have to identify answer as a string. DAVID MALAN: Good. We don't have to identify answer, which is, indeed, our variable as a string, because even though Python will see has data types-- and it does know what type of value you're storing-- you don't have to, pedantically as the programmer, tell the computer. The computer can figure it out from context. Any other distinctions? AUDIENCE: No semicolons. DAVID MALAN: No, no, semicolons, as well, and I was hoping no one would raise their hands from farther away. But here we go. Oh. [LAUGHTER] OK. My bad. Good. Good. Good. OK. So there's a few differences, but the short of it is that it's, indeed, simpler this time. Indeed, I don't need the %-- the backslash n either, because I'm going to get that for free. So let's fly through a few other comparisons, as well, not just on the string here or here, but now using a different approach. It turns out that you can use print in a few different ways. You can, indeed, just concatenate one string with another by using that plus operator. Or if you read the documentation, it turns out that print takes multiple arguments. So the first one might be the first word you want to say. The second argument might be the second thing you want to say. And by default, what print will do, per its documentation, is automatically join, or concatenate those two strings automatically by adding a space. So it's not a typo that I removed the space after the comma. I'm going to get that for free, so to speak, because print is going to do that for me. Now, this one's about to be a little ugly. But it's an increasingly common approach in Python to do the same thing. And it's a little more reminiscent of C. But it turns out we'll see over time it's a little more powerful. You can also achieve the same result like this. All right. So it's a little weird looking. But once you start to recognize the pattern, it's pretty straightforward. So it's still the function print. There's still a double quoted string, though it turns out you can use single quotes, as well in Python. Answer is the variable we want to print. So what's new now is these curly braces, which say interpolate the value in between those curly braces-- that is, substitute it in just like %s works. But there's one more oddity, definitely worthy of a stress ball here, that's not a typo, but does distinguish this from C. Yeah. AUDIENCE: The f. DAVID MALAN: The f-- and this is one that-- here you go-- the weirdest features of-- oh, my bad. [LAUGHS] This is one of the weirdest things about recent versions of Python in recent years. This is what's called a format string, or f string. If you don't have this weird f in the beginning of the string immediately to the left of the double quotes, you will literally print on the screen H-E-L-L-O comma space curly brace ANSWER curly brace. And that's it. So f in front of this turns the string into an f string or format string, which tells Python, don't print this literally. Plug the value in that I've placed between the curly braces. So it's pretty powerful once you pick up the convention like that. All right. Let's look at a few other examples. This, on the example-- on the left was an-- this on the left was an example of what type of programming feature? What do we call this-- the encounter? Yeah. AUDIENCE: The variable. DAVID MALAN: So this is just a variable. So a variable here and let me not-- well, this is getting a little easier for the stress balls. This is a variable. And in C, it corresponded to a line like this. So in Python, this, too, gets a little simpler. Instead of saying int counter equals zero semicolon, now, you want a variable called counter? Just make it so. Use the equals sign as the assignment operator. Set it equal to some value on the right-hand side, but no semicolon anymore. This, on the left, for instance, was an example of Scratch updating the value of a variable by one, incrementing it, so to speak. In C, we achieve that same result by just saying counter equals counter plus 1 semicolon, assuming the variable already existed. We could also do this in another way. But in Python, we can do this like this. It's identical, but no semicolon. But in C, we could also do it like this-- counter plus equals 1 semicolon. That was just a little shorter than having to type the whole thing out. In Python, you can do the exact same thing. But it's going to look different how? AUDIENCE: No semicolon. DAVID MALAN: No semicolon for this one, as well-- what you cannot do, for better or for worse, in C, you have an even more succinct trick. What could you do in C to increment a variable? Yeah. AUDIENCE: Type in plus plus. DAVID MALAN: You could do the plus plus operator after the variable's name. That does not exist in Python. Here we go. That does not exist-- sorry. It exists in Python. It's simply not in the language. So you have to start using this approach to be the most succinct. Well, what else do we have in Python? Here is, in Scratch, an example of a condition that only if x is less than y, does it say something on the screen like this. In C, a little ugly at first, but you've probably gotten used to this after multiple weeks of coding in C. Now, in Python, this is going to get simpler, too. The semicolon's definitely going away. The backslash n is definitely going away. Printf is about to become print, but also going away is most everything else. So there's no curly braces anymore. There is now a colon after the condition, or the Boolean expression there. There is necessary indentation. So those of you, who've been a little loose with style50 and favoring instead, just writing all of your code over on the left-hand side of the terminal, that has to stop now, even if style50 hasn't broken you of that habit already. Python is sensitive to whitespace, which means that if you want to use a condition and execute code inside of that condition, it must be indented consistently, by convention, four spaces. And it should always be four spaces or four more spaces and so forth. The curly braces, though, are now gone. How about something like this? If we have an if else statement, just like we did in week 0, in week 1, we translated that to C as such, introducing if and else this time. That, too, gets simpler. Now, it can be distilled as this. The curly braces are gone. The backslash n's are gone. But we've, again, added some colons, some colons, and some explicit indentation that's now matters all the more. How about an if else if else-- so a three-way fork in the road, if you will? In C, you just continue that same logic, asking if else if else. Python's not only going to get more succinct. It's also going to get a little weird, but not a typo. What jumps out at you here with Python that seems a little misleading? Yeah. AUDIENCE: Else if becomes elif. DAVID MALAN: Yeah, so else if was apparently too laborious for humans to type. And so now, in Python, that's just elif-- E-L-I-F-- but it means exactly the same thing. All right. How about this? This is a loop in Scratch. It does something forever. This wasn't super straightforward to convert to C, because in C, you don't really have a forever block. But we did decide that you can use while and just say true, true being a Boolean value that evaluates always to true by definition. So this would print out hello world forever. In Python, it's almost the same. But in Python, it's going to look like this. So the curly braces are gone. The semicolon is gone. The hand is already up. What's different here? AUDIENCE: I have a question about if. DAVID MALAN: Sure. What's the question about if? AUDIENCE: We didn't use curly brackets to solve the if. So like, we just indent back to [INAUDIBLE]. DAVID MALAN: Correct. But you don't-- because we don't have curly braces, it's not necessarily obvious at first glance where the code you want to execute conditionally begins and ends, unless you rely on the indentation. So if you wanted to do something outside of the condition, you just un-indent and move on your way. So it's identical to how you should have been writing C code. There's no curly braces. But now, the indentation matters. So back to the for loop here-- this will loop infinitely in C. In Python, I claim it looks like this. And the only new difference here that's worth noting is-- is what? AUDIENCE: True is capitalized. DAVID MALAN: True is capitalized. Why? Just because, but in Python, the two Boolean values, true and false, are, indeed, capitalized as here. All right. So let's finish out with a few more blocks. Recall that we implemented a coughing cat early on. And this is how you might do that three times specifically. In C, you can do this in a couple of ways. And the first way we proposed in week 1 was that you give yourself the counting variable like i, but you could call it anything. And then, you do something while i is greater than some target value, like 0. And then, you go ahead and cough again and again and again on each iteration decrementing-- that is, decreasing the value of i-- and then, keep checking that condition. So in Python, we can do pretty much the same thing. This converts pretty tightly to just this, which is pretty equivalent, except for the semicolons, the curly braces, and so forth, noting this time that we have the colon after the word while. But you can do this in another way. And indeed, we implemented it using a for loop, which is probably something you've gotten pretty familiar with and hopefully pretty comfortable with by now. These don't map directly to Python. You can do the same thing. But it's actually a little easier at least once you get used to it. So here, we had a variable called i incremented to 0. It kept getting incremented by a 1 up to but not including the value 3. And on each iteration, we printed cough, thereby achieving three coughs on the screen. In Python, we can change this to the following. You still have the keyword for. But there's no parentheses. There are no semicolons. And you a little more casually say for i in the following list of values. So in Python, square brackets represent what we're going to start calling a list. It's pretty much the same thing as an array, but with many more features. You can grow and shrink these lists in C-- in Python. You could not do that in C. And so in this case, this is, on the first iteration, going to set i equal to 0. And it's going to cough. It's then going to automatically set equal to 1 and then cough. It's then going to set i equal to 2 and then cough. And even though you're not doing anything with the value of i, because there is three values in this list-- 0, 1, 2-- it's going to cough three times. But there's a way to do this even more succinctly, because how would you implement this same idea if you wanted to cough 10 times or 50 times? I mean, that would get pretty atrocious if you just had to make a really big list with 0 through 49. You don't have to. There's a special function in Python called range that does that work for you. If you want to iterate three times, you literally say range open paren 3 close paren. And what that's going to do for your code is, essentially, hand you back three values from 0 to 1 to 2 automatically without you having to hard code them or write them explicitly. So now, if you want to call 50 times, you just change the 3 to a 50. You don't have to, of course, declare everything with square brackets. So this is a very common paradigm then in Python for loops. Well, what about types? Even this world gets a little simpler. These were the data types we focused on in C. But a bunch of them now go away in Python. We still have bool, like the capital true and false. We still have ints and floats, it turns out. But we also have strs, which is just a shorter version of the word string. And whereas in C, we definitely had the notion, the concept of strings, but we pretended that the word string existed, thanks to the CS50 library-- in Python, there actually is a data type called str-- you can just call it string-- that gives us even more functionality than the CS50 library did. So that was just a stepping stone to what exists here. And there's other data types in Python, too. In fact, a few of them are just here. And we'll play today with a few of these data types, because if you think about what we did the past two or three weeks introducing not only arrays, but then linked to lists and hash tables and trees and tris and stacks and Qs, this whole toolkit of data structures did we start talking about-- in Python, wonderfully, if you want a hash table, it comes with the language. If you want a linked list, it comes with the language-- no more pointers, no more creation of those low-level data structures yourself. You can just use them out of the box. So here's a list, then, to summarize some of the more powerful data types we get in Python that we did not have in C, unless we wrote them ourselves. You can have a range, like we just saw, which is just a sequence of numbers, like 0, 1, 2, or anything else. We can have a list, which is a sequence of mutable values, which is a fancy way of saying, they are values that can be changed. Mutable, like mutation, just means you can change those values. So you can add to, remove, and replace the values in the initial list. A list, then, in Python is like an array in C, but that can be automatically increased in size or decreased in size. So you don't have to do all of that maloc or realloc stuff anymore. A tuple is a sequence of immutable values, which is a fancy way of saying a sequence of values that once you put them there, you can't change them. So this is sometimes useful for, like, coordinates, x comma y, for GPS coordinates or the like. But when you know you're not going to change the values, you can use a tuple instead. Dict, or dictionary, is a collection of key value pairs. And this is the abstract data type, to borrow a word from a couple weeks ago, that underneath the hood is implemented with the thing we called-- and you built for Pset5-- a hash table. So Python comes with hash tables. They're called dictionaries, abbreviated dict in the language. And this simply will allow you to-- if you want a hash table, just declare it, just like you would an int or a float. There's no more implementing that yourself. And then, lastly, at least among the ones we'll look at today, a set is a collection of unique values. You might recall this term from a math class. So this is just a collection of values. But even if you put multiple copies of the same value in there, it's going to throw the duplicates away for you, which is just sometimes convenience. And there's other data types, too. But that's more than enough to get us started today. Indeed, everything we're going to look at today ultimately is derivative of the documentation. And Python's documentation is very thorough. But I will disclaim it's not super user friendly. And so starting this week and beyond, in really any language, like Google is going to be your friend. And sometimes Stack Overflow is going to be your friend. And your teaching fellows in course this instance will certainly be your friends, not in the sense that you should start googling, how to implement problem set 6, but rather, how do you iterate over values in Python? Or how do you convert string to lower case? Those kinds of building blocks that, frankly, are not intellectually interesting to memorize from our class-- you can just grabb them off the shelf or off Google when you need-- is exactly how folks like Brian and I and [INAUDIBLE] and Rodrigo program every day. You don't necessarily memorize everything in the documentation. But you know how to find it. And indeed, among the goals for this class is to take off the last of those training wheels and actually have you teach yourself new things on your own, having done it with the support structure of the class itself. So with that said, let's go ahead and do a couple of demonstrations of just what we can do with this language and why it's not only so powerful, but also so popular right now. I'm going to go ahead, for instance, and open up a file called-- let's call it blur.py. And blur.py might be reminiscent of what we did a few weeks back in Pset4, where in C, you implemented a set of filters. And blurring an image was one of them. And let me go ahead and open up the image here, for instance. I have in the source 6 directory today a whole bunch of examples, such as-- the image I want is going to be in Filter. This was the one we looked at some weeks ago. So we had this nice picture of [INAUDIBLE] bridge down by the river. And it's super pristine, nice and clear, because it's a very high-quality photo. But let's try to blur this in, this time, using Python. So I'm going to go over to blur.py. And I'm going to go ahead and do the equivalent in Python of including some library or some header files. But you don't say include in Python. You, instead, say import. And I'm going to say from PIL, which is like the pillow library-- I'm going to go ahead and import something called an image and an image filter. I only know these exist by having read the documentation for them and knowing that I can include or import those special features. And let's go ahead and do this. I'm going to go ahead and open up the image as it stands now. And I'll call that before. So I'm going to go ahead and open an image called bridge.bmp. And then, I'm going to go ahead and after that, say, you know what? Go ahead and run the before image through a filter called ImageFilter, specifically ImageFilter.BLUR. And then, after that, I'm going to go ahead and say after.save("out.bmp"). And I'm going to save my file. So once this has been read here-- there we go-- once this has been saved here, now I'm going to go ahead and do the following. Let me go into my file directory here. Let me open my terminal window here. Let me go ahead and grab a copy of this from my src6 directory here, which is in my filter subdirectory today-- bridge.bmp. And let me go ahead now and run python blur.py. So I'm going to go ahead and hit Enter now. Notice that another file was just created in my directory here. Let's go ahead and look at the nice pretty bridge, which is where we started. Let me shrink my terminal window here. Let me open now out.bmp. And voila-- blurred-- before, after, before, after. But what's more important-- three lines of code-- so that's how you would implement the same thing as Pset4's blur feature in Python. But wait. There's more. What about Pset5? Pset5, recall, you implemented a hash table. And indeed, you decided how to implement the underlying link list and the array and so forth. Well, you know what? Let me go ahead and create another file, this time, in Python-- wasn't allowed two weeks ago, but is allowed now. And I'm going to go ahead and implement this how? Well, I had a few different data structures to choose from in Python-- dict for dictionary and list and range and so forth and then also set. And I could use dict or dictionary. But I'm actually going to set, because what really is a dictionary? It's a set of unique words. So I'm going to use something called sets. So I'm going to go ahead and give myself a variable called words. And I'm going to initialize it to an empty set, if you will, just a container that can grow to fit values. But just in case I screw up and put duplicates in there, that's OK. The set is going to get rid of them for me. And then, recall for-- or sorry-- for this program, not speller.py, but rather dictionary.py to correspond with dictionary.c, we had a few functions. Now, in Python, the way you implement a function is not by saying int main void or something like that. You, instead, more simply say def for define and then the name of the function you want, like check, and then the inputs to that function, like word. And I'll come back to this. And I'm just going to say TODO for a moment, because I'm going to go ahead and predefine my other functions, like load, took a dictionary file name as input. So I'm going to go ahead and come back and do that. I, then, had a size function-- took no inputs. I'm going to go ahead and do that. And then, down here, I had an unload function. So I'm going to go ahead and come back and do that. So how do I now implement each of these functions? Well, let's start with load. After all, if I'm handed the dictionary, first thing I wanted to do in Pset4-- or Pset5-- was load it into memory. Well, it turns out in Python, you can do something like this-- file=open(dictionay), which is so close to C. But it's open instead of fopen. And I'm going to open it in read mode. So so far, this actually looks quite like the C version. But now, if I want to iterate over every word in the file, it turns out I can use a for loop, because a for loop in Python is way more powerful than a for loop in C. I can literally say for line in file. And then, here, I can go ahead and add to my set of words, which is in this variable called words, literally using a function called add that particular line-- that is, the word from the file. And then, you know, after that, file.close is how I'm going to close it. And then, all seems well. I'm going to go ahead and return True. Now, there's one bug here at the moment. Every line in the dictionary actually ended with what character technically, even though you don't see it, per se? AUDIENCE: A new line. DAVID MALAN: A new line, right? Every word in the file ended with a backslash n, even though when you open the file, we humans don't see it. But it is there. So that's OK. If you want to go ahead and strip off the trailing new line, so to speak, at the end of every line, you can just go to the line of the current file-- say rstrip, where rstrip means reverse strip. So remove from the end of the string what character? Backslash n-- and that's going to now look at the line, chopp off the backslash n, and pass as input to this add function the word from the dictionary. All right. What remains? Well, up here, how do I check the dictionary? Well, it turns out in Python, you can use conditions even more powerfully than in C. And if you want to know if a word is in a variable, like a word is in a set called words, we'll just ask the question, if word in words, you know what? Go ahead and return true. Else, go ahead and return false, although slight bug-- we also had to deal with capitalization in Pset5, right? The user's input from the file, the text, might be uppercase or lowercase. No big deal-- you want to lowercase a word? You don't have to do it character by character. Just call word, which is the word you're looking for, dot, which means go inside of it, just like a struct in C. And here, call a function that's built into that string called lower. All right. Well, I'm getting a little bored with implementing this. So let's finish this up. Let me go ahead. And how do I check how many words are in my dictionary? Well, just ask what the length is of that set. And how do you go about in free-- how do you go about freeing all of the memory used by your program in Python? How do you go about undoing the effects? Well, you don't. It's done for you. So we'll just return true. So this, then, is-- I'm sad to say-- I mean, excited to say-- is the entirety of Pset5 implemented in Python. So why did we do what we did? Well, let's actually run an example here. So I've got two windows open now-- two terminal windows-- on the left and on the right. On the left is my implementation of speller in C from a couple of weeks ago. Let me go ahead and run speller on one of the bigger files, like Shakespeare was one of the bigger files. So let's go ahead and see all of the misspelled words in Shakespeare, and using a hash table two weeks ago, looks like it took me 0.51 seconds to look for misspellings in Shakespeare.text. How about in Python? Well, over here, I have a copy of what we just wrote. This is also using a program called speller.py, which I didn't pull up, but I wrote in advance. And this is not the code that's timed. Only dictionary.c and dictionary.py are timed. So I'm going to go ahead and run my Python version of speller, which is going to muse dictionary.py that I just wrote on Shakespeare.text-- same file, right-hand side. You'll see the same words quickly flying by on the screen, but you might notice something already. So there's always a tradeoff in computer science and certainly in programming. There's always a price paid. Wowed as you were by how fast this is, relatively speaking, and more compellingly how many seconds it took me to implement Pset5 in Python and presumably how many hours it took you to implement Pset5 in C, that, too, developer time is a resource, a human resource. But we are paying a price. And based on the output of C on the left and Python on the right, what apparently is at least one of the prices paid? AUDIENCE: It's slow. DAVID MALAN: Say it again. AUDIENCE: Slower. DAVID MALAN: It's slower, right? Whereas this took 0.51 seconds in C, the same problem solved in Python took 1.45 seconds in Python. Now, frankly, thinking back two weeks and the many hours you probably spent on Pset5, who cares? Like, oh, my God. Sure. It's three times slower. But my God, the number of hours it took to implement that solution-- but it really depends on what your goals are, right? If you're optimizing for spending as little time as possible on a P set, odds are you're going to want to go with Python. But if you're implementing a spell checker used every day by thousands or millions of people, for instance, on Google or Facebook or even in Google Docs and the like, you know what? You probably don't want to spend three times as many seconds or fractions of seconds just because it's easier to write it in Python, because that three times increase might cost your users more time. It might cost you three times as much hardware. It might cost you three times as much money to buy three times as many servers to do the exact same work. So again, this is going to be representative of the types of tradeoffs in programming, but my apologies for not mentioning this two weeks ago. All right. So let's now see if we can't tease apart some of the differences in this language by way of examples by walking through a number of the examples we've done in weeks past. And to make it easier to see before and after, let me go ahead and use this feature of the IDE-- turns out if you click this little white icon here, you can split your screen like this. So I'm going to adopt the habit for a little bit now of opening one file on the left in C and one file in the right in Python instead. So lets go into, for instance, this directory called One, which has all of my programs from week 1 written in C, as well as some new ones for today that we'll write mostly in real time. So here is a program in week 1 that simply did this. It gets the user's name. How do we go about implementing this in Python? Well, let me go ahead and create a file called string.py. And as before, I'm going to go ahead now and convert this from before to after. However, this get string function is, for the moment, something that we give you in CS50. There is a CS50 library for Python. But we're only going to use it for a week or two's time. And we'll take that training wheel off. To use it, you can either say quite simply import cs50, which is similar to include cs50.h. Or you can more explicitly say from cs50, import the actual function you want, like get_string. So I'm going to go ahead and do it the more explicit way for now so that I can then do s gets get string. What's your name question mark? And I will put a backslash in here, because get_string is not print. It doesn't presumptuously give you a new line. And then, I'm going to go ahead and print out the user's name-- hello comma plus s. I'm going to save my file, go down to my terminal window, and run Python on string.py. I'm going to go ahead then and when prompted, type my name David. And hopefully, it's going to say hello comma David. Just to warm up here, too, we don't need to use the plus operator. I can, instead, change this to a second argument, getting rid of the space inside of hello and now rerun this program. And I'm hopefully going to see the exact same effect-- for instance, if Brian types his name, hello, Brian. And if I really want to get fancy, recall there's one other way I can do this. If I want to plug in the user's name here, as in Scratch, I can put what in between curly braces? AUDIENCE: S. DAVID MALAN: S, which is the name of the variable I've chosen, but notice this. If I get a little sloppy and I just use the curly braces and then I run Python of string.py, and type in, for instance, Emma's name-- that is not Emma's name. It's taking me literally. I have to turn it into an f string or format string, even though that syntax looks weird. Now, if I rerun it and type Emma, we'll hopefully be greeting, indeed, Emma-- so just some warm-ups to map one to the other. But let's see what else we can do here in Python. Well, recall in Python-- in C, we had this example, int.c. And this was a relatively simple example whose purpose in life was just to get an integer and then actually do some math by multiplying age by 365 to figure out roughly how many days old you are. Well, in Python, we can do this pretty similarly. Let me go ahead and open up a file that I will call int.py. And on the top of this file, I'm going to do from cs50 import get_int, because that's the function I want to use this time. I'm going to go ahead and get the user's age with get_int and say, what's your age backslash n. And then, I'm going to go ahead and print out-- not printf-- but print out the same thing as last time-- you are at least-- let me go ahead and make it this a little more room-- you are at least-- I'll come back to this-- something days period. So how do I now do this? Well, it turns out that you can plug in not just values, but expressions. I can actually say age times 365 inside the curly braces. So I don't need to, therefore, give myself another variable or use any commas. But of course, I'm missing one thing still. AUDIENCE: F. DAVID MALAN: The f to make this a format string, and you'll notice the IDE is smart. As soon as it notices, oh, that's a format string, it highlights in different colors the values that will be interpolated, the code inside your string that will be executed. So now, if I do Python of int.py and type in my age, for instance, 50, looks like I'm at least 18,000 days old, in this case. All right. So let's see what more we have in Python. Well, it turns out we had conditions in C. Let me go ahead and open up, for instance, conditions.c from last time. And we had this program here, where we prompted the user for a couple of integers, x and y. And then, we just compared the two and said x is less than y, or x is greater than y. Or x is equal to y. Well, this one I can type up pretty succinctly, too-- conditions.py-- let me go ahead and say from cs50 import get_int. Then, let me go ahead and get an int from the user. And I'm going to call it x. Let me go ahead and get another int from the user. And I'll call it-- oops-- get_int-- get_int. Let me go ahead and call that y. And then, let's just ask the question. If x is less than y-- oops-- [LAUGHS] --if x is less than y, go ahead and print x is less than y. Else if or-- AUDIENCE: [INAUDIBLE] DAVID MALAN: --elif-- slightly more succinct-- so you'll have to get used to it. x is greater than y. Let's go ahead and print out x is greater than y else-- I'm going to go ahead and say by deduction, that x must be equal to y. I'll save that file. I'll go ahead and run Python on conditions.py. I'll give myself two numbers just to do a quick cursory test. And indeed, x is less than y. And I trust if I keep running it, hopefully it should bear out that the rest of it is correct, as well. All right. So pretty one-to-one mapping here-- let's now start to do something that's a little more interesting. You might recall from week 1, we had this simple agreement program, where we prompted the user for a char. And then, we asked did the user type in y or-- Y or y or N or n. And we said agreed or not agreed, accordingly , just like a program that prompts you to agree to some terms and conditions, for instance. Well, let's go ahead and create another file over here called agree.py and do this in one or more ways. Let me go ahead and do from cs50 import get_char. This is subtle. But what is there not in Python recall? AUDIENCE: Chars. DAVID MALAN: Chars-- so what do you think the best approximation of a char is in a language that does not have chars, per se? AUDIENCE: A string. DAVID MALAN: A string-- and we'll just have to enforce on ourselves that the strings we're using are only going to be one character. So I'm going to go ahead and keep using get_string for this case. And I'm going to go ahead now and prompt the user for a string. And I'm going to ask them, do you agree question mark? And then, I'm going to ask the question if s equals equals Y-- that would be one possibility. I'm going to go ahead and say print("Agreed.") elif s equals equals N-- I'm going to go ahead and print("Not agreed.") just as in the C version. So is this identical? Or what feature is missing still? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, the lower case, right? So obviously, the lower case-- so you might be inclined to do, well, or s equals equals y. But no, in Python, if you want to say something or something else, you can literally just say or now. And in C-- Python here, we can say or s equals equals n. We can do the same here. Now, if I go ahead and run Python on agree.py and I type something like Y-- I seem to have agreed. If I type something like y-- oops-- let's do this again. If I do it again and type y, it should work, as well. And then, just for good measure, let's say no with a N-- Not agreed. So I'm checking in a couple of ways. But there's other ways you can do this, right? We've seen a hint of other features here. This gets a little verbose. I could actually say something like this. If s is in the following list of possible values, I could ask the question like this instead, and I could do the same down here. If s is n-- if s in N and n, I could similarly now determine that the user has not agreed. But now, things get more powerful without getting super long and verbose. Suppose I wanted to support not just Y or y, but Yes or yes in uppercase and lowercase. Well, I could actually enumerate other possibilities, like this. But you know what? Design-wise, I bet I can do better than this. I bet I can shrink this. And heck, I can keep going-- nope. And nope. How could I improve the design of this, even if you've never seen Python before today? How could I avoid explicitly typing so many values, a few of them quite similar? Yeah. AUDIENCE: By using, like, something similar to two lower case. DAVID MALAN: Yeah, something similar to two lower case-- recall that in C, you were able to lower case individual characters. But just a few moments ago when we re-implemented speller for Pset5, we could lowercase a whole word. So you know what? I could just say if s.lower. This treats s as the string that it is. But just like in C, there are these things called strucs, so are the data types in Python like strings also structures themselves. And inside of those structures are not only values, like the individual characters that compose them, but also built-in functions, otherwise known as methods. And so you can say s.lower and just lowercase the whole string automatically. So now, I can get rid of this. I can get rid of this, although can I? AUDIENCE: No. DAVID MALAN: No, I probably-- if I'm forcing everything to lowercase, I have to let things match up. So I'm going to go ahead and do the same thing down here-- s.lower. And I'm going to check, in this case, if it's equal to n or no like this. So now, if I go ahead and save that, rerun the program, and type in not just y, but maybe something like Yes, I'm agreed. And even if I do something weird like this-- Y, S, but e for whatever accidental reason, that, too, is tolerated, as well. So you can make your programs more user friendly in this way. All right. Before we forge ahead, any questions on what we've done thus far or syntax we've seen? Yeah. AUDIENCE: [INAUDIBLE] DAVID MALAN: Yes, can-- so to restate the question, can we alternatively still simply check if the first letter of the user's input is y? We absolutely could. And I think there's arguments for and against. You don't want to necessarily tolerate any word that starts with y or any word that starts with n. But let me come back to that in a little bit of time-- turns out in Python, there's a feature known as regular expressions, where you can actually define a pattern of characters that you're looking for. And I think that will let us solve that even more elegantly. So we'll come back to that before long. All right. Well, let's-- yeah, over in front. AUDIENCE: Is the difference between Python and C just C [INAUDIBLE] programming, or is there anything you can do in one language that you can't in the other? DAVID MALAN: Really good question-- is there anything you can do in Python that you can't do in C or vice versa? Short answer-- no. The languages we're looking at in this course can all effectively be used to solve the same problems. However, some languages are designed for or better suited for certain domains. Honestly, even the few examples we've done now were so much more pleasant to write in Python than they ever were in C, not to mention the filter example and the speller example and a bunch more that we're going to see before long. Similarly, with C, it would be a nightmare to implement a web-based application in C, because you have to implement so much of the plumbing, so to speak, the underlying code yourself. However, using something like Python or Ruby or PHP or Java these days gives you a lot more features out of the box. But you do pay a price. And that, in this case of C, for instance, is performance. You give up some bit of time. But you gain other features, as well. And the fact truly that Python does not have pointers is a feature not just because pointers were, hard but because it's so easy with pointers to make mistakes, as you probably experienced yourself. Segfaults are gone. And null pointers are gone, because the language protects you from yourself. And the reason why humans have dozens, hundreds of programming languages in the wild today is because a lot of people keep trying to improve upon languages from yesteryear. So we'll see other features distinguishing the two in a bit. All right. Let me go ahead and create another file called cough.py just to show how we can also bootstrap ourselves from something very simple and naive to a better designed version in Python. Recall from week 0, we wanted the cat to cough three times. And in week 1, we re-implemented that same idea with a little bit of copy/paste, but in a way that works. So notice this is a Python program. And it's going to cough three times. And I'm not going to keep running every program, because let me just stipulate that it will. But in this case here, even though I claim this is a program that will cough three times, let's be super clear. With this in all prior examples, what have I not put in the file, as well? Like, what is missing vis a vis C programs? AUDIENCE: [INAUDIBLE] DAVID MALAN: No what? AUDIENCE: Int main void. DAVID MALAN: There's no int main void. And there's no main whatsoever. So another feature of Python is that if you want to just write a program, you just start writing the program. You don't need a main function. Now, I'm going to walk that back a little bit, that claim, because there are some situations in which you do want a main function. But unlike in C, it's not necessary. Now, back in week 0 and 1, a bunch of people commented that surely, we can implement this better, not using three prints. But let's use a loop instead. So in Python, you could say for i in [0, 1, 2], go ahead and print out "cough," but of course, this is going to get annoying, because if you want to print four times or-- sorry-- four times or five times or six times or seven times zero index, you have to keep enumerating the stupid values. So that's why we use what function? AUDIENCE: Range. DAVID MALAN: Range-- so that is the same thing now that's going to print cough three times. But what if we wanted to now start to define our own coughing function, right? The goal of weeks 1 and 2 and onward was start to abstract away and build our own reusable puzzle pieces, albeit in a different language. How could I go about doing this in Python? Well, suppose that I want to do the following. For i in range 3, I want to just cough. And I want cough to be an abstraction, a custom function or a Scratch puzzle piece, that someone else or maybe I wrote that does this notion of coughing. Well, in Python, what's the keyword we can use to give ourselves a new function? AUDIENCE: Def. DAVID MALAN: Def for define-- and I can just say the name of the function is cough. And it takes no arguments. So unlike C, I don't specify a return type. And I don't specify the types of the inputs, but in this case, that's moot, because there are no inputs to cough. This function is super simple. It just wants to say print("cough"). And so here, I now have a function that's going to quite simply do this. And it's an abstraction in the sense that it can be all the way down here out of sight, out of mind. I don't care anymore how it's implemented. Maybe even a friend implemented it. And I've imported their code. But the problem arises now as follows. Let me go ahead and save this without all the whitespace. I seem to be practicing what I'm preaching-- no main function. Just start writing the code, but use def. But let me go ahead and run now Python of cough.py. I think-- yeah, I'm going to see the first of our errors. Python errors look a little different. You're going to see this word tracebac a lot, which is like trace back in time of everything that just happened. But you do see some clues. Cough.py is the file. Line 2 is the problem. Name cough is not defined. But wait a minute. It is. Cough is defined literally with the word def right here on line 4. But there's a problem on line 2, which is here. So even if you've never programmed in Python before, what's the intuition for this bug? Why is this broken? Yeah. AUDIENCE: You didn't define your function before using it. DAVID MALAN: Yeah, I didn't define my function before using it, which was exactly a problem we ran into in C. Unfortunately, in Python, there's no notion of prototypes. So we have one or two solutions. I can just move the function up here. But there's arguments against this. Right now, as with main, in general, it's a little bit annoying to put, like, all of your functions on top, because then, the reader or you have to go fishing through bigger files if you've written more lines. Where is the main part of this program? So in general, it's better to put the main code up top and the helper code down below. So the way to solve this conventionally is actually going to be to define a main function. Technically, it doesn't have to be called main. It does not have a special significance like in C. But humans adopt this paradigm and just define themselves a function called main. And they put it up top by convention, too. But now, I've introduced a new problem. Python of cough.py enter doesn't do anything. Well, why is that? Python is going to take you literally. You've defined a function called main. You've defined a function called cough. What have I not apparently done explicitly? AUDIENCE: You haven't called main. DAVID MALAN: I haven't called main. Now, in C, you get this feature for free. If you write main, it will be called. Python-- those training wheels are off, too. You have to call main explicitly. So this looks a little stupid. But this is the solution conventionally to this problem, where you literally call main at the bottom of your file, but you define main at the top. And this ensures that by the time line 8 is read by the computer, by the Python program, the interpreter, it's going to realize, oh, that's OK. You've defined main earlier. I know now what it is. So now, if I run it again, I see cough, cough, cough. All right. Let's make one final tweak here now so that I can factor out my loop here and instead change my cough function just as we did in week 0 and 1 to cough some number of times. How do I define a Python function that takes an input? It's actually relatively straightforward. Recall that you don't have to specify types. But you do have to specify names. And what might be a good name for the input to cough for a number? n, right, barring something else-- you could call it anything you want. But n is kind of a go-to for an integer. So if you're going to cough n times, what do I want to do? For i in range of n, I can go ahead and cough n times. So this program is functionally the same. But now, notice my custom function, just like in week 0 and 1, is more powerful. It takes input and produces output. So now, I can abstract away the notion of coughing to just say cough 3. So again, same exact ideas as we encountered a while back, but now, we have the ability to do this now in Python. Any questions, then, on those examples thus far? This is too fast. By all means, push back. And ask now. Yeah. AUDIENCE: I [INAUDIBLE] for Python, and I remember it saying like, if [INAUDIBLE] cough times [INAUDIBLE]. DAVID MALAN: Yes, OK. Would you like your mind to really be blown here then? Yes, you can also in Python do this. If you want to cough three times, you can just multiply the string by three. So now-- and if you're impressed by this, now you're really geeks, but here we go-- [LAUGHTER] --cough, cough, cough-- in a good way. This is very Pythonic, right? So all right. So now, we can let you into the club. So there's this expression in the world of Python. And there's a lot of programming communities, where things are considered Pythonic if-- which means this is the way to do it. It's not the only way. And it's arguably not even the best way. But it's the way everyone does it, sort of in double quotes. People are very religious when it comes, though, to their languages. And so a Pythonic way of doing this-- and the reason why there's memes making fun of this is that this is the Pythonic way. Like, boom-- no loops whatsoever, just multiply the thing you want. Now, to be fair, it's a little buggy. Like, I actually have an extra new line. So I probably have to try a little harder to get that right. But yes, there are hidden tricks in Python, a few of which we'll encounter today that let you do very fancy one-liners to save time, too. AUDIENCE: Why in some scenarios you said that we don't need backslashes, but like, for this one, we do? DAVID MALAN: Oh, really good question-- why do you sometimes not need backslash in, but sometimes you do? Print is going to give us a new line at the end of what it's printing. So let me go ahead now and rerun this without the explicit backslash n. You might be able to intuitively guess cough, cough, cough. You're not wrong, per se, but not what I intended. So that's why I need to put it back manually. AUDIENCE: OK. DAVID MALAN: Good question-- other questions on this here? All right. A few more examples from week 1 before we'll take things up to the more interesting problems from week 2 onward. Let me go ahead and split my screen once more. Let me go ahead and on the left, open up positive.c, which was a program recall that allowed us to define a function getting a positive integer. And we used a special-- a type of loop in week 1 when implementing this, that of a do while loop. Unfortunately, in Python, just as you don't have the plus plus operator, you also don't have a do while loop, which would seem problematic for very simple ideas like this, where you want the human to do something at least once and then maybe again and again and again. But that's OK, right? You have more than enough tools in the toolkit, both in C and Python, to do this without the more familiar, more comfortable structure. So let me write a program called positive.py. Let me go ahead and from CS50 import get_int. Let me go ahead and define a main function, just as I did before just so I can demonstrate how you can get a positive int from the user and then print it out-- so super simple example that's equivalent, for the moment, to what I'm doing over here back from week 1. So nothing on the left is new. It's all back from week 1, even if it's a bit far back now. Let me go ahead now and define also on the right-hand side def get_positive_int. It's not going to take any arguments. But I need to implement this notion of doing something while it's still true. And the most Pythonic or conventional way of doing this in Python is actually like this. Deliberately induce a infinite loop for yourself, because you can break out of it anytime you want. So this is a common Python paradigm. Go ahead, and at least once, get an int from the user asking them for positive integer. And then, after that, under what circumstances do I probably want to break out of this infinite loop if the goal is to get positive_int? What questions should I ask myself? Yeah. AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, quite simply, if n is greater than greater than 0-- no need for parentheses, but I do need the colon. I can, just as in C, use the break command, which breaks me out of the loop at which point now I can go ahead and return n. So it's different from what you see on the left. But it's logically the same. And honestly you could go back in week 1 and implement this logic in C, because we had while loops. We had the word true, albeit in lowercase. And we had all of this same code, too, even though we had curly braces and semicolons and a few other things. This, though, is the equivalent Python way of doing it here. But there is, it seems, a bug. Or rather, there is what you would think is a bug. This is OK, not a problem there. That'll go away eventually hopefully. Go. [LAUGHS] Pay no attention to that. The code is right, I believe. So there seems to be a bug. And this one is super subtle. But in weeks 1 through 5 when we were writing in C-- oh, see? It went away. Just ignore the problem sometimes. It will go away. [LAUGHTER] There is a seemingly subtle bug here. But it's not actually a bug in Python. But it would have been in C., what am I doing wrong, at least in C, even though I claim this is going to work? And if you compare left and right, it might become more obvious. What am I doing? Is that a-- yeah, in back. AUDIENCE: You're breaking before returning. DAVID MALAN: I'm breaking before returning. That's OK, because this break statement if n is greater than 0 is going to break me out of the indentation, out of the loop. So that's OK. But I think your concern is related if we can put our finger on it a little more precisely. Yeah. AUDIENCE: Like, you're not-- you're returning n, but n is [INAUDIBLE]. DAVID MALAN: Yes, so this is maybe the second part of your claim. The n is being returned on line 12. And I claim this is actually fine. But n was declared albeit implicitly-- that is, without any data type in Python-- on line 9. If we had done that in C over here, would not have worked, because recall in C, there's this notion of scope, where when you define a variable, it only exists inside of the curly braces that encapsulate it. Now, Python doesn't have curly braces. But there's still indentation, which implies the same. But in Python, your variables, even if they're declared under, under, under, under conditions or variables-- or loops, they will be accessible to you outside of those conditions and loops. So it's a nice feature. And it allows me, then, to run this program, Python of positive.py. Let me go ahead and provide-- oops-- hmm, turns out there is a bug. Yeah. AUDIENCE: [INAUDIBLE] main. DAVID MALAN: Yeah, so I have to call main at the bottom even though that looks a little silly. But now, let me go ahead and run the program now. Oh, now, it's prompting me for a positive integer. Let's not cooperate-- negative 1, 0, 1. Now, it, in fact, works. So again, sometimes you might have to think a little harder when it comes to implementing something in Python as opposed to C. But indeed, it is very much possible. Yeah. AUDIENCE: Are variables identical accessible across functions? DAVID MALAN: Good question-- are variables accessible across functions? No, they will be isolated to the function, but not to the indentation level in which they were defined. Well, let's go back for just a moment to a place we saw some weeks ago, which was this here. You'll recall that in Mario, we did a few examples early on, where we wanted to replicate the idea, printing out, like, four question marks in a row here. And we wanted to print out something like three squares in a column. And then, we also had this two-dimensional structure printing bricks. Let's see how we can implement those same ideas now using Python a bit more simply than before. So let me go ahead here. And I'll create a program called mario.py In which to whip these up, as well. So Mario.py-- the first goal is to do something like this. So I want to go ahead and print out four question marks in the sky or just in simple ASCII terms, just four question marks on the screen. So I can obviously just do 1, 2, 3, 4. But this is not particularly well designed. I can make it a little more reusable, a little more dynamic by saying for i in range (4). And then, I can go ahead and print out, for instance, a single question mark instead. But something's going to backfire now. If I run this, what am I going to see that I don't want to see? Yeah. AUDIENCE: It will be a question mark [INAUDIBLE]. DAVID MALAN: Exactly. It's going to be question marks in a vertical row. Why? Well, finally, we were so happy to get rid of the backslash n's. Now, it's come back to bite us, because sometimes you don't want the backslash n's. So here's where Python's functions are parameterizable in a little different way from C. Most every function we've seen in C might have taken zero or more arguments inside the parentheses, and you just separate them with commas. Python's a little fancier in that it has what are called named arguments, where you don't just specify comma something, comma, something, comma, something. You can, instead, specify the name of an argument or a parameter, an equals sign, and then its value. So you would only know this from Python's documentation. But it turns out that the print function takes an argument called end-- E-N-D-- whose value can equal whatever you want it to. By default, it literally equals backslash n. It sort of happens automatically, but you can override this. You can actually, say you know what? I don't want anything at the end of each thing I'm printing. So let me just to quote unquote. Let me rerun mario.py now. And now, I almost have what I want. But it's a little sloppy. I still want to move the cursor to the end. But that's OK. I can just print nothing, because I'm going to get a new line for free at the bottom of the program. So now is how I can implement this same idea. But you can put anything here. It might be a little weird. But I could put commas in between. And then, I could rerun mario.py and now get question mark comma question mark comma question mark comma, because I'm printing a comma after each one. But for our purposes, it suffices just to override that, in this case. Well, how can I go about doing this a little fancier? Well, you proposed-- or the meme you saw proposed that we can instead do this instead. We can just print, for instance, print question mark times 4. Now, we can rerun the program now. And voila-- even more Pythonic-- not necessarily as obvious or reusable, but certainly more succinct. Let's do one more this time for-- how about this? Recall that we wanted to print a column of three bricks. So how might we do this? Well, let me go ahead and do it the simplistic way. For i in range of 3, let me go ahead and print out a brick like that. Let me run the program now, mario.py. And voila, that one's pretty easy. But I can actually do this a little more cleverly if I do do this-- print one of these-- backslash n times 3. But let's fix that bug that came up earlier, as well. That's almost right. But I claim that this was a little messy. So what is the solution for fixing this bug, where I'm just being a little nit picky? I don't want this extra blank line at the end, which I'm getting for free from print itself. The blank lines-- the new lines in the middle are coming from the quoted string here. What's the fix to get rid of that extra new line at the very end? Yeah. AUDIENCE: You could change n to nothing. DAVID MALAN: Yeah, just say equals quote unquote. So the syntax is starting to get a little funky, right? Like, it's a little harder to parse visually. But this is, indeed, just the paradigm we've seen before. Here is one argument on the left. Here is another argument in the right. The only thing that's different in Python is that now, some arguments can have explicit names that you only know from the documentation. So now, if I rerun this after saving, now, I've got the effect that I actually want. Well, let's do one more with Mario here, this time to do something a little two dimensional and print out a brick that's like a 3 by 3 brick of hashes instead. Well, let's go back to my code here. And let me go ahead and do a first example in Python of a nested loop. So let me go ahead and do for i in range of 3. That gives me my rows. And then, I can just do for j in range 3 also. And then, in here, I can go ahead and print out just a hash mark. But I don't want to print out new lines every time. Otherwise, it's going to be a super tall column of hashes. But after I print a row, I do want to print a blank line. So I think this suffices. I'm going a little quickly here. But again, this-- the logic is from week 1. The syntax is now from week 6. Let me run this again-- mario.py. Nope. I screwed up. What did I do wrong? I didn't actually override what I intended. Whats-- yeah, over there on the left. AUDIENCE: You included the backslash n. DAVID MALAN: Yeah, and the whole point of using the n parameter was to override it. So let me change it to that, and let's see what happens now. Voila. Now I've implemented that same idea. Whoo, I think Rice Krispie Treats await us in the lobby. We'll see you in five minutes. All right. We are back. And let's now look back at where we started this conversation of comparing C against Python. And recall that one of the earliest examples we did today involved strings and using the CS50 library. But the CS50 library-- we're going to very quickly take away, indeed, just after a few problems that you implement in problem set 6. But we'll see now just how easily that can be done. It turns out in Python, you don't need to use get_string or the CS50 library itself, because there actually exists a function quite simply called input. And indeed, I can get rid of get_string, replace it with this function called input, and actually store the return value in s. And for the most part, that will behave identically to get_string. If I go ahead and run Python on string.py, I can go ahead and type my name in. And it still works as expected. But I need to be mindful now that input, by definition, in Python's documentation, always returns a string, which means that if I'm going to get rid of get_int and maybe get_float, another function you might want to use for problem set 6, and use input instead, it's no longer sufficient to just call input and store the answer in a variable called age. Why? Even though I've not specified the type of age on line 1, what apparently will its type be as I've just defined? AUDIENCE: It's going to be a string. DAVID MALAN: It's going to be a string. Input, by definition in Python, returns a string. So if you want to convert it to an integer, you need to know how. And the simplest way to do it is quite simply to convert it with a function called int. So this is actually very similar to casting in C. But it's a little backwards. In C, you would say parentheses int close parentheses. In Python, you say int open paren, whatever it is you want to convert, and then close parentheses. You call it as an actual function. But this is going to be a little fragile. It turns out that if you just blindly pass the user's input to this int function, if it doesn't look like an int, bad things are going to happen. You're going to see some kind of trace back or error message on the screen. That's why, for this first week, we used the CS50 library and get_int and get_string and get_float just because it's a little harder using the library to accidentally mistreat input. But you don't need to use this. And you needn't-- you won't use it after just a week or so more time. All right. A few other examples, and we'll build ultimately to some of the more powerful examples we can do even after just two hours of Python programming. Let me go ahead and open up, first of all, overflow.c, which you might recall from a few weeks back was a problem, because as soon as I kept doubling and doubling and doubling an integer in C and printing it out, what eventually happened? AUDIENCE: [INAUDIBLE] DAVID MALAN: Slight spoiler in the file name. AUDIENCE: It overflowed. DAVID MALAN: It overflowed, right? And it rolled around, so to speak, to 0, because all of the bits eventually rolled-- you carried too many ones. And voila, you were left with all zeros. Python is actually kind of cool. Let me go ahead and open up a file here called overflow.py and implement this same idea this time in Python. Let me go ahead and save this as overflow.py, which now might actually be a bit of a misnomer. I'm going to go ahead and do this. i equals 1 initially. While True, do the following forever. Go ahead and print out i. And then, you know what? Let me go ahead and sleep for one second and then, go ahead and multiply i times 2, which I can also more succinctly write as i star equals 2-- so almost identical to C, except no semicolon here. However, sleep you don't just get automatically. It turns out sleep is in a library called time. So I'm going to have to import sleep, so to speak, by using this one-liner up top. Let me go ahead and run this as Python of overflow.py. Let me go ahead and increase the size of this window here and run this. OK. I'm a little impatient. That seems a little slow. In Python, you can actually sleep for fractions of sentence-- frackish-- blah, blah-- fractions of seconds. So let me do this faster. AUDIENCE: [INAUDIBLE] DAVID MALAN: OK. Now, I'm not counting. But I'm pretty sure that's more than 4 billion, which you'll recall was the upper bound the last time around. And in fact, even though the internet is a little slow here-- so that's why it's not churning it out at a super fast rate-- these are really big numbers. And amazingly in Python, indeed, it's great for data science and analytics and such. Ints have no upper bounds. You cannot overflow an int. It will just grow and grow and grow until, frankly, it takes over your computer. But there is no fixed limit, as there was in C, which is wonderful. Downside, though, if Python floats, still imprecise-- so there are libraries, though. There is code that other people have written, though, to mitigate that problem in Python, as well. All right. Let's move now to where we left off in week 2, where we started introducing arrays that we're now going to start calling lists. Let me go ahead and split my window again. Let me go ahead and open from week 2 an example like scores2.c, which looked a little something like this. So it's been a while. But we did see this example a while back, which just initializes an array with three values-- 72, 73, 33-- and then computes the average using a bit of arithmetic down below. So a while back, but all it did was quite simply that. Let me go ahead and create a file called scores.py on the right-hand side now in Python. And let me go ahead and just give myself an array now called a list. And it's a list in the sense, like a linked list, that it can grow and shrink automatically-- so no more alloc or realloc. So in fact, if I want to add something to this list, I can literally say scores, which is the name of the variable, go inside of it just like a struct in C, and use a function, otherwise known now as a method that's inside of a structure, and just append a value like 72. I can then do this again and append 73. And I can then do this again and append 33. And now, I can go ahead and print out an average. Let's go ahead and say average, just like before. And it turns out Python has some fancy functions that are useful here. I can take the sum of all of those scores and divide by the length of that list, thereby giving me, hopefully, the total count-- the total sum of the scores divided by the total count of scores and getting an average-- so python scores.py. Oh, no, I forgot what? AUDIENCE: f. DAVID MALAN: Just the f for an fstring. All right. So let me go ahead now and rerun that. And wala-- it looks like with those three values, the average out actually to, for instance, 59.33333. And if I actually started poking around, we would really see the imprecision. And we're starting to see it on the screen here already. Well, let me go ahead make this more succinct. I don't need to use append, append, append. In Python, I can just say scores 72, 73, 33, not unlike the curly brace notation you might recall seeing at some points in C. But it's a little more commonly used here in Python. So this, too, is going to work exactly the same, the point being lists can grow and shrink. If you want a list, just use it. You don't have to think as hard anymore about using that type of structure. All right. Let me open up one of the first problems, though, we encountered in week 2. And that was, for instance, in string2.c. In string2.c, recall that I simply wanted to iterate over all of the characters in a string. And this problem we were able to solve pretty straightforwardly in C by using the square bracket notation-- turns out in Python, we can do this a little more succinctly. Let me go ahead and call this string.py. I'm going to go ahead and now import from CS50 the get_string library just to make user input a little easier today. I'm going to go ahead and get a string from the user, asking them for their inputs. And then, I'm just going to go ahead and print out output. And then, I'm going to suppress the new line, just to keep things all in the same line. And then, I want to iterate now over the user's input and print it character for character. Well, in C, I did this with square bracket notation and a very verbose for loop. In Python, I can do something pretty similar-- for i in range length of s, because the length of the string is the total number of characters. If I pass that as input to range, that lets me iterate once for every character. And I can use the same notation. I can print s bracket i in Python. And let me get rid of the new lines so that I only have one at the very end. So again, I'm typing quickly. But range just counts some number of times. How many times? However many characters there are, as per the length of the string, and on each iteration, print the i'th character of s. Let me go ahead and run this-- python of string.py. Let me type in, for instance-- oops. Do that again. After I see the prompt for input, let me type Emma's name. And there's the output, right? It looks the same, even though I'm technically printing it character for character. But Python is kind of fancy. And you don't need all of this mechanical stuff, like counting numbers and square bracket notation. If you want to iterate over a string character by character, you can just say for c in s, print c. And it will figure out how to get the character that you want. Technically, let me override the new line. But this is much more pleasant now. Now, if I want to type in the same thing, voila, works the same, less code, getting more work done, getting back to other things I really want to do instead. Let's look at another case from p-- of week 2, where we had this upper case code. The goal here, recall, was to take a string from the user s, and then go ahead and capitalize all of the letters therein. So how might I do this in-- oops-- how might I do this in Python? Well, we've seen hints of this already. Let me go ahead and in a file called uppercase.py, I'm going to go ahead and from cs50 import get_string as before. Then, I'm going to go ahead and get a string from the user, asking them for the before version. And then, here, I'm going to go ahead and print out after. And then, I'm going to go ahead and print out known line. And you know what? If I want to print the string, I'm just going to go ahead and print the string.upper and be done with it today. So now, if I do Python upper-- up-- oops-- Python of uppercase, and let's type in Emma's name this time in all lowercase-- wala-- done. And you don't have to worry about getting into the weeds of each individual character. Variables of type string, like s in this case, have functions built in, like upper. And we saw lower, as well, earlier. All right. Someone asked during the break about command line arguments, the things you can type after the word at the prompt. Well, it's a little weird with Python, because you're running a program called Python whose command line argument is the name of your program. But you can still provide command line arguments to your own program after the name of the file. So it's kind of offset by one. But you can, nonetheless, do this. So let me go ahead and open up from week 2, say, argv1.c. And this is from a few weeks back. And the purpose of this program in C was just to print each command line argument one at a time. In Python, today, I'm going to call this argv.py. And this is a little different. If you want to access command line arguments, you can't just use argv and argc because there is no int main void, or specifically, int main argc, string argv, as there was in c. That's gone. But argv and command line arguments more generally are exposed to you in another library. It happens to be called sys for system. And you can literally just import argv if you want. So it's a little different, but same exact idea. And if I want to print each of those, I can say for i in range-- now I want to say argc. My goal at hand, again, per the left, is just to print each command line argument and be done with it. But I don't have argc. And you might like to do this, but that doesn't exist. But that's OK. How do you think I could get the number of arguments in argv? The number of strings in argv? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, go with your instincts. We've only seen a few building blocks today. But if argv is a list of all command line arguments, it stands to reason that the length of that list is the same thing as argc. In c, the length of something and the something were kept separate in separate variables. In Python, you only need the thing itself because you can just ask it, what is your length? So if I go ahead and do this, I can now go ahead and print out argv of bracket i. And let's see. Python of argv.py. Enter. Nothing printed except the program's name. But what if I type in foo? What if I type in bar? What if I type in baz? These are just weird go-to words that computer scientists use when they need a placeholder like xyz. It's indeed printing all of the words after my program's name. Of course, I don't need to get into the weeds. As before, if you want to iterate over all of the words in a list for i and/or, let's say, for arg in argv, just go ahead and print it. Voila. Python. Much faster to do the same thing. So it reads a lot more like English even though it's a little terse, but the end result is going to be the same thing here. A couple more quick examples just of building blocks that you might assume exist, and indeed do. In exit.c, a few weeks back, we just introduced the notion of returning 0 or returning 1 or some other value just to signify that something worked or did not work. This was success or failure. Python offers the same feature but the syntax is a little different. Let me create a file called exit.py. And I can get access to both argv and exit like this. Let me go ahead and from sys import argv and a function called exit. So in Python, you don't just magically have access to functions. Sometimes you do need, as in C, to import them. And you only know this from the documentation what exists. And I'm going to do the same thing. So I wanted to say in c, if argc does not equal to, the equivalent in Python is if length of argv does not equal to. What do I want to do? I want to go ahead and print missing command line argument. And then I'm going to go ahead and exit 1. So whereas in c we said return 1 because we had a special main function, in Python, for now, we're just going to say exit 1. Same idea, slightly different name. Otherwise I'm going to go ahead and print out hello, placeholder, argv 1. With an f string. So this one's a little faster. But just to be super clear, all I'm doing is converting from left to right. And we'll have all of these examples on the course's website if you want to look at the more slowly left and right. The only new detail here is instead of returning one in error, I'm going to start calling exit 1. And I have to access that function after importing it from the sys library. That's all that's different here. Returning 0 is then, the same thing is exiting 0 as well. All right. What more building blocks might we like? How about-- oh, this is interesting to me. Here, let's go ahead and open up names.py, or rather-- let's see. Actually, let's go out and do this one from scratch. I'm going to go ahead and do a quick linear search style algorithm, this one called names.py. Let me go ahead and import from sys import exit just so I can return 0 or 1 as needed. Let me give myself a list of names just like we did a few weeks ago. Emma, and Rodrigo, and Brian, and my own. All in caps just because, just for consistency with a few weeks back. Suppose I want to search for just one of us. And suppose this program is only searching for Emma to see if she's in a list, just as we did a few weeks back. Well, in the past, you would do a 4 loop. You would iterate over every darn element in the list, checking if it equals equals Emma or stir comparing against Emma. Oh my god, no. We don't need to do that anymore. If you want to know if something is in a list, just say if Emma in names, print, found. And then I'm going to go ahead and exit 0 for success. And down here, I'm going to assume if I get this far, Not found. And I'll exit 1. So if I run Python of names.py. Enter. Emma is found. Suppose I change her name to Humphrey up here. Now it's not going to be found because Emma is not technically in the list. Emma Humphrey is in the list. So now if I rerun it she's not found. But I have distilled into a succinct one liner all of the logic that for weeks we've been using things like for loops, for, and the like. All right. Any questions before now we introduce some new Python-specific capabilities? Yeah. AUDIENCE: [INAUDIBLE] DAVID MALAN: Really good question. What would be the big O notation for doing this here? This is well-documented. So if you actually read Python's documentation, for each of its data structures, something like a list will give you big O of n. That is well-defined. A dictionary, too, has well-defined with high probability, and we'll come to that in a little bit. You would read the documentation to know exactly those things. So having familiarity with that big O notation can actually help you answer those things from docs as well. All right. Let's go ahead and open up a fancier example, or write one, called phonebook.py, the goal of which is to represent the notion of a phone book. Let me go ahead now and still from sys import exit just so I can terminate if we fail. Let me go ahead and define a bunch of people. But instead of putting people in a list like before, now I want to use something like a hash table. A hash table, recall, has inputs and outputs like keys and values. Or more generally, this is now what we're going to start calling a dictionary. A dictionary, just like in the human world, has a lot of words with a lot of definitions. A phone book is essentially a dictionary. It's got a lot of names and a lot of numbers. Those are keys and values respectively. So a dict in Python takes as input keys and produces as output values. And it happens to be implemented typically by the people who invented Python using a hash table. So the hash table you all wrote is now a building block to these data structures or abstract data structures that we'll now call, for instance, a dictionary more generally. So curly braces are back only in the context here of defining what's a dict or dictionary. I'm going to go ahead and define a key called Emma and I'm going to give her the same phone number we gave her a while back of this. Notice the colon. Notice the double quotes around each value. Let me go ahead and put Rodrigo in the phone book. And his number is going to be 617-555-0101 as before. Let me go ahead and put Brian in there, also separated with a colon. 555-0102. And I'll put myself in there with 617-555-0103. So this is a little different-looking. The curly braces say, hey, Python. Here comes a dictionary. A dictionary has keys and values, just like a dictionary in the human world has keys which are words and values which are definitions. Phone is the same idea. Names and numbers are our keys and values. I'm separating each key and value with a colon and I'm separating those pairs with a comma. All right. So why is this useful? This is now the simplest way to represent a phone book or even a dictionary with words and definitions in Python. I can now ask a question like if Emma in people. Well, let me go ahead and get her number. Let me go to say ahead and say Found, people, bracket, Emma, using some newer syntax. But I'll come back to this in a moment. And let's just start with this. So this is not going to work until I make it an f string, but let's see why this works. Python phonebook.py. Am I going to find Emma? Indeed. I found her number. If I change this to myself, David, and save and rerun it-- oh. You have to change this here, too. David. Sorry. Now I get my number as well. So what's going on here? So this is the Pythonic way of just asking, is a value in a data structure? You don't have to use for loops. You don't have to traverse chains or linked lists or the like. You can just ask the question as on line 10 here. This is somewhat new syntax. But what's cool about dictionaries in Python is that if the dictionary's called people-- and you know it's a dictionary only from these curly braces. If the dictionary is called people, you can treat it like an array but whose indices are not numbers 0, 1, 2, 3, but whose indices are words. So another name for a dictionary and programming is called in associative array, which is almost a better name, because it makes it sound like an array. But it's associative in the sense that you can associate words with values, not just numbers with values. So a dictionary, to be clear-- key value pairs. The keys, though, are strings. And the values are anything you want. In this case, their phone numbers. But they could be definitions of actual English words in a dictionary. All right. And I can go ahead and clean this up, too. I can change this back to Emma. And if I find her, I can go ahead and say exit 0. And if I don't find her, I could just say print not found and exit 1. But the exits aren't strictly necessary. The program will still quit. Yeah. AUDIENCE: [INAUDIBLE] DAVID MALAN: Really good question and that's subtlety that I didn't mention explicitly. The single quotes are necessary here because Python would get confused if I've got outer quotes here and outer quotes here on the beginning and end of line 11. So I'm deliberately using single quotes, which are OK in Python. You can use double or single. Unlike in C where double was strings and single was chars, there are no chars in Python. So you get to use both for either purpose. Yeah. AUDIENCE: [INAUDIBLE] DAVID MALAN: Really good question. So in pset 5, you implemented a hash table, which is the more lower-level notion of a dictionary. What I mean by that is that you stored words in the dictionary. But sometimes you had collisions, and so you use the linked lists. That's fine. But your check function, recall, in pset 5 only returns true or false. Is the word in the dictionary or not? The check function did not reveal any information about how long it took to find that word or how far down the chain it actually was. A dictionary is similarly an abstraction similar in spirit to your check function. Yes. Technically, underneath the hood, Emma and Rodrigo for whatever reason might hash to the same bucket, like the buckets on stage. But all you care about is the value. The dictionary's purpose in life is to go find Emma's value for you or Rodrigo's value for you and return it as quickly as possible. The fact that it happens to lead to a linked list, maybe, is an implementation detail that is not exposed to me, the programmer who just wants to store keys and values. And that's the difference between an abstract data type like a dictionary and an actual data structure like a hash table. You use the latter to implement the former. All right. Few final examples before we now make things more real world. You'll recall from week 4, the last past week that we'll look at, we had a few problems that we encountered, for instance, with comparing strings. This is a couple of weeks back now. But recall that this example was initially problematic because you could not compare s equals equals t. You had to use stir compare. Why could you not just say if s equals equals t to compare two strings and see? Yeah. AUDIENCE: We could [INAUDIBLE]. DAVID MALAN: Exactly. They were pointer to chars or addresses of strings. And you would be comparing the addresses of those strings that might look the same but they are stored in different locations. In Python, that nuance is now gone. If in Python you want to compare two strings, by god, just compare those two strings like this. Let me call this compare.py. Let me go ahead and from the cs50 library import get_string. Let me go ahead and get two strings from the user. For instance, s and t, arbitrarily as before. get_string. Here we go. Quote, unquote t. And then if you want to check if s equals equals t, just ask the question and say Same if so. Else, go ahead and say Different. Now if I run this program as compare.py, Python of compare.py, let me go ahead and type in, say, my name here and then my name again. Technically in C, s and t were stored in different locations. And in Python, they technically are, too. Doesn't matter. The equal equal operator in Python is going to compare literally what you intended. All right. What about this? This one was painful and sparked the whole exploration down the rabbit hole of pointers and addresses and the like. Suppose you just want to swap two values, x and y initialized a couple weeks ago to 1 and 2. My god, the hoops we had to jump through in C just to swap two values. Hopefully by the end, you understood why there was this fundamental issue. And that, again, had to do with memory and moving things around and copying. But in Python, guess what? Let me go ahead in Python and call a program swap.py. And let me go ahead and give myself two variables. That alone is already faster because you don't have to worry about data types or semicolons. Let me go ahead and just declare that x is x, y, is y, just so we can see what these values are. However, I could just use debug50. You can also debug Python programs in the IDE is well. I'm going to do this twice, recall, the goal now being to swap two values. So if I want to swap x and y, guess what? In Python, no big deal. Swap. All right. Python. swap.py. oh, my god. You get it for free with the language. So now let's actually start to take things in the direction we did in week 4 with file IO. Let me open up phonebook.c. This was another example of phone book manipulation where, recall, we opened a file called phonebook.csv which is like a lightweight Excel file. Comma, separated values. Simple text file. We opened it with fopen. We then got a name and a number from the human. And then we use this new function fprintf-- file printf-- to just print something percent s comma something else. The name comma number to the file. And this is how I was able to add the heads' names and numbers to that CSV. Well, we can actually do the same thing in Python but a little more simply as well. Although the syntax is going to look a little cryptic at first glance. Let me go ahead and save this file also as phonebook.py, although a fancier version now. Let me go ahead and open up here phonebook.csv which I've already populated with name comma number, just so that if we were to open it in Excel we would have column headings. And I'm going to go ahead and do this. In Python, if you want to deal with CSV files, there's actually a package called CSV. Package is a Python word for a library. And in that package is a lot of CSV-related functionality. And I'm also going to import from cs50 again get string. All right. What do I want to do? First line is going to be pretty similar to C. I'm going to open the file using open instead of fopen. And I'm going to call the file phonebook.csv. And I'm going to open it in quote, unquote, a mode. What was a again? append. If used w, It writes it and will just keep changing it again and again. A pen we'll keep adding to the file. So we can keep adding more tfs to the file. All right. Now let me go ahead and just get a name from someone. So get_string Name. Let me go ahead and get their number via get_string as well. Whoops. Number equals get string number. And get that from the human. And now this part's a little new. But again, this is the kind of thing that you just Google it when you forget the syntax for something like this. I'm going to declare a variable called writer, though I could call it anything I want. The purpose in life is going to be to write stuff to the file. I'm going to go inside of the CSV package, again, the library that I imported up top. And I'm going to pass to a writer function the file. So you would only know this from the documentation. But what I've highlighted here means hey, Python. Pass the open file to this library that's going to make it easier for me to read it as a CSV file. Rows and columns. That's all. Now let me go ahead and do this. writer-- oops. writer.writerow. So writerow is a function that's built in to the CSV library's functionality that quite simply lets me write a name and a number to that file. It will take care of the commas. It will take care of quoting anything. As an aside, if one of us were to have a comma in our name like Brian U, comma, Junior, that comma could be problematic because it could break the CSV's implicit assumption that commas separated values. But you could put quotes around Brian's full name, even if he had a comma, Junior or whatever in his name. This library takes care of all of that headache for you. But there is a subtlety. I mentioned something called a tuple before. For low-level, uninteresting reasons now, you actually need double parentheses now. So you're technically passing in one thing in parens. But more on that another time. Now let me go ahead and close the file. file.close. So let me go ahead and run this. Python phonebook.py. Whoops. Invalid syntax. I forgot an equal sign. And just as in C, you'll see that the red things appear sometimes when it knows what you've done wrong, but it takes a little while for them to disappear sometimes. Name. Let's go ahead and add Emma, all caps just for consistency. 617-555-0101 was her number. All right. Hopefully, hopefully. Come on. Come on. Oh wait. That's the wrong file. [LAUGHTER] Here we go. Because I created a new one. So, cheating. Name, number. I ran my program in a different directory which meant it created a new file. So I'm not actually cheating there. I was just in the wrong place. User error. Let's run it once more. Rodrigo. 617-555-0101. Enter. There we go. Let's run it again, this time with Brian. Brian, 617-555-0102, and so forth. So this code admittedly is not super straightforward. And honestly, this is exactly the kind of stuff that I Google when I forget actually how to manipulate the CSV. But that's what the documentation indeed is there for you. And in fact, let me clean this up a little bit. It turns out you can write this code a little differently. And online, you'll see slightly different approaches. You'll see a keyword in Python called with which this makes it a little tighter to write your code. If you use this keyword with as you'll see in documentation and some of the staff sample code, you don't have to close the file. It will automatically be closed for you, thereby just saving you one line of code. All right. Any questions on that? All right. And now if we can, enough with the sort of syntactic details. Like, that's Python. That's going to get you like 80%, 90% of the way through learning Python, even though you'll invariably have to lean on the slides and the notes and Google and Stack Overflow for a little syntactic details as you translate your C programs in problem set 6 to Python programs in problem set 6. But regular expressions. Now let's introduce some new powerful features of this language that C did not have but other languages do have, too. Regular expressions I alluded to earlier as representative of a feature where you can define patterns when you're trying to detect patterns in users' input. And it turns out in regular expressions, there's a few pieces of syntax that are useful to know. Dot in the examples we're about to do represents any character. So if you don't know what character you're expecting, you can just say dot to represent any character. Dot star is going to mean zero or more characters. Dot plus is going to mean one or more characters. Question mark is going to mean something optional. And. there's some other syntax as well. But let's make this more real first. If I go back from before into the very simple agreement example that we did a while back, you may recall that we had this code here where I enumerated explicitly yes and y and no and n. But as someone noted, these already kind of follow a pattern. And it turns out it might be sufficient just to check for a word starting with y or maybe I could check a little more succinctly for multiple values at once. So let me go ahead and do this. It turns out Python has a library called regular expressions, or RE. In this library, is a bunch of fancier functionality. I can change this if condition to be this instead. I can go ahead and use re.search which is a function whose purpose in life is going to be to search a string for a pattern that you care about, like something starting with y. And the way I'm going to do this is search for initially yes. And the string I'm going to search is s. And that is going to return effectively true or false. So I'm going to change my code to just quite simply be this. This says hey, Python. Search the string s for this word here. All right. Let's test this out. So Python of agree-- whoops, now in this version. Whoops. I forgot my own-- let's see. I forgot my colons. So Python of agree. Enter. Do I agree? I'm going to go ahead and type in yes, agreed. But at the moment, y by itself does not work. So let's make it work. Well, I could do this in a couple of ways. In regular expressions, you can say yes or some other value. So a vertical bar just means or. So it's not the word or and it's not double bars in this context of patterns. It's just a single vertical bar. But now I can type y or yes. But there's some cleverness here, right? Like, yes already starts with y. So I could actually say this. Let me arbitrarily put parentheses around es initially. But then put a question mark at the end. This is funky syntax. And again, what we're talking about now is not Python per se. These or regular expressions, patterns of text. This just means look for a y and maybe an es but maybe not an es. So the question mark means 0 or 1 instance of the thing to the left. It's optional. So now I can run this again and say yes. And that seems to work. Or I can say y and that seems to work. But this does not work. So how could I fix this and make it case-insensitive? I could actually just say lower and just force everything to lowercase. Or it turns out, if you read the documentation-- this looks a little weird-- you can also pass in a third argument, which weirdly is all caps like you're yelling. But this is regular expression IGNORECASE. And this will just force everything to be treated as lowercase or uppercase. It doesn't matter. But we'll see here this is actually going to make it a lot easier to search for certain patterns. We can say no similarly here by just starting to construct patterns. And again, you don't sit down generally and write regular expressions that just work like this. You build them up piece by piece as I already am. So let me fix this real quick. What did I just do wrong? Here we go. Let me do one last thing. Suppose I agree. Yes. OK. That's OK. Because I'm searching the whole string s But if I want to search for literally the beginning of the string, I can use a caret symbol here. And to search all the way to the end of the string, you can use a dollar sign. Why these are the way they are I don't know. It's hideous. But caret means start of string. Dollar sign means end of string. And if it's not crazy enough now, yes is not going to work. No agreement. But yes literally will. Because this means the human must type literally at the beginning of their input a y followed optionally by an es. And then per the dollar sign, that's got to be it for their input. You can make it really tight around the user's input to control what they are typing in, especially for something like an agreement. All right. So now let's do something more fun. So now that we have Python, it turns out we can do some more interesting things. And it turns out you can do these even on your own Mac or PC. I've been using the IDE all this time. But Python is even easier than C to get working on your own Mac and PC. And so indeed, before class, I literally downloaded a program called Python, installed it on my Mac-- and you could do it on a PC as well-- which allows me on my own Mac to use something like this terminal window in order to run Python programs on my own Mac without the IDE in the way. What this means in particular, I can use hardware on my own Mac or PC. For instance, like the microphone built in. So let me go ahead and make a program here that's going to be called, for instance, voice. Let me go ahead and open voice.py. I'm going use a different text editing program. It's not the IDE, but it's going to let me write code. And let me go ahead and do this. Let me go ahead and get input from the user not even using the CS50 library. But I'm just going to ask the human to say something backslash n. And then I'm going to force the user's input to lowercase just to make my life a little easier. And now I'm going to ask a few questions. If the word hello is in the user's words, well, let me go ahead and say hello to you, too. That's nice. elif, for instance, how are you in words. Let me go ahead and say something like, print, for instance, I am well. Thanks. elif, how about goodbye in words. Let me go ahead and print goodbye to you, too. Though I could certainly say most anything I want here. else, I don't know what's going on, so I'm just going to say huh. So what is the essence of this program? What have I done? Like, this is kind of, sort of, definitely a stretch, but the beginnings of artificial intelligence, if you will. It's a program that's interacting with me. And way back when, some of the earliest programs in AI were just text-based like this. Artificial intelligence is essentially like creating a human that's sentient and actually can respond to and react to a human as though they too are human themselves. So let me go ahead and run this. Python voice.py as though I'm talking to it and say, hello there. That's grammatically wrong, but we won't care. Hello to you, too. How are you? I am well, thanks that's kind of cool. Goodbye. Goodbye to you, too. Now why did that work? I'm just using pythons in operator, searching the user's words which are just strings that have been typed in via the input function. And again, the input function is almost the same as get string but it's the one that comes with Python. And I'm just doing if else, if else, if else, if else, printing out things. But it turns out with Python-- and honestly, other languages, but Python especially-- it's easy to do even fancier things, too. Let me go ahead and not get the human's words from the keyboard but let me import speech recognition, which is a library that I've installed on my computer in advance. And let me go ahead and change this a little bit. Let me go ahead and say something like this. Recognizer gets speech recognition.recognizer. And I literally did not know what I was doing. I was simply following the directions when I downloaded the library initially. But I learned that I can say speech recognition.microphone as source. Print. Now let's go ahead and say something to the human so they provide input. Then let me get some audio from the. User recognizer.listen to that source being the microphone. And then down here I'm going to say, Google speech recognition thinks things you said. And then print recognizer.recognize Google audio. So it's OK if we don't understand each and every line. I didn't last night when I was sort of experimenting with this example. The key, though, is that I've imported a very powerful library that's open source and freely available. Happens to talk to Google's back end infrastructure where they implement a number of artificial intelligence features. And if I didn't screw up, let's see how this one works. Python of voices.py. Hello, world. How are you? Goodbye, world. OK. Pretty, pretty amazing. [APPLAUSE] Thank you. Let me go in, and for time's sake, let me open up A variant of this that I wrote in advance. This one now is exactly the same. But now notice insofar as Google is handing me back a bunch of words, I can certainly just use some Python syntax and say, is hello in the user's words? Is how are you in the user's words? Goodbye to you, IS goodbye in the user's words? So let me run this version. Python voices2, which is available-- I can't talk while I'm doing this demo. Hello world. How are you today? Goodbye, world. OK. [LAUGHTER] Now let me take it up a notch and introduce, in this case, an example using regular expressions. So notice this. At quick glance, uses re.search. And it's searching for the words my name is, which is to say that hopefully this will detect if I have said my name is such and such. And it's then going to say hey to whatever matches. You can use regular expressions to extract information from input. So I'm extracting with parentheses here whatever comes after the word is. So here we go again. Python, this time of voices.3. Hello, there. My name is David. Ho, ho, ho! Now your computer is indeed sentiment. Let's do something else more powerful. And I hope you'll forgive if we go, like, two minutes over today. I hope it's going to be worth it. Let me go ahead, and in today's examples 2 for week 6, let me open up something like faces. In this case here, we have, for instance, a whole bunch of our Yale staff some weeks ago. So you'll see here a whole bunch of faces in Yale. And now I'm going to go ahead and, in advance, I wrote a program here called detect to detect faces. I'm going to go ahead and run this program called detect.py. It's written in Python but we'll let you see the code online. It's going to open that Yale JPEG file. It's going to analyze it looking for things that look like faces. Eyes, and nose, and mouth, and so forth. And if it finds them, it's going to open and extract each and every one of them, for better or for worse. Better still, suppose we have this photo which is a photo of most of CS50 staff here at Harvard this year. And if you see, I am among them somewhere. Well, I wrote another program thanks to a nice tutorial online, this one called recognize.py, that's going to analyze harvard.jpg this time and actually find, hopefully, me. Because I also have fed this program as input one photo of myself from CS50's website. And in just a moment, hopefully this will open up a file containing an analyzed version. And indeed, if we look for Waldo, there I am in the back. And the program in Python drew that green box. Let's do one final example. This one is going to be called qr.py. And it turns out, if you're familiar with QR codes, those two-dimensional barcodes you sometimes see online and in the real world, you can import a library called QR code. I can then generate an image using QR codes built-in function make. And let me go ahead and make a QR code containing, like, a link to one of the courses videos. Https://youtu.be/OHG5SJYRHA0. Let me just double check that there's no typos. OHG5SJYRHA0. So that's going to embed in a two-dimensional barcode that URL. I'm then going to do image.save qr.ping, which is a graphic format-- indeed, a ping format. And that's it. Two lines of code. I'm going to go ahead now and run for my final example homemade in Python, two lines of code, qr.py. That was super quick. And if I now go into my directory, you will see qr.ping. And if you'd like to take out your iPhone or Android, open your camera, point it at the code. You might need to zoom in. Hopefully this will work. [MUSIC - RICK ASTLEY, "NEVER GONNA GIVE YOU UP"] That's it for CS50. We'll see you next time.