CARTER ZENKE: OK, hello, everyone. It is so good to see you here. My name is Carter Zenke. I'm the course preceptor here on campus. This is our week six section for CS50. We'll dive into Python. So in lecture, we saw a few different topics and we have those same topics on the board today. Some including strings, this new dot notation for Python, loops, dictionaries, libraries, and how we read and write from files. So we'll touch a bit on each of these different topics during this section. And we'll spend a bit more time on things like loops and dictionaries and file writing and file reading to help prepare you for this week's problem set. So that's all those topics this week, but we'll try to dive into a few of them in particular. So let's actually start off with this idea of these strings. So we saw strings in C. And these strings are actually still here in Python. And there are all kinds of interesting things you can really do with strings. So one of them might be taking in some information from a book. So maybe you've read Goodnight Moon. And if you have, you know it's this children's book that has lots of simple text that is involved with it. And maybe the book starts off like this. It says, in the great green room, and maybe we're interested in seeing what a computer does with this text. So one thing we can now do with Python is have more access to all these different kinds of libraries and things we can use. Actually use AI and such. And so maybe we give this piece of text to this AI called DALL-E with OpenAI. And we get back maybe this kind of image here in the great green room that DALL-E generated just from seeing this piece of text. And we could also do it with the next phrase. We could say, there was a telephone and a red balloon, and we'll get back some text-- or some images a bit like this. This telephone here, this red balloon generated by the AI, again, here. Now, all of this-- this fancy AI, this image innovation, comes down to just giving text to a computer. And so let's think about how we do that in Python here as compared to C. So we saw in C we had this top level code here, char *text gets this get_string function whenever we prompt the user for there. And then, in Python, we had this text gets the value of input. And so take a moment here, maybe pause the video and think, what are the differences you see between the top level code and the bottom level coding? The top and C and the bottom one in Python. So maybe you notice that, in Python, we no longer have to say the type of the variable we're working with. So down below, we're still working with strings, as we are in the top piece of code. But notice how, in Python, we just get to say the variable name, like just text, right? Not string text, not char *text, just text. And so C is what we call a statically typed language. We have to declare the type before we use a given variable. Like text has to declare it as a string or a char *. Whereas, in Python, which is a dynamically typed language, we don't have to do that. We can say, Python, please infer for me the data type that we're talking about here. Saying text gets whatever input gives us. And because we know this input function always gives us a string, well text, of course, is going to be a string. OK, so it's one difference between C and Python in getting these variables and getting these strings. But how would we actually compare them perhaps? So in C, we had this top code. And in Python, we now have this bottom code. And so take a moment to yourself, maybe pause the video and think, what differences do you notice here between the top and the bottom? And so maybe you've seen that, in the top, we actually are comparing text with hello using a different kind of function. We're saying, let's use str compare to look at text and look at hello and check the return value of that and see if it's 0, where 0 indicates that these are the same. Well, in Python, we don't have to do any of that. No special functions involved here. All we get to say is, if text is equal to hello, if this text is hello, let's do something that's indented. And maybe you notice as well that, in the top, in our if statement, we have to have these curly braces. And the code that gets run if this condition is true is inside those curly braces. Whereas, in Python down below, we see that we have this if text equals hello. Then, we actually do the code that's indented. So no longer do we need these curly braces here. We can only rely on code that is indented. And so while you may be able to throw out the braces and the semicolons, you can also make sure you have to have your code indented. That's what can matter more in Python versus these curly braces up top. So comparing strings works a little differently in Python, but let's take a look at how we actually get access to individual characters of a string. Because remember that strings are just collections of characters or strings of characters. And to see individual characters in Python, we can actually do the same thing we did in C. We have the same bracket syntax, like text [i] to get access to a particular character inside our piece of text. And this also works for lists. So let's say we had a list in Python similar to an array in C. We could try to actually have these curly brace-- these brackets here that get us access to an individual element inside of our list. So some differences in strings here. But mostly, we can actually use some familiar syntax with these accessing individual characters here. Now, strings are a little more powerful than just comparing them or getting them in your program or even getting individual characters. And in fact, in C, we have-- not in C. In Python-- sorry-- we have access to these individual functions or the individual methods that belong to this type called str in Python, more long we call the string. So here we have this dot notation that we can introduce. So let's say we're trying to give this string to our program. Well, we could have some code a bit like this like we saw before. We could say that text equals input. And maybe our text looks like this on the right hand side. And if you pause the video or asked yourself, what looks a little odd about this text on the right? What is a little messy about it? And so maybe you've noticed that, in general, we have some spaces before the text and some places after the text. And ideally, those shouldn't be there. That's going to be a remnant of the user typing in some information and they typed it in wrong. We want to get rid of that. Well, luckily, in Python, we actually have these methods we can use to get rid of that whitespace on either end to strip that whitespace off. And this method is actually called dot strip. So we can say text.strip. And if you run this line of code now, we'll see that those white spaces on either end, actually, go away. So here is what we had before. It's a piece of messy text with white space on either side. But running text.strip, well, we've got rid of that. Now, we just have, in the great green room, the actual characters inside of our string. OK, and that's actually pretty good. But what if we had some other kinds of input? Let's say we had maybe miscapitalized letters. Like IN, all caps, thE with E capitalized, and ROom with RO capitalized. Well, how could we make sure this is all standardized? Well, we could use another method that belongs to strings, this one called lower. We could say text.lower. And that then gives us this same string, but in lowercase. And similarly, maybe we want to actually capitalize this because it's a sentence. We could actually do that very same thing. We could say text.capitalize to make sure that the I in this string is capitalized and the rest is here lowercased. So all of these are what we call methods that really belong to this idea of the string-- this data type called the string in Python. Some Python developers long ago decided that because these functions, like lower or capitalize or split and so on-- or not split, but strip, belong to this thing called a string. They're so integral to what it means to be a string, we actually have to include them inside this string data type, so to speak. So if we keep going here, we can actually think of other ways to use these methods. But first, let's actually dive into how they work, why they exist, and where they are. Compare what kinds of methods you want to use in your own code. So again, these strings, or more succinctly called strs in Python, S-T-R for short, actually you have a variety of methods you can figure out in the Python documentation. So if you go to docs.python.org, you'll see something like this. And if you scroll down to the str section, you'll see string methods inside. You get to find all the methods you could call on your strings in Python. We have capitalized down here and some others if we scroll down. But notice how we have access to this dot notation. It's not capitalize, and then, give the string as input to capitalize. It's actually str.capitalize or string.lower. And so why does this come up in Python? Well, if we actually bring it back to what we saw in C with our structs, we might get some intuition here. So in C, we had this idea of a struct called a candidate, for example, in an earlier problem set. And a candidate looks a bit like this, just a single person over here. And remember how this candidate had different attributes, like they had a name, or they had maybe a number of votes. And this candidate was some data type we had constructed to have these attributes. It had a name and some number of votes. And to get access to those, well, we did this very same thing of candidate.votes to get access to votes. And then, candidate.name to get access to name. So in Python, this str data type now is somewhat similar, but you can think of it as more of a toolkit. It's a data type that has some tools inside you can use on the string you're talking about in this case. So we could say str.capitalize. And there's a tool inside of this str type we can use-- this function we can use, or more particularly this method, we can use on this str data type. And similarly, we could even have str.lower, this other tool inside of our toolbox for strs we can use to actually make the str do what we want it to do. So very similar in spirit to this idea of attributes for our structs in C where structure often maybe defined as our own data types, similarly, in Python, we have our own data type called a str that now has not just attributes, but also functions that can belong to it and that can operate on its own self. So that's all for dot syntax. And take a look at the Python all the functions you can use here. But let's actually take a look at loops in Python and how strings come in with loops. So if we remember from lecture, we see that there are maybe the same kinds of loops impact. We have the while loop, the for loop, but they look a little different. And then, one big example of this difference is Python's four blank in blank syntax. And we'll see this very often in Python because it's so convenient to use. So let's say, for example, I have some piece of text, again, in the great green room on the right hand side. And I want to loop through this piece of text. Well, I could do that much more simply than I really could in C. All I have to do is say for C in text. Maybe we want to print out every character. And so what we'll do is make a new variable called C. And it'll actually loop through all the characters inside of this string called text and make sure that on each iteration C updates as we go through. So for example, let's say I run this code. I would see that C gets that very first character in text, like the I. On the next iteration, C will get that next character called n, and then the space, and then the t, and then the h, and then the e. And this will keep going and going and going all the way through our string. Now, we could call C anything we want to call it. We could call it z. We could call it s. We could call it character, just like a long variable name. We could even call it zebra if we wanted to. But the main thing here is that Python takes this string and infers that when it sees this loop called for blank in whatever string you have, it's going to take every individual character, assign it some name as you loop through, and make sure that each time that name refers to a particular element inside of your string where the element is now, in this case, a character. Now, this is actually-- this works beyond simply strings. We also have this kind of syntax for lists. And so let's say we want to turn this piece of text, which is all one string, into really a list of different strings where each string is a word. Well, we could have a different kind of method for a string. This one called .split. So here we have, again, our text on the right hand side. And let's make this new variable called words that will get our text but split up now individual words. And when you call .split onto a piece of text, it's going to automatically, or by default, look for spaces in that string and give you back substrings that are smaller strings that are between those spaces. So for example, if we were to run text.split here, what we'll get is now a list of individual words. So see how this changes from one long string into multiple individual words and are part of this Python list. And as a thought question here, how do we know this is a list? Take a look at the syntax on the right hand side. Let me show it back to you again. How do we know this is a list? Well, you might have thought that it's because of the brackets on either side. We see these square brackets, and that in Python denotes a list. But we also see these commas in between our words or their individual strings. So we say in, which is a string, comma, though, which is a string, comma. Those are all inside these square brackets. So that denotes to us a Python list, which is similar in spirit to a C array, but it's much more flexible overall for us. OK, so now, we split our words into-- split our text into individual words, let's actually see how we can loop through these words as a whole. So if we then use our for blank in blank syntax with a list as that second blank there, we could say, for word in words. Let's print out the word. And what this will do for us visually is say, on this first iteration, word will get the value in. And on the second iteration, word will get the value the. And on the third, it will get the value great and so on. And so you might be able to guess, well, on the fourth iteration, what will word get? It will get green. And on the fifth iteration, what will word get? It will get room, right? It will get room at the very end. So we see that word is going through and getting these individual words inside of our list. Well, why is that? Why is it the actual words now inside of our list as opposed to the characters inside of our string? Well, what matters here is the kind of data type you are asking Python to iterate over. So this for/in syntax is helpful if you want to have some kind of loop going through every element of a-- what we call iterable, where an iterable means you can iterate over it. It has some elements you can go over individually. If you have a list, what Python will do is go over every element inside that list. So in this case, our strings are the elements of our list, right? But if we have just a single string, like in or the or great or the string as a whole, in the great green room, well, Python will decide to go through every individual character. Because in this case, the characters are that subpiece, that subelement inside of our longer string, in this case. So I encourage you to actually go out and check what Python does in these different cases to actually see how this changes as you change what you iterate over in Python. So our first exercise together will be to actually take a look at this piece of code that will have us look at a variety different loops and figure out which one is going to print out which thing. So let's look back at our code here. I'll go over to my code space and I'll go ahead and sign in over here. I'll let you get your code space up too. So we have our code space loaded. So let's actually go ahead and pull up our piece of code that we'll take a look at it together. So we'll open up text.py. And I'll zoom in a bit on it for you. Now, the goal for this exercise-- so take a look at each of these loops in Python and figure out what they're going to do for us. So notice at the very top, we have the same text from before, in the great green room. And we split it up by word. So we're saying let's split up our text and make this list called words. But we have different loops down here to actually loop through those words and print out those words out in a variety of different ways. So let's look at round one here. What might we see in? And feel free to pause the video and write it out for yourself. What might we see in this round one? So we might see, as we saw before, every individual word being printed. So we might see first in, and then the, and then great, and then green, and then room. These are going to be on new lines because, if we know, print at the very end always includes that new line for us automatically. If we wanted to change that, let's say, print these all on the same line, like in the great green room on that same line, well, we could change the ending character of this print statement. We could do this. We could say end equals blank. Or end equals, in this case, it's going to be a space. So what'll happen here is Python will print out end, that very first word, add a space at the end, and then, print out that next word going and going and going through. So up to you, but we can maybe just leave this as printing out the individual words on single lines. Let's take a look at round two now. And feel free to pause the video and ask yourself, what might get printed out here? So if we look through this, we have for word in words. It's the very same loop we had before. We're getting access to every individual word in our list of words. So first in, then the, then great, then green, then room. But now, we have for c in word. And so recall, when we have a list we're looping over, like in this first loop, we're getting access to the elements of that list. But word now is simply an individual string. So what are we printing out here? Probably the individual characters inside of that particular word. So if we loop through this, we might see first I and then n. And then, would we see the space or would we not? Maybe take a guess? In this case, we actually wouldn't see the space. And that's because we've split our words into a list of words. And then, I actually try to print this out for you to see. I'll go ahead and copy this and I'll open up a new one called list.py. I'll paste this here. I'll print words. Let's go ahead and run this. Python of list.py. Well, we see now in the great green room. Notice how there aren't any spaces inside of these individual strings that are inside of our list. So if we go back here and we loop through the individual characters, first in this word, which is I and n, well, there are no spaces there. And if we loop through this next word like, T-H-E, there are no spaces there, either. So we'll say T-H-E and so on and so forth. You can keep going like this. We're printing out every individual character except for the spaces, in this case. OK, let's look at now our third round where we have for word in words. Same thing we saw before. But now, if g-- this character g is in the word, what will we do? Print the word. So take a guess as to what you might see in this loop, keeping in mind that this is our text, in the great green room. What would we see here? So you might see only the words that have g in them. And in Python, it's just this easy to ask if some string is part of another string. That's because we're asking if g, this string, will be inside this smaller string, this word right here. So let's go ahead and say I, it doesn't have any g's in it. The, no g's. Great has a g at the beginning, so maybe we print that. And green does too, but room does not. So we probably-- it would be safe to say that we would see great here and we would see green printed out to the screen on these new lines. And now, we have perhaps some new syntax here. For word, in words, [2:] . And maybe you're not familiar with that, but feel free to take a guess as to what you might think will happen here. Maybe pause the video. And if we take a look at this, well, let's go back to our list.py file and try to get a grasp on what this is really doing for us. So we see words [2:] . And this is somewhat familiar to us. We've seen words [2] before. What this will do if I run it as Python of list.py. Well, that will give us just great, right? It will give us the second indexed element where this is 0, 1, and 2 in our list. Just point that out. But what if we wanted not just that element, but that one in all the rest after it? Well, Python comes with this fancy feature. We can say a colon here to say, get me that word at index 2 and all the rest, right? So I could do this. I could say Python of list.py and see great, green, and room. Now, I've sliced my list into these smaller piece here. That only includes great and green and room. That's the technical term for it. Slicing. We're slicing a list using this colon syntax here. Now, I could even add an end state. Let's say I don't want room. I only want great and green. Well, I could modify my slice like this. I could say start at index 2, in this case, that's great, again, we're 0 index. 0, 1, and 2. And then, go up to, but not including, that last index here. And that last index is 4. Again, 0, 1, 2, 3, 4. So we'll go here. 2:4 Python list.py. Now, I see great and green. This worked because the very first number we put in is inclusive. We're going to get this index back. So we're going to get this value here, this string here. And we're also going to get 3-- index 3. But we won't get index 4. This is exclusive. The very first number is inclusive, the last number is exclusive. So to get just, for example, great, we could also do this. But that's not really necessary, right? We only really need 2, 4 to get great and green. Or if we wanted to, just 2: to get all the rest of them in this list, OK? So now that we've seen that, what do you think will happen here? Well, it will probably print out every individual word that is in this list only after the second index. So we'll probably see great-- we'll see great and green and we'll see room overall. OK, last one here. We have for word in words, print Goodnight Moon. Now, what do you think you will see here if you were to pause the video? And maybe you've noticed that, for a Python loop, we have this idea of going through every element we have in our list. A lot of Python for loops are really built on lists. So if we have for word in words, well, our list, again, is the same one in list.py if I print out just words now. In the great green room. Well, this will actually iterate over every element in that list. We'll say, OK, first in, then the, then great, then green, then room. And it doesn't matter-- if we're not doing anything with word, we're just looping that many times. So we're going to loop, in this case, five times and print out Goodnight Moon. But it seems a little odd to do it that way. And actually, I think I might have a syntax error here. I don't think we need this parentheses here for word in words. If we did this, well, why do we have but call it word? We're not really using word at all. So we could just say underscore, meaning that, look, this name doesn't matter. It could be anything. It could be z, it could be f, it could be zebra, whatever we want it to be. But we're not going to use this variable name, SO let's just call it underscore just to signify that this file name doesn't quite matter. All we're interested in doing is looping through it. It doesn't change the output of this for loop. It doesn't change anything about how it runs. It just changes the variable name. So it signifies we don't really care what the name is, in this case. So I'll change it back to for word in words. Let's go ahead and run this piece of code. We'll run Python of text.py. And now we'll see, in round five, five Goodnight Moons. So let's keep scrolling up again. Round one, we did see, in the great green room. Round two, we see all the individual characters in the great green room. Round three, we just see great and green. Round four, we see great green room. And of course, in round five, as we saw before, we do see five instances of Goodnight Moon. OK, so that covers a lot of Python loops. And in general, if you're going to use this kind of Python syntax, I think you'll find it really handy. But it just takes some practice to get to know what each of these loops is doing and how they work with different data types, in this case. So let's keep going here. And let's take a look at this new Python data type. This one called dictionary. And dictionaries are really important in Python, very easy to use, and often very useful for us, as you'll see in the problem set this week. So a dictionary. Well, if we think of a dictionary, it's this piece of paper, so to speak. We might have some idea of hosting some keys and some values. Maybe words and their definitions, like a real dictionary has. So here we have, for example, maybe a dictionary of authors. So maybe Goodnight Moon is one of our keys. And the value associated with that key is Margaret Wise Brown. Or Corduroy, that's our key, the book title, is now associated with the value Don Freeman, that author there. And Curious George associated with H.A. Ray, where we have Curious George as a key and H.A. Ray has the value. So what this gives us is this ability to look up, like in a dictionary, the actual author of some piece of text given the title. So we see, again, Goodnight Moon is example of a key here and Margaret Wise Brown example of a value. So in this dictionary, authors, I could ask for a title-- like a book title. I could say, give me a book title and I'll say, Goodnight Moon. And the dictionary will actually giving back the value associated with that key, like Margaret Wise Brown. Now, this example here is a collection of multiple objects in the same dictionary. Have multiple books here all in that same dictionary. We have Goodnight Moon, we have Corduroy, we have Curious George. But we can also have a single dictionary for a single object. We could also have, for example, a single book dictionary. And this dictionary has a title key and an author key, the two things that make it a book. It has a title and an author. Well, the title of this book is Goodnight Moon and the author is Margaret Wise Brown. So I could simply ask this dictionary for the title of the book by saying, give me the value-- the key title. I'm going to give you Goodnight Moon. I could also ask for the author of this book by asking for the value associated with the key author, and I get back Margaret Wise Brown, in this case. OK, so let's see an example of this in actual syntax. This is a pretty good theoretical overview, but let's think about it in syntax form. Here I have a new dictionary, book equals dict. And what this is doing for me is saying that, give me an empty dictionary, nothing in it yet, and call it book. So on the right hand side, I'll see this dictionary, a blank slate called book. Now, let's I want to add in a key and a value that's associated with it. So I can do this. I can say, make sure you add this key called title and give it the value Corduroy. So this bracket notation is back. But now, to add a key to the dictionary, we're going to use that. So we're going to say book ["title"] to insert a new key called title and give it the value Corduroy. And we can also do it for the author. We could say book ["author"] is Don Freeman, in this case. So we're to say that, in this case, this book's author is Don Freeman. Now, if I want to get back some value from my dictionary, I could do this. I could say, book, bracket, title, and print it out. And what would I get in this case? What do you think? I might get Corduroy. I would see Corduroy printed out to the screen down below. That's because saying book ["title"] is saying, look for the value associated with this key called title inside this dictionary that we're calling book, right? OK, now though, what if I asked for the key Corduroy? What do you think would happen? Well, if we look at our dictionary, do we have a key in Corduroy? It doesn't look like we do. We only have a key for title and a key for author. Corduroy is a value, but it's not really a key, right? It's associated with this key title, but it isn't a key itself. So if we did this, we get what Python calls a key error where a key error is simply we're trying to access some key in our dictionary that doesn't exist. They'll tell us that this key is not part of dictionary. You can't look up the value for a key that doesn't exist, in this case. OK, so that gives us access to these dictionaries. We're going to make them in code. But what if we wanted to do something a little more advanced? This is a single book, but what if we had multiple books? Well, we could maybe shorten our syntax a little bit here. We could say that if I want a new book, let's just define it all in one breath. Here we have a new dictionary denoted by these curly braces now. Not square brackets, but curly braces. And we have the key, like title and the value, like the Goodnight Moon, and the key author in the value Margaret Wise Brown. To give you the full picture here it looks like-- a bit like this. But again, this is only one book. So how could I actually get multiple books to actually represent multiple books in our code? Well, we could keep this same style of dictionary where we have a dictionary for every book and it has the two keys, title and author. We could also make a list of them-- a list of dictionaries. So let's take a look at this we could see-- OK, here I have this list. And how do you know it's a list if you take a look at this piece of code? Well, we see those square brackets on either end. Again, the square brackets that are there at the beginning and square brackets at the end. And we also see we have some commas in the middle, just to highlight that here. This is a list. But inside of this list, instead of individual strings, for example, as we saw earlier, we now have full dictionaries. We have this dictionary for Goodnight Moon, this dictionary for Corduroy, and this dictionary for Curious George. So this is helpful for us because we can represent all kinds of different things using dictionaries, but make sure we have multiple of them by keeping a list of these very same dictionaries. So let's get some practice here. Let's go to books.py. And inside of books.py, we'll actually make sure we can prompt the user for a title of book and an author. And we'll add that to our bookshelf, which is a list of books, in this case. So let's go back over here. And let's close out some old files and maybe we'll go ahead and code up books.py. So notice how we have part of our code imprinted for us. We have this list of books that is empty. And this is a list, again, because it's simply two empty square braces. Nothing inside this list, but there could be eventually. And now, we have this for loop for i in range 3. We saw range in lecture, which simply gives us a list from 0 all the way up to 2, in this case. We can do 0, 1, 2, loop three times. And inside this loop, we'll actually make sure we have our new dictionary and we add it to our list of books. And finally, at the very end, we'll print out our bookshelf, our list of books, in this case. So if we wanted to start here, well, I can go back to my syntax as we saw before. If I go back here and I see-- if I want to make a new dictionary, I just need to ask for a blank dict. So I'll go back over here and I'll say give, me a new dictionary called book, in this case. And maybe I want to add a key to this dictionary. So think to yourself, what is the syntax for that? And it looks a bit like this. We could say, OK, I want to have a title here. The key to this dictionary will be title. And I'll say, make sure that the title is whatever the user inputs. And I'll ask them for a title. And similarly, I could also say, OK, let's add an author to this book. And I'll say, the input for that will be asking the user for an author. So now, we have this blank dictionary. We've asked the user to give us a new key for the-- a new value for the key title, and similarly, a new value for the key author. We've made these keys and given them some value from the user. All right, we have our book. We could even print it out if we wanted to. You can print book down here. Let's run Python of books.py. We'll say let's get, in this case, Goodnight Moon by Margaret Wise Brown. And we see that, down below here, we do have that dictionary being printed out. And it's dictionary because we see it has these curly braces on either end and it has these keys associated with these values. So let's actually end our program here. What we can do is try to add this book to our list. And if you are not familiar with this yet, that's OK. We can actually go ahead and say books.append to add to our list. So currently, our list is empty. But we could use this method associated with a list called append to actually insert this book into our list. Books.append and individual book. So let's try this. Let's actually go ahead and down books.py. We'll do Python of books.py. Let's go ahead and add Goodnight Moon. This one written by, let's say, CS50. We could also add Corduroy. And maybe, [INAUDIBLE] CS52. And maybe we could add-- oh, I don't know, we can add Curious George, and that one is also written by CS50. So now we see down below-- if I go full screen-- we have this list of dictionaries. So what we saw before. We have the brackets starting our list on either end. The commas separate our dictionaries. And on the inside, we have this dictionary, this dictionary, and this dictionary down below here. So now, we have some individual books on our bookshelf. And now, what can we do? We could maybe decide to print out just the titles. I can go back through my books. I could say for book in books, looping through my list of books, print out the book's title like this. So now, instead of printing the entire list, I could print out just the titles. I could do this. I could say Python of books.py. I'll say Goodnight Moon by CS50. I'll say Corduroy by CS50. And I will say, in this case, Curious George by CS50. And now, I see Goodnight Moon, Corduroy, and Curious George. So helpful because we're able to structure our data inside of a dictionary and put that inside of our list here. OK, so what if we wanted to make sure the user couldn't type in a really awkward title. Like let's say, if I do this again, I might type in space, space, Goodnight Moon. And that isn't quite what I want, right? That isn't really good on me as a user to give this kind of data, but it's also not going to the programmer to assume that these are going to give me exactly what I want here. So instead of just accepting user input, I actually could go through and try to sanitize it a little bit. Make sure to clean it up. So I actually have it in the format I want it to be in. I could instead say, OK, whenever I get this input, I want to afterwards strip the white space. And I also want to-- and we capitalize it. So what this is doing here is stringing this dot notation together. So here I have input, which took me back a string. And remember, strings have access to these methods, like dot strip and dot capitalize. So first, we're going to get some input from the user, some string. We're going to strip it using this. And then, we're to run capitalize on it like this. So let's try that again. Let's do, in this case, print out just the title right after we get it. We'll print out book title. And let's go ahead and do this. We'd say Python of books.py. We'll say Goodnight Moon. And it's a little better, right? We capitalized the G in Goodnight Moon and we made sure that everything else is in lowercase. And there's no white space in front of anything we added here. OK, so just some handy syntax for cleaning up user input and making sure that you can make sure your data is formatted correctly in your own programs here. OK, so now that we have some dictionaries and this ability to represent data in this way, we can think of getting a little more advanced with our programs. If I go back to our slides, we might think of not just getting this shelf of books that the user types in, but really using some data in our programs. And we'll see this in action during the problem set this week. How do we get data and use it inside of our programs, especially using Python. Well, you can think of these libraries and these modules, where a library is some code somebody else has written. And in Python, we more specifically we call this often a individual module. And so, in this example, we'll actually see a CSV module to work with data that's inside of a CSV file. But what is a CSC file? So on your computer, maybe you have Excel or Google Sheets or something like that. And you could store data in different rows and different columns. So notice how here I have a title column and an author column and individual rows for every book. So I see Goodnight Moon with Margaret Wise Brown, Corduroy with Don Freeman, all the way down for these 15 or so books that I have in my data set. So this is what it looks like in Google Sheets or Excel. But under the hood, in the actual computer's file, you'll see something looks a bit like this with title,author Goodnight Moon, Margaret Wise Brown, and Corduroy, Don Freeman. So a CSV stands for Comma Separated Values, where notice how every individual row is a single book except for that first one, which is the row-- is the column titles. And for every row we have, we have multiple columns separated by these columns. So Goodnight Moon is a title of this book and Margaret Wise Brown is the author of this book that's on the same row right here. And similarly, Winnie the Pooh is the title of this book and A.A. Milne is title of this book-- is the author of this book on that same row. OK, so to read in these kinds of files, we might want to use a specialized system that understands how this data is formatted, right? It would be a lot of work for you to go through and parse every comma. To figure out, OK, if there's a comma here, I need to put this piece of data in that dictionary or this dictionary. Let's not do that. Let's actually rely on somebody else who's done that work for us here. So in Python, there is this CSV library, or CSV module, that has various methods or functions given inside of it that can help us read CSV files. So here, if you go to the Python documentation, docs.python.org, and you look at this CSV module, you'll be able to see all the kinds of information on what is defined inside the CSV module and what you get as part of that module. Now, how would I use this in my code? We saw this in lecture. I could simply write import csv, similar to hashtag includes DNIO or hashtag includes CS50. Here, I'm simply including, or importing, the CSV library that contains all this functionality I saw in the documentation. So we can think visually of this. It's a bit like getting a big box of stuff. We have this big box of code we can use in our program now called CSV. This is giving us a big box of code. And inside of that are some individual functions we could use. We could use maybe DictReader, DictWriter, reader, or writer. All this is defined inside of the CSV library. But how do we know from this big box of stuff what we actually want to use in our program? So if we just import the entire module, this entire big box of stuff, well, to be more specific, what we want to use, we have to use that dot syntax. We could say something like this. csv.DictReader, for example, to read our CSV as this collection of dictionaries. We could say csv.DictReader saying, go to that big box of stuff in the CSV module and give me the DictReader part of it-- the DictReader function, right? We could also do csv.reader to get the reader aspect and so on. So this dot syntax is coming back, but now, it's enabling us to access individual parts of our module. But let's say we don't want the entirety of this entire big box of CSV module. We don't want everything in here. Well, we could do this also. We could say, instead of import CSV as a whole, well, we could just say, give me from the CSV library the DictReader portion. I could do this. I could simply use now from here on out just DictReader. So from CSV, from this big box of stuff, import just DictReader. And then, I can simply use DictReader without qualifying where it comes from or what module it's part of just using DictReader now. Now, in general, we might prefer to actually do this the other way, to use it this way. And why do you think that might be? Think to yourself for a moment. Well, it's often handy to do it this way. Because we do it this way, we actually are able to make sure we don't collide our name. So maybe my own program has this function called reader, by chance, right? Here, if I say csv.reader, that differentiates this reader function from my own reader function. So it's helpful if you're actually defining your own function that might collide names with the functions you get from other modules in Python. But you can, of course, do it this way. If you'd like, and you're very certain don't have any function called DictReader here. OK, so let's see some of the differences between using these various functions inside of this CSV library. So we saw here before we had DictReader, DictWriter, reader, and writer. But what are the difference between these, and why would we even use one versus another? So to do that, let's actually dive into reading files and maybe writing to them. So let's actually go back to our code space now and think about this CSV. We have books.csv. and it's same thing we saw before. We have title, author as our column names, and we have the title and the author on individual rows separated by commas. So if I want to read these, I'll say, code reads.py. I now have this list of books and I've imported the CSV library so I can read books from this CSV and add them to my shelf, so to speak. So to open a file in Python, well, there's a few different ways to do it. I could simply just call open. I could say open books.csv. But if I do that, I later have to do something with the file here. I'll say it like, file equals open. And then, I have to close it. I could say close file like this. Or is it fclose? Pretty sure it's close. But you can double check me on that. But we actually don't have to-- we actually have to worry about this at all if we just say let's not just open it like this. Let's actually open it within a certain context. Only open it for a little bit. And once we're done with that file, go ahead and close it afterwards. We can say with open this file and let's call it something, like file, in this case. So we're going to open books.csv and call it file with open this file as file. Let's do the following code indented. So while we're indented here, our file will be open. We can do all kinds of things with it. But once we unindent, we go back out, our file will be closed. We can't do anything more with it here. So this takes care of running close on our file or figuring out when to open it versus close it. Python handles all of that now for us using this with syntax here. OK, so let's see this in a little more depth. We saw with open. This whatever filing we have as file. You can change file here. We can also call this maybe even books_file. Doesn't have to be just file. But here, we'll call it file. We can then read it in a few different ways. And one way, it doesn't even use the CSV module. We could do this. Text equals file.read where .read is some method associated with a file that lets us read in all the data that's part of it and store it inside some variables. So here, let's do the same. Let's say, text equals file.read. And this is not using the CSV library. We don't even need this right now. We could simply say, Python of reads.py. And we've maybe read our file. It's hard to tell, so let's go ahead and maybe print out the text. And our Python of reads.py. And now we see, in our terminal, well, all the same text we had before. Title, author. Goodnight Moon, Margaret Wise Brown. All of that is in our terminal now. But this isn't very useful, right? If we wanted to actually read in some data, store it in our bookshelf, well, that isn't going to help us add to our list of books, right? All of these books are still jumbled together. We don't really want that. We want to actually differentiate them and have dictionaries for every book in our CSV. So that's where this function of-- this method called a DictReader comes in. We can actually use the CSV module to define a special way to read our file. And we can then use it like this. We could say, give us a new file reader, this one is special from the csv.DictReader function. And let's actually use that to go through every individual row in our file and do something for those rows. So it's best shown by example here. If we go back to our code, let's not just read all the text. Let's go ahead and get a special reader for our file. Let's say I want a file reader. And I'm going to make sure this is the DictReader inside of the CSV module. Well, if I read the documentation, I know that DictReader needs access to a certain file to read. So I'll give it my file here that I've opened, book.csv. And DictReader will give us back some special data structure that we'll call file reader. And we can use this in our code as follows. I have to loop over it similar to a list in Python. And I can do that with for syntax. I could say for whatever in file reader. And let's maybe call this-- because we know we're looking at books-- let's call this book. For book in file reader. And now let's just print out book to see what we get here. So I want it to just say Python of reads.py. And now, I see individual books as dictionaries. So DictReader has done a lot of stuff for us. It said, well, I know that from your CSV file, you have these columns called title and author. And I also know that every individual row is going to be some particular element that you're interested in where maybe this column corresponds to title and this column corresponds to author. So what I'll do is I'll give you each of those rows as a dictionary with the keys that are your column names and the values that are whatever is inside every individual row right here. Notice how we have, in this case, title as the key. Goodnight Moon is the value. Author is the key and Margaret Wise Brown is the value. And it's the very same thing all the way through our CSV. Now, print it out in our terminal in individual dictionaries. So if we've printed these out, adding them to our list is pretty simple. We can just say books.append an individual book. And now, if we clear our terminal and are in Python or reads.py, well, we don't see anything because I didn't print it out. So do print books down below here. And now, we see all of our books inside of our list. And this is helpful because, again, we could just print out individual books or book titles. I could say for book in books. Let me go ahead and print out the book title. And I should see-- instead of all of my information on all my books, like we saw earlier with our just .read method, I can say Python of read.py. Now, I see just the titles formatted very nicely for myself here. So if we go back, this is how we're going to actually read CSV files. There's other ways too, like csv.reader. But this isn't going to quite be as useful for us because let's take a look at this. If we say csv.reader, let's go ahead and save for row in file reader, print row. Let's see what we get instead. Python reads.py. Well, we get just a list. And notice how it's actually included the column names in a single list. So reader gives us back every row of our file as a list. But this isn't quite as handy because I have to know that, for example, if I want to print out the book title, well, that's going to be a row where my titles are in the very first index. So I'll say row [0] to get the titles. Python reads.py. And I get title Goodnight Moon, Corduroy. That works, but it's not quite as clean as being able to name the actual attributes of my book that I want. And that's where DictReader comes in. By reading our rows as dictionaries, we get access to those individual keys we can use throughout our code here. So for book and file reader, I can instead print, in this case, the book's title. Oops, the book title. Python reads.py. And now, I see all of this. And DictReader also knows, if you're curious, not to print out the actual first row because it assumes that these are going to be the key names, unlike reader, which does not make that same assumption, OK? So having done this tour of these libraries and these modules and how to read in different pieces of data, this is really going to give you a lot of tools to use on this week's problem set. You'll be actually working with these very similar files, CSV, so even just textual files. And as you go through, food to keep in mind all what we learned right here. How to open a file, how to read it in using DictReader, and so on. And as you go off into this week, feel free to use all this stuff from the section. Thank you all for coming today. Wonderful to see you. We'll see you next week.