DAVID MALAN: This is CS50's Introduction to Programming with Python. My name is David Malan. And this is our week on unit tests. Up until now, we've been writing a lot of code, and you might have been testing your code by running your program, and passing in some sample inputs, and running it again, and passing in some sample inputs, or you might have been waiting for us to test your code instead. 

But it's actually much better practice to get into the habit sooner rather than later of testing your own code using code of your own. In fact, whether you're writing a personal project or working in industry, it's very common nowadays to not only write code to solve the problems that you want to solve, but also to write a little extra code to test the code that you wrote. And that's what we're going to focus on today, writing our own test so as to be all the more confident, all the more certain, that the problems we have been trying to solve are, in fact, solved correctly. 

So let's rewind a few weeks now to a program we wrote a while back, namely to calculate numbers. And specifically, we left off with this calculator on trying to compute the power of a number, like x squared or where x might be two or three or some other number as well. Let me go ahead and resurrect that file by going into my terminal window here and running again code of calculator.py. 

And let me go ahead and pick up where we left off way back when by defining a main function here. And then in my main function, I did something like this. I said, x equals int of input. And I ask the user, what's x, question mark? And then I immediately went ahead and printed out something like, x squared is, and then I passed in as a second argument to print the result of calling a function called square passing in that value, x. 

Now of course, I haven't yet implemented the square function. So let's define that as well. Let me go down a couple of lines and define square. And it takes an argument recall, a parameter that at the time I called n, for number, so I'll do that again, though I could technically choose any name for this variable. And I recall, did this. I returned n times n. 

And there were multiple ways to do this. Squaring a number is multiplying it by itself. So I could also use other syntax here, but this is what we ultimately settled on, and then recall that I ultimately called main in order to kick off the process of running this program. So just as a test manually, let me go ahead and run Python of calculator.py and hit Enter. 

What's x? Let's start with 2. x squared is 4. I think that's correct. So let's run it again, just for good measure. Python of calculator.py, let's type in 3 for x this time. X squared is 9. And I think that's correct. And I might be feeling pretty good at this point, and I go off and submit my code to a course, or I post it on the internet for others to use. 

But I haven't really methodically tested this code. And it's not necessarily the case that it works entirely. In fact, I haven't really considered a number of corner cases. I went with some pretty obvious numbers like 2 and 3, but what about 0? What about negative numbers? What about any number of other infinite numbers? We're not going to test an infinite number of inputs to this, because the program would never halt, but we should test some representative inputs ultimately. 

But before we do that, let's get into the habit of making sure that main isn't always called. Let's adopt this habit, again, of doing If__name__=="__main__", only then should we execute main. And I'm doing this now proactively, because I want to make sure that when I import my square function, perhaps from another library, from another file, treating it as though it's a library, I want to make sure that main is not just automatically called itself. 

Now what do I want to do from here now that I've modified this program as follows? Let's go ahead and write a completely different program whose sole purpose in life is to now test this program. So I've got my actual calculator and calculator.py. I've readied myself to call main conditionally so that I can safely import one or more things from this file in another file. 

What should that other file be? By convention, I'm going to create a file that's called test_, and then because the thing I'm testing is this calculator itself, let's call this file test_calculator.py. That's going to give me a new tab, in which I can write a brand new program whose purpose in life is now specifically to test that program, but really that program's specific functionality. Built into that program is the square function. Let's focus on testing that function. 

So how do I access that function in this program? Recall that I can import a function from another file as though it's a library of my own, a so-called module. So I'm going to do this. From calculator, import square. I could go ahead and just import square itself. But then I would have to prefix my use of square recall by saying calculator dot everywhere, and it's just a little cleaner to just import the one function. 

And now let me go ahead and do this. Let me go ahead and define a function called test square. This too is a convention. If you want to test a function called square, your function for testing should be called test_square. Or alternatively, you could do square_test, I'll adopt this convention here. 

Now what kind of tests can we do? I don't dislike the tests I ran earlier, testing x equals 2 and x equals 3. But every time I want to test my program previously, I would have to do that manually. And that's going to get tedious. It's not going to be easy for someone else to test it. And if I'm actually working in the real world, it would be nice if I could automatically have my program tested again and again by having some automated process run my own code. So let's do that and take the human ultimately out of the equation. 

So how might I go about testing the square function that I've now imported per line one? In my test square function, why don't I do this? If the result of calling square of 2 does not equal 4, why don't we go ahead and print an error message, because I know that in the real world, 2 squared should equal 4, so if square of 2 does not equal 4, there's a bug in my program. There's a bug in my function. I've made a mistake. 

So let me go ahead and print something like that so I or someone else knows 2 squared was not 4, for instance. So I could print out anything here. What should I maybe next test? Let's do more than one test. Let's say if the square of 3 does not equal 3 squared 9, then let's go ahead and print out that 3 squared was not 9. 

So I haven't done any more testing than I did earlier. But I've baked those two tests, x equals 2 and x equals 3, into my own code here, so I can now run those tests automatically, if you will. Now, it's not enough to just define a function called test square. I actually, if I want to run this function, need to call it somehow. And our convention for doing that is the same as always. 

In this file too, let me define main. And main's sole purpose in life is going to be to test square. And now at the bottom of this file, as before, let me go ahead and adopt my convention of if__name__=="__main__", then go ahead and call main. 

So a lot of this is just boilerplate. We've seen this before, defining a main function and calling a function to kick off some process, now adding the conditional at the bottom of the file to make sure I'm only conditionally calling main, just in case I import anything from this file elsewhere. So let's see. Let's go ahead and test my code now. Let me go ahead and run test_calculator Python and hit Enter, and nothing outputs. Nothing outputs. 

But I think it's OK. I think no output is good, because look at my test square function. I'm not printing anything if all seems well. So let's demonstrate as much by going back to my calculator, and let me break it. Let me introduce a bug. Maybe I didn't even get it right the first time. Maybe my code originally looked like this. I wasn't thinking. I forgot my squares. And so I thought that the square of a number is n plus n, instead of n times n, so a reasonable mistake to make, perhaps arithmetically. 

Let me now go back to my test calculator, which I'm not going to change, but I am going to rerun it, python of test_calculator.py. I'm going to cross my fingers here, but for naught, I'm going to see immediately that 3 squared was not 9. Now what is it? Let's see, when your tests fail, how can we put our finger on what's wrong? It's a little interesting that I completely broke my square function, and yet only one of these tests is failing. 

It looks like this test, lines 9 and 10, is fine, because I'm not seeing that output. But of course these two lines, this test, is failing, because 3 squared is not 9 when I'm using plus. So just to be clear here, why is my function only partially broken, just to be clear. Why am I seeing only I error instead of two, even though the square function is now mathematically broken? 

SPEAKER 1: Because 2 plus 2 is 4. 

DAVID MALAN: Yeah, it's as simple as that. I just got lucky that 2 plus 2 is the same thing as 2 times 2. So this is one of those corner cases, and this is why it's good to be in the habit of not just testing one thing, but test several and make sure you're covering your bases, so to speak. 

So I got lucky here. And that explains why I'm seeing only I error, even though the function itself is flawed, but let me propose that there's another way we could do this, because honestly, if I extrapolate from this simple example, running not just two tests but 3, or 4, or 10, or 20 tests, you can imagine that, my God, the code is going to get so much more complicated than the function itself. 

Already, look, in calculator.py, the function in question is two lines long. And yet in test_calculator, the code in question is five lines long. I've written more code to test my code than I actually wrote original code. So the fewer lines of code we can write when testing code, I think the more likely you and I are to do it, because it's going to be literally a little less work and just fewer opportunities for mistakes. 

So what's another approach I can take here? it turns out in Python, there is another keyword that we haven't yet used, which is this here, assert. Assert is a keyword in Python and some other languages as well that allow you to do exactly that, as in English, to assert that something is true, to sort of boldly claim that something is true. And if it is, nothing's going to happen. No errors are going to appear on the screen. 

But if you assert something in Python, and it is not true, that is, the thing you're insert asserting, a Boolean expression, is false, you're actually going to see some kind of error on the screen. So let's go ahead and try this new keyword as follows. Let me go back to my code here. And just to make it a little simpler, let me propose that I use this new keyword as follows. 

Let me simply assert that the square of 2 should equal 4. So I've changed my logic. Instead of checking for not equals, I'm now asserting very loudly that it should equal 4. And then on one additional line, let me do the other test, assert that the square of 3 equals equals 9. And that's it, no indented print. I'm just going to assert more simply these two things that I want to be true. 

Let me go ahead now, with calculator.py still broken. I'm still using plus accidentally, instead of multiplication. Let me go ahead now and run Python of test calculator.py, crossing my fingers as always, but it's not going to go well this time. A whole lot of errors seem to appear on the screen. And if I scroll up here for this traceback, we'll see that the thing that failed was this line here, assert square(3) == 9. 

Now unfortunately, when you're using the assert keyword, it's not terribly user friendly. It shows you the files and the line numbers involved, but it does show you the specific line of code that failed, the assertion that failed, so to speak. It's now kind of up to you and me to infer from this, wait a minute, why is the square root 3 not equal to 9? So it's not super user friendly, but honestly, it was half as much code for me to write. It's just two lines, instead of those previous four. 

But notice this little remnant down here. This was an assertion error. And we have seen errors before. We've seen errors before when we've made other mistakes in our code. And in the past, what was our solution for catching those errors? How do we catch errors that seem to resemble this, even though we've not seen this one before? 

SPEAKER 2: Try and except. 

DAVID MALAN: Yeah, in Python, we can use the try and except keywords to try to do something, optimistically, except if something goes wrong, do something else instead. So this is a step forward, in that I can at least catch this error. But it's going to be perhaps a step backward, in that I'm going to end up writing, I'll admit in advance, a little more code instead. 

So let me go ahead and try this. Let me go back into my code here. And instead of just asserting, blindly, let me go ahead, as Tola proposed, and try to do this first assertion, except if there is an assertion error, like we saw a moment ago, then go ahead and print out something more user friendly that explains what actually failed. 2 squared was not 4. And let me go ahead similarly and try to assert that the square of 3 equals 9, except if there's an assertion error there, in which case, I'm going to print out, more user friendly, 3 squared was not 9. 

So I've taken a step forward, but also a step back, because now I have more code. But I have at least introduced assertions and exceptions in a manner consistent with how we've seen in the past. When something goes wrong, you actually see an exception raised. Let me go ahead and run this version of the program now instead. Python of test calculator.py, crossing my fingers, it's still failed, because I'm seeing output. But we're back to at least user friendly output. 

So that's at least progress in some way here. But it's, again, more code than might have been ideal. And in fact, if we continue this further, what if we actually want to add additional test cases here as well? It seems like we might end up writing way more code than would be ideal. For instance, I'm testing 2 and 3 now. I should probably test some negative numbers as well. 

So why don't I go ahead and add in, for instance-- let me go ahead and copy and paste this. Let me try to assert that the square root of negative 2 equals equals 4, which should be the case mathematically. And if not, let me go ahead and change this to say negative 2 squared was not 4. And let me go ahead and copy paste this again, test another negative number, just for good measure. Let's test the square root of negative 3, which should equal 9. But if it doesn't, let's go ahead and say that negative 3 squared was not 9. 

And just to think aloud here, what might be another good value to test? I've tried 2. I've tried 3. I've tried negative 2. I've tried negative 3. I can't try infinite numbers. But there's at least something that's a little different in between those values. Let's try 0. 0 is an interesting case too, just in case something might be wrong. And why 0? I'm just going with instincts here. 

Odds are positive numbers are generally going to behave the same. Negative numbers might generally behave the same. 0 might be a little anomalous. There's no science to it necessarily, but rather considering for yourself based on your own experience, what are the potential corner cases based on the function you're trying to test? I'm trying to test something mathematical, so I want to test representative values. 

So let me go ahead and paste in one more try except block. Let's assert that the square of 0 should equal 0. And if not, I'll say something explanatory, like 0 squared was not 0. Now if I go ahead and run this, Python of test_calculator.py, and hit Enter, now I see multiple errors. And this is interesting. It's a bit of a clue, because notice that some, but not all, of these assertions are failing. 

The 1 for 2 squared is apparently OK, as we noted earlier. Recall that 2 squared happens to be 2 plus 2. So that bug doesn't really throw off our test, but it's a good thing we tested for 3. It's a good thing we tested for negative 2 and negative 3, because all of those tests caught this error. The 0 test did not notice, because 0 squared is, of course, 0, but 0 plus 0 is 0. So we're getting lucky or unlucky there, depending on how you view the glass as half full or half empty here. We at least by way of having multiple tests caught this mistake somehow. 

So it would be nice, though, if we weren't writing so much darn code here, because notice what I've done. I have try, except, try, except. I have all of these assertions. I have a main function. I have this if conditional at the bottom of my file. Honestly, who's going to want to write 31 lines of code now just to test a two line function? No one's going to write test code like this if we're all writing so much more code to do the actual testing. 

So people have solved this problem. If you are in the habit of testing your code a lot, or wanting to, if I'm in the habit of wanting to test my code a lot, if everyone else in the real world is in this habit of wanting to test their code, why don't we create tools that make it a little easier to do so? And in fact, there is a mechanism for doing this, whereby we can use a tool that's popularly called pytest. 

So pytest is a third party program that you can download and install that will automate the testing of your code, so long as you write the tests. But what's nice about this library and others like it is that it adopts some conventions so that you don't have to write as many lines of code yourself manually. They do some of that automatically for you. 

Now this is a third party library. There's other libraries for unit tests, so to speak, that is testing units of your code. Some of them come with Python itself. We're proposing that we look at pytest today because it's actually a little simpler than the unit testing frameworks that come with Python itself. And what do we mean by unit testing? Unit testing is just a formal way of describing testing individual units of your program. What are those individual units? They're typically functions. So unit tests are typically tests for functions that you have written. 

Now what does this mean in practice here. Let me go back to my VS code here and let me propose that we simplify my test calculator significantly. I'm going to go ahead and delete all of these tests, which were accumulating to like 31 lines of code. And let's see if we can distill the tests to their essence, using pytest. 

From my same calculator program, let me still import square. So I do still need that line of code so that I can test that's specific function. Now I'm going to go ahead and define a function, just like I did before, as follows. I'm going to define a function called test square, again by convention, test underscore and the name of the function you want to test, though it doesn't have to be that way. And now I'm going to go ahead and make a few assertions. 

I'm going to assert that the square of 2 should equal 4. I'm going to assert that the square of 3 should equal 9. I'm going to assert that the square of negative 2 should equal 4. And I'm going to assert that the square of negative 3 should equal 9. And lastly for now I'm going to assert that the square of 0 should equal 0. 

So I'm still using the assert keyword, as I introduced earlier. And even though it was a little tedious to type those, it's only eight lines of code now. And they're so easy to type. It's not try and except and all of this. Wouldn't it be nice if something else, someone else, handled the try, the except, the printing, all of the standardization of actually running these tests? And that's where, indeed, pytest comes into play. 

Per the documentation for pytest, which can itself be installed with pip install pytest, which we've used to install other libraries in the past, you can look at the documentation here for all of its formal usage. But fortunately, pytest is pretty user friendly, as testing frameworks go, and it actually allows us to dive right in by just running pytest on the code that we've written. 

So if I go back to VS Code here and look at my test_calculator.py, which, notice, has no main function anymore-- it has no conditional. It has no tries. It has no excepts. It has no prints. It just has my few assertions-- pytest and other libraries like it are going to automate the process of running these tests for me and informing me on the screen whether or not any of those tests failed. 

So let me go ahead and do this. I'm going to go ahead and increase the size of my terminal window for a moment, just so we can see more on the screen. And I'm going to run not python, as I've been doing. I'm going to run pytest, which, again, is this third party tool for running tests in your code. I'm going to run pytest of test_calculator, so that same file. I'm going to cross my fingers as always and hit Enter, and we'll see that something has failed. 

Now admittedly, even though I do think you'll find that pytest is relatively simple to use, it's output, at least at first glance, is not necessarily super user friendly. So what are we seeing here? Notice at the very top of my window is the command that I ran after my prompt. Right below that is a single F in red, which means fail, so not very encouraging. I tried really hard here, but fail is my grade on this program. 

But let's see exactly what happened. If I look at this excerpt here under failures, you'll see that test square is the function that failed. That makes sense, because that's the only one I wrote. And you'll see here somewhat arcane output describing what the error was. So what you're seeing here is the first line of output equals equals 4, which is fine. There's no red error message below that, so that one's OK. 

But this line of code here assert that square of 3 equals equals 9, pytest did not like that assertion, because it didn't end up being true. In fact, per the red E at the start of this line, you'll see that I'm effectively trying to assert that 6 equals equals 9. Now, where did the 6 come from? Wait a minute, if my test involves this, notice that where 6 equals square of 3, this is saying that because I've called square, passing in a value of 3, it turns out it's return value is 6. And of course, mathematically, 6 does not equal equal 9. So that's why this is failing. 

Now, pytest is not as user friendly as telling you exactly why the bug is there or how to fix it. This is really just a clue to you what must be wrong. What you're seeing here is a clue that the first test passed, because there's no red error below that line of code, but this test failed. Somehow or other, your square function is returning 6 when passed in 3 instead of 9. 

So at this point, you sort of put your detective hat on, you go back to your actual code, and you think about in calculator.py, how in the world is line 7 of my square function returning 6 instead of 9. And at this point, odds are the light bulb would go off above your head proverbially, and you would see, I'm using addition, instead of multiplication. 

But what pytest has done for us is automate the process of at least pointing out that error for us. And if I now go in and fix this-- let me go ahead, and the light bulb has gone off. I change the plus to a multiply. Now I'm going to go ahead, and after clearing my screen, I'm going to run not Python, but pytest of test_calculator.py, crossing my fingers again. And now it's green. And I see just a dot, which indicates that my one and only test passed. I'm good, 100% success with my test now after fixing that bug. Let me pause here and see if there's any questions. 

SPEAKER 3: So my question is, what if a user, instead of, because we are taking input from the user, what if the user is somewhat malicious and types in a string instead of an integer, or maybe he types in a float or some other data type? 

DAVID MALAN: Yeah, so what if the user, like we've seen in past examples, types in cat, instead of a number, when we're expecting an integer? How do we test for something like that? At the moment, I'm admittedly not testing user input. If I go back to my code here, notice that my calculator function, of course, has the square function that we keep testing and retesting. 

But notice that all of the user input is currently relegated to my main function. And admittedly, as of now, I am not testing my main function. So there could be one of those bugs. And in fact, there would be, because if the user types in a string, like cat, instead of an integer, like 2 or 3, then line two recall would actually raise a value error exception. So we've seen that before. 

So when it comes to testing your code, this is actually a good reason for having multiple functions in your program. Rather than putting all of your logic in just the file itself, rather than putting all of the logic in just main, it's actually really good, really helpful practice to break your ideas up into smaller bit-sized functions that themselves are testable. 

And what do I mean here? Square is perfectly testable. Why? Because it takes as input a parameter called n, and it returns as output in integer, which is going to be the square thereof, hopefully. It has a well-defined input and a well-defined output. It is therefore completely within your control in your test program to pass in those values. 

Now I will say, if you want to test whether square behaves properly when passed something like a string, like, quote, unquote, "cat," we could absolutely do something like this, assert that the square of quote, unquote, "cat," it's not going to equal something. You can actually, using different syntax, assert that a specific exception will be raised. 

So if we were actually going to go back into our square function, improve it, and deliberately raise an exception, we could test for that too. But for now, I'm deliberately only testing the square function. I'm not testing for specific user input. But that's another problem to be solved. Other questions now on unit tests? 

SPEAKER 4: Do use the unit test to test code for the CS50 check? 

DAVID MALAN: So Check 50 is similar in spirit. Check 50 is a tool that we, CS50, wrote that is essentially doing something like pytest for the evaluation of students' code. It is similar in spirit, but think of Check 50 as being an alternative to pytest, if you will. But it works a little bit differently. But same idea, pytest and unit testing more generally is a technique that is independent of CS50 and is something that you can and should be doing on your own code, both in or outside of this class. How about one other question here on our unit tests? 

SPEAKER 5: My question is that is instead of writing four times, like as a square of, 2 squared 4, instead of that, can we write equals to in square brackets the numbers we want, instead of writing four lines? 

DAVID MALAN: A really good question, absolutely. Right now if I go back to test_calculator.py, it's indeed pretty manual. It took me a while to say and to type out those several lines, and you could imagine writing some kind of loop to just assert in a loop that this equals that, that this equals that, and so forth, using a list or using maybe a list or a dictionary or some structure like that. 

So yes, you can absolutely automate some of these tests by not just doing the same thing again and again. You can still use all of the syntax of Python to do loops. But generally speaking, your tests should be pretty simple. And in fact, let me propose that we improve upon even this design further, because at the moment what's not really ideal, when I run all of these tests when my function is buggy, is notice the output that I got. 

Let me reintroduce that same bug by changing my multiplication back to addition. Let me increase the size of my terminal window again. And let me run pytest again of test_calculator.py. So this is the version of my code now that has the bug again. So I'm going to see that big massive failure where this failure has been displayed to me. 

But this is not as helpful as it could be, because I have all of those other tests in my code. Recall that I had, what, one, two, three, four, five separate tests, and I'm only seeing the output of the first. Now, why is that? If we go back to my code here, you'll see that the first assertion that's failing, namely this one here, that assert of square of 3 equals equals 9, the other tests aren't even getting run. 

And that's not a big deal in the sense that my code is buggy, so one or more of them are probably going to fail anyway, but wouldn't it be nice to know which of them are going to fail? And in fact, it's ideal to run as many tests all at once as possible to give you as many clues as possible to finding your bug. So let me propose that we improve the design of my testing code now, still using pytest as follows. 

Instead of having one big function called test_square that tests the entire function itself with so many different inputs, let's break down my tests into different categories. And here, too, there's no one right way to do this. But my mind is thinking that I should maybe test positive numbers separately, test negative numbers separately, and test 0 separately. 

I could think of other ways. I could test even numbers. I could test odd numbers or maybe some other pattern altogether, but separating this big test into multiple tests is probably going to yield more clues for me when something goes wrong. So let me do this. Let me go ahead and rename this function to test positive initially, and let me include in that function only those first two tests. Let me then create another function here called test negative. And in this function, let me test only negative 2 and negative 3. 

Then down here, let me do one more def of test_zero, and I'll just run one test in there. So I have the same assertions, the same five, but I've now divided them up among three separate functions. What's nice about pytest and other unit testing frameworks is that all three of these test functions will be run automatically. Even if one of them fails, the others will be attempted. That means that if one or two or three of them fail, I'll have one or two or three clues now for helping me find that mistake. 

So let me go ahead and again increase the size of my terminal window, just so we can see more on the screen. My calculator still has the bug, using addition, instead of multiplication. Let me go ahead and run not Python, but again, pytest of test_calculator.py, crossing my fingers as always, and now, oh my God, there's even more errors on the screen. But this in itself is more helpful. Let's work through them from top to bottom. 

So under FAILURES here, in all caps, which I know is not very encouraging to see failure when you're just trying to solve a problem, but that's what these frameworks do, under FAILURES, the first function that failed is test_positive. But here, too, we see the same clue as before. The first one, 2, the square of 2 equals equals 4, that one is fine. It's not erring with any red errors. But the next one is failing. So I know that square is broken when I pass in 3. 

What about down here? It looks like, unfortunately, my test negative function is failing too. Why? When I pass in-- oh, this is interesting-- here now, negative 2 doesn't even work. So I got lucky with positive 2. But negative 2 isn't working. So that's a bit of a clue. But in total, only two tests failed. 

So notice at the very bottom, this summary, two failed and one passed. What's the other one? What was the third one? Test zero. So test zero is passing. These two are failing. And so that kind of leads me logically, mathematically, if you will, to the source of the bug. And just to be clear too, if you have a lot of tests, this little one line output is helpful, even though also a bit discouraging, fail, fail, and dot means pass. So there are the three tests just depicted graphically a little bit differently. 

Let me rewind now and go back in to calculator.py. Let's fix that bug, because let's suppose that I've deduced I'm using addition. I should have been using multiplication all this time. Let me now after fixing the bug yet again, let me go back to my big terminal. Let me run pytest of test_calculator.py, hitting Enter, crossing my fingers now, and dot dot dot means all is well. 100% of my tests passed, all three of them. So now I'm good. 

It doesn't necessarily mean that my code is 100% correct. But it does mean that it has passed 100% of my current tests. And so it would probably behoove us to think a little harder about maybe we should test bigger numbers. Maybe we should test even smaller numbers. Maybe we should test strings or something else. The onus is ultimately on you to decide what you're going to test. 

But in the real world, you're going to be very unhappy with yourself or someone else-- maybe your boss is going to be very unhappy with you-- if you did not catch a bug in your code, which you could have caught had you just written a test to try that kind of input. Let me pause again and see if there's any questions now on unit testing with pytest. 

SPEAKER 6: So if you wanted to test, like someone suggested before, user input as well as testing your function, do you do that within the same file? Or do you make separate files for different types of tests? 

DAVID MALAN: Really good question. You could absolutely make separate files to test different types of things. Or if you don't have that many, you can keep them all in the same file. At the moment, I've been storing all of my tests in one file for convenience, and there's not terribly many of them. But we'll take a look in a bit at an example that allows me to put them into a folder and even run pytest on the whole folder of tests as well. So that's possible. Other questions on unit testing. 

SPEAKER 7: So I've got two questions. So a couple of while ago, you just used an exception called-- I'm not sure what it was-- oh yeah, assertion error. What exactly does that particular error catch? And my second question is, does the assert keyword stand out to the compiler, exactly tell them to insert this particular line of code? 

DAVID MALAN: Indeed. The assert keyword we're seeing and the assertion error we saw earlier are intertwined. So when you use assert and the assertion fails, because whatever Boolean expression you're using is not true, it's false, an assertion error, by definition of Python, will be raised. So those two work in conjunction. 

Those errors, those assertion errors, are still being raised by my code here when any of these lines of code fail. However, pytest, this third party library, is handling the process of catching those exceptions automatically for me, so as to give me this standard output. 

So we started today's story by really implementing unit testing myself. I wrote all of the code myself. I wrote main. I did my conditional. I did try and except. Honestly, it's going to get incredibly painful to write tests long term if you and I have to write that much code every time, especially when our function is this small. So pytest and unit testing frameworks like it just automate so much of that. Essentially, pytest adds the try, the except, the if, the prints for you, so you can just focus on the essence of the test, which really are these inputs and outputs. How about time for one other question here on unit testing as well? 

SPEAKER 8: So when we enter minus x or minus 5 squared, square root of that number comes up. But when we put 6.6 or 5.6, something like that integer, then line shows error. So what's happening there? 

DAVID MALAN: So I'm deliberately testing integers right now, in large part because I only want pow to operate on integers. And that might be conveyed in Python's documentation or my own documentation for that function. If you were to pass in something else, like a float, it turns out that floating point values in Python and other languages are actually very hard, if not impossible, to represent 100% precisely. 

And so if you are trying to compare it against some other value, there might be slight rounding errors as a result. I'm just inferring from what you've described, but I'm very deliberately now testing this function with only the inputs that I would expect. It might indeed throw other errors if other inputs are passed. 

Allow me to propose that we consider what should happen if square isn't actually passed a number. For instance, if I go back to calculator.py, and suppose that I, or perhaps someone else using my square function, simply forgets to convert the return value of input from a str to an int, as by modifying line to here. 

Now, something's definitely going to go wrong if I type in a str instead of what appears to be an int. For instance, if I clear my terminal here, run Python of calculator.py and hit Enter-- let's type in cat as our value for x-- and of course, this raises now a type error. Why? Can't multiply sequence by non-int of type 'str.' What does that mean? You can't do cat times cat, because indeed, square is expecting that end will be some number. 

But that doesn't necessarily mean that square itself is buggy. But this does mean that if I expect a type error to be raised, let's test for that too, so that I know the behavior indeed works as expected. So let me go back to test_calculator.py, and let me go in add a fourth test down here. How about define test underscore, and I'll call this test_str, because I'm going to specifically and deliberately pass in a str for testing. 

And I want to in spirit assert that passing in something like cat to square will raise a type error. But we don't use the assert keyword for that. Rather, we need this. Let me go to the top of this file, and let me additionally import the pytest library itself, because it turns out there's a function in that library called raises that allows me to express that I expect an exception to be raised. 

And I can express that as follows with pytest.raises, and then in parentheses I can pass in the type of exception I expect, which is going to be a type error in this case. And now when do I expect that type error to be raised? Whenever I do something like calling square and passing in not a number, but something like cat. So now if I go back to my terminal window, run pytest of test calculator.py, this time having four tests, I should see that all four now are successful. 

Let's now consider how we could test code that doesn't just expect numbers as input, but actually strings. And let me rewind us in time here in VS Code to that very first program we wrote a few different versions of in hello.py that ultimately looked a little something like this. I had a main function that prompted the user for the value of a variable by asking them, "what's your name?" question mark. 

And then we went ahead and did something like hello, open paren, name, passing that user's name into a function called hello. Now that function hello recall ultimately looked like this. We defined hello as taking a parameter called to, the default value of which was world, and that function very simply printed hello, followed by a comma, and then whatever the name that had been passed in. 

And then we ultimately called main, but for now onward, I'm going to always add this if conditional, if name equals equals underscore underscore main, then and only then do I want to call main. So that's essentially what this program looked like in its last incarnation. How do we go about testing it? Here again too, I'm not going to test the user's input per se in main. I'm going to focus really on the module of code here that's of interest, which is the hello function itself. How can I go about testing the hello function? 

Unfortunately, even if I start by doing something like code of test hello.py-- let me go about and start writing a test program-- I could import from my hello program a function called hello. So a bit strange to see from hello import hello, but notice that on this line here, I'm importing from the module-- that is the file called hello.py-- the function called hello. 

And how do I go about testing this? If I have a function like define test_argument like this-- let me do this. So if I were to define a function like define test_hello, what could I do? I could call hello with quote, unquote, say, "David," and then check if it equals, what, "hello, David." 

So would this work, this approach here? If I've written a test, called test_hello, that calls hello with an argument of David and then tests its return value, just like we've done for our calculator, would this work as written? And let me go back to in just a moment the version of hello that we're testing. So you can see that function hello. Here's the test. Here is the actual code. Would this test now work? Any thoughts? 

SPEAKER 9: I think the problem is that in the first version in hello.py, you're using the to argument that you first declared, when you declared the function instead of using the name. 

DAVID MALAN: That is actually not a bug here. So let me stipulate that in hello.py, this code actually does work as intended. And let me go ahead and test it manually, just to demonstrate as much. Let me run Python of hello.py, typing in, as my name, D-A-V-I-D, and I see, in fact, that it says, "hello, David." 

If, though, I were to change this program, and get rid of the name argument, get rid of the name variable, and just call hello, again, running Python of hello.py, this time I'm not even prompted, because I got rid of my input call, but it does behave as I expect. It does say "hello, world." So let me stipulate that this code in its current form is actually correct, but my test is not going to work as I'd hoped. And there's a subtle difference between my hello function and my square function that explains. Why might this test not work as intended? 

SPEAKER 10: Because it's not returning a value. 

DAVID MALAN: Yeah, exactly. Recall our discussion early on about functions. Functions can either return a value, like my square function hands you back the square of some value, or they can have side effects, sort of visual artifacts that might happen on the screen, like printing something out on the screen. And by definition, that's how print works. Notice that hello, it is short, but it's implemented ultimately using the print function, which does not return a value as I'm using it here. It instead has this side effect of printing something onto the screen. 

So it is not correct in my test function to check if the return value of hello equals equals hello David, because again, hello is not returning anything. It's printing something, that side effect, but notice, literally, it has no return keyword, unlike my square function, which did. 

So here's an opportunity to perhaps change how I go about implementing my actual functions. It turns out that as your programs get more and more sophisticated, more and more complicated, it tends to be best practice not to have side effects if you can avoid it, especially if you want your code to be testable. And in fact, I'm going to propose that we change my hello program to now work as follows. 

Let me go ahead and change this function to not print hello and then that name. Let me go ahead and literally return maybe an F string, which will clean this up a little bit, hello comma to close quotes at the end. So my syntax here is just the familiar f string or format string. It's going to return hello, world or hello, David or hello, whomever's name is passed in as that argument, but I'm returning it now. I'm not printing it out. 

So what needs to change up here? I could do something like this. I could say something like output equals hello and then print output in my main function. Or I can simplify that, because I don't really need that variable. I could instead just do this. I could still call hello, but I could immediately print out the result. 

And this version of my hello program now is actually more testable. Why? Because these assert statements that we're using, and we've seen thus far for our tests, are really designed to test arguments into functions and return values they're from, not testing side effects. 

So if you're doing equals equals, you're looking for a return value, something that's handed back from the function. So that's fine. If I modify the design of my program now not to just print hello, but to return the string, the sentence, the phrase that I want to construct, I can leave it to the caller-- that is the function who's using this hello function-- to handle the actual printing. 

Now what does this mean in my code? It means now if my hello.py looks like this, and hello is indeed returning a value, in my test_hello function, I can test it exactly like this. So let me go ahead and run pytest of test_hello.py, crossing my fingers as always, and voila, one passed. So I passed this test, because apparently the return value of hello does indeed equal "hello, David." 

Let's test the other scenario. What if I call hello without any arguments? Let's assert that calling hello with nothing in those parentheses similarly equals hello comma, but world, the default value. Let me now go ahead and run pytest of test_hello.py. And that too passes entirely. But there too, suppose that I had made some mistakes. Suppose that there were a bug in my code. It might not be best practice to combine multiple tests in this one function, so let's make it more clear what might pass or fail. 

Let's call the first function test the default to this function. And let's only include this first line of code. And then let's go ahead and define another function, like test_argument, to test this other line of code here. So now I have two different tests, each of which is testing something a little fundamentally different. 

So now when I run my code, it's still not broken. If I run pytest of test_hello.py, Enter, I've now passed two tests. And that's just as good as before. But if I did have a bug, having two tests instead of one would indeed give me, perhaps, a bit more of a hint as to what's wrong. Questions now on this testing of return values, when these return values are now strings instead of integers and why we've done this? 

SPEAKER 11: So my question is about function inside the function. Can we test that too or recursion we haven't seen? 

DAVID MALAN: If you have a recursive function, which we've not discussed in this class, yes, you can absolutely test those too by simply calling them exactly in this way. Recursion does not affect this process. How about one more question here on unit tests before we look at one final example? 

SPEAKER 12: When testing our arguments, can we use something like loops or inside of assets or for the values? 

DAVID MALAN: Absolutely. You can absolutely use a loop to test multiple values. In this case, for instance, I could do something like this. I could say for name in the following list of Hermione, say, Harry, and Ron, I could then within this loop assert that hello of that given name equals equals, say, the format string of hello, comma name, and then run all of these here at once by running, again, pytest of test_hello.py. 

It's still going to be just one test within that function, but if there's something interesting about those several strings that makes it compelling to test all of them, you can absolutely automate the test in that way. With that said, each of your tests should ideally be pretty simple and pretty small. Why? Because you don't want to write so much code, so much complicated code that your tests might be flawed. 

What we don't want to have to do is write tests for our tests and test for our tests for our test, because it would never end. So keeping tests nice and simple is really the goal, so that a reasonable human, yourself included, can eyeball them and just claim, yeah, that is correct. We don't need tests for our tests. 

How about one other feature? Suppose that we don't have just one test, but many different tests instead, and we want to start to organize those tests into multiple files and even a folder. Pytest and other frameworks support that paradigm as well. In fact, let me go ahead and test hello.py using a folder of tests, with technically just one test, but it would be representative of having even more in that folder. 

I'm going to go ahead and create a new folder called test using mkdir at my command line. And then within that folder, I'm going to go ahead and create a file called test_hello.py. Within this file, meanwhile, I'm going to test the same thing. So I'm going to go ahead, and from hello, import hello. 

And I'm going to go ahead and define a function like test default that simply tests the scenario where hello with no arguments returns hello, comma world. And I'm going to have that other function where I test that an argument is passed. And in this case, I'll choose an argument like asserting that hello, quote, unquote, David, equals, indeed, hello, comma, not world, but David. 

So in this case, I've just recreated the same test as earlier, but they're in a file now in a folder called test. Pytest allows me to run these here too. But to do so, I actually need to create one other file. Within my test directory, I need to create a file called __init__.py, which has the effect, even if this file is empty, of telling Python to treat that folder as not just a module, but a package, so to speak. 

A package is a Python module or multiple modules that are organized inside of a folder. And this file, __init__.py, is just a visual indicator to Python that indeed it should treat that folder as a package. If I had more code in this folder, I could do even more things with this file. But for now, it's just a clue that it's indeed meant to be a package and not just a module or file alone. 

What I can now do in closing is run pytest, not even on that specific file, but on a whole folder of tests. So if I run pytest of test, where the test is the name of that folder, pytest will automatically search through that folder looking for all possible tests, granted there's just those two in this one file, but when I run it now with Enter, I'll still pass those tests. I'll still get 100%. And I now have a mechanism, ultimately, for testing my own code. 

So whether you're writing functions that return integers or something else, functions that have side effects that could be rewritten as functions that return values, you now have a mechanism to not just wait for, one, someone like us to test your code and not just test your code manually again and again, which might get tedious, and you might make mistakes by not including some possible inputs, we now have an automated mechanism for testing one's own code that's going to be even more powerful when you start collaborating with others so that you can write tests that ensure that if they make a change to the same code, they haven't broken the code that you've written. That's it for this week. We'll see you next time.