WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 00:00:25.250 --> 00:00:28.400 DAVID MALAN: This is CS50's Introduction to Programming with Python. 00:00:28.400 --> 00:00:29.480 My name is David Malan. 00:00:29.480 --> 00:00:31.940 And this is our week on unit tests. 00:00:31.940 --> 00:00:34.070 Up until now, we've been writing a lot of code, 00:00:34.070 --> 00:00:37.207 and you might have been testing your code by running your program, 00:00:37.207 --> 00:00:40.040 and passing in some sample inputs, and running it again, and passing 00:00:40.040 --> 00:00:42.530 in some sample inputs, or you might have been waiting 00:00:42.530 --> 00:00:44.180 for us to test your code instead. 00:00:44.180 --> 00:00:47.270 But it's actually much better practice to get into the habit sooner 00:00:47.270 --> 00:00:51.877 rather than later of testing your own code using code of your own. 00:00:51.877 --> 00:00:53.960 In fact, whether you're writing a personal project 00:00:53.960 --> 00:00:57.065 or working in industry, it's very common nowadays to not only write code 00:00:57.065 --> 00:00:58.940 to solve the problems that you want to solve, 00:00:58.940 --> 00:01:03.260 but also to write a little extra code to test the code that you wrote. 00:01:03.260 --> 00:01:06.590 And that's what we're going to focus on today, writing our own test so as 00:01:06.590 --> 00:01:08.900 to be all the more confident, all the more certain, 00:01:08.900 --> 00:01:12.930 that the problems we have been trying to solve are, in fact, solved correctly. 00:01:12.930 --> 00:01:17.720 So let's rewind a few weeks now to a program we wrote a while back, 00:01:17.720 --> 00:01:20.540 namely to calculate numbers. 00:01:20.540 --> 00:01:23.450 And specifically, we left off with this calculator 00:01:23.450 --> 00:01:27.050 on trying to compute the power of a number, like x squared 00:01:27.050 --> 00:01:31.250 or where x might be two or three or some other number as well. 00:01:31.250 --> 00:01:34.850 Let me go ahead and resurrect that file by going into my terminal window 00:01:34.850 --> 00:01:38.750 here and running again code of calculator.py. 00:01:38.750 --> 00:01:42.410 And let me go ahead and pick up where we left off way back when 00:01:42.410 --> 00:01:44.367 by defining a main function here. 00:01:44.367 --> 00:01:46.700 And then in my main function, I did something like this. 00:01:46.700 --> 00:01:49.550 I said, x equals int of input. 00:01:49.550 --> 00:01:52.730 And I ask the user, what's x, question mark? 00:01:52.730 --> 00:01:55.880 And then I immediately went ahead and printed out something like, 00:01:55.880 --> 00:02:00.020 x squared is, and then I passed in as a second argument 00:02:00.020 --> 00:02:02.450 to print the result of calling a function called 00:02:02.450 --> 00:02:04.790 square passing in that value, x. 00:02:04.790 --> 00:02:08.639 Now of course, I haven't yet implemented the square function. 00:02:08.639 --> 00:02:10.110 So let's define that as well. 00:02:10.110 --> 00:02:12.560 Let me go down a couple of lines and define square. 00:02:12.560 --> 00:02:15.938 And it takes an argument recall, a parameter that at the time 00:02:15.938 --> 00:02:18.980 I called n, for number, so I'll do that again, though I could technically 00:02:18.980 --> 00:02:21.140 choose any name for this variable. 00:02:21.140 --> 00:02:22.730 And I recall, did this. 00:02:22.730 --> 00:02:25.190 I returned n times n. 00:02:25.190 --> 00:02:27.560 And there were multiple ways to do this. 00:02:27.560 --> 00:02:30.120 Squaring a number is multiplying it by itself. 00:02:30.120 --> 00:02:32.150 So I could also use other syntax here, but this 00:02:32.150 --> 00:02:34.130 is what we ultimately settled on, and then 00:02:34.130 --> 00:02:37.430 recall that I ultimately called main in order to kick off 00:02:37.430 --> 00:02:39.090 the process of running this program. 00:02:39.090 --> 00:02:41.900 So just as a test manually, let me go ahead and run 00:02:41.900 --> 00:02:44.750 Python of calculator.py and hit Enter. 00:02:44.750 --> 00:02:45.530 What's x? 00:02:45.530 --> 00:02:47.180 Let's start with 2. 00:02:47.180 --> 00:02:48.530 x squared is 4. 00:02:48.530 --> 00:02:49.595 I think that's correct. 00:02:49.595 --> 00:02:51.470 So let's run it again, just for good measure. 00:02:51.470 --> 00:02:55.880 Python of calculator.py, let's type in 3 for x this time. 00:02:55.880 --> 00:02:57.110 X squared is 9. 00:02:57.110 --> 00:02:58.370 And I think that's correct. 00:02:58.370 --> 00:03:00.412 And I might be feeling pretty good at this point, 00:03:00.412 --> 00:03:02.450 and I go off and submit my code to a course, 00:03:02.450 --> 00:03:04.580 or I post it on the internet for others to use. 00:03:04.580 --> 00:03:07.550 But I haven't really methodically tested this code. 00:03:07.550 --> 00:03:10.280 And it's not necessarily the case that it works entirely. 00:03:10.280 --> 00:03:13.100 In fact, I haven't really considered a number of corner cases. 00:03:13.100 --> 00:03:17.570 I went with some pretty obvious numbers like 2 and 3, but what about 0? 00:03:17.570 --> 00:03:18.930 What about negative numbers? 00:03:18.930 --> 00:03:20.990 What about any number of other infinite numbers? 00:03:20.990 --> 00:03:23.090 We're not going to test an infinite number of inputs 00:03:23.090 --> 00:03:25.007 to this, because the program would never halt, 00:03:25.007 --> 00:03:28.610 but we should test some representative inputs ultimately. 00:03:28.610 --> 00:03:31.970 But before we do that, let's get into the habit of making sure 00:03:31.970 --> 00:03:33.770 that main isn't always called. 00:03:33.770 --> 00:03:43.910 Let's adopt this habit, again, of doing If__name__=="__main__", 00:03:43.910 --> 00:03:45.920 only then should we execute main. 00:03:45.920 --> 00:03:48.320 And I'm doing this now proactively, because I 00:03:48.320 --> 00:03:53.210 want to make sure that when I import my square function, perhaps 00:03:53.210 --> 00:03:56.930 from another library, from another file, treating it as though it's a library, 00:03:56.930 --> 00:04:00.840 I want to make sure that main is not just automatically called itself. 00:04:00.840 --> 00:04:03.920 Now what do I want to do from here now that I've 00:04:03.920 --> 00:04:06.320 modified this program as follows? 00:04:06.320 --> 00:04:09.560 Let's go ahead and write a completely different program whose sole purpose 00:04:09.560 --> 00:04:12.210 in life is to now test this program. 00:04:12.210 --> 00:04:15.410 So I've got my actual calculator and calculator.py. 00:04:15.410 --> 00:04:18.200 I've readied myself to call main conditionally 00:04:18.200 --> 00:04:23.568 so that I can safely import one or more things from this file in another file. 00:04:23.568 --> 00:04:24.860 What should that other file be? 00:04:24.860 --> 00:04:28.340 By convention, I'm going to create a file that's called test_, 00:04:28.340 --> 00:04:31.580 and then because the thing I'm testing is this calculator itself, 00:04:31.580 --> 00:04:34.910 let's call this file test_calculator.py. 00:04:34.910 --> 00:04:36.920 That's going to give me a new tab, in which I 00:04:36.920 --> 00:04:39.080 can write a brand new program whose purpose in life 00:04:39.080 --> 00:04:41.330 is now specifically to test that program, 00:04:41.330 --> 00:04:43.850 but really that program's specific functionality. 00:04:43.850 --> 00:04:46.700 Built into that program is the square function. 00:04:46.700 --> 00:04:49.730 Let's focus on testing that function. 00:04:49.730 --> 00:04:52.760 So how do I access that function in this program? 00:04:52.760 --> 00:04:55.610 Recall that I can import a function from another file 00:04:55.610 --> 00:04:58.480 as though it's a library of my own, a so-called module. 00:04:58.480 --> 00:04:59.480 So I'm going to do this. 00:04:59.480 --> 00:05:03.170 From calculator, import square. 00:05:03.170 --> 00:05:05.600 I could go ahead and just import square itself. 00:05:05.600 --> 00:05:09.440 But then I would have to prefix my use of square recall 00:05:09.440 --> 00:05:12.440 by saying calculator dot everywhere, and it's just a little cleaner 00:05:12.440 --> 00:05:14.060 to just import the one function. 00:05:14.060 --> 00:05:16.440 And now let me go ahead and do this. 00:05:16.440 --> 00:05:19.910 Let me go ahead and define a function called test square. 00:05:19.910 --> 00:05:21.270 This too is a convention. 00:05:21.270 --> 00:05:25.390 If you want to test a function called square, your function for testing 00:05:25.390 --> 00:05:27.700 should be called test_square. 00:05:27.700 --> 00:05:30.430 Or alternatively, you could do square_test, 00:05:30.430 --> 00:05:31.930 I'll adopt this convention here. 00:05:31.930 --> 00:05:34.270 Now what kind of tests can we do? 00:05:34.270 --> 00:05:38.920 I don't dislike the tests I ran earlier, testing x equals 2 and x equals 3. 00:05:38.920 --> 00:05:41.745 But every time I want to test my program previously, 00:05:41.745 --> 00:05:43.120 I would have to do that manually. 00:05:43.120 --> 00:05:44.050 And that's going to get tedious. 00:05:44.050 --> 00:05:45.880 It's not going to be easy for someone else to test it. 00:05:45.880 --> 00:05:47.990 And if I'm actually working in the real world, 00:05:47.990 --> 00:05:49.720 it would be nice if I could automatically 00:05:49.720 --> 00:05:54.040 have my program tested again and again by having some automated process run 00:05:54.040 --> 00:05:54.910 my own code. 00:05:54.910 --> 00:05:58.130 So let's do that and take the human ultimately out of the equation. 00:05:58.130 --> 00:06:00.850 So how might I go about testing the square function 00:06:00.850 --> 00:06:03.670 that I've now imported per line one? 00:06:03.670 --> 00:06:05.800 In my test square function, why don't I do this? 00:06:05.800 --> 00:06:11.198 If the result of calling square of 2 does not equal 4, 00:06:11.198 --> 00:06:13.240 why don't we go ahead and print an error message, 00:06:13.240 --> 00:06:17.080 because I know that in the real world, 2 squared should equal 4, 00:06:17.080 --> 00:06:21.572 so if square of 2 does not equal 4, there's a bug in my program. 00:06:21.572 --> 00:06:22.780 There's a bug in my function. 00:06:22.780 --> 00:06:23.780 I've made a mistake. 00:06:23.780 --> 00:06:26.655 So let me go ahead and print something like that so I or someone else 00:06:26.655 --> 00:06:30.050 knows 2 squared was not 4, for instance. 00:06:30.050 --> 00:06:31.640 So I could print out anything here. 00:06:31.640 --> 00:06:33.163 What should I maybe next test? 00:06:33.163 --> 00:06:34.330 Let's do more than one test. 00:06:34.330 --> 00:06:40.120 Let's say if the square of 3 does not equal 3 squared 9, then let's go ahead 00:06:40.120 --> 00:06:43.390 and print out that 3 squared was not 9. 00:06:43.390 --> 00:06:46.240 So I haven't done any more testing than I did earlier. 00:06:46.240 --> 00:06:52.390 But I've baked those two tests, x equals 2 and x equals 3, into my own code 00:06:52.390 --> 00:06:55.910 here, so I can now run those tests automatically, if you will. 00:06:55.910 --> 00:06:59.890 Now, it's not enough to just define a function called test square. 00:06:59.890 --> 00:07:02.830 I actually, if I want to run this function, need to call it somehow. 00:07:02.830 --> 00:07:06.250 And our convention for doing that is the same as always. 00:07:06.250 --> 00:07:08.530 In this file too, let me define main. 00:07:08.530 --> 00:07:12.670 And main's sole purpose in life is going to be to test square. 00:07:12.670 --> 00:07:15.310 And now at the bottom of this file, as before, 00:07:15.310 --> 00:07:23.860 let me go ahead and adopt my convention of if__name__=="__main__", 00:07:23.860 --> 00:07:26.170 then go ahead and call main. 00:07:26.170 --> 00:07:27.890 So a lot of this is just boilerplate. 00:07:27.890 --> 00:07:29.890 We've seen this before, defining a main function 00:07:29.890 --> 00:07:32.073 and calling a function to kick off some process, 00:07:32.073 --> 00:07:34.240 now adding the conditional at the bottom of the file 00:07:34.240 --> 00:07:37.900 to make sure I'm only conditionally calling main, just in case I import 00:07:37.900 --> 00:07:40.220 anything from this file elsewhere. 00:07:40.220 --> 00:07:41.050 So let's see. 00:07:41.050 --> 00:07:43.030 Let's go ahead and test my code now. 00:07:43.030 --> 00:07:47.800 Let me go ahead and run test_calculator Python and hit Enter, 00:07:47.800 --> 00:07:49.780 and nothing outputs. 00:07:49.780 --> 00:07:50.920 Nothing outputs. 00:07:50.920 --> 00:07:53.110 But I think it's OK. 00:07:53.110 --> 00:07:56.800 I think no output is good, because look at my test square function. 00:07:56.800 --> 00:08:00.430 I'm not printing anything if all seems well. 00:08:00.430 --> 00:08:03.197 So let's demonstrate as much by going back to my calculator, 00:08:03.197 --> 00:08:04.030 and let me break it. 00:08:04.030 --> 00:08:05.180 Let me introduce a bug. 00:08:05.180 --> 00:08:07.180 Maybe I didn't even get it right the first time. 00:08:07.180 --> 00:08:09.010 Maybe my code originally looked like this. 00:08:09.010 --> 00:08:10.000 I wasn't thinking. 00:08:10.000 --> 00:08:11.200 I forgot my squares. 00:08:11.200 --> 00:08:15.160 And so I thought that the square of a number is n plus n, 00:08:15.160 --> 00:08:18.130 instead of n times n, so a reasonable mistake to make, 00:08:18.130 --> 00:08:19.390 perhaps arithmetically. 00:08:19.390 --> 00:08:21.340 Let me now go back to my test calculator, 00:08:21.340 --> 00:08:24.010 which I'm not going to change, but I am going to rerun it, 00:08:24.010 --> 00:08:26.410 python of test_calculator.py. 00:08:26.410 --> 00:08:29.170 I'm going to cross my fingers here, but for naught, I'm 00:08:29.170 --> 00:08:33.309 going to see immediately that 3 squared was not 9. 00:08:33.309 --> 00:08:35.020 Now what is it? 00:08:35.020 --> 00:08:39.400 Let's see, when your tests fail, how can we put our finger on what's wrong? 00:08:39.400 --> 00:08:42.520 It's a little interesting that I completely broke my square function, 00:08:42.520 --> 00:08:45.460 and yet only one of these tests is failing. 00:08:45.460 --> 00:08:49.672 It looks like this test, lines 9 and 10, is fine, 00:08:49.672 --> 00:08:51.130 because I'm not seeing that output. 00:08:51.130 --> 00:08:54.310 But of course these two lines, this test, 00:08:54.310 --> 00:08:57.640 is failing, because 3 squared is not 9 when I'm using plus. 00:08:57.640 --> 00:09:03.880 So just to be clear here, why is my function only partially broken, 00:09:03.880 --> 00:09:04.930 just to be clear. 00:09:04.930 --> 00:09:07.990 Why am I seeing only I error instead of two, 00:09:07.990 --> 00:09:11.785 even though the square function is now mathematically broken? 00:09:11.785 --> 00:09:13.200 SPEAKER 1: Because 2 plus 2 is 4. 00:09:13.200 --> 00:09:14.950 DAVID MALAN: Yeah, it's as simple as that. 00:09:14.950 --> 00:09:18.313 I just got lucky that 2 plus 2 is the same thing as 2 times 2. 00:09:18.313 --> 00:09:20.230 So this is one of those corner cases, and this 00:09:20.230 --> 00:09:22.480 is why it's good to be in the habit of not just testing one thing, 00:09:22.480 --> 00:09:25.610 but test several and make sure you're covering your bases, so to speak. 00:09:25.610 --> 00:09:27.010 So I got lucky here. 00:09:27.010 --> 00:09:29.470 And that explains why I'm seeing only I error, 00:09:29.470 --> 00:09:32.800 even though the function itself is flawed, but let me propose that there's 00:09:32.800 --> 00:09:35.050 another way we could do this, because honestly, 00:09:35.050 --> 00:09:39.280 if I extrapolate from this simple example, running not just two tests 00:09:39.280 --> 00:09:45.010 but 3, or 4, or 10, or 20 tests, you can imagine that, my God, 00:09:45.010 --> 00:09:48.460 the code is going to get so much more complicated than the function itself. 00:09:48.460 --> 00:09:53.290 Already, look, in calculator.py, the function in question is two lines long. 00:09:53.290 --> 00:09:58.330 And yet in test_calculator, the code in question is five lines long. 00:09:58.330 --> 00:10:01.900 I've written more code to test my code than I actually wrote original code. 00:10:01.900 --> 00:10:05.890 So the fewer lines of code we can write when testing code, 00:10:05.890 --> 00:10:08.118 I think the more likely you and I are to do it, 00:10:08.118 --> 00:10:09.910 because it's going to be literally a little 00:10:09.910 --> 00:10:12.740 less work and just fewer opportunities for mistakes. 00:10:12.740 --> 00:10:15.400 So what's another approach I can take here? 00:10:15.400 --> 00:10:19.700 it turns out in Python, there is another keyword that we haven't yet used, 00:10:19.700 --> 00:10:21.640 which is this here, assert. 00:10:21.640 --> 00:10:25.330 Assert is a keyword in Python and some other languages as well 00:10:25.330 --> 00:10:28.300 that allow you to do exactly that, as in English, to assert 00:10:28.300 --> 00:10:31.910 that something is true, to sort of boldly claim that something is true. 00:10:31.910 --> 00:10:34.420 And if it is, nothing's going to happen. 00:10:34.420 --> 00:10:36.290 No errors are going to appear on the screen. 00:10:36.290 --> 00:10:40.240 But if you assert something in Python, and it is not true, that is, 00:10:40.240 --> 00:10:44.230 the thing you're insert asserting, a Boolean expression, is false, 00:10:44.230 --> 00:10:47.930 you're actually going to see some kind of error on the screen. 00:10:47.930 --> 00:10:50.870 So let's go ahead and try this new keyword as follows. 00:10:50.870 --> 00:10:52.550 Let me go back to my code here. 00:10:52.550 --> 00:10:54.880 And just to make it a little simpler, let 00:10:54.880 --> 00:10:58.130 me propose that I use this new keyword as follows. 00:10:58.130 --> 00:11:04.120 Let me simply assert that the square of 2 should equal 4. 00:11:04.120 --> 00:11:05.440 So I've changed my logic. 00:11:05.440 --> 00:11:07.480 Instead of checking for not equals, I'm now 00:11:07.480 --> 00:11:11.260 asserting very loudly that it should equal 4. 00:11:11.260 --> 00:11:13.810 And then on one additional line, let me do the other test, 00:11:13.810 --> 00:11:17.590 assert that the square of 3 equals equals 9. 00:11:17.590 --> 00:11:21.040 And that's it, no indented print. 00:11:21.040 --> 00:11:24.010 I'm just going to assert more simply these two 00:11:24.010 --> 00:11:26.080 things that I want to be true. 00:11:26.080 --> 00:11:29.740 Let me go ahead now, with calculator.py still broken. 00:11:29.740 --> 00:11:33.670 I'm still using plus accidentally, instead of multiplication. 00:11:33.670 --> 00:11:37.900 Let me go ahead now and run Python of test calculator.py, 00:11:37.900 --> 00:11:41.320 crossing my fingers as always, but it's not going to go well this time. 00:11:41.320 --> 00:11:44.240 A whole lot of errors seem to appear on the screen. 00:11:44.240 --> 00:11:46.600 And if I scroll up here for this traceback, 00:11:46.600 --> 00:11:53.020 we'll see that the thing that failed was this line here, assert square(3) == 9. 00:11:53.020 --> 00:11:55.450 Now unfortunately, when you're using the assert keyword, 00:11:55.450 --> 00:11:57.460 it's not terribly user friendly. 00:11:57.460 --> 00:11:59.982 It shows you the files and the line numbers involved, 00:11:59.982 --> 00:12:02.440 but it does show you the specific line of code that failed, 00:12:02.440 --> 00:12:04.600 the assertion that failed, so to speak. 00:12:04.600 --> 00:12:08.560 It's now kind of up to you and me to infer from this, wait a minute, 00:12:08.560 --> 00:12:11.020 why is the square root 3 not equal to 9? 00:12:11.020 --> 00:12:13.187 So it's not super user friendly, but honestly, it 00:12:13.187 --> 00:12:14.770 was half as much code for me to write. 00:12:14.770 --> 00:12:16.990 It's just two lines, instead of those previous four. 00:12:16.990 --> 00:12:19.450 But notice this little remnant down here. 00:12:19.450 --> 00:12:21.070 This was an assertion error. 00:12:21.070 --> 00:12:23.350 And we have seen errors before. 00:12:23.350 --> 00:12:27.460 We've seen errors before when we've made other mistakes in our code. 00:12:27.460 --> 00:12:34.010 And in the past, what was our solution for catching those errors? 00:12:34.010 --> 00:12:37.610 How do we catch errors that seem to resemble this, 00:12:37.610 --> 00:12:39.977 even though we've not seen this one before? 00:12:39.977 --> 00:12:41.060 SPEAKER 2: Try and except. 00:12:41.060 --> 00:12:44.060 DAVID MALAN: Yeah, in Python, we can use the try and except keywords 00:12:44.060 --> 00:12:48.180 to try to do something, optimistically, except if something goes wrong, 00:12:48.180 --> 00:12:49.680 do something else instead. 00:12:49.680 --> 00:12:53.235 So this is a step forward, in that I can at least catch this error. 00:12:53.235 --> 00:12:55.610 But it's going to be perhaps a step backward, in that I'm 00:12:55.610 --> 00:12:59.700 going to end up writing, I'll admit in advance, a little more code instead. 00:12:59.700 --> 00:13:01.050 So let me go ahead and try this. 00:13:01.050 --> 00:13:02.850 Let me go back into my code here. 00:13:02.850 --> 00:13:05.960 And instead of just asserting, blindly, let 00:13:05.960 --> 00:13:10.400 me go ahead, as Tola proposed, and try to do this first assertion, 00:13:10.400 --> 00:13:16.280 except if there is an assertion error, like we saw a moment ago, then go ahead 00:13:16.280 --> 00:13:18.320 and print out something more user friendly 00:13:18.320 --> 00:13:20.210 that explains what actually failed. 00:13:20.210 --> 00:13:23.420 2 squared was not 4. 00:13:23.420 --> 00:13:28.100 And let me go ahead similarly and try to assert that the square of 3 00:13:28.100 --> 00:13:33.000 equals 9, except if there's an assertion error there, in which case, 00:13:33.000 --> 00:13:37.370 I'm going to print out, more user friendly, 3 squared was not 9. 00:13:37.370 --> 00:13:39.920 So I've taken a step forward, but also a step back, 00:13:39.920 --> 00:13:41.180 because now I have more code. 00:13:41.180 --> 00:13:44.288 But I have at least introduced assertions and exceptions 00:13:44.288 --> 00:13:46.580 in a manner consistent with how we've seen in the past. 00:13:46.580 --> 00:13:50.450 When something goes wrong, you actually see an exception raised. 00:13:50.450 --> 00:13:53.330 Let me go ahead and run this version of the program now instead. 00:13:53.330 --> 00:13:57.800 Python of test calculator.py, crossing my fingers, 00:13:57.800 --> 00:14:00.170 it's still failed, because I'm seeing output. 00:14:00.170 --> 00:14:02.850 But we're back to at least user friendly output. 00:14:02.850 --> 00:14:05.690 So that's at least progress in some way here. 00:14:05.690 --> 00:14:09.088 But it's, again, more code than might have been ideal. 00:14:09.088 --> 00:14:11.630 And in fact, if we continue this further, what if we actually 00:14:11.630 --> 00:14:14.900 want to add additional test cases here as well? 00:14:14.900 --> 00:14:18.980 It seems like we might end up writing way more code than would be ideal. 00:14:18.980 --> 00:14:21.260 For instance, I'm testing 2 and 3 now. 00:14:21.260 --> 00:14:24.000 I should probably test some negative numbers as well. 00:14:24.000 --> 00:14:26.870 So why don't I go ahead and add in, for instance-- let me go ahead 00:14:26.870 --> 00:14:28.710 and copy and paste this. 00:14:28.710 --> 00:14:32.780 Let me try to assert that the square root of negative 2 equals 00:14:32.780 --> 00:14:34.978 equals 4, which should be the case mathematically. 00:14:34.978 --> 00:14:36.770 And if not, let me go ahead and change this 00:14:36.770 --> 00:14:39.410 to say negative 2 squared was not 4. 00:14:39.410 --> 00:14:41.930 And let me go ahead and copy paste this again, 00:14:41.930 --> 00:14:44.280 test another negative number, just for good measure. 00:14:44.280 --> 00:14:47.900 Let's test the square root of negative 3, which should equal 9. 00:14:47.900 --> 00:14:53.330 But if it doesn't, let's go ahead and say that negative 3 squared was not 9. 00:14:53.330 --> 00:14:56.900 And just to think aloud here, what might be another good value to test? 00:14:56.900 --> 00:14:57.890 I've tried 2. 00:14:57.890 --> 00:14:58.550 I've tried 3. 00:14:58.550 --> 00:14:59.330 I've tried negative 2. 00:14:59.330 --> 00:15:00.247 I've tried negative 3. 00:15:00.247 --> 00:15:01.820 I can't try infinite numbers. 00:15:01.820 --> 00:15:03.320 But there's at least something that's a little 00:15:03.320 --> 00:15:04.737 different in between those values. 00:15:04.737 --> 00:15:05.240 Let's try 0. 00:15:05.240 --> 00:15:08.540 0 is an interesting case too, just in case something might be wrong. 00:15:08.540 --> 00:15:09.740 And why 0? 00:15:09.740 --> 00:15:11.720 I'm just going with instincts here. 00:15:11.720 --> 00:15:14.928 Odds are positive numbers are generally going to behave the same. 00:15:14.928 --> 00:15:16.970 Negative numbers might generally behave the same. 00:15:16.970 --> 00:15:18.650 0 might be a little anomalous. 00:15:18.650 --> 00:15:23.060 There's no science to it necessarily, but rather considering for yourself 00:15:23.060 --> 00:15:26.457 based on your own experience, what are the potential corner cases based 00:15:26.457 --> 00:15:28.040 on the function you're trying to test? 00:15:28.040 --> 00:15:29.790 I'm trying to test something mathematical, 00:15:29.790 --> 00:15:31.560 so I want to test representative values. 00:15:31.560 --> 00:15:34.520 So let me go ahead and paste in one more try except block. 00:15:34.520 --> 00:15:38.120 Let's assert that the square of 0 should equal 0. 00:15:38.120 --> 00:15:43.460 And if not, I'll say something explanatory, like 0 squared was not 0. 00:15:43.460 --> 00:15:48.950 Now if I go ahead and run this, Python of test_calculator.py, and hit Enter, 00:15:48.950 --> 00:15:50.780 now I see multiple errors. 00:15:50.780 --> 00:15:51.890 And this is interesting. 00:15:51.890 --> 00:15:55.580 It's a bit of a clue, because notice that some, but not all, 00:15:55.580 --> 00:15:57.320 of these assertions are failing. 00:15:57.320 --> 00:16:02.210 The 1 for 2 squared is apparently OK, as we noted earlier. 00:16:02.210 --> 00:16:05.450 Recall that 2 squared happens to be 2 plus 2. 00:16:05.450 --> 00:16:07.707 So that bug doesn't really throw off our test, 00:16:07.707 --> 00:16:09.290 but it's a good thing we tested for 3. 00:16:09.290 --> 00:16:11.450 It's a good thing we tested for negative 2 and negative 3, 00:16:11.450 --> 00:16:13.370 because all of those tests caught this error. 00:16:13.370 --> 00:16:18.740 The 0 test did not notice, because 0 squared is, of course, 0, but 0 plus 0 00:16:18.740 --> 00:16:19.370 is 0. 00:16:19.370 --> 00:16:22.130 So we're getting lucky or unlucky there, depending 00:16:22.130 --> 00:16:25.280 on how you view the glass as half full or half empty here. 00:16:25.280 --> 00:16:30.500 We at least by way of having multiple tests caught this mistake somehow. 00:16:30.500 --> 00:16:35.870 So it would be nice, though, if we weren't writing so much darn code here, 00:16:35.870 --> 00:16:37.170 because notice what I've done. 00:16:37.170 --> 00:16:39.710 I have try, except, try, except. 00:16:39.710 --> 00:16:41.150 I have all of these assertions. 00:16:41.150 --> 00:16:42.600 I have a main function. 00:16:42.600 --> 00:16:45.470 I have this if conditional at the bottom of my file. 00:16:45.470 --> 00:16:49.020 Honestly, who's going to want to write 31 lines of code 00:16:49.020 --> 00:16:51.830 now just to test a two line function? 00:16:51.830 --> 00:16:53.780 No one's going to write test code like this 00:16:53.780 --> 00:16:57.620 if we're all writing so much more code to do the actual testing. 00:16:57.620 --> 00:16:59.970 So people have solved this problem. 00:16:59.970 --> 00:17:02.720 If you are in the habit of testing your code a lot, or wanting to, 00:17:02.720 --> 00:17:04.700 if I'm in the habit of wanting to test my code a lot, 00:17:04.700 --> 00:17:07.283 if everyone else in the real world is in this habit of wanting 00:17:07.283 --> 00:17:09.319 to test their code, why don't we create tools 00:17:09.319 --> 00:17:11.690 that make it a little easier to do so? 00:17:11.690 --> 00:17:14.000 And in fact, there is a mechanism for doing 00:17:14.000 --> 00:17:17.780 this, whereby we can use a tool that's popularly called pytest. 00:17:17.780 --> 00:17:21.920 So pytest is a third party program that you can download and install 00:17:21.920 --> 00:17:26.450 that will automate the testing of your code, so long as you write the tests. 00:17:26.450 --> 00:17:29.150 But what's nice about this library and others 00:17:29.150 --> 00:17:31.580 like it is that it adopts some conventions so 00:17:31.580 --> 00:17:35.320 that you don't have to write as many lines of code yourself manually. 00:17:35.320 --> 00:17:38.090 They do some of that automatically for you. 00:17:38.090 --> 00:17:39.520 Now this is a third party library. 00:17:39.520 --> 00:17:42.520 There's other libraries for unit tests, so to speak, 00:17:42.520 --> 00:17:44.440 that is testing units of your code. 00:17:44.440 --> 00:17:46.240 Some of them come with Python itself. 00:17:46.240 --> 00:17:48.813 We're proposing that we look at pytest today 00:17:48.813 --> 00:17:50.980 because it's actually a little simpler than the unit 00:17:50.980 --> 00:17:53.170 testing frameworks that come with Python itself. 00:17:53.170 --> 00:17:54.760 And what do we mean by unit testing? 00:17:54.760 --> 00:17:57.910 Unit testing is just a formal way of describing testing 00:17:57.910 --> 00:18:00.297 individual units of your program. 00:18:00.297 --> 00:18:01.630 What are those individual units? 00:18:01.630 --> 00:18:02.960 They're typically functions. 00:18:02.960 --> 00:18:07.360 So unit tests are typically tests for functions that you have written. 00:18:07.360 --> 00:18:09.610 Now what does this mean in practice here. 00:18:09.610 --> 00:18:12.910 Let me go back to my VS code here and let 00:18:12.910 --> 00:18:17.260 me propose that we simplify my test calculator significantly. 00:18:17.260 --> 00:18:22.930 I'm going to go ahead and delete all of these tests, which were accumulating 00:18:22.930 --> 00:18:24.400 to like 31 lines of code. 00:18:24.400 --> 00:18:28.750 And let's see if we can distill the tests to their essence, using pytest. 00:18:28.750 --> 00:18:32.350 From my same calculator program, let me still import square. 00:18:32.350 --> 00:18:34.270 So I do still need that line of code so that I 00:18:34.270 --> 00:18:35.980 can test that's specific function. 00:18:35.980 --> 00:18:39.010 Now I'm going to go ahead and define a function, just like I did before, 00:18:39.010 --> 00:18:39.800 as follows. 00:18:39.800 --> 00:18:42.430 I'm going to define a function called test square, again 00:18:42.430 --> 00:18:46.360 by convention, test underscore and the name of the function you want to test, 00:18:46.360 --> 00:18:47.957 though it doesn't have to be that way. 00:18:47.957 --> 00:18:50.290 And now I'm going to go ahead and make a few assertions. 00:18:50.290 --> 00:18:53.350 I'm going to assert that the square of 2 should equal 4. 00:18:53.350 --> 00:18:57.310 I'm going to assert that the square of 3 should equal 9. 00:18:57.310 --> 00:19:01.750 I'm going to assert that the square of negative 2 should equal 4. 00:19:01.750 --> 00:19:06.250 And I'm going to assert that the square of negative 3 should equal 9. 00:19:06.250 --> 00:19:10.990 And lastly for now I'm going to assert that the square of 0 should equal 0. 00:19:10.990 --> 00:19:14.860 So I'm still using the assert keyword, as I introduced earlier. 00:19:14.860 --> 00:19:17.290 And even though it was a little tedious to type those, 00:19:17.290 --> 00:19:18.910 it's only eight lines of code now. 00:19:18.910 --> 00:19:20.440 And they're so easy to type. 00:19:20.440 --> 00:19:22.750 It's not try and except and all of this. 00:19:22.750 --> 00:19:26.410 Wouldn't it be nice if something else, someone else, 00:19:26.410 --> 00:19:31.930 handled the try, the except, the printing, all of the standardization 00:19:31.930 --> 00:19:33.400 of actually running these tests? 00:19:33.400 --> 00:19:36.370 And that's where, indeed, pytest comes into play. 00:19:36.370 --> 00:19:40.180 Per the documentation for pytest, which can itself be installed with pip 00:19:40.180 --> 00:19:44.170 install pytest, which we've used to install other libraries in the past, 00:19:44.170 --> 00:19:47.980 you can look at the documentation here for all of its formal usage. 00:19:47.980 --> 00:19:51.760 But fortunately, pytest is pretty user friendly, as testing frameworks go, 00:19:51.760 --> 00:19:55.660 and it actually allows us to dive right in by just running pytest on the code 00:19:55.660 --> 00:19:56.510 that we've written. 00:19:56.510 --> 00:20:00.310 So if I go back to VS Code here and look at my test_calculator.py, which, 00:20:00.310 --> 00:20:04.210 notice, has no main function anymore-- it has no conditional. 00:20:04.210 --> 00:20:05.440 It has no tries. 00:20:05.440 --> 00:20:06.460 It has no excepts. 00:20:06.460 --> 00:20:07.450 It has no prints. 00:20:07.450 --> 00:20:11.320 It just has my few assertions-- pytest and other libraries 00:20:11.320 --> 00:20:14.890 like it are going to automate the process of running these tests for me 00:20:14.890 --> 00:20:20.320 and informing me on the screen whether or not any of those tests failed. 00:20:20.320 --> 00:20:21.692 So let me go ahead and do this. 00:20:21.692 --> 00:20:24.400 I'm going to go ahead and increase the size of my terminal window 00:20:24.400 --> 00:20:26.567 for a moment, just so we can see more on the screen. 00:20:26.567 --> 00:20:29.170 And I'm going to run not python, as I've been doing. 00:20:29.170 --> 00:20:32.620 I'm going to run pytest, which, again, is this third party 00:20:32.620 --> 00:20:35.110 tool for running tests in your code. 00:20:35.110 --> 00:20:39.880 I'm going to run pytest of test_calculator, so that same file. 00:20:39.880 --> 00:20:42.490 I'm going to cross my fingers as always and hit Enter, 00:20:42.490 --> 00:20:46.040 and we'll see that something has failed. 00:20:46.040 --> 00:20:48.340 Now admittedly, even though I do think you'll 00:20:48.340 --> 00:20:51.290 find that pytest is relatively simple to use, 00:20:51.290 --> 00:20:55.490 it's output, at least at first glance, is not necessarily super user friendly. 00:20:55.490 --> 00:20:56.950 So what are we seeing here? 00:20:56.950 --> 00:21:01.660 Notice at the very top of my window is the command that I ran after my prompt. 00:21:01.660 --> 00:21:05.470 Right below that is a single F in red, which means fail, 00:21:05.470 --> 00:21:07.180 so not very encouraging. 00:21:07.180 --> 00:21:10.600 I tried really hard here, but fail is my grade on this program. 00:21:10.600 --> 00:21:12.460 But let's see exactly what happened. 00:21:12.460 --> 00:21:15.530 If I look at this excerpt here under failures, 00:21:15.530 --> 00:21:18.400 you'll see that test square is the function that failed. 00:21:18.400 --> 00:21:20.650 That makes sense, because that's the only one I wrote. 00:21:20.650 --> 00:21:25.390 And you'll see here somewhat arcane output describing what the error was. 00:21:25.390 --> 00:21:28.930 So what you're seeing here is the first line of output equals equals 4, 00:21:28.930 --> 00:21:29.530 which is fine. 00:21:29.530 --> 00:21:32.230 There's no red error message below that, so that one's OK. 00:21:32.230 --> 00:21:36.640 But this line of code here assert that square of 3 equals equals 9, 00:21:36.640 --> 00:21:40.880 pytest did not like that assertion, because it didn't end up being true. 00:21:40.880 --> 00:21:44.480 In fact, per the red E at the start of this line, 00:21:44.480 --> 00:21:50.350 you'll see that I'm effectively trying to assert that 6 equals equals 9. 00:21:50.350 --> 00:21:52.280 Now, where did the 6 come from? 00:21:52.280 --> 00:21:56.920 Wait a minute, if my test involves this, notice that where 6 equals square of 3, 00:21:56.920 --> 00:22:01.360 this is saying that because I've called square, passing in a value of 3, 00:22:01.360 --> 00:22:03.790 it turns out it's return value is 6. 00:22:03.790 --> 00:22:07.690 And of course, mathematically, 6 does not equal equal 9. 00:22:07.690 --> 00:22:09.760 So that's why this is failing. 00:22:09.760 --> 00:22:13.240 Now, pytest is not as user friendly as telling you 00:22:13.240 --> 00:22:16.750 exactly why the bug is there or how to fix it. 00:22:16.750 --> 00:22:19.840 This is really just a clue to you what must be wrong. 00:22:19.840 --> 00:22:23.020 What you're seeing here is a clue that the first test passed, 00:22:23.020 --> 00:22:26.860 because there's no red error below that line of code, but this test failed. 00:22:26.860 --> 00:22:32.290 Somehow or other, your square function is returning 6 00:22:32.290 --> 00:22:34.840 when passed in 3 instead of 9. 00:22:34.840 --> 00:22:37.180 So at this point, you sort of put your detective hat on, 00:22:37.180 --> 00:22:39.490 you go back to your actual code, and you think 00:22:39.490 --> 00:22:42.340 about in calculator.py, how in the world is 00:22:42.340 --> 00:22:47.380 line 7 of my square function returning 6 instead of 9. 00:22:47.380 --> 00:22:49.380 And at this point, odds are the light bulb 00:22:49.380 --> 00:22:51.130 would go off above your head proverbially, 00:22:51.130 --> 00:22:55.390 and you would see, I'm using addition, instead of multiplication. 00:22:55.390 --> 00:22:57.640 But what pytest has done for us is automate 00:22:57.640 --> 00:23:00.610 the process of at least pointing out that error for us. 00:23:00.610 --> 00:23:03.147 And if I now go in and fix this-- let me go ahead, 00:23:03.147 --> 00:23:04.480 and the light bulb has gone off. 00:23:04.480 --> 00:23:08.320 I change the plus to a multiply. 00:23:08.320 --> 00:23:10.660 Now I'm going to go ahead, and after clearing my screen, 00:23:10.660 --> 00:23:15.130 I'm going to run not Python, but pytest of test_calculator.py, 00:23:15.130 --> 00:23:16.390 crossing my fingers again. 00:23:16.390 --> 00:23:17.800 And now it's green. 00:23:17.800 --> 00:23:21.760 And I see just a dot, which indicates that my one and only test passed. 00:23:21.760 --> 00:23:27.080 I'm good, 100% success with my test now after fixing that bug. 00:23:27.080 --> 00:23:30.550 Let me pause here and see if there's any questions. 00:23:30.550 --> 00:23:33.520 SPEAKER 3: So my question is, what if a user, 00:23:33.520 --> 00:23:36.160 instead of, because we are taking input from the user, 00:23:36.160 --> 00:23:41.590 what if the user is somewhat malicious and types in a string instead 00:23:41.590 --> 00:23:46.420 of an integer, or maybe he types in a float or some other data type? 00:23:46.420 --> 00:23:48.730 DAVID MALAN: Yeah, so what if the user, like we've 00:23:48.730 --> 00:23:51.655 seen in past examples, types in cat, instead of a number, when 00:23:51.655 --> 00:23:52.780 we're expecting an integer? 00:23:52.780 --> 00:23:54.940 How do we test for something like that? 00:23:54.940 --> 00:23:57.970 At the moment, I'm admittedly not testing user input. 00:23:57.970 --> 00:24:02.750 If I go back to my code here, notice that my calculator function, of course, 00:24:02.750 --> 00:24:05.290 has the square function that we keep testing and retesting. 00:24:05.290 --> 00:24:08.470 But notice that all of the user input is currently 00:24:08.470 --> 00:24:10.300 relegated to my main function. 00:24:10.300 --> 00:24:14.120 And admittedly, as of now, I am not testing my main function. 00:24:14.120 --> 00:24:15.760 So there could be one of those bugs. 00:24:15.760 --> 00:24:19.600 And in fact, there would be, because if the user types in a string, like cat, 00:24:19.600 --> 00:24:24.850 instead of an integer, like 2 or 3, then line two recall would actually 00:24:24.850 --> 00:24:26.990 raise a value error exception. 00:24:26.990 --> 00:24:28.160 So we've seen that before. 00:24:28.160 --> 00:24:30.280 So when it comes to testing your code, this 00:24:30.280 --> 00:24:35.080 is actually a good reason for having multiple functions in your program. 00:24:35.080 --> 00:24:37.900 Rather than putting all of your logic in just the file itself, 00:24:37.900 --> 00:24:40.240 rather than putting all of the logic in just main, 00:24:40.240 --> 00:24:43.030 it's actually really good, really helpful practice 00:24:43.030 --> 00:24:46.750 to break your ideas up into smaller bit-sized functions 00:24:46.750 --> 00:24:48.370 that themselves are testable. 00:24:48.370 --> 00:24:49.630 And what do I mean here? 00:24:49.630 --> 00:24:52.270 Square is perfectly testable. 00:24:52.270 --> 00:24:52.810 Why? 00:24:52.810 --> 00:24:56.020 Because it takes as input a parameter called n, 00:24:56.020 --> 00:24:59.680 and it returns as output in integer, which is going 00:24:59.680 --> 00:25:01.390 to be the square thereof, hopefully. 00:25:01.390 --> 00:25:04.120 It has a well-defined input and a well-defined output. 00:25:04.120 --> 00:25:08.080 It is therefore completely within your control in your test program 00:25:08.080 --> 00:25:09.730 to pass in those values. 00:25:09.730 --> 00:25:15.730 Now I will say, if you want to test whether square behaves properly 00:25:15.730 --> 00:25:18.280 when passed something like a string, like, quote, unquote, 00:25:18.280 --> 00:25:20.920 "cat," we could absolutely do something like this, 00:25:20.920 --> 00:25:24.520 assert that the square of quote, unquote, "cat," 00:25:24.520 --> 00:25:26.050 it's not going to equal something. 00:25:26.050 --> 00:25:28.210 You can actually, using different syntax, 00:25:28.210 --> 00:25:31.000 assert that a specific exception will be raised. 00:25:31.000 --> 00:25:34.060 So if we were actually going to go back into our square function, 00:25:34.060 --> 00:25:37.880 improve it, and deliberately raise an exception, we could test for that too. 00:25:37.880 --> 00:25:41.200 But for now, I'm deliberately only testing the square function. 00:25:41.200 --> 00:25:43.990 I'm not testing for specific user input. 00:25:43.990 --> 00:25:45.670 But that's another problem to be solved. 00:25:45.670 --> 00:25:49.890 Other questions now on unit tests? 00:25:49.890 --> 00:25:56.670 SPEAKER 4: Do use the unit test to test code for the CS50 check? 00:25:56.670 --> 00:25:58.830 DAVID MALAN: So Check 50 is similar in spirit. 00:25:58.830 --> 00:26:03.000 Check 50 is a tool that we, CS50, wrote that is essentially doing something 00:26:03.000 --> 00:26:06.750 like pytest for the evaluation of students' code. 00:26:06.750 --> 00:26:10.080 It is similar in spirit, but think of Check 50 00:26:10.080 --> 00:26:12.697 as being an alternative to pytest, if you will. 00:26:12.697 --> 00:26:14.280 But it works a little bit differently. 00:26:14.280 --> 00:26:17.370 But same idea, pytest and unit testing more 00:26:17.370 --> 00:26:19.890 generally is a technique that is independent of CS50 00:26:19.890 --> 00:26:23.430 and is something that you can and should be doing on your own code, both in 00:26:23.430 --> 00:26:25.240 or outside of this class. 00:26:25.240 --> 00:26:31.322 How about one other question here on our unit tests? 00:26:31.322 --> 00:26:33.030 SPEAKER 5: My question is that is instead 00:26:33.030 --> 00:26:37.140 of writing four times, like as a square of, 2 squared 4, 00:26:37.140 --> 00:26:43.450 instead of that, can we write equals to in square brackets the numbers we want, 00:26:43.450 --> 00:26:44.980 instead of writing four lines? 00:26:44.980 --> 00:26:46.980 DAVID MALAN: A really good question, absolutely. 00:26:46.980 --> 00:26:49.620 Right now if I go back to test_calculator.py, 00:26:49.620 --> 00:26:51.420 it's indeed pretty manual. 00:26:51.420 --> 00:26:54.570 It took me a while to say and to type out those several lines, 00:26:54.570 --> 00:26:58.710 and you could imagine writing some kind of loop to just assert in a loop 00:26:58.710 --> 00:27:02.220 that this equals that, that this equals that, and so forth, using a list 00:27:02.220 --> 00:27:05.530 or using maybe a list or a dictionary or some structure like that. 00:27:05.530 --> 00:27:08.008 So yes, you can absolutely automate some of these tests 00:27:08.008 --> 00:27:10.050 by not just doing the same thing again and again. 00:27:10.050 --> 00:27:12.660 You can still use all of the syntax of Python to do loops. 00:27:12.660 --> 00:27:16.200 But generally speaking, your tests should be pretty simple. 00:27:16.200 --> 00:27:21.510 And in fact, let me propose that we improve upon even this design further, 00:27:21.510 --> 00:27:28.320 because at the moment what's not really ideal, when I run all of these tests 00:27:28.320 --> 00:27:32.010 when my function is buggy, is notice the output that I got. 00:27:32.010 --> 00:27:35.550 Let me reintroduce that same bug by changing my multiplication back 00:27:35.550 --> 00:27:36.520 to addition. 00:27:36.520 --> 00:27:39.150 Let me increase the size of my terminal window again. 00:27:39.150 --> 00:27:42.600 And let me run pytest again of test_calculator.py. 00:27:42.600 --> 00:27:46.260 So this is the version of my code now that has the bug again. 00:27:46.260 --> 00:27:49.290 So I'm going to see that big massive failure where 00:27:49.290 --> 00:27:52.530 this failure has been displayed to me. 00:27:52.530 --> 00:27:55.260 But this is not as helpful as it could be, 00:27:55.260 --> 00:27:58.180 because I have all of those other tests in my code. 00:27:58.180 --> 00:28:01.350 Recall that I had, what, one, two, three, four, five separate tests, 00:28:01.350 --> 00:28:03.270 and I'm only seeing the output of the first. 00:28:03.270 --> 00:28:04.410 Now, why is that? 00:28:04.410 --> 00:28:06.690 If we go back to my code here, you'll see 00:28:06.690 --> 00:28:11.370 that the first assertion that's failing, namely this one here, that assert 00:28:11.370 --> 00:28:15.750 of square of 3 equals equals 9, the other tests aren't even getting run. 00:28:15.750 --> 00:28:19.830 And that's not a big deal in the sense that my code is buggy, so one or more 00:28:19.830 --> 00:28:21.630 of them are probably going to fail anyway, 00:28:21.630 --> 00:28:24.870 but wouldn't it be nice to know which of them are going to fail? 00:28:24.870 --> 00:28:27.900 And in fact, it's ideal to run as many tests all at once as possible 00:28:27.900 --> 00:28:31.020 to give you as many clues as possible to finding your bug. 00:28:31.020 --> 00:28:35.010 So let me propose that we improve the design of my testing code 00:28:35.010 --> 00:28:38.040 now, still using pytest as follows. 00:28:38.040 --> 00:28:41.280 Instead of having one big function called test_square 00:28:41.280 --> 00:28:45.090 that tests the entire function itself with so many different inputs, 00:28:45.090 --> 00:28:48.270 let's break down my tests into different categories. 00:28:48.270 --> 00:28:51.100 And here, too, there's no one right way to do this. 00:28:51.100 --> 00:28:53.430 But my mind is thinking that I should maybe 00:28:53.430 --> 00:28:57.840 test positive numbers separately, test negative numbers separately, and test 0 00:28:57.840 --> 00:28:58.470 separately. 00:28:58.470 --> 00:28:59.637 I could think of other ways. 00:28:59.637 --> 00:29:00.810 I could test even numbers. 00:29:00.810 --> 00:29:03.930 I could test odd numbers or maybe some other pattern altogether, 00:29:03.930 --> 00:29:07.140 but separating this big test into multiple tests 00:29:07.140 --> 00:29:10.150 is probably going to yield more clues for me when something goes wrong. 00:29:10.150 --> 00:29:11.620 So let me do this. 00:29:11.620 --> 00:29:15.570 Let me go ahead and rename this function to test positive initially, 00:29:15.570 --> 00:29:19.170 and let me include in that function only those first two tests. 00:29:19.170 --> 00:29:23.560 Let me then create another function here called test negative. 00:29:23.560 --> 00:29:27.780 And in this function, let me test only negative 2 and negative 3. 00:29:27.780 --> 00:29:31.500 Then down here, let me do one more def of test_zero, 00:29:31.500 --> 00:29:33.660 and I'll just run one test in there. 00:29:33.660 --> 00:29:36.690 So I have the same assertions, the same five, 00:29:36.690 --> 00:29:39.960 but I've now divided them up among three separate functions. 00:29:39.960 --> 00:29:43.470 What's nice about pytest and other unit testing frameworks 00:29:43.470 --> 00:29:47.340 is that all three of these test functions will be run automatically. 00:29:47.340 --> 00:29:50.400 Even if one of them fails, the others will be attempted. 00:29:50.400 --> 00:29:53.790 That means that if one or two or three of them fail, 00:29:53.790 --> 00:29:58.235 I'll have one or two or three clues now for helping me find that mistake. 00:29:58.235 --> 00:30:01.110 So let me go ahead and again increase the size of my terminal window, 00:30:01.110 --> 00:30:02.693 just so we can see more on the screen. 00:30:02.693 --> 00:30:07.350 My calculator still has the bug, using addition, instead of multiplication. 00:30:07.350 --> 00:30:12.060 Let me go ahead and run not Python, but again, pytest of test_calculator.py, 00:30:12.060 --> 00:30:14.550 crossing my fingers as always, and now, oh my God, 00:30:14.550 --> 00:30:16.500 there's even more errors on the screen. 00:30:16.500 --> 00:30:19.038 But this in itself is more helpful. 00:30:19.038 --> 00:30:20.830 Let's work through them from top to bottom. 00:30:20.830 --> 00:30:22.677 So under FAILURES here, in all caps, which 00:30:22.677 --> 00:30:25.260 I know is not very encouraging to see failure when you're just 00:30:25.260 --> 00:30:28.260 trying to solve a problem, but that's what these frameworks do, 00:30:28.260 --> 00:30:31.530 under FAILURES, the first function that failed is test_positive. 00:30:31.530 --> 00:30:34.230 But here, too, we see the same clue as before. 00:30:34.230 --> 00:30:38.230 The first one, 2, the square of 2 equals equals 4, that one is fine. 00:30:38.230 --> 00:30:40.210 It's not erring with any red errors. 00:30:40.210 --> 00:30:41.650 But the next one is failing. 00:30:41.650 --> 00:30:45.150 So I know that square is broken when I pass in 3. 00:30:45.150 --> 00:30:46.122 What about down here? 00:30:46.122 --> 00:30:49.080 It looks like, unfortunately, my test negative function is failing too. 00:30:49.080 --> 00:30:49.950 Why? 00:30:49.950 --> 00:30:53.880 When I pass in-- oh, this is interesting-- here now, negative 2 00:30:53.880 --> 00:30:55.270 doesn't even work. 00:30:55.270 --> 00:30:56.640 So I got lucky with positive 2. 00:30:56.640 --> 00:30:58.050 But negative 2 isn't working. 00:30:58.050 --> 00:30:59.400 So that's a bit of a clue. 00:30:59.400 --> 00:31:02.850 But in total, only two tests failed. 00:31:02.850 --> 00:31:07.500 So notice at the very bottom, this summary, two failed and one passed. 00:31:07.500 --> 00:31:08.400 What's the other one? 00:31:08.400 --> 00:31:09.358 What was the third one? 00:31:09.358 --> 00:31:10.020 Test zero. 00:31:10.020 --> 00:31:11.970 So test zero is passing. 00:31:11.970 --> 00:31:13.590 These two are failing. 00:31:13.590 --> 00:31:17.170 And so that kind of leads me logically, mathematically, if you will, 00:31:17.170 --> 00:31:18.282 to the source of the bug. 00:31:18.282 --> 00:31:20.490 And just to be clear too, if you have a lot of tests, 00:31:20.490 --> 00:31:23.910 this little one line output is helpful, even though also a bit discouraging, 00:31:23.910 --> 00:31:27.180 fail, fail, and dot means pass. 00:31:27.180 --> 00:31:28.930 So there are the three tests just depicted 00:31:28.930 --> 00:31:31.480 graphically a little bit differently. 00:31:31.480 --> 00:31:35.590 Let me rewind now and go back in to calculator.py. 00:31:35.590 --> 00:31:38.050 Let's fix that bug, because let's suppose 00:31:38.050 --> 00:31:40.150 that I've deduced I'm using addition. 00:31:40.150 --> 00:31:42.460 I should have been using multiplication all this time. 00:31:42.460 --> 00:31:44.710 Let me now after fixing the bug yet again, 00:31:44.710 --> 00:31:46.550 let me go back to my big terminal. 00:31:46.550 --> 00:31:51.160 Let me run pytest of test_calculator.py, hitting Enter, crossing my fingers now, 00:31:51.160 --> 00:31:53.620 and dot dot dot means all is well. 00:31:53.620 --> 00:31:56.360 100% of my tests passed, all three of them. 00:31:56.360 --> 00:31:57.620 So now I'm good. 00:31:57.620 --> 00:32:02.110 It doesn't necessarily mean that my code is 100% correct. 00:32:02.110 --> 00:32:05.950 But it does mean that it has passed 100% of my current tests. 00:32:05.950 --> 00:32:10.150 And so it would probably behoove us to think a little harder about maybe 00:32:10.150 --> 00:32:11.710 we should test bigger numbers. 00:32:11.710 --> 00:32:13.600 Maybe we should test even smaller numbers. 00:32:13.600 --> 00:32:15.610 Maybe we should test strings or something else. 00:32:15.610 --> 00:32:19.090 The onus is ultimately on you to decide what you're going to test. 00:32:19.090 --> 00:32:22.360 But in the real world, you're going to be very unhappy with yourself 00:32:22.360 --> 00:32:25.900 or someone else-- maybe your boss is going to be very unhappy with you-- 00:32:25.900 --> 00:32:29.320 if you did not catch a bug in your code, which you could have caught 00:32:29.320 --> 00:32:33.070 had you just written a test to try that kind of input. 00:32:33.070 --> 00:32:35.500 Let me pause again and see if there's any questions now 00:32:35.500 --> 00:32:38.740 on unit testing with pytest. 00:32:38.740 --> 00:32:41.950 SPEAKER 6: So if you wanted to test, like someone suggested before, 00:32:41.950 --> 00:32:45.430 user input as well as testing your function, 00:32:45.430 --> 00:32:47.500 do you do that within the same file? 00:32:47.500 --> 00:32:50.260 Or do you make separate files for different types of tests? 00:32:50.260 --> 00:32:51.677 DAVID MALAN: Really good question. 00:32:51.677 --> 00:32:55.120 You could absolutely make separate files to test different types of things. 00:32:55.120 --> 00:32:58.300 Or if you don't have that many, you can keep them all in the same file. 00:32:58.300 --> 00:33:01.827 At the moment, I've been storing all of my tests in one file for convenience, 00:33:01.827 --> 00:33:03.410 and there's not terribly many of them. 00:33:03.410 --> 00:33:05.380 But we'll take a look in a bit at an example 00:33:05.380 --> 00:33:08.470 that allows me to put them into a folder and even run pytest 00:33:08.470 --> 00:33:11.090 on the whole folder of tests as well. 00:33:11.090 --> 00:33:12.010 So that's possible. 00:33:12.010 --> 00:33:14.120 Other questions on unit testing. 00:33:14.120 --> 00:33:16.810 SPEAKER 7: So I've got two questions. 00:33:16.810 --> 00:33:22.960 So a couple of while ago, you just used an exception called-- 00:33:22.960 --> 00:33:26.110 I'm not sure what it was-- oh yeah, assertion error. 00:33:26.110 --> 00:33:30.160 What exactly does that particular error catch? 00:33:30.160 --> 00:33:36.500 And my second question is, does the assert keyword 00:33:36.500 --> 00:33:39.320 stand out to the compiler, exactly tell them 00:33:39.320 --> 00:33:42.987 to insert this particular line of code? 00:33:42.987 --> 00:33:43.820 DAVID MALAN: Indeed. 00:33:43.820 --> 00:33:48.320 The assert keyword we're seeing and the assertion error we saw earlier 00:33:48.320 --> 00:33:49.530 are intertwined. 00:33:49.530 --> 00:33:52.460 So when you use assert and the assertion fails, 00:33:52.460 --> 00:33:56.570 because whatever Boolean expression you're using is not true, it's false, 00:33:56.570 --> 00:34:00.170 an assertion error, by definition of Python, will be raised. 00:34:00.170 --> 00:34:02.180 So those two work in conjunction. 00:34:02.180 --> 00:34:06.920 Those errors, those assertion errors, are still being raised by my code 00:34:06.920 --> 00:34:09.320 here when any of these lines of code fail. 00:34:09.320 --> 00:34:12.139 However, pytest, this third party library, 00:34:12.139 --> 00:34:16.639 is handling the process of catching those exceptions automatically for me, 00:34:16.639 --> 00:34:18.810 so as to give me this standard output. 00:34:18.810 --> 00:34:22.488 So we started today's story by really implementing unit testing myself. 00:34:22.488 --> 00:34:23.780 I wrote all of the code myself. 00:34:23.780 --> 00:34:24.440 I wrote main. 00:34:24.440 --> 00:34:25.400 I did my conditional. 00:34:25.400 --> 00:34:26.540 I did try and except. 00:34:26.540 --> 00:34:29.277 Honestly, it's going to get incredibly painful to write tests 00:34:29.277 --> 00:34:32.360 long term if you and I have to write that much code every time, especially 00:34:32.360 --> 00:34:34.010 when our function is this small. 00:34:34.010 --> 00:34:38.239 So pytest and unit testing frameworks like it just automate so much of that. 00:34:38.239 --> 00:34:43.460 Essentially, pytest adds the try, the except, the if, the prints for you, 00:34:43.460 --> 00:34:46.580 so you can just focus on the essence of the test, which 00:34:46.580 --> 00:34:49.130 really are these inputs and outputs. 00:34:49.130 --> 00:34:52.980 How about time for one other question here on unit testing as well? 00:34:52.980 --> 00:35:00.320 SPEAKER 8: So when we enter minus x or minus 5 squared, 00:35:00.320 --> 00:35:03.270 square root of that number comes up. 00:35:03.270 --> 00:35:07.460 But when we put 6.6 or 5.6, something like that integer, 00:35:07.460 --> 00:35:11.370 then line shows error. 00:35:11.370 --> 00:35:13.590 So what's happening there? 00:35:13.590 --> 00:35:16.850 DAVID MALAN: So I'm deliberately testing integers right now, 00:35:16.850 --> 00:35:19.953 in large part because I only want pow to operate on integers. 00:35:19.953 --> 00:35:23.120 And that might be conveyed in Python's documentation or my own documentation 00:35:23.120 --> 00:35:24.050 for that function. 00:35:24.050 --> 00:35:26.850 If you were to pass in something else, like a float, 00:35:26.850 --> 00:35:30.920 it turns out that floating point values in Python and other languages 00:35:30.920 --> 00:35:33.150 are actually very hard, if not impossible, 00:35:33.150 --> 00:35:35.420 to represent 100% precisely. 00:35:35.420 --> 00:35:39.020 And so if you are trying to compare it against some other value, 00:35:39.020 --> 00:35:41.900 there might be slight rounding errors as a result. 00:35:41.900 --> 00:35:43.940 I'm just inferring from what you've described, 00:35:43.940 --> 00:35:47.480 but I'm very deliberately now testing this function with only the inputs 00:35:47.480 --> 00:35:48.680 that I would expect. 00:35:48.680 --> 00:35:53.300 It might indeed throw other errors if other inputs are passed. 00:35:53.300 --> 00:35:56.240 Allow me to propose that we consider what should happen if square 00:35:56.240 --> 00:35:58.100 isn't actually passed a number. 00:35:58.100 --> 00:36:01.100 For instance, if I go back to calculator.py, 00:36:01.100 --> 00:36:04.730 and suppose that I, or perhaps someone else using my square function, 00:36:04.730 --> 00:36:09.020 simply forgets to convert the return value of input from a str to an int, 00:36:09.020 --> 00:36:11.270 as by modifying line to here. 00:36:11.270 --> 00:36:14.870 Now, something's definitely going to go wrong if I type in a str 00:36:14.870 --> 00:36:16.910 instead of what appears to be an int. 00:36:16.910 --> 00:36:18.980 For instance, if I clear my terminal here, 00:36:18.980 --> 00:36:22.250 run Python of calculator.py and hit Enter-- 00:36:22.250 --> 00:36:26.220 let's type in cat as our value for x-- and of course, 00:36:26.220 --> 00:36:27.570 this raises now a type error. 00:36:27.570 --> 00:36:28.070 Why? 00:36:28.070 --> 00:36:30.620 Can't multiply sequence by non-int of type 'str.' 00:36:30.620 --> 00:36:31.700 What does that mean? 00:36:31.700 --> 00:36:35.000 You can't do cat times cat, because indeed, square is 00:36:35.000 --> 00:36:36.860 expecting that end will be some number. 00:36:36.860 --> 00:36:39.650 But that doesn't necessarily mean that square itself is buggy. 00:36:39.650 --> 00:36:43.070 But this does mean that if I expect a type error to be raised, 00:36:43.070 --> 00:36:47.790 let's test for that too, so that I know the behavior indeed works as expected. 00:36:47.790 --> 00:36:53.070 So let me go back to test_calculator.py, and let me go in add a fourth test down 00:36:53.070 --> 00:36:53.570 here. 00:36:53.570 --> 00:36:56.240 How about define test underscore, and I'll 00:36:56.240 --> 00:36:59.510 call this test_str, because I'm going to specifically and deliberately pass 00:36:59.510 --> 00:37:01.080 in a str for testing. 00:37:01.080 --> 00:37:06.290 And I want to in spirit assert that passing in something like cat to square 00:37:06.290 --> 00:37:08.030 will raise a type error. 00:37:08.030 --> 00:37:10.550 But we don't use the assert keyword for that. 00:37:10.550 --> 00:37:11.630 Rather, we need this. 00:37:11.630 --> 00:37:14.570 Let me go to the top of this file, and let me additionally 00:37:14.570 --> 00:37:18.020 import the pytest library itself, because it turns out 00:37:18.020 --> 00:37:20.180 there's a function in that library called 00:37:20.180 --> 00:37:25.280 raises that allows me to express that I expect an exception to be raised. 00:37:25.280 --> 00:37:29.330 And I can express that as follows with pytest.raises, 00:37:29.330 --> 00:37:33.180 and then in parentheses I can pass in the type of exception I expect, 00:37:33.180 --> 00:37:35.720 which is going to be a type error in this case. 00:37:35.720 --> 00:37:38.720 And now when do I expect that type error to be raised? 00:37:38.720 --> 00:37:42.320 Whenever I do something like calling square and passing in not a number, 00:37:42.320 --> 00:37:44.150 but something like cat. 00:37:44.150 --> 00:37:46.380 So now if I go back to my terminal window, 00:37:46.380 --> 00:37:51.050 run pytest of test calculator.py, this time having four tests, 00:37:51.050 --> 00:37:55.880 I should see that all four now are successful. 00:37:55.880 --> 00:37:59.870 Let's now consider how we could test code that doesn't just expect numbers 00:37:59.870 --> 00:38:02.000 as input, but actually strings. 00:38:02.000 --> 00:38:04.640 And let me rewind us in time here in VS Code 00:38:04.640 --> 00:38:09.380 to that very first program we wrote a few different versions of in hello.py 00:38:09.380 --> 00:38:11.610 that ultimately looked a little something like this. 00:38:11.610 --> 00:38:14.480 I had a main function that prompted the user 00:38:14.480 --> 00:38:18.020 for the value of a variable by asking them, "what's your name?" 00:38:18.020 --> 00:38:19.050 question mark. 00:38:19.050 --> 00:38:21.650 And then we went ahead and did something like hello, 00:38:21.650 --> 00:38:26.570 open paren, name, passing that user's name into a function called hello. 00:38:26.570 --> 00:38:30.210 Now that function hello recall ultimately looked like this. 00:38:30.210 --> 00:38:33.530 We defined hello as taking a parameter called to, 00:38:33.530 --> 00:38:37.850 the default value of which was world, and that function very simply 00:38:37.850 --> 00:38:41.780 printed hello, followed by a comma, and then whatever 00:38:41.780 --> 00:38:43.310 the name that had been passed in. 00:38:43.310 --> 00:38:46.520 And then we ultimately called main, but for now onward, 00:38:46.520 --> 00:38:48.650 I'm going to always add this if conditional, 00:38:48.650 --> 00:38:53.360 if name equals equals underscore underscore main, then and only then 00:38:53.360 --> 00:38:54.380 do I want to call main. 00:38:54.380 --> 00:38:58.580 So that's essentially what this program looked like in its last incarnation. 00:38:58.580 --> 00:39:00.560 How do we go about testing it? 00:39:00.560 --> 00:39:03.800 Here again too, I'm not going to test the user's input per se in main. 00:39:03.800 --> 00:39:07.580 I'm going to focus really on the module of code 00:39:07.580 --> 00:39:10.220 here that's of interest, which is the hello function itself. 00:39:10.220 --> 00:39:14.420 How can I go about testing the hello function? 00:39:14.420 --> 00:39:19.550 Unfortunately, even if I start by doing something like code of test hello.py-- 00:39:19.550 --> 00:39:22.340 let me go about and start writing a test program-- 00:39:22.340 --> 00:39:26.210 I could import from my hello program a function called hello. 00:39:26.210 --> 00:39:28.700 So a bit strange to see from hello import 00:39:28.700 --> 00:39:32.900 hello, but notice that on this line here, I'm importing from the module-- 00:39:32.900 --> 00:39:36.680 that is the file called hello.py-- the function called hello. 00:39:36.680 --> 00:39:40.400 And how do I go about testing this? 00:39:40.400 --> 00:39:46.610 If I have a function like define test_argument like this-- 00:39:46.610 --> 00:39:48.090 let me do this. 00:39:48.090 --> 00:39:53.510 So if I were to define a function like define test_hello, what could I do? 00:39:53.510 --> 00:39:59.840 I could call hello with quote, unquote, say, "David," 00:39:59.840 --> 00:40:04.760 and then check if it equals, what, "hello, David." 00:40:04.760 --> 00:40:07.400 So would this work, this approach here? 00:40:07.400 --> 00:40:10.730 If I've written a test, called test_hello, that 00:40:10.730 --> 00:40:14.240 calls hello with an argument of David and then tests its return value, 00:40:14.240 --> 00:40:19.820 just like we've done for our calculator, would this work as written? 00:40:19.820 --> 00:40:22.370 And let me go back to in just a moment the version 00:40:22.370 --> 00:40:23.730 of hello that we're testing. 00:40:23.730 --> 00:40:25.550 So you can see that function hello. 00:40:25.550 --> 00:40:27.380 Here's the test. 00:40:27.380 --> 00:40:29.900 Here is the actual code. 00:40:29.900 --> 00:40:32.900 Would this test now work? 00:40:32.900 --> 00:40:34.010 Any thoughts? 00:40:34.010 --> 00:40:38.060 SPEAKER 9: I think the problem is that in the first version in hello.py, 00:40:38.060 --> 00:40:42.860 you're using the to argument that you first declared, when you declared 00:40:42.860 --> 00:40:47.070 the function instead of using the name. 00:40:47.070 --> 00:40:50.580 DAVID MALAN: That is actually not a bug here. 00:40:50.580 --> 00:40:53.730 So let me stipulate that in hello.py, this code actually 00:40:53.730 --> 00:40:54.863 does work as intended. 00:40:54.863 --> 00:40:57.780 And let me go ahead and test it manually, just to demonstrate as much. 00:40:57.780 --> 00:41:03.610 Let me run Python of hello.py, typing in, as my name, D-A-V-I-D, and I see, 00:41:03.610 --> 00:41:05.280 in fact, that it says, "hello, David." 00:41:05.280 --> 00:41:07.560 If, though, I were to change this program, 00:41:07.560 --> 00:41:11.460 and get rid of the name argument, get rid of the name variable, 00:41:11.460 --> 00:41:14.790 and just call hello, again, running Python of hello.py, 00:41:14.790 --> 00:41:17.700 this time I'm not even prompted, because I got rid of my input call, 00:41:17.700 --> 00:41:19.740 but it does behave as I expect. 00:41:19.740 --> 00:41:21.660 It does say "hello, world." 00:41:21.660 --> 00:41:26.820 So let me stipulate that this code in its current form is actually correct, 00:41:26.820 --> 00:41:30.310 but my test is not going to work as I'd hoped. 00:41:30.310 --> 00:41:38.310 And there's a subtle difference between my hello function 00:41:38.310 --> 00:41:41.490 and my square function that explains. 00:41:41.490 --> 00:41:45.420 Why might this test not work as intended? 00:41:45.420 --> 00:41:47.445 SPEAKER 10: Because it's not returning a value. 00:41:47.445 --> 00:41:48.570 DAVID MALAN: Yeah, exactly. 00:41:48.570 --> 00:41:50.850 Recall our discussion early on about functions. 00:41:50.850 --> 00:41:54.420 Functions can either return a value, like my square function hands 00:41:54.420 --> 00:41:56.580 you back the square of some value, or they 00:41:56.580 --> 00:41:59.425 can have side effects, sort of visual artifacts 00:41:59.425 --> 00:42:02.550 that might happen on the screen, like printing something out on the screen. 00:42:02.550 --> 00:42:05.290 And by definition, that's how print works. 00:42:05.290 --> 00:42:08.550 Notice that hello, it is short, but it's implemented ultimately 00:42:08.550 --> 00:42:12.240 using the print function, which does not return a value as I'm using it here. 00:42:12.240 --> 00:42:15.510 It instead has this side effect of printing something onto the screen. 00:42:15.510 --> 00:42:19.110 So it is not correct in my test function to check 00:42:19.110 --> 00:42:23.820 if the return value of hello equals equals hello David, 00:42:23.820 --> 00:42:26.280 because again, hello is not returning anything. 00:42:26.280 --> 00:42:28.050 It's printing something, that side effect, 00:42:28.050 --> 00:42:31.350 but notice, literally, it has no return keyword, 00:42:31.350 --> 00:42:34.290 unlike my square function, which did. 00:42:34.290 --> 00:42:37.440 So here's an opportunity to perhaps change 00:42:37.440 --> 00:42:41.040 how I go about implementing my actual functions. 00:42:41.040 --> 00:42:44.790 It turns out that as your programs get more and more sophisticated, more 00:42:44.790 --> 00:42:47.730 and more complicated, it tends to be best practice not 00:42:47.730 --> 00:42:50.250 to have side effects if you can avoid it, 00:42:50.250 --> 00:42:52.650 especially if you want your code to be testable. 00:42:52.650 --> 00:42:56.910 And in fact, I'm going to propose that we change my hello program to now work 00:42:56.910 --> 00:42:57.850 as follows. 00:42:57.850 --> 00:43:03.180 Let me go ahead and change this function to not print hello and then that name. 00:43:03.180 --> 00:43:05.760 Let me go ahead and literally return maybe 00:43:05.760 --> 00:43:09.210 an F string, which will clean this up a little bit, hello comma 00:43:09.210 --> 00:43:11.970 to close quotes at the end. 00:43:11.970 --> 00:43:15.540 So my syntax here is just the familiar f string or format string. 00:43:15.540 --> 00:43:19.800 It's going to return hello, world or hello, David or hello, whomever's name 00:43:19.800 --> 00:43:23.010 is passed in as that argument, but I'm returning it now. 00:43:23.010 --> 00:43:24.810 I'm not printing it out. 00:43:24.810 --> 00:43:27.480 So what needs to change up here? 00:43:27.480 --> 00:43:29.560 I could do something like this. 00:43:29.560 --> 00:43:33.000 I could say something like output equals hello 00:43:33.000 --> 00:43:35.940 and then print output in my main function. 00:43:35.940 --> 00:43:38.880 Or I can simplify that, because I don't really need that variable. 00:43:38.880 --> 00:43:40.560 I could instead just do this. 00:43:40.560 --> 00:43:44.700 I could still call hello, but I could immediately print out the result. 00:43:44.700 --> 00:43:49.500 And this version of my hello program now is actually more testable. 00:43:49.500 --> 00:43:50.040 Why? 00:43:50.040 --> 00:43:52.440 Because these assert statements that we're using, 00:43:52.440 --> 00:43:54.930 and we've seen thus far for our tests, are really 00:43:54.930 --> 00:44:00.150 designed to test arguments into functions and return values 00:44:00.150 --> 00:44:02.450 they're from, not testing side effects. 00:44:02.450 --> 00:44:05.700 So if you're doing equals equals, you're looking for a return value, something 00:44:05.700 --> 00:44:07.390 that's handed back from the function. 00:44:07.390 --> 00:44:08.340 So that's fine. 00:44:08.340 --> 00:44:11.970 If I modify the design of my program now not to just print hello, 00:44:11.970 --> 00:44:17.400 but to return the string, the sentence, the phrase that I want to construct, 00:44:17.400 --> 00:44:19.290 I can leave it to the caller-- 00:44:19.290 --> 00:44:22.170 that is the function who's using this hello function-- 00:44:22.170 --> 00:44:24.090 to handle the actual printing. 00:44:24.090 --> 00:44:25.830 Now what does this mean in my code? 00:44:25.830 --> 00:44:28.800 It means now if my hello.py looks like this, 00:44:28.800 --> 00:44:33.000 and hello is indeed returning a value, in my test_hello function, 00:44:33.000 --> 00:44:35.320 I can test it exactly like this. 00:44:35.320 --> 00:44:38.970 So let me go ahead and run pytest of test_hello.py, 00:44:38.970 --> 00:44:42.300 crossing my fingers as always, and voila, one passed. 00:44:42.300 --> 00:44:45.240 So I passed this test, because apparently the return value of hello 00:44:45.240 --> 00:44:48.300 does indeed equal "hello, David." 00:44:48.300 --> 00:44:49.920 Let's test the other scenario. 00:44:49.920 --> 00:44:53.190 What if I call hello without any arguments? 00:44:53.190 --> 00:44:56.730 Let's assert that calling hello with nothing in those parentheses 00:44:56.730 --> 00:45:00.690 similarly equals hello comma, but world, the default value. 00:45:00.690 --> 00:45:04.560 Let me now go ahead and run pytest of test_hello.py. 00:45:04.560 --> 00:45:07.020 And that too passes entirely. 00:45:07.020 --> 00:45:09.780 But there too, suppose that I had made some mistakes. 00:45:09.780 --> 00:45:12.030 Suppose that there were a bug in my code. 00:45:12.030 --> 00:45:15.610 It might not be best practice to combine multiple tests in this one function, 00:45:15.610 --> 00:45:18.330 so let's make it more clear what might pass or fail. 00:45:18.330 --> 00:45:22.350 Let's call the first function test the default to this function. 00:45:22.350 --> 00:45:24.660 And let's only include this first line of code. 00:45:24.660 --> 00:45:28.140 And then let's go ahead and define another function, like test_argument, 00:45:28.140 --> 00:45:30.730 to test this other line of code here. 00:45:30.730 --> 00:45:32.820 So now I have two different tests, each of which 00:45:32.820 --> 00:45:35.620 is testing something a little fundamentally different. 00:45:35.620 --> 00:45:38.430 So now when I run my code, it's still not broken. 00:45:38.430 --> 00:45:43.740 If I run pytest of test_hello.py, Enter, I've now passed two tests. 00:45:43.740 --> 00:45:45.800 And that's just as good as before. 00:45:45.800 --> 00:45:49.220 But if I did have a bug, having two tests instead of one 00:45:49.220 --> 00:45:54.000 would indeed give me, perhaps, a bit more of a hint as to what's wrong. 00:45:54.000 --> 00:45:57.570 Questions now on this testing of return values, 00:45:57.570 --> 00:46:00.720 when these return values are now strings instead of integers 00:46:00.720 --> 00:46:02.310 and why we've done this? 00:46:02.310 --> 00:46:07.050 SPEAKER 11: So my question is about function inside the function. 00:46:07.050 --> 00:46:14.020 Can we test that too or recursion we haven't seen? 00:46:14.020 --> 00:46:17.200 DAVID MALAN: If you have a recursive function, which we've not 00:46:17.200 --> 00:46:19.330 discussed in this class, yes, you can absolutely 00:46:19.330 --> 00:46:23.530 test those too by simply calling them exactly in this way. 00:46:23.530 --> 00:46:25.603 Recursion does not affect this process. 00:46:25.603 --> 00:46:27.520 How about one more question here on unit tests 00:46:27.520 --> 00:46:29.890 before we look at one final example? 00:46:29.890 --> 00:46:34.780 SPEAKER 12: When testing our arguments, can we 00:46:34.780 --> 00:46:41.220 use something like loops or inside of assets or for the values? 00:46:41.220 --> 00:46:42.220 DAVID MALAN: Absolutely. 00:46:42.220 --> 00:46:45.100 You can absolutely use a loop to test multiple values. 00:46:45.100 --> 00:46:48.200 In this case, for instance, I could do something like this. 00:46:48.200 --> 00:46:57.730 I could say for name in the following list of Hermione, say, Harry, and Ron, 00:46:57.730 --> 00:47:02.830 I could then within this loop assert that hello of that given name equals 00:47:02.830 --> 00:47:08.680 equals, say, the format string of hello, comma name, 00:47:08.680 --> 00:47:13.390 and then run all of these here at once by running, again, 00:47:13.390 --> 00:47:15.160 pytest of test_hello.py. 00:47:15.160 --> 00:47:17.865 It's still going to be just one test within that function, 00:47:17.865 --> 00:47:20.740 but if there's something interesting about those several strings that 00:47:20.740 --> 00:47:23.440 makes it compelling to test all of them, you can absolutely 00:47:23.440 --> 00:47:24.860 automate the test in that way. 00:47:24.860 --> 00:47:27.520 With that said, each of your tests should ideally 00:47:27.520 --> 00:47:30.040 be pretty simple and pretty small. 00:47:30.040 --> 00:47:30.580 Why? 00:47:30.580 --> 00:47:32.630 Because you don't want to write so much code, 00:47:32.630 --> 00:47:35.860 so much complicated code that your tests might be flawed. 00:47:35.860 --> 00:47:38.890 What we don't want to have to do is write tests for our tests and test 00:47:38.890 --> 00:47:41.240 for our tests for our test, because it would never end. 00:47:41.240 --> 00:47:44.023 So keeping tests nice and simple is really the goal, 00:47:44.023 --> 00:47:45.940 so that a reasonable human, yourself included, 00:47:45.940 --> 00:47:49.630 can eyeball them and just claim, yeah, that is correct. 00:47:49.630 --> 00:47:51.973 We don't need tests for our tests. 00:47:51.973 --> 00:47:53.140 How about one other feature? 00:47:53.140 --> 00:47:56.560 Suppose that we don't have just one test, but many different tests instead, 00:47:56.560 --> 00:47:59.920 and we want to start to organize those tests into multiple files and even 00:47:59.920 --> 00:48:00.580 a folder. 00:48:00.580 --> 00:48:03.740 Pytest and other frameworks support that paradigm as well. 00:48:03.740 --> 00:48:08.127 In fact, let me go ahead and test hello.py using a folder of tests, 00:48:08.127 --> 00:48:09.960 with technically just one test, but it would 00:48:09.960 --> 00:48:12.520 be representative of having even more in that folder. 00:48:12.520 --> 00:48:15.700 I'm going to go ahead and create a new folder called test 00:48:15.700 --> 00:48:18.040 using mkdir at my command line. 00:48:20.315 --> 00:48:23.440 And then within that folder, I'm going to go ahead and create a file called 00:48:23.440 --> 00:48:25.510 test_hello.py. 00:48:25.510 --> 00:48:28.490 Within this file, meanwhile, I'm going to test the same thing. 00:48:28.490 --> 00:48:31.970 So I'm going to go ahead, and from hello, import hello. 00:48:31.970 --> 00:48:36.260 And I'm going to go ahead and define a function like test default that 00:48:36.260 --> 00:48:39.710 simply tests the scenario where hello with no arguments 00:48:39.710 --> 00:48:41.600 returns hello, comma world. 00:48:41.600 --> 00:48:43.610 And I'm going to have that other function where 00:48:43.610 --> 00:48:45.802 I test that an argument is passed. 00:48:45.802 --> 00:48:47.510 And in this case, I'll choose an argument 00:48:47.510 --> 00:48:50.360 like asserting that hello, quote, unquote, David, 00:48:50.360 --> 00:48:54.740 equals, indeed, hello, comma, not world, but David. 00:48:54.740 --> 00:48:57.440 So in this case, I've just recreated the same test as earlier, 00:48:57.440 --> 00:49:01.220 but they're in a file now in a folder called test. 00:49:01.220 --> 00:49:03.260 Pytest allows me to run these here too. 00:49:03.260 --> 00:49:06.500 But to do so, I actually need to create one other file. 00:49:06.500 --> 00:49:14.780 Within my test directory, I need to create a file called __init__.py, 00:49:14.780 --> 00:49:18.200 which has the effect, even if this file is empty, 00:49:18.200 --> 00:49:24.230 of telling Python to treat that folder as not just a module, but a package, 00:49:24.230 --> 00:49:25.190 so to speak. 00:49:25.190 --> 00:49:28.340 A package is a Python module or multiple modules 00:49:28.340 --> 00:49:30.560 that are organized inside of a folder. 00:49:30.560 --> 00:49:36.860 And this file, __init__.py, is just a visual indicator to Python that indeed 00:49:36.860 --> 00:49:39.170 it should treat that folder as a package. 00:49:39.170 --> 00:49:41.960 If I had more code in this folder, I could do even more things 00:49:41.960 --> 00:49:42.683 with this file. 00:49:42.683 --> 00:49:44.600 But for now, it's just a clue that it's indeed 00:49:44.600 --> 00:49:48.530 meant to be a package and not just a module or file alone. 00:49:48.530 --> 00:49:53.360 What I can now do in closing is run pytest, not even on that specific file, 00:49:53.360 --> 00:49:55.460 but on a whole folder of tests. 00:49:55.460 --> 00:50:00.920 So if I run pytest of test, where the test is the name of that folder, 00:50:00.920 --> 00:50:03.530 pytest will automatically search through that folder looking 00:50:03.530 --> 00:50:07.650 for all possible tests, granted there's just those two in this one file, 00:50:07.650 --> 00:50:11.480 but when I run it now with Enter, I'll still pass those tests. 00:50:11.480 --> 00:50:12.960 I'll still get 100%. 00:50:12.960 --> 00:50:16.280 And I now have a mechanism, ultimately, for testing my own code. 00:50:16.280 --> 00:50:19.580 So whether you're writing functions that return integers or something else, 00:50:19.580 --> 00:50:22.490 functions that have side effects that could be rewritten as functions 00:50:22.490 --> 00:50:24.350 that return values, you now have a mechanism 00:50:24.350 --> 00:50:27.740 to not just wait for, one, someone like us to test your code 00:50:27.740 --> 00:50:30.380 and not just test your code manually again and again, which 00:50:30.380 --> 00:50:32.360 might get tedious, and you might make mistakes 00:50:32.360 --> 00:50:34.670 by not including some possible inputs, we now 00:50:34.670 --> 00:50:37.970 have an automated mechanism for testing one's own code that's 00:50:37.970 --> 00:50:41.100 going to be even more powerful when you start collaborating with others 00:50:41.100 --> 00:50:44.030 so that you can write tests that ensure that if they 00:50:44.030 --> 00:50:47.870 make a change to the same code, they haven't broken the code that you've 00:50:47.870 --> 00:50:48.990 written. 00:50:48.990 --> 00:50:50.040 That's it for this week. 00:50:50.040 --> 00:50:52.240 We'll see you next time.