1 00:00:00,000 --> 00:00:15,616 2 00:00:15,616 --> 00:00:16,600 DAVID MALAN: All right. 3 00:00:16,600 --> 00:00:20,320 This is CS50, and this is week 8. 4 00:00:20,320 --> 00:00:23,950 So for the past several weeks have we been focusing on first Scratch 5 00:00:23,950 --> 00:00:27,890 and then C. And now today do we introduce another language altogether, 6 00:00:27,890 --> 00:00:29,060 that of Python. 7 00:00:29,060 --> 00:00:31,990 Indeed, even though we've spent all this time talking about C-- 8 00:00:31,990 --> 00:00:34,110 and hopefully understanding from the ground floor 9 00:00:34,110 --> 00:00:37,330 up what's going on inside of a computer and how things work-- 10 00:00:37,330 --> 00:00:40,010 the reality is that C is not the best language with which 11 00:00:40,010 --> 00:00:41,900 to solve a whole lot of problems. 12 00:00:41,900 --> 00:00:44,290 |ndeed, as you yourselves might have realized by now, 13 00:00:44,290 --> 00:00:48,120 the fact that you have to manipulate sometimes memory at its lowest level-- 14 00:00:48,120 --> 00:00:50,780 the fact that any time you want to get something real done, 15 00:00:50,780 --> 00:00:55,260 like add capacity to a data structure or grow a string, 16 00:00:55,260 --> 00:00:57,600 you have to do all of that work yourself-- 17 00:00:57,600 --> 00:01:01,080 means that C really creates a whole lot of work for the programmer. 18 00:01:01,080 --> 00:01:03,360 But ever since C's invention many years ago 19 00:01:03,360 --> 00:01:05,840 has the world developed any number of new languages-- 20 00:01:05,840 --> 00:01:07,850 higher level languages, if you will-- that 21 00:01:07,850 --> 00:01:12,050 add on features, that fill in gaps, and generally 22 00:01:12,050 --> 00:01:14,460 solve problems more effectively. 23 00:01:14,460 --> 00:01:17,590 And so today, we start to do exactly that transition, 24 00:01:17,590 --> 00:01:21,880 having motivated this just a week ago with our look at machine learning. 25 00:01:21,880 --> 00:01:25,090 Indeed, one of the tools that we use to have that conversation 26 00:01:25,090 --> 00:01:27,760 was to introduce snippets of this language, Python, 27 00:01:27,760 --> 00:01:33,250 because indeed it is much more of a well-suited tool than something like C. 28 00:01:33,250 --> 00:01:34,940 But let's begin this transition now. 29 00:01:34,940 --> 00:01:37,380 We, of course, started this conversation many weeks ago 30 00:01:37,380 --> 00:01:38,610 when we looked at Scratch. 31 00:01:38,610 --> 00:01:41,950 And yet even though you probably found it pretty fun, pretty friendly, 32 00:01:41,950 --> 00:01:45,300 and pretty accessible, the reality was that built into Scratch 33 00:01:45,300 --> 00:01:50,010 was quite a lot of features, loops, and conditions, 34 00:01:50,010 --> 00:01:53,430 and customized functions, and variables, and any number of other features 35 00:01:53,430 --> 00:01:57,400 that we then saw the week after in C-- albeit a little more 36 00:01:57,400 --> 00:01:59,720 arcanely with more cryptic syntax. 37 00:01:59,720 --> 00:02:02,845 But the expressiveness of Scratch remained within C. 38 00:02:02,845 --> 00:02:06,760 And indeed, even today, as we transition to another language altogether, 39 00:02:06,760 --> 00:02:10,810 you will find that the ideas remain consistent. 40 00:02:10,810 --> 00:02:14,350 And indeed, things just get easier in many ways to do. 41 00:02:14,350 --> 00:02:17,000 So we transitioned to C. And today we transition to Python. 42 00:02:17,000 --> 00:02:20,010 And so let's, just as we did with Scratch, 43 00:02:20,010 --> 00:02:22,150 try to convert one language to another, just 44 00:02:22,150 --> 00:02:26,010 to emphasize that fundamentally the ideas today are changing, simply 45 00:02:26,010 --> 00:02:27,300 the way of expressing it. 46 00:02:27,300 --> 00:02:30,080 So this perhaps was the very first program 47 00:02:30,080 --> 00:02:32,354 we looked at in C-- arguably the simplest, 48 00:02:32,354 --> 00:02:34,520 and yet even then there was quite a bit of overhead. 49 00:02:34,520 --> 00:02:38,530 Well, starting today, if you wanted to write a program that does exactly that, 50 00:02:38,530 --> 00:02:39,170 voila! 51 00:02:39,170 --> 00:02:42,080 In Python, you simply say what you mean. 52 00:02:42,080 --> 00:02:45,070 If you want to print "hello world," you literally in a Python program 53 00:02:45,070 --> 00:02:48,300 are going to write print open parenthesis quote unquote, 54 00:02:48,300 --> 00:02:49,070 "hello world." 55 00:02:49,070 --> 00:02:51,740 And you can even omit the semi-colon that might 56 00:02:51,740 --> 00:02:53,570 have hung you up so many times since. 57 00:02:53,570 --> 00:02:56,350 Now in reality, you'll often see a slightly different paradigm 58 00:02:56,350 --> 00:02:57,940 when writing the simplest of programs. 59 00:02:57,940 --> 00:03:00,260 You might actually see some mention of main. 60 00:03:00,260 --> 00:03:04,550 But it turns out that a main function is not actually required in Python 61 00:03:04,550 --> 00:03:09,870 as it is in C. Rather, you can simply write code and just get going with it. 62 00:03:09,870 --> 00:03:11,970 And we'll do this hands-on in just a bit. 63 00:03:11,970 --> 00:03:13,860 But you'll find that a very common paradigm 64 00:03:13,860 --> 00:03:15,901 is to actually have code like this, where you do, 65 00:03:15,901 --> 00:03:17,820 in fact, define a function called main. 66 00:03:17,820 --> 00:03:20,220 And as we'll soon see through quite a few examples, 67 00:03:20,220 --> 00:03:22,350 this is how now in Python, you define a function. 68 00:03:22,350 --> 00:03:24,704 You literally say "def" for define, "main" 69 00:03:24,704 --> 00:03:27,620 if that's the name of the function, open paren, close paren, and maybe 70 00:03:27,620 --> 00:03:31,270 zero or more parameters therein, and then a colon, 71 00:03:31,270 --> 00:03:34,580 and an absence of the curly braces-- with which you might now 72 00:03:34,580 --> 00:03:35,810 have gotten so familiar. 73 00:03:35,810 --> 00:03:39,039 But then indented beneath that, generally four spaces here, 74 00:03:39,039 --> 00:03:40,830 would be the code that you want to execute. 75 00:03:40,830 --> 00:03:42,496 And we'll come back to this before long. 76 00:03:42,496 --> 00:03:46,270 But this is just a common paradigm to ensure 77 00:03:46,270 --> 00:03:49,300 that at least one function in a Python program is called by default 78 00:03:49,300 --> 00:03:51,500 and by convention, we'll see-- it's called main. 79 00:03:51,500 --> 00:03:55,340 But the reality is that the program can now be as simple as this. 80 00:03:55,340 --> 00:03:57,360 So let's distill some of the fundamentals 81 00:03:57,360 --> 00:04:01,900 that we first saw in Scratch, then saw in C, and now see in Python as well. 82 00:04:01,900 --> 00:04:05,530 So Python has functions and it also has something called methods-- but more 83 00:04:05,530 --> 00:04:07,830 on that when we talk about object-oriented programming. 84 00:04:07,830 --> 00:04:11,470 But a function in C for printing "hello world" might have looked like this. 85 00:04:11,470 --> 00:04:13,720 Notice the printf for printing a formatted string. 86 00:04:13,720 --> 00:04:15,730 Notice the backslash n that's inside there. 87 00:04:15,730 --> 00:04:16,793 Notice the semi-colon. 88 00:04:16,793 --> 00:04:18,959 In Python, it's indeed going to be a little simpler. 89 00:04:18,959 --> 00:04:20,729 We can distill that to just this. 90 00:04:20,729 --> 00:04:23,770 So we're not going to use printf, we're just going to use print. 91 00:04:23,770 --> 00:04:27,140 We don't, it turns out, have to have the backslash n in this example. 92 00:04:27,140 --> 00:04:28,610 You're going to get that for free. 93 00:04:28,610 --> 00:04:31,910 Just by calling print are you going to get a trailing newline printed. 94 00:04:31,910 --> 00:04:34,750 And we don't, again, need the semi-colon at the end. 95 00:04:34,750 --> 00:04:35,820 Well, what about loops? 96 00:04:35,820 --> 00:04:38,910 Well, in Scratch, we had the repeat block. 97 00:04:38,910 --> 00:04:41,590 We had the forever block and some other constructs still. 98 00:04:41,590 --> 00:04:46,346 In C, we had things like for loops and while loops and do while loops. 99 00:04:46,346 --> 00:04:47,970 Well, let's do a couple of conversions. 100 00:04:47,970 --> 00:04:52,110 In C, if you wanted do something forever, like print "hello world" again 101 00:04:52,110 --> 00:04:54,670 and again and again, never stopping, one per line, 102 00:04:54,670 --> 00:04:57,420 you might use a while loop like this. 103 00:04:57,420 --> 00:05:00,284 In Python, you're going to do something pretty similar in spirit, 104 00:05:00,284 --> 00:05:02,450 but it's going to be formatted a little differently. 105 00:05:02,450 --> 00:05:04,630 We still have access to the while keyword. 106 00:05:04,630 --> 00:05:08,760 The boolean value true now has to be capitalized with a capital T. 107 00:05:08,760 --> 00:05:11,330 And again, instead of using curly braces, 108 00:05:11,330 --> 00:05:13,940 you're going to use a colon at the end of this statement 109 00:05:13,940 --> 00:05:17,374 and then indent all of the code beneath it that you want to happen cyclically. 110 00:05:17,374 --> 00:05:19,790 And again, we've borrowed print "hello world" from before, 111 00:05:19,790 --> 00:05:21,320 so no semi-colon necessary there. 112 00:05:21,320 --> 00:05:24,610 No f and no backslash n is required. 113 00:05:24,610 --> 00:05:27,250 Meanwhile, if we had a for loop in C that we 114 00:05:27,250 --> 00:05:29,760 wanted to say print "hello world" 50 times, 115 00:05:29,760 --> 00:05:32,321 we might use a fairly common paradigm like this. 116 00:05:32,321 --> 00:05:34,570 Well, in Python you can do this in any number of ways. 117 00:05:34,570 --> 00:05:37,540 But perhaps one of the most common is to do something like this, 118 00:05:37,540 --> 00:05:42,430 to literally say for i in range 50-- more on that in just a moment-- 119 00:05:42,430 --> 00:05:44,710 and then print "hello world." 120 00:05:44,710 --> 00:05:46,700 So this is shorter hand notation. 121 00:05:46,700 --> 00:05:50,810 And this is perhaps the first instance where you really see just how pedantic, 122 00:05:50,810 --> 00:05:53,750 how much C belabors the point, whereas in Python you 123 00:05:53,750 --> 00:05:57,220 just probably with higher frequency just say what you mean. 124 00:05:57,220 --> 00:06:00,940 So for implies a looping construct here. i 125 00:06:00,940 --> 00:06:03,640 is declaring implicitly a variable that we're about to use. 126 00:06:03,640 --> 00:06:05,440 And then what do you want i to be? 127 00:06:05,440 --> 00:06:10,080 Well, you want it to be in a range of values from 0 up to but excluding 50. 128 00:06:10,080 --> 00:06:12,340 So you want to go from 0 to 49, effectively. 129 00:06:12,340 --> 00:06:14,890 And the way you can express that here is as follows. 130 00:06:14,890 --> 00:06:18,070 You call this range function, which gives you 131 00:06:18,070 --> 00:06:24,040 essentially a sequence of numbers starting at 0, and then 1, and then 2, 132 00:06:24,040 --> 00:06:25,910 and then 3-- all the way up to 49. 133 00:06:25,910 --> 00:06:30,280 And on each iteration of this loop does i get assigned that value. 134 00:06:30,280 --> 00:06:34,520 So functionally, what we've just done is equivalent to what 135 00:06:34,520 --> 00:06:39,100 we've just done here, but it does it in a more Pythonic way, if you will. 136 00:06:39,100 --> 00:06:41,141 We don't have access to that same for construct 137 00:06:41,141 --> 00:06:42,890 as we did in C. We actually have something 138 00:06:42,890 --> 00:06:45,850 that's a little easier, once you get used to it, to use. 139 00:06:45,850 --> 00:06:47,176 Now how about variables? 140 00:06:47,176 --> 00:06:50,459 Well, recall that in Scratch, we had variables, those little orange blocks. 141 00:06:50,459 --> 00:06:52,250 And we didn't have to worry about the type. 142 00:06:52,250 --> 00:06:55,130 We could just put in numbers or other such things into them. 143 00:06:55,130 --> 00:06:57,350 And then in C, we had to start caring about this. 144 00:06:57,350 --> 00:06:59,630 But we had booleans, and we had floats, and we 145 00:06:59,630 --> 00:07:06,600 had doubles, and chars, and strings, and longs, and a few others still. 146 00:07:06,600 --> 00:07:09,470 Well, in Python, we're still going to have a number of data types. 147 00:07:09,470 --> 00:07:13,210 But Python is not nearly as strongly-typed, so to speak, 148 00:07:13,210 --> 00:07:16,960 whereas in C-- and languages like C and a few others-- 149 00:07:16,960 --> 00:07:22,360 you have to know and care about and tell the compiler what type of value 150 00:07:22,360 --> 00:07:23,640 some variable is. 151 00:07:23,640 --> 00:07:25,610 In Python, those types exist. 152 00:07:25,610 --> 00:07:30,380 But the language is more loosely-typed, as we say, whereby they have types, 153 00:07:30,380 --> 00:07:33,920 but you as the programmer don't have to worry about specifying them, 154 00:07:33,920 --> 00:07:36,220 a bit more like our world from Scratch. 155 00:07:36,220 --> 00:07:39,780 So whereas in C, we might have declared an integer called i and assigned 156 00:07:39,780 --> 00:07:42,990 it an initial value of 0-- we might have used syntax like this. 157 00:07:42,990 --> 00:07:46,340 In Python, it's going to be similar in spirit, but a little more succinct. 158 00:07:46,340 --> 00:07:48,580 Again, just say what you mean. i gets zero 159 00:07:48,580 --> 00:07:51,150 with no semi-colon, no mention of the type. 160 00:07:51,150 --> 00:07:54,450 But insofar as Python supports numbers, it's 161 00:07:54,450 --> 00:07:58,540 going to realize-- oh, that zero looks like an integer, is an integer. 162 00:07:58,540 --> 00:08:02,890 I'm going to define, ultimately, i as of being of type int. 163 00:08:02,890 --> 00:08:05,410 Meanwhile we have boolean expressions in Python as well. 164 00:08:05,410 --> 00:08:07,570 And these actually translate perfectly. 165 00:08:07,570 --> 00:08:12,220 If you have an expression in C testing whether i is less than 50, 166 00:08:12,220 --> 00:08:14,460 this is the same thing in Python as well. 167 00:08:14,460 --> 00:08:16,300 You literally use the same syntax. 168 00:08:16,300 --> 00:08:18,947 If, instead, you want to generally compare two variables, just 169 00:08:18,947 --> 00:08:20,780 like we did a few weeks back in C, you might 170 00:08:20,780 --> 00:08:26,844 do x less than y-- same exact code in Python as well as in C. 171 00:08:26,844 --> 00:08:27,885 Now how about conditions? 172 00:08:27,885 --> 00:08:30,590 So conditions are these branching constructs 173 00:08:30,590 --> 00:08:33,549 where we can either go this way or maybe this way or another way. 174 00:08:33,549 --> 00:08:35,530 So it's the proverbial fork in the road. 175 00:08:35,530 --> 00:08:40,179 Well, in C, if you wanted to have an if statement that 176 00:08:40,179 --> 00:08:43,490 has three different branches, you might do something like this. 177 00:08:43,490 --> 00:08:45,790 And as you may recall, these curly braces 178 00:08:45,790 --> 00:08:49,030 are not strictly necessary, simply because we 179 00:08:49,030 --> 00:08:52,820 have one line of code nested beneath this if, and one line of code 180 00:08:52,820 --> 00:08:55,620 beneath this else if, and one line of code beneath this else. 181 00:08:55,620 --> 00:08:58,700 Technically, and you might have seen this in section or other resources, 182 00:08:58,700 --> 00:09:01,690 you can actually omit all of these curly braces, which to be fair, 183 00:09:01,690 --> 00:09:04,380 makes the code look a little more compact. 184 00:09:04,380 --> 00:09:06,460 But the logic is pretty straightforward. 185 00:09:06,460 --> 00:09:09,610 And we saw similar yellowish blocks in Scratch. 186 00:09:09,610 --> 00:09:12,610 Now in Python, the idea is going to be exactly the same, 187 00:09:12,610 --> 00:09:15,160 but some of the syntax is going to be a bit different. 188 00:09:15,160 --> 00:09:18,860 So if we want to say, is x less than y, we still say it, 189 00:09:18,860 --> 00:09:20,524 but we don't need the parentheses. 190 00:09:20,524 --> 00:09:22,440 In fact, if they don't add anything logically, 191 00:09:22,440 --> 00:09:25,300 we're just going to start omitting them altogether as unnecessary. 192 00:09:25,300 --> 00:09:28,090 We do have the colon, which is necessary at the end of the line. 193 00:09:28,090 --> 00:09:30,490 We do have consistent indentation. 194 00:09:30,490 --> 00:09:32,723 And those of you who have not necessarily 195 00:09:32,723 --> 00:09:38,640 had five for fives for style, realize that in Python the language by design 196 00:09:38,640 --> 00:09:41,640 is going to enforce the need for indentation. 197 00:09:41,640 --> 00:09:46,860 So in fact, I see myself being a little hypocritical here, as I inconsistently 198 00:09:46,860 --> 00:09:48,480 indent this actual code. 199 00:09:48,480 --> 00:09:50,450 So this would not actually work properly, 200 00:09:50,450 --> 00:09:52,481 because I've used a variable amount of spacing. 201 00:09:52,481 --> 00:09:53,980 So Python is not going to like that. 202 00:09:53,980 --> 00:09:57,090 And in fact, that's why I made that mistake to make this point here, 203 00:09:57,090 --> 00:10:01,430 so that you actually have to conform to using four spaces or some other, 204 00:10:01,430 --> 00:10:03,960 but being consistent ultimately. 205 00:10:03,960 --> 00:10:04,730 So notice this. 206 00:10:04,730 --> 00:10:05,230 This? 207 00:10:05,230 --> 00:10:06,690 Not a typo. 208 00:10:06,690 --> 00:10:09,000 I didn't make that many mistakes here. "elif" 209 00:10:09,000 --> 00:10:12,894 is actually the keyword that we use to express "else if." 210 00:10:12,894 --> 00:10:15,060 So it's simply a new keyword that we have in Python, 211 00:10:15,060 --> 00:10:16,860 again, ending the same line with the colon. 212 00:10:16,860 --> 00:10:19,110 And then here, logically, is the third and final case. 213 00:10:19,110 --> 00:10:21,610 else, if it's not less than and it's not greater then, 214 00:10:21,610 --> 00:10:23,920 it must in fact be equal to. 215 00:10:23,920 --> 00:10:29,520 So we've used print as before to express these three possible outputs. 216 00:10:29,520 --> 00:10:31,410 What about things like arrays? 217 00:10:31,410 --> 00:10:33,970 Well, Scratch had things called lists that we essentially 218 00:10:33,970 --> 00:10:36,000 equated with arrays, even though that was a bit 219 00:10:36,000 --> 00:10:38,840 of an oversimplification at the time. 220 00:10:38,840 --> 00:10:42,950 Python also has effectively what we've been using and taking 221 00:10:42,950 --> 00:10:45,930 for granted now in C, that of arrays. 222 00:10:45,930 --> 00:10:48,780 But it turns out, in Python we're going to start calling them lists. 223 00:10:48,780 --> 00:10:51,470 And they're so much easier to use. 224 00:10:51,470 --> 00:10:53,550 In fact, all of this low-level memory management 225 00:10:53,550 --> 00:10:56,149 of having to allocate and reallocate and resize 226 00:10:56,149 --> 00:10:58,940 arrays potentially if you want to grow or shrink them-- all of that 227 00:10:58,940 --> 00:10:59,773 goes out the window. 228 00:10:59,773 --> 00:11:01,730 And indeed, this is a feature you commonly get 229 00:11:01,730 --> 00:11:03,850 in a higher-level language like Python. 230 00:11:03,850 --> 00:11:06,270 It's a lot of this functionality built into the language, 231 00:11:06,270 --> 00:11:09,240 as opposed to you, the programmer, having to implement those 232 00:11:09,240 --> 00:11:10,370 low-level details. 233 00:11:10,370 --> 00:11:14,900 So, for instance, whereas in C, particularly in a main function, 234 00:11:14,900 --> 00:11:18,910 we've been using for some time argv, which is an argument vector or an array 235 00:11:18,910 --> 00:11:22,130 of arguments at the command line-- you might access the first of those with 236 00:11:22,130 --> 00:11:25,660 argvargv[0]-- we're actually going to have that same syntactic capability. 237 00:11:25,660 --> 00:11:29,720 We're going to access, in particular, argv a little differently via an object 238 00:11:29,720 --> 00:11:30,350 called sys. 239 00:11:30,350 --> 00:11:33,110 So sys.argv, as we'll see, is going to be the syntax. 240 00:11:33,110 --> 00:11:35,460 But those square brackets are going to remain 241 00:11:35,460 --> 00:11:40,610 and the ideas of arrays, now called lists, are going to remain as well. 242 00:11:40,610 --> 00:11:43,076 So what's a little bit different in Python? 243 00:11:43,076 --> 00:11:44,950 We're about to see a whole bunch of examples. 244 00:11:44,950 --> 00:11:48,200 And indeed we'll port-- so to speak-- convert, or translate some 245 00:11:48,200 --> 00:11:50,530 of our previous C examples into Python. 246 00:11:50,530 --> 00:11:53,630 But what's the mental model that you need to have for Python? 247 00:11:53,630 --> 00:11:58,450 Well, all this time, C, we've described as being compiled. 248 00:11:58,450 --> 00:12:03,160 In order to write and use a program in C, you have to write the source code. 249 00:12:03,160 --> 00:12:05,150 And you have to save the file in something.c. 250 00:12:05,150 --> 00:12:09,970 And then you have to run something like clang something.c in order to output 251 00:12:09,970 --> 00:12:12,390 from source code your machine code. 252 00:12:12,390 --> 00:12:16,190 And then that machine code, the zeros and ones that the-- Intel, usually-- 253 00:12:16,190 --> 00:12:22,070 CPU inside understands, can actually be run by double-clicking or doing ./a.out 254 00:12:22,070 --> 00:12:24,330 or whatever the program's name actually is. 255 00:12:24,330 --> 00:12:27,840 So as you may have realized already, this gets fairly tedious over time. 256 00:12:27,840 --> 00:12:30,110 Every time you make a darn change to your code, 257 00:12:30,110 --> 00:12:32,800 you have to recompile it with clang-- or with make, 258 00:12:32,800 --> 00:12:34,890 more generally-- and then run it. 259 00:12:34,890 --> 00:12:36,920 To make a change, compile, run it. 260 00:12:36,920 --> 00:12:38,310 Make a change, compile, run it. 261 00:12:38,310 --> 00:12:41,430 Wouldn't it be nice if we could reduce those numbers of steps 262 00:12:41,430 --> 00:12:44,280 somehow by just eliminating the compilation step? 263 00:12:44,280 --> 00:12:47,410 And indeed, a feature you get with a lot of higher-level languages 264 00:12:47,410 --> 00:12:54,370 like Python and JavaScript and PHP and Ruby is that they can be interpreted, 265 00:12:54,370 --> 00:12:55,510 so to speak. 266 00:12:55,510 --> 00:12:58,870 You don't have to worry so much about compiling them yourself 267 00:12:58,870 --> 00:13:00,790 and then running resulting machine code. 268 00:13:00,790 --> 00:13:05,550 You can just run one command in order to actually run your program. 269 00:13:05,550 --> 00:13:08,940 And there's a lot more going on underneath the hood, as we'll see. 270 00:13:08,940 --> 00:13:13,440 But ultimately if we had a program that looks like this-- 271 00:13:13,440 --> 00:13:16,072 simply a function called main as we saw earlier, 272 00:13:16,072 --> 00:13:18,030 and we'll see some more examples of this soon-- 273 00:13:18,030 --> 00:13:21,040 that simply prints out "hello world," it turns out 274 00:13:21,040 --> 00:13:24,590 that you can run this program in a couple of different ways. 275 00:13:24,590 --> 00:13:29,750 We can either, in the spirit of clang-- whereby in C, 276 00:13:29,750 --> 00:13:35,330 we ran clang hello.c and then ./a.out-- in Python, 277 00:13:35,330 --> 00:13:38,430 if this program is stored in a file called hello.py-- 278 00:13:38,430 --> 00:13:41,850 where .py is the common file extension for any programs written in Python-- 279 00:13:41,850 --> 00:13:46,110 we can distill those two steps, as we'll soon see, into just one. 280 00:13:46,110 --> 00:13:51,590 You run a program called Python, which is called the Python interpreter. 281 00:13:51,590 --> 00:13:53,820 And what that does underneath the hood for you 282 00:13:53,820 --> 00:13:57,800 is it compiles your Python source code into something called byte code, 283 00:13:57,800 --> 00:14:02,244 and then proceeds to interpret that byte code top to bottom, left to right. 284 00:14:02,244 --> 00:14:04,160 So this is a lower-level implementation detail 285 00:14:04,160 --> 00:14:05,650 that we're not going to have to worry about, 286 00:14:05,650 --> 00:14:07,890 because indeed one of the features of this kind of language 287 00:14:07,890 --> 00:14:09,681 is that you don't need to worry about that. 288 00:14:09,681 --> 00:14:13,180 And you don't need that middle step of having to compile your code. 289 00:14:13,180 --> 00:14:17,130 But for the curious, what's going to happen underneath the hood is this. 290 00:14:17,130 --> 00:14:20,360 If we have a function like main that's simply going to print "hello world" 291 00:14:20,360 --> 00:14:22,612 and we do run it through that Python command, what 292 00:14:22,612 --> 00:14:25,070 happens underneath the hood is that it gets converted first 293 00:14:25,070 --> 00:14:27,778 into something called byte code-- which fairly esoterically looks 294 00:14:27,778 --> 00:14:31,280 a little something like this, which you can actually see yourself if you run 295 00:14:31,280 --> 00:14:33,290 Python with the appropriate commands. 296 00:14:33,290 --> 00:14:36,340 And then what Python the interpreter does 297 00:14:36,340 --> 00:14:38,527 is it reads this kind of code-- top to bottom, 298 00:14:38,527 --> 00:14:41,610 left to right-- that we the programmers don't have to worry about in order 299 00:14:41,610 --> 00:14:44,390 to actually make your program do work. 300 00:14:44,390 --> 00:14:47,130 So you'll often hear that Python is an interpreted language, 301 00:14:47,130 --> 00:14:48,840 and that kind of is indeed the case. 302 00:14:48,840 --> 00:14:51,070 But there can indeed be this compilation step, 303 00:14:51,070 --> 00:14:53,666 and it actually depends on the implementation of Python 304 00:14:53,666 --> 00:14:56,040 that you're using or even the computer that you're using. 305 00:14:56,040 --> 00:14:58,230 And indeed, what we're now starting to see 306 00:14:58,230 --> 00:15:01,230 is the dichotomy between what it means to be a language 307 00:15:01,230 --> 00:15:04,820 and what it means to be a program, like this thing Python. 308 00:15:04,820 --> 00:15:06,630 Python is a language. 309 00:15:06,630 --> 00:15:07,790 C is a language. 310 00:15:07,790 --> 00:15:09,400 Clang is a compiler. 311 00:15:09,400 --> 00:15:13,230 Python is also not just a language, but a program 312 00:15:13,230 --> 00:15:17,380 that understands that language, otherwise known as an interpreter. 313 00:15:17,380 --> 00:15:20,520 And so anytime you see me starting to run the command "python," as you 314 00:15:20,520 --> 00:15:24,970 will too for future problem sets, will you be interpreting the language, 315 00:15:24,970 --> 00:15:27,190 the source code that you've written. 316 00:15:27,190 --> 00:15:28,030 All right. 317 00:15:28,030 --> 00:15:32,960 So let's go ahead now and make a transition in code from the world of C 318 00:15:32,960 --> 00:15:34,080 to the world of Python. 319 00:15:34,080 --> 00:15:37,230 And to help get us there, let's put back on just temporarily 320 00:15:37,230 --> 00:15:40,310 some training wheels of sorts-- a reimplementation 321 00:15:40,310 --> 00:15:43,715 of the CS50 library from C to Python, which we've done for you. 322 00:15:43,715 --> 00:15:45,840 And we won't look at the lower-level implementation 323 00:15:45,840 --> 00:15:47,480 details of how that works. 324 00:15:47,480 --> 00:15:50,480 But let me propose that at least for part of today's story, 325 00:15:50,480 --> 00:15:53,560 we're going to have access to at least a few functions. 326 00:15:53,560 --> 00:15:56,330 These functions are going to be called GetChar, GetFloat, GetInt, 327 00:15:56,330 --> 00:15:59,460 and GetString, just like those with which are already familiar. 328 00:15:59,460 --> 00:16:01,420 The syntax with which we access them is going 329 00:16:01,420 --> 00:16:03,003 to be a little different in this case. 330 00:16:03,003 --> 00:16:07,140 By convention, we're going to say cs50.GetChar cs50.GetFloat 331 00:16:07,140 --> 00:16:11,336 and so forth, to make clear that these aren't globally available 332 00:16:11,336 --> 00:16:14,460 functions that might have even come with the language, because they're not. 333 00:16:14,460 --> 00:16:17,490 Rather, these are inside of a module, so to speak, 334 00:16:17,490 --> 00:16:21,790 that CS50 wrote that implements exactly that functionality. 335 00:16:21,790 --> 00:16:25,570 We'll soon see that Python has at least these data types of bools, 336 00:16:25,570 --> 00:16:28,045 true or false, whereby the T, and in turn the F, 337 00:16:28,045 --> 00:16:30,496 have to be capitalized in Python, unlike in C; 338 00:16:30,496 --> 00:16:33,120 floats, which are going to give us real numbers, floating point 339 00:16:33,120 --> 00:16:37,360 values with decimal points; int, which is going to give us an integer; 340 00:16:37,360 --> 00:16:40,920 and str or string, which is going to give us the string that we've now 341 00:16:40,920 --> 00:16:42,310 come to know and love. 342 00:16:42,310 --> 00:16:44,660 But nicely enough, you can start to think again 343 00:16:44,660 --> 00:16:47,080 of string as an abstraction, because it's actually 344 00:16:47,080 --> 00:16:50,490 what's called a class that has a whole lot of functionality built-in. 345 00:16:50,490 --> 00:16:53,570 No longer are we going to have to worry about managing 346 00:16:53,570 --> 00:16:56,540 the memory for our strings underneath the hood. 347 00:16:56,540 --> 00:16:59,820 Now Python, realize, also comes with a bunch of other features, some of which 348 00:16:59,820 --> 00:17:00,980 we'll see today too. 349 00:17:00,980 --> 00:17:03,720 You can actually represent complex or imaginary numbers 350 00:17:03,720 --> 00:17:06,390 in Python natively in the language itself. 351 00:17:06,390 --> 00:17:08,660 You have the notion of lists, as we mentioned before, 352 00:17:08,660 --> 00:17:10,930 an analog to C's arrays. 353 00:17:10,930 --> 00:17:13,170 We have things called tuples, so if you've ever 354 00:17:13,170 --> 00:17:17,150 seen like xy coordinates or any kind of groups of values in the real world, 355 00:17:17,150 --> 00:17:20,660 we can implement those too in Python; ranges, which we saw briefly, 356 00:17:20,660 --> 00:17:23,680 which whereby you can define a range that starts at some value and ends 357 00:17:23,680 --> 00:17:27,470 at some value, which is often helpful when counting from, say 0 to 50; 358 00:17:27,470 --> 00:17:31,730 a set, which like in mathematics, allows you to have a collection of objects-- 359 00:17:31,730 --> 00:17:35,430 and you're not going to have duplicate, but it's 360 00:17:35,430 --> 00:17:38,780 going to be very easy to check whether or not something is in that set; 361 00:17:38,780 --> 00:17:41,340 and then a dict or dictionary, which is actually 362 00:17:41,340 --> 00:17:43,400 going to be really just a hash table. 363 00:17:43,400 --> 00:17:44,980 But more on that in just a bit. 364 00:17:44,980 --> 00:17:47,980 And these are just some of them that we'll soon see. 365 00:17:47,980 --> 00:17:49,920 So let's now rewind in time and take a look 366 00:17:49,920 --> 00:17:54,670 back at week one and perhaps this first and simplest example that we ever did, 367 00:17:54,670 --> 00:17:58,340 which is this one here called hello.c. 368 00:17:58,340 --> 00:18:01,140 And meanwhile, let me go ahead here on the right-hand side 369 00:18:01,140 --> 00:18:05,140 and create a new file that I'm going to go ahead and call hello.py. 370 00:18:05,140 --> 00:18:08,650 And in here, I'm going to go ahead and write the equivalent Python 371 00:18:08,650 --> 00:18:12,260 program to the C program on the left. 372 00:18:12,260 --> 00:18:15,280 print "hello world" 373 00:18:15,280 --> 00:18:15,900 Done. 374 00:18:15,900 --> 00:18:18,200 That is the first of our Python programs. 375 00:18:18,200 --> 00:18:19,700 Now how do I run it? 376 00:18:19,700 --> 00:18:21,510 There's no clang step. 377 00:18:21,510 --> 00:18:25,060 And it's not correct to do just ./hello.py, 378 00:18:25,060 --> 00:18:27,250 because inside of this file is just text. 379 00:18:27,250 --> 00:18:28,670 It's just my source code. 380 00:18:28,670 --> 00:18:31,010 I need to interpret that code somehow. 381 00:18:31,010 --> 00:18:33,460 And that's where that program Python comes in. 382 00:18:33,460 --> 00:18:36,287 I'm going to simply do python space hello.py-- 383 00:18:36,287 --> 00:18:38,120 and I don't need the dot slash in this case, 384 00:18:38,120 --> 00:18:41,530 because hello.py is assumed to be in the current directory. 385 00:18:41,530 --> 00:18:43,700 Hit enter and voila! 386 00:18:43,700 --> 00:18:45,620 There's my first Python program. 387 00:18:45,620 --> 00:18:49,180 So what I haven't put in here is any mention of main. 388 00:18:49,180 --> 00:18:50,820 And just to be clear, we could. 389 00:18:50,820 --> 00:18:52,920 Again, a common convention in Python, especially 390 00:18:52,920 --> 00:18:54,890 as programs get a little more complicated, 391 00:18:54,890 --> 00:18:58,620 is to actually do something like this-- to define a function called 392 00:18:58,620 --> 00:19:02,090 main that takes, in this case, no arguments, and then below it, 393 00:19:02,090 --> 00:19:04,900 to have this line pretty much copied and pasted. 394 00:19:04,900 --> 00:19:12,150 If name equals, equals, underscore, underscore, main, underscore, colon, 395 00:19:12,150 --> 00:19:13,930 then call main. 396 00:19:13,930 --> 00:19:15,930 So what's actually going on here? 397 00:19:15,930 --> 00:19:24,690 Long story short, this line 4 and line 5 is just a quick way of checking, 398 00:19:24,690 --> 00:19:28,240 is this file's default name quote unquote "main" with the underscores 399 00:19:28,240 --> 00:19:28,740 there? 400 00:19:28,740 --> 00:19:30,859 If so, go ahead and just call this function. 401 00:19:30,859 --> 00:19:33,400 Now generally, we won't bother writing our programs like this 402 00:19:33,400 --> 00:19:35,370 when it is not in fact necessary. 403 00:19:35,370 --> 00:19:37,720 But realize, all these two lines of code do 404 00:19:37,720 --> 00:19:41,090 is it ensures that if you do have a function called main in your program, 405 00:19:41,090 --> 00:19:45,220 it's just going to call it by default. That does not happen automatically. 406 00:19:45,220 --> 00:19:50,790 And indeed, if I just wrote hello to py like this, and gave it a main function, 407 00:19:50,790 --> 00:19:53,070 gave it a code, like print "hello world," 408 00:19:53,070 --> 00:19:58,050 but did not tell Python to actually call main, 409 00:19:58,050 --> 00:20:01,640 I could run the program like this, but nothing's actually going to happen. 410 00:20:01,640 --> 00:20:03,640 So keep that in mind as a potential gotcha 411 00:20:03,640 --> 00:20:05,830 as you start to write these things yourself. 412 00:20:05,830 --> 00:20:09,820 Well, now let's take a look back at another program we had in week 1. 413 00:20:09,820 --> 00:20:14,090 This one might have had me doing this in string.c. 414 00:20:14,090 --> 00:20:17,820 So in string.c did we introduce the CS50 library in C. 415 00:20:17,820 --> 00:20:20,690 And we also introduced from it the GetString function. 416 00:20:20,690 --> 00:20:25,210 And to use it, we had to declare a variable, like s, of type string 417 00:20:25,210 --> 00:20:27,450 and then assign it the return value of GetString. 418 00:20:27,450 --> 00:20:30,200 Well, let's go ahead and do this same program 419 00:20:30,200 --> 00:20:33,730 in Python, this time calling it string.py. 420 00:20:33,730 --> 00:20:36,220 And I'm going to go ahead now and include the CS50 library. 421 00:20:36,220 --> 00:20:38,553 But the syntax for this is a little different in Python. 422 00:20:38,553 --> 00:20:42,630 Instead of pound including, you do import cs50. 423 00:20:42,630 --> 00:20:46,320 And that's it, no angle brackets, no quotes, no .h, or anything like that. 424 00:20:46,320 --> 00:20:51,280 We have pre-installed in CS50 IDE the CS50 library for Python. 425 00:20:51,280 --> 00:20:53,390 And that's going to allow me now to do this. 426 00:20:53,390 --> 00:21:00,044 s gets cs50.get_string print "hello world" 427 00:21:00,044 --> 00:21:01,960 And we'll fill in this blank in just a moment, 428 00:21:01,960 --> 00:21:03,459 but let's first see what's going on. 429 00:21:03,459 --> 00:21:07,670 On line 3 here, I'm declaring a variable called s on the left. 430 00:21:07,670 --> 00:21:11,590 I'm not explicitly mentioning its type, because Python will figure out that it 431 00:21:11,590 --> 00:21:14,590 is in fact a string, because the function on the right hand side of this 432 00:21:14,590 --> 00:21:22,780 equal sign, cs50.get_string, is going to return to s a value of type string. 433 00:21:22,780 --> 00:21:26,576 Now as an aside, in C, we kept calling these things functions. 434 00:21:26,576 --> 00:21:27,700 And indeed, they still are. 435 00:21:27,700 --> 00:21:29,840 But technically, if you have a function-- 436 00:21:29,840 --> 00:21:34,210 like get_string in this case-- that's inside of an object, that's 437 00:21:34,210 --> 00:21:37,550 inside of what's called a module in Python, like the cs50 module 438 00:21:37,550 --> 00:21:40,560 here, now we can start calling get_string as a method, which just 439 00:21:40,560 --> 00:21:44,670 means it's a function associated with some kind of container-- in this case, 440 00:21:44,670 --> 00:21:46,096 this thing called cs50. 441 00:21:46,096 --> 00:21:48,720 Now unfortunately, this program, of course, is not yet correct. 442 00:21:48,720 --> 00:21:53,300 If I do python space string.py and then type in my name "David," 443 00:21:53,300 --> 00:21:55,170 it's still just says "hello world." 444 00:21:55,170 --> 00:21:57,426 So I need a way of substituting in my name here. 445 00:21:57,426 --> 00:21:59,550 And it turns out there's a couple of different ways 446 00:21:59,550 --> 00:22:03,050 to do this in Python, some of which are more outdated than others. 447 00:22:03,050 --> 00:22:06,610 So long story short, there are at least two major versions 448 00:22:06,610 --> 00:22:08,350 of this language called Python now. 449 00:22:08,350 --> 00:22:10,615 There's Python 2 and there's Python 3. 450 00:22:10,615 --> 00:22:13,740 Now it turns out-- and we didn't really talk about this in the world of C-- 451 00:22:13,740 --> 00:22:17,000 there's actually different versions of C. We in CS50 have generally 452 00:22:17,000 --> 00:22:23,630 been using version C11, which was the 2011 version of C, 453 00:22:23,630 --> 00:22:26,730 which just means it's the most recent version that we happen to be using. 454 00:22:26,730 --> 00:22:30,410 For the most part, that hadn't mattered in C. But in Python, it actually does. 455 00:22:30,410 --> 00:22:33,880 It turns out that the inventor of Python and the community around Python 456 00:22:33,880 --> 00:22:38,450 decided over the past several years to change the language in enough ways 457 00:22:38,450 --> 00:22:40,990 that they are breaking changes. 458 00:22:40,990 --> 00:22:42,890 They're not backwards compatible, which means 459 00:22:42,890 --> 00:22:47,670 if you wrote code in version 2 of Python, it might not work in version 3. 460 00:22:47,670 --> 00:22:49,640 And unfortunately both versions of the language 461 00:22:49,640 --> 00:22:53,660 have been coexisting for some time, such that there's a huge community that 462 00:22:53,660 --> 00:22:54,970 still uses Python 2. 463 00:22:54,970 --> 00:22:57,270 There's a growing community that uses Python 3. 464 00:22:57,270 --> 00:23:01,130 So that we stay at least as current as possible, we for the class' purposes 465 00:23:01,130 --> 00:23:02,267 will use Python 3. 466 00:23:02,267 --> 00:23:05,100 And for the most part, if you're learning Python for the first time, 467 00:23:05,100 --> 00:23:06,290 it's not going to matter. 468 00:23:06,290 --> 00:23:08,400 But realize, unfortunately, that when you look up 469 00:23:08,400 --> 00:23:10,340 resources on the internet or Google things, 470 00:23:10,340 --> 00:23:13,930 you'll very often find older examples that might not necessarily 471 00:23:13,930 --> 00:23:15,270 work as intended. 472 00:23:15,270 --> 00:23:18,941 So just compare them against what we've done here in class and in section. 473 00:23:18,941 --> 00:23:19,440 All right. 474 00:23:19,440 --> 00:23:21,840 So with that said, let's go ahead and substitute 475 00:23:21,840 --> 00:23:26,900 in my name, which I'm going to do fairly oddly with two curly braces here. 476 00:23:26,900 --> 00:23:28,460 And then I'm going to do this. 477 00:23:28,460 --> 00:23:33,780 .format open paren, s, close paren. 478 00:23:33,780 --> 00:23:35,390 So what's going on here? 479 00:23:35,390 --> 00:23:40,690 Well, it turns out that in Python, quote unquote "something" 480 00:23:40,690 --> 00:23:44,300 is indeed a string, or technically an object of type str. 481 00:23:44,300 --> 00:23:46,850 And it turns out that in Python and in a lot of higher level 482 00:23:46,850 --> 00:23:51,980 languages, objects-- as I keep calling them-- have built in functionality. 483 00:23:51,980 --> 00:23:54,890 So a string is no longer just a sequence of characters. 484 00:23:54,890 --> 00:24:00,550 It's no longer just the address of a byte of memory terminated eventually 485 00:24:00,550 --> 00:24:01,627 with backslash 0. 486 00:24:01,627 --> 00:24:03,960 There's actually a lot more going on underneath the hood 487 00:24:03,960 --> 00:24:05,740 that we don't really have to care about. 488 00:24:05,740 --> 00:24:07,520 Because indeed, this is a good thing. 489 00:24:07,520 --> 00:24:10,040 We can truly now think of a string in Python 490 00:24:10,040 --> 00:24:12,950 as being an abstraction for a sequence of characters. 491 00:24:12,950 --> 00:24:15,730 But baked into it, if you will, is a whole bunch 492 00:24:15,730 --> 00:24:16,970 of additional functionality. 493 00:24:16,970 --> 00:24:20,770 For instance, there is a function that is a method called 494 00:24:20,770 --> 00:24:23,650 format that comes with strings now. 495 00:24:23,650 --> 00:24:25,800 And it's a little weird to call them in this way. 496 00:24:25,800 --> 00:24:27,370 But notice the similarity. 497 00:24:27,370 --> 00:24:33,660 Just like the CS50 library, or module, or really object, has inside of it 498 00:24:33,660 --> 00:24:38,390 a get_string method or function, so does a string, 499 00:24:38,390 --> 00:24:42,370 like quote unquote "whatever" have built inside of it 500 00:24:42,370 --> 00:24:43,927 a method or function called format. 501 00:24:43,927 --> 00:24:46,760 And as you might have guessed, its purpose in life is just to format 502 00:24:46,760 --> 00:24:47,970 the thing to the left. 503 00:24:47,970 --> 00:24:51,647 So you get used to this format-- and there's no pun intended-- 504 00:24:51,647 --> 00:24:53,730 and there's other ways to do this still, but we'll 505 00:24:53,730 --> 00:24:55,770 see why this is useful in just a moment. 506 00:24:55,770 --> 00:24:59,650 For now, it just looks like a ridiculously unnecessarily complex way 507 00:24:59,650 --> 00:25:02,380 of plugging in a name to simply do this. 508 00:25:02,380 --> 00:25:06,120 If I type in my name David, and hit enter, now I get "hello David." 509 00:25:06,120 --> 00:25:09,190 But trust for now that this is going to be useful as we 510 00:25:09,190 --> 00:25:11,810 start to use other file formats still. 511 00:25:11,810 --> 00:25:16,730 Now as an aside, so that we've not just removed training wheels and now 512 00:25:16,730 --> 00:25:19,250 putting them back on you just for the sake of Python, 513 00:25:19,250 --> 00:25:23,080 let me emphasize that we can actually implement this program exactly 514 00:25:23,080 --> 00:25:27,590 the same way without using anything CS50 specific using built-in functionality, 515 00:25:27,590 --> 00:25:30,140 like the input function in Python version 3. 516 00:25:30,140 --> 00:25:33,870 The input function here optionally takes a prompt inside of its parentheses. 517 00:25:33,870 --> 00:25:36,370 But if I exclude that, it's just going to ask for some text. 518 00:25:36,370 --> 00:25:38,190 And here I can do this now. 519 00:25:38,190 --> 00:25:42,760 If I run Python string.py and type in my name, it still works. 520 00:25:42,760 --> 00:25:47,800 And if I actually do something like this, name colon space, save the file, 521 00:25:47,800 --> 00:25:50,720 and rerun it, now I get a prompt for free. 522 00:25:50,720 --> 00:25:51,360 So here, too. 523 00:25:51,360 --> 00:25:52,470 Super simple example. 524 00:25:52,470 --> 00:25:54,880 But whereas in C, typically we would have 525 00:25:54,880 --> 00:25:58,870 had to add that prompt using printf and loop again and again as needed, 526 00:25:58,870 --> 00:26:02,490 here we can simply prompt once via the input function 527 00:26:02,490 --> 00:26:07,720 and get back a value all at the same time, such as say, Zamyla's name here. 528 00:26:07,720 --> 00:26:10,860 So we're only using the CS50 library for today's purposes 529 00:26:10,860 --> 00:26:13,687 to show you the equivalence of some of our C examples 530 00:26:13,687 --> 00:26:15,020 vis-a-vis these Python examples. 531 00:26:15,020 --> 00:26:17,710 But it is by no means necessary, just gives us 532 00:26:17,710 --> 00:26:19,930 a bit more functionality that's useful. 533 00:26:19,930 --> 00:26:24,860 For instance, if I were to write a program very similar to this one-- 534 00:26:24,860 --> 00:26:28,620 recall way back when we had this program in C, which simply got int 535 00:26:28,620 --> 00:26:33,970 from the user and printed it out-- let me this time 536 00:26:33,970 --> 00:26:36,730 create a new file called int.py. 537 00:26:36,730 --> 00:26:39,860 And inside of it, import the CS50 library, which 538 00:26:39,860 --> 00:26:44,340 also has a function called cs50.getint. 539 00:26:44,340 --> 00:26:49,490 And then use this function to simply say, print, quote, unquote, "hello." 540 00:26:49,490 --> 00:26:53,640 Open curly brace, closed curly brace, .format i. 541 00:26:53,640 --> 00:26:54,950 Save this file. 542 00:26:54,950 --> 00:26:57,300 Run Python int.py. 543 00:26:57,300 --> 00:26:59,110 I can type in a number like 42. 544 00:26:59,110 --> 00:26:59,900 And voila. 545 00:26:59,900 --> 00:27:01,650 Now we've used Get Int. 546 00:27:01,650 --> 00:27:03,530 But now let's actually format something. 547 00:27:03,530 --> 00:27:08,530 You'll recall that in the world of C, we had some issues of imprecision. 548 00:27:08,530 --> 00:27:15,010 So recall that this program, whereby I printed the value of 1/10 to 55 decimal 549 00:27:15,010 --> 00:27:21,750 places, actually did not yield 0.100000 to infinity, 550 00:27:21,750 --> 00:27:23,760 as I was taught in grade school. 551 00:27:23,760 --> 00:27:29,920 Rather, we saw some raring of the head of imprecision, 552 00:27:29,920 --> 00:27:35,380 whereby floating point values in C were not represented infinitely precisely. 553 00:27:35,380 --> 00:27:36,770 In fact, let's do this too. 554 00:27:36,770 --> 00:27:39,070 Imprecision.py shall be the name of this file. 555 00:27:39,070 --> 00:27:39,820 And you know what? 556 00:27:39,820 --> 00:27:41,570 I don't even need to write much code here. 557 00:27:41,570 --> 00:27:45,210 I'm just going to go ahead and print out, somehow or other, 558 00:27:45,210 --> 00:27:49,480 a value like, say, 1 divided by 10. 559 00:27:49,480 --> 00:27:51,260 Let me go ahead and save that. 560 00:27:51,260 --> 00:27:54,790 Run Python of imprecision.py. 561 00:27:54,790 --> 00:27:56,780 And I do get 0.1. 562 00:27:56,780 --> 00:27:58,230 So this is kind of interesting. 563 00:27:58,230 --> 00:28:01,220 And in fact, it's revealing a feature of Python. 564 00:28:01,220 --> 00:28:03,200 But I don't want to see just one decimal point. 565 00:28:03,200 --> 00:28:11,260 I want to do the equivalent of %.55f, as we saw in C. 566 00:28:11,260 --> 00:28:12,700 It's almost the same in Python. 567 00:28:12,700 --> 00:28:16,150 But instead of using the percent sign, I'm going to use a colon instead. 568 00:28:16,150 --> 00:28:22,710 And now notice inside of all of this is just 0.55f preceded by that colon. 569 00:28:22,710 --> 00:28:24,810 So it's almost exactly what we did earlier, 570 00:28:24,810 --> 00:28:26,450 but with a bit more specificity. 571 00:28:26,450 --> 00:28:32,510 And now I see again that ridiculously disappointing imprecision eventually, 572 00:28:32,510 --> 00:28:34,050 which we also saw in C. 573 00:28:34,050 --> 00:28:37,130 So it turns out in Python, too, only a finite number of bits 574 00:28:37,130 --> 00:28:40,380 are used typically to represent a floating point value. 575 00:28:40,380 --> 00:28:44,380 So we still have, unfortunately, that issue of imprecision. 576 00:28:44,380 --> 00:28:48,560 But what we don't seem to have is something 577 00:28:48,560 --> 00:28:50,610 that we stumbled over some weeks ago. 578 00:28:50,610 --> 00:28:58,160 And in fact, the reason in the C version I did 1.0 divided by 10.0 was what? 579 00:28:58,160 --> 00:29:02,900 Why didn't I just do 1 divided by 10 in the C version? 580 00:29:02,900 --> 00:29:04,670 What happened? 581 00:29:04,670 --> 00:29:09,210 So as I recall, if you take an int in C and then divide it by an int in C, 582 00:29:09,210 --> 00:29:14,000 you get back and int in C. Unfortunately, 1 divided by 10 583 00:29:14,000 --> 00:29:15,334 should be 0.1. 584 00:29:15,334 --> 00:29:16,250 But that's not an int. 585 00:29:16,250 --> 00:29:17,500 That's a floating point value. 586 00:29:17,500 --> 00:29:20,780 So we solve this issue of truncation with integers 587 00:29:20,780 --> 00:29:24,787 whereby, if you have a value 1 divided by a value 10, both of which are ints, 588 00:29:24,787 --> 00:29:26,120 you're going to get back an int. 589 00:29:26,120 --> 00:29:29,240 The closest int after throwing away everything 590 00:29:29,240 --> 00:29:31,190 after the decimal point, which unfortunately 591 00:29:31,190 --> 00:29:35,840 would have been 0 if I didn't define them instead as being floats. 592 00:29:35,840 --> 00:29:38,910 But it seems that Python has actually fixed this. 593 00:29:38,910 --> 00:29:42,760 In fact, one of the features of Python 3 is to redress exactly this. 594 00:29:42,760 --> 00:29:45,340 For many years, we've all had to deal with the fact 595 00:29:45,340 --> 00:29:49,360 that an integer divided by an integer is, in fact, an integer and therefore 596 00:29:49,360 --> 00:29:51,560 mathematically incorrect, potentially. 597 00:29:51,560 --> 00:29:55,630 Well, turns out that's been fixed such that now 1 divided by 10 598 00:29:55,630 --> 00:30:00,976 gives you the value that you actually expect-- not, in fact, 0. 599 00:30:00,976 --> 00:30:02,350 But what does this actually mean? 600 00:30:02,350 --> 00:30:06,000 Let me go ahead and open up an example that I wrote in advance, this one 601 00:30:06,000 --> 00:30:11,370 being a translation of what we didn't see some time ago, like this. 602 00:30:11,370 --> 00:30:14,540 You'll recall that in the version we wrote weeks back, 603 00:30:14,540 --> 00:30:18,700 we just tested out the plus operator in C, the subtraction operator, 604 00:30:18,700 --> 00:30:22,260 multiplication, division, and modulo for remainder. 605 00:30:22,260 --> 00:30:25,960 Well, it turns out we can do something almost identically in Python 606 00:30:25,960 --> 00:30:28,162 here if we look at int.py. 607 00:30:28,162 --> 00:30:33,510 But notice that just as I've changed the program slightly 608 00:30:33,510 --> 00:30:36,900 to use this CS50 library for Python to get a value 609 00:30:36,900 --> 00:30:39,440 x here, to get a value y here. 610 00:30:39,440 --> 00:30:43,510 Notice that there is one additional example down here. 611 00:30:43,510 --> 00:30:45,380 I'm still demonstrating plus. 612 00:30:45,380 --> 00:30:48,520 I'm still demonstrating minus, multiplication, division. 613 00:30:48,520 --> 00:30:51,660 And what is this? 614 00:30:51,660 --> 00:30:54,790 So it turns out that in Python 3, if you want the old behavior 615 00:30:54,790 --> 00:30:59,120 and you actually want to do integer division such that you not only divide 616 00:30:59,120 --> 00:31:02,659 but effectively floor the value to the nearest int below it, 617 00:31:02,659 --> 00:31:05,200 you can actually use this syntax, which somewhat confusingly, 618 00:31:05,200 --> 00:31:07,930 perhaps looks like a comment in C. It is not a comment in Python. 619 00:31:07,930 --> 00:31:10,830 In fact, in Python, as you may have gleaned already, 620 00:31:10,830 --> 00:31:14,950 comments typically will start with just a single hash symbol. 621 00:31:14,950 --> 00:31:16,890 But there's other ways to do comments as well. 622 00:31:16,890 --> 00:31:20,000 But notice one other curiosity, too. 623 00:31:20,000 --> 00:31:24,190 This program does not print out new lines when prompting the user. 624 00:31:24,190 --> 00:31:27,020 In fact, if I run this program, let me go ahead 625 00:31:27,020 --> 00:31:32,130 and run this example-- which, again, is called ints.py. 626 00:31:32,130 --> 00:31:35,430 Notice that it prompts me for an int x and an int y. 627 00:31:35,430 --> 00:31:37,100 And I supply the new lines. 628 00:31:37,100 --> 00:31:38,670 They don't get printed for me. 629 00:31:38,670 --> 00:31:41,880 And then we get back the answers that we hopefully expect here. 630 00:31:41,880 --> 00:31:45,890 But what is this going on here? 631 00:31:45,890 --> 00:31:49,782 Well, in the previous examples, I got away with not using /n anymore. 632 00:31:49,782 --> 00:31:50,990 On the one hand, that's nice. 633 00:31:50,990 --> 00:31:53,060 I don't have to remember this annoying thing that often you 634 00:31:53,060 --> 00:31:54,284 might omit accidentally. 635 00:31:54,284 --> 00:31:56,450 And therefore, your prompt ends up on the same line. 636 00:31:56,450 --> 00:31:58,330 And just things look incorrect. 637 00:31:58,330 --> 00:32:02,950 Unfortunately, the price we pay by no longer having to call a /n in order 638 00:32:02,950 --> 00:32:07,930 to get a new line from Python's print function is if you don't want that 639 00:32:07,930 --> 00:32:11,190 freebie, if you don't want that /n, unfortunately, 640 00:32:11,190 --> 00:32:15,550 you're going to have to pass a second argument to the print function 641 00:32:15,550 --> 00:32:19,980 in Python that overrides what the default line ending is. 642 00:32:19,980 --> 00:32:23,600 So whereas you would be getting by default /n for free, 643 00:32:23,600 --> 00:32:28,400 if I instead say comma end equals, quote, unquote, nothing, 644 00:32:28,400 --> 00:32:31,030 that means Python, don't use the default /n. 645 00:32:31,030 --> 00:32:34,090 Instead, output nothing whatsoever. 646 00:32:34,090 --> 00:32:35,550 So it's a tradeoff. 647 00:32:35,550 --> 00:32:38,300 And again, much like you might have gleaned from the recent test, 648 00:32:38,300 --> 00:32:39,440 there's this theme of tradeoffs. 649 00:32:39,440 --> 00:32:42,780 So even in terms of the usability of a language, might there be this tradeoff? 650 00:32:42,780 --> 00:32:47,100 If you want one feature, you might have to give up some other altogether. 651 00:32:47,100 --> 00:32:50,740 So let's just tie this all together and implement a program together 652 00:32:50,740 --> 00:32:52,270 for temperature as follows. 653 00:32:52,270 --> 00:32:56,920 Let me go ahead and create a file called temperature.py. 654 00:32:56,920 --> 00:32:59,500 And this simply I want to use to convert, 655 00:32:59,500 --> 00:33:03,060 say, Fahrenheit to Celsius, to convert two temperatures. 656 00:33:03,060 --> 00:33:05,720 I'm going to go ahead for convenience and use the CS library. 657 00:33:05,720 --> 00:33:09,880 I'm going to declare a variable called f that's going to become, as we'll see, 658 00:33:09,880 --> 00:33:11,740 of type float by using cs50.getfloat. 659 00:33:11,740 --> 00:33:14,510 660 00:33:14,510 --> 00:33:18,500 And now I'm going to declare another variable, c, for Celsius, that's 661 00:33:18,500 --> 00:33:25,030 going to equal 5 divided by 9 times f minus 32, 662 00:33:25,030 --> 00:33:28,574 which I'm pretty sure is the formula for converting Fahrenheit to Celsius. 663 00:33:28,574 --> 00:33:30,490 And then I'm going to go ahead and print this, 664 00:33:30,490 --> 00:33:33,500 not with printf but with print, as follows. 665 00:33:33,500 --> 00:33:39,170 I'm going to have some placeholder there formatting this variable c. 666 00:33:39,170 --> 00:33:41,600 And what do I actually want to put inside of here? 667 00:33:41,600 --> 00:33:45,350 Well, if I want to go ahead and format it to just one decimal place, 668 00:33:45,350 --> 00:33:47,340 I'll use .1f. 669 00:33:47,340 --> 00:33:51,310 Let's go ahead and run Python on temperature.py. 670 00:33:51,310 --> 00:33:52,000 Enter. 671 00:33:52,000 --> 00:33:56,850 Let's type in a temperature like 212, 100 in Celsius. 672 00:33:56,850 --> 00:34:01,322 Let's type in the only other temperature I really know, 32, zero in Celsius. 673 00:34:01,322 --> 00:34:02,530 So we've done the conversion. 674 00:34:02,530 --> 00:34:04,470 And we've not had to worry nearly as much 675 00:34:04,470 --> 00:34:12,230 as we did a few weeks ago about all of the issues of integers 676 00:34:12,230 --> 00:34:14,909 being truncated when you divide. 677 00:34:14,909 --> 00:34:15,770 All right. 678 00:34:15,770 --> 00:34:18,310 So let's not focus so much on math and operators. 679 00:34:18,310 --> 00:34:22,290 Let's actually do a little bit of logic by way of this example from a while 680 00:34:22,290 --> 00:34:22,790 back. 681 00:34:22,790 --> 00:34:27,170 We had an example in C called logical.c, which simply did this. 682 00:34:27,170 --> 00:34:28,900 It asked me for a char. 683 00:34:28,900 --> 00:34:31,920 And it stored it inside of-- and actually, this 684 00:34:31,920 --> 00:34:35,429 could have been this-- char c gets get char. 685 00:34:35,429 --> 00:34:42,520 And then I compared that char c against Y in capital letter or y lowercase. 686 00:34:42,520 --> 00:34:44,940 And if they matched, I printed yes. 687 00:34:44,940 --> 00:34:47,929 Otherwise, if it was capital N or lowercase n, I printed no. 688 00:34:47,929 --> 00:34:49,050 Else, I just said error. 689 00:34:49,050 --> 00:34:51,620 So it's just an arbitrary program that's meant to assess, 690 00:34:51,620 --> 00:34:57,450 did I type yes or no effectively by its first letter, capitalized or otherwise? 691 00:34:57,450 --> 00:35:01,630 Let's go ahead and port this, translate this to Python as follows. 692 00:35:01,630 --> 00:35:03,680 Let me go ahead and create a new file over here. 693 00:35:03,680 --> 00:35:06,054 We'll call this logical.py. 694 00:35:06,054 --> 00:35:08,720 And I'm going to go ahead as before and import the CS50 library. 695 00:35:08,720 --> 00:35:12,490 But again, you could just use Python's built-in input function to do this. 696 00:35:12,490 --> 00:35:16,990 But at least this way, I'm guaranteed to get exactly the data type I want. 697 00:35:16,990 --> 00:35:18,469 CS50.getchar. 698 00:35:18,469 --> 00:35:20,635 And then over here, I'm going to now say conditions. 699 00:35:20,635 --> 00:35:23,280 So remember some of the syntax from before. 700 00:35:23,280 --> 00:35:26,700 You might be inclined to start saying, if open paren. 701 00:35:26,700 --> 00:35:27,870 But we don't need that here. 702 00:35:27,870 --> 00:35:36,520 We can instead just say if c equals equals yes, or c equals equals y, 703 00:35:36,520 --> 00:35:39,980 then go ahead and print yes. 704 00:35:39,980 --> 00:35:42,300 Now, this just seems ridiculous. 705 00:35:42,300 --> 00:35:45,750 All these weeks later, finally, you can truly just say what you mean? 706 00:35:45,750 --> 00:35:49,230 And indeed, in Python, there's not going to be the same double vertical bar 707 00:35:49,230 --> 00:35:53,570 or double ampersand that we've used now for some time to express or or and. 708 00:35:53,570 --> 00:35:57,520 Rather, we can really type this a bit more like an English sentence. 709 00:35:57,520 --> 00:36:00,680 It's still somewhat cryptic, to be sure, but at least there's less clutter. 710 00:36:00,680 --> 00:36:02,830 There's no required parentheses anymore. 711 00:36:02,830 --> 00:36:04,420 We don't need the curly braces even. 712 00:36:04,420 --> 00:36:06,460 We don't need vertical bars or ampersands. 713 00:36:06,460 --> 00:36:11,080 We can just use the word with which we're more familiar in the real world. 714 00:36:11,080 --> 00:36:15,670 But notice, too, I've done something subtly different from C. 715 00:36:15,670 --> 00:36:20,560 In the C version, to compare this variable c against y 716 00:36:20,560 --> 00:36:25,010 in capital letters or lowercase, I use single quotes. 717 00:36:25,010 --> 00:36:27,970 Why was that? 718 00:36:27,970 --> 00:36:30,920 In C, you actually have a data type called char. 719 00:36:30,920 --> 00:36:33,800 And it's fundamentally distinct from a string. 720 00:36:33,800 --> 00:36:38,140 So if I'm checking a char in C against some hard coded value, 721 00:36:38,140 --> 00:36:41,340 I have to use single quotes to make clear that this is just a single Ascii 722 00:36:41,340 --> 00:36:43,690 byte, capital Y or lowercase y. 723 00:36:43,690 --> 00:36:46,680 It's not capital Y /0. 724 00:36:46,680 --> 00:36:48,720 It's not lowercase y /0. 725 00:36:48,720 --> 00:36:51,710 It's just a single byte that I'm trying to compare. 726 00:36:51,710 --> 00:36:55,650 But it turns out in Python, there really is no such thing as a single char. 727 00:36:55,650 --> 00:37:01,670 If you want a character like capital Y or lowercase y, that's fine. 728 00:37:01,670 --> 00:37:04,490 But you're going to get an entire string-- a string with just one 729 00:37:04,490 --> 00:37:07,420 character in it plus whatever else is hidden inside 730 00:37:07,420 --> 00:37:09,430 of a Python string object. 731 00:37:09,430 --> 00:37:12,430 But what that means for us is that we don't have to worry as much about, 732 00:37:12,430 --> 00:37:13,060 is this a char? 733 00:37:13,060 --> 00:37:14,030 Is this a string? 734 00:37:14,030 --> 00:37:16,830 Just compare it in the more intuitive way. 735 00:37:16,830 --> 00:37:19,830 In fact, notice moreover what I am not using. 736 00:37:19,830 --> 00:37:22,370 In C, when we started to compare strings, 737 00:37:22,370 --> 00:37:25,580 we used things like StrComp or string compare. 738 00:37:25,580 --> 00:37:26,240 No more. 739 00:37:26,240 --> 00:37:28,390 You want to test two strings for equality. 740 00:37:28,390 --> 00:37:33,910 Does c from the user actually equal y, capitalized or lowercase? 741 00:37:33,910 --> 00:37:35,520 We can just double quote it like this. 742 00:37:35,520 --> 00:37:39,600 And in fact, it turns out that it doesn't matter in this context 743 00:37:39,600 --> 00:37:41,620 whether I use double quotes or single quotes. 744 00:37:41,620 --> 00:37:44,740 Generally in Python, you can actually use either. 745 00:37:44,740 --> 00:37:48,170 I'll simply adopt the habit here, and throughout these examples, 746 00:37:48,170 --> 00:37:50,290 of using double quotes, if only because they're 747 00:37:50,290 --> 00:37:54,620 identical to what we've done in CS50 for C. But realize that both of these 748 00:37:54,620 --> 00:37:55,360 are correct. 749 00:37:55,360 --> 00:38:00,450 Stylistically, generally just be consistent with respect to yourself. 750 00:38:00,450 --> 00:38:00,950 All right. 751 00:38:00,950 --> 00:38:04,650 So let's do another example and start to build on the sophistication. 752 00:38:04,650 --> 00:38:06,330 Because this isn't all that impressive. 753 00:38:06,330 --> 00:38:09,100 And actually, this of course is not yet done. 754 00:38:09,100 --> 00:38:15,810 Else if c equals equals N or c equals equals lowercase n, 755 00:38:15,810 --> 00:38:17,810 then I'm going to go ahead and print out-- oops. 756 00:38:17,810 --> 00:38:21,410 Not with printf but with no. 757 00:38:21,410 --> 00:38:26,232 Else, colon, I'm going to print out error. 758 00:38:26,232 --> 00:38:27,690 Almost forgot to finish my thought. 759 00:38:27,690 --> 00:38:28,950 So that's why the program was so short. 760 00:38:28,950 --> 00:38:32,286 Now it's almost as long although, again, if you ignore the curly braces, 761 00:38:32,286 --> 00:38:33,660 it's pretty much the same length. 762 00:38:33,660 --> 00:38:36,110 Just a little syntactically simpler. 763 00:38:36,110 --> 00:38:36,610 All right. 764 00:38:36,610 --> 00:38:38,410 So let's build up something a little more 765 00:38:38,410 --> 00:38:41,300 interesting in the interest of design. 766 00:38:41,300 --> 00:38:45,650 So some weeks ago, we introduced this example in C, the purpose of which, 767 00:38:45,650 --> 00:38:48,709 in positive.c, was to implement a program that doesn't just 768 00:38:48,709 --> 00:38:49,750 get an int from the user. 769 00:38:49,750 --> 00:38:51,330 It gets a positive integer. 770 00:38:51,330 --> 00:38:53,730 And this was a useful opportunity way back 771 00:38:53,730 --> 00:38:56,860 when to implement a custom function of our own, 772 00:38:56,860 --> 00:38:58,950 a feature that we had in Scratch. 773 00:38:58,950 --> 00:39:01,350 But it also was a nice way of abstracting away 774 00:39:01,350 --> 00:39:03,770 what it means to be get positive int, because we 775 00:39:03,770 --> 00:39:07,840 could use get int underneath the hood, but not necessarily care about it 776 00:39:07,840 --> 00:39:08,800 thereafter. 777 00:39:08,800 --> 00:39:10,520 So in C, recall a few details. 778 00:39:10,520 --> 00:39:13,020 We needed, one, not only our header files up top. 779 00:39:13,020 --> 00:39:15,226 But we also need this forward declaration. 780 00:39:15,226 --> 00:39:17,100 We need this prototype at the top of the file 781 00:39:17,100 --> 00:39:19,850 because C is going to read things top to bottom, left to right. 782 00:39:19,850 --> 00:39:23,600 So we'd better tell Clang or whatever compiler we're using about the function 783 00:39:23,600 --> 00:39:26,240 before we use it in the code itself. 784 00:39:26,240 --> 00:39:29,170 I now have an int i getting a positive int. 785 00:39:29,170 --> 00:39:31,100 And then I just go ahead and print this out. 786 00:39:31,100 --> 00:39:33,560 So the real magic seems to be below the break 787 00:39:33,560 --> 00:39:36,800 here whereby we implemented get positive int. 788 00:39:36,800 --> 00:39:39,880 And to do this in C, notice a few features. 789 00:39:39,880 --> 00:39:44,750 One, we declared it as a function, get positive int, that takes no arguments 790 00:39:44,750 --> 00:39:46,600 and returns an integer. 791 00:39:46,600 --> 00:39:50,270 Inside of that, we declared a variable n outside the scope of the 792 00:39:50,270 --> 00:39:54,770 do while loop because we want n to exist both here and here, 793 00:39:54,770 --> 00:39:57,420 as well as when we actually finally return it. 794 00:39:57,420 --> 00:39:59,420 And then in this do while loop, we just kept 795 00:39:59,420 --> 00:40:03,080 pestering the user so long as he or she gave us a value that's less than one, 796 00:40:03,080 --> 00:40:05,000 so non-positive. 797 00:40:05,000 --> 00:40:07,090 And then we returned it and printed it. 798 00:40:07,090 --> 00:40:10,030 Let's try to now port this to Python. 799 00:40:10,030 --> 00:40:13,710 In Python, let me go ahead now and do the following. 800 00:40:13,710 --> 00:40:18,330 I'm going to create a new file called positive.py. 801 00:40:18,330 --> 00:40:21,770 I'm going to go ahead and import the CS50 library as before. 802 00:40:21,770 --> 00:40:25,570 And I'm going to go ahead and define a main function that takes no arguments. 803 00:40:25,570 --> 00:40:27,120 We're not going to worry about command line arguments. 804 00:40:27,120 --> 00:40:29,411 And indeed, even when we are going to worry about them, 805 00:40:29,411 --> 00:40:32,140 we're not going to declare them inside those parentheses anymore. 806 00:40:32,140 --> 00:40:37,390 Now I'm going to go ahead and do i get get positive int. 807 00:40:37,390 --> 00:40:41,410 And now I'm going to go ahead and print out, with print, 808 00:40:41,410 --> 00:40:46,940 the placeholder is a positive integer, closed quotes. 809 00:40:46,940 --> 00:40:51,930 And then I'm going to do format i, plugging in that value. 810 00:40:51,930 --> 00:40:54,830 So let me shrink the screen here a little bit so that things 811 00:40:54,830 --> 00:40:57,300 fit a little better on the Python side. 812 00:40:57,300 --> 00:40:58,880 And now that's it for main. 813 00:40:58,880 --> 00:40:59,680 No curly braces. 814 00:40:59,680 --> 00:41:03,290 I just unindent in order to now start my next thought, which 815 00:41:03,290 --> 00:41:04,450 is going to be this. 816 00:41:04,450 --> 00:41:09,160 I'm going to go ahead and define another function called get positive int. 817 00:41:09,160 --> 00:41:10,670 I don't use void in Python. 818 00:41:10,670 --> 00:41:14,570 I simply leave the parentheses empty and add a colon at the end to say, 819 00:41:14,570 --> 00:41:16,660 here comes the function's implementation. 820 00:41:16,660 --> 00:41:20,500 And it turns out in Python, there isn't this do while construct. 821 00:41:20,500 --> 00:41:25,650 So the closest match to do while we did see earlier is just while. 822 00:41:25,650 --> 00:41:28,890 And a very common paradigm in Python is to deliberately induce, 823 00:41:28,890 --> 00:41:32,150 as you might have in C, an infinite loop capitalizing True 824 00:41:32,150 --> 00:41:36,810 because in Python, a bool that's true or false is going to be capitalized. 825 00:41:36,810 --> 00:41:40,280 And then inside of this loop, let's go ahead and do the following. 826 00:41:40,280 --> 00:41:46,620 Let's go ahead and say, print n is. 827 00:41:46,620 --> 00:41:50,680 And now below this, I'm to say n gets get int. 828 00:41:50,680 --> 00:41:52,870 But this is inside the CS50 module. 829 00:41:52,870 --> 00:41:54,570 So I need to do that there. 830 00:41:54,570 --> 00:41:57,920 And then I'm already in an infinite loop. 831 00:41:57,920 --> 00:41:58,790 So you know what? 832 00:41:58,790 --> 00:42:05,460 If n is greater than or equal to 1, I'm going to go ahead and break. 833 00:42:05,460 --> 00:42:08,910 So the logic is a little bit different this time. 834 00:42:08,910 --> 00:42:13,910 But I'm breaking out of the loop once I have what I intend. 835 00:42:13,910 --> 00:42:16,200 So I need to do one last thing. 836 00:42:16,200 --> 00:42:18,610 Once I've broken out of this loop, what do I 837 00:42:18,610 --> 00:42:22,870 need to do to complete the implementation of get positive int? 838 00:42:22,870 --> 00:42:23,710 I've gotten it. 839 00:42:23,710 --> 00:42:25,580 But I need to hand it back to the user. 840 00:42:25,580 --> 00:42:31,650 So let me go ahead on this last line and return that value as n. 841 00:42:31,650 --> 00:42:36,370 So notice a few distinctions here versus C. Whereas in C a few weeks ago, 842 00:42:36,370 --> 00:42:39,144 we had to give some hard thought to the issue of scope. 843 00:42:39,144 --> 00:42:41,310 Turns out we don't have to worry about that as much. 844 00:42:41,310 --> 00:42:45,400 As soon as I declare n here, it's going to be within scope within this function 845 00:42:45,400 --> 00:42:48,670 such that I can return it down here, even though that return statement 846 00:42:48,670 --> 00:42:53,500 is not indented and not inside, so to speak, that actual looping construct. 847 00:42:53,500 --> 00:42:56,700 Notice too, because we don't have a do while construct, 848 00:42:56,700 --> 00:43:00,610 I had to re-implement it using while alone. 849 00:43:00,610 --> 00:43:02,590 And I actually could have done that in C. 850 00:43:02,590 --> 00:43:05,410 Do while does not give us any fundamental capabilities that we 851 00:43:05,410 --> 00:43:08,450 couldn't implement for ourselves if we just implemented it 852 00:43:08,450 --> 00:43:11,010 logically a little more like this. 853 00:43:11,010 --> 00:43:13,070 We're still printing out n is first. 854 00:43:13,070 --> 00:43:14,240 We're then getting an int. 855 00:43:14,240 --> 00:43:15,781 We're then checking if it's positive. 856 00:43:15,781 --> 00:43:18,370 And if so, we're breaking out and returning. 857 00:43:18,370 --> 00:43:20,680 There is one or two bugs in here. 858 00:43:20,680 --> 00:43:22,920 And we'll trip over these in just a moment. 859 00:43:22,920 --> 00:43:32,080 Let me go ahead now and save this file and then run Python positive.py, Enter. 860 00:43:32,080 --> 00:43:35,240 Nothing seemed to happen. 861 00:43:35,240 --> 00:43:36,020 Hm. 862 00:43:36,020 --> 00:43:37,120 It's not running anymore. 863 00:43:37,120 --> 00:43:38,850 I'm back at my $prompt. 864 00:43:38,850 --> 00:43:40,094 Let me try running it again. 865 00:43:40,094 --> 00:43:40,885 Python positive.py. 866 00:43:40,885 --> 00:43:43,785 867 00:43:43,785 --> 00:43:45,160 I mean, there's no error message. 868 00:43:45,160 --> 00:43:49,060 And in the world of C, no error message usually meant something's right. 869 00:43:49,060 --> 00:43:50,120 And it's right. 870 00:43:50,120 --> 00:43:53,760 I've just kind of forgotten a key detail. 871 00:43:53,760 --> 00:43:55,260 I've imported CS50 library. 872 00:43:55,260 --> 00:43:56,800 I've defined main. 873 00:43:56,800 --> 00:43:58,290 I've defined get positive int. 874 00:43:58,290 --> 00:44:01,090 But what is different in this world now with Python? 875 00:44:01,090 --> 00:44:06,180 Main is not called by default. So if I want to actually call main, 876 00:44:06,180 --> 00:44:09,450 I'd better adopt a convention of, for instance, this paradigm. 877 00:44:09,450 --> 00:44:15,620 So if name equals equals main, then, with a colon, 878 00:44:15,620 --> 00:44:17,870 actually call the main function. 879 00:44:17,870 --> 00:44:21,990 And technically, as an aside, this would still work even without this. 880 00:44:21,990 --> 00:44:23,690 We could simply put main down here. 881 00:44:23,690 --> 00:44:25,620 But let me wave my hand at that detail for now 882 00:44:25,620 --> 00:44:28,280 and just emphasize that anytime you want to proactively call 883 00:44:28,280 --> 00:44:33,230 main, if you've set up your code in this way, we should indeed do it like this. 884 00:44:33,230 --> 00:44:36,340 Let me go ahead now and rerun Python positive.py. 885 00:44:36,340 --> 00:44:38,885 n is 42. 886 00:44:38,885 --> 00:44:40,420 n is a positive integer. 887 00:44:40,420 --> 00:44:43,250 Let me go ahead and run n is, and then 0. 888 00:44:43,250 --> 00:44:43,760 Nope. 889 00:44:43,760 --> 00:44:44,476 Negative 1. 890 00:44:44,476 --> 00:44:45,270 Nope. 891 00:44:45,270 --> 00:44:46,000 Foo. 892 00:44:46,000 --> 00:44:46,570 Retry. 893 00:44:46,570 --> 00:44:49,700 That's the CS50 library kicking in noticing that's a string. 894 00:44:49,700 --> 00:44:51,240 Let's try 50. 895 00:44:51,240 --> 00:44:51,820 And OK. 896 00:44:51,820 --> 00:44:52,332 That worked. 897 00:44:52,332 --> 00:44:55,040 Now, the bug I alluded to earlier is just that this looks stupid, 898 00:44:55,040 --> 00:44:57,520 having the cursor now on the next line. 899 00:44:57,520 --> 00:45:01,260 I can fix this, recall, by adding the second argument whereby the line ending 900 00:45:01,260 --> 00:45:03,540 for print is just quote unquote. 901 00:45:03,540 --> 00:45:06,650 Let me go ahead and rerun it. n is 42. 902 00:45:06,650 --> 00:45:09,460 And now things look a little bit cleaner. 903 00:45:09,460 --> 00:45:13,650 Now, at the risk of complicating, let me just point out one other detail. 904 00:45:13,650 --> 00:45:16,320 Technically, I could also do this. 905 00:45:16,320 --> 00:45:21,550 If you don't need a main function, then why do I have it at all? 906 00:45:21,550 --> 00:45:25,610 It stands to reason that I could just write my program like this. 907 00:45:25,610 --> 00:45:29,490 Yes, I'm defining an additional function, get positive int. 908 00:45:29,490 --> 00:45:31,290 And that's going to work as expected. 909 00:45:31,290 --> 00:45:33,291 But technically, if I don't need a main method-- 910 00:45:33,291 --> 00:45:35,373 and all of the simple examples we've done thus far 911 00:45:35,373 --> 00:45:37,470 just have me writing code right in the file itself 912 00:45:37,470 --> 00:45:42,060 and then interpreting it at the command line-- I should be able to do this, 913 00:45:42,060 --> 00:45:42,690 I would think. 914 00:45:42,690 --> 00:45:44,220 So let me try this. 915 00:45:44,220 --> 00:45:49,280 Let me go ahead and run again Python positive.py but on this new version. 916 00:45:49,280 --> 00:45:50,520 Enter. 917 00:45:50,520 --> 00:45:53,650 And now we get the first scary looking error message. 918 00:45:53,650 --> 00:45:56,600 So trace back most recent call last. 919 00:45:56,600 --> 00:46:01,400 File positive.py line 3, and module i get positive int. 920 00:46:01,400 --> 00:46:04,670 Name error name get positive int is not defined. 921 00:46:04,670 --> 00:46:07,240 So the first of our Clang-like error messages-- 922 00:46:07,240 --> 00:46:11,870 this one coming, of course, not from Clang, but from the Python interpreter. 923 00:46:11,870 --> 00:46:15,170 And even if the first few lines are indeed pretty cryptic-- 924 00:46:15,170 --> 00:46:20,320 name error name get positive int is not defined. 925 00:46:20,320 --> 00:46:21,100 But yes it is. 926 00:46:21,100 --> 00:46:23,900 It's right there at the moment on line 6. 927 00:46:23,900 --> 00:46:27,672 So it turns out Python is not all that much smarter than Clang 928 00:46:27,672 --> 00:46:29,130 when it comes to reading your code. 929 00:46:29,130 --> 00:46:32,030 It too is going to read it top to bottom, left to right. 930 00:46:32,030 --> 00:46:35,790 And insofar as I'm trying to call get positive int on line 3, 931 00:46:35,790 --> 00:46:39,950 but I'm not defining it until line 6, unacceptable. 932 00:46:39,950 --> 00:46:43,300 Now, you might be inclined to fix this like we did in C, whereby 933 00:46:43,300 --> 00:46:48,470 you say, all right, well, let me just do get positive int up here 934 00:46:48,470 --> 00:46:50,230 maybe, and just put a prototype. 935 00:46:50,230 --> 00:46:53,100 But this now looks especially weird. 936 00:46:53,100 --> 00:46:55,570 This now looks like a function call, not a prototype, 937 00:46:55,570 --> 00:46:58,940 because we're omitting now the return type because there is none. 938 00:46:58,940 --> 00:47:01,060 And there's no semicolon here by convention. 939 00:47:01,060 --> 00:47:04,240 And indeed, if I do this again, it's the same error. 940 00:47:04,240 --> 00:47:07,020 Now the problem is I'm calling it in the wrong place 941 00:47:07,020 --> 00:47:10,450 even earlier-- on this line, still line 3, in addition 942 00:47:10,450 --> 00:47:13,050 to line 5, which is now there. 943 00:47:13,050 --> 00:47:14,270 So how do we fix this? 944 00:47:14,270 --> 00:47:18,160 Well, back in C, we didn't technically need prototypes in most cases. 945 00:47:18,160 --> 00:47:21,450 We could instead just kind of work around it 946 00:47:21,450 --> 00:47:25,260 by moving the code to, say, the top of the file 947 00:47:25,260 --> 00:47:26,860 and ignore the problem, really. 948 00:47:26,860 --> 00:47:28,060 And now run the program. 949 00:47:28,060 --> 00:47:29,740 And now it's back to working. 950 00:47:29,740 --> 00:47:30,549 Why is that? 951 00:47:30,549 --> 00:47:33,840 Well, the Python interpreter is reading this file top to bottom, left to right. 952 00:47:33,840 --> 00:47:35,540 It imports the CS50 library. 953 00:47:35,540 --> 00:47:37,690 It defines a new function called get positive int. 954 00:47:37,690 --> 00:47:42,210 And then, on lines 11 and 12 now, it uses that function 955 00:47:42,210 --> 00:47:45,010 and actually then prints out the return value. 956 00:47:45,010 --> 00:47:47,620 But again, this very quickly gets a little messy. 957 00:47:47,620 --> 00:47:49,480 Now to find what this program does, I have 958 00:47:49,480 --> 00:47:52,335 to look all the way at the bottom of the file just to see my code. 959 00:47:52,335 --> 00:47:55,770 It would be nice if the actual logic of the program 960 00:47:55,770 --> 00:47:59,800 were at the top of the file, as has been our norm with C, putting main up top. 961 00:47:59,800 --> 00:48:03,140 So another good reason for having a main method 962 00:48:03,140 --> 00:48:05,280 is just to avoid these kinds of issues. 963 00:48:05,280 --> 00:48:07,740 If I rewind all of these changes that we just 964 00:48:07,740 --> 00:48:16,060 made and go back to this last version, this avoids all of these issues. 965 00:48:16,060 --> 00:48:20,936 Because if you're not calling main until literally the last line in your file, 966 00:48:20,936 --> 00:48:22,560 it's going to be defined at that point. 967 00:48:22,560 --> 00:48:24,700 So is any functions that it defines. 968 00:48:24,700 --> 00:48:26,680 And all of that will be implemented for you. 969 00:48:26,680 --> 00:48:28,790 And so now we're good to go. 970 00:48:28,790 --> 00:48:31,060 So again, we're complicating the program deliberately, 971 00:48:31,060 --> 00:48:36,150 but to proactively address those kinds of issues. 972 00:48:36,150 --> 00:48:39,030 Let's introduce one other topic now. 973 00:48:39,030 --> 00:48:41,700 Abstraction has been a theme, not only recently in the test, 974 00:48:41,700 --> 00:48:43,670 but also in the earliest weeks of the course. 975 00:48:43,670 --> 00:48:45,910 Well, you might recall from those early weeks, 976 00:48:45,910 --> 00:48:49,910 we had examples like this, where we had an example called cough0.c, whose 977 00:48:49,910 --> 00:48:52,690 purpose in life was to do [COUGHING]. 978 00:48:52,690 --> 00:48:54,540 So three coughs in a row. 979 00:48:54,540 --> 00:48:57,680 Now, this was clearly copy paste because all three of these lines 980 00:48:57,680 --> 00:48:58,500 are equivalent. 981 00:48:58,500 --> 00:48:59,500 But that's fine for now. 982 00:48:59,500 --> 00:49:04,780 Let me go ahead and verbatim convert this to Python as closely as I can. 983 00:49:04,780 --> 00:49:08,260 And cough0.py turns out it's pretty easy. 984 00:49:08,260 --> 00:49:11,170 Print quote unquote cough. 985 00:49:11,170 --> 00:49:13,840 And then I can really demonstrate how poorly 986 00:49:13,840 --> 00:49:16,770 designed this is by literally copying and pasting those three lines. 987 00:49:16,770 --> 00:49:18,170 I don't need standard IO.h. 988 00:49:18,170 --> 00:49:19,420 I don't need the CS50 library. 989 00:49:19,420 --> 00:49:20,220 I don't need main. 990 00:49:20,220 --> 00:49:25,570 We know-- because now, if I just do Python cough0.py, Enter, cough, cough, 991 00:49:25,570 --> 00:49:26,270 cough. 992 00:49:26,270 --> 00:49:26,770 All right. 993 00:49:26,770 --> 00:49:30,120 But we improved upon this example in C. Recall 994 00:49:30,120 --> 00:49:35,630 that in C, we then looked at cough1, which at least used a loop. 995 00:49:35,630 --> 00:49:38,220 So how do I do this in Python? 996 00:49:38,220 --> 00:49:41,420 Let me go ahead and save this now as cough1.py. 997 00:49:41,420 --> 00:49:43,720 And let me try to borrow some logic from earlier. 998 00:49:43,720 --> 00:49:46,810 Let me do for i in. 999 00:49:46,810 --> 00:49:47,560 And you know what? 1000 00:49:47,560 --> 00:49:49,020 I'm going to do range 3. 1001 00:49:49,020 --> 00:49:50,190 We had 50 before. 1002 00:49:50,190 --> 00:49:52,560 But I don't need it to iterate that many times. 1003 00:49:52,560 --> 00:49:56,240 Now let me just go ahead and print cough three times. 1004 00:49:56,240 --> 00:49:59,590 And now run Python cough1.py, Enter. 1005 00:49:59,590 --> 00:50:01,021 Cough, cough, cough. 1006 00:50:01,021 --> 00:50:01,520 All right. 1007 00:50:01,520 --> 00:50:07,000 But recall in the world of C, we improved further in cough2.c 1008 00:50:07,000 --> 00:50:07,840 as follows. 1009 00:50:07,840 --> 00:50:10,410 We abstracted away, so to speak, what it means 1010 00:50:10,410 --> 00:50:15,080 to be coughing by wrapping it in its own function called cough. 1011 00:50:15,080 --> 00:50:18,030 Because we don't really care that cough is implemented with printf. 1012 00:50:18,030 --> 00:50:20,740 We just like the idea, the semantics, if you will, 1013 00:50:20,740 --> 00:50:23,052 of having a new custom function called cough. 1014 00:50:23,052 --> 00:50:25,010 So let's go ahead and try to do that in Python. 1015 00:50:25,010 --> 00:50:29,310 Let me go over here and create a new file called cough2.py. 1016 00:50:29,310 --> 00:50:33,980 And in here, let me go ahead and define main as before. 1017 00:50:33,980 --> 00:50:37,700 Inside of this, let me do for i in range 3. 1018 00:50:37,700 --> 00:50:40,590 And let me go ahead here and call proactively cough, 1019 00:50:40,590 --> 00:50:42,780 even though it doesn't yet exist. 1020 00:50:42,780 --> 00:50:46,560 Let me go down here now and implement cough in such a way 1021 00:50:46,560 --> 00:50:49,350 that it simply prints cough. 1022 00:50:49,350 --> 00:50:52,960 Let me go ahead now and do Python cough2.py. 1023 00:50:52,960 --> 00:50:54,110 Wait. 1024 00:50:54,110 --> 00:50:55,590 Something's wrong. 1025 00:50:55,590 --> 00:50:58,850 What's going to happen? 1026 00:50:58,850 --> 00:50:59,474 Nothing. 1027 00:50:59,474 --> 00:51:01,015 I need to actually call the function. 1028 00:51:01,015 --> 00:51:03,860 And again, the paradigm that we'll adopt is this. 1029 00:51:03,860 --> 00:51:10,030 The name of the file is the default name of quote, unquote, __main__. 1030 00:51:10,030 --> 00:51:12,120 Then let me go ahead and call main. 1031 00:51:12,120 --> 00:51:15,180 So now if I run this again, voila. 1032 00:51:15,180 --> 00:51:16,510 Cough, cough, cough. 1033 00:51:16,510 --> 00:51:18,220 Notice again no prototype. 1034 00:51:18,220 --> 00:51:20,780 No imports from CS50 because we don't need it. 1035 00:51:20,780 --> 00:51:22,460 But let's improve upon this further. 1036 00:51:22,460 --> 00:51:27,680 In C, we took this one step further and then parameterized cough 1037 00:51:27,680 --> 00:51:31,010 so that we could cough three times but not have to implement the loop 1038 00:51:31,010 --> 00:51:32,180 ourselves in main. 1039 00:51:32,180 --> 00:51:34,480 We just want to punt, so to speak, or defer 1040 00:51:34,480 --> 00:51:38,450 to the actual implementation of cough to cough as many times as we want. 1041 00:51:38,450 --> 00:51:42,970 So if I want to do that here, let me go ahead and save a file called cough3.py. 1042 00:51:42,970 --> 00:51:48,020 And let me go ahead and again define main to just do a cough, but this time 1043 00:51:48,020 --> 00:51:51,370 three times, actually giving it an argument. 1044 00:51:51,370 --> 00:51:54,280 And then we go ahead and define cough again, 1045 00:51:54,280 --> 00:51:58,980 but not with open paren, closeed paren, but with an actual variable called n. 1046 00:51:58,980 --> 00:52:00,710 Here too, I don't need its data type. 1047 00:52:00,710 --> 00:52:02,390 Python will figure that out for me. 1048 00:52:02,390 --> 00:52:06,760 And then here, I can do for i in range of not 3 1049 00:52:06,760 --> 00:52:11,820 anymore, but n, because that's a local argument that's been passed in. 1050 00:52:11,820 --> 00:52:15,940 And now let me go ahead and print cough that many times. 1051 00:52:15,940 --> 00:52:17,970 Down here, let me go ahead and do my if. 1052 00:52:17,970 --> 00:52:21,900 The name of this file is the default name of __main. 1053 00:52:21,900 --> 00:52:24,650 Then go ahead and call main. 1054 00:52:24,650 --> 00:52:28,770 So now let me run this, cough3.py. 1055 00:52:28,770 --> 00:52:31,190 And I get cough, cough, cough. 1056 00:52:31,190 --> 00:52:34,219 And you recall we kind of took this to an extreme a few weeks ago. 1057 00:52:34,219 --> 00:52:36,510 Suppose I now want to implement the notion of sneezing. 1058 00:52:36,510 --> 00:52:40,150 Well, sneezing was deliberately introduced, 1059 00:52:40,150 --> 00:52:46,730 not so much because it's all that useful, per se, as a function, 1060 00:52:46,730 --> 00:52:49,970 but because it allowed me to factor out some common code. 1061 00:52:49,970 --> 00:52:53,570 It would be a little lazy of me if, to implement sneeze, 1062 00:52:53,570 --> 00:52:56,530 I went ahead and did something like this, whereby I literally 1063 00:52:56,530 --> 00:52:58,950 copy and paste the code, call this sneeze, 1064 00:52:58,950 --> 00:53:01,340 and then say "achoo" here instead. 1065 00:53:01,340 --> 00:53:04,220 Because look how similar these two functions are. 1066 00:53:04,220 --> 00:53:07,070 I mean, they're literally identical except for the words 1067 00:53:07,070 --> 00:53:08,430 being used therein. 1068 00:53:08,430 --> 00:53:10,420 The lines of code logically are the same. 1069 00:53:10,420 --> 00:53:17,250 So instead of that, let me go ahead and port this as I did in C as follows. 1070 00:53:17,250 --> 00:53:23,450 Let me go ahead and save this as cough4.py and in here go ahead 1071 00:53:23,450 --> 00:53:24,830 and define main. 1072 00:53:24,830 --> 00:53:27,160 And main now is going to call cough three times. 1073 00:53:27,160 --> 00:53:29,320 And it's going to call sneeze three times, which 1074 00:53:29,320 --> 00:53:30,920 just means I need to implement them. 1075 00:53:30,920 --> 00:53:35,596 So let me go ahead and define cough as before, taking in an integer n, 1076 00:53:35,596 --> 00:53:36,220 we can call it. 1077 00:53:36,220 --> 00:53:37,930 But we could call it anything we want. 1078 00:53:37,930 --> 00:53:39,030 But now you know what? 1079 00:53:39,030 --> 00:53:43,980 Let me generalize this and just have it call a say function 1080 00:53:43,980 --> 00:53:47,200 with the word we want it to say, and how many times. 1081 00:53:47,200 --> 00:53:49,840 Meanwhile, let me go ahead and define sneeze 1082 00:53:49,840 --> 00:53:56,770 as taking a similar int that simply says achoo, n that many times. 1083 00:53:56,770 --> 00:53:59,320 And now I just have to define say. 1084 00:53:59,320 --> 00:54:02,360 And before in C, on the left hand side here, took two arguments. 1085 00:54:02,360 --> 00:54:04,060 We can do that as well in Python. 1086 00:54:04,060 --> 00:54:07,870 We can simply say a word and n without worrying about their data type 1087 00:54:07,870 --> 00:54:09,170 and declaring them. 1088 00:54:09,170 --> 00:54:14,680 And now in here, I need to do this for i in range of n. 1089 00:54:14,680 --> 00:54:18,350 Let me go ahead and print word. 1090 00:54:18,350 --> 00:54:21,430 Now technically, if I really wanted to be consistent, 1091 00:54:21,430 --> 00:54:26,970 I could do print quote, unquote, curly braces, format word. 1092 00:54:26,970 --> 00:54:29,690 But I literally gain nothing in this case from doing that. 1093 00:54:29,690 --> 00:54:32,010 So it's a lot cleaner and a lot more readable 1094 00:54:32,010 --> 00:54:33,770 just to literally print the word. 1095 00:54:33,770 --> 00:54:35,630 You don't strictly need that placeholder. 1096 00:54:35,630 --> 00:54:40,550 Then down here, let's do if the name of the file equals equals, main as before. 1097 00:54:40,550 --> 00:54:41,240 Call main. 1098 00:54:41,240 --> 00:54:42,570 Voila. 1099 00:54:42,570 --> 00:54:46,410 Let's go ahead now and do Python of cough4.py. 1100 00:54:46,410 --> 00:54:47,460 Enter. 1101 00:54:47,460 --> 00:54:48,620 Cough, cough, cough. 1102 00:54:48,620 --> 00:54:50,230 Achoo, achoo, achoo. 1103 00:54:50,230 --> 00:54:52,990 So it's kind of an exercise in futility in the end 1104 00:54:52,990 --> 00:54:55,960 because the program still doesn't do anything that's 1105 00:54:55,960 --> 00:54:57,580 all that fundamentally interesting. 1106 00:54:57,580 --> 00:55:00,370 But notice how quickly we've moved from just printing 1107 00:55:00,370 --> 00:55:03,280 something like hello world just a little bit ago to defining 1108 00:55:03,280 --> 00:55:06,260 our own main function that calls two functions that are parameterized, 1109 00:55:06,260 --> 00:55:08,426 each of which in turn calls some other function that 1110 00:55:08,426 --> 00:55:09,550 takes multiple parameters. 1111 00:55:09,550 --> 00:55:11,360 So we're already very quickly building up 1112 00:55:11,360 --> 00:55:14,250 these building blocks, even faster than we might have done 1113 00:55:14,250 --> 00:55:17,200 in the earliest weeks of the class. 1114 00:55:17,200 --> 00:55:17,750 All right. 1115 00:55:17,750 --> 00:55:21,030 So that's essentially week one that we've now converted to Python. 1116 00:55:21,030 --> 00:55:24,420 Recall now in week two of CS50, we started to look at strings. 1117 00:55:24,420 --> 00:55:25,950 We looked at command line arguments. 1118 00:55:25,950 --> 00:55:29,240 So let's now, with relatively fewer examples, 1119 00:55:29,240 --> 00:55:32,320 compare and contrast what we did then to what we'll do now 1120 00:55:32,320 --> 00:55:34,180 and see what new features we have. 1121 00:55:34,180 --> 00:55:38,367 Recall indeed that in week two, we implemented strlen ourselves. 1122 00:55:38,367 --> 00:55:40,200 Before we even started taking it for granted 1123 00:55:40,200 --> 00:55:43,370 that there is a strlen function that returns the length of a string, 1124 00:55:43,370 --> 00:55:45,570 recall that we implemented it as follows. 1125 00:55:45,570 --> 00:55:47,160 We got a string from the user. 1126 00:55:47,160 --> 00:55:50,200 We initialized some counting variable, like n to 0. 1127 00:55:50,200 --> 00:55:53,720 And then while that location in the string using, 1128 00:55:53,720 --> 00:55:57,240 our square bracket notation, was not equal to the special sentinel value, 1129 00:55:57,240 --> 00:56:00,630 /0, do n plus plus, thereby incrementing n, 1130 00:56:00,630 --> 00:56:03,630 and then eventually print out what the value of n is. 1131 00:56:03,630 --> 00:56:06,950 So this, though, assumed in week two an understanding 1132 00:56:06,950 --> 00:56:09,220 of what's going on underneath the hood. 1133 00:56:09,220 --> 00:56:12,482 In Python, we're not going to want to worry about what's 1134 00:56:12,482 --> 00:56:13,690 going on underneath the hood. 1135 00:56:13,690 --> 00:56:18,520 Indeed, this whole principle of abstraction-- and more 1136 00:56:18,520 --> 00:56:22,400 specifically, encapsulation-- whereby, these implementation details 1137 00:56:22,400 --> 00:56:27,480 are deliberately hidden from us, is now something we can embrace as a feature. 1138 00:56:27,480 --> 00:56:30,960 No longer do we need to worry as much about how things are implemented, 1139 00:56:30,960 --> 00:56:33,750 but just that they are implemented. 1140 00:56:33,750 --> 00:56:37,650 So increasingly will we start to rely on publicly available documentation 1141 00:56:37,650 --> 00:56:40,780 and on examples online that use features of code, 1142 00:56:40,780 --> 00:56:42,900 as opposed to worrying as much about how they're 1143 00:56:42,900 --> 00:56:45,030 implemented underneath the hood. 1144 00:56:45,030 --> 00:56:49,530 So toward that end, let me go ahead and implement the equivalent 1145 00:56:49,530 --> 00:56:54,910 of this program in Python in a manner that would be appropriate here 1146 00:56:54,910 --> 00:56:56,850 with strlen.py. 1147 00:56:56,850 --> 00:56:59,740 I'm going to go ahead and import the CS50 library so that I can 1148 00:56:59,740 --> 00:57:03,310 get a string like this with get string. 1149 00:57:03,310 --> 00:57:06,650 And then I'm going to print the length of s. 1150 00:57:06,650 --> 00:57:09,890 So recall, of course, in C, we could have done this with strlen. 1151 00:57:09,890 --> 00:57:13,655 In the world of Python, we're not going to use strlen, but rather len, 1152 00:57:13,655 --> 00:57:16,220 or L-E-N for length, which it turns out can 1153 00:57:16,220 --> 00:57:19,120 be used on any numbers of different variables and objects. 1154 00:57:19,120 --> 00:57:20,610 It can be used on strings. 1155 00:57:20,610 --> 00:57:23,360 It can be used on lists and other data structures still. 1156 00:57:23,360 --> 00:57:26,759 So for now, know that this is how we might print the length of a string. 1157 00:57:26,759 --> 00:57:28,050 So let's go ahead and try this. 1158 00:57:28,050 --> 00:57:30,200 Python of strlen.py. 1159 00:57:30,200 --> 00:57:33,060 Type in something like foo, which is three letters. 1160 00:57:33,060 --> 00:57:35,150 And indeed, that's what we get back. 1161 00:57:35,150 --> 00:57:38,820 Well, now let's actually take a look at the fact that we do still, 1162 00:57:38,820 --> 00:57:42,390 nonetheless, have this notion of Ascii underneath the hood going 1163 00:57:42,390 --> 00:57:44,920 on, although not necessarily Ascii but Unicode, 1164 00:57:44,920 --> 00:57:49,340 which is a far more powerful encoding of symbols 1165 00:57:49,340 --> 00:57:54,540 so that we can have far more characters than just, say, 128, or even 256. 1166 00:57:54,540 --> 00:57:58,130 Let me go ahead and create the following example. 1167 00:57:58,130 --> 00:58:03,100 We'll call this Ascii0.py so that it lines up to the example 1168 00:58:03,100 --> 00:58:05,412 we did called Ascii0.c a few weeks back. 1169 00:58:05,412 --> 00:58:07,120 And let me go ahead and do the following. 1170 00:58:07,120 --> 00:58:12,950 For i in the range of 65, 65 plus 26. 1171 00:58:12,950 --> 00:58:16,260 So if I want to start iterating at 65, and then 1172 00:58:16,260 --> 00:58:19,620 iterate ultimately over 26 characters like we did a few weeks ago, 1173 00:58:19,620 --> 00:58:21,090 I can actually do this. 1174 00:58:21,090 --> 00:58:24,660 I can say something like, something is something, 1175 00:58:24,660 --> 00:58:28,090 specifically if I format two values. 1176 00:58:28,090 --> 00:58:31,100 I essentially want to format i and i again. 1177 00:58:31,100 --> 00:58:34,420 But the first of these I want to actually print as a character. 1178 00:58:34,420 --> 00:58:37,620 So it turns out that if you have in a variable, like i, 1179 00:58:37,620 --> 00:58:41,300 a decimal value, an integer, that corresponds underneath the hood 1180 00:58:41,300 --> 00:58:44,750 to an Ascii value, or really Unicode value, which is a superset, 1181 00:58:44,750 --> 00:58:48,120 you can call the CHR function, which is going to convert it 1182 00:58:48,120 --> 00:58:49,740 to its character equivalent. 1183 00:58:49,740 --> 00:58:57,750 If I go ahead now and run Python of Ascii0.py, I've made a mistake. 1184 00:58:57,750 --> 00:59:00,110 And you'll notice even CS50 IDE noticed this. 1185 00:59:00,110 --> 00:59:01,590 And I didn't notice CS50 IDE. 1186 00:59:01,590 --> 00:59:04,860 If I hover over that little x, it's yelling at me, invalid syntax. 1187 00:59:04,860 --> 00:59:08,910 Because CS50 IDE actually understands Python even more than it does C. So 1188 00:59:08,910 --> 00:59:12,900 I can actually fix this with that additional in keyword, which I forgot. 1189 00:59:12,900 --> 00:59:16,150 And now I can see the exact same tabular output which, 1190 00:59:16,150 --> 00:59:19,120 again, prints out capital A as 65. 1191 00:59:19,120 --> 00:59:22,810 So not necessarily a useful program other than to show us this equivalence. 1192 00:59:22,810 --> 00:59:26,090 Well, what about arguments at the command line? 1193 00:59:26,090 --> 00:59:30,560 Let me go ahead and implement a program similar in spirit to argv0.c a while 1194 00:59:30,560 --> 00:59:32,820 back, this time calling it .py. 1195 00:59:32,820 --> 00:59:36,010 And in here, let me go ahead and do this. 1196 00:59:36,010 --> 00:59:39,770 If-- and actually, let me go ahead and import sys first. 1197 00:59:39,770 --> 00:59:43,470 So sys is a system module that has a lot of lower level functionality, 1198 00:59:43,470 --> 00:59:46,570 among them command line arguments-- which, again, we do not 1199 00:59:46,570 --> 00:59:48,490 declare as being part of main. 1200 00:59:48,490 --> 00:59:50,980 They're globally accessible, if you will. 1201 00:59:50,980 --> 00:59:52,450 I'm going to go ahead and do this. 1202 00:59:52,450 --> 01:00:00,810 If the number of command line arguments in that list there equals equals 2, 1203 01:00:00,810 --> 01:00:05,220 then I'm going to go ahead and print out hello placeholder. 1204 01:00:05,220 --> 01:00:10,839 And then format inside of that sys.argv bracket 1. 1205 01:00:10,839 --> 01:00:13,130 So if there are two command line arguments-- something, 1206 01:00:13,130 --> 01:00:14,890 something-- I'm going to print the second of those 1207 01:00:14,890 --> 01:00:18,300 because the first of them is going to be the program's name or the file's name. 1208 01:00:18,300 --> 01:00:22,010 Else, I'm going to go ahead and just print out generically hello world. 1209 01:00:22,010 --> 01:00:23,290 Let me go ahead and save that. 1210 01:00:23,290 --> 01:00:25,470 Run Python argv0.py. 1211 01:00:25,470 --> 01:00:26,410 Enter. 1212 01:00:26,410 --> 01:00:27,240 And voila. 1213 01:00:27,240 --> 01:00:28,551 We have hello world. 1214 01:00:28,551 --> 01:00:30,300 Now, as an aside-- and just so that you've 1215 01:00:30,300 --> 01:00:33,190 seen it-- there are other ways of outputting strings because frankly, 1216 01:00:33,190 --> 01:00:35,564 this very quickly gets tedious if all you're trying to do 1217 01:00:35,564 --> 01:00:36,600 is plug in some value. 1218 01:00:36,600 --> 01:00:39,050 Generally, for consistency, I'll still do it this way. 1219 01:00:39,050 --> 01:00:41,010 But we could have done something like this. 1220 01:00:41,010 --> 01:00:44,240 And those of you who took, for instance, AP Computer Science A 1221 01:00:44,240 --> 01:00:46,880 in high school, or a Java class more generally, 1222 01:00:46,880 --> 01:00:49,190 might know that the plus operator is sometimes used 1223 01:00:49,190 --> 01:00:52,580 as the concatenation operator to take one string and another 1224 01:00:52,580 --> 01:00:53,840 and jam them together. 1225 01:00:53,840 --> 01:00:56,340 And indeed, we could do this as follows. 1226 01:00:56,340 --> 01:01:01,400 I could now do Python of argv0.py and get the same result. 1227 01:01:01,400 --> 01:01:06,400 But you'll find generally that using the format approach, as I originally did, 1228 01:01:06,400 --> 01:01:10,530 tends to be a little more sustainable once your code gets more complex. 1229 01:01:10,530 --> 01:01:12,020 Let's do something else. 1230 01:01:12,020 --> 01:01:14,960 Let's go ahead and print out a whole bunch of command line arguments, 1231 01:01:14,960 --> 01:01:19,130 just as we did a few weeks ago, this time in argv1.py, which 1232 01:01:19,130 --> 01:01:21,430 again corresponds to our earlier code. 1233 01:01:21,430 --> 01:01:24,610 And here, I'm going to go ahead and import the sys module again and do 1234 01:01:24,610 --> 01:01:26,550 for i in range. 1235 01:01:26,550 --> 01:01:31,430 And now this time, I'm going to do the length of sys.argv which, to be clear, 1236 01:01:31,430 --> 01:01:37,160 is going to give me the number of arguments in the argument vector. 1237 01:01:37,160 --> 01:01:40,510 And that list, called argv, which sounds awfully 1238 01:01:40,510 --> 01:01:46,030 equivalent to what special variable that we kept using in C? 1239 01:01:46,030 --> 01:01:48,960 If you recall, not just argv, but argc? 1240 01:01:48,960 --> 01:01:50,820 The latter doesn't exist in Python. 1241 01:01:50,820 --> 01:01:54,140 But we can query for it by just asking Python, what 1242 01:01:54,140 --> 01:01:56,200 is the length of the argument vector? 1243 01:01:56,200 --> 01:01:58,060 That means what is argc? 1244 01:01:58,060 --> 01:02:02,325 So I'm going to go ahead now and just print out sys.argv bracket i. 1245 01:02:02,325 --> 01:02:04,200 And if you think through these lines of code, 1246 01:02:04,200 --> 01:02:08,350 it would seem that this is going to iterate from 0 on up to the number 1247 01:02:08,350 --> 01:02:13,270 of arguments in that argv vector, or list, and then print out each of them 1248 01:02:13,270 --> 01:02:14,030 in turn. 1249 01:02:14,030 --> 01:02:17,090 So let me go ahead and run Python of argv1.py. 1250 01:02:17,090 --> 01:02:18,160 Enter. 1251 01:02:18,160 --> 01:02:21,760 And indeed, it just printed out one thing, the name of the program itself. 1252 01:02:21,760 --> 01:02:26,390 What if I did foo, bar, [INAUDIBLE], some arbitrary words, and hit Enter? 1253 01:02:26,390 --> 01:02:28,460 Now it's going to print all of those as well. 1254 01:02:28,460 --> 01:02:30,793 So this is just printing out, as we did a few weeks ago, 1255 01:02:30,793 --> 01:02:34,170 all of the words in argv. 1256 01:02:34,170 --> 01:02:37,100 But we can do something a little neater now as follows. 1257 01:02:37,100 --> 01:02:42,010 Suppose that in, argv2.py, just like a few weeks ago in argv2.c, 1258 01:02:42,010 --> 01:02:45,340 I wanted to print out all of the characters 1259 01:02:45,340 --> 01:02:50,000 in all of the words of the command line arguments. 1260 01:02:50,000 --> 01:02:52,080 I'm going to go ahead and import sys again. 1261 01:02:52,080 --> 01:02:55,770 And now I'm going to do for s in sys.argv. 1262 01:02:55,770 --> 01:02:58,530 So here's a new approach altogether. 1263 01:02:58,530 --> 01:03:01,662 And then do for c in s. 1264 01:03:01,662 --> 01:03:04,650 And then in here, I'm going to do print c, 1265 01:03:04,650 --> 01:03:07,110 and then eventually, just print a new line. 1266 01:03:07,110 --> 01:03:09,680 So now things are getting a little magical, or frankly, 1267 01:03:09,680 --> 01:03:11,160 just a little convenient. 1268 01:03:11,160 --> 01:03:13,070 I'm still importing the sys module so that I 1269 01:03:13,070 --> 01:03:15,060 have access to argv in the first place. 1270 01:03:15,060 --> 01:03:20,070 And it turns out that insofar as sys.argv is just a list-- like in C, 1271 01:03:20,070 --> 01:03:22,260 it's similar in spirit to an array-- I don't 1272 01:03:22,260 --> 01:03:26,520 have to do the for loop with the int i and index into the array using bracket 1273 01:03:26,520 --> 01:03:27,306 i. 1274 01:03:27,306 --> 01:03:32,880 I can get from Python's for keyword this beautiful feature, whereby 1275 01:03:32,880 --> 01:03:38,110 if I just say, much like the ranges I've been using it with thus far, 1276 01:03:38,110 --> 01:03:44,630 for s in sys.argv, this is going to assign s so the first string in argv. 1277 01:03:44,630 --> 01:03:47,140 Then on the next iteration, to the next string in argv. 1278 01:03:47,140 --> 01:03:52,480 Then on the next iteration, the next string in argv, each time updating s. 1279 01:03:52,480 --> 01:03:55,070 Meanwhile, on line 4 here, which is indented 1280 01:03:55,070 --> 01:03:59,815 as part of being inside this outermost loop, for c in s. 1281 01:03:59,815 --> 01:04:04,710 Well, it turns out that Python treats strings similar in spirit to C, 1282 01:04:04,710 --> 01:04:06,700 as sequences of characters. 1283 01:04:06,700 --> 01:04:09,980 But rather than put the burden on you to declare an int called i or j 1284 01:04:09,980 --> 01:04:13,290 or whatever, and then iterate over bracket i or bracket 1285 01:04:13,290 --> 01:04:16,290 j in each of these variables, you can just tell Python, 1286 01:04:16,290 --> 01:04:20,330 for each character in the string, for c-- and this could 1287 01:04:20,330 --> 01:04:26,690 have been any variable name altogether in the current argument from argv-- 1288 01:04:26,690 --> 01:04:28,810 go ahead and just print out C. 1289 01:04:28,810 --> 01:04:33,140 So again, here we see another hint of the ease with which you can write code 1290 01:04:33,140 --> 01:04:36,580 in a language like Python without having to worry nearly as much 1291 01:04:36,580 --> 01:04:40,790 about low level implementation details about random access and square bracket 1292 01:04:40,790 --> 01:04:44,350 notation and indexing into these arrays effectively. 1293 01:04:44,350 --> 01:04:48,120 You can just allow the language to hand you more of the data 1294 01:04:48,120 --> 01:04:49,200 that you care about. 1295 01:04:49,200 --> 01:04:52,640 So let's run Python of argv2.py. 1296 01:04:52,640 --> 01:04:53,370 Enter. 1297 01:04:53,370 --> 01:04:54,670 And it looks a little weird. 1298 01:04:54,670 --> 01:04:59,490 But if I increase the screen, you'll see that it printed one character per line, 1299 01:04:59,490 --> 01:05:01,330 exactly those command line arguments. 1300 01:05:01,330 --> 01:05:06,180 And if I do foo, you'll see argv2.py space 1301 01:05:06,180 --> 01:05:09,380 F-O-O. It's doing the exact same thing. 1302 01:05:09,380 --> 01:05:10,610 So not a useful program. 1303 01:05:10,610 --> 01:05:17,590 But it indeed is allowing us to actually access those characters and strings 1304 01:05:17,590 --> 01:05:18,310 still. 1305 01:05:18,310 --> 01:05:22,120 So let's just open up an example I wrote in advance to demonstrate 1306 01:05:22,120 --> 01:05:23,880 one other point altogether. 1307 01:05:23,880 --> 01:05:29,350 If I go into week two's folder here from this week and go into exit.py, 1308 01:05:29,350 --> 01:05:30,730 you'll see this example. 1309 01:05:30,730 --> 01:05:32,950 It doesn't do all that much, this program. 1310 01:05:32,950 --> 01:05:34,520 But it does seem to check this. 1311 01:05:34,520 --> 01:05:37,250 On line 4, it checks the length of sys.argv. 1312 01:05:37,250 --> 01:05:39,410 And if it doesn't equal 2, it yells at the user. 1313 01:05:39,410 --> 01:05:40,660 Missing command line argument. 1314 01:05:40,660 --> 01:05:41,890 And then it just exits. 1315 01:05:41,890 --> 01:05:46,950 So just like in C, we have the ability to return an exit code to the shell, 1316 01:05:46,950 --> 01:05:50,780 to your prompt, not using return, as we did in C. 1317 01:05:50,780 --> 01:05:55,470 You still use return in Python, but to return from methods or functions. 1318 01:05:55,470 --> 01:05:58,280 In Python, when you want to exit the program altogether, 1319 01:05:58,280 --> 01:06:02,380 because there is not necessarily a main function, 1320 01:06:02,380 --> 01:06:07,594 you just call exit and then pass inside of its parentheses the number 1321 01:06:07,594 --> 01:06:09,760 that you want to return-- the convention, as always, 1322 01:06:09,760 --> 01:06:13,090 being 0 for success and anything nonzero for failure. 1323 01:06:13,090 --> 01:06:15,380 And so that's why I'm arbitrarily, but conventionally, 1324 01:06:15,380 --> 01:06:17,840 returning 1 here to the prompt. 1325 01:06:17,840 --> 01:06:21,140 I'm exiting with an exit status code or exit code of 1 1326 01:06:21,140 --> 01:06:22,510 to indicate as much here. 1327 01:06:22,510 --> 01:06:24,760 Otherwise, I'm just printing out whatever the word is. 1328 01:06:24,760 --> 01:06:29,360 So if I run this program, and I go into today's second directory, 1329 01:06:29,360 --> 01:06:34,890 and I run Python of exit.py, missing command line argument. 1330 01:06:34,890 --> 01:06:37,220 And you might recall this trick from a few weeks back. 1331 01:06:37,220 --> 01:06:44,820 If you, at your prompt, run echo$?, it will show you the exit code of the most 1332 01:06:44,820 --> 01:06:46,540 recently run program. 1333 01:06:46,540 --> 01:06:50,530 So if I run this correctly this time with, for instance, my name, 1334 01:06:50,530 --> 01:06:51,570 and it says hello David. 1335 01:06:51,570 --> 01:06:56,050 And now I do echo$?, I should see a 0. 1336 01:06:56,050 --> 01:07:00,030 So just a lower level way of seeing what's going on underneath the hood. 1337 01:07:00,030 --> 01:07:04,310 Well, let's go ahead and do another example demonstrating 1338 01:07:04,310 --> 01:07:06,580 what also has changed for the better. 1339 01:07:06,580 --> 01:07:08,400 Let me go ahead and now do this. 1340 01:07:08,400 --> 01:07:11,710 In a file called compare1.py, which will line up, 1341 01:07:11,710 --> 01:07:14,770 you'll find, with compare1.c a few weeks back, 1342 01:07:14,770 --> 01:07:16,862 I'm going to go ahead and import the CSV library. 1343 01:07:16,862 --> 01:07:18,820 I'm going to go ahead and print out just quote, 1344 01:07:18,820 --> 01:07:22,200 unquote s, and then kill the new line. 1345 01:07:22,200 --> 01:07:26,190 And then use s get CS50.getstring. 1346 01:07:26,190 --> 01:07:29,360 And then let me do this once more with a t variable, 1347 01:07:29,360 --> 01:07:31,860 also getting rid of the new line, just for aesthetics. 1348 01:07:31,860 --> 01:07:36,210 And then t gets CS50.getstring. 1349 01:07:36,210 --> 01:07:38,451 And then let me go ahead and do a sanity check. 1350 01:07:38,451 --> 01:07:41,450 It turns out-- and you would only know this from reading our source code 1351 01:07:41,450 --> 01:07:44,220 or the documentation therefore-- turns out that get string 1352 01:07:44,220 --> 01:07:47,080 could return a special value. 1353 01:07:47,080 --> 01:07:52,560 It's not null because Python does not have pointers. 1354 01:07:52,560 --> 01:07:55,800 We don't have to worry about addresses anymore, per se. 1355 01:07:55,800 --> 01:07:59,230 But it does have special sentinel values like this one. 1356 01:07:59,230 --> 01:08:05,000 If s does not equal None with a capital N, and t does not equal None, 1357 01:08:05,000 --> 01:08:09,910 indeed None is a special value similar in spirit to null or similar 1358 01:08:09,910 --> 01:08:12,500 in spirit to false, but different from both. 1359 01:08:12,500 --> 01:08:15,560 It's not a pointer, as it is in C. And it's not a Boolean. 1360 01:08:15,560 --> 01:08:17,520 It's sort of the absence of a value. 1361 01:08:17,520 --> 01:08:20,100 And indeed we, in designing the CS50 library for Python, 1362 01:08:20,100 --> 01:08:22,120 decided that if something goes wrong with 1363 01:08:22,120 --> 01:08:25,691 get string-- maybe the computer or the interpreter is indeed out of memory, 1364 01:08:25,691 --> 01:08:28,149 even though there is no notion of allocating memory per se. 1365 01:08:28,149 --> 01:08:32,680 But something goes wrong inside of get string for whatever reason, 1366 01:08:32,680 --> 01:08:34,950 these calls could return None. 1367 01:08:34,950 --> 01:08:38,640 So I'm just for good measure checking that s is not None and t 1368 01:08:38,640 --> 01:08:42,071 is not None so that I can indeed trust that they're indeed strings, 1369 01:08:42,071 --> 01:08:43,779 so that I can now do something like this. 1370 01:08:43,779 --> 01:08:48,680 If s equals equals t, then print same. 1371 01:08:48,680 --> 01:08:51,510 Else print different. 1372 01:08:51,510 --> 01:08:55,939 And you will recall, perhaps, that when we did this in C some time ago, 1373 01:08:55,939 --> 01:08:57,660 this did not work. 1374 01:08:57,660 --> 01:09:02,220 In the world of C, line 10 would not have worked as intended 1375 01:09:02,220 --> 01:09:07,149 because it would have been comparing two pointers, two memory addresses. 1376 01:09:07,149 --> 01:09:12,630 And insofar as in C, get string returns two distinct addresses. 1377 01:09:12,630 --> 01:09:15,720 Even if the user types the same word as we did a few weeks back, 1378 01:09:15,720 --> 01:09:20,240 it's going to use the heap via malloc to give you two separate strings somewhere 1379 01:09:20,240 --> 01:09:23,390 in memory whose first byte's address is going to be different. 1380 01:09:23,390 --> 01:09:26,859 And so s and t in the world of C were not the same. 1381 01:09:26,859 --> 01:09:28,657 But that was never really all that useful. 1382 01:09:28,657 --> 01:09:30,740 I didn't really care about those memory addresses. 1383 01:09:30,740 --> 01:09:32,073 I wanted to compare the strings. 1384 01:09:32,073 --> 01:09:34,479 And I had to resort back in the day to STR compare. 1385 01:09:34,479 --> 01:09:37,010 Well, as we've already seen, you don't need 1386 01:09:37,010 --> 01:09:38,750 to worry as much about that in Python. 1387 01:09:38,750 --> 01:09:43,930 If you want to compare s and t, just do it using equals equals as always. 1388 01:09:43,930 --> 01:09:51,460 So that when I run this program now and type in Python compare1.py, 1389 01:09:51,460 --> 01:09:54,500 something like Zamaila, something like Zamaila. 1390 01:09:54,500 --> 01:09:56,130 Those are indeed the same. 1391 01:09:56,130 --> 01:09:59,860 But if I instead type Zamaila and then my own name, 1392 01:09:59,860 --> 01:10:02,050 those are indeed different. 1393 01:10:02,050 --> 01:10:07,470 And so this is as expected whereby, if I type two strings that 1394 01:10:07,470 --> 01:10:11,270 happen to be the same, and they're both retrieved by two different calls 1395 01:10:11,270 --> 01:10:13,920 to get string, they're nonetheless going to be compared 1396 01:10:13,920 --> 01:10:17,200 as expected for equality. 1397 01:10:17,200 --> 01:10:20,660 Let's do one other thing to demonstrate one other point of Python. 1398 01:10:20,660 --> 01:10:22,880 Let me go ahead and open up a new file. 1399 01:10:22,880 --> 01:10:25,620 I'm going to call this copy1.py. 1400 01:10:25,620 --> 01:10:27,430 And you'll see that it lines up in spirit 1401 01:10:27,430 --> 01:10:29,640 with copy1.c from a few weeks back. 1402 01:10:29,640 --> 01:10:31,240 Let me import the CS50 module. 1403 01:10:31,240 --> 01:10:34,820 Let me go ahead and print out s with new newline ending. 1404 01:10:34,820 --> 01:10:37,880 Let me go ahead and do CS50.getstring as before. 1405 01:10:37,880 --> 01:10:39,650 And let me go ahead and do a sanity check. 1406 01:10:39,650 --> 01:10:43,770 If s equals None, then let's just exit because this program's not 1407 01:10:43,770 --> 01:10:46,910 going to be useful if something bad happened underneath the hood. 1408 01:10:46,910 --> 01:10:51,920 And now let me go ahead and capitalize this thing, as I tried weeks ago. 1409 01:10:51,920 --> 01:10:54,580 Let me go ahead and do t get s.capitalize. 1410 01:10:54,580 --> 01:10:57,100 1411 01:10:57,100 --> 01:11:01,160 And then print out s, and then a placeholder 1412 01:11:01,160 --> 01:11:03,200 that I can format with s itself. 1413 01:11:03,200 --> 01:11:06,660 Then let me go ahead and print out t colon, and a placeholder, and then 1414 01:11:06,660 --> 01:11:08,091 format t itself. 1415 01:11:08,091 --> 01:11:10,090 And then let me go ahead, just for good measure, 1416 01:11:10,090 --> 01:11:13,179 and exit with 0, even though that will be assumed to be the default. 1417 01:11:13,179 --> 01:11:14,470 So what's going to happen here? 1418 01:11:14,470 --> 01:11:18,210 Let me run this program, Python copy1.py. 1419 01:11:18,210 --> 01:11:20,620 Type in something like Zamaila in all lowercase. 1420 01:11:20,620 --> 01:11:21,500 Enter. 1421 01:11:21,500 --> 01:11:25,490 And you'll see that it's now uppercase just t, and not s. 1422 01:11:25,490 --> 01:11:28,600 Let me go ahead and do another example with Andy's name. 1423 01:11:28,600 --> 01:11:31,770 And we've indeed capitalized Andy's name. 1424 01:11:31,770 --> 01:11:32,710 So what's going on? 1425 01:11:32,710 --> 01:11:34,270 And what's with all these dots? 1426 01:11:34,270 --> 01:11:36,550 The only time we ever really got into dots in C 1427 01:11:36,550 --> 01:11:39,800 was when we had structures or pointers thereto. 1428 01:11:39,800 --> 01:11:44,380 But it turns out that Python is an object oriented programming 1429 01:11:44,380 --> 01:11:46,660 language in the sense that it has support 1430 01:11:46,660 --> 01:11:50,180 for objects, full-fledged objects, really built into it. 1431 01:11:50,180 --> 01:11:51,660 C just has structs. 1432 01:11:51,660 --> 01:11:55,720 And structs, by definition, contain typically only data. 1433 01:11:55,720 --> 01:11:59,270 They will contain fields like dorm or house or name, 1434 01:11:59,270 --> 01:12:02,910 or whatever it is we're implementing, like a student structure in C. 1435 01:12:02,910 --> 01:12:06,610 But it turns out that in Python and in other object-oriented language, 1436 01:12:06,610 --> 01:12:09,750 you can have inside of structures objects, as they're more 1437 01:12:09,750 --> 01:12:13,310 properly called, not only pieces of data, as we'll eventually see, 1438 01:12:13,310 --> 01:12:15,930 but also built-in functionality. 1439 01:12:15,930 --> 01:12:19,470 So the syntax, to be fair, has been very weird when we look at strings. 1440 01:12:19,470 --> 01:12:25,660 But if you trust me when I say a string, or an STR variable, is an object, 1441 01:12:25,660 --> 01:12:28,190 that object has inside of it somewhere underneath the hood 1442 01:12:28,190 --> 01:12:30,400 a sequence of characters, whatever I've typed. 1443 01:12:30,400 --> 01:12:32,740 But it also has apparently built-in functionality. 1444 01:12:32,740 --> 01:12:35,380 Among that functionality is a function, a.k.a. 1445 01:12:35,380 --> 01:12:37,300 a method called format. 1446 01:12:37,300 --> 01:12:40,700 Similarly do string objects in Python have 1447 01:12:40,700 --> 01:12:46,210 a built-in function called capitalize that do exactly as you would expect. 1448 01:12:46,210 --> 01:12:48,790 So in C, we had toupper. 1449 01:12:48,790 --> 01:12:50,710 But that operated on just a single character. 1450 01:12:50,710 --> 01:12:54,340 And the burden was entirely on me to figure out what character in a string 1451 01:12:54,340 --> 01:12:56,790 I wanted to make uppercase. 1452 01:12:56,790 --> 01:13:00,370 In Python, this built-in capitalize function for the string class 1453 01:13:00,370 --> 01:13:03,320 will do exactly what we intend, uppercasing 1454 01:13:03,320 --> 01:13:06,260 the first letter in a string and leaving everything else untouched. 1455 01:13:06,260 --> 01:13:09,330 But it turns out that in Python, a string 1456 01:13:09,330 --> 01:13:15,240 is immutable, which is to say that once it's created, you can't change it. 1457 01:13:15,240 --> 01:13:16,860 And this is not the case in C. 1458 01:13:16,860 --> 01:13:20,980 In C, when we used getstring, or scanf, or malloc, 1459 01:13:20,980 --> 01:13:24,950 or created strings on the stack by allocating them effectively as arrays, 1460 01:13:24,950 --> 01:13:29,260 if we allocated memory on the heap or the stack and put strings there, 1461 01:13:29,260 --> 01:13:30,971 we could change those strings thereafter. 1462 01:13:30,971 --> 01:13:33,220 And in fact, the earliest version of this program in C 1463 01:13:33,220 --> 01:13:39,420 was buggy insofar as it accidentally capitalized both s and t, 1464 01:13:39,420 --> 01:13:41,620 even though we only intended to capitalize t. 1465 01:13:41,620 --> 01:13:45,030 But it works right out of the box with Python, at least as implemented here. 1466 01:13:45,030 --> 01:13:48,220 Because it turns out once s exists as a string, that's it. 1467 01:13:48,220 --> 01:13:50,470 That's the sequence of characters you're going to get. 1468 01:13:50,470 --> 01:13:52,820 You can't go in and change just one of them. 1469 01:13:52,820 --> 01:13:54,570 And so what's really happening here when I 1470 01:13:54,570 --> 01:13:59,270 call s.capitalize is this function is designed underneath the hood 1471 01:13:59,270 --> 01:14:03,160 by the authors of Python to give you a copy of s 1472 01:14:03,160 --> 01:14:06,970 but quickly change the first letter to a capital letter, 1473 01:14:06,970 --> 01:14:09,340 and then return the resulting copy. 1474 01:14:09,340 --> 01:14:11,000 All of that happens for me. 1475 01:14:11,000 --> 01:14:12,630 I do not need to use malloc. 1476 01:14:12,630 --> 01:14:13,957 I do not need to do STR copy. 1477 01:14:13,957 --> 01:14:15,790 I don't need to iterate over the characters. 1478 01:14:15,790 --> 01:14:19,816 All of this we get for free, so to speak, with the language. 1479 01:14:19,816 --> 01:14:22,490 1480 01:14:22,490 --> 01:14:29,560 Let's look now at just where else we can go. 1481 01:14:29,560 --> 01:14:33,000 One of the biggest problems we ran into, recall, in C 1482 01:14:33,000 --> 01:14:35,720 was near the end of our focus on it. 1483 01:14:35,720 --> 01:14:40,500 And we started tripping over issues like memory. 1484 01:14:40,500 --> 01:14:44,400 You'll recall in C, we had this example here, noswap.c. 1485 01:14:44,400 --> 01:14:46,560 And this program was pretty arbitrary. 1486 01:14:46,560 --> 01:14:49,080 It allocated an x and a y int and assigned 1487 01:14:49,080 --> 01:14:51,020 them the values 1 and 2 respectively. 1488 01:14:51,020 --> 01:14:54,200 It claimed to swap them by calling the swap function. 1489 01:14:54,200 --> 01:14:56,760 But then even though it said it swapped them, 1490 01:14:56,760 --> 01:14:59,914 it swapped only copies of those variables. 1491 01:14:59,914 --> 01:15:02,830 And indeed, the swap function, if we scroll down below the break here, 1492 01:15:02,830 --> 01:15:06,250 you'll see that it declares two parameters, a and b, 1493 01:15:06,250 --> 01:15:12,190 that by nature of how C argument passing happens become copies of x and y 1494 01:15:12,190 --> 01:15:15,440 such that a and b do get successfully swapped, 1495 01:15:15,440 --> 01:15:18,920 but there's no permanent effect on the caller's variables 1496 01:15:18,920 --> 01:15:22,890 in main's stack frame because that was fundamentally flawed. 1497 01:15:22,890 --> 01:15:26,660 And so we fundamentally fix that with this version here. 1498 01:15:26,660 --> 01:15:30,210 In swap.c some weeks ago, we instead started 1499 01:15:30,210 --> 01:15:34,310 passing an x and y by reference, by their addresses 1500 01:15:34,310 --> 01:15:37,530 using the ampersand operator to get their address in memory, 1501 01:15:37,530 --> 01:15:41,910 passing in effectively pointers, as declared here with the star operator. 1502 01:15:41,910 --> 01:15:45,270 And then we had to use the star operator inside here of swap 1503 01:15:45,270 --> 01:15:48,780 to dereference those pointers, those addresses, and to go to them 1504 01:15:48,780 --> 01:15:52,550 and actually change or get the values at those addresses. 1505 01:15:52,550 --> 01:15:54,560 So this worked. 1506 01:15:54,560 --> 01:15:59,710 But let me go ahead now and implement in Python something very similar. 1507 01:15:59,710 --> 01:16:03,710 I've already written this one up in advance in noswap.py. 1508 01:16:03,710 --> 01:16:05,410 And it looks like the following. 1509 01:16:05,410 --> 01:16:06,454 I define main up top. 1510 01:16:06,454 --> 01:16:08,370 I'm not going to bother using the CS50 library 1511 01:16:08,370 --> 01:16:09,990 because everything is hard coded here. 1512 01:16:09,990 --> 01:16:12,200 x and y shall be 1 and 2 respectively. 1513 01:16:12,200 --> 01:16:15,340 Don't need to mention int again because it's loosely tied to this language. 1514 01:16:15,340 --> 01:16:17,256 Now I'm going to go ahead and print x is this, 1515 01:16:17,256 --> 01:16:21,550 y is this, swapping dot, dot, dot, passing in x and y. 1516 01:16:21,550 --> 01:16:25,400 And then I do what's here swapped. 1517 01:16:25,400 --> 01:16:26,370 I claim it's swapped. 1518 01:16:26,370 --> 01:16:27,730 I print them out again. 1519 01:16:27,730 --> 01:16:29,360 Swap seems to be implemented. 1520 01:16:29,360 --> 01:16:31,010 I'm a little nervous about this. 1521 01:16:31,010 --> 01:16:35,980 This seems to really be just an implementation of literally noswap.c. 1522 01:16:35,980 --> 01:16:38,100 So let's try to confirm as much. 1523 01:16:38,100 --> 01:16:47,090 Let me go ahead now and go into this fourth week's directory in Python 1524 01:16:47,090 --> 01:16:50,320 noswap.py, Enter. 1525 01:16:50,320 --> 01:16:52,290 Indeed, it doesn't seem to work. 1526 01:16:52,290 --> 01:16:57,080 So it would seem that Python 2 passes these things in by reference. 1527 01:16:57,080 --> 01:16:58,760 So how do I fix this? 1528 01:16:58,760 --> 01:17:03,960 Unfortunately, the fix isn't as-- and this is kind of an understatement-- 1529 01:17:03,960 --> 01:17:09,210 easy as it was in C to just change these arguments to be by reference, 1530 01:17:09,210 --> 01:17:11,530 and then use pointers to actually dereference them 1531 01:17:11,530 --> 01:17:15,880 and actually do the actual swap because we don't have pointers in Python. 1532 01:17:15,880 --> 01:17:19,546 So in some way, here's another tradeoff that's been thematic. 1533 01:17:19,546 --> 01:17:21,170 We were getting all these new features. 1534 01:17:21,170 --> 01:17:23,120 Things are relatively simpler syntactically, 1535 01:17:23,120 --> 01:17:25,620 even though it will take some getting used to, by all means, 1536 01:17:25,620 --> 01:17:26,710 and some practice. 1537 01:17:26,710 --> 01:17:31,910 But now we've given up that ability to look underneath the hood and change 1538 01:17:31,910 --> 01:17:34,100 what's going on underneath the hood. 1539 01:17:34,100 --> 01:17:36,080 So pointers were scary. 1540 01:17:36,080 --> 01:17:37,170 And pointers were hard. 1541 01:17:37,170 --> 01:17:39,945 And managing memory is risky because you risk seg faults, 1542 01:17:39,945 --> 01:17:42,320 and you might have memory leaks, and all of the headaches 1543 01:17:42,320 --> 01:17:46,710 you might have had with psets four or five or any number of the challenges we 1544 01:17:46,710 --> 01:17:47,894 had involving addresses. 1545 01:17:47,894 --> 01:17:50,060 You really start to bang your head against the wall, 1546 01:17:50,060 --> 01:17:53,300 potentially, because you have access to that level of detail. 1547 01:17:53,300 --> 01:17:56,010 Unfortunately, as soon as it's taken away, 1548 01:17:56,010 --> 01:17:59,570 we would seem to lose the ability to solve certain problems. 1549 01:17:59,570 --> 01:18:03,510 And indeed, in this case, can't really solve it in the same way. 1550 01:18:03,510 --> 01:18:06,110 There are multiple ways we could address this. 1551 01:18:06,110 --> 01:18:10,260 But let me propose one that has the advantage of introducing 1552 01:18:10,260 --> 01:18:14,600 a tiny piece of syntax that's pretty cool to see it the first time. 1553 01:18:14,600 --> 01:18:19,820 So in swap.py, let me go ahead and declare x is 1 and y is 2. 1554 01:18:19,820 --> 01:18:24,390 Let me go ahead and print out x is this placeholder, and then plug in x there. 1555 01:18:24,390 --> 01:18:27,430 And then go ahead and print out y is this placeholder, 1556 01:18:27,430 --> 01:18:31,340 and then plug in this placeholder there. 1557 01:18:31,340 --> 01:18:35,220 And now let me go ahead and say print swapping dot, dot, dot. 1558 01:18:35,220 --> 01:18:37,790 And then we'll come back to this to do. 1559 01:18:37,790 --> 01:18:44,000 And now I'm going to go ahead and say print swapped boldly, 1560 01:18:44,000 --> 01:18:51,150 and then print x is this placeholder, x, and then print y 1561 01:18:51,150 --> 01:18:54,660 is this placeholder, and then format y. 1562 01:18:54,660 --> 01:18:57,050 So all that remains to do is the interesting part. 1563 01:18:57,050 --> 01:18:59,590 So it turns out we could do something like this. 1564 01:18:59,590 --> 01:19:05,490 We could say temp gets x, and then x gets y, and y gets temp. 1565 01:19:05,490 --> 01:19:06,600 And that would work. 1566 01:19:06,600 --> 01:19:09,480 It's a little inelegant because now, the beauty 1567 01:19:09,480 --> 01:19:11,347 of having a swap function before in C was 1568 01:19:11,347 --> 01:19:12,930 that we were factoring out that logic. 1569 01:19:12,930 --> 01:19:14,320 We could use it in multiple places. 1570 01:19:14,320 --> 01:19:15,660 Made the code a little more readable. 1571 01:19:15,660 --> 01:19:18,034 And now, in the middle of this beautiful print statement, 1572 01:19:18,034 --> 01:19:19,250 I've got this mess here. 1573 01:19:19,250 --> 01:19:21,930 But it turns out that's the right spirit, at least 1574 01:19:21,930 --> 01:19:23,610 to keeping the solution simple. 1575 01:19:23,610 --> 01:19:25,630 But notice what you can do in Python. 1576 01:19:25,630 --> 01:19:30,460 It turns out that you can actually swap two things at once. 1577 01:19:30,460 --> 01:19:34,020 And it's because of a feature that's implicit in the syntax here. 1578 01:19:34,020 --> 01:19:37,990 These are actually data types on each side of the equals sign. 1579 01:19:37,990 --> 01:19:40,650 It turns out that Python supports not just lists, 1580 01:19:40,650 --> 01:19:43,280 which we've generally known thus far as arrays in C, 1581 01:19:43,280 --> 01:19:47,130 but it also supports, again, tuples, a data structure that 1582 01:19:47,130 --> 01:19:50,320 allows you a comma separated list of values, 1583 01:19:50,320 --> 01:19:53,960 the burden of which is entirely on you to remember what comes first, 1584 01:19:53,960 --> 01:19:55,675 what comes last, what's in the middle. 1585 01:19:55,675 --> 01:19:58,800 But by way of doing this-- and I can do this in a couple of different ways. 1586 01:19:58,800 --> 01:20:00,810 And I can do it not even just with tuples. 1587 01:20:00,810 --> 01:20:04,020 You can think of this a little more like this, like an xy coordinates, 1588 01:20:04,020 --> 01:20:05,790 Cartesian plane and so forth. 1589 01:20:05,790 --> 01:20:09,100 You can actually consider this as happening really simultaneously, 1590 01:20:09,100 --> 01:20:12,310 but letting the language, Python and its interpreter, 1591 01:20:12,310 --> 01:20:16,240 figure out how to do that switcheroo without losing 1592 01:20:16,240 --> 01:20:18,560 one or both of the variables in the process. 1593 01:20:18,560 --> 01:20:20,960 It doesn't matter to us the low level implementation 1594 01:20:20,960 --> 01:20:24,400 detail that that might actually require some kind of temporary storage. 1595 01:20:24,400 --> 01:20:26,700 That is now a feature of the language that we 1596 01:20:26,700 --> 01:20:32,040 get for free if we actually want to assign two values simultaneously. 1597 01:20:32,040 --> 01:20:34,370 And this is actually powerful for that same reason. 1598 01:20:34,370 --> 01:20:37,140 It turns out that if you have some function called 1599 01:20:37,140 --> 01:20:41,750 foo that returns a single value, you could do something like this 1600 01:20:41,750 --> 01:20:45,140 to get back that value, as we've been doing all throughout these examples. 1601 01:20:45,140 --> 01:20:48,980 But it turns out foo could potentially return two values, which 1602 01:20:48,980 --> 01:20:50,320 you could assign like this. 1603 01:20:50,320 --> 01:20:52,780 Or foo could return three values like this. 1604 01:20:52,780 --> 01:20:55,930 If foo was indeed implemented as returning a tuple, 1605 01:20:55,930 --> 01:20:58,465 a comma separated list of values like this. 1606 01:20:58,465 --> 01:21:00,840 So you don't want to take this necessarily to an extreme. 1607 01:21:00,840 --> 01:21:02,840 But in C, you might recall that we did not 1608 01:21:02,840 --> 01:21:06,050 have this capability of being able to return multiple values. 1609 01:21:06,050 --> 01:21:08,960 And that is now an option, although there's alternatives 1610 01:21:08,960 --> 01:21:12,460 to needing to do that altogether. 1611 01:21:12,460 --> 01:21:16,700 So we're almost caught up in time in Python vis-a-vis where we started 1612 01:21:16,700 --> 01:21:18,940 and where we ended with C. But let's introduce 1613 01:21:18,940 --> 01:21:23,460 one other feature of Python that allows us to translate something 1614 01:21:23,460 --> 01:21:25,110 from C as well. 1615 01:21:25,110 --> 01:21:28,310 Recall that we introduced structures some time ago. 1616 01:21:28,310 --> 01:21:30,870 And indeed, I'm going to go ahead here and save 1617 01:21:30,870 --> 01:21:37,050 a file called structs0.py, which is a bit misleading because these 1618 01:21:37,050 --> 01:21:38,300 aren't technically structures. 1619 01:21:38,300 --> 01:21:40,500 They're objects, as I'm about to use. 1620 01:21:40,500 --> 01:21:42,240 But we'll clarify that in a moment. 1621 01:21:42,240 --> 01:21:44,930 Let me go ahead here and import CS50. 1622 01:21:44,930 --> 01:21:50,680 And let me also import, using slightly different syntax, this. 1623 01:21:50,680 --> 01:21:56,950 In a moment, I'm going to create on the fly my own Python module, my own class, 1624 01:21:56,950 --> 01:22:00,850 if you will, called student, inside of which 1625 01:22:00,850 --> 01:22:04,290 is going to be a class called Student capital S. 1626 01:22:04,290 --> 01:22:08,110 And first, let's assume that it exists so that I can just 1627 01:22:08,110 --> 01:22:10,190 take on faith that it will soon exist. 1628 01:22:10,190 --> 01:22:15,580 And let me give myself a list of students like this, an empty array, 1629 01:22:15,580 --> 01:22:18,860 if you will, as implied by the square bracket notation here. 1630 01:22:18,860 --> 01:22:19,634 So new syntax. 1631 01:22:19,634 --> 01:22:21,300 But what's nice is it's pretty readable. 1632 01:22:21,300 --> 01:22:23,510 On the left is the variable's name, assigning 1633 01:22:23,510 --> 01:22:24,800 what's on the right hand side. 1634 01:22:24,800 --> 01:22:27,480 We've seen square brackets for arrays or lists more generally. 1635 01:22:27,480 --> 01:22:30,460 So this just means give me an empty list and assign it to students. 1636 01:22:30,460 --> 01:22:35,320 Unlike strings, a list in Python is mutable, changeable. 1637 01:22:35,320 --> 01:22:38,910 So this does not mean that students is forever going to be an empty list. 1638 01:22:38,910 --> 01:22:42,930 We can add and append things to it, much like a stack or a queue or a linked 1639 01:22:42,930 --> 01:22:44,962 list more generally. 1640 01:22:44,962 --> 01:22:46,420 So now let me go ahead and do this. 1641 01:22:46,420 --> 01:22:50,390 For i in range three-- I'm just going to arbitrarily do this three times, just 1642 01:22:50,390 --> 01:22:51,930 like we did a few weeks ago. 1643 01:22:51,930 --> 01:22:57,110 I'm going to in here now print out print name with no line ending, just 1644 01:22:57,110 --> 01:22:58,440 to keep things pretty. 1645 01:22:58,440 --> 01:23:00,810 Let me go ahead then and use CS50.getstring 1646 01:23:00,810 --> 01:23:03,030 to actually get a student's name. 1647 01:23:03,030 --> 01:23:06,270 Then let me say hey, give me your dorm with no line ending, 1648 01:23:06,270 --> 01:23:07,430 just to keep it clean. 1649 01:23:07,430 --> 01:23:10,290 And then use dorm CS50 get string. 1650 01:23:10,290 --> 01:23:17,110 And then down here, let me do students.append students name dorm. 1651 01:23:17,110 --> 01:23:18,560 So this is new now. 1652 01:23:18,560 --> 01:23:20,970 And we'll come back to this in just a moment. 1653 01:23:20,970 --> 01:23:23,460 Then after this loop, let's just for good measure do this. 1654 01:23:23,460 --> 01:23:28,690 For students in students, print the following placeholder 1655 01:23:28,690 --> 01:23:31,200 is in placeholder. 1656 01:23:31,200 --> 01:23:36,580 Then format student.name, student.dorm. 1657 01:23:36,580 --> 01:23:40,120 So now things are getting a little more interesting. 1658 01:23:40,120 --> 01:23:42,920 I have now done a few things in this program. 1659 01:23:42,920 --> 01:23:45,040 I have imported something called a student, which 1660 01:23:45,040 --> 01:23:46,920 doesn't yet exist but will in a moment. 1661 01:23:46,920 --> 01:23:51,280 I have declared a variable, or a list, specifically, 1662 01:23:51,280 --> 01:23:53,530 called students, and assigned it an empty list. 1663 01:23:53,530 --> 01:23:55,290 Then I'm iterating three times arbitrarily 1664 01:23:55,290 --> 01:23:58,660 just so we have a demo to play with saying, give me your name, 1665 01:23:58,660 --> 01:24:01,250 give me your dorm, and then this. 1666 01:24:01,250 --> 01:24:04,910 So students is an object, as we say, a structure in C. 1667 01:24:04,910 --> 01:24:09,137 But now we call them objects, inside of which is going to be data. 1668 01:24:09,137 --> 01:24:10,220 There's not much data now. 1669 01:24:10,220 --> 01:24:11,450 It's just an empty list. 1670 01:24:11,450 --> 01:24:14,060 But it turns out, if you read the documentation for Python, 1671 01:24:14,060 --> 01:24:18,660 you'll see that a list has some built-in functions, or methods, as 1672 01:24:18,660 --> 01:24:22,650 well-- not just data, but also functionality-- one of which 1673 01:24:22,650 --> 01:24:24,069 is called append. 1674 01:24:24,069 --> 01:24:25,860 And if we read the documentation, we see we 1675 01:24:25,860 --> 01:24:30,450 can pass in an argument to append that is a variable or a value 1676 01:24:30,450 --> 01:24:33,440 that we want to append to the list, add to the end of it. 1677 01:24:33,440 --> 01:24:36,150 And we'll see in a moment what this syntax means. 1678 01:24:36,150 --> 01:24:42,880 It turns out this is similar in spirit to using malloc in C to malloc a struct 1679 01:24:42,880 --> 01:24:47,790 and then put inside of it two values, name and dorm. 1680 01:24:47,790 --> 01:24:51,390 But what's nice about Python and languages like PHP and Ruby 1681 01:24:51,390 --> 01:24:54,420 and Java, all of which support something similar in spirit, 1682 01:24:54,420 --> 01:24:58,940 is this single line gives me a new student object, inside of which 1683 01:24:58,940 --> 01:25:02,230 is that student's name and dorm as strings. 1684 01:25:02,230 --> 01:25:04,900 Later, outside of this loop, just for good measure, 1685 01:25:04,900 --> 01:25:07,490 we reiterate over this list as follows. 1686 01:25:07,490 --> 01:25:10,330 For student in students, well, what is this doing? 1687 01:25:10,330 --> 01:25:14,250 This, again, is an iterable list. 1688 01:25:14,250 --> 01:25:21,180 So not irritable, iterable list, whereby you can iterate over this list, 1689 01:25:21,180 --> 01:25:25,760 calling each element inside of it temporarily student, 1690 01:25:25,760 --> 01:25:27,560 as in our previous use of for. 1691 01:25:27,560 --> 01:25:29,940 And then just print so-and-so is in this dorm, 1692 01:25:29,940 --> 01:25:35,430 formatting those two values using the same dot notation as we used in C. 1693 01:25:35,430 --> 01:25:37,424 So we need a students object. 1694 01:25:37,424 --> 01:25:38,840 Otherwise, what's going to happen? 1695 01:25:38,840 --> 01:25:41,690 Let me go ahead and try to run this incorrectly as follows. 1696 01:25:41,690 --> 01:25:44,000 Python struct0.py. 1697 01:25:44,000 --> 01:25:45,290 Enter. 1698 01:25:45,290 --> 01:25:46,100 Import error. 1699 01:25:46,100 --> 01:25:47,950 No module named student. 1700 01:25:47,950 --> 01:25:51,380 So creating a Python module, it turns out, is super simple. 1701 01:25:51,380 --> 01:25:53,390 I create a file called student.py. 1702 01:25:53,390 --> 01:25:55,450 I now have a module called Student. 1703 01:25:55,450 --> 01:25:57,190 Of course, there's nothing in there. 1704 01:25:57,190 --> 01:25:59,260 So I need to actually populate it. 1705 01:25:59,260 --> 01:26:00,700 So let me go ahead and do this. 1706 01:26:00,700 --> 01:26:04,050 And we'll come back to this in the future with a bit more complexity. 1707 01:26:04,050 --> 01:26:06,840 But for now, let me introduce, with a bit of a wave 1708 01:26:06,840 --> 01:26:08,270 of the hand, the following. 1709 01:26:08,270 --> 01:26:12,440 If I want to create a structure called Student, technically in Python, 1710 01:26:12,440 --> 01:26:14,380 it's called a class. 1711 01:26:14,380 --> 01:26:16,810 And that class should be Student, the convention of which 1712 01:26:16,810 --> 01:26:19,420 is to call your structures in Python, your classes, 1713 01:26:19,420 --> 01:26:21,830 with a capital letter for the first letter. 1714 01:26:21,830 --> 01:26:25,610 And now I'm going to define a standard method called 1715 01:26:25,610 --> 01:26:30,720 init that takes as its first argument a parameter that's conventionally 1716 01:26:30,720 --> 01:26:35,250 called self, and then any number of other arguments that I want to pass it. 1717 01:26:35,250 --> 01:26:38,280 And then inside here, I'm going to do self.name 1718 01:26:38,280 --> 01:26:42,010 gets name and self.dorm gets dorm. 1719 01:26:42,010 --> 01:26:45,740 So this is perhaps the most new-looking piece of code 1720 01:26:45,740 --> 01:26:47,640 that we've seen thus far in Python. 1721 01:26:47,640 --> 01:26:49,920 And we'll explain it just at a high level for now. 1722 01:26:49,920 --> 01:26:53,360 But in line 1, we're saying, hey Python, give me a new structure. 1723 01:26:53,360 --> 01:26:57,600 Give me a class called Student, capital S. Line 2, hey Python, inside 1724 01:26:57,600 --> 01:27:01,340 of this class, there shall be a method, a function, 1725 01:27:01,340 --> 01:27:03,890 that's called init for initialization. 1726 01:27:03,890 --> 01:27:06,950 And it's going to take by convention three arguments, the first of which 1727 01:27:06,950 --> 01:27:09,950 you just have to do, let's say, for now, the second and third 1728 01:27:09,950 --> 01:27:12,280 and beyond of which are completely up to you. 1729 01:27:12,280 --> 01:27:14,380 Name and dorm is what I chose. 1730 01:27:14,380 --> 01:27:16,510 And what's neat is this. 1731 01:27:16,510 --> 01:27:23,300 Lines 3 and 4 mean whatever the user passes into me as a student's name 1732 01:27:23,300 --> 01:27:29,720 and dorm when this class is instantiated, allocated as an object, 1733 01:27:29,720 --> 01:27:35,150 go ahead and remember their name and dorm inside of these instance variables 1734 01:27:35,150 --> 01:27:38,680 called self.name and self.dorm. 1735 01:27:38,680 --> 01:27:41,020 So if you think of the scenario as follows, 1736 01:27:41,020 --> 01:27:46,900 in struct0.py, we had this line of code toward the end. 1737 01:27:46,900 --> 01:27:51,000 Not only were we appending something to the list called Students. 1738 01:27:51,000 --> 01:27:53,330 We had this highlighted portion of code. 1739 01:27:53,330 --> 01:27:57,930 Capital Student, open paren, name, dorm, closed paren. 1740 01:27:57,930 --> 01:28:01,400 That is similar in spirit, again, to calling malloc in C 1741 01:28:01,400 --> 01:28:05,590 and automatically, all in one breath, installing inside of it 1742 01:28:05,590 --> 01:28:07,110 two values, name and dorm. 1743 01:28:07,110 --> 01:28:10,260 So if this is similar in spirit to malloc, you can think of this line 1744 01:28:10,260 --> 01:28:16,120 here, this highlighted portion, as creating somewhere in memory, 1745 01:28:16,120 --> 01:28:20,270 in your computer-- doesn't matter where-- a structure like my fist here, 1746 01:28:20,270 --> 01:28:22,020 passing into it name and dorm. 1747 01:28:22,020 --> 01:28:27,920 And then what happens on those two lines of code in student.py, lines 3 and 4, 1748 01:28:27,920 --> 01:28:31,320 is if name and dorm are the two values that were passed in, 1749 01:28:31,320 --> 01:28:35,900 they get stored inside of this structure and saved permanently 1750 01:28:35,900 --> 01:28:39,746 in what are called instance variables inside of self. 1751 01:28:39,746 --> 01:28:43,820 Self just refers to the object that has been allocated. 1752 01:28:43,820 --> 01:28:45,860 So we'll come back to that before long. 1753 01:28:45,860 --> 01:28:49,250 But just take on faith for now that init has to be the name of the method 1754 01:28:49,250 --> 01:28:50,100 that you use. 1755 01:28:50,100 --> 01:28:53,830 Self is conventionally used as the first argument there. 1756 01:28:53,830 --> 01:28:57,150 And this just ensures that we're remembering a student's 1757 01:28:57,150 --> 01:29:00,070 name and his or her dorm as well. 1758 01:29:00,070 --> 01:29:03,920 So if I now run this, you'll see I'm prompted for David. 1759 01:29:03,920 --> 01:29:11,930 And I'll say Mather and Zamaila and Courier and Rob and Kirkland. 1760 01:29:11,930 --> 01:29:12,740 Enter. 1761 01:29:12,740 --> 01:29:14,490 And the program doesn't do all that much. 1762 01:29:14,490 --> 01:29:17,170 But it manipulates and it creates these objects, 1763 01:29:17,170 --> 01:29:20,390 and ultimately does something useful with them. 1764 01:29:20,390 --> 01:29:22,620 But it throws the information away. 1765 01:29:22,620 --> 01:29:25,170 And so for our command line examples here, 1766 01:29:25,170 --> 01:29:29,310 let's do one final example that improves upon that as follows. 1767 01:29:29,310 --> 01:29:34,190 Let me go ahead and create a new file called structs1.py, similar in spirit 1768 01:29:34,190 --> 01:29:37,900 to what we did some time ago in structs1.c. 1769 01:29:37,900 --> 01:29:40,150 I'm going to start with that same code from before. 1770 01:29:40,150 --> 01:29:42,680 And I'm going to keep around student.py. 1771 01:29:42,680 --> 01:29:45,744 But instead just printing it, you know what? 1772 01:29:45,744 --> 01:29:47,910 I'm going to get rid of the printing of these names. 1773 01:29:47,910 --> 01:29:49,430 I'm going instead do this. 1774 01:29:49,430 --> 01:29:56,500 File gets open students.csv, w, quote, unquote. 1775 01:29:56,500 --> 01:30:03,130 Writer gets csv.writer file for student in students, just as before. 1776 01:30:03,130 --> 01:30:11,320 Writer.writerow student.name, student.dorm, file.close. 1777 01:30:11,320 --> 01:30:14,840 1778 01:30:14,840 --> 01:30:17,680 Definitely a mouthful, and it's not perfect yet. 1779 01:30:17,680 --> 01:30:19,610 But let's try to glean what I'm doing. 1780 01:30:19,610 --> 01:30:22,230 Open turns out is similar in spirit to fopen from C. 1781 01:30:22,230 --> 01:30:25,520 And it takes two arguments just like in C, which is wonderful, 1782 01:30:25,520 --> 01:30:27,650 the name of the file to open and the mode 1783 01:30:27,650 --> 01:30:30,620 in which you want to open it-- writing, w, or reading, r. 1784 01:30:30,620 --> 01:30:32,320 And there's a few other options too. 1785 01:30:32,320 --> 01:30:36,760 This just returns to me a reference to that file somehow. 1786 01:30:36,760 --> 01:30:39,370 And indeed, all this time I've been describing variables 1787 01:30:39,370 --> 01:30:40,600 as just that, variables. 1788 01:30:40,600 --> 01:30:45,560 But technically speaking, all of these variables-- x and y, and now file and s 1789 01:30:45,560 --> 01:30:49,510 and t and others-- are references or symbols that 1790 01:30:49,510 --> 01:30:52,559 have been bound to objects in memory. 1791 01:30:52,559 --> 01:30:54,850 Which is just to say that you'll see online, especially 1792 01:30:54,850 --> 01:30:57,599 when reading up on Python, that there's certain terminology that's 1793 01:30:57,599 --> 01:30:58,850 associated with the language. 1794 01:30:58,850 --> 01:31:01,910 But at the end of the day, the ideas are no different fundamentally 1795 01:31:01,910 --> 01:31:05,470 from what we've been doing in Scratch and in C. These are just a variable 1796 01:31:05,470 --> 01:31:06,360 called file. 1797 01:31:06,360 --> 01:31:08,290 Here's another variable called writer. 1798 01:31:08,290 --> 01:31:13,300 And it is storing the return value of CSV.writer file. 1799 01:31:13,300 --> 01:31:13,966 So what's this? 1800 01:31:13,966 --> 01:31:16,090 I only knew this by reading up on the documentation 1801 01:31:16,090 --> 01:31:19,820 because I was curious in Python, how do I actually save my data inside 1802 01:31:19,820 --> 01:31:23,420 of a CSV, Comma Separated Values file? 1803 01:31:23,420 --> 01:31:25,720 This is sort of a very super simple Excel 1804 01:31:25,720 --> 01:31:28,570 file that just uses commas to separate what are effectively 1805 01:31:28,570 --> 01:31:29,990 different columns in a file. 1806 01:31:29,990 --> 01:31:35,286 So my goal here is to ultimately print David, Mather, Enter. 1807 01:31:35,286 --> 01:31:37,680 Zamaila, Courier, Enter. 1808 01:31:37,680 --> 01:31:40,247 Rob, Kirkland, Enter. 1809 01:31:40,247 --> 01:31:40,830 And that's it. 1810 01:31:40,830 --> 01:31:43,260 And save it permanently on disk, if you will, 1811 01:31:43,260 --> 01:31:45,830 so that we actually keep this information around. 1812 01:31:45,830 --> 01:31:47,460 So what does this do for me? 1813 01:31:47,460 --> 01:31:52,230 It turns out that Python comes with a built-in feature called the CSV 1814 01:31:52,230 --> 01:31:55,880 Module, inside of which is a whole bunch of functionality, some of which 1815 01:31:55,880 --> 01:31:59,500 is this one here, a class called writer that 1816 01:31:59,500 --> 01:32:02,910 takes one argument when you instantiate it called file. 1817 01:32:02,910 --> 01:32:06,960 So this just means, hey Python, give me a writer for CSVs. 1818 01:32:06,960 --> 01:32:11,790 Give me an object whose purpose in life is to write CSV files to hard drives. 1819 01:32:11,790 --> 01:32:14,096 Iterate over my students in students. 1820 01:32:14,096 --> 01:32:15,970 And then just from reading the documentation, 1821 01:32:15,970 --> 01:32:21,610 I know that I can call writer.writerow, which is a bit hard to say quickly 1822 01:32:21,610 --> 01:32:23,790 several times, but writerow. 1823 01:32:23,790 --> 01:32:27,130 And then it takes as an argument a tuple in this case. 1824 01:32:27,130 --> 01:32:28,890 That's why there's the double parentheses. 1825 01:32:28,890 --> 01:32:34,260 A tuple, a comma separated list of values, which in this case 1826 01:32:34,260 --> 01:32:37,750 I want to be student.name and student.dorm. 1827 01:32:37,750 --> 01:32:39,340 Then I close the file at the end. 1828 01:32:39,340 --> 01:32:43,480 So the net result here is kind of underwhelming to run. 1829 01:32:43,480 --> 01:32:45,690 And indeed, we're about to see a bug. 1830 01:32:45,690 --> 01:32:48,076 Python structs1.py. 1831 01:32:48,076 --> 01:32:48,576 Enter. 1832 01:32:48,576 --> 01:32:51,090 1833 01:32:51,090 --> 01:32:56,400 David Mather, Zamaila Courier, Rob Kirkland. 1834 01:32:56,400 --> 01:32:56,980 Damn it. 1835 01:32:56,980 --> 01:32:59,300 After all that input, then there's an error. 1836 01:32:59,300 --> 01:33:03,430 But this is actually illustrative of another feature, or design aspect, 1837 01:33:03,430 --> 01:33:04,880 of Python. 1838 01:33:04,880 --> 01:33:07,150 I'm not necessarily going to get compilation errors. 1839 01:33:07,150 --> 01:33:10,080 I might actually get runtime logical errors. 1840 01:33:10,080 --> 01:33:13,010 If I have made a mistake in my program that 1841 01:33:13,010 --> 01:33:16,420 isn't something super simple or dumb or frustrating, 1842 01:33:16,420 --> 01:33:20,510 like leaving off a parenthesis or a misplaced comma, 1843 01:33:20,510 --> 01:33:22,900 or something like that that's syntactically invalid, 1844 01:33:22,900 --> 01:33:26,097 Python might not notice that my program is buggy. 1845 01:33:26,097 --> 01:33:28,430 Because if it scans my code top to bottom, left to right 1846 01:33:28,430 --> 01:33:30,490 and doesn't notice some glaring syntax issue, 1847 01:33:30,490 --> 01:33:34,400 it might proceed to just run the program for me, that is, interpret the program. 1848 01:33:34,400 --> 01:33:39,660 Only once the Python interpreter gets to a line of code that syntactically 1849 01:33:39,660 --> 01:33:45,550 is correct but confuses it might it bail out with a so-called runtime error, 1850 01:33:45,550 --> 01:33:47,840 or more properly, throw an exception. 1851 01:33:47,840 --> 01:33:50,200 This one's saying name CSV is not defined. 1852 01:33:50,200 --> 01:33:52,540 And indeed, if I scroll up, the first time 1853 01:33:52,540 --> 01:33:57,750 I mention CSV was indeed on this line with the x, undefined variable CSV. 1854 01:33:57,750 --> 01:33:58,520 You know what? 1855 01:33:58,520 --> 01:33:59,375 I messed up. 1856 01:33:59,375 --> 01:34:02,207 I should have imported the CSV module. 1857 01:34:02,207 --> 01:34:04,290 And I would only know that from the documentation. 1858 01:34:04,290 --> 01:34:08,382 But I can infer as much from the fact that CSV does not exist. 1859 01:34:08,382 --> 01:34:09,590 Let's try this one more time. 1860 01:34:09,590 --> 01:34:17,510 David Mather, Zamaila Courier, Rob Kirkland, and Enter. 1861 01:34:17,510 --> 01:34:19,340 Nothing seems to happen. 1862 01:34:19,340 --> 01:34:24,930 But notice students.csv has now appeared. 1863 01:34:24,930 --> 01:34:29,470 And indeed, I have David, Mather, Zamaila, Courier, Rob, Kirkland. 1864 01:34:29,470 --> 01:34:31,820 I have my own tiny little database. 1865 01:34:31,820 --> 01:34:34,370 It's not a database in a particularly fancy sense. 1866 01:34:34,370 --> 01:34:35,210 I can't query it. 1867 01:34:35,210 --> 01:34:36,600 I can't change it very easily. 1868 01:34:36,600 --> 01:34:39,140 I have to just rewrite the whole thing out essentially. 1869 01:34:39,140 --> 01:34:41,870 But I have now persisted this data. 1870 01:34:41,870 --> 01:34:44,740 And never before in these Python examples 1871 01:34:44,740 --> 01:34:46,490 have we kept any of the information around 1872 01:34:46,490 --> 01:34:50,930 until now, much like the equivalent C version. 1873 01:34:50,930 --> 01:34:53,010 So guess what else we can do with Python. 1874 01:34:53,010 --> 01:34:55,520 Not only can we re implement all of week's 1 1875 01:34:55,520 --> 01:34:57,850 through 5 examples from C in Python. 1876 01:34:57,850 --> 01:35:01,330 So can we implement the entirety of our recent spell checker. 1877 01:35:01,330 --> 01:35:04,200 For instance, you may recall that the staff solution for speller 1878 01:35:04,200 --> 01:35:07,300 was run a little something as follows at the prompt, whereby 1879 01:35:07,300 --> 01:35:09,040 we specify optionally a dictionary. 1880 01:35:09,040 --> 01:35:10,956 But I'm going to go ahead and use the default. 1881 01:35:10,956 --> 01:35:14,080 And then I can spell check something like AustinPowers.text, 1882 01:35:14,080 --> 01:35:17,600 which, in the CS50 staff solution, which this one happens to use a try, 1883 01:35:17,600 --> 01:35:21,600 took me a total of 0.05 seconds to spell check a pretty 1884 01:35:21,600 --> 01:35:24,860 big dictionary with 19,190 words. 1885 01:35:24,860 --> 01:35:27,230 But it took me a long time to implement that try. 1886 01:35:27,230 --> 01:35:29,188 It probably took you quite a while to implement 1887 01:35:29,188 --> 01:35:32,240 your try, your hash table, your linked list, or other data structure. 1888 01:35:32,240 --> 01:35:35,160 But let me propose that today, we have in our speller 1889 01:35:35,160 --> 01:35:37,954 directory a reimplementation of speller in Python. 1890 01:35:37,954 --> 01:35:40,370 And this was the program you didn't need to worry too much 1891 01:35:40,370 --> 01:35:43,346 about in C. Speller.c we asked you to read through and understand. 1892 01:35:43,346 --> 01:35:44,720 But you didn't need to change it. 1893 01:35:44,720 --> 01:35:47,090 And so indeed today, we won't change it either. 1894 01:35:47,090 --> 01:35:54,070 But I'm going to go ahead and create a file called dictionary.py, 1895 01:35:54,070 --> 01:35:58,300 inside of which is going to be my very own implementation of this dictionary. 1896 01:35:58,300 --> 01:36:03,510 And it turns out in Python, we can implement speller as follows. 1897 01:36:03,510 --> 01:36:08,040 Class dictionary, thereby giving me really the equivalent of a structure, 1898 01:36:08,040 --> 01:36:11,050 much like we have in C. And I'm going to go ahead inside of this 1899 01:36:11,050 --> 01:36:15,300 and declare a function that's by default, and convention called init. 1900 01:36:15,300 --> 01:36:18,050 That takes in one argument, in this case called self. 1901 01:36:18,050 --> 01:36:22,810 And I'm going to simply do self.words gets set where set, it turns out, 1902 01:36:22,810 --> 01:36:24,830 is a function in Python that returns to me 1903 01:36:24,830 --> 01:36:29,870 an empty set, a collection of values that facilitate, generally, 1904 01:36:29,870 --> 01:36:33,210 on the average case, constant time lookups of whether something's there, 1905 01:36:33,210 --> 01:36:36,350 and constant time insertions of putting something into that set, 1906 01:36:36,350 --> 01:36:37,940 much like a mathematical set. 1907 01:36:37,940 --> 01:36:43,000 I'm now going to go ahead and implement my load function in Python as follows, 1908 01:36:43,000 --> 01:36:46,940 whereby I take in self as an argument as before, by convention, but then also 1909 01:36:46,940 --> 01:36:49,520 the name of the file to use as my dictionary. 1910 01:36:49,520 --> 01:36:52,330 And similar to C, I'm going to use a function like fopen, 1911 01:36:52,330 --> 01:36:56,180 but this time called open, where I simply pass in dictionary and quote, 1912 01:36:56,180 --> 01:36:57,480 unquote, r. 1913 01:36:57,480 --> 01:37:04,040 And then for each line in that file, I am going to access the set called words 1914 01:37:04,040 --> 01:37:08,940 and add to it the line I've just encountered after stripping off 1915 01:37:08,940 --> 01:37:11,620 the trailing new line. 1916 01:37:11,620 --> 01:37:14,400 Then I am going to close the file. 1917 01:37:14,400 --> 01:37:16,190 And I'm going to return true. 1918 01:37:16,190 --> 01:37:18,800 And I'm going to have finished my homework for load. 1919 01:37:18,800 --> 01:37:20,820 With just those few lines of code, can we 1920 01:37:20,820 --> 01:37:24,020 reimplement the entirety of the load function 1921 01:37:24,020 --> 01:37:29,570 for problem set 5 speller dictionary in Python itself? 1922 01:37:29,570 --> 01:37:33,080 Now the check function, maybe that's where the price is paid. 1923 01:37:33,080 --> 01:37:35,745 Maybe the tradeoff is check's going to be really, really scary. 1924 01:37:35,745 --> 01:37:38,370 So I'm going to implement this one as a method inside here too, 1925 01:37:38,370 --> 01:37:40,820 taking in a word that we want to spellcheck. 1926 01:37:40,820 --> 01:37:46,960 And I'm going to return word.lower in self.words. 1927 01:37:46,960 --> 01:37:48,530 And that's it for the check method. 1928 01:37:48,530 --> 01:37:49,700 What is this doing? 1929 01:37:49,700 --> 01:37:55,750 This is saying, return, true or false, whether the lowercase version 1930 01:37:55,750 --> 01:38:00,570 of the given word is in my own word set. 1931 01:38:00,570 --> 01:38:03,820 So self.words just refers to this container that's initially empty 1932 01:38:03,820 --> 01:38:06,730 but that has just been populated by the load method 1933 01:38:06,730 --> 01:38:10,560 by adding in all of the words that we loaded from that file. 1934 01:38:10,560 --> 01:38:13,690 So this true or false is implemented as follows. 1935 01:38:13,690 --> 01:38:17,550 Lowercase the given word and check whether it's in that set, 1936 01:38:17,550 --> 01:38:20,820 and return true or false in just one line. 1937 01:38:20,820 --> 01:38:21,550 Well, all right. 1938 01:38:21,550 --> 01:38:24,060 Maybe size is going to be where the price is paid. 1939 01:38:24,060 --> 01:38:26,400 Maybe size is what's really broken here. 1940 01:38:26,400 --> 01:38:28,600 So let's go ahead and implement size. 1941 01:38:28,600 --> 01:38:32,590 And let me return self.words. 1942 01:38:32,590 --> 01:38:33,090 All right. 1943 01:38:33,090 --> 01:38:36,540 That one's perhaps not a surprise since size in C is also pretty easy. 1944 01:38:36,540 --> 01:38:37,830 But what about unload? 1945 01:38:37,830 --> 01:38:41,510 Well, how about in unload, we similarly declare it. 1946 01:38:41,510 --> 01:38:43,940 Well, there's nothing to unload because Python does 1947 01:38:43,940 --> 01:38:45,592 all of your memory management for you. 1948 01:38:45,592 --> 01:38:48,050 So even though you might be allocating more and more memory 1949 01:38:48,050 --> 01:38:50,980 as you use this set, there's nothing to actually unload 1950 01:38:50,980 --> 01:38:53,380 because the interpreter will do that for you. 1951 01:38:53,380 --> 01:38:57,550 So it turns out that all of these conversions from C to Python 1952 01:38:57,550 --> 01:38:59,602 are useful in part because clearly, you can 1953 01:38:59,602 --> 01:39:01,560 implement the same kinds of programs that we've 1954 01:39:01,560 --> 01:39:02,768 been implementing for a week. 1955 01:39:02,768 --> 01:39:07,920 And frankly, in many cases, more easily and quicker, or with fewer lines 1956 01:39:07,920 --> 01:39:11,240 of code, or in a way that's just much less painful to write. 1957 01:39:11,240 --> 01:39:14,380 All of that low level stuff where you're implementing hash tables or trees 1958 01:39:14,380 --> 01:39:17,760 or tries is wonderfully illustrative of how those things work, and hopefully 1959 01:39:17,760 --> 01:39:21,140 gives you a true understanding of what's going on underneath the hood. 1960 01:39:21,140 --> 01:39:22,020 But my god. 1961 01:39:22,020 --> 01:39:26,190 If you just wanted to store words in a dictionary, if you 1962 01:39:26,190 --> 01:39:29,670 had to implement dozens of lines of code to implement your own try, 1963 01:39:29,670 --> 01:39:32,930 or your own hash table or linked list, programming very quickly 1964 01:39:32,930 --> 01:39:37,650 devolves into an incredibly mundane, frustrating profession. 1965 01:39:37,650 --> 01:39:41,650 But in this case do we begin to see hints of other languages, 1966 01:39:41,650 --> 01:39:44,730 Python among them, that allow us to solve the same problems much more 1967 01:39:44,730 --> 01:39:47,610 quickly, much more efficiently, much more effectively, much more 1968 01:39:47,610 --> 01:39:50,210 pleasurably, such that now we can start to stand 1969 01:39:50,210 --> 01:39:52,830 on the shoulders of even more people who have come before us, 1970 01:39:52,830 --> 01:39:54,720 start building on not only this language, 1971 01:39:54,720 --> 01:39:57,040 but on other APIs and libraries. 1972 01:39:57,040 --> 01:40:00,570 And indeed, that's now why we introduced Python. 1973 01:40:00,570 --> 01:40:02,540 No longer in the weeks to come are we going 1974 01:40:02,540 --> 01:40:06,930 to be focusing on the command line alone, but rather 1975 01:40:06,930 --> 01:40:09,140 on web-based interfaces. 1976 01:40:09,140 --> 01:40:13,390 Indeed, in Python do we have the ability to so much more easily than in C 1977 01:40:13,390 --> 01:40:17,470 write web-based software, actual websites that 1978 01:40:17,470 --> 01:40:20,140 are dynamic, not just built out of HTML and CSS, 1979 01:40:20,140 --> 01:40:24,160 but that have shopping carts and use databases and send emails or SMSes, 1980 01:40:24,160 --> 01:40:27,290 or any number of dynamic features, all of which, to be fair, 1981 01:40:27,290 --> 01:40:30,900 we could implement in C. But it would be the most painful experience 1982 01:40:30,900 --> 01:40:33,970 in the world to implement a dynamic website with all 1983 01:40:33,970 --> 01:40:38,760 of those features in a lower level language like C. But with Python can 1984 01:40:38,760 --> 01:40:42,100 we start to do this so much more readily. 1985 01:40:42,100 --> 01:40:45,690 So how do we go about using Python to generate websites? 1986 01:40:45,690 --> 01:40:49,190 A couple of weeks ago when we first looked at HTML and CSS 1987 01:40:49,190 --> 01:40:53,200 and talked more generally about HTTP, we hard coded everything we wrote. 1988 01:40:53,200 --> 01:40:55,570 We wrote HTML in our text editor. 1989 01:40:55,570 --> 01:40:57,840 We wrote CSS in our text editor. 1990 01:40:57,840 --> 01:40:59,160 We saved those files. 1991 01:40:59,160 --> 01:41:02,000 And then we loaded them using our browser. 1992 01:41:02,000 --> 01:41:03,720 But there was nothing dynamic about it. 1993 01:41:03,720 --> 01:41:07,420 There was no even hello world program that dynamically took my name. 1994 01:41:07,420 --> 01:41:09,770 But we did discuss, in the context of HTTP, 1995 01:41:09,770 --> 01:41:13,050 this ability of web browsers and web servers 1996 01:41:13,050 --> 01:41:18,760 to use HTML parameters in order to transmit inputs in between the two. 1997 01:41:18,760 --> 01:41:20,580 For instance, we talked about get, whereby 1998 01:41:20,580 --> 01:41:25,060 you can pass in these key value pairs via the get string, the query string, 1999 01:41:25,060 --> 01:41:26,240 in the URL itself. 2000 01:41:26,240 --> 01:41:28,450 We talked a bit about post, whereby you could 2001 01:41:28,450 --> 01:41:31,520 transmit more sensitive information, or bigger things like photographs 2002 01:41:31,520 --> 01:41:33,450 and passwords and confidential information, 2003 01:41:33,450 --> 01:41:38,860 via post, which is still passing in key value pairs from browser to server. 2004 01:41:38,860 --> 01:41:41,130 But we didn't at the time have any ability 2005 01:41:41,130 --> 01:41:45,730 to actually read or parse those inputs and produce dynamic outputs. 2006 01:41:45,730 --> 01:41:48,410 In fact, the most dynamic we got a couple of weeks ago 2007 01:41:48,410 --> 01:41:52,100 was with those search examples whereby I reimplemented the front end 2008 01:41:52,100 --> 01:41:57,360 interface of Google, sort of our very low budget version of Google's website. 2009 01:41:57,360 --> 01:41:59,700 And then I just completely punted to their back end 2010 01:41:59,700 --> 01:42:05,430 using the action attribute of https://www.google.com/search, 2011 01:42:05,430 --> 01:42:10,220 pretty much deferring entirely to Google all of the interesting, dynamic output 2012 01:42:10,220 --> 01:42:11,740 for my search results. 2013 01:42:11,740 --> 01:42:14,770 So today, we won't generate those search results ourselves. 2014 01:42:14,770 --> 01:42:19,600 But we will give ourselves, now that we have a language and the environment 2015 01:42:19,600 --> 01:42:22,970 with which to handle those inputs, we will give ourselves 2016 01:42:22,970 --> 01:42:26,640 the capability to start creating websites more like that. 2017 01:42:26,640 --> 01:42:32,480 In fact, ultimately, the goal of creating web-based software 2018 01:42:32,480 --> 01:42:34,450 is to dynamically output stuff like this. 2019 01:42:34,450 --> 01:42:37,670 This, of course, is the simplest web page 2020 01:42:37,670 --> 01:42:39,670 we could perhaps implement in HTML. 2021 01:42:39,670 --> 01:42:41,057 But it's entirely hard coded. 2022 01:42:41,057 --> 01:42:43,390 Wouldn't it be nice if we could minimally, for instance, 2023 01:42:43,390 --> 01:42:45,930 add someone's name dynamically to that output 2024 01:42:45,930 --> 01:42:48,240 so that it actually interacts with them in some way? 2025 01:42:48,240 --> 01:42:52,600 And you can, of course, extrapolate from that kind of feature 2026 01:42:52,600 --> 01:42:55,000 to things like Gmail, where it's constantly, 2027 01:42:55,000 --> 01:42:57,342 dynamically interacting with your keyboard input 2028 01:42:57,342 --> 01:43:00,300 based on who you put in the To field, what you put in the subject line. 2029 01:43:00,300 --> 01:43:03,930 The website's going to do and behave differently in order to send that mail. 2030 01:43:03,930 --> 01:43:06,840 Facebook Messenger or Gchat or any number of tools 2031 01:43:06,840 --> 01:43:09,380 are constantly taking web-based input from users 2032 01:43:09,380 --> 01:43:11,160 and producing dynamically output. 2033 01:43:11,160 --> 01:43:13,320 But how do we get at that input and output? 2034 01:43:13,320 --> 01:43:18,430 Especially since at the end of the day, this is all HTML boils down to. 2035 01:43:18,430 --> 01:43:20,500 Inside of those virtual envelopes, so to speak, 2036 01:43:20,500 --> 01:43:23,460 going between client and server or browser and server, 2037 01:43:23,460 --> 01:43:25,580 are requests like these from the client. 2038 01:43:25,580 --> 01:43:28,430 Get me the home page using this version of HTML 2039 01:43:28,430 --> 01:43:30,280 specifically from this host name here. 2040 01:43:30,280 --> 01:43:33,280 And then maybe some other additional detail and maybe some parameters 2041 01:43:33,280 --> 01:43:35,096 in that URL string. 2042 01:43:35,096 --> 01:43:37,220 Meanwhile, the server is going to respond similarly 2043 01:43:37,220 --> 01:43:41,200 with something pretty simple-- a textual response, some HTML headers like this 2044 01:43:41,200 --> 01:43:46,030 saying the content type is text HTML, if it indeed is, followed by the HTML 2045 01:43:46,030 --> 01:43:47,640 that the server has generated. 2046 01:43:47,640 --> 01:43:50,660 So it would seem that we need the ability, when 2047 01:43:50,660 --> 01:43:54,070 writing web-based software, to be able to, 2048 01:43:54,070 --> 01:43:57,440 one, dynamically generate HTML based on who the user is 2049 01:43:57,440 --> 01:43:59,470 or what he or she wants to see dynamically. 2050 01:43:59,470 --> 01:44:02,950 So we have the ability to write HTML, of course, per two weeks ago. 2051 01:44:02,950 --> 01:44:06,490 But we haven't yet printed it or generated it dynamically. 2052 01:44:06,490 --> 01:44:10,750 And we're also going to need a feature whereby, somehow or other, 2053 01:44:10,750 --> 01:44:15,140 any HTTP parameters coming to us from browsers 2054 01:44:15,140 --> 01:44:18,590 can be interpreted so that if a user is trying to add something 2055 01:44:18,590 --> 01:44:22,730 to their shopping cart, we can actually see what it is they've requested 2056 01:44:22,730 --> 01:44:24,460 to add to their shopping cart. 2057 01:44:24,460 --> 01:44:27,527 So it turns out we need just one mental model, if you will, 2058 01:44:27,527 --> 01:44:28,610 for this world of the web. 2059 01:44:28,610 --> 01:44:31,109 Back in the day, this mental model didn't necessarily exist. 2060 01:44:31,109 --> 01:44:35,480 But over time, we humans have come up with certain paradigms, or design 2061 01:44:35,480 --> 01:44:40,270 patterns, so to speak, that guide common implementations of web-based software 2062 01:44:40,270 --> 01:44:41,120 or mobile software. 2063 01:44:41,120 --> 01:44:44,200 Because the world realized over time that they adopted certain habits. 2064 01:44:44,200 --> 01:44:46,890 Or there are certain convenient ways to implement software. 2065 01:44:46,890 --> 01:44:49,620 And one such method, or one such design pattern, 2066 01:44:49,620 --> 01:44:53,490 is generally called MVC, Model View Controller. 2067 01:44:53,490 --> 01:44:56,430 And in this world, the controller is really 2068 01:44:56,430 --> 01:45:00,530 where the brains of your program or your website are-- all of the logic. 2069 01:45:00,530 --> 01:45:02,950 The logging in of users, logging out of users, 2070 01:45:02,950 --> 01:45:05,760 adding things to a shopping cart, removing things, checking out, 2071 01:45:05,760 --> 01:45:08,790 billing them, all of that sort of business logic so to speak. 2072 01:45:08,790 --> 01:45:11,610 And that exists in one or more files, typically, 2073 01:45:11,610 --> 01:45:14,720 on a web server that collectively are called the controller. 2074 01:45:14,720 --> 01:45:16,950 So it's not a technical term per se. 2075 01:45:16,950 --> 01:45:20,640 It's just a descriptor for what your code is ultimately doing. 2076 01:45:20,640 --> 01:45:25,270 View, meanwhile, the V in MVC, refers to the aesthetics of your site 2077 01:45:25,270 --> 01:45:29,150 typically-- the templates that you use for HTML, 2078 01:45:29,150 --> 01:45:32,710 or the CSS files that you use in order to style your website. 2079 01:45:32,710 --> 01:45:36,260 In other words, while the thinking, all of the code logic 2080 01:45:36,260 --> 01:45:38,580 might be embedded in files called your controller, 2081 01:45:38,580 --> 01:45:41,140 all of the sort of fluffier but still important stuff. 2082 01:45:41,140 --> 01:45:44,120 The aesthetic stuff, might be in the view side of things. 2083 01:45:44,120 --> 01:45:48,780 And then lastly is the M in MVC, Model, which is where your data typically 2084 01:45:48,780 --> 01:45:49,690 comes from. 2085 01:45:49,690 --> 01:45:52,810 So we just did an example using a CSV file. 2086 01:45:52,810 --> 01:45:54,190 That's a model of some sort. 2087 01:45:54,190 --> 01:45:55,290 It's a super simple model. 2088 01:45:55,290 --> 01:45:59,440 But a model is just a general term describing where your data lives 2089 01:45:59,440 --> 01:46:00,396 and how you access it. 2090 01:46:00,396 --> 01:46:02,270 And before long, we're going to use a fancier 2091 01:46:02,270 --> 01:46:05,000 version of a model, an actual database server, 2092 01:46:05,000 --> 01:46:07,570 that we can query and insert into and delete from and edit, 2093 01:46:07,570 --> 01:46:09,770 and any number of other features as well. 2094 01:46:09,770 --> 01:46:16,450 But for now, today, let's just focus on the C and the V in MVC as follows. 2095 01:46:16,450 --> 01:46:19,200 I'm going to go ahead and open up CS50 IDE, where we have 2096 01:46:19,200 --> 01:46:22,240 a simple program here called serve.py. 2097 01:46:22,240 --> 01:46:25,870 And this is perhaps among the lowest level 2098 01:46:25,870 --> 01:46:29,550 ways we could go about implementing our own web server. 2099 01:46:29,550 --> 01:46:31,840 So again, CS50 IDE comes with its own web server. 2100 01:46:31,840 --> 01:46:33,000 And Google has its own web server. 2101 01:46:33,000 --> 01:46:34,450 And Facebook has its own web server. 2102 01:46:34,450 --> 01:46:37,574 And many of them are using, like us, open source software, freely available 2103 01:46:37,574 --> 01:46:39,130 software that's super popular. 2104 01:46:39,130 --> 01:46:42,570 But suppose we want to implement our own web 2105 01:46:42,570 --> 01:46:47,870 server that listens on TCP port 80 for HTTP requests 2106 01:46:47,870 --> 01:46:49,190 for those virtual envelopes. 2107 01:46:49,190 --> 01:46:50,920 In Python, we might do it as follows. 2108 01:46:50,920 --> 01:46:55,290 And a lot of the words on the screen might be new. 2109 01:46:55,290 --> 01:46:58,900 But the syntax is fundamentally the same as what we've been focusing on today. 2110 01:46:58,900 --> 01:47:04,040 So from some module that comes with Python called HTTP server imports 2111 01:47:04,040 --> 01:47:08,540 a class called base HTTP request handler and HTTP server. 2112 01:47:08,540 --> 01:47:13,594 So it turns out that Python comes with some built-in web server functionality. 2113 01:47:13,594 --> 01:47:15,510 It's not all that user friendly, as we'll see. 2114 01:47:15,510 --> 01:47:17,250 We have to do a lot of work to use it. 2115 01:47:17,250 --> 01:47:19,820 And the names are fairly verbose unto themselves. 2116 01:47:19,820 --> 01:47:22,750 But it comes with the ability, as a language, 2117 01:47:22,750 --> 01:47:27,020 to let you implement a web server, a piece of software 2118 01:47:27,020 --> 01:47:30,410 that when you run it just starts listening on the internet, 2119 01:47:30,410 --> 01:47:36,520 on your computer's IP address on TCP port 80 for incoming HTTP requests 2120 01:47:36,520 --> 01:47:38,800 and then responds to them as you see fit. 2121 01:47:38,800 --> 01:47:42,330 So we've defined a class here called HTTP server request 2122 01:47:42,330 --> 01:47:46,230 handler that descends from this parent class, so to speak. 2123 01:47:46,230 --> 01:47:48,020 But more on that in the days to come. 2124 01:47:48,020 --> 01:47:51,990 On line 7 here, I'm defining a method conventionally called do Get, 2125 01:47:51,990 --> 01:47:54,910 where Get is capitalized, thereby making super clear that this 2126 01:47:54,910 --> 01:47:57,260 is the function, the method, that's going 2127 01:47:57,260 --> 01:48:00,940 to be called if our server receives a request via HTTP 2128 01:48:00,940 --> 01:48:03,720 get, as opposed to post or something else. 2129 01:48:03,720 --> 01:48:06,900 Self is, again, the convention when implementing a class for methods 2130 01:48:06,900 --> 01:48:10,940 to take in a reference to themselves, so to speak. 2131 01:48:10,940 --> 01:48:14,800 A reference to the containing object will just call self. 2132 01:48:14,800 --> 01:48:18,590 Now inside here-- and you'd only know this from having read the documentation 2133 01:48:18,590 --> 01:48:20,950 or having done this before-- notice that we're going 2134 01:48:20,950 --> 01:48:22,610 to do a few things in this web server. 2135 01:48:22,610 --> 01:48:23,520 Super simple. 2136 01:48:23,520 --> 01:48:26,210 We're going to, no matter what, just send a response 2137 01:48:26,210 --> 01:48:27,670 code, a status code of 200. 2138 01:48:27,670 --> 01:48:29,570 Everything is always OK in this server. 2139 01:48:29,570 --> 01:48:30,410 It's not realistic. 2140 01:48:30,410 --> 01:48:32,760 Things could certainly go wrong, especially if the user asks us 2141 01:48:32,760 --> 01:48:34,310 for something that we don't have. 2142 01:48:34,310 --> 01:48:36,057 A 404 for might be more appropriate. 2143 01:48:36,057 --> 01:48:38,640 But we're going to keep the example simple and no matter what, 2144 01:48:38,640 --> 01:48:40,620 send 200, OK. 2145 01:48:40,620 --> 01:48:44,600 Meanwhile, we're also going to send another HTTP header using this Python 2146 01:48:44,600 --> 01:48:46,800 call here of self.sendheaader. 2147 01:48:46,800 --> 01:48:50,380 And to be clear, these features-- send response, send headers, 2148 01:48:50,380 --> 01:48:53,230 soon end headers-- are methods or functions 2149 01:48:53,230 --> 01:48:56,280 that come with Python's built-in web server 2150 01:48:56,280 --> 01:48:59,790 that we are simply extending the capabilities of at the moment. 2151 01:48:59,790 --> 01:49:01,520 What is the header that we want to send? 2152 01:49:01,520 --> 01:49:04,380 Content type colon text HTML. 2153 01:49:04,380 --> 01:49:08,130 So we're going to behave exactly like that canonical example I put up again 2154 01:49:08,130 --> 01:49:09,120 a moment ago. 2155 01:49:09,120 --> 01:49:12,040 Lasly, we're going to send a super simple message. 2156 01:49:12,040 --> 01:49:18,750 We're simply going to write essentially to the socket connection 2157 01:49:18,750 --> 01:49:21,870 that my server has with the browser, the internet connection that we have. 2158 01:49:21,870 --> 01:49:24,200 I'm going to write the following bytes. 2159 01:49:24,200 --> 01:49:25,630 Hello, world. 2160 01:49:25,630 --> 01:49:27,770 And I'm going to use an encoding called UTF-8, 2161 01:49:27,770 --> 01:49:30,229 which is a way of encoding Unicode, which, again, 2162 01:49:30,229 --> 01:49:32,270 is an encoding scheme that's a superset of Ascii, 2163 01:49:32,270 --> 01:49:35,170 as we discussed back in week 0. 2164 01:49:35,170 --> 01:49:35,850 That's it. 2165 01:49:35,850 --> 01:49:37,220 Return. 2166 01:49:37,220 --> 01:49:42,480 Now, this just defines a class, my own customisation of a web server. 2167 01:49:42,480 --> 01:49:45,530 Python comes with a web server built in-- specifically, 2168 01:49:45,530 --> 01:49:49,850 that class called base HTTP request handler. 2169 01:49:49,850 --> 01:49:51,880 And I'm simply extending its capabilities 2170 01:49:51,880 --> 01:49:56,710 to specifically return hello world with content type text HTML 2171 01:49:56,710 --> 01:49:58,447 and with a status code of 200. 2172 01:49:58,447 --> 01:50:00,530 That wouldn't necessarily be the case by default-- 2173 01:50:00,530 --> 01:50:02,380 certainly not that generic message. 2174 01:50:02,380 --> 01:50:04,040 But I have to start this server. 2175 01:50:04,040 --> 01:50:10,447 And I could add a main function or implement this in any number of ways. 2176 01:50:10,447 --> 01:50:11,780 But I'm going to keep it simple. 2177 01:50:11,780 --> 01:50:14,530 At the bottom of the file, I'm going to configure the server here, 2178 01:50:14,530 --> 01:50:17,960 hard coding port 8080 to be the value of this variable. 2179 01:50:17,960 --> 01:50:20,090 A server address here is going to be a tuple. 2180 01:50:20,090 --> 01:50:22,589 And you would only know this, again, from the documentation. 2181 01:50:22,589 --> 01:50:26,120 This tuple, this comma separated list of values, 2182 01:50:26,120 --> 01:50:29,130 is going to be this weird-looking IP address, and then 2183 01:50:29,130 --> 01:50:30,889 that same value, 8080. 2184 01:50:30,889 --> 01:50:33,180 And this weird-looking at IP addresses is a convention. 2185 01:50:33,180 --> 01:50:36,600 If you specify that you want a web server to listen, 2186 01:50:36,600 --> 01:50:41,310 to talk on IP address 0.0.0.0, that's generally 2187 01:50:41,310 --> 01:50:45,920 shorthand notation for saying, listen on all possible network interfaces that 2188 01:50:45,920 --> 01:50:50,060 are built into my computer, whether it's CS50 IDE, or an actual server, 2189 01:50:50,060 --> 01:50:51,670 or a Mac, or a PC. 2190 01:50:51,670 --> 01:50:53,530 This is sort of like the wildcard saying, 2191 01:50:53,530 --> 01:50:56,130 just listen on any one of your ethernet cables 2192 01:50:56,130 --> 01:51:00,900 or Wi-Fi connections for incoming requests, but specifically on this port 2193 01:51:00,900 --> 01:51:02,120 8080. 2194 01:51:02,120 --> 01:51:05,950 This last line here essentially instantiates an HTTP server, 2195 01:51:05,950 --> 01:51:10,660 passing into it our request handler, which is that customization of behavior 2196 01:51:10,660 --> 01:51:11,850 that I described earlier. 2197 01:51:11,850 --> 01:51:13,940 And then lastly, nicely enough, there's a method, 2198 01:51:13,940 --> 01:51:17,550 a function built into this Python server called serve forever, 2199 01:51:17,550 --> 01:51:20,390 which just turns the server on and never turns it off 2200 01:51:20,390 --> 01:51:23,110 unless I forcibly kill it with, say, Control-C. 2201 01:51:23,110 --> 01:51:26,320 So let's go ahead and actually run this. 2202 01:51:26,320 --> 01:51:33,150 I'm going to go ahead into the folder containing serve.py 2203 01:51:33,150 --> 01:51:37,670 and run Python serve.py, Enter. 2204 01:51:37,670 --> 01:51:40,210 And nothing seems to happen just yet. 2205 01:51:40,210 --> 01:51:44,020 But I'm going to go ahead and open up another tab in CS50 IDE. 2206 01:51:44,020 --> 01:51:53,270 And I'm going to go to http://127.0.0.0:8080. 2207 01:51:53,270 --> 01:51:54,724 So why this IP address? 2208 01:51:54,724 --> 01:51:57,390 Even though this is a little inconsistent with what I just said, 2209 01:51:57,390 --> 01:52:01,310 technically, 0.0.0.0 is not your actual IP address. 2210 01:52:01,310 --> 01:52:03,460 It's, again, just kind of a wildcard string 2211 01:52:03,460 --> 01:52:06,340 that represents all of your possible network interfaces. 2212 01:52:06,340 --> 01:52:08,750 Every computer on the internet, generally, 2213 01:52:08,750 --> 01:52:12,440 has a local host address-- not its public IP, 2214 01:52:12,440 --> 01:52:14,990 not even a private IP that's in your own home 2215 01:52:14,990 --> 01:52:24,980 network behind your own firewall-- but 127.0.0.1 2216 01:52:24,980 --> 01:52:28,560 represents your own local host address, an IP address 2217 01:52:28,560 --> 01:52:30,540 that by default every computer in the interest 2218 01:52:30,540 --> 01:52:33,470 has insofar as it refers to itself. 2219 01:52:33,470 --> 01:52:37,300 So we all have generally, in our own Macs and PCs, or CS50 IDEs, 2220 01:52:37,300 --> 01:52:41,170 access to this IP address, which just refers to myself. 2221 01:52:41,170 --> 01:52:43,350 And port 8080 after the colon. 2222 01:52:43,350 --> 01:52:45,950 Normally, using a browser, you don't specify the port number 2223 01:52:45,950 --> 01:52:48,670 by saying colon 80 or colon 443. 2224 01:52:48,670 --> 01:52:51,570 But in this case, because it's a nonstandard port, what 2225 01:52:51,570 --> 01:52:57,180 I want to do with Google Chrome here is talk to my computer on this local host 2226 01:52:57,180 --> 01:52:59,000 address on that port. 2227 01:52:59,000 --> 01:53:01,954 Now, if you play along at home using CS50 IDE on the web, 2228 01:53:01,954 --> 01:53:03,620 your address will actually be different. 2229 01:53:03,620 --> 01:53:09,010 I simply happen to be using a local version of CS50 IDE on my own Mac 2230 01:53:09,010 --> 01:53:12,076 here so that I don't have to combat with any Wi-Fi issues. 2231 01:53:12,076 --> 01:53:13,950 But the idea is going to be exactly the same. 2232 01:53:13,950 --> 01:53:17,910 Whatever your workspace's IP address is or host name, 2233 01:53:17,910 --> 01:53:21,430 the English version of it, colon 8080, is what you will type. 2234 01:53:21,430 --> 01:53:22,434 Let me hit Enter. 2235 01:53:22,434 --> 01:53:23,850 But it's not all that interesting. 2236 01:53:23,850 --> 01:53:27,600 Indeed, if I view the page source, as we have in the past, this is not HTML. 2237 01:53:27,600 --> 01:53:32,550 I've been super lazy right now, simply outputting a promise via that header 2238 01:53:32,550 --> 01:53:34,940 that I'm outputting a content type of text HTML. 2239 01:53:34,940 --> 01:53:36,480 But this isn't really HTML. 2240 01:53:36,480 --> 01:53:37,480 This is just text. 2241 01:53:37,480 --> 01:53:39,990 And so this really isn't a full-fledged web server. 2242 01:53:39,990 --> 01:53:45,470 It's certainly not dynamic in that I've literally hard coded hello world. 2243 01:53:45,470 --> 01:53:50,490 So let's do something a little better, a little more pleasurable to write. 2244 01:53:50,490 --> 01:53:55,030 And for that, we're actually going to need something called a framework. 2245 01:53:55,030 --> 01:53:58,700 And so it turns out that writing code like this-- totally possible, 2246 01:53:58,700 --> 01:54:00,109 and folks did it for some time. 2247 01:54:00,109 --> 01:54:02,150 But eventually did people realize, you know what? 2248 01:54:02,150 --> 01:54:05,000 We're doing the same kinds of lines of code again and again. 2249 01:54:05,000 --> 01:54:07,920 This isn't particularly fun to implement the website or the product 2250 01:54:07,920 --> 01:54:08,880 that I'm working on. 2251 01:54:08,880 --> 01:54:12,380 Let me actually start to borrow ideas from past projects 2252 01:54:12,380 --> 01:54:13,700 into current projects. 2253 01:54:13,700 --> 01:54:15,920 And thus were born things called frameworks, 2254 01:54:15,920 --> 01:54:18,840 collections of code written by other people that are often 2255 01:54:18,840 --> 01:54:22,840 free or open source that you can then use in your own projects 2256 01:54:22,840 --> 01:54:23,982 to make your life easier. 2257 01:54:23,982 --> 01:54:25,190 And indeed, this is thematic. 2258 01:54:25,190 --> 01:54:28,660 Especially as we get farther and farther from C and lower level 2259 01:54:28,660 --> 01:54:32,540 languages toward Python, and eventually JavaScript and beyond, 2260 01:54:32,540 --> 01:54:34,420 you'll find that it's thematic for people 2261 01:54:34,420 --> 01:54:36,420 to sort of stand again on each other's shoulders 2262 01:54:36,420 --> 01:54:42,850 and use past problems solved to solve future problems more quickly. 2263 01:54:42,850 --> 01:54:44,220 So what do I mean by that? 2264 01:54:44,220 --> 01:54:46,790 Well, one of the very first things I did way back in the day 2265 01:54:46,790 --> 01:54:51,590 when learning web programming myself, after having taken CS50 and CS51, is I 2266 01:54:51,590 --> 01:54:53,170 taught myself a language called Perl. 2267 01:54:53,170 --> 01:54:56,110 It's not really in vogue these days, though still around and still 2268 01:54:56,110 --> 01:54:56,990 under development. 2269 01:54:56,990 --> 01:55:00,300 But it's similar in spirit to what we're talking about today in Python. 2270 01:55:00,300 --> 01:55:02,920 And I happened to use that language back in the day 2271 01:55:02,920 --> 01:55:06,310 to implement a website, the first ever website for the freshman 2272 01:55:06,310 --> 01:55:07,539 intramural sports program. 2273 01:55:07,539 --> 01:55:09,330 So all the freshmen or first years who want 2274 01:55:09,330 --> 01:55:12,710 to participate in sports just for fun, back in my day, 2275 01:55:12,710 --> 01:55:16,800 we would register for sports by walking across Harvard Yard, uphill 2276 01:55:16,800 --> 01:55:19,830 both ways in the snow, and then slide a piece of paper 2277 01:55:19,830 --> 01:55:22,210 under one of the proctor's or RA's doors saying, 2278 01:55:22,210 --> 01:55:25,380 I want to register for volleyball, or soccer, or whatever it was. 2279 01:55:25,380 --> 01:55:29,390 So this was an opportunity ripe for disruption with computers. 2280 01:55:29,390 --> 01:55:31,960 So I taught myself web programming back in the day 2281 01:55:31,960 --> 01:55:34,120 and volunteered to make this website for the group 2282 01:55:34,120 --> 01:55:36,280 so that students like myself could just-- well, 2283 01:55:36,280 --> 01:55:39,950 maybe students not like myself could register for sports online. 2284 01:55:39,950 --> 01:55:41,424 And so what did I actually do? 2285 01:55:41,424 --> 01:55:43,090 Well, we won't look at the Perl version. 2286 01:55:43,090 --> 01:55:48,720 We'll look instead at a Python version using a very popular framework, 2287 01:55:48,720 --> 01:55:51,170 freely available code called Flask. 2288 01:55:51,170 --> 01:55:53,500 So Flask is technically a micro framework 2289 01:55:53,500 --> 01:55:56,050 in that it doesn't have a huge number of features. 2290 01:55:56,050 --> 01:55:58,350 But it's got relatively few features that people really 2291 01:55:58,350 --> 01:56:01,750 seem to lately that helps you get worked on faster. 2292 01:56:01,750 --> 01:56:02,865 And by that I mean this. 2293 01:56:02,865 --> 01:56:06,200 This is how I might implement the simplest of websites for the freshman 2294 01:56:06,200 --> 01:56:07,730 intramural sports program. 2295 01:56:07,730 --> 01:56:10,120 Now, admittedly, it's lacking in quite a few features. 2296 01:56:10,120 --> 01:56:11,540 But let's see how it works. 2297 01:56:11,540 --> 01:56:14,740 And indeed, with some of our future web-based projects in CS50, 2298 01:56:14,740 --> 01:56:18,020 will we build upon Flask and borrow these same ideas. 2299 01:56:18,020 --> 01:56:21,150 So you'll notice that from Flask, am I importing 2300 01:56:21,150 --> 01:56:23,790 a whole bunch of potential features, none 2301 01:56:23,790 --> 01:56:26,580 of which I want to implement myself, all of which, pretty much, 2302 01:56:26,580 --> 01:56:31,650 I would have had to implement myself if I used that base HTTP web server that 2303 01:56:31,650 --> 01:56:32,930 comes with Python itself. 2304 01:56:32,930 --> 01:56:36,680 So Flask is built on top of that built-in functionality. 2305 01:56:36,680 --> 01:56:37,740 How does it work? 2306 01:56:37,740 --> 01:56:42,770 Once I've imported this module's components and classes, 2307 01:56:42,770 --> 01:56:44,610 I'm going to go ahead and instantiate, so 2308 01:56:44,610 --> 01:56:48,920 to speak, an application of type Flask, passing in the name of this file. 2309 01:56:48,920 --> 01:56:51,200 So this is just a special symbol, __name, 2310 01:56:51,200 --> 01:56:55,160 that we've seen before in the context of main that just refers to this file. 2311 01:56:55,160 --> 01:57:00,310 So this says, hey Python, give me a Flask application based on this file. 2312 01:57:00,310 --> 01:57:03,190 So now notice on line 5, a slightly new syntax, 2313 01:57:03,190 --> 01:57:04,920 something we'll call a decorator. 2314 01:57:04,920 --> 01:57:10,570 And it's a one liner in this case that simply provides Python with a hint 2315 01:57:10,570 --> 01:57:15,470 that the following method should be called anytime the user 2316 01:57:15,470 --> 01:57:17,460 requests a particular route. 2317 01:57:17,460 --> 01:57:22,850 A route, typically, is something like /foo or /bar or /search or the like. 2318 01:57:22,850 --> 01:57:26,130 So a route is like the path that you are requesting on the web server, 2319 01:57:26,130 --> 01:57:28,150 slash generally being the default. 2320 01:57:28,150 --> 01:57:32,250 So this is saying to Python, anytime the user requests 2321 01:57:32,250 --> 01:57:36,069 slash, the default home page, go ahead and call this index method. 2322 01:57:36,069 --> 01:57:37,860 Technically, we could have called anything. 2323 01:57:37,860 --> 01:57:39,580 But this is a good convention. 2324 01:57:39,580 --> 01:57:41,050 And return what? 2325 01:57:41,050 --> 01:57:43,190 The rendering of this template. 2326 01:57:43,190 --> 01:57:46,140 In other words, don't just return a few bytes, hello world. 2327 01:57:46,140 --> 01:57:48,180 Return this whole HTML file. 2328 01:57:48,180 --> 01:57:51,680 But it's a template in the sense that we can plug in values, as we'll soon see. 2329 01:57:51,680 --> 01:57:56,620 Meanwhile, hey Python, when you see a request for /register, 2330 01:57:56,620 --> 01:58:00,510 not using get by default, but by using post, 2331 01:58:00,510 --> 01:58:04,250 which might happen in a form submission, go ahead and call this method called 2332 01:58:04,250 --> 01:58:04,940 Register. 2333 01:58:04,940 --> 01:58:07,360 And just as a sanity check, let's do this. 2334 01:58:07,360 --> 01:58:13,010 If the request that we have received has a form in it 2335 01:58:13,010 --> 01:58:18,630 that has a Name input in it that's blank, equals quote, unquote, 2336 01:58:18,630 --> 01:58:25,820 or the request we've received has a form whose Dorm field is blank, 2337 01:58:25,820 --> 01:58:30,280 then return this template instead, failure.html. 2338 01:58:30,280 --> 01:58:32,300 Otherwise, return success. 2339 01:58:32,300 --> 01:58:36,800 So in other words, if the user has submitted a form to register for sports 2340 01:58:36,800 --> 01:58:40,090 and he or she has not given us their name or their dorm, 2341 01:58:40,090 --> 01:58:41,790 let's render this failure message. 2342 01:58:41,790 --> 01:58:43,840 Don't let them register because we don't even know who they are 2343 01:58:43,840 --> 01:58:44,810 or where they're from. 2344 01:58:44,810 --> 01:58:47,780 So we're going to display failure.html. 2345 01:58:47,780 --> 01:58:51,130 Otherwise, by default, we'll display success.html. 2346 01:58:51,130 --> 01:58:52,680 So let's see what this looks like. 2347 01:58:52,680 --> 01:58:57,030 I'm going to go ahead and hit Control-C to get out of the old web server. 2348 01:58:57,030 --> 01:58:59,540 I'm going to go into this Frosh IMs directory. 2349 01:58:59,540 --> 01:59:01,750 And this time, instead of running Python, 2350 01:59:01,750 --> 01:59:04,697 I'm instead going to run Flask, Run. 2351 01:59:04,697 --> 01:59:06,780 And then I'm going to be just super specific here. 2352 01:59:06,780 --> 01:59:10,940 I'm going to say the host I want to use is every possible interface. 2353 01:59:10,940 --> 01:59:12,732 And then the port I'm going to use is 8080, 2354 01:59:12,732 --> 01:59:14,773 though you can configure these in different ways. 2355 01:59:14,773 --> 01:59:15,920 And I'm going to hit Enter. 2356 01:59:15,920 --> 01:59:17,836 A whole bunch of stuff scrolled on the screen. 2357 01:59:17,836 --> 01:59:21,500 But the essence of it is that serving Flask App application. 2358 01:59:21,500 --> 01:59:25,010 Debug mode and CS50 IDE is turned on by default at the moment. 2359 01:59:25,010 --> 01:59:26,910 And now we're ready to go. 2360 01:59:26,910 --> 01:59:31,210 If I now go back to my web page and reload, I'm still at the same URL. 2361 01:59:31,210 --> 01:59:33,850 But a different web server is now responding to my requests. 2362 01:59:33,850 --> 01:59:40,186 And this is sort of in the spirit of 1996, '97, whenever I implemented this. 2363 01:59:40,186 --> 01:59:41,560 This is what the web looked like. 2364 01:59:41,560 --> 01:59:43,580 And in fact, this might be a little worse. 2365 01:59:43,580 --> 01:59:47,102 So now, suppose I'm kind of in a rush. 2366 01:59:47,102 --> 01:59:48,560 I just want to register for sports. 2367 01:59:48,560 --> 01:59:50,530 I don't think to provide my name or dorm. 2368 01:59:50,530 --> 01:59:52,410 Let me hit Register. 2369 01:59:52,410 --> 01:59:53,190 And I'm yelled at. 2370 01:59:53,190 --> 01:59:54,870 You must provide your name and dorm. 2371 01:59:54,870 --> 01:59:55,830 And notice, where am I? 2372 01:59:55,830 --> 01:59:59,760 If I look at the URL, I'm at that same IP and port 2373 01:59:59,760 --> 02:00:03,290 number, which will vary based on where you are in the world and what service 2374 02:00:03,290 --> 02:00:05,380 you're using, like CS50 IDE. 2375 02:00:05,380 --> 02:00:07,971 But I'm at /register, that route. 2376 02:00:07,971 --> 02:00:08,470 All right. 2377 02:00:08,470 --> 02:00:09,502 Let me go back. 2378 02:00:09,502 --> 02:00:11,460 Let me go ahead and give them my name at least. 2379 02:00:11,460 --> 02:00:11,960 David. 2380 02:00:11,960 --> 02:00:12,920 Register. 2381 02:00:12,920 --> 02:00:14,020 And voila. 2382 02:00:14,020 --> 02:00:17,620 I am being yelled at again because even though I provided my name, 2383 02:00:17,620 --> 02:00:19,130 I've not provided my dorm still. 2384 02:00:19,130 --> 02:00:22,780 So this seems to be a fairly lightweight error message. 2385 02:00:22,780 --> 02:00:23,740 But let me cooperate. 2386 02:00:23,740 --> 02:00:26,590 Let me provide both David and say Matthews, and click Register. 2387 02:00:26,590 --> 02:00:27,410 Aha. 2388 02:00:27,410 --> 02:00:28,630 You are registered. 2389 02:00:28,630 --> 02:00:29,720 Well, not really. 2390 02:00:29,720 --> 02:00:30,670 So why not really? 2391 02:00:30,670 --> 02:00:34,000 Well, that's because this particular website has no database yet. 2392 02:00:34,000 --> 02:00:35,520 There's no place to store the data. 2393 02:00:35,520 --> 02:00:36,790 There's not even a CSV file. 2394 02:00:36,790 --> 02:00:38,081 There's no email functionality. 2395 02:00:38,081 --> 02:00:40,290 All it's being used for today is to demonstrate 2396 02:00:40,290 --> 02:00:44,160 how we can check for the presence of form submissions 2397 02:00:44,160 --> 02:00:47,590 properly to make sure the user is actually providing those values. 2398 02:00:47,590 --> 02:00:52,290 So if I actually go back into CS50 IDE, let's go into this Frosh IMs directory, 2399 02:00:52,290 --> 02:00:54,860 inside if which is a Templates directory, 2400 02:00:54,860 --> 02:00:58,550 and take a look at Failure, the first thing that we saw. 2401 02:00:58,550 --> 02:01:00,820 Now, this admittedly looks a bit cryptic. 2402 02:01:00,820 --> 02:01:05,060 But for a moment, notice that it's extending something called layout.html. 2403 02:01:05,060 --> 02:01:08,260 And actually, it looks like there's these special syntax here. 2404 02:01:08,260 --> 02:01:12,100 So it turns out that Flask supports a templating language. 2405 02:01:12,100 --> 02:01:12,990 It's not HTML. 2406 02:01:12,990 --> 02:01:14,120 It's not CSS. 2407 02:01:14,120 --> 02:01:15,210 It's not even Python. 2408 02:01:15,210 --> 02:01:17,560 It's sort of a mini language unto itself, 2409 02:01:17,560 --> 02:01:22,200 a templating language that gives you simple instructions and features that 2410 02:01:22,200 --> 02:01:25,720 allow you to dynamically plug in values to your HTML. 2411 02:01:25,720 --> 02:01:27,750 So they don't have to hard code everything. 2412 02:01:27,750 --> 02:01:31,390 And so this is saying, hey Flask, go and get 2413 02:01:31,390 --> 02:01:34,900 the template file called layout.html, and then 2414 02:01:34,900 --> 02:01:38,470 plug in this title, and then this body. 2415 02:01:38,470 --> 02:01:40,010 So block title here. 2416 02:01:40,010 --> 02:01:43,246 Notice the funky syntax, curly brace with percent sign, 2417 02:01:43,246 --> 02:01:45,350 percent sign with curly brace. 2418 02:01:45,350 --> 02:01:50,070 This is literally saying, hey Flask, the title of this page shall be Failure. 2419 02:01:50,070 --> 02:01:53,590 And the body of this page, as per this block here, 2420 02:01:53,590 --> 02:01:56,460 shall be, quote, unquote, "You must provide your name and dorm." 2421 02:01:56,460 --> 02:02:02,480 Meanwhile, if we open up success.html, it's similar in spirit. 2422 02:02:02,480 --> 02:02:05,299 But notice it has a title of Success. 2423 02:02:05,299 --> 02:02:07,090 And it has a body of, "You are registered." 2424 02:02:07,090 --> 02:02:07,850 Well, not really. 2425 02:02:07,850 --> 02:02:09,920 So nothing interesting is happening. 2426 02:02:09,920 --> 02:02:13,830 But this body and this title will be plugged into this layout, 2427 02:02:13,830 --> 02:02:14,969 this other template file. 2428 02:02:14,969 --> 02:02:16,135 So let's now look at layout. 2429 02:02:16,135 --> 02:02:19,620 2430 02:02:19,620 --> 02:02:21,420 This looks more familiar. 2431 02:02:21,420 --> 02:02:25,850 So layout.html is sort of the parent of these children-- 2432 02:02:25,850 --> 02:02:28,360 success.html and failure.html. 2433 02:02:28,360 --> 02:02:30,840 And this is because I realized, when designing 2434 02:02:30,840 --> 02:02:34,420 the website for the first time, I don't want to have to copy and paste 2435 02:02:34,420 --> 02:02:36,080 a whole lot of similar HTML. 2436 02:02:36,080 --> 02:02:39,950 I don't want to have HTML in every file, head in every file, 2437 02:02:39,950 --> 02:02:42,420 title in every file, body in every file. 2438 02:02:42,420 --> 02:02:45,730 There's a lot of redundancy, a lot of structure to these web pages. 2439 02:02:45,730 --> 02:02:47,590 It would be nice if I can kind of come up 2440 02:02:47,590 --> 02:02:51,150 with a general layout, the aesthetics for my overarching website, 2441 02:02:51,150 --> 02:02:55,900 and then, on a per page basis, just plug-in a custom title, 2442 02:02:55,900 --> 02:02:58,470 just plug-in a custom body. 2443 02:02:58,470 --> 02:03:00,490 And that's all this funky syntax is doing. 2444 02:03:00,490 --> 02:03:04,890 It's saying, hey Flask, put the body of this page here. 2445 02:03:04,890 --> 02:03:07,270 And hey Flask, put the title of this page here. 2446 02:03:07,270 --> 02:03:09,510 So again, this has nothing to do with Python per se, 2447 02:03:09,510 --> 02:03:13,140 nothing to do with HTML or CSS per se, except that it 2448 02:03:13,140 --> 02:03:17,390 is a templating language, another language used for really 2449 02:03:17,390 --> 02:03:18,880 plugging in values in this way. 2450 02:03:18,880 --> 02:03:22,300 And it's conventionally used in exactly this context with Python, 2451 02:03:22,300 --> 02:03:25,150 with CSS, and with HTML. 2452 02:03:25,150 --> 02:03:28,100 It helps us keep things a little cleaner and avoiding 2453 02:03:28,100 --> 02:03:30,030 a whole lot of copy, paste. 2454 02:03:30,030 --> 02:03:32,450 The form, meanwhile, that we originally saw 2455 02:03:32,450 --> 02:03:34,940 is perhaps even more familiar except for the block up top. 2456 02:03:34,940 --> 02:03:37,100 It too extends layout.html. 2457 02:03:37,100 --> 02:03:39,180 It has the title of Frosh IMs. 2458 02:03:39,180 --> 02:03:41,090 And then it's pretty much just got an H1 tag, 2459 02:03:41,090 --> 02:03:43,089 which you might recall from a couple weeks back. 2460 02:03:43,089 --> 02:03:44,080 It's got a form tag. 2461 02:03:44,080 --> 02:03:47,940 It's got some BR for line breaks, a select element, and more. 2462 02:03:47,940 --> 02:03:50,740 And the only thing that's a little interesting here is notice this. 2463 02:03:50,740 --> 02:03:56,550 Whereas two weeks ago, I hard coded an action value like google.com/search 2464 02:03:56,550 --> 02:04:02,580 or to my own file, this is a nice abstraction, if you will, 2465 02:04:02,580 --> 02:04:08,990 whereby I can say in my template, give me the URL for my register route. 2466 02:04:08,990 --> 02:04:12,270 Now realistically, it's probably just /register because that's what we hard 2467 02:04:12,270 --> 02:04:13,440 coded into the file. 2468 02:04:13,440 --> 02:04:15,550 But it would be nice to not have to hard code 2469 02:04:15,550 --> 02:04:17,580 things that could change over time. 2470 02:04:17,580 --> 02:04:21,640 And so this URL for is a way of dynamically asking the templating 2471 02:04:21,640 --> 02:04:24,760 language, you go figure out what this route is called 2472 02:04:24,760 --> 02:04:29,430 and plug in the appropriate relative URL here. 2473 02:04:29,430 --> 02:04:32,520 So that's all Frosh IMs does on the front end. 2474 02:04:32,520 --> 02:04:33,670 What's the back end? 2475 02:04:33,670 --> 02:04:36,750 Well for that, we need to look at application.py. 2476 02:04:36,750 --> 02:04:38,680 Again, this is where we started. 2477 02:04:38,680 --> 02:04:44,940 When I submit via post that super simple HTML form to /register, that is, 2478 02:04:44,940 --> 02:04:47,480 this route, first, this if condition runs. 2479 02:04:47,480 --> 02:04:52,620 If the request form's Name field is blank or its Dorm is blank, 2480 02:04:52,620 --> 02:04:53,540 return failure. 2481 02:04:53,540 --> 02:04:56,370 Else, return success. 2482 02:04:56,370 --> 02:04:58,200 But this isn't especially dynamic. 2483 02:04:58,200 --> 02:05:01,730 It would be nice, if I keep saying "dynamic," that things actually 2484 02:05:01,730 --> 02:05:02,620 are dynamic. 2485 02:05:02,620 --> 02:05:04,970 So let's look at one final example. 2486 02:05:04,970 --> 02:05:09,000 So now let's rerun Flask in this Store subdirectory, still on the same IP, 2487 02:05:09,000 --> 02:05:13,580 still on the same ports, but now serving a different application on the same. 2488 02:05:13,580 --> 02:05:15,480 Indeed, when we now reload the browser, we 2489 02:05:15,480 --> 02:05:21,180 see not the Frosh IMs site, but a super, super simple e-commerce site, 2490 02:05:21,180 --> 02:05:25,560 a web storefront that allows us to buy apparently foos and bars and bazes. 2491 02:05:25,560 --> 02:05:27,370 This is just an email form. 2492 02:05:27,370 --> 02:05:28,614 These are text fields here. 2493 02:05:28,614 --> 02:05:30,030 These are just labels on the side. 2494 02:05:30,030 --> 02:05:31,222 And this is a Submit button. 2495 02:05:31,222 --> 02:05:32,930 And the shopping cart ultimately is going 2496 02:05:32,930 --> 02:05:36,410 to show me how many of these things I've added to my shopping cart already. 2497 02:05:36,410 --> 02:05:41,110 Indeed, let's try adding one foo, two bars, and three bazes, 2498 02:05:41,110 --> 02:05:42,470 and click Purchase. 2499 02:05:42,470 --> 02:05:44,950 I'm redirected automatically to my cart. 2500 02:05:44,950 --> 02:05:48,330 And I just see a reminder that I've got one foo, one bar, one baz. 2501 02:05:48,330 --> 02:05:50,590 And let's just confirm that this is indeed the case. 2502 02:05:50,590 --> 02:05:52,440 Let me go ahead and continue shopping. 2503 02:05:52,440 --> 02:05:55,190 And let me go ahead and buy 10 more foos. 2504 02:05:55,190 --> 02:05:56,520 Click Purchase. 2505 02:05:56,520 --> 02:05:58,870 And indeed, it's incremented this properly. 2506 02:05:58,870 --> 02:06:02,030 And indeed, while you can't quite see what I'm doing on my keyboard, 2507 02:06:02,030 --> 02:06:03,770 I'm hitting Reload now. 2508 02:06:03,770 --> 02:06:05,620 And nothing's changing. 2509 02:06:05,620 --> 02:06:08,300 Indeed, if I close the window and then reopen it, 2510 02:06:08,300 --> 02:06:10,620 you'll see that it retains state. 2511 02:06:10,620 --> 02:06:13,830 In other words, no matter whether I close or open my browser, 2512 02:06:13,830 --> 02:06:16,720 it seems to be remembering that I've got 11 foos, two bars, and three 2513 02:06:16,720 --> 02:06:19,480 bazes in my shopping cart, so to speak. 2514 02:06:19,480 --> 02:06:22,230 So how did we implement this functionality? 2515 02:06:22,230 --> 02:06:26,110 Well first, notice that the store itself is super simple. 2516 02:06:26,110 --> 02:06:30,370 It's just some HTML and a template that has a whole bunch of input fields 2517 02:06:30,370 --> 02:06:33,560 textually for foos, for bars, and for bazes, 2518 02:06:33,560 --> 02:06:35,730 as well as that Submit button called Purchase. 2519 02:06:35,730 --> 02:06:38,460 And it, like before, extends layout.html, 2520 02:06:38,460 --> 02:06:41,020 which this application has its own copy of. 2521 02:06:41,020 --> 02:06:43,201 The cart, meanwhile, is actually pretty simple. 2522 02:06:43,201 --> 02:06:45,450 And notice what's nice about this templating language. 2523 02:06:45,450 --> 02:06:46,695 It lets us do this. 2524 02:06:46,695 --> 02:06:51,440 This is a file called cart.html that also descends from that layout. 2525 02:06:51,440 --> 02:06:54,110 And we have this H1 tag here that just says Cart. 2526 02:06:54,110 --> 02:06:58,160 And now notice this template language looks quite like Python here. 2527 02:06:58,160 --> 02:07:00,030 Has for item in cart. 2528 02:07:00,030 --> 02:07:03,500 And it allows me using this curly bracket notation, two of them 2529 02:07:03,500 --> 02:07:07,740 on the left, two of them on the right, to plug-in 2530 02:07:07,740 --> 02:07:10,680 the Quantity field of this Item object. 2531 02:07:10,680 --> 02:07:13,170 So it seems that Cart is some kind of list, 2532 02:07:13,170 --> 02:07:17,680 and Item are the elements, the objects inside of that list. 2533 02:07:17,680 --> 02:07:21,290 And this is giving me the Quantity field inside of this structure 2534 02:07:21,290 --> 02:07:23,240 and the Name field inside of this structure. 2535 02:07:23,240 --> 02:07:29,490 And that's why I see 11 foo and two bar and three baz in my templating language 2536 02:07:29,490 --> 02:07:29,990 here. 2537 02:07:29,990 --> 02:07:33,320 I'm just interesting over what apparently is my shopping cart. 2538 02:07:33,320 --> 02:07:36,400 Now this invites the question, what is this shopping cart? 2539 02:07:36,400 --> 02:07:41,010 And for that last detail, we need to look at application.py. 2540 02:07:41,010 --> 02:07:43,790 So as before, we instantiate a Flask application up here. 2541 02:07:43,790 --> 02:07:48,250 But we also configure it with a property called secret key 2542 02:07:48,250 --> 02:07:49,760 per the documentation for Flask. 2543 02:07:49,760 --> 02:07:51,969 You probably shouldn't use a value of shh, 2544 02:07:51,969 --> 02:07:53,510 but for now we'll keep things simple. 2545 02:07:53,510 --> 02:07:56,910 But that key, long story short, is used to ensure with higher probability 2546 02:07:56,910 --> 02:08:00,300 the security of the sessions of the shopping cart that we're using. 2547 02:08:00,300 --> 02:08:02,300 This line here again declares a route for slash, 2548 02:08:02,300 --> 02:08:05,680 specifying we only want to accept get and post. 2549 02:08:05,680 --> 02:08:08,724 Turns out there's other verbs like put and patch and others, 2550 02:08:08,724 --> 02:08:10,390 but we're going to ignore those for now. 2551 02:08:10,390 --> 02:08:14,000 And if that route is requested, call this method store. 2552 02:08:14,000 --> 02:08:18,550 Meanwhile that method says, hey, if the request method is post-- that is, 2553 02:08:18,550 --> 02:08:21,670 if the user submitted a form not by get but by post-- 2554 02:08:21,670 --> 02:08:25,400 go ahead and iterate over the three possible items 2555 02:08:25,400 --> 02:08:28,630 that we sell in the store, foo, bar, and baz. 2556 02:08:28,630 --> 02:08:31,650 If the item is not already in the session-- 2557 02:08:31,650 --> 02:08:34,210 and you can think of session as our shopping cart. 2558 02:08:34,210 --> 02:08:37,490 It's a special object dictionary that allows us to store 2559 02:08:37,490 --> 02:08:40,330 keys and values, a hash table of sorts. 2560 02:08:40,330 --> 02:08:44,810 Then go ahead and add to the session, the shopping cart, that item-- foo 2561 02:08:44,810 --> 02:08:50,790 or bar or baz-- and then as an integer the number of foos or bars 2562 02:08:50,790 --> 02:08:53,340 or bazes that the user requested via the form. 2563 02:08:53,340 --> 02:08:57,380 We're calling int here, effectively casting whatever that value is. 2564 02:08:57,380 --> 02:09:01,010 Because it turns out HTTP, all of those messages going back and forth 2565 02:09:01,010 --> 02:09:03,030 all this time are purely textual. 2566 02:09:03,030 --> 02:09:08,340 So even though it looks like 10 or 1 or 2 or 3, those are actually strings. 2567 02:09:08,340 --> 02:09:11,260 So using int here converts that to the integer. 2568 02:09:11,260 --> 02:09:14,700 So we're actually storing numbers with which we can do simple arithmetic. 2569 02:09:14,700 --> 02:09:18,980 Otherwise, if a foo or bar or baz was already in the shopping cart, 2570 02:09:18,980 --> 02:09:22,380 go ahead with Python's plus equal operator, which C also 2571 02:09:22,380 --> 02:09:26,200 had, and just increment that count from 1 to 10, for instance, 2572 02:09:26,200 --> 02:09:27,580 as I did a moment ago. 2573 02:09:27,580 --> 02:09:33,520 And then redirect the user to whatever the URL is for the cart routes. 2574 02:09:33,520 --> 02:09:35,780 In other words, after I add something to my cart, 2575 02:09:35,780 --> 02:09:39,370 let's just show me the cart right away rather than showing me the order form 2576 02:09:39,370 --> 02:09:40,160 instead. 2577 02:09:40,160 --> 02:09:43,350 Otherwise, if the user requested this page via get, 2578 02:09:43,350 --> 02:09:47,260 go ahead by default and just return store.html, which 2579 02:09:47,260 --> 02:09:51,700 is that simple form that lists the text fields and the numbers of foos and bars 2580 02:09:51,700 --> 02:09:54,310 and bazes that you might want to buy. 2581 02:09:54,310 --> 02:09:57,630 Meanwhile, the shopping cart is implemented in this application 2582 02:09:57,630 --> 02:09:58,300 as follows. 2583 02:09:58,300 --> 02:10:02,550 Here's a route for /cart, in which case a method called cart is called. 2584 02:10:02,550 --> 02:10:06,920 We then declare inside of this method an empty list called cart. 2585 02:10:06,920 --> 02:10:09,610 And then as before, we iterate over the available items 2586 02:10:09,610 --> 02:10:11,890 in our store-- foo and bar and baz. 2587 02:10:11,890 --> 02:10:13,480 And then what do we do? 2588 02:10:13,480 --> 02:10:20,590 We simply append to this cart, this list object, the following. 2589 02:10:20,590 --> 02:10:25,790 We append what's called a dictionary, a hash table, a collection of key value 2590 02:10:25,790 --> 02:10:34,890 pairs, simply by associating a name with an item capitalized properly, 2591 02:10:34,890 --> 02:10:37,820 and a quantity associated with the actual number 2592 02:10:37,820 --> 02:10:39,690 of those items in my session. 2593 02:10:39,690 --> 02:10:44,660 And then we return this time not just the template via its name, cart.html. 2594 02:10:44,660 --> 02:10:51,070 We furthermore render cart.html, passing into that template 2595 02:10:51,070 --> 02:10:55,290 a variable called cart whose value is also cart. 2596 02:10:55,290 --> 02:10:57,860 In other words, this variable is going to be called cart. 2597 02:10:57,860 --> 02:11:00,186 And it's going to be equal to whatever this list is. 2598 02:11:00,186 --> 02:11:02,060 And it's these two lines here in the for loop 2599 02:11:02,060 --> 02:11:06,020 that are appending a set of key value pairs 2600 02:11:06,020 --> 02:11:09,890 so that we know how many foos we have, how many bars, and how many bazes. 2601 02:11:09,890 --> 02:11:14,610 And that's why in cart.html do we have access to, on line 9 2602 02:11:14,610 --> 02:11:20,300 here, a cart list over which we can iterate. 2603 02:11:20,300 --> 02:11:23,640 So there, we're just scratching the surface of what we can do. 2604 02:11:23,640 --> 02:11:25,810 But we now have a language with which we can 2605 02:11:25,810 --> 02:11:27,610 express these new kinds of features. 2606 02:11:27,610 --> 02:11:31,782 We now have a server environment that allows us to actually execute 2607 02:11:31,782 --> 02:11:33,490 Python code not only at the command line, 2608 02:11:33,490 --> 02:11:38,430 but also via HTTP and in turn TCP/IP, and in general, 2609 02:11:38,430 --> 02:11:39,760 over the internet itself. 2610 02:11:39,760 --> 02:11:42,940 So now, using this language-- and soon a database, 2611 02:11:42,940 --> 02:11:45,720 and soon a client side language called JavaScript and more-- can 2612 02:11:45,720 --> 02:11:48,678 we start to build the very kinds of websites with which you're probably 2613 02:11:48,678 --> 02:11:52,010 already familiar and using them every day on your laptops and phones. 2614 02:11:52,010 --> 02:11:55,530 We're just now beginning our foray into web programming. 2615 02:11:55,530 --> 02:11:58,490 And next week, we'll add a back end so that we can actually 2616 02:11:58,490 --> 02:12:02,328 do all this and more. 2617 02:12:02,328 --> 02:12:02,994 [AUDIO PLAYBACK] 2618 02:12:02,994 --> 02:12:06,718 [MUSIC PLAYING] 2619 02:12:06,718 --> 02:12:07,967 -I never even got to know him. 2620 02:12:07,967 --> 02:12:15,430 2621 02:12:15,430 --> 02:12:17,805 I just-- I don't know what happened. 2622 02:12:17,805 --> 02:12:22,755 2623 02:12:22,755 --> 02:12:23,745 Please, please. 2624 02:12:23,745 --> 02:12:27,930 I-- I need to be alone. 2625 02:12:27,930 --> 02:12:31,427 The people need to know. 2626 02:12:31,427 --> 02:12:32,990 [INAUDIBLE] 2627 02:12:32,990 --> 02:12:33,490 No. 2628 02:12:33,490 --> 02:12:34,920 Please-- please go. 2629 02:12:34,920 --> 02:12:51,874 2630 02:12:51,874 --> 02:12:54,540 I never did get that dinner with him at his favorite restaurant. 2631 02:12:54,540 --> 02:12:58,440 2632 02:12:58,440 --> 02:12:59,990 [END PLAYBACK]