1 00:00:00,000 --> 00:00:00,996 [MUSIC PLAYING] 2 00:00:00,996 --> 00:00:23,920 3 00:00:23,920 --> 00:00:26,800 DAVID MALAN: All right, this is CS50's introduction 4 00:00:26,800 --> 00:00:28,150 to programming with Python. 5 00:00:28,150 --> 00:00:31,030 My name is David Malan, and this is our look at style. 6 00:00:31,030 --> 00:00:35,140 Now, up until now, we've been writing code that's hopefully at least correct. 7 00:00:35,140 --> 00:00:37,600 That is, the code does what you intended to do. 8 00:00:37,600 --> 00:00:40,900 And hopefully, it's also well designed, that 9 00:00:40,900 --> 00:00:43,300 is, you're using relatively few lines of code 10 00:00:43,300 --> 00:00:46,120 to achieve some goal while still ensuring that it's readable. 11 00:00:46,120 --> 00:00:50,710 You're perhaps using functions that save you the trouble of reinventing wheels, 12 00:00:50,710 --> 00:00:51,520 so to speak. 13 00:00:51,520 --> 00:00:55,510 But it's possible that your code isn't necessarily manifesting the best style 14 00:00:55,510 --> 00:00:58,180 as well, which is another form of quality 15 00:00:58,180 --> 00:00:59,830 that you can ascribe to some code. 16 00:00:59,830 --> 00:01:03,430 Now style, or the right type of style to use, is rather subjective, 17 00:01:03,430 --> 00:01:06,370 and it depends typically on the programmer, on the company, 18 00:01:06,370 --> 00:01:09,430 on the course, or on the language that you're actually using. 19 00:01:09,430 --> 00:01:11,560 But within the Python community, it turns out 20 00:01:11,560 --> 00:01:13,780 that there's some pretty regimented standards 21 00:01:13,780 --> 00:01:16,450 that most all python Programmers adhere to, 22 00:01:16,450 --> 00:01:20,290 or rather, are expected to adhere to, because this particular language 23 00:01:20,290 --> 00:01:23,680 and community has tried to codify some of their preferences 24 00:01:23,680 --> 00:01:28,030 for how your code should look in the form of something called PEP 8. 25 00:01:28,030 --> 00:01:29,770 So a PEP or P-E-P-- 26 00:01:29,770 --> 00:01:32,860 Python enhancement proposal-- is a set of proposals 27 00:01:32,860 --> 00:01:34,870 that the community within the world of Python 28 00:01:34,870 --> 00:01:38,470 typically generate to not only propose new ideas, but also to codify, 29 00:01:38,470 --> 00:01:40,240 ultimately, certain standards. 30 00:01:40,240 --> 00:01:45,310 And PEP 8 happens to be such a proposal that standardized, or rather tried 31 00:01:45,310 --> 00:01:48,790 to standardize, what our code should look like-- that is to say, 32 00:01:48,790 --> 00:01:51,610 it's quite possible to write code that's not only correct, 33 00:01:51,610 --> 00:01:54,790 it might even be well designed, but it's just really a mess. 34 00:01:54,790 --> 00:01:56,860 And it just doesn't look very good. 35 00:01:56,860 --> 00:01:58,000 It's not very pretty. 36 00:01:58,000 --> 00:01:59,770 It's therefore harder to read. 37 00:01:59,770 --> 00:02:01,240 It's harder for others to read. 38 00:02:01,240 --> 00:02:03,550 And therefore, it's just not as maintainable. 39 00:02:03,550 --> 00:02:06,970 And any time you make something harder to read or less maintainable, 40 00:02:06,970 --> 00:02:09,340 you're just increasing the probability, dare say down 41 00:02:09,340 --> 00:02:11,200 the line, of introducing bugs. 42 00:02:11,200 --> 00:02:14,560 So it's a good thing for your code to be properly formatted, 43 00:02:14,560 --> 00:02:18,640 just like in the world of writing emails or essays or books or documents 44 00:02:18,640 --> 00:02:19,360 or beyond. 45 00:02:19,360 --> 00:02:21,490 It tends to be good practice to capitalize 46 00:02:21,490 --> 00:02:25,367 certain words, at least in English, use good punctuation, use paragraph 47 00:02:25,367 --> 00:02:26,200 breaks and the like. 48 00:02:26,200 --> 00:02:28,870 So even if you're relatively new to programming, at least 49 00:02:28,870 --> 00:02:30,910 in English or your own human language, odds 50 00:02:30,910 --> 00:02:34,030 are you've had quite a bit of practice with writing language 51 00:02:34,030 --> 00:02:37,390 in the human world that also just looks good as well. 52 00:02:37,390 --> 00:02:40,750 Well, what does it mean to look good in the world of programming code? 53 00:02:40,750 --> 00:02:43,810 Well, let me propose that if you look at PEP 8 itself, 54 00:02:43,810 --> 00:02:46,660 which is available on the internet via this address here, 55 00:02:46,660 --> 00:02:48,880 it turns out that it tries to standardize 56 00:02:48,880 --> 00:02:52,187 a number of details that would be manifest in your own code 57 00:02:52,187 --> 00:02:53,770 once you've written a number of lines. 58 00:02:53,770 --> 00:02:56,470 And the overarching premise of PEP 8, and really 59 00:02:56,470 --> 00:02:59,080 the notion of style in the Python community especially, 60 00:02:59,080 --> 00:03:01,720 is that "readability counts." 61 00:03:01,720 --> 00:03:04,120 And typically, languages-- Python among them-- 62 00:03:04,120 --> 00:03:06,460 come with what's generally known as a style guide. 63 00:03:06,460 --> 00:03:09,220 A style guide, not unlike PEP 8, is some kind 64 00:03:09,220 --> 00:03:12,430 of guide-- a document, either printed or perhaps internet 65 00:03:12,430 --> 00:03:16,297 based-- that just tries to standardize what everyone's code should look like. 66 00:03:16,297 --> 00:03:18,880 So a course that you're taking might have its own style guide. 67 00:03:18,880 --> 00:03:21,400 A company you're working for might have its own style guide. 68 00:03:21,400 --> 00:03:24,130 You, down the road as a professional programmer, 69 00:03:24,130 --> 00:03:26,410 might have your own style guide for your own code. 70 00:03:26,410 --> 00:03:28,870 But within the Python community, they've tried generally 71 00:03:28,870 --> 00:03:31,750 to standardize for the most part a lot of these details. 72 00:03:31,750 --> 00:03:33,670 And ultimately-- quote, unquote too-- 73 00:03:33,670 --> 00:03:37,870 a style guide is about consistency, consistency with this style guide, 74 00:03:37,870 --> 00:03:39,850 in the context of Python is important. 75 00:03:39,850 --> 00:03:42,460 Consistency within a project is more important. 76 00:03:42,460 --> 00:03:46,990 And consistency within one module or function is the most important. 77 00:03:46,990 --> 00:03:49,030 So that is to say, these aren't necessarily hard 78 00:03:49,030 --> 00:03:51,160 and fast rules, but rather guidelines that 79 00:03:51,160 --> 00:03:53,590 should guide the design of your own code. 80 00:03:53,590 --> 00:03:56,320 Now, how do you go about designing, or how do you 81 00:03:56,320 --> 00:03:58,150 go about styling your code well? 82 00:03:58,150 --> 00:04:01,600 Well, it boils down to things like this and many more-- 83 00:04:01,600 --> 00:04:04,480 indentation, using consistent indentation. 84 00:04:04,480 --> 00:04:06,730 Now, in some languages, it doesn't strictly 85 00:04:06,730 --> 00:04:09,490 have to be there, when you indent one line of code under another, 86 00:04:09,490 --> 00:04:12,640 or it could be one space, or two spaces, three spaces, or four-- maybe 87 00:04:12,640 --> 00:04:14,530 even eight, or an actual tab. 88 00:04:14,530 --> 00:04:17,410 In the world of Python, they tried to put an end to this debate 89 00:04:17,410 --> 00:04:23,020 and they prescribed that we all just agree to use four spaces consistently. 90 00:04:23,020 --> 00:04:26,560 Tabs versus spaces-- no tabs, spaces instead. 91 00:04:26,560 --> 00:04:28,720 And indeed, typically in something like VS Code, 92 00:04:28,720 --> 00:04:30,850 when you hit Tab on your keyboard, depending 93 00:04:30,850 --> 00:04:33,640 on how the program is configured, it should generally 94 00:04:33,640 --> 00:04:37,870 convert even things like tabs to individual spaces, like four. 95 00:04:37,870 --> 00:04:39,580 What about maximum line length? 96 00:04:39,580 --> 00:04:41,620 Your code tends to get less and less readable 97 00:04:41,620 --> 00:04:43,660 the longer and longer your lines of code get, 98 00:04:43,660 --> 00:04:45,785 especially if they start to scroll over the screen. 99 00:04:45,785 --> 00:04:49,090 So Python too tries to standardize what the maximum line length is. 100 00:04:49,090 --> 00:04:51,730 You just shouldn't go past a certain number of characters 101 00:04:51,730 --> 00:04:53,470 to the right of your screen. 102 00:04:53,470 --> 00:04:56,320 Blank lines-- using some number of blank lines, 103 00:04:56,320 --> 00:04:59,020 for instance, between blocks of code, perhaps among comments 104 00:04:59,020 --> 00:05:01,550 also just lends itself to making your code more readable. 105 00:05:01,550 --> 00:05:02,050 Why? 106 00:05:02,050 --> 00:05:04,450 Because it's not just a wall of text, a wall 107 00:05:04,450 --> 00:05:08,232 of code that you or other programmers have to see by adding in blank lines. 108 00:05:08,232 --> 00:05:10,690 And whitespace, more generally, can make it a little easier 109 00:05:10,690 --> 00:05:13,060 to wrap your mind around what's going on. 110 00:05:13,060 --> 00:05:14,980 And then imports-- even something like this, 111 00:05:14,980 --> 00:05:18,700 when you're importing this library or that, or this module or package, 112 00:05:18,700 --> 00:05:21,970 Python too prescribes just how and where you should generally 113 00:05:21,970 --> 00:05:24,830 put those lines of code that say import or from. 114 00:05:24,830 --> 00:05:28,950 And Python, via PEP 8, also prescribes any number of details as well. 115 00:05:28,950 --> 00:05:29,450 And 116 00:05:29,450 --> 00:05:33,200 These aren't details that we necessarily preach along the way, 117 00:05:33,200 --> 00:05:36,680 but rather practice as we write each of these examples in class. 118 00:05:36,680 --> 00:05:39,350 And it turns out, too, that as you see more and more code, well, 119 00:05:39,350 --> 00:05:43,820 you just get accustomed to the certain rules of proposals like these. 120 00:05:43,820 --> 00:05:46,640 So how do you go about, though, checking if your code is 121 00:05:46,640 --> 00:05:50,840 in conformance with something like PEP 8, or a style guide more generally? 122 00:05:50,840 --> 00:05:53,118 Well, you can certainly read the style guide itself 123 00:05:53,118 --> 00:05:55,910 and then look at your own code, and compare the two left and right. 124 00:05:55,910 --> 00:05:59,420 And decide, oh, I need to fix this and this other thing in my own code. 125 00:05:59,420 --> 00:06:01,403 But there's also tools, being programmers, 126 00:06:01,403 --> 00:06:03,320 that can help us solve these problems as well. 127 00:06:03,320 --> 00:06:05,403 And one of the most popular in the world of Python 128 00:06:05,403 --> 00:06:08,180 is a program called pylint, which is an example of what's 129 00:06:08,180 --> 00:06:11,870 generally known as a linter, which is a program that rather statically 130 00:06:11,870 --> 00:06:12,500 analyzes. 131 00:06:12,500 --> 00:06:15,050 That is-- reads your code top to bottom, left to right, 132 00:06:15,050 --> 00:06:18,950 and tries to figure out if there are potentially mistakes therein, 133 00:06:18,950 --> 00:06:22,920 or at least inconsistencies with something like a prescribed style 134 00:06:22,920 --> 00:06:23,420 guide. 135 00:06:23,420 --> 00:06:26,030 This is something that you can install via the usual ways, 136 00:06:26,030 --> 00:06:27,110 using something like pip. 137 00:06:27,110 --> 00:06:28,760 And its documentation is here. 138 00:06:28,760 --> 00:06:32,090 But it turns out there's other tools out there as well that are perhaps 139 00:06:32,090 --> 00:06:33,740 a little less noisy than pylint. 140 00:06:33,740 --> 00:06:36,380 It turns out if you run pylint on most of the programs 141 00:06:36,380 --> 00:06:38,480 you've written thus far, odds are you're going 142 00:06:38,480 --> 00:06:41,660 to be overwhelmed with just how many things you apparently did wrong 143 00:06:41,660 --> 00:06:45,230 stylistically, even though your code may very well be both correct 144 00:06:45,230 --> 00:06:46,400 and well designed. 145 00:06:46,400 --> 00:06:49,160 So a little less noisy, at least initially, 146 00:06:49,160 --> 00:06:52,700 might be this program here, that's the de facto standard within the Python 147 00:06:52,700 --> 00:06:55,580 community for formatting your code for you. 148 00:06:55,580 --> 00:06:58,010 Pycodestyle, formerly known as PEP 8-- 149 00:06:58,010 --> 00:07:01,580 a program as well-- is a program that you can not only 150 00:07:01,580 --> 00:07:04,340 run on your computer, documented at this URL here, 151 00:07:04,340 --> 00:07:07,670 it will actually take care of the process of reformatting 152 00:07:07,670 --> 00:07:09,890 your code for you if it's a bit messy. 153 00:07:09,890 --> 00:07:14,090 That is, if the style of your code, your indentation and blank lines 154 00:07:14,090 --> 00:07:17,330 and other details as well, are not in accordance with the style guide, 155 00:07:17,330 --> 00:07:20,360 something like pycodestyle will just fix it for you. 156 00:07:20,360 --> 00:07:22,550 But another one, an alternative nowadays that's 157 00:07:22,550 --> 00:07:25,760 actually gaining steam, that's perhaps even more popular nowadays, 158 00:07:25,760 --> 00:07:27,230 quite simply called black. 159 00:07:27,230 --> 00:07:30,290 And black is a program that you too can install with pip here, 160 00:07:30,290 --> 00:07:32,090 and it's documented at this URL. 161 00:07:32,090 --> 00:07:34,850 And the etymology of the name black is actually 162 00:07:34,850 --> 00:07:39,710 an allusion to Henry Ford, who invented cars way back when, and only 163 00:07:39,710 --> 00:07:42,260 sold quite a few black models of the same. 164 00:07:42,260 --> 00:07:46,070 And indeed, he's generally known as saying a quote along these lines, 165 00:07:46,070 --> 00:07:49,070 any customer can have a car painted any color that he 166 00:07:49,070 --> 00:07:51,800 wants, so long that it is black. 167 00:07:51,800 --> 00:07:54,590 So if you think about that for a moment, it's not quite a choice. 168 00:07:54,590 --> 00:07:58,400 And indeed, that's the spirit of this particular formatter called black, 169 00:07:58,400 --> 00:08:01,280 which is that it's opinionated, more so than others. 170 00:08:01,280 --> 00:08:05,000 With a lot of these formatters that exist out there, you tend to-- 171 00:08:05,000 --> 00:08:07,190 or your company tends to, or your course tends 172 00:08:07,190 --> 00:08:09,350 to configure them with certain rules. 173 00:08:09,350 --> 00:08:11,900 So there might be different ways to do indentation. 174 00:08:11,900 --> 00:08:15,407 There might be different ways to do imports or blank lines, 175 00:08:15,407 --> 00:08:17,240 and different companies and different people 176 00:08:17,240 --> 00:08:20,570 might reasonably disagree, and therefore have their own style guide here, 177 00:08:20,570 --> 00:08:22,010 their own style guide there. 178 00:08:22,010 --> 00:08:26,300 And it just tends to waste a lot of time, is the thinking, if all of us 179 00:08:26,300 --> 00:08:28,190 don't even agree on some of these basics. 180 00:08:28,190 --> 00:08:32,360 So this particular formatter, called black, is opinionated in the sense 181 00:08:32,360 --> 00:08:34,669 that it just makes a lot of these decisions for you. 182 00:08:34,669 --> 00:08:38,090 And if you don't like the way black is formatting your code, tough-- 183 00:08:38,090 --> 00:08:40,549 tough, that's the way it's going to do it. 184 00:08:40,549 --> 00:08:43,640 And this is a trend, perhaps now, at least within the Python community, 185 00:08:43,640 --> 00:08:46,190 of quibbling a little less over these stylistic details 186 00:08:46,190 --> 00:08:50,990 so that you and I can ultimately focus all the more on writing good code 187 00:08:50,990 --> 00:08:52,490 and solving actual problems. 188 00:08:52,490 --> 00:08:54,020 And let's go ahead and do just this. 189 00:08:54,020 --> 00:08:56,990 Let me go ahead and open up vscode here, where in advance I've 190 00:08:56,990 --> 00:09:00,020 created a program called students.py, whose purpose in life 191 00:09:00,020 --> 00:09:02,330 is just to create at the top of this program 192 00:09:02,330 --> 00:09:05,510 a dictionary containing key value pairs, the keys of which 193 00:09:05,510 --> 00:09:08,030 are the names of some students, the values of which 194 00:09:08,030 --> 00:09:11,562 are the houses at Hogwarts in which they live, and then I've just got a for loop 195 00:09:11,562 --> 00:09:13,520 here that iterates over each of those students, 196 00:09:13,520 --> 00:09:15,988 printing out for now each of their names. 197 00:09:15,988 --> 00:09:18,530 You can imagine certainly doing more with the values as well, 198 00:09:18,530 --> 00:09:20,420 but for now, this is a simple program. 199 00:09:20,420 --> 00:09:24,200 But it's already manifesting poor style, arguably-- certainly 200 00:09:24,200 --> 00:09:26,330 inconsistent with PEP 8 itself. 201 00:09:26,330 --> 00:09:29,510 And I know this from just experience, eyeballing it, and realizing, wow, 202 00:09:29,510 --> 00:09:31,880 this is not a good thing that this first line of code 203 00:09:31,880 --> 00:09:33,930 goes completely off my own screen. 204 00:09:33,930 --> 00:09:35,930 Feels like it's probably a little long-- and I'm 205 00:09:35,930 --> 00:09:38,240 going to have to scroll to even see what's going on. 206 00:09:38,240 --> 00:09:40,790 And this one's more subtle, but if you look at line three 207 00:09:40,790 --> 00:09:44,810 here, even though I've indented technically sufficiently-- so 208 00:09:44,810 --> 00:09:47,210 long as my code is indented underneath the for loop, 209 00:09:47,210 --> 00:09:50,090 and any subsequent indentation, if I had more lines of code, 210 00:09:50,090 --> 00:09:52,880 were similarly indented, the code would work. 211 00:09:52,880 --> 00:09:54,200 And it would be correct. 212 00:09:54,200 --> 00:09:58,040 But it just does not accord with PEP 8, which prescribes again, 213 00:09:58,040 --> 00:10:01,010 four spaces of indentation at each level. 214 00:10:01,010 --> 00:10:02,787 So how can I go about fixing this? 215 00:10:02,787 --> 00:10:05,120 Well, if I've already installed one of these formatters, 216 00:10:05,120 --> 00:10:07,610 for instance, something like black, I could literally just 217 00:10:07,610 --> 00:10:09,470 do this with my terminal window. 218 00:10:09,470 --> 00:10:13,820 I'm going to run the command black, space, students.py, and hit enter. 219 00:10:13,820 --> 00:10:16,910 And voila, you'll see that the students.py file 220 00:10:16,910 --> 00:10:18,740 has been automatically reformatted. 221 00:10:18,740 --> 00:10:21,470 And at the top of my file, I have now a dictionary-- 222 00:10:21,470 --> 00:10:24,780 same dictionary as before, but just so much more readable. 223 00:10:24,780 --> 00:10:27,840 Not only does it not wrap around to the edge of the screen, 224 00:10:27,840 --> 00:10:32,250 you can also see each of the key values pairs one line at a time. 225 00:10:32,250 --> 00:10:35,460 And you can see that it lends itself to even adding more down the road. 226 00:10:35,460 --> 00:10:40,650 It turns out it's not incorrect to have a final trailing comma like this, 227 00:10:40,650 --> 00:10:43,740 even though it's not strictly necessary here on the new line six. 228 00:10:43,740 --> 00:10:44,280 Why? 229 00:10:44,280 --> 00:10:48,060 Well, Padma in Ravenclaw is the very last student 230 00:10:48,060 --> 00:10:51,840 in this particular dictionary. 231 00:10:51,840 --> 00:10:54,630 But just in case, as is often the case, I 232 00:10:54,630 --> 00:10:57,180 might go in and start adding more key value pairs later. 233 00:10:57,180 --> 00:11:00,355 A common source of mistake is to accidentally forget that, oh, 234 00:11:00,355 --> 00:11:01,980 I didn't have a comma there previously. 235 00:11:01,980 --> 00:11:04,050 And here I am, adding more key value pairs. 236 00:11:04,050 --> 00:11:07,800 So that is the kind of detail that black not only fixes for you, 237 00:11:07,800 --> 00:11:10,590 but again, has an opinion on as well. 238 00:11:10,590 --> 00:11:13,410 So moving forward, as you write code of your own, 239 00:11:13,410 --> 00:11:16,530 it's good to get ingrained into you some of the lessons of and some 240 00:11:16,530 --> 00:11:20,160 of the guidelines in a style guide like PEP 8, but know that there's tools, 241 00:11:20,160 --> 00:11:23,610 be pycodestyle, or black, or something else altogether, 242 00:11:23,610 --> 00:11:27,450 that you can use as a programmer to help you focus more on correctness, 243 00:11:27,450 --> 00:11:29,730 more on design, more on solving actual problems, 244 00:11:29,730 --> 00:11:35,420 while still ensuring that your code is now automatically formatted as well. 245 00:11:35,420 --> 00:11:37,000