1 00:00:00,000 --> 00:00:02,904 [MUSIC PLAYING] 2 00:00:02,904 --> 00:00:10,180 3 00:00:10,180 --> 00:00:11,200 DAVID MALAN: All right. 4 00:00:11,200 --> 00:00:15,880 So this is CS50, and this is lecture two, our continuation of C. 5 00:00:15,880 --> 00:00:18,380 And for the next several weeks, we're gonna keep using C. 6 00:00:18,380 --> 00:00:21,635 But we're gonna focus less on the language and the syntax, which we'll 7 00:00:21,635 --> 00:00:24,010 get experience with over time by way of the problem sets, 8 00:00:24,010 --> 00:00:26,676 and more and more on the ideas and more and more on the problems 9 00:00:26,676 --> 00:00:27,670 that we can solve. 10 00:00:27,670 --> 00:00:29,710 But before we forge ahead with anything new, 11 00:00:29,710 --> 00:00:32,080 let's take a quick look back at where we left off 12 00:00:32,080 --> 00:00:34,840 and what we'll sort of assume for today's comfort level, 13 00:00:34,840 --> 00:00:37,010 and ask any and all questions along the way. 14 00:00:37,010 --> 00:00:40,360 So in order to program last time, we needed a tool. 15 00:00:40,360 --> 00:00:42,605 And that tool was this thing here, CS50 IDE. 16 00:00:42,605 --> 00:00:45,730 If you haven't dived in already, you probably will this weekend for problem 17 00:00:45,730 --> 00:00:46,420 set one. 18 00:00:46,420 --> 00:00:49,000 And this will be a web-based programming environment 19 00:00:49,000 --> 00:00:51,250 that's got all the requisite tools you need in order 20 00:00:51,250 --> 00:00:55,210 to write code, compile code, and then, starting today, debug code 21 00:00:55,210 --> 00:00:57,550 or find mistakes in code. 22 00:00:57,550 --> 00:01:01,975 But this is requisite because it's not sufficient to just write this. 23 00:01:01,975 --> 00:01:04,769 What do we call this, generally speaking? 24 00:01:04,769 --> 00:01:06,010 Yeah, so this is source code. 25 00:01:06,010 --> 00:01:06,880 So this is code. 26 00:01:06,880 --> 00:01:09,550 When someone says, I write code, they write stuff like this. 27 00:01:09,550 --> 00:01:12,400 And this is, particularly, a language called C. But, of course, 28 00:01:12,400 --> 00:01:15,580 computers don't speak C. And they don't speak Java, 29 00:01:15,580 --> 00:01:18,660 and they don't speak Python or C++ or any of the languages with which 30 00:01:18,660 --> 00:01:19,540 you're familiar. 31 00:01:19,540 --> 00:01:21,539 They only understand what at the end of the day? 32 00:01:21,539 --> 00:01:22,930 Yeah, so binary. 33 00:01:22,930 --> 00:01:25,300 So binary is, of course zeros and ones, otherwise known 34 00:01:25,300 --> 00:01:29,520 as machine code in this context, insofar as it it is code. 35 00:01:29,520 --> 00:01:33,029 It's instructions that implement some problem-solving techniques. 36 00:01:33,029 --> 00:01:35,320 But it's just zeros and ones that computers understand. 37 00:01:35,320 --> 00:01:39,723 So we needed a tool to get from A to B. And that was called what? 38 00:01:39,723 --> 00:01:41,410 Yeah, a compiler. 39 00:01:41,410 --> 00:01:43,960 So a compiler, of course, does this for us. 40 00:01:43,960 --> 00:01:45,520 Source code is the input. 41 00:01:45,520 --> 00:01:49,020 Compiler is the program or really the algorithm, 42 00:01:49,020 --> 00:01:50,770 albeit in the form of a piece of software. 43 00:01:50,770 --> 00:01:53,557 And then the output is machine code, zeros and ones. 44 00:01:53,557 --> 00:01:56,890 And for our purposes, we're not going to worry about how we get from step A to B 45 00:01:56,890 --> 00:01:57,520 per se. 46 00:01:57,520 --> 00:01:58,780 We'll want to use the tool. 47 00:01:58,780 --> 00:02:01,270 But this is another area unto itself in computer science. 48 00:02:01,270 --> 00:02:03,310 If you want to understand how compilers work 49 00:02:03,310 --> 00:02:06,820 and how humans got from literally zeros and ones to something called assembly 50 00:02:06,820 --> 00:02:10,539 code to higher-level languages, you'll see a glimpse of that in CS50. 51 00:02:10,539 --> 00:02:14,500 But it unto itself is a whole field that might prove ultimately of interest. 52 00:02:14,500 --> 00:02:16,820 But this, more mechanically, is how we compiled code. 53 00:02:16,820 --> 00:02:17,770 Clang is a compiler. 54 00:02:17,770 --> 00:02:19,480 It stands for C language. 55 00:02:19,480 --> 00:02:22,076 And it's just software that some humans wrote some years ago. 56 00:02:22,076 --> 00:02:23,200 And there are alternatives. 57 00:02:23,200 --> 00:02:26,110 If you've ever used Visual Studio in the Windows world 58 00:02:26,110 --> 00:02:29,710 or GCC in the Linux and Unix world, there's bunches of other compilers. 59 00:02:29,710 --> 00:02:32,890 We just happened to use clang since it's pretty popular. 60 00:02:32,890 --> 00:02:35,440 And then that second command is even stranger looking. 61 00:02:35,440 --> 00:02:38,230 But it represents the act of doing what? 62 00:02:38,230 --> 00:02:39,430 ./a.out. 63 00:02:39,430 --> 00:02:40,900 Yes, over here? 64 00:02:40,900 --> 00:02:42,010 Yeah, running the program. 65 00:02:42,010 --> 00:02:42,509 Exactly. 66 00:02:42,509 --> 00:02:44,894 So ./a.out is a cryptic way of running a program. 67 00:02:44,894 --> 00:02:47,560 But it's like the textual equivalent of double-clicking an icon. 68 00:02:47,560 --> 00:02:49,660 And a.out is just like the default name you 69 00:02:49,660 --> 00:02:54,830 get, assembler output, when you compile a program without specifying its name. 70 00:02:54,830 --> 00:02:56,560 But we were able to specify a name. 71 00:02:56,560 --> 00:03:00,250 If you introduce a technique called command-line arguments, 72 00:03:00,250 --> 00:03:02,440 you can be a little more precise. 73 00:03:02,440 --> 00:03:05,944 clang -o for output, then any word you want. 74 00:03:05,944 --> 00:03:07,360 In this case, I went with "hello." 75 00:03:07,360 --> 00:03:11,110 And then the name of the program or the file that you want to compile. 76 00:03:11,110 --> 00:03:13,060 But, this of course, gets pretty tedious. 77 00:03:13,060 --> 00:03:15,080 And, in fact, there's a missing step. 78 00:03:15,080 --> 00:03:18,070 Sometimes when you want to write a program, 79 00:03:18,070 --> 00:03:21,267 it suffices to compile it exactly as that. 80 00:03:21,267 --> 00:03:22,600 But let me go ahead and do this. 81 00:03:22,600 --> 00:03:26,350 Let me go ahead into CS50 IDE, and let me go ahead 82 00:03:26,350 --> 00:03:28,450 and briefly do the following. 83 00:03:28,450 --> 00:03:29,670 File, New. 84 00:03:29,670 --> 00:03:32,005 And let me go ahead and save this as "hello.c." 85 00:03:32,005 --> 00:03:35,710 And just from memory, I'll quickly recreate that same program. 86 00:03:35,710 --> 00:03:40,697 int main void, and then we had printf, hello, world. 87 00:03:40,697 --> 00:03:43,030 And then just for good measure, backslash n, which means 88 00:03:43,030 --> 00:03:44,930 move the cursor to the next line. 89 00:03:44,930 --> 00:03:46,900 And now I'm gonna go ahead and save that. 90 00:03:46,900 --> 00:03:52,300 If I go ahead now and run clang hello.c, looks good. 91 00:03:52,300 --> 00:03:54,310 And ./a.out. 92 00:03:54,310 --> 00:03:55,450 That looks good, too. 93 00:03:55,450 --> 00:03:58,690 But recall that we introduced some other functions the other day, 94 00:03:58,690 --> 00:04:00,400 as well, like get_string and get_int. 95 00:04:00,400 --> 00:04:02,650 And we'll see bunches more before long. 96 00:04:02,650 --> 00:04:06,290 And if I do that, notice that I have to do a couple of things. 97 00:04:06,290 --> 00:04:12,250 So if I want to do, like, string name gets get string, 98 00:04:12,250 --> 00:04:14,530 and then, quote, unquote "name" to prompt the user, 99 00:04:14,530 --> 00:04:17,635 recall that the left-hand side says, hey, computer, 100 00:04:17,635 --> 00:04:19,510 give me a variable that's gonna store string. 101 00:04:19,510 --> 00:04:20,241 Call it name. 102 00:04:20,241 --> 00:04:21,490 Could have called it anything. 103 00:04:21,490 --> 00:04:23,290 I could have called it s. 104 00:04:23,290 --> 00:04:26,440 But why might it be arguably better to call my variable name instead 105 00:04:26,440 --> 00:04:28,547 of string, or s, rather? 106 00:04:28,547 --> 00:04:29,046 Yeah? 107 00:04:29,046 --> 00:04:29,850 AUDIENCE: It's clearer. 108 00:04:29,850 --> 00:04:30,820 DAVID MALAN: It's just clearer, right? 109 00:04:30,820 --> 00:04:33,114 It might be a very marginal, nit-picky detail. 110 00:04:33,114 --> 00:04:34,030 But it's just clearer. 111 00:04:34,030 --> 00:04:35,590 And when our programs get bigger, it's just nicer 112 00:04:35,590 --> 00:04:37,673 to be able to read words and understand implicitly 113 00:04:37,673 --> 00:04:40,420 what they mean without having to think through what x or y or s 114 00:04:40,420 --> 00:04:41,860 or whatever actually is. 115 00:04:41,860 --> 00:04:45,174 On the right-hand side, meanwhile, we had this function get-string, 116 00:04:45,174 --> 00:04:47,590 whose purpose in life is to go get a string from the user, 117 00:04:47,590 --> 00:04:51,460 from his or her keyboard, prompting them with a word like "name," 118 00:04:51,460 --> 00:04:54,520 and then return it, just as Sam handed me back a slip of paper 119 00:04:54,520 --> 00:04:56,380 with a name on it. 120 00:04:56,380 --> 00:04:57,730 But there's a catch. 121 00:04:57,730 --> 00:05:03,560 It's now no longer sufficient to just adapt my code for this new approach. 122 00:05:03,560 --> 00:05:05,840 And I had, again, to change that second line. 123 00:05:05,840 --> 00:05:08,036 I had to give a placeholder, which is %s. 124 00:05:08,036 --> 00:05:11,160 And if that's a little cryptic looking, just kind of think of it like a Mad 125 00:05:11,160 --> 00:05:12,910 Lib, if you're familiar with those, where there's just, like, 126 00:05:12,910 --> 00:05:14,540 a fill-in-the-blank here. 127 00:05:14,540 --> 00:05:17,200 And all %s is doing is saying, put a word there. 128 00:05:17,200 --> 00:05:18,070 What word? 129 00:05:18,070 --> 00:05:22,270 Well whatever comes after the comma, whatever variable or value is there. 130 00:05:22,270 --> 00:05:24,790 So that's good. 131 00:05:24,790 --> 00:05:26,620 But a couple of things can go wrong. 132 00:05:26,620 --> 00:05:28,990 And let me point those out so that you don't perhaps 133 00:05:28,990 --> 00:05:30,890 trip over it yourself on your own. 134 00:05:30,890 --> 00:05:35,200 Let me go ahead and do "clang hello.c" now that I've made these changes 135 00:05:35,200 --> 00:05:36,239 and saved them. 136 00:05:36,239 --> 00:05:38,530 Bunches of errors all of a sudden, even though I've not 137 00:05:38,530 --> 00:05:39,820 changed all that much code. 138 00:05:39,820 --> 00:05:41,570 Again, rule of thumb from last time should 139 00:05:41,570 --> 00:05:44,890 be even if there's lots of error messages, always look at the first one 140 00:05:44,890 --> 00:05:48,880 first because the rest might just be kind of a resulting cascade of errors, 141 00:05:48,880 --> 00:05:51,190 only one of which is important, which is the first. 142 00:05:51,190 --> 00:05:54,580 Now, it says "use of undeclared identifier string. 143 00:05:54,580 --> 00:05:57,610 Did you mean stdin," which is something else altogether. 144 00:05:57,610 --> 00:05:58,300 I didn't. 145 00:05:58,300 --> 00:05:59,330 I meant string. 146 00:05:59,330 --> 00:06:02,604 And it turns out that string is a feature of the so-called CS50 library. 147 00:06:02,604 --> 00:06:05,770 So this is one of these training wheels we're just gonna use for a few weeks 148 00:06:05,770 --> 00:06:08,440 until we dive in underneath the hood of strings, too. 149 00:06:08,440 --> 00:06:13,516 But in order to use anything from CS50, what did I need to add to my code, too? 150 00:06:13,516 --> 00:06:14,450 AUDIENCE: Source code. 151 00:06:14,450 --> 00:06:15,700 DAVID MALAN: Source code, yes. 152 00:06:15,700 --> 00:06:16,974 But what else? 153 00:06:16,974 --> 00:06:17,890 AUDIENCE: The library. 154 00:06:17,890 --> 00:06:18,931 DAVID MALAN: The library. 155 00:06:18,931 --> 00:06:20,607 And the library was the CS50 library. 156 00:06:20,607 --> 00:06:23,440 And that means there's a file somewhere on the computer, in the IDE, 157 00:06:23,440 --> 00:06:27,470 called cs50.h, a so-called header file-- more on those in a bit. 158 00:06:27,470 --> 00:06:28,990 And so I did forget that detail. 159 00:06:28,990 --> 00:06:31,362 But suppose you don't recall that yourselves. 160 00:06:31,362 --> 00:06:33,820 Well, you might recall from one of our orientation sessions 161 00:06:33,820 --> 00:06:36,434 that CS50 has tools with which to help for this, too. 162 00:06:36,434 --> 00:06:39,100 You don't have to turn to this course's online discussion forum. 163 00:06:39,100 --> 00:06:40,750 You don't have to go to office hours, necessarily, 164 00:06:40,750 --> 00:06:42,010 for error messages like this. 165 00:06:42,010 --> 00:06:45,160 If you can't quite wrap your mind around what's happening, 166 00:06:45,160 --> 00:06:46,660 go ahead and do this instead. 167 00:06:46,660 --> 00:06:49,570 Instead of just running clang hello.c, do something 168 00:06:49,570 --> 00:06:53,680 like help50 clang hello.c, where help50 is 169 00:06:53,680 --> 00:06:57,350 a CS50 specific command, sort of a virtual teaching fellow, if you will. 170 00:06:57,350 --> 00:07:00,640 And if we recognize your error messages in yellow at the bottom, 171 00:07:00,640 --> 00:07:02,229 we'll highlight the first one. 172 00:07:02,229 --> 00:07:05,020 And then we'll try to give you advice like you might get in person. 173 00:07:05,020 --> 00:07:07,330 So "by undeclared identifier, clang means 174 00:07:07,330 --> 00:07:10,090 that you've used a name, string, on line 5 175 00:07:10,090 --> 00:07:12,820 of hello.c, which hasn't been defined. 176 00:07:12,820 --> 00:07:16,115 Did you forget to include cs50.h, in which string is defined, 177 00:07:16,115 --> 00:07:16,782 atop your file?" 178 00:07:16,782 --> 00:07:19,573 So we'll generally try to prompt you with rhetorical questions that 179 00:07:19,573 --> 00:07:22,640 hopefully are correct in leading you toward the right solution. 180 00:07:22,640 --> 00:07:26,140 So OK, that jogged my memory, at least, even if I'm not yet 100% 181 00:07:26,140 --> 00:07:29,230 comfortable with what these lines really are doing. 182 00:07:29,230 --> 00:07:31,330 And we'll tease that apart more today. 183 00:07:31,330 --> 00:07:34,060 But it feels like I'm trying to do this. 184 00:07:34,060 --> 00:07:40,120 But it turns out there's one other gotcha here. clang hello.c, Enter. 185 00:07:40,120 --> 00:07:40,950 Dang it. 186 00:07:40,950 --> 00:07:43,170 Well, actually, this is a net positive, right? 187 00:07:43,170 --> 00:07:44,830 Fewer error messages, it would seem. 188 00:07:44,830 --> 00:07:47,680 But now one we did not see the other day-- 189 00:07:47,680 --> 00:07:50,080 "undefined reference to get_string." 190 00:07:50,080 --> 00:07:52,000 So it's kind of similar in spirit. 191 00:07:52,000 --> 00:07:53,470 Something is not understood. 192 00:07:53,470 --> 00:07:55,120 But it's different wording, certainly. 193 00:07:55,120 --> 00:07:56,890 And I'm not quite sure what that means. 194 00:07:56,890 --> 00:07:59,140 But it turns out that it's not sufficient 195 00:07:59,140 --> 00:08:04,390 just when using most libraries to use include cs50.h or other header 196 00:08:04,390 --> 00:08:06,070 files, as we'll eventually see. 197 00:08:06,070 --> 00:08:09,610 That just teaches the compiler that something exists. 198 00:08:09,610 --> 00:08:12,280 It was like briefly last time, when we talked about prototypes, 199 00:08:12,280 --> 00:08:14,740 where I put a little one-liner that just said, by the way, 200 00:08:14,740 --> 00:08:16,370 this function's eventually gonna exist. 201 00:08:16,370 --> 00:08:17,870 That's what the header file's doing. 202 00:08:17,870 --> 00:08:20,340 It's like a promise to clang, this shall exist. 203 00:08:20,340 --> 00:08:21,910 But there's a second step. 204 00:08:21,910 --> 00:08:25,720 You actually have to feed clang the zeros and ones that 205 00:08:25,720 --> 00:08:29,500 implement the CS50 library that you, of course, did not create yourself, 206 00:08:29,500 --> 00:08:31,720 but we did pre-install in the IDE. 207 00:08:31,720 --> 00:08:33,850 And there's a separate way of doing that. 208 00:08:33,850 --> 00:08:37,210 Rather than just do clang hello.c, you have 209 00:08:37,210 --> 00:08:40,294 to do what's called linking your code against that library, 210 00:08:40,294 --> 00:08:43,419 at least if it's a third-party library that doesn't come with the computer. 211 00:08:43,419 --> 00:08:45,070 It came from humans like us. 212 00:08:45,070 --> 00:08:49,180 So this now says, hey, clang, compile hello.c, but link it 213 00:08:49,180 --> 00:08:52,300 against the CS50 library, which means take my zeros and ones, 214 00:08:52,300 --> 00:08:55,810 take CS50's zeros and ones, combine them, and then give me 215 00:08:55,810 --> 00:08:58,339 my actual program to run. 216 00:08:58,339 --> 00:09:00,130 And so that's going to be a key ingredient. 217 00:09:00,130 --> 00:09:02,129 And help50 could guide you toward that solution. 218 00:09:02,129 --> 00:09:04,570 But if I hit Enter now, now it seems to compile. 219 00:09:04,570 --> 00:09:06,520 ./a.out. 220 00:09:06,520 --> 00:09:07,900 My name shall be David. 221 00:09:07,900 --> 00:09:08,770 Enter. 222 00:09:08,770 --> 00:09:10,060 And it says, "hello, David." 223 00:09:10,060 --> 00:09:12,893 So again, don't get hung up, ultimately, in office hours and problem 224 00:09:12,893 --> 00:09:14,230 sets on these kinds of errors. 225 00:09:14,230 --> 00:09:16,540 You're gonna hit these bumps from the get-go. 226 00:09:16,540 --> 00:09:19,150 But just realize-- look for sort of familiar words. 227 00:09:19,150 --> 00:09:20,260 Use things like help50. 228 00:09:20,260 --> 00:09:21,552 Reach out to the course online. 229 00:09:21,552 --> 00:09:24,135 And just get over those hurdles because at the end of the day, 230 00:09:24,135 --> 00:09:26,170 the interesting stuff's gonna be the logic 231 00:09:26,170 --> 00:09:29,480 of the programs and the actual problems we're trying to solve. 232 00:09:29,480 --> 00:09:33,190 So what is it that's actually, then, going on here underneath the hood? 233 00:09:33,190 --> 00:09:35,516 And frankly, this is very quickly becoming tedious. 234 00:09:35,516 --> 00:09:37,390 So how do I automate some of these processes? 235 00:09:37,390 --> 00:09:39,580 Because it's very easy to forget this, and it's just very boring 236 00:09:39,580 --> 00:09:41,562 to continue to type so many commands. 237 00:09:41,562 --> 00:09:43,270 Well, recall that there was a shortcut we 238 00:09:43,270 --> 00:09:46,900 talked about the other day, which just kind of hides all of these details. 239 00:09:46,900 --> 00:09:48,190 You don't need to remember -o. 240 00:09:48,190 --> 00:09:51,182 You now don't need to remember -lcs50. 241 00:09:51,182 --> 00:09:52,890 What do you do instead to make a program? 242 00:09:52,890 --> 00:09:54,130 Yeah? 243 00:09:54,130 --> 00:09:56,620 Yeah, so make, it's not a compiler itself. 244 00:09:56,620 --> 00:09:59,980 It's just kind of a helper program that knows how to run a compiler for you. 245 00:09:59,980 --> 00:10:03,400 So frankly, the simpler approach is just to do that-- make hello. 246 00:10:03,400 --> 00:10:06,340 And the reason it outputs so many more words is just because we have, 247 00:10:06,340 --> 00:10:08,550 in anticipation of teaching the semester, sort of 248 00:10:08,550 --> 00:10:12,112 preconfigured it with command-line arguments, additional words, 249 00:10:12,112 --> 00:10:14,070 that we expect you're gonna need at some point. 250 00:10:14,070 --> 00:10:17,010 And this just saves you the trouble of having to futz around with the manual 251 00:10:17,010 --> 00:10:18,090 to figure that kind of thing out. 252 00:10:18,090 --> 00:10:19,380 But that's why it looks cryptic. 253 00:10:19,380 --> 00:10:21,505 But notice this is really the most important word-- 254 00:10:21,505 --> 00:10:24,340 hello followed by hello.c. 255 00:10:24,340 --> 00:10:26,290 And those are your two ingredients. 256 00:10:26,290 --> 00:10:26,790 All right. 257 00:10:26,790 --> 00:10:30,400 So what's going on, then, underneath the hood there? 258 00:10:30,400 --> 00:10:33,750 Well, it turns out that even though we can simplify the command structure, 259 00:10:33,750 --> 00:10:35,640 it's actually doing quite a bit for us. 260 00:10:35,640 --> 00:10:37,470 And this the process of compiling. 261 00:10:37,470 --> 00:10:41,310 But that was kind of an oversimplification, or, put more 262 00:10:41,310 --> 00:10:42,999 intelligently, like an abstraction. 263 00:10:42,999 --> 00:10:46,290 There's actually quite a few steps that go on underneath the hood, one of which 264 00:10:46,290 --> 00:10:50,100 is called preprocessing and compiling and assembling and linking. 265 00:10:50,100 --> 00:10:52,002 So let's do a quick dive-down here. 266 00:10:52,002 --> 00:10:54,960 But then we'll abstract away, just so that you've seen what's going on. 267 00:10:54,960 --> 00:10:58,690 But henceforth, we can just take for granted that all of this is happening. 268 00:10:58,690 --> 00:11:01,500 So here is source code, same program as the simplest 269 00:11:01,500 --> 00:11:03,060 version we had a moment ago. 270 00:11:03,060 --> 00:11:05,370 And ultimately, I need to get this to machine code. 271 00:11:05,370 --> 00:11:09,150 Well, let's see if we can't just visualize how we get from point A to B 272 00:11:09,150 --> 00:11:12,600 without completely abstracting it away with just those big arrows. 273 00:11:12,600 --> 00:11:14,300 So this is my source code. 274 00:11:14,300 --> 00:11:17,100 And it turns out that the very first step 275 00:11:17,100 --> 00:11:21,690 of turning source code into machine code in the world of C 276 00:11:21,690 --> 00:11:23,760 is you first run what's called a preprocessor. 277 00:11:23,760 --> 00:11:25,509 You don't do this explicitly, although you 278 00:11:25,509 --> 00:11:28,260 could if you were really low-level and interested in it. 279 00:11:28,260 --> 00:11:30,690 But what the preprocessor does, essentially, 280 00:11:30,690 --> 00:11:35,010 is anytime there's a line of code that starts with a pound sign, 281 00:11:35,010 --> 00:11:39,540 or a hashtag these days, that's a special command that gets, essentially, 282 00:11:39,540 --> 00:11:42,420 replaced with the contents of the file, at least in this case. 283 00:11:42,420 --> 00:11:46,860 So somewhere on the idea is a file called, literally, "stdio.h." 284 00:11:46,860 --> 00:11:52,900 And so #include means go get that file and essentially copy and paste it here. 285 00:11:52,900 --> 00:11:55,320 And so when you preprocess your code, this yellow line 286 00:11:55,320 --> 00:11:57,570 here becomes something like this. 287 00:11:57,570 --> 00:11:58,830 And I'm doing "..." 288 00:11:58,830 --> 00:12:01,080 it's dozens if not hundreds of lines long. 289 00:12:01,080 --> 00:12:04,230 But there's one juicy line in it which is the little clue 290 00:12:04,230 --> 00:12:07,630 to clang that printf shall exist. 291 00:12:07,630 --> 00:12:09,540 And that's why you need stdio.h. 292 00:12:09,540 --> 00:12:12,750 So that's essentially, for our purposes today, all the preprocessor does. 293 00:12:12,750 --> 00:12:15,514 It does these kind of find and replace style operations 294 00:12:15,514 --> 00:12:17,430 so that now your file, without you knowing it, 295 00:12:17,430 --> 00:12:20,096 suddenly became much bigger because it's got other lines of code 296 00:12:20,096 --> 00:12:21,330 that someone else wrote. 297 00:12:21,330 --> 00:12:24,870 And then your code remains right there as it was. 298 00:12:24,870 --> 00:12:31,290 But the next step after preprocessing is something called compiling itself, 299 00:12:31,290 --> 00:12:34,410 which technically, the compiler, if we really 300 00:12:34,410 --> 00:12:36,720 want to be nit-picky and look at its formal definition, 301 00:12:36,720 --> 00:12:41,400 is actually taking these yellow lines, your source code and someone else's, 302 00:12:41,400 --> 00:12:44,820 perhaps, and converting it into something called assembly code. 303 00:12:44,820 --> 00:12:47,640 And this is a language that humans kind of sort of still 304 00:12:47,640 --> 00:12:50,820 do, but back in the day really did program in. 305 00:12:50,820 --> 00:12:54,510 And in fact, if you have a computer with an Intel CPU, a brain made 306 00:12:54,510 --> 00:13:00,360 by Intel inside of your computer, there was and still is a big user's manual 307 00:13:00,360 --> 00:13:04,080 that tells programmers around the world that this Intel CPU understands 308 00:13:04,080 --> 00:13:07,650 the following instructions-- add, subtract, multiply, divide, 309 00:13:07,650 --> 00:13:11,250 all the basics, and then things like move numbers from here 310 00:13:11,250 --> 00:13:13,950 to here, read numbers from here to here, just move stuff 311 00:13:13,950 --> 00:13:15,430 around in the computer's memory. 312 00:13:15,430 --> 00:13:17,850 And so even though this really looks cryptic even 313 00:13:17,850 --> 00:13:21,460 to me, since I am by no means an expert at assembly language, 314 00:13:21,460 --> 00:13:24,510 certainly, all these years later, you can see words that kind of sound 315 00:13:24,510 --> 00:13:29,910 familiar, like "mov" suggests moving a value from one location to another. 316 00:13:29,910 --> 00:13:33,406 "sub" alludes to subtraction, so subtracting one number from another. 317 00:13:33,406 --> 00:13:35,530 And without really thinking this through carefully, 318 00:13:35,530 --> 00:13:37,630 I'm not really sure what's going on yet. 319 00:13:37,630 --> 00:13:40,680 But I do see a familiar word down there called "printf." 320 00:13:40,680 --> 00:13:44,040 And so long story short, what the computer or compiler specifically 321 00:13:44,040 --> 00:13:48,090 has done is it's taken my more user-friendly C code, converted it 322 00:13:48,090 --> 00:13:51,000 to something that's a little closer to what the machine understands. 323 00:13:51,000 --> 00:13:52,590 But it's not there yet because the machine only 324 00:13:52,590 --> 00:13:53,714 understands zeros and ones. 325 00:13:53,714 --> 00:13:56,240 So there's another step called assembling. 326 00:13:56,240 --> 00:13:59,190 And the assembling process simply takes assembly code 327 00:13:59,190 --> 00:14:00,720 and converts it to zeros and ones. 328 00:14:00,720 --> 00:14:02,880 Now we're down to the zeros and ones. 329 00:14:02,880 --> 00:14:07,020 And what's amazing-- if it's interesting in the first place-- is 330 00:14:07,020 --> 00:14:10,422 when you run clang and hit Enter, all of this is just happening instantly. 331 00:14:10,422 --> 00:14:12,630 And you're getting these zeros and ones, this output. 332 00:14:12,630 --> 00:14:15,630 But I've left the room on the other side because all we've done 333 00:14:15,630 --> 00:14:21,000 is convert my code from source code to assembly code to machine code. 334 00:14:21,000 --> 00:14:28,140 What needs to now be merged in, so to speak, for that "Hello World" program? 335 00:14:28,140 --> 00:14:32,432 Yeah, so still need, like, stdio, the standard I/O library that has printf. 336 00:14:32,432 --> 00:14:34,890 So the next step is to take a whole bunch of zeros and ones 337 00:14:34,890 --> 00:14:37,290 from somewhere else on the system, combine them 338 00:14:37,290 --> 00:14:41,490 until this is the file containing a.out or hello, 339 00:14:41,490 --> 00:14:42,900 whatever you called your program. 340 00:14:42,900 --> 00:14:47,230 And that, ultimately, is what the computer understands. 341 00:14:47,230 --> 00:14:49,560 So that is a very low-level detail. 342 00:14:49,560 --> 00:14:52,119 Thankfully, we learned in the very first lecture 343 00:14:52,119 --> 00:14:55,410 this notion of abstraction, which means even as you dive in underneath the hood 344 00:14:55,410 --> 00:14:57,510 and sort of understand how we're building up, 345 00:14:57,510 --> 00:15:01,680 now, henceforth-- literally every minute hereafter-- that whole mouthful just 346 00:15:01,680 --> 00:15:03,030 becomes compiling. 347 00:15:03,030 --> 00:15:05,760 And indeed, that's what most people in the programming business 348 00:15:05,760 --> 00:15:07,990 refer to as compiling, is all of those several steps. 349 00:15:07,990 --> 00:15:09,323 But that's all that's happening. 350 00:15:09,323 --> 00:15:12,420 Feels like magic, but it's just one step after another 351 00:15:12,420 --> 00:15:14,400 gets us closer to our goal. 352 00:15:14,400 --> 00:15:17,046 353 00:15:17,046 --> 00:15:18,465 Questions? 354 00:15:18,465 --> 00:15:20,320 It's about as low-level as we'll get. 355 00:15:20,320 --> 00:15:22,024 Yeah? 356 00:15:22,024 --> 00:15:26,892 AUDIENCE: Why do you have to go through the assembly code and then the machine 357 00:15:26,892 --> 00:15:27,392 code? 358 00:15:27,392 --> 00:15:28,860 Why not just go straight to machine code? 359 00:15:28,860 --> 00:15:29,400 DAVID MALAN: Good question. 360 00:15:29,400 --> 00:15:32,220 Why do you have to go from one step to another, like from source 361 00:15:32,220 --> 00:15:33,803 code to assembly code to machine code? 362 00:15:33,803 --> 00:15:34,950 You absolutely could. 363 00:15:34,950 --> 00:15:38,310 It just happens to be the case that there's lots of humans in the world 364 00:15:38,310 --> 00:15:40,410 and lots of people working on different projects. 365 00:15:40,410 --> 00:15:42,600 And this notion of layering your software 366 00:15:42,600 --> 00:15:45,420 on top of someone else's on top of others' allows 367 00:15:45,420 --> 00:15:51,571 us to build more complex systems much more cleanly, if you will. 368 00:15:51,571 --> 00:15:53,820 And there's different types of computers in the world. 369 00:15:53,820 --> 00:15:54,540 There's Macs. 370 00:15:54,540 --> 00:15:56,100 There's PCs, which even though these days, 371 00:15:56,100 --> 00:15:58,058 they're a lot more similar underneath the hood, 372 00:15:58,058 --> 00:16:01,140 literally, than they used to be back in the day, there's different CPUs. 373 00:16:01,140 --> 00:16:03,360 There's phones that have very different CPUs. 374 00:16:03,360 --> 00:16:07,290 And wouldn't it be nice if I could write my programs in one language 375 00:16:07,290 --> 00:16:11,550 and compile them into zeros and ones that do work on a Mac and on a PC 376 00:16:11,550 --> 00:16:13,870 and on an Android phone and an iPhone and so forth? 377 00:16:13,870 --> 00:16:16,230 And that's why by having these sort of different layers, 378 00:16:16,230 --> 00:16:20,202 one set of humans or one person can implement the process of converting C 379 00:16:20,202 --> 00:16:20,910 to assembly code. 380 00:16:20,910 --> 00:16:23,940 Then someone else can take it to the zeros and ones, in some sense. 381 00:16:23,940 --> 00:16:26,070 Or even-- there's even intermediate steps. 382 00:16:26,070 --> 00:16:29,109 Compilers have front ends and back ends and all of this complexity. 383 00:16:29,109 --> 00:16:30,900 But it gives us advantages because it means 384 00:16:30,900 --> 00:16:34,920 we can sort of decide which types of hardware to support more easily. 385 00:16:34,920 --> 00:16:36,750 Really good question. 386 00:16:36,750 --> 00:16:38,501 Other questions? 387 00:16:38,501 --> 00:16:39,000 OK. 388 00:16:39,000 --> 00:16:42,780 So with that said, let's now consider any number 389 00:16:42,780 --> 00:16:45,580 of ways in which things can go wrong. 390 00:16:45,580 --> 00:16:48,234 It's easy for me, certainly, to write "hello, world," 391 00:16:48,234 --> 00:16:49,650 and everything just kind of works. 392 00:16:49,650 --> 00:16:51,941 And even when it doesn't, I quickly know how to fix it. 393 00:16:51,941 --> 00:16:53,821 And it's only from experience and practice. 394 00:16:53,821 --> 00:16:56,820 But let me just give you a teaser not just of help50 but two other tools 395 00:16:56,820 --> 00:17:00,420 that you'll see, particularly for the problem sets, that will not necessarily 396 00:17:00,420 --> 00:17:03,680 teach you how to write good code-- good, efficient code. 397 00:17:03,680 --> 00:17:06,930 That's where the humans come in and the teaching fellows feedback and sections 398 00:17:06,930 --> 00:17:08,140 and office hours and more-- 399 00:17:08,140 --> 00:17:10,890 but at least to write correct code that meets our specifications 400 00:17:10,890 --> 00:17:12,970 and that's well-styled, at least looks good. 401 00:17:12,970 --> 00:17:18,599 But the third ingredient, recall, besides correctness and style 402 00:17:18,599 --> 00:17:20,550 is gonna be design, which is something we'll 403 00:17:20,550 --> 00:17:23,040 learn after practice and examples. 404 00:17:23,040 --> 00:17:26,160 So with check50, this is a tool that comes 405 00:17:26,160 --> 00:17:30,990 in CS50 IDE, recall, if unfamiliar, that allows me to essentially do this. 406 00:17:30,990 --> 00:17:33,360 Let me whittle this back down to my simplest 407 00:17:33,360 --> 00:17:35,250 hello, world program like this. 408 00:17:35,250 --> 00:17:37,440 I no longer need the CS50 library. 409 00:17:37,440 --> 00:17:39,090 I can run make hello. 410 00:17:39,090 --> 00:17:39,859 Seems to work. 411 00:17:39,859 --> 00:17:42,150 And how do you go about testing your programs if you've 412 00:17:42,150 --> 00:17:43,520 written this for a problem set? 413 00:17:43,520 --> 00:17:45,936 Well, the easiest and most straightforward way, of course, 414 00:17:45,936 --> 00:17:47,820 is just run it. 415 00:17:47,820 --> 00:17:49,350 Looks like it's correct. 416 00:17:49,350 --> 00:17:50,042 And it is. 417 00:17:50,042 --> 00:17:52,500 And there's not too much that can go wrong in this program. 418 00:17:52,500 --> 00:17:54,249 But soon, you'll see, with problem set one 419 00:17:54,249 --> 00:17:56,100 and beyond, anytime you start getting input 420 00:17:56,100 --> 00:17:58,680 from the user where he or she has to type their name 421 00:17:58,680 --> 00:18:02,310 or a number or other things, you can absolutely concoct scenarios 422 00:18:02,310 --> 00:18:03,600 where something goes wrong. 423 00:18:03,600 --> 00:18:07,750 But if you run a command in this case like check50, we can do the following. 424 00:18:07,750 --> 00:18:11,760 Let me go ahead and first make a directory called-- 425 00:18:11,760 --> 00:18:15,200 let me go ahead and do this and do mkdir-- 426 00:18:15,200 --> 00:18:16,530 for make directory-- hello. 427 00:18:16,530 --> 00:18:18,280 And then we didn't see this the other day. 428 00:18:18,280 --> 00:18:20,529 And you'll see more of this in today's super sections, 429 00:18:20,529 --> 00:18:22,980 or classwide sections, which will also be filmed. 430 00:18:22,980 --> 00:18:25,910 I'm just gonna to move this file into a directory called hello. 431 00:18:25,910 --> 00:18:28,410 So that's like on a Mac or PC just dragging and dropping it. 432 00:18:28,410 --> 00:18:30,000 But I'm doing it with my keyboard. 433 00:18:30,000 --> 00:18:33,360 What's the command to change into another directory? cd. 434 00:18:33,360 --> 00:18:35,430 So that's like double-clicking on a directory, 435 00:18:35,430 --> 00:18:37,560 albeit with my keystrokes only. 436 00:18:37,560 --> 00:18:41,040 And now I'm gonna go ahead and run this. 437 00:18:41,040 --> 00:18:43,020 I can run make hello again. 438 00:18:43,020 --> 00:18:43,830 Seems to work. 439 00:18:43,830 --> 00:18:45,630 And I can run ./hello. 440 00:18:45,630 --> 00:18:46,330 Seems to work. 441 00:18:46,330 --> 00:18:48,450 But now let's see if CS50 agrees. 442 00:18:48,450 --> 00:18:49,800 So check50. 443 00:18:49,800 --> 00:18:54,240 And then I'm gonna type "cs50/2017/fall/hello," 444 00:18:54,240 --> 00:18:56,940 which looks like a bunch of folders, but it's not. 445 00:18:56,940 --> 00:19:00,060 It's just a unique identifier that has sort of some hierarchy to it. 446 00:19:00,060 --> 00:19:03,240 You would only know to type this by reading problem set 447 00:19:03,240 --> 00:19:05,160 specification online. 448 00:19:05,160 --> 00:19:07,740 And what this is gonna do, if you haven't seen it already, 449 00:19:07,740 --> 00:19:10,510 is actually connect to CS50's server. 450 00:19:10,510 --> 00:19:12,810 It's gonna authenticate you, if you haven't already. 451 00:19:12,810 --> 00:19:18,090 I'm gonna go ahead and log in as student50. 452 00:19:18,090 --> 00:19:20,104 And now hereafter it will remember my password, 453 00:19:20,104 --> 00:19:22,020 for at least some amount of time, so you don't 454 00:19:22,020 --> 00:19:23,730 have to type it in every darn time. 455 00:19:23,730 --> 00:19:25,080 Then it's preparing. 456 00:19:25,080 --> 00:19:26,100 It's uploading. 457 00:19:26,100 --> 00:19:28,230 And what's happening now is my "hello.c" file 458 00:19:28,230 --> 00:19:30,510 is somewhere in the cloud on CS50 servers. 459 00:19:30,510 --> 00:19:33,360 We are running the checks, the tests that the staff wrote. 460 00:19:33,360 --> 00:19:37,097 And hopefully, I'm gonna see a whole bunch of green smiley faces 461 00:19:37,097 --> 00:19:38,930 that look a little yellow on this projector, 462 00:19:38,930 --> 00:19:40,740 but those are, in fact, green smiley faces 463 00:19:40,740 --> 00:19:43,990 instead of frowny faces, which would suggest something is wrong. 464 00:19:43,990 --> 00:19:44,820 So that's all good. 465 00:19:44,820 --> 00:19:47,280 And don't be discouraged if you see a few frowny faces 466 00:19:47,280 --> 00:19:50,700 or a few flat, confused faces if something else is awry. 467 00:19:50,700 --> 00:19:52,890 But style50 does something different. 468 00:19:52,890 --> 00:19:55,080 Right now, the style of my code, I'd argue, 469 00:19:55,080 --> 00:19:58,740 looks pretty good because it's kind of hard to go wrong when it's this short. 470 00:19:58,740 --> 00:19:59,820 But we'll see a way. 471 00:19:59,820 --> 00:20:03,960 And if I instead run "style50 hello.c," just the name of the file I want 472 00:20:03,960 --> 00:20:04,560 to check-- 473 00:20:04,560 --> 00:20:06,750 looks good, but consider adding more comments. 474 00:20:06,750 --> 00:20:09,875 And that's pretty compelling because there's zero at the moment. 475 00:20:09,875 --> 00:20:12,000 And so what kind of comments might you want to add? 476 00:20:12,000 --> 00:20:19,500 Well, in this program, it's not that compelling to add that many comments 477 00:20:19,500 --> 00:20:23,190 because the reality is this program's so short it probably takes me 478 00:20:23,190 --> 00:20:25,116 less time to read the code than the comments. 479 00:20:25,116 --> 00:20:27,240 But it's very common, as you'll see in the examples 480 00:20:27,240 --> 00:20:30,150 from lecture, to do something like this-- 481 00:20:30,150 --> 00:20:33,752 "says hello to user," just a quick one-line summary so that 482 00:20:33,752 --> 00:20:36,460 when you're skimming the file or looking at the code, OK, got it. 483 00:20:36,460 --> 00:20:37,376 I know what this does. 484 00:20:37,376 --> 00:20:40,379 And if I care to know how it does that, then I can read the code. 485 00:20:40,379 --> 00:20:41,670 And so that would be a comment. 486 00:20:41,670 --> 00:20:46,410 And that will probably make style50 happy in this case. 487 00:20:46,410 --> 00:20:49,230 But what if I'm getting a little sloppy? 488 00:20:49,230 --> 00:20:53,750 And I remember vaguely that I was in the habit of hitting Tab or the space bar 489 00:20:53,750 --> 00:20:54,270 in lecture. 490 00:20:54,270 --> 00:20:57,441 But I can't be bothered to do that when I'm working on my problems set. 491 00:20:57,441 --> 00:20:59,190 I just want to get the darn thing to work. 492 00:20:59,190 --> 00:21:02,880 It's not uncommon for code to eventually start to look like this, 493 00:21:02,880 --> 00:21:04,890 even though this, too, is a simple program. 494 00:21:04,890 --> 00:21:07,950 Now, good style, as you'll see and learn from practice, 495 00:21:07,950 --> 00:21:11,220 dictates that just like in Scratch there were those yellow puzzle 496 00:21:11,220 --> 00:21:14,910 pieces that kind of hug the code, similarly, inside of curly braces, 497 00:21:14,910 --> 00:21:16,890 you really should be indenting. 498 00:21:16,890 --> 00:21:22,065 And so if I go ahead and sort of forget that and now run style50 on "hello.c," 499 00:21:22,065 --> 00:21:25,440 I'll see see my code outputted in the terminal 500 00:21:25,440 --> 00:21:27,150 window, the bottom of the screen. 501 00:21:27,150 --> 00:21:31,750 But green suggests hey, programmer, add the following characters. 502 00:21:31,750 --> 00:21:33,670 So green suggests add here. 503 00:21:33,670 --> 00:21:36,510 And if I go ahead now and reindent that by hitting 504 00:21:36,510 --> 00:21:39,960 Tab-- specifically four spaces, which is a human convention-- 505 00:21:39,960 --> 00:21:41,280 it should make it happy again. 506 00:21:41,280 --> 00:21:43,071 We can go in the reverse direction, though. 507 00:21:43,071 --> 00:21:46,010 Suppose that I got a little confused as to what I actually 508 00:21:46,010 --> 00:21:47,760 am supposed to indent-- and you might even 509 00:21:47,760 --> 00:21:49,551 see in textbooks and some online resources, 510 00:21:49,551 --> 00:21:51,900 some people write their code like this. 511 00:21:51,900 --> 00:21:54,922 Let me go ahead now and run style50 on this. 512 00:21:54,922 --> 00:21:56,130 It's gonna print out my code. 513 00:21:56,130 --> 00:21:59,190 And red in this case means remove those characters 514 00:21:59,190 --> 00:22:00,635 that you might not otherwise see. 515 00:22:00,635 --> 00:22:02,260 So it's not always going to be perfect. 516 00:22:02,260 --> 00:22:03,810 And especially when the programs get long, 517 00:22:03,810 --> 00:22:06,220 it might be a little nonobvious what changes you have to make. 518 00:22:06,220 --> 00:22:08,530 But just like with error messages, start at the top. 519 00:22:08,530 --> 00:22:10,230 Make one or few changes. 520 00:22:10,230 --> 00:22:13,386 Save it and rerun it, and see what the updated advice is. 521 00:22:13,386 --> 00:22:16,010 And I can't stress this enough, especially with problem set one 522 00:22:16,010 --> 00:22:18,510 and any problem set thereafter-- don't get 523 00:22:18,510 --> 00:22:20,820 into the habit of sitting down and trying 524 00:22:20,820 --> 00:22:23,020 to bite off the entirety of a problem. 525 00:22:23,020 --> 00:22:26,400 Odds are with Scratch, you didn't sit down and write the whole thing 526 00:22:26,400 --> 00:22:30,720 without once playing it or testing it or adding features to it. 527 00:22:30,720 --> 00:22:35,490 Don't get into that habit, then, in C. Take steps and steps, 528 00:22:35,490 --> 00:22:39,240 just as we've been doing with these examples so far. 529 00:22:39,240 --> 00:22:39,880 All right. 530 00:22:39,880 --> 00:22:42,770 Any questions, then, on those tools? 531 00:22:42,770 --> 00:22:45,900 And we'll come back in just a moment to more sophisticated debugging 532 00:22:45,900 --> 00:22:47,261 techniques. 533 00:22:47,261 --> 00:22:47,760 All right. 534 00:22:47,760 --> 00:22:52,770 So one of the problems that we were distracted by earlier 535 00:22:52,770 --> 00:22:54,810 is there's this old-school games, "Super Mario 536 00:22:54,810 --> 00:22:59,850 Bros.," wherein a character like this jumps around the screen quite a bit. 537 00:22:59,850 --> 00:23:01,887 And it's one from the very first Nintendo game, 538 00:23:01,887 --> 00:23:05,220 and there's lots of obstacles in the way of Mario as he's running left and right 539 00:23:05,220 --> 00:23:06,030 and jumping. 540 00:23:06,030 --> 00:23:08,010 And some of these obstacles can be represented 541 00:23:08,010 --> 00:23:12,420 with fairly simple constructs like bricks in this colorful world. 542 00:23:12,420 --> 00:23:15,810 And we can approximate this just by using characters on our screens, 543 00:23:15,810 --> 00:23:16,560 as well. 544 00:23:16,560 --> 00:23:19,200 So I actually poked around for far too much time last night 545 00:23:19,200 --> 00:23:21,960 looking at old "Super Mario Bros." maps, which if I had them in, 546 00:23:21,960 --> 00:23:24,900 like, the 1980s, would have made "Super Mario Bros." a lot easier. 547 00:23:24,900 --> 00:23:28,200 But people have captured all of the imagery from this game. 548 00:23:28,200 --> 00:23:31,734 And one snapshot from this game was a screen like this. 549 00:23:31,734 --> 00:23:33,900 So eventually, Mario's supposed to run through this. 550 00:23:33,900 --> 00:23:36,566 And he's supposed to bump his head up against the question marks 551 00:23:36,566 --> 00:23:37,870 and get coins and so forth. 552 00:23:37,870 --> 00:23:41,100 But for now, I'm gonna really, really simplify this and propose 553 00:23:41,100 --> 00:23:43,500 that all I care about for the sake of discussion 554 00:23:43,500 --> 00:23:45,840 is this line of question marks. 555 00:23:45,840 --> 00:23:49,290 How would a computer program, whether in "Super Mario Bros." 556 00:23:49,290 --> 00:23:53,670 or today here in Sanders, go about printing a line of question marks 557 00:23:53,670 --> 00:23:55,350 in a row like that? 558 00:23:55,350 --> 00:24:00,590 Well let me go ahead and open up CS50 IDE. 559 00:24:00,590 --> 00:24:03,570 I'm gonna go ahead and create a new file here. 560 00:24:03,570 --> 00:24:07,770 And I'm just going to go ahead and call this, say, "mario0.c" 561 00:24:07,770 --> 00:24:11,310 because it's the first or the zero version of this program. 562 00:24:11,310 --> 00:24:13,590 And I just want to print, like, four question marks. 563 00:24:13,590 --> 00:24:15,090 So let me take a stab at this first. 564 00:24:15,090 --> 00:24:21,042 So #include stdio.h, which I think I need because why? 565 00:24:21,042 --> 00:24:21,930 AUDIENCE: Printf? 566 00:24:21,930 --> 00:24:22,860 DAVID MALAN: Yeah, I need printf. 567 00:24:22,860 --> 00:24:24,568 i need to be able to print the character. 568 00:24:24,568 --> 00:24:26,470 So int main void is what comes next. 569 00:24:26,470 --> 00:24:28,950 And we'll start to tease apart why that is today. 570 00:24:28,950 --> 00:24:31,080 And now I'm going to go ahead and print out "????." 571 00:24:31,080 --> 00:24:33,720 572 00:24:33,720 --> 00:24:34,731 And then semi-colon. 573 00:24:34,731 --> 00:24:35,230 All right. 574 00:24:35,230 --> 00:24:38,165 Let me go ahead now and make mario0. 575 00:24:38,165 --> 00:24:39,630 ./mario0. 576 00:24:39,630 --> 00:24:43,560 And I kind of sort of have a very ugly textual representation 577 00:24:43,560 --> 00:24:46,530 of a really fun-- at least 1980s style-- game. 578 00:24:46,530 --> 00:24:48,060 But there's a slight aesthetic bug. 579 00:24:48,060 --> 00:24:49,851 And I made this same mistake the other day. 580 00:24:49,851 --> 00:24:52,840 How do I move my cursor onto the next line? 581 00:24:52,840 --> 00:24:54,630 Yeah, so backslash n. 582 00:24:54,630 --> 00:24:58,344 So backslash is the one we're about to type, and forward slash or slash 583 00:24:58,344 --> 00:25:00,610 is what people would call just the other direction. 584 00:25:00,610 --> 00:25:02,050 So that's backslash n. 585 00:25:02,050 --> 00:25:05,880 And that's a special escape character, so to speak. 586 00:25:05,880 --> 00:25:09,430 For now, just know that this starts to confuse the computer 587 00:25:09,430 --> 00:25:11,530 if you just literally hit Enter. 588 00:25:11,530 --> 00:25:13,570 Now, your code's on two lines, when really it's 589 00:25:13,570 --> 00:25:15,590 just one idea or one function. 590 00:25:15,590 --> 00:25:19,311 So humans decided some time ago, let's just represent that special character 591 00:25:19,311 --> 00:25:21,310 that you would otherwise just hit on a keyboard. 592 00:25:21,310 --> 00:25:23,960 So now if I rerun make mario0. 593 00:25:23,960 --> 00:25:25,450 ./mario0. 594 00:25:25,450 --> 00:25:27,250 OK, now looks a little better. 595 00:25:27,250 --> 00:25:30,907 But we know from scratch that we don't just need to do question mark, 596 00:25:30,907 --> 00:25:32,740 question mark, question mark, question mark, 597 00:25:32,740 --> 00:25:36,290 especially if I want even more coins to be available on the screen. 598 00:25:36,290 --> 00:25:39,180 What's the right programming construct to just give me more of these? 599 00:25:39,180 --> 00:25:39,980 AUDIENCE: A for loop. 600 00:25:39,980 --> 00:25:41,730 DAVID MALAN: Yeah, like a for loop, right? 601 00:25:41,730 --> 00:25:44,410 So let me go ahead and tweak this a little bit. 602 00:25:44,410 --> 00:25:48,910 Let me go ahead and in, let's say, "mario1.c"-- 603 00:25:48,910 --> 00:25:52,210 so "mario1.c"-- I'm going to instead do this. 604 00:25:52,210 --> 00:25:55,600 So for int-- to give me an integer-- 605 00:25:55,600 --> 00:25:59,650 i equals 0 by default, though it could be 1. 606 00:25:59,650 --> 00:26:02,176 But programmers tend to use 0. 607 00:26:02,176 --> 00:26:04,615 i is less than-- 608 00:26:04,615 --> 00:26:07,240 I'm not sure, so let's just put a big blank there for a moment. 609 00:26:07,240 --> 00:26:09,860 And then i plus plus, I remember, being the way to increment. 610 00:26:09,860 --> 00:26:15,280 And then inside of this loop, I'm going to do printf "?" 611 00:26:15,280 --> 00:26:16,900 semi-colon. 612 00:26:16,900 --> 00:26:18,910 And now let's answer this question. 613 00:26:18,910 --> 00:26:22,660 If this for loop, which, recall, has a very methodical process to it-- 614 00:26:22,660 --> 00:26:27,550 it initializes, checks the condition, does something, 615 00:26:27,550 --> 00:26:31,210 increments, checks the condition, repeat again and again. 616 00:26:31,210 --> 00:26:34,303 What number should I put on the otherwise blank line here? 617 00:26:34,303 --> 00:26:35,110 AUDIENCE: Four. 618 00:26:35,110 --> 00:26:35,984 DAVID MALAN: So four. 619 00:26:35,984 --> 00:26:40,720 But if I'm counting from 0 to 4, that feels like it's five numbers. 620 00:26:40,720 --> 00:26:44,050 So three might get me closer, but less than. 621 00:26:44,050 --> 00:26:45,880 We have this relational operator. 622 00:26:45,880 --> 00:26:48,430 Less than, could have been greater than in other contexts. 623 00:26:48,430 --> 00:26:50,680 So the less than actually saves us. 624 00:26:50,680 --> 00:26:55,090 If I do for here, think about logically what's happening. 625 00:26:55,090 --> 00:26:57,640 i gets initialized to 0 for the first time. 626 00:26:57,640 --> 00:26:59,470 And we get a question mark printed. 627 00:26:59,470 --> 00:27:03,610 Then it gets plus plussed, and so it becomes 1, which is less than 4. 628 00:27:03,610 --> 00:27:07,040 And so that's the first time I printed a question mark. 629 00:27:07,040 --> 00:27:08,800 Then i becomes 1 next. 630 00:27:08,800 --> 00:27:10,490 I print another one. 631 00:27:10,490 --> 00:27:13,600 i is now 1. 632 00:27:13,600 --> 00:27:14,770 I do another. 633 00:27:14,770 --> 00:27:15,560 i is now 2. 634 00:27:15,560 --> 00:27:16,690 I do another. 635 00:27:16,690 --> 00:27:19,510 i is now three, which is not consistent with the number of fingers 636 00:27:19,510 --> 00:27:21,550 I'm holding up because I started at 0. 637 00:27:21,550 --> 00:27:26,830 But once i becomes 3, and therefore I've already printed my four question marks, 638 00:27:26,830 --> 00:27:30,340 the next value i is gonna take on is 4 itself. 639 00:27:30,340 --> 00:27:32,210 Is 4 less than 4? 640 00:27:32,210 --> 00:27:34,930 No, so I never get a chance to print another question mark 641 00:27:34,930 --> 00:27:36,490 or put up another finger. 642 00:27:36,490 --> 00:27:39,670 And honestly, this is a waste of intellectual capacity 643 00:27:39,670 --> 00:27:42,220 to think through, OK, how many numbers are between 0 and 4? 644 00:27:42,220 --> 00:27:45,370 We could have-- like most of us in this room just think-- 645 00:27:45,370 --> 00:27:48,640 could have just done i is less than or equal to 4, 646 00:27:48,640 --> 00:27:50,410 and that, too, would have worked. 647 00:27:50,410 --> 00:27:51,910 This is even more clear, perhaps. 648 00:27:51,910 --> 00:27:56,140 You start at 1, and you count up to and through the number 4. 649 00:27:56,140 --> 00:27:58,330 And that will give me four fingers, as well. 650 00:27:58,330 --> 00:27:59,920 Why do we start counting at 0? 651 00:27:59,920 --> 00:28:02,230 It's kind of just because, but more technically it's 652 00:28:02,230 --> 00:28:06,790 because it's easy to start counting with all 0 bits per our first lecture. 653 00:28:06,790 --> 00:28:07,780 So it's just a habit. 654 00:28:07,780 --> 00:28:09,910 And it's fine if you're more comfortable this way. 655 00:28:09,910 --> 00:28:12,400 But before long, get into this habit just 656 00:28:12,400 --> 00:28:15,250 because everyone else does it this way. 657 00:28:15,250 --> 00:28:17,740 OK, so now let's go ahead and print this out. 658 00:28:17,740 --> 00:28:20,570 Make mario1. 659 00:28:20,570 --> 00:28:22,630 ./mario1. 660 00:28:22,630 --> 00:28:25,480 Ah, still that bug. 661 00:28:25,480 --> 00:28:28,810 OK, is this gonna fix this? 662 00:28:28,810 --> 00:28:29,320 Why not? 663 00:28:29,320 --> 00:28:31,840 664 00:28:31,840 --> 00:28:34,930 Yeah, that's gonna do question mark, new line, question mark, new line. 665 00:28:34,930 --> 00:28:35,710 That's not right. 666 00:28:35,710 --> 00:28:39,340 So what line number should the backslash n really go on or between? 667 00:28:39,340 --> 00:28:41,261 AUDIENCE: It should go outside the for loop. 668 00:28:41,261 --> 00:28:43,510 DAVID MALAN: OK, so it should go outside the for loop. 669 00:28:43,510 --> 00:28:45,030 So specifically-- I saw a hand in back, too. 670 00:28:45,030 --> 00:28:45,940 What line number? 671 00:28:45,940 --> 00:28:46,440 Yeah. 672 00:28:46,440 --> 00:28:47,120 AUDIENCE: Eight and nine? 673 00:28:47,120 --> 00:28:48,760 DAVID MALAN: Yeah, so between eight and nine. 674 00:28:48,760 --> 00:28:49,930 There's no room there at the moment. 675 00:28:49,930 --> 00:28:50,770 That's no big deal. 676 00:28:50,770 --> 00:28:52,360 We'll just hit Enter, printf. 677 00:28:52,360 --> 00:28:54,899 And I can certainly just do a single backslash n, 678 00:28:54,899 --> 00:28:56,440 even with no words to the left of it. 679 00:28:56,440 --> 00:28:57,490 That, too, is OK. 680 00:28:57,490 --> 00:28:58,480 Let me recompile this. 681 00:28:58,480 --> 00:29:01,450 And honestly, if you get bored retyping the same commands, 682 00:29:01,450 --> 00:29:04,110 know that you can also hit up and down on your keyboard, 683 00:29:04,110 --> 00:29:06,290 and it will go through your history, so to speak. 684 00:29:06,290 --> 00:29:08,510 And that, too, over time will start to save you time. 685 00:29:08,510 --> 00:29:10,080 So there's make mario1. 686 00:29:10,080 --> 00:29:10,750 Enter. 687 00:29:10,750 --> 00:29:15,354 ./mario1, or I could just scroll back up as I did before. 688 00:29:15,354 --> 00:29:17,020 And now I get those four question marks. 689 00:29:17,020 --> 00:29:20,930 But now let's actually create this a little more interestingly, as follows. 690 00:29:20,930 --> 00:29:24,070 Let me go ahead and not just hard-code 4 into this program. 691 00:29:24,070 --> 00:29:28,390 Let me make one more version of Mario, call it mario2.c, 692 00:29:28,390 --> 00:29:30,820 and this time actually get some user input. 693 00:29:30,820 --> 00:29:34,420 How about I do int n because is like a number, 694 00:29:34,420 --> 00:29:36,910 and it's just common convention to call it that. 695 00:29:36,910 --> 00:29:41,890 get_int, and then I can say number, semi-colon. 696 00:29:41,890 --> 00:29:44,560 And now instead of hard-coding 4, why don't I just put 697 00:29:44,560 --> 00:29:46,930 n there, which I can certainly do? 698 00:29:46,930 --> 00:29:50,110 So let me go ahead now and run make mario2. 699 00:29:50,110 --> 00:29:52,210 Uh-oh. 700 00:29:52,210 --> 00:29:53,780 Error. 701 00:29:53,780 --> 00:29:56,740 Yeah, I forgot cs50.h. 702 00:29:56,740 --> 00:29:58,300 So I have to go back up here. 703 00:29:58,300 --> 00:30:02,991 I'll just do a quick copy-paste, and then change the word, cs50.h. 704 00:30:02,991 --> 00:30:04,240 Now I'm gonna clear my screen. 705 00:30:04,240 --> 00:30:06,239 And to clear your screen, you can hit Control-L, 706 00:30:06,239 --> 00:30:09,580 for instance, which will just keep fewer characters on the screen for us. 707 00:30:09,580 --> 00:30:10,530 Make mario2. 708 00:30:10,530 --> 00:30:11,110 That worked. 709 00:30:11,110 --> 00:30:14,915 I didn't need to worry about the -lcs50 because, again, make does that for me. 710 00:30:14,915 --> 00:30:16,040 That's one of the features. 711 00:30:16,040 --> 00:30:18,070 And now I can do make mario2. 712 00:30:18,070 --> 00:30:18,580 Number. 713 00:30:18,580 --> 00:30:20,039 How many question marks do we want? 714 00:30:20,039 --> 00:30:20,704 AUDIENCE: Seven. 715 00:30:20,704 --> 00:30:22,120 DAVID MALAN: I heard seven first. 716 00:30:22,120 --> 00:30:23,830 And now we have seven question marks. 717 00:30:23,830 --> 00:30:25,871 And it's not necessarily gonna look super pretty. 718 00:30:25,871 --> 00:30:28,390 If I do 700, now I'm gonna get a whole lot. 719 00:30:28,390 --> 00:30:30,580 But look how quickly it did that for me. 720 00:30:30,580 --> 00:30:32,870 And so we have this power now of loops. 721 00:30:32,870 --> 00:30:36,940 So that's good, but you know what? 722 00:30:36,940 --> 00:30:38,330 Let's see, what about this? 723 00:30:38,330 --> 00:30:43,210 What about -50 question marks? 724 00:30:43,210 --> 00:30:44,540 Is -50 an int? 725 00:30:44,540 --> 00:30:45,359 AUDIENCE: Yes. 726 00:30:45,359 --> 00:30:46,150 DAVID MALAN: It is. 727 00:30:46,150 --> 00:30:50,980 So we will get it for you via the get_int That's 728 00:30:50,980 --> 00:30:52,510 not really logically what we want. 729 00:30:52,510 --> 00:30:53,260 So think about it. 730 00:30:53,260 --> 00:31:00,744 On line seven if n equals -50, how many times will the for loop execute? 731 00:31:00,744 --> 00:31:02,360 AUDIENCE: None. 732 00:31:02,360 --> 00:31:04,102 DAVID MALAN: Why none? 733 00:31:04,102 --> 00:31:05,560 AUDIENCE: Because 0's greater than. 734 00:31:05,560 --> 00:31:08,960 DAVID MALAN: Yeah, because 0 is in this case greater than -50. 735 00:31:08,960 --> 00:31:12,640 So that condition never lets the loop actually proceed logically. 736 00:31:12,640 --> 00:31:14,430 So we're kind of OK. 737 00:31:14,430 --> 00:31:15,430 Nothing seems to happen. 738 00:31:15,430 --> 00:31:17,230 I get this sort of ugly blank line. 739 00:31:17,230 --> 00:31:18,880 And maybe that's arguably a bug. 740 00:31:18,880 --> 00:31:22,510 But at least it didn't freak out the computer and just kind of print things 741 00:31:22,510 --> 00:31:25,750 infinitely many times, as could actually happen. 742 00:31:25,750 --> 00:31:29,500 So let me go ahead and-- actually, at the risk 743 00:31:29,500 --> 00:31:33,850 of losing control over my computer, let's go ahead and change the logic. 744 00:31:33,850 --> 00:31:36,470 Suppose I change the less than to a greater than. 745 00:31:36,470 --> 00:31:38,800 And we initialize n to -50. 746 00:31:38,800 --> 00:31:42,370 And now, is 0 greater than -50? 747 00:31:42,370 --> 00:31:43,012 AUDIENCE: Yes. 748 00:31:43,012 --> 00:31:43,720 DAVID MALAN: Yes. 749 00:31:43,720 --> 00:31:47,620 And it's gonna be that way for a really long time, most likely, 750 00:31:47,620 --> 00:31:49,120 even as you increment it. 751 00:31:49,120 --> 00:31:54,580 And so let me go ahead and do make mario2 and then hold my breath 752 00:31:54,580 --> 00:31:57,280 and do -50. 753 00:31:57,280 --> 00:32:00,337 And even the internet and the computer can't really keep up. 754 00:32:00,337 --> 00:32:02,920 And that's why you're just kind of seeing it bursty like this. 755 00:32:02,920 --> 00:32:06,370 We're sending thousands, tens of thousands, millions, ultimately, 756 00:32:06,370 --> 00:32:07,870 of question marks across the screen. 757 00:32:07,870 --> 00:32:09,760 And that, too, you might do accidentally. 758 00:32:09,760 --> 00:32:13,630 And so just as I did, you can hit the secret keystroke, which usually works, 759 00:32:13,630 --> 00:32:15,700 which is Control-C for cancel. 760 00:32:15,700 --> 00:32:18,471 And that will stop a program in the window from running. 761 00:32:18,471 --> 00:32:18,970 All right. 762 00:32:18,970 --> 00:32:20,980 So I've gone ahead now and implemented kind 763 00:32:20,980 --> 00:32:23,470 of a very weak approximation of this. 764 00:32:23,470 --> 00:32:24,220 So that's great. 765 00:32:24,220 --> 00:32:27,669 Let's now take a step up and consider not just this construct, 766 00:32:27,669 --> 00:32:30,460 but if we fast forward in the game, to this part of the screen, now 767 00:32:30,460 --> 00:32:32,360 maybe we have a vertical block, as well. 768 00:32:32,360 --> 00:32:34,690 And let's just consider for a moment what about my code 769 00:32:34,690 --> 00:32:40,720 needs to change if I want to print three or maybe any number of vertical blocks? 770 00:32:40,720 --> 00:32:44,215 Fundamentally, how do I want to change the code? 771 00:32:44,215 --> 00:32:45,590 How do I want to change the code? 772 00:32:45,590 --> 00:32:46,090 Yeah. 773 00:32:46,090 --> 00:32:47,050 AUDIENCE: [INAUDIBLE] 774 00:32:47,050 --> 00:32:50,216 DAVID MALAN: Yeah, I just need a line break, where I accidentally almost did 775 00:32:50,216 --> 00:32:50,770 earlier. 776 00:32:50,770 --> 00:32:52,561 But in this case, it would be a good thing. 777 00:32:52,561 --> 00:32:55,640 So let me go over there, and let me just quickly make mario3 778 00:32:55,640 --> 00:32:56,890 by starting at the same point. 779 00:32:56,890 --> 00:32:59,080 So mario3.c. 780 00:32:59,080 --> 00:33:03,154 And then let me go down here and change this as follows. 781 00:33:03,154 --> 00:33:06,070 Let's make this i is less than n, the way it's supposed to be and just 782 00:33:06,070 --> 00:33:09,040 so that if we upload these later, I don't forget. 783 00:33:09,040 --> 00:33:12,130 And now I'll do a hashtag just because it looks more brick-like. 784 00:33:12,130 --> 00:33:13,960 It's not one of those coin things. 785 00:33:13,960 --> 00:33:15,280 And now I do this here. 786 00:33:15,280 --> 00:33:17,780 I don't think I need this anymore. 787 00:33:17,780 --> 00:33:21,100 So let me save that and run make mario3. 788 00:33:21,100 --> 00:33:26,809 Seems to be OK compiling-wise. mario3, number 3. 789 00:33:26,809 --> 00:33:27,850 And I get three of those. 790 00:33:27,850 --> 00:33:29,590 And now I could do maybe five of those. 791 00:33:29,590 --> 00:33:31,340 That works, and so forth. 792 00:33:31,340 --> 00:33:34,990 So there's still an opportunity for improvement here. 793 00:33:34,990 --> 00:33:40,630 In case I want to pester the user to actually cooperate such 794 00:33:40,630 --> 00:33:44,044 that if he or she types in -50, I don't want to just quit. 795 00:33:44,044 --> 00:33:46,210 I want to yell at them or somehow give them feedback 796 00:33:46,210 --> 00:33:48,880 and say, give me what I asked for. 797 00:33:48,880 --> 00:33:51,940 How do I continue to pester a user again and again 798 00:33:51,940 --> 00:33:55,481 and again until he or she actually gives me the value I want? 799 00:33:55,481 --> 00:33:55,980 I'm sorry? 800 00:33:55,980 --> 00:33:56,969 AUDIENCE: While. 801 00:33:56,969 --> 00:33:57,760 DAVID MALAN: While. 802 00:33:57,760 --> 00:33:59,450 So there's different looping constructs. 803 00:33:59,450 --> 00:34:02,890 While-- and it turns out we could use while or even for, 804 00:34:02,890 --> 00:34:04,580 or there's another one, as well. 805 00:34:04,580 --> 00:34:06,350 And let's consider how we might do this. 806 00:34:06,350 --> 00:34:09,909 It turns out that when you want to get user input from someone, 807 00:34:09,909 --> 00:34:11,170 you could use for. 808 00:34:11,170 --> 00:34:12,326 You could use while. 809 00:34:12,326 --> 00:34:15,159 But you'll find that it's a little annoying to use those constructs. 810 00:34:15,159 --> 00:34:19,630 Let me just jump to the better way first so that we see one other way. 811 00:34:19,630 --> 00:34:24,310 It turns out if you in a program want to do something at least one time, 812 00:34:24,310 --> 00:34:28,150 and maybe some more times, you could use for or while. 813 00:34:28,150 --> 00:34:32,830 But it's actually a little more straightforward to literally just do it 814 00:34:32,830 --> 00:34:37,120 while something is true. 815 00:34:37,120 --> 00:34:38,650 Now, this is just a placeholder. 816 00:34:38,650 --> 00:34:40,580 Let me start to fill in some logic here. 817 00:34:40,580 --> 00:34:46,449 So I want to do the following-- do the following while what? 818 00:34:46,449 --> 00:34:49,870 If the user does not give me a positive number, 819 00:34:49,870 --> 00:34:52,210 I want to prompt him or her again. 820 00:34:52,210 --> 00:34:56,440 The curly braces on lines seven and nine at the moment connote exactly that. 821 00:34:56,440 --> 00:35:02,590 Do this, do this, do this while line 10 is true. 822 00:35:02,590 --> 00:35:06,670 So what Boolean expression, if you will, do I want to type in the parentheses 823 00:35:06,670 --> 00:35:11,790 here on line 10 to express the fact keep doing this 824 00:35:11,790 --> 00:35:13,231 until the number is positive? 825 00:35:13,231 --> 00:35:13,730 Yeah? 826 00:35:13,730 --> 00:35:15,690 AUDIENCE: While n is greater than 0. 827 00:35:15,690 --> 00:35:21,680 DAVID MALAN: So while n is greater than 0, keep doing-- 828 00:35:21,680 --> 00:35:22,180 which one? 829 00:35:22,180 --> 00:35:23,130 AUDIENCE: Less than. 830 00:35:23,130 --> 00:35:24,421 DAVID MALAN: I heard less than. 831 00:35:24,421 --> 00:35:25,680 OK, so let's rethink this. 832 00:35:25,680 --> 00:35:30,627 So while n is less than 0, ask the user for a number. 833 00:35:30,627 --> 00:35:31,710 Ask the user for a number. 834 00:35:31,710 --> 00:35:32,640 Ask the user for a number. 835 00:35:32,640 --> 00:35:33,310 And you know what? 836 00:35:33,310 --> 00:35:35,435 This is going to just confuse the heck out of them. 837 00:35:35,435 --> 00:35:37,440 Let's be even more clear with our prompt. 838 00:35:37,440 --> 00:35:38,970 Give me a positive number. 839 00:35:38,970 --> 00:35:42,880 But keep prompting him or her until we actually get a positive number. 840 00:35:42,880 --> 00:35:46,026 Now, if we really want to be nit-picky, it's actually not even less than. 841 00:35:46,026 --> 00:35:46,650 We're so close. 842 00:35:46,650 --> 00:35:47,700 AUDIENCE: Less than or equal to. 843 00:35:47,700 --> 00:35:49,366 DAVID MALAN: It's less than or equal to. 844 00:35:49,366 --> 00:35:52,620 Unfortunately, I don't really remember having a key on my keyboard that's got, 845 00:35:52,620 --> 00:35:55,260 like, an angled bracket and then a line under it 846 00:35:55,260 --> 00:35:56,880 like you might write in math class. 847 00:35:56,880 --> 00:35:59,370 So there's a way to do that nonetheless on your keyboard. 848 00:35:59,370 --> 00:36:01,420 I actually just do them side by side. 849 00:36:01,420 --> 00:36:03,360 This is less than or equal to. 850 00:36:03,360 --> 00:36:05,159 This would be greater than or equal to. 851 00:36:05,159 --> 00:36:07,950 And so now just get comfortable reading these things left to right. 852 00:36:07,950 --> 00:36:10,620 There's no special symbol like you would have in a math book or a homework 853 00:36:10,620 --> 00:36:11,726 assignment on paper. 854 00:36:11,726 --> 00:36:13,350 So this, I think, says the right thing. 855 00:36:13,350 --> 00:36:20,190 Do this while n is less than or equal to 0, which is, of course, not positive. 856 00:36:20,190 --> 00:36:24,070 And then down here, the rest of my code, I think, can stay the same. 857 00:36:24,070 --> 00:36:26,730 I just have a block of code up here now that's doing something. 858 00:36:26,730 --> 00:36:27,480 And you know what? 859 00:36:27,480 --> 00:36:30,390 This is where comments start to get useful. 860 00:36:30,390 --> 00:36:35,130 Prompt user for a positive number. 861 00:36:35,130 --> 00:36:40,830 And now down here, print out that many bricks. 862 00:36:40,830 --> 00:36:44,730 So it's kind of obvious if you just read through the code what I just said. 863 00:36:44,730 --> 00:36:47,310 But this helps you if you sort of sleep on it and wake up, 864 00:36:47,310 --> 00:36:48,480 and you want to remember, why did I do this? 865 00:36:48,480 --> 00:36:49,140 Why did I do that? 866 00:36:49,140 --> 00:36:51,300 It helps the reader of your code, a colleague, a teaching fellow, 867 00:36:51,300 --> 00:36:51,930 and so forth. 868 00:36:51,930 --> 00:36:54,640 That's how you kind of start to add comments to your code. 869 00:36:54,640 --> 00:36:57,060 Unfortunately, there's a bug, and we're about to hit it. 870 00:36:57,060 --> 00:36:58,260 So let me try. 871 00:36:58,260 --> 00:37:01,590 Let me go ahead and make mario3. 872 00:37:01,590 --> 00:37:02,820 Oh, my god. 873 00:37:02,820 --> 00:37:05,190 More errors than I have lines of code, it seems. 874 00:37:05,190 --> 00:37:06,750 And this one's weird. 875 00:37:06,750 --> 00:37:09,660 Error-- unused variable n. 876 00:37:09,660 --> 00:37:11,952 And now let me dive in deeper to these error messages 877 00:37:11,952 --> 00:37:13,660 just so you start to notice little clues. 878 00:37:13,660 --> 00:37:16,830 So over here on the left is, of course, the filename, 879 00:37:16,830 --> 00:37:18,960 as you might have noticed-- mario3.c Then 880 00:37:18,960 --> 00:37:22,020 there's a colon, and then a number, and then a colon and another number. 881 00:37:22,020 --> 00:37:27,000 Turns out this is just a very succinct way of saying that in mario3.c 882 00:37:27,000 --> 00:37:32,250 on line nine at character or column, left to right, 13, 883 00:37:32,250 --> 00:37:34,710 you've got a problem, at least the compiler thinks. 884 00:37:34,710 --> 00:37:36,630 So generally, the character is kind of sort of helpful. 885 00:37:36,630 --> 00:37:39,671 It's really the line number that draws your attention to the right place. 886 00:37:39,671 --> 00:37:42,360 Somehow, this is buggy. 887 00:37:42,360 --> 00:37:46,740 And specifically, the bug is that I have an unused variable n. 888 00:37:46,740 --> 00:37:52,940 And then very inexplicably, on line 11, now I have a use of n. 889 00:37:52,940 --> 00:37:54,840 So it's unused here, but it's used here. 890 00:37:54,840 --> 00:37:58,890 And somehow, the computer doesn't like this. 891 00:37:58,890 --> 00:38:01,450 Why might this be? 892 00:38:01,450 --> 00:38:02,154 Yeah. 893 00:38:02,154 --> 00:38:04,972 AUDIENCE: [INAUDIBLE] 894 00:38:04,972 --> 00:38:06,680 DAVID MALAN: Yeah, that's the trick here. 895 00:38:06,680 --> 00:38:09,420 So it's a little different from Scratch, where when you make a variable, 896 00:38:09,420 --> 00:38:11,003 you can just use it anywhere you want. 897 00:38:11,003 --> 00:38:13,820 In C and some other languages, variables only 898 00:38:13,820 --> 00:38:16,070 exist in what's called a certain scope. 899 00:38:16,070 --> 00:38:19,040 And a scope you can generally think of as just the most recently 900 00:38:19,040 --> 00:38:22,080 opened and closed curly braces. 901 00:38:22,080 --> 00:38:24,170 So what does that imply here? 902 00:38:24,170 --> 00:38:28,100 Well, on line nine, I am on the left-hand side declaring a variable. 903 00:38:28,100 --> 00:38:30,110 Hey, computer, give me a variable called n. 904 00:38:30,110 --> 00:38:31,380 And it's gonna store an int. 905 00:38:31,380 --> 00:38:32,780 That's the story we keep telling. 906 00:38:32,780 --> 00:38:37,880 But the problem is I am doing that in between lines 8 and 10, curly braces. 907 00:38:37,880 --> 00:38:39,890 And I claimed today that that means, kind of 908 00:38:39,890 --> 00:38:43,909 like Scratch has the hug the puzzle pieces, in C, variables 909 00:38:43,909 --> 00:38:45,200 are treated a little different. 910 00:38:45,200 --> 00:38:48,590 If you declare a variable in here, it only exists in here, 911 00:38:48,590 --> 00:38:51,435 and you can't use it down here in your code. 912 00:38:51,435 --> 00:38:54,270 And so this would kind of seem to be a catch-22. 913 00:38:54,270 --> 00:38:55,970 I need a variable. 914 00:38:55,970 --> 00:38:57,470 And so I can declare it. 915 00:38:57,470 --> 00:39:01,360 But I can't declare the variable there if I want to use it later. 916 00:39:01,360 --> 00:39:04,200 It doesn't really seem to be a good situation. 917 00:39:04,200 --> 00:39:06,980 So just logically, even if you've never programmed before, 918 00:39:06,980 --> 00:39:10,430 if the fundamental problem is that this variable exists only 919 00:39:10,430 --> 00:39:14,790 in that scope of the curly braces, how intuitively could we solve this? 920 00:39:14,790 --> 00:39:15,928 Yeah. 921 00:39:15,928 --> 00:39:17,260 AUDIENCE: Move the curly brace? 922 00:39:17,260 --> 00:39:18,770 DAVID MALAN: Remove the curly-- 923 00:39:18,770 --> 00:39:19,990 oh, move the curly braces. 924 00:39:19,990 --> 00:39:23,290 Yeah, we could move the curly braces, which is essentially the idea. 925 00:39:23,290 --> 00:39:25,780 The catch is the do-while loop really kind of needs them. 926 00:39:25,780 --> 00:39:28,720 At least in generally cases, you need those curly braces. 927 00:39:28,720 --> 00:39:30,760 But you know what? 928 00:39:30,760 --> 00:39:32,650 It's been a while since I typed them. 929 00:39:32,650 --> 00:39:35,740 But I do have another pair of curly braces 930 00:39:35,740 --> 00:39:40,340 that are sort of outside of, so to speak, my inner curly braces. 931 00:39:40,340 --> 00:39:42,460 So I have another scope here that's essentially 932 00:39:42,460 --> 00:39:44,680 the whole function called main. 933 00:39:44,680 --> 00:39:47,410 So what if I somehow declare my variable out there. 934 00:39:47,410 --> 00:39:49,182 And indeed, I can. 935 00:39:49,182 --> 00:39:51,140 I can go to, like, line seven-- or even higher, 936 00:39:51,140 --> 00:39:52,630 but generally you want to keep it as close to where 937 00:39:52,630 --> 00:39:54,340 you care about it as possible. 938 00:39:54,340 --> 00:39:56,290 I can type int n. 939 00:39:56,290 --> 00:39:59,075 And I don't think I want to prompt the user here 940 00:39:59,075 --> 00:40:02,200 because then I'm going to create the same problem as before, where I'm just 941 00:40:02,200 --> 00:40:04,030 prompting him or her once. 942 00:40:04,030 --> 00:40:07,600 I want that prompt in a loop, again and again and again, potentially. 943 00:40:07,600 --> 00:40:08,340 So that's OK. 944 00:40:08,340 --> 00:40:11,050 We've not seen this before, but you can declare a variable, 945 00:40:11,050 --> 00:40:12,880 and then do nothing with it yet. 946 00:40:12,880 --> 00:40:14,901 Just say, hey, computer, give me a variable. 947 00:40:14,901 --> 00:40:16,900 I'll deal with this later, just like in Scratch. 948 00:40:16,900 --> 00:40:18,858 You declare a variable if you did, and then you 949 00:40:18,858 --> 00:40:20,700 deal with it later as you want. 950 00:40:20,700 --> 00:40:22,750 Now, this would be a bug still. 951 00:40:22,750 --> 00:40:25,270 I can't say, hey, computer, give me a variable n. 952 00:40:25,270 --> 00:40:28,330 And then, oh, by the way, give me another variable n. 953 00:40:28,330 --> 00:40:32,410 So all I have to do to fix this issue is just don't redeclare it. 954 00:40:32,410 --> 00:40:34,330 Just use it. 955 00:40:34,330 --> 00:40:37,180 So line seven says, hey, computer, give me a variable called 956 00:40:37,180 --> 00:40:38,860 n that's going to store an int. 957 00:40:38,860 --> 00:40:42,850 Line 10, same story as always except it's slightly shorter. 958 00:40:42,850 --> 00:40:45,220 On the left-hand side, it says, here's my variable. 959 00:40:45,220 --> 00:40:47,880 Right-hand side says, here's a value we got from the user. 960 00:40:47,880 --> 00:40:50,390 Put it from right to left. 961 00:40:50,390 --> 00:40:55,150 And so now because n is declared or created on line seven, 962 00:40:55,150 --> 00:40:58,860 it exists within the scope of these outermost curly braces. 963 00:40:58,860 --> 00:41:00,970 And now I can use it kind of anywhere I want, 964 00:41:00,970 --> 00:41:06,700 including on line 12, which is great, and, most importantly, on line 15. 965 00:41:06,700 --> 00:41:11,050 So let me go ahead and save this and do make mario3, hold my breath. 966 00:41:11,050 --> 00:41:13,840 Good, it actually worked. ./mario3. 967 00:41:13,840 --> 00:41:14,670 Positive number. 968 00:41:14,670 --> 00:41:15,190 Nuh-uh. 969 00:41:15,190 --> 00:41:16,881 I'm gonna give you -1. 970 00:41:16,881 --> 00:41:17,380 OK. 971 00:41:17,380 --> 00:41:18,664 I'm gonna give you zero. 972 00:41:18,664 --> 00:41:19,330 All right, fine. 973 00:41:19,330 --> 00:41:20,530 I'll give you 3. 974 00:41:20,530 --> 00:41:22,360 And now it actually cooperates. 975 00:41:22,360 --> 00:41:24,880 And so the do-while construct is still a loop, 976 00:41:24,880 --> 00:41:27,130 and Scratch doesn't really have an analog of this. 977 00:41:27,130 --> 00:41:31,390 But the do-while loop is still a loop, but it does something at least once. 978 00:41:31,390 --> 00:41:34,570 The difference fundamentally, though, is this-- 979 00:41:34,570 --> 00:41:41,080 if I did this, like, while n is less than or equal to 0, 980 00:41:41,080 --> 00:41:45,420 if I change this to a while loop, which we saw ever so briefly the other day 981 00:41:45,420 --> 00:41:51,970 as just an analog of the forever block in Scratch, if I do this, 982 00:41:51,970 --> 00:41:53,860 there's kind of a logical problem. 983 00:41:53,860 --> 00:41:55,790 Here's n being declared on line seven. 984 00:41:55,790 --> 00:41:59,230 So we're avoiding the scope issue this time from the get-go. 985 00:41:59,230 --> 00:42:02,590 But line eight is saying while n is less than or equal to zero. 986 00:42:02,590 --> 00:42:05,230 But what is n at this point? 987 00:42:05,230 --> 00:42:06,612 It's not yet defined. 988 00:42:06,612 --> 00:42:08,320 And, in fact, as we'll soon see in class, 989 00:42:08,320 --> 00:42:11,230 it has some garbage value, typically, some unknown value, 990 00:42:11,230 --> 00:42:15,200 remnants of whatever the computer used that RAM or memory for in the past. 991 00:42:15,200 --> 00:42:18,422 So this is literally undefined behavior, it would seem. 992 00:42:18,422 --> 00:42:20,380 I don't know if the loop's gonna execute or not 993 00:42:20,380 --> 00:42:21,910 because I don't know what's in n. 994 00:42:21,910 --> 00:42:23,830 So you could hack around this, so to speak. 995 00:42:23,830 --> 00:42:26,650 Hacking generally means kind of sort of figuring out 996 00:42:26,650 --> 00:42:28,990 a solution to a problem that might not be the cleanest. 997 00:42:28,990 --> 00:42:32,440 And OK, let me just initialize this to, like, -1,000 because I 998 00:42:32,440 --> 00:42:34,900 know that's less than or equal to zero. 999 00:42:34,900 --> 00:42:38,785 So it's a hack in that it fixes the logical problem because now 1000 00:42:38,785 --> 00:42:43,160 on line eight, is -1,000 less than or equal to 0? 1001 00:42:43,160 --> 00:42:43,660 It is. 1002 00:42:43,660 --> 00:42:45,550 So now my loop will execute at least once, 1003 00:42:45,550 --> 00:42:47,350 and it will then change the value of n. 1004 00:42:47,350 --> 00:42:50,140 But what the heck is -1,000 coming from? 1005 00:42:50,140 --> 00:42:54,440 These kind of inelegant solutions would be horrible, horrible design, 1006 00:42:54,440 --> 00:42:57,640 even though it logically gets the job done and it's correct. 1007 00:42:57,640 --> 00:42:58,780 Bad, bad design. 1008 00:42:58,780 --> 00:43:02,225 And so that's why we started with a better design, with a do-while loop. 1009 00:43:02,225 --> 00:43:04,600 But you'll find there's many different ways to do things. 1010 00:43:04,600 --> 00:43:06,433 And you might not, certainly, in problem set 1011 00:43:06,433 --> 00:43:09,460 one or two do things always the right or best way the first time. 1012 00:43:09,460 --> 00:43:11,680 And that's OK because with practice and experience, 1013 00:43:11,680 --> 00:43:14,530 you'll begin to see patterns with which you 1014 00:43:14,530 --> 00:43:17,370 can solve these same kinds of problems. 1015 00:43:17,370 --> 00:43:21,790 Any questions on these approximations of Mario? 1016 00:43:21,790 --> 00:43:24,370 Well, let me do one last one, one last one 1017 00:43:24,370 --> 00:43:26,590 involving Mario, and kind of like this. 1018 00:43:26,590 --> 00:43:29,290 I spent way too much time looking for parts of Mario 1019 00:43:29,290 --> 00:43:30,790 that kind of painted these pictures. 1020 00:43:30,790 --> 00:43:35,320 And I found this, these additional bricks underground in the fire level. 1021 00:43:35,320 --> 00:43:38,650 And suppose I wanted to print, like, a cube of hashtags, 1022 00:43:38,650 --> 00:43:42,100 so not just a horizontal line, not just a vertical line, 1023 00:43:42,100 --> 00:43:43,954 but kind of sort of both together. 1024 00:43:43,954 --> 00:43:46,370 And indeed, you can think of these bricks as exactly that. 1025 00:43:46,370 --> 00:43:49,660 It's like hashtag, hashtag, hashtag, hashtag, hashtag, hashtag, hashtag, 1026 00:43:49,660 --> 00:43:52,330 hashtag, and so forth, kind of like an old-school typewriter, 1027 00:43:52,330 --> 00:43:54,070 printing one line at a time. 1028 00:43:54,070 --> 00:43:56,050 And if you even remember typewriters, you 1029 00:43:56,050 --> 00:43:59,520 can actually think of computers and printf as behaving very similarly. 1030 00:43:59,520 --> 00:44:02,020 You can print something, then do the backslash n. 1031 00:44:02,020 --> 00:44:04,180 Print something else, do the backslash n. 1032 00:44:04,180 --> 00:44:06,832 So what is a square like this on the screen? 1033 00:44:06,832 --> 00:44:09,040 Well, it's really just the process of, like, painting 1034 00:44:09,040 --> 00:44:11,710 the screen, if you will, from left to right, moving down, 1035 00:44:11,710 --> 00:44:13,270 left to right, moving down. 1036 00:44:13,270 --> 00:44:15,230 And now do we do this? 1037 00:44:15,230 --> 00:44:18,740 Well, what was the type of code we used in order 1038 00:44:18,740 --> 00:44:22,129 to do something again and again and again? 1039 00:44:22,129 --> 00:44:23,420 The for loop was the first one. 1040 00:44:23,420 --> 00:44:26,390 We could use other constructs, but I'm going to go ahead and use the for loop 1041 00:44:26,390 --> 00:44:26,889 again. 1042 00:44:26,889 --> 00:44:31,130 Let me save this as mario4, as our fourth and final example. 1043 00:44:31,130 --> 00:44:33,200 I'm gonna keep this code up here because I still 1044 00:44:33,200 --> 00:44:37,395 want to prompt the user for some number of blocks, a positive number at that. 1045 00:44:37,395 --> 00:44:40,520 And now I don't want to just do this, but let me just see where I left off. 1046 00:44:40,520 --> 00:44:42,900 Let me go ahead and make mario4. 1047 00:44:42,900 --> 00:44:44,290 ./mario4. 1048 00:44:44,290 --> 00:44:46,520 And let's do, like, a 5-by-5 block. 1049 00:44:46,520 --> 00:44:47,150 OK, that's not. 1050 00:44:47,150 --> 00:44:48,740 That's just a column. 1051 00:44:48,740 --> 00:44:50,690 So I've got to do a little more. 1052 00:44:50,690 --> 00:44:52,880 Well, it turns out that just like in Scratch, 1053 00:44:52,880 --> 00:44:55,940 I can take one idea and kind of nest it inside of another. 1054 00:44:55,940 --> 00:44:57,650 Let me go ahead and do this. 1055 00:44:57,650 --> 00:45:02,000 How about inside of my for loop that's going from i to n, 1056 00:45:02,000 --> 00:45:05,630 let me do another one for int-- and I don't 1057 00:45:05,630 --> 00:45:08,532 want to reuse i because I feel like if I use i in two places, 1058 00:45:08,532 --> 00:45:09,990 something's going to get messed up. 1059 00:45:09,990 --> 00:45:12,156 So I'm gonna go with the next one alphabetically, j, 1060 00:45:12,156 --> 00:45:13,560 which is actually pretty common. 1061 00:45:13,560 --> 00:45:15,610 So int j gets 0. 1062 00:45:15,610 --> 00:45:17,300 j is less than n. 1063 00:45:17,300 --> 00:45:18,920 And j plus plus. 1064 00:45:18,920 --> 00:45:22,610 And now in here, I'm gonna put that brick. 1065 00:45:22,610 --> 00:45:25,310 I think I need to get rid of this here. 1066 00:45:25,310 --> 00:45:26,630 And let's see now what happens. 1067 00:45:26,630 --> 00:45:28,850 So I've got a for loop inside of a for loop. 1068 00:45:28,850 --> 00:45:33,690 If I go ahead and do make mario4, Enter, code is compilable. 1069 00:45:33,690 --> 00:45:35,560 ./mario4. 1070 00:45:35,560 --> 00:45:37,986 Let's type in 5. 1071 00:45:37,986 --> 00:45:39,680 Hmm. 1072 00:45:39,680 --> 00:45:42,939 I think that's actually, like, 25, if I really count it out. 1073 00:45:42,939 --> 00:45:43,980 That's not what I wanted. 1074 00:45:43,980 --> 00:45:44,729 I wanted a square. 1075 00:45:44,729 --> 00:45:47,020 So what's obviously missing aesthetically? 1076 00:45:47,020 --> 00:45:47,720 A new line. 1077 00:45:47,720 --> 00:45:51,080 But I kind of thought that doesn't go here, right? 1078 00:45:51,080 --> 00:45:53,330 Because if I do this-- just real quick teaser. 1079 00:45:53,330 --> 00:45:56,460 If I rerun mario4 after making that change, 1080 00:45:56,460 --> 00:45:59,540 now I've just made the opposite problem. 1081 00:45:59,540 --> 00:46:00,780 So what needs to change? 1082 00:46:00,780 --> 00:46:01,280 Yeah. 1083 00:46:01,280 --> 00:46:02,280 Oh, just scratching? 1084 00:46:02,280 --> 00:46:02,780 OK. 1085 00:46:02,780 --> 00:46:04,174 What needs to change? 1086 00:46:04,174 --> 00:46:04,840 Yeah, over here. 1087 00:46:04,840 --> 00:46:08,607 AUDIENCE: Another line with a printf and a backslash n. 1088 00:46:08,607 --> 00:46:10,690 DAVID MALAN: Yeah, and what line would you propose 1089 00:46:10,690 --> 00:46:13,708 the printf with the backslash n? 1090 00:46:13,708 --> 00:46:14,569 AUDIENCE: 21? 1091 00:46:14,569 --> 00:46:15,360 DAVID MALAN: Sorry? 1092 00:46:15,360 --> 00:46:16,440 AUDIENCE: 21. 1093 00:46:16,440 --> 00:46:17,400 DAVID MALAN: 21. 1094 00:46:17,400 --> 00:46:19,754 So above or below it? 1095 00:46:19,754 --> 00:46:20,700 AUDIENCE: Above it. 1096 00:46:20,700 --> 00:46:21,140 DAVID MALAN: Above it. 1097 00:46:21,140 --> 00:46:22,300 OK, so let me go there. 1098 00:46:22,300 --> 00:46:26,160 So let me go ahead and printf backslash n. 1099 00:46:26,160 --> 00:46:27,120 And now let's see. 1100 00:46:27,120 --> 00:46:29,050 So make mario4. 1101 00:46:29,050 --> 00:46:30,780 ./mario4, Enter. 1102 00:46:30,780 --> 00:46:31,540 5. 1103 00:46:31,540 --> 00:46:32,550 Beautiful, beautiful. 1104 00:46:32,550 --> 00:46:35,010 It's not quite a square on the screen because the hashtags are a little more 1105 00:46:35,010 --> 00:46:36,600 vertical than they are wide. 1106 00:46:36,600 --> 00:46:37,350 But that's OK. 1107 00:46:37,350 --> 00:46:39,990 We've built this sort of approximation of that level, too. 1108 00:46:39,990 --> 00:46:43,090 And now, just for good measure, let me just think about-- 1109 00:46:43,090 --> 00:46:46,290 this is kind of an oversimplification, print out that many bricks. 1110 00:46:46,290 --> 00:46:52,500 So print out this many rows or columns on the outside? 1111 00:46:52,500 --> 00:46:56,970 And then in here, where we're going with this, print out this many-- 1112 00:46:56,970 --> 00:47:02,220 what should my comment here be on the top? 1113 00:47:02,220 --> 00:47:04,110 Top on is rows? 1114 00:47:04,110 --> 00:47:07,170 And then down here, this should be columns. 1115 00:47:07,170 --> 00:47:11,250 And it is because on the outermost loop, you've got i. 1116 00:47:11,250 --> 00:47:13,590 And it's starting at 0, and it's eventually going to 5. 1117 00:47:13,590 --> 00:47:16,140 But whenever i is 0, at the beginning, it's 1118 00:47:16,140 --> 00:47:19,762 like the cursor is in the top left-hand location by default on the screen. 1119 00:47:19,762 --> 00:47:22,470 And then you've got this nested loop, which says, oh, by the way, 1120 00:47:22,470 --> 00:47:24,210 do the following five times. 1121 00:47:24,210 --> 00:47:25,680 What are you doing five times? 1122 00:47:25,680 --> 00:47:28,440 Hashtag, hashtag, hashtag, hashtag, hashtag. 1123 00:47:28,440 --> 00:47:29,620 Then a new line. 1124 00:47:29,620 --> 00:47:32,370 Then i becomes 1. 1125 00:47:32,370 --> 00:47:35,490 So it's like moving over-- 1126 00:47:35,490 --> 00:47:36,630 sorry. 1127 00:47:36,630 --> 00:47:40,200 Then i becomes 1, which means you're now on the second row 1128 00:47:40,200 --> 00:47:42,880 because you've printed out one of those newline characters. 1129 00:47:42,880 --> 00:47:45,220 So here, too, this is where comments would be helpful because, frankly, 1130 00:47:45,220 --> 00:47:46,370 even I had to think about that. 1131 00:47:46,370 --> 00:47:47,580 And you don't want to waste time thinking 1132 00:47:47,580 --> 00:47:48,996 about code you've already written. 1133 00:47:48,996 --> 00:47:52,500 Just give yourself the answer to why you made past decisions 1134 00:47:52,500 --> 00:47:54,690 as in a case like this. 1135 00:47:54,690 --> 00:47:55,230 All right. 1136 00:47:55,230 --> 00:47:56,820 So suppose something's going wrong. 1137 00:47:56,820 --> 00:47:59,430 And, in fact, we already solved the problem of, like, a lot of hashtags 1138 00:47:59,430 --> 00:48:01,480 going this way and a lot of hashtags going that way. 1139 00:48:01,480 --> 00:48:04,290 But suppose you want to wrap your mind around what your code is actually 1140 00:48:04,290 --> 00:48:04,890 doing. 1141 00:48:04,890 --> 00:48:07,950 It turns out that we have two other tools we can use. 1142 00:48:07,950 --> 00:48:10,680 It turns out we have in the CS50 library a function that's 1143 00:48:10,680 --> 00:48:15,810 almost identical to printf except we called eprintf for, like, error printf 1144 00:48:15,810 --> 00:48:19,450 just to help you see what's going on inside of your code. 1145 00:48:19,450 --> 00:48:21,730 And you should use it as follows. 1146 00:48:21,730 --> 00:48:24,480 If you kind of want to wrap your mind more clearly around what 1147 00:48:24,480 --> 00:48:27,540 your own code is doing or, for that matter, even an example for class 1148 00:48:27,540 --> 00:48:30,480 that you downloaded, you can add, certainly, your own lines 1149 00:48:30,480 --> 00:48:32,870 of this-- like, "hello there. 1150 00:48:32,870 --> 00:48:38,490 I'm at home playing with this code" or something like that, right? 1151 00:48:38,490 --> 00:48:40,620 So something nonsensical, but at least now 1152 00:48:40,620 --> 00:48:44,610 when you see that sentence on the screen, you know on what line 1153 00:48:44,610 --> 00:48:46,810 the computer was executing your program. 1154 00:48:46,810 --> 00:48:48,930 So you can be a little more methodical than this. 1155 00:48:48,930 --> 00:48:51,610 And with eprintf, notice we can do the following. 1156 00:48:51,610 --> 00:48:54,716 I'm going to change this to just eprintf, and it works the same. 1157 00:48:54,716 --> 00:48:56,340 And I'm going to go ahead and do this-- 1158 00:48:56,340 --> 00:48:59,850 "about to prompt user for a number." 1159 00:48:59,850 --> 00:49:04,530 I just want to provide an explicit note to myself, temporarily, 1160 00:49:04,530 --> 00:49:06,287 what should be happening here. 1161 00:49:06,287 --> 00:49:07,620 And let me see now what happens. 1162 00:49:07,620 --> 00:49:10,230 If I do make mario4. 1163 00:49:10,230 --> 00:49:12,930 OK. ./mario4. 1164 00:49:12,930 --> 00:49:13,812 Ah. 1165 00:49:13,812 --> 00:49:16,020 I get a little ugly output, but it's just diagnostic. 1166 00:49:16,020 --> 00:49:16,680 It's temporary. 1167 00:49:16,680 --> 00:49:21,150 It says mario4.c on line 10 is giving the following message-- 1168 00:49:21,150 --> 00:49:23,220 "about to prompt user for a number." 1169 00:49:23,220 --> 00:49:27,030 That's just a note to self so that I'm comfortable understanding the flow 1170 00:49:27,030 --> 00:49:28,900 or the structure of my program. 1171 00:49:28,900 --> 00:49:30,300 I can still interact with it. 1172 00:49:30,300 --> 00:49:32,490 Let's type in something like -1. 1173 00:49:32,490 --> 00:49:36,625 And what should I see next on the screen if I type -1? 1174 00:49:36,625 --> 00:49:37,500 Yeah, another prompt. 1175 00:49:37,500 --> 00:49:38,970 "About to prompt user for number." 1176 00:49:38,970 --> 00:49:40,710 So it's just like a sanity check. 1177 00:49:40,710 --> 00:49:43,410 If you think something's going to happen, tell yourself 1178 00:49:43,410 --> 00:49:46,797 that it should in your code, and make sure you see what you expect to see. 1179 00:49:46,797 --> 00:49:48,630 And then once you're sure your code is good, 1180 00:49:48,630 --> 00:49:50,790 then don't submit it with this because this is not 1181 00:49:50,790 --> 00:49:52,620 correct per the specification. 1182 00:49:52,620 --> 00:49:54,840 You can just get rid of it at that point. 1183 00:49:54,840 --> 00:49:56,800 But frankly, that gets tedious very quickly. 1184 00:49:56,800 --> 00:50:00,660 Oh, and how do I kill my program if I don't want to keep playing? 1185 00:50:00,660 --> 00:50:02,657 Control-C will terminate the program. 1186 00:50:02,657 --> 00:50:04,740 There's one other tool, perhaps the most powerful. 1187 00:50:04,740 --> 00:50:06,570 And I can't stress stress this enough. 1188 00:50:06,570 --> 00:50:09,720 Get into the habit of using this as needed early on. 1189 00:50:09,720 --> 00:50:12,700 Even if it takes you an extra 10 minutes, half hour to play with it, 1190 00:50:12,700 --> 00:50:16,230 it will save you, potentially, hours over the course of the semester. 1191 00:50:16,230 --> 00:50:18,480 And that is a program called a debugger. 1192 00:50:18,480 --> 00:50:20,310 So a debugger is a program that helps you 1193 00:50:20,310 --> 00:50:22,650 remove bugs or mistakes from a program. 1194 00:50:22,650 --> 00:50:24,310 And it works like this. 1195 00:50:24,310 --> 00:50:27,030 I'm going to go ahead and recompile mario4. 1196 00:50:27,030 --> 00:50:31,320 And now I would normally run it, of course, with ./mario4. 1197 00:50:31,320 --> 00:50:34,541 But suppose I have a bug, and I really want to understand what's going on. 1198 00:50:34,541 --> 00:50:35,790 I'm going to do the following. 1199 00:50:35,790 --> 00:50:37,706 You'll notice that all of my examples thus far 1200 00:50:37,706 --> 00:50:41,320 have line numbers in the so-called gutter of the program, left-hand side. 1201 00:50:41,320 --> 00:50:44,700 And it turns out you can actually click to the left of those numbers at, like, 1202 00:50:44,700 --> 00:50:46,140 this point here. 1203 00:50:46,140 --> 00:50:47,580 And you can put a red dot. 1204 00:50:47,580 --> 00:50:50,070 This shall be known as what's called a breakpoint. 1205 00:50:50,070 --> 00:50:53,290 This is like a little stop sign, only for yourself, that says, 1206 00:50:53,290 --> 00:50:57,450 hey, computer, pause my program here, or really stop my program here, 1207 00:50:57,450 --> 00:50:59,670 like a stop sign, temporarily. 1208 00:50:59,670 --> 00:51:02,250 And let me, the human, go at human speed, 1209 00:51:02,250 --> 00:51:04,890 not, like, billions of things per second speed. 1210 00:51:04,890 --> 00:51:07,000 And by this, I mean the following. 1211 00:51:07,000 --> 00:51:12,510 I'm going to now run not mario4 but debug50 space mario4, which, 1212 00:51:12,510 --> 00:51:16,400 again, is a program we wrote that invokes or starts 1213 00:51:16,400 --> 00:51:18,290 the IDE's built-in debugger. 1214 00:51:18,290 --> 00:51:21,290 So notice magically this right-hand panel just popped out. 1215 00:51:21,290 --> 00:51:22,950 And it's actually always been there. 1216 00:51:22,950 --> 00:51:26,330 It's always said "Debugger," and it just happened to open that window for me 1217 00:51:26,330 --> 00:51:26,960 automatically. 1218 00:51:26,960 --> 00:51:28,410 And let's see what's going on. 1219 00:51:28,410 --> 00:51:32,510 There's a lot of words, but we're familiar with many of them already. 1220 00:51:32,510 --> 00:51:35,870 Notice that down here is the word local variables. 1221 00:51:35,870 --> 00:51:37,510 And then there's kind of a table here. 1222 00:51:37,510 --> 00:51:40,070 And it's not very big because I only have one local variable. 1223 00:51:40,070 --> 00:51:44,240 And at this point in the story, my variable n happens-- 1224 00:51:44,240 --> 00:51:45,140 I got lucky. 1225 00:51:45,140 --> 00:51:47,960 It has a default value, it would seem, of 0. 1226 00:51:47,960 --> 00:51:49,530 I shouldn't rely on that. 1227 00:51:49,530 --> 00:51:53,010 But it's just so early in my program that it seems safe-- 1228 00:51:53,010 --> 00:51:55,010 well rather, it's so early in my program that it 1229 00:51:55,010 --> 00:51:58,410 happened to have the value 0 in it for our purposes today. 1230 00:51:58,410 --> 00:51:59,510 And it's of type int. 1231 00:51:59,510 --> 00:52:01,560 But what's cool now is the following. 1232 00:52:01,560 --> 00:52:05,810 Now notice that my program is effectively paused on line seven, 1233 00:52:05,810 --> 00:52:08,450 or, specifically, line 10, which is the first interesting line. 1234 00:52:08,450 --> 00:52:10,220 That's why it's highlighted in yellow. 1235 00:52:10,220 --> 00:52:11,930 And what's cool here is this. 1236 00:52:11,930 --> 00:52:15,680 Up here in the top right, you have a play button which will just say, 1237 00:52:15,680 --> 00:52:16,850 play the rest of my program. 1238 00:52:16,850 --> 00:52:19,010 Just let it go through without pausing. 1239 00:52:19,010 --> 00:52:21,920 Or, if I hover over this thing, you can step 1240 00:52:21,920 --> 00:52:25,580 over this line, which means, hey, computer, execute this line, 1241 00:52:25,580 --> 00:52:28,727 but at my human pace, just one line of code at a time. 1242 00:52:28,727 --> 00:52:31,310 If you're really curious, you can step into that line of code, 1243 00:52:31,310 --> 00:52:33,061 but more on that in just a moment. 1244 00:52:33,061 --> 00:52:36,060 Meanwhile, this is step out, which is if we've actually dived in deeper. 1245 00:52:36,060 --> 00:52:38,360 So what do I mean by all of this? 1246 00:52:38,360 --> 00:52:40,970 So I'm currently paused on line 10, which 1247 00:52:40,970 --> 00:52:43,520 was the first interesting line of code in my program, so far 1248 00:52:43,520 --> 00:52:45,200 as the debugger is concerned. 1249 00:52:45,200 --> 00:52:47,180 I'm going to go ahead at top right, and I'm 1250 00:52:47,180 --> 00:52:50,010 going to go ahead and click Step Over. 1251 00:52:50,010 --> 00:52:53,060 And notice my terminal window is now prompting me for a number. 1252 00:52:53,060 --> 00:52:53,690 Why? 1253 00:52:53,690 --> 00:52:57,620 Well because I've stepped over the get int line, which means execute it. 1254 00:52:57,620 --> 00:53:00,020 So let me go ahead and type in that number. 1255 00:53:00,020 --> 00:53:04,190 Let me go ahead and type in -50, Enter. 1256 00:53:04,190 --> 00:53:08,760 And keep an eye on the variable on the right-hand side. 1257 00:53:08,760 --> 00:53:13,490 Notice now in the debugger, even without printing it with printf or eprintf, 1258 00:53:13,490 --> 00:53:16,067 I can see that n has a value of -50. 1259 00:53:16,067 --> 00:53:17,650 It's just a sanity check, so to speak. 1260 00:53:17,650 --> 00:53:21,150 I can see what it is to be sure it's consistent with my expectations. 1261 00:53:21,150 --> 00:53:21,650 All right. 1262 00:53:21,650 --> 00:53:24,360 That's not right, so let me go ahead and step over. 1263 00:53:24,360 --> 00:53:26,840 And notice the yellow line moved because it's looping. 1264 00:53:26,840 --> 00:53:29,150 You can literally see what I keep doing with my hand. 1265 00:53:29,150 --> 00:53:30,000 Let me do it again. 1266 00:53:30,000 --> 00:53:30,876 OK, positive number. 1267 00:53:30,876 --> 00:53:32,250 I'm going to cooperate this time. 1268 00:53:32,250 --> 00:53:34,280 42, Enter. 1269 00:53:34,280 --> 00:53:38,360 Notice at the right-hand side, the value n is indeed 42. 1270 00:53:38,360 --> 00:53:40,970 And notice the yellow line, if I keep stepping, 1271 00:53:40,970 --> 00:53:43,770 is about to jump to the next interesting line of code. 1272 00:53:43,770 --> 00:53:46,340 And if I keep doing this, keep doing this, 1273 00:53:46,340 --> 00:53:49,640 watch what's about to happen in the blue terminal window at the bottom. 1274 00:53:49,640 --> 00:53:51,920 There's the first hashtag. 1275 00:53:51,920 --> 00:53:53,540 There's the second hashtag. 1276 00:53:53,540 --> 00:53:57,050 So the sort of fake animation I did the other day with just my slides, 1277 00:53:57,050 --> 00:53:59,990 and what I try to do verbally and with my hand going back and forth, 1278 00:53:59,990 --> 00:54:02,030 you can now see much more methodically. 1279 00:54:02,030 --> 00:54:05,150 So even if it's a simple program, and even if it's code you wrote, 1280 00:54:05,150 --> 00:54:08,690 you can really see step by step what it is your program's doing. 1281 00:54:08,690 --> 00:54:10,730 And maybe it's not doing what you expect. 1282 00:54:10,730 --> 00:54:13,610 And if it's not, you'll see it visually. 1283 00:54:13,610 --> 00:54:14,110 All right. 1284 00:54:14,110 --> 00:54:16,620 Now I'm just gonna go ahead and say, OK, print the rest of the thing. 1285 00:54:16,620 --> 00:54:17,480 So I hit Play. 1286 00:54:17,480 --> 00:54:20,660 You see that the GDB, the GNU Debugger, server is exiting. 1287 00:54:20,660 --> 00:54:21,560 It's just quitting. 1288 00:54:21,560 --> 00:54:24,360 And now I'm back at my prompt, and the debugger goes away. 1289 00:54:24,360 --> 00:54:28,250 So do not undervalue those particular tools. 1290 00:54:28,250 --> 00:54:33,892 So before we forge ahead, I thought I'd introduce Abhishek here, 1291 00:54:33,892 --> 00:54:36,600 who you might have seen on the internet just a couple months ago. 1292 00:54:36,600 --> 00:54:37,516 He kind of went viral. 1293 00:54:37,516 --> 00:54:39,050 He's a recent grad from NYU. 1294 00:54:39,050 --> 00:54:40,640 And he did this extraordinary thing. 1295 00:54:40,640 --> 00:54:45,230 He took a device called the Microsoft Hololens, which is an augmented reality 1296 00:54:45,230 --> 00:54:48,380 device that puts sort of a goofy looking screen in front of your eyes. 1297 00:54:48,380 --> 00:54:50,507 But then it projects images in front of your eyes. 1298 00:54:50,507 --> 00:54:53,340 And it's really cool in that much like an Android phone or an iPhone 1299 00:54:53,340 --> 00:54:58,012 these days, it knows where you are in a three-dimensional space. 1300 00:54:58,012 --> 00:55:01,220 And what Abhishek actually did was he went to a very three-dimensional space, 1301 00:55:01,220 --> 00:55:02,990 Central Park in Manhattan. 1302 00:55:02,990 --> 00:55:07,220 And he had before that spent days recreating "Super Mario 1303 00:55:07,220 --> 00:55:11,090 Bros." in augmented reality by recreating one of those maps to which I 1304 00:55:11,090 --> 00:55:12,080 alluded earlier. 1305 00:55:12,080 --> 00:55:13,970 And the end result-- and I'll show you just a glimpse of it, 1306 00:55:13,970 --> 00:55:17,120 and we'll put it on the course's website for you to see later in detail-- 1307 00:55:17,120 --> 00:55:20,960 was this, which was pretty mind-blowing and a wonderful application of computer 1308 00:55:20,960 --> 00:55:23,034 science to the real world, literally. 1309 00:55:23,034 --> 00:55:23,700 [VIDEO PLAYBACK] 1310 00:55:23,700 --> 00:55:23,990 - Hi. 1311 00:55:23,990 --> 00:55:24,680 I'm Abhishek. 1312 00:55:24,680 --> 00:55:28,040 And I recreated the iconic first level of "Super Mario Bros." 1313 00:55:28,040 --> 00:55:30,950 as a first-person, life-size, augmented-reality game 1314 00:55:30,950 --> 00:55:32,480 that I'm now going to play as Mario. 1315 00:55:32,480 --> 00:55:46,368 1316 00:55:46,368 --> 00:55:48,348 [MUSIC, "SUPER MARIO BROS. 1317 00:55:48,348 --> 00:55:48,848 THEME"] 1318 00:55:48,848 --> 00:56:18,027 1319 00:56:18,027 --> 00:56:18,610 [END PLAYBACK] 1320 00:56:18,610 --> 00:56:21,480 DAVID MALAN: Abhishek gave a tech talk in CS50 a couple of months ago. 1321 00:56:21,480 --> 00:56:23,840 And the funniest part, if you really look closely-- and it is Manhattan-- 1322 00:56:23,840 --> 00:56:24,964 is some people look at him. 1323 00:56:24,964 --> 00:56:28,309 But a lot of New Yorkers don't even look twice at what he's doing. 1324 00:56:28,309 --> 00:56:30,350 Let's go ahead here and take a five-minute break. 1325 00:56:30,350 --> 00:56:34,650 And when we come back, we'll begin to look at the world of cryptography. 1326 00:56:34,650 --> 00:56:35,480 So we are back. 1327 00:56:35,480 --> 00:56:38,750 And, of course, there are more functions than just printf. 1328 00:56:38,750 --> 00:56:41,420 And we've seen a glimpse of these by way of the CS50 library. 1329 00:56:41,420 --> 00:56:43,760 And there's many, many, many, many more that come with C itself 1330 00:56:43,760 --> 00:56:46,610 and that other people around the world have written over the years. 1331 00:56:46,610 --> 00:56:50,010 But implied in each of these CS50 functions, 1332 00:56:50,010 --> 00:56:53,470 notice, are these key words like string and int and float-- 1333 00:56:53,470 --> 00:56:55,220 which we talked about the other day, too-- 1334 00:56:55,220 --> 00:57:00,950 long, and long long, and double, as we saw the other day, too. 1335 00:57:00,950 --> 00:57:04,970 So it turns out that C, to be clear, has what are called data types. 1336 00:57:04,970 --> 00:57:06,590 And we glimpsed this the other day. 1337 00:57:06,590 --> 00:57:10,054 Data types specify what type of data you can put inside of a variable. 1338 00:57:10,054 --> 00:57:11,970 And that's what's different from Scratch, too. 1339 00:57:11,970 --> 00:57:14,300 In C and a few other languages, too, you have 1340 00:57:14,300 --> 00:57:16,910 to decide in advance as the programmer what kind of data 1341 00:57:16,910 --> 00:57:19,460 are you going to put in this variable so that the computer-- 1342 00:57:19,460 --> 00:57:21,140 or, really, the compiler-- knows. 1343 00:57:21,140 --> 00:57:23,900 And so the compiler knows how to deal with it for you. 1344 00:57:23,900 --> 00:57:27,080 Well, it turns out that if you want to print these things out, 1345 00:57:27,080 --> 00:57:29,910 printf also comes with certain format codes. 1346 00:57:29,910 --> 00:57:32,882 And we've seen %s for strings and %i for integers. 1347 00:57:32,882 --> 00:57:34,340 And there's a bunch of others, too. 1348 00:57:34,340 --> 00:57:37,298 Perhaps the most common would be these, just so that you've seen them-- 1349 00:57:37,298 --> 00:57:38,180 %f for float. 1350 00:57:38,180 --> 00:57:39,410 We saw that the other day. 1351 00:57:39,410 --> 00:57:42,610 %lld for a long, long decimal number. 1352 00:57:42,610 --> 00:57:44,360 That's one I often have to look up myself. 1353 00:57:44,360 --> 00:57:46,070 And then there's even more of those, too. 1354 00:57:46,070 --> 00:57:48,590 So just realize that as you're getting input from the users, 1355 00:57:48,590 --> 00:57:50,930 whether for problem sets or any other purposes, 1356 00:57:50,930 --> 00:57:54,500 realize that sometimes you have to check the manual or the documentation, 1357 00:57:54,500 --> 00:57:59,030 so to speak, for functions that you're using. 1358 00:57:59,030 --> 00:58:01,610 And so that you know where to turn for those kinds of things, 1359 00:58:01,610 --> 00:58:03,410 let me just introduce one thing real quick. 1360 00:58:03,410 --> 00:58:06,620 And you'll see more of this in super sections and sections and beyond. 1361 00:58:06,620 --> 00:58:10,010 If you forget, for instance, how certain functions work, 1362 00:58:10,010 --> 00:58:14,600 you can actually type the following-- "man get_string," 1363 00:58:14,600 --> 00:58:16,310 where man stands for manual. 1364 00:58:16,310 --> 00:58:19,910 And this is kind of an old-school command on Unix and Linux computers 1365 00:58:19,910 --> 00:58:22,650 that have this text-based keyboard environment. 1366 00:58:22,650 --> 00:58:26,840 And you'll see pretty much a standard, structured user's manual 1367 00:58:26,840 --> 00:58:28,794 for the function in which you're interested. 1368 00:58:28,794 --> 00:58:30,710 So if you forget what we talked about in class 1369 00:58:30,710 --> 00:58:32,844 or you're not really sure how else you can use it, 1370 00:58:32,844 --> 00:58:34,760 and the function is something like get_string, 1371 00:58:34,760 --> 00:58:36,680 you can simply read about it here. 1372 00:58:36,680 --> 00:58:39,650 But sometimes, frankly, it's going to look a little arcane. 1373 00:58:39,650 --> 00:58:43,700 I mean, we have not talked about what some of these symbols mean-- the ..., 1374 00:58:43,700 --> 00:58:47,360 the word const, the asterisk that I've highlighted on the screen. 1375 00:58:47,360 --> 00:58:50,400 So frankly, sometimes you will find the man pages, as they're called-- 1376 00:58:50,400 --> 00:58:51,380 the manual pages-- 1377 00:58:51,380 --> 00:58:54,830 just confusing unto themselves, which is a nasty situation to be in. 1378 00:58:54,830 --> 00:58:57,710 If you're already confused, and the documentation's not helping, 1379 00:58:57,710 --> 00:58:59,840 you of need a third option. 1380 00:58:59,840 --> 00:59:02,330 And so if you go to CS50's website, you'll 1381 00:59:02,330 --> 00:59:06,050 actually find that there's a link to a tool 1382 00:59:06,050 --> 00:59:09,170 that the staff has created over the years called CS50 Reference. 1383 00:59:09,170 --> 00:59:12,051 This is a more user-friendly version of those same man 1384 00:59:12,051 --> 00:59:14,300 pages, where we've gone through and sort of translated 1385 00:59:14,300 --> 00:59:18,270 the very arcane English into less comfortable English, if you will. 1386 00:59:18,270 --> 00:59:21,830 So if over here I scroll down to, say, printf-- 1387 00:59:21,830 --> 00:59:23,600 or, rather, let me just search for it-- 1388 00:59:23,600 --> 00:59:25,040 I can see printf here. 1389 00:59:25,040 --> 00:59:28,550 It's inside of this header file, this h file on the system. 1390 00:59:28,550 --> 00:59:30,770 And now I can actually read about it here. 1391 00:59:30,770 --> 00:59:33,860 And notice at top right, checked is the Less Comfortable box, 1392 00:59:33,860 --> 00:59:37,280 which means, hey, show me the language the TFs came up with as opposed 1393 00:59:37,280 --> 00:59:38,450 to the default language. 1394 00:59:38,450 --> 00:59:40,325 But it, too, is meant to be a training wheel. 1395 00:59:40,325 --> 00:59:42,380 So if and when you're ready to sort of take away 1396 00:59:42,380 --> 00:59:44,671 some of those simplifications, you can uncheck that box 1397 00:59:44,671 --> 00:59:46,884 and now see the much more verbose technical version 1398 00:59:46,884 --> 00:59:48,800 that you would actually see in the real world. 1399 00:59:48,800 --> 00:59:51,380 So keep in mind those kinds of things, too, especially 1400 00:59:51,380 --> 00:59:53,690 if it feels like we go through things quickly in class, 1401 00:59:53,690 --> 00:59:58,410 which we do, and you need to lean on something authoritative thereafter. 1402 00:59:58,410 --> 01:00:02,780 But let's tease apart what actually a string is. 1403 01:00:02,780 --> 01:00:06,650 Let me go ahead and start actually, with Stelios here. 1404 01:00:06,650 --> 01:00:10,880 So Stelios, one of our head TAs in New Haven, has this name here. 1405 01:00:10,880 --> 01:00:14,690 And I've written it as a string, S-T-E-L-I-O-S. 1406 01:00:14,690 --> 01:00:16,910 But I've kind of drawn boxes, deliberately, 1407 01:00:16,910 --> 01:00:21,620 around his name to capture the fact that this thing we call string, 1408 01:00:21,620 --> 01:00:26,540 like "Stelios," is actually not really a string only. 1409 01:00:26,540 --> 01:00:29,150 It's really like an abstraction for something 1410 01:00:29,150 --> 01:00:32,450 a little lower-level, which is a character after a character 1411 01:00:32,450 --> 01:00:34,400 after a character, and so forth. 1412 01:00:34,400 --> 01:00:36,750 And so here, too, we see an example of an abstraction. 1413 01:00:36,750 --> 01:00:41,690 It's not that much fun to call Stelios S-T-E-L-I-O-S. We call him Stelios. 1414 01:00:41,690 --> 01:00:46,447 But we, in languages like C, would call that construct a string or, more 1415 01:00:46,447 --> 01:00:48,030 technically, a sequence of characters. 1416 01:00:48,030 --> 01:00:48,779 But it's a string. 1417 01:00:48,779 --> 01:00:49,830 It's a nice abstraction. 1418 01:00:49,830 --> 01:00:51,317 It's a nice simplification. 1419 01:00:51,317 --> 01:00:53,150 But it turns out there's an opportunity here 1420 01:00:53,150 --> 01:00:57,470 now to see how characters and numbers interrelate in a computer 1421 01:00:57,470 --> 01:01:01,040 and see how powerful computer programs and software are that we ourselves 1422 01:01:01,040 --> 01:01:01,730 can write. 1423 01:01:01,730 --> 01:01:04,970 But first, how do we access individual characters in a name? 1424 01:01:04,970 --> 01:01:08,960 I can easily get Stelios's name using the function get_string, as we've seen, 1425 01:01:08,960 --> 01:01:11,540 just like Sam did from the audience the other day. 1426 01:01:11,540 --> 01:01:14,840 But how do I actually get at, like, the S or the T or the E? 1427 01:01:14,840 --> 01:01:18,230 Or if maybe he makes a typo or maybe he, like, doesn't type it very neatly, 1428 01:01:18,230 --> 01:01:21,170 how do I capitalize his name or sort of clean up his user input 1429 01:01:21,170 --> 01:01:23,420 like websites today very commonly do? 1430 01:01:23,420 --> 01:01:28,040 Well, let me go ahead and open up CS50 IDE again 1431 01:01:28,040 --> 01:01:32,370 and just do a pretty simple example that this time involves strings. 1432 01:01:32,370 --> 01:01:33,985 Let me go ahead and create a new file. 1433 01:01:33,985 --> 01:01:37,550 1434 01:01:37,550 --> 01:01:42,070 And I'm going to call this file string0.c. 1435 01:01:42,070 --> 01:01:45,970 And I'm going to go ahead now and write a short program-- 1436 01:01:45,970 --> 01:01:51,030 come on-- once I've lost control over my terminal window. 1437 01:01:51,030 --> 01:01:55,420 Now I've lost control of my menu. 1438 01:01:55,420 --> 01:01:56,860 This is my own fault for-- 1439 01:01:56,860 --> 01:01:59,149 oh, here we go. 1440 01:01:59,149 --> 01:02:00,440 Well, this is gonna look great. 1441 01:02:00,440 --> 01:02:03,342 Very inspiring here. 1442 01:02:03,342 --> 01:02:04,288 Where'd it go? 1443 01:02:04,288 --> 01:02:07,610 1444 01:02:07,610 --> 01:02:08,260 Oh, oh. 1445 01:02:08,260 --> 01:02:09,250 Here. 1446 01:02:09,250 --> 01:02:09,940 OK. 1447 01:02:09,940 --> 01:02:13,010 That's an example of bad design, so we will fix that. 1448 01:02:13,010 --> 01:02:15,235 And now I see that I've misspelled string as strig. 1449 01:02:15,235 --> 01:02:16,110 So we're just gonna-- 1450 01:02:16,110 --> 01:02:18,651 no one on the internet will ever know the following happened. 1451 01:02:18,651 --> 01:02:21,650 OK, so string0-- voila. 1452 01:02:21,650 --> 01:02:22,600 Here we go. 1453 01:02:22,600 --> 01:02:23,110 All right. 1454 01:02:23,110 --> 01:02:26,730 So string0.c, and I'm gonna whip up a really quick program here as follows. 1455 01:02:26,730 --> 01:02:28,930 So int main void. 1456 01:02:28,930 --> 01:02:31,840 And now string s gets get_string. 1457 01:02:31,840 --> 01:02:34,810 And I'm just gonna ask for the user's input in this way. 1458 01:02:34,810 --> 01:02:37,180 And now I'm going to go ahead and print out-- 1459 01:02:37,180 --> 01:02:40,810 how about just say the word output here. 1460 01:02:40,810 --> 01:02:43,450 And just to be nice and tidy, let me put a couple of spaces 1461 01:02:43,450 --> 01:02:44,860 here in anticipation. 1462 01:02:44,860 --> 01:02:47,440 And now let me go ahead and do this-- on line five, 1463 01:02:47,440 --> 01:02:49,840 my intention is to get, like, Stelios's name from him 1464 01:02:49,840 --> 01:02:51,370 or whoever is playing this game. 1465 01:02:51,370 --> 01:02:54,160 But now I want to go ahead and not just print out, like, 1466 01:02:54,160 --> 01:02:58,490 hello Stelios, and plug in his value s, which we've been doing. 1467 01:02:58,490 --> 01:03:00,340 I want to do this character at a time. 1468 01:03:00,340 --> 01:03:03,010 And doing something one at a time kind of suggests a loop. 1469 01:03:03,010 --> 01:03:04,130 And indeed, I can do that. 1470 01:03:04,130 --> 01:03:10,150 So I'm going to do for int i gets 0, i is less than however 1471 01:03:10,150 --> 01:03:14,590 long his name is, and then i plus plus. 1472 01:03:14,590 --> 01:03:17,020 And now I can introduce one other trick that you 1473 01:03:17,020 --> 01:03:20,170 can kind of glimpse ever so quickly from the screen I had up before. 1474 01:03:20,170 --> 01:03:23,620 It turns out that %c is the placeholder for a character. 1475 01:03:23,620 --> 01:03:24,670 Perhaps no surprise. 1476 01:03:24,670 --> 01:03:29,400 But the catch is I only have access to s, the whole thing, the string s. 1477 01:03:29,400 --> 01:03:31,810 But it turns out there's a new piece of syntax here. 1478 01:03:31,810 --> 01:03:34,690 And as is kind of sort of implied by our having used boxes 1479 01:03:34,690 --> 01:03:37,930 to flank Stelios's letters of his name there, 1480 01:03:37,930 --> 01:03:41,320 turns out that the equivalent in C is to kind of sort of do the same, 1481 01:03:41,320 --> 01:03:44,560 use a box of characters, by using the square brackets, which you might not 1482 01:03:44,560 --> 01:03:45,700 often use on your keyboard. 1483 01:03:45,700 --> 01:03:48,340 On a US keyboard, they're often just above the Enter key. 1484 01:03:48,340 --> 01:03:52,120 And here I can go ahead and type in s[i]. 1485 01:03:52,120 --> 01:03:57,740 And so to speak, this is going to print the i'th character, if you will, 1486 01:03:57,740 --> 01:03:58,700 of Stelios's name. 1487 01:03:58,700 --> 01:04:00,339 So i is going to start at 0. 1488 01:04:00,339 --> 01:04:02,380 And I keep doing plus plus, plus plus, plus plus. 1489 01:04:02,380 --> 01:04:04,504 And using the square bracket notation, so to speak, 1490 01:04:04,504 --> 01:04:08,930 I can dive into the individual letters in his name in this case. 1491 01:04:08,930 --> 01:04:12,520 So when I run this, what's going to be the net effect? 1492 01:04:12,520 --> 01:04:15,160 Let me go ahead and make string0. 1493 01:04:15,160 --> 01:04:16,060 Huh. 1494 01:04:16,060 --> 01:04:19,140 OK, that is not valid C code, however long his name is. 1495 01:04:19,140 --> 01:04:21,220 So I have a problem to solve here. 1496 01:04:21,220 --> 01:04:23,110 How do I actually get the length of his name? 1497 01:04:23,110 --> 01:04:25,060 Well I can kind of cheat. 1498 01:04:25,060 --> 01:04:28,551 OK, so one, two, three, four, five, six, seven. 1499 01:04:28,551 --> 01:04:29,050 All right. 1500 01:04:29,050 --> 01:04:32,010 So we can just write this program as follows-- 1501 01:04:32,010 --> 01:04:33,047 7. 1502 01:04:33,047 --> 01:04:34,630 But this should rub you the wrong way. 1503 01:04:34,630 --> 01:04:36,860 Why is this not a good solution to the problem? 1504 01:04:36,860 --> 01:04:37,360 Yeah? 1505 01:04:37,360 --> 01:04:40,085 AUDIENCE: Because it's not changeable. 1506 01:04:40,085 --> 01:04:40,960 DAVID MALAN: Exactly. 1507 01:04:40,960 --> 01:04:41,793 It's not changeable. 1508 01:04:41,793 --> 01:04:44,440 I have this dynamism of get_string to get Stelios's name. 1509 01:04:44,440 --> 01:04:48,100 But seven is not going to be true of all the humans who might use this program. 1510 01:04:48,100 --> 01:04:49,360 I need something dynamic. 1511 01:04:49,360 --> 01:04:51,700 Well, it turns out there is a function for that. 1512 01:04:51,700 --> 01:04:56,830 I can call strlen, for string length, pass in as input 1513 01:04:56,830 --> 01:04:58,930 the variable whose length I want to get, and that 1514 01:04:58,930 --> 01:05:02,892 will return to me a number, which will be, in this case, it would seem, 7. 1515 01:05:02,892 --> 01:05:04,100 But it's going to be dynamic. 1516 01:05:04,100 --> 01:05:06,970 So if I type in, like, David, that should return 5, hopefully, 1517 01:05:06,970 --> 01:05:10,720 and any number for any number of other humans engaging in this. 1518 01:05:10,720 --> 01:05:12,460 So let me go ahead now and try again. 1519 01:05:12,460 --> 01:05:14,840 Make string0. 1520 01:05:14,840 --> 01:05:16,450 A lot of errors. 1521 01:05:16,450 --> 01:05:18,837 And "use of undeclared identifier string." 1522 01:05:18,837 --> 01:05:19,420 Wait a minute. 1523 01:05:19,420 --> 01:05:21,100 We've seen this before. 1524 01:05:21,100 --> 01:05:22,400 How did I solve this last time? 1525 01:05:22,400 --> 01:05:22,900 Yeah? 1526 01:05:22,900 --> 01:05:24,490 What's up above missing? 1527 01:05:24,490 --> 01:05:25,660 AUDIENCE: The libraries. 1528 01:05:25,660 --> 01:05:27,040 DAVID MALAN: Yeah, the libraries or the header 1529 01:05:27,040 --> 01:05:28,623 files, so to speak, for the libraries. 1530 01:05:28,623 --> 01:05:34,030 So I need to include, I'm pretty, sure at least stdio.h for printf. 1531 01:05:34,030 --> 01:05:39,370 I need to include cs50.h for get_string. 1532 01:05:39,370 --> 01:05:40,390 And we're almost there. 1533 01:05:40,390 --> 01:05:41,556 Let me see if that's enough. 1534 01:05:41,556 --> 01:05:42,190 Make string0. 1535 01:05:42,190 --> 01:05:46,480 Oh, Implicitly declaring library functions strlen with type-- 1536 01:05:46,480 --> 01:05:48,490 I don't really know what that is. 1537 01:05:48,490 --> 01:05:51,250 But there's kind of an answer hinted there-- 1538 01:05:51,250 --> 01:05:54,470 include the header string.h and so forth. 1539 01:05:54,470 --> 01:05:56,094 So turns out this is true. 1540 01:05:56,094 --> 01:05:57,760 And there's different ways to know this. 1541 01:05:57,760 --> 01:06:03,430 If I actually go back to reference.cs50.net and do strlen, 1542 01:06:03,430 --> 01:06:04,869 there's that function. 1543 01:06:04,869 --> 01:06:06,910 Let me go back to the less comfortable-- whoops-- 1544 01:06:06,910 --> 01:06:08,680 to the less comfortable version. 1545 01:06:08,680 --> 01:06:13,180 Notice that under synopsis of a man page or reference.cs50.net 1546 01:06:13,180 --> 01:06:15,100 is always a quick summary of how you use it. 1547 01:06:15,100 --> 01:06:17,110 So just the prototype of the function that 1548 01:06:17,110 --> 01:06:20,470 gives you a sense of what it is-- size_t is essentially equivalent to an int, 1549 01:06:20,470 --> 01:06:22,450 just saying the size of something as a number. 1550 01:06:22,450 --> 01:06:25,420 But include string.h is the ingredient I wanted. 1551 01:06:25,420 --> 01:06:27,130 So let me go ahead and copy that. 1552 01:06:27,130 --> 01:06:28,570 Let me go back to the IDE. 1553 01:06:28,570 --> 01:06:30,460 I'm gonna be a little nit-picky, and I'm just 1554 01:06:30,460 --> 01:06:31,970 gonna keep things alphabetical at the top. 1555 01:06:31,970 --> 01:06:33,386 But that's not strictly necessary. 1556 01:06:33,386 --> 01:06:36,290 It just makes it easier to skim later on when the list gets long. 1557 01:06:36,290 --> 01:06:37,580 Make string0. 1558 01:06:37,580 --> 01:06:40,460 Seems good to go. ./string0. 1559 01:06:40,460 --> 01:06:41,070 Inputs. 1560 01:06:41,070 --> 01:06:43,640 Now I'm going to go ahead and type in Stelios's name. 1561 01:06:43,640 --> 01:06:45,354 And I got his output, as well. 1562 01:06:45,354 --> 01:06:47,770 Now, that was a lot of unnecessary work to print his name. 1563 01:06:47,770 --> 01:06:49,550 I could have just used %s. 1564 01:06:49,550 --> 01:06:51,320 But now I can make modifications. 1565 01:06:51,320 --> 01:06:54,200 What if I wanted to print it one per line? 1566 01:06:54,200 --> 01:06:55,820 I can add that. 1567 01:06:55,820 --> 01:06:59,750 I can make the program again, rerun it, and type in his name, 1568 01:06:59,750 --> 01:07:01,119 and now I get it one per line. 1569 01:07:01,119 --> 01:07:01,910 It's a little ugly. 1570 01:07:01,910 --> 01:07:03,230 Like, now it says output s. 1571 01:07:03,230 --> 01:07:04,820 But that's just an aesthetic bug. 1572 01:07:04,820 --> 01:07:06,170 I could go in and fix that. 1573 01:07:06,170 --> 01:07:09,110 But now I have control over the individual characters 1574 01:07:09,110 --> 01:07:10,670 in his actual name. 1575 01:07:10,670 --> 01:07:13,140 So that would seem to be progress in some form. 1576 01:07:13,140 --> 01:07:15,690 But if I now have access to the individual letters, 1577 01:07:15,690 --> 01:07:19,280 we can kind of come full circle from the very first lecture where 1578 01:07:19,280 --> 01:07:21,980 we talked about zeros and ones, and then numbers, 1579 01:07:21,980 --> 01:07:26,030 and then letters, and now, in turn, words, otherwise known as strings, 1580 01:07:26,030 --> 01:07:27,824 by way of a topic called typecasting. 1581 01:07:27,824 --> 01:07:30,740 Types, of course, are the types of variables we've been talking about. 1582 01:07:30,740 --> 01:07:33,600 Casting means to convert from one to the other. 1583 01:07:33,600 --> 01:07:35,720 And you might recall from the first lecture 1584 01:07:35,720 --> 01:07:39,410 that capital A was the number we know in decimal as 65 1585 01:07:39,410 --> 01:07:41,540 and whatever pattern of zeros and ones that is. 1586 01:07:41,540 --> 01:07:44,430 Capital B is 66, and so forth. 1587 01:07:44,430 --> 01:07:46,910 So can I see that now for the first time? 1588 01:07:46,910 --> 01:07:48,230 Well it turns out I can. 1589 01:07:48,230 --> 01:07:50,270 Let me go back to the IDE. 1590 01:07:50,270 --> 01:07:55,940 And let me go ahead and create a new program called ascii0.c. 1591 01:07:55,940 --> 01:07:57,590 and ASCII, again is just the standard. 1592 01:07:57,590 --> 01:08:00,423 It's an acronym, American Standard Code for Information Interchange, 1593 01:08:00,423 --> 01:08:04,136 which maps letters to numbers and numbers to letters. 1594 01:08:04,136 --> 01:08:06,260 So let me go ahead now and whip up a quick program. 1595 01:08:06,260 --> 01:08:09,740 Include stdio.h for printf. 1596 01:08:09,740 --> 01:08:11,840 Int main void. 1597 01:08:11,840 --> 01:08:14,797 And then let me go here and do the following. 1598 01:08:14,797 --> 01:08:15,380 You know what? 1599 01:08:15,380 --> 01:08:22,040 I'm gonna just go ahead and print out, let's say, string s gets get_string. 1600 01:08:22,040 --> 01:08:24,080 Let's just ask for someone's name. 1601 01:08:24,080 --> 01:08:27,870 And then let me go ahead and do the following-- for int i gets 0, 1602 01:08:27,870 --> 01:08:30,560 i is less than the length of that string-- 1603 01:08:30,560 --> 01:08:31,990 learning from last time-- 1604 01:08:31,990 --> 01:08:33,170 i plus plus. 1605 01:08:33,170 --> 01:08:35,490 So this is gonna iterate over the whole string. 1606 01:08:35,490 --> 01:08:36,990 And now what I want to do is this. 1607 01:08:36,990 --> 01:08:40,109 Let me go ahead and print out the following. 1608 01:08:40,109 --> 01:08:45,630 Let me print out the character itself, and then a space, 1609 01:08:45,630 --> 01:08:48,354 and then how about an integer, and then a new line. 1610 01:08:48,354 --> 01:08:50,270 And we'll see what this does in just a moment. 1611 01:08:50,270 --> 01:08:52,269 I want to plug in values for these placeholders. 1612 01:08:52,269 --> 01:08:57,520 So how do I get at the first character of the name if the string is called s? 1613 01:08:57,520 --> 01:09:01,340 Yeah, so s[i] for the i'th character. 1614 01:09:01,340 --> 01:09:05,450 And that's gonna plug in, literally, S-T-E-L if Stelios is the one playing 1615 01:09:05,450 --> 01:09:06,300 the game. 1616 01:09:06,300 --> 01:09:09,319 But now I put a comma to plug in a second placeholder. 1617 01:09:09,319 --> 01:09:11,000 And %i-- you know what I'm gonna do? 1618 01:09:11,000 --> 01:09:16,670 I'm going to do int in parentheses s[i] semi-colon. 1619 01:09:16,670 --> 01:09:18,229 So it looks a little cryptic. 1620 01:09:18,229 --> 01:09:20,729 But let me just remove this for a moment. 1621 01:09:20,729 --> 01:09:23,090 This is just the same thing twice-- 1622 01:09:23,090 --> 01:09:25,970 print the i'th character of the name, i'th character of the name. 1623 01:09:25,970 --> 01:09:28,760 But in parentheses, I'm doing what's called typecasting. 1624 01:09:28,760 --> 01:09:32,390 I'm taking whatever that is, which is a char or character. 1625 01:09:32,390 --> 01:09:35,990 And I'm saying, parenthetically, make this an int instead. 1626 01:09:35,990 --> 01:09:38,029 So if it's capital A, it becomes 65. 1627 01:09:38,029 --> 01:09:40,439 Capital B, it becomes 66, and so forth. 1628 01:09:40,439 --> 01:09:45,050 And if I now compile this program after preemptively 1629 01:09:45,050 --> 01:09:48,200 fixing what would have been a mistake by adding the header-- 1630 01:09:48,200 --> 01:09:52,149 make ascii0.c Whoops, sorry. 1631 01:09:52,149 --> 01:09:53,559 Oh, common mistake. 1632 01:09:53,559 --> 01:09:54,350 Nothing to be done. 1633 01:09:54,350 --> 01:09:55,680 I'm pretty sure there's something to be done. 1634 01:09:55,680 --> 01:09:56,555 I need to compile it. 1635 01:09:56,555 --> 01:09:58,640 What did I do wrong? 1636 01:09:58,640 --> 01:09:59,610 Yeah, don't put the .c. 1637 01:09:59,610 --> 01:10:02,443 It's a little counterintuitive, but when you want to make a program, 1638 01:10:02,443 --> 01:10:05,080 you type the name of the program, not the name of the file. 1639 01:10:05,080 --> 01:10:06,200 Now, in-- oh, damn it. 1640 01:10:06,200 --> 01:10:08,300 I almost learned from my mistakes. 1641 01:10:08,300 --> 01:10:09,339 What am I missing now? 1642 01:10:09,339 --> 01:10:10,130 AUDIENCE: String.h. 1643 01:10:10,130 --> 01:10:12,050 DAVID MALAN: String.h. 1644 01:10:12,050 --> 01:10:13,220 All right. 1645 01:10:13,220 --> 01:10:18,020 Include string.h. 1646 01:10:18,020 --> 01:10:19,280 Save it. 1647 01:10:19,280 --> 01:10:21,020 OK, so let me make ascii0. 1648 01:10:21,020 --> 01:10:22,970 Good. ./ascii0. 1649 01:10:22,970 --> 01:10:25,550 And now, Stelios, Enter. 1650 01:10:25,550 --> 01:10:28,760 And now we see the ASCII codes or the numbers that 1651 01:10:28,760 --> 01:10:30,966 correspond to the letters in his name. 1652 01:10:30,966 --> 01:10:32,090 They're pretty big numbers. 1653 01:10:32,090 --> 01:10:33,330 They're in the 100s now. 1654 01:10:33,330 --> 01:10:35,054 And that's because they're lowercase. 1655 01:10:35,054 --> 01:10:37,970 We've previously talked only about capital A, capital B, and so forth. 1656 01:10:37,970 --> 01:10:40,100 But it turns out that the lowercase letters also 1657 01:10:40,100 --> 01:10:44,900 have values associated with them, like some of those here, as well. 1658 01:10:44,900 --> 01:10:47,420 And now it turns out now that I know this, now I 1659 01:10:47,420 --> 01:10:50,809 can kind of do some low-level stuff that we all take for granted on our phones 1660 01:10:50,809 --> 01:10:53,600 and websites like when you just type in your name in all lowercase, 1661 01:10:53,600 --> 01:10:56,420 and the website just fixes it, or if you type in your phone number 1662 01:10:56,420 --> 01:10:59,450 with parentheses, without parentheses, with dashes, without, 1663 01:10:59,450 --> 01:11:03,110 the website just kind of fixes it and cleans it up into some cleaner format. 1664 01:11:03,110 --> 01:11:05,750 We now have kind of the low-level control to do this. 1665 01:11:05,750 --> 01:11:08,690 I won't type this one out manually just because it's a little longer. 1666 01:11:08,690 --> 01:11:10,100 Let me go ahead and open it up. 1667 01:11:10,100 --> 01:11:15,530 And among today's examples in source2, which is on the course's website, 1668 01:11:15,530 --> 01:11:17,420 is this example here-- 1669 01:11:17,420 --> 01:11:19,019 capitalize0. 1670 01:11:19,019 --> 01:11:20,810 So let me make a little more room for this. 1671 01:11:20,810 --> 01:11:21,750 It's a little longer. 1672 01:11:21,750 --> 01:11:24,470 But let's just focus on just a couple lines at a time. 1673 01:11:24,470 --> 01:11:26,920 Here's the beginning of my program main. 1674 01:11:26,920 --> 01:11:29,480 Here is a line of code where get_string before. 1675 01:11:29,480 --> 01:11:31,190 I just say, give me the before string. 1676 01:11:31,190 --> 01:11:35,354 And then I claim, now print the string after making some changes to it. 1677 01:11:35,354 --> 01:11:36,270 So what am I gonna do? 1678 01:11:36,270 --> 01:11:40,170 On line 11, I seem to have used the same ideas a moment ago, 1679 01:11:40,170 --> 01:11:42,630 but with one change, actually. 1680 01:11:42,630 --> 01:11:44,700 I've done something a little different. 1681 01:11:44,700 --> 01:11:47,490 Line 11 is very similar to what I've been doing to iterate over 1682 01:11:47,490 --> 01:11:49,260 the characters in a string. 1683 01:11:49,260 --> 01:11:53,240 But I did something different, which is what? 1684 01:11:53,240 --> 01:11:56,695 What looks different now versus what was on the screen a little bit ago? 1685 01:11:56,695 --> 01:11:59,672 1686 01:11:59,672 --> 01:12:01,350 Anyone a little farther back? 1687 01:12:01,350 --> 01:12:02,072 Yeah. 1688 01:12:02,072 --> 01:12:03,960 AUDIENCE: [INAUDIBLE] 1689 01:12:03,960 --> 01:12:07,080 DAVID MALAN: Yeah, it looks like I'm declaring two variables all 1690 01:12:07,080 --> 01:12:08,970 in the same breath, so to speak. 1691 01:12:08,970 --> 01:12:12,190 I have my int i equals zero, and we've done this a bunch of times. 1692 01:12:12,190 --> 01:12:15,030 But then I have a comma here for the very first time. 1693 01:12:15,030 --> 01:12:17,089 n equals strlen of s. 1694 01:12:17,089 --> 01:12:19,380 But if you think about what these building blocks are-- 1695 01:12:19,380 --> 01:12:23,760 OK, the comma is new, but n is on the left, so that's a variable, apparently. 1696 01:12:23,760 --> 01:12:26,880 It's probably an int because the word int came before. 1697 01:12:26,880 --> 01:12:29,070 Equal sign is assignment from right to left. 1698 01:12:29,070 --> 01:12:31,750 strlen is a function that returns the length of a string. 1699 01:12:31,750 --> 01:12:37,310 So this would seem to be storing, just to be clear, what number in n? 1700 01:12:37,310 --> 01:12:39,060 Yeah, the number of characters in whatever 1701 01:12:39,060 --> 01:12:41,470 the string is the user typed in, "Stelios" or "David" 1702 01:12:41,470 --> 01:12:42,900 or whatever the name is. 1703 01:12:42,900 --> 01:12:44,820 So then I have my condition. 1704 01:12:44,820 --> 01:12:46,920 Then there's the semi-colon, which we have seen. 1705 01:12:46,920 --> 01:12:48,720 And then there's plus plus. 1706 01:12:48,720 --> 01:12:53,760 So I claim that this is, in a sense, better design. 1707 01:12:53,760 --> 01:12:55,374 It's a little more complicated. 1708 01:12:55,374 --> 01:12:56,790 Like, I typed out more characters. 1709 01:12:56,790 --> 01:12:58,350 I added another variable. 1710 01:12:58,350 --> 01:13:01,800 But why might it be smart and good design 1711 01:13:01,800 --> 01:13:05,610 to have used an extra variable, using a little more space to keep 1712 01:13:05,610 --> 01:13:09,700 a number around, so as to then simply compare i against n? 1713 01:13:09,700 --> 01:13:13,890 1714 01:13:13,890 --> 01:13:15,480 Why did I jump through these hoops? 1715 01:13:15,480 --> 01:13:16,020 What do you think? 1716 01:13:16,020 --> 01:13:18,269 AUDIENCE: Because then it doesn't have to check what n 1717 01:13:18,269 --> 01:13:20,190 is each time it goes through the loop. 1718 01:13:20,190 --> 01:13:21,065 DAVID MALAN: Exactly. 1719 01:13:21,065 --> 01:13:22,259 AUDIENCE: [INAUDIBLE] 1720 01:13:22,259 --> 01:13:25,050 DAVID MALAN: It doesn't have to check what the length of the string 1721 01:13:25,050 --> 01:13:28,920 is on every iteration because after all, once I or Stelios or whoever types 1722 01:13:28,920 --> 01:13:31,030 in their name, it's not going to change. 1723 01:13:31,030 --> 01:13:34,800 It is D-A-V-I-D or Stelios's name or Maria's name or whoever is playing 1724 01:13:34,800 --> 01:13:35,430 the game. 1725 01:13:35,430 --> 01:13:37,800 And so why would you keep calling a function saying, oh, 1726 01:13:37,800 --> 01:13:38,760 by the way, what's the length? 1727 01:13:38,760 --> 01:13:39,660 By the way, what's the length? 1728 01:13:39,660 --> 01:13:40,410 What's the length? 1729 01:13:40,410 --> 01:13:44,100 Just remember it the first time because odds are it takes a little bit of time 1730 01:13:44,100 --> 01:13:47,400 to do that computation and actually figure out the length. 1731 01:13:47,400 --> 01:13:50,370 And so here, we've simply kept that answer around in n 1732 01:13:50,370 --> 01:13:51,960 and can compare two variables. 1733 01:13:51,960 --> 01:13:55,509 Meanwhile, here's a pretty big if-else construct. 1734 01:13:55,509 --> 01:13:58,675 But if we break it down into pieces, it's doing something relatively simple. 1735 01:13:58,675 --> 01:14:06,990 On line 13, I am asking the question, if the i'th character of s is greater than 1736 01:14:06,990 --> 01:14:11,140 or equal to a lowercase A, &&, which we haven't seen before. 1737 01:14:11,140 --> 01:14:13,410 This means logically and. 1738 01:14:13,410 --> 01:14:20,970 So if it's greater than or equal to a and less than or equal to z-- 1739 01:14:20,970 --> 01:14:24,120 put another way, if it's between a and z inclusive in 1740 01:14:24,120 --> 01:14:28,450 lowercase-- what am I doing on line 15, which is super weird-looking? 1741 01:14:28,450 --> 01:14:31,460 1742 01:14:31,460 --> 01:14:35,780 I'm first printing out %c, which is my placeholder. 1743 01:14:35,780 --> 01:14:43,430 But then I'm printing out the result of s[i] minus whatever lowercase A minus 1744 01:14:43,430 --> 01:14:44,420 capital A is? 1745 01:14:44,420 --> 01:14:46,070 I mean, this is just strange now. 1746 01:14:46,070 --> 01:14:48,860 But let me just point out one clue. 1747 01:14:48,860 --> 01:14:50,360 Turns out there's a pattern here. 1748 01:14:50,360 --> 01:14:52,580 And humans did this deliberately. 1749 01:14:52,580 --> 01:14:55,040 If you can do the arithmetic quickly, how far apart 1750 01:14:55,040 --> 01:14:58,820 is lowercase A from capital A? 1751 01:14:58,820 --> 01:14:59,809 It's 32, right? 1752 01:14:59,809 --> 01:15:01,100 And you could just do the math. 1753 01:15:01,100 --> 01:15:03,170 97 minus 65-- oh, 32. 1754 01:15:03,170 --> 01:15:05,555 How about capital B versus lowercase B? 1755 01:15:05,555 --> 01:15:06,360 It's 32. 1756 01:15:06,360 --> 01:15:06,860 32. 1757 01:15:06,860 --> 01:15:07,880 It follows this pattern. 1758 01:15:07,880 --> 01:15:10,550 So this is the say-- and it's sort of proof by example. 1759 01:15:10,550 --> 01:15:13,010 We're not even seeing all the way to Z, but trust me. 1760 01:15:13,010 --> 01:15:15,770 32 is invariant across all of the letters of the alphabet. 1761 01:15:15,770 --> 01:15:17,300 They're always 32 away. 1762 01:15:17,300 --> 01:15:20,510 And I could hard-code 32, but that feels a little inelegant. 1763 01:15:20,510 --> 01:15:22,700 Why don't I instead just arithmetically say 1764 01:15:22,700 --> 01:15:26,130 whatever the difference is between lowercase A and capital A, 1765 01:15:26,130 --> 01:15:28,530 and that's all I'm saying in parentheses here. 1766 01:15:28,530 --> 01:15:32,390 Whatever that numeric difference is in the computer's representation 1767 01:15:32,390 --> 01:15:37,380 of my numbers, just subtract that difference from the i'th character. 1768 01:15:37,380 --> 01:15:40,610 Now, what's nice is I kind of sort of should do this first. 1769 01:15:40,610 --> 01:15:43,082 Like, I should cast the character to an int. 1770 01:15:43,082 --> 01:15:44,540 But I don't need to be so explicit. 1771 01:15:44,540 --> 01:15:46,915 The computer knows that characters are integers. 1772 01:15:46,915 --> 01:15:49,040 And the computer knows that integers are character. 1773 01:15:49,040 --> 01:15:50,081 There's this equivalence. 1774 01:15:50,081 --> 01:15:52,340 I don't need to be so verbose as to even say that. 1775 01:15:52,340 --> 01:15:54,980 It just suffices to let the computer figure it out 1776 01:15:54,980 --> 01:15:58,130 implicitly that in this context, I'm doing arithmetic on numbers, 1777 01:15:58,130 --> 01:16:02,086 and then, in the context of printf, I'm displaying that number as characters. 1778 01:16:02,086 --> 01:16:02,960 Nothing is happening. 1779 01:16:02,960 --> 01:16:06,170 You're just telling the compiler what context in which 1780 01:16:06,170 --> 01:16:09,170 to treat these values, numeric or characters. 1781 01:16:09,170 --> 01:16:14,330 So long story short, what does the effect of these four lines of code 1782 01:16:14,330 --> 01:16:18,014 have on the characters in the user's input? 1783 01:16:18,014 --> 01:16:18,680 What does it do? 1784 01:16:18,680 --> 01:16:20,030 AUDIENCE: [INAUDIBLE] 1785 01:16:20,030 --> 01:16:21,322 DAVID MALAN: It capitalizes it. 1786 01:16:21,322 --> 01:16:21,821 Right? 1787 01:16:21,821 --> 01:16:22,610 It capitalizes it. 1788 01:16:22,610 --> 01:16:25,850 And that might have been implied, too, by the file name, in full disclosure. 1789 01:16:25,850 --> 01:16:29,840 But it's how you think about solving the problem of capitalization. 1790 01:16:29,840 --> 01:16:30,770 Here's the string. 1791 01:16:30,770 --> 01:16:32,810 Home in on the individual characters. 1792 01:16:32,810 --> 01:16:36,000 Figure out if they're within a range you want to deal with. 1793 01:16:36,000 --> 01:16:40,250 And if so, do some kind of mutation of them to change from one value 1794 01:16:40,250 --> 01:16:41,060 to another. 1795 01:16:41,060 --> 01:16:42,650 I could have done this a horrible way. 1796 01:16:42,650 --> 01:16:45,080 I could have had, like, an if-else-if-else-if-else-- 1797 01:16:45,080 --> 01:16:49,070 I could have, like, 26 conditions checking is the character A? 1798 01:16:49,070 --> 01:16:49,800 Is it B? 1799 01:16:49,800 --> 01:16:50,300 Is it C? 1800 01:16:50,300 --> 01:16:52,700 And if so, make it capital A, capital B, capital C. 1801 01:16:52,700 --> 01:16:55,220 But that code would have been like this big or bigger. 1802 01:16:55,220 --> 01:16:59,270 This is now a more algorithmic way of solving the same problem. 1803 01:16:59,270 --> 01:17:02,390 And if it's not a lowercase letter between A and Z, just print it out. 1804 01:17:02,390 --> 01:17:04,890 There's no work to be done. 1805 01:17:04,890 --> 01:17:08,240 Now, just so you've seen it, it doesn't have to even be as verbose as this. 1806 01:17:08,240 --> 01:17:12,890 In capitalize1.c, which is available also on the course's website, 1807 01:17:12,890 --> 01:17:15,050 I've made my code a little better designed. 1808 01:17:15,050 --> 01:17:17,240 I'm now not reinventing as many wheels. 1809 01:17:17,240 --> 01:17:20,060 I'm standing on the shoulders of smart programmers before me. 1810 01:17:20,060 --> 01:17:22,850 And I've clearly changed at least one thing. 1811 01:17:22,850 --> 01:17:26,990 Instead of doing this manual process of comparing against lowercase A 1812 01:17:26,990 --> 01:17:30,710 and lowercase Z, I'm just punting and using 1813 01:17:30,710 --> 01:17:33,500 a function which, beautifully, is called is islower, which 1814 01:17:33,500 --> 01:17:35,180 just literally answers that question. 1815 01:17:35,180 --> 01:17:37,940 Because another data type in C is not just 1816 01:17:37,940 --> 01:17:42,380 int and char and float and string in CS50's library, 1817 01:17:42,380 --> 01:17:44,330 but there's also something called a Boolean. 1818 01:17:44,330 --> 01:17:47,450 A Boolean, also named after Boole, is similar in spirit 1819 01:17:47,450 --> 01:17:49,370 to a Boolean expression, true or false. 1820 01:17:49,370 --> 01:17:54,890 But a Boolean variable is literally just the idea true or false. 1821 01:17:54,890 --> 01:17:58,470 And so islower you can think of as returning a Boolean value. 1822 01:17:58,470 --> 01:18:00,719 It returns true or false, yes or no. 1823 01:18:00,719 --> 01:18:03,260 And the name of this function, therefore is very appropriate. 1824 01:18:03,260 --> 01:18:04,140 Is lower? 1825 01:18:04,140 --> 01:18:06,080 That's a yes-no or a true-false question. 1826 01:18:06,080 --> 01:18:07,580 I don't know how it's implemented. 1827 01:18:07,580 --> 01:18:12,200 If I really care, I could go to CS50 Reference, 1828 01:18:12,200 --> 01:18:14,750 or I could use the man command on the IDE. 1829 01:18:14,750 --> 01:18:16,820 And I could actually check how this thing works. 1830 01:18:16,820 --> 01:18:18,500 But I do need one takeaway. 1831 01:18:18,500 --> 01:18:21,039 To use this, I need to use the ctype library. 1832 01:18:21,039 --> 01:18:24,080 So there's other libraries that we're now just scratching the surface of. 1833 01:18:24,080 --> 01:18:27,170 And you would only know they exist by reading documentation like that. 1834 01:18:27,170 --> 01:18:27,920 But you know what? 1835 01:18:27,920 --> 01:18:29,900 I can go even further. 1836 01:18:29,900 --> 01:18:30,530 You know what? 1837 01:18:30,530 --> 01:18:34,180 If some human years ago wrote a function to check if something is lower, 1838 01:18:34,180 --> 01:18:38,462 what did he or she probably do, as well, for me? 1839 01:18:38,462 --> 01:18:39,860 AUDIENCE: Uppercase? 1840 01:18:39,860 --> 01:18:42,480 DAVID MALAN: Yeah, isupper also does exist. 1841 01:18:42,480 --> 01:18:42,980 Yeah. 1842 01:18:42,980 --> 01:18:43,760 So spoiler here. 1843 01:18:43,760 --> 01:18:45,750 So isupper exists. 1844 01:18:45,750 --> 01:18:50,390 But if they checked if it's lower or is it upper, gonna just go out on a limb. 1845 01:18:50,390 --> 01:18:51,721 toupper? 1846 01:18:51,721 --> 01:18:52,220 Yeah. 1847 01:18:52,220 --> 01:18:54,650 So it turns out there's a function called toupper 1848 01:18:54,650 --> 01:18:56,990 that converts a letter to uppercase. 1849 01:18:56,990 --> 01:19:00,170 And indeed, I can now leverage this in my third version of this program 1850 01:19:00,170 --> 01:19:01,130 as follows. 1851 01:19:01,130 --> 01:19:06,200 capitalize2.c gets even better designed still, if you will. 1852 01:19:06,200 --> 01:19:09,590 It's even shorter, fewer lines of code, easier to read, 1853 01:19:09,590 --> 01:19:11,300 fewer opportunities for bugs. 1854 01:19:11,300 --> 01:19:12,377 How do I solve it now? 1855 01:19:12,377 --> 01:19:14,210 I still iterate over each of the characters, 1856 01:19:14,210 --> 01:19:19,550 but I just blindly call toupper, toupper, toupper on every character 1857 01:19:19,550 --> 01:19:21,200 because I read the documentation. 1858 01:19:21,200 --> 01:19:24,064 If you pass a character to toupper that is already uppercase, 1859 01:19:24,064 --> 01:19:24,980 it just prints it out. 1860 01:19:24,980 --> 01:19:25,760 Doesn't change it. 1861 01:19:25,760 --> 01:19:28,680 If you pass in a punctuation symbol, it just passes it through. 1862 01:19:28,680 --> 01:19:32,100 But if you pass in a lowercase letter, it capitalizes it for you. 1863 01:19:32,100 --> 01:19:33,630 And so I can now kind of implement-- 1864 01:19:33,630 --> 01:19:36,584 I can lean on whoever implemented that before me. 1865 01:19:36,584 --> 01:19:37,500 It could have been me. 1866 01:19:37,500 --> 01:19:39,170 I could have wrote my own function called toupper 1867 01:19:39,170 --> 01:19:41,503 But I don't need to because in the world of programming, 1868 01:19:41,503 --> 01:19:46,040 there exists libraries of code that other people have written for us 1869 01:19:46,040 --> 01:19:47,880 that we can leverage. 1870 01:19:47,880 --> 01:19:49,880 Any questions, then, on that? 1871 01:19:49,880 --> 01:19:50,642 Yeah. 1872 01:19:50,642 --> 01:19:54,585 AUDIENCE: So this method, you wouldn't be able to [INAUDIBLE].. 1873 01:19:54,585 --> 01:19:56,460 DAVID MALAN: This would be all of them, yeah. 1874 01:19:56,460 --> 01:20:00,470 So if I only wanted to capitalize Stelios's first-- 1875 01:20:00,470 --> 01:20:04,730 the first letter of his name, I probably wouldn't want the loop. 1876 01:20:04,730 --> 01:20:09,120 I would probably just want to capitalize [0], specifically, of the letters. 1877 01:20:09,120 --> 01:20:13,030 But I'd want to make sure that his name is at least one character long, lest he 1878 01:20:13,030 --> 01:20:15,860 just have hit, like, Enter accidentally or maliciously. 1879 01:20:15,860 --> 01:20:17,360 Absolutely. 1880 01:20:17,360 --> 01:20:21,270 So let's just dive in to one other detail here as follows. 1881 01:20:21,270 --> 01:20:25,550 Suppose that I want to actually know what the length of a string is. 1882 01:20:25,550 --> 01:20:28,610 I know that there exists this function called strlen. 1883 01:20:28,610 --> 01:20:31,880 But it turns out I can figure out lengths of strings for myself, too. 1884 01:20:31,880 --> 01:20:35,210 Let me go ahead and write a program called strlen itself. 1885 01:20:35,210 --> 01:20:38,970 But I'm not allowed in this example to use string length. 1886 01:20:38,970 --> 01:20:41,430 I'm going to go ahead and include the CS50 library. 1887 01:20:41,430 --> 01:20:44,150 Let me include stdio.h. 1888 01:20:44,150 --> 01:20:47,600 Let me go ahead and do int main void. 1889 01:20:47,600 --> 01:20:50,980 And now let me go ahead here and do string-- 1890 01:20:50,980 --> 01:20:54,440 bad style-- string s gets get_string. 1891 01:20:54,440 --> 01:20:55,700 Name. 1892 01:20:55,700 --> 01:20:59,330 And now let me go ahead and do int n equals 0. 1893 01:20:59,330 --> 01:21:02,600 Just give me a variable, call it n, set it equal to zero. 1894 01:21:02,600 --> 01:21:10,580 And then let me go ahead and while I'm not at the end of the string-- 1895 01:21:10,580 --> 01:21:12,380 also not valid code-- 1896 01:21:12,380 --> 01:21:13,550 n plus plus. 1897 01:21:13,550 --> 01:21:17,090 I can use that plus plus trick that we've seen before for i plus plus. 1898 01:21:17,090 --> 01:21:21,680 And then I'm going to go ahead and print out whatever the value of that counter 1899 01:21:21,680 --> 01:21:23,780 is because I want in my loop to just count 1900 01:21:23,780 --> 01:21:26,120 the number of characters in Stelios's name 1901 01:21:26,120 --> 01:21:28,110 or whoever's name actually ran the program. 1902 01:21:28,110 --> 01:21:30,890 And just to be clear, this is what's called syntactic sugar, which 1903 01:21:30,890 --> 01:21:35,240 is a very sexy way of just saying this is shorthand notation for doing this, 1904 01:21:35,240 --> 01:21:38,090 which is just more boring-looking. 1905 01:21:38,090 --> 01:21:39,600 This does the exact same thing. 1906 01:21:39,600 --> 01:21:41,350 It's just a more succinct way of doing it. 1907 01:21:41,350 --> 01:21:43,190 And you'll see little features of languages like this just 1908 01:21:43,190 --> 01:21:44,960 to save us humans keystrokes. 1909 01:21:44,960 --> 01:21:47,279 This, of course, is not a solution to a problem. 1910 01:21:47,279 --> 01:21:49,070 How do I know I'm at the end of the string? 1911 01:21:49,070 --> 01:21:52,130 Well, it turns out we need to break the abstraction layer, so to speak, 1912 01:21:52,130 --> 01:21:54,410 of strings just a little bit. 1913 01:21:54,410 --> 01:21:58,050 So it turns out that in your computer, we have this piece of hardware-- 1914 01:21:58,050 --> 01:21:58,550 RAM. 1915 01:21:58,550 --> 01:21:59,630 And we saw this the other day. 1916 01:21:59,630 --> 01:22:01,820 And we talked a little bit about the limitations of computers 1917 01:22:01,820 --> 01:22:03,837 and the finite amount of memory that they have. 1918 01:22:03,837 --> 01:22:06,170 And if you think about all of the chips on this device-- 1919 01:22:06,170 --> 01:22:07,940 doesn't matter for today how this works. 1920 01:22:07,940 --> 01:22:10,700 But just know that there's lots and lots of bytes that can 1921 01:22:10,700 --> 01:22:12,800 be stored in your computer's memory. 1922 01:22:12,800 --> 01:22:17,420 And you might have 1 billion bytes, 1 gigabyte, 2 billion bytes, 2 gigabytes. 1923 01:22:17,420 --> 01:22:21,650 But for our purposes today, just think of this RAM inside of your computer 1924 01:22:21,650 --> 01:22:25,940 as just a long list of available bytes-- lots of bits, zeroes and ones, 1925 01:22:25,940 --> 01:22:27,430 that you can change the values of. 1926 01:22:27,430 --> 01:22:29,721 And maybe it's kind of a grid, so there's lots of bytes 1927 01:22:29,721 --> 01:22:31,490 horizontally, lots of bytes vertically. 1928 01:22:31,490 --> 01:22:34,730 We can kind of number them all so that one of the bytes is 0, 1929 01:22:34,730 --> 01:22:37,880 and the other one way at the bottom is, like, the 2 billionth byte. 1930 01:22:37,880 --> 01:22:41,570 So just assume that we can number all of the bytes in our computer's memory. 1931 01:22:41,570 --> 01:22:44,810 Well, it turns out that when you type in Stelios's name, it of course 1932 01:22:44,810 --> 01:22:49,630 ends with an S. But it would probably be a stupid decision to just look for an S 1933 01:22:49,630 --> 01:22:51,946 when figuring out the length of someone's name 1934 01:22:51,946 --> 01:22:53,570 because it's not gonna work on my name. 1935 01:22:53,570 --> 01:22:57,600 It's not gonna work on Maria's name or any number of other people in the room. 1936 01:22:57,600 --> 01:23:00,470 So we don't know enough yet about what's going 1937 01:23:00,470 --> 01:23:03,260 on inside of the computer's memory. 1938 01:23:03,260 --> 01:23:07,130 It turns out that if you think of this grid now as your computer's RAM, 1939 01:23:07,130 --> 01:23:09,042 maybe top-left corner is byte zero. 1940 01:23:09,042 --> 01:23:11,750 The one next to it is byte one, then byte two, then dot, dot, to, 1941 01:23:11,750 --> 01:23:12,820 byte two billion. 1942 01:23:12,820 --> 01:23:18,110 So I'm just arbitrarily depicting it as a two-dimensional grid. 1943 01:23:18,110 --> 01:23:21,200 Turns out we need to know that there's this special character. 1944 01:23:21,200 --> 01:23:24,980 What C does for us even without our telling it to do, 1945 01:23:24,980 --> 01:23:29,240 it always puts a secret number at the end of any string the human types in. 1946 01:23:29,240 --> 01:23:32,030 It's specifically represented as backslash 0. 1947 01:23:32,030 --> 01:23:34,100 But that's just the special way, like backslash n 1948 01:23:34,100 --> 01:23:37,930 is special, of saying that is eight zero bits all together. 1949 01:23:37,930 --> 01:23:40,100 It's a special value, 0. 1950 01:23:40,100 --> 01:23:44,360 And so now that we have this so-called sentinel value, if you will-- 1951 01:23:44,360 --> 01:23:47,090 sentinel value means this is just special. 1952 01:23:47,090 --> 01:23:48,650 The human can't really type this. 1953 01:23:48,650 --> 01:23:52,010 Like, I can't actually type all zero bits easily on my keyboard 1954 01:23:52,010 --> 01:23:55,610 because honestly, even if you hit the number zero, that is technically 1955 01:23:55,610 --> 01:24:00,080 the character 0 because it turns out even numbers on your keyboard 1956 01:24:00,080 --> 01:24:01,760 map to different integers. 1957 01:24:01,760 --> 01:24:03,020 But more on that another time. 1958 01:24:03,020 --> 01:24:09,330 So 00000000 as bits are what that is. 1959 01:24:09,330 --> 01:24:12,470 And so if I write a program that calls get_string multiple times, 1960 01:24:12,470 --> 01:24:14,720 and Stelios is the first one to type in his name, 1961 01:24:14,720 --> 01:24:16,790 it might end up in memory looking like this. 1962 01:24:16,790 --> 01:24:19,582 But then suppose one other person types in their name, like Maria. 1963 01:24:19,582 --> 01:24:22,040 Her name is just going to fit in the next available memory, 1964 01:24:22,040 --> 01:24:24,530 but also be null terminated, so to speak. 1965 01:24:24,530 --> 01:24:28,520 The sentinel value is also called null, N-U-L. But that's just all zeros. 1966 01:24:28,520 --> 01:24:30,770 And then if someone else types in his or her name, 1967 01:24:30,770 --> 01:24:32,145 it's still going to fit in there. 1968 01:24:32,145 --> 01:24:33,499 So Zamyla, for instance-- 1969 01:24:33,499 --> 01:24:36,290 it wraps around, but again, this is an arbitrary artist's rendition 1970 01:24:36,290 --> 01:24:37,560 of my computer's memory. 1971 01:24:37,560 --> 01:24:40,970 Z-A-M-Y-L-A, backslash 0. 1972 01:24:40,970 --> 01:24:44,420 And I can keep typing in names until I'm just out of memory. 1973 01:24:44,420 --> 01:24:46,530 At that point, the program's going to crash, 1974 01:24:46,530 --> 01:24:49,700 or I'm gonna have an if condition that says too many things in memory. 1975 01:24:49,700 --> 01:24:52,550 Something's gonna have to stop at that point. 1976 01:24:52,550 --> 01:24:54,490 So what this means for my implementation, 1977 01:24:54,490 --> 01:24:56,500 ultimately, is the following. 1978 01:24:56,500 --> 01:25:02,080 I can now go ahead here and change this silly English to the following. 1979 01:25:02,080 --> 01:25:11,920 While the n'th character of the string does not equal, 1980 01:25:11,920 --> 01:25:14,590 quote, unquote, backslash 0. 1981 01:25:14,590 --> 01:25:18,070 And I'm using single quotes this time because recall from last time 1982 01:25:18,070 --> 01:25:21,370 that we use single quotes anytime we talk about single characters. 1983 01:25:21,370 --> 01:25:24,370 We use double quotes any time we're talking about strings. 1984 01:25:24,370 --> 01:25:29,920 And even though s is a string, s[n] is the n'th or i'th-- doesn't matter what 1985 01:25:29,920 --> 01:25:31,614 letter we use-- the n'th character. 1986 01:25:31,614 --> 01:25:32,530 So that's a character. 1987 01:25:32,530 --> 01:25:34,640 And so we now need to use single quotes. 1988 01:25:34,640 --> 01:25:38,217 So this is really just doing the following-- it's initializing n to 0. 1989 01:25:38,217 --> 01:25:39,550 And then it's looking in memory. 1990 01:25:39,550 --> 01:25:41,590 And it's saying, is this backslash 0? 1991 01:25:41,590 --> 01:25:43,660 If not, increment n by 1. 1992 01:25:43,660 --> 01:25:44,740 Is this backslash 0? 1993 01:25:44,740 --> 01:25:45,450 No. 1994 01:25:45,450 --> 01:25:46,110 No. 1995 01:25:46,110 --> 01:25:46,720 No. 1996 01:25:46,720 --> 01:25:47,220 No. 1997 01:25:47,220 --> 01:25:48,490 Damn it. 1998 01:25:48,490 --> 01:25:49,480 No. 1999 01:25:49,480 --> 01:25:50,520 No. 2000 01:25:50,520 --> 01:25:51,500 Yes. 2001 01:25:51,500 --> 01:25:55,960 And at that point, I have 7 fingers up, or n is storing the value 7. 2002 01:25:55,960 --> 01:25:58,090 That's what my program is going to print out. 2003 01:25:58,090 --> 01:26:01,690 So now we have a complete program that counts 2004 01:26:01,690 --> 01:26:03,310 the number of characters in a string. 2005 01:26:03,310 --> 01:26:06,880 I don't need this program because strlen exists as a function. 2006 01:26:06,880 --> 01:26:13,090 But it's now a capability to which I have access. 2007 01:26:13,090 --> 01:26:15,850 Any questions, then, on what a string really 2008 01:26:15,850 --> 01:26:19,120 is underneath the hood as this sequence of characters 2009 01:26:19,120 --> 01:26:21,880 with a special null character at the end? 2010 01:26:21,880 --> 01:26:23,016 Yeah. 2011 01:26:23,016 --> 01:26:31,099 AUDIENCE: [INAUDIBLE] 2012 01:26:31,099 --> 01:26:32,390 DAVID MALAN: Ah, good question. 2013 01:26:32,390 --> 01:26:35,860 What about other data types, if I can rephrase it like ints and floats 2014 01:26:35,860 --> 01:26:36,610 and so forth? 2015 01:26:36,610 --> 01:26:38,230 Actually, strings are special. 2016 01:26:38,230 --> 01:26:42,610 If I scroll back to the list of data types that C has, 2017 01:26:42,610 --> 01:26:46,230 for instance, most of these are of fixed length. 2018 01:26:46,230 --> 01:26:47,980 And this is why the compiler needs to know 2019 01:26:47,980 --> 01:26:51,021 what you're putting in them because the compiler and the computer in turn 2020 01:26:51,021 --> 01:26:52,414 need to know is it one byte? 2021 01:26:52,414 --> 01:26:53,080 Is it two bytes? 2022 01:26:53,080 --> 01:26:53,860 Is it four? 2023 01:26:53,860 --> 01:26:54,490 Is it eight? 2024 01:26:54,490 --> 01:26:55,870 How many bytes should I look at? 2025 01:26:55,870 --> 01:26:58,601 Strings have no predetermined length because, of course, 2026 01:26:58,601 --> 01:27:00,600 we don't know who's going to type in their name. 2027 01:27:00,600 --> 01:27:03,520 But an int, turns out, in most systems is always 2028 01:27:03,520 --> 01:27:07,480 going to be 32 bits or maybe 64 bits, or, equivalently, 2029 01:27:07,480 --> 01:27:11,290 four bytes or eight bytes because there's a one to eight ratio. 2030 01:27:11,290 --> 01:27:13,362 A bool is often one byte. 2031 01:27:13,362 --> 01:27:14,320 It's a little wasteful. 2032 01:27:14,320 --> 01:27:15,670 Even though technically you need one bit, 2033 01:27:15,670 --> 01:27:18,130 it's just easier to deal with eight-bit increments. 2034 01:27:18,130 --> 01:27:21,640 Chars are, by definition, eight bits or one byte. 2035 01:27:21,640 --> 01:27:24,050 So almost all of the data types are a fixed length. 2036 01:27:24,050 --> 01:27:26,920 So you don't need to have a special null character. 2037 01:27:26,920 --> 01:27:28,060 But strings you do. 2038 01:27:28,060 --> 01:27:29,830 Strings are special. 2039 01:27:29,830 --> 01:27:31,791 Other questions? 2040 01:27:31,791 --> 01:27:32,290 All right. 2041 01:27:32,290 --> 01:27:35,250 So what can we start to do with this? 2042 01:27:35,250 --> 01:27:39,730 Well, it turns out that this idea of thinking about things that are 2043 01:27:39,730 --> 01:27:45,490 back-to-back-to-back-to-back- as being individually accessible is actually 2044 01:27:45,490 --> 01:27:46,447 a very powerful idea. 2045 01:27:46,447 --> 01:27:49,030 Because up until now, we've just had this list of data types-- 2046 01:27:49,030 --> 01:27:50,640 bool and float and char and int. 2047 01:27:50,640 --> 01:27:53,740 It's kind of a short list of very primitive things. 2048 01:27:53,740 --> 01:27:56,470 But it turns out if you want to write a program that doesn't just 2049 01:27:56,470 --> 01:28:00,550 keep asking for one name but asks for two people's names or 10 people's names 2050 01:28:00,550 --> 01:28:03,571 or asks for, as you asked earlier, the name or maybe their house 2051 01:28:03,571 --> 01:28:06,070 or their dorm or their phone number or their email address-- 2052 01:28:06,070 --> 01:28:07,900 a whole bunch of different values-- 2053 01:28:07,900 --> 01:28:11,230 it would be nice to kind of store multiple things together. 2054 01:28:11,230 --> 01:28:17,050 And one way you can store multiple strings is you could call one string s. 2055 01:28:17,050 --> 01:28:18,554 You can call the next string t. 2056 01:28:18,554 --> 01:28:21,220 You could call the next string whatever-- you could just come up 2057 01:28:21,220 --> 01:28:23,082 with arbitrary names for your strings. 2058 01:28:23,082 --> 01:28:24,790 But that's going to very quickly devolve. 2059 01:28:24,790 --> 01:28:27,730 Imagine, like, what the registrar uses here or at Yale 2060 01:28:27,730 --> 01:28:29,350 to actually keep track of students. 2061 01:28:29,350 --> 01:28:33,790 They don't have a computer program with thousands of variables inside of it. 2062 01:28:33,790 --> 01:28:35,980 They probably have a computer program for dealing 2063 01:28:35,980 --> 01:28:40,120 with course registrations with at least one variable called students. 2064 01:28:40,120 --> 01:28:43,390 And inside of that students variable can the registrar 2065 01:28:43,390 --> 01:28:46,450 fit one student, 10 students, thousands of students. 2066 01:28:46,450 --> 01:28:50,410 It can kind of grow to fill the number of values we actually care about. 2067 01:28:50,410 --> 01:28:52,540 And C isn't quite as powerful as that. 2068 01:28:52,540 --> 01:28:56,470 We'll need another language like Python or JavaScript to really get dynamism. 2069 01:28:56,470 --> 01:29:00,550 But for now, we do have the ability in C to represent multiple things 2070 01:29:00,550 --> 01:29:02,710 back to back to back to back to back in memory. 2071 01:29:02,710 --> 01:29:05,150 So not just characters in strings. 2072 01:29:05,150 --> 01:29:08,740 We can borrow that idea from strings and store, if we really want, 2073 01:29:08,740 --> 01:29:12,520 student, student, student, student like multiple strings 2074 01:29:12,520 --> 01:29:15,250 back to back instead of just individual characters. 2075 01:29:15,250 --> 01:29:18,480 And what that idea is called is an array. 2076 01:29:18,480 --> 01:29:21,700 An array is a contiguous chunk of memory, 2077 01:29:21,700 --> 01:29:24,280 something back to back to back-- literally physically 2078 01:29:24,280 --> 01:29:28,270 next to each other, typically, in the RAM that we've presented as hardware. 2079 01:29:28,270 --> 01:29:30,597 But it's not just character, character, character. 2080 01:29:30,597 --> 01:29:33,430 Maybe it's int, int int, int, int or string, string, string, string, 2081 01:29:33,430 --> 01:29:36,430 string or, more generally, student, student, student, student, student-- 2082 01:29:36,430 --> 01:29:38,230 multiple things back to back to back. 2083 01:29:38,230 --> 01:29:40,330 And so now we can actually give you a glimpse 2084 01:29:40,330 --> 01:29:45,010 of what this thing here is that we keep typing sort of on faith. 2085 01:29:45,010 --> 01:29:48,790 Int main void literally says that your main programs 2086 01:29:48,790 --> 01:29:51,580 that you're about to start writing for pset one and beyond 2087 01:29:51,580 --> 01:29:55,180 will be returning an int, even if you don't do it yourself. 2088 01:29:55,180 --> 01:29:57,330 They're going to return by default 0, it turns out. 2089 01:29:57,330 --> 01:30:01,660 And we'll see before long why this is useful for a main function 2090 01:30:01,660 --> 01:30:05,479 to return a value, even though we humans will rarely, if ever, see that value. 2091 01:30:05,479 --> 01:30:07,270 But it is interesting to note that main can 2092 01:30:07,270 --> 01:30:12,880 take input, and not input in the sense of get_int and get_string and so forth. 2093 01:30:12,880 --> 01:30:15,280 You can actually provide your program with input 2094 01:30:15,280 --> 01:30:17,200 at the so-called command line. 2095 01:30:17,200 --> 01:30:21,490 All this time, I've been typing ./mario0, ./mario1, 2096 01:30:21,490 --> 01:30:23,170 and no words after that. 2097 01:30:23,170 --> 01:30:26,410 And yet we've shown you clang already, the compiler itself, 2098 01:30:26,410 --> 01:30:29,770 which can take in, like, -o and then -lcs50, 2099 01:30:29,770 --> 01:30:33,689 all of these additional key words that somehow influence its behavior. 2100 01:30:33,689 --> 01:30:35,980 So wouldn't it be nice if I could write a program where 2101 01:30:35,980 --> 01:30:38,680 I don't prompt the user eventually for his or her name. 2102 01:30:38,680 --> 01:30:41,350 Let me just let them type their name at the command line 2103 01:30:41,350 --> 01:30:44,380 and hit Enter once and be done with it, just like clang 2104 01:30:44,380 --> 01:30:46,480 is just one long command, and you're done with it. 2105 01:30:46,480 --> 01:30:47,770 There's no prompts. 2106 01:30:47,770 --> 01:30:51,500 Well, we can do this if we change void to this. 2107 01:30:51,500 --> 01:30:55,690 And it's a mouthful, but there is an alternative version of main 2108 01:30:55,690 --> 01:30:57,400 that does not just take zero arguments. 2109 01:30:57,400 --> 01:30:59,650 That's what the key word void all this time has meant. 2110 01:30:59,650 --> 01:31:01,960 It just means main takes no input by default. 2111 01:31:01,960 --> 01:31:05,510 You have to prompt the user explicitly with get_int or get_string or whatever. 2112 01:31:05,510 --> 01:31:09,400 But there's an alternative second version of main in C 2113 01:31:09,400 --> 01:31:11,800 that takes two inputs. 2114 01:31:11,800 --> 01:31:13,960 And you don't have to provide them explicitly. 2115 01:31:13,960 --> 01:31:15,940 We'll see how to use this in a second. 2116 01:31:15,940 --> 01:31:17,920 Main can also be handed two inputs. 2117 01:31:17,920 --> 01:31:22,660 One is an int, and one is an array of strings. 2118 01:31:22,660 --> 01:31:25,210 The int is the total number of words that the human 2119 01:31:25,210 --> 01:31:27,820 has typed at their keyboard. 2120 01:31:27,820 --> 01:31:30,380 The argv, argument vector, by convention, 2121 01:31:30,380 --> 01:31:34,580 though we could call it anything we want, that is an array of words 2122 01:31:34,580 --> 01:31:38,660 that the user typed at the prompt before hitting Enter. 2123 01:31:38,660 --> 01:31:41,910 And so this is useful in the following way. 2124 01:31:41,910 --> 01:31:45,960 I'm going to go ahead and in today's source code open up an example called 2125 01:31:45,960 --> 01:31:47,510 argv-- for argument vector-- 2126 01:31:47,510 --> 01:31:51,170 0 as follows. 2127 01:31:51,170 --> 01:31:54,890 In argv0, there's not all that much going on. 2128 01:31:54,890 --> 01:31:57,680 And if you at least kind of take on faith the concept here, 2129 01:31:57,680 --> 01:31:59,360 you can perhaps infer what's going on. 2130 01:31:59,360 --> 01:32:02,840 So I've changed what main looks like on line six, the signature of main, 2131 01:32:02,840 --> 01:32:03,990 so to speak. 2132 01:32:03,990 --> 01:32:05,330 And then I'm asking a question. 2133 01:32:05,330 --> 01:32:10,700 If argv equals equals 2, then print out "Hello, something." 2134 01:32:10,700 --> 01:32:13,940 Otherwise, just print out the hardcoded "hello, world." 2135 01:32:13,940 --> 01:32:18,860 So it looks like argv[1] is kind of being treated like we were treating 2136 01:32:18,860 --> 01:32:20,150 strings a moment ago. 2137 01:32:20,150 --> 01:32:22,640 But this is the special syntax that's new. 2138 01:32:22,640 --> 01:32:25,130 If you use square brackets like this, like I've done, 2139 01:32:25,130 --> 01:32:28,460 with no numbers inside, that's like telling the computer, hey, computer, 2140 01:32:28,460 --> 01:32:33,440 this variable argv is going to be an array of some length of strings. 2141 01:32:33,440 --> 01:32:34,040 Why strings? 2142 01:32:34,040 --> 01:32:36,873 Because string is the word immediately to the left-- string argv0[]. 2143 01:32:36,873 --> 01:32:38,686 2144 01:32:38,686 --> 01:32:41,060 Now, I don't know how the strings are gonna get in there. 2145 01:32:41,060 --> 01:32:42,810 The computer's gonna do that for me. 2146 01:32:42,810 --> 01:32:44,180 But it gives me this capability. 2147 01:32:44,180 --> 01:32:49,240 Let me go ahead and compile this program as follows-- make argv0. 2148 01:32:49,240 --> 01:32:52,160 ./argv0. 2149 01:32:52,160 --> 01:32:52,880 Hello, world. 2150 01:32:52,880 --> 01:32:54,200 Uninteresting. 2151 01:32:54,200 --> 01:32:59,092 But if I now type in my name at the prompt and hit Enter, now it's dynamic. 2152 01:32:59,092 --> 01:33:00,050 So what must this mean? 2153 01:33:00,050 --> 01:33:02,180 Even if the syntax is a little new, we can kind of 2154 01:33:02,180 --> 01:33:04,420 infer now what this must be doing. 2155 01:33:04,420 --> 01:33:07,040 Argc happens to stand for argument count. 2156 01:33:07,040 --> 01:33:10,967 So argc equaling two apparently implies that the human typed two words 2157 01:33:10,967 --> 01:33:14,300 at the prompt-- the name of the program, and then whatever else he or she typed. 2158 01:33:14,300 --> 01:33:16,850 Meanwhile, argv-- argument vector-- 2159 01:33:16,850 --> 01:33:20,317 is the variable that you can use to go get the first word or the second word 2160 01:33:20,317 --> 01:33:22,400 or, if there are more, the third and fourth words. 2161 01:33:22,400 --> 01:33:27,410 In fact, if I kind of change this manually, what should probably be, 2162 01:33:27,410 --> 01:33:30,508 by that logic, in argv[0]? 2163 01:33:30,508 --> 01:33:31,467 AUDIENCE: [INAUDIBLE] 2164 01:33:31,467 --> 01:33:33,550 DAVID MALAN: Yeah, the name of the program, right? 2165 01:33:33,550 --> 01:33:34,280 So let me see. 2166 01:33:34,280 --> 01:33:35,220 So make argv0. 2167 01:33:35,220 --> 01:33:37,870 2168 01:33:37,870 --> 01:33:41,180 ./argv0 David. 2169 01:33:41,180 --> 01:33:41,970 Hello-- OK. 2170 01:33:41,970 --> 01:33:44,750 I mean, it's stupid-looking, but that's all I'm doing. 2171 01:33:44,750 --> 01:33:49,670 I could be a little bold and say what is in the 100th location of this array 2172 01:33:49,670 --> 01:33:51,980 or list, as you can also think of it? 2173 01:33:51,980 --> 01:33:53,640 Make argv0. 2174 01:33:53,640 --> 01:33:54,930 ./david. 2175 01:33:54,930 --> 01:33:56,870 Whoa. 2176 01:33:56,870 --> 01:33:57,710 That is bad. 2177 01:33:57,710 --> 01:34:01,040 And get used to this because it will start to happen with greater frequency. 2178 01:34:01,040 --> 01:34:05,840 Segmentation fault is a very cryptic way of saying you touched memory, RAM, 2179 01:34:05,840 --> 01:34:07,070 that you should not have. 2180 01:34:07,070 --> 01:34:08,945 And you can kind of think of what this means. 2181 01:34:08,945 --> 01:34:14,360 So if argv[0]-- let me pull up my picture of an array. 2182 01:34:14,360 --> 01:34:20,300 If my array looks like this, and argv[0] is here, and that was safe to print, 2183 01:34:20,300 --> 01:34:21,134 and argv[1] is here. 2184 01:34:21,134 --> 01:34:22,091 That was safe to print. 2185 01:34:22,091 --> 01:34:22,730 It was my name. 2186 01:34:22,730 --> 01:34:23,900 And argv-- what did I do-- 2187 01:34:23,900 --> 01:34:26,630 100, it's like way over here. 2188 01:34:26,630 --> 01:34:27,980 I don't know what's over here. 2189 01:34:27,980 --> 01:34:31,040 And indeed, touching that memory was very bad. 2190 01:34:31,040 --> 01:34:32,450 The program crashed. 2191 01:34:32,450 --> 01:34:35,869 And segmentation fault is an allusion to how computers lay out memory. 2192 01:34:35,869 --> 01:34:38,660 You've got like a segment of memory here, a segment of memory here, 2193 01:34:38,660 --> 01:34:39,701 a segment of memory here. 2194 01:34:39,701 --> 01:34:42,290 Segmentation fault means you touched a chunk of memory 2195 01:34:42,290 --> 01:34:46,310 that was not yours to use, to change or to even view. 2196 01:34:46,310 --> 01:34:48,620 So I got lucky, though-- well, I didn't get lucky. 2197 01:34:48,620 --> 01:34:50,180 I could sometimes see garbage values. 2198 01:34:50,180 --> 01:34:51,721 Let me be a little more conservative. 2199 01:34:51,721 --> 01:34:54,770 Let me put [2], which is just one past what I typed in. 2200 01:34:54,770 --> 01:34:56,910 It's sometimes undefined behavior. 2201 01:34:56,910 --> 01:34:58,670 I don't know what I'm gonna get. 2202 01:34:58,670 --> 01:34:59,240 Null. 2203 01:34:59,240 --> 01:35:02,110 So there's some funky characters there or zeros there. 2204 01:35:02,110 --> 01:35:04,760 But now you're playing with fire, so to speak. 2205 01:35:04,760 --> 01:35:07,130 These are logical bugs in my program. 2206 01:35:07,130 --> 01:35:11,990 But it is OK to check if if argc is two, then 2207 01:35:11,990 --> 01:35:16,370 it's OK to look at 0 and 1, two things and only two things. 2208 01:35:16,370 --> 01:35:18,910 Any questions on that? 2209 01:35:18,910 --> 01:35:19,410 All right. 2210 01:35:19,410 --> 01:35:22,850 So where, in what domain, is this kind of thing helpful? 2211 01:35:22,850 --> 01:35:26,270 And there's a couple more examples of argv that you can look at online. 2212 01:35:26,270 --> 01:35:29,300 Turns out that in the world of cryptography, 2213 01:35:29,300 --> 01:35:31,250 this stuff really starts to get interesting. 2214 01:35:31,250 --> 01:35:34,081 So the world of cryptography is all about scrambling information. 2215 01:35:34,081 --> 01:35:35,830 Maybe back in the day in grade school, you 2216 01:35:35,830 --> 01:35:38,870 might have passed notes to a friend or a crush that you had in the classroom. 2217 01:35:38,870 --> 01:35:41,870 And if you were really clever, or your teacher was really adversarial, 2218 01:35:41,870 --> 01:35:45,350 you might have to encode your message so that you're not just writing, 2219 01:35:45,350 --> 01:35:46,940 like, "I love you" or whatever. 2220 01:35:46,940 --> 01:35:50,090 But you instead change all the A's to B's, and all 2221 01:35:50,090 --> 01:35:53,510 the B's to C's or hopefully something a little more cryptic than that so that 2222 01:35:53,510 --> 01:35:57,050 the teacher can't just change all the B's to A's and all the C's to B's. 2223 01:35:57,050 --> 01:35:58,910 But you kind of scrambled the words. 2224 01:35:58,910 --> 01:36:03,020 But you scrambled the words, perhaps, in such a way 2225 01:36:03,020 --> 01:36:07,070 that it's reversible by the recipients, the recipient 2226 01:36:07,070 --> 01:36:08,480 of your encrypted message. 2227 01:36:08,480 --> 01:36:13,100 So to encrypt information means to convert it into some other format, 2228 01:36:13,100 --> 01:36:17,540 from what's called plaintext to ciphertext, which sounds really cool, 2229 01:36:17,540 --> 01:36:19,740 and it's just the scrambled version. 2230 01:36:19,740 --> 01:36:20,850 But it's not random. 2231 01:36:20,850 --> 01:36:22,724 It's got to follow a pattern or, if you will, 2232 01:36:22,724 --> 01:36:26,480 an algorithm so that he or she on the other end can reverse the algorithm 2233 01:36:26,480 --> 01:36:27,380 and undo it. 2234 01:36:27,380 --> 01:36:29,720 Now, in the simple example I proposed, A becomes 2235 01:36:29,720 --> 01:36:33,810 B. B becomes C. What is the secret that you and your crush know? 2236 01:36:33,810 --> 01:36:35,540 It's probably just the number one. 2237 01:36:35,540 --> 01:36:38,740 He or she has to just know, if you added 1 to the letters, 2238 01:36:38,740 --> 01:36:40,589 that they should subtract 1 to the letters. 2239 01:36:40,589 --> 01:36:43,880 And hopefully they know that if you hit Z, you should probably wrap around to A 2240 01:36:43,880 --> 01:36:46,379 and not get into a weird punctuation or something like that. 2241 01:36:46,379 --> 01:36:48,750 So you can keep an algorithm as simple as that. 2242 01:36:48,750 --> 01:36:52,550 So we can think of cryptography, really, as just an example of problem-solving. 2243 01:36:52,550 --> 01:36:56,030 You want to send a message from someone, yourself, 2244 01:36:56,030 --> 01:36:58,479 to someone else, maybe over a very insecure medium 2245 01:36:58,479 --> 01:37:00,020 like passing a note through the room. 2246 01:37:00,020 --> 01:37:02,790 And you want only one person to know how to access it. 2247 01:37:02,790 --> 01:37:05,060 That's like providing inputs, and you want outputs-- 2248 01:37:05,060 --> 01:37:07,220 your plaintext and your ciphertext-- so that no one 2249 01:37:07,220 --> 01:37:09,320 can understand it except you and the recipient. 2250 01:37:09,320 --> 01:37:11,131 So it turns out that cryptography-- 2251 01:37:11,131 --> 01:37:14,130 there's different forms of it, but perhaps the simplest looks like this. 2252 01:37:14,130 --> 01:37:17,930 There's two inputs, the plaintext, the message you want to actually send, 2253 01:37:17,930 --> 01:37:22,119 and then the key, which might be a number like 1 or 2 or 25 or 26. 2254 01:37:22,119 --> 01:37:24,410 And more than that's probably silly because you're just 2255 01:37:24,410 --> 01:37:26,790 wrapping around the alphabet even more, so to speak. 2256 01:37:26,790 --> 01:37:29,960 But the output is going to be something called ciphertext. 2257 01:37:29,960 --> 01:37:31,880 And when your crush receives this message, 2258 01:37:31,880 --> 01:37:34,520 he or she really just needs to reverse the process. 2259 01:37:34,520 --> 01:37:35,669 They have to know the key. 2260 01:37:35,669 --> 01:37:38,960 Otherwise, they're going to be guessing all day long what your message actually 2261 01:37:38,960 --> 01:37:39,560 was. 2262 01:37:39,560 --> 01:37:42,920 But so long as you know the secret in advance, you can do this. 2263 01:37:42,920 --> 01:37:44,480 Now, of course, there's a gotcha. 2264 01:37:44,480 --> 01:37:47,750 You have to be on speaking terms with this person you're crushing on 2265 01:37:47,750 --> 01:37:51,660 because he or she needs to know what the key is in advance. 2266 01:37:51,660 --> 01:37:54,420 Otherwise, you're just sending them nonsensical values. 2267 01:37:54,420 --> 01:37:56,570 So that's kind of, too, a catch-22. 2268 01:37:56,570 --> 01:38:00,080 In order to send a secret message from A to B, 2269 01:38:00,080 --> 01:38:04,231 A and B need to be able to confer in advance and agree on this secret. 2270 01:38:04,231 --> 01:38:06,230 But if you need to agree in advance on a secret, 2271 01:38:06,230 --> 01:38:09,660 why don't you just use that time to send the message directly to the person? 2272 01:38:09,660 --> 01:38:10,160 Right? 2273 01:38:10,160 --> 01:38:11,210 So there's this disconnect. 2274 01:38:11,210 --> 01:38:13,670 And we'll come back to this before long because most of us 2275 01:38:13,670 --> 01:38:16,487 probably don't know someone who works at, like, amazon.com. 2276 01:38:16,487 --> 01:38:18,320 And yet when I buy something on Amazon, I've 2277 01:38:18,320 --> 01:38:20,570 been told all these years that it's secure. 2278 01:38:20,570 --> 01:38:21,440 It's encrypted. 2279 01:38:21,440 --> 01:38:23,750 My credit card, my name, and all of that are somehow 2280 01:38:23,750 --> 01:38:27,440 encrypted between me and Amazon in Seattle or wherever their servers are. 2281 01:38:27,440 --> 01:38:29,030 But I don't know anyone there. 2282 01:38:29,030 --> 01:38:30,880 And yet somehow, cryptography still works. 2283 01:38:30,880 --> 01:38:34,910 So this type of cartography is just one called secret-key cryptography. 2284 01:38:34,910 --> 01:38:37,500 But there's public-key cryptography and yet other things. 2285 01:38:37,500 --> 01:38:40,040 And so what you'll find in problem set two in particular 2286 01:38:40,040 --> 01:38:42,264 is you'll have an opportunity to explore this world, 2287 01:38:42,264 --> 01:38:44,930 whereby you'll write software that encrypts and then, hopefully, 2288 01:38:44,930 --> 01:38:48,680 decrypts information and even, if you're among those more comfortable, 2289 01:38:48,680 --> 01:38:51,470 an opportunity to try writing software that 2290 01:38:51,470 --> 01:38:53,900 takes passwords that are encrypted-- 2291 01:38:53,900 --> 01:38:55,760 or, more properly, hashed, so to speak. 2292 01:38:55,760 --> 01:38:57,020 More on that before long-- 2293 01:38:57,020 --> 01:39:00,440 and you try to crack those passwords, actually figure out 2294 01:39:00,440 --> 01:39:02,720 what the passwords actually were. 2295 01:39:02,720 --> 01:39:06,020 And it all boils down to, ultimately, in the context of C, 2296 01:39:06,020 --> 01:39:09,300 taking as input a message, like a plaintext, 2297 01:39:09,300 --> 01:39:11,750 and somehow converting it to ciphertext by manipulating 2298 01:39:11,750 --> 01:39:15,480 those individual characters, or, if you're the recipient, vice versa. 2299 01:39:15,480 --> 01:39:18,860 And I like to show a clip from, frankly, a film you can watch, like, 2300 01:39:18,860 --> 01:39:22,580 literally every hour on the hour around the holidays, "A Christmas 2301 01:39:22,580 --> 01:39:25,581 Story," because it has an example of a very simple form of cryptography. 2302 01:39:25,581 --> 01:39:27,705 If you ever saw this movie, this is little Ralphie. 2303 01:39:27,705 --> 01:39:30,080 And he's really excited because over months or whatever, 2304 01:39:30,080 --> 01:39:32,900 he saves up and sends in, like, all of these, like, 2305 01:39:32,900 --> 01:39:34,730 cereal box covers or something like that, 2306 01:39:34,730 --> 01:39:37,850 and gets back, finally, this secret decoder ring. 2307 01:39:37,850 --> 01:39:40,610 And the secret decoder ring is kind of a nice mental model 2308 01:39:40,610 --> 01:39:43,350 to have for the type of cryptography I'm proposing here, 2309 01:39:43,350 --> 01:39:44,930 this sort of rotational idea-- 2310 01:39:44,930 --> 01:39:47,630 A becomes B. B becomes C. Because if you imagine 2311 01:39:47,630 --> 01:39:50,000 a ring that has another ring on the outside, 2312 01:39:50,000 --> 01:39:53,679 you can kind of line up the A's and Z's, so to speak, differently. 2313 01:39:53,679 --> 01:39:55,220 And that's what he was saving up for. 2314 01:39:55,220 --> 01:39:57,595 So I thought we'd take just a moment to look at this clip 2315 01:39:57,595 --> 01:39:59,464 to inspire one of the problems ahead. 2316 01:39:59,464 --> 01:40:00,130 [VIDEO PLAYBACK] 2317 01:40:00,130 --> 01:40:03,390 - Be it known to all and sundry that Ralph Parker is hereby 2318 01:40:03,390 --> 01:40:05,862 appointed a member of the Little Orphan Annie secret circle 2319 01:40:05,862 --> 01:40:08,570 and is entitled to all the honors and benefits occurring thereto. 2320 01:40:08,570 --> 01:40:09,642 Too 2321 01:40:09,642 --> 01:40:12,200 - Signed Little Orphan Annie! 2322 01:40:12,200 --> 01:40:14,890 Countersigned Pierre Andre! 2323 01:40:14,890 --> 01:40:16,340 In ink! 2324 01:40:16,340 --> 01:40:18,920 Honors and benefits already, at the age of nine. 2325 01:40:18,920 --> 01:40:22,364 2326 01:40:22,364 --> 01:40:25,711 - Let's go overboard! 2327 01:40:25,711 --> 01:40:26,210 - Come on. 2328 01:40:26,210 --> 01:40:27,200 Let's get on with it. 2329 01:40:27,200 --> 01:40:31,172 I don't need all that jazz about smugglers and pirates. 2330 01:40:31,172 --> 01:40:33,590 - Listen tomorrow night for the concluding adventure 2331 01:40:33,590 --> 01:40:35,840 of the black pirate ship. 2332 01:40:35,840 --> 01:40:41,810 Now it's time for Annie's secret message for you members of the secret circle. 2333 01:40:41,810 --> 01:40:45,530 Remember, kids, only members of Annie's secret circle 2334 01:40:45,530 --> 01:40:48,160 can decode Annie's secret message. 2335 01:40:48,160 --> 01:40:52,080 Remember, Annie is depending on you. 2336 01:40:52,080 --> 01:40:54,860 Set your pins to B2. 2337 01:40:54,860 --> 01:40:56,892 Here is the message. 2338 01:40:56,892 --> 01:40:58,390 12, 11-- 2339 01:40:58,390 --> 01:40:59,600 - I am in. 2340 01:40:59,600 --> 01:41:01,570 My first secret meeting. 2341 01:41:01,570 --> 01:41:05,300 - --14, 11, 18, 16-- 2342 01:41:05,300 --> 01:41:07,730 - Oh, Pierre was in great voice tonight. 2343 01:41:07,730 --> 01:41:10,820 I could tell that tonight's message was really important. 2344 01:41:10,820 --> 01:41:12,680 - --3, 25. 2345 01:41:12,680 --> 01:41:14,720 That's a message from Annie herself. 2346 01:41:14,720 --> 01:41:15,960 Remember, don't tell anyone. 2347 01:41:15,960 --> 01:41:20,780 2348 01:41:20,780 --> 01:41:24,870 - 90 seconds later I'm in the only room in the house where a boy of nine 2349 01:41:24,870 --> 01:41:29,050 could sit in privacy and decode. 2350 01:41:29,050 --> 01:41:29,740 Aha! 2351 01:41:29,740 --> 01:41:31,490 B! 2352 01:41:31,490 --> 01:41:33,340 I went to the next. 2353 01:41:33,340 --> 01:41:36,596 E. The first word is "be!" 2354 01:41:36,596 --> 01:41:38,880 S. It was coming easier now. 2355 01:41:38,880 --> 01:41:40,614 U. 2356 01:41:40,614 --> 01:41:42,542 - Aw, come on, Ralphie! 2357 01:41:42,542 --> 01:41:43,988 I got to go! 2358 01:41:43,988 --> 01:41:46,398 - I'll be right down, Ma! 2359 01:41:46,398 --> 01:41:47,362 Gee, whiz. 2360 01:41:47,362 --> 01:41:50,254 2361 01:41:50,254 --> 01:41:53,510 - T. O! 2362 01:41:53,510 --> 01:41:55,900 "Be sure to"-- be sure to what? 2363 01:41:55,900 --> 01:41:57,870 What was Little Orphan Annie trying to say? 2364 01:41:57,870 --> 01:41:58,670 "Be sure to" what? 2365 01:41:58,670 --> 01:42:00,070 - Ralphie, Randy has got to go. 2366 01:42:00,070 --> 01:42:01,504 Will you please come out? 2367 01:42:01,504 --> 01:42:02,810 - All right, Ma! 2368 01:42:02,810 --> 01:42:04,910 I'll be right out! 2369 01:42:04,910 --> 01:42:06,740 - I was getting closer now. 2370 01:42:06,740 --> 01:42:08,150 The tension was terrible. 2371 01:42:08,150 --> 01:42:09,652 What was it? 2372 01:42:09,652 --> 01:42:11,610 The fate of the planet may hang in the balance. 2373 01:42:11,610 --> 01:42:12,110 [KNOCKING] 2374 01:42:12,110 --> 01:42:15,450 - Ralph, Randy's got to go! 2375 01:42:15,450 --> 01:42:18,330 - I'll be right out, for crying out loud! 2376 01:42:18,330 --> 01:42:20,013 DAVID MALAN: Gee, almost there! 2377 01:42:20,013 --> 01:42:21,220 My fingers flew. 2378 01:42:21,220 --> 01:42:22,940 My mind was a steel trap. 2379 01:42:22,940 --> 01:42:24,650 Every pore vibrated. 2380 01:42:24,650 --> 01:42:26,744 It was almost clear! 2381 01:42:26,744 --> 01:42:27,244 Yes! 2382 01:42:27,244 --> 01:42:27,721 Yes! 2383 01:42:27,721 --> 01:42:28,221 Yes! 2384 01:42:28,221 --> 01:42:29,152 Yes! 2385 01:42:29,152 --> 01:42:34,890 - "Be sure to drink your Ovaltine." 2386 01:42:34,890 --> 01:42:35,520 Ovaltine? 2387 01:42:35,520 --> 01:42:39,532 2388 01:42:39,532 --> 01:42:41,380 A crummy commercial? 2389 01:42:41,380 --> 01:42:44,158 2390 01:42:44,158 --> 01:42:45,445 Son of a bitch! 2391 01:42:45,445 --> 01:42:46,740 [END PLAYBACK] 2392 01:42:46,740 --> 01:42:48,190 DAVID MALAN: That's it for CS50. 2393 01:42:48,190 --> 01:42:49,600 We'll see you next time. 2394 01:42:49,600 --> 01:42:52,350 [APPLAUSE] 2395 01:42:52,350 --> 01:42:54,649