1 00:00:00,000 --> 00:00:02,670 Section Problem Set 2: Hacker Edition 2 00:00:02,670 --> 00:00:04,910 Rob Bowden, Harvard University 3 00:00:04,910 --> 00:00:07,410 This is CS50. CS50.TV 4 00:00:07,410 --> 00:00:15,770 So, I'm Rob. I'm a senior in Kirkland. This is my third year TFing CS50. 5 00:00:15,770 --> 00:00:22,220 It is the first time that we are changing from the traditional-lecture-style section, 6 00:00:22,220 --> 00:00:25,610 where we just kind of review what happened in lecture and then you guys ask questions, 7 00:00:25,610 --> 00:00:32,250 now to being a lot more problem-based, where we use Spaces, and-- 8 00:00:32,250 --> 00:00:37,410 Oh, so the idea is to go to that link I sent you and then you'll be in my Space. 9 00:00:37,410 --> 00:00:42,410 Does anyone not have a laptop? Okay. 10 00:00:42,410 --> 00:00:47,050 So we're going to be using this, and we're going to be doing problems live in section 11 00:00:47,050 --> 00:00:50,740 and discussing them and figuring out what's wrong 12 00:00:50,740 --> 00:00:56,390 and I might pull up some of your code, and I might discuss your ideas. 13 00:00:56,390 --> 00:01:02,140 So has anyone had difficulty? 14 00:01:02,140 --> 00:01:07,000 You can chat on the side; I don't know if we'll have reason for that. 15 00:01:07,000 --> 00:01:12,270 Now, like the previous supersection, if you were at that class, you know what that's about. 16 00:01:12,270 --> 00:01:19,200 On all of P sets there's going to be these sections. 17 00:01:19,200 --> 00:01:22,550 So P-set 2, specifications, I guess you saw it on P-set 1 already. 18 00:01:22,550 --> 00:01:27,400 But we can look at P-set 2 for what we're going to be going over today. 19 00:01:27,400 --> 00:01:29,460 And you'll see a section of questions. 20 00:01:29,460 --> 00:01:37,530 So this will be in all of the P-sets; there'll be a section of questions. 21 00:01:37,530 --> 00:01:41,340 So far we've said, "Consider this an opportunity to practice." 22 00:01:41,340 --> 00:01:44,940 You won't be asked to submit this program. 23 00:01:44,940 --> 00:01:48,480 The idea is that these are supposed to kind of help you get started with the problem set. 24 00:01:48,480 --> 00:01:53,220 I guess on the Hacker edition, a lot of them are supposed to just be new, interesting things to learn. 25 00:01:53,220 --> 00:01:58,590 They may not be directly applicable to the problem set. 26 00:01:58,590 --> 00:02:01,810 And right now we're not having you submit them, but in theory, 27 00:02:01,810 --> 00:02:07,480 for later problem sets, you might submit them, and thus you can either come to section 28 00:02:07,480 --> 00:02:10,380 or watch the section to get the answers, or you can just get them on your own 29 00:02:10,380 --> 00:02:16,350 if you don't feel like enjoying my presence. 30 00:02:16,350 --> 00:02:21,010 So the--I think this is the first one. 31 00:02:21,010 --> 00:02:29,280 Oh. Also, under these sections of questions we also have you ask questions about the shorts. 32 00:02:29,280 --> 00:02:33,440 So I guess, in theory, you're supposed to watch these before coming to section, 33 00:02:33,440 --> 00:02:38,550 but it's fine if you don't; we'll go over them anyway. 34 00:02:38,550 --> 00:02:42,590 So we can start with these: "How does a while loop differ from a do-while loop? 35 00:02:42,590 --> 00:02:46,210 When is the latter particularly useful?" 36 00:02:46,210 --> 00:02:49,390 So anyone have any--? 37 00:02:49,390 --> 00:02:52,730 [Student] The do-while loop will always execute at least once. 38 00:02:52,730 --> 00:03:02,950 Yes. So that is the difference. A while loop--I'll just do it on here--while loop, we have the condition 39 00:03:02,950 --> 00:03:19,760 right here, whereas a do-while, you don't have a condition until we get down here. 40 00:03:19,760 --> 00:03:24,130 And so, when your program's executing, and it gets to the while loop, 41 00:03:24,130 --> 00:03:26,380 it immediately checks if this condition is true. 42 00:03:26,380 --> 00:03:30,710 If that condition is not true, it will just skip over the loop entirely. 43 00:03:30,710 --> 00:03:34,390 Do-while loop, as the program is executing, it gets to the "do." 44 00:03:34,390 --> 00:03:37,920 Nothing happens at this point, just continues executing. 45 00:03:37,920 --> 00:03:42,690 Then when it hits the "while," if the condition is true, it'll loop back and do it again 46 00:03:42,690 --> 00:03:46,730 and again and again until the condition is not true and then just falls through. 47 00:03:46,730 --> 00:03:50,600 So, the difference being, that this can skip right from the very start. 48 00:03:50,600 --> 00:03:56,770 This necessarily executes once and then may execute more times if the condition is still true. 49 00:03:56,770 --> 00:04:03,720 So the while loop will only do it once, or--the while loop--we may not need to do it at all, 50 00:04:03,720 --> 00:04:07,900 since as soon as we get to it, if the condition is false, we'll just skip right over it. 51 00:04:07,900 --> 00:04:11,770 Whereas do-while loop, we will execute it once, necessarily. 52 00:04:11,770 --> 00:04:14,560 Then, when we get to the condition, we check if it's true or false. 53 00:04:14,560 --> 00:04:19,790 If it's true, we'll do it again; if it's false, we'll just continue going. 54 00:04:19,790 --> 00:04:24,680 So when is the latter particularly useful? 55 00:04:24,680 --> 00:04:31,190 So I can say that in the entirety of the 4 years, 3 years, whatever, 56 00:04:31,190 --> 00:04:38,780 that I've been programming, I have used this, like, under 10 times. 57 00:04:38,780 --> 00:04:43,140 And probably 5 of them are in CS50 when we're introducing do-while loops. 58 00:04:43,140 --> 00:04:47,510 So when do you used do-while loops? 59 00:04:47,510 --> 00:04:49,510 When is the--yeah? 60 00:04:49,510 --> 00:04:53,180 [Student] When you're trying to get user input, or something you want to check-- 61 00:04:53,180 --> 00:04:59,700 Yeah. So do-while loops, user input is the big one. 62 00:04:59,700 --> 00:05:03,160 That's why on the first couple problem sets, when you want to ask the user, like, 63 00:05:03,160 --> 00:05:08,520 "give me a string," you can't continue until you get that string. 64 00:05:08,520 --> 00:05:12,980 And so you, necessarily, need to ask for the string at least once. 65 00:05:12,980 --> 00:05:16,950 But then if they answer something bad, then you need to loop back and ask again. 66 00:05:16,950 --> 00:05:20,810 But other than user input, it's very rare that I encounter a case 67 00:05:20,810 --> 00:05:27,170 where I want to loop "at least once" but possibly more. 68 00:05:27,170 --> 00:05:33,370 Questions or--? Has anyone used a do-while loop anywhere else? 69 00:05:33,370 --> 00:05:36,780 Okay. So the next one is, "What does undeclared identifier 70 00:05:36,780 --> 00:05:43,310 usually indicate if outputted by clang?" 71 00:05:43,310 --> 00:05:47,380 So what kind of code could I write to get 'undeclared identifier?' 72 00:05:47,380 --> 00:05:49,550 [Student] That x = 2? 73 00:05:49,550 --> 00:05:52,650 So we can just try it in here, x = 2. 74 00:05:52,650 --> 00:06:04,830 We'll run this--oh, I didn't click it. So here we get--all right. 75 00:06:04,830 --> 00:06:07,100 "Use of undeclared identifier x." 76 00:06:07,100 --> 00:06:11,610 So that's the undeclared identifier, a variable. 77 00:06:11,610 --> 00:06:13,910 It will frequently call a variable an identifier. 78 00:06:13,910 --> 00:06:17,300 So it might not know it's actually a variable; it doesn't know what it is. 79 00:06:17,300 --> 00:06:19,380 So it's an identifier. 80 00:06:19,380 --> 00:06:26,060 So why is it undeclared? Yeah. 81 00:06:26,060 --> 00:06:32,190 So to be clear on terminology, the declaration of a variable 82 00:06:32,190 --> 00:06:37,360 is when you say "int x," or "string y," whatever. 83 00:06:37,360 --> 00:06:41,910 The initialization of the variable, or the assignment of the variable, 84 00:06:41,910 --> 00:06:44,510 is whenever you say "x = 2." 85 00:06:44,510 --> 00:06:52,950 So we can do these in separate steps, int x, x = 2, and until--we can have a bunch of stuff in here-- 86 00:06:52,950 --> 00:07:00,350 but until this line happens, x is still uninitialized, but it has been declared. 87 00:07:00,350 --> 00:07:06,760 And so we can obviously do it in 1 line, and now we are declaring and initializing. 88 00:07:06,760 --> 00:07:10,730 Questions? 89 00:07:10,730 --> 00:07:18,390 And finally, "Why is the Caesar Cipher not very secure?" 90 00:07:18,390 --> 00:07:23,830 So first, does anyone want to say what the Caesar Cipher is? 91 00:07:23,830 --> 00:07:28,100 [Student] Caesar Cipher just is that you map, you shift every letter, 92 00:07:28,100 --> 00:07:34,420 a certain number of letters go over, and move back over, and it's not very secure because 93 00:07:34,420 --> 00:07:42,260 there's only 26 possible options and you just have to try each 1 of those until you get it. 94 00:07:42,260 --> 00:07:45,470 Oh. So, I should repeat? 95 00:07:45,470 --> 00:07:51,600 The Caesar Cipher, it's--I mean, you'll be dealing with it on the problems that you-- 96 00:07:51,600 --> 00:07:56,110 or I guess the standard edition of the problem set that's not on the hacker edition. 97 00:07:56,110 --> 00:08:01,550 So on the standard edition to the problem set, you get a message like, "Hello, world, " 98 00:08:01,550 --> 00:08:08,410 and you also have a number like 6, and you take that message, and each individual character, 99 00:08:08,410 --> 00:08:11,310 you rotate it by 6 positions in the alphabet. 100 00:08:11,310 --> 00:08:16,560 So the 'h' in hello would become h-i-j-k-l-m-n. 101 00:08:16,560 --> 00:08:19,600 So the first letter would be n. We do the same thing with e. 102 00:08:19,600 --> 00:08:23,530 If we have a, like, z or something, then we wrap back around to 'a.' 103 00:08:23,530 --> 00:08:29,280 But each character gets cycled 6 characters later in the alphabet, and it's not very secure 104 00:08:29,280 --> 00:08:35,440 since there are only 26 possibilities for how many ways you could wrap a single letter. 105 00:08:35,440 --> 00:08:42,919 So you can just try all 26 of them and, presumably, for a long enough message, 106 00:08:42,919 --> 00:08:46,860 only 1 of those possible 26 things is going to be legible, 107 00:08:46,860 --> 00:08:50,300 and the legible one is going to be the original message. 108 00:08:50,300 --> 00:08:56,240 So it's not a very good way of encrypting anything at all. 109 00:08:56,240 --> 00:08:59,070 Unrelated to those shorts, "What is a function?" 110 00:08:59,070 --> 00:09:03,370 So what is a function? Yes. 111 00:09:03,370 --> 00:09:11,640 [Student] It's like a separate piece of code that you can call to go through and then get the return value of whatever. 112 00:09:11,640 --> 00:09:18,160 Yeah. So I'll answer it by also answering the next--or repeat by also just answering the next one. 113 00:09:18,160 --> 00:09:22,410 You can use functions instead of just copying and pasting code over and over again. 114 00:09:22,410 --> 00:09:27,200 Just take that code, put it into a fuction, and then you could just call the function 115 00:09:27,200 --> 00:09:29,870 wherever you have been copying and pasting. 116 00:09:29,870 --> 00:09:33,350 So functions are useful. 117 00:09:33,350 --> 00:09:35,860 So now we'll do actual problems. 118 00:09:35,860 --> 00:09:46,490 The first one. So the idea of the first one is, you pass it a string, and regardless of the-- 119 00:09:46,490 --> 00:09:52,060 or does it say all lowercase? It does not say all lowercase. 120 00:09:52,060 --> 00:09:57,730 So the message can be anything, and--oh no. It does. 121 00:09:57,730 --> 00:10:01,610 "For simplicity, you may assume that the user will only input lowercase letters and spaces." 122 00:10:01,610 --> 00:10:08,180 So we pass it a message with only lowercase letters and then we alternate 123 00:10:08,180 --> 00:10:15,450 between capital and lowercase--we change the string to be capital and lowercase, alternating. 124 00:10:15,450 --> 00:10:22,920 So before we give you a second to even dive into the problem, 125 00:10:22,920 --> 00:10:32,420 what is the first thing that we need to do? 126 00:10:32,420 --> 00:10:36,900 Oh, what did I just click on? Oh, I just clicked on an email in here. 127 00:10:36,900 --> 00:10:42,870 So the first thing we need to do--am I looking at the wrong one? 128 00:10:42,870 --> 00:10:49,320 Is this part of this one? 129 00:10:49,320 --> 00:10:51,320 No, those are still in there, though. 130 00:10:51,320 --> 00:10:55,160 Okay, still here. 131 00:10:55,160 --> 00:11:03,160 Now we can not assume--? Yes. Here we cannot assume that it's only lowercase and spaces. 132 00:11:03,160 --> 00:11:07,770 So now we have to deal with the fact that the letters can be whatever we want them to be. 133 00:11:07,770 --> 00:11:11,910 And so the first thing we want to do is just get the message. 134 00:11:11,910 --> 00:11:19,790 We just need to get a string, string s = GetString, okay. 135 00:11:19,790 --> 00:11:24,890 Now this problem, there are a couple of ways of doing it. 136 00:11:24,890 --> 00:11:29,840 But we are going to want to use bitwise operators here. 137 00:11:29,840 --> 00:11:35,280 Are there people who either were not at the supersection, 138 00:11:35,280 --> 00:11:37,480 or something, and don't know what bitwise operators are? 139 00:11:37,480 --> 00:11:41,710 Or how they relate to ASCII in any way? 140 00:11:41,710 --> 00:11:45,650 [Student] I wasn't at the supersection, but I know what bitwise operators are. 141 00:11:45,650 --> 00:11:49,560 Okay. So then I don't have to go over the basics of them, but I'll explain 142 00:11:49,560 --> 00:11:51,830 what we're going to want to use here. 143 00:11:51,830 --> 00:11:59,680 So 'A': Binary representation of capital A, the number is 65. 144 00:11:59,680 --> 00:12:07,560 I'm just going to look at--41 is going to be 01000001. 145 00:12:07,560 --> 00:12:14,170 So that should be 65 in decimal; so this is the binary representation of the character capital A. 146 00:12:14,170 --> 00:12:19,440 Now, the binary representation of the character lowercase 'a' 147 00:12:19,440 --> 00:12:33,350 is going to be the same thing, almost. Is that--6, yeah. This is right. 148 00:12:33,350 --> 00:12:37,670 So binary capital A, binary lowercase 'a.' 149 00:12:37,670 --> 00:12:43,940 So notice that the difference between A and 'a' is this single bit. 150 00:12:43,940 --> 00:12:49,440 And this happens to be the 32 bit, the bit representing the number 32. 151 00:12:49,440 --> 00:12:53,910 And that makes sense since A is 65; 'a' is 97. 152 00:12:53,910 --> 00:12:56,610 The difference between them is 32. 153 00:12:56,610 --> 00:13:03,770 So now we know we can convert from A to 'a' by taking A 154 00:13:03,770 --> 00:13:09,710 and bitwise ORing it, with--that looks like a 1. 155 00:13:09,710 --> 00:13:20,900 This is a bitwise OR, with 00100000, and that'll give us 'a.' 156 00:13:20,900 --> 00:13:26,850 And we can get from 'a' to A by bitwise ANDing 157 00:13:26,850 --> 00:13:33,700 with 11, 0 in that place, 11111. 158 00:13:33,700 --> 00:13:43,840 So this will then give us exactly what 'a' was; but cancel out this individual bit, 159 00:13:43,840 --> 00:13:50,070 so we'll have 01000001; I don't know if I counted right. 160 00:13:50,070 --> 00:13:56,750 But this technique of bitwise ORing to get from capital to lowercase, 161 00:13:56,750 --> 00:14:02,080 and bitwise ANDing to get from lowercase to capital isn't exclusive to A. 162 00:14:02,080 --> 00:14:06,510 All of the letters, K vs k, Z vs z, 163 00:14:06,510 --> 00:14:10,080 all of them are just going to differ by this single bit. 164 00:14:10,080 --> 00:14:16,290 And so you can use this to change from any lowercase letter to any capital letter and vice versa. 165 00:14:16,290 --> 00:14:26,670 Okay. So an easy way of getting from this--so instead of having to 166 00:14:26,670 --> 00:14:32,170 write out whatever 1011111 is--an easy way of representing this number, and this isn't one 167 00:14:32,170 --> 00:14:39,710 that I went over in the supersection, but tilde (~) is another bitwise operator. 168 00:14:39,710 --> 00:14:42,520 What ~ does is it looks at the bit representation. 169 00:14:42,520 --> 00:14:45,630 Let's take any number. 170 00:14:45,630 --> 00:14:53,130 This is just some binary number, and what ~ does is it just flips all of the bits. 171 00:14:53,130 --> 00:15:00,630 So this was a 1, now a 0, this is a 0, now a 1, 010100. 172 00:15:00,630 --> 00:15:08,320 So that's all ~ does. So 32 is going to be the number--get rid of that-- 173 00:15:08,320 --> 00:15:23,320 so 32 is going to be the number 00100000, and so ~ of this is going to be 174 00:15:23,320 --> 00:15:29,980 this number up here that I ANDed 'a' with. 175 00:15:29,980 --> 00:15:35,600 Does everyone see that? This is pretty common, like when you want to figure out 176 00:15:35,600 --> 00:15:40,740 for later things that we might be seeing, when we want to see if-- 177 00:15:40,740 --> 00:15:44,710 or we want everything, every single bit set except for 1 178 00:15:44,710 --> 00:15:47,910 you tend to do ~ of the bit that we don't want set. 179 00:15:47,910 --> 00:15:53,090 So we don't want the 32 bit set, so we do ~ of 32. 180 00:15:53,090 --> 00:15:57,790 Okay. So we can use all of those here. 181 00:15:57,790 --> 00:16:03,000 All right, so it's fine if you're not done, we shall slowly walk over together, 182 00:16:03,000 --> 00:16:11,870 or walk over this, so--through this. Walk through this. 183 00:16:11,870 --> 00:16:20,790 So we have our string, and we want to loop over each character in that string and do something to it. 184 00:16:20,790 --> 00:16:26,710 So how do we loop over a string? What should we use? 185 00:16:26,710 --> 00:16:30,980 I'm not going to do it on here. Yeah. 186 00:16:30,980 --> 00:16:42,940 So I have my iterator, and he said it, but how do I know how many characters are in the string? 187 00:16:42,940 --> 00:16:47,030 Strlen (s), then i + +. 188 00:16:47,030 --> 00:16:49,860 So what I've done here isn't the best way of doing things. 189 00:16:49,860 --> 00:16:51,860 Does anyone know why? 190 00:16:51,860 --> 00:16:55,290 Because you're checking the language of the string every single time. 191 00:16:55,290 --> 00:17:06,859 So we are going to want to move strlen, I could say up here, int length = strlen (s), 192 00:17:06,859 --> 00:17:11,900 and then do i < length, and in case you haven't seen it before, 193 00:17:11,900 --> 00:17:20,410 I could also do int i = 0, length = strlen (s). 194 00:17:20,410 --> 00:17:25,010 And so this is somewhat preferable, since now I've restricted the scope 195 00:17:25,010 --> 00:17:29,150 of the variable length to just this 'for' loop, instead of declaring it before 196 00:17:29,150 --> 00:17:34,990 and that it always exists, and in case you didn't catch why that's bad, 197 00:17:34,990 --> 00:17:39,410 or why the original was bad, it's--start at the for loop. 198 00:17:39,410 --> 00:17:43,380 I checked the condition. Is i < the length of s? 199 00:17:43,380 --> 00:17:46,790 So the length of s, let's work with "hello" the entire time. 200 00:17:46,790 --> 00:17:49,670 So length of s, h-e-l-l-o. Length is 5. 201 00:17:49,670 --> 00:17:57,580 So i = 0, length is 5, so i is not < 5, so the loop continues. 202 00:17:57,580 --> 00:18:02,750 Then we go again. We check the condition. Is i < the length of hello? 203 00:18:02,750 --> 00:18:08,390 So let's check the length of hello. H-e-l-l-o. That's 5; i is not < 5, so we continue again. 204 00:18:08,390 --> 00:18:13,330 So we are calculating, we are counting h-e-l-l-o, for each iteration of the loop, 205 00:18:13,330 --> 00:18:17,380 even thought it's never going to change; it's always going to be 5. 206 00:18:17,380 --> 00:18:22,530 So we just remember 5 up front, and now everything's better. 207 00:18:22,530 --> 00:18:24,990 So iterating over the entire string. 208 00:18:24,990 --> 00:18:31,470 What do we want to do for each character of the string? 209 00:18:31,470 --> 00:18:38,510 [Student speaking, unintelligible] 210 00:18:38,510 --> 00:18:47,000 Yeah. So, if the character is non-alphabetic, then we just want to skip over it. 211 00:18:47,000 --> 00:18:52,300 Because we only care about alphabetic letters; we can't capitalize a number. 212 00:18:52,300 --> 00:19:10,850 So how can we do this? So our condition, so if we want something--check if it's alphabetical. 213 00:19:10,850 --> 00:19:14,060 So how do we check this? 214 00:19:14,060 --> 00:19:18,720 [Student] You can just use the function is alpha. 215 00:19:18,720 --> 00:19:23,160 Is that included in either of these, or any include like, char.h or something? 216 00:19:23,160 --> 00:19:32,710 Let's not use the is alpha function, and use the explicit--so we have s[i], 217 00:19:32,710 --> 00:19:40,460 that is the eighth character of s, remember that a string is an array of characters, 218 00:19:40,460 --> 00:19:43,180 so the eighth character of s. 219 00:19:43,180 --> 00:19:49,280 Now, if it is a capital letter, we know it has to be in a specific range. 220 00:19:49,280 --> 00:19:54,370 And what is that range? 221 00:19:54,370 --> 00:20:07,860 Yeah. So if s[i] is ≥ 65, and s[i] is ≤ 90, what should I do instead? 222 00:20:07,860 --> 00:20:18,470 Yeah. So you should absolutely never even need to know the ASCII values of anything ever. 223 00:20:18,470 --> 00:20:25,640 Never think of the numbers 65, 90, 97 and 102, or whatever it is. 224 00:20:25,640 --> 00:20:32,470 You don't need--112?--you don't need to know those at all. That's wrong too. 225 00:20:32,470 --> 00:20:41,940 Only use the single-quote characters, single quote constants. So 'A' and less than 90 is 'Z.' 226 00:20:41,940 --> 00:20:47,930 And this is significantly better--I would not know off the top of my head that Z is 90. 227 00:20:47,930 --> 00:20:52,690 I do know off the top of my head that 'Z' is capital Z. 228 00:20:52,690 --> 00:21:02,100 So as long as this is in the range of capital A to capital Z, or we can check for lowercase, 229 00:21:02,100 --> 00:21:17,010 Or if it's in the range ≥ 'a' and ≤ z. 230 00:21:17,010 --> 00:21:19,010 So that's our condition. 231 00:21:19,010 --> 00:21:22,520 The style for where to put these things varies. 232 00:21:22,520 --> 00:21:29,520 I'll do it like this. 233 00:21:29,520 --> 00:21:31,520 Now, what do we want to do? 234 00:21:31,520 --> 00:21:39,530 We know this letter is a character, an alphabetical character. 235 00:21:39,530 --> 00:21:46,270 So we need to alternate between whether this should now be a capital letter or a lowercase letter. 236 00:21:46,270 --> 00:21:48,820 How do we keep track of which one we want it to be? 237 00:21:48,820 --> 00:21:55,520 [Student voices, unintelligible] 238 00:21:55,520 --> 00:21:59,150 So yes, but let me check. 239 00:21:59,150 --> 00:22:04,910 Module 0-2 was said, was a suggestion thrown out, and I agree with that. 240 00:22:04,910 --> 00:22:11,780 Except notice that, like--is this the case? Yeah. 241 00:22:11,780 --> 00:22:18,270 It's every other one, but we can't module 2 of i, or i mod 2, since 242 00:22:18,270 --> 00:22:22,950 notice that E is capital and 'a' is lowercase? But there's a space separating them? 243 00:22:22,950 --> 00:22:27,150 So they're going to be the same mod 2, but they're different cases. 244 00:22:27,150 --> 00:22:29,150 [Student question, unintelligible] 245 00:22:29,150 --> 00:22:34,690 Yeah. So we're just going to keep a count. 246 00:22:34,690 --> 00:22:38,730 We could also do that in here if we wanted; that might get a little unwieldy 247 00:22:38,730 --> 00:22:41,300 in the for loop declarations; I'll put it up here. 248 00:22:41,300 --> 00:22:48,840 So int count = starts at 0. 249 00:22:48,840 --> 00:22:54,070 And so now, I'm going to count how many alphabetical characters we've had. 250 00:22:54,070 --> 00:22:59,550 So we're inevitably going to count + + since we found another alphabetical character. 251 00:22:59,550 --> 00:23:09,130 But, so now you're saying if count mod 2. 252 00:23:09,130 --> 00:23:12,590 So what if count mod 2? Oh. I'll do = = 0 for now. 253 00:23:12,590 --> 00:23:21,740 We'll also go over that. So if count mod 2 = = 0, then what? 254 00:23:21,740 --> 00:23:27,830 [Students answer, unintelligible] 255 00:23:27,830 --> 00:23:32,750 So we want it to end up uppercase. 256 00:23:32,750 --> 00:23:37,520 There are 2 cases; uppercase and lowercase are the 2 cases. 257 00:23:37,520 --> 00:23:40,990 So if we're in lowercase we need to make it uppercase. 258 00:23:40,990 --> 00:23:43,710 If it's uppercase we don't need to do anything. 259 00:23:43,710 --> 00:23:50,760 But, is there a way--shouldn't have flipped-- 260 00:23:50,760 --> 00:23:54,800 that we don't even need to check whether it's uppercase or lowercase? 261 00:23:54,800 --> 00:24:02,240 What can we do to always make sure that we always end up at uppercase? 262 00:24:02,240 --> 00:24:07,830 So notice what we did for lowercase 'a'; what if we did this same exact thing to uppercase A? 263 00:24:07,830 --> 00:24:11,900 Does uppercase A change, or does the value change? 264 00:24:11,900 --> 00:24:23,100 Yeah. So any capital letter bitwise ANDed with ~ 32 is going to be that same uppercase character 265 00:24:23,100 --> 00:24:29,220 because for any uppercase character the 32nd bit is not set. 266 00:24:29,220 --> 00:24:40,920 So if we want to bring the character s[i], we want it to become lowercase or uppercase. 267 00:24:40,920 --> 00:24:46,890 So if it was lowercase, it is now uppercase , if it was uppercase, it's still uppercase, and that's it. 268 00:24:46,890 --> 00:24:54,290 I said this in the supersection: You can use 32 if you want, but I tend to prefer doing 'a' - A, 269 00:24:54,290 --> 00:25:01,150 instead of just plain 32, because it can be any other bit. 270 00:25:01,150 --> 00:25:03,610 After the 32 bit, it can be any of these, or we wouldn't have enough 271 00:25:03,610 --> 00:25:05,840 numbers to represent all of the characters. 272 00:25:05,840 --> 00:25:09,110 So if you get the 32 bit, it could be the 64 bit, it could be the 128 bit. 273 00:25:09,110 --> 00:25:13,990 Any of those bits could be the bit that distinguishes between uppercase and lowercase. 274 00:25:13,990 --> 00:25:18,350 I shouldn't need to know that it's the 32 bit. 275 00:25:18,350 --> 00:25:27,130 I can use this 'a' - A to get the bit that differs between the two 276 00:25:27,130 --> 00:25:33,000 without needing to rely on the magic number that is 32. 277 00:25:33,000 --> 00:25:38,770 And so now, else count was odd, and so what do I want to do? 278 00:25:38,770 --> 00:25:43,920 [Student answers, unintelligible] 279 00:25:43,920 --> 00:25:45,920 [Student] What's that? 280 00:25:45,920 --> 00:25:49,850 I will do it in 1 second. 281 00:25:49,850 --> 00:25:55,690 So now if I want to--I want to make sure the character is now lowercase, 282 00:25:55,690 --> 00:26:04,140 and so I can OR by 32, and 32 meaning 'a' - A. 283 00:26:04,140 --> 00:26:06,510 But notice, by the same reasoning as the previous one, that if 284 00:26:06,510 --> 00:26:11,670 the letter was already lowercase, then ORing by 32 just keeps it lowercase. 285 00:26:11,670 --> 00:26:16,220 It hasn't changed the original character. 286 00:26:16,220 --> 00:26:19,910 But now I don't have to avoid saying, "If it is lowercase, just forget about it, 287 00:26:19,910 --> 00:26:23,650 if it's uppercase, then change it." 288 00:26:23,650 --> 00:26:26,900 It's much more convenient to do this. 289 00:26:26,900 --> 00:26:33,190 [Student] Would that strategy of subtracting the uppercase from the lowercase work if it weren't 32? 290 00:26:33,190 --> 00:26:35,330 If it was, like, 34 or something? 291 00:26:35,330 --> 00:26:41,840 So, you need to know that the difference between the 2 is--? >>1 bit. 292 00:26:41,840 --> 00:26:49,840 It could be more than 1 bit, as long as all of the bits below this position are the same. 293 00:26:49,840 --> 00:26:58,500 So we need at least 26 characters--or, there are 26 characters. 294 00:26:58,500 --> 00:27:04,590 So we need at least 26 numbers to represent the difference-- 295 00:27:04,590 --> 00:27:07,650 The difference between A and 'a' has to be at least 26, 296 00:27:07,650 --> 00:27:10,760 or else we wouldn't have represented all the capital numbers. 297 00:27:10,760 --> 00:27:18,630 That means that A, if we start at 1, it's going to use all of these bits, 298 00:27:18,630 --> 00:27:23,900 all of these first 5 bits, to represent everything through Z. 299 00:27:23,900 --> 00:27:32,170 That's why the next bit, or this bit, the next bit is the one that's chosen to distinguish between A and 'a.' 300 00:27:32,170 --> 00:27:40,930 That's also why, in ASCII table, there are 5 symbols separating capital letters from lowercase letters. 301 00:27:40,930 --> 00:27:49,050 Since those are the symbols, the extra 5 that brings up the 32 being the difference between them. 302 00:27:49,050 --> 00:27:51,840 [Student] So we could do it, because ASCII's designed that way. 303 00:27:51,840 --> 00:27:57,280 Yes. But ASCII--the difference could also be both of these bits. 304 00:27:57,280 --> 00:28:12,040 Like, if A were 10000001, and 'a' was 11100001--I forget, whatever. 305 00:28:12,040 --> 00:28:18,100 But if it were this, then we could still use 'a' - A. 306 00:28:18,100 --> 00:28:22,650 It's just now the difference between A and 'a' is still these 2 bits. 307 00:28:22,650 --> 00:28:32,240 I think it's written 48. Is it 32 + 64? I think it is? 308 00:28:32,240 --> 00:28:40,160 It would still be 2 bits; every single character, like, Z and z, K and k, 309 00:28:40,160 --> 00:28:45,160 they would still have the same exact bits set except for those 2 bits. 310 00:28:45,160 --> 00:28:48,870 So as long as that's always true, regardless of if we're using ASCII or some other system, 311 00:28:48,870 --> 00:28:53,050 as long as there's only a set number of bits that are different for each character, 312 00:28:53,050 --> 00:28:55,050 then that works fine. 313 00:28:55,050 --> 00:29:06,110 It's just that 32 was set up because it's the first one we could possibly use. >>Cool. 314 00:29:06,110 --> 00:29:14,520 I tend to prefer, in case you haven't seen, if the block is only a single line, 315 00:29:14,520 --> 00:29:24,280 you can get rid of the curly braces; so I tend to prefer doing this. 316 00:29:24,280 --> 00:29:34,010 Also, you know how we can do things like s[i] + = 1? 317 00:29:34,010 --> 00:29:41,090 You can also do s[i] bitwise AND = 32. 318 00:29:41,090 --> 00:29:46,400 And bitwise OR = 32. 319 00:29:46,400 --> 00:29:51,490 Also, count mod 2 = = 0. 320 00:29:51,490 --> 00:30:00,900 So remember that--I won't write it--any non-zero value is true, and 0 is false. 321 00:30:00,900 --> 00:30:07,880 So "if count mod 2 = = 0" is the same as saying "if not count mod 2." 322 00:30:07,880 --> 00:30:11,580 I probably would have just reversed the lines and said, "if count mod 2, 323 00:30:11,580 --> 00:30:15,350 do the OR 1, else do the AND 1," so that I didn't need the "not." 324 00:30:15,350 --> 00:30:18,650 But this works just as well. 325 00:30:18,650 --> 00:30:25,660 And what else can I do here? 326 00:30:25,660 --> 00:30:29,060 You could combine them with ternary if you wanted, but then that'd just make things messier 327 00:30:29,060 --> 00:30:33,770 and probably more difficult to read, so we won't do that. 328 00:30:33,770 --> 00:30:37,330 Anyone have any other suggestions? 329 00:30:37,330 --> 00:30:41,580 Is that all the problem asked for? Oh yeah. 330 00:30:41,580 --> 00:30:51,070 So get rid of these empty lines, now we'll print f , % s being the one for strings, 331 00:30:51,070 --> 00:30:56,620 We will print f, s. 332 00:30:56,620 --> 00:30:59,330 Now let's run it. Did I do anything wrong? 333 00:30:59,330 --> 00:31:03,200 That's a \ "; I want an n. 334 00:31:03,200 --> 00:31:07,840 Okay. Now we'll run it. It'll probably yell at me. 335 00:31:07,840 --> 00:31:11,250 Strlen is in string.h. 336 00:31:11,250 --> 00:31:14,290 So this is the nice thing about Clang is it tells you what it's in, 337 00:31:14,290 --> 00:31:19,140 instead of GCC which just says, "Hey, you forgot something, I don't know what it was." 338 00:31:19,140 --> 00:31:29,220 But this will tell me, "You meant to include string.h." 339 00:31:29,220 --> 00:31:32,130 So I didn't prompt for anything, so it's not saying anything. 340 00:31:32,130 --> 00:31:42,540 But we'll do their example, "Thanks 4 the add". 341 00:31:42,540 --> 00:31:47,880 That looks right. Hooray. 342 00:31:47,880 --> 00:31:52,370 So returning to your main, I almost never do it. 343 00:31:52,370 --> 00:31:57,110 It's optional. And main is the only function for which it is optional. 344 00:31:57,110 --> 00:32:07,140 If you do not return anything from main, it's assumed that you meant to return 0. 345 00:32:07,140 --> 00:32:13,070 Questions? 346 00:32:13,070 --> 00:32:20,980 Okay. So now the second problem. 347 00:32:20,980 --> 00:32:24,810 "Recall from week 2's second lecture that swapping 2 variables' values by passing 348 00:32:24,810 --> 00:32:30,780 those 2 variables to a function (even if called swap) doesn't exactly work, at least not without 'pointers.'" 349 00:32:30,780 --> 00:32:37,020 And ignore pointers until we get to them. 350 00:32:37,020 --> 00:32:40,070 We want to swap 2 variables; we're not using a function to do it. 351 00:32:40,070 --> 00:32:43,410 We're still going to do it in main like it says. 352 00:32:43,410 --> 00:32:48,360 But to use those 2 variables, we don't want to use a temporary variable. 353 00:32:48,360 --> 00:32:50,770 There are 2 ways to do this. 354 00:32:50,770 --> 00:32:56,310 You can do it using your traditional binary operators. 355 00:32:56,310 --> 00:33:00,180 So does anyone know a quick and dirty way of doing that? 356 00:33:00,180 --> 00:33:07,650 It might actually take a minute of thinking. If I have-- 357 00:33:07,650 --> 00:33:12,130 I'll set the problem up like they ask. So if I have 2 variables, A, which is just an integer 358 00:33:12,130 --> 00:33:17,800 that they give me, and sum variable B, which is another integer that I'm given. 359 00:33:17,800 --> 00:33:22,700 So if I have these 2 variables, now I want to swap them. 360 00:33:22,700 --> 00:33:31,550 The traditional, using your regular binary operators, I mean, like +, -, ÷. 361 00:33:31,550 --> 00:33:36,630 Not bitwise operators which act on binary. 362 00:33:36,630 --> 00:33:39,600 So using -, +, ÷, and all those. 363 00:33:39,600 --> 00:33:52,980 We could swap by doing something like a = a + b, and b = a - b, a = a - b. 364 00:33:52,980 --> 00:34:04,260 So, sanity check, and then we'll see why that works. 365 00:34:04,260 --> 00:34:13,320 Let's say a = 7, b = 3, then a + b is going to be 10. 366 00:34:13,320 --> 00:34:18,820 So we're now setting a = 10, and then we're doing b = a - b. 367 00:34:18,820 --> 00:34:30,250 So we're doing b = a - b, which is going to be 7, and b = a - b again, 368 00:34:30,250 --> 00:34:38,650 or a = a - b. Which is going to be 10 - 7 which is 3. 369 00:34:38,650 --> 00:34:44,850 So now, correctly, 'a' was 7, b was 3, and now b is 7 and 'a' is 3. 370 00:34:44,850 --> 00:34:48,679 So that kind of makes sense; 'a' is the combination of the 2 numbers. 371 00:34:48,679 --> 00:34:53,000 At this point, 'a' is the combination, and then we're subtracting out the original b, 372 00:34:53,000 --> 00:34:56,860 and then we're subtracting out what was the original 'a.' 373 00:34:56,860 --> 00:35:01,150 But this does not work for all numbers. 374 00:35:01,150 --> 00:35:08,880 To see this, let's consider a system; so we usually think of integers as 32 bits. 375 00:35:08,880 --> 00:35:13,050 Let's work on something that's only like 4 bits. 376 00:35:13,050 --> 00:35:15,450 Hopefully I come up with a good example right now. 377 00:35:15,450 --> 00:35:18,680 So, I know, this will be easy. 378 00:35:18,680 --> 00:35:26,720 Let's say our 2 numbers are 1111, and 1111; so we're in binary right now. 379 00:35:26,720 --> 00:35:34,630 In actual decimals, if you want to think of it that way, a = 15 and b = 15. 380 00:35:34,630 --> 00:35:37,630 And so we expect, after we swap them--they don't even have to be the same numbers, 381 00:35:37,630 --> 00:35:41,140 but I did it this way. 382 00:35:41,140 --> 00:35:47,100 Let's make them not the same numbers. Let's do 1111 and 0001. 383 00:35:47,100 --> 00:35:51,860 So a = 15 and b = 1. 384 00:35:51,860 --> 00:35:57,670 After we swap them, we expect 'a' to be 1 and b to be 15. 385 00:35:57,670 --> 00:36:01,780 So our first step is a = a + b. 386 00:36:01,780 --> 00:36:08,770 Our numbers are only 4 bits wide, so 'a,' which is 1111, + b, which is 0001, 387 00:36:08,770 --> 00:36:16,780 is going to end up being 10000, but we only have 4 bits. 388 00:36:16,780 --> 00:36:22,540 So now a = 0. 389 00:36:22,540 --> 00:36:34,080 And now we want to set b = a - b--actually, this still works out perfectly. 390 00:36:34,080 --> 00:36:39,630 a = a - b--let's see if this works out perfectly. 391 00:36:39,630 --> 00:36:53,720 So then b = 0 - 1, which would still be 15, and then a = a - b, which would be 1. 392 00:36:53,720 --> 00:36:56,210 Maybe this does work. 393 00:36:56,210 --> 00:36:59,020 I feel like there's a reason it doesn't work using regular. 394 00:36:59,020 --> 00:37:06,400 Okay, so working on the assumption that it doesn't work with regular binary operations, 395 00:37:06,400 --> 00:37:15,040 and I will look for--I will Google to see if that is true. 396 00:37:15,040 --> 00:37:23,490 So we want to do it using bitwise operators, and the clue here is XOR. 397 00:37:23,490 --> 00:37:28,780 So, introducing XOR (^) if you have not seen it yet. 398 00:37:28,780 --> 00:37:34,610 It's, again, a bitwise operator so it acts bit by bit, and it's-- 399 00:37:34,610 --> 00:37:39,910 If you have the bits 0 and 1, then this will be 1. 400 00:37:39,910 --> 00:37:45,230 If you have the bits 1 and 0, it'll be 1, you have the bits 0 and 0 it'll be 0, 401 00:37:45,230 --> 00:37:47,640 and if you have the bits 1 and 1 it'll be 0. 402 00:37:47,640 --> 00:37:56,180 So it's like OR. If either of the bits are true, it's 1, but unlike OR, it can't be both bits that are true. 403 00:37:56,180 --> 00:37:59,320 OR would have this be 1, XOR would have this be 0. 404 00:37:59,320 --> 00:38:02,250 So we're going to want to use XOR here. 405 00:38:02,250 --> 00:38:09,960 Think about it for a minute; I'm going to Google. 406 00:38:09,960 --> 00:38:16,230 Well, you can't read that; I'm currently on the XOR swap algorithm page. 407 00:38:16,230 --> 00:38:21,340 Hopefully this will explain why I can't-- 408 00:38:21,340 --> 00:38:34,190 This is exactly the algorithm that we just did. 409 00:38:34,190 --> 00:38:37,330 I still don't see why--I must have just picked a bad example, 410 00:38:37,330 --> 00:38:44,940 but this case where 'a' happened to become 0, after getting to 5 bits, so now 'a' is 0, 411 00:38:44,940 --> 00:38:48,730 that is what is called "integer overflow." 412 00:38:48,730 --> 00:38:54,370 According to Wikipedia, "Unlike the XOR swap, this variation requires that it uses some methods 413 00:38:54,370 --> 00:38:59,780 to guarantee that x + y does not cause an integer overflow." 414 00:38:59,780 --> 00:39:08,350 So this does have problems; this was integer overflow, but I did something wrong. 415 00:39:08,350 --> 00:39:10,520 I'm not sure. I'll try to come up with another one. 416 00:39:10,520 --> 00:39:13,640 [Student] Well, isn't integer overflow when you're trying to put a number in there 417 00:39:13,640 --> 00:39:16,640 bigger than the amount of bits you have allocated? 418 00:39:16,640 --> 00:39:23,730 Yeah. We have 4 bits. That's--we had 4 bits, we then try to add 1 to it, so we end up with 5 bits. 419 00:39:23,730 --> 00:39:26,690 But the fifth bit just gets cut off, yeah. 420 00:39:26,690 --> 00:39:28,970 It might actually-- 421 00:39:28,970 --> 00:39:33,010 [Student] Does that throw you an error, or does that--would that throw an error? 422 00:39:33,010 --> 00:39:40,720 No. So there's no error. When you get to the assembly level, a special bit 423 00:39:40,720 --> 00:39:47,020 somewhere is set that said there was an overflow, but in C you kind of just don't deal with that. 424 00:39:47,020 --> 00:39:55,160 You actually can't deal with it unless you use special assembly instructions in C. 425 00:39:55,160 --> 00:39:58,110 Let's think about XOR swap. 426 00:39:58,110 --> 00:40:02,220 And I think the Wikipedia article might have also been saying that-- 427 00:40:02,220 --> 00:40:07,310 So it also brought up modular arithmetic, so I guess I was, in theory, doing modular arithmetic 428 00:40:07,310 --> 00:40:11,160 when I said that 0 - 1 is 15 again. 429 00:40:11,160 --> 00:40:15,410 So that might actually--on a regular processor that does 0 - 1 = 15. 430 00:40:15,410 --> 00:40:20,430 Since we end up at 0, we subtract 1, so then it just wraps back around to 1111. 431 00:40:20,430 --> 00:40:28,930 So this algorithm might actually work, the a + b, the a - b, b - a; that might be fine. 432 00:40:28,930 --> 00:40:34,030 But there's some processors which don't do that, and so it would not be fine in those specific ones. 433 00:40:34,030 --> 00:40:39,880 XOR swap will work on any processor. Okay. 434 00:40:39,880 --> 00:40:42,280 The idea is that it's supposed to be the same, though. 435 00:40:42,280 --> 00:40:50,120 Where we are using XOR to somehow get the information of both into 1 of the variables, 436 00:40:50,120 --> 00:40:54,120 and then pull out the information of the individual variables again. 437 00:40:54,120 --> 00:41:04,330 So does anyone have ideas/the answer? 438 00:41:04,330 --> 00:41:14,540 [Student answer, unintelligible] 439 00:41:14,540 --> 00:41:22,220 So this should work, and also, XOR is commutative. 440 00:41:22,220 --> 00:41:27,620 Regardless of which order these 2 numbers happen to be in up here, 441 00:41:27,620 --> 00:41:30,100 this result is going to be the same. 442 00:41:30,100 --> 00:41:35,800 So a ^ b is b ^ a. 443 00:41:35,800 --> 00:41:51,860 You might also see this written as a ^ = b, b ^ = a, a ^ = b again. 444 00:41:51,860 --> 00:42:00,200 So this is right, and to see why this works, think of the bits. 445 00:42:00,200 --> 00:42:10,400 Using a smallish number, let's say 11001, and 01100. 446 00:42:10,400 --> 00:42:12,790 So this is 'a'; this is b. 447 00:42:12,790 --> 00:42:15,540 So a ^ = b. 448 00:42:15,540 --> 00:42:22,380 We're going to be setting 'a' = to the XOR of these 2 things. 449 00:42:22,380 --> 00:42:32,920 So 1 ^ 0 is 1; 1 ^ 1 is 0; 0 ^ 1 is 1, and 0 ^ 0 is 0; 1 ^ 0 is 1. 450 00:42:32,920 --> 00:42:37,380 So 'a,' if you look at the decimal number, it's going to be-- 451 00:42:37,380 --> 00:42:41,160 you're not going to see much of a relationship between the original 'a' and the new 'a,' 452 00:42:41,160 --> 00:42:45,600 but looking at the bits, 'a' is now like a mesh of the information 453 00:42:45,600 --> 00:42:49,970 of both the original 'a' and the original b. 454 00:42:49,970 --> 00:42:57,930 So if we take b ^ a, we see that we'll end up at the original 'a.' 455 00:42:57,930 --> 00:43:08,910 And if we take the original 'a' ^ the new 'a,' we see we end up at the original b. 456 00:43:08,910 --> 00:43:18,380 So (a ^ b) ^ b = the original 'a.' 457 00:43:18,380 --> 00:43:27,910 And (a ^ b) ^ a = the original b. 458 00:43:27,910 --> 00:43:37,010 There is--another way of seeing this is anything XOR itself is always 0. 459 00:43:37,010 --> 00:43:45,020 So 1101 ^ 1101, all the bits are going to be the same. 460 00:43:45,020 --> 00:43:47,920 So there's never going to be a case where 1 is a 0 and the other is 1. 461 00:43:47,920 --> 00:43:51,080 So this is 0000. 462 00:43:51,080 --> 00:43:57,240 The same with this. (a ^ b) ^ b is like a ^ (b ^ b). 463 00:43:57,240 --> 00:44:03,680 (b ^ b) is going to be 0; a ^ 0 is just going to be 'a,' since all the bits are 0. 464 00:44:03,680 --> 00:44:08,050 So the only ones that are going to be where 'a' was originally a 1--had ones. 465 00:44:08,050 --> 00:44:12,070 And the same idea here; I'm pretty sure it's also commutative. 466 00:44:12,070 --> 00:44:17,590 Yeah. I did say before that it was commutative. 467 00:44:17,590 --> 00:44:24,680 The ^ 'a,' and it's associative, so now (b ^ a) ^ a. 468 00:44:24,680 --> 00:44:28,970 And we can do b ^ (a ^ a). 469 00:44:28,970 --> 00:44:31,540 And so again, we get the original b. 470 00:44:31,540 --> 00:44:37,120 So 'a' is now the combination of 'a' and b together. 471 00:44:37,120 --> 00:44:49,660 Using our new combo 'a' we say b = combo 'a' ^ the original b, we get the original 'a.' 472 00:44:49,660 --> 00:45:05,170 And now a = combo 'a' ^ the new b, which was the original--or which is now what was 'a' or b. 473 00:45:05,170 --> 00:45:13,620 That's this case down here. This is = b, old b. 474 00:45:13,620 --> 00:45:16,550 So now everything is back in the swapped order. 475 00:45:16,550 --> 00:45:22,960 If we actually looked at the bits, b = a ^ b, is going to XOR these 2, 476 00:45:22,960 --> 00:45:33,920 and the answer is going to be this, and then a = a ^ b is XORing these 2 and the answer is this. 477 00:45:33,920 --> 00:45:41,090 Questions? Okay. So the last one is somewhat significantly more difficult. 478 00:45:41,090 --> 00:45:43,180 [Student] I think he has a question about it. >>Oh, sorry. 479 00:45:43,180 --> 00:45:49,380 [Student] What's actually faster? If you use this XOR, or is it if you declare a new variable? 480 00:45:49,380 --> 00:45:55,190 So what is actually faster, declaring a new variable or using XOR to swap? 481 00:45:55,190 --> 00:45:59,600 The answer is, in all likelihood, a temporary variable. 482 00:45:59,600 --> 00:46:05,780 And that is because once it's compiled down--so at the assembly level, 483 00:46:05,780 --> 00:46:12,320 there's no such thing as local variables or any temporary variables or any of this stuff. 484 00:46:12,320 --> 00:46:16,060 They're just like--there's memory, and there are registers. 485 00:46:16,060 --> 00:46:20,920 Registers are where things are actively happening. 486 00:46:20,920 --> 00:46:24,750 You don't add 2 things in memory; you add 2 things in registers. 487 00:46:24,750 --> 00:46:28,160 And you bring things from memory into registers to then add them, 488 00:46:28,160 --> 00:46:33,180 and then you might put them back into memory, but all the action happens in registers. 489 00:46:33,180 --> 00:46:38,750 So when you're using the temporary variable approach, usually what happens is 490 00:46:38,750 --> 00:46:42,810 these 2 numbers are already in registers. 491 00:46:42,810 --> 00:46:46,570 And then from that point on, after you've swapped them, 492 00:46:46,570 --> 00:46:51,540 it'll just start using the other register. 493 00:46:51,540 --> 00:46:56,510 Anywhere you had been using b, it'll just use the register that was already storing 'a.' 494 00:46:56,510 --> 00:47:02,180 So it doesn't need to do anything to actually do the swap. Yeah? 495 00:47:02,180 --> 00:47:05,690 [Student] But it also takes more memory, right? 496 00:47:05,690 --> 00:47:10,280 It will only take more memory if it needs to store that temporary variable. 497 00:47:10,280 --> 00:47:14,830 Like if you later use that temporary variable again somewhere, 498 00:47:14,830 --> 00:47:18,920 then--or you assign something to that temporary variable. 499 00:47:18,920 --> 00:47:24,630 So if at any point in time 'a,' b in temp have distinct values or something, 500 00:47:24,630 --> 00:47:30,680 then it's going to have distinct locations in memory, but it is true that 501 00:47:30,680 --> 00:47:34,800 there are many local variables which will only exist in registers. 502 00:47:34,800 --> 00:47:44,370 In which case, it's never put into memory, and so you're never wasting memory. 503 00:47:44,370 --> 00:47:58,620 Okay. Last question is a bit more. 504 00:47:58,620 --> 00:48:04,850 So here, in this CS50 appliance, there is a dictionary. 505 00:48:04,850 --> 00:48:12,390 And the reason for this is because [??B66] is a spell checker where you'll be writing 506 00:48:12,390 --> 00:48:15,780 using hash tables or tries or some data structure. 507 00:48:15,780 --> 00:48:22,660 You're going to be writing a spell checker, and you're going to be using this dictionary to do that. 508 00:48:22,660 --> 00:48:28,280 But for this problem, we are just going to look up to see if a single word is in the dictionary. 509 00:48:28,280 --> 00:48:31,250 So instead of storing the entire dictionary in some data structure 510 00:48:31,250 --> 00:48:35,180 and then looking over an entire document to see if anything's misspelled, 511 00:48:35,180 --> 00:48:38,490 we just want to find 1 word. So we can just scan over the entire dictionary 512 00:48:38,490 --> 00:48:44,300 and if we never find the word in the entire dictionary, then it wasn't in there. 513 00:48:44,300 --> 00:48:52,150 If we scan over the entire dictionary and do see the word, then we're good, we found it. 514 00:48:52,150 --> 00:48:56,580 It says here that we want to start looking at C's file-handling function, 515 00:48:56,580 --> 00:48:59,930 since we want to read the dictionary, 516 00:48:59,930 --> 00:49:07,680 but I will give the hint here as to which functions you should think of. 517 00:49:07,680 --> 00:49:11,510 I'll write them on Spaces. 518 00:49:11,510 --> 00:49:20,490 So the main ones you'll want to look at are f open and then, inevitably, f closed, 519 00:49:20,490 --> 00:49:26,540 which will go at the end of your program, and f scan f. 520 00:49:26,540 --> 00:49:31,060 You could also use f read, but you probably don't want to 521 00:49:31,060 --> 00:49:34,200 because that--you don't end up needing that. 522 00:49:34,200 --> 00:49:41,880 F scan f is what you're going to be using to scan over the dictionary. 523 00:49:41,880 --> 00:49:46,370 And so you don't need to code up the solution, just try and like pseudo-code your way 524 00:49:46,370 --> 00:50:05,200 to a solution, and then we'll discuss it. 525 00:50:05,200 --> 00:50:14,110 And actually, since I already gave you these, if you go into any terminal or your appliance's shell, 526 00:50:14,110 --> 00:50:18,250 I would--I usually--if you have not seen yet, I don't know if you did in class, 527 00:50:18,250 --> 00:50:23,490 but man, so the man pages, are pretty useful for looking at pretty much any function. 528 00:50:23,490 --> 00:50:27,330 So I can do, like, man f, scan f. 529 00:50:27,330 --> 00:50:32,300 This is now the info about the scan f family of functions. 530 00:50:32,300 --> 00:50:37,070 I could also do man f, open, and that'll give me the details of that. 531 00:50:37,070 --> 00:50:40,750 So if you know what function you are using, or you're reading code 532 00:50:40,750 --> 00:50:43,000 and you see some function and you're like, "What does this do?" 533 00:50:43,000 --> 00:50:45,280 Just man that function name. 534 00:50:45,280 --> 00:50:47,340 There are a couple of weird examples where you might have to say 535 00:50:47,340 --> 00:50:51,620 like. man 2 that function name, or man 3 that function name, 536 00:50:51,620 --> 00:50:58,230 but you only have to do that if man function name doesn't happen to work the first time. 537 00:50:58,230 --> 00:51:03,010 [Student] So I'm reading the man page for open, but I'm still confused on how to use it and the program. 538 00:51:03,010 --> 00:51:06,170 Okay. A lot of the man pages are less than helpful. 539 00:51:06,170 --> 00:51:08,470 They're more helpful if you already know what they do 540 00:51:08,470 --> 00:51:12,670 and then you just need to remember the order of the arguments or something. 541 00:51:12,670 --> 00:51:17,640 Or they can give you a general overview, but some of them are very overwhelming. 542 00:51:17,640 --> 00:51:22,220 Like f scan f, also. It gives you the information for all of these functions, 543 00:51:22,220 --> 00:51:28,120 and 1 line down here happens to say, "F scan f reads from the string point or stream." 544 00:51:28,120 --> 00:51:32,360 But f open. So, how would we use f open? 545 00:51:32,360 --> 00:51:38,470 The idea of a program which needs to do file I/O is that 546 00:51:38,470 --> 00:51:45,070 you first need to open the file you want to do things with, and inevitably, 547 00:51:45,070 --> 00:51:51,220 read things from that file and do stuff with them. 548 00:51:51,220 --> 00:51:55,350 F open is what we use to open the file. 549 00:51:55,350 --> 00:52:04,190 The thing we get back, so what file do we want to open, it gives us the-- 550 00:52:04,190 --> 00:52:11,970 in here it says "/user/share/dict/words." 551 00:52:11,970 --> 00:52:16,740 This is the file that we want to open, and we want to open it-- 552 00:52:16,740 --> 00:52:21,440 we have to explicitly specify whether we want to open it to read or if we want to open it to write. 553 00:52:21,440 --> 00:52:26,490 There's a couple of combinations and stuff, but we want to open this for reading. 554 00:52:26,490 --> 00:52:29,380 We want to read from the file. 555 00:52:29,380 --> 00:52:34,290 So what does this return? It returns a file star (*), 556 00:52:34,290 --> 00:52:37,260 and I'll just show everything in the variable f, so *, 557 00:52:37,260 --> 00:52:40,840 again, it's a pointer, but we don't want to deal with pointers. 558 00:52:40,840 --> 00:52:46,470 You can think of f as, f is now the variable you're going to use to represent the file. 559 00:52:46,470 --> 00:52:49,850 So if you want to read from the file, you read from f. 560 00:52:49,850 --> 00:52:54,820 If you want to close the file, you close f. 561 00:52:54,820 --> 00:53:00,350 So at the end of the program when we inevitably want to close the file, what should we do? 562 00:53:00,350 --> 00:53:06,750 We want to close f. 563 00:53:06,750 --> 00:53:12,600 So now the last file function that we're going to want to use is scan f, f scan f. 564 00:53:12,600 --> 00:53:20,930 And what that does is it scans over the file looking for a pattern to match. 565 00:53:20,930 --> 00:53:39,100 Looking at the man page here, we see int f scan f, ignore the return value for now. 566 00:53:39,100 --> 00:53:45,230 The first argument is the file * stream, so the first argument we're going to want to pass is f. 567 00:53:45,230 --> 00:53:47,900 We're scanning over f. 568 00:53:47,900 --> 00:53:53,680 The second argument is a format string. 569 00:53:53,680 --> 00:53:58,310 I will give you a format string right now. 570 00:53:58,310 --> 00:54:05,180 I think we happen to say, 127s\n, a lot of that's unnecessary. 571 00:54:05,180 --> 00:54:12,490 The idea of what that format string is, is you can think of scan f as the opposite of print f. 572 00:54:12,490 --> 00:54:17,160 So print f, print f we also use this type of format parameter, 573 00:54:17,160 --> 00:54:25,000 but in print f what we're doing is--let's look at an equivalent. 574 00:54:25,000 --> 00:54:32,550 So print f, and there's actually also f print f, where the first argument is going to be f. 575 00:54:32,550 --> 00:54:40,980 When you print f, we could say something like, "print 127s\n" and then if we pass it some string, 576 00:54:40,980 --> 00:54:44,050 it's going to print this string and then a new line. 577 00:54:44,050 --> 00:54:49,690 What 127 means, I'm pretty sure, but I've never restricted myself to it, 578 00:54:49,690 --> 00:54:52,470 You wouldn't even need to say '127' in the print f, 579 00:54:52,470 --> 00:54:57,090 but what it means is print the first 127 characters. 580 00:54:57,090 --> 00:54:59,350 So I'm pretty sure that's the case. You can Google for that. 581 00:54:59,350 --> 00:55:03,000 But in the next one I'm almost positive it means that. 582 00:55:03,000 --> 00:55:08,880 So this is print the first 127 characters, followed by a new line. 583 00:55:08,880 --> 00:55:14,680 F scan f now, instead of looking at a variable and printing it, 584 00:55:14,680 --> 00:55:22,620 it's going to look at some string, and store the pattern into the variable. 585 00:55:22,620 --> 00:55:26,360 Let's actually use scan f in a different example. 586 00:55:26,360 --> 00:55:31,670 So let's say we had some int, x = 4, 587 00:55:31,670 --> 00:55:41,110 and we wanted to create a string made of--wanted to create the string 588 00:55:41,110 --> 00:55:44,250 that was like, this will come up much later, 589 00:55:44,250 --> 00:55:49,020 something that's just like 4.jpg. 590 00:55:49,020 --> 00:55:51,870 So this might be a program where you will have sum counter, 591 00:55:51,870 --> 00:55:56,420 sum counter i, and you want to save a bunch of images. 592 00:55:56,420 --> 00:56:02,430 So you want to save i.jpg, where i is some iteration of your loop. 593 00:56:02,430 --> 00:56:05,500 So how do we make this string for that JPEG? 594 00:56:05,500 --> 00:56:11,720 If you wanted to print 4.jpg, we could just say print f, % d.jpg, 595 00:56:11,720 --> 00:56:14,410 and then it would print for that JPEG. 596 00:56:14,410 --> 00:56:20,050 But if we want to save the string 4.jpg, we use scan f. 597 00:56:20,050 --> 00:56:30,860 So string s--actually we can't--character, char s, let's go 100. 598 00:56:30,860 --> 00:56:35,400 So I just declared some array of 100 characters, 599 00:56:35,400 --> 00:56:39,830 and that's what we're inevitably going to be storing that JPEG in. 600 00:56:39,830 --> 00:56:47,920 So we're going to use scan f, and the format, how we would say % d.jpg 601 00:56:47,920 --> 00:56:54,980 in order to print 4.jpg, the format of this is going to be % d.jpg. 602 00:56:54,980 --> 00:57:04,020 So the format is % d.jpg, what we want to replace % d with is x, 603 00:57:04,020 --> 00:57:06,590 and now we need to store that string somewhere. 604 00:57:06,590 --> 00:57:12,500 And where we're going to store this string is in the array s. 605 00:57:12,500 --> 00:57:21,640 So after this line of code, s, if we print f, % s of the variable s, 606 00:57:21,640 --> 00:57:26,280 it's going to print 4.jpg. 607 00:57:26,280 --> 00:57:38,930 So f scan f is the same as scan f, except now it's looking over this file 608 00:57:38,930 --> 00:57:43,600 for what to store in s. 609 00:57:43,600 --> 00:57:46,160 That's what the last argument is going to be. 610 00:57:46,160 --> 00:57:54,170 We want to store--"Scan f family of functions scans in both according to format as tried below. 611 00:57:54,170 --> 00:58:02,450 If any are stored in the location points you might return--" 612 00:58:02,450 --> 00:58:12,910 No, we might be good. Let me think for a second. 613 00:58:12,910 --> 00:58:26,350 So scan f does not--what the heck is the function which does that? 614 00:58:26,350 --> 00:58:31,650 So scan f isn't going to take an integer and do dot jpg. 615 00:58:31,650 --> 00:58:43,490 It's going to [mumbles]. 616 00:58:43,490 --> 00:58:49,360 Save int variable in string int C. 617 00:58:49,360 --> 00:58:55,940 What is this variable, or what is this function called? 618 00:58:55,940 --> 00:59:04,950 Yes. That's--yes. So what I was defining to you before was s print f, 619 00:59:04,950 --> 00:59:09,820 which--that makes much more sense, why I said it was much more like print f. 620 00:59:09,820 --> 00:59:14,700 Scan f is still kind of like print f, but s print f is going to scan it over 621 00:59:14,700 --> 00:59:17,510 and replace the variables and now store it in a string. 622 00:59:17,510 --> 00:59:19,620 Instead of printing it, it stores it in a string. 623 00:59:19,620 --> 00:59:25,070 So ignore that entirely. You can still think of the format specifier as like that of print f. 624 00:59:25,070 --> 00:59:34,510 So now, if we wanted to do the 4.jpg thing, we would do s print f, x of this. 625 00:59:34,510 --> 00:59:38,520 So what scan f is doing--what was your question going to be? 626 00:59:38,520 --> 00:59:40,820 [Student] I'm just confused on what we're trying to do right here 627 00:59:40,820 --> 00:59:43,450 with that JPEG. Can you explain that 1 more time? 628 00:59:43,450 --> 00:59:52,710 So this was--it's less relevent to f scan f now; hopefully, it will tie back in some kind of way. 629 00:59:52,710 --> 01:00:02,240 But what I initially was intending to show was--this is actually directly relevant to these [?? F5] 630 01:00:02,240 --> 01:00:08,520 You're going to be using s print f, where, say we have 100 images, 631 01:00:08,520 --> 01:00:13,630 and you want to read image 1.jpg, 2.jpg, 3.jpg. 632 01:00:13,630 --> 01:00:21,520 So in order to do that, you need to f open, and then you have to pass in the string that you want to open. 633 01:00:21,520 --> 01:00:30,020 So we would want to open 1.jpg; in order to create the string that is 1.jpg, 634 01:00:30,020 --> 01:00:37,660 we do s print f of % d.jpg--we didn't do for int i = 0. 635 01:00:37,660 --> 01:00:46,580 i < 40, i + +. 636 01:00:46,580 --> 01:00:51,130 So s print f % d.jpg of i. 637 01:00:51,130 --> 01:00:56,320 So after this line, now the variable or the array s is going to 1.jpg. 638 01:00:56,320 --> 01:01:10,610 Or, 0.jpg, 1.jpg, 2.jpg. And so we can open, in turn, each image for reading. 639 01:01:10,610 --> 01:01:19,550 So that is what s print f does. Do you see what s print f is now doing? 640 01:01:19,550 --> 01:01:25,720 [Student] Okay, so it's taking--it creates a string, something.jpg, and then stores it. 641 01:01:25,720 --> 01:01:30,360 Yes. It creates--this is another format string, just like scan f and print f, 642 01:01:30,360 --> 01:01:37,530 where it inserts all of the variables into the second argument, might be s as opposed to i. 643 01:01:37,530 --> 01:01:42,280 Perhaps--I mean, that's the case. But whatever the order of arguments is. 644 01:01:42,280 --> 01:01:45,440 It's going to insert all of the variables into the format string 645 01:01:45,440 --> 01:01:52,250 and then store into our buffer; we call that a buffer, it's where we're storing the string. 646 01:01:52,250 --> 01:02:00,750 So we are storing inside of s the correctly-formatted string, % d having been replaced with 4. 647 01:02:00,750 --> 01:02:08,080 [Student] So if we did this, is the variable f just going to be reassigned? 648 01:02:08,080 --> 01:02:18,110 Yes. So we should close the original f before doing this. 649 01:02:18,110 --> 01:02:22,810 But--and then also, if there were not an f open up here, then we would need to say-- 650 01:02:22,810 --> 01:02:29,280 Yeah. But it would open a hundred different files. 651 01:02:29,280 --> 01:02:37,360 [Student] But we wouldn't be able to access or--okay. 652 01:02:37,360 --> 01:02:44,230 Okay. So scan f, f scan f, is kind of the same idea, 653 01:02:44,230 --> 01:02:53,610 but instead of, instead of storing it into a string, it's more like you are now 654 01:02:53,610 --> 01:03:02,420 going over a sting and pattern matching against that string and storing the results into variables. 655 01:03:02,420 --> 01:03:11,290 You can use scan f to parse over something like 4.jpg, and store the integer 4 into sum int x. 656 01:03:11,290 --> 01:03:13,430 That's what we can use scan f for. 657 01:03:13,430 --> 01:03:16,300 F scan f is going to do that at the command line. 658 01:03:16,300 --> 01:03:19,200 I'm actually pretty sure this is what the CS50 library does. 659 01:03:19,200 --> 01:03:29,050 So when you say, "get int," it's scan f-ing over--scan f is the way you get user input. 660 01:03:29,050 --> 01:03:34,670 F scan f is going to do the same thing but using a file to scan over. 661 01:03:34,670 --> 01:03:41,090 So here, we are scanning over this file. 662 01:03:41,090 --> 01:03:45,460 The pattern we are trying to match is some string that is 127 characters long 663 01:03:45,460 --> 01:03:48,100 followed by a new line 664 01:03:48,100 --> 01:03:54,770 So I'm pretty sure we could even just say "match s," since in the dictionary 665 01:03:54,770 --> 01:03:57,770 we happen to have, we're guaranteed no word is that long, 666 01:03:57,770 --> 01:04:03,310 and also f scan f, I think, will stop at the new line no matter what. 667 01:04:03,310 --> 01:04:06,970 But we'll include the new line in the match, and-- 668 01:04:06,970 --> 01:04:13,960 [Student] If we didn't include the new line, wouldn't it find parts of a word? 669 01:04:13,960 --> 01:04:22,900 It--each--looking at the dictionary-- 670 01:04:22,900 --> 01:04:26,200 So in the dictionary, these are all of our words. 671 01:04:26,200 --> 01:04:30,500 Each one is on a new line. 672 01:04:30,500 --> 01:04:32,510 The scan f is going to pick up this word. 673 01:04:32,510 --> 01:04:38,750 If we don't include the new line, then it's possible that the next scan f will just read the new line. 674 01:04:38,750 --> 01:04:44,180 But including new line then will just ignore the new line. 675 01:04:44,180 --> 01:04:49,440 But we'll never get part of a word, since we are always reading up to a new line, no matter what. 676 01:04:49,440 --> 01:04:54,530 [Student] But what if you search for the word "cissa," like c-i-s-s-a. 677 01:04:54,530 --> 01:04:57,380 Will it find that, and say it's a match? 678 01:04:57,380 --> 01:05:05,110 So here we--it will read in--this is actually a good point. 679 01:05:05,110 --> 01:05:10,660 We're never using the current--the word we're looking for is the first command line argument. 680 01:05:10,660 --> 01:05:16,460 So string, word = argv 1. 681 01:05:16,460 --> 01:05:20,020 So the string we're looking for is argv 1. 682 01:05:20,020 --> 01:05:23,290 We are not looking for a word at all in our scan f. 683 01:05:23,290 --> 01:05:28,030 What we were doing with scan f is getting each word in the dictionary, 684 01:05:28,030 --> 01:05:34,320 and then once we have that word we're going to use strcmp to compare them. 685 01:05:34,320 --> 01:05:39,210 We're going to compare our word and what we just read in. 686 01:05:39,210 --> 01:05:45,110 So inevitably, we're going to end up doing a bunch of scan fs 687 01:05:45,110 --> 01:05:52,130 until it just so happens that scan f will return-- 688 01:05:52,130 --> 01:05:54,800 it will return one, as long as it has matched a new word, 689 01:05:54,800 --> 01:06:01,360 and it will return something else as soon as it has failed to match the word. 690 01:06:01,360 --> 01:06:08,440 We are reading over the entire dictionary, storing line by line each word into the variable s. 691 01:06:08,440 --> 01:06:17,240 Then we are comparing word with s, and if the comparison = = 0, 692 01:06:17,240 --> 01:06:21,650 strcmp happens to bring 0 if a match was made. 693 01:06:21,650 --> 01:06:31,510 So if it was 0, then we can print f, matched, 694 01:06:31,510 --> 01:06:35,370 or word is in dictionary, or whatever you want to print f. 695 01:06:35,370 --> 01:06:41,450 And then--we don't want to f close over and over again. 696 01:06:41,450 --> 01:06:50,410 This is the kind of thing we want to do, and we are not just looking for word in the dictionary. 697 01:06:50,410 --> 01:06:56,660 So we could do that, if we wanted to look for their pattern, c-i-s-s-a, like you said before, 698 01:06:56,660 --> 01:07:00,260 if we wanted to look for that pattern, then it would fail in the case 699 01:07:00,260 --> 01:07:08,010 because that's not actually a word, but one of the words in the dictionary happens to have that in it. 700 01:07:08,010 --> 01:07:13,560 So it would match this word, but this subset of the word is not a word itself. 701 01:07:13,560 --> 01:07:17,250 But that's not how we're using it; we're reading in each word 702 01:07:17,250 --> 01:07:19,740 and then comparing the word we have with that word. 703 01:07:19,740 --> 01:07:25,780 So we're always comparing full words. 704 01:07:25,780 --> 01:07:29,620 I can send out the finalized solutions later. 705 01:07:29,620 --> 01:07:32,050 This is kind of nearly the right answer, I think. 706 01:07:32,050 --> 01:07:34,720 [Student comment, unintelligible] 707 01:07:34,720 --> 01:07:40,870 Oh, did I get rid of that before? Char s, I guess we said 127--I forget what the largest is. 708 01:07:40,870 --> 01:07:44,100 We'll just do 128; so now s is long enough. 709 01:07:44,100 --> 01:07:46,570 We don't need to print anything. 710 01:07:46,570 --> 01:07:56,440 We're also going to want to have to close our file, and that should be about the right answer. 711 01:07:56,440 --> 01:07:59,440 CS50.TV