1 00:00:00,000 --> 00:00:01,940 [Walkthrough - Problem Set 2] 2 00:00:01,940 --> 00:00:04,130 [Zamyla Chan - Harvard University] 3 00:00:05,170 --> 00:00:07,490 [This is CS50. CS50.TV] 4 00:00:07,490 --> 00:00:10,750 All right. Hello, everyone, and welcome to Walkthrough 2. 5 00:00:10,750 --> 00:00:14,330 First, I want to congratulate you for finishing pset 1. 6 00:00:14,330 --> 00:00:18,140 I know that it could have been a bit tough for some of you, 7 00:00:18,140 --> 00:00:20,460 could have been your first computer program that you wrote, 8 00:00:20,460 --> 00:00:24,500 but just remember that at the end of this, when you look back at the end of the semester, 9 00:00:24,500 --> 00:00:29,820 you'll look at pset 1 and you'll say, "Hey, I could have done that in 5 minutes." 10 00:00:29,820 --> 00:00:35,700 So know and trust that at the end of this you'll definitely find pset 1 quite simple. 11 00:00:35,700 --> 00:00:40,640 But for now it's a huge accomplishment, and congratulations for getting done. 12 00:00:40,640 --> 00:00:44,010 Now, also a quick note before we get into the meat of the walkthrough. 13 00:00:44,010 --> 00:00:48,340 I just want to make a quick note that I sometimes won't have enough time 14 00:00:48,340 --> 00:00:52,500 during the walkthroughs to go through every single way of doing the problem set 15 00:00:52,500 --> 00:00:56,140 and rather just maybe focus on 1 or 2 kind of implementations, 16 00:00:56,140 --> 00:00:57,750 ways that you could do this. 17 00:00:57,750 --> 00:01:01,970 But that isn't to say that you are forbidden from doing it another way. 18 00:01:01,970 --> 00:01:05,980 There are often, as with computer science, numerous ways of doing things, 19 00:01:05,980 --> 00:01:12,190 and so definitely feel free to use a different type of solution than I may have presented. 20 00:01:12,190 --> 00:01:14,520 [pset 2: Crypto - Zamyla Chan - zamyla@cs50.net] 21 00:01:14,520 --> 00:01:17,160 [pset2 - 0. A Section of Questions - 1. Caesar - 2. Vigenere] 22 00:01:17,160 --> 00:01:20,650 All right. So problem set 2: Crypto is a fun one. 23 00:01:20,650 --> 00:01:24,500 Again, with every pset you'll start with a section of questions 24 00:01:24,500 --> 00:01:29,600 that's going to be conducted in your sections with your assigned teaching fellow. 25 00:01:29,600 --> 00:01:31,670 We aren't going to go through these over the walkthrough, 26 00:01:31,670 --> 00:01:35,100 but they definitely will help you complete the pset. 27 00:01:35,100 --> 00:01:38,100 So the first part of the problem set is Caesar. 28 00:01:38,100 --> 00:01:43,470 And so in Caesar someone will pass you a key with an integer, 29 00:01:43,470 --> 00:01:48,420 and you will encrypt a string of text that they provide you 30 00:01:48,420 --> 00:01:50,670 and give them back an encrypted thing. 31 00:01:50,670 --> 00:01:56,050 If anyone watched A Christmas Story, there's an example of that there. 32 00:01:56,050 --> 00:01:59,090 Then the second part of the problem set is Vigenere, 33 00:01:59,090 --> 00:02:01,790 which is a more advanced encryption technique. 34 00:02:01,790 --> 00:02:05,640 And so we're going to encipher a piece of text, 35 00:02:05,640 --> 00:02:09,600 except instead with just a single integer, we're actually going to encode it 36 00:02:09,600 --> 00:02:13,340 with a keyword that the user will provide us. 37 00:02:16,270 --> 00:02:22,090 Okay, so the first tool in the toolbox today is actually going to be updating the appliance. 38 00:02:22,090 --> 00:02:26,430 On the discussion board we would see things like, "Why doesn't this work?" 39 00:02:26,430 --> 00:02:28,110 "Why doesn't Submit 50 work?" 40 00:02:28,110 --> 00:02:31,830 and often the solution is actually just to update your appliance. 41 00:02:31,830 --> 00:02:36,730 And so if you just run in a terminal window in your appliance sudo yum -y-- 42 00:02:36,730 --> 00:02:40,040 that's a flag saying yes, update everything--update, 43 00:02:40,040 --> 00:02:42,280 then your appliance will update if need be. 44 00:02:42,280 --> 00:02:46,960 And it doesn't hurt if you already are at the most recent version of the appliance. 45 00:02:46,960 --> 00:02:51,280 Then it will just say no new updates available and you can continue working along. 46 00:02:51,280 --> 00:02:55,800 But this is good to execute even every time that you open the appliance 47 00:02:55,800 --> 00:02:57,140 because we're still very much-- 48 00:02:57,140 --> 00:03:00,320 sometimes if we come into a bug--fixing it in the appliance. 49 00:03:00,320 --> 00:03:03,180 So make sure that you have the most recent version of the appliance 50 00:03:03,180 --> 00:03:07,710 and run that update there. 51 00:03:07,710 --> 00:03:14,360 All right. So since we're dealing with letters and changing, enciphering things, 52 00:03:14,360 --> 00:03:20,410 we're going to really want to become best friends with our ASCII chart. 53 00:03:20,410 --> 00:03:24,350 There are numerous ones online, if you find. Maybe even make your own. 54 00:03:24,350 --> 00:03:29,950 Basically, with every letter and every number and every character 55 00:03:29,950 --> 00:03:32,210 there is a number associated with them, 56 00:03:32,210 --> 00:03:38,670 and so it's good to see their ASCII values alongside the actual letter. 57 00:03:38,670 --> 00:03:42,310 That will definitely help you in the problem set. 58 00:03:42,310 --> 00:03:45,750 One thing that really helped me in this problem set was to actually print it out, 59 00:03:45,750 --> 00:03:48,380 and as I was going through, I would actually draw on it, 60 00:03:48,380 --> 00:03:51,150 write, "If this has to go to there, then..." 61 00:03:51,150 --> 00:03:55,270 Kind of draw on it and mark it up, become best friends with your ASCII table. 62 00:03:57,240 --> 00:04:00,750 Then we have a few other tools at our disposal. 63 00:04:00,750 --> 00:04:03,750 This time instead of actually prompting the user for all of their input 64 00:04:03,750 --> 00:04:05,230 we're going to do a combination. 65 00:04:05,230 --> 00:04:06,880 We're going to prompt them for some input, 66 00:04:06,880 --> 00:04:11,350 but we're also going to just use the command line arguments. 67 00:04:11,350 --> 00:04:15,600 So when they run their program, usually you say ./hello, for instance, 68 00:04:15,600 --> 00:04:17,310 if your program was hello.c. 69 00:04:17,310 --> 00:04:22,500 But this time instead of just saying that, they can put words, arguments afterwards. 70 00:04:22,500 --> 00:04:27,210 And so we're going to use whatever they pass in to us as their input as well, 71 00:04:27,210 --> 00:04:31,720 so moving beyond just prompting for integer but also using command line arguments. 72 00:04:31,720 --> 00:04:36,590 And then we'll go into arrays and strings, which we'll be using a lot as well. 73 00:04:41,460 --> 00:04:44,810 Here's just an example of 1 mini ASCII chart. 74 00:04:44,810 --> 00:04:48,460 As I said, every letter corresponds to a number, 75 00:04:48,460 --> 00:04:52,510 and so familiarize yourself with that. It will come in handy. 76 00:04:52,510 --> 00:04:55,610 And later when we start doing some ASCIIMath dealing with the numbers-- 77 00:04:55,610 --> 00:05:00,110 adding, subtracting them--then definitely good to refer to this chart. 78 00:05:02,860 --> 00:05:06,920 So here's an example of a Caesar cipher--something that you may have played with. 79 00:05:06,920 --> 00:05:11,190 It is just a wheel. Essentially, there is an outer alphabet and then there is an inner alphabet. 80 00:05:11,190 --> 00:05:15,290 So right here is an example of the Caesar cipher but with a key of 0. 81 00:05:15,290 --> 00:05:21,540 Essentially, A is aligned with A, B is aligned with B, all the way up to Z. 82 00:05:21,540 --> 00:05:26,590 But then say we wanted a key of 3, for instance. 83 00:05:26,590 --> 00:05:33,280 Then we would rotate the inner wheel so that A now aligns with D, etc. 84 00:05:33,280 --> 00:05:35,250 And so this is essentially what we're going to do. 85 00:05:35,250 --> 00:05:38,340 We don't have a wheel, but what we're going to do is make our program 86 00:05:38,340 --> 00:05:44,490 kind of shift the alphabet along with us a certain amount of numbers. 87 00:05:44,490 --> 00:05:48,650 So as I said before, we're going to be dealing with command line arguments 88 00:05:48,650 --> 00:05:50,390 as well as getting an integer. 89 00:05:50,390 --> 00:05:55,050 So the way that a user will run your Caesar program is by saying ./caesar 90 00:05:55,050 --> 00:05:58,090 and then entering a number after that. 91 00:05:58,090 --> 00:06:01,130 And that number represents the key, the shift, 92 00:06:01,130 --> 00:06:06,740 how many times you're going to be rotating the inner wheel of your Caesar cipher. 93 00:06:06,740 --> 00:06:08,390 And so you see here an example. 94 00:06:08,390 --> 00:06:14,550 If we entered the letters from A to L in our Caesar cipher, 95 00:06:14,550 --> 00:06:19,520 then it would input D through O because that's every letter shifted over 3 times, 96 00:06:19,520 --> 00:06:22,080 just like the example of the wheel that I showed you. 97 00:06:22,080 --> 00:06:25,300 So then if you entered, for instance, This is CS50! 98 00:06:25,300 --> 00:06:27,960 then it would also move all of the letters. 99 00:06:27,960 --> 00:06:31,040 And that's an important thing in both Caesar and Vigenere 100 00:06:31,040 --> 00:06:34,890 is that we're going to skip over any non-letters. 101 00:06:34,890 --> 00:06:39,160 So any spaces, characters, etc, numbers, we're going to keep them the same. 102 00:06:39,160 --> 00:06:42,920 We're only going to shift the letters in this case. 103 00:06:42,920 --> 00:06:45,870 So as you see in the wheel, we only have the letters available to us, 104 00:06:45,870 --> 00:06:50,150 so we only want to shift the letters and encrypt the letters. 105 00:06:51,370 --> 00:06:56,720 So the first thing to do, you saw that the usage for Caesar in problem set 2 106 00:06:56,720 --> 00:07:05,280 is to run Caesar and then enter a number when you run it in the terminal. 107 00:07:05,280 --> 00:07:10,940 So what we need to do is to somehow get that key and access it. 108 00:07:10,940 --> 00:07:14,730 And so we want to somehow see it's going to be the second command line argument. 109 00:07:14,730 --> 00:07:20,950 The first one is going to be ./caesar, and the next one is going to be the key number. 110 00:07:22,190 --> 00:07:29,200 So before we had int main (void) to start our C programs. 111 00:07:29,200 --> 00:07:31,790 We're going to peel back a layer a little bit 112 00:07:31,790 --> 00:07:34,720 and actually see that instead of passing in void to our main function 113 00:07:34,720 --> 00:07:37,920 we're actually dealing with 2 parameters. 114 00:07:37,920 --> 00:07:44,070 We have an int named argc and then an array of strings called argv. 115 00:07:44,070 --> 00:07:46,030 So argc is an integer, 116 00:07:46,030 --> 00:07:49,640 and it represents the number of arguments passed in to your program. 117 00:07:49,640 --> 00:07:53,590 And then argv is actually the list of the arguments passed. 118 00:07:53,590 --> 00:08:00,820 All of the arguments are strings, and so argv represents an array, a list, of strings. 119 00:08:01,830 --> 00:08:03,990 Let's talk about arrays a little bit. 120 00:08:03,990 --> 00:08:05,940 Arrays are essentially a new data structure. 121 00:08:05,940 --> 00:08:09,660 We have ints, we have doubles, we have strings, and now we have arrays. 122 00:08:09,660 --> 00:08:13,820 Arrays are data structures that can hold multiple values of the same type, 123 00:08:13,820 --> 00:08:18,320 so essentially, a list of whatever type you want. 124 00:08:18,320 --> 00:08:24,400 Essentially, if you wanted a list of integers all in 1 variable, 125 00:08:24,400 --> 00:08:29,090 then you would create a new variable that was of type int array. 126 00:08:29,090 --> 00:08:34,450 So arrays are zero-indexed, meaning that the first element of the array is at index 0. 127 00:08:34,450 --> 00:08:41,799 If the array is of length 4, as in this example, then your last element would be at index 3, 128 00:08:41,799 --> 00:08:44,810 which is 4 - 1. 129 00:08:45,940 --> 00:08:48,420 So to create array, you would do something like this. 130 00:08:48,420 --> 00:08:51,440 Say you wanted a double array. 131 00:08:51,440 --> 00:08:56,520 This goes for any type of data type, though. 132 00:08:56,520 --> 00:09:00,210 So say you want a double array. Say you want to call it mailbox. 133 00:09:00,210 --> 00:09:04,760 Just like you would initialize any other double, 134 00:09:04,760 --> 00:09:09,760 you would say double and then the name, but this time we put the square brackets, 135 00:09:09,760 --> 00:09:13,570 and then the number there will be the length of the array. 136 00:09:13,570 --> 00:09:16,840 Note that in arrays we can't ever change the length, 137 00:09:16,840 --> 00:09:21,230 so you always have to define and choose how many boxes, 138 00:09:21,230 --> 00:09:25,440 how many values your array is going to hold. 139 00:09:25,440 --> 00:09:31,820 So to set different values in your array, you're going to use this following syntax, 140 00:09:31,820 --> 00:09:33,200 as you see on the slide. 141 00:09:33,200 --> 00:09:37,620 You have mailbox index 0 will be set to 1.2, 142 00:09:37,620 --> 00:09:42,180 mailbox index 1 set to 2.4, etc. 143 00:09:42,180 --> 00:09:47,910 So now that we've reviewed arrays a bit, let's go back to argc and argv. 144 00:09:47,910 --> 00:09:52,220 We know that argv is now an array of strings. 145 00:09:52,220 --> 00:09:55,080 So when a user passes in--say they're running a program-- 146 00:09:55,080 --> 00:09:58,740 they say ./hello David Malan, 147 00:09:58,740 --> 00:10:05,160 what the program will do for you already is actually come up with what argc and argv are. 148 00:10:05,160 --> 00:10:07,620 So you don't need to worry about that. 149 00:10:07,620 --> 00:10:14,370 Argc in this case would be 3 because it sees 3 distinct words separated by spaces. 150 00:10:14,370 --> 00:10:18,850 And so then the array in this instance, the first index would be ./hello, 151 00:10:18,850 --> 00:10:21,770 the next one David, the next one Malan. 152 00:10:21,770 --> 00:10:25,640 Does anyone see right away what the relationship between argv, 153 00:10:25,640 --> 00:10:28,990 the array, and argc is? 154 00:10:32,820 --> 00:10:38,090 Yeah. We'll get into that in an example in args.c. 155 00:10:38,090 --> 00:10:42,880 Let's see if we can take advantage of the relationship between the 2. 156 00:10:42,880 --> 00:10:46,550 Here you might find that in the appliance the default application 157 00:10:46,550 --> 00:10:49,450 to open .c files is sometimes Emacs. 158 00:10:49,450 --> 00:10:54,660 But we want to deal with gedit, so what you can do is you can right click on your C file, 159 00:10:54,660 --> 00:11:04,580 go to Properties, Open With, and then choose gedit, Set as default, 160 00:11:04,580 --> 00:11:13,020 and now your program should open in gedit instead of Emacs. 161 00:11:14,710 --> 00:11:16,290 Perfect. 162 00:11:17,120 --> 00:11:25,520 So here I have a program that I want to print out each command line argument. 163 00:11:25,520 --> 00:11:32,050 So whatever the user inputs, I want to essentially return it back to them on a new line. 164 00:11:32,050 --> 00:11:36,710 So what's a structure that we can use to iterate over something-- 165 00:11:36,710 --> 00:11:40,380 something that you probably used in your pset 1? 166 00:11:40,380 --> 00:11:45,840 If you want to go through a set number of things? >>[student] For loop. 167 00:11:45,840 --> 00:11:48,910 For loop. Exactly. So let's start with the for loop. 168 00:11:48,910 --> 00:11:56,900 We have for int i = 0. Let's just start with a standard initialization variable. 169 00:11:56,900 --> 00:12:02,370 I'm going to leave the condition for a set and then say i++, going to do things there. 170 00:12:02,370 --> 00:12:04,090 All right. 171 00:12:04,090 --> 00:12:11,590 So thinking back to argv, if argv is the list of arguments passed in to the program 172 00:12:11,590 --> 00:12:15,380 and argc is the number of arguments in the program, 173 00:12:15,380 --> 00:12:21,280 then that means that argc is essentially the length of argv, right, 174 00:12:21,280 --> 00:12:28,970 because there are going to be as many arguments as the value of argc. 175 00:12:28,970 --> 00:12:35,910 So if we want to iterate over each element in argv, 176 00:12:35,910 --> 00:12:43,290 we're going to want to each time access the variable in argv at the given index. 177 00:12:43,290 --> 00:12:49,060 That can be represented with this, right? 178 00:12:49,060 --> 00:12:53,430 This variable here represents the particular string in this instance 179 00:12:53,430 --> 00:12:57,030 because it's a string array--the particular string at that given index. 180 00:12:57,030 --> 00:13:00,690 What we want to do, in this case we want to print it out, so let's say printf. 181 00:13:00,690 --> 00:13:04,680 And now argv is a string, so we want to put that placeholder there. 182 00:13:04,680 --> 00:13:08,430 We want a new line just to make it look good. 183 00:13:08,430 --> 00:13:12,530 So here we have a for loop. We don't have the condition yet. 184 00:13:12,530 --> 00:13:20,020 So i starts at 0, and then every time it's going to print the given string 185 00:13:20,020 --> 00:13:22,980 at that particular index in the array. 186 00:13:22,980 --> 00:13:28,410 So when do we want to stop printing out elements in the array? 187 00:13:28,410 --> 00:13:35,720 When we've finished, right? When we've reached the end of the array. 188 00:13:35,720 --> 00:13:38,870 So we don't want to exceed past the length of the array, 189 00:13:38,870 --> 00:13:43,700 and we already know we don't need to actually actively find out what the length of argv is 190 00:13:43,700 --> 00:13:47,520 because it's given to us, and what's that? Argc. Exactly. 191 00:13:47,520 --> 00:13:56,640 So we want to do this process argc number of times. 192 00:13:56,640 --> 00:13:59,550 I'm not in the right directory. 193 00:14:02,100 --> 00:14:03,490 All right. 194 00:14:03,490 --> 00:14:08,990 Now let's make args. No errors, which is great. 195 00:14:08,990 --> 00:14:11,430 So let's just run args. 196 00:14:11,430 --> 00:14:15,130 What is this going to return to us? It's just going to print it back. 197 00:14:15,130 --> 00:14:18,320 "You inputted args into the program; I'm going to give it back to you." 198 00:14:18,320 --> 00:14:23,170 So let's say we want to say args then foo bar. 199 00:14:23,170 --> 00:14:26,570 So then it prints it out back to us. All right? 200 00:14:26,570 --> 00:14:30,790 So there is an example of how you can use argc and argv 201 00:14:30,790 --> 00:14:33,460 knowing that argc represents the length of argv. 202 00:14:33,460 --> 00:14:42,750 Make sure that you do not ever with arrays access one beyond the length of the array 203 00:14:42,750 --> 00:14:45,140 because C will definitely shout at you. 204 00:14:45,140 --> 00:14:47,560 You'll get something called a segmentation fault, 205 00:14:47,560 --> 00:14:52,470 which is never fun, basically saying you're trying to access something 206 00:14:52,470 --> 00:14:55,000 that doesn't exist, doesn't belong to you. 207 00:14:55,000 --> 00:14:59,430 So make sure, and especially with the zero-indexing, we don't want to-- 208 00:14:59,430 --> 00:15:02,390 Like for instance, if we have an array of length 4, 209 00:15:02,390 --> 00:15:07,240 that array index 4 doesn't exist because we start at 0, at zero index. 210 00:15:07,240 --> 00:15:11,730 It will become second nature just like for loops when we start at 0. 211 00:15:11,730 --> 00:15:13,610 So just keep that in mind. 212 00:15:13,610 --> 00:15:22,590 You don't want to ever access the index of an array that's beyond your reach. 213 00:15:26,710 --> 00:15:32,560 So we can see now how we can kind of access 214 00:15:32,560 --> 00:15:35,930 the command line arguments that are passed in. 215 00:15:35,930 --> 00:15:41,330 But as you saw the string, the argv is actually a string array. 216 00:15:41,330 --> 00:15:45,740 So it's actually not an integer yet, but in Caesar we want to deal with integers. 217 00:15:45,740 --> 00:15:54,430 Luckily, there is a function created for us that can actually convert a string to an integer. 218 00:15:54,430 --> 00:15:58,710 Also in here we aren't dealing with user input where we're prompting them 219 00:15:58,710 --> 00:16:03,740 for input here for the key, so we can't actually reprompt and say, 220 00:16:03,740 --> 00:16:07,840 "Oh, give me another integer, say, if it's not valid." 221 00:16:07,840 --> 00:16:10,540 But we do still need to check for correct usage. 222 00:16:10,540 --> 00:16:13,520 In Caesar they are only allowed to pass in 1 number, 223 00:16:13,520 --> 00:16:18,030 and so they have to run ./caesar and then they have to give you a number. 224 00:16:18,030 --> 00:16:23,660 So argc has to be a certain number. 225 00:16:23,660 --> 00:16:29,060 What number would that be if they have to pass you the ./caesar and then a key? 226 00:16:29,060 --> 00:16:32,920 What is argc? >>[student] 2. >>Two. Exactly. 227 00:16:32,920 --> 00:16:35,490 So you want to make sure that argc is 2. 228 00:16:35,490 --> 00:16:39,620 Otherwise you basically refuse to run the program. 229 00:16:39,620 --> 00:16:43,040 In main it's a function that says int main, 230 00:16:43,040 --> 00:16:47,360 so then we always in good practice return 0 at the end of a successful program. 231 00:16:47,360 --> 00:16:50,840 So if, say, they give you 3 command line arguments instead of 2 232 00:16:50,840 --> 00:16:54,350 or give you 1, for instance, then what you'll do is you'll want to check for that 233 00:16:54,350 --> 00:16:59,900 and then return 1 saying, no, I can't proceed with this program. 234 00:16:59,900 --> 00:17:03,190 [student] There can't be a space in your text. >>Pardon me? 235 00:17:03,190 --> 00:17:06,780 [student] There can't be a space in the text you're trying to encrypt. 236 00:17:06,780 --> 00:17:08,480 Ah! 237 00:17:08,480 --> 00:17:11,280 In terms of the text that we're trying to encrypt, that actually comes later 238 00:17:11,280 --> 00:17:13,970 when we give that text. 239 00:17:13,970 --> 00:17:18,260 So right now we're just accepting as command arguments the actual number, 240 00:17:18,260 --> 00:17:21,579 the actual shift for the Caesar encryption. 241 00:17:21,579 --> 00:17:27,569 [student] Why do you need 2 as opposed to just 1 argc? There's definitely 1 number. 242 00:17:27,569 --> 00:17:32,200 Right. The reason why we need 2 for argc instead of 1 243 00:17:32,200 --> 00:17:36,260 is because when you run a program and say ./caesar or ./hello, 244 00:17:36,260 --> 00:17:38,280 that actually counts as a command line argument. 245 00:17:38,280 --> 00:17:43,020 So then that already takes up 1 and so then we're inputting 1 extra. 246 00:17:45,030 --> 00:17:49,440 So you're inputting actually a string in the command line argument. 247 00:17:49,440 --> 00:17:52,730 What you want to do, for Caesar we want to deal with an integer, 248 00:17:52,730 --> 00:17:57,180 so you can use this atoi function. 249 00:17:57,180 --> 00:18:02,850 And basically, you pass it in a string and then it will return you back an integer 250 00:18:02,850 --> 00:18:06,070 if it's possible to make that string into an integer. 251 00:18:06,070 --> 00:18:10,960 Now remember when we're dealing with printf or GetString, things like that, 252 00:18:10,960 --> 00:18:13,390 we include the libraries that are specific to us. 253 00:18:13,390 --> 00:18:19,450 So at the beginning we start with a hash tag standard I/O, .h, something like that. 254 00:18:19,450 --> 00:18:22,430 Well, atoi isn't within one of those libraries, 255 00:18:22,430 --> 00:18:26,600 so what we have to do is we have to include the right library for that. 256 00:18:26,600 --> 00:18:32,720 So recall back to Walkthrough 1 where I discussed the manual function. 257 00:18:32,720 --> 00:18:37,110 You type man in your terminal and then followed by the name of a function. 258 00:18:37,110 --> 00:18:39,720 And so that will bring up a whole list of its usage, 259 00:18:39,720 --> 00:18:42,890 but as well it will bring up which library that belongs to. 260 00:18:42,890 --> 00:18:47,000 So I'll leave that to you to use the manual function with atoi 261 00:18:47,000 --> 00:18:53,360 and figure out which library you need to include to be able to use the atoi function. 262 00:18:54,450 --> 00:18:57,670 So we've got the key and now it comes to getting the plain text, 263 00:18:57,670 --> 00:19:01,820 and so that actually is going to be user input where you prompt. 264 00:19:01,820 --> 00:19:05,540 We dealt with GetInt and GetFloat, and so in the same vein 265 00:19:05,540 --> 00:19:07,670 we're going to be dealing with GetString. 266 00:19:07,670 --> 00:19:12,440 But in this case we don't need to do any do while or while loops to check. 267 00:19:12,440 --> 00:19:14,480 GetString will definitely give us a string, 268 00:19:14,480 --> 00:19:17,630 and we're going to encrypt whatever the user gives us. 269 00:19:17,630 --> 00:19:23,770 So you can assume that all of these user inputted strings are correct. 270 00:19:23,770 --> 00:19:24,670 Great. 271 00:19:24,670 --> 00:19:27,270 So then once you've got the key and once you've got the text, 272 00:19:27,270 --> 00:19:31,660 now what's left is you have to encipher the plaintext. 273 00:19:31,660 --> 00:19:36,530 Just to quickly cover over lingo, the plaintext is what the user gives you, 274 00:19:36,530 --> 00:19:41,030 and the ciphertext is what you return to them. 275 00:19:42,450 --> 00:19:45,850 So strings, to be able to go through actually letter by letter 276 00:19:45,850 --> 00:19:48,550 because we have to shift every letter, 277 00:19:48,550 --> 00:19:51,390 we understand that strings, if we kind of peel back the layer, 278 00:19:51,390 --> 00:19:54,130 we see that they're just really a list of characters. 279 00:19:54,130 --> 00:19:55,930 One comes after the other. 280 00:19:55,930 --> 00:20:01,690 And so we can treat strings as arrays because they are arrays of characters. 281 00:20:01,690 --> 00:20:05,640 So say you have a string named text, 282 00:20:05,640 --> 00:20:09,400 and within that variable text is stored This is CS50. 283 00:20:09,400 --> 00:20:15,680 Then text at index 0 would be a capital T, index 1 would be h, etc. 284 00:20:17,530 --> 00:20:23,970 And then with arrays, in the argc example in args.c, 285 00:20:23,970 --> 00:20:27,090 we saw that we had to iterate over an array 286 00:20:27,090 --> 00:20:32,440 and so we had to iterate from i = 0 up until i is less than the length. 287 00:20:32,440 --> 00:20:35,560 So we need some way of figuring out what the length of our string is 288 00:20:35,560 --> 00:20:37,090 if we're going to iterate over it. 289 00:20:37,090 --> 00:20:42,300 Luckily again, there is a function there for us, although later on in CS50 290 00:20:42,300 --> 00:20:45,860 you'll definitely be able to implement and make your own function 291 00:20:45,860 --> 00:20:48,260 that can calculate the length of a string. 292 00:20:48,260 --> 00:20:52,120 But for now we're going to use string length, so strlen. 293 00:20:52,120 --> 00:21:00,440 You pass in a string, and then it will return you an int that represents the length of your string. 294 00:21:00,440 --> 00:21:05,840 Let's look at an example of how we might be able to iterate over each character in a string 295 00:21:05,840 --> 00:21:08,470 and do something with that. 296 00:21:08,470 --> 00:21:13,250 What we want to do is iterate over each character of the string, 297 00:21:13,250 --> 00:21:19,150 and what we want to do is we print back each character 1 by 1 298 00:21:19,150 --> 00:21:22,060 except we add something next to it. 299 00:21:22,060 --> 00:21:27,020 So let's start with the for loop. Int i = 0. 300 00:21:27,020 --> 00:21:30,070 We're going to leave space for the condition. 301 00:21:32,700 --> 00:21:36,840 We want to iterate until we reach the end of the string, right? 302 00:21:36,840 --> 00:21:41,340 So then what function gives us the length of the string? 303 00:21:41,340 --> 00:21:43,160 [inaudible student response] 304 00:21:43,160 --> 00:21:46,420 That's the length of the command line arguments. 305 00:21:46,420 --> 00:21:50,650 But for a string we want to use a function that gives us the length of the string. 306 00:21:50,650 --> 00:21:53,090 So that's string length. 307 00:21:53,090 --> 00:21:57,130 And so then you have to pass in a string to it. 308 00:21:57,130 --> 00:21:59,760 It needs to know what string it needs to calculate the length of. 309 00:21:59,760 --> 00:22:03,160 So then in this case we're dealing with string s. 310 00:22:04,790 --> 00:22:05,860 Great. 311 00:22:05,860 --> 00:22:10,770 So then what we want to do, let's printf. 312 00:22:10,770 --> 00:22:14,850 Now, we want to deal with characters. We want to print out each individual character. 313 00:22:14,850 --> 00:22:22,150 When you want it to print out a float, you would use the placeholder like %f. 314 00:22:22,150 --> 00:22:24,580 With an int you would use %d. 315 00:22:24,580 --> 00:22:30,890 And so similarly, with a character you use %c to say I'm going to be printing a character 316 00:22:30,890 --> 00:22:34,570 that's stored inside a variable. 317 00:22:34,570 --> 00:22:40,840 So we have this, and let's add a period and a space to it. 318 00:22:40,840 --> 00:22:45,430 Which character are we using? 319 00:22:45,430 --> 00:22:49,780 We're going to be using whatever character we're at of the string. 320 00:22:49,780 --> 00:22:52,890 So then we're going to be using something with string, 321 00:22:52,890 --> 00:22:56,420 but we want to be accessing the certain character there. 322 00:22:56,420 --> 00:23:02,740 So if a string is just an array, then how do we access elements of arrays? 323 00:23:02,740 --> 00:23:06,480 We have those square brackets, and then we put the index in there. 324 00:23:06,480 --> 00:23:11,820 So we have square brackets. Our index in this case we can just use i. Exactly. 325 00:23:15,290 --> 00:23:22,370 So here we're saying we're going to be printing a character followed by a dot and a space, 326 00:23:22,370 --> 00:23:30,870 and that character is going to be the ith letter in our string s. 327 00:23:32,920 --> 00:23:39,330 I'm just going to save that. Okay. 328 00:23:42,510 --> 00:23:46,840 Now I'm going to run string length. 329 00:23:46,840 --> 00:23:53,440 So we had a string called OMG, and now it's emphasized even more. 330 00:23:53,440 --> 00:23:57,870 Similarly, let's say we actually want to get a string from the user. 331 00:23:57,870 --> 00:23:59,580 How might we do this? 332 00:23:59,580 --> 00:24:01,610 Before, how did we get an int? 333 00:24:01,610 --> 00:24:08,040 We said GetInt, right? But this isn't int, so let's GetString. 334 00:24:11,780 --> 00:24:17,770 Let's make string length. Here we didn't enter a specific prompt. 335 00:24:17,770 --> 00:24:19,940 So I don't know. 336 00:24:19,940 --> 00:24:23,820 I'm going to put my name in here and so then I can do one of those things 337 00:24:23,820 --> 00:24:29,600 where I assign a word for every letter or something like that. Cool. 338 00:24:29,600 --> 00:24:31,900 So that's string length. 339 00:24:33,000 --> 00:24:34,640 So we're back to Caesar. 340 00:24:34,640 --> 00:24:38,620 We have a few tools on how we iterate over a string, 341 00:24:38,620 --> 00:24:41,250 how we access each individual element. 342 00:24:41,250 --> 00:24:44,720 So now we can get back to the program. 343 00:24:44,720 --> 00:24:48,650 As I mentioned before, in the ASCII table, your best friend, 344 00:24:48,650 --> 00:24:52,300 you're going to see the numbers that are associated with every letter. 345 00:24:52,300 --> 00:24:55,900 So here say our plaintext is I'm dizzy! 346 00:24:55,900 --> 00:25:01,090 Then each of these characters is going to have a number and ASCII value associated with it, 347 00:25:01,090 --> 00:25:04,710 even the apostrophe, even the space, even the exclamation mark, 348 00:25:04,710 --> 00:25:06,600 so you'll want to keep that in mind. 349 00:25:06,600 --> 00:25:12,360 So say our key that the user included in their command line argument is 6. 350 00:25:12,360 --> 00:25:17,770 That means for the first letter, which is I, which is represented by 73, 351 00:25:17,770 --> 00:25:25,610 you want to return to them whatever letter is represented by the ASCII value of 73 + 6. 352 00:25:25,610 --> 00:25:29,020 In this case that would be 79. 353 00:25:30,840 --> 00:25:35,040 Now we want to go to the next character. 354 00:25:35,040 --> 00:25:40,960 So the next in index 1 of the plaintext would be the apostrophe. 355 00:25:40,960 --> 00:25:46,780 But remember we only want to encipher the letters. 356 00:25:46,780 --> 00:25:50,040 So we want to make sure that the apostrophe actually stays the same, 357 00:25:50,040 --> 00:25:54,310 that we don't change from 39 to whatever 45 is. 358 00:25:54,310 --> 00:25:57,150 We want to keep it as an apostrophe. 359 00:25:57,150 --> 00:26:00,780 So we want to remember to only encipher the letters 360 00:26:00,780 --> 00:26:04,560 because we want all of the other symbols to remain unchanged in our program. 361 00:26:04,560 --> 00:26:07,130 Another thing that we want is to preserve capitalization. 362 00:26:07,130 --> 00:26:10,250 So when you have an uppercase letter, it should stay as an uppercase. 363 00:26:10,250 --> 00:26:12,830 Lowercases should stay as lowercase. 364 00:26:13,620 --> 00:26:19,480 So some useful functions to be able to deal with only enciphering letters 365 00:26:19,480 --> 00:26:22,380 and keep preserving the capitalization of things 366 00:26:22,380 --> 00:26:25,130 is the isalpha, isupper, islower functions. 367 00:26:25,130 --> 00:26:29,270 And so these are functions that return you a Boolean value. 368 00:26:29,270 --> 00:26:34,180 Basically, true or false. Is this an uppercase? Is this alphanumeric? 369 00:26:34,180 --> 00:26:37,180 Is this a letter, essentially. 370 00:26:37,180 --> 00:26:41,070 So here are 3 examples of how you would use that function. 371 00:26:41,070 --> 00:26:47,060 Basically, you could test whether the value returned to you by that function is true or false 372 00:26:47,060 --> 00:26:49,400 based on that input. 373 00:26:49,400 --> 00:26:54,880 Either don't encipher something or cipher it or make sure that it's uppercase, etc. 374 00:26:54,880 --> 00:27:01,080 [student] Can you just explain those a little more and how you use them? >>Yeah, for sure. 375 00:27:01,080 --> 00:27:08,470 So if we look back, here we have a capital I, right? 376 00:27:08,470 --> 00:27:14,550 So we know that I goes to O because I + 6 is O. 377 00:27:14,550 --> 00:27:18,740 But we want to make sure that that O is going to be a capital O. 378 00:27:18,740 --> 00:27:22,940 So basically, that is kind of going to change our input. 379 00:27:22,940 --> 00:27:26,870 So whether it's uppercase or not will kind of change the way that we deal with it. 380 00:27:26,870 --> 00:27:32,360 So then if we use the isupper function on that particular index, 381 00:27:32,360 --> 00:27:36,480 so isupper("I"), that returns for us true, so we know that it's upper. 382 00:27:36,480 --> 00:27:40,360 So then based on that, later we'll go into a formula 383 00:27:40,360 --> 00:27:42,750 that you'll be using to shift things in Caesar, 384 00:27:42,750 --> 00:27:46,560 so then basically, there's going to be a slightly different formula if it's uppercase 385 00:27:46,560 --> 00:27:50,670 as opposed to lowercase. Make sense? 386 00:27:51,020 --> 00:27:52,760 Yeah. No worries. 387 00:27:54,900 --> 00:27:58,990 I talked a bit about adding 6 to a letter, which doesn't quite make sense 388 00:27:58,990 --> 00:28:05,500 except when we kind of understand that these characters 389 00:28:05,500 --> 00:28:08,920 are kind of interchangeable with integers. 390 00:28:08,920 --> 00:28:11,250 What we do is we kind of use implicit casting. 391 00:28:11,250 --> 00:28:18,100 We'll go into casting a bit later on where you take a value and you turn it into a different type 392 00:28:18,100 --> 00:28:20,440 than it originally was. 393 00:28:20,440 --> 00:28:25,910 But with this pset we'll be able to kind of interchangeably use characters 394 00:28:25,910 --> 00:28:30,880 and their corresponding integer values. 395 00:28:30,880 --> 00:28:35,140 So if you simply encase a character with just the single quotes, 396 00:28:35,140 --> 00:28:40,390 then you'll be able to work with it with integers, dealing with it as an integer. 397 00:28:40,390 --> 00:28:48,040 So the capital C relates to 67. Lowercase f relates to 102. 398 00:28:48,040 --> 00:28:51,480 Again, if you want to know these values, look at your ASCII table. 399 00:28:51,480 --> 00:28:56,160 So let's go into some examples of how you might be able to subtract and add, 400 00:28:56,160 --> 00:29:03,130 how you can actually really work with these characters, use them interchangeably. 401 00:29:03,870 --> 00:29:11,350 I say that ASCIIMath is going to calculate the addition of a character to an integer 402 00:29:11,350 --> 00:29:17,590 and then displays the resultant character as well as the resultant ASCII value. 403 00:29:17,590 --> 00:29:22,290 And so here I'm saying--we'll deal with this part later-- 404 00:29:22,290 --> 00:29:29,100 but basically, I'm saying that the user should say run ASCIIMath along with a key, 405 00:29:29,100 --> 00:29:30,880 and I'm saying that that key is going to be the number 406 00:29:30,880 --> 00:29:34,600 with which we're going to add this character. 407 00:29:34,600 --> 00:29:38,560 So here notice that since I'm demanding a key, 408 00:29:38,560 --> 00:29:40,590 since I'm demanding that they're giving me 1 thing, 409 00:29:40,590 --> 00:29:45,600 I only want to accept ./asciimath and a key. 410 00:29:45,600 --> 00:29:49,330 So I'm going to demand that argc is equal to 2. 411 00:29:49,330 --> 00:29:54,360 If it's not, then I'm going to return 1 and the program will exit. 412 00:29:55,070 --> 00:29:58,540 So I'm saying the key isn't going to be the first command line argument, 413 00:29:58,540 --> 00:30:05,080 it's going to be the second one, and as you see here, 414 00:30:05,080 --> 00:30:11,790 I'm going to turn that into an integer. 415 00:30:15,740 --> 00:30:19,230 Then I'm going to set a character to be r. 416 00:30:19,230 --> 00:30:23,970 Notice that the type of the variable chr is actually an integer. 417 00:30:23,970 --> 00:30:30,480 The way that I'm able to use r as an integer is by encasing it with these single quotes. 418 00:30:33,850 --> 00:30:40,560 So back to our printf statement where we have a placeholder for a character 419 00:30:40,560 --> 00:30:43,590 and then a placeholder for an integer, 420 00:30:43,590 --> 00:30:49,450 the character is represented by the chr, and the integer is the key. 421 00:30:49,450 --> 00:30:54,320 And so then we're going to in result add the 2 together. 422 00:30:54,320 --> 00:30:58,420 So we're going to add r + whatever the key is, 423 00:30:58,420 --> 00:31:03,520 and then we're going to print the result of that. 424 00:31:06,210 --> 00:31:14,220 So let's make asciimath. It's up to date, so let's just run asciimath. 425 00:31:14,220 --> 00:31:18,290 Oh, but see, it doesn't do anything because we didn't actually give it a key. 426 00:31:18,290 --> 00:31:23,850 So when it just returned 1, our main function, it just returned back to us. 427 00:31:23,850 --> 00:31:29,250 So then let's pass in a key. Someone give me a number. >>[student] 4. 428 00:31:29,250 --> 00:31:30,920 4. Okay. 429 00:31:30,920 --> 00:31:39,280 So r increased by 4 is going to give us v, which corresponds to the ASCII value of 118. 430 00:31:39,280 --> 00:31:43,880 So then it kind of makes sense that-- 431 00:31:43,880 --> 00:31:51,250 Actually, can I ask you, what do you think the ASCII value of r is if r + 4 is 118? 432 00:31:53,070 --> 00:31:55,470 Then yeah, r is 114. 433 00:31:55,470 --> 00:32:03,010 So if you look on the ASCII table then, sure enough, you'll see that r is represented by 114. 434 00:32:03,010 --> 00:32:08,610 So now that we know that we can add integers to characters, this seems pretty simple. 435 00:32:08,610 --> 00:32:12,740 We're just going to iterate over a string like we saw in an example before. 436 00:32:12,740 --> 00:32:17,170 We'll check if it's a letter. 437 00:32:17,170 --> 00:32:20,420 If it is, then we'll shift it by whatever the key is. 438 00:32:20,420 --> 00:32:23,650 Pretty simple, except when you get to like this, 439 00:32:23,650 --> 00:32:32,140 you see that z, represented by 122, then would give you a different character. 440 00:32:32,140 --> 00:32:37,770 We actually want to stay within our alphabet, right? 441 00:32:37,770 --> 00:32:43,180 So we need to figure out some way of kind of wrapping around. 442 00:32:43,180 --> 00:32:47,190 When you reach zed and you want to increase by a certain number, 443 00:32:47,190 --> 00:32:51,230 you don't want to go into beyond the ASCII alphabet section; 444 00:32:51,230 --> 00:32:54,140 you want to wrap back all the way to A. 445 00:32:54,140 --> 00:32:58,550 But keep in mind you're still preserving the case. 446 00:32:58,550 --> 00:33:00,980 So knowing that letters can't become symbols 447 00:33:00,980 --> 00:33:05,290 just like symbols aren't going to be changing as well. 448 00:33:05,290 --> 00:33:08,170 In the last pset you definitely didn't need to, 449 00:33:08,170 --> 00:33:14,310 but an option was to implement your greedy pset by using the modulus function. 450 00:33:14,310 --> 00:33:17,230 But now we're actually going to need to use modulus, 451 00:33:17,230 --> 00:33:19,900 so let's just go over this a little bit. 452 00:33:19,900 --> 00:33:26,920 Essentially, when you have x modulo y, that gives you the remainder of x divided by y. 453 00:33:26,920 --> 00:33:30,930 Here are some examples here. We have 27 % 15. 454 00:33:30,930 --> 00:33:36,200 Basically, when you subtract 15 from 27 as many times as possible without getting negative 455 00:33:36,200 --> 00:33:39,060 then you get 12 left over. 456 00:33:39,060 --> 00:33:44,650 So that's kind of like in the math context, but how can we actually use this? 457 00:33:44,650 --> 00:33:47,100 It's going to be useful for our wrapover. 458 00:33:47,100 --> 00:33:55,420 For this, let's just say I asked you all to divide into 3 groups. 459 00:33:55,420 --> 00:33:58,010 Sometimes you do this in groups and something like that. 460 00:33:58,010 --> 00:34:01,320 Say I said, "Okay, I want you all to be divided into 3." 461 00:34:01,320 --> 00:34:04,240 How might you do that? 462 00:34:04,240 --> 00:34:06,810 [inaudible student response] Yeah, exactly. Count off. Okay. 463 00:34:06,810 --> 00:34:10,260 Let's actually do that. Do you want to start? 464 00:34:10,260 --> 00:34:13,810 [students counting off] 1, 2, 3, 4. 465 00:34:13,810 --> 00:34:16,620 But remember... >>[student] Oh, sorry. 466 00:34:16,620 --> 00:34:18,730 That's a really good point. 467 00:34:18,730 --> 00:34:24,130 You said 4, but we actually want you to say 1 because we only want 3 groups. 468 00:34:24,130 --> 00:34:30,159 So then, how-- No, that's a really good example because then how might you say 1? 469 00:34:30,159 --> 00:34:33,370 What's the relationship between 4 and 1? 470 00:34:33,370 --> 00:34:36,760 Well, 4 mod 3 is 1. 471 00:34:36,760 --> 00:34:41,460 So if you continue, you would be 2. 472 00:34:41,460 --> 00:34:44,540 So we have 1, 2, 3, 1, 2. 473 00:34:44,540 --> 00:34:49,420 Again, you're actually the 5th person. How do you know to say 2 instead of 5? 474 00:34:49,420 --> 00:34:53,760 You say 5 mod 3 is 2. 475 00:34:53,760 --> 00:34:59,100 I want to see how many groups of 3 are left over, then which order am I. 476 00:34:59,100 --> 00:35:02,860 And so then if we continued along the whole room, 477 00:35:02,860 --> 00:35:07,760 then we would see that we're always actually applying the mod function to ourselves 478 00:35:07,760 --> 00:35:09,990 to kind of count off. 479 00:35:09,990 --> 00:35:14,490 That's a more kind of tangible example of how you might use modulo 480 00:35:14,490 --> 00:35:17,960 because I'm sure most of us have probably gone through that process 481 00:35:17,960 --> 00:35:19,630 where we've had to count off. 482 00:35:19,630 --> 00:35:21,840 Any questions on modulo? 483 00:35:21,840 --> 00:35:25,360 It will be pretty important to understand the concepts of this, 484 00:35:25,360 --> 00:35:28,640 so I want to make sure you guys understand. 485 00:35:28,640 --> 00:35:34,660 [student] If there is no remainder, does it give you the actual number? 486 00:35:34,660 --> 00:35:40,430 If one of the first 3 of them had done it, would it have given them what they actually were, 487 00:35:40,430 --> 00:35:43,310 or would it have given them [inaudible] >>That's a good question. 488 00:35:43,310 --> 00:35:48,750 When there is no remainder for the modulo--so say you have 6 mod 3-- 489 00:35:48,750 --> 00:35:52,340 that actually gives you back 0. 490 00:35:53,670 --> 00:35:57,290 We'll talk about that a bit later. 491 00:35:58,810 --> 00:36:07,720 Oh yeah, for instance, the 3rd person--3 mod 3 is actually 0 but she said 3. 492 00:36:07,720 --> 00:36:14,900 So that's kind of like an inner catch, for instance, 493 00:36:14,900 --> 00:36:17,620 like okay, if the mod is 0 then I'm going to be the 3rd person. 494 00:36:17,620 --> 00:36:22,740 But we'll get into kind of how we might want to deal with what 0 is later. 495 00:36:22,740 --> 00:36:32,750 So now we somehow have a way of mapping the zed to the right letter. 496 00:36:32,750 --> 00:36:34,920 So now we've gone through these examples, 497 00:36:34,920 --> 00:36:37,880 we kind of see how Caesar might work. 498 00:36:37,880 --> 00:36:42,640 You see the 2 alphabets and then you see them shifting. 499 00:36:42,640 --> 00:36:44,430 So let's try and express that in terms of formula. 500 00:36:44,430 --> 00:36:46,940 This formula is actually given to you in the spec, 501 00:36:46,940 --> 00:36:52,070 but let's kind of look through what each variable means. 502 00:36:52,070 --> 00:36:55,000 Our end result is going to be the ciphertext. 503 00:36:55,000 --> 00:36:58,300 So this says that the ith character of the ciphertext 504 00:36:58,300 --> 00:37:02,500 is going to correspond to the ith character of the plaintext. 505 00:37:02,500 --> 00:37:08,130 That makes sense because we want to always be lining these things up. 506 00:37:08,130 --> 00:37:13,480 So it's going to be the ith character of the ciphertext plus k, which is our key-- 507 00:37:13,480 --> 00:37:17,230 that makes sense--and then we have this mod 26. 508 00:37:17,230 --> 00:37:19,860 Remember back when we had the zed 509 00:37:19,860 --> 00:37:24,190 we didn't want to get into the character, so we wanted to mod it 510 00:37:24,190 --> 00:37:26,540 and kind of wrap around the alphabet. 511 00:37:26,540 --> 00:37:33,430 After zed you would go to a, b, c, d, until you got to the right number. 512 00:37:33,430 --> 00:37:44,690 So we know that zed, if + 6, would give us f because after zed comes a, b, c, d, e, f. 513 00:37:44,690 --> 00:37:52,530 So let's remember we know for sure that zed + 6 is going to give us f. 514 00:37:52,530 --> 00:38:03,530 In ASCII values, z is 122 and f is 102. 515 00:38:03,530 --> 00:38:10,570 So we have to find some way of making our Caesar formula give us 102 516 00:38:10,570 --> 00:38:13,590 after taking in 122. 517 00:38:13,590 --> 00:38:19,550 So if we just apply this formula, the ('z' + 6) % 26, that actually gives you 24 518 00:38:19,550 --> 00:38:25,980 because 122 + 6 is 128; 128 % 26 gives you 24 remainder. 519 00:38:25,980 --> 00:38:29,140 But that doesn't really mean f. That's definitely not 102. 520 00:38:29,140 --> 00:38:33,590 That's also not the 6th letter in the alphabet. 521 00:38:33,590 --> 00:38:41,550 So obviously, we need to have some way of tweaking this a little bit. 522 00:38:42,970 --> 00:38:51,340 In terms of the regular alphabet, we know that z is the 26th letter and f is the 6th. 523 00:38:51,340 --> 00:38:55,460 But we're in computer science, so we're going to index at 0. 524 00:38:55,460 --> 00:39:00,690 So then instead of z being the number 26, we're going to say it's number 25 525 00:39:00,690 --> 00:39:02,630 because a is 0. 526 00:39:02,630 --> 00:39:04,770 So now let's apply this formula. 527 00:39:04,770 --> 00:39:11,710 We have z represented by 25 + 6, which gives you 31. 528 00:39:11,710 --> 00:39:15,790 And 31 mod 26 gives you 5 as a remainder. 529 00:39:15,790 --> 00:39:20,500 That's perfect because we know that f is the 5th letter in the alphabet. 530 00:39:20,500 --> 00:39:26,400 But it still isn't f, right? It still isn't 102. 531 00:39:26,400 --> 00:39:32,730 So then for this pset, a challenge will be trying to find out the relationship 532 00:39:32,730 --> 00:39:36,910 between converting between these ASCII values and the alphabetical index. 533 00:39:36,910 --> 00:39:40,280 Essentially, what you'll want to do, you want to start out with the ASCII values, 534 00:39:40,280 --> 00:39:45,390 but then you want to somehow translate that into an alphabetical index 535 00:39:45,390 --> 00:39:52,610 then calculate what letter it should be--basically, what its alphabetical index is 536 00:39:52,610 --> 00:39:57,660 of the cipher character--then translate that back to the ASCII values. 537 00:39:57,660 --> 00:40:04,870 So if you whip out your ASCII table, then try and find relationships between, say, 102 and 5 538 00:40:04,870 --> 00:40:10,440 or the 122 and 25. 539 00:40:12,140 --> 00:40:15,690 We've gotten our key from the command line arguments, we've gotten the plaintext, 540 00:40:15,690 --> 00:40:17,520 we've enciphered it. 541 00:40:17,520 --> 00:40:19,820 Now all we have left to do is print it. 542 00:40:19,820 --> 00:40:22,040 We could do this a couple of different ways. 543 00:40:22,040 --> 00:40:24,570 What we could do is actually print as we go along. 544 00:40:24,570 --> 00:40:28,250 As we iterate over the characters in the string, 545 00:40:28,250 --> 00:40:31,660 we could simply just print right then when we calculate it. 546 00:40:31,660 --> 00:40:36,030 Alternatively, you could also store it in an array and have an array of characters 547 00:40:36,030 --> 00:40:39,280 and at the end iterate over that whole array and print it out. 548 00:40:39,280 --> 00:40:40,980 So you have a couple of options for that. 549 00:40:40,980 --> 00:40:47,280 And remember that %c is going to be the placeholder for printing a character. 550 00:40:47,280 --> 00:40:50,420 So there we have Caesar, and now we move on to Vigenere, 551 00:40:50,420 --> 00:40:57,580 which is very similar to Caesar but just slightly more complex. 552 00:40:57,580 --> 00:41:03,310 So essentially with Vigenere is you're going to be passing in a keyword. 553 00:41:03,310 --> 00:41:06,510 So instead of a number, you're going to have a string, 554 00:41:06,510 --> 00:41:09,200 and so that's going to act as your keyword. 555 00:41:09,200 --> 00:41:14,440 Then, as usual, you're going to get a prompt for a string from the user 556 00:41:14,440 --> 00:41:19,050 and then encipher it and then give them the ciphertext back. 557 00:41:19,050 --> 00:41:24,650 So as I said, it's very similar to Caesar, except instead of shifting by a certain number, 558 00:41:24,650 --> 00:41:30,620 the number is actually going to change every time from character to character. 559 00:41:30,620 --> 00:41:34,890 To represent that actual number to shift, it's represented by the keyboard letters. 560 00:41:34,890 --> 00:41:43,150 So if you enter in a shift of a, for instance, then that would correspond to a shift of 0. 561 00:41:43,150 --> 00:41:45,900 So it's again back to the alphabetical index. 562 00:41:45,900 --> 00:41:49,100 What might be useful if you're seeing that we're actually dealing with ASCII values 563 00:41:49,100 --> 00:41:51,790 as well as the letters, as well as the alphabetical index, 564 00:41:51,790 --> 00:41:58,020 maybe find or make your own ASCII table that shows the alphabetical index of 0 through 25, 565 00:41:58,020 --> 00:42:03,750 a through z, and the ASCII values so that you can kind of see the relationship 566 00:42:03,750 --> 00:42:07,020 and sketch out and try and find some patterns. 567 00:42:07,020 --> 00:42:11,010 Similarly, if you were shifting at the certain instance by f-- 568 00:42:11,010 --> 00:42:21,110 and this is either lowercase or uppercase f--then that would correspond to 5. 569 00:42:21,110 --> 00:42:24,180 Are we good so far? 570 00:42:25,770 --> 00:42:30,050 The formula for Vigenere is a bit different. 571 00:42:30,050 --> 00:42:32,960 Basically, you see that it's just like Caesar, 572 00:42:32,960 --> 00:42:37,390 except instead of just k we have k index j. 573 00:42:37,390 --> 00:42:44,810 Notice that we're not using i because essentially, the length of the keyword 574 00:42:44,810 --> 00:42:49,850 isn't necessarily the length of our ciphertext. 575 00:42:49,850 --> 00:42:56,130 This will be a bit clearer when we see an example that I have a bit later on. 576 00:42:56,130 --> 00:43:03,160 Basically, if you run your program with a keyword of ohai, 577 00:43:03,160 --> 00:43:08,560 then that means that every time, ohai is going to be your shift. 578 00:43:08,560 --> 00:43:11,060 So depending on what position you are in your keyword, 579 00:43:11,060 --> 00:43:15,800 you're going to shift your certain ciphertext character by that amount. 580 00:43:15,800 --> 00:43:19,630 Again, just like Caesar, we want to make sure that we preserve the capitalization of things 581 00:43:19,630 --> 00:43:22,900 and we only encipher letters, not characters or spaces. 582 00:43:22,900 --> 00:43:26,330 So look back to Caesar on the functions that you may have used, 583 00:43:26,330 --> 00:43:32,570 the way that you decided how to shift things, and apply that to your program here. 584 00:43:32,570 --> 00:43:35,260 So let's map this out. 585 00:43:35,260 --> 00:43:39,680 We have a plaintext that we've gotten from the user from GetString 586 00:43:39,680 --> 00:43:44,090 saying This... is CS50! 587 00:43:44,090 --> 00:43:47,090 Then we have a keyword of ohai. 588 00:43:47,090 --> 00:43:50,930 The first 4 characters are pretty simple. 589 00:43:50,930 --> 00:43:55,580 We know that T is going to be shifted by o, 590 00:43:55,580 --> 00:44:01,990 then h is going to be shifted by h, i is going to be shifted by a. 591 00:44:01,990 --> 00:44:04,610 Here you see that a represents 0, 592 00:44:04,610 --> 00:44:11,940 so then the end value is actually just the same letter as before. 593 00:44:11,940 --> 00:44:15,250 Then s is shifted by i. 594 00:44:15,250 --> 00:44:19,370 But then you have these periods here. 595 00:44:19,370 --> 00:44:25,960 We don't want to encipher that, so then we don't change it by anything 596 00:44:25,960 --> 00:44:31,280 and just print out the period unchanged. 597 00:44:31,280 --> 00:44:38,020 [student] I don't understand how you know that this is shifted by-- Where did you-- >>Oh, sorry. 598 00:44:38,020 --> 00:44:41,620 At the top here you see that the command line argument ohai here, 599 00:44:41,620 --> 00:44:43,740 that's going to be the keyword. 600 00:44:43,740 --> 00:44:49,550 And so basically, you're cycling over the characters in the keyword. 601 00:44:49,550 --> 00:44:52,020 [student] So o is going to be shifting the same-- 602 00:44:52,020 --> 00:44:56,260 So o corresponds to a certain number in the alphabet. 603 00:44:56,260 --> 00:44:58,400 [student] Right. But where did you get the CS50 part from? 604 00:44:58,400 --> 00:45:02,540 Oh. That's in GetString where you're like, "Give me a string to encode." 605 00:45:02,540 --> 00:45:07,510 [student] They're going to give you that argument to shift by 606 00:45:07,510 --> 00:45:09,380 and then you'll ask for your first string. >>Yeah. 607 00:45:09,380 --> 00:45:12,440 So when they run the program, they're going to include the keyword 608 00:45:12,440 --> 00:45:14,740 in their command line arguments when they run it. 609 00:45:14,740 --> 00:45:19,740 Then once you've checked that they've actually given you 1 and not more, not less, 610 00:45:19,740 --> 00:45:23,750 then you're going to prompt them for a string, say, "Give me a string." 611 00:45:23,750 --> 00:45:27,630 So that's where in this case they've given you This... is CS50! 612 00:45:27,630 --> 00:45:32,090 So then you're going to use that and use ohai and iterate over. 613 00:45:32,090 --> 00:45:38,200 Notice that here we skipped over encrypting the periods, 614 00:45:38,200 --> 00:45:51,660 but in terms of our position for ohai, the next one we used o. 615 00:45:51,660 --> 00:45:54,990 In this case it's a bit harder to see because that's 4, 616 00:45:54,990 --> 00:45:57,710 so let's continue a bit. Just stick with me here. 617 00:45:57,710 --> 00:46:02,960 Then we have i and s, which are then translated by o and h respectively. 618 00:46:02,960 --> 00:46:09,370 Then we have a space, and so then we know that we aren't going to encipher the spaces. 619 00:46:09,370 --> 00:46:18,930 But notice that instead of going to a in this spot right here, 620 00:46:18,930 --> 00:46:28,330 we're encrypting by a--I don't know if you can see that--right here. 621 00:46:28,330 --> 00:46:33,710 So it's not like you actually predetermined, say, o goes here, h goes here, 622 00:46:33,710 --> 00:46:39,200 a goes here, i goes here, o, h, a, i, o, h, a, i. You don't do that. 623 00:46:39,200 --> 00:46:43,760 You only shift your position in the keyword 624 00:46:43,760 --> 00:46:51,020 when you know that you're actually going to be encrypting an actual letter. 625 00:46:51,020 --> 00:46:53,920 Does that kind of make sense? 626 00:46:53,920 --> 00:46:55,800 Okay. 627 00:46:56,490 --> 00:46:58,500 So just some reminders. 628 00:46:58,500 --> 00:47:03,760 You want to make sure that you only advance to the next letter in your keyword 629 00:47:03,760 --> 00:47:06,390 if the character in your plaintext is a letter. 630 00:47:06,390 --> 00:47:09,120 So say we're at the o. 631 00:47:09,120 --> 00:47:19,310 We notice that the next character, the i index of the plaintext, is a number, for instance. 632 00:47:19,310 --> 00:47:31,630 Then we don't advance j, the index for our keyword, until we reach another letter. 633 00:47:31,630 --> 00:47:36,230 Again, you also want to make sure that you wraparound to the beginning of the keyword 634 00:47:36,230 --> 00:47:37,770 when you're at the end of it. 635 00:47:37,770 --> 00:47:42,030 If you see here we're at i, the next one has to be o. 636 00:47:42,030 --> 00:47:47,690 So you want to find some way of being able to wraparound to the beginning of your keyword 637 00:47:47,690 --> 00:47:49,470 every time that you reach the end. 638 00:47:49,470 --> 00:47:55,040 And so again, what kind of operator is useful in that case for wrapping around? 639 00:47:56,630 --> 00:47:59,840 Like in the counting off example. 640 00:47:59,840 --> 00:48:03,710 [student] The percent sign. >>Yeah, the percent sign, which is modulo. 641 00:48:03,710 --> 00:48:11,250 So modulo will come in handy here when you want to wrap over the index in your ohai. 642 00:48:11,250 --> 00:48:17,700 And just a quick hint: Try to think of wrapping over the keyword a bit like the counting off, 643 00:48:17,700 --> 00:48:23,590 where if there's 3 groups, the 4th person, 644 00:48:23,590 --> 00:48:30,610 their number that they said was 4 mod 3, which was 1. 645 00:48:30,610 --> 00:48:32,880 So try and think of it that way. 646 00:48:34,770 --> 00:48:42,740 As you saw in the formula, wherever you have ci and then pi but then kj, 647 00:48:42,740 --> 00:48:44,700 you want to make sure that you keep track of those. 648 00:48:44,700 --> 00:48:47,580 You don't need to call it i, you don't need to call it j, 649 00:48:47,580 --> 00:48:53,270 but you want to make sure that you keep track of the position that you're at in your plaintext 650 00:48:53,270 --> 00:48:55,790 as well as the position that you're at in your keyword 651 00:48:55,790 --> 00:48:59,840 because those aren't necessarily going to be the same. 652 00:48:59,840 --> 00:49:06,400 Not only does the keyword--it could be a completely different length than your plaintext. 653 00:49:06,400 --> 00:49:09,140 Also, your plaintext, there are numbers and characters, 654 00:49:09,140 --> 00:49:14,450 so it's not going to perfectly match up together. Yes. 655 00:49:14,450 --> 00:49:19,280 [student] Is there a function to change case? 656 00:49:19,280 --> 00:49:24,530 Can you change a to capital A? >>Yeah, there definitely is. 657 00:49:24,530 --> 00:49:27,890 You can check out--I believe it's toupper, all 1 word. 658 00:49:30,650 --> 00:49:36,310 But when you're trying to cipher things and preserve the text, 659 00:49:36,310 --> 00:49:39,350 it's best basically to have separate cases. 660 00:49:39,350 --> 00:49:42,040 If it's an uppercase, then you want to shift by this 661 00:49:42,040 --> 00:49:46,460 because in your formula, when you look back how we have to kind of go 662 00:49:46,460 --> 00:49:50,900 interchangeably between the ASCII way of representing the numbers 663 00:49:50,900 --> 00:49:55,020 and the actual alphabetical index, we want to make sure 664 00:49:55,020 --> 00:50:01,850 there's going to be some kind of pattern that you're going to use. 665 00:50:01,850 --> 00:50:04,580 Another note on the pattern, actually. 666 00:50:04,580 --> 00:50:07,250 You're going to definitely be dealing with numbers. 667 00:50:07,250 --> 00:50:11,280 Try not to use magic numbers, which is an example of style. 668 00:50:11,280 --> 00:50:18,470 So say you want to every time shift something by like-- 669 00:50:18,470 --> 00:50:22,400 Okay, so hint, another spoiler is when you're going to be shifting something 670 00:50:22,400 --> 00:50:26,310 by a certain amount, try not to represent that by an actual number 671 00:50:26,310 --> 00:50:32,810 but rather try and see if you can use the ASCII value, which will kind of make more sense. 672 00:50:32,810 --> 00:50:35,470 Another note: Because we're dealing with formulas, 673 00:50:35,470 --> 00:50:41,200 even though your TF will kind of know what pattern you might be using, 674 00:50:41,200 --> 00:50:44,430 best to in your comments kind of explain the logic, like, 675 00:50:44,430 --> 00:50:51,880 "I'm using this pattern because..." and kind of explain the pattern succinctly in your comments. 676 00:50:54,090 --> 00:50:58,990 [this was walkthrough 2] If there aren't any other questions, then I'll just stay here for a little bit. 677 00:50:58,990 --> 00:51:04,370 Good luck with your pset 2: Crypto and thanks for coming. 678 00:51:06,070 --> 00:51:08,620 [student] Thank you. >>Thanks. 679 00:51:09,220 --> 00:51:10,800 [Media Offline intro]