[Walkthrough - Problem Set 2] [Zamyla Chan - Harvard University] [This is CS50. CS50.TV] All right. Hello, everyone, and welcome to Walkthrough 2. First, I want to congratulate you for finishing pset 1. I know that it could have been a bit tough for some of you, could have been your first computer program that you wrote, but just remember that at the end of this, when you look back at the end of the semester, you'll look at pset 1 and you'll say, "Hey, I could have done that in 5 minutes." So know and trust that at the end of this you'll definitely find pset 1 quite simple. But for now it's a huge accomplishment, and congratulations for getting done. Now, also a quick note before we get into the meat of the walkthrough. I just want to make a quick note that I sometimes won't have enough time during the walkthroughs to go through every single way of doing the problem set and rather just maybe focus on 1 or 2 kind of implementations,  ways that you could do this. But that isn't to say that you are forbidden from doing it another way. There are often, as with computer science, numerous ways of doing things, and so definitely feel free to use a different type of solution than I may have presented. [pset 2: Crypto - Zamyla Chan - zamyla@cs50.net] [pset2 - 0. A Section of Questions - 1. Caesar - 2. Vigenere] All right. So problem set 2: Crypto is a fun one. Again, with every pset you'll start with a section of questions that's going to be conducted in your sections with your assigned teaching fellow. We aren't going to go through these over the walkthrough, but they definitely will help you complete the pset. So the first part of the problem set is Caesar. And so in Caesar someone will pass you a key with an integer, and you will encrypt a string of text that they provide you and give them back an encrypted thing. If anyone watched A Christmas Story, there's an example of that there. Then the second part of the problem set is Vigenere, which is a more advanced encryption technique. And so we're going to encipher a piece of text, except instead with just a single integer, we're actually going to encode it with a keyword that the user will provide us. Okay, so the first tool in the toolbox today is actually going to be updating the appliance. On the discussion board we would see things like, "Why doesn't this work?" "Why doesn't Submit 50 work?"  and often the solution is actually just to update your appliance. And so if you just run in a terminal window in your appliance sudo yum -y-- that's a flag saying yes, update everything--update, then your appliance will update if need be. And it doesn't hurt if you already are at the most recent version of the appliance. Then it will just say no new updates available and you can continue working along. But this is good to execute even every time that you open the appliance because we're still very much-- sometimes if we come into a bug--fixing it in the appliance. So make sure that you have the most recent version of the appliance and run that update there. All right. So since we're dealing with letters and changing, enciphering things, we're going to really want to become best friends with our ASCII chart. There are numerous ones online, if you find. Maybe even make your own. Basically, with every letter and every number and every character there is a number associated with them, and so it's good to see their ASCII values alongside the actual letter. That will definitely help you in the problem set. One thing that really helped me in this problem set was to actually print it out, and as I was going through, I would actually draw on it, write, "If this has to go to there, then..." Kind of draw on it and mark it up, become best friends with your ASCII table. Then we have a few other tools at our disposal. This time instead of actually prompting the user for all of their input we're going to do a combination. We're going to prompt them for some input,  but we're also going to just use the command line arguments. So when they run their program, usually you say ./hello, for instance, if your program was hello.c. But this time instead of just saying that, they can put words, arguments afterwards. And so we're going to use whatever they pass in to us as their input as well, so moving beyond just prompting for integer but also using command line arguments. And then we'll go into arrays and strings, which we'll be using a lot as well. Here's just an example of 1 mini ASCII chart. As I said, every letter corresponds to a number, and so familiarize yourself with that. It will come in handy. And later when we start doing some ASCIIMath dealing with the numbers-- adding, subtracting them--then definitely good to refer to this chart. So here's an example of a Caesar cipher--something that you may have played with. It is just a wheel. Essentially, there is an outer alphabet and then there is an inner alphabet. So right here is an example of the Caesar cipher but with a key of 0. Essentially, A is aligned with A, B is aligned with B, all the way up to Z. But then say we wanted a key of 3, for instance. Then we would rotate the inner wheel so that A now aligns with D, etc. And so this is essentially what we're going to do. We don't have a wheel, but what we're going to do is make our program kind of shift the alphabet along with us a certain amount of numbers. So as I said before, we're going to be dealing with command line arguments as well as getting an integer. So the way that a user will run your Caesar program is by saying ./caesar and then entering a number after that. And that number represents the key, the shift, how many times you're going to be rotating the inner wheel of your Caesar cipher. And so you see here an example. If we entered the letters from A to L in our Caesar cipher, then it would input D through O because that's every letter shifted over 3 times, just like the example of the wheel that I showed you. So then if you entered, for instance, This is CS50! then it would also move all of the letters. And that's an important thing in both Caesar and Vigenere is that we're going to skip over any non-letters. So any spaces, characters, etc, numbers, we're going to keep them the same. We're only going to shift the letters in this case. So as you see in the wheel, we only have the letters available to us, so we only want to shift the letters and encrypt the letters. So the first thing to do, you saw that the usage for Caesar in problem set 2 is to run Caesar and then enter a number when you run it in the terminal. So what we need to do is to somehow get that key and access it. And so we want to somehow see it's going to be the second command line argument. The first one is going to be ./caesar, and the next one is going to be the key number. So before we had int main (void) to start our C programs. We're going to peel back a layer a little bit  and actually see that instead of passing in void to our main function we're actually dealing with 2 parameters. We have an int named argc and then an array of strings called argv. So argc is an integer,  and it represents the number of arguments passed in to your program. And then argv is actually the list of the arguments passed. All of the arguments are strings, and so argv represents an array, a list, of strings. Let's talk about arrays a little bit. Arrays are essentially a new data structure. We have ints, we have doubles, we have strings, and now we have arrays. Arrays are data structures that can hold multiple values of the same type, so essentially, a list of whatever type you want. Essentially, if you wanted a list of integers all in 1 variable, then you would create a new variable that was of type int array. So arrays are zero-indexed, meaning that the first element of the array is at index 0. If the array is of length 4, as in this example, then your last element would be at index 3, which is 4 - 1. So to create array, you would do something like this. Say you wanted a double array. This goes for any type of data type, though. So say you want a double array. Say you want to call it mailbox. Just like you would initialize any other double,  you would say double and then the name, but this time we put the square brackets, and then the number there will be the length of the array. Note that in arrays we can't ever change the length, so you always have to define and choose how many boxes,  how many values your array is going to hold. So to set different values in your array, you're going to use this following syntax, as you see on the slide. You have mailbox index 0 will be set to 1.2, mailbox index 1 set to 2.4, etc. So now that we've reviewed arrays a bit, let's go back to argc and argv. We know that argv is now an array of strings. So when a user passes in--say they're running a program-- they say ./hello David Malan, what the program will do for you already is actually come up with what argc and argv are. So you don't need to worry about that. Argc in this case would be 3 because it sees 3 distinct words separated by spaces. And so then the array in this instance, the first index would be ./hello, the next one David, the next one Malan. Does anyone see right away what the relationship between argv,  the array, and argc is? Yeah. We'll get into that in an example in args.c. Let's see if we can take advantage of the relationship between the 2. Here you might find that in the appliance the default application  to open .c files is sometimes Emacs.   But we want to deal with gedit, so what you can do is you can right click on your C file, go to Properties, Open With, and then choose gedit, Set as default, and now your program should open in gedit instead of Emacs. Perfect. So here I have a program that I want to print out each command line argument. So whatever the user inputs, I want to essentially return it back to them on a new line. So what's a structure that we can use to iterate over something-- something that you probably used in your pset 1? If you want to go through a set number of things? >>[student] For loop. For loop. Exactly. So let's start with the for loop. We have for int i = 0. Let's just start with a standard initialization variable. I'm going to leave the condition for a set and then say i++, going to do things there. All right. So thinking back to argv, if argv is the list of arguments passed in to the program and argc is the number of arguments in the program, then that means that argc is essentially the length of argv, right, because there are going to be as many arguments as the value of argc. So if we want to iterate over each element in argv, we're going to want to each time access the variable in argv at the given index. That can be represented with this, right? This variable here represents the particular string in this instance  because it's a string array--the particular string at that given index. What we want to do, in this case we want to print it out, so let's say printf. And now argv is a string, so we want to put that placeholder there. We want a new line just to make it look good. So here we have a for loop. We don't have the condition yet. So i starts at 0, and then every time it's going to print the given string at that particular index in the array. So when do we want to stop printing out elements in the array? When we've finished, right? When we've reached the end of the array. So we don't want to exceed past the length of the array, and we already know we don't need to actually actively find out what the length of argv is because it's given to us, and what's that? Argc. Exactly. So we want to do this process argc number of times. I'm not in the right directory. All right. Now let's make args. No errors, which is great. So let's just run args. What is this going to return to us? It's just going to print it back. "You inputted args into the program; I'm going to give it back to you." So let's say we want to say args then foo bar. So then it prints it out back to us. All right? So there is an example of how you can use argc and argv knowing that argc represents the length of argv. Make sure that you do not ever with arrays access one beyond the length of the array because C will definitely shout at you. You'll get something called a segmentation fault, which is never fun, basically saying you're trying to access something that doesn't exist, doesn't belong to you. So make sure, and especially with the zero-indexing, we don't want to-- Like for instance, if we have an array of length 4,  that array index 4 doesn't exist because we start at 0, at zero index. It will become second nature just like for loops when we start at 0. So just keep that in mind.  You don't want to ever access the index of an array that's beyond your reach. So we can see now how we can kind of access  the command line arguments that are passed in. But as you saw the string, the argv is actually a string array. So it's actually not an integer yet, but in Caesar we want to deal with integers. Luckily, there is a function created for us that can actually convert a string to an integer. Also in here we aren't dealing with user input where we're prompting them  for input here for the key, so we can't actually reprompt and say, "Oh, give me another integer, say, if it's not valid." But we do still need to check for correct usage. In Caesar they are only allowed to pass in 1 number, and so they have to run ./caesar and then they have to give you a number. So argc has to be a certain number.  What number would that be if they have to pass you the ./caesar and then a key? What is argc? >>[student] 2. >>Two. Exactly. So you want to make sure that argc is 2. Otherwise you basically refuse to run the program. In main it's a function that says int main, so then we always in good practice return 0 at the end of a successful program. So if, say, they give you 3 command line arguments instead of 2 or give you 1, for instance, then what you'll do is you'll want to check for that and then return 1 saying, no, I can't proceed with this program. [student] There can't be a space in your text. >>Pardon me? [student] There can't be a space in the text you're trying to encrypt. Ah!  In terms of the text that we're trying to encrypt, that actually comes later when we give that text. So right now we're just accepting as command arguments the actual number, the actual shift for the Caesar encryption. [student] Why do you need 2 as opposed to just 1 argc? There's definitely 1 number. Right. The reason why we need 2 for argc instead of 1 is because when you run a program and say ./caesar or ./hello, that actually counts as a command line argument. So then that already takes up 1 and so then we're inputting 1 extra. So you're inputting actually a string in the command line argument. What you want to do, for Caesar we want to deal with an integer, so you can use this atoi function. And basically, you pass it in a string and then it will return you back an integer if it's possible to make that string into an integer. Now remember when we're dealing with printf or GetString, things like that, we include the libraries that are specific to us. So at the beginning we start with a hash tag standard I/O, .h, something like that. Well, atoi isn't within one of those libraries, so what we have to do is we have to include the right library for that. So recall back to Walkthrough 1 where I discussed the manual function. You type man in your terminal and then followed by the name of a function. And so that will bring up a whole list of its usage, but as well it will bring up which library that belongs to. So I'll leave that to you to use the manual function with atoi and figure out which library you need to include to be able to use the atoi function. So we've got the key and now it comes to getting the plain text, and so that actually is going to be user input where you prompt. We dealt with GetInt and GetFloat, and so in the same vein   we're going to be dealing with GetString. But in this case we don't need to do any do while or while loops to check. GetString will definitely give us a string,  and we're going to encrypt whatever the user gives us. So you can assume that all of these user inputted strings are correct. Great. So then once you've got the key and once you've got the text, now what's left is you have to encipher the plaintext. Just to quickly cover over lingo, the plaintext is what the user gives you, and the ciphertext is what you return to them. So strings, to be able to go through actually letter by letter because we have to shift every letter, we understand that strings, if we kind of peel back the layer, we see that they're just really a list of characters. One comes after the other. And so we can treat strings as arrays because they are arrays of characters. So say you have a string named text,  and within that variable text is stored This is CS50. Then text at index 0 would be a capital T, index 1 would be h, etc. And then with arrays, in the argc example in args.c, we saw that we had to iterate over an array and so we had to iterate from i = 0 up until i is less than the length. So we need some way of figuring out what the length of our string is if we're going to iterate over it. Luckily again, there is a function there for us, although later on in CS50 you'll definitely be able to implement and make your own function that can calculate the length of a string. But for now we're going to use string length, so strlen. You pass in a string, and then it will return you an int that represents the length of your string. Let's look at an example of how we might be able to iterate over each character in a string and do something with that. What we want to do is iterate over each character of the string, and what we want to do is we print back each character 1 by 1 except we add something next to it. So let's start with the for loop. Int i = 0. We're going to leave space for the condition. We want to iterate until we reach the end of the string, right? So then what function gives us the length of the string? [inaudible student response] That's the length of the command line arguments. But for a string we want to use a function that gives us the length of the string. So that's string length. And so then you have to pass in a string to it. It needs to know what string it needs to calculate the length of. So then in this case we're dealing with string s. Great.  So then what we want to do, let's printf. Now, we want to deal with characters. We want to print out each individual character. When you want it to print out a float, you would use the placeholder like %f. With an int you would use %d. And so similarly, with a character you use %c to say I'm going to be printing a character that's stored inside a variable. So we have this, and let's add a period and a space to it. Which character are we using?  We're going to be using whatever character we're at of the string. So then we're going to be using something with string, but we want to be accessing the certain character there. So if a string is just an array, then how do we access elements of arrays? We have those square brackets, and then we put the index in there. So we have square brackets. Our index in this case we can just use i. Exactly. So here we're saying we're going to be printing a character followed by a dot and a space, and that character is going to be the ith letter in our string s. I'm just going to save that. Okay. Now I'm going to run string length. So we had a string called OMG, and now it's emphasized even more. Similarly, let's say we actually want to get a string from the user. How might we do this? Before, how did we get an int? We said GetInt, right? But this isn't int, so let's GetString. Let's make string length. Here we didn't enter a specific prompt. So I don't know.  I'm going to put my name in here and so then I can do one of those things where I assign a word for every letter or something like that. Cool. So that's string length. So we're back to Caesar.  We have a few tools on how we iterate over a string,  how we access each individual element. So now we can get back to the program. As I mentioned before, in the ASCII table, your best friend, you're going to see the numbers that are associated with every letter. So here say our plaintext is I'm dizzy! Then each of these characters is going to have a number and ASCII value associated with it, even the apostrophe, even the space, even the exclamation mark, so you'll want to keep that in mind. So say our key that the user included in their command line argument is 6. That means for the first letter, which is I, which is represented by 73, you want to return to them whatever letter is represented by the ASCII value of 73 + 6. In this case that would be 79. Now we want to go to the next character. So the next in index 1 of the plaintext would be the apostrophe. But remember we only want to encipher the letters. So we want to make sure that the apostrophe actually stays the same, that we don't change from 39 to whatever 45 is. We want to keep it as an apostrophe. So we want to remember to only encipher the letters because we want all of the other symbols to remain unchanged in our program. Another thing that we want is to preserve capitalization. So when you have an uppercase letter, it should stay as an uppercase. Lowercases should stay as lowercase. So some useful functions to be able to deal with only enciphering letters and keep preserving the capitalization of things  is the isalpha, isupper, islower functions. And so these are functions that return you a Boolean value. Basically, true or false. Is this an uppercase? Is this alphanumeric?  Is this a letter, essentially. So here are 3 examples of how you would use that function. Basically, you could test whether the value returned to you by that function is true or false based on that input. Either don't encipher something or cipher it or make sure that it's uppercase, etc. [student] Can you just explain those a little more and how you use them? >>Yeah, for sure. So if we look back, here we have a capital I, right? So we know that I goes to O because I + 6 is O. But we want to make sure that that O is going to be a capital O. So basically, that is kind of going to change our input. So whether it's uppercase or not will kind of change the way that we deal with it. So then if we use the isupper function on that particular index, so isupper("I"), that returns for us true, so we know that it's upper. So then based on that, later we'll go into a formula  that you'll be using to shift things in Caesar, so then basically, there's going to be a slightly different formula if it's uppercase as opposed to lowercase. Make sense? Yeah. No worries. I talked a bit about adding 6 to a letter, which doesn't quite make sense except when we kind of understand that these characters  are kind of interchangeable with integers. What we do is we kind of use implicit casting. We'll go into casting a bit later on where you take a value and you turn it into a different type than it originally was. But with this pset we'll be able to kind of interchangeably use characters and their corresponding integer values. So if you simply encase a character with just the single quotes, then you'll be able to work with it with integers, dealing with it as an integer. So the capital C relates to 67. Lowercase f relates to 102. Again, if you want to know these values, look at your ASCII table. So let's go into some examples of how you might be able to subtract and add, how you can actually really work with these characters, use them interchangeably. I say that ASCIIMath is going to calculate the addition of a character to an integer and then displays the resultant character as well as the resultant ASCII value. And so here I'm saying--we'll deal with this part later-- but basically, I'm saying that the user should say run ASCIIMath along with a key, and I'm saying that that key is going to be the number with which we're going to add this character. So here notice that since I'm demanding a key, since I'm demanding that they're giving me 1 thing,  I only want to accept ./asciimath and a key. So I'm going to demand that argc is equal to 2. If it's not, then I'm going to return 1 and the program will exit. So I'm saying the key isn't going to be the first command line argument, it's going to be the second one, and as you see here, I'm going to turn that into an integer. Then I'm going to set a character to be r. Notice that the type of the variable chr is actually an integer. The way that I'm able to use r as an integer is by encasing it with these single quotes. So back to our printf statement where we have a placeholder for a character and then a placeholder for an integer, the character is represented by the chr, and the integer is the key. And so then we're going to in result add the 2 together. So we're going to add r + whatever the key is, and then we're going to print the result of that. So let's make asciimath. It's up to date, so let's just run asciimath. Oh, but see, it doesn't do anything because we didn't actually give it a key. So when it just returned 1, our main function, it just returned back to us. So then let's pass in a key. Someone give me a number. >>[student] 4. 4. Okay.  So r increased by 4 is going to give us v, which corresponds to the ASCII value of 118. So then it kind of makes sense that-- Actually, can I ask you, what do you think the ASCII value of r is if r + 4 is 118? Then yeah, r is 114. So if you look on the ASCII table then, sure enough, you'll see that r is represented by 114. So now that we know that we can add integers to characters, this seems pretty simple. We're just going to iterate over a string like we saw in an example before. We'll check if it's a letter. If it is, then we'll shift it by whatever the key is. Pretty simple, except when you get to like this, you see that z, represented by 122, then would give you a different character. We actually want to stay within our alphabet, right? So we need to figure out some way of kind of wrapping around. When you reach zed and you want to increase by a certain number, you don't want to go into beyond the ASCII alphabet section; you want to wrap back all the way to A. But keep in mind you're still preserving the case. So knowing that letters can't become symbols just like symbols aren't going to be changing as well. In the last pset you definitely didn't need to, but an option was to implement your greedy pset by using the modulus function. But now we're actually going to need to use modulus, so let's just go over this a little bit. Essentially, when you have x modulo y, that gives you the remainder of x divided by y. Here are some examples here. We have 27 % 15. Basically, when you subtract 15 from 27 as many times as possible without getting negative then you get 12 left over. So that's kind of like in the math context, but how can we actually use this? It's going to be useful for our wrapover. For this, let's just say I asked you all to divide into 3 groups. Sometimes you do this in groups and something like that. Say I said, "Okay, I want you all to be divided into 3." How might you do that? [inaudible student response] Yeah, exactly. Count off. Okay. Let's actually do that. Do you want to start? [students counting off] 1, 2, 3, 4. But remember... >>[student] Oh, sorry. That's a really good point. You said 4, but we actually want you to say 1 because we only want 3 groups. So then, how-- No, that's a really good example because then how might you say 1? What's the relationship between 4 and 1? Well, 4 mod 3 is 1. So if you continue, you would be 2. So we have 1, 2, 3, 1, 2. Again, you're actually the 5th person. How do you know to say 2 instead of 5? You say 5 mod 3 is 2. I want to see how many groups of 3 are left over, then which order am I. And so then if we continued along the whole room,  then we would see that we're always actually applying the mod function to ourselves to kind of count off. That's a more kind of tangible example of how you might use modulo because I'm sure most of us have probably gone through that process where we've had to count off. Any questions on modulo?  It will be pretty important to understand the concepts of this,  so I want to make sure you guys understand. [student] If there is no remainder, does it give you the actual number? If one of the first 3 of them had done it, would it have given them what they actually were, or would it have given them [inaudible] >>That's a good question. When there is no remainder for the modulo--so say you have 6 mod 3-- that actually gives you back 0. We'll talk about that a bit later. Oh yeah, for instance, the 3rd person--3 mod 3 is actually 0 but she said 3. So that's kind of like an inner catch, for instance, like okay, if the mod is 0 then I'm going to be the 3rd person. But we'll get into kind of how we might want to deal with what 0 is later. So now we somehow have a way of mapping the zed to the right letter. So now we've gone through these examples, we kind of see how Caesar might work. You see the 2 alphabets and then you see them shifting. So let's try and express that in terms of formula. This formula is actually given to you in the spec, but let's kind of look through what each variable means. Our end result is going to be the ciphertext. So this says that the ith character of the ciphertext is going to correspond to the ith character of the plaintext.  That makes sense because we want to always be lining these things up. So it's going to be the ith character of the ciphertext plus k, which is our key-- that makes sense--and then we have this mod 26. Remember back when we had the zed we didn't want to get into the character, so we wanted to mod it and kind of wrap around the alphabet. After zed you would go to a, b, c, d, until you got to the right number. So we know that zed, if + 6, would give us f because after zed comes a, b, c, d, e, f. So let's remember we know for sure that zed + 6 is going to give us f. In ASCII values, z is 122 and f is 102. So we have to find some way of making our Caesar formula give us 102 after taking in 122. So if we just apply this formula, the ('z' + 6) % 26, that actually gives you 24 because 122 + 6 is 128; 128 % 26 gives you 24 remainder. But that doesn't really mean f. That's definitely not 102. That's also not the 6th letter in the alphabet. So obviously, we need to have some way of tweaking this a little bit. In terms of the regular alphabet, we know that z is the 26th letter and f is the 6th. But we're in computer science, so we're going to index at 0. So then instead of z being the number 26, we're going to say it's number 25 because a is 0. So now let's apply this formula. We have z represented by 25 + 6, which gives you 31. And 31 mod 26 gives you 5 as a remainder. That's perfect because we know that f is the 5th letter in the alphabet. But it still isn't f, right? It still isn't 102. So then for this pset, a challenge will be trying to find out the relationship between converting between these ASCII values and the alphabetical index. Essentially, what you'll want to do, you want to start out with the ASCII values, but then you want to somehow translate that into an alphabetical index then calculate what letter it should be--basically, what its alphabetical index is of the cipher character--then translate that back to the ASCII values. So if you whip out your ASCII table, then try and find relationships between, say, 102 and 5 or the 122 and 25. We've gotten our key from the command line arguments, we've gotten the plaintext, we've enciphered it. Now all we have left to do is print it. We could do this a couple of different ways. What we could do is actually print as we go along. As we iterate over the characters in the string, we could simply just print right then when we calculate it. Alternatively, you could also store it in an array and have an array of characters and at the end iterate over that whole array and print it out. So you have a couple of options for that. And remember that %c is going to be the placeholder for printing a character. So there we have Caesar, and now we move on to Vigenere, which is very similar to Caesar but just slightly more complex. So essentially with Vigenere is you're going to be passing in a keyword. So instead of a number, you're going to have a string, and so that's going to act as your keyword. Then, as usual, you're going to get a prompt for a string from the user and then encipher it and then give them the ciphertext back. So as I said, it's very similar to Caesar, except instead of shifting by a certain number, the number is actually going to change every time from character to character. To represent that actual number to shift, it's represented by the keyboard letters. So if you enter in a shift of a, for instance, then that would correspond to a shift of 0. So it's again back to the alphabetical index. What might be useful if you're seeing that we're actually dealing with ASCII values as well as the letters, as well as the alphabetical index, maybe find or make your own ASCII table that shows the alphabetical index of 0 through 25, a through z, and the ASCII values so that you can kind of see the relationship and sketch out and try and find some patterns. Similarly, if you were shifting at the certain instance by f-- and this is either lowercase or uppercase f--then that would correspond to 5. Are we good so far? The formula for Vigenere is a bit different. Basically, you see that it's just like Caesar, except instead of just k we have k index j. Notice that we're not using i because essentially, the length of the keyword isn't necessarily the length of our ciphertext. This will be a bit clearer when we see an example that I have a bit later on. Basically, if you run your program with a keyword of ohai, then that means that every time, ohai is going to be your shift. So depending on what position you are in your keyword, you're going to shift your certain ciphertext character by that amount. Again, just like Caesar, we want to make sure that we preserve the capitalization of things and we only encipher letters, not characters or spaces. So look back to Caesar on the functions that you may have used, the way that you decided how to shift things, and apply that to your program here. So let's map this out. We have a plaintext that we've gotten from the user from GetString saying This... is CS50! Then we have a keyword of ohai. The first 4 characters are pretty simple. We know that T is going to be shifted by o, then h is going to be shifted by h, i is going to be shifted by a. Here you see that a represents 0,  so then the end value is actually just the same letter as before. Then s is shifted by i. But then you have these periods here. We don't want to encipher that, so then we don't change it by anything and just print out the period unchanged. [student] I don't understand how you know that this is shifted by-- Where did you-- >>Oh, sorry. At the top here you see that the command line argument ohai here, that's going to be the keyword. And so basically, you're cycling over the characters in the keyword. [student] So o is going to be shifting the same-- So o corresponds to a certain number in the alphabet. [student] Right. But where did you get the CS50 part from? Oh. That's in GetString where you're like, "Give me a string to encode." [student] They're going to give you that argument to shift by and then you'll ask for your first string. >>Yeah. So when they run the program, they're going to include the keyword in their command line arguments when they run it. Then once you've checked that they've actually given you 1 and not more, not less, then you're going to prompt them for a string, say, "Give me a string." So that's where in this case they've given you This... is CS50! So then you're going to use that and use ohai and iterate over. Notice that here we skipped over encrypting the periods, but in terms of our position for ohai, the next one we used o. In this case it's a bit harder to see because that's 4, so let's continue a bit. Just stick with me here. Then we have i and s, which are then translated by o and h respectively. Then we have a space, and so then we know that we aren't going to encipher the spaces. But notice that instead of going to a in this spot right here, we're encrypting by a--I don't know if you can see that--right here. So it's not like you actually predetermined, say, o goes here, h goes here, a goes here, i goes here, o, h, a, i, o, h, a, i. You don't do that. You only shift your position in the keyword  when you know that you're actually going to be encrypting an actual letter. Does that kind of make sense? Okay. So just some reminders. You want to make sure that you only advance to the next letter in your keyword if the character in your plaintext is a letter. So say we're at the o. We notice that the next character, the i index of the plaintext, is a number, for instance. Then we don't advance j, the index for our keyword, until we reach another letter. Again, you also want to make sure that you wraparound to the beginning of the keyword when you're at the end of it. If you see here we're at i, the next one has to be o. So you want to find some way of being able to wraparound to the beginning of your keyword every time that you reach the end. And so again, what kind of operator is useful in that case for wrapping around? Like in the counting off example. [student] The percent sign. >>Yeah, the percent sign, which is modulo. So modulo will come in handy here when you want to wrap over the index in your ohai. And just a quick hint: Try to think of wrapping over the keyword a bit like the counting off, where if there's 3 groups, the 4th person,  their number that they said was 4 mod 3, which was 1. So try and think of it that way. As you saw in the formula, wherever you have ci and then pi but then kj, you want to make sure that you keep track of those. You don't need to call it i, you don't need to call it j, but you want to make sure that you keep track of the position that you're at in your plaintext as well as the position that you're at in your keyword because those aren't necessarily going to be the same. Not only does the keyword--it could be a completely different length than your plaintext. Also, your plaintext, there are numbers and characters, so it's not going to perfectly match up together. Yes. [student] Is there a function to change case? Can you change a to capital A? >>Yeah, there definitely is. You can check out--I believe it's toupper, all 1 word. But when you're trying to cipher things and preserve the text, it's best basically to have separate cases. If it's an uppercase, then you want to shift by this because in your formula, when you look back how we have to kind of go  interchangeably between the ASCII way of representing the numbers and the actual alphabetical index, we want to make sure  there's going to be some kind of pattern that you're going to use. Another note on the pattern, actually. You're going to definitely be dealing with numbers. Try not to use magic numbers, which is an example of style. So say you want to every time shift something by like-- Okay, so hint, another spoiler is when you're going to be shifting something  by a certain amount, try not to represent that by an actual number but rather try and see if you can use the ASCII value, which will kind of make more sense. Another note: Because we're dealing with formulas,  even though your TF will kind of know what pattern you might be using, best to in your comments kind of explain the logic, like, "I'm using this pattern because..." and kind of explain the pattern succinctly in your comments. [this was walkthrough 2] If there aren't any other questions, then I'll just stay here for a little bit. Good luck with your pset 2: Crypto and thanks for coming. [student] Thank you. >>Thanks. [Media Offline intro]