[ Silence ] [ Inaudible Discussions ] >> Alright, welcome back, this is CS50 and this is the start of week 3. So in addition to this bit of geek humor today, I thought I'd share an e-mail that someone not even taking CS50 but nonetheless an undergrad sent to me the other day that I thought you'd enjoy. [ Noise ] [ Laughter ] >> Okay. Anyhow, I don't normally share your e-mails publicly in senders but this one was good. So just a quad full rapid fire announcements, so one, if you still have a Scratch 4 from hacker, the Hacker Edition of problem set 0, please sometime this week, drop it off with one of the teaching fellows or CAs off to the corner. Sections were announced this weekend and are already in progress as of yesterday, realize that if you have a conflict and this always happens 'cause things get shuffled around on you. Just go to this URL, submit a request if you haven't already and if for some reason we seem to have forgotten about you, just e-mail sectioning at CS50.net and Matt and Rob will be sure to follow up. So another lunch, it actually went well last Friday. We thought we'd do it again. So if you would like to join me in some of the TFs and CAs, CS50.net/RSVP that's this Friday at 1:15 p.m. And again, we'll vary the times and days over time, and then office hours. So Wednesday office hours in particular have proved to be quite popular. So we thought we would expand the program to include our friends from Quincy House. So on Wednesdays now, office hours will be in Quincy House and then as usual [inaudible] tonight, Leverett tomorrow, Lowell House on Thursday. The schedule's always on the course's website. Also to satiate the appetites of those 10 percent of you who are among those more comfortable, what we thought we would do is try and experiment whereby we fork off some hacker-specific office hours that will offer somewhere on campus. We're still sorting out the details. But for those of you who are diving into the Hacker Edition of this week's P-set and next and so forth, realize that this will be an opportunity in a more intimate setting to just sit back and sit across the table from some of our hacker-type teaching fellows, and ask any and all questions that might arise during the evening. But we'll focus on the standard edition at regularly scheduled office hours. So great, P-set 0 is now on progress, now that we've assigned TFs and P-set 1 will also be returned. You'll hear about this from your teaching fellows via e-mail over the next few days and thereafter. Now that we have sections assigned, we'll be on track to return feedback typically within just 1 week. It's just start of term where we insert this delay because of lack of sectioning. And now a request, so help.CS50.net has proved hopefully actually helpful for you, but realize that we can do best by you if you provide us with more detail than is sometimes there. So writing us notes, however sweet that pretty much boil down to it doesn't work isn't enough for us to go on, so honestly, please, do get in to the habit. Anytime you pose a question of us, state exactly what the problem is, like what are symptoms you're seeing, what is the specific error message? Don't just say I'm seeing an error message 'cause it's enormously useful for us to know what error message. And then also, what have you tried, what has not worked, what operating system are you using, if it's germane to the appliance itself? In short, just try to provide us with as much detail so that when we reply to you, we don't have a questions mark, we actually have an answer in the insurance of efficiency. With that said, our goal is indeed to return e-mails within 24 hours as best we can and we're doing okay. These are the statistics from last week whereby over 200 questions were answered between 0 and 1 hour is there. The gray bars are last week, the green bars are this week. So the number of questions is increasing. Some questions took us 1 to 4 hours as denoted by the second green bar. Four to 8, 8 to 12 dot, dot, dot and then we accidentally left one person hanging for like 2 days by accident. But that wasn't anomaly, so realize our goal is to respond as quickly as possible. So today, is of again about security in this domain and then also about C and what I thought I'd do is draw our attention to a file. We actually included in last week's source code, really just to peak curiosity. It's way more sophisticated that the point at which we're at now. But back in 2007, it was the day or so before CS50 lecture when the first jailbreak code was released for the iPhone. Jailbreaking generally means to hack an iPhone is such a way that you don't have to use it on AT&T or you can install unauthorized software on it. Or just generally reclaim control of the phone that's otherwise pretty locked down by Apple. And what was fun at the time and fast forward to today, is that it was actually implemented that first hack in C-code. And so if you're just curious to see how something like this might work, I've just opened in G edit a file called iunlock.c and scrolling down through this file, you'll see that even though there's a lot of new stuff, you'll see some familiar syntax, sharp include apparently not just standard I/O though, but some other files in there as well. And if you scroll down you'll see some things we'll get to in a few weeks, strucs and so forth. But here we have void, here we have the name of some function "int i" and so, some of this should certainly be readable already. And it's probably not something worth understanding its entirety, but perhaps it does relate the real world media and things you hear about like jailbreaking to the kinds of things that we are actually doing. I'm also speaking of security. I went through my spam folder which is always fun and I thought I would try to share not just our own student's e-mail but this one here. So it's not particularly helpful to read the entirety of this thing. But the immediate takeaway should be do not click on any links in an e-mail like this. As you will see towards semester's end, it is trivial to write a program that actually sends e-mail. It's trivial to write a program that sends e-mail as though it's from your roommates. And so you see a lot of spam on the internet that's also customized in such a way that it seems to be from your own mail server's domain. This piece of spam here specifically mentions college.harvard.edu. Well, it's really not hard to spam every possible username within college.harvard.edu with something like this. Generally known as a phising attack, the idea being trying to hook unsuspecting users into clicking a link that, Click Here, and what generally happens once you've clicked there. So a virus, some kind of file gets downloaded on to your computer, especially if it's a PC it could be some form that says, please provide your username and password, that website that you end up at, might even look identical to college.harvard.edu. But as we'll see in a few weeks, it's really not that hard to go to college.harvard.edu. Look at its HTML source code, the language in which webpages are written and copy and paste it into your own bad guys website in voila, you now have your own copy of college.harvard.edu. So a lot of fun stuff there on the horizon. As an aside, I'm reminded of my own freshman year, when I first discovered this trick where you can send an e-mail as anyone on the internet. It doesn't have to be your own actual e-mail account. And so of course, my proctor mates and I were engaged in some stupid argument over e-mail within math use and I decided to be the ever so witty one and send a reply from the opposing argument side essentially conceding to what ever stupid argument we were having. Unfortunately, I wasn't so good with computers and I accidentally left my signature enabled in the e-mail client so there was this dear proctor mates, dot, dot, dot, I now agree with such and such signed proctor mates, DJM. So, kind of an idiot. So in any case, we'll see how to do this and has--actually how to do it well so that it serves you for good purposes and not that. This stuff too, and this should--this is really just a silly one. Anytime you see domain name ending in 0LX.net, this too should be a tell. And later in this semester, we'll talk about other signs frankly of security potential breaches, but realize too, this is a fascinating numbers game, right. Even though most of you, all of you in this room might laugh and think I'm not--I'm never gonna click on a link like that, that may be true, but if you send out an e-mail to a million people, and only 1 percent of people are not like you and they do actually click, that's 10,000 victims for whatever attack this actually is. So this, there is this notion of order of magnitude in computer science and in data analysis that we'll get to before long that just really speaks to the power of statistics frankly and even something stupid like this, right, all you need is 1 percent of those people clicking on an e-mail which is practically free to send and voila, you've done something either particularly lucrative or malicious. This one too, this is a little and this is the last one. This one I received as an attachment, right. I get all sorts of ads for Viagra and such, but this one came in as in--okay, wait a minute, this one came in as an advertisement in my spam folder, as an attachment in my spam folder. So what's going in here, right? Why is first of all this attachment all wavy? Why is it an image in the first place and not text? [ Inaudible Remark ] >> Yeah, so part of it is spam filters hopefully won't actually catch this because what most spam filters do are looking for pattern matching. Something else we'll talk about later in this semester. Looking for keywords like Viagra and words that, you know, people might use in an e-mail but statistically you're more likely to use it in some kind of spam. And so they've used an image because you can't really do text analysis on this, at least not as easily. But you can do text analysis. You can do what's called OCR, optical character recognition. But by kind of making the image wavy in this way, which they probably did with Photoshop or something like that, just makes it harder for applications like that to actually interpret what this image is. Now as for fun and also just to make I wasn't taking me to someplace sketchy, I did go to dot--ed5.ru this morning, but it does seem to have been taken down. So it's now a dead end. So now a more serious note, we're at the point in this semester where folks need to make a decision between staying or leaving and we always see comments like this screen shot here. So this is from a website with which you're probably very familiar. We see though comments that are a little more explicit. And this is not to make light of comments like these. This happens every semester to some subset of students here at 1-2. So, right, this is not meant to like--elicit that emotion frankly but really-- [ Laughter ] >> Really to emphasize, first of all, these FMLs are from last year, so 12 months ago. So if a number of you are in the audience particularly among those less comfortable, the 55 percent demographic this semester, do realize that this is certainly early on for a lot of students for whom this is a very new world, these are very sort of normal and sort of cyclical emotions that you feel. But it really is the sort of thing where once you get more acclimated to this, once you stop making what you will eventually call stupid mistakes, leaving off semi colons and forgetting parentheses and this little things that you just become all tied to you before long, realize that the material, the ideas, your own programs will get more exciting. I mean, if you think back to the excessive use of videography and photography that we've shown you thus far, you didn't really see anyone crying at the CS50 fair. You saw some people asleep, maybe at this CS50 hackathon but there is no one in tears. It was really mostly smiles. And I can promise you this, as can our 40 some odd TFs and 40 some odd CAs that that is precisely the emotion you will feel. And so if you're really at this point right now, honestly, I dare say this is more a function of strategy and your current work habits as they relate to 50, and not something fundamental to the material itself. As I said a week or so ago, honestly, if you feel that sort of stress, sort of building up in you, if you're working on some P-set, it's already 3 a.m., 4 a.m., well, just honestly call it a night, take it a break. Because even I can't sort of code under pressure, and much better than to cash in that late day that you might have or simply to take a break and come back to it rather than waste some 3, 4 hours staring at a frustratingly blinking prompt to only to get nowhere. Honestly, this happens to me to this day, it's just no fun and it's not easy to code under stress. And if anything, reach out to me Rob, Matt, or any of your now assigned teaching fellows, if you have any questions or concerns, but realize there have been many, many others before you. Not in tears, just before you. [ Laughter ] >> So, where are we going this week? Let's set the stage and let's dive back in to some of that code. So in P-set 2, the problem domain is security and more specifically cryptography, the art of encrypting information. And most anytime you encrypt information, you need some secret between you, say Alice, and the other person Bob, because otherwise, there's no way for the two of you to know how to interpret what might look like nonsense that you're sending across the internet or across a room on a piece of paper. So in cryptography there's this notion of plain text, the original message depicted there at left, then there's the notion of cypher text depicted there in the middle. And the cypher text is the scrambled stuff, it looks like nonsense, it looks like random letters but there's some algorithm, there's some process for reversing that pro--reversing that scrambled text and that's of course known as decryption and that yields back the plain text. And so you need some secret, and the simplest kind of secret might be well, if I wanna write the letter A, let me actually write to my significant other in the class the letter B. And if I want a B, I write a C and the C in a D, and so forth. You rotate perhaps the letters by one place or you can be fancier, rotate them by 2 places. Or 3 places or 13 places, though hopefully not 26 places, okay, alright. So otherwise it's not all that effective. And so this is generally known as secret key cryptography whereby you need to know some secret, Alice and Bob, and that secret in this scenario is just a number, like the number 1 if you're gonna rotate all of those characters by 1 place. There's a well-known rotation called rot13, R-O-T 13, which was historically used on the internet to encrypt information but in a very simplistic way. It was often used back in the day on message boards. If you wanted to talk about movies and spoilers, but you didn't wanna be the jerk who spoils it for everyone, so this way you had to click a button and it would unrotate the scramble text by 13 places. Now just thinking about this rotation from 1--0 to 26 places, how secure would you say this is? Alright, it's not all that secure, alright. It might be tedious and time consuming to figure out what a handwritten notes says 'cause you'd have to redo it all by hand and you could guess, maybe A as B, B as C and so forth. If you screw up maybe A as C, B as D so you might have to do this some 25 times to actually figure out what the secret was. And then voila, the secret is cracked. Well, what you're gonna do in P-set 2 is actually automate all of that, so that all you have to do is run a command and bam, decrypted, or bam, encrypted. And certainly, a computer can do this much faster. But you can actually do better than this so called Caesar cypher. So the Caesar cypher is the name given to any of these rotational cyphers. Whereby you take an alphabet, and you kinda rotate it conceptually this way so that A lines up with some different letter and this is what little Ralphie was referring to in A Christmas Story when he actually was using that secret decoder ring to crack the be sure to drink your Ovaltine advertisement. So this is again named after Caesar, but then years later, there came along another fellow, named Vigenere, who've actually built upon this idea of a rotational cypher but didn't just used one secret, right. A recurring theme you'll see even in computer science is if well, the first attempt at doing something isn't quite good enough or quite secure enough, maybe do it again, or do it again and try to increase the computational complexity of something. And so what this fellow proposed, the so called Vigenere cypher, is simply a cypher where you don't use one key, you use maybe 2 keys or 3 keys or even 4 and then you'll use the first key on the first letter, the second key on the second letter, third on the third, fourth or fourth ant then if your input, your plain text has 5 letters, then you just repeat back to the first letter in your key. Now it's a little tedious to remember multiple numbers like the number 2 and 3 and 13 and 14. And so what Vigenere proposed is that rather than just remember a number, like you might for the Caesar cypher, you instead remember a word. Like my keyword might be ABC and then you simply decide sort of as a society that anytime a keyword has the letter A, this is going to represent the number 0, the letter B is gonna represent the number 1, the letter C is gonna represent the number 2, and so forth. So if your secret, not key, but secret keyword is ABC, that's really like saying, we have 3 keys and they are 0, 1, 2 and so, if wanna encrypt something like hello, you can do, if this is plain text, H-E-L-L-O, and I wanna encrypt it with A, B, C. Well, it's 5 letters so I actually have to repeat my keyword, so I just do this. And now, H plus A well A is 0, so H plus 0 is H. E plus B is actually E plus 1 so that goes to F, L plus 2, so L M N, so that's N, N and then I'm back to B here, so O, rotated by 1 place is P, and so--well, did I mess up? [ Inaudible Remark ] >> O, L, oh yes--yes, yes, thank you. This is plus 0, thank you. Plug in the code, alright, so that's L, L--no, no, no--hang on. You don't have 300 people staring at you. [ Laughter ] >> Okay. So plus 2 L, M, N. Okay, okay, thank you. [ Applause ] >> Really set the bar high here in 50, okay. So that would now be the cypher text, right. And now this is actually harder to decrypt or to crack in the sense of trying to brute force your way through it. When we say brute force, it means try all possible keys to figure out what it is. But now this is getting increasingly computational complex whereas the complexity of the Caesar cypher, if you're just rotating by a letter of the English alphabet, you have 26 possible keys, release one--0 through 25 or 1 through 26 depending on where you wanna start counting. How many do you actually have in the Vigenere cypher? Well, for your first keyword, you have what? 26 possibilities. Your second letter in the keyword you have another 26, and then another 26 possibilities and then another. So in general, the complexity of C--of the Vigenere cypher is gonna be 26 raised to some power where that power, let's call it K, or whatever is simply the length of that key. So 26 times 26 times 26 times 26, now this is starting to actually get pretty big. If we use a let's say 5-letter keyword alone, let me just pull up a little text based calculator here. And if I do 26 raised to the 5 power, and hit Enter, so that's already some 11 million possible keywords that the bad guy might actually have to try in order to figure out what keyword was actually used. Now here too, we have computers, and computers can churn through 11 million possible keywords, pretty darn fast. Thankfully though, there exist some more sophisticated algorithms out there, but that's something we'll revisit before long. Now, just as a teaser for those of you who might be a little shy of tackling the Hacker Edition, realize that this week is actually pretty open ended. What we've given you, if you choose to elect this edition, is a password file. So, on most any Unix or Linux or Mac OS system there's a text file historically that contains all the user names who are authorized to use that computer and then all of their passwords separated by a colon. So for instance, Caesar, colon, and then his encrypted password, [inaudible] colon, and then his encrypted password and so forth. So these are not in plain text, these are the cypher texts. And so the challenge of the Hacker Edition this week, if you'd not yet dived in, is to take this as input and figure it out what are each of these people's passwords, and you will find that they are varying levels of difficulties. Rob's is not so difficult, Mailyn's [phonetic] is actually pretty difficult. And so you'll see that various heuristics are actually applicable here, right. If you were implementing a password cracking program, even if at first time you have no idea how to do this in C, what kinds of words or what kinds of passwords might you try if the goal is to brute force figure out whose password is what. So you can use what's called the dictionary attack. So this is actually pretty sophisticated approach, but it turns out in the appliance, as in most computers these days, there's a huge text file with like 150,000 English words. Well, that's actually not all that much to a computer. What you could try doing is open that dictionary file, read through it, top to bottom, left to right and try encrypting every one of those English dictionary words and if you see a match in this file, as we discussed last week, voila, you'd figured out what Rob's password is. So in other words, you can't really decrypt these things, 'cause they're what we called one-way hashes. But you can encrypt a guess and see if it matches what's actually in this file. Now, other things you might try, honestly, you could try some easy ones. Maybe it's all numbers, maybe it's the word password, maybe it's the number 1, 2, 3, 4, 5. So one of the challenges of the Hacker Edition is that, because there's actually a whole lot of possible passwords that any one of these people's could be, turns out that your program could take days, weeks to actually run. So you're never actually gonna see certainly before the deadline if you're right. And so one of the challenges is going to be write a password-cracking program that optimizes for what hopefully normal people might use as passwords. And then only eventually resorts to just trying all possible letters in the ASCII character set, all 256 possibilities. So do feel free to dive in and even if you decide, I have to abort this particular edition, that's fine, you can then dive into the standard edition. You can chose week to week which one to do. Alright. Any questions on P-sets or otherwise? No, alright. So let's go ahead and open up a blank file here. So let me close iUnlock. And recall that we spent some time with this fellow here last week. This notion of 100 bottles a beer on the wall and singing something very familiar to him, had I actually planned ahead it wouldn't have been a donut, it would have been a bottle of beer, but that's the first thing that came up on Google. So the connection here is to 100 bottles of beer on the wall, and recall that we implemented this with a lot of printf's and with a main function. But then we started factoring this thing out, some of that code out. So if I recall, we started with something like this, let me zoom in and do include standard io.h and that gives me, quick sanity check, what function? Printf, that's all, and now another quick sanity check, why is it black and white? Where is my color? Okay, alright, so. Well, we'll notch things up a bit now. That was too easy, alright. So let me go ahead and save this as let's call it what we did before, beer1.c, now that G edit knows that this is a C file it starts a syntax highlighting it for us. And then recall how began this thing. We had int main, I didn't take any so called command line arguments, instead I simply asked the user how many bottles of beer will there be, and then a question mark. And then I'll put a new line character, semicolon and now I actually need to get the input from the user, so the function we keep using for this is--okay, so let's call it get int. Okay, feel free to yell at me if I'm making mistakes. This time that was unintentional. Okay, so couple mistakes already, right. So first, I do need the CS50 library so at that I even have access to get int, and there's still a bug in here already, this won't compile. Yeah. [ Inaudible Remark ] >> I'm sorry? On the third line here, we need to actually do int, right. So we actually need to declare this variable and let me see if I can--let's see, turn on, that's okay. I was gonna turn on line numbers but we'll leave it off for now. Alright, so int and get int so that's good. And now let's go ahead and forge ahead. So I'm gonna go ahead now and say for int I gets, let's say N. I is greater than equal to 0 and then I minus, minus, so in other words, I want some kind of loop that just like the song is gonna count down from however number of bottles I start with. And then on each iteration it's gonna decrement, decrement, decrement and it's gonna stop, once we actually hit 0, and then recall that the song was something along the lines of--I don't know what this is gonna be yet, bottles of beer on the wall, and then I'll do a new line in this and let me just do some quick copy paste so that we don't have to type out all of these lines and now let's just change this. 100 bottles of beer, take 1 down, pass it around, so that comes next. So take 1 down, pass it around, that's pretty easy. And then something, something bottles of beer on the wall. So we're almost there. These question marks are illegitimate, so I should probably plug in a--okay, so percent D, 'cause we're talking about a decimal integer, like 99, 98 and so forth. So let me go ahead and plug that in now, and now, what value do I actually want to plug in here? 'Cause right now, printf can't just take one argument, it has to take two. Yeah, so we wanna just plug in I, not N? If I plugged in N, what would the song sing? The same number again and again. It would just say 99, 99, 99, so we do want I, 'cause I is the number that's being decremented on each iteration so now I want I here, now I want I here, okay good. Right, there is no actual placeholder there but down here, I want I as well. Okay, exactly. So we don't even though on each iteration, we are decrementing I, realize that within one stanza of this song, we have to say verbally 2 numbers both where we started and where we're gonna end up. So, according to that song we're supposed to end with 98 bottles of beer on the wall. So I minus 1 will do that. Any objections now to how we're doing this? Some tweaks. [ Inaudible Remarks ] >> I'm sorry? [ Inaudible Remarks ] >> Ah yeah, so thus far, we're not really enforcing any kind of boundaries on N. In fact if I don't notice that quite yet, let me go ahead and zoom out, let me go ahead and open up my terminal window here, and let me go ahead and make beer 1, that's good, so it did compile okay. So that's promising. As an aside, we were asked the other day how I keep clearing the screen, if you wanna do that, it's just control L, it just flashes what you see on the screen. So let me go ahead now and run beer 1. How many bottles of beer will there be? Let's say 99, and unfortunately, this seems to air at least once we--oh, now it's just not scrolling far--back far and out, but that's okay. This is a bottle, right, minus 1 bottles of beer on the wall? Let's fix that one first, 'cause that's the first one that jumps out. What's the easy fix here? [ Inaudible Remarks ] >> Yeah. So it should really be greater than 0, or you could say greater than or equal to 1, either way would be perfectly legitimate. Whatever seems most logical to you, so that should fix that bug but there's this other one. Let me zoom in again, let me go ahead and run beer 1 and now let's just be difficult, negative 3, okay, so didn't sing anything. So that actually worked okay, but why? What caught that error? [ Inaudible Remarks ] >> Yeah. So right, because we're initializing I to negative 3 and then we're checking that condition only do the following. So long as I is greater than or equal to 1, well, we're detecting this and effectively not executing any of those 4 print f lines, so we simply quit. However, it's not really clear to the user what's going on, so it would not be unreasonable to say something like, well, if N is less than let's say 1, then let's go ahead and say something like print F, please input a positive number and then that alone could be enough. But generally as a matter of good practice, what should I do after this printf? You probably wanna return something other than 0. Again, this is something you sort of should take on faith for now, you don't really see these numbers just yet, but we soon will. But you wanna return something other than 0 'cause 0 generally denotes good and anything else denotes bad. Now what about this? I seem to be missing return. So I do, I strictly need something at the bottom. You don't really, but once your program start getting more sophisticated, it's not a bad habit to just explicitly return 0 but what C does for you these days is you will need for main specifically, 0 gets returned automatically. So that's why we've done that thus far. Alright, so not bad. Any questions on this thus far? [ Inaudible Remark ] >> Why can't you use percent I? Oh, so you actually can, I think, I actually never really use percent I, I think they're synonymous. But I'll double check. Other questions? Alright, so let's now rather than jump in to code that was prefab just last time, let me rip this apart in a manner that's a little more clear. I feel like last week did not quite do. Some of these ideas just is and were a bit confusing. So let me now fix. So it feels like something like this. This printing of something in a loop that's taking us input some number like N can kind of be factored out. So that conceptually, I could have as we did last week a chorus function whose purpose in life is to just sing this song. This again is an example of the fancier terms, hierarchal decomposition where you simply take some chunk of code that conceptually does something and that's it. It's very well defined what it does, factored out into a function just to kind of clean up what's going on in your main function. And so by that, I mean this, let me go ahead and copy this code temporarily, let me go down here now, and define a new function, it's gonna return void 'cause it's just gonna print. It's not gonna return any values or numbers. . >> I'm gonna call it like last week chorus. Just to be clear, I'm going to go ahead and call this int bottles, I'm not gonna call it N this time, alright. And then I'm gonna go ahead and paste in that same code. And I just need to tweak it a little bit. I need to change this to bottles and everything else looks pretty good. So let me now scroll back up. I definitely don't need this anymore, but what do I wanna put here instead? How do I call a function that I've written? Yeah, it's just a word, just like printf, chorus, and then I need to pass in the value N, so why do this? Well now, if you kind of scroll back up, and this is one of the aims of good design, it's actually now really super simple to read this program as for the first time and realize, oh, I can totally wrap my mind around this quickly. First is prompting for some input, it's getting that input, it's doing a sanity check on the--what value the user types in, and then it's printing a chorus. And what's nice now, is you reach the end of this function main, the program is done, and so now, if you are curious and once you wrapped your mind around this program, now you can dive in deeper and say okay, what is chorus, and sure enough, if you scroll down, now you can focus on this more bite size program. So this notion of decomposition is partly for this reason here so that frankly your programs don't become this and this and this long, whereby the time you get to the halfway down or the bottom of the function, you forget what you even did before, rather much like an English story or an outline for an essay, you can get a sense from main alone what's going on and only if you care, as to how course works do you need to look down lower in the file. Now, we can do this slightly differently but there's a bug first. This code will not compile, why? Yeah. [ Inaudible Remark ] >> Yeah, so I need this thing called the function prototype because again, C is pretty primitive in that if you don't tell it that something like a function exists, it's gonna assume it doesn't if it encounters it earlier in the file. So even though chorus absolutely exists down here, I need to provide a little clue at the very top of my file, and I don't strictly need to hit Enter here. It's typically more common just to put a line like that. I just need to copy and paste literally the function's prototype. It's return type, so the word like void but more on those to come, the name of the function chorus, it's type of integer, type of argument, and in this case, its name. Though as an aside, its name is not strictly necessary, you might see in textbooks that people just do this, that's fine too. But it's somewhat simpler and more readable, I would say, to include the name of the argument. So now this should compile, let's take a look. Let me zoom in, let me go ahead and make beer 1, it did indeed compile. Let me go ahead and run beer 1, we'll do let's just say 99 bottles of beer, zooms by, 0 bottles of beer on the wall. But there's still a bug, not so much a logical bug now but really a grammatical one, which is what obviously? [ Inaudible Remarks ] >> Yeah, so now I'm kinda mispronouncing. This one bottles of beer on the wall feels like we can do better, right. And this might seem like a trivial little thing, but all of us kind of, you know, if you notice these things in Windows or Mac programs, and you kind of--you know, eyebrows go up. If you see something stupid like this in a program that you paid for or downloaded and it's just got grammatical mistakes in it. But this is actually really easy to fix, right. Conceptually, we just need to do what, inside of this loop? We just need to conditionally say bottle or bottle, so we can do this in any number of ways. In fact, the simplest might be to do this. So if I equals, equals, 1, let me go ahead and indent this. Let me go ahead and do a little copy-paste but that should be your first warning sign. If you find that you're doing a lot of copy-paste, odds are, you can something a little differently. So let me do this here so if that [inaudible], I'm gonna do this, let me scroll down. Let me print this over here. So now I just need to change the grammar, so if 1 bottle of beer, 1 bottle of beer. Technically, this is a little redundant now. Do I need to be doing the placeholder? No, you can leave it, but--so in short, there's a number of ways we can do this, right. I could hard code the number of 1 int at this point, get rid of I and then also do the same in the second line. Or I can keep it the same but feels like we're doing a little too much work here, right. We're kind of copying and pasting 2 pretty ugly lines of code, making them almost identical except for the omission of 1 letter. So what could we do instead? [ Inaudible Remarks ] >> So use the conditional or okay. So we saw one clever approach with like a one liner whereby we just factored out--whereby we factored out this condition. Let's actually not go there just yet. Let's see if we can't simplify at least what we're asking for. What if I instead I do this? Where if I equals, equals 1, really the whole sentence doesn't have to change, rather it's simply what aspect of it, it's the word. So what if I said something like S gets bottle, otherwise, S gets bottles and now I don't need these curly braces 'cause it's just one line of code. This isn't quite right yet, but notice where I could go with this. So let me actually rip out this, 'cause I've decided, I don't really like this approach. Let me move this over here, get rid of this, so now we're back to the original version except for this part up top here. So this isn't gonna compile yet, and it's not even useful yet, because what do I need to change here? That could be percent S, that could be percent S, then I'm gonna have to plug in S here, then I'm gonna have to plug in S here, but, what else am I still gonna have to do? [ Inaudible Remarks ] >> Yeah. So I still need to declare S. So let me do this, string S, but okay, wait a minute, I kinda need to do it here then right. 'Cause only one of those branches is gonna execute. But still broken, why? [ Inaudible Remarks ] >> Yeah, scope. So again, this issue with scope and even though in this case, we have just single lines of code and so we're allowed to cut a corner we don't strictly need these curly braces. Remember that rule of thumb that if there are curly braces or effectively there are, except in this special case where you can omit them, really that means S will exist here or S will exist here. But as soon as you get here, later in the program, S is gone. That memory is no longer accessible to you. So you can't save the word S. So I need a better solution, so I can't declare the variable inside the curly braces or inside the condition but I can declare it, for instance, here. So on occasion, you will encounter scenarios where you have to declare variables a little sooner than you actually wanna use them. But you need to declare then somewhere within the scope of the chunk of code that you wanna use them in. Now recall last week, we had what we called, it's called the ternary operator. It was that funky thing with parentheses and so forth. It turns out, that if this kind of strikes you as kind of ugly and my God, I just doubled the size of this function just to do something stupid like print out bottle or bottles. Well, realize that there's syntactic sugar in languages like C where you could actually say this. String S gets either the value bottle or bottles based on whether I equals, equals 1, and if I does equal 1, I wanted to get the word bottle. Otherwise I wanted to get the word bottles. And so what's nice is all this kind of ugly code, not wrong, wouldn't be penalized for this certainly, 'cause it is correct and it's nicely indented and so forth, but you can actually simplify this to something much more elegant. And so this is the so called ternary operator and opposed to binary or unary in the sense that it actually takes 3 values, one to the right, one to the middle, and then 1 to the left. But that's just unnecessary jargon. So with this work, is S now in scope? So it actually is. So this is nice--a nice one-liner for fixing the grammar. We haven't fixed everything, right. This is still broken and I can't quite use the same word there necessarily, but let me wave my hand at that final detail, but the idea hopefully is how we can fix the grammar is hopefully more clear now. Yeah. [ Inaudible Remark ] >> If else--if--and if else if, else if? So you could absolutely do that. To contrive a scenario here, if I equals, equals, 0 I could say, S gets no bottles and yell, for instance, else if I equals, equals 1, then I can do S gets 1 bottle for instance, else I can do S gets bottles. So in other words, I can have as many branches as I like and as before, I don't need those curly braces. [ Inaudible Remarks ] >> Oh, oh I'm sorry. I misinterpreted your question. No. So, that was the shorter answer. No, you can have--well actually, that's--that--okay, no, I'm wrong. So you can, it just starts to get less clear. You can actually put another set of parentheses here, and another question mark and colon, and another set of parentheses and so forth. But I would argue against doing that, because honestly, it becomes much less readable if you're not sure which one lines up with which idea. So I reduce it just to this degree. Yeah. [ Inaudible Remarks ] >> Uh-huh. [ Inaudible Remarks ] >> Really good question. So if you declare a function in this way, with the prototype at the very top of your function. That seems to run the risk of colliding with other functions you wrote and might still be using, that might run the risk of colliding with functions other people wrote, right. For instance, what if the person who wrote the standard I/O library whose header file is called standard io.h? What if that file contains a function called chorus? Well, in C, you're kind of out of luck. Like that happens. There's no notion of what's called name spaces or packages in C. So this is a problem that is solved by more modern languages which will get to you later in this semester among them PHP, JavaScript and the like. C plus, plus handles this as those Java but in C, you cannot for instance, steal the name of a function that someone else wrote that you are using by way of including it with something like this. >> So you could not implement get int, you could not implement printf unless, you were willing to sacrifice that other person's version of it. Good question, yeah. [ Inaudible Remarks ] >> Ah, good question. Would you want to declare the string S outside of the 4 loop? You could in theory declare it outside of the loop and you would then still have access to it. Because it would still be in scope, but the problem is that the words are changing as this loop iterates. It might go from 3 to 2 to 1 to 0. It might go from 2 to 1 to 0. So, in other words, you need to make the decision at some point conditionally in the loop as to what word you're gonna spit out. You can't do it based on N alone, right. Because it's not N's bottles, it's the I equals 1. [ Inaudible Remarks ] >> Oh I see, absolutely. So if you want it, you could do this here. However, I would argue as a matter of style. This gains you nothing. Oh, and I see what you're saying, in terms of efficiency. So this is--this would be recommended against these days. The version of C that we are using allows you to do exactly what I did the first time and the compiler, namely GCC is smart enough to realize that it doesn't need to reallocate a new 32 bits, 32 bits, 32 bits or whatever it is for that particular string. All it will do is update the assignment. So for that kind of detail, generally the compiler is smarter than us, so you don't have worry about that. Good question, yeah. [ Inaudible Remarks ] >> If you had the same--if you had 2 libraries with the same function name in them, you could not use them together. It would break. The compiler would yell that previously--it would yell that the function is previously declared. Good question. Alright, so that was bottles. Let's take a look now at 2 buggy programs just to see whether this starts to pop out a little more. This is buggy4.c. So this is a program that's supposed to increment a variable but I claim does not. All that made it at top is the comments. So let's look at the top. At the top, we have a function prototype for a function that's apparently called increment that takes no arguments and returns nothing, main does what? Alright. So X gets 1, it prints out the value, just says, dot, dot, dot, calls increment, then prints out the value of X again. So the que--this is buggy, this is not in fact going to increment X, but why? Let's look at the increment function. This is broken, and in fact won't even compile, why? Yeah. [ Inaudible Remarks ] >> Yeah, it doesn't declare the type of what? [ Inaudible Remarks ] >> Okay. So it hasn't declared a type for the variable X, okay, I'll fix that. Let me do in text. Better? So not quite. These two is now actually worse, you can't declare a variable like this and then do plus, plus. Frankly, it doesn't even have a value yet. So that's not quite right, so it's a good thought and it is related to this fundamental problem here, yeah? [ Inaudible Remarks ] >> Yeah, exactly, so X exists but it only exists up here. So it only exists in the scope of main whose scope is defined by its own curly braces. So okay, alright, let me fix, then let me do this. Now increment has its own X. Okay, yeah? [ Inaudible Remarks ] >> Okay. [ Inaudible Remarks ] >> Okay. [ Inaudible Remarks ] >> Okay, so we somehow need to pass from main the value of X into this increment function it it's actually gonna do anything if I can simplify. And let me just before I erase the second mistake here with this compile though. So this would actually compile, right. 'Cause if you go back to basic definitions what is increment now doing? Well, it's not taking any input, but that's fine, it is declaring a variable called X and it's of type int and that means what? 32 bits are being set aside inside of increment's frame. Remember when we drew RAM as a big rectangle, so a sliver of RAM has now been allocated to the increment function. And in there is some 32 bits that are gonna be used for this variable X, and then it's incrementing X. But there is a problem here. What is X at that point in the story? Like we don't know, and in fact, if you think back now to last week's picture when we drew RAM as a big rectangle. The interesting thing about this design of memory use in a computer is if this is main at the bottom. Well, main is kind of lucky in that his memory always sticks around until the program ends because his gets put into the computer's memory first. Well, what if you call a function foo, it ends up there but then foo returns. So what happens to it, this gets reclaimed. What if you then call a function bar, well, it might go there on top of main but then it return so its memory goes away. But it doesn't really go away per se, it's just that the computer forgets that those bits were once used by a function foo or those bits were once used by a function bar but all those 0s and 1s are still there. All the magnetic particles or all the electrons are still there in the same orientation you left them. And so what might happen when a function like increment is called, well, increment is gonna get a sliver of memory. But if you had 011, 011 whatever the 0s and 1s were when a function called foo or bar was using that chunk of memory, what value is gonna be an X at this point in the story. The answer before was this but really it's whatever junk was left over there by those previous function calls, foo or bar. So these two is going to be for bad guys an opportunity to do bad things, right? If you can imagine a program whose purpose in life is not to increment numbers but to ask you for your password, right, and you guys have been doing this, to submit 50 program for instance in the appliance asks you for a password. Well, that password gets stored in the appliance's RAM inside of some function, inside of some slice of RAM allocated to some function. But as soon as you're done running submit 50, okay, that memory is given back to the operating system and these slivers go away. But what about the 0s and 1s, what about the ASCII characters that were temporarily in RAM storing your password, where are they? They're still in RAM, right, the computer's forgotten where they are but if a bad guy has physical access to your computer or somehow hacks in via some network connection and has the ability and the savvy to poke around your computer's RAM, you might see 1, 2, 3, 4, 5 or whatever your password is just lying around there in memory and it's simply because, again, computer's use of RAM is the simplistic. So what's happening here, I'm saying give me 32 bits, call it X. Increment that value but who knows what that value is, maybe it is, 1, 2, 3, 4, 5. So what did I just do, I plus plus it to 1, 2, 3, 4, 6 but then this function returns and so what effect does this have ultimately on this variable here X, absolutely none. It's in a completely different sliver of RAM. So that was a lot. Why don't we go ahead and take our 5-minute break here. [ Noise ] >> Alright, we are back. So recall that last week we scratched the surface of this other topic known as an array, in English as best you understand now, what is an array with regard to C? Anyone got something? So it's a list, right, it's some kind of variable, let's call it noun, that's not quite right but it's some kind of variable that doesn't store one value but it stores multiple values and we did see something just like this in scratch and scratch called them list. And in other word, that gets tossed around for these things that's called vectors. But an array in C can be likened conceptionally to like a bunch of mailboxes, right, an array in C is a bunch of contiguous chunks of memory. Contiguous just means back to back to back in RAM. And so in this case here, this might be an array of mailboxes of size 5. So I could store for instance one thing inside each of these mailboxes and that's it. Now, what thing can I put in there? Well, it depends on how we declare the array. If we say this is an array of int, I can fit 32 bits in each of these mailboxes. If I declare them, the array as a char array, I can only fit how many bits in each. So just H, right, remember that a char is a byte and a byte is a bit so if this is an array of chars, of characters, well I can only fit 8 bits in each, right? So that's like having smaller or bigger mailboxes. Now, if I wanted to store a word in these mailboxes, well, what's the longest word I could store in here? And so maximally 5 for sure, right, if there's only 5 mailboxes. But recall from last week, it's not quite 5. In fact, the upper bound is 4 because we actually want to keep around some kind of reminder that that string, that word ends here. And so that special character, recall, was the null character or the backslash zero, so all zero bits denotes the end of a string. So we can generalize this now as an array, it's just its placeholders, it's cubbies, it's a mailbox inside of which you can put a whole bunch of values one after the other but by the design they all have to be of the same type. So let's actually see this in action. Let me go back to my G edit window here, let me open a new file and I'm gonna go ahead and create something that just lets me play with a string. >> Let me go ahead and first declare this as let's say in John Harvard's directory as string 1.c. Because it turns out even though we just introduced this jargon last week of an array, we've been using arrays actually since week 1 when we first introduced a string 'cause what's a string? Well, it's a word, a phrase, sentence, paragraph whatever, but what is that? It's one character after another after another after another. So this thing we've been calling string casually is actually an array of character. So we can now peel back that layer. So let's go ahead and include, let's say the CS50 Library. Let's go ahead and include standard io.h. Let's go ahead and do int, main, void. I don't want any command line arguments just yet. And then in here let me go ahead and ask user for string. So I'm gonna do string S, get string and then what do I wanna do with this? Well, first let me figure out the length. Int, let's say N get strlen of S. So it turns out that there is a function called string length or srtlen and it happens to be declared in a file, actually I don't remember. So what are my solutions here? So let me actually go and open a terminal window and I can actually type. If I know the function is called strlen I can say man for manual strlen enter and you'll see a few things among which at the very top is a reminder as to what header file it's in. So I apparently need that and now this is a little cryptic at first glance but this will make sense before long, the function is indeed called strlen. It takes one argument but that argument is apparently not a string. What data type is it apparently? Const char asterisk for some reason. Well, as a teaser and a star here means a pointer. A pointer is a memory address and we'll actually come back to that. But for now, just assume for simplicity that everything highlighted here in white is just a "string." The keyword const char star is what we have simplified temporarily in the CS50 Library as "string." Now as for this here, size underscore T, this too is a teaser. It turns out that in C you can declare your own data types if you don't want to confine yourself to just int or float or double. You can actually create a student data structure. You can create a human data structure inside of which are multiple fields like a student's name, ID number and phone number and the like. But we'll get there too but for now just know that size T happens to be synonymous with int. So realize that when looking at man pages, it's sometimes they actually won't make much sense at first glance. And in fact not until mid to late in the semester will you look at this and be like, I remember what that is. For now, just kind of reason through as best you can that okay, this at least means I need this header file so let me go ahead and go up here and include, include string.H and then recall this reference in particular. On CS50's website we have under resources the following link at the very top. So to C reference, this is a relatively more user friendly reference guide to all of C's functions and even though many of them might seem new, in total there's not a huge number, certainly not as many as Java for those with backgrounds. But let's go to C string and character because it feels like I'm doing something with strings. And sure enough if I scroll down, strlen, there it is, returns the length of a string, let me click that. And now this isn't all that enlightening in this case but you'll find that a lot of this documentation includes examples, includes links to related function. So hopefully, it will at least answer some questions. But honestly, when in doubt, just try it. So let's see what happens if I pass strlen, an argument that's of type string a.k.a. const char star and store its value in ints. I'm not gonna bother with this size T and all this. I'm just gonna go with what feels simple to me. So now I have in end the length of the string. Let me go ahead and have a loop. So for int I get zero. Let me do I is less than N. Let me do I plus, plus then let me open some curly braces and what do I wanna do? Well, let me go ahead and treat this string as though it is array of characters. So let me go ahead and print F a single character. So percent C one at a time and then I'm gonna put a new line. So in short I wanna print 1 character per line out of whatever is in S. But S, I think I need to use I here somehow. How do I get the Ith character of S, yeah. So the next syntax we introduced was this, S open bracket I close bracket and that says go to the variable S then index into it, so to speak, to the Ith location, 0, 1, 2, 3 and print out that particular letter. So let's see what happens now. So this feels like it should iterate from I to N where N is the string length and then print out whatever character is at that location. Any syntax errors before I try to compile? Alright, let's try. So let me just go down here, this is make string 1, enter, okay, good sign so far. Let me go ahead and do string 1, enter, and now let me type in hello world, enter, okay. It feels like that actually did work. If I scroll back up, we see H-E-L-L-O one per line. We see the space bar character and then voila, we're at the very end. So what's the takeaway here? Well, apparently all this time we've actually been using arrays, we just didn't slap that label on it but in a string is just an array of characters. But here is where you need to be a little careful. Suppose I kind of messed up here and suppose I did this, alright? So you already know hopefully instinctively this is bad, unclear what's gonna happen but going beyond the length of something probably feels like it's at least gonna make some aesthetic bug. So let's see what happens. I'm now iterating from 0 up through N which means if the length of the string is N, this really means a total of N plus 1 iterations 'cause if you started 0, you get one extra 1 which is probably one too many. So let's recompile. So I'm gonna rerun make string 1, enter, rerun string 1 enter, hello world, enter, it doesn't feel so bad. So that doesn't seem to be too problematic. Why did I get a space? Well, that's how 0 in this case is being interpreted, so backslash 0 is being automatically inserted into S for me. So we'll see this before long but for now, know that what get string does is it actually reads one character at a time another, another, another from the keyboard and then when you are done hitting that last key on the keyboard and you hit Enter, when you are the user it then adds in a backslash 0 to the very end of the string. But when you call string length, it does not in fact--does not in fact count that backslash 0. So in other words, if I type in hello, H-E-L-L-O string length is gonna return what value, 5. It's not going to return 6 even though how many bytes are being used? Well, 6 'cause there actually is that 6, 0 character terminating it but we'll see this very long. Now let's just do something crazy. I would like to go 100 times the length of this string just to see what's in memory, right? If I argued before that inside of RAM are these leftovers from previous function calls, let's see what's in there, let's print them out 1 character at a time. So just to keep things a little more readable, let me not do the new lines 'cause I'd like to just see it on one screen rather than scrolling so let's get rid of the new line and now let me go ahead and scroll down. Let me rerun string, make string 1. String 1, enter, let's type in hello, enter, that's funky. What is that, right? So it turns out there's a lot of unprintable characters in ASCII, right? If you think back to Asciitable.com or the chart we had a while ago, if you go to Asciitable.com, I think Tommy mentioned this in the walkthrough last night. This is frankly an unnecessarily complex looking chart. But that's just maps letters to numbers. If we scroll down to 65 over here in the top right, it maps to A. But notice that over here there's crazy things like form feed and shift out, shift in. I don't even know what some of these characters are. Well, neither does the terminal window. And so it prints them as these funky garbage characters. I mean frankly let's really mess around here a thousand times the length. Let's see what we can see. Alright, go down here, zoom in, remake. And again, remember this keyboard shortcut. If you're tired of typing the same thing again and again you can always hit up and down to scroll through your history or you can type exclamation point and then start matching whatever the last command was you typed that started with M and hit enter and your blinking prompt will figure out what it was. So I'm gonna now go ahead and rerun string 1. Type in hello enter, still not doing much. What might it be doing though? Can I go further? It's getting a little crazy. Make, rerun, hello, ah-huh. So some of you have seen this already, yes. If so, good job, very advance already. So this is bad, right? A segmentation fault just sounds bad and we did induce this last week when we I did what? Remember when I kept calling, I forget what the function was called. Maybe it was foo or something silly. Remember, I just kept having foo call foo which called foo which called foo, and what did that mean? It meant that more and more RAM is getting allocated, allocated, allocated, until finally it crossed over its segment so you get a fault and core dump actually is silly terminology which means you now literally have a file in your current folder called core, and what do you think is inside of that file? It's actually useful. It's a bad sign. It means you screwed up or I screwed up but what's inside of it? It's kind of some forensic information, inside of that core file is essentially the entire content of the appliance's RAM at the moment your program crashed. Now, it's not everything 'cause that would be huge. It's only as much as you need to see. But we'll see in future weeks a tool again called the Debugger whereby we can do a little forensic analysis of this file and figure out retroactively where and hopefully why my program fails. Now what happened here? Well, I iterated like what, 100,000 times farther than I was supposed to on this string. So what does that mean? Well, it doesn't mean that I kept calling functions again and again. It means I simply tried to touch or print memory that was way out of my boundaries. And so this is a bad thing. So hopefully that issue makes a little more sense. But it turns out we should be a little more careful still. I can make this program crash in other ways because it turns out that get string doesn't always return a string. >> It's hard to simulate this but there are ways where a bad guy or even we if we were really crafty here could tell get string to return without even giving it any characters whatsoever. And so if we actually read the documentation for the CS50 Library which we can do as follows. I'm gonna open up G edit. And in today's source code in the source directory is a file called CS50.H. So this has been in the appliance all this time. You've not have to care about or see it in person but you'll at the top some fancy stuff and some files we haven't seen. But let me scroll down and point out just a couple of details. So this again is CS50.H. Every time for the past 2 weeks you've been saying include CS50.H, what has GCC been doing for you? It's been opening this file and effectively copying and pasting its contents at the very top of your program and then proceeding with your own code. What is the point of that? Well, it means that no matter where you call get int or get string, the compiler already knows about it because if we scroll down further in this file, notice that we put all the function prototypes for all the CS50 functions in here with a whole bunch of comments explaining how they work for you the human so that no matter where you call get int, get string, get long, long, the function prototypes have already been declared for you because again, when you do sharp sign include, it's like copying and pasting this file at the top. Now what about string, how have we been simplifying that all this time? Well, again you can declare, you can define your own types in C and again we'll come back to this. But that one line is what makes possible the data type we have been calling string all of this time. But let me scroll down as another teaser to come with regard to pointers. If I scroll down to the documentation here, forget string. Notice that it says this. It reads a line of text from standard input, that just means keyboard, and returns it as a string otherwise known as char star sands trailing new line character. So, in other words, even though you've been hitting enter at the keyboard, we do not hand you as part of the string that backslash N. Otherwise every time you printed strings they would move the cursor to the next line. So we get rid of that. Returns, this is what's interesting. Returns null upon error or no input whatsoever. So it turns out there's a few scenarios in which get string cannot give you a string. Either somehow the bad guy has avoided hitting enter at all but he has nonetheless said I'm done executing get string. Or what's another situation in which a function like get string could fathomably fail. Like what could a bad guy do that's just really annoying and breaks a function like get string whose purpose in life is to get a string from the keyboard and hand you back a chunk of memory with that string. Yeah, what if it's a super, super, super long string, right, the appliance recall only has by default like 768 megabytes of memory and some of you with netbooks and the like have had to crank that memory down to only like 256. So what if the bad guy, you know, it's a little tedious but typed in billions of characters on the keyboard or really just did a whole lot of copy paste to paste in a huge string or the whole King James Bible which you can just copy and paste from the web page and then paste at the prompt. What might happen, well get string might not be able to allocate enough memory from the operating system and so what is get string gonna do? Well, according to its documentation it's not gonna return that string, it's gonna return a special character called null in all caps. And this is the absence of any actual memory, the absence of any string. But we'll come back to that. For now, know that this program is actually a little buggy because we're not checking if S is actually a legitimate value. We'll see in future weeks that in fact it could be itself flawed, rather it could be nonexistent. And so we could get a segmentation just by trying to traverse it. So let me try another version here. Let me go ahead and open up a prefabbed file this time, string2.c 'cause it turns out there's an inefficiency in this. So what I'll start to do now in lecture so that all the answers are not always on the screen is even though in the printout on the PDFs online and the source code which you're welcome to download in advance or bring to class on a laptop, I go ahead and with a little program. I strip out all of the comments, all of the one line comments so that it doesn't just tell you what's going on here. So that's why there are some empty lines here. And now let me actually point out this. This is the key to solving the problem I just alluded to, checking if S actually does not equal null, but again more on that in the future. Let's focus instead on just this part here which is almost identical to the thing I just implemented but with one difference. I would argue that this implementation of string is worse than what we just did on the fly. This is more inefficient. This is bad design so to speak. What's--damn it, I have to spoil the answer now. This is really good design. So what if--damn it--I opened the wrong version. Okay, here we go. Let's put this over here, okay. This is bad design, what should we do instead. Well, let's try--that question is clearly out of the bag now, why is this bad design, why don't you like this? [ Inaudible Remark ] >> Okay, good. So remember how the for loop works. You got the initialization of some variables and that's over here to the left of the semicolon. You've got the increment over here on the right of the semicolon that happens every time the loop goes through. And then remember this part, the condition, how many times does this thing check. So every time you go through the loop. So what's the stupid design decision here even though this is nice and clean and readable, right? This is very succinct, it's one liner, it's not ugly but how many times am I calling strlen on S? Well, every time this loop goes around. So if I type in H-E-L-L-O, what is strlen of H-E-L-L-O gonna be on the first iteration? 5, what about the second iteration, what's this length of hello. 5 and then 5 and then 5, right, the fact that I'm doing this like an idiot just means that we are doing the same exact thing again and again and again and that's a waste. So we can do better than that in a couple of ways. One, we could do it in the manner I first did where I said, you know what, give me a variable N and pass, give it the value of strlen of S and then go ahead and do I is less than N here. So that fixes it. But if you don't like that syntax you can actually do it in line and this was that spoiler, N get strlen of S. So notice now I have 2 initializations to the left. I can do a condition now based on that variable and so this is just a marginally cleaner way of actually checking the length of S but only once because even though I is being incremented we're not actually changing the length of that thing itself. So this is good design. The bad version that I changed in to a moment ago we're calling strlen, strlen again, that's what we mean in the PDFs of the problem sets that when we say grading with respect to design, look for those kinds of inefficiencies. That might not be glaringly obvious and they're not incorrect, they are just bad design. So let me go ahead and open up just one or so other things here. Let's take a look at this one. So this is a program and this is actually so lightly useful. It's kind of a calculator but just for quiz averages. This is a program whose purpose in life is to ask the user further quiz scores then it computes the average of those quiz scores and tells you what your average is. So marginally useful but it introduces a few new features that are now gonna be increasingly useful. So let's start at the top. In English, what would you wager this line is doing? So the verb, at least one verb and the answer should be it's declaring something, what's it declaring? So it's declaring an array and I say that the verb should be declare because any time you see a variable data type like floats or int or string, you're declaring something. So you're declaring an array. How do you know it's an array? Well, the clue today is just these square braces. Now, what is quizzes in all caps, we've not seen that before. Well it turns out that a very good practice, also good design in a program is if you need to use some constant value like the number 2, 2 quizzes per semester. It's best not to hard code the number 2 all throughout your program but rather to factor that out, use what's called sharp define, so similar to sharp include but the word is different, sharp define. You then give the constant a name and by convention it is all capital letters though it's not required but that's what programmers do. And then the number that you want to make quizzes synonymous with. So it creates like an alias so that when your programs compile anywhere the word "quizzes" appears, the number 2 is gonna be substituted. Does it matter if it's an int or a double? No, we could actually do this but you wanna make sure--sorry, when you define a constant you can make it almost any data type. This though would be bad here because when you declare an array with square braces, you must pass it in ints, you cannot pass at a floats. So what is going on here? We've not seen this syntax in a program yet but this variable is gonna be called grades. How big is this array gonna be? >> Well, when you wanna specify in advance how big an array should be, you just specify in square braces that number. So I could do this and in fact that is synonymous with what I just did. But again, the goal of a constant is to factor that out because look just ahead, we don't know what this is doing yet but notice I've already used quizzes 3 times. Just if you fast forward if we ever had 3 quizzes or God forbid 4 or 5 during the semester, you would then have to go change that number all over the place and it's just too easy to make a mistake. Leave a 2 up here or a 3 down here and now the program is broken. So we can standardize the number at the very top of the program. So in short, this gives me an array of size 2, 2 mailboxes side by side, each of which can store a float as a value. Alright. What were your quiz scores, then we iterate from I as 0 up to quizzes. This is sort of old school now. So quiz number 1 of 2 so I'm just doing 0 plus 1 is 1, so this will literally say quiz 1 of 2. So I'm trying to make it more human friendly by counting from 1 not rather than zero. And then what's this doing? Well, this is now the syntax. If you already have an array and it's a big enough size, if you wanna put something in the Ith location you just specify grades, bracket I, get, in other words the assignment operator whatever it is you wanna put there. So in short, what do these lines of code do? It declares an array of size 2, a mailbox with 2 slots. It then asks the user for their quiz scores and then it stores those 2 floating point values in grades bracket 0 and grades bracket 1, that's all. So one number here, one number here. Let's go ahead and scroll down now and now I'm computing the average and this is it. So how do you compute the average of numbers? Frankly, this is very easy with 2 but we wanna keep it general. You compute your average by adding all your quiz scores together and dividing by the total. So let's do that. I have a sum initially that's equal to zero. I then iterate from 0 up to quizzes, so however many quizzes I have. Then I can use this fancy syntax plus equal adds to some the Ith grade. So once I get to this line here what have I just done? I've stored in a variable called sum, the sum of my 2 quizzes. So something plus something, now I have a total but now I need to get the average so I need to divide by quizzes. And just so that I don't kind of screw you out of the A instead of the A minus, we wanna make sure we round up or round down because otherwise we would be truncating. Remember what happens when you divide something by an integer potentially, so let's go ahead and round so we round up or down. The round function is in, turns out the math dot H header, so more about that online under cs50.net.resources but these now stores an average in integer that's the result of rounding the sum divided by quizzes. So at the end of the day, this is just basic arithmetic that you could do paper pencil on or a calculator on your head even, but what we've introduced now is the ability to store multiple grades in an array. Why didn't I just use 2 variable for grade 1 and for grade 2? Right, float, grade 1, comma and--semicolon, float grade 2, semicolon, why is that bad arguably? [ Inaudible Question ] >> Yeah, exactly. What if I have a third quiz, a fourth quiz, do you really wanna start copying and pasting your variables and have grade 1, grade 2, grade 3, grade 4, grade 5, grade 10. No, rather you can put them all in the same variable called grades and then just index into it in this Ith location. And so it's with these primitives, modelling strings as arrays and being able to reverse strings, our arrays with 4 loops that you'll be able to take something like hello and encrypt it whether you're doing the Hacker Edition this week or the standard edition. So with that, we will see you on Wednesday.