[MUSIC PLAYING] DAVID J MALAN: All right, welcome back to CS50. This is the start of week two. A word from one of our friends on campus-- if you are interested, possibly, either now or in some future term even, once more comfortable, teaching middle school students a little something about computer science, do head to that URL. They are in particular need right now of teachers, particularly if you have had some exposure to computer science. So recall that last time, we introduced a few data types in C, and you may have started to get your hands dirty with these thus far in problem set one. And we had a char. So in somewhat technical terms, what is a char as you know it today? So it's a character, but let's be more precise now. What do we mean by character or individual char? A non-numerical character-- so not necessarily. It turns out that even numbers, even punctuation and letters are represented with this data type known as a char. So it's not necessarily alphabetical. Yeah? So it's an ASCII character. So if you think back to week zero, when we had our byte of volunteers come up and either hold their hands up or not all, they represented bits. But collectively as a group of eight, they represented a byte. And we introduced the notion of ASCII at that lecture, which simply is a mapping between numbers and letters. And ASCII uses, as those humans implied, eight bits to represent a character. So accordingly, if eight bits can each take on one of two values-- zero or one-- that means there were two possibilities for this person-- zero or one-- two for this person, two for this person, two for this one. So a total of two times two times two times two times two-- so two the eighth in total. So there's a total number of characters 256 possible that you can represent with eight bits. Now, those of you who speak Asian languages might know that there's more characters in the world than just As and Bs and Cs and Ds. And indeed, ASCII does not suffice for a lot of languages of the world. But more on that another time. For now, know that in C if you want to represent a letter, a piece of punctuation, or just something character in nature, we use a char. And it's one byte or eight bits. How about an int? Well, an int is an integer. How many bits, if you recall, was an integer typically? Anyone recall? So it's typically 32. It actually depends on the computer that you're using. But in the appliance, and in a lot of computers, it's 32 bits or four bytes-- eight times four. And ints are just used for storing numbers, either negative, positive, or zero. And if you've got 32 bits and you only care about positive numbers, can anyone ballpark how many possible integers a computer can represent from zero on up? So it would be two to the 32, which is roughly four billion. So these powers of two are going to be recurring themes in computer science. As we'll see, they're quite convenient to work with even if it's not quite easy to do the math in one's head. So we'll say roughly four billion. Now, a long long-- you can kind of guess. It's longer than an int. How many bits? So 64 bits or eight bytes. This just means you can represent even bigger numbers, bigger positive or bigger negative numbers. And how about float? That's a floating point value of 32 bits. This is just a real number, something with a decimal point. But if you instead need more places after the decimal point or you want to represent a bigger number with some fraction after it, you can use a double, which is 64 bits. But there's an interesting takeaway here. So if ints are limited by 32 bits and even long longs are limited by 64 bits, that sort of begs the question, what if you actually want to count higher than 4 billion for an int? Well, you just use a long long. But what if you want to count higher than two to the 64th, give or take? Now, that's a huge number. But eventually, you might actually care about these kinds of values, especially if you are using a database and starting to collect lots and lots and lots of data and assigning unique numbers to each piece of that data. So we kind of have a problem. And similarly, with floating point values-- floats or doubles-- if you've only got a finite number of bits, how many total numbers could you possibly represent? Well, it's less clear when you involve a decimal point. But it's surely finite. If you have a finite number of bits, a finite number of humans, a finite number of light bulbs, surely you can only represent a finite number of floating point values. But how many real numbers are their in the world? There's an infinite. So that's kind of a problem because we don't have an infinite amount of memory or RAM inside of our computers. So some challenging things can happen. So let's go ahead and try to express this here. Let me go ahead and open up gedit. I'm going to go ahead and save a file called "floats0.c" just to be consistent with an example that is available online, if you would like. And I'm going to go ahead and define it as follows-- I'm going to go ahead and say, int main void, as we often do. And then in this program, I'm going to declare myself a float, so a 32-bit variable called f, arbitrarily. And then I'm going to store in it I don't know, one tenth, so 0.1. So I'm going to express that as one divided by 10, which is perfectly legitimate in C. And then on the second line, I simply want to print out that value. So recall that we can use the familiar printf. We don't want to use %i for an int. We want to use %f for a float. And then I'm going to do backslash n, close quote, comma, f, semicolon. So here's my program. There's already one bug. Does someone for whom this clicked already want to point at least one bug I've made? Yeah? Yeah. I forgot "#include " at the top, they symptom of which if I try to compile this is going to be that the compiler is going to yell at me, saying undefined symbol or something to that effect. It doesn't understand something like printf. So I'm going to do "#include ", save the file. And now it's in better shape. But I'm also going to point out one new detail today. In addition to specifying place holders like %f %i %s, you can sometimes influence the behavior of that placeholder. For instance, in the case of a floating point value, if I only want to display one decimal place after the period, I can actually do 0.1f. So in other words, I separate the f and the percent sign with 0.1, just telling printf, you might have a whole bunch of numbers after the decimal point for me. But I only want to see one of them. So I'm going to go ahead now and save this program, go into my terminal window, and I'm going to go ahead and type make float 0, enter. I see that somewhat cryptic line that will begin to make more sense as we tease it apart this week and next. Now I'm going to go ahead and run float zero. And, damn. So there's another bug here for some reason. I'm pretty sure that one tenth, or one divided by 10, is not 0.0. Maybe I'm just not looking at enough digits. So why don't I say two .2 to see two decimal places instead of just one. Let me go back to my terminal window here and hit up a couple of times to see my history. Do make float zero again, and then up again. And now enter. And now I'm pretty sure this is wrong. And I could do three and four, and I'm probably going to keep seeing zeros. So where is the bug? One divided by 10 should be 0.1. Someone want to take a stab at what the fundamental issue is? Yeah? They're both integers. So what? So with one divided by 10, that's what I do in arithmetic. And I get 0.1. Yeah. And so it is indeed that issue. When you take an integer in a computer and you divide it by another integer, the computer by default is going to assume that you want an integer. The problem though, of course, is that 0.1 is not an integer. It's a real number. And so what the computer does by default is it just throws away everything after the decimal point. It doesn't round down or up per se. It just throws away everything after the decimal point. And now that makes sense. Because now we're clearly left with zero. But wait a minute. I'm not seeing an int zero. I'm actually seeing 0.00. So how do I reconcile this now? If one divided by 10 is zero, but I'm seeing 0.00, where is it getting converted back to a real number? Yeah. Exactly. So up here in line five, when I actually store that 0.1, which is then truncated to zero, inside of a float, that's effectively equivalent to storing it not as an int but, indeed, as a float. Moreover, I'm then using printf to explicitly print that number to two decimal places even though there might not actually be any. So this kind of sucks, right? Apparently you can't do math, at least at this level of precision, in a computer. But surely there's a solution. What's the simplest fix we could maybe do, even just intuitively here to solve this? Yeah? Turn the integers into-- yeah. Even if I'm not quite sure what's really going on here, if it fundamentally has to do with these both being ints, well, why don't I make that 10.0, making this 1.0, resave the file. Let me go back down to the bottom and recompile. Let me now rerun. And there-- now, I've got my one tenth represented as 0.10. All right. So that's not bad. And let me point out one other way we could have solved this. Let me actually roll back in time to when we had this as one tenth a moment ago. And let me go ahead and resave this file as a different file name, just to have a little checkpoint. So that was version one. And now let me go ahead and do one more version. We'll call this version two zero indexed. And I'm going to instead do this-- you know what? Adding dot zero works in this case. But suppose one were a variable. Supposed 10 were a variable. In other words, suppose that I couldn't just hard-code .0 at the end of this arithmetic expression. Well, I can actually do something in parentheses called casting. I can cast that integer 10 to a float, and I can cast that integer one to a float, as well. Then the math that's going to be done is effectively 1.0 divided by 10.0, the result of which goes in f as before. So if I recompile this as make floats 2, and now floats 2, I get the same answer, as well. So this is a fairly contrived example, to solve this problem by introducing casting. But in general, casting's going to be a powerful thing, particularly for problem set two in a week's time, when you want to convert one data type to another that at the end of the day are represented in the same way. At the end of the day, every single thing we've talked about thus far is just ints underneath the hood. Or if that's too low-level for you, they're just numbers underneath the hood. Even characters, again, recall from week zero, are numbers underneath the hood. Which is to say, we can convert between different types of numbers if they're just bits. We can convert between numbers and letters if they're just bits, and vice versa. And casting in this way is a mechanism in programming that lets you forcibly change one data type to another. Unfortunately, this isn't as straightforward as I might have liked. I'm going to go back into floats 1, which was the simpler, more straightforward one with .0 added on to each. And just as a quick refresher, let me go ahead and recompile this, make floats 2-- sorry, this is make floats 1. And now let's run floats 1. And in the bottom, notice that I indeed get 0.1. So, problem solved. But not yet. I'm now going to get a little curious, and I'm going to go back into my printf statement and say, you know what? I'd like to confirm that this is really one tenth. And I'm going to want to see this to, say, five decimal places. It's not a problem. I change the two to a five, I recompile with make. I rerun it as floats 1. Looking pretty good. My sanity checks might end there, but I'm getting a little more adventurous. I'm going to change 0.5 to 0.10. I want to see 10 digits after the decimal place. And I'm going to go ahead and recompile this and rerun floats 1. I kind of regret having tested this further because my math is not so correct anymore, it seems. But wait a minute, maybe that's just a fluke. Maybe the computer is acting a little bit strange. Let me go ahead and do 20 decimal points and reassure myself that I know how to do math. I know how to program. Make floats 1, recompile, and damn it. That is really, really getting far from the mark. So what's going on here? Intuitively, based on our assumptions earlier about the size of data types, what must be happening here underneath the hood? Yeah? Exactly. If you want this much precision, and that's a heck of a lot of precision-- 20 numbers after the decimal point. You can't possibly represent an arbitrary number unless you have an arbitrary number of bits. But we don't. For a float, we only have 32 bits. So if 32 bits can only be permuted in a way-- just like our humans on, stage hands up or down-- in a finite number of ways, there's only a finite number of real numbers you can represent with those bits. And so the computer eventually is going to have to start cutting corners. The computer can hide those details from us for a little bit of time. But if we start poking at the numbers and looking farther and farther at the trailing numbers in the whole number, then we start to see that it's actually approximating the idea of one tenth. And so it turns out, tragically, there's an infinite number of numbers we cannot represent precisely in a computer, at least with a finite number of bits, a finite amount of RAM. Now unfortunately, this sometimes has real-world consequences. If people don't quite appreciate this or sort of take for granted the fact that their computer will just do what they tell it to do and don't understand these underlying representation details-- which, frankly, in some languages are hidden from the user, unlike in C-- some bad things can happen. And what I thought we'd do is take a step back. And this is about an eight-minute video. It aired a few years ago, and it gives insights into actually what can go wrong when you under-appreciate these kinds of details in the very all-too real world. If we could dim the lights for a few minutes. SPEAKER 1: We now return to engineering disasters on Modern Marvels. Computers-- we've all come to accept the often frustrating problems that go with them. Bugs, viruses, and software glitches are small prices to pay for the convenience. But in high-tech and high-speed military and space program applications, the smallest problem can be magnified into disaster. On June 4, 1996, scientists prepared to launch an unmanned Ariane 5 rocket. It was carrying scientific satellites designed to establish precisely how the Earth's magnetic field interacts with solar winds. The rocket was built for the European Space Agency and lifted off from its facility on the coast of French Guiana. JACK GANSSLE: At about 37 seconds into the flight, they first noticed something was going wrong. The nozzles were swiveling in a way they really shouldn't. Around 40 seconds into the flight, clearly the vehicle was in trouble. And that's when they made a decision to destroy it. The range safety officer, with tremendous guts, pressed the button, blew up the rocket before it could become a hazard to public safety. SPEAKER 1: This was the maiden voyage of the Ariane 5, and its destruction took place because of a flaw embedded in the rocket's software. JACK GANSSLE: The problem on the Ariane was that there was a number that required 64 bits to express. And they wanted to convert to a 16-bit number. They assumed that the number was never going to be very big, that most of those digits in the 64-bit number were zeros. They were wrong. SPEAKER 1: The inability of one software program to accept the kind of number generated by another was at the root of the failure. Software development had become a very costly part of new technology. The Ariane 4 rocket had been very successful, so much of the software created for it was also used in the Ariane 5. PHILIP COYLE: The basic problem was that the Ariane 5 was faster, accelerated faster. And the software hadn't accounted for that. SPEAKER 1: The destruction of the rocket was a huge financial disaster, all due to a minute software error. But this wasn't the first time data conversion problems had plagued modern rocket technology. JACK GANSSLE: In 1991, with the start of the first Gulf War, the Patriot missile experienced a similar kind of a number conversion problem. As a result, 28 American soldiers were killed and about 100 others wounded when the Patriot, which was supposed to protect against incoming Scuds, failed to fire a missile. SPEAKER 1: When Iraq invaded Kuwait and America launched Desert Storm in early 1991, Patriot missile batteries were deployed to protect Saudi Arabia and Israel from Iraqi Scud missile attacks. The Patriot is a US medium-range surface-to-air system manufactured by the Raytheon company. THEODORE POSTOL: The size of the Patriot interceptor itself is roughly 20-feet long. And it weighs about 2000 pounds. And it carries a warhead of about-- I think it's roughly 150 pounds. And the warhead itself is a high explosive which has fragments around it. The casing of the warhead is designed to act like buckshot. SPEAKER 1: The missiles are carried four per container and are transported by a semi trailer. PHILIP COYLE: The Patriot anti-missile system goes back at least 20 years now. It was originally designed as an air defense missile to shoot down enemy airplanes. In the first Gulf War, when that war came along, the Army wanted to use it to shoot down Scuds, not airplanes. The Iraqi air force was not so much of a problem. But the Army was worried about Scuds. And so they tried to upgrade the Patriot. SPEAKER 1: Intercepting an enemy missile traveling at mach five was going to be challenging enough. But when the Patriot was rushed into service, the Army was not aware of an Iraqi modification that made their Scuds nearly impossible to hit. THEODORE POSTOL: What happened is the Scuds that were coming in were unstable. They were wobbling. The reason for this was the Iraqis, in order to get 600 kilometers out of a 300-kilometer-range missile, took weight out of the front warhead. They made the warhead lighter. So now the Patriot's trying to come at the Scud. And most of the time, the overwhelming majority of the time, it would just fly by the Scud. SPEAKER 1: Once the Patriot system operators realized the Patriot missed its target, they detonated the Patriots warhead to avoid possible casualties if it was allowed to fall to the ground. THEODORE POSTOL: That was what most people saw as big fireballs in the sky and misunderstood as intercepts of Scud warheads. SPEAKER 1: Although in the night skies Patriots appeared to be successfully destroying Scuds, at Dhahran there could be no mistake about its performance. There, the Patriot's radar system lost track of an incoming Scud and never launched due to a software flaw. It was the Israelis who first discovered that the longer the system was on, the greater the time discrepancy became due to a clock embedded in the system's computer. JACK GANSSLE: About two weeks before the tragedy in Dhahran, the Israelis reported to the Defense Department that the system was losing time. After about eight hours of running, they noticed that the system is becoming noticeably less accurate. The Defense Department responded by telling all of the Patriot batteries to not leave the systems on for a long time. They never said what a long time was. Eight hours? 10 hours? 1,000 hours? Nobody knew. SPEAKER 1: The Patriot battery stationed at the barracks at Dhahran and its flawed internal clock had been on over 100 hours on the night of February 25th. JACK GANSSLE: It tracked time to an accuracy of about a tenth of a second. Now, a tenth of a second is an interesting number because it can't be expressed in binary exactly, which means it can't be expressed exactly in any modern digital computer. It's hard to believe, but use this as an example. Let's take the number one third. One third cannot be expressed in decimal exactly. One third is 0.333 going on for infinity. There's no way to do that with absolute accuracy in decimal. That's exactly the same kind of problem that happened in the Patriot. The longer the system ran, the worst the time error became. SPEAKER 1: After 100 hours of operation, the error in time was only about one third of a second. But in terms of targeting a missile traveling at mach five, it resulted in a tracking error of over 600 meters. It would be a fatal error for the soldiers at Dhahran. THEODORE POSTOL: What happened is a Scud launch was detected by early warning satellites. And they knew that the Scud was coming in their general direction. They didn't know where it was coming. SPEAKER 1: It was now up to the radar component of the Patriot system defending Dhahran to locate and keep track of the incoming enemy missile. JACK GANSSLE: The radar was very smart. It would actually track the position of the Scud and then predict where it probably would be the next time the radar sent a pulse out. That was called the range gate. THEODORE POSTOL: Then once the Patriot decides enough time has passed to go back and check the next location for this detected object, it goes back. So when it went back to the wrong place, it then sees no object. And it decides that there was no object, it was a false detection, and drops the track. SPEAKER 1: The incoming Scud disappeared from the radar screen, and seconds later it slammed into the barracks. The Scud killed 28 and was the last one fired during the first Gulf War. Tragically, the updated software arrived at Dhahran the following day. The software flaw had been fixed, closing one chapter in the troubled history of the Patriot missile. Patriot is actually an acronym for Phased Array TRacking Intercept Of Target. DAVID J MALAN: All right, so a sobering example, to be sure. And fortunately, these lower level bugs are not something that we'll typically have to appreciate, certainly not with some of our earliest of programs. Rather, most of the bugs you'll encounter will be logical in nature, syntactic in nature whereby the code just doesn't work right. And you know it pretty fast. But particularly when we get to the end of the semester, it's going to become more and more of a possibility to really think hard about the design of your programs and the underlying representation there, too, of the data. For instance, we'll introduce MySQL, which is a popular database engine that you can use with websites to store data on the back end. And you'll have to start to decide at the end of the semester not only what types of data along these lines to use but exactly how many bits to use, whether or not you want to store dates as dates and times as times, and also things like how big do you want the unique IDs to be for, say, the users in your database. In fact, if some of you have had Facebook accounts for quite some time, and you know how to get access to your User ID-- which sometimes shows up in your profile's URL unless you've chosen a nickname for the URL, or if you've used Facebook's Graph API, the publicly available API by which you can ask Facebook for raw data-- you can see what your numeric ID is. And some years ago, Facebook essentially had to change from using the equivalent of ints to using long long because over time as users come and go and create lots of accounts and fake accounts, even they very easily were able to exhaust something like a 4 billion possible value like an int. So more on those kinds of issues down the road, as well. All right, so that was casting. That was imprecision. A couple of quick announcements. So sections formally begin this coming Sunday, Monday, Tuesday. You'll hear via email later this week as to your section assignment. And you'll also here at that point how to change your section if your schedule has now changed or your comfort level has now changed. Meanwhile P-set one and hacker one are due this Thursday with the option to extend that deadline per the specifications to Friday in a typical way. Realize that included with the problem set specifications are instructions on how to use the CS50 appliance, make, as well as some CS50 specific tools like style 50, which can provide you with feedback dynamically on the quality of your code style and also check 50, which can provide you with dynamic feedback as to your code's correctness. Forgive that we're still ironing out a few kinks with check 50. A few of your classmates who did start around four AM on Friday night when the spec went up have noticed since then a few bugs that we are working through, and apologies for anyone who has experienced undue frustrations. The fault is mine. But we'll follow up on the CS50 discuss when that is resolved. So a word on scores themselves. So it'll be a week or two before you start to get feedback on problem sets because you don't yet have a teaching fellow. And even then, we will start to evaluate the C problem sets before we go back and evaluate scratch so that you get more relevant feedback more quickly. But in general per the syllabus, CS50 problem sets are evaluated along the following four axes-- scope, correctness, design, and style. Scope is going to be a number typically between zero and five that captures how much of the piece that you bit off. Typically, you want this to be five. You at least tried everything. And notice it's a multiplicative factor so that doing only part of the problem set is not the best strategy. Meanwhile, more obvious is the importance of correctness-- just is your program correct with respect to the specification? This is weighted deliberately more heavily than the other two axes by a factor of three because we recognize that typically you're going to spend a lot more time chasing down some bugs, getting your code to work, then you are indenting it and choosing appropriate variable names and the like, which is on the other end of the spectrum of style. That's not to say style is not important, and we'll preach it over time both in lectures and in sections. Style refers to the aesthetics of your code. Have you chosen well-named variables that are short but somewhat descriptive? Is your code indented as you've seen in lecture and in a manner consistent with style 50? Lastly is design right there in the middle. Design is the harder one to put a finger on because it's much more subjective. But it's perhaps the most important of the three axes in terms of pedagogical value over time and that this will be the teaching fellow's opportunity to provide you with qualitative feedback. Indeed, in CS50 even though we do have these formulas and scores, at the end of the day these are very deliberately very small buckets-- point values between zero and three and zero and five. We don't try to draw very coarse lines between problem sets or between students but rather focus as much as we can on qualitative, longhand feedback, either typed or verbal from your particular teaching fellow, you'll get to know quite well. But in general, those are the weights that the various axes will have. Meanwhile, too, it's worth keeping in mind that you should not assume that a three out of five is a 60% and therefore roughly failing. Three is deliberately meant to be sort of middle of the road good. If you're getting threes at the beginning of the semester, that's indeed meant to be a good place to begin. If you're getting twos, fairs, there's definitely some work to pay a little more attention, to take advantage of sections and office hours. If you're getting fours and fives, great. But really, we hope to see trajectories among students-- very individualized per student, but starting the semester here in sort of the two to the three range but ending up here in the four to five range. That's what we're really looking for. And we do keep in mind the delta that you exhibit between week zero and week 12 when I'm doing grades. It doesn't matter to us absolutely how you fair at the beginning if your trajectory is indeed upward and strong. Academic honesty-- so let me put on my more serious voice for just a moment. So this course has the distinction of sending more students than any other in history to the ad board, I believe. We have sort of lost count at this point of how often this happens. And that's not because students in 50 are any more dishonest than their classmates elsewhere. But realize, too, that we are very good at detecting this sort of thing. And that is the advantage that a computer science class has in that we can and we do compare all students problem sets pair-wise against every other, not only this year but all prior years. We have the ability, like students in the class, to Google and to find code on sites like github and discussion forums. There are absolutely solutions to CS50's p-sets floating around there. But if you can find them, we can find them. And all of this is very much automated and easy and sad for us to find. But I want to emphasize, too, that the course's academic honesty policy is very much meant to be very much the opposite of that spirit. Indeed, this year we've rephrased things in the syllabus to be this, dot dot dot, with more detail in the syllabus. But the overarching theme in the course really is to be reasonable. We recognize that there is a significant amount of pedagogical value in collaborating, to some extent, with classmates, whereby you two or you three or you more are standing at a white board whiteboarding, so to speak, your ideas-- writing out pseudocode in pictures, diagramming what should Mario be if you were to write it first in pseudocode. What should the greedy algorithm-- how should it behave per problem sets one? And so realize that behavior that we encourage is very much along those lines. And in the syllabus, you'll see a whole bunch of bullets under a reasonable category and a not reasonable category that helps us help you wrap your mind around where we do draw that line. And in general, a decent rule of thumb is that if you are struggling to solve some bug and your friend or classmate is sitting next to you, it is reasonable for you to show him or her your code and say, hey, can you help me figure out what's going wrong here? We don't typically embrace the opposite side. It is not a correct response for your friend or classmate here to say, oh, just look at mine and figure it out from that. That is sort of unreasonable. But having someone else, another brain, another pair of eyes look at your screen or look at your code and say, are you sure you want to have a loop here? Or are you sure you want that semicolon here? Or oh, that error message means this. Those are very reasonable and encouraged behaviors. The cases to which I was alluding to earlier boil down to when students are late at night making poor judgment decisions and emailing their code to someone else or just saying, here, it's in Dropbox or Googling late at night. And so I would encourage and beg of you, if you do have those inevitable moments of stress, you're bumping up against the deadline, you have no late day since it's already Friday at that point, email the course's heads or myself directly. Say, listen, I'm at my breaking point here. Let's have a conversation and figure it out. Resorting to the web or some other not reasonable behavior is never the solution, and too many of your classmates are no longer here on campus because of that poor judgment. But it's very easy to skirt that line. And here is a little picture to cheer you up from Reddit so that now everything will be OK. So a quick recap, then, of where we left off. So last week, recall that we introduce conditions, not in Scratch but in C this time. And there was some new syntax but really no new ideas per se. We had Boolean expressions that we could or together with two vertical bars or and together with two ampersands, saying that both the left and the right must be true for this to execute. Then we had switches, which we looked at briefly, but I propose are really just different syntax for achieving the same kind of goal if you know in advance what your cases are going to be. We looked at loops. A for loop is maybe the most common, or at least the one that people typically reach for instinctively. Even though it looks a little cryptic, you'll see many, many examples of this before long, as you have already late last week. While loops can similarly achieve the same thing. But if you want to do any incrementation or updating of variable's values, you have to do it more manually than the for loop before allows. And then there's the do-while loop, which allows us to do something at least once while something else is true. And this is particularly good for programs or for games where you want to prompt the user for something at least once. And then if he or she doesn't cooperate, you might want to do it again and again. With variables, meanwhile, we had lines of code like this, which could be two lines. You could declare an int called counter, semicolon. Or you can just declare and define it, so to speak. Give it a value at the same time. And then lastly, we talked about functions. And this was a nice example in the sense that it illustrates two types of functions. One is GetString(), which, again, gets a string from the user. But GetString() is kind of interesting, so far as we've used it, because we've always used it with something on the left-hand side of an equal sign. That is to say that GetString() returns a value. It returns, of course, a string. And then on the left-hand side, we're simply saving that string inside of a variable called name. This is different, in a sense, from printf because printf, at least in our usage here, does not return anything. As an aside, it does return something. We just don't care what it is. But it does have what's called a side effect. And what is that side effect in every case we've seen thus far? What does printf do? It prints something to the screen, displays text or numbers or something on the screen. And that's just considered a side effect because it's not really handing it back to me. It's not an answer inside of a black box that I can then reach into and grab. It's just doing it on its own, much like Colton was plugged into this black box last week, and he somehow magically was drawing on the board without me actually involved. That would be a side effect. But if I actually had to reach back in here and say, oh, here is the string from the user, that would be a return value. And thus far we've only used functions that other people have written. But we can actually do these kinds of things ourselves. So I'm going to go into the CS50 appliance again. Let me close the tab that we had open a moment ago. And let me go ahead and create a new file. And I'm going to go ahead and call this one positive.c. So I want to do something with positive numbers here. So I'm going to go ahead and do int-- sorry-- #include . Let's not make that same mistake as before. Int main (void), open curly brace, closed curly brace. And now I want to do the following. I want to write a program that insists that the user gives me a positive integer. So there is no GetPositiveInt function in the CS50 library. There's only GetInt(). But that's OK because I have the constructs with which I can impose a little more constraint on that value. I could do something like this. So int n-- and if you're typing along, just realize I'm going to go back and change some things in a moment-- so int n equals GetInt(). And that's going to put an int inside of n. And let me be a more descriptive. Let me say something like I demand that you give me a positive integer. All right. So just a little bit of instructions. And now what can I do? Well, I already know from my simple conditions or branches, just like I had in Scratch, I could say something like if n is less than or equal to zero, then I want to do something like, that is not positive. And then I could do-- OK, but I really want to get that int. So I could go up here and I could kind of copy this and indent this. And then, OK. So if n is less than or equal to zero do this. Now, what if the user doesn't cooperate? Well, then I'm going to borrow this here. And then I go in here and here and here. So this is clearly not the solution, right? Because there's no end in sight. If I want to demand that the user gives me a positive integer, I can actually get the int. I can then check for that int. But then I want to check it again and check it again and check it again. So obviously, what's the better construct to be using here? All right, so some kind of loop. So I'm going to get rid of almost all of this. And I want to get this int at least once. So I'm going to say do-- and I'll come back to the while in just a moment-- now, do what? I'm going to do int n gets GetInt(). OK. So that's pretty good. And now how often do I want to do this? Let me put the printf inside of the loop so I can demand again and again, if need be. And what do I want this while condition to do? I want to keep doing this while what is the case? Yeah. N is less than or equal to zero. So already, we've significantly cleaned this code up. We've borrowed a very simple construct-- the do-while loop. I've stolen just the important lines of code that I started copying and pasting, which was not wise. And so now I'm going to actually paste it in here and just do it once. And now what do I want to do at the very end of this program? I'll just say something simple like, thanks for the-- and I'll do %i for int-- backslash n, comma, and then plug in n, semicolon. All right. So let's see what happens now when I run this program. I'm going to go ahead and do make positive. Damn. A few errors. So let me scroll back up to the first. Don't work through them backwards. Work through them from top down lest they cascade and only one thing be wrong. Implicit declaration of function GetInt(). Yeah. So it wasn't enough. I kind of made the same mistake but a little different this time. I need to not only include stdio.h but also cs50.h, which includes the so-called declarations of get int, which teach the appliance, or teaches C what GetInt() is. So let me resave. I'm going to ignore the other errors because I'm going to hope that they're somehow related to the error I already fixed. So let me go ahead and recompile with make positive, Enter. Damn. Three errors, still. Let me scroll up to the first. Unused variable n. We've not seen this before. And this, too, is a little cryptic. This is the output of the compiler. And what that highlighted line there-- positive.c:9:13-- is saying, it's saying on line nine of positive.c, at the 13th character, 13th column, you made this mistake. And in particular, it's telling me unused variable n. So let's see-- line nine. I'm using n in the sense that I'm giving it a value. But what the compiler doesn't like is that I'm not seemingly using it. But wait a minute, I am using it. In line 11, I'm using it here. But if I scroll down further at positive.c:11-- so at line 11, character 12, the compiler's telling me, use of undeclared identifier n. So undeclared means I have not specified it as a variable with a data type. But wait a minute. I did exactly that in line nine. So someone is really confused here. It's either me or the compiler because in line nine, again, I'm declaring an int n, and I'm assigning it the return value of GetInt(). Then I'm using that variable n in line 11 and checking if its value is less than or equal to zero. But this apparently is bad and broken why? Say it again? Ah, I have to declare n before entering the loop. But why? I mean, we just proposed a bit ago that it's fine to declare variables all on one line and then assign them some value. A global variable-- let's come back to that idea in just a moment. Why do you want me to put it outside of the loop? It is. Exactly. So, albeit, somewhat counterintuitive, let me summarize. When you declare n inside of the do block there-- specifically inside of those curly braces-- that variable n has what's called a scope-- unrelated to our scoring system in the course-- but has a scope that's limited to those curly braces. In other words, typically if you declare a variable inside a set of curly braces, that variable only exists inside of those curly braces. So by that logic alone, even though I've declared n in line nine, it essentially disappears from scope, disappears from memory, so to speak, by the time I hit line 11. Because line 11, unfortunately, is outside of those curly braces. So I unfortunately can't fix this by going back to what I did it before. You might at first do this. But what are you now not doing cyclically? You're obviously not getting the int cyclically. So we can leave the GetInt(), and we should leave the GetInt() inside the loop because that's what we want to pester the user for again and again. But it does suffice to go up to line, say, six. Int n, semicolon. Don't give it a value yet because you don't need to just yet. But now down here, notice-- this would be a very easy mistake. I don't want to shadow my previous declaration of n. I want to use the n that actually exists. And so now in line 10, I assign n a value. But in line six, I declare n. And so can I or can I not use it in line 12 now? I can because between which curly braces is n declared now? The one up here on line five. To one here on line 14. So if I now zoom out, save this file, go back into and run make positive, it compiled this time. So that's already progress. Slash. ./positive, Enter. I demand that you give me a positive integer. Negative 1. Negative 2. Negative 3. Zero. One. And thanks for the one is what's now printed. Let me try something else, out of curiosity. I'm being told to input an integer. But what if I instead type in lamb? So you now see a different prompt-- retry. But nowhere in my code did I write retry. So where, presumably, is this retry prompt coming from, would you say? Yeah, from GetInt() itself. So one of the things CS50's staff does for you, at least in these first few weeks, is we have written some amount of error checking to ensure that if you call GetInt(), you will at least get back an int from the user. You won't get a string. You won't get a char. You won't get something else altogether. You'll get an int. Now, it might not be positive. It might not be negative. We make no guarantees around that. But we will pester the user to retry, retry, retry until he or she actually cooperates. Similarly, if I do 1.23, that is not an int. But if I do type in, say, 50, that gives me a value that I wanted. All right. So not bad. Any questions on what we've just done? The key takeaway being, to be clear, not so much the loop, which we've seen before even though we haven't really used it, but the issue of scope, where variables can only be can only be used within some specified scope. All right, let me address the suggestion you made earlier, that of a global variable. As an aside, it turns out that another solution to this problem, but typically an incorrect solution or a poorly designed solution, is to declare your variable as what's called a global variable. Now I'm kind of violating my definition of scope because there are no curly braces at the very top and the very bottom of a file. But the implication of that is that now in line four, n is a global variable. And as the name implies, it's just accessible everywhere. Scratch actually has these. If you used a variable, you might recall you had to choose if it's for this sprite or for all sprites. Well, all sprites is just the clearer way of saying global. Yeah? Ah, really good question. So recall that in the very first version of my code, when I incorrectly declared and defined n in line nine-- I declared it as a variable and I gave it a value with the assignment operator-- this gave me two errors. One, the fact that n wasn't used, and two, that in line 11 it just wasn't declared. So the first one I didn't address at the time. It is not strictly an error to declare a variable but not use it. But one of the things we've done in the CS50 appliance, deliberately, pedagogically, is we've cranked up the expectations of the compiler to make sure that you're doing things not just correctly but really correctly. Because if you're declaring a variable like n and never using it, or using it correctly, then what is it doing there? It truly serves no purpose. And it's very easy over time, if you don't configure your own computer in this way, to just have code that has little remnants here, remnants there. And then months later you look back and you're like, why is this line of code there? And if there's no good reason, it doesn't benefit you or your colleagues down the road to have to stumble over it then. As an aside, where is that coming from? Well, recall that every time we compile program, all of this stuff is being printed. So we'll come back to this. But again, make is a utility that automates the process of compiling by running the actual compiler called clang. This thing, we'll eventually see, has to do with debugging with a special program called the debugger. This has to do with optimizing the code-- more on that in future. Std=c99-- this just means use the 1999 version of C. C's been around even longer than that, but they made some nice changes 10 plus years ago. And here's the relevant ones. We are saying make anything that previously would have been a warning an error preventing the student from compiling. And wall means do that for a whole bunch of things, not just related to variables. And then let me scroll to the end of this line. And this, too, we'll eventually come back to. This is obviously the name of the file I'm compiling. This recalls the name of the file I'm outputting as the name of my runnable program. This -lcs50 just means use the CS50 library, and any zeros and ones that the staff wrote and compiled earlier this year, integrate them into my program. And anyone know what -lm is? It's the math library, which is just there even if you're not doing any math. It's just automatically provided to us by make. Well, let me do one other example here by opening up a new file. And let me save this one as string.c. It turns out that as we talk about data types today, there's even more going on underneath the hood than we've seen thus far. So let me quickly do a quick program. Include stdio.h. And I'll save that. And you know, let me not make the same mistake again and again. Include cs50.h. And let me go ahead now and do int main(void). And now I simply want to do a program that does this-- declare a string called s and get a string from the user. And let me do a little instructions here-- please give me a string-- so the user knows what to do. And then down here below this, I want to do the following-- for int i gets zero. Again, computer scientists typically start counting at zero, but we could make that one if we really wanted. Now I'm going to do i is less than the string length of s. So strlen-- S-T-R-L-E-N-- again, it's concise because it's easier to type, even though it's a little cryptic. That is a function we've not used before but literally does that-- return to me a number that represents the length of the string that the user typed. If they typed in hello, it would return five because there's five letters in hello. Then, on each iteration of this loop, i plus plus. So again, a standard construct even if you're not quite too comfortable or familiar with it yet. But now on each iteration of this loop, notice what I'm going to do. I want to go ahead and print out a single character-- so %c backslash n on a new line. And then, you know what I want to do? Whatever the word is that the user types in, like hello, I want to print H-E-L-L-O, one character per line. In other words, I want to get at the individual characters in a string, whereby up until now a string has just been a sequence of characters. And it turns out I can do s, bracket, i, close bracket, close parenthesis, semicolon. And I do have to do one more thing. It's in a file called string.h that strlen is declared. So if I want to use that function, I need to tell the compiler, expect to use it. Now let me go ahead and make the program called string. Dot, slash, string. Please give me a string. I'll go ahead and type it. Hello, in all caps, Enter. And now notice I've printed this one character after the other. So the new detail here is that a string, at the end of the day, can be accessed by way of its individual characters by introducing the square bracket notation. And that's because a string underneath the hood is indeed a sequence of characters. But what's neat about them is in your computer's RAM-- Mac, PC, whatever it is-- they're literally back to back to back-- H-E-L-L-O-- at individual, adjacent bytes in memory. So if you want to get at the eighth such byte, which in this loop would be bracket zero, bracket one, bracket two, bracket three, bracket four-- that's zero indexed up until five-- that will print out H-E-L-L-O on its own line. Now, as a teaser, let me show you the sorts of things you'll eventually be able to understand, at least with some close looking. For one, what we included in today's examples, if you'd like, is actually one of the very first jailbreaks for the iPhone. Jailbreaking means cracking the phone so you can actually use it on a different carrier or install your own software. And you'll notice this looks completely cryptic, most likely. But look at this. The iPhone was apparently cracked with a for loop, an if condition, an else condition, a bunch of functions we've not seen. And again, you won't at first glance probably understand how this is working. But everything that we sort of take for granted in our modern lives actually tends to reduce even to some of these fundamentals we've been looking at. Let me go ahead and open one other program, holloway.c. So this, too, is something you shouldn't really know. Even none of the staff or I could probably figure this out by looking at it because this was someone's code that was submitted to what's historically known as an obfuscated C contest, where you write a program that compiles and runs but is so damn cryptic no human can understand what it's going to do until they actually run it. So indeed, if you look at this code, I see a switch. I see main. I see these square brackets implying some kind of an array. Does anyone want to guess what this program actually does if I run Holloway? Yes. OK. Well done. So only the staff and I cannot figure out what these things do. And now lastly, let me go ahead and open up one other program. This one-- again, we'll make the source code available online-- this one's just kind of pretty to look at. All they did is hit the space bar quite a bit. But this is real code. So if you think that's pretty, if we actually run this at the prompt, eventually you'll see how we might do things like this. So we'll leave you on that note and see you on Wednesday. [MUSIC PLAYING] SPEAKER 2: At the next CS50, the TFs stage a mutiny. SPEAKER 3: There he is. Get him! [MUSIC PLAYING]