## Week 1 Wednesday Andrew Sellergren ### Announcements and Demos (0:00--3:00) + This is CS50. + Study cards are in and the total enrollment of CS50 is 745! Though we gather twice a week as a whole class in Sanders, we hope that your CS50 experience will be defined just as much by more intimate gatherings in section, office hours, and elsewhere. + Sections begin this Sunday and are held on Sundays, Mondays, and Tuesdays. You should have your section assignments by Saturday. + Office Hours are Mondays through Thursdays, 8 p.m. to 11 p.m. in Annenberg Hall. Problem sets are due on Thursdays at noon, but you have up to 5 late days to spend, so Office Hours on Thursday may still prove useful. You'll probably want to hang on to those late days for later, more challenging problem sets. + Problem Set 1 will be released on Friday when the Walkthrough will also be held (2:30 p.m. in Harvard Hall 104). + We'll be manning [CS50 Discuss](http://cs50.net/discuss) during lecture, so if you're uncomfortable raising your hand in Sanders, try posting it there with the label of "lecture." ### From Last Time (5:00--34:00) + We discovered that 1 / 10 is not actually 0.1. Well, at least it's not stored as exactly 0.1 by a computer. With a finite number of bits, there's only a finite number of numbers we can represent. + Generally when we write numbers, we omit leading zeroes. Hence, we write 123 not 00123. However, when we write numbers in binary, we'll adopt the convention of including those leading zeroes to fill out 8 places to emphasize that 8 bits, or 1 byte, is a somewhat standard unit of measure. Thus, we write 0, 1, and 2 as follows: 00000000 00000001 00000010 + The protagonists in the movie [Office Space](http://www.youtube.com/watch?v=G_wiXgRWrIU) take advantage of floating point imprecision to rip off their company Initech. Consider that if banking software stores a number like 0.1 improperly, it could mean that there are fractions of a cent gained or lost. If you haven't seen Office Space, that's your homework for the weekend. + Several more engineering disasters have occurred as a result of floating point imprecision, as [this clip](http://www.youtube.com/watch?v=EMVBLg2MrLs) from Monday describes. + The CS50 Appliance is free software that allows you to run an instance of another operating system (Linux) within your own computer's native operating system. To run the Appliance, you also need a *hypervisor* like VMWare, VirtualBox, or Parallels. The [specification](http://cdn.cs50.net/2012/fall/psets/1/pset1.pdf) for Problem Set 1 contains instructions for installing a hypervisor and the Appliance. + C is a programming language which is less graphical but perhaps more powerful than Scratch. We write source code in C in any text editor like Notepad, TextEdit, or gedit. In order to execute the programs we write, we need to compile our source code, or translate it to 0's and 1's. These 0's and 1's are interpreted by the CPU inside a computer as instructions. + To open gedit on the Appliance, we can go to Menu in the bottom left of the desktop, select Accessories and choose gedit from the submenu that appears. Alternatively, we can click the icon directly next to the Menu in the bottom left of the desktop. + Within the gedit window, the top portion is a normal text editor and the bottom portion is a terminal window for executing commands like `ls` for list, `cd` for change directory, `rm` for remove. When we click File → Save in gedit, we can choose the `jharvard` directory (our user's home directory) and name our file `hello.c`. Then we can write: int main(void) { printf("hello, world!"); } + In order to use the `printf` function, though, we need to include the standard library that contains its definition: #include int main(void) { printf("hello, world!"); } + gedit highlights the open and close elements of brackets and curly braces when you are near them. + Good practice is also to include an explicit return statement: #include int main(void) { printf("hello, world!"); return 0; } + Returning 0 means that all went well. In general, this number will not be visible to the user that runs the program unless something goes wrong and a non-zero return value is reported. + To make our program more aesthetically pleasing, we should probably include a newline character at the end of our printed line: #include int main(void) { printf("hello, world!\n"); return 0; } + Now, we turn our attention to the bottom portion of the gedit window, the terminal. Here we have our command prompt with `jharvard@appliance (~)` telling us that we are acting as user `jharvard` and we're currently in our home directory denoted by the `~`. Here we can type `make hello` which actually runs a compiler program named `clang` under the hood. Now we've created a file name `hello` in our home directory. We can run this by typing `./hello` and hitting Enter. + We can step up this program's complexity a notch by getting input from the user: #include int main(void) { string s = GetString(); printf("hello, world!\n"); return 0; } + The `GetString` function is one written by the staff to obtain input from the user in the form of a string. To use it, we need to tell the compiler where to find its definition: #include #include int main(void) { string s = GetString(); printf("hello, world!\n"); return 0; } + Getting input from the user is actually not that trivial in C especially when you don't how much data to expect. Thus, we've handled it for you so that you can focus on more interesting tasks. + Providing far too much input to a program is one method that hackers use to expose potential security flaws. + `GetString` takes 0 arguments, or inputs which alter a function's default behavior. The equals sign above doesn't stand for equality, but rather assignment: it says to take whatever's on the right (the output of `GetString`) and store it in whatever's on the left (`s`). Writing `string s` is known as *declaring* it as a variable, or allocating space in RAM to store it. If we want to print out the user's input instead of a hardcoded string, we need to pass `s` as an argument to `printf`: #include #include int main(void) { string s = GetString(); printf("hello, %s!\n", s); return 0; } + The two arguments to `printf` are separated by a comma. `printf` knows to take the second argument and insert it into the first argument before printing out the whole string. + When we compile and run this program, we get a blank line and a blinking cursor. The program waits for us to type something and hit Enter. To make it a little more obvious that this is what the program is waiting for, we can print out a line giving instructions to the user: #include #include int main(void) { printf("Enter a string: "); string s = GetString(); printf("hello, %s!\n", s); return 0; } + When we rerun our program, nothing appears to have changed. Oops, we forgot to recompile. When we recompile and rerun, we get a more user-friendly prompt for input. + Let's start thinking about corner cases. What if we type in a number? Our program seems to handle that okay. What if we type in a really big number or string? That, too, is handled well by our program (although technically if we typed in 5 billion or so characters we could overload it). As we'll see in a few weeks, that's because `GetString` is written to allocate more and more RAM as it's needed. What if we enter nothing? We can simulate this by hitting Enter without typing anything. Our program seems to work fine, but the output is a little silly, so maybe we should actually handle this corner case in our program explicitly. What if we enter a decimal number? Seems fine. As it turns out, even though we're typing numbers, it's ultimately being processed by this program as a string. + Other primitive types in C include `char`, `float`, `double`, `long`, `long long`, and `int`. A `char` is a single character. A `float` is a non-whole number, i.e. one with a decimal point. An `int` is an integer. A `double` is similar to a `float` except it is stored with 64 bits instead of 32 bits. This means that we can use it to represent either larger numbers or more precise numbers (i.e. numbers with more numbers after the decimal point). A `long`, like an `int`, is stored with 32 bits. A `long long` is stored with 64 bits. + Within the CS50 Library, we've defined a few additional variable types, including `bool` and `string`. As in the context of Scratch, a `bool` or boolean variable is one that takes a value of either true or false. ### More Math (34:00--57:00) + Let's write a program that asks the user for an integer instead of a string: #include int main(void) { printf("Give me a number: "); int n = GetInt(); printf("Thanks for the %d!\n", n); return 0; } + `GetInt` is another function available to us in the CS50 Library. When we type `make math`, we get an error: math.c:6:13: implicit declaration of function 'GetInt' is invalid in C99 + Oh right, we forgot to include (i.e. explicitly declare) the CS50 Library: #include #include int main(void) { printf("Give me a number: "); int n = GetInt(); printf("Thanks for the %d!\n", n); return 0; } + What this `#include` actually does is tell `clang` to go and fetch the file named `cs50.h` and paste its code at the top of our file before compiling it. + If we want to start being a difficult user again, we can try providing non-integer inputs to `math.c`. If we enter 0.1, we see a prompt that says "Retry." We can hit Enter all we want, but we actually wrote `GetInt` to continue prompting the user until he or she provides an integer. How might we have implemented this continuous prompting? Perhaps an infinite loop. + Other functions that we implemented in the CS50 Library are as follows: + `GetChar` + `GetDouble` + `GetFloat` + `GetInt` + `GetLongLong` + `GetString` + In C, unlike some other programming languages, you have to tell the compiler what type a variable will have. These types provide context to the compiler so that it knows how to interpret the bits (e.g. as an number or as a character). + Question: if we take out `#include `, what happens when we compile? We get an "implicit declaration" error similar to the one we saw when we left out `#include `. If we take out all references to `printf`, as well, then the compile will throw a new error indicating an unused variable. This is a non-fatal mistake, but we've configured the Appliance to be really annoying when it compiles your code. + Another corner case we can try is a really large number. If we type in 1111111111111111111111111111111111111111111111111111111111111, the program outputs "Thanks for the 2147483647!" What's going on here? An `int` is only 32 bits, each of which can be either 0 or 1. That's 2 possibilities for 32 places, so we can store 232 possible numbers. But, wait, 232 is about 4 billion and the number that was printed was only around 2 billion. In fact, the number that was printed is 231 - 1. Turns out that although we have 32 bits with which to represent numbers in an `int`, we need at least one of those bits to denote whether the number is positive or negative. Thus, with an `int` you can represent the numbers between -231 and 231 - 1. When we provide a number larger than that to `GetInt`, it defaults to the maximum. + Question: why couldn't we represent the negative sign using a character? You could, but you'll still need a bit to do that. + Before we move on, we'll examine one more example that will illustrate a few of the more common stumbling points in basic C syntax. Recall the formula for converting temperatures from Fahrenheit to Celsius: C = (5/9) x (F - 32). Let's write a C program named `f2c.c` that will do this conversion for us: #include #include int main(void) { // ask user for temperature printf("Temperature in F: "); float f = GetFloat(); // convert F to C float c = 5 / 9 * F - 32; // display result printf("that number in C is %f\n", c); return 0; } + We probably want to use a `float` to store the temperature given by the user, because temperatures can have decimal places. We could use a `double`, but this feels like overkill. We should also get into the habit of commenting our code from now on. Comments empower you and anyone else reading your code (e.g. your TF) to skim it and know exactly what it does without necessarily diving into each line. + Of course, there's a bug in the code as written above. Recall "order of operations" from elementary school. This same concept applies in programming and is called *operator precedence*. Let's fix it by inserting parentheses: #include #include int main(void) { // ask user for temperature printf("Temperature in F: "); float f = GetFloat(); // convert F to C float c = (5 / 9) * (f - 32); // display result printf("that number in C is %f\n", c); return 0; } + Note also that we needed to change `F` to `f` in our formula because variable names are case-sensitive. If we compile and run this program as is, we'll actually get 0 printed out for every input. Why? We have the same problem with integer division that we did when we divided 1 by 10 in `math3.c`. `5 / 9` will always return 0 because it can't give us non-integers. If we write `5 / 9.0` instead, we get a `float` result. The compiler knows that dividing an `int` by a `float` should return a `float` to maintain precision. Alternatively, we could write `5 / (float) 9`, which will *cast* or convert 9 to a `float`. + Note that on the command line, you can use the up and down arrows to scroll through recent commands. + When we compile and run the fixed version, we get correct outputs! To make it a little prettier, let's change the formatting character: #include #include int main(void) { // ask user for temperature printf("Temperature in F: "); float f = GetFloat(); // convert F to C float c = (5 / 9.0) * (f - 32); // display result printf("that number in C is %.1f\n", c); return 0; } + Know that operator precedence applies to more than just mathematical operations in programming. There are, in fact, whole charts that outline operator precedence, but you don't need to worry about memorizing them. ### Constructs of C (57:00--73:00) #### Conditions and Boolean Expressions + `nonswitch.c` demonstrates the use of conditions and boolean expressions in C: /****************************************************************** * nonswitch.c * * David J. Malan * malan@harvard.edu * * Assesses the size of user's input. * * Demonstrates use of Boolean ANDing. ******************************************************************/ #include #include int main(void) { // ask user for an integer printf("Give me an integer between 1 and 10: "); int n = GetInt(); // judge user's input if (n >= 1 && n <= 3) printf("You picked a small number.\n"); else if (n >= 4 && n <= 6) printf("You picked a medium number.\n"); else if (n >= 7 && n <= 10) printf("You picked a big number.\n"); else printf("You picked an invalid number.\n"); return 0; } + This program simply deems whether the number that the user picked is small, medium, or large. The `&&` is equivalent to the "and" operator in Scratch. Thus, the first condition implies that the number must be both greater than or equal to 1 and less than or equal to 3. Greater than or equal to is written as `>=` and less than or equal to is written as `<=`. + Let's change our program to test for a single number: /****************************************************************** * nonswitch.c * * David J. Malan * malan@harvard.edu * * Assesses the size of user's input. * * Demonstrates use of Boolean ANDing. ******************************************************************/ #include #include int main(void) { // ask user for an integer printf("Give me an integer between 1 and 10: "); int n = GetInt(); // judge user's input if (n == 42) { printf("You picked the right answer.\n"); } else { printf("You picked the wrong answer.\n"); } return 0; } + This program demonstrates use of the `==` operator. We use this rather than `=` to test for equality. In fact, if we use `=`, we'll actually be assigning the value 42 to `n`. By the way, when reading `n = 42` in plain English, David often says "n gets 42." That just means that 42 is assigned to `n`. + What will be the effect on the program if we write `n = 42` instead of `n == 42`? Then we'll be checking whether the value 42 evaluates to true. As it turns out, any non-zero value is considered "true" in programming. Thus, if we write `n = 42`, the first condition will always evaluate to true and the program will always print "You picked the right answer." + One other thing to note, not all of the curly braces are necessary in the program above. We could rewrite it as follows: /****************************************************************** * nonswitch.c * * David J. Malan * malan@harvard.edu * * Assesses the size of user's input. * * Demonstrates use of Boolean ANDing. ******************************************************************/ #include #include int main(void) { // ask user for an integer printf("Give me an integer between 1 and 10: "); int n = GetInt(); // judge user's input if (n == 42) printf("You picked the right answer.\n"); else printf("You picked the wrong answer.\n"); return 0; } + Note that this only works if there's a single line of code following the `if` or `else`. If you do use the curly braces (probably good practice for now), there are multiple different ways of writing them that are considered correct style. In general, just be consistent. #### Switch Statements + We can rewrite `nonswitch.c` using another piece of C syntax called the switch statement: /****************************************************************** * switch1.c * * David J. Malan * malan@harvard.edu * * Assesses the size of user's input. * * Demonstrates use of a switch. ******************************************************************/ #include #include int main(void) { // ask user for an integer printf("Give me an integer between 1 and 10: "); int n = GetInt(); // judge user's input switch (n) { case 1: case 2: case 3: printf("You picked a small number.\n"); break; case 4: case 5: case 6: printf("You picked a medium number.\n"); break; case 7: case 8: case 9: case 10: printf("You picked a big number.\n"); break; default: printf("You picked an invalid number.\n"); } return 0; } + `switch` takes a variable on whose value the behavior of the statement depends. Within the switch statement, you can define multiple `case` statements whose values are tested against the value of the variable. Each `case` statement falls through to the next unless there is a `break` statement. So, in the above, the cases where `n` is equal to 1, 2, or 3 are all treated the same, the cases where `n` is equal to 4, 5, or 6 are all treated the same, and the cases where `n` is equal to 7, 8, 9, or 10 are all treated them same. Finally, the `default` keyword defines behavior when no other `case` statement is executed. + Switch statements can also be used on characters as well as integers. See [`switch2.c`](http://cdn-local.cs50.net/2012/fall/lectures/1/src1w/switch2.c) for an example. #### Loops + To print the numbers from 10 to 0, we could write 10 separate `printf` statements. Better yet, let's do it by writing a loop. + for loops take the following general structure: for (initializations; condition; updates) { // do this again and again } + Within the parentheses after the `for` keyword, there are three parts. Before the first semicolon, we are initializing a variable which will be our iterator or counter, often named `i` by convention. Between the two semicolons, we're providing a condition which, if true, will cause another iteration of the loop to be executed. Finally, we provide code to update our iterator. + Our countdown program, then, looks like so: #include int main(void) { for (int i = 10; i >= 0; i--) { printf("%d\n", i); } return 0; } + Writing this as a loop allows us to very easily change the number at which the countdown begins. If we used the method of copying and pasting `printf` statements, we'd have to add 90 lines of code to count down from 100 instead of 10. Using a loop, we only need to add one character. + The `--` operator means "subtract 1." If we change this to `++`, we can see that our program executes seemingly forever. + Another type of loop is the do-while loop. One use case for the do-while loop is when we want to get input from the user: /****************************************************************** * positive1.c * * David J. Malan * malan@harvard.edu * * Demands that user provide a positive number. * * Demonstrates use of do-while. ******************************************************************/ #include #include int main(void) { // loop until user provides a positive integer int n; do { printf("I demand that you give me a positive integer: "); n = GetInt(); } while (n < 1); printf("Thanks for the %d!\n", n); return 0; } + The code within the `do` block will always be executed once at the least. In this context, that means the user will always be prompted for a positive integer at least once. If the user fails to provide a positive integer, he or she will be reprompted. The loop continues to execute as long as the `while` condition evaluates to true.