### Week 1 Monday Andrew Sellergren #### Announcements and Demos (0:00-8:00) + This is CS50. + Next time you're milling about the Science Center, take a gander at the [Mark I](http://en.wikipedia.org/wiki/Harvard_Mark_I), one of the very first electromechanical computers capable of long self-sustained computation. Check out [this video](http://cdn.cs50.net/2012/fall/lectures/1/clips/Aiken%20Computer%20history.mpg) introducing it to the world in the 1940's! From this same Mark I computer comes the term "bug" that we take for granted. One of the engineers discovered an actual moth in the machine that was causing some incorrect calculations. The moth was then taped to a log book for posterity's sake. + Friday (9/14, 1:15 p.m.) will be the first [CS50 Lunch](http://cs50.net/lunch)! The goal is just to make this large class a little more intimate. + For your reading pleasure: the scribe notes! This is your canonical source for the happenings of lecture so that you don't have to have your head down while in Sanders. This is also your canonical source for jokes made at David's expense. + Problem Set 0 is now available. If you're not sure where to begin, begin with the [Walkthrough](http://cs50.tv/2012/fall/psets/0/walkthrough0-720p.mp4)! + [Sectioning](http://cs50.net/section) is in progress, so don't forget to fill out the form! + [Office Hours](http://cs50.net/ohs) are Monday through Thursday, 8 p.m. to 11 p.m. in Annenberg Hall. Bring your laptop and charger, outlets will be available along the south wall. Then post your question to [CS50 Discuss](http://cs50.net/discuss). When your question has been dispatched to a local staff member, the "Enter the Queue" button will start flashing. Click it and your name will appear on the CS50 Greeter's iPad to be matched with the next available staff member. Given the high attendance at Office Hours last year, we're trying this year to triage the frequently asked questions. + Don't forget that this course can be taken Pass/Fail! You'll still have 5 weeks to decide if you want to take it for a letter grade after the initial feeling of being overwhelmed has subsided. + RSVP for [CS50 Lunch](http://cs50.net/lunch) if you'd like to attend! #### From Scratch to C (8:00-19:00) + Recall our first C program from last week: #include int main(void) { printf("hello, world!\n"); return 0; } + The blue "say" puzzle piece from Scratch has now become `printf` and the orange "when green flag clicked" puzzle piece has become `main(void)`. + Statements are direct instructions, e.g. "say" in Scratch or `printf` in C. The "f" in `printf` stands for "formatted," which simply means you can modify what you're printing with some aesthetic details. In the above, "hello, world!\n" represents a *string*, a sequence of one or more characters. The "\n", recall, is the newline character, the equivalent of hitting Enter at the end of the line. The parentheses around "hello, world!\n" are necessary because `printf` is actually a *function*, something like a miniature program. What we write inside this function's parentheses are its *arguments*, or things we want the function to use while it's executing. The semicolon denotes the end of the line of code. + The "forever" loop from Scratch can be recreated with a `while (true)` block. This syntax purposefully induces an infinite loop. Whatever is within the parentheses is the `while` condition. As long as that condition evaluates to "true," the code within the `while` loop executes. Since the keyword `true` is always "true," the code within the loop always executes. A `while` loop is denoted in C between curly braces like so: while (true) { printf("hello, world!\n"); } + The "repeat" loop from Scratch is equivalent to a `for` loop in C that looks like so: for (int i = 0; i < 10; i++) { printf("hello, world!\n"); } + This syntax declares an integer named `i` (a convention used for variables that are only used for counting) which is set to 0 to begin with. `i < 10` implies that the code within the loop will execute as long as `i` is less than 10. Finally, on each iteration of the loop, the statement `i++` increments `i` by one. All in all, this code causes "hello, world!" to be printed 10 times. + In C, a loop that increments a variable and announces its value would look like so: int counter = 0; while (true) { printf("%d\n", counter); counter++; } + Here we declare a variable named `counter` and then create an infinite loop that prints its value then increments it. The `%d` is new syntax: it's a placeholder for the decimal number that we're passing. + Boolean expressions are much the same in C as in Scratch. The less-than (`<`) and greater-than (`>`) operators are the same. One difference is that the "and" operator is represented as `&&` in C. + Conditions in C look much the same as they do in Scratch: if (x < y) { printf("x is less than y\n"); } else if (x > y) { printf("x is greater than y\n"); } else { printf("x is equal to y\n"); } + The curly braces perform the same encapsulation that the orange block did in Scratch. + Recall that we used a variable called “inventory” in Scratch to store a series of related variables--fruits in the case of FruitcraftRPG. This inventory can be implemented as an array in C: string inventory[1]; inventory[0] = "Orange"; #### hello, world! (19:00-25:00) + Our first program in C was one that simply printed "hello, world": #include int main(void) { printf("hello, world!\n"); return 0; } + A "hello world" program is the canonical example for introducing a new programming language. + What"s the deal with the `#include` line? This line tells the compiler to make use of other libraries of code within our program. The standard library is `stdio.h`, which allows us to use the `printf` function. + How about `int main(void)`? We can gloss over this for now, but know that this is telling the compiler that our `main` function (which is comparable to the "when green flag clicked" puzzle piece in Scratch), will return an integer value when it finishes executing. The `void` implies that our `main` function takes no arguments. + Within the curly braces, we have the `printf` statement and a `return` statement. Returning 0 means "all is well." Returning a non-zero number generally means that an error occurred. If you've ever seen a cryptic error message with a numeric code on your home computer, that is probably the return value of the function that failed. #### Writing, Compiling, and Executing (25:00-74:00) + In the interest of standardizing tools across all the different operating systems used by students, we're instead introducing the CS50 Appliance. As we hinted earlier, this software is a virtual machine that allows you to run an instance of the Linux operating system (specifically Fedora) within whatever operating system your personal computer runs. + Detailed instructions for installing the CS50 Appliance are available [here](https://manual.cs50.net/Appliance#How_to_Install_Appliance). Once you've installed and launched the Appliance, you'll be presented with a desktop reminiscent of Windows. From the start menu, we can open gedit, an environment similar to TextEdit and Notepad. + As before, we'll write the program above, this time saving it in our home directory (`~`) as `hello.c`. The .c extension is a convention for programs written in C. Within our gedit window, at the bottom, there is a line that begins with `jharvard@appliance (~)`: and has a blinking prompt. This is the command line for Linux. The command line allows us to execute programs by typing their names and hitting Enter rather than double clicking them. By default, we've given everyone the username `jharvard, short for John Harvard. + Now that we've saved our file with a `\.c` extension, gedit will do some syntax highlighting for us, making keywords stand out. In the left column, gedit presents a summary of the file we're working on. + Typing `clang hello.c` into the bottom terminal window and hitting Enter will compile our source code into object code, i.e. translate it into 0's and 1's. This will create a file named `a.out` in our home directory. To execute this program, we type `./a.out` at the command line. The `./` is necessary to tell the computer to only look for this program in our current directory. + The bar at the bottom of our Application has some icons on the left. The black square represents a larger Terminal window that we can open separately from gedit. Opening Terminal and typing `ls` will give us a list of files and folders in our home directory. + To compile our program into a file with a more meaningful name, we use a *command line flag* to the `clang` program like so: clang -o hello hello.c + In our home directory, there are a few directories with seemingly cryptic names, among them `src1m`. This stands for "source code week 1 Monday." In order to open this directory, we type `cd src1m`. There's no double clicking on the command line! When we open this directory, the `(~)` on the left of the line will change to `(~/src1m)` to tell us what our current directory is. The tilde stands for our home directory. + For another example of a "hello world" program, check out [`holloway.c`](http://cdn.cs50.net/2012/fall/lectures/1/src1m/holloway.c). Note that this is not a demonstration of good style or design! He was the winner of the [The International Obfuscated C Code Contest](http://www.ioccc.org/), a competition to write the most confusing code possible. + To get rid of our compiled programs, we use the `rm` command. + From now on, instead of typing `clang` directly, we can use the `make` command to compile our programs. `make` will actually execute `clang` for us with a few other options specified. + As we mentioned earlier, the `printf` function can take many different formatting characters. Just a few of them are: + `%c` for `char` + `%d` for `int` + `%f` for `float` + `%lld` for `long long` + `%s` for `string` + A `float` is simply a number with a decimal point. A `long long` is a number much larger than an `int` or `long` can store. + With this added knowledge of formatting characters, let's tweak our program a little bit: #include int main(void) { printf("hello, %s!\n", "david"); return 0; } + This change isn't all that interesting since the program now prints "hello, david!" instead of "hello, world!" but it's a step in an interesting direction. + Let's now make use of a variable to store a value provided to us by the user: #include int main(void) { string s = GetString(); printf("hello, %s!\n", s); return 0; } + `GetString` is a function provided in the CS50 Library written by the staff. `GetString` takes in user input and passes it back to your program as a string. In these first few weeks, we want you concentrating on more interesting tasks than collecting user input, so we've provided `GetString` and a few other functions for you. + Now when we try to compile this program, we get all sorts of errors. When the compiler prints out this many errors, it's a good idea to work your way through them from top to bottom because the errors at bottom might actually have been caused by the errors at the top. The topmost error is as follows: hello.c:5:5 error: use of undeclared identifier 'string': did you mean 'stdin'? + No, we didn't mean `stdin`! However, the variable type `string` is actually not built in to C. It's available via the CS50 Library. To use this library, we actually need to tell our program to include it like so: #include #include int main(void) { string s = GetString(); printf("hello, %s!\n", s); return 0; } + It compiles! When we run it though, nothing appears to happen. That's because it's waiting for us to type something as input. When we type in "david" and hit Enter, we again get "hello, david!" printed to the screen. + If we want to be annoying, we could type in thousands of characters as input instead of just a few. In the world of programming, it's cases like these that the programmer didn't necessarily anticipate that lead to bugs. Fortunately, we did anticipate this case when writing the CS50 Library, so everything works as intended. If it hadn't though, we might have caused a dreaded *segmentation fault*. Don't worry, you'll see one of these at some point in the next few weeks! + To recap, we installed the Appliance, opened it, and opened up gedit within it. We then wrote something like this: int main(void) { printf("take 2!!!\n"); } + When we try to compile this, though, we get an error. Seems like we forgot to include the library that contains the definition of `printf`: #include int main(void) { printf("take 2!!!\n"); } + Let's take a closer look at this line from our more advanced version of `hello.c`: string s = GetString(); + On the left side of the equals sign, we are creating a variable named `s` which will hold a string. Writing `string s` tells the computer to reserve space in its RAM to store a string. RAM is the memory that your computer uses in the short term while a program is executing. Compare this to the hard drive which your computer uses in the long term. This is a bit of an oversimplification, though, since we still don't know how big the string is that we need to store. We'll reexamine this in the coming weeks to see how `GetString` handles this. + On the right side of the equals sign, we call the function `GetString` with no arguments (as implied by the empty parentheses). Every time you call `GetString` it will have the same behavior, in contrast to `printf` which will behave differently depending on what arguments you pass it. + The equals sign itself tells the computer to take what's on the right and store it in what's on the left. This is called an *assignment*. + How many arguments does `printf` take in our usage above? Two. The comma outside the quotation marks separates the first argument `"hello, %s!\n"` from the second argument `s`. + This program only ever returns 0 to indicate that everything went well, but we could also return something non-zero if something went wrong. + Question: why did our take 2 of `hello.c` not return 0? No good reason, we probably should have included the return statement. Technically, 0 will be returned automatically for you if you don't specify a return statement. + Question: what if we wrote `void main(void)` instead of `int main(void)`? Then we wouldn't need the return statement at all because we defined `main` as returning nothing. + Question: do `%s` and `string` exist outside of the Appliance? `%s` is native to C, but `string` is only available through the CS50 Library. + Question: are commands case-sensitive? Commands are what we write in our terminal window on the command line. Statements are things like `return 0` that we write in our programs. Both commands and statements are case-sensitive. + Question: what if we named the file `hello.d`? There is, in fact, a programming language named D! However, to answer the question, we can just try it out. Turns out that `make` knows what to do and compiles our code as C anyway. + C has a number of *primitive types* built into it. These include `int`, `char`, `float`, `double` and more. A `char` is usually 8 bits in size. It is used to store a number that is mapped to a character via ASCII. `int` is the integer type which stores numbers using 4 bytes or 32 bits of memory. A `double` is 64 bits and is used to store numbers with greater precsion. + Let's take a look at a program that uses the primitive type `int`: /****************************************************************** * math1.c * * David J. Malan * malan@harvard.edu * * Computes a total but does nothing with it. * * Demonstrates use of variables. ******************************************************************/ #include int main(void) { int x = 1; int y = 2; int z = x + y; return 0; } + The chunk of text appearing between `/*` and `*/` represents a *comment*. A comment is used to describe what's going on in plain language as a note to yourself or to others who might be reading your source code. Comments do not affect the execution of the program itself. Comments in C start with `/*` and end with `*/`. In the case above, we have a file-level comment which we've dressed up with some extra asterisks. + Is the `` necessary in this program? Actually, no, because we're not using any functions like `printf` from the standard library. + This program itself seems to add two numbers together, but does nothing with the sum. Writing `int x` asks the operating system for 32 bits of RAM in which to store an integer. Likewise with `int y` and `int z`. + Let's take a look at `math2.c` which actually does something with the sum it computes: /****************************************************************** * math2.c * * David J. Malan * malan@harvard.edu * * Computes and prints an integral total. * * Demonstrates use of a format string. ******************************************************************/ #include int main(void) { int x = 1; int y = 2; int z = x + y; printf("%d\n", z); return 0; } + Now when we compile and run `math2.c`, we see the number 3 printed out. + Let's try some division: /****************************************************************** * math3.c * * David J. Malan * malan@harvard.edu * * Computes and prints a floating-point total. * * Demonstrates loss of precision. ******************************************************************/ #include int main(void) { float answer = 1 / 10; printf("%.2f\n", answer); return 0; } + We're using a `float` to store the result of the division because we know it will be a decimal. `%.2f` is a variation on our familiar formatting characters which tells the operating system to print a floating-point number, but only to two decimal places. + When we complile and run `math3.c`, we get 0.00 printed out. Oops. Well, let's change our formatting character to `%.20f` to make sure we're not missing something. Nope, now we just get 0.00000000000000000000 printed out. 1 divided by 10 isn't 0 so what's going on? The problem here is that 1 and 10 as we've used them above are `int`'s. When you ask a C program to divide two `int`'s, the answer must also be an `int`. 0.1 is not an integer, so it must be rounded off to an integer. In this case, it gets rounded off to 0. We can fix this by changing that one line to the following: float answer = 1.0 / 10.0 + Now we're dividing two `float`'s, so the answer will be stored as a proper `float`. However, when we compile and run this program with the formatting character `%.20f`, we see another problem. The answer that is printed to the screen is 0.10000000149011611938. Have we been lying to you for years when we told you that 1 / 10 is 0.1? No. The problem is that we have a finite number of bits to represent an infinite number of real numbers. Thus, there's an inherent imprecision in the storing of numbers. If you don't think this is a serious issue, take a look at [the following](http://www.youtube.com/watch?v=EMVBLg2MrLs).