## Week 10 Monday Andrew Sellergren ### Announcements and Demos (0:00-3:00) + Next week is our last lecture! We'll celebrate with cake. + If you've ever downloaded a piece of software from the internet (you have), you've placed a significant amount of trust in the makers of that software. There's almost nothing to stop that software from deleting all the files on your hard drive or reading all your e-mails. Today we talk about security in a world where computer programs have a tremendous amount of power. ### Trusting Trust (3:00-33:00) + Let's take a look at a simple login program in C: /**************************************************************************** * login.c * * Rob Bowden * rob@cs.harvard.edu * * A silly program that validates a username and password combination ***************************************************************************/ #include #include #include int main(void) { // get username printf("Please enter your username: "); char* username = GetString(); if (username == NULL) return 1; // get password printf("Please enter your password: "); char* password = GetString(); if (password == NULL) return 1; // check to see if the username/password combination // was valid if ((strcmp(username, "rob") == 0 && strcmp(password, "thisiscs50") == 0) || (strcmp(username, "tommy") == 0 && strcmp(password, "i<3javascript") == 0)) { printf("Success!! You now have access.\n"); } // deny them access! else { printf("Invalid login.\n"); } // free memory allocated by GetString free(username); free(password); // that's all folks! return 0; } + All this program does is check for two username and password combinations. Of course, in a real program, the usernames and passwords probably wouldn't be stored in plaintext within the source code itself. + Instead of using `clang` to compile this program, we're going to use a program named `compiler`. How do we compile `compiler` if we don't already have a compiler? This is a problem known as *bootstrapping*. More on this later. Just this once, let's just use `clang` to compile `compiler`. + Now that we have a working `compiler` binary, let's use it to re-compile `compiler.c`. We run: ./compiler compiler.c compile.c -o compiler + Finally, we use `compiler` to compile our login program: ./compiler login.c -o login -lcs50 + When we run `login`, we see that we get access when we enter in the username "rob" with the password "thisiscs50." We are denied access when we enter in some random username and password. Nothing interesting so far. + Let's add one more username and password combination into our source code: else if (strcmp(username, "hacker") == 0 && strcmp(password, "LOLihackedyou") == 0) { printf("Hacked!! You now have access.\n"); } + Although it's not a very clever one, this is an example of a *backdoor*. If you read the source code, it's obvious that this code grants access to one extra user. But what if you can't examine the source code? For all you know, that piece of software that you just downloaded and executed actually has some code like this which allows access to a user other than you. + In the source code for our original compiler, we simply read a `.c` file into memory and passsed it to `clang`. But what if our compiler actually injected code into the source code we asked it to compile? Version 2 of `compiler` does exactly that. First, it checks if the file being compiled is `login.c`. Then, it looks for the "deny them access" comment and then inserts the extra else if condition we just considered. Now, even though our login program's source code no longer has a backdoor which allows the user "hacker" to gain access, our compiled login program does. + This is still not very sneaky, though, because if someone were to look at the source code of the compiler, they might notice this strange injection of code. Let's take this one step further. + In version 3 of `compiler`, we check if the file being compiled is `compiler.c`. Then, we inject the same code from version 2 which was responsible for injecting the backdoor into `login.c`. So let's recompile `compiler` to include this change: ./compiler compiler.c compile.c -o evilest_compiler + Now we have a evil compiler that will inject malicious code when it's compiling itself. So we need the evil compiler to compile itself: ./evil_compiler compiler.c compile.c -o compiler + Finally, we have an evil compiler with which we can compile `login.c` such that it has a backdoor. Realize that this is different from version 2 because we can get rid of any trace of maliciousness from the source code. If we were distributing this, we could send the harmless `compiler.c` and `login.c` source code along with the evil compiler binary. Even if the user then re-compiles `compiler`, he'll be using the evil compiler which will recreate its evil feature. + To disavail you of the notion that this is a toy example, consider the speech that Ken Thompson gave, [Reflections on Trusting Trust](http://cm.bell-labs.com/who/ken/trust.html), when he accepted the Turing Award (more or less the Nobel Prize of computer science). In it, he describes this exact technique for compromising a compiler so that it would introduce a backdoor into a login program. The login program he refers to, however, is not some toy program, but rather the login program for all of UNIX. Since delivering this speech, Thompson has confirmed that this exploit was actually implemented and released to at least one company, BBN Technologies. + Can you spot the bug in the following program? #include void divide(int a, int b) { if (b == 0) { if (a = 0) { printf("Undefined\n"); } else { printf("Infinity\n"); } } else { printf("%d\n", a / b); } } int main(void) { divide(13, 4); divide(0, 0); divide(1, 0); } + In our innermost if condition, we're assigning 0 to `a`, not checking if `a` equals 0, because we're using the `=` operator instead of the `==` operator. In this case, the bug is not terribly harmful, although it does give us incorrect answers to our inputs. However, in 2003, a similar bug was introduced into the Linux kernel which assigned administrator privileges to a user in certain scenarios. Instead of `a = 0`, the code was something like `user = administrator`. Luckily, this bug, which turned out to have been intentionally introduced as a hacking attempt, was detected by a routine audit of source code changes before being released. + Coming back to our evil compiler, it's possible that the backdoor could be detected by examining the binary of the compiler itself. For example, if you run `strings compiler`, you can see all of the strings that the `compiler` binary contains. Among them is our else if condition, which may stand out as strange. Of course, `strings` also had to be compiled at some point, so who's to say that our evil compiler doesn't also intentionally introduce bugs into it? The same goes for `objdump`, a program that is used to disassemble binary back into machine code or assembly language. + Okay, fine, maybe we've checked and double checked the compiler so we know we can trust it. What about the processor though? It would be pretty complicated, but the processor itself could be bugged to have a backdoor as well. The moral of the story is that you can't trust anything that you didn't write yourself! ### Security (33:00-57:00) + Disclaimer: the things we're about to show you are meant to educate you, not to inspire you to do evil. We want you to be able to defend against attacks like this, not initiate them! + Back in the day, you had to plug in your computer in order to connect to the internet. Then in 1997, a new standard called IEEE 802.11 was introduced. You know this standard more commonly as the wireless internet or wifi. + Since 1997, the wireless standard has gone through a number of revisions, including 802.11b and 802.11g. Today, most computers adhere to 802.11n. Soon, 802.11ac is coming out. With each of these versions, we gain some bandwidth. + In a typical wireless setup, the router is connected to the internet via a physical cable. Your computer is then connected to the router via radio waves. The closer the computer is to the router, the stronger the signal is. Both the computer and the router must be capable of transmitting and receiving these radio waves constantly. + Bad guys love wifi. They especially love unencrypted wifi. If a bad guy can park outside your house and use your unencrypted wifi to download the worst of the worst from the internet, then the FBI will be knocking at *your* door, not his. + Bad guys also love wifi because it allows them to intercept information that was meant for someone else. Doing so is actually quite easy using programs like `tcpdump`, Wireshark, and Firesheep. These programs sniff all the packets that are going over the wireless network you're connected to. + In addition to intercepting information, bad guys can transmit false information quite easily on unencrypted wifi connections. When you're connected to a wifi network and you make a request for www.harvard.edu, a bad guy can send a false response and navigate you instead to www.harvardsucks.org. + Bad guys love wifi because it makes it so easy to snoop. To do the same snooping on a wired connection, a bad guy would have to actually splice his connection into the cable that connects your computer to the internet. + How do we defend against these bad guys? When the 802.11 standard first came out, the WEP encryption standard also came out. WEP stands for Wired Equivalent Protection. As its name implies, it tries to give you protection equivalent to what you would get from a wired connection. To do this, it requires a short password which is used to encrypt all the traffic sent between you and the router. + Because the WEP password is very short and everyone on a shared connection uses the same password, it's very easy to decrypt. Later released were WPA and WPA2, which are more secure. Even these have their problems, though, as a bad guy can send a packet to your computer to disconnect it from the router. Then he can listen as you reconnect and use that information to guess the WPA or WPA2 password. + When you're setting up your wireless network, you can also choose not to broadcast the network's name, or SSID. Then anyone who wants to connect to the network will need to type in its name rather than choosing it from a dropdown menu. + Another protection you have against bad guys on wireless is HTTPS. This is an added layer of encryption for your traffic. Of course, this extra encryption doesn't help if you choose to click through the big red screen which warns you that the website you're trying to go to may be insecure. + A VPN, or virtual private network, is another defense against bad guys on wireless. A VPN is a secure connection between you and a trusted server, for example Harvard. When you're connected to it, all of your web traffic will be encrypted and sent through Harvard. + Smartphones these days are often GPS-enabled, which means they can track your location anytime they have a connection. Recently, a security researcher discovered that on iPhones, these saved locations were being backed up to iTunes without first being encrypted. With a free piece of software called [iPhone Tracker](http://petewarden.github.com/iPhoneTracker/), you could very easily slurp up this information and plot it on a map to see where a person spends his time. Apple has since encrypted this information, but the takeaway here is to be wary of how much information about us is being collected and how easily it is accessible if we aren't careful. With your newfound knowledge of internet security, hopefully you can at least make an informed decision as to how much you're exposing yourself by visiting a particular website or by using a particular wireless connection.