Andrew Sellergren
Let's take a look at a simple login program in C:
/****************************************************************************
* login.c
*
* Rob Bowden
* rob@cs.harvard.edu
*
* A silly program that validates a username and password combination
***************************************************************************/
#include <stdio.h>
#include <string.h>
#include <cs50.h>
int main(void)
{
// get username
printf("Please enter your username: ");
char* username = GetString();
if (username == NULL)
return 1;
// get password
printf("Please enter your password: ");
char* password = GetString();
if (password == NULL)
return 1;
// check to see if the username/password combination
// was valid
if ((strcmp(username, "rob") == 0 && strcmp(password, "thisiscs50") == 0) ||
(strcmp(username, "tommy") == 0 && strcmp(password, "i<3javascript") == 0))
{
printf("Success!! You now have access.\n");
}
// deny them access!
else
{
printf("Invalid login.\n");
}
// free memory allocated by GetString
free(username);
free(password);
// that's all folks!
return 0;
}
clang
to compile this program, we're going to use a program named compiler
. How do we compile compiler
if we don't already have a compiler? This is a problem known as bootstrapping. More on this later. Just this once, let's just use clang
to compile compiler
.Now that we have a working compiler
binary, let's use it to re-compile compiler.c
. We run:
./compiler compiler.c compile.c -o compiler
Finally, we use compiler
to compile our login program:
./compiler login.c -o login -lcs50
login
, we see that we get access when we enter in the username "rob" with the password "thisiscs50." We are denied access when we enter in some random username and password. Nothing interesting so far.Let's add one more username and password combination into our source code:
else if (strcmp(username, "hacker") == 0 &&
strcmp(password, "LOLihackedyou") == 0)
{
printf("Hacked!! You now have access.\n");
}
.c
file into memory and passsed it to clang
. But what if our compiler actually injected code into the source code we asked it to compile? Version 2 of compiler
does exactly that. First, it checks if the file being compiled is login.c
. Then, it looks for the "deny them access" comment and then inserts the extra else if condition we just considered. Now, even though our login program's source code no longer has a backdoor which allows the user "hacker" to gain access, our compiled login program does.In version 3 of compiler
, we check if the file being compiled is compiler.c
. Then, we inject the same code from version 2 which was responsible for injecting the backdoor into login.c
. So let's recompile compiler
to include this change:
./compiler compiler.c compile.c -o evilest_compiler
Now we have a evil compiler that will inject malicious code when it's compiling itself. So we need the evil compiler to compile itself:
./evil_compiler compiler.c compile.c -o compiler
login.c
such that it has a backdoor. Realize that this is different from version 2 because we can get rid of any trace of maliciousness from the source code. If we were distributing this, we could send the harmless compiler.c
and login.c
source code along with the evil compiler binary. Even if the user then re-compiles compiler
, he'll be using the evil compiler which will recreate its evil feature.Can you spot the bug in the following program?
#include <stdio.h>
void divide(int a, int b)
{
if (b == 0)
{
if (a = 0)
{
printf("Undefined\n");
}
else
{
printf("Infinity\n");
}
}
else
{
printf("%d\n", a / b);
}
}
int main(void)
{
divide(13, 4);
divide(0, 0);
divide(1, 0);
}
a
, not checking if a
equals 0, because we're using the =
operator instead of the ==
operator. In this case, the bug is not terribly harmful, although it does give us incorrect answers to our inputs. However, in 2003, a similar bug was introduced into the Linux kernel which assigned administrator privileges to a user in certain scenarios. Instead of a = 0
, the code was something like user = administrator
. Luckily, this bug, which turned out to have been intentionally introduced as a hacking attempt, was detected by a routine audit of source code changes before being released.strings compiler
, you can see all of the strings that the compiler
binary contains. Among them is our else if condition, which may stand out as strange. Of course, strings
also had to be compiled at some point, so who's to say that our evil compiler doesn't also intentionally introduce bugs into it? The same goes for objdump
, a program that is used to disassemble binary back into machine code or assembly language.tcpdump
, Wireshark, and Firesheep. These programs sniff all the packets that are going over the wireless network you're connected to.