Merge sort can be written in pseudocode as follows:
On input of n elements:
If n < 2
Return.
Else:
Sort left half of elements.
Sort right half of elements.
Merge sorted halves.
Recursion can be used to elegantly calculate the summation of numbers. Given a number n, we want to calculate n + n - 1 + n - 2 + … 0. We’ll call this summation sigma after the Greek letter Σ used in mathematical notation. Let’s start by writing the basic structure of the program:
#incude <stdio.h>
int main(void)
{
int n;
do
{
printf("Positive integer please: ");
n = GetInt()
}
while (n < 1);
int sum = sigma(n);
printf("The sume is %d.\n");
return 0;
}
Now that we have the basic structure of the program, we’re only missing the implementation of sigma
. Let’s give it a shot using a non-recursive algorithm:
#incude <stdio.h>
int main(void)
{
int n;
do
{
printf("Positive integer please: ");
n = GetInt()
}
while (n < 1);
int sum = sigma(n);
printf("The sume is %d.\n");
return 0;
}
int sigma(int number)
{
int sum = 0;
for (int i = 0; i <= number; i++)
sum += 1;
return sum;
}
Pretty straightforward. One thing we forgot was the function prototype. We need that above our declaration of main
. We also forgot to include the CS50 Library.
#include <cs50.h>
#incude <stdio.h>
// prototype
int sigma(int number);
int main(void)
{
int n;
do
{
printf("Positive integer please: ");
n = GetInt()
}
while (n < 1);
int sum = sigma(n);
printf("The sume is %d.\n");
return 0;
}
int sigma(int number)
{
int sum = 0;
for (int i = 0; i <= number; i++)
sum += 1;
return sum;
}
sigma
above main
.Because we’ve structured our program in this way, we can complete rip out our definition of sigma
and rewrite it to use a recursive algorithm:
#include <cs50.h>
#incude <stdio.h>
// prototype
int sigma(int number);
int main(void)
{
int n;
do
{
printf("Positive integer please: ");
n = GetInt()
}
while (n < 1);
int sum = sigma(n);
printf("The sume is %d.\n");
return 0;
}
int sigma(int number)
{
return (number + sigma(number - 1));
}
sigma
calls itself infinitely. Last time we did this, we ran out of memory because there is a chunk of the stack allocated for each function call. We only have so many chunks to give out before we start trying to access memory that doesn’t belong to us, in which case our program fails with a segmentation fault. When this happens, there’s a file named core
that gets dumped which contains the state of the program at the time it failed. Soon, we’ll show you how to make use of this file to debug your program.To fix this problem, we add a base case:
#include <cs50.h>
#incude <stdio.h>
// prototype
int sigma(int number);
int main(void)
{
int n;
do
{
printf("Positive integer please: ");
n = GetInt()
}
while (n < 1);
int sum = sigma(n);
printf("The sume is %d.\n");
return 0;
}
int sigma(int number)
{
if (number <= 0)
return 0;
else
return (number + sigma(number - 1));
}
Recall our swap
function from a while back:
void swap(int a, int b)
{
int tmp = a;
a = b;
b = tmp;
}
What was the problem with this function? When you pass arguments to a function like this, you’re actually just passing copies of the arguments. Thus, when we modify a
and b
, we’re only modifying the copies, which are thrown away when the function returns. We saw this in the context of buggy3.c
:
/****************************************************************************
* buggy3.c
*
* David J. Malan
* malan@harvard.edu
*
* Should swap two variables' values, but doesn't!
* Can you find the bug?
***************************************************************************/
#include <stdio.h>
// function prototype
void swap(int a, int b);
int main(void)
{
int x = 1;
int y = 2;
printf("x is %d\n", x);
printf("y is %d\n", y);
printf("Swapping...\n");
swap(x, y);
printf("Swapped!\n");
printf("x is %d\n", x);
printf("y is %d\n", y);
return 0;
}
/**
* Swap arguments' values.
*/
void swap(int a, int b)
{
int tmp = a;
a = b;
b = tmp;
}
swap
function. Nothing within the main
function was actually changed.x
and y
to our swap
function, we can pass the memory addresses of x
and y
so that swap
can modify them. Although swap
and main
ostensibly have separate chunks of memory allocated to them, they can still touch any piece of memory in the operating system. Herein lies the power and the danger of C. This abuse of memory access is often used by serial number crackers and website hackers.printf
statements at key points. We’re going to show you a utility called GDB to make the process of debugging much less manual. GDB allows you to walk through your program line by line as it executes.buggy3.c
program, we type gdb -tui ./buggy3
. This brings up a split window, the top of which contains our source code and the bottom of which contains a prompt for GDB.run
to execute the program as normal. If beforehand we type break main
, we set a breakpoint on the function main
. That is, GDB will now know to pause execution as soon as the main
function is called. To mark this breakpoint, the characters b+
will appear in the left margin next to line 18 (the first line ofmain
) in the top half of our split window. Now we type run
and hit Enter.At this point in the story, if we want to see the value of x
we simply type print x
. This gives us some strange output:
$1 = 134514032
$1
, this just means that you can refer to back to this value using the term $1
. The value itself is somewhat confusing, though. Turns out it’s because x
hasn’t been initialized yet. Before it is initialized, x
contains a garbage value, the remnants of whatever was previously stored in memory there. As soon as we type next
to advance to the next line of code, we can print x
again to see that it’s value is now 1.x
and y
all the time, we can type display x
and display y
to show them after every command we execute. We can also type info locals
to show all the local variables in the current function.next
a few more times to advance past the printf
statements. Note that because we’re running our program within GDB, we won’t really see the terminal window output except awkwardly on the divider line between the top half and the bottom half of the split window.x
and y
printed after every command we execute, so when we type next
to advance through the call to swap
, we can see that their values truly do remain unchanged by this function.run
again and answer y
to start the program from the beginning. This time, we’ll hit next
till we get to the line on which swap
is called, at which point we’ll type step
. This command instructs GDB to actually step inside the next line. When we do this, the opening curly brace of the swap
function is highlighted to indicate that execution has paused there. We can next
our way to the last line of swap
and print the values of a
and b
to verify that the swap actually successfully occurred. However, as soon as we next
again, we end up back in main
and x
and y
haven’t changed values.-tui
flag to see just the bottom half of the split window without the textual user interface (TUI) display of the source code in the top half. Type list
to see the lines of source code around the current line.run
break
print
display
info
next
step
continue
backtrace
frame
s
is a string, just type print s
and you’ll get the value of the string as well as its address in memory.swap
again, but we forgot what the value of x
and y
were in main
. We can’t print them out because they’re not in scope. However, we can type backtrace
to see what frames are currently on the stack. We see that main
is frame number 1, so we can type frame 1
to enter main
again. We type frame 0
to return to where we were in swap
.next 10
is a shortcut for entering the command next
10 times. continue
will run your program until it hits the next breakpoint.We introduce the concept of structs in Problem Set 3 so that we can have our own custom variables. Consider for this next example that we want to store a bunch of related information about a student:
#include <cs50.h>
#include <stdio.h>
int main(void)
{
int id;
string name;
string house;
id = 123;
name = "David";
house = "Mather";
printf("%s, whose ID is %d, lives in %s.\n", name, id, house);
return 0;
}
Okay, so our boring program stores information about a single student. What if we want to store information about more than one student? Copying and pasting is really not ideal. We could maybe be a little more clever using arrays instead:
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// people
int ids[3];
string names[3];
string houses[3];
// David
ids[0] = 123;
names[0] = "David";
houses[0] = "Mather";
// Rob
ids[1] = 456;
names[1] = "Rob";
houses[1] = "Kirkland";
printf("%s, whose ID is %d, lives in %s.\n", names[0], ids[0], houses[0]);
printf("%s, whose ID is %d, lives in %s.\n", names[1], ids[1], houses[1]);
return 0;
}
emails
and make sure that the entry at index 0 in emails
corresponds to the same student as the entry at index 0 in all the other arrays. There has to be a better way.Indeed there is! We can create our own data type to encapsulate a student:
#include <cs50.h>
#include <stdio.h>
typedef struct
{
int id;
string name;
string house;
}
student;
int main(void)
{
// David
student david;
david.id = 123;
david.name = "David";
david.house = "Mather";
// Rob
student rob;
rob.id = 456;
rob.name = "Rob";
rob.house = "Kirkland";
return 0;
}
main
, we now have a new data type named student
. We can declare a variable of this type just like any other type. To assign values to its attributes, we use dot notation.This still feels a little clunky, though, since we have to hardcode values and repeat similar lines of code. Let’s again leverage arrays to store multiple variables of the same type:
#include <cs50.h>
#include <stdio.h>
typedef struct
{
int id;
string name;
string house;
}
student;
int main(void)
{
// people
student students[3];
// David
students[0].id = 123;
students[0].name = "David";
students[0].house = "Mather";
// Rob
students[1].id = 456;
students[1].name = "Rob";
students[1].house = "Kirkland";
return 0;
}
This too could use some cleaning up. We should probably #define
the number 3 so that we can easily change it if need be. Perhaps we could also use a for loop coupled with GetString
instead of hardcoding the student info. Indeed, we do so in our prefabbed example:
/****************************************************************************
* structs1.c
*
* Computer Science 50
* David J. Malan
*
* Demonstrates use of structs.
***************************************************************************/
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "structs.h"
// class size
#define STUDENTS 3
int main(void)
{
// declare class
student class[STUDENTS];
// populate class with user's input
for (int i = 0; i < STUDENTS; i++)
{
printf("Student's ID: ");
class[i].id = GetInt();
printf("Student's name: ");
class[i].name = GetString();
printf("Student's house: ");
class[i].house = GetString();
printf("\n");
}
// now print anyone in Mather
for (int i = 0; i < STUDENTS; i++)
if (strcmp(class[i].house, "Mather") == 0)
printf("%s is in Mather!\n\n", class[i].name);
// free memory
for (int i = 0; i < STUDENTS; i++)
{
free(class[i].name);
free(class[i].house);
}
return 0;
}
strcmp
function compares two strings. If it returns 0, the two strings are equal. If it returns -1, the first string comes before the second string in the alphabet and if it returns 1, the first string comes after the second string in the alphabet.GetString
thus far, you’ve introduced a small bug in your program. GetString
asks the operating system for memory to store a string, but it never gives it back. This results in what’s called a memory leak. In the real world, if you notice that your computer gets slower and slower the longer it’s been on, it may be because someone who wrote a program you’re running has introduced a memory leak.Like structs1.c
, the following program prompts the user for information on three students:
/****************************************************************************
* structs2.c
*
* David J. Malan
* malan@harvard.edu
*
* Demonstrates use of structs.
***************************************************************************/
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "structs.h"
// class size
#define STUDENTS 3
int main(void)
{
// declare class
student class[STUDENTS];
// populate class with user's input
for (int i = 0; i < STUDENTS; i++)
{
printf("Student's ID: ");
class[i].id = GetInt();
printf("Student's name: ");
class[i].name = GetString();
printf("Student's house: ");
class[i].house = GetString();
printf("\n");
}
// now print anyone in Mather
for (int i = 0; i < STUDENTS; i++)
if (strcmp(class[i].house, "Mather") == 0)
printf("%s is in Mather!\n\n", class[i].name);
// let's save these students to disk
FILE* fp = fopen("database", "w");
if (fp != NULL)
{
for (int i = 0; i < STUDENTS; i++)
{
fprintf(fp, "%d\n", class[i].id);
fprintf(fp, "%s\n", class[i].name);
fprintf(fp, "%s\n", class[i].house);
}
fclose(fp);
}
// free memory
for (int i = 0; i < STUDENTS; i++)
{
free(class[i].name);
free(class[i].house);
}
return 0;
}
database
. To do this, we open the file using the fopen
function. The second argument "w"
tells fopen
that we want to write to this file.if (fp != NULL)
is a sanity check that call to fopen
didn’t fail. More on this later.fprintf
takes a file pointer as its first argument. We’ll discuss pointers in more detail later, but for now think of it as a placeholder for the file we opened. Instead of printing to the screen, fprintf
prints to the file we pass to it.database
is indeed created on our file system. Inside this file is the information we inputted to structs2
.scramble.c
, the syntax might be a little more familiar. We use a struct to define a new word
type that includes room for both the word itself as well as a variable denoting whether the user has already found the word. If he has already found the word, we don’t want to give him points for finding it again.Notice that the keyword typedef
is missing from the following definition:
// defines a dictionary as having a size and an array of words
struct
{
int size;
word words[WORDS];
}
dictionary;
dictionary
.The program compare1.c
is an example of how not to compare two strings:
/****************************************************************************
* compare1.c
*
* David J. Malan
* malan@harvard.edu
*
* Tries (and fails) to compare two strings.
*
* Demonstrates strings as pointers to arrays.
***************************************************************************/
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// get line of text
printf("Say something: ");
string s = GetString();
// get another line of text
printf("Say something: ");
string t = GetString();
// try (and fail) to compare strings
if (s == t)
printf("You typed the same thing!\n");
else
printf("You typed different things!\n");
return 0;
}
string
data type is actually just a synonym for char *
, which is a pointer, or a memory address. Thus, the program above is actually comparing two memory addresses, not two strings.GetString
doesn’t actually return the string itself, but rather the value 0, the address of the first character. Knowing the address of the first character, the operating system can find the rest of the string simply by iterating until it hits the special null terminator character \0
which denotes the end of a string.