Implement several O(n2) sorting algorithms.
Experience principles of abstraction.
Write code that must conform into an existing framework.
Experience how real-world run time of algorithms differs from theoretical run time.
Page 17 of http://www.howstuffworks.com/c.htm.
Chapters 20 and 23 of Absolute Beginner’s Guide to C.
Chapters 13, 15, and 18 of Programming in C.
This course’s philosophy on academic honesty is best stated as "be reasonable." The course recognizes that interactions with classmates and others can facilitate mastery of the course’s material. However, there remains a line between enlisting the help of another and submitting the work of another. This policy characterizes both sides of that line.
The essence of all work that you submit to this course must be your own. Collaboration on problems is not permitted (unless explicitly stated otherwise) except to the extent that you may ask classmates and others for help so long as that help does not reduce to another doing your work for you. Generally speaking, when asking for help, you may show your code or writing to others, but you may not view theirs, so long as you and they respect this policy’s other constraints. Collaboration on quizzes and tests is not permitted at all. Collaboration on the final project is permitted to the extent prescribed by its specification.
Below are rules of thumb that (inexhaustively) characterize acts that the course considers reasonable and not reasonable. If in doubt as to whether some act is reasonable, do not commit it until you solicit and receive approval in writing from your instructor. If a violation of this policy is suspected and confirmed, your instructor reserves the right to impose local sanctions on top of any disciplinary outcome that may include an unsatisfactory or failing grade for work submitted or for the course itself.
Communicating with classmates about problems in English (or some other spoken language).
Discussing the course’s material with others in order to understand it better.
Helping a classmate identify a bug in his or her code, such as by viewing, compiling, or running his or her code, even on your own computer.
Incorporating snippets of code that you find online or elsewhere into your own code, provided that those snippets are not themselves solutions to assigned problems and that you cite the snippets' origins.
Reviewing past years' quizzes, tests, and solutions thereto.
Sending or showing code that you’ve written to someone, possibly a classmate, so that he or she might help you identify and fix a bug.
Sharing snippets of your own solutions to problems online so that others might help you identify and fix a bug or other issue.
Turning to the web or elsewhere for instruction beyond the course’s own, for references, and for solutions to technical difficulties, but not for outright solutions to problems or your own final project.
Whiteboarding solutions to problems with others using diagrams or pseudocode but not actual code.
Working with (and even paying) a tutor to help you with the course, provided the tutor does not do your work for you.
Accessing a solution to some problem prior to (re-)submitting your own.
Asking a classmate to see his or her solution to a problem before (re-)submitting your own.
Decompiling, deobfuscating, or disassembling the staff’s solutions to problems.
Failing to cite (as with comments) the origins of code, writing, or techniques that you discover outside of the course’s own lessons and integrate into your own work, even while respecting this policy’s other constraints.
Giving or showing to a classmate a solution to a problem when it is he or she, and not you, who is struggling to solve it.
Looking at another individual’s work during a quiz or test.
Paying or offering to pay an individual for work that you may submit as (part of) your own.
Providing or making available solutions to problems to individuals who might take this course in the future.
Searching for, soliciting, or viewing a quiz’s questions or answers prior to taking the quiz.
Searching for or soliciting outright solutions to problems online or elsewhere.
Splitting a problem’s workload with another individual and combining your work (unless explicitly authorized by the problem itself).
Submitting (after possibly modifying) the work of another individual beyond allowed snippets.
Submitting the same or similar work to this course that you have submitted or will submit to another.
Using resources during a quiz beyond those explicitly allowed in the quiz’s instructions.
Viewing another’s solution to a problem and basing your own solution on it.
Your work on this problem set will be evaluated along four axes primarily.
To what extent does your code implement the features required by our specification?
To what extent is your code consistent with our specifications and free of bugs?
To what extent is your code written well (i.e., clearly, efficiently, elegantly, and/or logically)?
To what extent is your code readable (i.e., commented and indented with variables aptly named)?
To obtain a passing grade in this course, all students must ordinarily submit all assigned problems unless granted an exception in writing by the instructor.
First, refresh your memory on the O(n2) sorting algorithms we’ve learned about so far in the course. Here are Jackson and Tommy to explain:
Before moving on, be sure you’re comfortable answering the following questions:
Why is bubble sort in O(n2)?
Why is insertion sort in Ω(n)?
How does selection sort work?
As always, first log into your CS50 IDE at cs50.io and execute
within a terminal window to make sure your workspace is up-to-date. Next, execute
at your prompt to ensure that you’re inside of the
unit3 directory within your
workspace directory. Then execute
to download a ZIP of this problem’s distro into your workspace. You should see a bunch of output followed by:
Confirm that you’ve indeed downloaded
race.zip by executing
and then run
to unzip the file. If you then run
ls again, you should see that you have a newly unzipped directory called
race as well. You can now delete the ZIP, with:
rm -f race.zip
Careful! We’ve added the
-f flag this time, so
rm will not confirm that you want to delete the file. If you like the comfort of having the system double-check with you, just omit
-f from your command. Lastly, execute
and you should see that the directory contains four files:
Makefile helpers.c helpers.h race.o
Off we go!
In this problem, you’ll be racing the three O(n2) sorting algorithms we’ve seen under a few different test conditions, to see how they perform against one another. Those test conditions will be:
arrays that are almost sorted, with two elements out of place,
arrays in reverse order,
arrays in completely random order, and
arrays that are already sorted.
The good news is that you don’t have to implement anything involving populating the arrays! You only have to do a tiny amount of command-line validation and implement the three sorting algorithms themselves.
But we’re getting a bit ahead of ourselves. First we need to deal with the contents of the directory you just unzipped. If you completed Problem 3-2 using the partial staff solution, this will be old news to you, but if not, have a peek at the
Makefile we’ve prepared for you. In particular, focus on this portion.
race: race.o helpers.c helpers.h clang -ggdb3 -O0 -std=c11 -Wall -Werror -o race race.o helpers.c -lcs50 -lm
Per the dependencies implied above (after the colon), any changes to either
helpers.h will compel
make to rebuild
race the next time it’s invoked for this target. In other words, this means that
race is not simply comprised of a single source file, but rather of three separate files.
helpers.h you probably can figure out. But what the heck is
race.o? A refresher on the compilation process might be in order here, first. Take it away, Rob:
Interesting… so it turns out that
race.o contains the object code that was generated from a source file (presumably called
race.c) that we wrote and partially compiled, but then chose not to include in the distro. We made this choice on the one hand because the code therein is a bit complicated at this stage of the course (thus allowing us to abstract some of the complex detail), but further because it provides you with an opportunity to write code that simply must conform to a precise specification, or it won’t work!
It turns out that
race.o contains the object code for, among other things,
main. And if you can’t change
main then you can’t change the way that
main calls any functions, including the functions you’ll be tasked with writing in this problem. Bummer!
If this seems unfair, know that it’s also a really good indicator of what you’ll experience in the real world if you continue with programming as a career. Often times large groups of people collaborate on a single project, and there are standards and specifications that must be adhered to so that everyone’s components interoperate smoothly. Veering from those standards will mean your code is incompatible with the project at large, which will mean you will have wasted some of your valuable time (and probably irritated your colleagues, too)!
Incidentally, because it is not a so-called "target" specified in the
Makefile, if when working on this problem you (inadvertently or intentionally) try to
you’ll actually default to using the standard
Makefile included with CS50 IDE which will just try to compile the
helpers.c file alone into its own program. Problem is, if you open up
helpers.c, there’s no
main function, so you’ll probably get a whole bunch of cryptic error messages concluding with one along these lines:
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o: In function `_start': (.text+0x20): undefined reference to `main'
So do be sure that whenever you try to compile this program, you do so with
or, in fact, because of the
all target specified in
Makefile, you could also just
or, in fact, because
race is the first target listed in
Makefile and absent any other command-line arguments supplied to
make it will simply default to compiling the first target listed in the
Makefile, you can even say just
All of the work you’ll be doing in this problem will be confined to
helpers.c and possibly
helpers.h. In particular, you have to implement the four functions prototyped therein:
If you try to compile
race from the distro and run it without any command line arguments, you’re immediately notified of the proper usage of the program.
Usage: ./race array-type size
And then, if you supply it with three command line arguments, regardless of what those arguments are, you’ll see the following:
Invalid array-type. Must be -a, -b, -r, or -s
Why? Because right now if you have a look at
helpers.c, you’ll see that it always returns
false. But eventually what
check_flag should do, per the comment atop its prototype, is return
true if the argument passed in (which happens to be
-s, and return
false if the argument passed in is anything other than that.
Does the format of those strings remind you of anything you’ve seen recently? Recall the command we recommended you use to get rid of
rm -f race.zip
In this case, we would term
-f a flag, which is just another way of describing a command-line argument to a particular program or Linux command (in this case,
rm) that modifies the behavior of that program. In the case of
rm, that flag tells
rm to not confirm with you whether you intend to delete the file(s) in question; it just deletes them right away.
check_flag is doing is confirming whether
argv is one of those four things. If doesn’t report out which one it is, just that it’s one of them. Odds are there’s a function that might help with checking that.
Incidentally, what do these four flags represent? They determine what type of array will be the test case for the various sorting algorithms:
-afor almost sorted arrays. These arrays are already sorted except for two elements which have been randomly switched.
-bfor backwards arrays. These arrays are sorted, but in reverse order: left-to-right, largest-to-smallest.
-rfor random arrays. These arrays have no particular order.
-sfor sorted arrays. These arrays are already properly sorted in order from left-to-right, smallest-to-largest.
Again, you needn’t worry about implementing the functionality of populating the arrays. That was done by us, and the object code resulting from that implementation lives in
In the functions
insertion you will be implementing, respectively, bubble sort, selection sort, and insertion sort. Remember that you are not allowed to modify the prototypes of
insertion, but you are welcome to create any additional "helper" functions that you wish, placing their prototypes in
helpers.h and their definitions in
We’re not going to give you much more than that! But do make sure to implement all three algorithms which, per the shorts atop this specification, behave quite differently even though all three have the same ultimate result. As a tip, you may want to start with fairly small
argv) arguments, and you may want to do some debugging to ensure that your sorting algorithms are actually sorting the array properly.
Once you’ve implemented
check_flag, if you try to run your program you’ll see that when it runs you get output like the following:
bubble sort benchmark: 0.000 seconds selection sort benchmark: 0.000 seconds insertion sort benchmark: 0.000 seconds
So this is where the "race" really happens. Time to see which of these algorithms is the fastest. But… wait? Aren’t they all O(n2) algorithms? Shouldn’t they all run at exactly the same speed? Well, not exactly.
Theoretically, as n gets larger and larger, yes, these three algorithms will tend to run at closer and closer speeds. But theoretical runtime is not the same as real-world runtime, and so under average and varying test conditions, the performance of these three algorithms will differ, sometimes substantially. For example, see the below wherein underlined text represents user input to the program.
username@ide50:~/workspace/unit3/race $ ./race -b 2000 bubble sort benchmark: 0.012 seconds selection sort benchmark: 0.008 seconds insertion sort benchmark: 0.004 seconds
Not too much of a difference. But what about
username@ide50:~/workspace/unit3/race $ ./race -r 100000 bubble sort benchmark: 52.032 seconds selection sort benchmark: 28.944 seconds insertion sort benchmark: 8.808 seconds
Ouch! Or lastly
username@ide50:~/workspace/unit3/race $ ./race -s 20000 bubble sort benchmark: 0.000 seconds selection sort benchmark: 0.592 seconds insertion sort benchmark: 0.000 seconds
Hmm… selection sort still took that much time to "sort" an already-sorted array? Is the difference between Ω(n2) and Ω(n) now a bit more clear?
And know that because of varying processor performance and system load, under otherwise-identical conditions from run-to-run the running time of these algorithms may vary somewhat.
check50 is not capable of detecting whether you are implementing bubble, selection, or insertion sort correctly. It is only capable of determining whether your output is indeed sorted. Because the crux of this problem lies in implementing these sorts correctly, we leave it to you (and GDB!) to ensure that your three functions are implemented properly.
To run the staff solution, simply execute:
passing in appropriate command-line arguments.
In no part of this problem are you expected to optimize your runtimes for any of these algorithms (beyond, of course, implementing them correctly). Rather, after you get them implemented you should test different arrays of different sizes and different configurations to see under which circumstances each algorithm "shines". So you can see actual differences between these algorithms, we recommend that your
size argument be at least
1000, as that way they’ll tend to take at least a few thousandths of a second to sort.
This was Problem 3-4.