BRIAN YU: In this lab, your task is going to be to determine which sorting algorithm various different programs use. We've been introduced, now, to three different sorting algorithms-- selection sort, bubble sort, and merge sort. Recall that in selection sort, the way the algorithm works is by taking an array of numbers and repeating some process-- the process of looking through that entire array of numbers to try and find the smallest item in that entire array, and then bringing that smallest item to the beginning of the array. 

Then we search through the remainder of the array, looking for the next smallest item, and when we find it, we bring that next smallest item back to the beginning of the array, and we repeat that process, continually looking through the remaining unsorted portion of the array, finding the smallest element, and swapping it with the element in the next position. Once we repeat that process through the entire array, we'll end up with a sorted array. 

Meanwhile, bubble sort works a little bit differently. It also makes passes through the array, but it compares two values at a time. It looks at every pair of values and tries to determine whether that pair of numbers is in the correct order. 

That is to say, if you're trying to sort from smallest to largest, making sure that for any pair of two numbers, the smaller number is to the left of the larger number. And what bubbles sort does is that if it finds two numbers that are out of order, then it swaps them. And bubble sort will continually go through this process of considering pairs of values and swapping them if necessary, until it gets to the end of the array. 

Finally, merge sort works fundamentally differently. It works by taking an array and dividing it into a left half and a right half, and sorting each of those halves first, and after we've recursively sorted each of those halves, we merge those halves back together. By repeating this process recursively again and again, we can build up an entire sorted array very quickly. 

Each of these algorithms can be analyzed in terms of their running time, in terms of big O-- the upper bound on the number of steps required to complete the sort-- and as well as big omega-- the lower bound on the number of steps required to complete the sort. In this case, we can see that for a selection sort, it has both a big O and a big omega of n squared. That is to say that if there are n numbers that we are trying to sort, it might take about n squared steps in order to actually complete that sorting algorithm. 

Bubble sort, meanwhile, also has a big O of n squared. As an upper bound, it still might take on the order of n squared steps to sort n numbers, but it has a big omega of n. That is to say, if we were to get lucky and we were given an array that's already sorted, for example, bubble sort could sort n items and using just n steps by making one pass through the array, concluding that no swaps are necessary, and then not continuing on with anything else, because the array is already sorted. 

Meanwhile, merge sort has a running time of big O of n log n, and big omega of n log n as well, meaning that if there are n numbers that we are trying to sort, this algorithm is going to take on the order of n times log n steps in order to sort all of those numbers. That's better than n squared, though not nearly as good as big omega of n, for example. 

Your task now, in this lab, is that we are going to give you three programs-- sort1, sort2, and sort3. One of them uses selection sort. One of them uses bubble sort. And one of them uses merge sort, but we're not telling you which is which. 

Your task is going to be to run these sorting algorithm programs on various different inputs-- lists of numbers of different sizes and different orders-- and to try to determine which algorithm corresponds to which program. The files we give you are going to be these. We'll give you a file called random5000.txt, for example, which will be a text file that contains 5,000 numbers in random order. Likewise, random10000.txt and random50000.txt will contain 10,000 50,000 numbers, respectively, all in random order. 

In addition to files in random order, we'll also give you files called reversed5000.txt and reverse10000.txt, and so forth, each of which will also contain numbers, but this time in reverse order, from largest to smallest. And finally, we also give you sorted5000 and sorted10000.txt, which again will be text files that just contain numbers, one on each line, but this time the numbers will already be in sorted order from smallest to largest. 

Let's take a look at what you might do with these files. Here's the distribution code for the sort lab. You'll notice that we have our three programs-- sort1, sort2, and sort3-- and we also have these files. Random5000.txt, for example, is going to be a file with 5,000 lines, each with numbers, where those numbers are not in any particular order. 

Meanwhile, reversed5000.txt is going to be a text file that also has 5,000 numbers, but this time they're in reverse order, from largest to smallest. And finally, files like sorted5000.txt will give you 5,000 lines of numbers, where they're in sorted order in ascending order from smallest, all the way to largest. 

You can then use these sorting programs-- sort1, sort2, and sort3-- to try to see how you would go about sorting, and how long it might take to sort any of these files. For example, I might run ./sort1 on, let's say, random5000.txt. And what you'll see when I run that program is the program will sort those numbers using some sorting algorithm, and then print those numbers in sorted order from smallest to largest. 

Now, how long did that take? It turns out there's also a program called time that can time how long it takes for some other program to run. So I might say time ./sort1, and then random5000.txt, and if I run that, after everything is sorted, I'll see that the amount of time that took in real time was 0.168 seconds. And depending on which sorting program I use, and depending on what file I run that sorting algorithm on, and the number of values that are in that file, as well as what order that file is already in, I might get different values for the amount of time it takes to sort and print out all of those numbers. 

Using that information, your challenge is to solve that puzzle, to figure out which algorithm does sort1 use, and which algorithm does sort2 use, and which algorithm does sort3 use? Again, you can test any of these sorting programs by running ./sort1 or ./sort2 sort3 on a text file that's provided as a command line argument. 

For example, in this case, we're running sort1 on reversed10000.txt. And to time how long any of these commands take, all you need to do is prepend the time command in front of whatever command you would otherwise want to run. For example, here we're timing how long it takes to use sort1 to sort and print out all of the values in reversed10000.txt. 

Ultimately, what you'll need to do is test and time each of these programs on some of the sample files that we give you. Don't have to test it on all of them, but you'll likely want to test on at least a few to get a sense for how these programs behave depending on the size of the input, and depending on how well-ordered the values inside of that text file already are. 

And using that information, you should be able to determine which program corresponds to which sorting algorithm by taking advantage of what you know about the running times of those sorting algorithms. If one sorting algorithm has a much better big omega then it does big O, then you might expect that it would be much faster on a text file that's already sorted, for example. 

And you might examine what happens with each of these programs as you try to run them on larger and larger inputs of inputs that have more and more values to see how long that takes in comparison to files that have fewer values. Using all of that as your clues and information, you should be able to piece together and draw conclusions about which of these sorting algorithms we're using in each of those programs. My name is Brian, and this was sort.