1 00:00:00,000 --> 00:00:08,532 >> [MUSIC PLAYING] 2 00:00:08,532 --> 00:00:12,060 >> ZAMYLA CHAN: The first thing you might notice about find is that we already 3 00:00:12,060 --> 00:00:13,450 have code written for us. 4 00:00:13,450 --> 00:00:15,160 This is called distribution code. 5 00:00:15,160 --> 00:00:18,000 So we're not just writing our own code from scratch anymore. 6 00:00:18,000 --> 00:00:22,800 Rather, we're filling in the voids in some pre-existing code. 7 00:00:22,800 --> 00:00:27,790 >> The find.c program prompts for numbers to fill the haystack, searches the 8 00:00:27,790 --> 00:00:32,189 haystack for a user submitted needle, and it does this by calling sort and 9 00:00:32,189 --> 00:00:35,590 search, functions defined in helpers.c. 10 00:00:35,590 --> 00:00:37,670 So find.c is written already. 11 00:00:37,670 --> 00:00:40,770 Your job is to write helpers. 12 00:00:40,770 --> 00:00:41,870 >> So what are we doing? 13 00:00:41,870 --> 00:00:44,210 We're implementing two functions. 14 00:00:44,210 --> 00:00:49,030 Search, which returns true if a value is found in the haystack, returning 15 00:00:49,030 --> 00:00:51,370 false if the value is not in the haystack. 16 00:00:51,370 --> 00:00:57,990 And then we're also implementing sort which sorts the array called values. 17 00:00:57,990 --> 00:00:59,960 >> So let's tackle search. 18 00:00:59,960 --> 00:01:04,560 Search is currently implemented as a linear search, but you can do much 19 00:01:04,560 --> 00:01:05,550 better than that. 20 00:01:05,550 --> 00:01:09,910 Linear search is implemented in O of n time, which is quite slow. 21 00:01:09,910 --> 00:01:13,850 Although, it can search any list given to it. 22 00:01:13,850 --> 00:01:20,130 Your job is to implement binary search, which has run time O of log n. 23 00:01:20,130 --> 00:01:21,130 That's pretty fast. 24 00:01:21,130 --> 00:01:23,170 >> But there's a stipulation. 25 00:01:23,170 --> 00:01:27,600 Binary search can only search through pre-sorted lists. 26 00:01:27,600 --> 00:01:30,370 Why is that? 27 00:01:30,370 --> 00:01:32,620 >> Well let's look at an example. 28 00:01:32,620 --> 00:01:36,280 Given an array of values, the haystack, we're going to be looking 29 00:01:36,280 --> 00:01:37,130 for a needle. 30 00:01:37,130 --> 00:01:40,460 And in this example, the integer three. 31 00:01:40,460 --> 00:01:44,130 The way that binary search works is that we compare the middle value of 32 00:01:44,130 --> 00:01:48,370 the array to the needle, much like how we opened a phonebook to the middle 33 00:01:48,370 --> 00:01:50,660 page in week zero. 34 00:01:50,660 --> 00:01:54,650 >> So after comparing the middle value to the needle, you can discard either the 35 00:01:54,650 --> 00:01:58,530 left or the right half of the array by tightening your bounds. 36 00:01:58,530 --> 00:02:03,390 In this case, since three, our needle, is less than 10, the middle value, the 37 00:02:03,390 --> 00:02:05,990 right bound can decrease. 38 00:02:05,990 --> 00:02:08,400 But try to make your bounds as tight as possible. 39 00:02:08,400 --> 00:02:11,630 If the middle value isn't the needle, then you know that you don't need to 40 00:02:11,630 --> 00:02:13,010 include it in your search. 41 00:02:13,010 --> 00:02:17,310 So you're right bound can tighten the search bounds just a tiny bit more, 42 00:02:17,310 --> 00:02:21,770 and so on and so forth until you find your needle. 43 00:02:21,770 --> 00:02:23,480 >> So what does the pseudocode look like? 44 00:02:23,480 --> 00:02:28,420 Well while we're still looking through the list and still have elements to 45 00:02:28,420 --> 00:02:33,690 look in, we take the middle of the list, and compare that middle value to 46 00:02:33,690 --> 00:02:34,950 our needle. 47 00:02:34,950 --> 00:02:37,310 If they're equal, then that means we've found the needle and we can 48 00:02:37,310 --> 00:02:38,990 return true. 49 00:02:38,990 --> 00:02:42,870 >> Otherwise, if the needle is less than the middle value, then that means we 50 00:02:42,870 --> 00:02:47,280 can discard the right half, and just search the left side of the array. 51 00:02:47,280 --> 00:02:51,090 Otherwise, we'll search the right side of the array. 52 00:02:51,090 --> 00:02:54,410 And at the end, if you don't have any more elements left to search but you 53 00:02:54,410 --> 00:02:58,050 haven't found your needle yet, then you return false because the needle 54 00:02:58,050 --> 00:03:01,890 definitely isn't in the haystack. 55 00:03:01,890 --> 00:03:05,270 >> Now a neat thing about this pseudocode in binary search is that it can be 56 00:03:05,270 --> 00:03:09,940 interpreted as either an iterative or recursive implementation. 57 00:03:09,940 --> 00:03:13,810 So it would be recursive if you called the search function within the search 58 00:03:13,810 --> 00:03:17,350 function on either half of the array. 59 00:03:17,350 --> 00:03:21,030 We'll cover recursion a bit later in the course, but do know that it is an 60 00:03:21,030 --> 00:03:24,190 option if you'd like to try. 61 00:03:24,190 --> 00:03:26,030 >> Now let's look at sort. 62 00:03:26,030 --> 00:03:30,750 Sort takes an array and the integer n, which is the size of the array. 63 00:03:30,750 --> 00:03:34,030 Now there are various different types of sorts, and you can look at some 64 00:03:34,030 --> 00:03:36,370 shorts for demos and explanations. 65 00:03:36,370 --> 00:03:39,580 The return type for our sort function is void. 66 00:03:39,580 --> 00:03:43,580 So that means that we're not going to return any array from sort. 67 00:03:43,580 --> 00:03:48,140 We're actually going to change the very array that was passed into us. 68 00:03:48,140 --> 00:03:52,290 >> And that's possible because arrays are passed by reference in C. Now we'll 69 00:03:52,290 --> 00:03:55,290 see more about this later, but the essential difference between passing 70 00:03:55,290 --> 00:03:59,340 in something like an integer and passing in an array, is that when you 71 00:03:59,340 --> 00:04:03,490 pass in an integer, C is just going to make a copy of that integer and pass 72 00:04:03,490 --> 00:04:04,450 it to the function. 73 00:04:04,450 --> 00:04:08,530 The original variable won't be changed once the function is finished. 74 00:04:08,530 --> 00:04:12,480 With an array, on the other hand, it's not going to make a copy, and you'll 75 00:04:12,480 --> 00:04:17,910 actually be editing the very array itself. 76 00:04:17,910 --> 00:04:21,269 >> So one type of sort is the selection sort. 77 00:04:21,269 --> 00:04:24,750 The selection sort works by starting at the beginning, and then you iterate 78 00:04:24,750 --> 00:04:26,820 over and find the smallest element. 79 00:04:26,820 --> 00:04:30,710 And then you swap that smallest element with the first one. 80 00:04:30,710 --> 00:04:34,360 And then you move to the second element , find the next smallest 81 00:04:34,360 --> 00:04:38,320 element, and then swap that with the second element in the array because 82 00:04:38,320 --> 00:04:41,100 the first element is already sorted. 83 00:04:41,100 --> 00:04:45,370 And so then you continue for every element in identifying the smallest 84 00:04:45,370 --> 00:04:47,690 value and swapping it out. 85 00:04:47,690 --> 00:04:53,460 >> For I equals 0, the very first element to n minus 1, you're going to compare 86 00:04:53,460 --> 00:04:57,820 every next value after that and find the index of the minimum value. 87 00:04:57,820 --> 00:05:02,520 Once you find the minimum value index, you can swap that value of array 88 00:05:02,520 --> 00:05:05,930 minimum and array I. 89 00:05:05,930 --> 00:05:09,760 >> Another type of sort that you can implement is bubble sort. 90 00:05:09,760 --> 00:05:14,380 So bubble sort iterates over the list comparing adjacent elements and 91 00:05:14,380 --> 00:05:17,720 swapping the elements that are in the wrong order. 92 00:05:17,720 --> 00:05:22,380 And this way, the largest element will bubble to the end. 93 00:05:22,380 --> 00:05:28,070 And the list is sorted once no more elements have been swapped. 94 00:05:28,070 --> 00:05:31,920 >> So those are two examples of sort algorithms that you can implement for 95 00:05:31,920 --> 00:05:33,230 the find program. 96 00:05:33,230 --> 00:05:37,350 Once you finish sort, and you've done search, you're finished. 97 00:05:37,350 --> 00:05:39,720 My name is Zamyla, and this is CS50. 98 00:05:39,720 --> 00:05:46,987 >> [MUSIC PLAYING]