1 00:00:00,000 --> 00:00:12,350 >> [MUSIC PLAYING] 2 00:00:12,350 --> 00:00:13,050 >> ROB BOWDEN: Hi. 3 00:00:13,050 --> 00:00:13,640 I'm Rob. 4 00:00:13,640 --> 00:00:16,210 And let's this solution out. 5 00:00:16,210 --> 00:00:20,070 So here we're going to implement a general table. 6 00:00:20,070 --> 00:00:24,090 We see that the struct node of our table is going to look like this. 7 00:00:24,090 --> 00:00:28,710 So it's going to have a char word array of size LENGTH + 1. 8 00:00:28,710 --> 00:00:32,259 Don't forget the + 1, since the maximum word in the dictionary is 45 9 00:00:32,259 --> 00:00:33,130 characters. 10 00:00:33,130 --> 00:00:37,070 And then we're going to need one extra character for the backslash zero. 11 00:00:37,070 --> 00:00:40,870 >> And then our hashtable in each bucket is going to store a 12 00:00:40,870 --> 00:00:42,320 linked list of nodes. 13 00:00:42,320 --> 00:00:44,420 We are not doing linear probing here. 14 00:00:44,420 --> 00:00:48,430 And so in order to link to the next element in the bucket, we need a 15 00:00:48,430 --> 00:00:50,390 struct node* next. 16 00:00:50,390 --> 00:00:51,110 OK. 17 00:00:51,110 --> 00:00:53,090 So that's what a node looks like. 18 00:00:53,090 --> 00:00:56,180 >> Now here is the declaration of our hashtable. 19 00:00:56,180 --> 00:00:59,640 It's going to have 16,834 buckets. 20 00:00:59,640 --> 00:01:01,910 But that number doesn't really matter. 21 00:01:01,910 --> 00:01:05,450 And finally, we're going to have the global variable hashtable size, which 22 00:01:05,450 --> 00:01:07,000 is going to start off as zero. 23 00:01:07,000 --> 00:01:10,760 And it's going to keep track of how many words are in our dictionary. 24 00:01:10,760 --> 00:01:13,710 >> So let's take a look at load. 25 00:01:13,710 --> 00:01:16,390 Notice that load, it returns a bool. 26 00:01:16,390 --> 00:01:20,530 You return true if it was successfully loaded, and false otherwise. 27 00:01:20,530 --> 00:01:23,990 And it takes a const char* dictionary, which is the dictionary 28 00:01:23,990 --> 00:01:25,280 that we want to open. 29 00:01:25,280 --> 00:01:27,170 So that's the first thing we're going to do. 30 00:01:27,170 --> 00:01:29,500 >> We're going to fopen the dictionary for reading. 31 00:01:29,500 --> 00:01:31,680 And we're going to have to make sure that it succeeded. 32 00:01:31,680 --> 00:01:35,920 So if it returned NULL, then we did not successfully open the dictionary. 33 00:01:35,920 --> 00:01:37,440 And we need to return false. 34 00:01:37,440 --> 00:01:41,580 But assuming that it did successfully open, then we want to read the 35 00:01:41,580 --> 00:01:42,400 dictionary. 36 00:01:42,400 --> 00:01:46,450 So keep looping until we find some reason to break out of this loop, 37 00:01:46,450 --> 00:01:47,570 which we'll see. 38 00:01:47,570 --> 00:01:48,920 So keep looping. 39 00:01:48,920 --> 00:01:51,780 >> And now we're going to malloc a single node. 40 00:01:51,780 --> 00:01:54,020 And of course we need to air check again. 41 00:01:54,020 --> 00:01:58,680 So if mallocing did not succeed, then we want to unload any node that we 42 00:01:58,680 --> 00:02:02,590 happened to malloc before, close the dictionary and return false. 43 00:02:02,590 --> 00:02:06,830 But ignoring that, assuming we succeeded, then we want to use fscanf 44 00:02:06,830 --> 00:02:12,400 to read a single word from our dictionary into our node. 45 00:02:12,400 --> 00:02:17,940 So remember that entry>word is the char word buffer of size LENGHTH + 1 46 00:02:17,940 --> 00:02:20,300 that we're going to store the word in. 47 00:02:20,300 --> 00:02:25,070 >> So fscanf is going to return 1, as long as it was able to successfully 48 00:02:25,070 --> 00:02:26,750 read a word from the file. 49 00:02:26,750 --> 00:02:30,460 If either an error happens, or we reach the end of the file, it 50 00:02:30,460 --> 00:02:31,950 will not return 1. 51 00:02:31,950 --> 00:02:35,180 In which case it does not return 1, we're finally going to break out of 52 00:02:35,180 --> 00:02:37,280 this while loop. 53 00:02:37,280 --> 00:02:42,770 So we see that once we have successfully read a word into 54 00:02:42,770 --> 00:02:48,270 entry>word, then we're going to that word using our hash function. 55 00:02:48,270 --> 00:02:49,580 >> Let's take a look at the hash function. 56 00:02:49,580 --> 00:02:52,430 57 00:02:52,430 --> 00:02:55,610 So you don't really need to understand this. 58 00:02:55,610 --> 00:02:59,460 And actually we just pulled this hash function from the internet. 59 00:02:59,460 --> 00:03:04,010 The only thing you need to recognize is that this takes a const char* word. 60 00:03:04,010 --> 00:03:08,960 So it's taking a string as input, and returning an unsigned int as output. 61 00:03:08,960 --> 00:03:12,360 So that's all a hash function is, is it takes in an input and gives you an 62 00:03:12,360 --> 00:03:14,490 index into the hashtable. 63 00:03:14,490 --> 00:03:18,530 >> Notice that we're moding by NUM_BUCKETS, so that value returned 64 00:03:18,530 --> 00:03:21,730 actually is an index into the hashtable and doesn't index beyond the 65 00:03:21,730 --> 00:03:24,320 bounds of the array. 66 00:03:24,320 --> 00:03:28,060 So given that function, we're going to hash the word that we read the 67 00:03:28,060 --> 00:03:29,390 dictionary. 68 00:03:29,390 --> 00:03:31,700 And then we're going to use that hash to insert the 69 00:03:31,700 --> 00:03:33,750 entry into the hashtable. 70 00:03:33,750 --> 00:03:38,520 >> Now hashtable hash is the current linked list in the table. 71 00:03:38,520 --> 00:03:41,410 And it's very possible that it's just NULL. 72 00:03:41,410 --> 00:03:44,960 We want to insert our entry at the beginning of this linked list. 73 00:03:44,960 --> 00:03:48,600 And so we're going to have our current entry point to what the hashtable 74 00:03:48,600 --> 00:03:50,380 currently points to. 75 00:03:50,380 --> 00:03:53,310 And then we're going to store, in the hashtable at the 76 00:03:53,310 --> 00:03:55,350 hash, the current entry. 77 00:03:55,350 --> 00:03:59,320 So these two lines successfully insert the entry at the beginning of the 78 00:03:59,320 --> 00:04:02,260 linked list at that index in the hashtable. 79 00:04:02,260 --> 00:04:04,900 >> Once we're done with that, we know that we found another word in the 80 00:04:04,900 --> 00:04:07,790 dictionary, and we increment again. 81 00:04:07,790 --> 00:04:13,960 So we keep doing that until fscanf finally returned something non-1 at 82 00:04:13,960 --> 00:04:16,950 which point remember that we need to free entry. 83 00:04:16,950 --> 00:04:19,459 So up here we malloced an entry. 84 00:04:19,459 --> 00:04:21,329 And we tried to read something from the dictionary. 85 00:04:21,329 --> 00:04:23,910 And we did not successfully read something from the dictionary, in 86 00:04:23,910 --> 00:04:26,650 which case we need to free the entry that we never actually put into the 87 00:04:26,650 --> 00:04:29,140 hashtable, and finally break. 88 00:04:29,140 --> 00:04:32,750 >> Once we break out we need to see, well, did we break out because there 89 00:04:32,750 --> 00:04:34,360 was an error reading from the file? 90 00:04:34,360 --> 00:04:37,120 Or did we break out because we reached the end of the file? 91 00:04:37,120 --> 00:04:39,480 If there was an error, then we want to return false. 92 00:04:39,480 --> 00:04:40,930 Because load did not succeed. 93 00:04:40,930 --> 00:04:43,890 And in the process we want to unload all the words that we read in, and 94 00:04:43,890 --> 00:04:45,670 close the dictionary file. 95 00:04:45,670 --> 00:04:48,740 >> Assuming we did succeed, then we just still need to close the dictionary 96 00:04:48,740 --> 00:04:53,040 file, and finally return true since we successfully loaded the dictionary. 97 00:04:53,040 --> 00:04:54,420 And that's it for load. 98 00:04:54,420 --> 00:04:59,020 So now check, given a loaded hashtable, is going to look like this. 99 00:04:59,020 --> 00:05:03,140 So check, it returns a bool, which is going to indicate whether the passed 100 00:05:03,140 --> 00:05:07,530 in char* word, whether the passed in string is in our dictionary. 101 00:05:07,530 --> 00:05:09,890 So if it is in the dictionary, if it is in our hashtable, 102 00:05:09,890 --> 00:05:11,170 we will return true. 103 00:05:11,170 --> 00:05:13,380 And if it's not, we will return false. 104 00:05:13,380 --> 00:05:17,740 >> Given this passed in word, we're going to hash the word. 105 00:05:17,740 --> 00:05:22,110 Now an important thing to recognize is that in load we knew that all of the 106 00:05:22,110 --> 00:05:23,820 words we're going to be lower case. 107 00:05:23,820 --> 00:05:25,820 But here we're not so sure. 108 00:05:25,820 --> 00:05:29,510 If we take a look at our hash function, our hash function actually 109 00:05:29,510 --> 00:05:32,700 is lower casing each character of the word. 110 00:05:32,700 --> 00:05:37,940 So regardless of the capitalization of word, our hash function is the return 111 00:05:37,940 --> 00:05:42,270 the same index for whatever the capitalization is, as it would have 112 00:05:42,270 --> 00:05:45,280 returned for a completely lowercase version of the word. 113 00:05:45,280 --> 00:05:46,600 Alright. 114 00:05:46,600 --> 00:05:49,790 That's our index is into the hashtable for this word. 115 00:05:49,790 --> 00:05:52,940 >> Now this for loop is going to iterate over the linked list 116 00:05:52,940 --> 00:05:55,000 that was at that index. 117 00:05:55,000 --> 00:05:59,610 So notice we are initializing entry to point to that index. 118 00:05:59,610 --> 00:06:02,750 We're going to continue while entry != NULL. 119 00:06:02,750 --> 00:06:07,770 And remember that updating the pointer in our linked list entry=entry>next. 120 00:06:07,770 --> 00:06:14,400 So have our current entry point to the next item in the linked list. 121 00:06:14,400 --> 00:06:19,250 >> So for each entry in the linked list, we're going to use strcasecmp. 122 00:06:19,250 --> 00:06:20,330 It's not strcomp. 123 00:06:20,330 --> 00:06:23,780 Because once again, we want to do things case insensitively. 124 00:06:23,780 --> 00:06:27,870 So we use strcasecmp to compare the word that was passed through this 125 00:06:27,870 --> 00:06:31,860 function against the word that is in this entry. 126 00:06:31,860 --> 00:06:35,570 If it returns zero, that means there was a match, in which case we want to 127 00:06:35,570 --> 00:06:36,630 return true. 128 00:06:36,630 --> 00:06:39,590 We successfully found the word in our hashtable. 129 00:06:39,590 --> 00:06:43,040 >> If there was not a match, then we're going to loop again and look at the 130 00:06:43,040 --> 00:06:43,990 next entry. 131 00:06:43,990 --> 00:06:47,640 And we'll continue looping while there are entries in this linked list. 132 00:06:47,640 --> 00:06:50,160 What happens if we break out of this for loop? 133 00:06:50,160 --> 00:06:55,110 That means we did not find an entry that matched this word, in which case 134 00:06:55,110 --> 00:07:00,220 we return false to indicate that our hashtable did not contain this word. 135 00:07:00,220 --> 00:07:02,540 And that's a check. 136 00:07:02,540 --> 00:07:04,790 >> So let's take a look at size. 137 00:07:04,790 --> 00:07:06,970 Now size is going to be pretty simple. 138 00:07:06,970 --> 00:07:11,080 Since remember in load, for each word we found, we incremented a global 139 00:07:11,080 --> 00:07:12,880 variable hashtable size. 140 00:07:12,880 --> 00:07:16,480 So the size function is just going to return global variable. 141 00:07:16,480 --> 00:07:18,150 And that's it. 142 00:07:18,150 --> 00:07:22,300 >> Now finally, we need to unload the dictionary once everything's done. 143 00:07:22,300 --> 00:07:25,340 So how are we going to do that? 144 00:07:25,340 --> 00:07:30,440 Right here we're looping over all buckets of our table. 145 00:07:30,440 --> 00:07:33,240 So there are NUM_BUCKETS buckets. 146 00:07:33,240 --> 00:07:37,410 And for each linked list in our hashtable, we're going to loop over 147 00:07:37,410 --> 00:07:41,070 the entirety of the linked list, freeing each element. 148 00:07:41,070 --> 00:07:42,900 >> Now we need to be careful. 149 00:07:42,900 --> 00:07:47,910 So here we have a temporary variable that's storing the pointer to the next 150 00:07:47,910 --> 00:07:49,730 element in the linked list. 151 00:07:49,730 --> 00:07:52,140 And then we're going to free the current element. 152 00:07:52,140 --> 00:07:55,990 We need to be sure we do this since we can't just free the current element 153 00:07:55,990 --> 00:07:59,180 and then try to access the next pointer, since once we've freed it, 154 00:07:59,180 --> 00:08:00,870 the memory becomes invalid. 155 00:08:00,870 --> 00:08:04,990 >> So we need to keep around a pointer to the next element, then we can free the 156 00:08:04,990 --> 00:08:08,360 current element, and then we can update our current element to point to 157 00:08:08,360 --> 00:08:09,550 the next element. 158 00:08:09,550 --> 00:08:12,800 We'll loop while there are elements in this linked list. 159 00:08:12,800 --> 00:08:15,620 We'll do that for all linked lists in the hashtable. 160 00:08:15,620 --> 00:08:19,460 And once we're done with that, we've completely unloaded the hashtable, and 161 00:08:19,460 --> 00:08:20,190 we're done. 162 00:08:20,190 --> 00:08:23,200 So it's impossible for unload to ever return false. 163 00:08:23,200 --> 00:08:26,470 And when we're done, we just return true. 164 00:08:26,470 --> 00:08:29,000 >> Let's give this solution a try. 165 00:08:29,000 --> 00:08:33,070 So let's take a look at what our struct node will look like. 166 00:08:33,070 --> 00:08:36,220 Here we see we're going to have a bool word and a struct node* children 167 00:08:36,220 --> 00:08:37,470 bracket ALPHABET. 168 00:08:37,470 --> 00:08:38,929 169 00:08:38,929 --> 00:08:42,020 So the first thing you might be wondering, why is ALPHABET 170 00:08:42,020 --> 00:08:44,660 ed defined as 27? 171 00:08:44,660 --> 00:08:47,900 Well, remember that we're going to need to be handling the apostrophe. 172 00:08:47,900 --> 00:08:51,910 So that's going to be somewhat of a special case throughout this program. 173 00:08:51,910 --> 00:08:54,710 >> Now remember how a trie actually works. 174 00:08:54,710 --> 00:08:59,380 Let's say we're indexing the word "cats." Then from the root of trie, 175 00:08:59,380 --> 00:09:02,610 we're going to look at the children array, and we're going to look at the 176 00:09:02,610 --> 00:09:08,090 index that corresponds to the letter C. So that will be indexed 2. 177 00:09:08,090 --> 00:09:11,530 So given that, that will give us a new node. 178 00:09:11,530 --> 00:09:13,820 And then we'll work from that node. 179 00:09:13,820 --> 00:09:17,770 >> So given that node, we're once again going to look at the children array. 180 00:09:17,770 --> 00:09:22,110 And we're going to look at index zero to correspond to the A in cat. 181 00:09:22,110 --> 00:09:27,170 So then we're going to go to that node, and given that node we're going 182 00:09:27,170 --> 00:09:31,090 to look at the end it's a corresponds to T. And moving on to that node, 183 00:09:31,090 --> 00:09:35,530 finally, we have completely looked through our word "cat." And now bool 184 00:09:35,530 --> 00:09:40,960 word is supposed to indicate whether this given word is actually a word. 185 00:09:40,960 --> 00:09:43,470 >> So why do we need that special case? 186 00:09:43,470 --> 00:09:47,700 Well what of the word "catastrophe" is in our dictionary, but the 187 00:09:47,700 --> 00:09:50,150 word "cat" is not? 188 00:09:50,150 --> 00:09:54,580 So and looking to see if the word "cat" is in our dictionary, we're 189 00:09:54,580 --> 00:09:59,970 going to successfully look through the indices C-A-T in region node. 190 00:09:59,970 --> 00:10:04,290 But that's only because catastrophe happened to create nodes on the way 191 00:10:04,290 --> 00:10:07,190 from C-A-T, all the way to the end of the word. 192 00:10:07,190 --> 00:10:12,020 So bool word is used to indicate whether this particular location 193 00:10:12,020 --> 00:10:14,310 actually indicates a word. 194 00:10:14,310 --> 00:10:15,140 >> All right. 195 00:10:15,140 --> 00:10:19,310 So now that we know what it trie is going to look like, let's look at the 196 00:10:19,310 --> 00:10:20,730 load function. 197 00:10:20,730 --> 00:10:24,610 So load is going to return a bool for whether we successfully or 198 00:10:24,610 --> 00:10:26,720 unsuccessfully loaded the dictionary. 199 00:10:26,720 --> 00:10:30,460 And this is going to be the dictionary that we want to load. 200 00:10:30,460 --> 00:10:33,930 >> So first thing we're to do is open up that dictionary for reading. 201 00:10:33,930 --> 00:10:36,160 And we have to make sure we didn't fail. 202 00:10:36,160 --> 00:10:39,580 So if the dictionary was not successfully opened, it will return 203 00:10:39,580 --> 00:10:42,400 null, in which case we're going to return false. 204 00:10:42,400 --> 00:10:47,230 But assuming that it successfully opened, then we can actually read 205 00:10:47,230 --> 00:10:48,220 through the dictionary. 206 00:10:48,220 --> 00:10:50,880 >> So first thing we're going to want to do is we have this 207 00:10:50,880 --> 00:10:52,500 global variable root. 208 00:10:52,500 --> 00:10:56,190 Now root is going to be a node*. 209 00:10:56,190 --> 00:10:59,760 It's the top of our trie that we're going to be iterating through. 210 00:10:59,760 --> 00:11:02,660 So the first thing that we're going to want to do is allocate 211 00:11:02,660 --> 00:11:04,140 memory for our root. 212 00:11:04,140 --> 00:11:07,980 Notice that we're using the calloc function, which is basically the same 213 00:11:07,980 --> 00:11:11,500 as the malloc function, except it's guaranteed to return something that is 214 00:11:11,500 --> 00:11:13,180 completely zeroed out. 215 00:11:13,180 --> 00:11:17,290 So if we used malloc, we would need to go through all of the pointers in our 216 00:11:17,290 --> 00:11:20,160 node, and make sure that they're all null. 217 00:11:20,160 --> 00:11:22,710 So calloc will do that for us. 218 00:11:22,710 --> 00:11:26,330 >> Now just like malloc, we need to make sure that the allocation was actually 219 00:11:26,330 --> 00:11:27,520 successful. 220 00:11:27,520 --> 00:11:29,990 If this returned null, then we need to close or dictionary 221 00:11:29,990 --> 00:11:32,100 file and return false. 222 00:11:32,100 --> 00:11:36,835 So assuming that allocation was successful, we're going to use a node* 223 00:11:36,835 --> 00:11:40,270 cursor to iterate through our trie. 224 00:11:40,270 --> 00:11:43,890 So our roots never going to change, but we're going to use cursor to 225 00:11:43,890 --> 00:11:47,875 actually go from node to node. 226 00:11:47,875 --> 00:11:50,940 >> So in this for loop we are reading through the dictionary file. 227 00:11:50,940 --> 00:11:53,670 And we're using fgetc. 228 00:11:53,670 --> 00:11:56,290 Fgetc is going to grab a single character from the file. 229 00:11:56,290 --> 00:11:59,370 We're going to continue grabbing characters while we do not reach the 230 00:11:59,370 --> 00:12:01,570 end of the file. 231 00:12:01,570 --> 00:12:03,480 >> There are two cases we need to handle. 232 00:12:03,480 --> 00:12:06,610 The first, if the character was not a new line. 233 00:12:06,610 --> 00:12:10,450 So we know if it was a new line, then we're about to move on to a new word. 234 00:12:10,450 --> 00:12:15,240 But assuming it was not a new line, then here we want to figure out the 235 00:12:15,240 --> 00:12:18,380 index we're going to index into in the children array that 236 00:12:18,380 --> 00:12:19,810 we looked at before. 237 00:12:19,810 --> 00:12:23,880 >> So, like I said before, we need to special case the apostrophe. 238 00:12:23,880 --> 00:12:26,220 Notice we're using the ternary operator here. 239 00:12:26,220 --> 00:12:29,580 So we're going to read this as, if the character we read in was an 240 00:12:29,580 --> 00:12:35,330 apostrophe, then we're going to set index= "ALPHABET"-1, which will 241 00:12:35,330 --> 00:12:37,680 be the index 26. 242 00:12:37,680 --> 00:12:41,130 >> Else, if it was not an apostrophe, there we're going to set the index 243 00:12:41,130 --> 00:12:43,760 equal to c - a. 244 00:12:43,760 --> 00:12:49,030 So remember back from previously p-sets, c - a is going to give us the 245 00:12:49,030 --> 00:12:53,410 alphabetical position of C. So if C is the letter A, this will 246 00:12:53,410 --> 00:12:54,700 give us index zero. 247 00:12:54,700 --> 00:12:58,120 For the letter B, it will give us the index 1, and so on. 248 00:12:58,120 --> 00:13:03,010 >> So this gives us the index into the children array that we want. 249 00:13:03,010 --> 00:13:08,890 Now if this index is currently null in the children, that means that a node 250 00:13:08,890 --> 00:13:11,830 does not currently exist from that path. 251 00:13:11,830 --> 00:13:15,160 So we need to allocate a node for that path. 252 00:13:15,160 --> 00:13:16,550 That's what we'll do here. 253 00:13:16,550 --> 00:13:20,690 >> So we're going to again use the calloc function, so that we don't have to 254 00:13:20,690 --> 00:13:22,880 zero out all the pointers. 255 00:13:22,880 --> 00:13:27,240 And we again need to check that calloc did not fail. 256 00:13:27,240 --> 00:13:30,700 If calloc did fail, then we need to unload everything, close our 257 00:13:30,700 --> 00:13:32,820 dictionary, and return false. 258 00:13:32,820 --> 00:13:40,050 So assuming that it did not fail, then this will create a new child for us. 259 00:13:40,050 --> 00:13:41,930 And then we will go to that child. 260 00:13:41,930 --> 00:13:44,960 Our cursor will iterate down to that child. 261 00:13:44,960 --> 00:13:49,330 >> Now if this was not null to begin with, then the cursor can just iterate 262 00:13:49,330 --> 00:13:52,590 down to that child without actually having to allocate anything. 263 00:13:52,590 --> 00:13:56,730 This is the case where we first happened allocate the word "cat." And 264 00:13:56,730 --> 00:14:00,330 that means when we go to allocate "catastrophe," we don't need to create 265 00:14:00,330 --> 00:14:01,680 nodes for C-A-T again. 266 00:14:01,680 --> 00:14:04,830 They already exist. 267 00:14:04,830 --> 00:14:06,080 >> What is this else? 268 00:14:06,080 --> 00:14:10,480 This is the condition where c was backslash n, where c was a new line. 269 00:14:10,480 --> 00:14:13,710 This means that we have successfully completed a word. 270 00:14:13,710 --> 00:14:16,860 Now what do we want to do when we successfully completed a word? 271 00:14:16,860 --> 00:14:21,100 We're going to use this word field inside of our struct node. 272 00:14:21,100 --> 00:14:23,390 We want to set that to true. 273 00:14:23,390 --> 00:14:27,150 So that indicates that this node indicates a successful 274 00:14:27,150 --> 00:14:29,250 word, an actual word. 275 00:14:29,250 --> 00:14:30,940 >> Now set that to true. 276 00:14:30,940 --> 00:14:35,150 We want to reset our cursor to point to the beginning of the trie again. 277 00:14:35,150 --> 00:14:40,160 And finally, increment our dictionary size, since we found another work. 278 00:14:40,160 --> 00:14:43,230 So we're going to keep doing that, reading in character by character, 279 00:14:43,230 --> 00:14:49,150 constructing new nodes in our trie and for each word in the dictionary, until 280 00:14:49,150 --> 00:14:54,020 we finally reach C != EOF, in which case we break out of the file. 281 00:14:54,020 --> 00:14:57,050 >> Now there are two cases under which we might have hit EOF. 282 00:14:57,050 --> 00:15:00,980 The first is if there was an error reading from the file. 283 00:15:00,980 --> 00:15:03,470 So if there was an error, we need to do the typical. 284 00:15:03,470 --> 00:15:06,460 Unload everything, close the file, return false. 285 00:15:06,460 --> 00:15:09,810 Assuming there was not an error, that just means we actually hit the end of 286 00:15:09,810 --> 00:15:13,750 the file, in which case, we close the file and return true since we 287 00:15:13,750 --> 00:15:17,330 successfully loaded dictionary into our trie. 288 00:15:17,330 --> 00:15:20,170 >> So now let's check out check. 289 00:15:20,170 --> 00:15:25,156 Looking at the check function, we see that check is going to return a bool. 290 00:15:25,156 --> 00:15:29,680 It returns true if this word that it's being passed is in our trie. 291 00:15:29,680 --> 00:15:32,110 It returns false otherwise. 292 00:15:32,110 --> 00:15:36,050 So how are you determine whether this word is in our trie? 293 00:15:36,050 --> 00:15:40,190 >> We see here that, just like before, we're going to use cursor to iterate 294 00:15:40,190 --> 00:15:41,970 through our trie. 295 00:15:41,970 --> 00:15:46,600 Now here we're going to iterate over our entire word. 296 00:15:46,600 --> 00:15:50,620 So iterating over the word we are past, we're going to determine the 297 00:15:50,620 --> 00:15:56,400 index into the children array that corresponds to word bracket I. So this 298 00:15:56,400 --> 00:15:59,670 is going to look exactly like load, where if word[i] 299 00:15:59,670 --> 00:16:03,310 is an apostrophe, then we want to use index "ALPHABET" - 1. 300 00:16:03,310 --> 00:16:05,350 Because we determined that that is where we're going to store 301 00:16:05,350 --> 00:16:07,100 apostrophes. 302 00:16:07,100 --> 00:16:11,780 >> Else we're going to use two lower word bracket I. So remember that word can 303 00:16:11,780 --> 00:16:13,920 have arbitrary capitalization. 304 00:16:13,920 --> 00:16:17,540 And so we want to make sure that we're using a lowercase version of things. 305 00:16:17,540 --> 00:16:21,920 And then subtract from that 'a' to once again give us the alphabetical 306 00:16:21,920 --> 00:16:23,880 position of that character. 307 00:16:23,880 --> 00:16:27,680 So that's going to be our index into the children array. 308 00:16:27,680 --> 00:16:32,420 >> And now if that index into the children array is null, that means we 309 00:16:32,420 --> 00:16:34,990 can no longer continue iterating down our trie. 310 00:16:34,990 --> 00:16:38,870 If that's the case, this word cannot possibly be in our trie. 311 00:16:38,870 --> 00:16:42,340 Since if it were, that would mean there would be a path 312 00:16:42,340 --> 00:16:43,510 down to that word. 313 00:16:43,510 --> 00:16:45,290 And you would never encounter null. 314 00:16:45,290 --> 00:16:47,850 So encountering null, we return false. 315 00:16:47,850 --> 00:16:49,840 The word is not in the dictionary. 316 00:16:49,840 --> 00:16:53,660 If it were not null, then we're going to continue iterating. 317 00:16:53,660 --> 00:16:57,220 >> So we're going out there cursor to point to that particular 318 00:16:57,220 --> 00:16:59,760 node at that index. 319 00:16:59,760 --> 00:17:03,150 We keep doing that throughout the entire word, assuming 320 00:17:03,150 --> 00:17:03,950 we never hit null. 321 00:17:03,950 --> 00:17:07,220 That means we were able to get through the entire word and find 322 00:17:07,220 --> 00:17:08,920 a node in our try. 323 00:17:08,920 --> 00:17:10,770 But we're not quite done yet. 324 00:17:10,770 --> 00:17:12,290 >> We don't want to just return true. 325 00:17:12,290 --> 00:17:14,770 We want to return cursor>word. 326 00:17:14,770 --> 00:17:18,980 Since remember again, is "cat" is not in our dictionary, and "catastrophe" 327 00:17:18,980 --> 00:17:22,935 is, then we will successfully we get through the word "cat." But cursor 328 00:17:22,935 --> 00:17:25,760 word will be false and not true. 329 00:17:25,760 --> 00:17:30,930 So we return cursor word to indicate whether this node is actually a word. 330 00:17:30,930 --> 00:17:32,470 And that's it for check. 331 00:17:32,470 --> 00:17:34,250 >> So let's check out size. 332 00:17:34,250 --> 00:17:37,350 So size is going to be pretty easy since, remember in load, we're 333 00:17:37,350 --> 00:17:41,430 incrementing dictionary size for each word that we encounter. 334 00:17:41,430 --> 00:17:45,350 So size is just going to return dictionary size. 335 00:17:45,350 --> 00:17:47,390 And that's it. 336 00:17:47,390 --> 00:17:50,590 >> So lastly we have unload. 337 00:17:50,590 --> 00:17:55,100 So unload, we're going to use a recursive function to actually do all 338 00:17:55,100 --> 00:17:56,530 of the work for us. 339 00:17:56,530 --> 00:17:59,340 So our function is going to be called on unloader. 340 00:17:59,340 --> 00:18:01,650 What is unloader going to do? 341 00:18:01,650 --> 00:18:06,580 We see here that unloader is going to iterate over all of the children at 342 00:18:06,580 --> 00:18:08,410 this particular node. 343 00:18:08,410 --> 00:18:11,750 And if the child node is not null, then we're going to 344 00:18:11,750 --> 00:18:13,730 unload the child node. 345 00:18:13,730 --> 00:18:18,010 >> So this is you recursively unload all of our children. 346 00:18:18,010 --> 00:18:21,080 Once we're sure that all of our children have been unloaded, then we 347 00:18:21,080 --> 00:18:25,210 can free ourselves, so unload ourselves. 348 00:18:25,210 --> 00:18:29,460 This will work recursively unload the entire trie. 349 00:18:29,460 --> 00:18:32,850 And then once that's done, we can just return true. 350 00:18:32,850 --> 00:18:34,210 Unload cannot fail. 351 00:18:34,210 --> 00:18:35,710 We're just freeing things. 352 00:18:35,710 --> 00:18:38,870 So once we're done freeing everything, return true. 353 00:18:38,870 --> 00:18:40,320 And that's it. 354 00:18:40,320 --> 00:18:41,080 My name is Rob. 355 00:18:41,080 --> 00:18:42,426 And this was speller. 356 00:18:42,426 --> 00:18:47,830 >> [MUSIC PLAYING]