1 00:00:00,000 --> 00:00:00,530 2 00:00:00,530 --> 00:00:03,070 >> SPEAKER 1: Let's give this solution a try. 3 00:00:03,070 --> 00:00:07,130 So let's take a look at what our Struct node will look like. 4 00:00:07,130 --> 00:00:11,040 Here, we see we're going to have a Bool Word and a Struct node star 5 00:00:11,040 --> 00:00:12,990 Children bracket alphabet. 6 00:00:12,990 --> 00:00:18,720 So first thing you might be wondering, why is alphabet hash defined as 27? 7 00:00:18,720 --> 00:00:22,540 Well, remember that we're going to need to be handling the apostrophe, so 8 00:00:22,540 --> 00:00:25,610 that's going to be somewhat of a special case throughout this program. 9 00:00:25,610 --> 00:00:28,780 >> OK, now, remember how a Trie actually works. 10 00:00:28,780 --> 00:00:33,420 Let's say we're indexing the word cats, then from the root of our Trie, 11 00:00:33,420 --> 00:00:36,670 we're going to look at the Children array, and we're going to look at the 12 00:00:36,670 --> 00:00:42,250 index that corresponds to the letter C. So that would be index two. 13 00:00:42,250 --> 00:00:46,400 So given that, that will give us a new node, and then we'll 14 00:00:46,400 --> 00:00:47,880 work from that node. 15 00:00:47,880 --> 00:00:51,830 >> So given that node, we're once again going to look at the Children array, 16 00:00:51,830 --> 00:00:56,170 and we're going to look at index zero to correspond to the A in cat. 17 00:00:56,170 --> 00:01:01,240 So then we're going to go to that node, and given that node, we're going 18 00:01:01,240 --> 00:01:05,170 to look at the index that corresponds to T. And moving on to that node, 19 00:01:05,170 --> 00:01:09,590 finally, we have completely looked through our word Cat, and now Bool 20 00:01:09,590 --> 00:01:15,020 Word is supposed to indicate whether this given word is actually a word. 21 00:01:15,020 --> 00:01:17,530 >> So why do we need that special case? 22 00:01:17,530 --> 00:01:21,680 Well, what if the word catastrophe is in our dictionary, but 23 00:01:21,680 --> 00:01:24,120 the word cat is not? 24 00:01:24,120 --> 00:01:29,030 So in looking to see if the word cat is in our dictionary, we're going to 25 00:01:29,030 --> 00:01:34,880 successfully look through the indices C-A-T and reach a node, but that's 26 00:01:34,880 --> 00:01:39,760 only because catastrophe happened to create nodes on the way from C-A-T all 27 00:01:39,760 --> 00:01:41,250 the way to the end of the word. 28 00:01:41,250 --> 00:01:46,520 So Bool Word is used indicate whether this particular location actually 29 00:01:46,520 --> 00:01:48,370 indicates a word. 30 00:01:48,370 --> 00:01:52,920 >> All right, so now that we know what a Trie is going to look like, let's look 31 00:01:52,920 --> 00:01:54,800 at the Load function. 32 00:01:54,800 --> 00:01:58,670 So Load is going to return a Bool for whether we successfully or 33 00:01:58,670 --> 00:02:03,020 unsuccessfully loaded dictionary and this is going to be the dictionary 34 00:02:03,020 --> 00:02:04,520 that we want to load. 35 00:02:04,520 --> 00:02:08,310 So first thing we're going to do is open up that dictionary for reading. 36 00:02:08,310 --> 00:02:12,060 We have to make sure we didn't fail, so if the dictionary was not 37 00:02:12,060 --> 00:02:15,280 successfully opened, it will return No, in which case we're going to 38 00:02:15,280 --> 00:02:16,340 return False. 39 00:02:16,340 --> 00:02:21,290 But assuming that it successfully opened, then we can actually read 40 00:02:21,290 --> 00:02:22,310 through the dictionary. 41 00:02:22,310 --> 00:02:24,940 >> So first thing we're going to want to do is we have this 42 00:02:24,940 --> 00:02:26,560 global variable root. 43 00:02:26,560 --> 00:02:30,250 Now, root is going to be a node star. 44 00:02:30,250 --> 00:02:33,830 It's the top of our Trie that we're going to be iterating through. 45 00:02:33,830 --> 00:02:38,200 So first thing we're going to want to do is allocate memory for our root. 46 00:02:38,200 --> 00:02:42,040 >> Notice that we're using the Calloc function, which is basically the same 47 00:02:42,040 --> 00:02:45,560 as the Malloc function, except it's guaranteed to return something that is 48 00:02:45,560 --> 00:02:47,240 completely zeroed out. 49 00:02:47,240 --> 00:02:51,350 So if we used Malloc, we would need to go through all of the pointers in our 50 00:02:51,350 --> 00:02:54,220 node and make sure that they're all null. 51 00:02:54,220 --> 00:02:56,780 So Calloc will do that for us. 52 00:02:56,780 --> 00:03:00,390 >> Now, just like Malloc, we need to make sure that the allocation's actually 53 00:03:00,390 --> 00:03:01,580 successful. 54 00:03:01,580 --> 00:03:04,060 If this returned null, then we need to close our dictionary 55 00:03:04,060 --> 00:03:06,170 file and return False. 56 00:03:06,170 --> 00:03:11,040 So assuming the allocation was successful, we're going to use a node 57 00:03:11,040 --> 00:03:14,340 star Cursor to iterate through our Trie. 58 00:03:14,340 --> 00:03:17,950 So our root's never going to change, but we're going to use Cursor to 59 00:03:17,950 --> 00:03:20,770 actually go from node to node. 60 00:03:20,770 --> 00:03:25,000 >> All right, so in this For loop, we are reading through the dictionary file, 61 00:03:25,000 --> 00:03:26,965 and we're using at fgetc. 62 00:03:26,965 --> 00:03:30,360 So fgetc is going to grab a single character from the file. 63 00:03:30,360 --> 00:03:33,430 We're going to continue grabbing characters while we do not reach the 64 00:03:33,430 --> 00:03:37,540 end of the file, so there are two cases we need to handle. 65 00:03:37,540 --> 00:03:41,640 The first, if the character was not a new line, so we know if it was a new 66 00:03:41,640 --> 00:03:44,480 line, then we're about to move on to a new word. 67 00:03:44,480 --> 00:03:49,300 But assuming it was not a new line, then here, we want to figure out the 68 00:03:49,300 --> 00:03:52,440 index we're going to index into in the Children array that 69 00:03:52,440 --> 00:03:53,890 we looked at before. 70 00:03:53,890 --> 00:03:57,950 >> So like I said before, we need to special case the apostrophe. 71 00:03:57,950 --> 00:04:01,040 Notice we're using the ternary operator here, so we're going to read 72 00:04:01,040 --> 00:04:05,500 this as if the character we read in was an apostrophe, then we're going to 73 00:04:05,500 --> 00:04:11,740 set index equal to alphabet minus 1, which will be the index 26. 74 00:04:11,740 --> 00:04:15,190 Else, if it was not an apostrophe, then we're going to set the index 75 00:04:15,190 --> 00:04:17,820 equal to c minus a. 76 00:04:17,820 --> 00:04:23,090 So remember back from previous p sets, c minus a is going to give us the 77 00:04:23,090 --> 00:04:27,470 alphabetical position of c, so if c is the letter A, this will 78 00:04:27,470 --> 00:04:28,770 give us index zero. 79 00:04:28,770 --> 00:04:32,180 For the letter B, it would give us the index 1, and so on. 80 00:04:32,180 --> 00:04:37,070 >> So this gives us the index into the Children array that we want. 81 00:04:37,070 --> 00:04:42,540 Now, if this index is currently null in the Children array, that means that 82 00:04:42,540 --> 00:04:47,470 a node does not currently exist from that path, so we need to allocate a 83 00:04:47,470 --> 00:04:49,220 node for that path. 84 00:04:49,220 --> 00:04:50,610 That's what we do here. 85 00:04:50,610 --> 00:04:54,650 So we're going to, again, use the Calloc function so that we don't have 86 00:04:54,650 --> 00:05:00,130 to zero out all of the pointers, and we, again, need to check that Calloc 87 00:05:00,130 --> 00:05:01,300 did not fail. 88 00:05:01,300 --> 00:05:04,760 If Calloc did fail, then we need to unload everything, close our 89 00:05:04,760 --> 00:05:06,880 dictionary, and return False. 90 00:05:06,880 --> 00:05:14,110 >> So assuming that it did not fail, then this will create a new child for us, 91 00:05:14,110 --> 00:05:16,000 and then we will go to that child. 92 00:05:16,000 --> 00:05:19,030 Our cursor will iterate down to that child. 93 00:05:19,030 --> 00:05:23,390 Now, if this was not null to begin with, then the cursor can just iterate 94 00:05:23,390 --> 00:05:26,650 down to that child without actually having to allocate anything. 95 00:05:26,650 --> 00:05:30,790 This is the case where we first happened to allocate the word cat, and 96 00:05:30,790 --> 00:05:34,390 that means when we go to allocate catastrophe, we don't need to create 97 00:05:34,390 --> 00:05:35,720 nodes for C-A-T again. 98 00:05:35,720 --> 00:05:37,620 They already exist. 99 00:05:37,620 --> 00:05:40,140 >> OK, so what is this Else? 100 00:05:40,140 --> 00:05:44,600 This is the condition where c was backslash n, where c was a new line. 101 00:05:44,600 --> 00:05:47,780 This means that we have successfully completed a word. 102 00:05:47,780 --> 00:05:51,020 Now, what do we want to do when we successfully completed a word? 103 00:05:51,020 --> 00:05:55,250 We're going to use this word field inside of our Struct node. 104 00:05:55,250 --> 00:06:00,570 >> We want to set that to True, so that indicates that this node indicates a 105 00:06:00,570 --> 00:06:03,320 successful word an actual word. 106 00:06:03,320 --> 00:06:05,050 Now, set that to True. 107 00:06:05,050 --> 00:06:09,210 We want to reset our cursor to point to the beginning of the Trie again. 108 00:06:09,210 --> 00:06:13,510 And finally, increment our dictionary size since we found another word. 109 00:06:13,510 --> 00:06:16,450 >> All right, so we're going to keep doing that, reading in character by 110 00:06:16,450 --> 00:06:21,960 character, constructing new nodes in our Trie and for each word in the 111 00:06:21,960 --> 00:06:26,810 dictionary, until we finally reach c equals EOF, in which case, we break 112 00:06:26,810 --> 00:06:28,100 out of the file. 113 00:06:28,100 --> 00:06:31,110 Now, there are two cases under which we might have hit EOF. 114 00:06:31,110 --> 00:06:35,680 The first is if there was an error reading from the file, so if there was 115 00:06:35,680 --> 00:06:39,280 an error, we need to do the typical unload everything, close the file, 116 00:06:39,280 --> 00:06:40,520 return False. 117 00:06:40,520 --> 00:06:43,870 Assuming there was not an error, that just means we actually hit the end of 118 00:06:43,870 --> 00:06:47,820 the file, in which case, we close the file and return True since we 119 00:06:47,820 --> 00:06:51,010 successfully loaded the dictionary into our Trie. 120 00:06:51,010 --> 00:06:54,240 >> All right, so now let's check out Check. 121 00:06:54,240 --> 00:06:58,780 Looking at the Check function, we see that Check is going to return a Bool. 122 00:06:58,780 --> 00:07:03,740 It returns True if this word that it's being passed is in our Trie. 123 00:07:03,740 --> 00:07:06,170 It returns False otherwise. 124 00:07:06,170 --> 00:07:10,110 >> So how are we going to determine whether this word is in our Trie? 125 00:07:10,110 --> 00:07:14,270 We see here that, just like before, we're going to use cursor to iterate 126 00:07:14,270 --> 00:07:16,010 through our Trie. 127 00:07:16,010 --> 00:07:20,650 Now, here, we're going to iterate over our entire word. 128 00:07:20,650 --> 00:07:24,680 So iterating over the word we are passed, we're going to determine the 129 00:07:24,680 --> 00:07:29,280 index into the Children array that corresponds to word bracket i. 130 00:07:29,280 --> 00:07:34,150 So this is going to look exactly like Load, where if word bracket i is an 131 00:07:34,150 --> 00:07:38,110 apostrophe, then we want to use index alphabet minus 1 because we determined 132 00:07:38,110 --> 00:07:41,160 that is where we're going to store apostrophes. 133 00:07:41,160 --> 00:07:44,440 >> Else we're going to use tolower word bracket i. 134 00:07:44,440 --> 00:07:48,270 So remember that word can have arbitrary capitalization, and so we 135 00:07:48,270 --> 00:07:51,590 want to make sure that we're using a lowercase version of things. 136 00:07:51,590 --> 00:07:55,300 And then subtract from that lowercase a to, once again, give us the 137 00:07:55,300 --> 00:07:57,940 alphabetical position of that character. 138 00:07:57,940 --> 00:08:01,740 So that's going to be our index into the Children array. 139 00:08:01,740 --> 00:08:06,480 >> And now, if that index into the Children array is null, that means we 140 00:08:06,480 --> 00:08:09,050 can no longer continue iterating down our Trie. 141 00:08:09,050 --> 00:08:13,320 If that's the case, this word cannot possibly be in our Trie, since if it 142 00:08:13,320 --> 00:08:18,000 were, that would mean there would be a path down to that word, and you would 143 00:08:18,000 --> 00:08:19,350 never encounter null. 144 00:08:19,350 --> 00:08:21,910 So encountering null, we return False. 145 00:08:21,910 --> 00:08:23,810 The word is not in the dictionary. 146 00:08:23,810 --> 00:08:28,200 If it were not null, then we're going to continue iterating, so we're going 147 00:08:28,200 --> 00:08:33,150 to update our cursor to point to that particular node at that index. 148 00:08:33,150 --> 00:08:36,659 >> So we keep doing that throughout the entire word. 149 00:08:36,659 --> 00:08:40,630 Assuming we never hit null, that means we were able to get through the entire 150 00:08:40,630 --> 00:08:44,840 world and find a node in our Trie, but we're not quite done yet. 151 00:08:44,840 --> 00:08:46,350 We don't want to just return True. 152 00:08:46,350 --> 00:08:51,400 We want to return cursor error word since, remember again, if cat is not 153 00:08:51,400 --> 00:08:55,140 in our dictionary and catastrophe is, then we will successfully get through 154 00:08:55,140 --> 00:08:59,810 the word cat, but cursor word will be False and not True. 155 00:08:59,810 --> 00:09:04,990 So we return cursor word to indicate whether this node is actually a word, 156 00:09:04,990 --> 00:09:06,530 and that's it for check. 157 00:09:06,530 --> 00:09:08,310 >> So let's check out Size. 158 00:09:08,310 --> 00:09:11,410 So Size is going to be pretty easy since, remember in Load, we're 159 00:09:11,410 --> 00:09:15,480 incrementing dictionary size for each word that we encounter. 160 00:09:15,480 --> 00:09:20,820 So Size is just going to return dictionary size, and that's it. 161 00:09:20,820 --> 00:09:24,650 >> All right, so lastly, we have Unload. 162 00:09:24,650 --> 00:09:29,050 So Unload, we're going to use a recursive function to actually do all 163 00:09:29,050 --> 00:09:33,390 of the work for us, so our function is going to be called Unloader. 164 00:09:33,390 --> 00:09:35,830 What is Unloader going to do? 165 00:09:35,830 --> 00:09:40,640 We see here that Unloader is going to iterate over all of the children at 166 00:09:40,640 --> 00:09:45,810 this particular node, and if the child node is not null, then we're going to 167 00:09:45,810 --> 00:09:47,760 unload the child node. 168 00:09:47,760 --> 00:09:52,070 >> So this is going to recursively unload all of our children. 169 00:09:52,070 --> 00:09:55,140 Once we're sure that all of our children have been unloaded, then we 170 00:09:55,140 --> 00:09:58,830 can free ourselves, so unload ourself. 171 00:09:58,830 --> 00:10:04,550 So this will recursively unload the entire Trie, and then once that's 172 00:10:04,550 --> 00:10:06,910 done, we can just return True. 173 00:10:06,910 --> 00:10:09,770 Unload cannot fail, we're just freeing things. 174 00:10:09,770 --> 00:10:12,985 So once we're done freeing everything, return True. 175 00:10:12,985 --> 00:10:14,380 And that's it. 176 00:10:14,380 --> 00:10:16,792 My name is Rob, and this was [INAUDIBLE]. 177 00:10:16,792 --> 00:10:21,888