[MUSIC PLAYING] ROB BOWDEN: Hi. I'm Rob. And let's this solution out. So here we're going to implement a general table. We see that the struct node of our table is going to look like this. So it's going to have a char word array of size LENGTH + 1. Don't forget the + 1, since the maximum word in the dictionary is 45 characters. And then we're going to need one extra character for the backslash zero. And then our hashtable in each bucket is going to store a linked list of nodes. We are not doing linear probing here. And so in order to link to the next element in the bucket, we need a struct node* next. OK. So that's what a node looks like. Now here is the declaration of our hashtable. It's going to have 16,834 buckets. But that number doesn't really matter. And finally, we're going to have the global variable hashtable size, which is going to start off as zero. And it's going to keep track of how many words are in our dictionary. So let's take a look at load. Notice that load, it returns a bool. You return true if it was successfully loaded, and false otherwise. And it takes a const char* dictionary, which is the dictionary that we want to open. So that's the first thing we're going to do. We're going to fopen the dictionary for reading. And we're going to have to make sure that it succeeded. So if it returned NULL, then we did not successfully open the dictionary. And we need to return false. But assuming that it did successfully open, then we want to read the dictionary. So keep looping until we find some reason to break out of this loop, which we'll see. So keep looping. And now we're going to malloc a single node. And of course we need to air check again. So if mallocing did not succeed, then we want to unload any node that we happened to malloc before, close the dictionary and return false. But ignoring that, assuming we succeeded, then we want to use fscanf to read a single word from our dictionary into our node. So remember that entry>word is the char word buffer of size LENGHTH + 1 that we're going to store the word in. So fscanf is going to return 1, as long as it was able to successfully read a word from the file. If either an error happens, or we reach the end of the file, it will not return 1. In which case it does not return 1, we're finally going to break out of this while loop. So we see that once we have successfully read a word into entry>word, then we're going to that word using our hash function. Let's take a look at the hash function. So you don't really need to understand this. And actually we just pulled this hash function from the internet. The only thing you need to recognize is that this takes a const char* word. So it's taking a string as input, and returning an unsigned int as output. So that's all a hash function is, is it takes in an input and gives you an index into the hashtable. Notice that we're moding by NUM_BUCKETS, so that value returned actually is an index into the hashtable and doesn't index beyond the bounds of the array. So given that function, we're going to hash the word that we read the dictionary. And then we're going to use that hash to insert the entry into the hashtable. Now hashtable hash is the current linked list in the table. And it's very possible that it's just NULL. We want to insert our entry at the beginning of this linked list. And so we're going to have our current entry point to what the hashtable currently points to. And then we're going to store, in the hashtable at the hash, the current entry. So these two lines successfully insert the entry at the beginning of the linked list at that index in the hashtable. Once we're done with that, we know that we found another word in the dictionary, and we increment again. So we keep doing that until fscanf finally returned something non-1 at which point remember that we need to free entry. So up here we malloced an entry. And we tried to read something from the dictionary. And we did not successfully read something from the dictionary, in which case we need to free the entry that we never actually put into the hashtable, and finally break. Once we break out we need to see, well, did we break out because there was an error reading from the file? Or did we break out because we reached the end of the file? If there was an error, then we want to return false. Because load did not succeed. And in the process we want to unload all the words that we read in, and close the dictionary file. Assuming we did succeed, then we just still need to close the dictionary file, and finally return true since we successfully loaded the dictionary. And that's it for load. So now check, given a loaded hashtable, is going to look like this. So check, it returns a bool, which is going to indicate whether the passed in char* word, whether the passed in string is in our dictionary. So if it is in the dictionary, if it is in our hashtable, we will return true. And if it's not, we will return false. Given this passed in word, we're going to hash the word. Now an important thing to recognize is that in load we knew that all of the words we're going to be lower case. But here we're not so sure. If we take a look at our hash function, our hash function actually is lower casing each character of the word. So regardless of the capitalization of word, our hash function is the return the same index for whatever the capitalization is, as it would have returned for a completely lowercase version of the word. Alright. That's our index is into the hashtable for this word. Now this for loop is going to iterate over the linked list that was at that index. So notice we are initializing entry to point to that index. We're going to continue while entry != NULL. And remember that updating the pointer in our linked list entry=entry>next. So have our current entry point to the next item in the linked list. So for each entry in the linked list, we're going to use strcasecmp. It's not strcomp. Because once again, we want to do things case insensitively. So we use strcasecmp to compare the word that was passed through this function against the word that is in this entry. If it returns zero, that means there was a match, in which case we want to return true. We successfully found the word in our hashtable. If there was not a match, then we're going to loop again and look at the next entry. And we'll continue looping while there are entries in this linked list. What happens if we break out of this for loop? That means we did not find an entry that matched this word, in which case we return false to indicate that our hashtable did not contain this word. And that's a check. So let's take a look at size. Now size is going to be pretty simple. Since remember in load, for each word we found, we incremented a global variable hashtable size. So the size function is just going to return global variable. And that's it. Now finally, we need to unload the dictionary once everything's done. So how are we going to do that? Right here we're looping over all buckets of our table. So there are NUM_BUCKETS buckets. And for each linked list in our hashtable, we're going to loop over the entirety of the linked list, freeing each element. Now we need to be careful. So here we have a temporary variable that's storing the pointer to the next element in the linked list. And then we're going to free the current element. We need to be sure we do this since we can't just free the current element and then try to access the next pointer, since once we've freed it, the memory becomes invalid. So we need to keep around a pointer to the next element, then we can free the current element, and then we can update our current element to point to the next element. We'll loop while there are elements in this linked list. We'll do that for all linked lists in the hashtable. And once we're done with that, we've completely unloaded the hashtable, and we're done. So it's impossible for unload to ever return false. And when we're done, we just return true. Let's give this solution a try. So let's take a look at what our struct node will look like. Here we see we're going to have a bool word and a struct node* children bracket ALPHABET. So the first thing you might be wondering, why is ALPHABET ed defined as 27? Well, remember that we're going to need to be handling the apostrophe. So that's going to be somewhat of a special case throughout this program. Now remember how a trie actually works. Let's say we're indexing the word "cats." Then from the root of trie, we're going to look at the children array, and we're going to look at the index that corresponds to the letter C. So that will be indexed 2. So given that, that will give us a new node. And then we'll work from that node. So given that node, we're once again going to look at the children array. And we're going to look at index zero to correspond to the A in cat. So then we're going to go to that node, and given that node we're going to look at the end it's a corresponds to T. And moving on to that node, finally, we have completely looked through our word "cat." And now bool word is supposed to indicate whether this given word is actually a word. So why do we need that special case? Well what of the word "catastrophe" is in our dictionary, but the word "cat" is not? So and looking to see if the word "cat" is in our dictionary, we're going to successfully look through the indices C-A-T in region node. But that's only because catastrophe happened to create nodes on the way from C-A-T, all the way to the end of the word. So bool word is used to indicate whether this particular location actually indicates a word. All right. So now that we know what it trie is going to look like, let's look at the load function. So load is going to return a bool for whether we successfully or unsuccessfully loaded the dictionary. And this is going to be the dictionary that we want to load. So first thing we're to do is open up that dictionary for reading. And we have to make sure we didn't fail. So if the dictionary was not successfully opened, it will return null, in which case we're going to return false. But assuming that it successfully opened, then we can actually read through the dictionary. So first thing we're going to want to do is we have this global variable root. Now root is going to be a node*. It's the top of our trie that we're going to be iterating through. So the first thing that we're going to want to do is allocate memory for our root. Notice that we're using the calloc function, which is basically the same as the malloc function, except it's guaranteed to return something that is completely zeroed out. So if we used malloc, we would need to go through all of the pointers in our node, and make sure that they're all null. So calloc will do that for us. Now just like malloc, we need to make sure that the allocation was actually successful. If this returned null, then we need to close or dictionary file and return false. So assuming that allocation was successful, we're going to use a node* cursor to iterate through our trie. So our roots never going to change, but we're going to use cursor to actually go from node to node. So in this for loop we are reading through the dictionary file. And we're using fgetc. Fgetc is going to grab a single character from the file. We're going to continue grabbing characters while we do not reach the end of the file. There are two cases we need to handle. The first, if the character was not a new line. So we know if it was a new line, then we're about to move on to a new word. But assuming it was not a new line, then here we want to figure out the index we're going to index into in the children array that we looked at before. So, like I said before, we need to special case the apostrophe. Notice we're using the ternary operator here. So we're going to read this as, if the character we read in was an apostrophe, then we're going to set index= "ALPHABET"-1, which will be the index 26. Else, if it was not an apostrophe, there we're going to set the index equal to c - a. So remember back from previously p-sets, c - a is going to give us the alphabetical position of C. So if C is the letter A, this will give us index zero. For the letter B, it will give us the index 1, and so on. So this gives us the index into the children array that we want. Now if this index is currently null in the children, that means that a node does not currently exist from that path. So we need to allocate a node for that path. That's what we'll do here. So we're going to again use the calloc function, so that we don't have to zero out all the pointers. And we again need to check that calloc did not fail. If calloc did fail, then we need to unload everything, close our dictionary, and return false. So assuming that it did not fail, then this will create a new child for us. And then we will go to that child. Our cursor will iterate down to that child. Now if this was not null to begin with, then the cursor can just iterate down to that child without actually having to allocate anything. This is the case where we first happened allocate the word "cat." And that means when we go to allocate "catastrophe," we don't need to create nodes for C-A-T again. They already exist. What is this else? This is the condition where c was backslash n, where c was a new line. This means that we have successfully completed a word. Now what do we want to do when we successfully completed a word? We're going to use this word field inside of our struct node. We want to set that to true. So that indicates that this node indicates a successful word, an actual word. Now set that to true. We want to reset our cursor to point to the beginning of the trie again. And finally, increment our dictionary size, since we found another work. So we're going to keep doing that, reading in character by character, constructing new nodes in our trie and for each word in the dictionary, until we finally reach C != EOF, in which case we break out of the file. Now there are two cases under which we might have hit EOF. The first is if there was an error reading from the file. So if there was an error, we need to do the typical. Unload everything, close the file, return false. Assuming there was not an error, that just means we actually hit the end of the file, in which case, we close the file and return true since we successfully loaded dictionary into our trie. So now let's check out check. Looking at the check function, we see that check is going to return a bool. It returns true if this word that it's being passed is in our trie. It returns false otherwise. So how are you determine whether this word is in our trie? We see here that, just like before, we're going to use cursor to iterate through our trie. Now here we're going to iterate over our entire word. So iterating over the word we are past, we're going to determine the index into the children array that corresponds to word bracket I. So this is going to look exactly like load, where if word[i] is an apostrophe, then we want to use index "ALPHABET" - 1. Because we determined that that is where we're going to store apostrophes. Else we're going to use two lower word bracket I. So remember that word can have arbitrary capitalization. And so we want to make sure that we're using a lowercase version of things. And then subtract from that 'a' to once again give us the alphabetical position of that character. So that's going to be our index into the children array. And now if that index into the children array is null, that means we can no longer continue iterating down our trie. If that's the case, this word cannot possibly be in our trie. Since if it were, that would mean there would be a path down to that word. And you would never encounter null. So encountering null, we return false. The word is not in the dictionary. If it were not null, then we're going to continue iterating. So we're going out there cursor to point to that particular node at that index. We keep doing that throughout the entire word, assuming we never hit null. That means we were able to get through the entire word and find a node in our try. But we're not quite done yet. We don't want to just return true. We want to return cursor>word. Since remember again, is "cat" is not in our dictionary, and "catastrophe" is, then we will successfully we get through the word "cat." But cursor word will be false and not true. So we return cursor word to indicate whether this node is actually a word. And that's it for check. So let's check out size. So size is going to be pretty easy since, remember in load, we're incrementing dictionary size for each word that we encounter. So size is just going to return dictionary size. And that's it. So lastly we have unload. So unload, we're going to use a recursive function to actually do all of the work for us. So our function is going to be called on unloader. What is unloader going to do? We see here that unloader is going to iterate over all of the children at this particular node. And if the child node is not null, then we're going to unload the child node. So this is you recursively unload all of our children. Once we're sure that all of our children have been unloaded, then we can free ourselves, so unload ourselves. This will work recursively unload the entire trie. And then once that's done, we can just return true. Unload cannot fail. We're just freeing things. So once we're done freeing everything, return true. And that's it. My name is Rob. And this was speller. [MUSIC PLAYING]