How would you represent all your family members in a computer? We could simply use a list, but there's a clear hierarchy here. Let's say we're beginning with your great-grandmother, Alice. She has 2 sons, Bob and your grandfather, Charlie. Charlie has 3 children, your uncle, Dave, your aunt, Eve, and your father, Fred. You are Fred's only child. Why would organizing your family members in this way be better than the simple list representation? One reason is that this hierarchical structure, called a 'tree,' contains more information than a simple list. We know the familial relationships between everyone just by examining the tree. Also, it can speed up look-up time tremendously, if tree data is sorted. We can't take advantage of that here, but we'll see an example of this soon. Each person is represented by a node on the tree. Nodes can have child nodes as well as a parent node. These are the technical terms, even when using trees for things besides families. Alice's node has 2 children and no parents, while Charlie's node has 3 children and 1 parent. A leaf node is one that does not have any children on the outside edge of the tree. The topmost node of the tree, the root node, has no parent. A binary tree is a specific type of tree, in which each node has, at most, 2 children. Here is the struct of a node of a binary tree in C. Every node has some data associated with it and 2 pointers to other nodes. In our family tree, the associated data was each person's name. Here it is an int, though it could be anything. As it turns out, a binary tree would not be a good representation for a family, since people frequently have more than 2 children. A binary search tree is a special, ordered type of binary tree that allows us to look at values quickly. You may have noticed that every node below the root of a tree is the root of another tree, called a 'subtree.' Here, the root of the tree is 6, and its child, 2, is the root of a subtree. In a binary search tree all the values of a node's right subtree are greater than the node's value. Here: 6. Well, the values in a node's left subtree are less than the node's value. If we need to handle duplicate values, we can change either of those to a loose inequality, meaning identical values can fall either on the left or right, as long as we are consistent about it throughout. This tree is a binary search tree because it follows these rules. This is how it would look if we turned all the nodes into C structs. Notice that if a child is missing, the pointer is null. How do we check to see if 7 is in the tree? We start at the root. Seven is greater than 6, so if it's in the tree, it must be to the right. Then, it is less than 8, so it must be left. Here, we have found 7. Now, we'll check for 5. Five is less than 6, so it must be to the left. Five is greater than 2, so it must be right, and it is also greater than 4, so it must be right again. However, there is no child here. The pointer is null. This means that 5 is not in our tree. We can search the binary tree with the following code: At every node, we check to see if we have found the value we are looking for. If we do not find it, we determine if it should be on the left or right and check that subtree. This loop will continue down the tree until there is no child node on either the left or right. Remember that 5 was not in the tree. How do we insert it? The process looks similar to search. We iterate down the tree starting from 6, left to 2, right to 4, and right again, but 4 has no child on this side. This will be the new position for 5, and it will start with no children. How fast are operations on a binary search tree? Remember that Bigohnotation seeks to provide an upper bound. In the worst case, our tree could simply be a linked list meaning that insertion, deletion, and search could take time proportional to the number of nodes in the tree. This is O(n). For example, the following is a valid binary search tree. However, if we try to find 9, we have to traverse every node. It is no better than a linked list. Ideally, we would want every node of our binary search tree to have 2 children. This way, insertion, deletion and search would take, at worst, O(log n) time. The tree from before could be more balanced, like this. Now, looking up any value takes, at most, 3 steps. This tree is balanced, meaning that is has a minimal depth relative to the number of nodes. Looking for a value in a balanced binary search tree is similar to binary search on a sorted array. In fact, if we don't need to insert or delete items, they behave exactly the same way. However, a tree structure is better for handling insertions and deletions My name is Bannus Van der Kloot. This is CS50.