1 00:00:00,000 --> 00:00:02,270 [Review: Quiz 1] 2 00:00:02,270 --> 00:00:04,620 [Ali Nahm, Oreoluwa Barbarinsa, Lucas Freitas, Rob Bowden] [Harvard University] 3 00:00:04,620 --> 00:00:07,660 [This is CS50.] [CS50.TV] 4 00:00:07,660 --> 00:00:11,610 [Lucas Freitas] Welcome everyone. This is review for quiz 1. 5 00:00:11,610 --> 00:00:15,040 Just as a disclaimer, this is--I mean, we're going to try to cover 6 00:00:15,040 --> 00:00:17,770 as much material as possible, but that doesn't mean that 7 00:00:17,770 --> 00:00:20,780 we're going to cover all of the things that can be in quiz 1. 8 00:00:20,780 --> 00:00:25,270 So be sure you also take a look at lecture, sections, everything that you can. 9 00:00:25,270 --> 00:00:28,240 Quiz 1 is going to be on Wednesday, next Wednesday. 10 00:00:28,240 --> 00:00:33,800 So be sure to study. It's going to be, pretty much, like the first quiz 11 00:00:33,800 --> 00:00:36,390 regarding its format, but it's probably going to be much harder. 12 00:00:36,390 --> 00:00:39,600 At least, last year when I took 50, I thought it was much harder. 13 00:00:39,600 --> 00:00:42,410 So study a lot. 14 00:00:42,410 --> 00:00:45,190 >> I'm going to cover data structures and Huffman coding. 15 00:00:45,190 --> 00:00:47,910 This is something that a lot of people think is complex, 16 00:00:47,910 --> 00:00:51,930 but I'm going to try to make it as easy as possible. 17 00:00:51,930 --> 00:00:56,330 First of all, what we want you guys to know for quiz 1 is to 18 00:00:56,330 --> 00:01:00,970 understand the conceptual descriptions of each of the data structures that I'm going to present. 19 00:01:00,970 --> 00:01:03,960 That means that you don't have to actually 20 00:01:03,960 --> 00:01:07,020 implement a hash table in your quiz 1. 21 00:01:07,020 --> 00:01:10,250 We don't want you to implement a whole hash table; maybe we'll try 22 00:01:10,250 --> 00:01:13,090 to make you implement some functions, 23 00:01:13,090 --> 00:01:16,940 the most common operations, but we're not going to make you implement everything. 24 00:01:16,940 --> 00:01:21,010 So it's important that you understand the concept behind each data structure 25 00:01:21,010 --> 00:01:23,510 and also that you are able to code in C, 26 00:01:23,510 --> 00:01:27,880 just the most common operations they have for each data structure. 27 00:01:27,880 --> 00:01:30,090 And also be able to review pointers and structs, 28 00:01:30,090 --> 00:01:33,470 because they appear a lot in these data structures. 29 00:01:33,470 --> 00:01:37,380 >> First, linked lists. Linked lists are actually very similar to arrays, 30 00:01:37,380 --> 00:01:39,930 but the difference between a linked list and an array, 31 00:01:39,930 --> 00:01:45,160 first of all, is that a linked list has a very flexible size, 32 00:01:45,160 --> 00:01:50,060 while in arrays you have to either choose a very large size for the array, 33 00:01:50,060 --> 00:01:53,710 so you know that you're going to be able to store all your data in that array, 34 00:01:53,710 --> 00:01:59,370 or you have to use malloc to have a flexible length of array. 35 00:01:59,370 --> 00:02:03,680 In linked lists it's very easy to just get more elements, 36 00:02:03,680 --> 00:02:07,210 put more elements in the linked list or remove elements. 37 00:02:07,210 --> 00:02:09,370 And actually, if you don't want the linked list to be sorted, 38 00:02:09,370 --> 00:02:13,950 you can search and remove elements in constant time, 39 00:02:13,950 --> 00:02:16,800 so O (1) time, so it's very convenient. 40 00:02:16,800 --> 00:02:20,660 You just have to be careful to always remember to malloc and free the nodes, 41 00:02:20,660 --> 00:02:25,510 just because if you don't, you'll have memory leaks. 42 00:02:25,510 --> 00:02:31,480 So linked lists--the definition of a node is just like what we have right there. 43 00:02:31,480 --> 00:02:35,110 I put int n, but you can store any data you want. 44 00:02:35,110 --> 00:02:37,280 So if you want to store a string, it's fine. 45 00:02:37,280 --> 00:02:41,690 If you want to store a struct, it's fine, a double, whatever you want. 46 00:02:41,690 --> 00:02:44,630 I just put int n for the examples here. 47 00:02:44,630 --> 00:02:46,800 And you have a pointer to the next node. 48 00:02:46,800 --> 00:02:51,940 So, basically, a linked list has some data, and then it points to the next node. 49 00:02:51,940 --> 00:02:56,710 If it's the last element in the linked list, it's going to point to NULL. 50 00:02:56,710 --> 00:02:59,060 So this is an example of a linked list. 51 00:02:59,250 --> 00:03:05,960 >> Okay, so now let's see what we should do if I want to insert an element in a linked list. 52 00:03:05,960 --> 00:03:08,810 First, a function insert will be of type void 53 00:03:08,810 --> 00:03:11,350 because I don't want to return anything. 54 00:03:11,350 --> 00:03:14,200 And I'm going to take an int as an argument, 55 00:03:14,200 --> 00:03:17,090 because I want to know what I want to insert. 56 00:03:17,090 --> 00:03:21,840 So what's the first thing I should do? Well, I should malloc on newnode, 57 00:03:21,840 --> 00:03:24,240 so that is the first line. 58 00:03:24,240 --> 00:03:27,580 I'm just creating a new node to put in a linked list. 59 00:03:27,580 --> 00:03:32,360 So what can I do? Well, we know that in our implementations of linked lists 60 00:03:32,360 --> 00:03:38,180 in class, we always put the head as a global variable. 61 00:03:38,180 --> 00:03:41,800 So what we can do is change the head. 62 00:03:41,800 --> 00:03:44,300 I can make this new node be the new head, 63 00:03:44,300 --> 00:03:46,670 and it's going to point to the previous head. 64 00:03:46,670 --> 00:03:50,390 How can we do that? The first thing I have to do 65 00:03:50,390 --> 00:03:54,770 is change the 'n' in the new node to value, 66 00:03:54,770 --> 00:03:57,530 which was passed to the function. 67 00:03:57,530 --> 00:04:01,050 Then newnode's next is going to be the head. 68 00:04:01,050 --> 00:04:05,800 The head is going to be newnode. So it's pretty simple. 69 00:04:05,800 --> 00:04:10,090 For deleting a node, we can do it like-- 70 00:04:10,090 --> 00:04:14,790 One way we could do that is to say, 71 00:04:14,790 --> 00:04:18,160 okay, if I wanted to delete, for example, 3, 72 00:04:18,160 --> 00:04:24,850 what I could do is just point the previous node 73 00:04:24,850 --> 00:04:27,580 to the next node of 3. 74 00:04:27,580 --> 00:04:29,400 So I would just do something like that. 75 00:04:29,400 --> 00:04:33,400 But what is the problem with doing that? 76 00:04:33,400 --> 00:04:37,400 I have a memory leak, so I don't have access to the number 3 anymore. 77 00:04:37,400 --> 00:04:42,480 The problem with that is that I'm not going to be able to free that node. 78 00:04:42,480 --> 00:04:45,360 I'm going to have memory leak and (unintelligible) is going to hate me. 79 00:04:45,360 --> 00:04:49,370 So instead of doing that, I should probably have a temporary pointer. 80 00:04:49,370 --> 00:04:53,210 So I put temp. It is going to point to the node that I want to delete. 81 00:04:53,210 --> 00:04:58,170 And then I can move the previous nodes to point to the next node 82 00:04:58,170 --> 00:05:00,390 of the node that I want to delete. 83 00:05:00,390 --> 00:05:02,730 And finally, I can free the pointer. 84 00:05:02,730 --> 00:05:07,480 Do I have to free the pointer that I created right there? 85 00:05:07,480 --> 00:05:09,560 I don't have to, just because-- 86 00:05:09,560 --> 00:05:13,430 the difference is that this node was created using malloc, 87 00:05:13,430 --> 00:05:17,280 so it's in the heap, while this one was just declared as a NULL switch in the stack. 88 00:05:17,280 --> 00:05:20,000 So I don't have to free it. 89 00:05:20,000 --> 00:05:22,030 >> Okay. So now let's talk about stacks. 90 00:05:22,030 --> 00:05:24,680 Stacks are pretty straightforward. 91 00:05:24,680 --> 00:05:29,540 We did stacks and queues in class just using arrays, 92 00:05:29,540 --> 00:05:32,820 but you should be familiar--just be aware 93 00:05:32,820 --> 00:05:40,740 that you can also do stacks in queues using linked lists as well. 94 00:05:40,740 --> 00:05:44,460 So if you have an array, what would be a stack? 95 00:05:44,460 --> 00:05:46,810 A stack, first, will have to have a size. 96 00:05:46,810 --> 00:05:49,950 You have to store what is the size of the stack that you have right now. 97 00:05:49,950 --> 00:05:52,980 And also you would have an array, in this case of numbers, 98 00:05:52,980 --> 00:05:55,120 but if you want, it can be an array 99 00:05:55,120 --> 00:06:00,380 of strings, an array of struct, anything that you want to store. 100 00:06:00,380 --> 00:06:03,240 About the stack: The difference between a stack and a linked list 101 00:06:03,240 --> 00:06:08,590 is that in the stack you only have access to the last element that was put in the stack. 102 00:06:08,590 --> 00:06:11,770 It's called last in, first out. 103 00:06:11,770 --> 00:06:15,090 Just like you have a stack of trays, 104 00:06:15,090 --> 00:06:17,670 if you put a tray on the top of the stack, 105 00:06:17,670 --> 00:06:22,670 you have to remove that tray first to have access to the other trays. 106 00:06:22,670 --> 00:06:26,310 It's the same thing with stacks. 107 00:06:26,310 --> 00:06:31,220 So if I want to, for example, add an element to a stack, what should I do? 108 00:06:31,220 --> 00:06:34,070 It's called push, and it's pretty straightforward. 109 00:06:34,070 --> 00:06:37,130 The first thing you have to do is check if the size of the stack 110 00:06:37,130 --> 00:06:40,150 is not greater or equal to the capacity of the stack. 111 00:06:40,150 --> 00:06:45,810 Because if you already are on full capacity, you cannot add anything else. 112 00:06:45,810 --> 00:06:51,140 And then if not, you just have to add the element to the stack. 113 00:06:51,140 --> 00:06:54,530 And finally, increment the size. So it's pretty straightforward. 114 00:06:54,530 --> 00:06:57,140 So I just add the number 2. 115 00:06:57,140 --> 00:07:00,350 And if I want to pop, which means that I want to remove 116 00:07:00,350 --> 00:07:03,870 the last element that was added and return the value of the element, 117 00:07:03,870 --> 00:07:09,180 the first thing I have to check is that the stack is not empty. 118 00:07:09,180 --> 00:07:11,510 Because if it's empty, I cannot return anything. 119 00:07:11,510 --> 00:07:14,820 In that case, I'm returning -1. 120 00:07:14,820 --> 00:07:18,960 Otherwise, I'm going to decrement the size of the spec, 121 00:07:18,960 --> 00:07:22,510 and return numbers (s.size). 122 00:07:22,510 --> 00:07:27,230 Why did I decrement the size and then return s.size? 123 00:07:27,230 --> 00:07:30,930 It's because, in this case, the spec has size 4, 124 00:07:30,930 --> 00:07:33,810 and I want to return the fourth element, right? 125 00:07:33,810 --> 00:07:36,030 But what is the index of the fourth element? Three. 126 00:07:36,030 --> 00:07:44,510 Since I do size - - is going to be 3, I can just return s.numbers(s.size) 127 00:07:44,510 --> 00:07:48,410 because it's 3. So it's just the index. 128 00:07:48,410 --> 00:07:50,380 >> Now queues. Queues are pretty much the same thing. 129 00:07:50,380 --> 00:07:54,950 The only difference is that instead of having last in, first out, 130 00:07:54,950 --> 00:07:57,480 you have first in, first out. 131 00:07:57,480 --> 00:07:59,460 Probably if you're waiting to go to a concert, 132 00:07:59,460 --> 00:08:04,260 you wouldn't be happy if you had a stack instead of a queue. 133 00:08:04,260 --> 00:08:07,730 Being the last person to come would be the first person to enter the concert. 134 00:08:07,730 --> 00:08:09,760 You probably wouldn't be happy. 135 00:08:09,760 --> 00:08:15,020 In the queue, the first person to get in is also the first person to get out. 136 00:08:15,020 --> 00:08:18,720 So in the definition of a queue, besides having the size in the array, 137 00:08:18,720 --> 00:08:23,360 you also have to have the head, which is the index to the head of the stack. 138 00:08:23,360 --> 00:08:29,000 So the first element right now. 139 00:08:29,000 --> 00:08:32,710 Enqueue is the same thing as push for stacks. 140 00:08:32,710 --> 00:08:34,980 If you were very naive, you would just say, 141 00:08:34,980 --> 00:08:39,289 well, I can just do exactly the same thing as I did for push. 142 00:08:39,289 --> 00:08:44,030 I can just check if it's not beyond the capacity. 143 00:08:44,030 --> 00:08:48,760 If it is, I return false, otherwise I can just export the new value 144 00:08:48,760 --> 00:08:50,630 and then increment the size. 145 00:08:50,630 --> 00:08:52,750 But why is this wrong? 146 00:08:52,750 --> 00:08:55,010 Let's see this example. 147 00:08:55,010 --> 00:08:57,020 I'm trying to enqueue a bunch of stuff, 148 00:08:57,020 --> 00:08:58,390 and then I'm going to dequeue and enqueue. 149 00:08:58,390 --> 00:09:00,550 There's a lot of commands, but it's very simple. 150 00:09:00,550 --> 00:09:04,790 I'm going to enqueue 5, so add 5, and then 7, 151 00:09:04,790 --> 00:09:09,310 1, 4, 6, and then I want to dequeue something, 152 00:09:09,310 --> 00:09:12,000 which means that I'm going to remove the first element. 153 00:09:12,000 --> 00:09:14,640 So I'm going to remove the number 3, right? 154 00:09:14,640 --> 00:09:17,320 The first element. Okay. 155 00:09:17,320 --> 00:09:21,450 Now if I try to enqueue something else, what is going to happen? 156 00:09:21,450 --> 00:09:24,290 According to my implementation, 157 00:09:24,290 --> 00:09:31,040 I was going to put the next number in the index q.size. 158 00:09:31,040 --> 00:09:35,140 In this case, the size is 8, 159 00:09:35,140 --> 00:09:38,640 so the index 8 will be right here in the last position. 160 00:09:38,640 --> 00:09:43,900 If I try to enqueue 1 right here, I would be overwriting the last position 161 00:09:43,900 --> 00:09:45,870 to the number 1, which is completely wrong. 162 00:09:45,870 --> 00:09:49,870 What I want to do is wrap around and go to the first position. 163 00:09:49,870 --> 00:09:52,870 Maybe you would just say, well, I just have to check 164 00:09:52,870 --> 00:09:55,600 if I can actually put something there. 165 00:09:55,600 --> 00:09:58,560 If not, I just say, oh, the new full capacity 166 00:09:58,560 --> 00:10:02,010 is actually capacity - 1, and you cannot put an element there. 167 00:10:02,010 --> 00:10:06,150 But what is the problem? The problem is that if I just dequeue everything right here 168 00:10:06,150 --> 00:10:08,240 and then I try to add something else, it would just say, 169 00:10:08,240 --> 00:10:11,210 well, you were at full capacity, which is 0. 170 00:10:11,210 --> 00:10:13,620 So your queue is gone. 171 00:10:13,620 --> 00:10:16,990 You have to wrap around, and a way of wrapping around 172 00:10:16,990 --> 00:10:22,040 that you guys learned in visionary and other psets was using mod. 173 00:10:22,040 --> 00:10:29,090 You can try it at home to understand why you would do q.size + q.head 174 00:10:29,090 --> 00:10:31,080 mod capacity, but if you check right here, 175 00:10:31,080 --> 00:10:34,760 we can see that it works. 176 00:10:34,760 --> 00:10:37,760 So in the last example, q.size was 8 177 00:10:37,760 --> 00:10:47,590 and the head was 1, because it was this position here of the array. 178 00:10:47,590 --> 00:10:51,970 So it will be 8 + 1, 9. Mod capacity 9 would be 0. 179 00:10:51,970 --> 00:10:56,640 It would go to the index 0. We'll be in the right position. 180 00:10:56,640 --> 00:10:59,750 And then try the queue at home. 181 00:10:59,750 --> 00:11:04,950 Some important things: try to understand the difference between a stack and a queue. 182 00:11:04,950 --> 00:11:11,620 At home, try to get very familiar with implementing enqueue, dequeue, push and pop. 183 00:11:11,620 --> 00:11:16,560 And also understand when you would use each of them. 184 00:11:16,560 --> 00:11:22,830 >> So let's relax for 10 seconds with a bunch of Pokemons. 185 00:11:22,830 --> 00:11:26,080 And now let's go back to data structures. 186 00:11:26,080 --> 00:11:29,770 Hash tables. A lot of people were scared of hash tables. 187 00:11:29,770 --> 00:11:33,650 in problem set 6, Spell Checker. 188 00:11:33,650 --> 00:11:35,980 Hash tables and tries, a lot of people get scared of them. 189 00:11:35,980 --> 00:11:38,540 They think they're so hard to understand. Yeah? 190 00:11:38,540 --> 00:11:41,490 [Rob Bowden] Problem set 5. >>Problem set 5, yeah. Thanks Rob. 191 00:11:41,490 --> 00:11:43,370 Yeah. Six was Huff n' Puff, yeah. 192 00:11:43,370 --> 00:11:49,340 Problem set 5 was Spell Checker, and you had to use either a hash table or a try. 193 00:11:49,340 --> 00:11:55,360 A lot of people thought that they were super hard to understand, but they're actually pretty simple. 194 00:11:55,360 --> 00:12:01,290 What is a hash table, basically? A hash table is an array of linked lists. 195 00:12:01,290 --> 00:12:06,730 The only difference between an array and a hash table 196 00:12:06,730 --> 00:12:09,730 is that in the hash table you have something called a hash function. 197 00:12:09,730 --> 00:12:12,080 What is a hash function? 198 00:12:12,080 --> 00:12:13,970 I don't know if you guys can read here. 199 00:12:13,970 --> 00:12:16,090 This is an example of a hash table. 200 00:12:16,090 --> 00:12:19,220 So you can see that you have an array with 31 elements. 201 00:12:19,220 --> 00:12:22,440 And what we do in a hash table is have a hash function 202 00:12:22,440 --> 00:12:26,660 that is going to translate a key, each int to an index. 203 00:12:26,660 --> 00:12:31,740 If, for example, if I want to choose for B. Harrison, 204 00:12:31,740 --> 00:12:34,190 I would put B. Harrison in my hash functions, 205 00:12:34,190 --> 00:12:36,960 and the hash function would return 24. 206 00:12:36,960 --> 00:12:40,930 So I know that I want to store B. Harrison in 24. 207 00:12:40,930 --> 00:12:46,580 So that's the difference between just having an array and having a hash table. 208 00:12:46,580 --> 00:12:48,740 In the hash table you'll have a function that is going to tell you 209 00:12:48,740 --> 00:12:54,740 where to store the data that you want to store. 210 00:12:54,740 --> 00:12:57,040 For the hash function, you want to look for a hash function 211 00:12:57,040 --> 00:13:00,600 that is deterministic and well-distributed. 212 00:13:00,600 --> 00:13:07,810 As you can see here, you see that a lot of the data that I wanted to store was actually 19 213 00:13:07,810 --> 00:13:12,470 instead of using 31 and 30 and 29, which were all free. 214 00:13:12,470 --> 00:13:16,920 So the hash function that I used was not very well-distributed. 215 00:13:16,920 --> 00:13:20,710 When we say well-distributed, it means that we want to have, 216 00:13:20,710 --> 00:13:26,520 roughly, at least 1 or 2 for each of the-- 217 00:13:26,520 --> 00:13:32,190 like, a difference of 1 or 2 for each of the indices in the arrays. 218 00:13:32,190 --> 00:13:43,950 You want to have, roughly, the same number of elements in each linked list in the array. 219 00:13:43,950 --> 00:13:48,600 And it's easy to check if it's valid in the hash table, view as hash tables. 220 00:13:48,600 --> 00:13:51,770 >> Then trees. This is a tree. 221 00:13:51,770 --> 00:13:56,400 Trees in computer science are upside down for some reason. 222 00:13:56,400 --> 00:14:00,150 So right here you have the root of the tree and then the leaves. 223 00:14:00,150 --> 00:14:05,630 You should just know the nomenclature for parents and child. 224 00:14:05,630 --> 00:14:12,880 Each node has its children, which are the nodes that are below the parent. 225 00:14:12,880 --> 00:14:19,660 So, for example, 2 is going to be the parent for 3 and for the other child right there, 226 00:14:19,660 --> 00:14:25,290 while 3 is going to be the parent for 1 and the other children that are there. 227 00:14:25,290 --> 00:14:29,990 And 1 is going to be 3's child, and so on. 228 00:14:29,990 --> 00:14:34,610 We have something much more interesting, called a binary search tree, 229 00:14:34,610 --> 00:14:39,040 in which all the values on the right of a node 230 00:14:39,040 --> 00:14:41,660 are going to be on the right, right here--on the right, 231 00:14:41,660 --> 00:14:46,780 are going to be greater than the element in the root. 232 00:14:46,780 --> 00:14:49,780 So if I have the number 5 right here, all the elements on the right 233 00:14:49,780 --> 00:14:51,940 are going to be greater than 5, and on the left 234 00:14:51,940 --> 00:14:56,770 all the elements are going to be less than 5. 235 00:14:56,770 --> 00:14:58,780 Why is this useful? 236 00:14:58,780 --> 00:15:01,660 Well, if I want to check if the number 7 is here, for example, 237 00:15:01,660 --> 00:15:05,960 I just go to 5 first and I'm going to see, is 7 greater or less than 5? 238 00:15:05,960 --> 00:15:09,540 It's greater, so I know it's going to have to be on the right of the tree. 239 00:15:09,540 --> 00:15:13,980 So I have much less stuff to look at. 240 00:15:13,980 --> 00:15:19,520 In implementation of a binary search tree, the node, I'm just going to have to have data, 241 00:15:19,520 --> 00:15:21,750 so int n; you could also have a string 242 00:15:21,750 --> 00:15:23,630 or anything you wanted. 243 00:15:23,630 --> 00:15:28,100 You just have to be careful on defining what is greater, what is less. 244 00:15:28,100 --> 00:15:30,390 So if you had strings, for example, you could define 245 00:15:30,390 --> 00:15:34,690 that all those things on the right are going to have larger length, 246 00:15:34,690 --> 00:15:40,940 the left are going to have lower lengths, so it's really up to you. 247 00:15:40,940 --> 00:15:44,930 >> How can I implement find for BST? 248 00:15:44,930 --> 00:15:47,840 The first thing we'll have to do is check if the root is NULL. 249 00:15:47,840 --> 00:15:50,920 If it's NULL, it means that the thing is not there 250 00:15:50,920 --> 00:15:53,330 because you don't even have a tree, right? 251 00:15:53,330 --> 00:15:55,790 So I return false. 252 00:15:55,790 --> 00:15:58,740 Otherwise, I'm going to check if the number is greater 253 00:15:58,740 --> 00:16:01,720 than the value in the root. 254 00:16:01,720 --> 00:16:04,250 I'm going to try to find the element on the right 255 00:16:04,250 --> 00:16:08,590 of the tree. 256 00:16:08,590 --> 00:16:11,310 You see that I'm using recursion here. 257 00:16:11,310 --> 00:16:14,150 And then if it's less, I'm going to look at the left. 258 00:16:14,150 --> 00:16:18,330 And finally, otherwise, if it's not less or not greater, 259 00:16:18,330 --> 00:16:20,660 it means that it's the value itself. 260 00:16:20,660 --> 00:16:23,010 So I just return true. 261 00:16:23,010 --> 00:16:26,360 You can see here that I used if, if, if. 262 00:16:26,360 --> 00:16:30,820 And remember, in quiz 0, we had a problem that had if, if, if, 263 00:16:30,820 --> 00:16:32,780 and you were supposed to find the inefficiency, 264 00:16:32,780 --> 00:16:35,180 and the inefficiency was that you used if. 265 00:16:35,180 --> 00:16:39,060 You should have used if, else if, else if, and else. 266 00:16:39,060 --> 00:16:44,240 So, should I use else if and else if and else here? 267 00:16:44,240 --> 00:16:46,200 Does anyone--yeah? 268 00:16:46,200 --> 00:16:51,140 [Student speaking, inaudible] 269 00:16:51,140 --> 00:16:53,480 That's perfect. So she's saying that it doesn't matter, 270 00:16:53,480 --> 00:16:55,930 just because the inefficiency that we had before 271 00:16:55,930 --> 00:16:59,550 was that because, maybe if some condition was satisfied, 272 00:16:59,550 --> 00:17:03,570 so you have performed an action, but then you were going to check all of the other conditions. 273 00:17:03,570 --> 00:17:06,319 But in this case, it returned right away, so it doesn't matter. 274 00:17:06,319 --> 00:17:09,220 So you don't have to use else if. 275 00:17:09,220 --> 00:17:11,740 >> And finally, let's talk about tries, 276 00:17:11,740 --> 00:17:13,800 which is everyone's favorite. 277 00:17:13,800 --> 00:17:15,980 A try is a tree of arrays. 278 00:17:15,980 --> 00:17:20,369 It's very fast to look up values, but it uses a lot of memory. 279 00:17:20,369 --> 00:17:22,530 And it's usually to filter words, so when you 280 00:17:22,530 --> 00:17:27,920 want to implement, for example, I don't know, like a phone book in your phone 281 00:17:27,920 --> 00:17:30,440 and you want to be able to type B 282 00:17:30,440 --> 00:17:32,510 and just have names of people who have B. 283 00:17:32,510 --> 00:17:37,960 It's very easy to implement that using a try, for example. 284 00:17:37,960 --> 00:17:39,820 How do you define a node in a try? 285 00:17:39,820 --> 00:17:43,910 You just have to have a bool that is going to be is_word. 286 00:17:43,910 --> 00:17:48,660 That represents that using all the characters before that node, 287 00:17:48,660 --> 00:17:51,920 you were able to form a word, 288 00:17:51,920 --> 00:17:57,230 and then you'll have an array of pointers to nodes. 289 00:17:57,230 --> 00:18:03,120 Can you see that we have an array of parent nodes, so node* array? Yeah? 290 00:18:03,120 --> 00:18:06,050 So let's see how that will work. For the spell check, 291 00:18:06,050 --> 00:18:08,230 we have an array of 27 elements, 292 00:18:08,230 --> 00:18:12,150 because we have all the letters plus the apostrophe. 293 00:18:12,150 --> 00:18:17,800 Before here I'm just going to use 2 because I want to be able to write on the board. 294 00:18:17,800 --> 00:18:20,230 Okay. So this is an example of a try. 295 00:18:20,230 --> 00:18:25,600 If I just define the first node, I'll have an array of 2 elements 296 00:18:25,600 --> 00:18:29,290 that are 2 pointers to NULL, so I just put 'a' and 'b'. 297 00:18:29,290 --> 00:18:32,430 And I'm going to have a bool that says is_word. 298 00:18:32,430 --> 00:18:34,420 It's going to be false for the first one, 299 00:18:34,420 --> 00:18:37,370 just because, before that you don't have any characters. 300 00:18:37,370 --> 00:18:40,900 So an empty word is not a word. So it's false. 301 00:18:40,900 --> 00:18:46,320 If I want to add 'a' to this dictionary, what would I have to do? 302 00:18:46,320 --> 00:18:49,760 I would just have to malloc a new node for 'a', 303 00:18:49,760 --> 00:18:54,630 and then add its word to true. 304 00:18:54,630 --> 00:19:00,180 So it just represents that having 'a' is going to be true. Make sense? 305 00:19:00,180 --> 00:19:04,120 Then if I want to add 'ba', I'll have to malloc 1 for 'b', 306 00:19:04,120 --> 00:19:07,550 and then I'm going to set up the boolean to false, 307 00:19:07,550 --> 00:19:10,160 because 'b' by itself is not a word. 308 00:19:10,160 --> 00:19:13,010 Then I'm going to malloc another one for 'a', so 'ba', 309 00:19:13,010 --> 00:19:16,290 and then I'm going to set up it's a word to true. 310 00:19:16,290 --> 00:19:18,950 Because 'ba' is a word. 311 00:19:18,950 --> 00:19:21,910 And then if I want to see if 'b' is in this dictionary, 312 00:19:21,910 --> 00:19:26,730 I can just go to the first one, 'b'. I go down, and I look at is word, and it says false. 313 00:19:26,730 --> 00:19:30,110 So it's not a word. If I want to check 'ba', 314 00:19:30,110 --> 00:19:38,010 I go to the first one, 'b', and then go to 'a', and I see true, so it is a word. Make sense? 315 00:19:38,010 --> 00:19:41,950 A lot of people get confused by tries. No? 316 00:19:41,950 --> 00:19:44,740 >> Finally, Huffman coding. Huffman coding is very useful 317 00:19:44,740 --> 00:19:47,550 to save memory and compress text files, 318 00:19:47,550 --> 00:19:52,270 just because a lot of times you use 'a' and 'e', for example, 319 00:19:52,270 --> 00:19:57,710 in your documents, but I don't know if you guys use 'q' or 'z' as much. 320 00:19:57,710 --> 00:20:02,040 Having just 1 byte for every single character, 321 00:20:02,040 --> 00:20:08,520 every single--the 256 characters that we have in the ASCII table is not very optimal, 322 00:20:08,520 --> 00:20:11,410 just because there are some characters that you use much more, 323 00:20:11,410 --> 00:20:15,180 so you should probably use less memory for those. 324 00:20:15,180 --> 00:20:17,560 How do I use Huffman coding? 325 00:20:17,560 --> 00:20:20,010 We have to do a Huffman tree. 326 00:20:20,010 --> 00:20:23,370 A Huffman tree has nodes 327 00:20:23,370 --> 00:20:27,760 that have a symbol that is going to be like, 'a', 'b', 'c', the letter, 328 00:20:27,760 --> 00:20:32,990 whatever letter you have, a frequency that is the frequency that the word appears in the text, 329 00:20:32,990 --> 00:20:36,280 that you were creating the Huffman tree for, 330 00:20:36,280 --> 00:20:41,800 and then a node that is going to point to the left of the Huffman tree 331 00:20:41,800 --> 00:20:47,210 and another node that is going to point to the right. So just like a tree. 332 00:20:47,210 --> 00:20:49,440 How do you build a Huffman tree? 333 00:20:49,440 --> 00:20:54,020 You're going to pick the 2 nodes that have the lowest frequencies. 334 00:20:54,020 --> 00:20:56,490 If you have a tie you're going to pick the 2 nodes 335 00:20:56,490 --> 00:20:59,870 that have the lowest ASCII values as well. 336 00:20:59,870 --> 00:21:02,420 Then you're going to create a new tree out of those 2 nodes 337 00:21:02,420 --> 00:21:08,030 that is going to have the combined frequency in the parent node. 338 00:21:08,030 --> 00:21:13,240 And then you're going to remove the 2 children from the forest 339 00:21:13,240 --> 00:21:15,570 and replace them with the parent. 340 00:21:15,570 --> 00:21:18,930 And you're going to repeat that until you only have 1 tree in the forest. 341 00:21:18,930 --> 00:21:23,840 So let's see how you would do a Huffman tree for ZAMYLA. 342 00:21:23,840 --> 00:21:29,220 You can see here that all the letters have frequency 1 except for 'A'; that has frequency 2. 343 00:21:29,220 --> 00:21:34,090 So I created nodes for all the letters I put in order of ASCII value and frequency. 344 00:21:34,090 --> 00:21:40,090 So if I want to create the first tree, it will be with 'L' and 'M'. 345 00:21:40,090 --> 00:21:43,100 So it's here. The frequency of the pair will be 2 346 00:21:43,100 --> 00:21:49,470 because it's 1 + 1, then the next 2 with the lowest frequencies are 'Y' and 'Z'. 347 00:21:49,470 --> 00:21:53,180 And then I have all of them being--have a frequency of 2. 348 00:21:53,180 --> 00:22:00,470 So which ones are the ones that have the lowest ASCII value for the next one? 349 00:22:00,470 --> 00:22:04,830 'A' and 'L'. So I create the new node, 350 00:22:04,830 --> 00:22:09,930 and finally, it's 4 and 2, so 2 is going to be on the left. 351 00:22:09,930 --> 00:22:12,430 And this is the Huffman tree. 352 00:22:12,430 --> 00:22:16,060 Then if I want to write some text, 353 00:22:16,060 --> 00:22:24,440 like in binary to convert to text, using the Huffman tree is very easy. 354 00:22:24,440 --> 00:22:30,220 For example, if I say that moving to the left is a 0 and moving to the right is a 1, 355 00:22:30,220 --> 00:22:32,410 What is that going to represent? 356 00:22:32,410 --> 00:22:35,530 So like 1, 1, so right, right, 357 00:22:35,530 --> 00:22:40,370 and then 0, so left would be L, and then 1, 0, 0. 358 00:22:40,370 --> 00:22:43,950 So 1, 0, so just 1, 0, 'A'. 359 00:22:43,950 --> 00:22:47,540 And then 0, 1, so 'Z'. 360 00:22:47,540 --> 00:22:52,170 And then 1, 0, 0--no. 361 00:22:52,170 --> 00:22:56,780 0, 0 will be 'Y', so Lazy. 362 00:22:56,780 --> 00:23:06,060 So that's all for me, Rob's going to take over. 363 00:23:06,060 --> 00:23:08,400 >> [Rob Bowden] So, week 7 stuff. 364 00:23:08,400 --> 00:23:11,390 We've got a lot to go over really fast. 365 00:23:11,390 --> 00:23:13,430 Bitwise operators, buffer overflow, 366 00:23:13,430 --> 00:23:16,760 CS50 library, then HTML, HTTP, CSS. 367 00:23:16,760 --> 00:23:20,990 All in like 15 to 20 minutes. 368 00:23:20,990 --> 00:23:24,330 Bitwise operators. There are 6 of them that you need to know. 369 00:23:24,330 --> 00:23:31,200 Bitwise and, bitwise or, XOR, left shift, right shift, and not. 370 00:23:31,200 --> 00:23:35,420 Right shift and not you barely saw in lecture at all. 371 00:23:35,420 --> 00:23:40,480 We'll go over it quickly here, but it's good to know that these are the 6 that exist. 372 00:23:40,480 --> 00:23:45,070 Remember that bitwise operators are like when you do 3 + 4. 373 00:23:45,070 --> 00:23:49,420 You aren't dealing with the binary of 3 and 4. 374 00:23:49,420 --> 00:23:56,550 With bitwise operators you are actually dealing with the individual bits of the numbers 3 and 4. 375 00:23:56,550 --> 00:23:59,120 >> So the first one that we'll say is bitwise not, 376 00:23:59,120 --> 00:24:02,340 and all it does is flip all the bits. 377 00:24:02,340 --> 00:24:05,500 So here, if you're writing this in C, you wouldn't write it 378 00:24:05,500 --> 00:24:09,380 as ~11011 or whatever, you would write it like ~4, 379 00:24:09,380 --> 00:24:12,970 and then it would flip the binary representation of 4. 380 00:24:12,970 --> 00:24:24,800 So here, ~ of some binary number 1101101 is going to exactly flip all 1's to 0's and all 0's to 1's. 381 00:24:24,800 --> 00:24:27,600 As I say there, the frequent use of this, 382 00:24:27,600 --> 00:24:30,830 and we'll see it in a bit, is like we want to come up with some number 383 00:24:30,830 --> 00:24:35,460 where all of the bits are 1, except for one of them. 384 00:24:35,460 --> 00:24:38,560 So it's usually easier to express the number 385 00:24:38,560 --> 00:24:40,630 where just that single bit is set, 386 00:24:40,630 --> 00:24:44,650 and then take the ~ of it, so every other bit is set except for that one. 387 00:24:44,650 --> 00:24:50,300 So that's what we're going to use more in a bit. 388 00:24:50,300 --> 00:24:58,220 >> Bitwise or. Here are 2 binary numbers, and these 2 numbers 389 00:24:58,220 --> 00:25:00,780 are pretty representative, since they represent every possible 390 00:25:00,780 --> 00:25:07,290 combination of bits you could need to operate on. 391 00:25:07,290 --> 00:25:13,540 Here, when I or'd each bit, we're just going to compare straight down. 392 00:25:13,540 --> 00:25:15,410 So on the left side we have a 1 and a 1. 393 00:25:15,410 --> 00:25:20,510 When I bitwise | those, what am I going to get? One. 394 00:25:20,510 --> 00:25:25,320 Then bitwise | 0 and 1 is going to give me? One. 395 00:25:25,320 --> 00:25:27,840 Bitwise 1 and 0 is going to be the same thing, one. 396 00:25:27,840 --> 00:25:31,880 Bitwise 0 | 0 is going to give me 0. 397 00:25:31,880 --> 00:25:37,300 So the only case where I get 0 is in the 0 | 0 case. 398 00:25:37,300 --> 00:25:40,020 And you can think of that just like your logical ors. 399 00:25:40,020 --> 00:25:44,830 So if you think of 1 as true and 0 as false, the same thing applies here. 400 00:25:44,830 --> 00:25:50,040 So true or true is true; true or false is true. 401 00:25:50,040 --> 00:25:57,150 False or true is true; false or false is the only thing that's actually false. 402 00:25:57,150 --> 00:26:00,100 Here's the example that you should know 403 00:26:00,100 --> 00:26:05,160 as a pretty good example of when bitwise operators are used. 404 00:26:05,160 --> 00:26:08,660 Here if we or capital 'A' with Ox20, 405 00:26:08,660 --> 00:26:11,830 and we'll look at these in a second, we get something. 406 00:26:11,830 --> 00:26:16,020 And if we or lowercase 'a' with Ox20, we get something. 407 00:26:16,020 --> 00:26:26,750 So let's pull up ASCII table. 408 00:26:26,750 --> 00:26:34,000 Okay. Here we see that 'A' is-- 409 00:26:34,000 --> 00:26:36,920 here we have 'A' is decimal 65. 410 00:26:36,920 --> 00:26:45,120 But I'll go with hexadecimal, which is Ox41. 411 00:26:45,120 --> 00:26:48,280 Pretty sure we saw it in class. I think we saw it in class 412 00:26:48,280 --> 00:26:52,730 that it's pretty easy to convert from hexadecimal to binary. 413 00:26:52,730 --> 00:26:55,280 So here, if I want to put 4 into binary, 414 00:26:55,280 --> 00:26:59,550 that's just going to be 0100. 415 00:26:59,550 --> 00:27:03,620 This is 1's place, 2's place, 4's place, so this is 4. 416 00:27:03,620 --> 00:27:08,550 Then I can split 1 into binary, which is going to be 0001. 417 00:27:08,550 --> 00:27:14,280 And so this is going to be the representation of 'A' in binary. 418 00:27:14,280 --> 00:27:22,720 Taking lowercase 'a', it's now going to be Ox61, 419 00:27:22,720 --> 00:27:27,050 where, splitting these up into its binary, so a 6-- 420 00:27:27,050 --> 00:27:37,830 Let's actually do it--is there no eraser? Eraser. 421 00:27:37,830 --> 00:27:48,220 Ox61. So splitting 6 into binary is going to be 0 + 4 + 2 + 0. 422 00:27:48,220 --> 00:27:54,610 And splitting 1 is going to be 0001. 423 00:27:54,610 --> 00:27:56,520 Looking at the difference between these 2, 424 00:27:56,520 --> 00:28:04,250 we see that the only difference between a lowercase and a capital 'A' is this single bit. 425 00:28:04,250 --> 00:28:11,810 So coming back to here--okay. 426 00:28:11,810 --> 00:28:15,920 Coming back to here, if we look at what the bit Ox20 is, 427 00:28:15,920 --> 00:28:22,210 so splitting Ox20 into its binary, 428 00:28:22,210 --> 00:28:27,310 is 0010, 0000. 429 00:28:27,310 --> 00:28:33,470 Ox20, the only bit that is set is this bit that we are concerned with, 430 00:28:33,470 --> 00:28:38,210 with switching between capital and lowercase 'a'. 431 00:28:38,210 --> 00:28:47,610 If I or 'A', which is this one, 'A', 432 00:28:47,610 --> 00:28:50,580 if I or 'A' with Ox20, 433 00:28:50,580 --> 00:28:53,490 what am I going to get? 434 00:28:53,490 --> 00:28:58,960 [Student, inaudible] >>Lowercase 'a', because it's going to flip this bit to a 1. 435 00:28:58,960 --> 00:29:04,170 And if I or 'a' with Ox20, what am I going to get? 436 00:29:04,170 --> 00:29:08,780 Lowercase a, because just oring 'a' with Ox20, 437 00:29:08,780 --> 00:29:14,580 I'm just going to be oring this single bit to a 1; it's already a 1, so it doesn't matter. 438 00:29:14,580 --> 00:29:17,960 So we get 'a' and 'a'. 439 00:29:17,960 --> 00:29:24,820 >> Bitwise and. Again, we can think of this as our logical and counterpart. 440 00:29:24,820 --> 00:29:28,180 On the left side we have true & true. 441 00:29:28,180 --> 00:29:31,160 It's going to be true, and for all of the cases, 442 00:29:31,160 --> 00:29:36,270 false & true or true & false, or false & false, 443 00:29:36,270 --> 00:29:38,550 none of those things are true. 444 00:29:38,550 --> 00:29:44,170 So what we end up getting is 1000. 445 00:29:44,170 --> 00:29:48,830 So now, here, here's where I've used the trusty bitwise not, 446 00:29:48,830 --> 00:29:52,230 where we had Ox20. 447 00:29:52,230 --> 00:29:54,350 So this is Ox20. 448 00:29:54,350 --> 00:29:59,570 Now what I want to do, bitwise ~ of Ox20. 449 00:29:59,570 --> 00:30:03,600 That is going to flip all the bits. 450 00:30:03,600 --> 00:30:09,330 So I have 1101, 1111. 451 00:30:09,330 --> 00:30:18,940 And so 'A' anded with ~Ox20 is going to give me what? 452 00:30:18,940 --> 00:30:22,430 The only bit we really need to think about is this one, 453 00:30:22,430 --> 00:30:26,020 since, if all of these bits are set to 1, 454 00:30:26,020 --> 00:30:29,000 then we're going to get exactly what 'A' was, 455 00:30:29,000 --> 00:30:31,260 except for, possibly, what this bit is. 456 00:30:31,260 --> 00:30:34,460 Because if it was a 1, now it's going to be set to a 0, 457 00:30:34,460 --> 00:30:39,810 because whatever this is, anded with this is going to be 0. 458 00:30:39,810 --> 00:30:43,280 So what is 'A'&~Ox20 going to give me? 459 00:30:43,280 --> 00:30:48,200 [Students answer, inaudible] >>And what is 'a' and--it's 'A'. 460 00:30:48,200 --> 00:30:52,170 And what is 'a'&~Ox20 going to give me? 461 00:30:52,170 --> 00:30:56,720 'A.' Because this is currently a 1. 462 00:30:56,720 --> 00:30:59,570 Anding with this 0 is going to make it a 0, 463 00:30:59,570 --> 00:31:02,530 and now we're going to get a 'A'. 464 00:31:02,530 --> 00:31:06,600 >> Both are 'A,' and last but not least of this type, 465 00:31:06,600 --> 00:31:10,830 we have XOR. It's very much like or, 466 00:31:10,830 --> 00:31:14,400 except it means exclusively or. 467 00:31:14,400 --> 00:31:18,420 This is like what you usually think of as or in the real world. 468 00:31:18,420 --> 00:31:23,190 So you do either 'x' or 'y', but not both. 469 00:31:23,190 --> 00:31:28,700 Here 1^1 is going to be 0. 470 00:31:28,700 --> 00:31:33,650 Because true, this is--it doesn't work as well with the logical true and false 471 00:31:33,650 --> 00:31:37,150 as bitwise & and or do, 472 00:31:37,150 --> 00:31:40,100 but true ^ true is false. 473 00:31:40,100 --> 00:31:44,810 Because we only want to return true if only one of them is true. 474 00:31:44,810 --> 00:31:50,950 So 1^1 is 0. What about 0^1? 475 00:31:50,950 --> 00:31:56,010 Is 1. 1^0 is 1, 0^0 is 0. 476 00:31:56,010 --> 00:32:03,890 So under all circumstances, 0 bitwise something 0 is going to be 0. 477 00:32:03,890 --> 00:32:10,270 1 bitwise something 0 or 0 bitwise 1, 478 00:32:10,270 --> 00:32:14,660 if it's | or ^, it'll be a 1, and if it's & it'll be 0. 479 00:32:14,660 --> 00:32:20,850 And the only case where 1 bitwise 1 is not 1 is with exclusive or. 480 00:32:20,850 --> 00:32:24,580 That's 0110. 481 00:32:24,580 --> 00:32:36,520 So here now, using XOR--so we're back at 20. 482 00:32:36,520 --> 00:32:43,480 'A'^Ox20 is these 2 bits we're comparing. 483 00:32:43,480 --> 00:32:50,020 So a 1 ^ 0 is going to give me a what? A one. 484 00:32:50,020 --> 00:32:58,430 'A'^Ox20 is going to give me? Lowercase a. 485 00:32:58,430 --> 00:33:04,010 'a'^Ox20 is going to give me? Capital A. 486 00:33:04,010 --> 00:33:09,310 Because whatever this is doing, this XORing with Ox20 487 00:33:09,310 --> 00:33:15,380 is effectively flipping whatever this bit is. 488 00:33:15,380 --> 00:33:21,240 If this is a 0, it's now going to become a 1. 489 00:33:21,240 --> 00:33:26,160 Since this is a 1, 1 ^ 1 is 0. 490 00:33:26,160 --> 00:33:33,280 So our 'a' has become 'A', and our 'A' has become 'a'. 491 00:33:33,280 --> 00:33:36,910 So XOR is a really convenient way of just flipping the case. 492 00:33:36,910 --> 00:33:39,960 You just want to iterate over a string of letters 493 00:33:39,960 --> 00:33:44,330 and alternate the case of every single character, 494 00:33:44,330 --> 00:33:50,680 you just XOR everything with Ox20. 495 00:33:50,680 --> 00:33:55,220 >> Now we have left shift. Left shift is just going to, basically, 496 00:33:55,220 --> 00:34:01,250 push all of the numbers into, or to the left, and insert 0's behind them. 497 00:34:01,250 --> 00:34:05,550 So here we have 00001101. 498 00:34:05,550 --> 00:34:08,560 We're going to push 3 0's in from the right, 499 00:34:08,560 --> 00:34:13,580 and we get 01101000. 500 00:34:13,580 --> 00:34:16,380 In nonbinary terms, 501 00:34:16,380 --> 00:34:24,699 we see that that's really dealing 13 left-shifted with 3, which gives us 104. 502 00:34:24,699 --> 00:34:32,530 So left shifting, we see here, x< 00:34:40,139 13 * 2^3, 2^3 is 8, so 13 * 8 is 104. 504 00:34:40,139 --> 00:34:45,679 If you just think about binary in general, how each digit, 505 00:34:45,679 --> 00:34:49,530 if we start from the right, it's the 1's place, then the 2's place, then the 4's place. 506 00:34:49,530 --> 00:34:51,330 So by pushing in 0's from the right, 507 00:34:51,330 --> 00:34:55,080 we're just pushing things that were in the 4's place to the 8's place, 508 00:34:55,080 --> 00:34:57,920 and things that were in the 8's place to the 16's place. 509 00:34:57,920 --> 00:35:01,280 Each shift just multiplies by 2. Yeah? 510 00:35:01,280 --> 00:35:05,210 [Student] What happens if you shifted by 5? 511 00:35:05,210 --> 00:35:10,790 [Bowden] If you shifted by 5 you would just lose digits. 512 00:35:10,790 --> 00:35:15,410 Inevitably, it's the same thing. Like, integers are only 32 bits, 513 00:35:15,410 --> 00:35:20,750 so if you add 2 really big integers, it just doesn't fit in an integer. 514 00:35:20,750 --> 00:35:23,660 So it's the same thing here. If you shifted by 5, 515 00:35:23,660 --> 00:35:25,650 we would just lose that one. 516 00:35:25,650 --> 00:35:28,820 And that's kind of what I mean by "roughly," 517 00:35:28,820 --> 00:35:37,470 where if you shift too far, you lose bits. 518 00:35:37,470 --> 00:35:39,830 >> Right shift is going to be the opposite, 519 00:35:39,830 --> 00:35:43,090 where we're going to shove 0's off the end, 520 00:35:43,090 --> 00:35:48,400 and for our purposes, fill in 0's from the left. 521 00:35:48,400 --> 00:35:52,910 So doing this, we're basically reversing what we had already done. 522 00:35:52,910 --> 00:35:57,780 And we see that the three 0's on the right have just fallen off, 523 00:35:57,780 --> 00:36:02,020 and we have pushed the 1101 all the way to the right. 524 00:36:02,020 --> 00:36:08,380 This is doing 104>>3, which is, effectively, x / 2^y. 525 00:36:08,380 --> 00:36:11,200 So now, here, it's a similar idea. 526 00:36:11,200 --> 00:36:18,720 Why is it just roughly x / 2^y, and not actually x / 2^y? 527 00:36:18,720 --> 00:36:22,240 Because if I had shifted by 4, I would have lost a 1. 528 00:36:22,240 --> 00:36:25,950 Basically, what you think of, just think of integer division in general. 529 00:36:25,950 --> 00:36:31,070 So, like 5 / 2 is 2. It's not 2.5. 530 00:36:31,070 --> 00:36:35,000 It's the same idea here. When we divide by 2, 531 00:36:35,000 --> 00:36:39,910 we can lose odd bits along the way. 532 00:36:39,910 --> 00:36:43,870 So now--that's it for bitwise. That's all you need to know. 533 00:36:43,870 --> 00:36:46,340 Remember the use cases we saw in class, 534 00:36:46,340 --> 00:36:49,340 like a bit mask is useful for bitwise operators, 535 00:36:49,340 --> 00:36:53,220 or you use them for bit masks. 536 00:36:53,220 --> 00:36:58,620 Capital letters and lowercase letters, conversions is a pretty prototypical example. 537 00:36:58,620 --> 00:37:01,640 >> Okay, so buffer overflow attacks. 538 00:37:01,640 --> 00:37:05,110 Anyone remember what was wrong with this function? 539 00:37:05,110 --> 00:37:10,140 Notice we declared an array of 12 bytes, 12 chars, 540 00:37:10,140 --> 00:37:18,510 and then we copy into our buffer of 12 chars the entire string bar. 541 00:37:18,510 --> 00:37:25,080 So what's the problem here? 542 00:37:25,080 --> 00:37:32,270 The magic number 12 should pretty much immediately pop out as--why 12? 543 00:37:32,270 --> 00:37:35,050 What if bar happens to be more than 12 characters? 544 00:37:35,050 --> 00:37:41,200 What if bar is millions of characters? 545 00:37:41,200 --> 00:37:46,010 Here the issue is memcpy. If bar is long enough, 546 00:37:46,010 --> 00:37:50,330 it will just completely--'c', 'c' doesn't care that it was only 12 characters; 547 00:37:50,330 --> 00:37:53,280 'c' doesn't care that it can't fit that many bytes. 548 00:37:53,280 --> 00:37:58,250 It will just completely overwrite char, the 12 bytes we've allocated for it, 549 00:37:58,250 --> 00:38:01,830 and everything past it in memory that doesn't actually belong to that buffer 550 00:38:01,830 --> 00:38:06,520 with whatever the string bar is. 551 00:38:06,520 --> 00:38:09,780 So this was the picture we saw in class 552 00:38:09,780 --> 00:38:12,220 where we have our stack growing up. 553 00:38:12,220 --> 00:38:16,040 You should be used to these pictures or get familiar with them again. 554 00:38:16,040 --> 00:38:21,260 We have our stack growing up, memory addresses start at 0 at the top 555 00:38:21,260 --> 00:38:26,270 and grow down to like 4 billion at the bottom. 556 00:38:26,270 --> 00:38:28,820 We have our array 'c' somewhere in memory, 557 00:38:28,820 --> 00:38:32,260 then we have our pointer to bar right underneath it, 558 00:38:32,260 --> 00:38:38,720 and then we have this saved frame pointer in our return address and our parent routine's stack. 559 00:38:38,720 --> 00:38:40,800 Remember what the return address is? 560 00:38:40,800 --> 00:38:45,360 It's when main calls a function foo, calls a function bar, 561 00:38:45,360 --> 00:38:48,100 inevitably, bar returns. 562 00:38:48,100 --> 00:38:52,610 So when bar returns, they need to know that it's going back to foo that called it. 563 00:38:52,610 --> 00:39:01,360 So the return address is the address of the function that it has to return to when the function returns. 564 00:39:01,360 --> 00:39:05,830 The reason that's important for buffer overflow attacks is because, conveniently, 565 00:39:05,830 --> 00:39:09,580 hackers like to change that return address. 566 00:39:09,580 --> 00:39:14,950 Instead of going back to foo, I'm going to go back to wherever the hacker wants me to go back to. 567 00:39:14,950 --> 00:39:17,760 And, conveniently, where the hacker frequently wants to go back to 568 00:39:17,760 --> 00:39:22,400 is the start of the buffer that we originally had. 569 00:39:22,400 --> 00:39:26,170 So notice, again, Little Indian. 570 00:39:26,170 --> 00:39:28,490 The appliance is an example of a Little Indian system, 571 00:39:28,490 --> 00:39:34,140 so an integer or a pointer is stored with the bytes reversed. 572 00:39:34,140 --> 00:39:38,980 So here we see--is this? Yeah. 573 00:39:38,980 --> 00:39:45,660 We see Ox80, OxC0, Ox35, OxO8. 574 00:39:45,660 --> 00:39:48,250 Remember the hexadecimal digits? 575 00:39:48,250 --> 00:39:50,640 We don't reverse the hexadecimal digits in Little Indian, 576 00:39:50,640 --> 00:39:56,110 because 2 hexadecimal digits make up a single byte, and we reverse the bytes. 577 00:39:56,110 --> 00:40:00,300 That's why we don't store, like, 80530CO8. 578 00:40:00,300 --> 00:40:07,520 We store, instead, each pair of 2 digits, starting from the right. 579 00:40:07,520 --> 00:40:10,880 That address refers to the address of the start 580 00:40:10,880 --> 00:40:15,190 of our buffer that we actually wanted to copy into in the first place. 581 00:40:15,190 --> 00:40:19,230 The reason that's useful is because, what if the attacker 582 00:40:19,230 --> 00:40:24,100 happened to, instead of having a string that was just 583 00:40:24,100 --> 00:40:27,060 a harmless string of like, their name or something, 584 00:40:27,060 --> 00:40:33,900 what if, instead, that string were just some arbitrary code 585 00:40:33,900 --> 00:40:38,610 that did whatever they wanted it to do? 586 00:40:38,610 --> 00:40:45,630 So they could--I can't think of any cool code. 587 00:40:45,630 --> 00:40:47,780 It could be anything, though. Any disastrous code. 588 00:40:47,780 --> 00:40:51,440 If they wanted, they could just do something at seg faults, but that would be pointless. 589 00:40:51,440 --> 00:40:54,950 They usually do it to hack your system. 590 00:40:54,950 --> 00:40:59,930 >> Okay. CS50 library. 591 00:40:59,930 --> 00:41:04,800 This is, basically, getInt, getString, all those functions we provided for you. 592 00:41:04,800 --> 00:41:10,630 So we have char* string, and that's the abstraction that we blew away 593 00:41:10,630 --> 00:41:12,450 at some point during the semester. 594 00:41:12,450 --> 00:41:18,220 Remember that a string is just an array of characters. 595 00:41:18,220 --> 00:41:23,240 So here we see an abridged version of getString. 596 00:41:23,240 --> 00:41:25,920 You should look back at it to remember how it's actually implemented. 597 00:41:25,920 --> 00:41:30,950 Key details are, notice we get in a single character at a time 598 00:41:30,950 --> 00:41:34,570 from standard in, which is just like us typing at the keyboard. 599 00:41:34,570 --> 00:41:37,890 So a single character at a time, and if we get too many characters, 600 00:41:37,890 --> 00:41:40,580 so if n + 1 is greater than capacity, 601 00:41:40,580 --> 00:41:44,140 then we need to increase the capacity of our buffer. 602 00:41:44,140 --> 00:41:47,780 So here we're doubling the size of our buffer. 603 00:41:47,780 --> 00:41:51,840 And that keeps going; we insert the character into our buffer 604 00:41:51,840 --> 00:41:56,220 until we receive a new line or end of file or whatever, 605 00:41:56,220 --> 00:41:59,380 in which case, we're done with the string and then the real getString 606 00:41:59,380 --> 00:42:05,120 shrinks the memory, like if we allocated too much memory it'll go back and shrink a bit. 607 00:42:05,120 --> 00:42:08,830 So we don't show that, but the main idea is 608 00:42:08,830 --> 00:42:11,960 it has to read in a single character at a time. 609 00:42:11,960 --> 00:42:17,140 It can't just read in a whole thing at once, 610 00:42:17,140 --> 00:42:19,550 because their buffer is only of a certain size. 611 00:42:19,550 --> 00:42:26,590 So if the string that it tries to insert into buffer is too big, then it would overflow. 612 00:42:26,590 --> 00:42:28,940 So here we prevent that by only reading in a single character 613 00:42:28,940 --> 00:42:33,750 at a time and growing whenever we need to. 614 00:42:33,750 --> 00:42:40,270 So getInt and the other CS50 library functions tend to use getString 615 00:42:40,270 --> 00:42:42,310 in their implementations. 616 00:42:42,310 --> 00:42:45,370 So I highlighted the important things here. 617 00:42:45,370 --> 00:42:49,460 It calls getString to get a string. 618 00:42:49,460 --> 00:42:51,710 If getString failed to return memory, 619 00:42:51,710 --> 00:42:54,270 remember that getString mallocs something, so whenever you call getString 620 00:42:54,270 --> 00:42:57,820 you shouldn't (unintelligible) free that string that you got. 621 00:42:57,820 --> 00:43:02,870 So here, if it failed to malloc something, we return INT_MAX as just a flag that, 622 00:43:02,870 --> 00:43:05,650 hey, we weren't actually able to get an integer. 623 00:43:05,650 --> 00:43:10,830 You should ignore whatever I return to you, or 624 00:43:10,830 --> 00:43:15,540 you should not treat this as a valid input. 625 00:43:15,540 --> 00:43:21,360 Finally, assuming that did succeed, we use sscanf with that special flag, 626 00:43:21,360 --> 00:43:23,820 which means, first match an integer, 627 00:43:23,820 --> 00:43:26,770 then match any characters after that integer. 628 00:43:26,770 --> 00:43:29,070 So notice we want it to equal 1. 629 00:43:29,070 --> 00:43:32,940 So sscanf returns how many matches if successfully made? 630 00:43:32,940 --> 00:43:37,010 It will return 1 if it successfully matched an integer, 631 00:43:37,010 --> 00:43:40,890 it will return 0 if it did not match an integer, and it will return 2 632 00:43:40,890 --> 00:43:45,920 if it matched an integer followed by some character. 633 00:43:45,920 --> 00:43:49,780 So notice we retry if we match anything but 1. 634 00:43:49,780 --> 00:43:55,230 So if we entered 1, 2, 3, C, or 1, 2, 3, X, 635 00:43:55,230 --> 00:43:57,400 then 1, 2, 3 would get stored in the integer, 636 00:43:57,400 --> 00:43:59,620 X would get stored at the character, 637 00:43:59,620 --> 00:44:06,410 sscanf would return 2, and we would retry, because we only want an integer. 638 00:44:06,410 --> 00:44:09,810 >> Quickly blowing through HTML, HTTP, CSS. 639 00:44:09,810 --> 00:44:15,340 HyperText Markup Language is the structure and semantics of the web. 640 00:44:15,340 --> 00:44:19,960 Here is the example from lecture where we have HTML tags. 641 00:44:19,960 --> 00:44:22,110 We have head tags, body tags, 642 00:44:22,110 --> 00:44:27,770 we have examples of empty tags where we actually don't have a start and close tag, 643 00:44:27,770 --> 00:44:30,820 we just have link and image. 644 00:44:30,820 --> 00:44:38,480 There is no closing image tag; there's just a single tag that accomplishes everything the tag needs to do. 645 00:44:38,480 --> 00:44:41,950 The link is an example; we'll see how you link to CSS, 646 00:44:41,950 --> 00:44:45,910 the script is an example of how you link to an external JavaScript. 647 00:44:45,910 --> 00:44:53,100 It's pretty straightforward, and remember, HTML is not a programming language. 648 00:44:53,100 --> 00:44:58,250 Here, remember how you would define a form or at least what this would do? 649 00:44:58,250 --> 00:45:01,740 Such a form has an action and a method. 650 00:45:01,740 --> 00:45:06,210 The methods you will only ever see are GET and POST. 651 00:45:06,210 --> 00:45:09,040 So GET is the version where the thing gets put in the URL. 652 00:45:09,040 --> 00:45:11,680 POST is where it is not put in the URL. 653 00:45:11,680 --> 00:45:18,520 Instead, any data from the form is inserted more hidden in the HTTP request. 654 00:45:18,520 --> 00:45:22,390 So here, action defines where the HTTP request goes. 655 00:45:22,390 --> 00:45:27,490 Where it's going is google.com/search. 656 00:45:27,490 --> 00:45:32,890 Method. Remember the differences between GET and POST, 657 00:45:32,890 --> 00:45:37,200 and, just say as an example, if you want to bookmark something. 658 00:45:37,200 --> 00:45:40,660 You will never be able to bookmark a POST URL 659 00:45:40,660 --> 00:45:44,970 because the data is not included in the URL. 660 00:45:44,970 --> 00:45:49,790 >> HTTP, now, is HyperText Transfer Protocol. 661 00:45:49,790 --> 00:45:54,080 The HyperText Transfer Protocol, you would expect it to transfer 662 00:45:54,080 --> 00:45:57,710 HyperText Markup Language, and it does. 663 00:45:57,710 --> 00:46:00,170 But it also transfers any images you find on the Web, 664 00:46:00,170 --> 00:46:05,400 any downloads you make start as an HTTP request. 665 00:46:05,400 --> 00:46:10,350 So HTTP is just the language of the World Wide Web. 666 00:46:10,350 --> 00:46:15,610 And here you need to recognize this kind of an HTTP request. 667 00:46:15,610 --> 00:46:19,300 Here HTTP/1.1 on the side just says that's the version 668 00:46:19,300 --> 00:46:21,570 of the protocol I'm using. 669 00:46:21,570 --> 00:46:25,770 It's pretty much always going to be HTTP/1.1, as you'll see it. 670 00:46:25,770 --> 00:46:30,110 Then we see that this was GET, the alternative being POST, that you might see. 671 00:46:30,110 --> 00:46:40,790 And the URL that I was trying to visit was www.google.com/search?q= blah, blah, blah. 672 00:46:40,790 --> 00:46:44,240 So remember that this, the question mark q= blah blah blah, 673 00:46:44,240 --> 00:46:49,040 is the sort of stuff that is submitted by a form. 674 00:46:49,040 --> 00:46:51,830 The response it might return to me would look something like this. 675 00:46:51,830 --> 00:46:54,050 Again, starting with the protocol, which is going to be that, 676 00:46:54,050 --> 00:46:59,190 followed by the status code. Here it's 200 OK. 677 00:46:59,190 --> 00:47:05,060 And finally, the web page that I actually asked for will be followed. 678 00:47:05,060 --> 00:47:08,210 The possible status code you might see, and you should know several of them. 679 00:47:08,210 --> 00:47:12,770 200 OK you have probably seen before. 680 00:47:12,770 --> 00:47:17,830 403 Forbidden, 404 Not Found, 500 Internal Server Error 681 00:47:17,830 --> 00:47:22,140 is usually if you go to a website and something's broken or their PHP code crashes, 682 00:47:22,140 --> 00:47:24,930 whereas in the appliance we have that big orange box 683 00:47:24,930 --> 00:47:27,830 that comes up and says, like, something is wrong, this code doesn't work 684 00:47:27,830 --> 00:47:30,380 or this function's bad. 685 00:47:30,380 --> 00:47:33,230 Usually websites don't want you knowing what functions are actually bad, 686 00:47:33,230 --> 00:47:37,880 so instead they'll just give you 500 Internal Server Errors. 687 00:47:37,880 --> 00:47:43,050 >> TCP/IP is 1 layer under HTTP. 688 00:47:43,050 --> 00:47:47,550 Remember that there is Internet outside of the World Wide Web. 689 00:47:47,550 --> 00:47:52,270 Like if you play an online game that doesn't go through HTTP, 690 00:47:52,270 --> 00:47:55,740 it's going through a different--it's still using the Internet, 691 00:47:55,740 --> 00:47:58,900 but it doesn't use HTTP. 692 00:47:58,900 --> 00:48:02,470 HTTP is just one example of protocol built on TCP/IP. 693 00:48:02,470 --> 00:48:07,820 IP literally means Internet Protocol. 694 00:48:07,820 --> 00:48:11,500 Every computer has an IP address; they are those 4-digit things 695 00:48:11,500 --> 00:48:16,510 like 192.168.2.1, or whatever; that tends to be a local one. 696 00:48:16,510 --> 00:48:23,390 But that is the pattern of an IP address. 697 00:48:23,390 --> 00:48:29,060 So the DNS, Domain Name Service, 698 00:48:29,060 --> 00:48:33,410 that's what translates things like google.com to an actual IP address. 699 00:48:33,410 --> 00:48:37,700 So if you type that IP address into a URL, 700 00:48:37,700 --> 00:48:40,850 that would bring you to Google, but you tend not to remember those things. 701 00:48:40,850 --> 00:48:45,470 You tend to remember google.com instead. 702 00:48:45,470 --> 00:48:51,560 The last thing we have is ports, where this is the TCP part of IP. 703 00:48:51,560 --> 00:48:54,880 TCP does more. Think about, like, you have your web browser running. 704 00:48:54,880 --> 00:48:58,670 Maybe you have some email application running; 705 00:48:58,670 --> 00:49:02,150 maybe you have some other program that uses the Internet running. 706 00:49:02,150 --> 00:49:05,090 They all need access to the Internet, 707 00:49:05,090 --> 00:49:08,100 but your computer only has 1 WiFi card or whatever. 708 00:49:08,100 --> 00:49:10,780 So ports are the way that we're able to split up 709 00:49:10,780 --> 00:49:13,550 how these applications are able to use the Internet. 710 00:49:13,550 --> 00:49:17,230 Each application gets 1 specific port that it can listen on, 711 00:49:17,230 --> 00:49:19,670 and by default, HTTP uses port 80. 712 00:49:19,670 --> 00:49:22,410 Some email services use 25. 713 00:49:22,410 --> 00:49:24,490 The low-numbered ones tend to be reserved. 714 00:49:24,490 --> 00:49:29,270 You are usually able to get higher-numbered ones for yourself. 715 00:49:29,270 --> 00:49:32,010 >> CSS, Cascading Style Sheets. 716 00:49:32,010 --> 00:49:36,030 We style web pages with CSS, not with HTML. 717 00:49:36,030 --> 00:49:38,440 There are 3 places you can put your CSS. 718 00:49:38,440 --> 00:49:46,300 It can be inline, between style tags, or in a completely separate file and then linked in. 719 00:49:46,300 --> 00:49:48,470 And here is just an example of CSS. 720 00:49:48,470 --> 00:49:50,450 You should recognize this pattern, 721 00:49:50,450 --> 00:49:54,310 where the first example is we're matching the body tag, 722 00:49:54,310 --> 00:49:56,680 and here we're centering the body tag. 723 00:49:56,680 --> 00:50:00,420 The second example, we are matching the thing 724 00:50:00,420 --> 00:50:04,740 with ID footer, and we're applying some styles to that. 725 00:50:04,740 --> 00:50:07,310 Notice that ID footer text-aligns to the left, 726 00:50:07,310 --> 00:50:09,840 whereas body text-aligns center. 727 00:50:09,840 --> 00:50:13,180 Footer is inside the body. 728 00:50:13,180 --> 00:50:16,470 It will, instead, text-align left, even though body says text-align center. 729 00:50:16,470 --> 00:50:18,880 This is the whole cascading part of it. 730 00:50:18,880 --> 00:50:22,110 You can have--you can specify styles for the body, 731 00:50:22,110 --> 00:50:25,320 and then things in the body you can specify more specific styles, 732 00:50:25,320 --> 00:50:28,160 and things work as you expect. 733 00:50:28,160 --> 00:50:34,420 More specific CSS specifiers take precedence. 734 00:50:34,420 --> 00:50:46,140 I think that's it. 735 00:50:46,140 --> 00:50:49,260 >> [Ali Nahm] Hi everyone. If I could just get your attention. 736 00:50:49,260 --> 00:50:53,990 I'm Ali and I'm going to go through PHP and SQL really fast. 737 00:50:53,990 --> 00:51:00,310 So we can begin. PHP is short for PHP: Hypertext Preprocessor. 738 00:51:00,310 --> 00:51:03,730 And as you all should know, it's a server-side scripting language, 739 00:51:03,730 --> 00:51:06,800 and we use it for the back end of websites, 740 00:51:06,800 --> 00:51:12,540 and how it does a lot of the computations, the behind-scenes part. 741 00:51:12,540 --> 00:51:17,510 Syntax. It's not like C, surprise, surprise. 742 00:51:17,510 --> 00:51:22,060 It always has to start with the, if you can see, the--I can't move ahead. 743 00:51:22,060 --> 00:51:31,340 You can see you need the new kinds of braces and then you also need the ?php. 744 00:51:31,340 --> 00:51:35,780 That's always how you have to frame your PHP text, your PHP code. 745 00:51:35,780 --> 00:51:39,180 So it can't just be like C, where you kind of put it on first. 746 00:51:39,180 --> 00:51:42,290 You need to always surround it. 747 00:51:42,290 --> 00:51:47,610 And now, the major syntax is that all variables need to start with the $ character. 748 00:51:47,610 --> 00:51:49,490 You need to do it when you're defining them; you need to do it 749 00:51:49,490 --> 00:51:51,860 when you're referring to to them later on. 750 00:51:51,860 --> 00:51:56,510 You always need that $. It's your new best friend, pretty much. 751 00:51:56,510 --> 00:52:01,690 You do not--unlike C, you do not need to put what kind of variable type it is. 752 00:52:01,690 --> 00:52:04,940 So while you do need the $, you do not need to put, like, 753 00:52:04,940 --> 00:52:09,470 int x or string y, etcetera, etcetera. 754 00:52:09,470 --> 00:52:11,490 So a slight difference. 755 00:52:11,490 --> 00:52:15,590 As a result of this, it means that PHP is a weakly type. 756 00:52:15,590 --> 00:52:19,310 PHP is a weakly type language, and it has weakly typed variables. 757 00:52:19,310 --> 00:52:24,020 In other words, that means that you can switch between different kinds of variable types. 758 00:52:24,020 --> 00:52:27,230 You can store your number 1 as an int, 759 00:52:27,230 --> 00:52:29,650 you can store it as a string, and you can store it as a float, 760 00:52:29,650 --> 00:52:33,550 and it will all be that number 1. 761 00:52:33,550 --> 00:52:36,080 Even though you're storing it in different forms, 762 00:52:36,080 --> 00:52:39,120 it's still--the variable types are still holding in the end. 763 00:52:39,120 --> 00:52:41,540 So if you look here, if you remember from pset 7, 764 00:52:41,540 --> 00:52:43,500 many of you probably had issues with this. 765 00:52:43,500 --> 00:52:47,280 Two equal signs, 3 equal signs, 4 equal signs. 766 00:52:47,280 --> 00:52:49,990 Okay, there are no 4 equal signs, but there are 2 and 3. 767 00:52:49,990 --> 00:52:53,320 You use 2 equal signs to check the values. 768 00:52:53,320 --> 00:52:55,830 It can check across types. 769 00:52:55,830 --> 00:52:58,770 So if you can see at the first example, 770 00:52:58,770 --> 00:53:02,210 I have num_int == num_string. 771 00:53:02,210 --> 00:53:06,710 So your int and your string are both, technically, 1, 772 00:53:06,710 --> 00:53:10,790 but they're different types. But for the double equals, it'll still pass. 773 00:53:10,790 --> 00:53:15,510 However, for the triple equals, it checks value as well as the different types. 774 00:53:15,510 --> 00:53:18,760 That means that it's not going to pass in that second case here, 775 00:53:18,760 --> 00:53:22,350 where you're using 3 equal signs instead. 776 00:53:22,350 --> 00:53:26,590 So that's a major difference that you should all have shown now. 777 00:53:26,590 --> 00:53:31,570 >> String concatenation is another powerful thing you can use in PHP. 778 00:53:31,570 --> 00:53:34,080 It's basically just this handy dot notation, 779 00:53:34,080 --> 00:53:36,230 and that's how you can bind strings together. 780 00:53:36,230 --> 00:53:40,800 So if you have Cat and you have Dog, and you want to put the 2 strings together, 781 00:53:40,800 --> 00:53:44,080 you can use the period, and that's kind of how it works. 782 00:53:44,080 --> 00:53:46,660 You can also just place them next to each other, 783 00:53:46,660 --> 00:53:49,030 as you can see here in the bottom example, 784 00:53:49,030 --> 00:53:51,610 where I have echo string 1, space string 2. 785 00:53:51,610 --> 00:53:56,930 PHP will know to replace them as such. 786 00:53:56,930 --> 00:53:59,780 Arrays. Now, in PHP, there are 2 different kinds of arrays. 787 00:53:59,780 --> 00:54:03,180 You can have regular arrays, and you can also have associative arrays, 788 00:54:03,180 --> 00:54:06,040 and we're going to go through them right now. 789 00:54:06,040 --> 00:54:08,280 Regular arrays are just this in C, 790 00:54:08,280 --> 00:54:11,240 and so you have indices that are numbered. 791 00:54:11,240 --> 00:54:13,160 Right now we're just going to create one and put-- 792 00:54:13,160 --> 00:54:15,500 so this is how we create an empty array, then we're going to 793 00:54:15,500 --> 00:54:17,310 put into the index number 0. 794 00:54:17,310 --> 00:54:19,200 We're going to put the number 6, the value 6. 795 00:54:19,200 --> 00:54:21,500 You can see it at the bottom here. 796 00:54:21,500 --> 00:54:24,240 Where's--at index number 1 we're going to put value number 4, 797 00:54:24,240 --> 00:54:26,720 and so you can see there's a 6, there's a 4, 798 00:54:26,720 --> 00:54:29,160 and then as we're printing things, 799 00:54:29,160 --> 00:54:33,550 when we try and print the value stored at index number 0, 800 00:54:33,550 --> 00:54:36,900 then we'll see the value 6 being printed out. Cool? 801 00:54:36,900 --> 00:54:40,160 So that's regular arrays for you. 802 00:54:40,160 --> 00:54:42,750 Another way you can also add things to regular arrays now 803 00:54:42,750 --> 00:54:44,780 is you can just append them at the end. 804 00:54:44,780 --> 00:54:47,240 That means that you don't have to specify the specific index. 805 00:54:47,240 --> 00:54:51,000 You can see number, and then in the square brackets there's no index specified. 806 00:54:51,000 --> 00:54:56,270 And it will know--PHP will know to just add it to the end of the list, the next free spot. 807 00:54:56,270 --> 00:54:59,190 So you can see the 1 right there at that 0 spot, 808 00:54:59,190 --> 00:55:02,690 the 2 went right there at the first spot. 809 00:55:02,690 --> 00:55:04,690 The 3 goes--is added there as well. 810 00:55:04,690 --> 00:55:06,720 So that kind of makes sense. You're just constantly adding it, 811 00:55:06,720 --> 00:55:09,360 and then when we're echoing the index of number 1, 812 00:55:09,360 --> 00:55:13,080 it will print out the value 2. 813 00:55:13,080 --> 00:55:16,800 >> Then we have arrays that are associative arrays. 814 00:55:16,800 --> 00:55:19,370 Associative arrays, instead of having numerical indices, 815 00:55:19,370 --> 00:55:23,630 what they do is, they have indices that are by string. 816 00:55:23,630 --> 00:55:25,670 You can see, instead of--I got rid of all those number indices, 817 00:55:25,670 --> 00:55:32,140 and now it's key1, key2, key3, and they're in double quotes to signify that they're all strings. 818 00:55:32,140 --> 00:55:34,470 So we can have an example of this. 819 00:55:34,470 --> 00:55:38,790 The example of this is that we have the tf, and that's the index name. 820 00:55:38,790 --> 00:55:42,030 We're going to put "Ali" as the name, at the index, calories eaten, 821 00:55:42,030 --> 00:55:47,640 we can put an int this time instead of a string, 822 00:55:47,640 --> 00:55:52,240 and then at the index likes, we can put an entire array inside of it. 823 00:55:52,240 --> 00:55:55,490 So this is kind of--it's a similar concept to how we had 824 00:55:55,490 --> 00:55:58,930 indices with numbers, but now we can change the indices around 825 00:55:58,930 --> 00:56:03,890 to have them as strings instead. 826 00:56:03,890 --> 00:56:06,070 You can also do this, besides just doing it individually, 827 00:56:06,070 --> 00:56:09,400 you can do it all in one chunk. So you can see that tf of that array, 828 00:56:09,400 --> 00:56:13,350 and then we set them all in one giant square bracket set. 829 00:56:13,350 --> 00:56:15,220 So that can speed things up. 830 00:56:15,220 --> 00:56:19,730 It's more of a stylistic choice than not. 831 00:56:19,730 --> 00:56:21,550 We also have loops. 832 00:56:21,550 --> 00:56:26,020 In C we have loops that work like this. 833 00:56:26,020 --> 00:56:29,690 We had our array, and we went from index 0 to the end of the list, 834 00:56:29,690 --> 00:56:31,740 and we print it all, right? 835 00:56:31,740 --> 00:56:33,880 Except the problem is, for associative arrays, 836 00:56:33,880 --> 00:56:36,610 we don't necessarily know those numerical indices 837 00:56:36,610 --> 00:56:39,610 because now we have the string indices. 838 00:56:39,610 --> 00:56:44,800 Now we use foreach loops, which, again, you hopefully used in pset 7. 839 00:56:44,800 --> 00:56:48,930 Foreach loops will just know every single part of the list. 840 00:56:48,930 --> 00:56:52,450 And it doesn't have to know exactly the numerical index that you have. 841 00:56:52,450 --> 00:56:56,490 So you have the foreach syntax, so it's foreach, you put the array. 842 00:56:56,490 --> 00:57:00,430 So my array is called pset, and then as, the word as, 843 00:57:00,430 --> 00:57:04,530 and then you put this local temporary variable that you're going to use 844 00:57:04,530 --> 00:57:10,690 just for the specific thing that's going to hold the specific-- 845 00:57:10,690 --> 00:57:14,770 one instance or one section of the array. 846 00:57:14,770 --> 00:57:18,350 Pset num will hold 1, and then maybe it will hold the number 6, 847 00:57:18,350 --> 00:57:20,410 and then it will hold the number 2. 848 00:57:20,410 --> 00:57:26,630 But it's guaranteed to go through every single value that's in the array. 849 00:57:26,630 --> 00:57:30,530 Useful functions that you should know in PHP are the require, 850 00:57:30,530 --> 00:57:35,880 so that makes sure that you're including certain files, echo, exit, empty. 851 00:57:35,880 --> 00:57:40,490 I highly recommend you look at pset 7 and look at those functions. 852 00:57:40,490 --> 00:57:42,810 You might have to know those, 853 00:57:42,810 --> 00:57:47,060 so I would definitely know what, exactly, those are all doing. 854 00:57:47,060 --> 00:57:50,080 >> And now we're going to go through scope really quickly. 855 00:57:50,080 --> 00:57:53,490 In scope, PHP is kind of a funky thing, unlike C, 856 00:57:53,490 --> 00:57:56,170 and so we're just going to go through it quickly. 857 00:57:56,170 --> 00:57:58,930 So let's say we start at that arrow that we have there. 858 00:57:58,930 --> 00:58:02,900 And we're going to start with $i. So the variable 'i' is going to be 0, 859 00:58:02,900 --> 00:58:06,730 and we're just going to keep printing it in that big white box over there. 860 00:58:06,730 --> 00:58:09,220 We're going to start with i0, and then we're going to echo it. 861 00:58:09,220 --> 00:58:12,670 So there's the 0. 862 00:58:12,670 --> 00:58:15,210 And then we're going to increment it by the for loop, 863 00:58:15,210 --> 00:58:17,810 and then it's going to be value of 1. 864 00:58:17,810 --> 00:58:20,070 One is less than 3, so it's going to pass through that for loop, 865 00:58:20,070 --> 00:58:23,230 and then we're going to see it printed again. 866 00:58:23,230 --> 00:58:25,520 We're going to increment it again to 2, 867 00:58:25,520 --> 00:58:29,860 and 2 is less than 3, so it'll pass the for loop, and it'll print the 2. 868 00:58:29,860 --> 00:58:35,100 Then you'll note that 3 is not less than 3, so we'll break out of the for loop. 869 00:58:35,100 --> 00:58:40,050 So now we've exited, and then we're going to go into aFunction. 870 00:58:40,050 --> 00:58:45,010 Okay. So you have to note that this variable that we've created, 871 00:58:45,010 --> 00:58:48,270 the 'i' variable, is not locally scoped. 872 00:58:48,270 --> 00:58:50,280 That means that it's not local to the loop, 873 00:58:50,280 --> 00:58:58,060 and that variable we can still access and change afterwards, and it will still be effective. 874 00:58:58,060 --> 00:59:02,160 So if you go into the function now, you'll see that we also use the 'i' variable, 875 00:59:02,160 --> 00:59:05,320 and we're going to increment 'i' ++. 876 00:59:05,320 --> 00:59:09,410 You would think, at first, based on C, that that's a copy of the 'i' variable. 877 00:59:09,410 --> 00:59:12,830 It's a totally different thing, which is correct. 878 00:59:12,830 --> 00:59:16,560 So when we print it, we're going to print 'i' ++, which is going to print out that 4, 879 00:59:16,560 --> 00:59:19,640 and then we're going to--sorry. 880 00:59:19,640 --> 00:59:22,030 Then we're going to end out of that function, 881 00:59:22,030 --> 00:59:24,820 and we're going to be where that arrow is right now. 882 00:59:24,820 --> 00:59:29,190 That means that then, however, even though the function changed the value of 'i', 883 00:59:29,190 --> 00:59:32,620 it didn't change outside of the function, 884 00:59:32,620 --> 00:59:35,060 because the function has a separate scope. 885 00:59:35,060 --> 00:59:38,960 That means that when we echo 'i', it hasn't changed in the scope of the function, 886 00:59:38,960 --> 00:59:43,660 and so then we're going to print 3 again. 887 00:59:43,660 --> 00:59:47,520 Different things about scope in PHP than in C. 888 00:59:47,520 --> 00:59:51,130 >> Now in PHP and HTML. 889 00:59:51,130 --> 00:59:53,510 PHP is used to make web pages dynamic. 890 00:59:53,510 --> 00:59:58,660 It kind of makes things different. 891 00:59:58,660 --> 01:00:02,090 We have it different from HTML. 892 01:00:02,090 --> 01:00:05,230 With HTML, we always just have the same static thing, like how Rob showed, 893 01:00:05,230 --> 01:00:09,370 whereas PHP, you can change things based on who the user is. 894 01:00:09,370 --> 01:00:11,830 So if I have this, I have, "You are logged in as--" and then the name, 895 01:00:11,830 --> 01:00:14,420 and I can change the name. So right now the name is Joseph, 896 01:00:14,420 --> 01:00:18,880 and it has the "about me," but then I can also change the name to have Tommy. 897 01:00:18,880 --> 01:00:21,700 And that would be a different thing. 898 01:00:21,700 --> 01:00:23,840 So then we can also change different things about him, 899 01:00:23,840 --> 01:00:27,070 and it will show different content based on the name. 900 01:00:27,070 --> 01:00:31,430 So PHP can kind of change what's going on in your website. 901 01:00:31,430 --> 01:00:33,540 Same here. Still, note that they have different content, 902 01:00:33,540 --> 01:00:38,870 even though you are technically still accessing that same web page on the surface. 903 01:00:38,870 --> 01:00:43,450 Generating HTML. There are 2 different ways that you can do this. 904 01:00:43,450 --> 01:00:48,980 So we'll go through that right now. The first way is, you have--yeah, sorry. 905 01:00:48,980 --> 01:00:51,150 So you just have your regular for loop in PHP, 906 01:00:51,150 --> 01:00:56,270 and then you echo in PHP and you echo out HTML. 907 01:00:56,270 --> 01:00:58,720 Using what Rob showed you of HTML script 908 01:00:58,720 --> 01:01:04,030 and then using the PHP print to just print it out to the web page. 909 01:01:04,030 --> 01:01:09,520 The alternative way is to do it as if you separate out the PHP and the HTML. 910 01:01:09,520 --> 01:01:11,940 So you can have a line of PHP that starts the for loop, 911 01:01:11,940 --> 01:01:16,020 then you can have the line of the HTML in a separate thing, 912 01:01:16,020 --> 01:01:19,700 and then you end the loop, again, with a PHP. 913 01:01:19,700 --> 01:01:21,800 So it's kind of separating it out. 914 01:01:21,800 --> 01:01:24,020 On the left side, you can that you have all the-- 915 01:01:24,020 --> 01:01:26,360 it's just 1 chunk of PHP. 916 01:01:26,360 --> 01:01:28,510 On the right you can see that you have a line of PHP, 917 01:01:28,510 --> 01:01:32,540 you have a line of HTML, and you have a line of PHP again. 918 01:01:32,540 --> 01:01:36,870 So separating it out into what they're doing. 919 01:01:36,870 --> 01:01:39,330 And you'll note that either way, for either of them, 920 01:01:39,330 --> 01:01:41,980 they still print out the image, the image, the image, 921 01:01:41,980 --> 01:01:44,540 so that HTML still is printed the same way. 922 01:01:44,540 --> 01:01:49,870 And then you'll still see the 3 images show up on your website. 923 01:01:49,870 --> 01:01:52,820 So it's 2 different ways of doing the same thing. 924 01:01:52,820 --> 01:01:55,060 >> Now we have forms and requests. As Rob showed you, 925 01:01:55,060 --> 01:01:59,400 there are forms of HTML, and we will just breeze through this. 926 01:01:59,400 --> 01:02:02,040 You have an action and you have a method, and your action 927 01:02:02,040 --> 01:02:04,350 kind of shows you where you're going to send it, and the method is whether 928 01:02:04,350 --> 01:02:06,960 it's going to be a GET or a POST. 929 01:02:06,960 --> 01:02:11,220 And a GET request, as Rob said, means that you're going to put it in a form 930 01:02:11,220 --> 01:02:15,760 and you'll see it as a URL, whereas a POST request you will not see in a URL. 931 01:02:15,760 --> 01:02:17,840 So a slight difference. 932 01:02:17,840 --> 01:02:19,950 However, one thing that's a similar thing 933 01:02:19,950 --> 01:02:22,560 is that POST and GET are equally insecure. 934 01:02:22,560 --> 01:02:26,430 So you may think that just because you don't see it in the URL, 935 01:02:26,430 --> 01:02:28,790 that means the POST is more secure, 936 01:02:28,790 --> 01:02:34,420 but you can still see it in your cookies in the information that you're sending. 937 01:02:34,420 --> 01:02:38,260 So don't think that about one or the other. 938 01:02:38,260 --> 01:02:42,160 Another thing to note is that you also have section variables. 939 01:02:42,160 --> 01:02:45,850 You guys used this in pset 7 to get your user ID information. 940 01:02:45,850 --> 01:02:48,550 What happened was that you can use this associative array, 941 01:02:48,550 --> 01:02:53,310 the $_SESSION, and then you're able to access different things 942 01:02:53,310 --> 01:02:57,720 and store different things across the pages. 943 01:02:57,720 --> 01:03:00,750 >> Last thing is that we have SQL, Structured Query Language, 944 01:03:00,750 --> 01:03:04,360 and this is a programming language to manage databases. 945 01:03:04,360 --> 01:03:08,220 What, exactly, are databases? They're collections of tables, 946 01:03:08,220 --> 01:03:10,630 and each table can have similar kinds of objects. 947 01:03:10,630 --> 01:03:14,990 So we had a table of users in your finance pset. 948 01:03:14,990 --> 01:03:20,610 And why are they useful? Because it's a way of permanently storing information. 949 01:03:20,610 --> 01:03:22,840 It's a way of tracking things and managing things 950 01:03:22,840 --> 01:03:25,890 and actually seeing it on different pages and keeping track. 951 01:03:25,890 --> 01:03:29,930 Whereas if you just store it at that one immediate moment 952 01:03:29,930 --> 01:03:33,720 and then use it later, you won't be able to access anything that you've saved. 953 01:03:33,720 --> 01:03:37,660 We have 4 major things that we use for SQL commands. 954 01:03:37,660 --> 01:03:40,190 We have select, insert, delete, and update. 955 01:03:40,190 --> 01:03:42,880 Those are really important for you guys to know for your quiz. 956 01:03:42,880 --> 01:03:45,990 >> We'll quickly go over select right now. 957 01:03:45,990 --> 01:03:48,540 Basically, you're selecting rows from a database. 958 01:03:48,540 --> 01:03:52,400 So if you have, right here-- 959 01:03:52,400 --> 01:03:56,740 we have these 2 different things, and we want to select from the classes table 960 01:03:56,740 --> 01:04:01,480 where awesome--where in the awesome column the value is 1. 961 01:04:01,480 --> 01:04:04,460 So you can see here, we have these 2 things of class name, 962 01:04:04,460 --> 01:04:08,490 CS50 and Stat110, and we have the class IDs and the slogan. 963 01:04:08,490 --> 01:04:13,150 So we want to select all of that information. 964 01:04:13,150 --> 01:04:17,480 Then you can see right here that it's kind of picking out of that awesome column, 965 01:04:17,480 --> 01:04:25,170 where all the things are 1, and then it has the class ID, class name and slogan that it can pick out. 966 01:04:25,170 --> 01:04:28,100 How exactly do you do this in code? You have to use PHP. 967 01:04:28,100 --> 01:04:33,830 So that's kind of how PHP and SQL are related to each other. 968 01:04:33,830 --> 01:04:38,130 Now we have our code, and we're going to use our query function 969 01:04:38,130 --> 01:04:41,370 as we did in pset 7, and we're going to run the SQL query. 970 01:04:41,370 --> 01:04:43,870 Then we're going to have-- 971 01:04:43,870 --> 01:04:46,280 we always have to check if row's triple equal if false. 972 01:04:46,280 --> 01:04:49,010 So again, you want to check the type and the value, 973 01:04:49,010 --> 01:04:53,880 and then if it doesn't work, then you want to apologize, as usual, as we did in pset 7. 974 01:04:53,880 --> 01:04:55,870 Otherwise, you want to loop through everything with those handy 975 01:04:55,870 --> 01:04:59,410 foreach loops that we just went over. 976 01:04:59,410 --> 01:05:01,280 Now that we're looping through and we've made it past, 977 01:05:01,280 --> 01:05:05,080 let's assume that our query passed, now we have our foreach loop. 978 01:05:05,080 --> 01:05:11,050 And the first row it has, so here's the row, right here; it's boxed. 979 01:05:11,050 --> 01:05:14,010 It's going to print out all the information that it's gotten. 980 01:05:14,010 --> 01:05:18,070 So it's going to print out at the bottom "Wanna Learn HTML?" 981 01:05:18,070 --> 01:05:23,370 Then it's going to go to the next row, because it's completed the first for loop, 982 01:05:23,370 --> 01:05:26,510 and so then it's going to print out the second line of it, 983 01:05:26,510 --> 01:05:32,120 which is going to be STAT110, Find all the Moments. 984 01:05:32,120 --> 01:05:34,290 >> One last thing is on SQL Vulnerabilities. 985 01:05:34,290 --> 01:05:37,300 I know David touched on this a little bit in lecture. 986 01:05:37,300 --> 01:05:40,730 You can read this later. It's really funny. 987 01:05:40,730 --> 01:05:45,320 SQL Injection is a kind of tricky thing. 988 01:05:45,320 --> 01:05:49,890 Let's say that you just stick those variables right into your query, 989 01:05:49,890 --> 01:05:52,290 as you can see in that first line. 990 01:05:52,290 --> 01:05:54,520 So it seems fine, right? You're just putting in the user name 991 01:05:54,520 --> 01:05:58,820 and password to your SQL query, and you want to ship it off and get whatever is in your data table. 992 01:05:58,820 --> 01:06:01,450 That seems pretty simple. So lets say someone puts in, 993 01:06:01,450 --> 01:06:04,910 for the password, this OR text right here-- 994 01:06:04,910 --> 01:06:06,780 should actually be in the red box. 995 01:06:06,780 --> 01:06:11,920 So let's say that they put that password into--that's what they enter. 996 01:06:11,920 --> 01:06:16,520 So they're putting OR "1" = 1. 997 01:06:16,520 --> 01:06:20,880 Kind of a silly password to have. 998 01:06:20,880 --> 01:06:25,070 Now let's just replace it in, and you'll note that in that SQL query now, 999 01:06:25,070 --> 01:06:29,090 it evaluates to always true, because you'll note that 1000 01:06:29,090 --> 01:06:32,240 you can SQL query select all of this information 1001 01:06:32,240 --> 01:06:35,420 or you can just have 1 = 1. 1002 01:06:35,420 --> 01:06:41,030 So that's always going to evaluate to true. 1003 01:06:41,030 --> 01:06:46,610 That's not going to really work, because that means that the hacker can break into your system. 1004 01:06:46,610 --> 01:06:49,300 The solution to this is that you have to use the PDO system, 1005 01:06:49,300 --> 01:06:51,360 which means that you have to use question marks, 1006 01:06:51,360 --> 01:06:53,350 which is what you guys used in pset 7, 1007 01:06:53,350 --> 01:06:57,620 where you're going to use a question mark in place of where you want to put something, 1008 01:06:57,620 --> 01:07:01,430 and then you're going to have a comma, and then you'll have afterwards, 1009 01:07:01,430 --> 01:07:07,610 after your string, the different variables that you want to replace into your question mark. 1010 01:07:07,610 --> 01:07:10,330 So you'll note here that now I have these red question marks. 1011 01:07:10,330 --> 01:07:15,420 Then I put the variables after my strings so I know to replace them in that order afterwards. 1012 01:07:15,420 --> 01:07:18,470 That will make sure that if someone does it like this, 1013 01:07:18,470 --> 01:07:24,050 and they have the or 1 = 1 situation, that will make sure, 1014 01:07:24,050 --> 01:07:30,490 in the back end, make sure that it won't actually break the SQL query. 1015 01:07:30,490 --> 01:07:33,660 Okay, so that's pretty much it, a whirlwind of PHP and SQL. 1016 01:07:33,660 --> 01:07:41,520 Best of luck to all of you, and now to Ore. 1017 01:07:41,520 --> 01:07:44,270 >> [Oreoluwatomiwa Babarinsa] Okay everyone. Time to go over some JavaScript 1018 01:07:44,270 --> 01:07:48,840 and some other things very quickly so we don't hold you up tonight. 1019 01:07:48,840 --> 01:07:56,930 JavaScript. Yes. JavaScript is kind of a cool thing, purportedly. 1020 01:07:56,930 --> 01:07:59,090 The things you really need to know about JavaScript, it's sort of like 1021 01:07:59,090 --> 01:08:03,810 the client-side end of what your web app is going to be doing. 1022 01:08:03,810 --> 01:08:08,280 There's some things you just don't want to take care of all the time on the server side. 1023 01:08:08,280 --> 01:08:12,880 All the little interactions, highlighting one thing, making something disappear. 1024 01:08:12,880 --> 01:08:15,340 You really don't want to have to talk to your server all the time for that. 1025 01:08:15,340 --> 01:08:18,069 And some of that is not even possible to do on the server side. 1026 01:08:18,069 --> 01:08:21,899 This is why we need something like JavaScript. 1027 01:08:21,899 --> 01:08:24,359 Cool things about JavaScript: It is dynamically typed. 1028 01:08:24,359 --> 01:08:27,149 What this means is that your program doesn't need to know 1029 01:08:27,149 --> 01:08:30,970 what, exactly, the variables are when you write it out. 1030 01:08:30,970 --> 01:08:34,510 It'll just sort of figure it out as it's running. 1031 01:08:34,510 --> 01:08:37,520 Other things that are cool about it: It's a curly brace language, 1032 01:08:37,520 --> 01:08:41,359 which means the syntax is similar to C and PHP. 1033 01:08:41,359 --> 01:08:47,050 You don't have to do much rework when you're learning JavaScript. 1034 01:08:47,050 --> 01:08:49,180 Here we have a little bit of JavaScript. 1035 01:08:49,180 --> 01:08:52,560 Interesting thing right here is that, if you look at it, 1036 01:08:52,560 --> 01:08:56,330 we have a bit of JavaScript right there in the head tag. 1037 01:08:56,330 --> 01:08:59,479 What is does is basically just include a JavaScript file. 1038 01:08:59,479 --> 01:09:02,260 This is one way you can include JavaScript into your program. 1039 01:09:02,260 --> 01:09:06,910 Then the second little bit is actually some inline JavaScript, 1040 01:09:06,910 --> 01:09:10,790 very similar to an inline style with CSS, 1041 01:09:10,790 --> 01:09:16,180 and you're just writing some code very quickly there. 1042 01:09:16,180 --> 01:09:18,120 JavaScript has arrays. 1043 01:09:18,120 --> 01:09:20,850 Just another way to keep data around, very useful. 1044 01:09:20,850 --> 01:09:25,180 Very nice and easy syntax. 1045 01:09:25,180 --> 01:09:29,870 You use square brackets to access everything and keep everything together. 1046 01:09:29,870 --> 01:09:35,020 Nothing too complex. 1047 01:09:35,020 --> 01:09:38,630 The cool thing about JavaScript and scripting languages in general 1048 01:09:38,630 --> 01:09:40,920 is that you don't have to worry about array sizes. 1049 01:09:40,920 --> 01:09:43,880 You can just use array.length and keep track of it, 1050 01:09:43,880 --> 01:09:46,960 and also the array can grow or shrink as you need it to. 1051 01:09:46,960 --> 01:09:49,279 So you don't even need to worry about any sort of, 1052 01:09:49,279 --> 01:09:57,050 oh no, I need to allocate more things, or anything like that. 1053 01:09:57,050 --> 01:10:00,090 >> The cool thing here is that JavaScript has something called objects. 1054 01:10:00,090 --> 01:10:04,800 It's an object-oriented language, so what it has is, essentially, 1055 01:10:04,800 --> 01:10:10,100 a way for you to group data together, somewhat similar to a struct, 1056 01:10:10,100 --> 01:10:17,280 but you can access it like a struct or in an associative array syntax. 1057 01:10:17,280 --> 01:10:22,520 It's pretty simple and what you can do with this is group data together 1058 01:10:22,520 --> 01:10:24,810 if you have a bunch of data that's related. 1059 01:10:24,810 --> 01:10:26,850 Because it's all the things you need to describe a car, 1060 01:10:26,850 --> 01:10:29,050 you don't need to have it in a bunch of different places. 1061 01:10:29,050 --> 01:10:35,300 You can just stick it into 1 object in JavaScript. 1062 01:10:35,300 --> 01:10:39,090 As you probably know, iterating is one of those tedious tasks. 1063 01:10:39,090 --> 01:10:43,810 You just do it over an over again. You need to talk to every object in the car, 1064 01:10:43,810 --> 01:10:47,340 or you need to go through every item in a list or something like that. 1065 01:10:47,340 --> 01:10:51,770 So JavaScript has, similar to PHP, a foreach syntax. 1066 01:10:51,770 --> 01:10:54,590 In this case, it's a for in loop. 1067 01:10:54,590 --> 01:10:57,300 You want to use this only on objects. 1068 01:10:57,300 --> 01:11:01,030 There are some problems that occur if you use this on arrays. 1069 01:11:01,030 --> 01:11:03,750 It generally is one of those things, though, that is very useful, 1070 01:11:03,750 --> 01:11:06,590 because you eliminate a lot of overhead 1071 01:11:06,590 --> 01:11:10,270 because you don't have to pull up everything in your object by yourself. 1072 01:11:10,270 --> 01:11:12,300 You don't have to remember all the key names. 1073 01:11:12,300 --> 01:11:18,270 You just sort of get them back in this syntax. 1074 01:11:18,270 --> 01:11:21,500 In this, with for, you just want to remember 1075 01:11:21,500 --> 01:11:27,180 that you're getting back all the keys, in a very similar way to hash table. 1076 01:11:27,180 --> 01:11:30,880 If you remember from that, when you would put in a string you could get something out 1077 01:11:30,880 --> 01:11:33,840 that would have an associated value with it. 1078 01:11:33,840 --> 01:11:36,360 What you can do with this is you can say, all right, 1079 01:11:36,360 --> 01:11:42,120 I put in a car, and I called it a Ferrari. 1080 01:11:42,120 --> 01:11:45,290 So you can put in the string Ferrari again later, and you can get that out. 1081 01:11:45,290 --> 01:11:50,000 And you can do that in a loop, with the for in loop. 1082 01:11:50,000 --> 01:11:53,320 So just more about objects. The key thing from this you need to remember 1083 01:11:53,320 --> 01:12:00,340 is that you can use the object struct like syntax whenever you want with these, 1084 01:12:00,340 --> 01:12:04,590 except if what your going to use as a string isn't a valid variable name. 1085 01:12:04,590 --> 01:12:07,650 So if you look at that there, we have key with spaces. 1086 01:12:07,650 --> 01:12:12,500 Well, if you were to put object.key, space, with, space, spaces, 1087 01:12:12,500 --> 01:12:15,320 that just wouldn't make sense syntactically. 1088 01:12:15,320 --> 01:12:22,730 So you only can do that with this sort of bracket syntax. 1089 01:12:22,730 --> 01:12:26,520 >> Also, JavaScript is very scope-wise to PHP. 1090 01:12:26,520 --> 01:12:29,050 You have 2 ways of addressing scope. 1091 01:12:29,050 --> 01:12:31,960 You can not have the var in front of a variable, 1092 01:12:31,960 --> 01:12:34,060 and that just means this is global. 1093 01:12:34,060 --> 01:12:37,050 You can see it from anywhere. Even if you were to put this in an if statement, 1094 01:12:37,050 --> 01:12:42,430 anywhere else in your code after that point you could see that variable. 1095 01:12:42,430 --> 01:12:46,730 Another thing, though, is with the var, it's limited to whatever function you're in. 1096 01:12:46,730 --> 01:12:48,870 If you're not in a function, well, it's global. 1097 01:12:48,870 --> 01:12:53,900 But if you are in a function it's only visible within that function. 1098 01:12:53,900 --> 01:12:56,420 I don't have an example, but, yeah. It's one of those things where 1099 01:12:56,420 --> 01:12:59,900 you can manage what variables you want to be global, 1100 01:12:59,900 --> 01:13:03,810 what variables you want to be local, but you do need to be careful about this, 1101 01:13:03,810 --> 01:13:06,890 because you don't have the type of fine grain control you do in C, 1102 01:13:06,890 --> 01:13:15,820 where if something is declared in a for loop, it's going to stay in that for loop. 1103 01:13:15,820 --> 01:13:18,790 The thing we actually care about using JavaScript for is manipulating web pages, right? 1104 01:13:18,790 --> 01:13:21,800 I mean, that's why we're doing this. 1105 01:13:21,800 --> 01:13:23,840 >> To do that, we use something called the DOM. 1106 01:13:23,840 --> 01:13:25,850 The Document Object Model. 1107 01:13:25,850 --> 01:13:29,430 Basically, what it does is it takes all your HTML 1108 01:13:29,430 --> 01:13:34,110 and models it out into a bunch of objects that are nested within each other. 1109 01:13:34,110 --> 01:13:37,080 You start out with something like this. 1110 01:13:37,080 --> 01:13:44,770 You have, on the right for me, a bunch of code out there that's sort of-- 1111 01:13:44,770 --> 01:13:46,640 You would think that'd be very hard to manipulate, 1112 01:13:46,640 --> 01:13:48,700 because you'd be parsing through a bunch of text 1113 01:13:48,700 --> 01:13:52,080 and having to piece apart things. And what if it wasn't correctly formatted? 1114 01:13:52,080 --> 01:13:54,880 Bad things would happen. 1115 01:13:54,880 --> 01:13:58,140 So JavaScript takes care of this for you, and you get a nice data structure, 1116 01:13:58,140 --> 01:14:01,390 like the one on my left, where you just have a document, 1117 01:14:01,390 --> 01:14:03,530 and inside that you have something called HTML, 1118 01:14:03,530 --> 01:14:05,600 and inside that you have a head and a body, 1119 01:14:05,600 --> 01:14:08,420 and inside that head you have a title, etcetera, etcetera, etcetera. 1120 01:14:08,420 --> 01:14:11,810 This simplifies manipulating a web page so that it's just, 1121 01:14:11,810 --> 01:14:14,190 oh, I just want to talk to this object. 1122 01:14:14,190 --> 01:14:21,340 Sort of a very similar way you would talk to another object you made yourself. 1123 01:14:21,340 --> 01:14:25,980 Like I said, all the DOM is in the document object. 1124 01:14:25,980 --> 01:14:29,290 Either it's just one place and then you can go within it to find things, 1125 01:14:29,290 --> 01:14:33,880 and you can do it--this is the old style of doing it, up there, 1126 01:14:33,880 --> 01:14:38,130 where you do document.getElementById, and then the name, 1127 01:14:38,130 --> 01:14:42,420 and as you can probably tell, this gets very unwieldy after a while. 1128 01:14:42,420 --> 01:14:44,480 So you probably don't want to do that. That's why we have 1129 01:14:44,480 --> 01:14:48,760 the next thing we're going to talk about after this. 1130 01:14:48,760 --> 01:14:52,510 The key thing here is that, all right, you have all these elements, right? 1131 01:14:52,510 --> 01:14:56,400 So maybe I can change the color of something when the page loads. 1132 01:14:56,400 --> 01:14:58,380 So what? What if my user clicks something? 1133 01:14:58,380 --> 01:15:00,540 I want it to do something interesting when they click something. 1134 01:15:00,540 --> 01:15:02,600 That's why we have events. 1135 01:15:02,600 --> 01:15:05,330 You can, basically, find any element in your DOM, 1136 01:15:05,330 --> 01:15:08,560 and then say, hey. When this loads or someone clicks it, 1137 01:15:08,560 --> 01:15:11,410 or when they mouse over it, do something with it. 1138 01:15:11,410 --> 01:15:15,330 And what you have is, you have functions that handle this for you. 1139 01:15:15,330 --> 01:15:17,980 These functions are event handlers. 1140 01:15:17,980 --> 01:15:20,440 What they're--it's just a fancy way of saying, 1141 01:15:20,440 --> 01:15:23,500 this function is only executed when this event happens. 1142 01:15:23,500 --> 01:15:28,070 So it handles the event that occurs. 1143 01:15:28,070 --> 01:15:30,810 This is how you would lay out an event handler. 1144 01:15:30,810 --> 01:15:34,750 I have some button, and when you click it, it explodes. 1145 01:15:34,750 --> 01:15:40,560 So don't click the button. 1146 01:15:40,560 --> 01:15:42,910 This is one way of approaching it, right? 1147 01:15:42,910 --> 01:15:46,430 You have a button tag, and on click you have a string that says, 1148 01:15:46,430 --> 01:15:50,460 oh, by the way, I do this exploding thing for me. 1149 01:15:50,460 --> 01:15:53,990 Otherwise, it's just like a regular button you just made. 1150 01:15:53,990 --> 01:15:56,550 You can also do this another way, 1151 01:15:56,550 --> 01:16:02,770 by grabbing the DOM element, but we'll save that after we talk about jQuery. 1152 01:16:02,770 --> 01:16:07,580 >> jQuery: It is a library that is cross-browser. 1153 01:16:07,580 --> 01:16:09,580 You can use it in pretty much anything. 1154 01:16:09,580 --> 01:16:12,090 And it just gives you a lot of tools to work with. 1155 01:16:12,090 --> 01:16:15,850 Because JavaScript, while powerful, doesn't have all the tools you need 1156 01:16:15,850 --> 01:16:20,550 out of the box to really tackle a web app you might want to do. 1157 01:16:20,550 --> 01:16:24,650 So it simplifies a lot of things, gives you a lot of functions 1158 01:16:24,650 --> 01:16:28,760 out of the box that you would normally have to write yourself, over and over and over again. 1159 01:16:28,760 --> 01:16:31,600 And just makes things very simple. 1160 01:16:31,600 --> 01:16:35,780 You also have selectors, which let you take out all those elements 1161 01:16:35,780 --> 01:16:42,800 from your DOM much more simply, instead of having to use these very long function calls. 1162 01:16:42,800 --> 01:16:46,630 More on these selectors. You have, up there you have, let's say 1163 01:16:46,630 --> 01:16:49,800 I want to get an element with the ID "rock." 1164 01:16:49,800 --> 01:16:56,450 Well, in jQuery, it's just $ and then a string that has a pound, and then "rock." 1165 01:16:56,450 --> 01:17:01,960 It's very simple and a lot faster than the traditional JavaScript way of tackling this problem. 1166 01:17:01,960 --> 01:17:06,120 And you have similar things for classes and element types. 1167 01:17:06,120 --> 01:17:08,140 jQuery is--one of the cool features is you can sort of compress 1168 01:17:08,140 --> 01:17:14,350 down your queries on your DOM very, very fast. 1169 01:17:14,350 --> 01:17:18,980 Now we're back to event handling, and this is how you would handle one event in jQuery. 1170 01:17:18,980 --> 01:17:23,090 So what we're going here is we're saying, all right. I have a script tag, right? 1171 01:17:23,090 --> 01:17:25,400 So I have this inline JavaScript. 1172 01:17:25,400 --> 01:17:27,750 What we're going to do is we're going to say, all right. 1173 01:17:27,750 --> 01:17:30,860 When the document is ready, which means the document's been loaded, 1174 01:17:30,860 --> 01:17:34,660 we are going to go in to that function, and we're going to say, all right, 1175 01:17:34,660 --> 01:17:37,060 this function's actually doing something else. 1176 01:17:37,060 --> 01:17:42,320 It's basically saying, all right, get me the element with the ID "myid." 1177 01:17:42,320 --> 01:17:47,960 And then give this a function handler that executes when you click it. 1178 01:17:47,960 --> 01:17:49,820 Basically what this does is, it says, all right. 1179 01:17:49,820 --> 01:17:52,630 The page is loaded, so I'm going to in, find this element, 1180 01:17:52,630 --> 01:17:56,420 give it this event handler, and it basically sets up your page for you. 1181 01:17:56,420 --> 01:18:00,520 And this is how you want to think about event handling. 1182 01:18:00,520 --> 01:18:06,310 You just want to think about, all right, when something occurs, what do I want to happen? 1183 01:18:06,310 --> 01:18:10,520 You don't want to think about, okay, I need to make sure this thing talks to this thing, 1184 01:18:10,520 --> 01:18:14,660 this thing blah blah blah, because you just want to talk thing in terms of events. 1185 01:18:14,660 --> 01:18:17,650 When this happens, this happens. When this happens, that happens. 1186 01:18:17,650 --> 01:18:20,240 And if things trigger other things, that's great. 1187 01:18:20,240 --> 01:18:22,150 But you don't want to try and do complicated code 1188 01:18:22,150 --> 01:18:24,130 where you're triggering multiple things at the same time, 1189 01:18:24,130 --> 01:18:28,860 because you're just going to give yourself a headache. 1190 01:18:28,860 --> 01:18:32,340 >> All right. Now we can get our page to handle events, 1191 01:18:32,340 --> 01:18:35,640 but let's say my user clicks a button. 1192 01:18:35,640 --> 01:18:38,040 What if I want to send that request back to the server, 1193 01:18:38,040 --> 01:18:41,100 but I don't want to reload the page, because having to reload a new page 1194 01:18:41,100 --> 01:18:44,390 every single time gets kind of tedious, and why do I need 1195 01:18:44,390 --> 01:18:47,430 to pull down the header again, and the footer again, 1196 01:18:47,430 --> 01:18:49,670 and all the elements of the page again 1197 01:18:49,670 --> 01:18:53,180 just to refresh the greeting or the time? 1198 01:18:53,180 --> 01:18:55,290 So that's why we have something like Ajax. 1199 01:18:55,290 --> 01:18:59,150 What we can do here with Ajax is we can say, all right, 1200 01:18:59,150 --> 01:19:01,290 I want to send some data to the server, 1201 01:19:01,290 --> 01:19:04,010 and I want to get a response back so I can update my page, 1202 01:19:04,010 --> 01:19:12,120 or maybe just do some algorithmic calculation that doesn't necessarily show anything to the user. 1203 01:19:12,120 --> 01:19:15,500 What do you need to do this? Well, you need a URL you need to talk to. 1204 01:19:15,500 --> 01:19:18,650 Your server can't just magically listen in from nowhere. 1205 01:19:18,650 --> 01:19:21,960 You need to have a specific place you're sending this data to. 1206 01:19:21,960 --> 01:19:26,240 And you also need some data to send, or maybe it's a dataless query. 1207 01:19:26,240 --> 01:19:31,380 You just want to ping back to the server and say, hey, I'm alive, or something like that. 1208 01:19:31,380 --> 01:19:35,150 And then you want a function that basically handles with success. 1209 01:19:35,150 --> 01:19:38,250 Let's say you get back some information from your server, 1210 01:19:38,250 --> 01:19:42,960 and you want to change the user's title on their page. 1211 01:19:42,960 --> 01:19:44,930 So you would get the information back, 1212 01:19:44,930 --> 01:19:48,860 and you would push that to the screen. 1213 01:19:48,860 --> 01:19:51,170 What happens is, when the page is ready, 1214 01:19:51,170 --> 01:19:56,500 you create an on click function for this button called greeter. 1215 01:19:56,500 --> 01:19:58,810 What this then does is, when that button is pushed, 1216 01:19:58,810 --> 01:20:03,700 you talk to greetings.php, you make a POST request, 1217 01:20:03,700 --> 01:20:07,290 and you say, hey, get me something from your page. 1218 01:20:07,290 --> 01:20:09,890 We don't really need to describe that, but greetings.php, 1219 01:20:09,890 --> 01:20:12,480 let's just say, gives back "hello world." 1220 01:20:12,480 --> 01:20:15,650 So we get back this "hello world," and on success of this, 1221 01:20:15,650 --> 01:20:20,730 assuming nothing goes wrong, then we just go to this target place 1222 01:20:20,730 --> 01:20:25,720 that we specified and we just stick the response in there. 1223 01:20:25,720 --> 01:20:31,560 And this is a very simple way of setting up an Ajax query. 1224 01:20:31,560 --> 01:20:34,340 >> Very quickly, Rob sort of mentioned this already, 1225 01:20:34,340 --> 01:20:37,170 things can go wrong, bad things can happen, 1226 01:20:37,170 --> 01:20:42,660 so you want to familiarize yourself with these HTTP response codes. 1227 01:20:42,660 --> 01:20:46,030 What these are are just, like, 200, everything went okay. 1228 01:20:46,030 --> 01:20:48,670 Something else, bad things happened. 1229 01:20:48,670 --> 01:20:50,790 It's generally the thing you want to remember. 1230 01:20:50,790 --> 01:20:53,440 But it's nice to know all of these. 1231 01:20:53,440 --> 01:20:55,970 And finally, once we've gone through all of that, 1232 01:20:55,970 --> 01:20:58,680 we need to talk very quickly about design, 1233 01:20:58,680 --> 01:21:00,620 and then we can let you all leave. 1234 01:21:00,620 --> 01:21:03,410 Design. Things you want to remember. 1235 01:21:03,410 --> 01:21:06,950 Ask yourself these questions: Who'll be using this? 1236 01:21:06,950 --> 01:21:09,580 What will they be using it for? What do my users care about? 1237 01:21:09,580 --> 01:21:11,750 What don't they care about? 1238 01:21:11,750 --> 01:21:14,500 You just don't want to make an app and let it just grow 1239 01:21:14,500 --> 01:21:18,270 and become this giant, all-consuming thing that you can't even finish. 1240 01:21:18,270 --> 01:21:23,900 You want to have discrete goals and plans and things you want to address. 1241 01:21:23,900 --> 01:21:29,000 Make it effortless. All of this says, basically, 1242 01:21:29,000 --> 01:21:34,950 make it easy for the user to use it; don't make it a giant blob of text like this slide is, actually. 1243 01:21:34,950 --> 01:21:38,020 You just want it to be something where it's very easy for someone to go in 1244 01:21:38,020 --> 01:21:40,800 and do what they want to do. 1245 01:21:40,800 --> 01:21:42,920 You don't want them to have to navigate 5 pages 1246 01:21:42,920 --> 01:21:45,460 to get to your prime function of your site. 1247 01:21:45,460 --> 01:21:49,290 If Google had 5 pages before you could even search something, 1248 01:21:49,290 --> 01:21:53,080 no one would use it. 1249 01:21:53,080 --> 01:21:55,890 And lastly, paper prototype, focus group. 1250 01:21:55,890 --> 01:21:59,220 Have good design and testing practices. 1251 01:21:59,220 --> 01:22:00,730 Just because you think it works for you, 1252 01:22:00,730 --> 01:22:04,860 doesn't mean anyone else thinks it works. 1253 01:22:04,860 --> 01:22:14,490 But yeah, that's it. 1254 01:22:14,490 --> 01:22:17,490 [CS50.TV]