1 00:00:00,506 --> 00:00:08,986 [ no speaking ] 2 00:00:09,486 --> 00:00:10,796 >> Unknown: All right, welcome to CS50. 3 00:00:10,796 --> 00:00:12,916 This is the start of week 9. 4 00:00:12,916 --> 00:00:15,256 So one announcement, if you haven't RSVP'd already and would 5 00:00:15,256 --> 00:00:18,196 like to join us, Wednesday, Mather [assumed spelling], 6 pm. 6 00:00:18,196 --> 00:00:18,766 [Inaudible] just RSVP. 7 00:00:18,896 --> 00:00:24,096 One new handout today and this is a bug of sort or a feature 8 00:00:24,096 --> 00:00:26,476 if you will, that's actually been circulated a bit 9 00:00:26,476 --> 00:00:28,066 on various blogs this past week. 10 00:00:28,066 --> 00:00:29,366 So it's been around for some time. 11 00:00:29,666 --> 00:00:31,166 I thought I'd replicate it here 12 00:00:31,166 --> 00:00:32,646 with this website called, Google. 13 00:00:32,936 --> 00:00:35,566 I was wanting to do some math the other day 14 00:00:35,956 --> 00:00:38,336 and it turns out Google does math. 15 00:00:38,336 --> 00:00:41,306 For instance, if I want to know what 3 - 2 is, hit enter, 16 00:00:41,546 --> 00:00:43,436 well we'll actually do that kind of math for you. 17 00:00:43,656 --> 00:00:46,406 But that I can do in my head, but I can't really do 18 00:00:46,406 --> 00:00:49,376 in my head very well something like this, this, 19 00:00:50,646 --> 00:00:53,736 that's a big number and then let's just minus, 20 00:00:53,736 --> 00:00:55,236 actually let's try to be creative here. 21 00:00:55,496 --> 00:00:59,786 Let's put almost the same number but 998. 22 00:00:59,786 --> 00:01:02,756 OK, so for anyone who's taken like Math 55, what's the answer? 23 00:01:03,256 --> 00:01:05,996 [ laughter ] 24 00:01:06,496 --> 00:01:08,156 Seven. Excellent. 25 00:01:08,386 --> 00:01:11,096 So it's pretty close, it should be 1 right. 26 00:01:11,226 --> 00:01:13,876 Pretty big numbers, but one's just 1 bigger than the other, 27 00:01:13,876 --> 00:01:15,256 but if you do a little Google search 28 00:01:15,256 --> 00:01:18,206 for this, the Google fails us. 29 00:01:18,316 --> 00:01:18,946 So. 30 00:01:18,946 --> 00:01:19,166 [ booing ] 31 00:01:19,166 --> 00:01:20,656 I know. 32 00:01:20,656 --> 00:01:21,746 [ laughter ] 33 00:01:21,746 --> 00:01:22,936 So this went around a while. 34 00:01:22,936 --> 00:01:25,536 So this isn't a bug per say because otherwise someone 35 00:01:25,536 --> 00:01:26,606 at Google would have addressed it. 36 00:01:26,606 --> 00:01:28,976 Because it's been blogged about at nauseum. 37 00:01:28,976 --> 00:01:32,816 So what might the explanation be? 38 00:01:33,766 --> 00:01:34,426 What's that? 39 00:01:34,886 --> 00:01:37,316 So rounding or there's some kind of imprecision here. 40 00:01:37,316 --> 00:01:39,936 So even though we talk mostly about this in the context 41 00:01:39,936 --> 00:01:42,506 of floating point values, so even with integers, 42 00:01:42,716 --> 00:01:45,346 we've seen in C at least, that there are upper bounds 43 00:01:45,346 --> 00:01:47,826 on the kinds of math that we can actually compute. 44 00:01:48,066 --> 00:01:49,586 So if you like this kind of thing, 45 00:01:49,586 --> 00:01:53,646 so there's really no big lesson here, this is in fact a feature 46 00:01:53,646 --> 00:01:56,476 of the way math is implemented at least by Google here, 47 00:01:56,716 --> 00:01:58,246 but there's actually a neat article that's 48 00:01:58,686 --> 00:01:59,756 been recirculated. 49 00:01:59,756 --> 00:02:01,686 It's called, Why Computers Suck at Math. 50 00:02:01,806 --> 00:02:04,596 It's an interesting article and it discusses this 51 00:02:04,856 --> 00:02:05,996 and other such things. 52 00:02:05,996 --> 00:02:08,156 So we've linked that on the course's website 53 00:02:08,156 --> 00:02:08,846 if you would like. 54 00:02:09,146 --> 00:02:11,176 Also on the course's website now, 55 00:02:11,736 --> 00:02:14,526 we've replaced the old big board with the new big board. 56 00:02:14,526 --> 00:02:15,926 The old big board in still linked there. 57 00:02:16,496 --> 00:02:19,296 You'll see that your classmate Charles has already figured 58 00:02:19,296 --> 00:02:24,586 out how to exploit financial opportunity perhaps, 59 00:02:24,796 --> 00:02:27,146 more likely bugs or weaknesses 60 00:02:27,146 --> 00:02:29,356 in our own implementation of this big board. 61 00:02:29,356 --> 00:02:30,976 This is sort of this funny thing, 62 00:02:30,976 --> 00:02:33,246 especially about teaching a Harvard class right, you know, 63 00:02:33,486 --> 00:02:35,406 you just want to do something nice and fun 64 00:02:35,406 --> 00:02:37,746 that gets the students excited about this and that. 65 00:02:37,746 --> 00:02:39,386 It's completely ancillary to the piece 66 00:02:39,386 --> 00:02:42,676 and then there's always one or more people who decide, 67 00:02:42,966 --> 00:02:44,396 let's see what we can break. 68 00:02:44,866 --> 00:02:47,096 And of course, there are weaknesses in this 69 00:02:47,096 --> 00:02:49,846 and that's fine, Charles is up 4000% 70 00:02:49,886 --> 00:02:52,526 since like 9 am this morning. 71 00:02:53,156 --> 00:02:56,026 If you actually look at his history by clicking his name, 72 00:02:56,026 --> 00:02:59,056 you'll see exactly how he did this and what he bought 73 00:02:59,056 --> 00:03:01,556 and things we probably shouldn't have let him buy. 74 00:03:01,856 --> 00:03:05,106 And again, you can essentially see into the future 75 00:03:05,106 --> 00:03:10,156 if you simply watch your own eTrade account or CNN.com 76 00:03:10,156 --> 00:03:13,006 and then make your purchases within the 15 minute window 77 00:03:13,316 --> 00:03:15,726 that is the delay that Yahoo Finance imposes. 78 00:03:15,726 --> 00:03:16,526 But that's OK. 79 00:03:16,836 --> 00:03:18,876 Why don't we just throw down the gauntlet and say 80 00:03:18,876 --> 00:03:20,826 that Charles is now the person to be. 81 00:03:20,826 --> 00:03:24,946 So 433,000 is currently the number one position. 82 00:03:24,946 --> 00:03:27,536 I think I'm doing rather the opposite. 83 00:03:27,536 --> 00:03:29,856 Oh actually, we're only down 1%. 84 00:03:30,066 --> 00:03:31,926 But one takeaway that's worth nothing here is 85 00:03:31,926 --> 00:03:33,836 and you may have glimpsed it really fast just now, 86 00:03:34,396 --> 00:03:36,786 is this is actually an Ajax implementation 87 00:03:36,786 --> 00:03:37,636 of this big board. 88 00:03:37,906 --> 00:03:40,686 Ajax is kind of a buzz word, but it refers to websites 89 00:03:40,686 --> 00:03:43,626 that are more dynamic and that's to update the page's content, 90 00:03:43,626 --> 00:03:45,256 the whole thing doesn't have to reload. 91 00:03:45,496 --> 00:03:47,946 You can just reload portions of the page, 92 00:03:47,946 --> 00:03:50,106 thereby creating a much more seamless interface 93 00:03:50,106 --> 00:03:52,376 and that's going to be one of the topics for this week. 94 00:03:52,376 --> 00:03:55,106 How can we move away, even just one week later, 95 00:03:55,416 --> 00:03:58,426 from this very mechanical but very reliable approach 96 00:03:58,426 --> 00:04:02,176 of implementing websites where one page leads to another leads 97 00:04:02,176 --> 00:04:04,906 to another and can we make this user experience a lot more 98 00:04:04,906 --> 00:04:05,936 seamless like this? 99 00:04:05,936 --> 00:04:08,076 And if you watch this over the course of several minutes, 100 00:04:08,346 --> 00:04:11,586 you'll see that these numbers do in fact fluctuate 101 00:04:11,586 --> 00:04:14,116 and I wouldn't be surprised if by the end of today's lecture, 102 00:04:14,676 --> 00:04:16,356 someone is already contesting Charles. 103 00:04:16,446 --> 00:04:18,026 So we shall, we shall see. 104 00:04:18,026 --> 00:04:20,546 Oh and also back is, Ceiling Cat. 105 00:04:20,546 --> 00:04:23,126 Ceiling Cat is one of the most famous of all cats. 106 00:04:23,126 --> 00:04:26,676 I know that the feelings on this topic in the course are mixed, 107 00:04:26,966 --> 00:04:29,426 but seeing as in the remaining surveys that were submitted 108 00:04:29,426 --> 00:04:31,766 for Peace at 5, we just finished reading through, 109 00:04:31,906 --> 00:04:34,466 one student actually said that he or she registered 110 00:04:34,466 --> 00:04:37,096 for the course because my god, the syllabus was online 111 00:04:37,096 --> 00:04:38,936 in advance and so few courses do that. 112 00:04:39,196 --> 00:04:41,786 And another student literally wrote that he or she enrolled 113 00:04:41,786 --> 00:04:44,936 in the course because he or she saw Ceiling Cat 114 00:04:44,936 --> 00:04:46,396 on version 1 of the website. 115 00:04:46,456 --> 00:04:49,796 So we actually removed that, worrying that it would work 116 00:04:49,796 --> 00:04:51,696 against us this year, but we'll put him back 117 00:04:51,836 --> 00:04:53,646 since this is now the web week of the course. 118 00:04:53,806 --> 00:04:56,976 And if you actually click him, you can read about the beginning 119 00:04:56,976 --> 00:04:59,596 of the earth from Ceiling Cat's perspective. 120 00:04:59,596 --> 00:05:02,146 So this is another famous link there. 121 00:05:02,356 --> 00:05:05,856 Anyhow, all right, so what have we got in store today? 122 00:05:05,856 --> 00:05:09,286 So last week we talked about, last week we talked about php 123 00:05:09,286 --> 00:05:13,566 and a little bit of My SQL and HTML and pset 7 is really going 124 00:05:13,566 --> 00:05:14,596 to enforce this stuff. 125 00:05:14,846 --> 00:05:16,486 So we're going to forge ahead this week 126 00:05:16,696 --> 00:05:19,256 and give you the conceptual framework for both pset 8 127 00:05:19,256 --> 00:05:21,756 and probably for many of you, final projects. 128 00:05:21,756 --> 00:05:23,196 And as much as we've kind 129 00:05:23,196 --> 00:05:25,656 of promoted taking a web based approach for final projects, 130 00:05:25,656 --> 00:05:27,566 realize you absolutely do not have to. 131 00:05:27,836 --> 00:05:29,696 We just know historically in recent years, 132 00:05:29,696 --> 00:05:31,126 60% plus of students 133 00:05:31,126 --> 00:05:33,356 in the course tackle web based final projects, 134 00:05:33,356 --> 00:05:34,636 even though we spend but two weeks 135 00:05:34,636 --> 00:05:36,216 of it officially in the semester. 136 00:05:36,216 --> 00:05:38,746 But you are welcome to pursue most anything per the pset, 137 00:05:38,926 --> 00:05:40,956 per the final project specifications. 138 00:05:41,256 --> 00:05:43,696 So I thought I would either try to really excite you 139 00:05:44,276 --> 00:05:47,376 or really crush you with the following promise which is 140 00:05:47,376 --> 00:05:51,476 that we can implement the entirety of problem set 6, 141 00:05:51,476 --> 00:05:53,166 essentially in one line of code. 142 00:05:53,476 --> 00:05:55,316 So that's a bit of an oversimplification 143 00:05:55,316 --> 00:05:56,886 but it does speak to this issue 144 00:05:56,886 --> 00:05:58,726 of choosing the right tool for the job. 145 00:05:58,726 --> 00:06:00,856 So what I went ahead and did was the following 146 00:06:00,856 --> 00:06:02,336 and included among your printouts 147 00:06:02,336 --> 00:06:04,886 for today is this file called, speller. 148 00:06:05,306 --> 00:06:06,666 So notice a couple of things. 149 00:06:06,666 --> 00:06:08,996 Speller 1 had no file extension. 150 00:06:08,996 --> 00:06:11,886 So even in the course we've had this habit 151 00:06:12,066 --> 00:06:14,566 of using file extensions like .c and .h, 152 00:06:14,906 --> 00:06:17,046 for the most part those are just conventions. 153 00:06:17,246 --> 00:06:19,756 Now with that said, some operating systems like Windows, 154 00:06:19,996 --> 00:06:22,896 rely on file extensions so that when you double-click it, 155 00:06:23,146 --> 00:06:24,686 it knows what program to load. 156 00:06:24,686 --> 00:06:28,256 But operating systems like Linux and Unix and MacOS, 157 00:06:28,256 --> 00:06:30,236 are a bit more, a bit smarter than that 158 00:06:30,446 --> 00:06:32,396 and they actually look inside the file to figure 159 00:06:32,396 --> 00:06:33,726 out what program to launch. 160 00:06:33,726 --> 00:06:34,876 But there's still disconvention 161 00:06:34,876 --> 00:06:36,456 of naming things with file extensions. 162 00:06:36,766 --> 00:06:37,896 But it's not necessary. 163 00:06:37,896 --> 00:06:40,956 I proceeded to write this program called, speller. 164 00:06:41,216 --> 00:06:44,316 No file extension but notice, no 0's and 1's. 165 00:06:44,396 --> 00:06:47,526 This is not a compiled program because I wrote this one in php. 166 00:06:48,036 --> 00:06:50,366 So the catch when you write a program in php, 167 00:06:50,366 --> 00:06:52,736 especially if you just want to run it at the command line 168 00:06:52,946 --> 00:06:54,886 like I did early last week when we did 169 00:06:54,886 --> 00:06:57,956 that very simple quote check, stock quote checking program. 170 00:06:58,376 --> 00:07:00,266 Four or so lines of code. 171 00:07:00,436 --> 00:07:06,756 I ran it at the command line as php quote.php, because I wanted 172 00:07:06,756 --> 00:07:08,226 to use the php interpreted, 173 00:07:08,226 --> 00:07:10,266 a program that may executes php code 174 00:07:10,476 --> 00:07:11,686 and I gave it a command line argument, 175 00:07:11,686 --> 00:07:13,436 the name of the file I wanted to execute. 176 00:07:13,606 --> 00:07:16,146 But that's a little bit annoying because now your users have 177 00:07:16,176 --> 00:07:18,766 to realize, oh this program happens to be written 178 00:07:18,766 --> 00:07:21,556 in a language called php, even though I don't care. 179 00:07:21,826 --> 00:07:23,196 But just to run this program I need 180 00:07:23,196 --> 00:07:25,246 to know to run php filename. 181 00:07:25,446 --> 00:07:27,526 Well you can avoid that altogether on a Mac 182 00:07:27,526 --> 00:07:29,356 or on a typically Linux or Unix system 183 00:07:29,616 --> 00:07:31,516 by including this thing atop the file. 184 00:07:31,516 --> 00:07:33,966 So this is what's sort of goofily called a shebang, 185 00:07:34,456 --> 00:07:39,666 S H E B A N G, which just means put the sharp symbol at the top, 186 00:07:39,666 --> 00:07:42,616 followed by bang or exclamation point, followed by the path 187 00:07:43,036 --> 00:07:45,226 of the program that you want to use 188 00:07:45,226 --> 00:07:47,116 to interpret the following lines of code. 189 00:07:47,386 --> 00:07:50,326 So simply by including this at the top of my file 190 00:07:50,646 --> 00:07:54,586 and by running this command, chamod, change mode 700 191 00:07:54,586 --> 00:07:56,946 of speller and this command chamod is discussed, 192 00:07:56,946 --> 00:07:58,546 if you haven't read it already, in pset 7. 193 00:07:58,546 --> 00:08:02,636 This is going to make my file executable and what this means, 194 00:08:02,636 --> 00:08:05,206 if I do an LS and this is an aesthetic thing, 195 00:08:05,456 --> 00:08:08,786 speller is shown to me in green and with a star, 196 00:08:08,786 --> 00:08:09,946 at least within Putty [assumed spelling], 197 00:08:09,946 --> 00:08:11,736 the program I'm using on this PC. 198 00:08:11,946 --> 00:08:13,346 So that just means it's executable, 199 00:08:13,566 --> 00:08:16,256 so that means I can run at the command line, speller, 200 00:08:16,506 --> 00:08:20,266 or more specifically just to be really security conscience, . 201 00:08:20,266 --> 00:08:22,086 /speller. Run this copy of speller. 202 00:08:22,376 --> 00:08:24,146 So what the OS, Linux in this case, 203 00:08:24,146 --> 00:08:25,446 is going to do is it's going to see ah, 204 00:08:25,596 --> 00:08:28,236 top line of this file says use the following 205 00:08:28,236 --> 00:08:30,316 interpreter user/bin/php. 206 00:08:30,546 --> 00:08:33,036 That's just where it is on the hard drive. 207 00:08:33,356 --> 00:08:34,506 Let me load that program 208 00:08:34,506 --> 00:08:36,236 and then feed the following lines to it. 209 00:08:36,236 --> 00:08:39,856 So the same goal is accomplished of executing the code, 210 00:08:39,856 --> 00:08:42,326 but it's a little bit more user friendly that now I don't have 211 00:08:42,356 --> 00:08:44,396 to know or care what language this is written in. 212 00:08:44,666 --> 00:08:47,066 So I implemented speller and if I run it like this, 213 00:08:47,696 --> 00:08:51,496 I see output identical to pset 6's framework 214 00:08:51,576 --> 00:08:52,676 that you guys were handed. 215 00:08:52,906 --> 00:08:54,866 So we won't walk in great detail through this. 216 00:08:54,866 --> 00:08:57,136 You're welcome to do it sort of at your leisure at home, 217 00:08:57,136 --> 00:09:00,266 but notice I pretty much tried to look at my C code 218 00:09:00,266 --> 00:09:01,676 on the left hand side in a window 219 00:09:01,866 --> 00:09:04,276 and then I had another window open for this file and I tried 220 00:09:04,276 --> 00:09:07,316 to literally translate C code to php code. 221 00:09:07,316 --> 00:09:09,436 Partly so that you guys, if you want to curl up with this 222 00:09:09,436 --> 00:09:13,076 at some point with your old pset 6, or pset 6 still in progress, 223 00:09:13,346 --> 00:09:16,316 you can actually see how you would implement the same program 224 00:09:16,316 --> 00:09:17,316 but in a different language. 225 00:09:17,316 --> 00:09:18,876 And most of the syntax is familiar. 226 00:09:18,976 --> 00:09:21,976 So at the very top here, I have require. 227 00:09:22,346 --> 00:09:25,756 So require is similar in spirit to that sharp include. 228 00:09:25,756 --> 00:09:28,116 In C it means require the following file, 229 00:09:28,116 --> 00:09:29,816 copy and paste it's contents here. 230 00:09:30,126 --> 00:09:33,126 So this thing here is a little php specific. 231 00:09:33,126 --> 00:09:34,266 Php like a lot of languages, 232 00:09:34,266 --> 00:09:36,296 have different levels of error reporting. 233 00:09:36,296 --> 00:09:38,036 If you really want to get your hands dirty 234 00:09:38,036 --> 00:09:39,626 with the feedback from the program. 235 00:09:39,716 --> 00:09:43,796 So I'm just saying turn on, suppress things called notices 236 00:09:43,926 --> 00:09:45,776 and warnings and a lot of languages there's 237 00:09:45,776 --> 00:09:48,346 at least three types of bad things that an happen. 238 00:09:48,346 --> 00:09:51,206 Notices, which are like eh, you really shouldn't do that. 239 00:09:51,346 --> 00:09:54,176 Warnings, which are something bad could probably happen, 240 00:09:54,176 --> 00:09:55,516 but I'm going to proceed nonetheless. 241 00:09:55,736 --> 00:09:58,816 And errors, which means sorry, really bad stuff happened, 242 00:09:58,816 --> 00:10:00,166 I can't even run your code. 243 00:10:00,166 --> 00:10:01,346 So there's different levels. 244 00:10:01,656 --> 00:10:04,186 In C, we've pretty much turned everything into errors 245 00:10:04,186 --> 00:10:06,476 for you guys, to force you to actually deal with them. 246 00:10:06,476 --> 00:10:08,846 So for now you can just kind of take this one on faith. 247 00:10:09,056 --> 00:10:10,536 This is stole from the problem sets. 248 00:10:10,536 --> 00:10:12,716 I defined a constant in php. 249 00:10:12,716 --> 00:10:13,606 Syntax is different. 250 00:10:13,606 --> 00:10:14,606 It's not sharp define. 251 00:10:14,606 --> 00:10:17,166 It's define and define is actually a function 252 00:10:17,166 --> 00:10:18,096 that takes two arguments. 253 00:10:18,146 --> 00:10:20,336 The name of the constant and then a value on [inaudible] 254 00:10:20,336 --> 00:10:21,916 but otherwise it's the same idea. 255 00:10:22,196 --> 00:10:24,246 Same deal for the second constant here, words. 256 00:10:24,636 --> 00:10:27,506 And then down here notice that just like in C, 257 00:10:27,686 --> 00:10:29,276 I can have command line arguments. 258 00:10:29,276 --> 00:10:31,786 So if [inaudible] does not equal 2 or 3, 259 00:10:32,096 --> 00:10:33,776 tell the user how to run this program. 260 00:10:33,776 --> 00:10:37,476 So it's pretty much been a copy paste, from C to php, 261 00:10:37,476 --> 00:10:39,676 but I've changed my variables to have dollar signs 262 00:10:39,676 --> 00:10:41,226 and I've changed how I declare constants. 263 00:10:41,496 --> 00:10:44,246 But pretty much it's a pretty good translation. 264 00:10:44,556 --> 00:10:47,146 Now this thing here I did just to be a little bit anal. 265 00:10:47,396 --> 00:10:50,686 So these are the variables we had in speller as well, tiload, 266 00:10:50,686 --> 00:10:52,886 ticheck and I've initialized them just 267 00:10:52,886 --> 00:10:54,566 to be a little fancy to 0. 268 00:10:54,566 --> 00:10:57,346 now this is just me thinking I'm being clever 269 00:10:57,346 --> 00:10:58,926 by just saving a character. 270 00:10:58,926 --> 00:11:01,486 By not doing 0.0, but it's the same thing. 271 00:11:01,486 --> 00:11:03,496 As soon as you append a period to a number, 272 00:11:03,496 --> 00:11:05,646 it becomes a floating point value so 273 00:11:05,646 --> 00:11:07,796 and I also just didn't want to waste 4 lines of code. 274 00:11:07,796 --> 00:11:09,356 So again, stylistic decision. 275 00:11:09,576 --> 00:11:12,246 I just put them all on the same line separated by semi-colons. 276 00:11:12,286 --> 00:11:14,976 But it's the same approach as in C. But notice I did not have 277 00:11:15,006 --> 00:11:17,006 to specify for any of these variables what? 278 00:11:18,306 --> 00:11:21,396 A type. So again, php is loosely typed, 279 00:11:21,396 --> 00:11:23,696 which means there are data types underneath the hood, 280 00:11:23,966 --> 00:11:27,106 but php gives you so much automatic conversion or casting 281 00:11:27,106 --> 00:11:29,016 from one to the other, that you the programmer don't have 282 00:11:29,016 --> 00:11:30,136 to worry as much about it. 283 00:11:30,406 --> 00:11:33,126 So this is our little Tertiary operator question mark 284 00:11:33,126 --> 00:11:36,996 and colon, which just says if arcC is 3 assign dict 285 00:11:37,036 --> 00:11:39,296 to the value of ardv1 else assign 286 00:11:39,296 --> 00:11:41,386 if the default value in that constant. 287 00:11:41,566 --> 00:11:44,706 And then for the most part the rest of this you can play with 288 00:11:44,706 --> 00:11:46,186 or look through on your own if you want, 289 00:11:46,376 --> 00:11:47,716 but it's pretty much a translation 290 00:11:47,716 --> 00:11:50,606 of the benchmarking code that you guys used for problem set 6 291 00:11:50,866 --> 00:11:53,276 and the end result and the end of this program is 292 00:11:53,276 --> 00:11:54,666 that it prints this output, 293 00:11:54,716 --> 00:11:56,906 which should be vaguely familiar at this point. 294 00:11:57,096 --> 00:11:59,726 So in short, I reimplemented speller in php, 295 00:11:59,726 --> 00:12:01,986 but of course you didn't implement speller, 296 00:12:02,026 --> 00:12:05,606 you implemented dictionary.c and maybe .h, 297 00:12:05,606 --> 00:12:07,576 so you implemented only the data structure. 298 00:12:07,816 --> 00:12:09,576 So I thought we would kind of do that here. 299 00:12:09,576 --> 00:12:12,206 Let me go ahead and move this one aside for just a moment. 300 00:12:12,676 --> 00:12:14,776 I'm going to create a file called dictionary.php. 301 00:12:14,776 --> 00:12:17,986 It's automatically going to be included when I run speller, 302 00:12:17,986 --> 00:12:19,376 because of that require line. 303 00:12:19,606 --> 00:12:22,136 Every php file has to begin and end with this 304 00:12:22,206 --> 00:12:25,206 or anytime you write php code specifically, 305 00:12:25,366 --> 00:12:27,846 you need to encase it in these things, so it's not conflated 306 00:12:27,846 --> 00:12:29,806 for HTML or something like that. 307 00:12:30,156 --> 00:12:30,866 So let's see. 308 00:12:31,126 --> 00:12:33,446 I need to implement a whole dictionary 309 00:12:33,726 --> 00:12:34,806 and let's see, what was in there? 310 00:12:34,806 --> 00:12:37,186 So I had a function that I had to implement called, 311 00:12:37,326 --> 00:12:39,286 load and it has to return a value. 312 00:12:39,286 --> 00:12:42,046 I had a function called, check and it took a word 313 00:12:42,046 --> 00:12:43,236 and now I'm going to have to write that. 314 00:12:43,706 --> 00:12:45,686 I had a function called, there were two more, 315 00:12:46,966 --> 00:12:49,896 size and that's probably pretty easy, but let's see. 316 00:12:49,896 --> 00:12:51,856 And then finally a function called, unload. 317 00:12:52,076 --> 00:12:53,936 So these were the four functions I implemented 318 00:12:53,936 --> 00:12:55,616 and pretty much we gave them to as blank, 319 00:12:55,706 --> 00:12:57,296 maybe a return value here or there. 320 00:12:57,536 --> 00:13:00,426 But I need to implement the data structure first. 321 00:13:00,426 --> 00:13:02,716 In php and some of these high level languages 322 00:13:02,716 --> 00:13:05,636 and scripting languages, you can implement data 323 00:13:05,636 --> 00:13:06,556 structures yourself. 324 00:13:06,556 --> 00:13:09,786 You can implement trees and tries and hash tables. 325 00:13:10,076 --> 00:13:12,456 But the thing is you don't' have to, because in a lot 326 00:13:12,456 --> 00:13:14,876 of higher level languages, you're handed this stuff 327 00:13:14,876 --> 00:13:16,256 for free, out of the box. 328 00:13:16,256 --> 00:13:17,556 It's a feature of the language. 329 00:13:17,846 --> 00:13:19,886 And in fact, we mentioned this briefly last time, 330 00:13:20,166 --> 00:13:25,046 that thing called dollar_, $_post or $_gets, 331 00:13:25,266 --> 00:13:27,556 that's super-global as I labeled it last week, 332 00:13:27,846 --> 00:13:29,976 that's what's called associative array. 333 00:13:30,206 --> 00:13:32,376 The syntax for that recall looked a little something 334 00:13:32,376 --> 00:13:32,766 like this. 335 00:13:32,836 --> 00:13:35,866 Post and then I could say something like name and what 336 00:13:35,966 --> 00:13:38,716 that variable gave me is the value that the user typed 337 00:13:38,716 --> 00:13:40,216 into a form called, name. 338 00:13:40,546 --> 00:13:43,696 Now we're mostly familiar with arrays in a numeric sense. 339 00:13:43,766 --> 00:13:47,306 In C we always did something like this or like this 340 00:13:47,306 --> 00:13:51,056 or like this, but what's nice about an associative array is 341 00:13:51,056 --> 00:13:53,526 that it's a generalization of the idea of an array 342 00:13:53,646 --> 00:13:57,976 and you can index into the array using things other than numbers. 343 00:13:58,166 --> 00:13:59,846 You can use arbitrary strings. 344 00:14:00,216 --> 00:14:02,666 So in fact, what an associative array is, 345 00:14:02,876 --> 00:14:05,096 it's kind of like a two column table in memory, 346 00:14:05,096 --> 00:14:06,816 where on the left hand column are keys 347 00:14:07,086 --> 00:14:10,066 and the right hand column are values, but what's really nice 348 00:14:10,066 --> 00:14:11,486 about the language itself is 349 00:14:11,486 --> 00:14:13,276 that it doesn't search these things linearly. 350 00:14:13,526 --> 00:14:16,156 What an associative array is implemented as, 351 00:14:16,156 --> 00:14:18,926 underneath the hood, is something like a hash table 352 00:14:19,146 --> 00:14:22,186 with chains, which is what many of you, but not all of you use, 353 00:14:22,186 --> 00:14:24,426 to implement your implementation of pset 6. 354 00:14:24,756 --> 00:14:27,886 So long story short, if I want an associative array, 355 00:14:27,886 --> 00:14:33,216 aka hash table, I simply have to do something like this. 356 00:14:33,676 --> 00:14:34,586 Give me an array. 357 00:14:35,106 --> 00:14:38,486 So this is going to be my hash table. 358 00:14:38,876 --> 00:14:39,746 It's as simple as that. 359 00:14:39,926 --> 00:14:42,736 So now I can put stuff into this array and I can get stuff 360 00:14:42,736 --> 00:14:45,706 out of this array, just by using net square bracket notation. 361 00:14:45,966 --> 00:14:47,486 So let's see, what else am I going to need? 362 00:14:47,486 --> 00:14:50,186 I'm also going to declare a global variable called, 363 00:14:50,186 --> 00:14:54,676 hashtablesize and I'm just going to call this, size and I'm going 364 00:14:54,676 --> 00:14:55,956 to initialize it to 0. 365 00:14:56,346 --> 00:14:58,556 So notice this is outside the scope of those functions 366 00:14:58,556 --> 00:15:00,286 so now all I have to do is implement the 367 00:15:00,286 --> 00:15:01,126 remaining functions. 368 00:15:01,346 --> 00:15:03,676 Well let's kind of impress just 369 00:15:03,706 --> 00:15:06,096 by being quick here, all right, done. 370 00:15:06,326 --> 00:15:07,736 So we're done with that function there. 371 00:15:07,936 --> 00:15:09,336 So we have three functions left. 372 00:15:09,336 --> 00:15:10,156 What about load? 373 00:15:10,486 --> 00:15:12,786 Well to get in load, oh and actually I forgot one thing 374 00:15:12,786 --> 00:15:14,586 in load, it did take a command line argument. 375 00:15:14,616 --> 00:15:15,946 The name of the dictionary to load. 376 00:15:16,206 --> 00:15:18,956 My goal here has to be to load the dictionary into memory 377 00:15:19,296 --> 00:15:21,866 and then iterate over the files and then insert each 378 00:15:21,866 --> 00:15:23,966 of the files into my data structure. 379 00:15:24,266 --> 00:15:25,176 Well how can I do that? 380 00:15:25,646 --> 00:15:30,736 Well it turns out that in php, there is a function called file, 381 00:15:31,076 --> 00:15:35,446 which returns to you an array such that every element 382 00:15:35,446 --> 00:15:38,566 of the array is a word from the file. 383 00:15:38,876 --> 00:15:40,536 So it's literally gives you that. 384 00:15:40,536 --> 00:15:42,606 In fact, let me do a quick, little, let's see, 385 00:15:42,606 --> 00:15:44,746 what do I want to do here? 386 00:15:45,016 --> 00:15:46,346 Let's do a quick sanity check? 387 00:15:46,346 --> 00:15:48,356 I'm going to go ahead and do this and I'm going to say, 388 00:15:48,356 --> 00:15:51,916 call this a variable called, lines equals file of dict, 389 00:15:52,156 --> 00:15:54,286 and this is going to get me again an array, 390 00:15:54,446 --> 00:15:57,736 whereby each element is a word from that file. 391 00:15:57,976 --> 00:15:59,826 And again, now this is sort of debugging mode. 392 00:15:59,826 --> 00:16:02,976 I'm going to use that recursive print function from last week, 393 00:16:02,976 --> 00:16:05,066 I'm going to print lines and then I'm just going to exit. 394 00:16:05,066 --> 00:16:07,026 So this is not a working implementation 395 00:16:07,286 --> 00:16:09,486 but I know speller is going to call load and I just want 396 00:16:09,486 --> 00:16:11,296 to see what's inside this variable temporarily 397 00:16:11,296 --> 00:16:13,706 and then we'll throw away these 2 lines of debugging code. 398 00:16:14,006 --> 00:16:16,666 So let me go ahead and save this and let me go ahead 399 00:16:16,666 --> 00:16:18,966 and run speller, on a very small file. 400 00:16:18,966 --> 00:16:20,776 Something like the Ralph Wiggim [assumed spelling] quote. 401 00:16:21,126 --> 00:16:21,886 And hit, enter. 402 00:16:21,886 --> 00:16:23,566 And in fact, that's what I see. 403 00:16:23,566 --> 00:16:25,556 So what you're seeing being spit out again and again, 404 00:16:25,556 --> 00:16:28,536 because I'm recursively printing the array, is a list of all 405 00:16:28,536 --> 00:16:30,476 of the words in the given dict file. 406 00:16:30,476 --> 00:16:31,606 In that default dictionary. 407 00:16:31,606 --> 00:16:33,056 You probably remember at least the last one. 408 00:16:33,056 --> 00:16:35,286 If you ever saw these things flow on the screen. 409 00:16:35,486 --> 00:16:37,716 Well there's a weird artifact here. 410 00:16:37,986 --> 00:16:41,896 Whereby every, there's a space between every line and that's 411 00:16:41,936 --> 00:16:44,346 because there are new lines in the file so, well we'll deal 412 00:16:44,386 --> 00:16:45,446 with that in just a moment. 413 00:16:45,656 --> 00:16:48,086 Well it turns out if I'm getting back an array here, 414 00:16:48,296 --> 00:16:50,646 I'm not even going to bother assigning it to a variable 415 00:16:50,646 --> 00:16:52,186 because I'm going to immediately launch 416 00:16:52,186 --> 00:16:53,536 into a for each construct. 417 00:16:53,536 --> 00:16:56,586 For each word from the file and the syntax 418 00:16:56,586 --> 00:16:57,586 for doing this, is this. 419 00:16:58,216 --> 00:17:00,606 This is a nice little piece of syntax that doesn't exist 420 00:17:00,606 --> 00:17:03,336 in C. Not this user friendly like and it's going to say, 421 00:17:03,336 --> 00:17:06,626 for each of the words in that array, so it takes an array 422 00:17:06,666 --> 00:17:09,746 in parenthesis, then the keyword, as, and then the name 423 00:17:09,746 --> 00:17:11,156 of a variable that you want to update 424 00:17:11,156 --> 00:17:14,606 on every iteration being the current word from that array. 425 00:17:14,896 --> 00:17:16,836 So this is going to induce you know, the equivalent 426 00:17:16,836 --> 00:17:19,536 of a wild loop or a forward loop, but it's kind of nice 427 00:17:19,536 --> 00:17:21,376 and compact in what it's going to do for me. 428 00:17:21,636 --> 00:17:22,726 So what do I want to do? 429 00:17:23,026 --> 00:17:25,496 Well I need to remember that each of these words is 430 00:17:25,496 --> 00:17:28,966 in the dictionary, but let's see how I can do that. 431 00:17:28,966 --> 00:17:31,136 Here's my hash table, dictionary, 432 00:17:31,436 --> 00:17:33,166 well I pretty much just want to do this. 433 00:17:33,216 --> 00:17:35,616 I don't want to index into an I location, 434 00:17:35,616 --> 00:17:36,696 because there's no variable I 435 00:17:36,696 --> 00:17:38,116 and I don't' want a hard coded number. 436 00:17:38,366 --> 00:17:41,746 What I want to use is the word, as a key and you know what, 437 00:17:41,826 --> 00:17:46,256 it suffices to say this, OK insert this word, word, 438 00:17:46,256 --> 00:17:48,386 into my dictionary but just say true. 439 00:17:48,706 --> 00:17:51,766 So I'm treating word as a key, true as a value, 440 00:17:51,946 --> 00:17:54,426 so that if you've now fast forwarded in your mind 441 00:17:54,426 --> 00:17:56,436 to the check function, how is check going 442 00:17:56,436 --> 00:17:58,246 to check whether there's a word in the dictionary? 443 00:17:58,506 --> 00:18:01,936 It's just going to check in the given word exists in dictionary, 444 00:18:01,936 --> 00:18:04,686 because if it gets back an answer of true, I put it there. 445 00:18:04,686 --> 00:18:07,066 And if nothing comes back that means I didn't put it there, 446 00:18:07,276 --> 00:18:09,266 which means it's not in fact a word in the dictionary. 447 00:18:09,496 --> 00:18:10,916 So before we clean up this function, 448 00:18:10,916 --> 00:18:12,906 let's fast forward and just do this. 449 00:18:12,976 --> 00:18:17,196 Return dictionary, bracket word 450 00:18:17,196 --> 00:18:19,736 and actually let's be a little more anal than this. 451 00:18:19,736 --> 00:18:25,336 So if I find a true value at the location in dictionary, go ahead 452 00:18:25,336 --> 00:18:30,406 and return true else go ahead and return, return false. 453 00:18:30,786 --> 00:18:32,236 So this function is actually done. 454 00:18:32,406 --> 00:18:34,266 If the given word is in the dictionary, 455 00:18:34,496 --> 00:18:35,806 return true else false. 456 00:18:36,016 --> 00:18:37,286 Now let me just clean this up. 457 00:18:37,286 --> 00:18:40,216 Because I did say that there were these new lines 458 00:18:40,216 --> 00:18:41,156 at the end of the file, right. 459 00:18:41,156 --> 00:18:44,236 Because the default dictionary word, new line, word, new line. 460 00:18:44,506 --> 00:18:47,226 So there's this other function and this is really just useful. 461 00:18:47,286 --> 00:18:49,476 Chop, the behavior of chop is 462 00:18:49,476 --> 00:18:51,836 to eliminate the last character from a word. 463 00:18:51,836 --> 00:18:53,376 Or I could use something called trim, 464 00:18:53,376 --> 00:18:55,326 which eliminates white space from the end 465 00:18:55,326 --> 00:18:56,386 or the beginning of a word. 466 00:18:56,386 --> 00:18:58,126 So I'm just going to clean up my word in this way 467 00:18:58,316 --> 00:18:59,726 and actually let's go ahead and use trim, 468 00:18:59,726 --> 00:19:01,156 just because it's a little more thorough 469 00:19:01,356 --> 00:19:02,996 and I know it's just going to get rid of white space 470 00:19:02,996 --> 00:19:04,266 on the beginning and end of a word. 471 00:19:04,696 --> 00:19:06,866 All right, so I'm pretty much done. 472 00:19:06,866 --> 00:19:07,926 Actually you know what, let me do this. 473 00:19:08,176 --> 00:19:09,716 I now need to save, return true. 474 00:19:09,846 --> 00:19:12,046 OK. Turns out I don't' really need these things. 475 00:19:12,046 --> 00:19:13,376 Let's make the code look even smaller 476 00:19:13,556 --> 00:19:15,526 and let me do one reasonable sanity check. 477 00:19:15,526 --> 00:19:20,006 So if file exists, this variable, 478 00:19:20,006 --> 00:19:26,986 actually if this file does not exist or this file is not, 479 00:19:27,706 --> 00:19:32,586 is readable, these are literally functions in php, what do I want 480 00:19:32,586 --> 00:19:34,556 to do if the file neither exists 481 00:19:34,636 --> 00:19:36,606 or is readable or is not readable? 482 00:19:37,666 --> 00:19:38,616 Yes, so return false. 483 00:19:38,616 --> 00:19:41,506 So now I'm adding a modicum of error checking here. 484 00:19:41,796 --> 00:19:43,126 But I'm pretty thorough here. 485 00:19:43,356 --> 00:19:46,026 So if the file doesn't exist or it's not readable, return false 486 00:19:46,026 --> 00:19:47,186 because I can't load the dictionary. 487 00:19:47,386 --> 00:19:51,926 Otherwise, iterate over each word in the file and then insert 488 00:19:51,926 --> 00:19:54,806 into my hash table, every word, and just assign it a value 489 00:19:54,806 --> 00:19:57,396 of true so that later I can ask a billion questions. 490 00:19:57,396 --> 00:19:58,076 Is the word there? 491 00:19:58,216 --> 00:19:58,936 Yes or no? 492 00:19:58,986 --> 00:20:01,456 Yes or no, iteratively in my check function. 493 00:20:01,706 --> 00:20:03,096 Now I need to change one thing. 494 00:20:03,316 --> 00:20:04,436 There's one nuisance of php, 495 00:20:04,436 --> 00:20:06,886 which is that even though dictionary looks 496 00:20:06,886 --> 00:20:08,996 like it's a global variable at the top of the file, 497 00:20:09,626 --> 00:20:12,766 you actually can't access global variables inside 498 00:20:12,766 --> 00:20:15,746 of functions unless you say, hey php, 499 00:20:15,746 --> 00:20:18,366 dictionary is a global variable, just so you realize. 500 00:20:18,576 --> 00:20:20,056 And so I need to do this in here. 501 00:20:20,296 --> 00:20:21,866 I need to do it in here and I need 502 00:20:21,866 --> 00:20:24,106 to do one other, one other place. 503 00:20:24,946 --> 00:20:26,426 I need to do this here. 504 00:20:26,576 --> 00:20:27,816 So it's stupid frankly. 505 00:20:27,816 --> 00:20:29,226 This is an annoying feature of php, 506 00:20:29,596 --> 00:20:32,196 but at least now we have access to that global variable. 507 00:20:32,196 --> 00:20:33,576 Unload. Is there anything to do? 508 00:20:34,236 --> 00:20:35,116 No, not really. 509 00:20:36,226 --> 00:20:37,356 Already unload. 510 00:20:37,356 --> 00:20:38,906 In php you don't have to manage your memory. 511 00:20:38,906 --> 00:20:40,546 The php interpreter does it for you. 512 00:20:40,546 --> 00:20:43,266 No malloc, no free, none of that. 513 00:20:43,326 --> 00:20:44,476 It's all done for you. 514 00:20:44,896 --> 00:20:45,756 So any concerns? 515 00:20:45,836 --> 00:20:46,996 Did I make any mistakes? 516 00:20:47,176 --> 00:20:49,316 Because I literally am doing this on the fly here. 517 00:20:49,956 --> 00:20:54,136 Good, it doesn't. 518 00:20:54,136 --> 00:20:56,376 It's always going to be 0 right now. 519 00:20:56,376 --> 00:21:01,966 So what do we want to do to fix this? 520 00:21:02,166 --> 00:21:04,026 Yes, might as well go in here. 521 00:21:04,026 --> 00:21:05,536 Guess I've got to put my brackets again 522 00:21:05,536 --> 00:21:08,836 and then I'm going to do size++ and then I just have 523 00:21:08,836 --> 00:21:10,016 to make one other change here. 524 00:21:10,156 --> 00:21:13,456 I have to specify that size is global. 525 00:21:13,826 --> 00:21:14,626 Yes? Good? 526 00:21:14,896 --> 00:21:16,426 OK and we can do this any number of ways. 527 00:21:16,426 --> 00:21:17,136 So let's save this. 528 00:21:17,476 --> 00:21:20,326 Let's run speller on text and now run it 529 00:21:20,326 --> 00:21:23,366 on the Ralph [assumed spelling] file and that's wrong isn't it. 530 00:21:23,996 --> 00:21:27,866 Let's do, do I want to do this right now on the fly? 531 00:21:27,866 --> 00:21:32,776 File exists for each file is word, dictionary, trim, word. 532 00:21:32,776 --> 00:21:33,726 That's OK. 533 00:21:33,866 --> 00:21:35,206 Size gets true. 534 00:21:36,876 --> 00:21:40,336 Can you see the bug? 535 00:21:40,526 --> 00:21:43,356 What? Capitalization. 536 00:21:43,356 --> 00:21:45,416 Oh, yes. Thank you, thank you very much. 537 00:21:45,416 --> 00:21:46,316 Capitalization. 538 00:21:46,536 --> 00:21:48,186 OK, so that I can fix. 539 00:21:48,256 --> 00:21:50,046 Stir to lower. 540 00:21:50,046 --> 00:21:50,113 [ laughter ] 541 00:21:50,113 --> 00:21:56,586 I mean, frankly you laugh, but this is one 542 00:21:56,586 --> 00:21:59,206 of the biggest sources of bugs in your C code right? 543 00:21:59,206 --> 00:22:01,686 Was given the check function, most of you probably made a copy 544 00:22:01,686 --> 00:22:04,056 of the word, then had to force it to lower case, then you had 545 00:22:04,056 --> 00:22:05,606 to chase down all these stupid memory bugs. 546 00:22:05,876 --> 00:22:07,736 But I mean, this is actually very reasonable, 547 00:22:07,736 --> 00:22:10,826 I'm passed the word, it could be in any type of capitalization. 548 00:22:10,826 --> 00:22:13,546 I want to force it to lowercase, so stir to lower is 549 00:22:13,546 --> 00:22:14,766 in fact the function for that. 550 00:22:14,766 --> 00:22:18,036 Let's go ahead and rerun Ralph and yes, it actually works. 551 00:22:18,346 --> 00:22:19,366 So what was that? 552 00:22:19,366 --> 00:22:22,046 Eight minutes of coding to implement problem set 6 553 00:22:22,046 --> 00:22:24,046 in the better choice of languages. 554 00:22:24,366 --> 00:22:25,536 But surely there must be a catch. 555 00:22:25,746 --> 00:22:26,926 So in fact there is. 556 00:22:26,986 --> 00:22:28,976 So let me actually not run Ralph 557 00:22:29,036 --> 00:22:31,686 Because recall Ralph is a terribly short file. 558 00:22:32,086 --> 00:22:34,086 Right? This is and if you've never actually looked 559 00:22:34,086 --> 00:22:36,966 in this file, this is the, this is from the lab, 560 00:22:36,966 --> 00:22:38,286 so you can actually look in this file. 561 00:22:38,286 --> 00:22:38,926 Some of you have seen it. 562 00:22:38,926 --> 00:22:39,426 So that's OK. 563 00:22:39,846 --> 00:22:41,266 So let's take a look at speller though 564 00:22:41,266 --> 00:22:43,776 on something much longer like the Holmes file. 565 00:22:43,776 --> 00:22:45,566 So this is like 6 megabytes. 566 00:22:46,046 --> 00:22:47,696 So this was a very large file. 567 00:22:47,976 --> 00:22:50,446 It's going ahead and spell checking, spitting out all 568 00:22:50,446 --> 00:22:54,186 of the bogus words and let's give it a little bit here. 569 00:22:54,906 --> 00:22:57,356 There are a lot of misspelled, quote unquote, words, 570 00:22:57,356 --> 00:22:59,056 because this was an unfamiliar text. 571 00:22:59,336 --> 00:23:04,546 All right, so words misspelled, 17,000 out of 140, 572 00:23:04,736 --> 00:23:08,046 out of actually a million words in the file. 573 00:23:08,366 --> 00:23:12,166 So this is a big file and it took roughly 2.19 seconds 574 00:23:12,226 --> 00:23:13,906 to totally spell check that file. 575 00:23:14,186 --> 00:23:17,936 So now let me shift gears to my other window here. 576 00:23:17,936 --> 00:23:18,666 Same server. 577 00:23:18,766 --> 00:23:21,966 I'm on logged in as myself and I'm actually going to run, 578 00:23:21,966 --> 00:23:25,606 not the php version but the staff solution 579 00:23:25,636 --> 00:23:29,686 from pset 6 itself, which was called, speller here. 580 00:23:29,686 --> 00:23:32,276 I'm going to pass it the same text file, 581 00:23:32,626 --> 00:23:34,946 which is misspellings, text, 582 00:23:35,176 --> 00:23:38,736 Holmes So again I'm just running the C implementation 583 00:23:38,846 --> 00:23:42,086 of the staff's implementation of dictionary and I'm going 584 00:23:42,086 --> 00:23:44,516 to hit, enter and wow. 585 00:23:45,326 --> 00:23:48,966 So maybe it's that the staff is much better at writing code 586 00:23:48,966 --> 00:23:50,806 than I am right, especially since I just wrote this 587 00:23:50,806 --> 00:23:53,566 on the fly or maybe it also has a little something to do 588 00:23:53,566 --> 00:23:54,646 with the choice of language. 589 00:23:54,646 --> 00:23:56,726 So this is one of the trade offs and again this theme 590 00:23:56,726 --> 00:23:58,156 of you don't' get anything for free. 591 00:23:58,396 --> 00:24:01,086 So yes, I really whittled down my development time from what? 592 00:24:01,426 --> 00:24:04,016 Ten hours, fifteen hours, twenty plus hours for some 593 00:24:04,016 --> 00:24:05,196 of you, took eight minutes. 594 00:24:05,196 --> 00:24:07,046 Which might be a little disheartening certainly, 595 00:24:07,396 --> 00:24:10,876 but look at the price I'm going to pay in the long run 596 00:24:10,876 --> 00:24:14,056 if I use this thing to spell check very large bodies of text. 597 00:24:14,276 --> 00:24:15,756 So development time when down, 598 00:24:15,976 --> 00:24:18,236 but running time significantly went up. 599 00:24:18,236 --> 00:24:19,626 You notice the difference here? 600 00:24:20,356 --> 00:24:21,666 Notice the bug here. 601 00:24:21,666 --> 00:24:21,733 [ laughter ] 602 00:24:21,733 --> 00:24:24,816 OK, well we'll ignore that little detail 603 00:24:24,816 --> 00:24:26,656 that the words misspelled is slightly different. 604 00:24:26,806 --> 00:24:28,656 Because there is in fact a little bug somewhere. 605 00:24:29,806 --> 00:24:32,716 But trust me that it's the choice of language. 606 00:24:32,716 --> 00:24:34,466 I should get credit for pointing out the bug 607 00:24:34,466 --> 00:24:36,016 and letting you realize it. 608 00:24:36,186 --> 00:24:40,866 Maybe? So notice the order of magnitude though, 2.19 seconds 609 00:24:40,866 --> 00:24:44,256 versus .4 seconds and this was still a relatively small file. 610 00:24:44,256 --> 00:24:46,136 It's only 6 megabytes and we deal 611 00:24:46,136 --> 00:24:48,076 with much larger data sets these days. 612 00:24:48,076 --> 00:24:49,006 So what's the takeaway? 613 00:24:49,236 --> 00:24:53,946 Well the theme in interpreted languages like php or JavaScript 614 00:24:53,946 --> 00:24:57,166 or pearl or ruby or python, there's a whole bunch 615 00:24:57,166 --> 00:24:59,416 of popular interpreted languages. 616 00:24:59,736 --> 00:25:02,866 They are very convenient and they give you a lot of features 617 00:25:02,866 --> 00:25:04,766 out of the box for free, so that you don't have 618 00:25:04,846 --> 00:25:07,056 to reinvent the wheels you guys have been implementing 619 00:25:07,056 --> 00:25:07,746 all semester. 620 00:25:08,076 --> 00:25:09,246 But you really pay for it. 621 00:25:09,246 --> 00:25:11,886 One, you have no idea how things are implemented underneath the 622 00:25:11,886 --> 00:25:13,596 hood, which may or may not be a bad thing. 623 00:25:13,836 --> 00:25:16,946 But certainly you can't fine tune your code very effectively 624 00:25:16,946 --> 00:25:19,506 and if the goal, especially you know, for a job 625 00:25:19,506 --> 00:25:22,566 or for your research is really to maximize performance, 626 00:25:22,826 --> 00:25:25,266 you really need to pick up the right tool for the job. 627 00:25:25,656 --> 00:25:25,766 Yes? 628 00:25:26,546 --> 00:25:31,086 >> So when you run a reference and place it in the table in C, 629 00:25:31,586 --> 00:25:35,156 there [inaudible] and here you're basically referencing 630 00:25:35,706 --> 00:25:39,396 this table and not there [inaudible]? 631 00:25:39,396 --> 00:25:40,296 >> That's correct. 632 00:25:40,386 --> 00:25:42,896 So and I'm also taking advantage of certain features 633 00:25:42,896 --> 00:25:44,966 of the language So the point here is 634 00:25:44,966 --> 00:25:47,866 that in my own dictionary implementation 635 00:25:48,096 --> 00:25:49,956 and let me open my commented one 636 00:25:49,956 --> 00:25:51,206 which you guys have a printout of. 637 00:25:51,526 --> 00:25:53,376 So in my implementation notice 638 00:25:53,376 --> 00:25:56,496 that in my check function I'm kind of blindly checking 639 00:25:56,496 --> 00:25:59,716 if there is in fact a word at a give location in my dictionary, 640 00:26:00,006 --> 00:26:01,976 but there might not be anything there. 641 00:26:01,976 --> 00:26:04,566 And as you can point out in C, very bad things happen 642 00:26:04,566 --> 00:26:06,976 if you just index into an array, anywhere you want. 643 00:26:07,216 --> 00:26:08,856 So in php and in a lot of language 644 00:26:08,936 --> 00:26:10,416 like this, that's not a problem. 645 00:26:10,416 --> 00:26:13,076 You will simply by default get back a value of false 646 00:26:13,336 --> 00:26:16,216 or it will trigger a warning or some kind of notice 647 00:26:16,456 --> 00:26:21,036 and in fact I'm being a little tricky here in that the line 648 00:26:21,036 --> 00:26:23,936 of code that I mentioned at the very beginning of speller 649 00:26:23,936 --> 00:26:26,226 that suppresses notices and warnings, 650 00:26:26,546 --> 00:26:28,756 what you just described, indexing into an array 651 00:26:28,756 --> 00:26:31,676 where you should not, that in php actually triggers usually 652 00:26:31,676 --> 00:26:32,476 what's called a notice. 653 00:26:32,736 --> 00:26:35,066 You will see printed on the screen, notice, 654 00:26:35,106 --> 00:26:38,446 indexing into an array, not wise or something to that effect. 655 00:26:38,666 --> 00:26:41,316 But with this line of code, I'm actually suppressing that notice 656 00:26:41,666 --> 00:26:43,496 because I'm very comfortable knowing, 657 00:26:43,496 --> 00:26:46,146 accepting that I know there might not be anything there. 658 00:26:46,526 --> 00:26:50,326 So with this case I'm telling php, quiet, I don't want to know 659 00:26:50,326 --> 00:26:51,926 about these notices, this is OK. 660 00:26:52,176 --> 00:26:53,956 But there are other ways to deal with this. 661 00:26:54,006 --> 00:26:55,586 You could actually do something like this. 662 00:26:55,686 --> 00:26:57,316 So let me make one last comment on this 663 00:26:57,986 --> 00:26:59,036 and then we'll forge ahead. 664 00:26:59,146 --> 00:27:00,346 You could, I could have done this. 665 00:27:00,606 --> 00:27:12,226 If dictionary stir to, stir to lower word is set. 666 00:27:12,606 --> 00:27:14,856 So there's a function in php called, is set, 667 00:27:15,136 --> 00:27:17,996 that solves exactly the problem you are worried 668 00:27:17,996 --> 00:27:20,366 about which is a check before actually going there, 669 00:27:20,366 --> 00:27:22,106 is there anything there and only 670 00:27:22,106 --> 00:27:23,626 if there is does it return a value. 671 00:27:24,766 --> 00:27:28,536 Yes? So I wouldn't. 672 00:27:28,536 --> 00:27:30,446 I would replace one with the other at that point. 673 00:27:30,616 --> 00:27:31,706 I would replace one with the other. 674 00:27:31,706 --> 00:27:34,106 So, but I made the conscious decision to kind of gloss 675 00:27:34,106 --> 00:27:36,316 over some of these details just in the interest of simplicity, 676 00:27:36,646 --> 00:27:39,206 but frankly it's reasonable to do 677 00:27:39,206 --> 00:27:42,146 such if you know what you're doing, as a design decision. 678 00:27:42,146 --> 00:27:42,866 Other questions? 679 00:27:43,926 --> 00:27:46,866 OK, so then a word on why did we just put you 680 00:27:46,866 --> 00:27:49,336 through problem set 6 and problem set 5 681 00:27:49,336 --> 00:27:50,576 and problem set 4 right? 682 00:27:50,736 --> 00:27:53,046 So one, there are different tools for the job. 683 00:27:53,046 --> 00:27:55,256 Like when you are doing various research things 684 00:27:55,256 --> 00:27:57,376 or you are trying to actually you know, 685 00:27:57,376 --> 00:27:59,816 implement the next best search engine or what not, 686 00:28:00,026 --> 00:28:02,686 you really do sometimes want to get very low level, 687 00:28:02,686 --> 00:28:04,596 closer to the operating system, into the hardware, 688 00:28:04,596 --> 00:28:05,826 so you can actually eek 689 00:28:05,826 --> 00:28:08,396 out as much performance as you might want. 690 00:28:08,576 --> 00:28:10,006 But there's also something to be said 691 00:28:10,006 --> 00:28:12,626 for actually understanding what this thing is. 692 00:28:12,626 --> 00:28:15,776 I would argue that there's a lot of software developers out there 693 00:28:15,966 --> 00:28:17,266 who completely take for granted 694 00:28:17,266 --> 00:28:19,496 that there is this thing called an array, you can access 695 00:28:19,496 --> 00:28:21,176 into it, with no appreciation 696 00:28:21,176 --> 00:28:23,276 of how it's actually implemented underneath the hood 697 00:28:23,276 --> 00:28:26,886 and therefore when to use it and when not to use it as well. 698 00:28:27,156 --> 00:28:28,946 And also it's around this time of the semester, 699 00:28:28,946 --> 00:28:31,756 especially with pset 6, where late day usage really spikes, 700 00:28:31,756 --> 00:28:34,246 so not to worry if you're kind of still working hard 701 00:28:34,246 --> 00:28:35,646 on pset 6, it's typical. 702 00:28:35,646 --> 00:28:36,936 Especially at the end of the semester. 703 00:28:37,316 --> 00:28:39,216 But realize, the common sentiment at this point 704 00:28:39,216 --> 00:28:42,386 in the term is you know, my god like, thank god the end is 705 00:28:42,386 --> 00:28:44,046 in sight and you know, you might feel 706 00:28:44,046 --> 00:28:46,626 like you've learned something, you might feel fairly gratified 707 00:28:46,626 --> 00:28:48,526 at 4 am after finishing these things, 708 00:28:48,566 --> 00:28:51,476 but many of you might think eh, this isn't' really for me. 709 00:28:51,476 --> 00:28:52,856 I don't want to go through that again. 710 00:28:52,856 --> 00:28:55,706 But realize, that is not necessarily what programming is 711 00:28:55,706 --> 00:28:58,456 and what programming computers is all about. 712 00:28:58,496 --> 00:29:00,736 So now that we're finally taking this step up, 713 00:29:01,056 --> 00:29:04,156 realize that there is this great element of fun, I think, 714 00:29:04,156 --> 00:29:07,916 of programming, that allows you to solve real world problems. 715 00:29:07,916 --> 00:29:10,366 Whether it's some new site, again or events site, 716 00:29:10,366 --> 00:29:13,566 the Twitter site, relatively easily including simple things 717 00:29:13,566 --> 00:29:14,716 like the spell checker. 718 00:29:15,046 --> 00:29:17,096 So you simply have to pick the right tool for the job. 719 00:29:17,096 --> 00:29:18,906 And that's why I think final projects around this time 720 00:29:18,906 --> 00:29:20,856 of the year tend to be particularly fun 721 00:29:21,086 --> 00:29:23,616 because you can finally, actually bite off something 722 00:29:23,616 --> 00:29:24,886 at your choosing and not us. 723 00:29:25,866 --> 00:29:26,546 So that was a lot. 724 00:29:26,546 --> 00:29:30,306 Why don't we take our five minute break now. 725 00:29:30,986 --> 00:29:32,396 All right. 726 00:29:35,606 --> 00:29:38,866 So it's kind of spooky that it's already time 727 00:29:38,866 --> 00:29:42,626 for this little announcement, but indeed, even though a month 728 00:29:42,626 --> 00:29:45,926 or so remains in the semester, do realize that we'll already be 729 00:29:45,926 --> 00:29:50,746 on the search for TF's and for CA's for next fall, 2010. 730 00:29:50,746 --> 00:29:53,486 Know that the role of TF involves leading section, 731 00:29:53,626 --> 00:29:55,726 working with your students, grading their problem sets, 732 00:29:55,726 --> 00:29:59,016 office hours and the like and CA'ing by contrast is a role 733 00:29:59,016 --> 00:30:02,386 that we've targeted particularly at alumni at CS50. 734 00:30:02,386 --> 00:30:04,906 So almost all of the CA's that you may have met in the lab 735 00:30:04,906 --> 00:30:07,926 or on email lists this year, are former CS50 students 736 00:30:07,926 --> 00:30:11,356 who have offered to contribute 2 hours, only 2 hours 737 00:30:11,356 --> 00:30:14,056 of work per week, on a volunteer basis working 738 00:30:14,056 --> 00:30:17,226 with their fellow classmates and their successors in this class 739 00:30:17,436 --> 00:30:20,316 and the lab in what is pretty much the most intense, 740 00:30:20,316 --> 00:30:21,866 fun part of the course and office hours. 741 00:30:22,406 --> 00:30:25,176 Realize that if you do choose to join us next year, 742 00:30:25,456 --> 00:30:28,116 it's a little more fun being on the other side of things 743 00:30:28,116 --> 00:30:29,386 and actually running the white board 744 00:30:29,386 --> 00:30:30,886 and not running to the white board. 745 00:30:31,226 --> 00:30:34,086 So more, just go, we'll post a link 746 00:30:34,086 --> 00:30:36,686 on the course's homepage before long via which you can apply 747 00:30:36,686 --> 00:30:38,326 by telling us a little something about yourself 748 00:30:38,326 --> 00:30:40,016 and what you plan for next term. 749 00:30:40,286 --> 00:30:45,036 So in the time that has passed, Charles has made let's see 750 00:30:45,036 --> 00:30:51,636 about $60,000, so he's doing quite well. 751 00:30:51,636 --> 00:30:53,306 So we give him our official blessing 752 00:30:53,306 --> 00:30:54,866 and challenge you to best him. 753 00:30:54,866 --> 00:30:57,156 But realize, this thing exists one just for fun, 754 00:30:57,156 --> 00:30:58,406 but also so that you can actually play 755 00:30:58,406 --> 00:31:01,226 with the staff's own implementation of CS50 finance, 756 00:31:01,266 --> 00:31:02,046 which you can get 757 00:31:02,046 --> 00:31:03,896 to by following the appropriate link here. 758 00:31:04,136 --> 00:31:06,866 But realize too, you need not implement your version 759 00:31:06,866 --> 00:31:09,366 of pset 7 exactly as the staff has. 760 00:31:09,596 --> 00:31:11,596 We have simply offered it for consideration. 761 00:31:11,596 --> 00:31:14,236 And you may have realized already, you can look at some 762 00:31:14,236 --> 00:31:15,556 of the code for the site. 763 00:31:15,556 --> 00:31:18,426 You're welcome to look at the HTML and the CSS. 764 00:31:18,426 --> 00:31:20,136 What you'll find is that you don't have access to the php 765 00:31:20,136 --> 00:31:22,676 and this is kind of a rule of thumb with web development. 766 00:31:23,016 --> 00:31:26,776 It's pretty reasonable and it's technologically completely 767 00:31:26,776 --> 00:31:30,186 possible these days to look at other people's source code 768 00:31:30,246 --> 00:31:34,386 with regard to HTML and CSS and even, unless they've jumped 769 00:31:34,386 --> 00:31:36,496 through some hoops, JavaScript as well. 770 00:31:36,496 --> 00:31:38,776 And I can't emphasize enough, if you like this kind of stuff, 771 00:31:38,916 --> 00:31:42,946 the best way to learn about how to do new things is to look 772 00:31:42,946 --> 00:31:44,926 at a site you like and even though it might be a little 773 00:31:44,926 --> 00:31:46,896 complicated at first glance, start poking 774 00:31:46,896 --> 00:31:50,036 around with it's source code and learn from someone else's sites. 775 00:31:50,176 --> 00:31:52,126 There's really not much intellectual property 776 00:31:52,386 --> 00:31:54,656 when it comes to XHTML and the layout of the site. 777 00:31:54,656 --> 00:31:56,336 The juicy stuff is in the JavaScript, 778 00:31:56,586 --> 00:31:58,366 which we'll see today and also in the php. 779 00:31:58,366 --> 00:32:02,336 The former of which we can actually obviewskate in some way 780 00:32:02,336 --> 00:32:04,956 to make it harder for people to take your intellectual property. 781 00:32:05,246 --> 00:32:07,216 But let me bring your attention to this thing here. 782 00:32:07,316 --> 00:32:11,116 So pset 7 recommends that you install a few free tools 783 00:32:11,116 --> 00:32:13,536 and we encourage you to use Firefox only 784 00:32:13,536 --> 00:32:15,906 because it's really useful for development purposes, 785 00:32:16,036 --> 00:32:17,436 but ultimately you should be testing 786 00:32:17,436 --> 00:32:20,626 as the spec says your code on multiple browsers. 787 00:32:20,626 --> 00:32:23,356 Because this is a very common thing for, 788 00:32:24,196 --> 00:32:26,226 you will find the hard way unfortunately, 789 00:32:26,496 --> 00:32:28,946 that the web browser manufacturers have always been 790 00:32:28,946 --> 00:32:31,116 at odds for years, 10, 15 years now, 791 00:32:31,116 --> 00:32:33,566 where Microsoft interprets something in the specification 792 00:32:33,566 --> 00:32:37,576 for XHTML one way, Mozilla interprets it another way, 793 00:32:37,706 --> 00:32:38,886 Apple interprets it another way, 794 00:32:39,046 --> 00:32:41,056 so you will see slight differences, 795 00:32:41,176 --> 00:32:43,286 even on your relatively simple websites 796 00:32:43,476 --> 00:32:44,836 across multiple browsers. 797 00:32:44,836 --> 00:32:46,786 And one of the things that drives us nuts 798 00:32:46,786 --> 00:32:49,196 for the course's website is making it look 799 00:32:49,196 --> 00:32:51,966 as best we can the same, no matter the OS 800 00:32:51,966 --> 00:32:54,406 and no matter the browser that your user is using. 801 00:32:54,406 --> 00:32:57,076 So we do expect that you play with at least two browsers 802 00:32:57,076 --> 00:33:00,026 for your project, but with Firefox you have some really 803 00:33:00,116 --> 00:33:00,996 useful tools. 804 00:33:00,996 --> 00:33:04,306 So I had this little bug at the bottom of my window, 805 00:33:04,306 --> 00:33:05,756 because I installed an extension. 806 00:33:05,756 --> 00:33:08,326 It's free software for Firefox called Firebug 807 00:33:08,576 --> 00:33:10,026 and it's a few different things. 808 00:33:10,106 --> 00:33:15,246 It is a, it allows you to view the websites, whether it's your 809 00:33:15,246 --> 00:33:17,706 or someone else's, XHTML or HTML, 810 00:33:17,966 --> 00:33:19,836 in a much more user friendly way. 811 00:33:19,836 --> 00:33:20,916 So notice what I've done. 812 00:33:20,916 --> 00:33:23,586 After clicking the bug, this little pop up opened up 813 00:33:23,626 --> 00:33:26,636 and what it's showing me with much nicer indentation 814 00:33:26,636 --> 00:33:29,036 and nesting, the structure of this webpage. 815 00:33:29,036 --> 00:33:31,596 By contrast, if you look at the course's website, 816 00:33:31,916 --> 00:33:34,416 it's an absolute mess underneath the hood, 817 00:33:34,496 --> 00:33:36,026 not because we were lazy 818 00:33:36,166 --> 00:33:40,166 and didn't really practice good style, but because a lot 819 00:33:40,166 --> 00:33:43,526 of the website is dynamically generated by php scripts 820 00:33:43,526 --> 00:33:46,796 that we wrote and even though we kept our php code 821 00:33:46,796 --> 00:33:49,106 and our JavaScript code pretty neat, when you start 822 00:33:49,106 --> 00:33:52,046 to combine all of these things, things just get messy 823 00:33:52,046 --> 00:33:55,476 and there's really no point trying to make this look clean, 824 00:33:55,626 --> 00:33:57,736 because for the most part it's only a browser 825 00:33:57,736 --> 00:33:58,976 that has to understand this. 826 00:33:58,976 --> 00:34:02,676 So we elaborated on this in the pset spec, but for learning, 827 00:34:02,876 --> 00:34:04,836 this is frankly a bit of a nightmare trying 828 00:34:04,836 --> 00:34:06,366 to find your way around this and even 829 00:34:06,366 --> 00:34:08,076 when we reimplemented Google last week, 830 00:34:08,316 --> 00:34:11,426 I relied on control F, just to find things in the document. 831 00:34:11,636 --> 00:34:14,486 But Firebug is useful for XHTML and HTML for this reason. 832 00:34:14,696 --> 00:34:16,346 You get a much nicer view of the world. 833 00:34:16,346 --> 00:34:18,486 Which means if I want to see what's going on in here, 834 00:34:18,796 --> 00:34:23,266 I can actually dive in and click on these pluses. 835 00:34:23,266 --> 00:34:24,366 So it looks like the body 836 00:34:24,366 --> 00:34:27,466 of my page has this thing called the div, which I've given an ID 837 00:34:27,466 --> 00:34:30,566 of raprum, there's a table in there that's laying things out, 838 00:34:30,566 --> 00:34:32,856 a table body, a bunch of table rows 839 00:34:32,856 --> 00:34:34,116 and more interesting is this. 840 00:34:34,426 --> 00:34:38,536 If you start to hover over tags, you'll see top left, in blue, 841 00:34:38,536 --> 00:34:40,086 what it is I'm hovering over. 842 00:34:40,346 --> 00:34:42,756 So apparently we are actually using very deliberately, 843 00:34:42,756 --> 00:34:45,136 an invisible table who's border is 0, 844 00:34:45,216 --> 00:34:46,416 thereby making it invisible. 845 00:34:46,666 --> 00:34:48,956 So notice, this TD, this table data element, 846 00:34:49,226 --> 00:34:50,696 is the white space at top right. 847 00:34:50,696 --> 00:34:52,046 If I scroll down a little further, 848 00:34:52,046 --> 00:34:55,306 this guy here is apparently that bar in the top middle 849 00:34:55,306 --> 00:34:57,566 and as you might guess over here, that's the guy 850 00:34:57,566 --> 00:34:58,686 on the right hand side. 851 00:34:58,686 --> 00:35:01,456 If I dive into the next table row, notice ah, 852 00:35:01,506 --> 00:35:03,316 there's the left hand side of the page, 853 00:35:03,316 --> 00:35:04,936 here's the middle, here's the right. 854 00:35:04,986 --> 00:35:07,396 And so if you see something on a website that you like 855 00:35:07,396 --> 00:35:10,286 or you wonder, wow how did they implement this it's really cool, 856 00:35:10,536 --> 00:35:13,136 this is a wonderful way of actually wrapping your mind 857 00:35:13,136 --> 00:35:14,116 around how they did it. 858 00:35:14,406 --> 00:35:16,556 Now CSS isn't something you need to worry so much 859 00:35:16,556 --> 00:35:18,706 about for the course, but if you're coming to the course 860 00:35:18,706 --> 00:35:20,646 with some background, realize too 861 00:35:20,646 --> 00:35:23,546 that Firefox lets you see the CSS that's being applied 862 00:35:23,546 --> 00:35:24,016 to a site. 863 00:35:24,256 --> 00:35:26,426 If you click on an individual element like this one 864 00:35:26,426 --> 00:35:28,986 in the middle, on the right hand side you will see all 865 00:35:28,986 --> 00:35:32,186 of the CSS rules that have been applied to that element 866 00:35:32,456 --> 00:35:35,156 and from top to bottom you'll see how they cascade. 867 00:35:35,416 --> 00:35:36,966 So again, I won't spend much time on this, 868 00:35:36,966 --> 00:35:39,216 only because it's not all that enlightening intellectually 869 00:35:39,216 --> 00:35:40,456 for the course's purposes, 870 00:35:40,616 --> 00:35:42,746 but this is a very common technology, 871 00:35:42,746 --> 00:35:44,456 but for the most part it's not necessary 872 00:35:44,456 --> 00:35:45,726 for our two problem sets. 873 00:35:45,776 --> 00:35:46,856 But realize this is there. 874 00:35:46,856 --> 00:35:50,366 And finally, which we'll start using on Wednesday this week, 875 00:35:50,566 --> 00:35:55,396 we have a script tag which actually lets you debug things 876 00:35:55,936 --> 00:35:58,466 that happen to be written in a language called JavaScript. 877 00:35:58,466 --> 00:36:00,946 So this will be invaluable too, to think of this 878 00:36:00,946 --> 00:36:04,926 as a GDB substitute and also useful is one other thing. 879 00:36:04,926 --> 00:36:07,726 So on the course's website, we have this link here, to web. 880 00:36:08,346 --> 00:36:09,746 Firebug is what I just mentioned, 881 00:36:09,746 --> 00:36:11,006 Firefox is the browser. 882 00:36:11,286 --> 00:36:13,176 Then there's this other debugger, which we may talk 883 00:36:13,176 --> 00:36:15,216 about a bit on Wednesday, then there's this thing 884 00:36:15,216 --> 00:36:16,456 which we will talk about today. 885 00:36:16,606 --> 00:36:19,176 Live HTTP headers and then web developer. 886 00:36:19,526 --> 00:36:22,906 This web developer toolbar is what gives me this menu here. 887 00:36:23,066 --> 00:36:24,926 And it also appears under your tools menu. 888 00:36:25,216 --> 00:36:29,046 So very important these days or very common, is for a website 889 00:36:29,046 --> 00:36:31,796 to be designed with a specific browser size 890 00:36:31,796 --> 00:36:33,256 or window size in mind. 891 00:36:33,756 --> 00:36:36,396 Most common these days is probably to assume 892 00:36:36,696 --> 00:36:45,176 that a user's screen is 1024 pixels wide by 768 pixels tall. 893 00:36:45,176 --> 00:36:46,346 And this wasn't always the case. 894 00:36:46,346 --> 00:36:47,836 Facebook, a couple of years ago, 895 00:36:47,996 --> 00:36:50,876 actually assumed a different monitor size 896 00:36:51,176 --> 00:36:54,176 which was 800 pixels by 600 pixels and a lot 897 00:36:54,176 --> 00:36:56,976 of websites did that, including maybe even CNN 898 00:36:57,136 --> 00:36:58,896 up until a couple of years ago. 899 00:36:58,896 --> 00:37:01,366 Well as technology proceeds and as a lot of us start 900 00:37:01,366 --> 00:37:04,266 to acquire widescreen laptops, your screen resolution, 901 00:37:04,266 --> 00:37:06,906 the number of pixels left to right, top to bottom, increases. 902 00:37:07,156 --> 00:37:09,216 So even Facebook a year or two ago, 903 00:37:09,446 --> 00:37:11,446 expanded it's site somewhat to be wider. 904 00:37:11,536 --> 00:37:13,006 So why is this relevant? 905 00:37:13,256 --> 00:37:16,176 Well at home I'm sort of you know, indulgent enough 906 00:37:16,176 --> 00:37:17,356 to have a 30 inch LCD, 907 00:37:17,356 --> 00:37:20,766 which means if I designed CS50's website on my monitor, 908 00:37:21,066 --> 00:37:24,166 like you'd be scrolling left and right all day long. 909 00:37:24,166 --> 00:37:26,446 Because I would just assume that you have my monitor. 910 00:37:26,726 --> 00:37:28,486 But tools like this, web developer, 911 00:37:28,786 --> 00:37:30,166 lets you do little tricks like this. 912 00:37:30,236 --> 00:37:33,696 I can click in the resize menu, 800x600 913 00:37:33,926 --> 00:37:36,836 and it will just make my window be the size 914 00:37:37,116 --> 00:37:39,196 of an older monitor 800x600. 915 00:37:39,196 --> 00:37:41,836 So realize these kinds of tools can just save your, 916 00:37:41,836 --> 00:37:45,736 help keep you sane, but actually in this pset 5 survey, 917 00:37:46,046 --> 00:37:48,846 one or more of you commented that you did have 918 00:37:48,876 --> 00:37:52,286 to scroll rightward on the course's website and only 919 00:37:52,286 --> 00:37:54,286 by having that pointed out to us did we realize it. 920 00:37:54,286 --> 00:37:56,466 Frankly I don't notice some of these bugs on my own monitor 921 00:37:56,636 --> 00:37:58,326 and it was because we had made a mistake 922 00:37:58,326 --> 00:37:59,736 down below and it was too wide. 923 00:37:59,736 --> 00:38:00,856 But look, we fixed it. 924 00:38:01,016 --> 00:38:01,756 Now it looks better. 925 00:38:02,306 --> 00:38:02,626 Anyhow. 926 00:38:02,626 --> 00:38:04,006 [ laughter ] 927 00:38:04,006 --> 00:38:05,786 So let's actually do something here. 928 00:38:05,786 --> 00:38:08,836 So last week we looked at XHTML and php 929 00:38:08,836 --> 00:38:12,136 and that gives us this ability to have a client interface 930 00:38:12,136 --> 00:38:13,526 with the server, back and forth 931 00:38:13,756 --> 00:38:15,786 and the server could generate dynamic output, 932 00:38:16,026 --> 00:38:18,196 thereby influencing the user's experience. 933 00:38:18,196 --> 00:38:19,946 And that's what pset 7 is all about. 934 00:38:20,146 --> 00:38:23,646 But we had no form of validation really, other than a couple 935 00:38:23,646 --> 00:38:25,796 of quick and dirty checks on the server side 936 00:38:26,096 --> 00:38:28,456 and in pset 7 you'll know that it's a little annoying 937 00:38:28,686 --> 00:38:32,016 that if the user provides a bogus input, 938 00:38:32,386 --> 00:38:35,766 we encourage you just to say, to apologize to them 939 00:38:35,766 --> 00:38:36,946 with the apology function. 940 00:38:37,216 --> 00:38:38,576 Where you're saying, the message saying, 941 00:38:38,636 --> 00:38:40,446 invalid username, go back. 942 00:38:40,966 --> 00:38:43,386 Well that's kind of annoying on a modern website if you have 943 00:38:43,386 --> 00:38:46,236 to again, per last week's discussion, you have to go back 944 00:38:46,686 --> 00:38:49,476 and risk your form not even being filled out for you. 945 00:38:49,476 --> 00:38:50,696 You lose all of that data. 946 00:38:51,076 --> 00:38:53,536 So one of the very common approaches these days is 947 00:38:53,536 --> 00:38:55,626 for websites to use a bit of JavaScript. 948 00:38:55,996 --> 00:38:59,696 So JavaScript is also an interpreted language. 949 00:38:59,996 --> 00:39:03,256 So an interpreted language again is one that's executed line 950 00:39:03,256 --> 00:39:05,006 by line, top to bottom, left to right. 951 00:39:05,076 --> 00:39:07,056 It's not compiled into 0's and 1's. 952 00:39:07,056 --> 00:39:10,566 So it looks like English or it looks like actual source code. 953 00:39:10,976 --> 00:39:14,276 So in this case, JavaScript is a client side 954 00:39:14,276 --> 00:39:15,306 programming language. 955 00:39:15,336 --> 00:39:17,726 Php is typically server side, 956 00:39:17,726 --> 00:39:20,006 but there's no reason you can't run it on your own Mac or PC, 957 00:39:20,116 --> 00:39:21,716 but typically it's server side. 958 00:39:22,036 --> 00:39:23,546 JavaScript is client side. 959 00:39:23,776 --> 00:39:25,306 Which means when you visit a webpage 960 00:39:25,306 --> 00:39:27,416 that has some JavaScript code inside of it, 961 00:39:27,686 --> 00:39:31,036 that JavaScript code gets downloaded to your browser along 962 00:39:31,036 --> 00:39:34,446 with the gifs and the jpgs, along with the HTML and the CSS, 963 00:39:34,936 --> 00:39:36,326 all of it comes to your browser. 964 00:39:36,536 --> 00:39:39,566 So it's your browser's job to render, 965 00:39:39,646 --> 00:39:42,606 to display not only the XHTML and the graphics and all that, 966 00:39:42,836 --> 00:39:44,966 but also to execute the JavaScript code, 967 00:39:44,966 --> 00:39:46,426 top to bottom, left to right. 968 00:39:46,756 --> 00:39:47,936 And so it's with JavaScript 969 00:39:47,936 --> 00:39:50,706 that you can actually enhance the user's experience client 970 00:39:50,756 --> 00:39:54,086 side and many of you probably these days completely take 971 00:39:54,086 --> 00:39:55,836 for granted a website like Google Maps. 972 00:39:56,336 --> 00:39:59,886 Right? We can go up here and search for we're at 33, 973 00:39:59,886 --> 00:40:04,546 all right let's do the science center, 1 Oxford Street, 02138. 974 00:40:04,766 --> 00:40:07,476 So this was a huge leap forward a few years ago 975 00:40:07,476 --> 00:40:10,036 when you didn't have to click these stupid up, down, left, 976 00:40:10,036 --> 00:40:12,346 right buttons, which then reload the whole page 977 00:40:12,346 --> 00:40:13,596 and show you a new rectangle. 978 00:40:13,636 --> 00:40:15,666 Reload the whole page, show you a new rectangle. 979 00:40:15,926 --> 00:40:18,736 I mean frankly most of us and rightfully so, probably take 980 00:40:18,736 --> 00:40:21,456 for granted the fact that I can click and drag like this. 981 00:40:21,686 --> 00:40:23,126 But notice if I do it fast enough, 982 00:40:23,126 --> 00:40:25,116 what seems to be happening in the top left corner? 983 00:40:25,696 --> 00:40:31,616 It's yes, so it's gray for a moment and then it downloads. 984 00:40:31,616 --> 00:40:33,506 Well this is what's called Ajax. 985 00:40:33,506 --> 00:40:36,256 So Ajax, asynchronous JavaScript in XML, 986 00:40:36,256 --> 00:40:38,426 it's more of a buzz phrase than anything, but it refers 987 00:40:38,426 --> 00:40:41,816 to the use of JavaScript, this interpreted client side language 988 00:40:42,096 --> 00:40:44,726 to fetching more information from a server 989 00:40:44,936 --> 00:40:47,666 and inserting it inside of the webpage instead 990 00:40:47,666 --> 00:40:50,396 of forcing the user to reload their whole screen. 991 00:40:50,646 --> 00:40:54,036 So we actually do this again, on the course's homepage right now 992 00:40:54,036 --> 00:40:55,246 with regard to the big board. 993 00:40:55,456 --> 00:40:58,756 This thing updates itself every 30, Charles, doing well again. 994 00:40:58,756 --> 00:41:00,636 [ laughter ] 995 00:41:00,636 --> 00:41:03,206 This thing updates itself I think every 30 seconds 996 00:41:03,206 --> 00:41:03,546 or minute. 997 00:41:03,546 --> 00:41:05,426 I forget what timer I set for it. 998 00:41:05,606 --> 00:41:06,866 But I thought it would be annoying 999 00:41:06,866 --> 00:41:08,506 if a student visiting the webpage all 1000 00:41:08,506 --> 00:41:12,066 of a sudden has the whole page refresh and it redownloads all 1001 00:41:12,066 --> 00:41:14,216 of that content, which can be slow just 1002 00:41:14,216 --> 00:41:15,806 to update a couple of values. 1003 00:41:15,806 --> 00:41:17,996 So in fact, if we leave this page running, which I will 1004 00:41:17,996 --> 00:41:21,306 in the background, it's going to keep updating itself in line. 1005 00:41:21,306 --> 00:41:23,716 It's going to use Ajax and do this more seamlessly. 1006 00:41:23,966 --> 00:41:26,176 So this was a huge leap forward because Google, 1007 00:41:26,306 --> 00:41:28,756 when it's competitors at the time were like MapQuest 1008 00:41:28,756 --> 00:41:31,626 and Yahoo, completely left them in the dust with this feature 1009 00:41:31,626 --> 00:41:32,896 and you're coming to expect this. 1010 00:41:32,896 --> 00:41:36,386 Facebook is much more laden with Ajax these days, for better 1011 00:41:36,386 --> 00:41:39,516 or for worse and what this means is that you don't' have to click 1012 00:41:39,516 --> 00:41:41,006 and reload as many pages. 1013 00:41:41,006 --> 00:41:42,086 Things just kind of update. 1014 00:41:42,256 --> 00:41:45,016 A lot of people freaked out I'm told over this new live feed 1015 00:41:45,016 --> 00:41:45,826 or something like that. 1016 00:41:45,826 --> 00:41:46,776 Well that's Ajax. 1017 00:41:46,826 --> 00:41:49,706 The fact that you can see your friends like status updates 1018 00:41:49,706 --> 00:41:52,366 in real time and they just kind of, I think they, 1019 00:41:52,716 --> 00:41:55,036 I'm told they get inserted into the page and then kind 1020 00:41:55,036 --> 00:41:57,396 of move the rest of the status updates downward. 1021 00:41:57,636 --> 00:42:00,456 That's all happening in line because some people 1022 00:42:00,456 --> 00:42:03,636 at Facebook wrote some code in JavaScript, they downloaded 1023 00:42:03,636 --> 00:42:07,516 to your browser by embedding it in Facebook.com's webpages 1024 00:42:07,516 --> 00:42:09,726 and it's getting executed by your browser. 1025 00:42:10,006 --> 00:42:12,506 So let's see if we can't take a step toward that. 1026 00:42:12,706 --> 00:42:15,246 Because problem set 8 is going to have you play an exactly 1027 00:42:15,246 --> 00:42:19,396 that seamless world of Ajax and JavaScript with php and MySQL. 1028 00:42:19,656 --> 00:42:23,286 So here is a terribly simple form, really didn't care so much 1029 00:42:23,286 --> 00:42:25,336 about aesthetics this time, just functionality. 1030 00:42:25,616 --> 00:42:29,026 I want it to collect a user's email, their password, 1031 00:42:29,026 --> 00:42:31,816 I want it to get their password again and I wanted them to agree 1032 00:42:31,816 --> 00:42:33,006 to some terms and conditions. 1033 00:42:33,006 --> 00:42:35,836 Why? I wanted a few different form elements here. 1034 00:42:36,126 --> 00:42:37,726 But you'll notice if I fill this out, 1035 00:42:38,376 --> 00:42:41,116 it's going to lead me nowhere very interesting. 1036 00:42:41,116 --> 00:42:42,646 Let me go ahead and pull up Firebug, 1037 00:42:42,646 --> 00:42:44,766 just because it's a little cleaner than looking 1038 00:42:44,766 --> 00:42:46,516 at my own source code here. 1039 00:42:46,516 --> 00:42:49,116 I'm going to go into body, I'm going to go into form 1040 00:42:49,296 --> 00:42:50,896 and notice, where does the form go? 1041 00:42:50,896 --> 00:42:51,936 I'll zoom in here. 1042 00:42:52,146 --> 00:42:53,646 What file does it submit to? 1043 00:42:54,226 --> 00:42:57,366 What's the value of it's action line? 1044 00:42:58,666 --> 00:43:01,486 [Inaudible].php, all right, so kind of a stupid name. 1045 00:43:01,486 --> 00:43:04,196 It's meant to mimic a function we wrote for you for pset 7. 1046 00:43:04,196 --> 00:43:05,076 So let me go into this. 1047 00:43:05,106 --> 00:43:06,176 This is dump.php. 1048 00:43:06,176 --> 00:43:08,566 This is not valid XHTML. 1049 00:43:08,566 --> 00:43:12,426 This is not a legitimate webpage per say, but for now I just want 1050 00:43:12,426 --> 00:43:14,436 to kind of experiment with this form submission 1051 00:43:14,436 --> 00:43:15,846 to really understand how it works. 1052 00:43:16,116 --> 00:43:18,626 So what I did is this quick and dirty, as we keep saying, 1053 00:43:18,626 --> 00:43:21,496 script that apparently outputs a pre-tag 1054 00:43:21,766 --> 00:43:24,036 which means here comes some pre-formatted text. 1055 00:43:24,326 --> 00:43:26,356 Display it in a mono spaced font that looks 1056 00:43:26,356 --> 00:43:28,876 like a typewriter instead of like an English essay. 1057 00:43:29,226 --> 00:43:29,956 And then I have this 1058 00:43:29,956 --> 00:43:31,336 and actually I don't' need this right now. 1059 00:43:31,676 --> 00:43:35,196 I have print recursively the contents of $_gets. 1060 00:43:35,196 --> 00:43:38,356 So recall there's two ways essentially to submit data 1061 00:43:38,356 --> 00:43:39,846 to a website from a form. 1062 00:43:40,066 --> 00:43:43,836 Via get, which puts all your data in the url and via post. 1063 00:43:44,066 --> 00:43:46,626 So what was a rule of thumb here? 1064 00:43:46,686 --> 00:43:51,396 When would you want to use post instead of get? 1065 00:43:51,636 --> 00:43:52,276 Any thoughts? 1066 00:43:53,816 --> 00:43:55,186 So for privacy. 1067 00:43:55,426 --> 00:43:58,266 Right? So if you use get, by definition of it, 1068 00:43:58,266 --> 00:44:00,856 remember that the query's end up in the url 1069 00:44:00,856 --> 00:44:04,036 and that's how we sort of bootstrapped ourselves 1070 00:44:04,036 --> 00:44:05,246 into reimplementing Google. 1071 00:44:05,446 --> 00:44:10,816 We combined our user's input with their get string. 1072 00:44:10,816 --> 00:44:11,516 With their url. 1073 00:44:11,846 --> 00:44:14,296 But that's not so good if you're typing in private information, 1074 00:44:14,296 --> 00:44:16,636 if you're typing in a username and password or credit card, 1075 00:44:16,876 --> 00:44:18,886 so post hides all of that information. 1076 00:44:18,886 --> 00:44:21,436 Post can also handle much larger submissions. 1077 00:44:21,436 --> 00:44:23,556 So if you're uploading a photo, it's kind of hard 1078 00:44:23,556 --> 00:44:26,556 to imagine uploading a photo to Facebook via just a url 1079 00:44:26,556 --> 00:44:28,306 and converting it to 0's and 1's, 1080 00:44:28,606 --> 00:44:30,426 that's also done by a post as well. 1081 00:44:30,676 --> 00:44:33,096 And as an aside, in case you've seen it in any of my code, 1082 00:44:33,276 --> 00:44:36,766 request is another super global variable as they're called. 1083 00:44:37,026 --> 00:44:39,916 This has both the contents of get and post in them. 1084 00:44:40,186 --> 00:44:44,116 So if you're being a little lazy or you want to support both get 1085 00:44:44,116 --> 00:44:46,786 and post, you can access the same variables inside 1086 00:44:46,786 --> 00:44:47,406 of request. 1087 00:44:47,406 --> 00:44:50,746 But it's better designed generally to pick get or post, 1088 00:44:50,746 --> 00:44:51,816 so that you know what to expect. 1089 00:44:52,166 --> 00:44:53,456 So that's all dump.php is. 1090 00:44:53,496 --> 00:44:56,746 It's going to show me the contents of what was submitted 1091 00:44:56,996 --> 00:44:59,286 by the form and therefore automatically put 1092 00:44:59,286 --> 00:45:00,836 into that super global called gets. 1093 00:45:01,086 --> 00:45:02,236 So let's go ahead and do this. 1094 00:45:02,236 --> 00:45:02,686 I'm going to type 1095 00:45:02,686 --> 00:45:07,746 in mailin@post.harvard.edu 1234512345, 1096 00:45:08,026 --> 00:45:10,636 I'm going to check the box and I'm going to click submit 1097 00:45:10,636 --> 00:45:13,726 and what I get back here is this output. 1098 00:45:13,726 --> 00:45:17,246 So again, this sort of indented output is the result of print R 1099 00:45:17,246 --> 00:45:18,226 and this is just what it does. 1100 00:45:18,226 --> 00:45:19,636 It dumps the contents of an array. 1101 00:45:20,006 --> 00:45:22,786 So what you have been received server side, 1102 00:45:22,786 --> 00:45:25,126 four variables inside of get. 1103 00:45:25,296 --> 00:45:28,296 So if I really wanted to get to do something more interesting, 1104 00:45:28,356 --> 00:45:30,296 rather dump to do something more interesting, 1105 00:45:30,586 --> 00:45:31,866 I could do something like this. 1106 00:45:31,866 --> 00:45:37,166 I could do let's say, name: and then let's duplicate this. 1107 00:45:37,226 --> 00:45:40,396 Let's say password 1. 1108 00:45:40,846 --> 00:45:43,376 Let's change this to password 2. 1109 00:45:43,376 --> 00:45:45,956 And then we'll change this to check box. 1110 00:45:46,346 --> 00:45:47,876 So just to make clear what I'm doing, 1111 00:45:48,096 --> 00:45:51,056 let's now put not the value of the whole array 1112 00:45:51,176 --> 00:45:56,446 but let's just print out $_get quote unquote email. 1113 00:45:56,446 --> 00:45:59,246 Sorry I wrote something down that didn't exist. 1114 00:45:59,606 --> 00:46:00,616 I meant to say email. 1115 00:46:01,006 --> 00:46:06,786 And then here I want to say quote open bracket ?print$, 1116 00:46:07,076 --> 00:46:08,126 so this is a little tedious. 1117 00:46:08,126 --> 00:46:10,416 It turns out that if all you want to do is print a value, 1118 00:46:10,716 --> 00:46:14,516 php has a shorthand notation which is open bracket ?= 1119 00:46:14,866 --> 00:46:17,396 and then just put the variable that you want to put there 1120 00:46:17,396 --> 00:46:21,966 which in this case is going to be password 1, so quote unquote 1121 00:46:22,316 --> 00:46:23,426 and you probably get the idea. 1122 00:46:23,456 --> 00:46:24,876 So I won't bother finishing the rest. 1123 00:46:25,166 --> 00:46:27,016 But the point is, the top version is really 1124 00:46:27,016 --> 00:46:27,706 just debugging. 1125 00:46:27,706 --> 00:46:28,616 Dumping the whole array. 1126 00:46:28,616 --> 00:46:30,756 If you actually care about individual values, 1127 00:46:30,946 --> 00:46:33,996 you go after them using this associative array syntax. 1128 00:46:34,036 --> 00:46:35,576 This square bracket notation. 1129 00:46:35,826 --> 00:46:38,306 If I reload this page, thereby resubmitting the form, 1130 00:46:38,306 --> 00:46:42,916 after saving my file, notice that what I get down here, 1131 00:46:43,326 --> 00:46:45,156 oh I'm cheating, I'm on the wrong server. 1132 00:46:45,156 --> 00:46:47,906 Dammit. How best to fix? 1133 00:46:48,206 --> 00:46:53,926 Cloud.CS50.net, mailin source forms. 1134 00:46:56,446 --> 00:47:02,496 OK, let's resubmit, mailin@post, 12345 12345, 1135 00:47:02,536 --> 00:47:05,126 check the box and submit. 1136 00:47:06,166 --> 00:47:07,266 OK and it's a mess. 1137 00:47:07,646 --> 00:47:11,686 Why is it all on one line like this? 1138 00:47:11,886 --> 00:47:12,816 Yes so no line break. 1139 00:47:12,816 --> 00:47:13,846 So no BR's. 1140 00:47:13,846 --> 00:47:14,936 Right? So the quick fix here 1141 00:47:14,936 --> 00:47:18,076 and then we'll move forward is put this there, put this there, 1142 00:47:18,076 --> 00:47:20,996 this there, this there and now finally if I reload, OK. 1143 00:47:21,276 --> 00:47:23,626 So again, sort of a refresher of what we did last week. 1144 00:47:23,626 --> 00:47:25,156 So now let's take things up a notch. 1145 00:47:25,156 --> 00:47:29,276 Because right now this program, if I actually go in here 1146 00:47:29,876 --> 00:47:34,276 and change things around like, if I just say mailin 1147 00:47:34,796 --> 00:47:38,096 or something like this or mailin@fast or mailin@post, 1148 00:47:38,096 --> 00:47:39,846 I don't give a valid email address, 1149 00:47:40,126 --> 00:47:42,556 I just submit this well nothing bad happens. 1150 00:47:42,556 --> 00:47:44,106 It just gets submitted to the server. 1151 00:47:44,106 --> 00:47:48,116 So now the server has to do all of the validation of this data. 1152 00:47:48,456 --> 00:47:50,936 So that might necessarily be such a bad thing, 1153 00:47:51,136 --> 00:47:53,306 because at the end of the day that's the safest approach. 1154 00:47:53,306 --> 00:47:56,696 Actually having your server defend against bogus user input, 1155 00:47:56,906 --> 00:47:58,996 is by far the most robust approach. 1156 00:47:59,346 --> 00:48:00,316 But it's kind of annoying. 1157 00:48:00,316 --> 00:48:02,706 Because now the user have to go back, if I'm yelled 1158 00:48:02,706 --> 00:48:04,286 at then I have to fix my form. 1159 00:48:04,606 --> 00:48:07,586 Can't we give the user more dynamic, more immediate feedback 1160 00:48:07,826 --> 00:48:10,116 and save them the time and the trouble of going 1161 00:48:10,116 --> 00:48:11,786 to a whole new page, then going back 1162 00:48:11,786 --> 00:48:12,836 and forth and back and forth? 1163 00:48:13,166 --> 00:48:15,356 Well yes, we can. 1164 00:48:15,356 --> 00:48:18,456 So in form2.html, we have the same form 1165 00:48:18,666 --> 00:48:19,726 but a little bit of magic. 1166 00:48:19,936 --> 00:48:22,326 For instance, if I decide eh, you don't' need to know this 1167 00:48:22,326 --> 00:48:24,196 about me, I am going to proceed nonetheless, 1168 00:48:24,776 --> 00:48:26,366 notice I immediately get yelled at. 1169 00:48:26,556 --> 00:48:28,516 So it's a little small here, so let me zoom in. 1170 00:48:28,516 --> 00:48:29,726 This is an alert window. 1171 00:48:29,896 --> 00:48:32,546 It says, you must provide an email address. 1172 00:48:32,576 --> 00:48:32,906 All right. 1173 00:48:32,906 --> 00:48:34,016 Well where did this come from? 1174 00:48:34,286 --> 00:48:36,606 Well let's take a look at the page's source in Firebug. 1175 00:48:36,606 --> 00:48:39,406 But again you can go to view source and see the same thing. 1176 00:48:39,406 --> 00:48:43,026 So open the body, open the form, oh this is interesting, 1177 00:48:43,686 --> 00:48:47,026 what is a new attribute on the form element here 1178 00:48:47,026 --> 00:48:48,306 that we didn't have a moment ago? 1179 00:48:50,456 --> 00:48:51,586 So on submit. 1180 00:48:52,536 --> 00:48:54,866 So on submit kind of self explanatory. 1181 00:48:54,866 --> 00:48:56,606 Because it says, on submit. 1182 00:48:56,676 --> 00:48:59,336 So when this form gets submitted, do the following. 1183 00:48:59,566 --> 00:49:01,426 Well it turns out that what should be done 1184 00:49:01,426 --> 00:49:03,806 when the form is submitted is the following line 1185 00:49:03,806 --> 00:49:05,036 of code should be executed. 1186 00:49:05,036 --> 00:49:07,006 It's just long so it wraps on two lines, but the line 1187 00:49:07,006 --> 00:49:10,996 of code is, returnvalidopenparencloseparen;. 1188 00:49:10,996 --> 00:49:12,676 So that's a snippet of code. 1189 00:49:12,676 --> 00:49:15,836 It's not php code, it is in fact JavaScript code. 1190 00:49:15,836 --> 00:49:16,676 So let's take a look. 1191 00:49:16,676 --> 00:49:18,566 This is form2.html. 1192 00:49:18,906 --> 00:49:20,296 So I'm just using HTML files. 1193 00:49:20,296 --> 00:49:21,026 I'm not bothering with php. 1194 00:49:21,026 --> 00:49:24,566 This is purely client side static content right now. 1195 00:49:24,986 --> 00:49:27,286 So let me scroll down all the way to the bottom. 1196 00:49:27,506 --> 00:49:28,736 Here is the same form. 1197 00:49:29,016 --> 00:49:32,806 So that is my form tag, my email word, password, password again. 1198 00:49:33,036 --> 00:49:35,646 This is sort of, this is pretty much all old school now 1199 00:49:35,646 --> 00:49:38,946 from last week, but the only difference is I added this 1200 00:49:39,086 --> 00:49:41,726 attribute here and again from last week, 1201 00:49:41,726 --> 00:49:43,676 web browsers don't' care about white space. 1202 00:49:43,676 --> 00:49:46,236 I decided ugh, this feels a little messy having it all 1203 00:49:46,236 --> 00:49:49,446 on one line, so I just decided stylistically to put it 1204 00:49:49,446 --> 00:49:50,556 on a new line and that's fine. 1205 00:49:50,556 --> 00:49:52,896 White space inside of tags like this is OK. 1206 00:49:53,416 --> 00:49:55,486 So what is this validate function? 1207 00:49:55,486 --> 00:49:56,376 Well let's take a look. 1208 00:49:56,686 --> 00:49:59,236 Well even though last week we said the only thing that belongs 1209 00:49:59,286 --> 00:50:01,796 in your head element is the title tag, 1210 00:50:02,546 --> 00:50:06,066 kind of a simplification because you can also put a script tag. 1211 00:50:06,386 --> 00:50:08,496 So this tag up here is kind of new. 1212 00:50:08,666 --> 00:50:10,206 It says, open bracket script. 1213 00:50:10,556 --> 00:50:11,596 It then says, 1214 00:50:11,966 --> 00:50:15,576 type=text/JavaScript That's just telling the browser hey, 1215 00:50:15,826 --> 00:50:18,556 here comes some JavaScript So JavaScript is kind 1216 00:50:18,556 --> 00:50:21,136 of the only language that people embed in webpages. 1217 00:50:21,136 --> 00:50:22,496 There are other languages. 1218 00:50:22,526 --> 00:50:25,576 There's VB script which is a Windows thing, 1219 00:50:25,796 --> 00:50:28,376 but pretty much these days people only use this tag 1220 00:50:28,376 --> 00:50:29,006 as follows. 1221 00:50:29,316 --> 00:50:34,436 This is just a bit of syntax to ward off confusion. 1222 00:50:34,436 --> 00:50:37,386 So long story short, there's certain characters 1223 00:50:37,386 --> 00:50:40,266 on the keyboard that can confuse browsers very easily. 1224 00:50:40,526 --> 00:50:44,546 For instance, if I wanted to say 2 less than 3, 1225 00:50:44,926 --> 00:50:46,946 well why is this kind of expression, 1226 00:50:46,946 --> 00:50:48,606 if I type this inside a webpage, 1227 00:50:48,606 --> 00:50:50,566 potentially confusing to a browser? 1228 00:50:51,096 --> 00:50:55,856 Just 2<3. What's the scary letter there? 1229 00:50:57,276 --> 00:50:58,206 So it's this thing right? 1230 00:50:58,206 --> 00:51:01,476 Because this is the special tag, this special symbol we seem 1231 00:51:01,476 --> 00:51:04,896 to use everywhere to tell the browser here comes a tag. 1232 00:51:04,896 --> 00:51:07,686 Interpret this as XHTML, not as actual text. 1233 00:51:08,016 --> 00:51:11,656 So for now just know that this crazy syntax here, 1234 00:51:11,836 --> 00:51:14,056 which is intentionally crazy because they figured 1235 00:51:14,056 --> 00:51:16,816 who is ever going to need to write a sentence 1236 00:51:16,816 --> 00:51:21,386 with these characters in it, so this thing here, 1237 00:51:21,386 --> 00:51:23,516 who is ever going to need to write this in a sentence. 1238 00:51:23,896 --> 00:51:25,796 So that's why it looks as crazy as it does. 1239 00:51:25,956 --> 00:51:29,006 This just means ignore the following stuff. 1240 00:51:29,096 --> 00:51:32,276 It is not XHTML, it's probably code or something else. 1241 00:51:32,566 --> 00:51:34,646 So here's my first JavaScript function. 1242 00:51:34,646 --> 00:51:37,356 Just like php, I say literally function, 1243 00:51:37,596 --> 00:51:40,036 then I give the thing a name, then parenthesis and inside 1244 00:51:40,036 --> 00:51:41,566 of those can be any parameters. 1245 00:51:41,626 --> 00:51:42,486 This time there is none. 1246 00:51:42,866 --> 00:51:43,726 So what am I doing? 1247 00:51:43,726 --> 00:51:45,516 Well this function's purpose in life is 1248 00:51:45,516 --> 00:51:47,136 to validate the user's input. 1249 00:51:47,406 --> 00:51:49,586 So I could have done any number of things but let's consider 1250 00:51:49,586 --> 00:51:51,246 for a second, what could go wrong. 1251 00:51:51,286 --> 00:51:53,486 What could the user do that might annoy me, 1252 00:51:53,486 --> 00:51:55,416 the person trying to trap this information? 1253 00:51:56,826 --> 00:51:57,576 They might do what? 1254 00:51:59,366 --> 00:52:01,366 Leave something blank. 1255 00:52:01,616 --> 00:52:04,576 So leaving their email address blank kind of useless 1256 00:52:04,576 --> 00:52:06,596 if the point is to register them for something. 1257 00:52:06,816 --> 00:52:08,246 What might they also do that's wrong? 1258 00:52:10,076 --> 00:52:11,106 Passwords might differ. 1259 00:52:11,186 --> 00:52:12,146 They might be too short, 1260 00:52:12,216 --> 00:52:14,286 they might be not have fancy enough characters 1261 00:52:14,286 --> 00:52:16,036 or more simply, they might not match. 1262 00:52:16,376 --> 00:52:18,836 Or third thing that we can pick on pretty easily? 1263 00:52:20,376 --> 00:52:21,866 They didn't check the box right? 1264 00:52:21,866 --> 00:52:22,756 They didn't check the box. 1265 00:52:22,816 --> 00:52:24,756 They need to check the box if I'm going to register them. 1266 00:52:24,936 --> 00:52:26,976 So can we detect these things client side? 1267 00:52:26,976 --> 00:52:28,056 We can server side. 1268 00:52:28,056 --> 00:52:30,476 And even last week recall, I used a couple of lines 1269 00:52:30,476 --> 00:52:32,856 of php code to say, if this thing is empty, 1270 00:52:32,986 --> 00:52:33,856 yell at the user. 1271 00:52:33,856 --> 00:52:34,666 Make them go back. 1272 00:52:34,936 --> 00:52:36,896 Well here we're going to give them more immediate feedback. 1273 00:52:36,896 --> 00:52:37,886 So here's the syntax. 1274 00:52:37,886 --> 00:52:39,616 It's a little long, but that's OK. 1275 00:52:39,616 --> 00:52:43,356 So it's I say if document document is the webpage. 1276 00:52:43,356 --> 00:52:46,176 It's a special global variable if you will that refers 1277 00:52:46,176 --> 00:52:48,266 to the webpage itself, document. 1278 00:52:48,406 --> 00:52:53,276 forms. Document.forms is an array essentially of all 1279 00:52:53,276 --> 00:52:55,076 of the forms on the webpage. 1280 00:52:55,076 --> 00:52:57,066 Right now there's just one so this is pretty simple. 1281 00:52:57,346 --> 00:53:00,386 Registration is the name of the form that I care about. 1282 00:53:00,646 --> 00:53:01,516 Well where does this come? 1283 00:53:01,516 --> 00:53:03,346 Well if you fast forward here, 1284 00:53:03,346 --> 00:53:05,986 actually there was one other attribute I added to the form, 1285 00:53:05,986 --> 00:53:07,376 which we haven't used previously. 1286 00:53:07,636 --> 00:53:08,496 I gave it a name. 1287 00:53:08,496 --> 00:53:11,946 An arbitrary name, but one that's memorable to me. 1288 00:53:12,146 --> 00:53:12,916 So now notice, 1289 00:53:13,216 --> 00:53:17,376 document.forms.registration is simply the JavaScript way 1290 00:53:17,376 --> 00:53:20,366 of saying, go examine that specific form. 1291 00:53:20,366 --> 00:53:22,076 What do you want to examine inside of it? 1292 00:53:22,076 --> 00:53:25,176 Well that registration form has a field called email. 1293 00:53:25,486 --> 00:53:26,346 Where does this come from? 1294 00:53:26,346 --> 00:53:29,116 Well fast forward to the bottom and notice the name I gave 1295 00:53:29,116 --> 00:53:31,566 to the input for email, it's just email. 1296 00:53:32,036 --> 00:53:34,326 So careful, the casing is important, 1297 00:53:34,596 --> 00:53:36,336 but I did copy it literally so we're good. 1298 00:53:36,336 --> 00:53:38,746 And then finally, that's just the form field. 1299 00:53:38,746 --> 00:53:41,646 If I want to check the value, I do .value. 1300 00:53:41,986 --> 00:53:42,776 So document. 1301 00:53:42,776 --> 00:53:43,976 forms. registration. 1302 00:53:44,146 --> 00:53:47,836 email. value, it's kind of like stepping through a tree from top 1303 00:53:47,836 --> 00:53:51,316 to bottom, deeper and deeper and deeper until you hit this value. 1304 00:53:51,346 --> 00:53:53,236 And I'm just saying, if it is blank, 1305 00:53:53,236 --> 00:53:55,616 if it equals the empty string what do it do? 1306 00:53:55,616 --> 00:53:58,266 Well this is exactly what I got yelled at for before. 1307 00:53:58,526 --> 00:54:00,766 Javascript has built in an alert function. 1308 00:54:00,766 --> 00:54:02,336 Looks a little different on different browsers 1309 00:54:02,376 --> 00:54:04,206 but at the end of the day, accomplishes the same task. 1310 00:54:04,476 --> 00:54:07,976 It triggers a little pop up with an OK button or the equivalent 1311 00:54:08,296 --> 00:54:10,336 and then just says something to the user. 1312 00:54:10,666 --> 00:54:14,706 But I still want to ensure that the form does not get submitted. 1313 00:54:14,856 --> 00:54:18,836 And so I return false here and that's important 1314 00:54:18,836 --> 00:54:20,916 because if you look back at the form tag, 1315 00:54:21,166 --> 00:54:23,326 notice I didn't just call validate, 1316 00:54:23,486 --> 00:54:26,276 I returned the return value of validate, 1317 00:54:26,546 --> 00:54:28,056 so the browser realizes oh, 1318 00:54:28,056 --> 00:54:30,376 if validate the function returns false, 1319 00:54:30,636 --> 00:54:31,936 I should not submit this form. 1320 00:54:31,936 --> 00:54:33,786 Like let's just short circuit the operation 1321 00:54:33,786 --> 00:54:34,726 and not let things pass. 1322 00:54:35,306 --> 00:54:36,566 Well what else am I checking for? 1323 00:54:36,806 --> 00:54:37,946 Well here's one thing. 1324 00:54:37,946 --> 00:54:40,636 If the password is blank, I catch it there. 1325 00:54:40,876 --> 00:54:45,086 If the first password 1 does not equal the second password, 1326 00:54:45,146 --> 00:54:45,926 I catch that. 1327 00:54:46,296 --> 00:54:48,636 If this, notice this is another property, 1328 00:54:48,876 --> 00:54:53,576 so if the agreement input is not true for it's checked property, 1329 00:54:53,816 --> 00:54:55,496 then I have to say you must agree 1330 00:54:55,496 --> 00:54:56,586 to our terms and conditions. 1331 00:54:56,586 --> 00:54:58,026 So it's all fairly straightforward. 1332 00:54:58,226 --> 00:55:00,316 You know, kind of along these lines so I've allowed them 1333 00:55:00,316 --> 00:55:02,436 to wrap but pretty simple in the end. 1334 00:55:02,436 --> 00:55:04,506 And finally, I return true. 1335 00:55:04,576 --> 00:55:05,996 None of these problems exist. 1336 00:55:06,306 --> 00:55:09,156 So if I go back here and I do provide an email addresses, 1337 00:55:09,156 --> 00:55:13,896 mailin@post.harvard.edu 12345 12345, 1338 00:55:14,406 --> 00:55:15,536 but oh I don't want to agree. 1339 00:55:15,856 --> 00:55:19,446 Submit, it still catches that, but I'm left on the same page 1340 00:55:19,446 --> 00:55:21,976 and notice, the url has not changed. 1341 00:55:21,976 --> 00:55:25,626 I'm still at form2.html, I am not at dump.php 1342 00:55:26,186 --> 00:55:29,016 because I've short circuited that process. 1343 00:55:29,636 --> 00:55:30,616 So any gotcha's? 1344 00:55:31,706 --> 00:55:33,986 Is there a problem with using JavaScript in this way? 1345 00:55:34,746 --> 00:55:36,876 What do you think? 1346 00:55:37,086 --> 00:55:38,976 Especially if you're among those more comfortable 1347 00:55:38,976 --> 00:55:40,476 or those familiar with web programming. 1348 00:55:41,066 --> 00:55:43,466 Can I get away with just JavaScript and skip all 1349 00:55:43,466 --> 00:55:46,796 that php stuff from last time? 1350 00:55:47,026 --> 00:55:48,046 You know what the answers 1351 00:55:48,046 --> 00:55:49,746 to these questions are the same right? 1352 00:55:49,746 --> 00:55:52,526 No. Why? Right? 1353 00:55:52,526 --> 00:55:55,936 So it turns out that just because you tell the browser 1354 00:55:55,936 --> 00:55:58,036 to do something, doesn't mean it has to do it 1355 00:55:58,036 --> 00:55:59,816 and a very common way of attacking 1356 00:55:59,816 --> 00:56:02,256 or compromising websites is to take advantage 1357 00:56:02,256 --> 00:56:04,896 of foolish assumptions programmers have made. 1358 00:56:05,166 --> 00:56:09,366 If you only validate the user's input client side or rather 1359 00:56:09,366 --> 00:56:11,736 if you only validate the user's input client side using 1360 00:56:11,736 --> 00:56:16,336 JavaScript, well what if the user you know, is mildly clever 1361 00:56:16,336 --> 00:56:19,686 and realizes they can go to Safari and they can go 1362 00:56:19,686 --> 00:56:24,406 to disable JavaScript They've essentially disabled all 1363 00:56:24,406 --> 00:56:25,896 of your validation completely. 1364 00:56:26,206 --> 00:56:30,436 So we can post on the bulletin, oh Charles, you're not doing, 1365 00:56:30,436 --> 00:56:32,656 oh wait, oh, my internet connection is dead there. 1366 00:56:33,006 --> 00:56:34,656 Let's see actually how Charles is doing. 1367 00:56:34,656 --> 00:56:36,386 [ laughter ] 1368 00:56:36,386 --> 00:56:37,146 Very nice. 1369 00:56:38,676 --> 00:56:43,106 OK. OK. So anyhow, this is not a menu that comes 1370 00:56:43,166 --> 00:56:44,116 by default with Safari. 1371 00:56:44,116 --> 00:56:44,846 You have to enable it. 1372 00:56:44,846 --> 00:56:45,556 It's very easy. 1373 00:56:45,556 --> 00:56:48,416 Just email [inaudible].net or I'll look it up at some point. 1374 00:56:48,416 --> 00:56:49,106 I forget how to do it. 1375 00:56:49,186 --> 00:56:49,736 But it's there now. 1376 00:56:49,976 --> 00:56:50,876 So this is bad. 1377 00:56:50,976 --> 00:56:53,866 Just using JavaScript not such a good idea and in fact, 1378 00:56:53,866 --> 00:56:54,846 some bad things happen. 1379 00:56:54,846 --> 00:56:57,276 If you disable JavaScript and try to visit any number 1380 00:56:57,276 --> 00:56:59,386 of popular websites, they might work, 1381 00:56:59,386 --> 00:57:01,066 though they might be a bit shot in the foot, 1382 00:57:01,156 --> 00:57:02,526 certain features might not work 1383 00:57:02,836 --> 00:57:05,016 or they might just not work at all. 1384 00:57:05,016 --> 00:57:07,436 So people have started making assumptions these days 1385 00:57:07,436 --> 00:57:10,316 when designing websites, if you don't have JavaScript enabled, 1386 00:57:10,606 --> 00:57:12,056 like to hell with you. 1387 00:57:12,056 --> 00:57:15,146 Like it's not worth frankly, the time trying 1388 00:57:15,146 --> 00:57:17,376 to implement two different versions of a website 1389 00:57:17,466 --> 00:57:19,126 and it's actually kind of a fun experiment. 1390 00:57:19,126 --> 00:57:21,916 At least if you're so inclined, to disable JavaScript 1391 00:57:21,916 --> 00:57:24,296 and visit your favorite websites and see which ones break. 1392 00:57:24,296 --> 00:57:27,026 Which ones take for granted JavaScript Now it's not all 1393 00:57:27,026 --> 00:57:29,766 that silly to ask this, because even though Safari 1394 00:57:29,766 --> 00:57:32,766 on the iPhone is actually pretty good, it is a real web browser. 1395 00:57:32,986 --> 00:57:36,406 Before this I had a Blackberry who's web browser sucks frankly 1396 00:57:36,406 --> 00:57:39,126 and they didn't support JavaScript fully, so there were 1397 00:57:39,126 --> 00:57:40,936 so many websites I couldn't pull 1398 00:57:40,936 --> 00:57:42,756 up because they just didn't work. 1399 00:57:42,906 --> 00:57:45,376 So though I make fun of this you know, this trade off, 1400 00:57:45,566 --> 00:57:47,376 it's actually a very real world concern 1401 00:57:47,376 --> 00:57:48,966 because who do you want to cater to? 1402 00:57:48,966 --> 00:57:51,716 People who have JavaScript enabled and don't even know 1403 00:57:51,716 --> 00:57:54,076 that you can turn it off or do you actually want to cater 1404 00:57:54,076 --> 00:57:56,756 to other devices that might not support JavaScript fully 1405 00:57:56,756 --> 00:58:00,356 or do you want to deal with the really paranoid types out there 1406 00:58:00,546 --> 00:58:02,756 who intentionally turn JavaScript off and hope 1407 00:58:02,826 --> 00:58:05,586 and expect that the world continues to function properly? 1408 00:58:05,896 --> 00:58:08,166 So these are the kinds of things that evolve over time. 1409 00:58:08,166 --> 00:58:10,996 These mentalities and these features. 1410 00:58:11,206 --> 00:58:14,836 So realize that validation client side, 1411 00:58:14,836 --> 00:58:18,316 very useful in that it's saved me some cycles. 1412 00:58:18,316 --> 00:58:20,186 I don't have to go reload the page and all of this, 1413 00:58:20,426 --> 00:58:22,076 but it can be very easily circumvented 1414 00:58:22,076 --> 00:58:24,066 and you can take advantage of these assumptions. 1415 00:58:24,066 --> 00:58:25,716 So this other tool I mentioned 1416 00:58:25,716 --> 00:58:27,996 and showed very briefly last week is called, 1417 00:58:27,996 --> 00:58:29,476 live http headers. 1418 00:58:29,736 --> 00:58:31,326 So when you request a webpage, 1419 00:58:31,546 --> 00:58:34,446 what the browser essentially sends across the wire, 1420 00:58:34,796 --> 00:58:37,876 across the internet, is literally something like this. 1421 00:58:37,876 --> 00:58:40,116 Let me go ahead and use this program. 1422 00:58:40,496 --> 00:58:41,656 Just to type something big. 1423 00:58:41,936 --> 00:58:43,496 What a browser sent, if you go 1424 00:58:43,496 --> 00:58:49,876 to www.let's say www.facebook.com/home.php right. 1425 00:58:49,876 --> 00:58:51,696 This is one of their common url's. 1426 00:58:51,696 --> 00:58:55,166 Well what the browser really does when you hit enter, 1427 00:58:55,366 --> 00:58:58,116 is it opens an internet connection, tcpip connection 1428 00:58:58,116 --> 00:59:01,426 to facebook.com and it sends it a request like this, 1429 00:59:01,706 --> 00:59:08,116 getme/home.php and use version like 1.1 of http, the language 1430 00:59:08,116 --> 00:59:09,296 that browsers and servers speak. 1431 00:59:09,296 --> 00:59:10,706 There's 1.1, there's 1.0. 1432 00:59:10,886 --> 00:59:12,086 That's doesn't matter so much. 1433 00:59:12,296 --> 00:59:13,796 So this is the message that's sent. 1434 00:59:14,166 --> 00:59:17,406 Well if you submit some variables to a website, 1435 00:59:17,586 --> 00:59:18,896 now let's consider Google. 1436 00:59:18,896 --> 00:59:24,196 So Google we've seen, so that was http://www.Google.com/search 1437 00:59:24,706 --> 00:59:28,126 and then /q=foo. 1438 00:59:28,496 --> 00:59:30,166 So this is something I'm searching for. 1439 00:59:30,336 --> 00:59:32,466 Well what gets sent then to the website is this, 1440 00:59:32,666 --> 00:59:37,026 /search?q=foohttp/1.1. 1441 00:59:37,396 --> 00:59:38,316 Well notice this. 1442 00:59:38,686 --> 00:59:42,246 It looks like I can submit anything I want to the server, 1443 00:59:42,536 --> 00:59:45,476 just by sending that message to the server. 1444 00:59:45,706 --> 00:59:48,686 So if I don't care to execute JavaScript 1445 00:59:48,686 --> 00:59:51,096 or I outright disable it, nothing is stopping me 1446 00:59:51,096 --> 00:59:53,686 from still sending bogus data to a website. 1447 00:59:53,896 --> 00:59:55,726 So let's take a look at this registration form. 1448 00:59:55,986 --> 01:00:00,556 So this registration form lives at form2.html and actually, 1449 01:00:00,556 --> 01:00:02,246 let's actually go back to the first version, 1450 01:00:02,246 --> 01:00:04,046 because it won't get in our way 1451 01:00:04,046 --> 01:00:06,376 with JavaScript I now have this tool up. 1452 01:00:06,376 --> 01:00:08,686 I'm going to clear it's window and now notice. 1453 01:00:08,686 --> 01:00:10,246 Let me move one to the side here. 1454 01:00:10,346 --> 01:00:14,016 I'm going to go ahead and type in, let's clear this, 1455 01:00:14,206 --> 01:00:18,106 I'm going to go ahead and type in mailin@post1234512345, 1456 01:00:18,176 --> 01:00:20,426 I'm going to check the box and click submit. 1457 01:00:20,826 --> 01:00:22,396 Notice this page got updated 1458 01:00:22,396 --> 01:00:24,436 because what this little program is doing for me, 1459 01:00:24,436 --> 01:00:27,326 this Firefox plugin, it's sniffing my web traffic 1460 01:00:27,326 --> 01:00:29,766 and it's showing me what is the browser really sending 1461 01:00:29,766 --> 01:00:30,476 to the server. 1462 01:00:30,756 --> 01:00:33,866 So at the very top is the url that was just submitted to 1463 01:00:34,076 --> 01:00:36,246 and as we expected, it ends in dump.php. 1464 01:00:36,246 --> 01:00:37,816 We knew it was going to go there. 1465 01:00:38,076 --> 01:00:39,156 Then there's a question mark. 1466 01:00:39,376 --> 01:00:44,486 So I'm at the very top middle of my screen, ?email=mailin@post, 1467 01:00:44,756 --> 01:00:47,336 the funky encoding is a browser's way of making 1468 01:00:47,336 --> 01:00:50,046 of escaping input, to make sure it's not confused 1469 01:00:50,046 --> 01:00:53,716 for special characters, %40 is just special character 1470 01:00:53,716 --> 01:00:54,986 than means the @ sign. 1471 01:00:54,986 --> 01:00:57,576 So no worries there, but ampersand is important. 1472 01:00:57,726 --> 01:00:59,426 Because ampersand separates remember, 1473 01:00:59,426 --> 01:01:01,116 parameters from other parameters. 1474 01:01:01,336 --> 01:01:04,916 So it looks like via gets, what's been requested is 1475 01:01:04,916 --> 01:01:07,746 that url which means if I now look at the line below, 1476 01:01:07,746 --> 01:01:10,856 ah that is literally what the browser is sending 1477 01:01:10,856 --> 01:01:11,586 to the server. 1478 01:01:11,586 --> 01:01:12,766 It's this thing here. 1479 01:01:12,956 --> 01:01:16,166 Gets/ until the mailin and all of this and notice, 1480 01:01:16,506 --> 01:01:19,706 all of the form elements are submitted via the url. 1481 01:01:19,986 --> 01:01:21,126 So what's the take away? 1482 01:01:21,166 --> 01:01:24,326 Well if you have for instance a website, 1483 01:01:24,326 --> 01:01:27,116 a banks website that's just checking people's usernames 1484 01:01:27,116 --> 01:01:28,756 and passwords with JavaScript and then 1485 01:01:28,756 --> 01:01:30,136 if they're correct they let them see 1486 01:01:30,136 --> 01:01:31,336 that person's account balance, 1487 01:01:31,636 --> 01:01:33,496 well clearly could you circumvent that either 1488 01:01:33,496 --> 01:01:37,046 by turning off JavaScript or just by sending data 1489 01:01:37,046 --> 01:01:39,836 to the server, actually I'm telling the wrong story. 1490 01:01:39,836 --> 01:01:40,856 I'm confusing my two stories. 1491 01:01:41,646 --> 01:01:44,226 You can still send anything you want to the server 1492 01:01:44,226 --> 01:01:46,086 and if the server just assumes 1493 01:01:46,316 --> 01:01:48,046 that the data has already been validated, 1494 01:01:48,286 --> 01:01:49,546 they're in for some trouble. 1495 01:01:49,676 --> 01:01:52,366 Because clearly can you send most anything you want 1496 01:01:52,536 --> 01:01:54,646 and in fact, even though I'm still just using a plug 1497 01:01:54,646 --> 01:01:55,826 in to look at this content, 1498 01:01:56,096 --> 01:01:58,336 I can very easily request a website. 1499 01:01:58,336 --> 01:01:59,476 In fact, let me do just this. 1500 01:02:00,026 --> 01:02:03,326 So minor aside, just to show you how simple the internet 1501 01:02:03,326 --> 01:02:03,866 really is. 1502 01:02:04,116 --> 01:02:06,936 We use SSH a lot, but there's also a program called TelNet. 1503 01:02:07,356 --> 01:02:09,926 So TelNet is like an unencrypted version of a browser. 1504 01:02:10,196 --> 01:02:15,106 I can actually do TelNet and then I can do a server's name. 1505 01:02:15,106 --> 01:02:17,076 Let's see, I haven't done this before, so let's just try it. 1506 01:02:17,076 --> 01:02:17,986 Facebook.com. 1507 01:02:18,356 --> 01:02:21,976 But TelNet, by default, has it's own special port, it's 23, 1508 01:02:22,226 --> 01:02:25,296 but I want port 80, because port 80, this is an internet thing, 1509 01:02:25,436 --> 01:02:28,076 is a number that uniquely identifies the service we know 1510 01:02:28,076 --> 01:02:28,736 as http. 1511 01:02:28,736 --> 01:02:31,226 Web traffic has a port number called 80, 1512 01:02:31,516 --> 01:02:34,566 as the cell traffic is 443, SSH which some 1513 01:02:34,566 --> 01:02:36,236 of you have even noticed, is what number? 1514 01:02:37,316 --> 01:02:39,536 22. So we've seen these kinds of numbers before, 1515 01:02:39,536 --> 01:02:40,596 but just taken them for granted. 1516 01:02:40,596 --> 01:02:41,386 I hit, enter. 1517 01:02:41,576 --> 01:02:44,916 Oh I am connected to www.facebook.com. 1518 01:02:44,916 --> 01:02:46,416 Now I have an interactive session. 1519 01:02:46,416 --> 01:02:47,516 So what do I want to do? 1520 01:02:47,666 --> 01:02:49,056 Well let me go ahead and get/home.php 1521 01:02:49,056 --> 01:02:52,746 and I don't know much about the language. 1522 01:02:52,746 --> 01:02:55,256 I'm just going to use version 1 and I'm going 1523 01:02:55,256 --> 01:02:57,516 to hit enter twice and OK, interesting. 1524 01:02:57,766 --> 01:03:00,086 It looks like Facebook has responded with this. 1525 01:03:00,506 --> 01:03:02,256 So this is an http thing. 1526 01:03:02,926 --> 01:03:04,766 If the server sends back that line, 1527 01:03:04,916 --> 01:03:07,026 location:, that's a redirect. 1528 01:03:07,246 --> 01:03:09,296 That tells the browser, go to this url. 1529 01:03:09,716 --> 01:03:10,896 Well I'm not a browser so I'm going 1530 01:03:10,896 --> 01:03:12,086 to have to do this manually. 1531 01:03:12,086 --> 01:03:16,186 So let me go ahead and copy and put that in my clipboard. 1532 01:03:16,346 --> 01:03:18,716 Now I'm going to rerun TelNet, so let's TelNet. 1533 01:03:18,966 --> 01:03:22,076 This time I'm, oh wait let me, I've got to fix one thing. 1534 01:03:22,746 --> 01:03:24,376 Come on. Quit. 1535 01:03:25,296 --> 01:03:26,766 Let's see that once more. 1536 01:03:26,766 --> 01:03:27,896 I copied too much of it. 1537 01:03:28,136 --> 01:03:31,856 So I just want to get /common/browser.php. 1538 01:03:32,416 --> 01:03:34,226 All right, so now I've copied that because that's 1539 01:03:34,226 --> 01:03:35,166 where it's telling me to go. 1540 01:03:35,166 --> 01:03:41,646 Let's rerun TelNet, get this now, http, oops, enter. 1541 01:03:41,646 --> 01:03:47,816 Ah, look at that, facebook.com's html. 1542 01:03:48,096 --> 01:03:50,526 So this is the mess that your website, 1543 01:03:50,526 --> 01:03:53,446 that your web browser would download and let's just try 1544 01:03:53,446 --> 01:03:55,996 and scroll up, right, like so Mark's really made a mess 1545 01:03:55,996 --> 01:03:56,826 of their website here. 1546 01:03:57,296 --> 01:03:58,916 So look at all this stuff. 1547 01:03:59,856 --> 01:04:01,866 Look, OK. Let's find something familiar. 1548 01:04:01,866 --> 01:04:02,846 Oh my god, OK. 1549 01:04:03,001 --> 01:04:05,001 [ laughter ] 1550 01:04:05,156 --> 01:04:07,326 But notice, so this is actually JavaScript 1551 01:04:07,326 --> 01:04:08,746 So this is intellectual property. 1552 01:04:08,746 --> 01:04:10,216 Because Facebook is pretty sophisticated 1553 01:04:10,216 --> 01:04:12,486 with what it does client side, all the fancy animation 1554 01:04:12,486 --> 01:04:15,306 and all that, but they don't really want CS50 students 1555 01:04:15,306 --> 01:04:17,626 or anyone out there, just copying their website, 1556 01:04:17,626 --> 01:04:19,226 as has actually been done in other countries. 1557 01:04:19,546 --> 01:04:21,436 And taking their code. 1558 01:04:21,656 --> 01:04:23,886 So this actually is JavaScript code 1559 01:04:23,886 --> 01:04:26,546 but it's been quote unquote obviewskated or compressed. 1560 01:04:26,816 --> 01:04:30,076 So there are ways of not turning your JavaScript into 0's 1561 01:04:30,126 --> 01:04:32,486 and 1's, which is a lot harder to reverse engineer, 1562 01:04:32,776 --> 01:04:35,126 but you change all the variables to funky names 1563 01:04:35,126 --> 01:04:37,146 so that it's really confusing what they represent. 1564 01:04:37,326 --> 01:04:39,646 You eliminate obviously almost all the white space. 1565 01:04:39,676 --> 01:04:41,496 There's no new lines, there's no indentation. 1566 01:04:41,776 --> 01:04:44,696 Now frankly, this is only a slight measure of protection. 1567 01:04:45,036 --> 01:04:47,556 So you can obviewskate your code in this way, 1568 01:04:47,856 --> 01:04:50,636 but frankly a smart person who really knows their stuff, 1569 01:04:50,636 --> 01:04:53,436 could certainly figure out what this code is doing. 1570 01:04:53,436 --> 01:04:55,596 And there's plenty of tools that will convert this back 1571 01:04:55,926 --> 01:04:57,566 to cleaner JavaScript, 1572 01:04:57,616 --> 01:04:59,596 even though it can't recover the variable names. 1573 01:04:59,896 --> 01:05:01,196 But this is the trade off here. 1574 01:05:01,276 --> 01:05:04,346 If you're sort of smart enough and adroit enough 1575 01:05:04,346 --> 01:05:06,706 with JavaScript to figure out what this is doing, 1576 01:05:07,116 --> 01:05:09,956 odds are you could reimplement Facebook in less time, 1577 01:05:09,956 --> 01:05:11,266 just by starting from scratch. 1578 01:05:11,266 --> 01:05:11,976 So it's a trade off. 1579 01:05:11,976 --> 01:05:13,146 It just raises the bar. 1580 01:05:13,146 --> 01:05:14,886 But his is another matter of security. 1581 01:05:15,196 --> 01:05:16,006 Ah, here we go. 1582 01:05:16,006 --> 01:05:17,316 Finally. So it looks 1583 01:05:17,316 --> 01:05:20,636 like Facebook is using XHTML 1.0 strict, 1584 01:05:20,736 --> 01:05:22,106 which is a certain flavor of it. 1585 01:05:22,386 --> 01:05:24,466 Here is the tag we've been telling you guys to use. 1586 01:05:24,466 --> 01:05:25,786 There's a couple more things there. 1587 01:05:26,016 --> 01:05:26,726 Here is the head. 1588 01:05:26,726 --> 01:05:27,906 There's some meta tags here 1589 01:05:27,906 --> 01:05:29,036 which were tags we don't care about. 1590 01:05:29,036 --> 01:05:30,486 Oh, here's actually today's topic. 1591 01:05:30,776 --> 01:05:33,246 So embedded in Facebook.com is all of this JavaScript 1592 01:05:33,476 --> 01:05:36,366 So long story short, what we sort of take for granted 1593 01:05:36,366 --> 01:05:38,256 because we're pointing and clicking all the time 1594 01:05:38,256 --> 01:05:40,936 with browser's these days is all very low level and all 1595 01:05:40,936 --> 01:05:42,636 of these details can be interesting 1596 01:05:42,756 --> 01:05:45,206 and can be again exploited if you know how 1597 01:05:45,206 --> 01:05:47,196 to send inputs to the server. 1598 01:05:47,486 --> 01:05:49,516 So let's see if we can't be a little more clever here. 1599 01:05:49,516 --> 01:05:51,926 That was form1.html, we saw form2, 1600 01:05:52,176 --> 01:05:53,606 let's take a look at form3. 1601 01:05:53,886 --> 01:05:57,066 So form3 looks like this, in source code. 1602 01:05:57,226 --> 01:05:58,156 This is the pretty code. 1603 01:05:58,686 --> 01:06:02,676 The only thing I did differently is actually I'm going 1604 01:06:02,676 --> 01:06:04,066 to skip back because it's not that interesting. 1605 01:06:04,146 --> 01:06:04,796 Let's look at 4. 1606 01:06:05,526 --> 01:06:08,386 I veto 3. All right, so form 4. 1607 01:06:08,676 --> 01:06:10,336 So the difference in form 4 is 1608 01:06:10,366 --> 01:06:11,976 that something just got grayed out, 1609 01:06:12,086 --> 01:06:13,766 even though it's a little subtle on the screen. 1610 01:06:14,186 --> 01:06:17,626 Yes, so this is kind of annoying. 1611 01:06:17,776 --> 01:06:18,816 Can't even submit. 1612 01:06:18,816 --> 01:06:20,356 So right, this really annoys the user. 1613 01:06:20,356 --> 01:06:21,186 Oh but wait a minute. 1614 01:06:21,496 --> 01:06:23,526 Oh I checked that box and now I can submit. 1615 01:06:23,526 --> 01:06:24,666 So how can you do things like this? 1616 01:06:24,756 --> 01:06:27,066 Well let's introduce one more little building block 1617 01:06:27,376 --> 01:06:28,726 and then see where we can go with this. 1618 01:06:28,726 --> 01:06:29,946 So this is form 4. 1619 01:06:30,226 --> 01:06:32,156 It turns out that this is the XHTML. 1620 01:06:32,376 --> 01:06:34,176 It's pretty much copy and paste as before 1621 01:06:34,396 --> 01:06:37,586 and notice again here I am calling return validate 1622 01:06:37,846 --> 01:06:40,876 on submission, but there's this other one here and it's wrapping 1623 01:06:40,876 --> 01:06:42,106 because my font is a little big, 1624 01:06:42,376 --> 01:06:45,416 but notice you can also have an event handler as it's called, 1625 01:06:45,416 --> 01:06:48,076 a special attribute, that actually responds not 1626 01:06:48,076 --> 01:06:51,066 to form submission, but to clicking. 1627 01:06:51,296 --> 01:06:55,246 So this says, when the user clicks this check box, go ahead 1628 01:06:55,246 --> 01:06:56,786 and call a function called toggle. 1629 01:06:57,036 --> 01:06:57,996 Well what does toggle do? 1630 01:06:57,996 --> 01:06:59,396 Well let's scroll back up to the top. 1631 01:06:59,946 --> 01:07:00,966 There's a validate function. 1632 01:07:00,966 --> 01:07:02,906 We're going to ignore that for now, but here's toggle. 1633 01:07:03,276 --> 01:07:03,976 Interesting. 1634 01:07:03,976 --> 01:07:06,436 So again, pretty robust, but if document. 1635 01:07:06,646 --> 01:07:07,836 forms. registration. 1636 01:07:07,836 --> 01:07:11,686 button, that's the name I gave to the check box.disabled, 1637 01:07:11,806 --> 01:07:14,046 so if that is true go ahead 1638 01:07:14,046 --> 01:07:17,296 and make it false else go ahead and make it true. 1639 01:07:17,506 --> 01:07:20,996 So it turns out that some form elements have properties not 1640 01:07:20,996 --> 01:07:23,576 just value, which we used before for validation, 1641 01:07:23,826 --> 01:07:26,446 they also have bullion properties like disabled. 1642 01:07:26,676 --> 01:07:29,216 If disabled is true, it means you cannot click 1643 01:07:29,216 --> 01:07:30,046 on that form field. 1644 01:07:30,046 --> 01:07:32,696 You cannot fill out that box because it's disabled. 1645 01:07:32,916 --> 01:07:34,436 So we can toggle it just 1646 01:07:34,436 --> 01:07:38,076 by reassigning it a different truth value, false or true. 1647 01:07:38,356 --> 01:07:40,376 So again, a tiny, little building block there, 1648 01:07:40,586 --> 01:07:42,936 but let's us create again a more robust 1649 01:07:42,936 --> 01:07:44,346 or more seamless interface. 1650 01:07:44,736 --> 01:07:47,116 So why does this then get interesting 1651 01:07:47,116 --> 01:07:49,456 that we can do things client side? 1652 01:07:49,696 --> 01:07:51,306 Well this was just form validation. 1653 01:07:51,306 --> 01:07:54,056 That's kind of an exercise in JavaScript syntax. 1654 01:07:54,056 --> 01:07:54,856 Nothing more than that. 1655 01:07:55,266 --> 01:07:57,166 Well it turns out we can use JavaScript 1656 01:07:57,166 --> 01:08:00,106 to also go fetch more information ala Google Maps. 1657 01:08:00,466 --> 01:08:01,686 So let's go ahead and type 1658 01:08:01,686 --> 01:08:05,726 in here our favorite stock quote like, GOOG and click, get quote. 1659 01:08:05,936 --> 01:08:07,906 And every previous lecture and example, 1660 01:08:07,996 --> 01:08:10,436 this form would get submitted to a php file 1661 01:08:10,646 --> 01:08:12,636 and present the result and I'd see the answer. 1662 01:08:12,856 --> 01:08:14,956 Let me go ahead and click get quote. 1663 01:08:15,056 --> 01:08:16,206 Oh, a pop up. 1664 01:08:16,386 --> 01:08:18,606 So still not all that impressive, 1665 01:08:18,606 --> 01:08:20,136 so let's just look at version 2. 1666 01:08:20,426 --> 01:08:21,596 As a little teaser here. 1667 01:08:21,826 --> 01:08:24,166 Now OK, price to be determined. 1668 01:08:24,166 --> 01:08:26,406 Now again, I'm cutting corners on aesthetics but GOOG, 1669 01:08:26,406 --> 01:08:29,646 let me go down here click, get quote. 1670 01:08:29,646 --> 01:08:32,136 And now I'm updating the page dynamically 1671 01:08:32,136 --> 01:08:33,776 and my url has not changed. 1672 01:08:33,776 --> 01:08:36,096 And it's this very basic building block 1673 01:08:36,096 --> 01:08:39,146 that really drives the fanciest, sexiest of websites today, 1674 01:08:39,386 --> 01:08:41,326 whether it's maps or whether it's Facebook 1675 01:08:41,326 --> 01:08:42,846 or most any other site you visit. 1676 01:08:42,846 --> 01:08:48,856 So more on this on Wednesday.