1 00:00:00,000 --> 00:00:03,920 >> [MUSIC PLAYING] 2 00:00:03,920 --> 00:00:11,760 3 00:00:11,760 --> 00:00:13,800 >> DAVID J. MALAN: All right. 4 00:00:13,800 --> 00:00:15,640 This is CS50. 5 00:00:15,640 --> 00:00:17,620 This is the start of week eight. 6 00:00:17,620 --> 00:00:22,440 And you may recall that we left off last time looking at a new language 7 00:00:22,440 --> 00:00:23,240 altogether. 8 00:00:23,240 --> 00:00:25,905 In fact, one that's fairly lightweight and it's not even 9 00:00:25,905 --> 00:00:26,940 a programming language. 10 00:00:26,940 --> 00:00:31,100 It's a markup language that lets us actually structure and make web pages. 11 00:00:31,100 --> 00:00:33,350 And when you something else in conjunction with this-- 12 00:00:33,350 --> 00:00:35,670 or you soon will, if you haven't already. 13 00:00:35,670 --> 00:00:38,530 We're going to use Cascading Style Sheets, or CSS, which 14 00:00:38,530 --> 00:00:40,971 is another type of language with properties and values 15 00:00:40,971 --> 00:00:43,220 that's going to let us do things like change the color 16 00:00:43,220 --> 00:00:46,010 and change the position and these kinds of tweaks. 17 00:00:46,010 --> 00:00:49,940 But today and onward, we start to focus on more powerful languages, 18 00:00:49,940 --> 00:00:52,810 actual programming languages like PHP. 19 00:00:52,810 --> 00:00:54,880 >> So PHP has been around for some time. 20 00:00:54,880 --> 00:00:56,810 And as you'll see, it was designed primarily 21 00:00:56,810 --> 00:01:00,280 early on for actual use in web development 22 00:01:00,280 --> 00:01:02,360 and actually generating web pages. 23 00:01:02,360 --> 00:01:04,849 So what kinds of features does a language 24 00:01:04,849 --> 00:01:10,040 need in order to make web pages dynamically with it? 25 00:01:10,040 --> 00:01:14,760 >> In other words, if you want to generate content dynamically-- like Facebook's 26 00:01:14,760 --> 00:01:19,480 Newsfeed, which changes constantly, or instant messages that pop up from time 27 00:01:19,480 --> 00:01:21,872 to time-- like what's the key piece of functionality 28 00:01:21,872 --> 00:01:24,580 you need in a programming language that would let you dynamically 29 00:01:24,580 --> 00:01:28,070 print new information to the screen? 30 00:01:28,070 --> 00:01:28,685 >> STUDENT: Code. 31 00:01:28,685 --> 00:01:29,560 DAVID J. MALAN: Code. 32 00:01:29,560 --> 00:01:30,440 OK. 33 00:01:30,440 --> 00:01:31,995 We'll take that. 34 00:01:31,995 --> 00:01:35,310 A little more precise. 35 00:01:35,310 --> 00:01:37,639 I mean, we could do this with C, frankly. 36 00:01:37,639 --> 00:01:38,930 It would be a pain in the neck. 37 00:01:38,930 --> 00:01:41,045 But-- is this commitment? 38 00:01:41,045 --> 00:01:41,895 >> STUDENT: Yeah. 39 00:01:41,895 --> 00:01:42,677 Variables, maybe? 40 00:01:42,677 --> 00:01:43,760 DAVID J. MALAN: Variables. 41 00:01:43,760 --> 00:01:44,160 OK, sure. 42 00:01:44,160 --> 00:01:45,740 Variables can certainly help us out. 43 00:01:45,740 --> 00:01:47,020 And even something simpler. 44 00:01:47,020 --> 00:01:50,640 We used it in the very first program of the very first day 45 00:01:50,640 --> 00:01:55,686 when we actually said "hello world." 46 00:01:55,686 --> 00:01:56,570 >> STUDENT: Print. 47 00:01:56,570 --> 00:01:57,778 >> DAVID J. MALAN: Print, right? 48 00:01:57,778 --> 00:02:01,050 Print, or printf in the world of C. So all this time, 49 00:02:01,050 --> 00:02:03,362 we've had at our disposal a language-- C, 50 00:02:03,362 --> 00:02:05,570 in particular-- and even Scratch for that matter that 51 00:02:05,570 --> 00:02:07,400 can generate strings of text. 52 00:02:07,400 --> 00:02:11,090 >> Well, if HTML, as we saw last week, is just a whole bunch of strings of text 53 00:02:11,090 --> 00:02:14,692 albeit with open brackets and closed brackets and some kind of rhyme 54 00:02:14,692 --> 00:02:16,650 and reason behind it, well then we could really 55 00:02:16,650 --> 00:02:20,440 start generating web pages either manually by typing them out in gedit 56 00:02:20,440 --> 00:02:23,870 or in Microsoft Word, for that matter-- we just need a text editor. 57 00:02:23,870 --> 00:02:26,830 >> Or we could write code, to your suggestion 58 00:02:26,830 --> 00:02:30,435 earlier, that would let us dynamically generate HTML, 59 00:02:30,435 --> 00:02:32,560 and that's what we're going to start doing with PHP 60 00:02:32,560 --> 00:02:34,900 and ultimately even with a language called JavaScript, 61 00:02:34,900 --> 00:02:37,910 is use one language to generate another. 62 00:02:37,910 --> 00:02:40,720 And indeed, this is what Facebook and many, many other sites 63 00:02:40,720 --> 00:02:44,530 do to actually dynamically display new information to you. 64 00:02:44,530 --> 00:02:47,117 >> So let's begin with this-- a cryptic looking line, but one 65 00:02:47,117 --> 00:02:48,450 that's actually pretty powerful. 66 00:02:48,450 --> 00:02:51,210 Thus far, we've been using C, which is a compiled language. 67 00:02:51,210 --> 00:02:55,050 And just a quick recap-- a compiled language has what characteristic? 68 00:02:55,050 --> 00:02:59,050 You obviously need to compile it, but what does that mean? 69 00:02:59,050 --> 00:03:00,505 Yeah? 70 00:03:00,505 --> 00:03:02,940 >> STUDENT: It needs to be assembled into machine code. 71 00:03:02,940 --> 00:03:03,060 >> DAVID J. MALAN: OK. 72 00:03:03,060 --> 00:03:04,530 It needs to be assembled into machine code. 73 00:03:04,530 --> 00:03:07,340 So you take your source code, which is sort of English-like. 74 00:03:07,340 --> 00:03:09,270 You convert that to something lower level, 75 00:03:09,270 --> 00:03:11,590 which is ultimately called object code-- 0's and 1's. 76 00:03:11,590 --> 00:03:14,830 And it's those 0's and 1's that a CPU, like those made by Intel, 77 00:03:14,830 --> 00:03:16,110 actually understand. 78 00:03:16,110 --> 00:03:19,690 >> Now, PHP and Python and Ruby and JavaScript and bunches of other 79 00:03:19,690 --> 00:03:23,190 languages are not compiled languages but interpreted languages, 80 00:03:23,190 --> 00:03:26,630 which means you just type them and then you don't turn them into 0's and 1's. 81 00:03:26,630 --> 00:03:30,790 You instead just provide then as input to someone else's program, 82 00:03:30,790 --> 00:03:32,080 called an interpreter. 83 00:03:32,080 --> 00:03:34,460 And that person's program has been designed 84 00:03:34,460 --> 00:03:38,280 to understand what each and every symbol in Python or PHP 85 00:03:38,280 --> 00:03:42,650 or Ruby or any number of other languages means. 86 00:03:42,650 --> 00:03:44,760 >> And so all we need is something like this. 87 00:03:44,760 --> 00:03:46,350 So in fact, I'm going to go over to the appliance 88 00:03:46,350 --> 00:03:48,100 here, just into any old window, and we're 89 00:03:48,100 --> 00:03:52,580 going to go ahead and open a file called, say, hello. 90 00:03:52,580 --> 00:03:55,780 Now previously, I might have saved this even with a file extension, 91 00:03:55,780 --> 00:03:57,910 but I'm going to do something even simpler here. 92 00:03:57,910 --> 00:04:02,450 I'm going to go ahead and start this file with this cryptic syntax. 93 00:04:02,450 --> 00:04:06,310 So "user, bin, env, for environment, php." 94 00:04:06,310 --> 00:04:10,670 >> This is simply one line of code that's going to tell my operating system, 95 00:04:10,670 --> 00:04:13,730 go find in your local environment whatever that is, 96 00:04:13,730 --> 00:04:18,149 wherever PHP is-- the interpreter-- and go ahead and use that interpreter 97 00:04:18,149 --> 00:04:20,589 to interpret the following code. 98 00:04:20,589 --> 00:04:22,760 Now, this is kind of an ugly feature of PHP. 99 00:04:22,760 --> 00:04:24,980 But in this language, any time you write PHP code, 100 00:04:24,980 --> 00:04:29,200 you need to have one of these ugly PHP tags demarcating the beginning 101 00:04:29,200 --> 00:04:32,220 of your code-- 00:04:37,430 >> But below here, I can now do something quite simple, like printf hello comma 103 00:04:37,430 --> 00:04:40,922 world backslash n close quote, close parenthesis. 104 00:04:40,922 --> 00:04:42,630 And then just for good measure, I'm going 105 00:04:42,630 --> 00:04:45,380 to go ahead and close my php tag over here 106 00:04:45,380 --> 00:04:47,390 so that everything looks nicely pretty printed. 107 00:04:47,390 --> 00:04:50,780 >> And as soon as I click Save, gedit is actually smart enough 108 00:04:50,780 --> 00:04:54,620 to look at that very first line and realize, oh, you're writing PHP code. 109 00:04:54,620 --> 00:04:56,710 Let me syntax highlight it with the colors 110 00:04:56,710 --> 00:04:58,690 here so that it stands out a little more. 111 00:04:58,690 --> 00:05:01,300 But now I'm going to go down to my terminal window. 112 00:05:01,300 --> 00:05:02,340 I'll zoom in. 113 00:05:02,340 --> 00:05:06,860 >> This program was called "hello," so I'm going to do dot slash hello, 114 00:05:06,860 --> 00:05:07,990 but permission denied. 115 00:05:07,990 --> 00:05:08,490 And bash. 116 00:05:08,490 --> 00:05:10,610 We actually heard of that thing a couple weeks ago 117 00:05:10,610 --> 00:05:13,140 in the context of Shellshock, one of those bugs. 118 00:05:13,140 --> 00:05:16,240 >> But permission denied we've seen before, maybe in a different context. 119 00:05:16,240 --> 00:05:19,060 Does anyone recall how you might fix something 120 00:05:19,060 --> 00:05:22,100 where permission is denied like this? 121 00:05:22,100 --> 00:05:23,490 What's the command, at least? 122 00:05:23,490 --> 00:05:24,159 >> STUDENT: Chmod. 123 00:05:24,159 --> 00:05:26,700 DAVID J. MALAN: Yeah, chmod, for changing the mode of a file. 124 00:05:26,700 --> 00:05:30,171 And you'll get all the more used to this next week with a subsequent problem 125 00:05:30,171 --> 00:05:30,670 set. 126 00:05:30,670 --> 00:05:33,211 But for now, I'm going to change the mode not to be readable, 127 00:05:33,211 --> 00:05:36,650 but to give everyone executeability privileges, the ability 128 00:05:36,650 --> 00:05:37,710 to run this file. 129 00:05:37,710 --> 00:05:40,360 And I'm going to assign that to the file hello. 130 00:05:40,360 --> 00:05:45,150 >> If I now do dot slash hello enter, you see, in fact, my program, hello world. 131 00:05:45,150 --> 00:05:48,760 And what step did I clearly skip altogether? 132 00:05:48,760 --> 00:05:49,520 Compiling. 133 00:05:49,520 --> 00:05:51,680 So I just ran this program quite simply. 134 00:05:51,680 --> 00:05:55,690 >> And it turns out you can do this with a lot of syntax reminiscent of C. 135 00:05:55,690 --> 00:06:03,400 Let me go in to today's code, which I put into my vhost directory 136 00:06:03,400 --> 00:06:05,250 here, for real reasons we'll come back to. 137 00:06:05,250 --> 00:06:09,350 And I'm going to go into, let's say, conditions 1. 138 00:06:09,350 --> 00:06:12,450 >> And you'll see here, first and foremost, a whole bunch of comments. 139 00:06:12,450 --> 00:06:15,240 But this is actually a re-creation in PHP 140 00:06:15,240 --> 00:06:18,960 of a program we did in week one called conditions 1.c 141 00:06:18,960 --> 00:06:20,690 where the purpose in life of this program 142 00:06:20,690 --> 00:06:22,950 is apparently to ask the user for an integer 143 00:06:22,950 --> 00:06:25,270 and then do some fluffy analysis on it whereby 144 00:06:25,270 --> 00:06:29,510 you say if it's positive or negative or equal to zero. 145 00:06:29,510 --> 00:06:34,220 And I bring this up only because, except for maybe one little detail, 146 00:06:34,220 --> 00:06:37,150 it's indistinguishable so far from C. 147 00:06:37,150 --> 00:06:39,930 >> What's the one characteristic here that maybe jumps out 148 00:06:39,930 --> 00:06:41,410 at you as a little different? 149 00:06:41,410 --> 00:06:42,160 Maybe two things. 150 00:06:42,160 --> 00:06:42,660 Yeah? 151 00:06:42,660 --> 00:06:44,070 >> STUDENT: Dollar sign n? 152 00:06:44,070 --> 00:06:44,944 >> DAVID J. MALAN: Yeah. 153 00:06:44,944 --> 00:06:46,210 So dollar sign n is present. 154 00:06:46,210 --> 00:06:48,120 And dollar signs, as we'll see, are going 155 00:06:48,120 --> 00:06:51,460 to be a fix to the beginning of any variable in PHP. 156 00:06:51,460 --> 00:06:54,250 It's both good and bad-- good in that it's sort of obvious what's 157 00:06:54,250 --> 00:06:56,797 a variable, bad in that it's yet another thing to type. 158 00:06:56,797 --> 00:06:58,630 And there's one other thing we haven't quite 159 00:06:58,630 --> 00:07:00,876 seen, at least by this spelling. yeah? 160 00:07:00,876 --> 00:07:01,630 >> STUDENT: Readline. 161 00:07:01,630 --> 00:07:02,671 >> DAVID J. MALAN: Readline. 162 00:07:02,671 --> 00:07:06,550 Readline we didn't see, per se, in C, even though there exists something 163 00:07:06,550 --> 00:07:09,530 similar, but we've used getstring, and this is its counterpart. 164 00:07:09,530 --> 00:07:12,950 So if I go into this directory, which happens to be, 165 00:07:12,950 --> 00:07:18,030 as I'll explain in a bit in my vhost directory and my source A directory, 166 00:07:18,030 --> 00:07:22,730 and I go ahead and do dot slash conditions-- whoops-- dot slash 167 00:07:22,730 --> 00:07:26,710 conditions 1, you'll see, again, the same issue-- permission denied. 168 00:07:26,710 --> 00:07:33,610 >> So let me zoom in and do chmod a plus x on conditions, dot slash conditions. 169 00:07:33,610 --> 00:07:35,222 I'd like an integer, please, 50. 170 00:07:35,222 --> 00:07:36,930 And we could play this game all day long. 171 00:07:36,930 --> 00:07:39,140 It's going to behave exactly as it did in week one. 172 00:07:39,140 --> 00:07:42,860 >> OK so not all that different, except not just that slight bit of syntax, 173 00:07:42,860 --> 00:07:45,490 but at the top, I again had this line which 174 00:07:45,490 --> 00:07:49,760 allowed me to create something that looks like a C program called hello, 175 00:07:49,760 --> 00:07:51,150 called conditions 1. 176 00:07:51,150 --> 00:07:54,520 But it's not 0's and ones I'm executing directly. 177 00:07:54,520 --> 00:07:57,620 It's instead running this interpreter whose name 178 00:07:57,620 --> 00:07:59,440 happens to be identical to the language. 179 00:07:59,440 --> 00:08:04,970 The program is called PHP, and my code below line one is being passed into it. 180 00:08:04,970 --> 00:08:07,740 >> We can do another fairly simple example reminiscent of something 181 00:08:07,740 --> 00:08:09,240 we did weeks ago. 182 00:08:09,240 --> 00:08:12,020 Again, this is a sort of arbitrary chunk of code 183 00:08:12,020 --> 00:08:14,000 that apparently does what when you run it? 184 00:08:14,000 --> 00:08:15,625 What's this going to print, presumably? 185 00:08:15,625 --> 00:08:23,540 186 00:08:23,540 --> 00:08:28,250 >> So initially on line 16, it's going to say x is now 2, probably. 187 00:08:28,250 --> 00:08:30,920 %d is the same as $i for printf. 188 00:08:30,920 --> 00:08:33,460 So then it's cubing, dot, dot, dot, in line 17. 189 00:08:33,460 --> 00:08:36,299 And then line 18 appears to call a function Cubed. 190 00:08:36,299 --> 00:08:37,600 And where is Cubed defined? 191 00:08:37,600 --> 00:08:40,319 >> Well, it looks like in line 25, so that's not all that different. 192 00:08:40,319 --> 00:08:42,610 I've got some comments above it, but for the most part, 193 00:08:42,610 --> 00:08:45,370 it's a fairly straightforward porting or conversion 194 00:08:45,370 --> 00:08:48,470 from the C program to the PHP version. 195 00:08:48,470 --> 00:08:52,670 But there are now a couple of differences that maybe should jump out. 196 00:08:52,670 --> 00:08:56,100 What else is different about how you might write this same program in C? 197 00:08:56,100 --> 00:08:57,900 >> STUDENT: [INAUDIBLE]. 198 00:08:57,900 --> 00:09:00,070 >> DAVID J. MALAN: There's no prototype up top. 199 00:09:00,070 --> 00:09:03,210 So PHP-- and frankly, a lot of modern languages-- 200 00:09:03,210 --> 00:09:06,920 are a lot smarter and more helpful than C compilers in that you 201 00:09:06,920 --> 00:09:09,740 can put the function up here, you can put a function down here, 202 00:09:09,740 --> 00:09:12,740 and the interpreter is going to do you the favor of reading 203 00:09:12,740 --> 00:09:16,010 the whole file before it decides that some function doesn't exist. 204 00:09:16,010 --> 00:09:17,970 So nice improvements years later. 205 00:09:17,970 --> 00:09:22,126 But there's also something else different or absent here. 206 00:09:22,126 --> 00:09:22,626 Yeah? 207 00:09:22,626 --> 00:09:25,084 >> STUDENT: [INAUDIBLE]. 208 00:09:25,084 --> 00:09:27,750 DAVID J. MALAN: We don't have to declare the types of variables, 209 00:09:27,750 --> 00:09:31,780 so we'll see before long that there are different types in PHP, 210 00:09:31,780 --> 00:09:34,970 but you don't need to specify them, which also is both good and bad. 211 00:09:34,970 --> 00:09:36,623 And there's one other thing missing. 212 00:09:36,623 --> 00:09:37,430 >> STUDENT: There's no libraries. 213 00:09:37,430 --> 00:09:38,630 >> DAVID J. MALAN: There is no libraries. 214 00:09:38,630 --> 00:09:39,350 OK, so that's nice. 215 00:09:39,350 --> 00:09:40,540 We get a lot more out of the box. 216 00:09:40,540 --> 00:09:43,373 So there's actually a lot more things than I thought were different. 217 00:09:43,373 --> 00:09:44,350 How about way in back? 218 00:09:44,350 --> 00:09:46,032 What's that? 219 00:09:46,032 --> 00:09:46,740 Say it once more? 220 00:09:46,740 --> 00:09:47,960 >> STUDENT: Pointer. 221 00:09:47,960 --> 00:09:49,270 >> DAVID J. MALAN: No pointers. 222 00:09:49,270 --> 00:09:51,280 OK, at least in this example, no. 223 00:09:51,280 --> 00:09:52,070 That's fair. 224 00:09:52,070 --> 00:09:55,090 So there are not pointers in PHP actually in general. 225 00:09:55,090 --> 00:09:58,730 There are something called references, but we won't spend too much time there. 226 00:09:58,730 --> 00:09:59,520 And what else? 227 00:09:59,520 --> 00:10:00,185 >> STUDENT: Main. 228 00:10:00,185 --> 00:10:01,060 DAVID J. MALAN: Main. 229 00:10:01,060 --> 00:10:02,768 So this was the biggie I was thinking of. 230 00:10:02,768 --> 00:10:04,660 Notice there's no main entry point. 231 00:10:04,660 --> 00:10:06,525 You simply start writing your code. 232 00:10:06,525 --> 00:10:08,400 And this is actually going to be advantageous 233 00:10:08,400 --> 00:10:10,560 when we transition momentarily to actually using 234 00:10:10,560 --> 00:10:13,980 this same language for web-based programming, for which we 235 00:10:13,980 --> 00:10:16,580 don't want to have just one entry point. 236 00:10:16,580 --> 00:10:19,980 We might want to have a bunch of URLs, a bunch of different files 237 00:10:19,980 --> 00:10:22,930 all taking in user input and producing output. 238 00:10:22,930 --> 00:10:27,130 >> But here is the very disheartening example that I promised last time, 239 00:10:27,130 --> 00:10:30,130 namely in this folder here, misspellings. 240 00:10:30,130 --> 00:10:33,680 So in this file speller, which we won't spend too much time on, there is 241 00:10:33,680 --> 00:10:37,870 essentially a porting-- P-O-R-T. It's just the word given when, say, 242 00:10:37,870 --> 00:10:40,800 you convert from one language to another manually usually. 243 00:10:40,800 --> 00:10:45,680 >> This is a porting of PC version of speller from PSET 5. 244 00:10:45,680 --> 00:10:48,856 And I essentially tried to convert it line by line as closely as I can. 245 00:10:48,856 --> 00:10:51,730 So if you like this kind of thing, it actually is worth at some point 246 00:10:51,730 --> 00:10:54,229 pulling them both up side by side and seeing what's the same 247 00:10:54,229 --> 00:10:55,230 and what's different. 248 00:10:55,230 --> 00:10:57,510 But they're pretty darn similar. 249 00:10:57,510 --> 00:11:00,110 If you remember what speller even looked like, 250 00:11:00,110 --> 00:11:02,110 even though you didn't have to change this file, 251 00:11:02,110 --> 00:11:04,860 it's pretty similar structurally with just a couple 252 00:11:04,860 --> 00:11:06,200 of changes here and there. 253 00:11:06,200 --> 00:11:10,140 >> So this is only to say that it's pretty straightforward to convert speller 254 00:11:10,140 --> 00:11:12,000 from C to PHP. 255 00:11:12,000 --> 00:11:15,390 But in dictionary, there's something even more compelling. 256 00:11:15,390 --> 00:11:19,270 Let me go ahead and create my own dictionary.php file. 257 00:11:19,270 --> 00:11:24,010 So slightly different in that we'll call it .php instead of .c. 258 00:11:24,010 --> 00:11:26,980 Because this is a PHP file, I do-- slightly annoyingly-- have 259 00:11:26,980 --> 00:11:30,132 to start the file with a php tag like that. 260 00:11:30,132 --> 00:11:32,340 And I'm going to go ahead and define a few functions. 261 00:11:32,340 --> 00:11:35,770 Function called check, which is going to take in a word like before. 262 00:11:35,770 --> 00:11:37,520 But this argument's going to have a dollar 263 00:11:37,520 --> 00:11:39,840 sign because we're, again, using PHP. 264 00:11:39,840 --> 00:11:42,350 Another function from dictionary.c was load 265 00:11:42,350 --> 00:11:47,120 and it took in the name of a dictionary, so I'll get that function ready to go. 266 00:11:47,120 --> 00:11:50,920 >> Another one in dictionary.c was what? 267 00:11:50,920 --> 00:11:54,580 Size was one of the nicest ones, at least if you kept some variable around. 268 00:11:54,580 --> 00:11:57,830 so size just has to return a variable. 269 00:11:57,830 --> 00:11:59,090 And then there was unload. 270 00:11:59,090 --> 00:12:02,830 >> So there were these four functions in problem set 5 271 00:12:02,830 --> 00:12:06,770 that you needed to implement with some data structure or structures. 272 00:12:06,770 --> 00:12:10,170 So I promised that in PHP, we can declare 273 00:12:10,170 --> 00:12:14,490 a hash table, for instance, all that more easily. 274 00:12:14,490 --> 00:12:17,377 In fact, if I want a hash table, I'm just going to go like that 275 00:12:17,377 --> 00:12:18,460 and there's my hash table. 276 00:12:18,460 --> 00:12:21,555 And that's the note, disheartening, that we left off on last time. 277 00:12:21,555 --> 00:12:23,930 And you know what, if I wanted a variable for size, well, 278 00:12:23,930 --> 00:12:25,867 this one's not all that different from C, 279 00:12:25,867 --> 00:12:27,450 but I'm going to go ahead and do that. 280 00:12:27,450 --> 00:12:28,630 And notice no data type. 281 00:12:28,630 --> 00:12:31,180 And I'll go back later and actually add some comments here. 282 00:12:31,180 --> 00:12:32,480 But what about load? 283 00:12:32,480 --> 00:12:35,780 >> If dollar sign dictionary is the name of my file 284 00:12:35,780 --> 00:12:39,600 and I actually want to load words into this table now, 285 00:12:39,600 --> 00:12:42,360 I can actually do something fairly simple. 286 00:12:42,360 --> 00:12:44,880 One-- and this is minorly annoying-- in PHP, 287 00:12:44,880 --> 00:12:47,710 you have to specify inside of a function if you 288 00:12:47,710 --> 00:12:51,060 want to access some global variable that's defined outside. 289 00:12:51,060 --> 00:12:53,530 >> But that's not particularly interesting right now. 290 00:12:53,530 --> 00:12:57,920 What's more interesting is this for each construct that I mentioned last time. 291 00:12:57,920 --> 00:13:01,880 And it turns out that PHP has a function called file whose purpose in life 292 00:13:01,880 --> 00:13:05,550 is to open a file and read in all of its lines into an array 293 00:13:05,550 --> 00:13:06,840 and hand them back to. 294 00:13:06,840 --> 00:13:12,170 >> Which is to say I can do dictionary so that now effectively when I call file, 295 00:13:12,170 --> 00:13:15,472 this is going to hand me back an array of words from the file. 296 00:13:15,472 --> 00:13:16,430 It's not all that good. 297 00:13:16,430 --> 00:13:20,130 It's still going to be a line of words, something linear. 298 00:13:20,130 --> 00:13:23,880 But I can go ahead and iterate over each of these words using 299 00:13:23,880 --> 00:13:25,710 that syntax we saw briefly last time. 300 00:13:25,710 --> 00:13:27,940 And you'll see it more in the upcoming PSET. 301 00:13:27,940 --> 00:13:32,070 >> But now I have a loop iterating over each word in the dictionary. 302 00:13:32,070 --> 00:13:36,100 And on each iteration, recall I'm calling the current word "word." 303 00:13:36,100 --> 00:13:39,790 And all it's going to take to put a word into the dictionary is 304 00:13:39,790 --> 00:13:43,530 going to be word guess "true." 305 00:13:43,530 --> 00:13:44,740 That's my insert function. 306 00:13:44,740 --> 00:13:46,661 That's my load function for my dictionary. 307 00:13:46,661 --> 00:13:49,410 Now it's a bit of a cheat because, you know what, there's actually 308 00:13:49,410 --> 00:13:52,920 backslash n's at the end of the words that I should probably get rid of, 309 00:13:52,920 --> 00:13:56,380 but that's not a problem because PHP has a function called chop which literally 310 00:13:56,380 --> 00:13:58,480 chops off one character at the very end. 311 00:13:58,480 --> 00:13:59,400 So no problem there. 312 00:13:59,400 --> 00:14:02,199 We've gone ahead and actually shortened that to just this. 313 00:14:02,199 --> 00:14:05,240 And now I should probably keep track of size, so let's at least do this-- 314 00:14:05,240 --> 00:14:05,835 size++. 315 00:14:05,835 --> 00:14:07,339 I can do that as before. 316 00:14:07,339 --> 00:14:10,380 And then this is probably going to work just fine, so that's return true. 317 00:14:10,380 --> 00:14:10,930 Done. 318 00:14:10,930 --> 00:14:11,797 PSET 5. 319 00:14:11,797 --> 00:14:13,545 >> [LAUGHTER] 320 00:14:13,545 --> 00:14:14,420 >> DAVID J. MALAN: OK. 321 00:14:14,420 --> 00:14:16,628 We're going to do that again with the next PSET, too. 322 00:14:16,628 --> 00:14:18,730 So what about size? 323 00:14:18,730 --> 00:14:22,080 Well, this one hopefully is about as you would expect last time, 324 00:14:22,080 --> 00:14:24,460 although I have to do this stupid global thing. 325 00:14:24,460 --> 00:14:26,610 It's just an artifact from the language's design. 326 00:14:26,610 --> 00:14:28,450 >> But check is a little more interesting. 327 00:14:28,450 --> 00:14:31,420 So if I passed in dollar sign word, I first 328 00:14:31,420 --> 00:14:34,060 want to have access to that global variable table. 329 00:14:34,060 --> 00:14:36,700 And now if I want to check if a word is there, 330 00:14:36,700 --> 00:14:44,350 I can simply say if it is true that the following is set in the table, 331 00:14:44,350 --> 00:14:49,957 then go ahead and return true; else, return false. 332 00:14:49,957 --> 00:14:51,180 Done. 333 00:14:51,180 --> 00:14:52,440 The other half of PSET 5. 334 00:14:52,440 --> 00:14:54,540 >> All right, so again, I'm cutting a few corners. 335 00:14:54,540 --> 00:14:56,831 In fairness, I should probably spend a few more seconds 336 00:14:56,831 --> 00:14:58,300 on this implementation. 337 00:14:58,300 --> 00:15:01,860 And I probably shouldn't mock all the hours you put on the PSET so much. 338 00:15:01,860 --> 00:15:04,045 So strtolower is a function. 339 00:15:04,045 --> 00:15:06,670 Something similar existentialist in C, at least for characters, 340 00:15:06,670 --> 00:15:08,560 but PHP's got a whole string version. 341 00:15:08,560 --> 00:15:11,226 >> That's going to force everything to lowercase, which some of you 342 00:15:11,226 --> 00:15:14,944 might have done to canonicalize what you were putting in your dictionary. 343 00:15:14,944 --> 00:15:16,360 And now you can do this in C, too. 344 00:15:16,360 --> 00:15:17,780 This has nothing to do with PHP. 345 00:15:17,780 --> 00:15:20,260 >> But any time you have a Boolean condition, 346 00:15:20,260 --> 00:15:22,680 like something on line 10 there, which is only 347 00:15:22,680 --> 00:15:27,145 going to evaluate to true or false, and your if else clearly 348 00:15:27,145 --> 00:15:33,620 is returning true or false, I could simply really make this sexier 349 00:15:33,620 --> 00:15:38,360 and just do something like this. 350 00:15:38,360 --> 00:15:40,500 So that there's my check function. 351 00:15:40,500 --> 00:15:42,560 Right, if the Boolean returns a true or a false, 352 00:15:42,560 --> 00:15:44,630 let's just return it straight away. 353 00:15:44,630 --> 00:15:47,340 >> And there's a few other tweaks I could make here and there. 354 00:15:47,340 --> 00:15:51,380 Load-- unload, by the way, that's done. 355 00:15:51,380 --> 00:15:52,850 Nothing to do there. 356 00:15:52,850 --> 00:15:55,840 Since all of the memory in PHP and many other languages 357 00:15:55,840 --> 00:15:57,570 is actually managed for you. 358 00:15:57,570 --> 00:16:00,330 So whereas in C, as you've learned painfully, 359 00:16:00,330 --> 00:16:04,700 anything you malloc or calloc or realloc, you have to free yourself. 360 00:16:04,700 --> 00:16:08,770 Anything you fopen, you have to fclose, so that resources are ultimately freed 361 00:16:08,770 --> 00:16:11,690 and tools like Valgrind don't notice and don't complain, 362 00:16:11,690 --> 00:16:13,570 which is a good thing to run on them. 363 00:16:13,570 --> 00:16:16,190 >> But surely, there must be some catch, right? 364 00:16:16,190 --> 00:16:19,400 Otherwise, we kind of wasted a whole bunch of weeks. 365 00:16:19,400 --> 00:16:23,270 So there's any number of reasons why we sort of take this trajectory, 366 00:16:23,270 --> 00:16:24,440 but there is a trade-off. 367 00:16:24,440 --> 00:16:25,820 Right, this has been thematic. 368 00:16:25,820 --> 00:16:29,690 >> So what might a trade-off here moving, from C to PHP? 369 00:16:29,690 --> 00:16:33,250 Feels like all win so far other than a bit of ugliness here or there. 370 00:16:33,250 --> 00:16:34,040 Yeah. 371 00:16:34,040 --> 00:16:34,700 What's that? 372 00:16:34,700 --> 00:16:36,064 >> STUDENT: [INAUDIBLE] memory. 373 00:16:36,064 --> 00:16:36,980 DAVID J. MALAN: Speed. 374 00:16:36,980 --> 00:16:37,479 OK. 375 00:16:37,479 --> 00:16:40,720 Well, my speed was pretty fast. 376 00:16:40,720 --> 00:16:42,020 Right? 377 00:16:42,020 --> 00:16:44,320 But speed of execution of the program? 378 00:16:44,320 --> 00:16:45,580 OK, so that's a fair point. 379 00:16:45,580 --> 00:16:50,930 >> So as it would happen, I in advance cued up both my try solution, the one 380 00:16:50,930 --> 00:16:53,510 I had on the big board was a try-based solution, 381 00:16:53,510 --> 00:16:55,510 and I have that in this directory here. 382 00:16:55,510 --> 00:16:58,510 So I in a moment, I can go ahead and run this on the King James Bible, 383 00:16:58,510 --> 00:16:59,657 hitting Enter. 384 00:16:59,657 --> 00:17:01,990 And this is hopefully correct implementation at the end, 385 00:17:01,990 --> 00:17:05,109 gives me time in total of 0.38 seconds for that 386 00:17:05,109 --> 00:17:07,270 one somewhat arbitrary example. 387 00:17:07,270 --> 00:17:09,270 >> And if I now go into this second terminal window 388 00:17:09,270 --> 00:17:14,569 here where I first opened gedit, let me go into today's code-- which, again, 389 00:17:14,569 --> 00:17:19,650 is in this directory here-- and let me go ahead and run speller. 390 00:17:19,650 --> 00:17:23,470 So just to be clear, this is the PHP version. 391 00:17:23,470 --> 00:17:25,170 I'm just showing the top of it here. 392 00:17:25,170 --> 00:17:32,020 >> So if I do speller of tilde CS50 PSET 5 texts, King James, enter. 393 00:17:32,020 --> 00:17:39,700 394 00:17:39,700 --> 00:17:43,050 It's still faster than writing it in C, but the total time 395 00:17:43,050 --> 00:17:47,650 is, notice, 0.93, whereas my C-based implementation was 0.38. 396 00:17:47,650 --> 00:17:49,110 So it's a non-trivial difference. 397 00:17:49,110 --> 00:17:51,100 >> And this is just on one file. 398 00:17:51,100 --> 00:17:53,480 If you were to run the two programs versus the big board 399 00:17:53,480 --> 00:17:56,510 and have a whole bunch of inputs tested, this would surely add up. 400 00:17:56,510 --> 00:18:00,310 And if we had even larger data sets, this, too, would add up all the more. 401 00:18:00,310 --> 00:18:04,820 So yes, paying some price of speed is indeed the case. 402 00:18:04,820 --> 00:18:05,470 What else? 403 00:18:05,470 --> 00:18:08,000 404 00:18:08,000 --> 00:18:08,860 Yeah? 405 00:18:08,860 --> 00:18:10,340 >> STUDENT: Amount of RAM use. 406 00:18:10,340 --> 00:18:11,756 >> DAVID J. MALAN: Amount of RAM use. 407 00:18:11,756 --> 00:18:15,380 So I didn't give one second thought when writing this PHP 408 00:18:15,380 --> 00:18:17,300 version as to how much memory I was using. 409 00:18:17,300 --> 00:18:22,080 I'm completely deferring that to PHP itself and whoever wrote that program. 410 00:18:22,080 --> 00:18:24,500 And that might be OK, but if I actually really 411 00:18:24,500 --> 00:18:28,420 care about squeezing as much performance out of my program or out of my website 412 00:18:28,420 --> 00:18:31,150 or out of whatever tool I'm building, maybe 413 00:18:31,150 --> 00:18:33,310 PHP, indeed, is not the right language. 414 00:18:33,310 --> 00:18:36,330 >> And in fact, that is why, for instance, many web servers-- 415 00:18:36,330 --> 00:18:38,980 the actual programs that serve up web content-- 416 00:18:38,980 --> 00:18:41,810 are not written in PHP or in Python or Ruby. 417 00:18:41,810 --> 00:18:44,630 They are written, like you'll now do with PSEt 6, 418 00:18:44,630 --> 00:18:48,120 in C so that you can squeeze every bit of performance out of it 419 00:18:48,120 --> 00:18:50,780 and really exercise fine-grain control over what's 420 00:18:50,780 --> 00:18:52,980 going on underneath the hood and not just take it 421 00:18:52,980 --> 00:18:54,890 for granted some higher level data structure. 422 00:18:54,890 --> 00:18:58,071 >> Consider, after all, whoever in PHP implemented 423 00:18:58,071 --> 00:19:00,070 that notion of a hash table-- it's actually more 424 00:19:00,070 --> 00:19:04,260 properly called an associative array-- does he or she have any idea what kind 425 00:19:04,260 --> 00:19:07,090 of inputs you are going to be putting into the structure? 426 00:19:07,090 --> 00:19:08,260 So obviously not, right? 427 00:19:08,260 --> 00:19:10,340 It's a generic tool in the toolkit that's 428 00:19:10,340 --> 00:19:13,430 provided to anyone who wants to use it, and so surely it 429 00:19:13,430 --> 00:19:17,680 can't be optimized ultimately for exactly what you want to do. 430 00:19:17,680 --> 00:19:21,180 >> So trade-offs-- development time might differ, performance might differ, 431 00:19:21,180 --> 00:19:23,120 complexity or memory usage might differ. 432 00:19:23,120 --> 00:19:24,820 And so what you'll find increasingly is that there's 433 00:19:24,820 --> 00:19:26,570 going to be different tools for the trade. 434 00:19:26,570 --> 00:19:31,160 And in fact for a super majority of people's final projects in this class, 435 00:19:31,160 --> 00:19:34,360 believe it or not, C is not going to be the right language to use. 436 00:19:34,360 --> 00:19:37,880 >> And in fact, one of the takeaways ultimately for any class like this 437 00:19:37,880 --> 00:19:40,510 is to get you thinking about, well, what should you pull off 438 00:19:40,510 --> 00:19:42,710 the shelf when you want to solve some problem. 439 00:19:42,710 --> 00:19:46,720 And indeed, we'll cross this bridge even more as we look at more languages 440 00:19:46,720 --> 00:19:47,920 even beyond today. 441 00:19:47,920 --> 00:19:50,530 >> So let's transition now to perhaps a more familiar context 442 00:19:50,530 --> 00:19:52,480 for using a language like PHP. 443 00:19:52,480 --> 00:19:56,720 It's somewhat common to use at the command line, writing scripts 444 00:19:56,720 --> 00:19:59,050 like I did, but it's much, much more common. 445 00:19:59,050 --> 00:20:02,350 And it was intended to be used in the form of files that typically end 446 00:20:02,350 --> 00:20:05,060 in .php-- but that's not a prerequisite-- 447 00:20:05,060 --> 00:20:07,990 that themselves generate web content. 448 00:20:07,990 --> 00:20:11,310 >> So let me go ahead and open a few examples I prepared in advance. 449 00:20:11,310 --> 00:20:15,100 And these are actually sort of true stories in that one of the first things 450 00:20:15,100 --> 00:20:18,200 I ever did myself after finishing CS50 and maybe, I think, 451 00:20:18,200 --> 00:20:21,350 CS51 years ago is my roommate and I were helping 452 00:20:21,350 --> 00:20:24,320 to run the freshman intramural sports program, which, at the time, 453 00:20:24,320 --> 00:20:28,610 had freshman registering for various sports by filling out a piece of paper, 454 00:20:28,610 --> 00:20:31,800 as it was called, walking across the yard to Wigglesworth, 455 00:20:31,800 --> 00:20:34,030 and dropping it in some proctor's door drop. 456 00:20:34,030 --> 00:20:37,210 And then he or she would go through them and then actually email us manually 457 00:20:37,210 --> 00:20:39,140 that we were registered for some sport. 458 00:20:39,140 --> 00:20:41,166 >> So clearly, an opportunity for improvement. 459 00:20:41,166 --> 00:20:44,040 These days, you might turn to just Google Forms, but back in the day, 460 00:20:44,040 --> 00:20:46,914 we had to actually reach for-- this wasn't even that long ago-- reach 461 00:20:46,914 --> 00:20:49,410 for a programming language that wasn't PHP. 462 00:20:49,410 --> 00:20:51,200 At the time, it was something called Perl, 463 00:20:51,200 --> 00:20:52,890 which has gone out of vogue since. 464 00:20:52,890 --> 00:20:54,160 But the idea is the same. 465 00:20:54,160 --> 00:20:58,940 >> And I essentially sat down to try to port goes Perl versions to PHP, 466 00:20:58,940 --> 00:21:03,710 but in full disclaimer, did not give any thought to the aesthetics just yet. 467 00:21:03,710 --> 00:21:04,960 So here is a web page. 468 00:21:04,960 --> 00:21:05,670 This is a file. 469 00:21:05,670 --> 00:21:09,470 If I zoom in, its apparently called froshim0.php 470 00:21:09,470 --> 00:21:12,060 just because it's our first example in this series. 471 00:21:12,060 --> 00:21:15,970 And notice that it has what appears to be a very ugly HTML form, 472 00:21:15,970 --> 00:21:18,680 but a form is interesting because it allows 473 00:21:18,680 --> 00:21:21,910 me to provide user input to the browser. 474 00:21:21,910 --> 00:21:27,730 >> Now last time when we had a form, to whom did we submit our query parameter, 475 00:21:27,730 --> 00:21:30,450 the queue parameter as it was called? 476 00:21:30,450 --> 00:21:31,330 So to Google, right? 477 00:21:31,330 --> 00:21:34,090 We totally punted on the idea of doing anything with that input. 478 00:21:34,090 --> 00:21:36,160 >> But today, we start producing output. 479 00:21:36,160 --> 00:21:39,420 And the behavior I'm going to see here initially is pretty trivial. 480 00:21:39,420 --> 00:21:42,980 David, I'll check off gender here, say Matthews here. 481 00:21:42,980 --> 00:21:43,800 I won't be captain. 482 00:21:43,800 --> 00:21:45,410 I'm going to click Register. 483 00:21:45,410 --> 00:21:50,720 And notice that the URL has changed to register-0.php, 484 00:21:50,720 --> 00:21:52,310 and then there's this ugly text here. 485 00:21:52,310 --> 00:21:54,460 I gave no thought to the formatting of this. 486 00:21:54,460 --> 00:21:59,900 >> But what is interesting is that three values were apparently passed in. 487 00:21:59,900 --> 00:22:02,960 This is PHP's sort of equivalent to printf-- 488 00:22:02,960 --> 00:22:06,330 we'll see what it's really called in a bit-- that just prints out 489 00:22:06,330 --> 00:22:08,300 what you passed into it. 490 00:22:08,300 --> 00:22:11,414 So this suggests that that form had at least three fields to it, 491 00:22:11,414 --> 00:22:12,580 and you saw me type them in. 492 00:22:12,580 --> 00:22:15,739 One was my name, one was gender, one was dormitory. 493 00:22:15,739 --> 00:22:18,780 And captain didn't even get sent to the server because I didn't check it. 494 00:22:18,780 --> 00:22:22,150 >> So this is to say apparently, when you submit things on the web, 495 00:22:22,150 --> 00:22:26,830 not only does the URL sometimes change-- sometimes it doesn't. 496 00:22:26,830 --> 00:22:30,330 In fact, the file name changed, but what is absent from the URL 497 00:22:30,330 --> 00:22:32,861 that we did see last time with Google. 498 00:22:32,861 --> 00:22:33,360 Yeah? 499 00:22:33,360 --> 00:22:34,380 >> STUDENT: No query string 500 00:22:34,380 --> 00:22:35,220 >> DAVID J. MALAN: There's no query string. 501 00:22:35,220 --> 00:22:37,270 There's no question mark something. 502 00:22:37,270 --> 00:22:40,050 There's no question mark q equals cats, as we did last time. 503 00:22:40,050 --> 00:22:42,170 And there's certainly no question mark name equals 504 00:22:42,170 --> 00:22:46,310 David or dorm equals Matthews, so where is that all going? 505 00:22:46,310 --> 00:22:51,290 >> Well, let me go back to gedit here and open up the first of those files 506 00:22:51,290 --> 00:22:57,020 in my vhost, local host, public directory here and go into froshim0. 507 00:22:57,020 --> 00:23:02,060 So it turns out that almost all of this page is just HTML. 508 00:23:02,060 --> 00:23:05,410 And this might be unfamiliar to you, but it soon will be more so with PSET 6 509 00:23:05,410 --> 00:23:07,370 and PSET 7 and PSET 8. 510 00:23:07,370 --> 00:23:09,160 But this is just an HTML page. 511 00:23:09,160 --> 00:23:12,400 >> And the interesting stuff seems to be over here. 512 00:23:12,400 --> 00:23:16,290 A form tag whose action attribute has a value of register 0. 513 00:23:16,290 --> 00:23:18,890 That's why when I submit this, it goes to that file. 514 00:23:18,890 --> 00:23:20,620 But method is different today-- post. 515 00:23:20,620 --> 00:23:23,120 So it turns out there's at least two methods on the web used 516 00:23:23,120 --> 00:23:24,911 to send information from browser to server. 517 00:23:24,911 --> 00:23:25,980 Get puts it in the URL. 518 00:23:25,980 --> 00:23:27,950 Post puts it elsewhere. 519 00:23:27,950 --> 00:23:30,570 And when and why might you actually want a website 520 00:23:30,570 --> 00:23:34,110 to use post then instead of get, just intuitively? 521 00:23:34,110 --> 00:23:37,080 Any website. 522 00:23:37,080 --> 00:23:42,010 What kind of data should be passed just by inference now via post as opposed 523 00:23:42,010 --> 00:23:45,184 to get, if we've seen the two differences? 524 00:23:45,184 --> 00:23:46,350 STUDENT: [INAUDIBLE] secure. 525 00:23:46,350 --> 00:23:47,790 DAVID J. MALAN: If you want something to be secure. 526 00:23:47,790 --> 00:23:50,360 So you might type a password into a website, a credit card 527 00:23:50,360 --> 00:23:53,030 into a website would kind of these suboptimal 528 00:23:53,030 --> 00:23:56,220 if the browser put that value inside of the URL. 529 00:23:56,220 --> 00:23:57,680 Why? 530 00:23:57,680 --> 00:24:00,059 You see it, which doesn't seem to be such a big deal, 531 00:24:00,059 --> 00:24:03,350 but odds are you pretty frequently walk away from your computer or use computer 532 00:24:03,350 --> 00:24:05,310 labs, and so someone else or even a roommate 533 00:24:05,310 --> 00:24:08,220 could easily walk up and see that private information. 534 00:24:08,220 --> 00:24:10,220 When you send an email via the web, you probably 535 00:24:10,220 --> 00:24:12,350 don't want that data ending up in the URL as well. 536 00:24:12,350 --> 00:24:15,266 And so there's any number of reasons why we might want to put it here. 537 00:24:15,266 --> 00:24:18,610 And photos-- right, I can't even quite imagine how you would take a graphic, 538 00:24:18,610 --> 00:24:21,480 like a JPEG, and put it into a URL. 539 00:24:21,480 --> 00:24:22,330 You could do it. 540 00:24:22,330 --> 00:24:25,840 There's ways of encoding it, but it's just not straightforward like that. 541 00:24:25,840 --> 00:24:29,030 >> So register 0 is actually very underwhelming. 542 00:24:29,030 --> 00:24:31,610 All it says literally is this. 543 00:24:31,610 --> 00:24:35,910 It prints out inside of some HTML tags the following. 544 00:24:35,910 --> 00:24:38,640 I've got a PHP tag here nested inside of a pre tag. 545 00:24:38,640 --> 00:24:42,300 "Pre" just means pre-formatted text, mono-spaced, like a typewriter. 546 00:24:42,300 --> 00:24:44,836 >> Printr is a print recursive function. 547 00:24:44,836 --> 00:24:46,710 And then there's this interesting thing here. 548 00:24:46,710 --> 00:24:48,835 And we'll come back to this because there's others, 549 00:24:48,835 --> 00:24:51,140 but dollar sign underscore post appears to be 550 00:24:51,140 --> 00:24:56,110 a variable in PHP in which anything you send from browser to server 551 00:24:56,110 --> 00:24:58,040 gets stored for you. 552 00:24:58,040 --> 00:25:00,930 And we'll see how to get at that information before long. 553 00:25:00,930 --> 00:25:04,000 >> But first, let's go back to a slightly different example. 554 00:25:04,000 --> 00:25:09,050 Going into register-- or rather, froshims1.php, 555 00:25:09,050 --> 00:25:10,470 which looks a little different. 556 00:25:10,470 --> 00:25:12,670 I took a little more effort with formatting, 557 00:25:12,670 --> 00:25:14,370 even though it's still pretty ugly. 558 00:25:14,370 --> 00:25:16,990 But I'm going to go ahead and type in "David" now. 559 00:25:16,990 --> 00:25:17,850 Male. 560 00:25:17,850 --> 00:25:19,360 We'll check "captain" this time. 561 00:25:19,360 --> 00:25:20,660 We'll do Matthews. 562 00:25:20,660 --> 00:25:22,430 And register. 563 00:25:22,430 --> 00:25:24,110 >> And this time it says, hm, not really. 564 00:25:24,110 --> 00:25:26,180 All right, so what's register 1? 565 00:25:26,180 --> 00:25:30,509 Let me go into open register 1 and-- hm. 566 00:25:30,509 --> 00:25:32,300 All right, so this is interesting, and this 567 00:25:32,300 --> 00:25:34,880 is a stepping stone now toward more interesting programs. 568 00:25:34,880 --> 00:25:38,970 >> Notice the top of this file has a PHP tag as well as some comments. 569 00:25:38,970 --> 00:25:42,590 And these are, for now, a distraction so let's just get rid of those comments 570 00:25:42,590 --> 00:25:47,070 just like they're in C. And I claim with this chunk of code with a comment 571 00:25:47,070 --> 00:25:49,280 that this code is validating the submission. 572 00:25:49,280 --> 00:25:51,690 >> Well, it turns out that variables like dollar sign 573 00:25:51,690 --> 00:25:53,739 underscore post are called super globals. 574 00:25:53,739 --> 00:25:55,530 They're like these special global variables 575 00:25:55,530 --> 00:25:58,840 that are just omni-presently available within your program. 576 00:25:58,840 --> 00:26:03,870 And you can use square bracket notation to index into them not using numbers 577 00:26:03,870 --> 00:26:07,460 like 0, 1, 2, 3, but actual words. 578 00:26:07,460 --> 00:26:12,100 >> So you can think of dollar sign underscore post as sort of a hash table 579 00:26:12,100 --> 00:26:15,920 that you could pass a key into, a lookup word in-between the square brackets, 580 00:26:15,920 --> 00:26:19,370 and it's going to give you back the value that the user actually provided. 581 00:26:19,370 --> 00:26:21,210 PHP has a function called empty that just 582 00:26:21,210 --> 00:26:23,720 says yes or no, this variable is empty or not. 583 00:26:23,720 --> 00:26:27,250 We have these double bars, which just means or, like in C. 584 00:26:27,250 --> 00:26:31,740 >> So in effect, this line 4 is just saying if the user didn't give a name 585 00:26:31,740 --> 00:26:36,540 or didn't give a gender or didn't give a dorm, go ahead and redirect him 586 00:26:36,540 --> 00:26:38,184 or her via this line here. 587 00:26:38,184 --> 00:26:40,600 So this is a little cryptic, but this just means literally 588 00:26:40,600 --> 00:26:43,330 go back to this location, so it punts the user 589 00:26:43,330 --> 00:26:45,420 back to wherever he or she came from. 590 00:26:45,420 --> 00:26:47,880 But it's a little inelegant in that I hard coded it. 591 00:26:47,880 --> 00:26:52,150 >> But what if this if condition does not evaluate to true? 592 00:26:52,150 --> 00:26:55,790 What if the user did give me his or her name and dorm and gender? 593 00:26:55,790 --> 00:26:58,540 That if condition's not going to evaluate to true, 594 00:26:58,540 --> 00:27:00,650 so I don't hit the exit in line 7. 595 00:27:00,650 --> 00:27:01,680 So what happens? 596 00:27:01,680 --> 00:27:03,880 And this is what's interesting about PHP. 597 00:27:03,880 --> 00:27:07,470 >> You can drop into and out of PHP mode, so to speak. 598 00:27:07,470 --> 00:27:10,985 If you want some code to execute, you can open and close a PHP tag 599 00:27:10,985 --> 00:27:13,010 and put code there like I've done here. 600 00:27:13,010 --> 00:27:16,810 As soon as you close the PHP tag, the server 601 00:27:16,810 --> 00:27:19,407 is just going to spit out whatever you put there. 602 00:27:19,407 --> 00:27:21,740 And indeed, this was part of the original design of PHP, 603 00:27:21,740 --> 00:27:25,280 for better for worse, was this commingling of code and markup 604 00:27:25,280 --> 00:27:25,920 language. 605 00:27:25,920 --> 00:27:28,670 And we'll see that this very quickly devolves into a mess. 606 00:27:28,670 --> 00:27:31,280 And so we'll do better than this ultimately, but just 607 00:27:31,280 --> 00:27:35,620 notice the ease with which I'm actually able to execute some logic. 608 00:27:35,620 --> 00:27:37,440 >> But still a bit underwhelming. 609 00:27:37,440 --> 00:27:41,210 Let's open up version two of Frosh IMs, which 610 00:27:41,210 --> 00:27:44,270 apparently submits to register2.php. 611 00:27:44,270 --> 00:27:47,600 So this file's actually going to look almost the same. 612 00:27:47,600 --> 00:27:50,780 I'm going to go to Frosh IMs 2 . 613 00:27:50,780 --> 00:27:53,050 But in Frosh IMs 2, let's see what happens. 614 00:27:53,050 --> 00:27:58,110 >> David, click the radio button, as it's called; Matthews, no captain. 615 00:27:58,110 --> 00:27:59,230 Register. 616 00:27:59,230 --> 00:28:00,130 You are registered. 617 00:28:00,130 --> 00:28:00,700 Not really. 618 00:28:00,700 --> 00:28:02,574 Oh wait, we just did that example, didn't we? 619 00:28:02,574 --> 00:28:04,520 All right, stand by. 620 00:28:04,520 --> 00:28:06,602 We'll do the three. 621 00:28:06,602 --> 00:28:08,560 Clearly something's about to happen with Gmail. 622 00:28:08,560 --> 00:28:09,600 We'll get there. 623 00:28:09,600 --> 00:28:11,900 >> So Frosh IMs 3 looks like this. 624 00:28:11,900 --> 00:28:13,050 No different. 625 00:28:13,050 --> 00:28:19,850 But when I do David, male, Matthews, and register, this third and final version 626 00:28:19,850 --> 00:28:22,230 claims, quite simply, you are registered really. 627 00:28:22,230 --> 00:28:23,560 That's sort of immaterial. 628 00:28:23,560 --> 00:28:25,600 But I claim with this third and final version 629 00:28:25,600 --> 00:28:30,610 I have now recreated exactly what my roommate and I built for the Frosh IMs 630 00:28:30,610 --> 00:28:31,731 program years ago. 631 00:28:31,731 --> 00:28:32,480 And it was simple. 632 00:28:32,480 --> 00:28:34,330 There was no database, no Excel spreadsheet. 633 00:28:34,330 --> 00:28:36,450 But more importantly, there was no more paper 634 00:28:36,450 --> 00:28:42,520 because what we did with this program was to actually email the proctor, who 635 00:28:42,520 --> 00:28:44,530 was previously receiving these things via forms. 636 00:28:44,530 --> 00:28:48,890 >> And apparently we've programmed this in such a way that when someone registers, 637 00:28:48,890 --> 00:28:52,470 John Harvard's account emails the proctor-- or himself in this case, 638 00:28:52,470 --> 00:28:55,960 John Harvard-- with the following text-- "This person just registered." 639 00:28:55,960 --> 00:29:00,560 Name is David, captain is blank; gender, male; and dorm, Matthews. 640 00:29:00,560 --> 00:29:01,560 >> So what happened there? 641 00:29:01,560 --> 00:29:05,360 Well, the file in question here is apparently register3.php. 642 00:29:05,360 --> 00:29:09,080 And if I open this, you'll see both the power of code like this 643 00:29:09,080 --> 00:29:12,380 and also, frankly, the insecurity of a system like email. 644 00:29:12,380 --> 00:29:16,290 I have just effectively pretended to be John Harvard in the following way. 645 00:29:16,290 --> 00:29:20,920 >> I have the open php tag up top, which just says here comes some PHP code. 646 00:29:20,920 --> 00:29:23,155 Down here, turns out there are libraries in PHP. 647 00:29:23,155 --> 00:29:26,410 You just don't need to include header files as much. 648 00:29:26,410 --> 00:29:28,900 You get more with the kitchen sink, so to speak. 649 00:29:28,900 --> 00:29:31,820 >> But this time in line 4, I do want to special library called 650 00:29:31,820 --> 00:29:36,087 PHP mailer, which is something you can install for free in many systems. 651 00:29:36,087 --> 00:29:37,920 Down here I'm validating the submission just 652 00:29:37,920 --> 00:29:40,540 by checking did the user give me a name, a gender, and a dorm. 653 00:29:40,540 --> 00:29:44,130 And if so, go ahead and instantiate a mailer. 654 00:29:44,130 --> 00:29:47,020 >> You can think of this as being a line of code that just allocates. 655 00:29:47,020 --> 00:29:48,950 It's like malloc, but it's a little sexier 656 00:29:48,950 --> 00:29:51,790 in that you mention not just malloc and some generic number. 657 00:29:51,790 --> 00:29:55,030 You say give me one of these, give me a new one of these. 658 00:29:55,030 --> 00:29:57,950 >> And if you've programmed in Java or C++ or other languages, 659 00:29:57,950 --> 00:29:59,130 you might have seen this. 660 00:29:59,130 --> 00:30:01,840 But the short of it, if unfamiliar, this line 661 00:30:01,840 --> 00:30:05,410 puts into dollar sign mail a special struct called 662 00:30:05,410 --> 00:30:08,731 an object that has built-in email functionality. 663 00:30:08,731 --> 00:30:10,355 And in fact, notice and similar syntax. 664 00:30:10,355 --> 00:30:11,900 >> This is not a pointer, per se. 665 00:30:11,900 --> 00:30:13,990 PHP just uses the same syntax. 666 00:30:13,990 --> 00:30:17,660 This line is saying use SMTP-- Simple Mail Transfer 667 00:30:17,660 --> 00:30:20,900 Protocol, which is just the protocol used to send mail. 668 00:30:20,900 --> 00:30:24,240 This is specifying use Harvard's SMTP server, which 669 00:30:24,240 --> 00:30:25,830 is somewhere here on campus. 670 00:30:25,830 --> 00:30:28,480 >> This is saying what TCP port number to talk to, 671 00:30:28,480 --> 00:30:31,650 and I just figured that out by googling or by asking the help desk. 672 00:30:31,650 --> 00:30:34,640 And then because Harvard uses some system security on the mail server-- 673 00:30:34,640 --> 00:30:37,060 at least to encrypt traffic between you and it, 674 00:30:37,060 --> 00:30:41,380 even though anyone can send to it-- I'm going to turn on the TLS protocol 675 00:30:41,380 --> 00:30:42,710 for keeping this secure. 676 00:30:42,710 --> 00:30:44,730 >> But this is where things get a little scary. 677 00:30:44,730 --> 00:30:47,970 I can just arbitrarily say that I am jharvard, 678 00:30:47,970 --> 00:30:51,930 and I can just arbitrarily email myself here. 679 00:30:51,930 --> 00:30:55,650 And then I can specify a subject with this line here. 680 00:30:55,650 --> 00:30:58,460 >> And this just looks ugly, but it's just a bunch of concatenation. 681 00:30:58,460 --> 00:31:04,480 Turns out PHP has a super useful symbol, like some languages, the dot operator, 682 00:31:04,480 --> 00:31:07,340 which just literally concatenates string after string after string, 683 00:31:07,340 --> 00:31:09,810 and you don't have to malloc or figure out the total length of the string. 684 00:31:09,810 --> 00:31:10,820 You just do it. 685 00:31:10,820 --> 00:31:15,220 And indeed, because I'm concatenating in all of these things with these dots, 686 00:31:15,220 --> 00:31:18,330 that's why the email I sent looked as it did. 687 00:31:18,330 --> 00:31:20,610 >> And then lastly here, I'm sending mail. 688 00:31:20,610 --> 00:31:22,580 So if this is false, I'm just going to die, 689 00:31:22,580 --> 00:31:25,680 which is a function that just prints to the screen some error message. 690 00:31:25,680 --> 00:31:29,170 But it is, in fact, calling the send function. 691 00:31:29,170 --> 00:31:31,780 Otherwise, if all of this fails, it redirects me back here. 692 00:31:31,780 --> 00:31:34,050 >> And why did I see that I'm registered really? 693 00:31:34,050 --> 00:31:36,110 Well, it happened right here. 694 00:31:36,110 --> 00:31:38,170 So I bring this up for a couple of reasons. 695 00:31:38,170 --> 00:31:41,542 >> One, this is exactly how if you build some website for a final project 696 00:31:41,542 --> 00:31:44,000 or for the real world, this is how you send email reminders 697 00:31:44,000 --> 00:31:45,924 to your customers or your subscribers. 698 00:31:45,924 --> 00:31:47,590 This is how you send password reminders. 699 00:31:47,590 --> 00:31:50,760 This is how you send people messages that they have a new Facebook 700 00:31:50,760 --> 00:31:52,990 message pending or something like that. 701 00:31:52,990 --> 00:31:55,010 >> But it also speaks to the fact that this could 702 00:31:55,010 --> 00:31:58,160 have been very well from Davin or anyone else. 703 00:31:58,160 --> 00:32:00,567 And I say this kind of with a smile because I'm 704 00:32:00,567 --> 00:32:03,400 quite sure what's going through several of your minds at this point. 705 00:32:03,400 --> 00:32:11,910 But this is one of those do as I say, not as I do kind of things, 706 00:32:11,910 --> 00:32:14,480 because it is trivial to forge emails like this. 707 00:32:14,480 --> 00:32:16,480 But as you may have seen or read in the Crimson, 708 00:32:16,480 --> 00:32:18,271 of late it's also pretty trivial for people 709 00:32:18,271 --> 00:32:20,050 to trace them back to some origin. 710 00:32:20,050 --> 00:32:23,790 And ask me some time, perhaps at CS50 lunch, how I first 711 00:32:23,790 --> 00:32:27,080 got acquainted very closely almost to the ed board many years 712 00:32:27,080 --> 00:32:30,890 ago when I discovered how the internet worked. 713 00:32:30,890 --> 00:32:36,940 So in any case-- slightly after the ed board did. 714 00:32:36,940 --> 00:32:42,300 >> So in any case, there is a whole bunch of super globals, 715 00:32:42,300 --> 00:32:45,960 as they're called here, one of which we saw-- dollar sign underscore post. 716 00:32:45,960 --> 00:32:49,530 There's a counterpart called get, which is where stuff from a URL 717 00:32:49,530 --> 00:32:50,690 ends up going. 718 00:32:50,690 --> 00:32:54,051 And there's a whole bunch of others, too-- session and server and cookie. 719 00:32:54,051 --> 00:32:55,800 We'll come back to cookie some other time, 720 00:32:55,800 --> 00:33:01,340 but session is kind of cool because right now-- up until now-- 721 00:33:01,340 --> 00:33:06,350 everything we've done with a web browser is sort of stateless, so to speak. 722 00:33:06,350 --> 00:33:10,060 I can click around, access files on the server, something 723 00:33:10,060 --> 00:33:13,500 happens on the screen, but then the connection closes. 724 00:33:13,500 --> 00:33:17,450 The Internet Explorer or the Firefox icon stops spinning 725 00:33:17,450 --> 00:33:20,340 and you just see what that web page contains. 726 00:33:20,340 --> 00:33:23,530 >> So HTTP is stateless in that once it makes a connection, 727 00:33:23,530 --> 00:33:25,050 gets some data, that's it. 728 00:33:25,050 --> 00:33:29,940 No more connection, unlike Skype, unlike Facetime, unlike GChat, which 729 00:33:29,940 --> 00:33:32,180 maintains a constant connection to the server. 730 00:33:32,180 --> 00:33:34,650 The web is fundamentally disconnected, though we'll 731 00:33:34,650 --> 00:33:36,630 see before long how we can simulate things 732 00:33:36,630 --> 00:33:39,300 like Facebook chat and GChat, which maintain the illusion-- 733 00:33:39,300 --> 00:33:41,680 or actually do maintain a constant connection using 734 00:33:41,680 --> 00:33:43,270 more modern technology. 735 00:33:43,270 --> 00:33:49,000 >> But if I go to, say, counter.php, this is another simple example, 736 00:33:49,000 --> 00:33:52,700 as we'll see, that currently thinks I visited the site zero time. 737 00:33:52,700 --> 00:33:56,790 But if I simply reload the page, it somehow knows I was here before. 738 00:33:56,790 --> 00:33:58,840 If I reload again, it knows I was here before. 739 00:33:58,840 --> 00:34:01,100 And again and again and again and again. 740 00:34:01,100 --> 00:34:03,610 >> So there's some plus-plussing going on, but notice 741 00:34:03,610 --> 00:34:07,090 the little thing spins ever so briefly up top and then disconnects, 742 00:34:07,090 --> 00:34:11,179 so it's not like I have a constant connection to my appliance. 743 00:34:11,179 --> 00:34:16,929 Well, if I go into counter.php, notice how simple it is. 744 00:34:16,929 --> 00:34:19,080 I first call this special function that we'll soon 745 00:34:19,080 --> 00:34:21,513 start taking for granted called session start. 746 00:34:21,513 --> 00:34:22,179 Start a session. 747 00:34:22,179 --> 00:34:25,095 >> And a session henceforth is just going to be a bucket, like a shopping 748 00:34:25,095 --> 00:34:28,120 cart in which you can put values and kind of trust as a programmer 749 00:34:28,120 --> 00:34:31,590 that they're going to be here when that user comes back-- a second 750 00:34:31,590 --> 00:34:35,670 later, an hour later, even a year later, so long as he or she doesn't clear 751 00:34:35,670 --> 00:34:37,602 their cookies, as we'll eventually see. 752 00:34:37,602 --> 00:34:39,310 And now I just have an if condition here. 753 00:34:39,310 --> 00:34:44,679 So if the following key, called counter, is set inside 754 00:34:44,679 --> 00:34:49,210 of this super global-- this hash table, if you will-- called session, 755 00:34:49,210 --> 00:34:53,350 then go ahead and grab the value from the session-- think 756 00:34:53,350 --> 00:34:55,250 of this as a shopping cart-- and store it 757 00:34:55,250 --> 00:34:57,680 in a temporary variable called counter. 758 00:34:57,680 --> 00:35:02,240 >> Otherwise, if that value counter was not set in the so-called shopping cart, 759 00:35:02,240 --> 00:35:04,430 just initialize it to 0. 760 00:35:04,430 --> 00:35:09,830 Lastly, down here, go and put back into the shopping carts or the session 761 00:35:09,830 --> 00:35:13,000 the value of counter +1. 762 00:35:13,000 --> 00:35:16,730 So it turns out that this special container here-- 763 00:35:16,730 --> 00:35:20,355 which, again, is one of these associated arrays, an array that you can index 764 00:35:20,355 --> 00:35:25,010 into words instead of numbers-- persists even after the user goes away. 765 00:35:25,010 --> 00:35:26,510 Again, I'll go back to the page now. 766 00:35:26,510 --> 00:35:28,400 It's been a minute or so. 767 00:35:28,400 --> 00:35:31,300 But it remembers that I've been here 19 times before. 768 00:35:31,300 --> 00:35:32,740 This is my 20th visit. 769 00:35:32,740 --> 00:35:36,560 >> And so this is going to be key to implementing any website that remembers 770 00:35:36,560 --> 00:35:40,640 that you're logged in, that you put something literal in your shopping cart 771 00:35:40,640 --> 00:35:43,902 to buy or that you have some number of messages pending. 772 00:35:43,902 --> 00:35:45,610 Anytime you want to remember information, 773 00:35:45,610 --> 00:35:48,130 we'll see that PHP, like several other languages, 774 00:35:48,130 --> 00:35:53,640 provides us with this illusion of state even though, as you'll see in PSET 6, 775 00:35:53,640 --> 00:35:57,642 as you're making HTTP requests from client to server, that's it. 776 00:35:57,642 --> 00:35:59,850 Once you get back that response, there's nothing more 777 00:35:59,850 --> 00:36:01,790 coming back from the server by default. 778 00:36:01,790 --> 00:36:03,820 But we'll see how to work around that. 779 00:36:03,820 --> 00:36:07,430 >> Well now, let's try to clean this up a little bit. 780 00:36:07,430 --> 00:36:09,470 We've seen a few different examples there. 781 00:36:09,470 --> 00:36:12,250 Oh, and as an aside, for those familiar or unfamiliar, 782 00:36:12,250 --> 00:36:14,230 the reason that the Frosh IMs example went 783 00:36:14,230 --> 00:36:18,060 from looking really ugly to slightly-- well, 784 00:36:18,060 --> 00:36:23,160 still ugly-- to slightly less ugly though still ugly 785 00:36:23,160 --> 00:36:25,230 is because if we look at the source code here, 786 00:36:25,230 --> 00:36:28,240 it turns out that I have this at the very top of the file. 787 00:36:28,240 --> 00:36:32,570 >> Turns out that bootstrap is one of many freely available libraries out there 788 00:36:32,570 --> 00:36:37,140 that exist not for programming languages always, but for CSS or for JavaScript 789 00:36:37,140 --> 00:36:39,190 or HTML or any number of languages. 790 00:36:39,190 --> 00:36:42,160 >> And these folks here-- originally came out 791 00:36:42,160 --> 00:36:44,730 of Twitter-- just have a whole bunch of styles. 792 00:36:44,730 --> 00:36:47,360 It's a massive file here that someone wrote, 793 00:36:47,360 --> 00:36:51,020 or someone's wrote, over time that specifies colors and formatting 794 00:36:51,020 --> 00:36:53,740 and whatnot so that I can kind of borrow their syntax 795 00:36:53,740 --> 00:36:56,157 and not have to figure out how to lay out my form. 796 00:36:56,157 --> 00:36:57,990 This is also minified so that a computer can 797 00:36:57,990 --> 00:37:00,560 understand it but not necessarily a human. 798 00:37:00,560 --> 00:37:03,050 So that's just why the stylization there changed. 799 00:37:03,050 --> 00:37:05,450 >> But let's now do better in terms of design, 800 00:37:05,450 --> 00:37:07,490 because if we stay down this road too long, 801 00:37:07,490 --> 00:37:11,290 our code's going to get messy and messier. 802 00:37:11,290 --> 00:37:13,040 So let's focus on these examples here. 803 00:37:13,040 --> 00:37:15,090 The last for today. 804 00:37:15,090 --> 00:37:18,720 >> So here is a super simple version 1.0 of CS50's website. 805 00:37:18,720 --> 00:37:21,250 It only has links to lectures and syllabus, 806 00:37:21,250 --> 00:37:25,490 and it's using that unordered list tag-- the UL tag that we used last time. 807 00:37:25,490 --> 00:37:28,800 And if in fact, if I open up View Page Source, 808 00:37:28,800 --> 00:37:31,710 you'll see that this is really, really simple HTML. 809 00:37:31,710 --> 00:37:35,460 And in fact, even though this is a PHP file underneath the hood, 810 00:37:35,460 --> 00:37:38,620 it's still just spitting out only HTML for now. 811 00:37:38,620 --> 00:37:41,312 >> So if I click on Lectures, we see this happen. 812 00:37:41,312 --> 00:37:43,020 And if I click on week zero, we see this. 813 00:37:43,020 --> 00:37:44,920 And if I click on Wednesday, we see this. 814 00:37:44,920 --> 00:37:47,900 And this apparently was the PDF of the slides from that day. 815 00:37:47,900 --> 00:37:52,020 All I've done is link with an anchor tag to this URL here. 816 00:37:52,020 --> 00:37:55,400 >> So this is only to say this is a pretty simple version of CS50's website. 817 00:37:55,400 --> 00:37:56,790 Let's see how it's implemented. 818 00:37:56,790 --> 00:38:01,240 If I go into the mvc0 directory, we'll see a few files. 819 00:38:01,240 --> 00:38:03,250 One is a README, so if some of this is too fast, 820 00:38:03,250 --> 00:38:05,166 you can just poke around more leisurely later. 821 00:38:05,166 --> 00:38:07,930 And notice in here is an index.php file. 822 00:38:07,930 --> 00:38:09,960 It turns out that if you yourself, the human, 823 00:38:09,960 --> 00:38:14,460 don't specify a final name in a URL, the web server usually 824 00:38:14,460 --> 00:38:17,010 infers some default name for you. 825 00:38:17,010 --> 00:38:20,060 An index dot something is generally the default. 826 00:38:20,060 --> 00:38:23,010 >> So that's why a moment ago when I visited this URL here, 827 00:38:23,010 --> 00:38:26,750 no file name, no file extension, no period in the URL. 828 00:38:26,750 --> 00:38:29,710 It just knew somehow magically to look for index.php. 829 00:38:29,710 --> 00:38:30,870 It's just a convention. 830 00:38:30,870 --> 00:38:32,360 Could be called anything. 831 00:38:32,360 --> 00:38:35,110 >> So if I now go into index.php, you'll see 832 00:38:35,110 --> 00:38:37,100 that, indeed-- let's get rid of the comments 833 00:38:37,100 --> 00:38:39,500 here because there's really nothing interesting to it-- 834 00:38:39,500 --> 00:38:41,579 this is just hard coded HTML. 835 00:38:41,579 --> 00:38:43,370 So that's consistent, though, with my claim 836 00:38:43,370 --> 00:38:45,230 that you can commingle HTML and PHP. 837 00:38:45,230 --> 00:38:48,060 There's no actual programming logic in here. 838 00:38:48,060 --> 00:38:51,030 >> And the other files are pretty much just as uninteresting. 839 00:38:51,030 --> 00:38:56,240 It's just hard-coded week one here to week one m and week one w, 840 00:38:56,240 --> 00:38:57,510 for Monday and Wednesday. 841 00:38:57,510 --> 00:39:01,890 And then if I open up week zero, notice it's almost identical. 842 00:39:01,890 --> 00:39:03,320 >> And that's kind of a key takeaway. 843 00:39:03,320 --> 00:39:06,180 Notice just how redundant this is. 844 00:39:06,180 --> 00:39:10,710 These files barely change, and yet I pulled one of these copy/paste jobs 845 00:39:10,710 --> 00:39:13,420 where I took one file-- presumably in week zero-- copied it 846 00:39:13,420 --> 00:39:16,320 when week one came around, and tweaked a few values. 847 00:39:16,320 --> 00:39:18,590 We should probably be able to do better than this. 848 00:39:18,590 --> 00:39:21,800 >> So let's go back up to mvc and go into version one. 849 00:39:21,800 --> 00:39:24,810 And notice I've got a few files, because what 850 00:39:24,810 --> 00:39:29,870 was common to all of those files just a moment ago-- if I go back to version 0, 851 00:39:29,870 --> 00:39:32,600 let's go back into index, and just postulate-- 852 00:39:32,600 --> 00:39:36,090 once I get rid of the comments-- what part of this page 853 00:39:36,090 --> 00:39:40,072 is presumably in every one of my files? 854 00:39:40,072 --> 00:39:40,780 Just call it out. 855 00:39:40,780 --> 00:39:44,620 Which lines are duplicated probably across all of these pages? 856 00:39:44,620 --> 00:39:45,120 Yeah? 857 00:39:45,120 --> 00:39:46,110 >> STUDENT: [INAUDIBLE]. 858 00:39:46,110 --> 00:39:47,660 >> DAVID J. MALAN: 1 through 9. 859 00:39:47,660 --> 00:39:48,720 Yeah, absolutely. 860 00:39:48,720 --> 00:39:52,080 1 through 9, except maybe 8 changes a little bit because CS50 861 00:39:52,080 --> 00:39:54,650 becomes lectures or week zero or something. 862 00:39:54,650 --> 00:39:55,970 But almost identical. 863 00:39:55,970 --> 00:39:58,657 So all this stuff is just kind of copied and pasted. 864 00:39:58,657 --> 00:40:00,490 And there's a couple other lines I can think 865 00:40:00,490 --> 00:40:05,000 of that are probably identical across all the files. 866 00:40:05,000 --> 00:40:06,315 >> STUDENT: 12 and 13. 867 00:40:06,315 --> 00:40:07,190 DAVID J. MALAN: Yeah. 868 00:40:07,190 --> 00:40:11,220 Sure, 12, 13, and 14 probably, just because the interesting stuff 869 00:40:11,220 --> 00:40:15,460 is happening on lines 11 and 10, so it would seem. 870 00:40:15,460 --> 00:40:18,350 So let's look at version 1, which tries to improve on this. 871 00:40:18,350 --> 00:40:24,020 In version 1 of this mvc example-- we'll explain what mvc means in a moment-- 872 00:40:24,020 --> 00:40:27,420 if I go into index, it kind of looks a little confusing now. 873 00:40:27,420 --> 00:40:28,880 It's not quite as simple as before. 874 00:40:28,880 --> 00:40:30,906 >> But once you start to read it carefully, it's 875 00:40:30,906 --> 00:40:32,530 pretty straightforward what it's doing. 876 00:40:32,530 --> 00:40:34,397 Apparently line 1 and line 8 have replaced 877 00:40:34,397 --> 00:40:37,230 all of the stuff you just identified-- though just for good measure, 878 00:40:37,230 --> 00:40:41,900 I left the ULs there just in case some days didn't have a list of things. 879 00:40:41,900 --> 00:40:47,860 And so require is kind of like pound include in C. It copies and pastes 880 00:40:47,860 --> 00:40:50,470 the contents effectively right here into this file. 881 00:40:50,470 --> 00:40:53,650 >> So in header.php, as you might infer from its name, 882 00:40:53,650 --> 00:40:55,330 is going to be the header of the page. 883 00:40:55,330 --> 00:40:57,110 It's kind of orphaned here. 884 00:40:57,110 --> 00:41:01,820 It only has the top to it, but there's no more content below. 885 00:41:01,820 --> 00:41:05,070 >> And if I look at footer meanwhile, which was the other file mentioned-- 886 00:41:05,070 --> 00:41:08,830 this one's even less interesting, but again, it's common to everything. 887 00:41:08,830 --> 00:41:10,250 So this is the footer. 888 00:41:10,250 --> 00:41:11,300 This is the header. 889 00:41:11,300 --> 00:41:13,950 This is the file that's changing, so why not 890 00:41:13,950 --> 00:41:18,140 try to factor out the commonality with these two lines here? 891 00:41:18,140 --> 00:41:20,090 >> But we can clean this up a little further. 892 00:41:20,090 --> 00:41:23,260 I'm going to go ahead and open up version two where 893 00:41:23,260 --> 00:41:27,106 we'll see that there's a new file, helpers.php. 894 00:41:27,106 --> 00:41:28,610 We'll see what that is in a moment. 895 00:41:28,610 --> 00:41:30,930 Let's go to index, as the entry point as before. 896 00:41:30,930 --> 00:41:35,230 >> And now notice I'm requiring helpers.php, not header or footer. 897 00:41:35,230 --> 00:41:41,720 But helpers is kind of like helpers.c and helpers.h from PSET 2 898 00:41:41,720 --> 00:41:46,150 or PSET 3 long ago when you actually did search and find for that PSET, 899 00:41:46,150 --> 00:41:50,950 and you had all of your code for sorting and searching in a separate file. 900 00:41:50,950 --> 00:41:52,510 That's what's going on here. 901 00:41:52,510 --> 00:41:54,390 >> And now line 3 looks a little different. 902 00:41:54,390 --> 00:41:55,920 And it's just one line. 903 00:41:55,920 --> 00:41:57,950 To make this even more clear, I could just 904 00:41:57,950 --> 00:42:01,820 do this to be stylistically consistent with everything else we've done. 905 00:42:01,820 --> 00:42:04,130 But that's not really changing the functionality. 906 00:42:04,130 --> 00:42:05,880 It's just one line of real code. 907 00:42:05,880 --> 00:42:09,010 >> Apparently, there's a function somewhere called render header, 908 00:42:09,010 --> 00:42:11,420 and this is where things get pretty powerful. 909 00:42:11,420 --> 00:42:17,040 Notice that inside of its parentheses is what other piece of syntax? 910 00:42:17,040 --> 00:42:19,780 911 00:42:19,780 --> 00:42:23,350 It's probably a little hard to say, but notice there's-- I'll put some white 912 00:42:23,350 --> 00:42:24,300 space. 913 00:42:24,300 --> 00:42:25,530 There's square brackets. 914 00:42:25,530 --> 00:42:29,700 >> And square brackets we saw a bit ago in the context of associative arrays, 915 00:42:29,700 --> 00:42:31,580 which are, again, like hash tables. 916 00:42:31,580 --> 00:42:36,230 And if you think now to C, the order of arguments into a function 917 00:42:36,230 --> 00:42:37,570 has to always be the same. 918 00:42:37,570 --> 00:42:41,146 You have to remember what the order is-- x, y, z or z, y, x-- 919 00:42:41,146 --> 00:42:44,020 and you have to always provide them in the same order or look them up 920 00:42:44,020 --> 00:42:45,100 if you've forgotten. 921 00:42:45,100 --> 00:42:51,140 >> But this seems to be a clever way of passing an arbitrary key value 922 00:42:51,140 --> 00:42:55,840 pairs whereby title is the name of an argument in this case 923 00:42:55,840 --> 00:42:58,334 and CS50 is its value. 924 00:42:58,334 --> 00:43:00,250 And the fact that I have these square brackets 925 00:43:00,250 --> 00:43:02,560 here means that I could also pass in something 926 00:43:02,560 --> 00:43:07,550 like a week is 1 or 0 or 2 or 3. 927 00:43:07,550 --> 00:43:10,550 So we've parameterized this function in such a way 928 00:43:10,550 --> 00:43:15,180 that it can take multiple inputs, but for now it's just the one. 929 00:43:15,180 --> 00:43:20,060 >> If I now go into helpers.php, notice what it's doing. 930 00:43:20,060 --> 00:43:22,030 This is a little bit of new functionality, 931 00:43:22,030 --> 00:43:24,190 but for now just take on faith that this is 932 00:43:24,190 --> 00:43:26,570 the syntax with which you define a function in PHP. 933 00:43:26,570 --> 00:43:27,840 You literally say function. 934 00:43:27,840 --> 00:43:30,090 You don't specify a return type, and that's consistent 935 00:43:30,090 --> 00:43:33,880 with the variable detail earlier where you don't really strongly type. 936 00:43:33,880 --> 00:43:35,650 >> This just specifies that, by default, this 937 00:43:35,650 --> 00:43:37,460 takes an associative array as an argument. 938 00:43:37,460 --> 00:43:38,210 And you know what? 939 00:43:38,210 --> 00:43:41,450 If the user doesn't pass one in, assume a default value. 940 00:43:41,450 --> 00:43:44,680 >> This is a feature that C doesn't have for us, which is nice, because now 941 00:43:44,680 --> 00:43:46,430 data, even if you don't give it anything, 942 00:43:46,430 --> 00:43:49,300 is going to be an array but an empty one. 943 00:43:49,300 --> 00:43:51,860 And as an aside, extract just does something funky 944 00:43:51,860 --> 00:43:56,380 where it takes all of the keys from this associative array, all of the things 945 00:43:56,380 --> 00:43:59,950 you could put in square brackets, and creates variables out of them 946 00:43:59,950 --> 00:44:06,270 so that we can ultimately have access to them in footet.php and header.php. 947 00:44:06,270 --> 00:44:08,950 That's a little abstract, so let me point this out. 948 00:44:08,950 --> 00:44:12,990 >> In index.php, notice that I'm passing in a key value pair of title 949 00:44:12,990 --> 00:44:14,850 with a value of CS50. 950 00:44:14,850 --> 00:44:18,660 If I now look at helpers.php, notice that RenderHeader 951 00:44:18,660 --> 00:44:23,870 is extracting that data that I'm passing in, and then requiring header.php. 952 00:44:23,870 --> 00:44:27,970 What I've done is sort of a poor man's implementation now of the following. 953 00:44:27,970 --> 00:44:31,720 >> If I open up header.php, notice that I've no longer hard 954 00:44:31,720 --> 00:44:34,890 coded the word CS50 in this header file. 955 00:44:34,890 --> 00:44:39,310 I've put this admittedly atrociously named function, HTML special chars, 956 00:44:39,310 --> 00:44:40,170 in there. 957 00:44:40,170 --> 00:44:41,640 But notice what I've done. 958 00:44:41,640 --> 00:44:44,240 I've got open HTML. 959 00:44:44,240 --> 00:44:47,420 I then have open head and open title. 960 00:44:47,420 --> 00:44:52,380 >> And then inside of the title's open and close tags, I have a bit of PHP code. 961 00:44:52,380 --> 00:44:56,670 And this is a nice but of syntax, which just means echo out. 962 00:44:56,670 --> 00:44:59,840 It literally means this-- echo the following-- 963 00:44:59,840 --> 00:45:01,910 but this is sexier to write. 964 00:45:01,910 --> 00:45:05,000 Echo out the title that's been passed in. 965 00:45:05,000 --> 00:45:07,560 >> But what do you think HTML special char is all about, 966 00:45:07,560 --> 00:45:10,590 especially if you have some prior HTML experience? 967 00:45:10,590 --> 00:45:14,050 What characters might be dangerous to pass in to a page 968 00:45:14,050 --> 00:45:17,980 where you're dynamically generating the web page with code like this? 969 00:45:17,980 --> 00:45:21,370 970 00:45:21,370 --> 00:45:24,650 Let me go to this file, version two, and see if I can't induce this. 971 00:45:24,650 --> 00:45:26,210 >> Version two is this. 972 00:45:26,210 --> 00:45:28,510 And notice everything is fine, working well. 973 00:45:28,510 --> 00:45:35,280 But suppose I go into index.php and I specified that the title of my page 974 00:45:35,280 --> 00:45:36,630 is not CS50. 975 00:45:36,630 --> 00:45:44,930 It is open bracket script alert hello world, close single quote, 976 00:45:44,930 --> 00:45:49,740 close parenthesis, semicolon, open bracket, slash script. 977 00:45:49,740 --> 00:45:51,897 >> Script, as we'll eventually see, is a tag 978 00:45:51,897 --> 00:45:54,480 that you can use to use of another programming language called 979 00:45:54,480 --> 00:45:56,330 JavaScript inside of a web page. 980 00:45:56,330 --> 00:45:57,960 And now notice the logic here. 981 00:45:57,960 --> 00:45:59,840 Here is a key called title. 982 00:45:59,840 --> 00:46:02,690 Here is it's crazy long value now. 983 00:46:02,690 --> 00:46:07,840 >> But if I go to the helpers page- or rather, the header page, 984 00:46:07,840 --> 00:46:11,310 I'm calling this function on that title first. 985 00:46:11,310 --> 00:46:15,250 So if I now reload this page, I see this, which looks ridiculous, 986 00:46:15,250 --> 00:46:16,110 but it's safe. 987 00:46:16,110 --> 00:46:17,310 It just looks stupid. 988 00:46:17,310 --> 00:46:20,320 >> But suppose instead I had forgotten this. 989 00:46:20,320 --> 00:46:24,660 And mark my words, a nonzero number of you will forget to do this 990 00:46:24,660 --> 00:46:27,790 and you'll get some industrious student or friend coming up 991 00:46:27,790 --> 00:46:31,540 to you at the CS50 fair or anonymously at night poking around on your website 992 00:46:31,540 --> 00:46:35,300 and essentially injecting code unbeknownst to you into your site 993 00:46:35,300 --> 00:46:35,800 somehow. 994 00:46:35,800 --> 00:46:39,000 >> Because if I simply spit out title here and title 995 00:46:39,000 --> 00:46:44,330 there-- well, if title literally looks like this and PHP 996 00:46:44,330 --> 00:46:47,660 as a language that can spit out other languages text, 997 00:46:47,660 --> 00:46:50,650 this is literally going to replace this tag with, 998 00:46:50,650 --> 00:46:53,010 of course, what I put elsewhere. 999 00:46:53,010 --> 00:46:57,640 >> So if I now go here and reload after undoing those safety mechanisms, 1000 00:46:57,640 --> 00:46:59,982 now I have hello world here. 1001 00:46:59,982 --> 00:47:02,690 Now that's not all that big of a deal, but you could do something 1002 00:47:02,690 --> 00:47:05,119 a little more malicious here, like there's 1003 00:47:05,119 --> 00:47:08,410 other tags-- as we'll see once we spend more time in JavaScript-- like location 1004 00:47:08,410 --> 00:47:14,910 dot href gets, quote, unquote, HTTP business.com, but the opposite of that 1005 00:47:14,910 --> 00:47:15,950 from the other day. 1006 00:47:15,950 --> 00:47:20,120 And now you can induce a web page to actually go immediately 1007 00:47:20,120 --> 00:47:21,190 to this web page here. 1008 00:47:21,190 --> 00:47:23,000 >> And actually, I don't want to even go to business.com 1009 00:47:23,000 --> 00:47:24,749 because I don't want to know what that is. 1010 00:47:24,749 --> 00:47:28,710 But this, too, will trigger code to be injected into this page. 1011 00:47:28,710 --> 00:47:32,680 So this is only to say that even though we're introducing super early on some 1012 00:47:32,680 --> 00:47:36,800 of these more complex structures, it's all toward an end of making sure 1013 00:47:36,800 --> 00:47:39,320 that your code is not exploitable. 1014 00:47:39,320 --> 00:47:40,960 >> So now a third version here. 1015 00:47:40,960 --> 00:47:42,470 It's getting a little fancier. 1016 00:47:42,470 --> 00:47:44,875 I didn't really like-- the anal side of me 1017 00:47:44,875 --> 00:47:47,750 was getting a little annoyed by the fact that I had a function called 1018 00:47:47,750 --> 00:47:51,940 RenderHeader and RenderFooter that were almost identical. 1019 00:47:51,940 --> 00:47:55,400 So it occurred to me, why don't I parameterize these functions 1020 00:47:55,400 --> 00:47:59,180 into just one called render, have it take a second argument 1021 00:47:59,180 --> 00:48:04,420 like the name of the template, the final to render-- either header or footer? 1022 00:48:04,420 --> 00:48:07,160 And then optionally, if I want to pass in some key value pairs 1023 00:48:07,160 --> 00:48:10,580 like I do for the title for the header but not for the footer, 1024 00:48:10,580 --> 00:48:11,800 I could do that. 1025 00:48:11,800 --> 00:48:16,510 >> And so now if I go into helpers.php, it's a little more complex. 1026 00:48:16,510 --> 00:48:19,670 And I'll wave my hands at the details, but it's just one function. 1027 00:48:19,670 --> 00:48:21,890 So that's a step toward a better design. 1028 00:48:21,890 --> 00:48:23,360 >> We can take this one step further. 1029 00:48:23,360 --> 00:48:28,890 If I go into my fourth version of this, notice now 1030 00:48:28,890 --> 00:48:31,320 that I'm doing something even more kind of cryptic. 1031 00:48:31,320 --> 00:48:33,230 And I know this is a lot to absorb at once, 1032 00:48:33,230 --> 00:48:35,080 but we're just kind of cleaning things up. 1033 00:48:35,080 --> 00:48:38,550 Now I'm putting my helpers file into a folder called 1034 00:48:38,550 --> 00:48:41,190 includes-- just an arbitrary name where I want to put stuff 1035 00:48:41,190 --> 00:48:44,300 that I want to include-- and then the rest of this is the same. 1036 00:48:44,300 --> 00:48:47,140 >> But if I look now in gedit, notice that I've gotten rid 1037 00:48:47,140 --> 00:48:51,940 of all of those other files and I've moved them, for instance, into here. 1038 00:48:51,940 --> 00:48:55,110 And then in templates, I have this here, too. 1039 00:48:55,110 --> 00:48:59,292 And so this is all now toward a step of using a much better design pattern. 1040 00:48:59,292 --> 00:49:01,000 And we're very quickly going to move away 1041 00:49:01,000 --> 00:49:03,870 from PHP's default functionality, which we started here with, 1042 00:49:03,870 --> 00:49:07,655 where you just commingle PHP, and your HTML, and your CSS, 1043 00:49:07,655 --> 00:49:09,780 and you just spit it out and you go about your way. 1044 00:49:09,780 --> 00:49:11,404 It's not going to be very maintainable. 1045 00:49:11,404 --> 00:49:14,481 Just like in C, we started using multiple files and multiple functions 1046 00:49:14,481 --> 00:49:15,730 and factoring things that out. 1047 00:49:15,730 --> 00:49:16,688 We'll do the same here. 1048 00:49:16,688 --> 00:49:19,970 And in fact, in the fifth and final version here, I did one other thing. 1049 00:49:19,970 --> 00:49:23,710 You can even use dot dot, which, again, is just the parent directory. 1050 00:49:23,710 --> 00:49:28,260 To be even more security conscious, because if I look at the listening 1051 00:49:28,260 --> 00:49:32,450 here for fifth and final version, notice that I have one directory here called 1052 00:49:32,450 --> 00:49:35,180 public, and then on the same level, so to speak, 1053 00:49:35,180 --> 00:49:38,490 I've got includes and templates and then that text file readme. 1054 00:49:38,490 --> 00:49:41,130 >> And the reason I've structured it like this-- and so many web 1055 00:49:41,130 --> 00:49:44,330 hosts, especially those $5 a month ones or $10 month ones, 1056 00:49:44,330 --> 00:49:47,170 if you've ever had one of these services-- what so many of them do 1057 00:49:47,170 --> 00:49:50,690 is they just expect you to dump all of your files into one directory, 1058 00:49:50,690 --> 00:49:53,640 like we did already with this very first example. 1059 00:49:53,640 --> 00:49:56,740 >> But as soon as you start building more sophisticated sites that just store 1060 00:49:56,740 --> 00:50:00,480 data you care about and files you care about, actually organizing things 1061 00:50:00,480 --> 00:50:05,060 correctly and with more security consciousness in mind can 1062 00:50:05,060 --> 00:50:07,927 we start to defend against all of the friends 1063 00:50:07,927 --> 00:50:10,135 that you have either in or outside of this class who, 1064 00:50:10,135 --> 00:50:12,510 as soon as you start making programs yourself on the web, 1065 00:50:12,510 --> 00:50:15,140 are going to start picking on you and on them. 1066 00:50:15,140 --> 00:50:17,420 >> And so we'll look ultimately at this design. 1067 00:50:17,420 --> 00:50:20,010 This is just a picture that depicts the following. 1068 00:50:20,010 --> 00:50:22,897 We're going to put all of our programming logic in one or more files, 1069 00:50:22,897 --> 00:50:25,230 and we're going to just start calling those controllers. 1070 00:50:25,230 --> 00:50:28,022 It's where the brains of our websites actually are. 1071 00:50:28,022 --> 00:50:29,730 Then we're going to have views, and views 1072 00:50:29,730 --> 00:50:32,480 are as simple as just separate files-- called templates, often. 1073 00:50:32,480 --> 00:50:34,410 They just have the aesthetics of my page, 1074 00:50:34,410 --> 00:50:37,020 what I want the page to look like-- the colors and the layout 1075 00:50:37,020 --> 00:50:38,870 and the positions of all of the variables. 1076 00:50:38,870 --> 00:50:41,120 >> And then more interesting that we'll eventually get to 1077 00:50:41,120 --> 00:50:45,420 is the model, which is going to be just the word we slap on other technologies 1078 00:50:45,420 --> 00:50:47,771 that we bring into the picture, like actual databases, 1079 00:50:47,771 --> 00:50:49,520 so that when you want to save information, 1080 00:50:49,520 --> 00:50:52,140 you don't just send an email to your proctor or to yourself, 1081 00:50:52,140 --> 00:50:57,350 you actually store it in a database using another language known as SQL. 1082 00:50:57,350 --> 00:51:00,450 And so we'll leave here today and pick up with this on Wednesday 1083 00:51:00,450 --> 00:51:02,990 and introduce databases then. 1084 00:51:02,990 --> 00:51:06,940 >> [MUSIC PLAYING] 1085 00:51:06,940 --> 00:54:24,555