1 00:00:00,000 --> 00:00:01,390 2 00:00:01,390 --> 00:00:04,890 >> [MUSIC PLAYING] 3 00:00:04,890 --> 00:00:10,955 4 00:00:10,955 --> 00:00:12,580 DAVID J MALAN: All right, welcome back. 5 00:00:12,580 --> 00:00:13,600 This is CS50. 6 00:00:13,600 --> 00:00:15,540 This is the end of week seven. 7 00:00:15,540 --> 00:00:18,180 And it's the end of that scavenger hunt from problem set four 8 00:00:18,180 --> 00:00:19,220 that you might recall. 9 00:00:19,220 --> 00:00:21,650 After recovering all of those JPEGs of staff, 10 00:00:21,650 --> 00:00:24,820 you were challenged, if you'd like, to photograph yourself with as many 11 00:00:24,820 --> 00:00:25,981 of those folks as you can. 12 00:00:25,981 --> 00:00:28,480 We got a whole bunch of submissions over the past few weeks, 13 00:00:28,480 --> 00:00:32,980 indeed, quite a few right before noon today, some of which are those here, 14 00:00:32,980 --> 00:00:37,670 caught here in-- looks like-- Annenberg Hall at office hours, one here 15 00:00:37,670 --> 00:00:39,530 in Lowell House with Nick. 16 00:00:39,530 --> 00:00:41,750 Here's Ramon being caught on the phone. 17 00:00:41,750 --> 00:00:43,870 This was at a CS50 lunch. 18 00:00:43,870 --> 00:00:46,840 This was Jason Skyping with a more creative classmate, 19 00:00:46,840 --> 00:00:48,280 who phoned him this way. 20 00:00:48,280 --> 00:00:49,690 We don't know what this was. 21 00:00:49,690 --> 00:00:51,940 >> [LAUGHTER] 22 00:00:51,940 --> 00:00:54,570 >> DAVID J MALAN: But that's worth a gigabyte. 23 00:00:54,570 --> 00:00:56,960 Here is Chang, who literally ran off the stage 24 00:00:56,960 --> 00:01:00,480 to avoid being photographed one day, but was eventually caught. 25 00:01:00,480 --> 00:01:02,050 Here is Nick. 26 00:01:02,050 --> 00:01:03,480 Here is Nick. 27 00:01:03,480 --> 00:01:04,080 Here is Nick. 28 00:01:04,080 --> 00:01:05,090 29 00:01:05,090 --> 00:01:07,670 And here is Alison down by the fields. 30 00:01:07,670 --> 00:01:11,840 And Zamyla even was found at a ballroom competition. 31 00:01:11,840 --> 00:01:14,100 So we will go through these photos, figure out 32 00:01:14,100 --> 00:01:16,690 who submitted the most the earliest, and reward 33 00:01:16,690 --> 00:01:20,662 one fabulous prize, as promised in the spec. 34 00:01:20,662 --> 00:01:23,120 And we'll also follow up about the space that was involved. 35 00:01:23,120 --> 00:01:26,860 >> A couple of announcements-- so lunch is, again, this Friday at 1:15 PM. 36 00:01:26,860 --> 00:01:30,420 If you'd like to join us, RSVP at that URL here. 37 00:01:30,420 --> 00:01:33,730 Jason appears again here from one of the sections a couple of years 38 00:01:33,730 --> 00:01:35,510 back, which happened to fall on Halloween. 39 00:01:35,510 --> 00:01:38,950 And indeed, he dressed as a pumpkin that particular year. 40 00:01:38,950 --> 00:01:42,700 If you watch this section of his from 2011 section 41 00:01:42,700 --> 00:01:46,480 eight, if you are curious, at CS50.tv, I think 42 00:01:46,480 --> 00:01:49,730 this was the year in which his air pump was working. 43 00:01:49,730 --> 00:01:52,490 >> If you then watch the similar section in 2012, 44 00:01:52,490 --> 00:01:55,620 you'll see this Jason much deflated, since the suit no longer functioned, 45 00:01:55,620 --> 00:01:58,060 which is only to say this Friday, if you'd 46 00:01:58,060 --> 00:02:02,720 like to carve a pumpkin with Daven and Gabe and others, RSVP to the heads 47 00:02:02,720 --> 00:02:04,480 at cs50.harvard.edu address. 48 00:02:04,480 --> 00:02:06,200 It promises to be great fun. 49 00:02:06,200 --> 00:02:08,660 Daven, we're told, has carved pumpkins all of his life. 50 00:02:08,660 --> 00:02:11,930 Gabriel from Brazil has never carved a pumpkin for Halloween. 51 00:02:11,930 --> 00:02:14,700 So be there with them as he learns. 52 00:02:14,700 --> 00:02:16,830 >> Seminars, meanwhile-- so you'll learn soon 53 00:02:16,830 --> 00:02:20,650 about what our expectations are for the final project, which essentially 54 00:02:20,650 --> 00:02:23,150 will boil down to designing and implementing 55 00:02:23,150 --> 00:02:26,440 most any project of interest to you, albeit subject to the approval 56 00:02:26,440 --> 00:02:28,490 and guidance from your teaching fellow. 57 00:02:28,490 --> 00:02:32,110 Toward the end of the semester, we introduce a number 58 00:02:32,110 --> 00:02:35,610 of seminars, which are optional classes led by the teaching fellows and Harvard 59 00:02:35,610 --> 00:02:38,570 staff, friends of the course across campus, on various topics that 60 00:02:38,570 --> 00:02:41,470 are tangential to the course's underlying syllabus 61 00:02:41,470 --> 00:02:45,590 but nonetheless applicable, fun, and different for potential final projects. 62 00:02:45,590 --> 00:02:49,530 >> For instance, first, if you'd like to register, head to that URL there. 63 00:02:49,530 --> 00:02:53,010 And this is the lineup for this year's seminars alone. 64 00:02:53,010 --> 00:02:56,060 But realize we have dozens of seminars from years past, all of which 65 00:02:56,060 --> 00:02:59,774 are linked in the Seminars menu option of the course's website. 66 00:02:59,774 --> 00:03:02,190 So if you're thinking about going beyond your comfort zone 67 00:03:02,190 --> 00:03:05,060 or picking up some new skills, for instance, programming iPhone 68 00:03:05,060 --> 00:03:08,100 apps with Swift, a new language from Apple or Objective-C 69 00:03:08,100 --> 00:03:11,230 or Android apps or programming [? cue ?] light bulbs, or any of the topics 70 00:03:11,230 --> 00:03:15,490 up here and more, due check out the registration page. 71 00:03:15,490 --> 00:03:19,730 >> So we began and concluded on Monday with looking at HTTP. 72 00:03:19,730 --> 00:03:22,675 So quick refresher-- HTTP, HyperText Transfer Protocol. 73 00:03:22,675 --> 00:03:24,045 But what does that really mean? 74 00:03:24,045 --> 00:03:26,805 75 00:03:26,805 --> 00:03:27,930 What does that really mean? 76 00:03:27,930 --> 00:03:30,665 77 00:03:30,665 --> 00:03:31,290 Is that a hand? 78 00:03:31,290 --> 00:03:33,074 79 00:03:33,074 --> 00:03:34,740 I know you're just scratching your head. 80 00:03:34,740 --> 00:03:36,400 But you want to propose what HTTP is? 81 00:03:36,400 --> 00:03:37,792 82 00:03:37,792 --> 00:03:40,576 >> AUDIENCE: How computers communicate with [INAUDIBLE]. 83 00:03:40,576 --> 00:03:41,517 84 00:03:41,517 --> 00:03:43,100 DAVID J MALAN: I missed the last part. 85 00:03:43,100 --> 00:03:45,774 How computers communicate with-- 86 00:03:45,774 --> 00:03:47,325 >> AUDIENCE: Internet servers. 87 00:03:47,325 --> 00:03:50,450 DAVID J MALAN: Good-- with internet servers, and specifically, web servers. 88 00:03:50,450 --> 00:03:53,533 Because recall, there's a bunch of services on the internet, some of which 89 00:03:53,533 --> 00:03:57,349 you use probably daily between chat and message, chat, and web, and email, 90 00:03:57,349 --> 00:03:57,890 and the like. 91 00:03:57,890 --> 00:04:00,900 And HTTP is just the protocol that web browsers 92 00:04:00,900 --> 00:04:03,750 speak when communicating with web servers, and vice versa. 93 00:04:03,750 --> 00:04:05,580 And the analog in the human world might be, 94 00:04:05,580 --> 00:04:08,730 I extend my hand to shake some other human's and he or she 95 00:04:08,730 --> 00:04:11,970 acknowledges by extending his or her hand as well. 96 00:04:11,970 --> 00:04:13,970 So that's just a protocol, a set of conventions. 97 00:04:13,970 --> 00:04:15,630 >> And what indeed are those conventions? 98 00:04:15,630 --> 00:04:18,640 Well, it just boils down to sending messages back and forth, 99 00:04:18,640 --> 00:04:19,770 as we depicted here. 100 00:04:19,770 --> 00:04:22,520 And there's a couple of ways in which you can send these messages. 101 00:04:22,520 --> 00:04:24,360 And perhaps the most common is known as get. 102 00:04:24,360 --> 00:04:26,510 And we'll see a contrast to this before long. 103 00:04:26,510 --> 00:04:30,010 >> But a get request from a browser to server just looks like this. 104 00:04:30,010 --> 00:04:32,960 It's a bunch of text that it puts inside of a virtual envelope. 105 00:04:32,960 --> 00:04:35,854 On the outside of that envelope go a couple pieces of details. 106 00:04:35,854 --> 00:04:37,770 What needs to go on the envelope, so to speak, 107 00:04:37,770 --> 00:04:41,820 in order to get a request like this from me to a web server? 108 00:04:41,820 --> 00:04:42,320 Yeah. 109 00:04:42,320 --> 00:04:43,270 >> AUDIENCE: Your IP address. 110 00:04:43,270 --> 00:04:45,890 >> DAVID J MALAN: My IP address in the From field, so to speak, 111 00:04:45,890 --> 00:04:49,490 and of course, the recipient's IP address. 112 00:04:49,490 --> 00:04:52,710 But in the case of a web packet, we need a little more detail 113 00:04:52,710 --> 00:04:55,254 It's not sufficient just to send an envelope to a server, 114 00:04:55,254 --> 00:04:57,670 because that server might be listening for different types 115 00:04:57,670 --> 00:04:59,180 of internet traffic. 116 00:04:59,180 --> 00:05:01,370 So what else do we need besides the recipient's IP? 117 00:05:01,370 --> 00:05:02,723 118 00:05:02,723 --> 00:05:03,222 Yeah? 119 00:05:03,222 --> 00:05:04,241 >> AUDIENCE: Is it TCP? 120 00:05:04,241 --> 00:05:05,074 DAVID J MALAN: Good. 121 00:05:05,074 --> 00:05:06,470 TCP-- 122 00:05:06,470 --> 00:05:07,340 >> AUDIENCE: Address. 123 00:05:07,340 --> 00:05:09,340 >> DAVID J MALAN: Address, or port, as it's called. 124 00:05:09,340 --> 00:05:11,010 Close, but a TCP port number. 125 00:05:11,010 --> 00:05:12,220 And there's a bunch of these. 126 00:05:12,220 --> 00:05:14,310 But surely the most familiar should eventually 127 00:05:14,310 --> 00:05:17,590 be 80, which is the default one used for web traffic. 128 00:05:17,590 --> 00:05:20,040 And another familiar one soon will be 443, 129 00:05:20,040 --> 00:05:24,280 which is used for secure web traffic, URLs that start with https. 130 00:05:24,280 --> 00:05:26,650 >> So this is what goes inside of that envelope. 131 00:05:26,650 --> 00:05:29,780 And get/ just means, give me the default web page. 132 00:05:29,780 --> 00:05:32,700 Give me the root of the hard drive on that web server. 133 00:05:32,700 --> 00:05:36,050 And hopefully, the web server will respond with, OK 134 00:05:36,050 --> 00:05:39,630 and the number 200, which is just a convention saying, yes, all 135 00:05:39,630 --> 00:05:40,470 is indeed OK. 136 00:05:40,470 --> 00:05:41,680 Here's the page. 137 00:05:41,680 --> 00:05:45,510 The type of the web page is going to be text, but more specifically, HTML, 138 00:05:45,510 --> 00:05:47,010 which we're about to dive back into. 139 00:05:47,010 --> 00:05:49,877 And the dot dot dot just means, here is the HTML. 140 00:05:49,877 --> 00:05:51,710 And that's where we pick up the story today, 141 00:05:51,710 --> 00:05:55,740 actually writing HTML, HyperText Markup Language, which 142 00:05:55,740 --> 00:05:57,727 is the language in which web pages are written. 143 00:05:57,727 --> 00:05:59,060 It's not a programming language. 144 00:05:59,060 --> 00:06:01,270 There's no functions or loops or conditions. 145 00:06:01,270 --> 00:06:03,800 It's a markup language, as well again see today, 146 00:06:03,800 --> 00:06:07,240 that allows you to specify how to structure and stylize 147 00:06:07,240 --> 00:06:09,300 aesthetically a web page. 148 00:06:09,300 --> 00:06:11,470 >> So this was the one and only page we really 149 00:06:11,470 --> 00:06:13,930 looked at, if briefly, on Monday. 150 00:06:13,930 --> 00:06:16,250 And notice a few salient characteristics. 151 00:06:16,250 --> 00:06:20,170 There's a lot of open angled bracket and close angled bracket. 152 00:06:20,170 --> 00:06:23,160 In between those angled brackets are words. 153 00:06:23,160 --> 00:06:25,660 And we're going to start calling those words tags. 154 00:06:25,660 --> 00:06:28,800 So open bracket head and closed bracket head 155 00:06:28,800 --> 00:06:33,620 are the open and closed tags, or the start and end tags 156 00:06:33,620 --> 00:06:37,660 respectively, of an HTML element, as we'll call it, called head. 157 00:06:37,660 --> 00:06:41,760 And the same jargon applies to body in HTML and so forth. 158 00:06:41,760 --> 00:06:43,970 >> And what's nice is HTML-- and indeed, we'll 159 00:06:43,970 --> 00:06:47,187 spend terribly little time on it, because you'll mostly just figure out 160 00:06:47,187 --> 00:06:49,770 what features it has when you actually have a concrete problem 161 00:06:49,770 --> 00:06:52,820 to solve-- you'll find that a browser is pretty dumb. 162 00:06:52,820 --> 00:06:56,450 It's just going to do-- not unlike a computer-- what you tell it to do. 163 00:06:56,450 --> 00:06:59,279 And so when you have open bracket HTML at the very top 164 00:06:59,279 --> 00:07:01,320 there, that essentially just means, hey, browser, 165 00:07:01,320 --> 00:07:04,090 here comes a web page written in HTML. 166 00:07:04,090 --> 00:07:06,130 >> When it sees open bracket head, that just means, 167 00:07:06,130 --> 00:07:10,350 hey, browser, here comes the head, or the topmost portion of my web page. 168 00:07:10,350 --> 00:07:14,192 When it sees a closed bracket head, that just means, hey, 169 00:07:14,192 --> 00:07:15,150 that's it for the head. 170 00:07:15,150 --> 00:07:16,420 Standby for something else. 171 00:07:16,420 --> 00:07:18,878 And that something else is apparently going to be the body. 172 00:07:18,878 --> 00:07:22,630 And when you don't have a tag, like you have just hello, comma, world, 173 00:07:22,630 --> 00:07:26,610 that's just going to be raw text that ultimately is displayed in the screen. 174 00:07:26,610 --> 00:07:29,220 >> Now, you'll notice too the indentation here. 175 00:07:29,220 --> 00:07:32,160 You can probably infer how we're stylizing it. 176 00:07:32,160 --> 00:07:34,850 Every time I open a tag, so to speak, I indent. 177 00:07:34,850 --> 00:07:38,540 And every time I close a tag, I un-indent, 178 00:07:38,540 --> 00:07:40,690 similar in spirit to curly braces. 179 00:07:40,690 --> 00:07:43,470 And beyond that, I'm kind of using my judgment. 180 00:07:43,470 --> 00:07:48,380 Notice that I didn't bother hitting Enter inside of that title tag. 181 00:07:48,380 --> 00:07:48,990 Why? 182 00:07:48,990 --> 00:07:51,920 Well, I just decided it looked a little cleaner to me, the human, 183 00:07:51,920 --> 00:07:53,181 to just not bother doing that. 184 00:07:53,181 --> 00:07:54,930 So again, there's some judgment calls just 185 00:07:54,930 --> 00:07:57,670 like there is in C or any language. 186 00:07:57,670 --> 00:08:04,110 >> But notice too that this indentation lends itself to a mental model, 187 00:08:04,110 --> 00:08:05,670 not to over complicate it. 188 00:08:05,670 --> 00:08:07,020 But a tree, right? 189 00:08:07,020 --> 00:08:09,290 If you think of a web page, apparently written 190 00:08:09,290 --> 00:08:12,050 like this, as being nicely indented that way, 191 00:08:12,050 --> 00:08:17,390 you can almost think of the open bracket HTML closed bracket tag is demarcating 192 00:08:17,390 --> 00:08:21,380 the root of a node, a family tree style node in the style of the trees 193 00:08:21,380 --> 00:08:22,900 we looked at last Friday. 194 00:08:22,900 --> 00:08:27,630 >> And indeed, we have on the right here what we'll call a DOM, D-O-M, document 195 00:08:27,630 --> 00:08:31,680 object model, a fancy way of saying a tree that represents that HTML. 196 00:08:31,680 --> 00:08:36,140 And notice that HTML has, we'll say, like a family tree, two children. 197 00:08:36,140 --> 00:08:37,659 On the left is head. 198 00:08:37,659 --> 00:08:39,179 On the right is body. 199 00:08:39,179 --> 00:08:44,220 >> And just as a mindless thought exercise, head, of course, has how many children 200 00:08:44,220 --> 00:08:46,070 according to this structure? 201 00:08:46,070 --> 00:08:48,200 So just one, title-- and that's why we have 202 00:08:48,200 --> 00:08:50,580 the arrow going from head to the title. 203 00:08:50,580 --> 00:08:55,110 So it's as though that person in the family tree had just one offspring. 204 00:08:55,110 --> 00:08:58,230 And then title itself can be said to have a child too. 205 00:08:58,230 --> 00:09:01,780 >> Recall that the HTML had hello, comma, world beneath it. 206 00:09:01,780 --> 00:09:06,090 And I've simply drawn it within an oval instead of a rectangle just 207 00:09:06,090 --> 00:09:10,559 to convey semantically that even though it's a node in the tree, so to speak, 208 00:09:10,559 --> 00:09:12,100 it's sort of fundamentally different. 209 00:09:12,100 --> 00:09:12,800 It's not a tag. 210 00:09:12,800 --> 00:09:14,780 Or more properly, it's not an element. 211 00:09:14,780 --> 00:09:16,590 It's just a text node, if you will. 212 00:09:16,590 --> 00:09:18,990 But these are completely arbitrary human conventions. 213 00:09:18,990 --> 00:09:23,180 This is just now my way of representing what I'll as an aggregate 214 00:09:23,180 --> 00:09:24,340 call the document. 215 00:09:24,340 --> 00:09:27,750 >> And as an aside, the thing at the super top left hand corner, 216 00:09:27,750 --> 00:09:32,080 open bracket exclamation point doc type HTML, this looks like a tag, 217 00:09:32,080 --> 00:09:35,560 but it's the stupid corner case where that is just there, copied and pasted 218 00:09:35,560 --> 00:09:38,460 to indicate the browsers this is HTML version 5. 219 00:09:38,460 --> 00:09:41,540 The world keeps changing what the first line of code in a page should be. 220 00:09:41,540 --> 00:09:43,820 This just means version 5. 221 00:09:43,820 --> 00:09:45,950 So it doesn't quite look like the others. 222 00:09:45,950 --> 00:09:48,120 >> All right, so with that said, you'll now appreciate 223 00:09:48,120 --> 00:09:50,767 this fairly this stupid tattoo someone got. 224 00:09:50,767 --> 00:09:51,990 >> [LAUGHTER] 225 00:09:51,990 --> 00:09:54,210 >> DAVID J MALAN: All right, and now let's actually dive 226 00:09:54,210 --> 00:09:55,710 into doing something with this. 227 00:09:55,710 --> 00:09:58,610 You'll recall that last time I opened up the CS50 Appliance 228 00:09:58,610 --> 00:10:01,650 and I did something as simple as opening up gedit. 229 00:10:01,650 --> 00:10:05,190 And I saved the file even on my desktop-- nowhere special-- 230 00:10:05,190 --> 00:10:05,870 as hello.html. 231 00:10:05,870 --> 00:10:07,100 232 00:10:07,100 --> 00:10:10,984 >> So let me do that again-- hello.html Enter. 233 00:10:10,984 --> 00:10:13,900 And now in this file, I'm going to go ahead and replicate what we just 234 00:10:13,900 --> 00:10:18,850 saw-- doc type html Then I'm going to do open bracket html closed bracket. 235 00:10:18,850 --> 00:10:21,890 And then I'm going to preemptively open and close the tag. 236 00:10:21,890 --> 00:10:22,390 Why? 237 00:10:22,390 --> 00:10:23,598 Just so I don't forget later. 238 00:10:23,598 --> 00:10:26,850 It's just good practice, like opening and closing curly braces all at once. 239 00:10:26,850 --> 00:10:28,900 >> And then what came next? 240 00:10:28,900 --> 00:10:30,582 You can think of the tattoo. 241 00:10:30,582 --> 00:10:31,450 >> AUDIENCE: The head. 242 00:10:31,450 --> 00:10:32,500 >> DAVID J MALAN: The head. 243 00:10:32,500 --> 00:10:36,020 And then in here, I had the title, I think. 244 00:10:36,020 --> 00:10:39,886 And the title was arbitrarily, hello, world close title. 245 00:10:39,886 --> 00:10:42,760 And then down here, the body, of course-- then we close the body tag. 246 00:10:42,760 --> 00:10:45,660 And then just somewhat redundantly, I had the same thing down here. 247 00:10:45,660 --> 00:10:47,150 >> So I claim that this is a web page. 248 00:10:47,150 --> 00:10:49,050 This is something that could now live on the web, 249 00:10:49,050 --> 00:10:51,925 even though of course, it's literally living on my desktop right now. 250 00:10:51,925 --> 00:10:55,837 But indeed, if I minimize gedit, I'll see on my desktop its icon. 251 00:10:55,837 --> 00:10:58,420 Even though this is the appliance, you could do this on Mac OS 252 00:10:58,420 --> 00:11:01,580 without TextEdit or Windows with Notepad even. 253 00:11:01,580 --> 00:11:06,115 >> And if I go ahead and double click that even, and select-- well, let's 254 00:11:06,115 --> 00:11:07,990 not select that because Chrome's not opening. 255 00:11:07,990 --> 00:11:09,281 Let's go ahead and open Chrome. 256 00:11:09,281 --> 00:11:10,160 257 00:11:10,160 --> 00:11:14,040 And then do Command-O for open And navigate to my desktop 258 00:11:14,040 --> 00:11:15,320 and open that file. 259 00:11:15,320 --> 00:11:20,120 That is how a browser interprets HTML, top to bottom, left to right. 260 00:11:20,120 --> 00:11:21,314 Hey, browser here's HTML. 261 00:11:21,314 --> 00:11:21,980 Here's the head. 262 00:11:21,980 --> 00:11:23,250 Here's the title. 263 00:11:23,250 --> 00:11:24,090 Here's the body. 264 00:11:24,090 --> 00:11:26,620 And indeed, this is how it renders that web page. 265 00:11:26,620 --> 00:11:27,800 >> But notice the URL. 266 00:11:27,800 --> 00:11:32,430 None of you could pull up this specific page on your laptops right now, 267 00:11:32,430 --> 00:11:34,910 even inside of your appliance via that URL, 268 00:11:34,910 --> 00:11:40,130 because file:// indicates it's actually on my file system, my hard drive, 269 00:11:40,130 --> 00:11:40,990 not yours. 270 00:11:40,990 --> 00:11:42,440 So this isn't all that useful. 271 00:11:42,440 --> 00:11:44,940 >> Let's now move toward using an actual web server. 272 00:11:44,940 --> 00:11:48,309 And it turns out the CS50 Appliance is more than just an environment where 273 00:11:48,309 --> 00:11:51,100 you can write C code and compile and run it like you've been doing. 274 00:11:51,100 --> 00:11:55,500 It also has been configured by the staff to represent a typical web 275 00:11:55,500 --> 00:11:58,290 server that's on the internet, one that you might pay for 276 00:11:58,290 --> 00:12:00,210 or one that's in the so-called cloud. 277 00:12:00,210 --> 00:12:02,600 >> And it's running standard free open source 278 00:12:02,600 --> 00:12:06,160 software, for instance, something called Apache, which is perhaps 279 00:12:06,160 --> 00:12:08,700 still the most popular web server software in the world 280 00:12:08,700 --> 00:12:11,030 that thousands of websites use today. 281 00:12:11,030 --> 00:12:13,420 And it also even has software like MySQL, 282 00:12:13,420 --> 00:12:16,240 which is a database server that we'll eventually get to, 283 00:12:16,240 --> 00:12:18,330 which is only to say I can start treating 284 00:12:18,330 --> 00:12:22,040 my appliance as a full fledged server that I'm not paying for elsewhere. 285 00:12:22,040 --> 00:12:25,980 It just lives on my own laptop for development and convenience purposes. 286 00:12:25,980 --> 00:12:27,870 >> So let's go ahead and take advantage of this. 287 00:12:27,870 --> 00:12:30,120 I'm going to go ahead and open up a terminal window. 288 00:12:30,120 --> 00:12:33,030 And I'm going to go ahead and move-- actually, first I'm 289 00:12:33,030 --> 00:12:34,860 going to navigate to my desktop. 290 00:12:34,860 --> 00:12:36,400 If I do ls, there's hello.html. 291 00:12:36,400 --> 00:12:37,022 292 00:12:37,022 --> 00:12:38,730 And I'm going to go ahead and start using 293 00:12:38,730 --> 00:12:40,800 a new directory we've not used before today. 294 00:12:40,800 --> 00:12:46,840 >> hello.html-- I'm going to move to ../vhosts for virtual hosts-- 295 00:12:46,840 --> 00:12:50,940 more on that in the future-- and then into a directory called localhost, 296 00:12:50,940 --> 00:12:54,420 which is the nickname given to almost any computer, whether it's a Mac, PC, 297 00:12:54,420 --> 00:12:57,560 or Linux computer, and then specifically into a directory that we, 298 00:12:57,560 --> 00:13:01,260 the staff already created for you when you downloaded the appliance called 299 00:13:01,260 --> 00:13:01,760 public. 300 00:13:01,760 --> 00:13:04,551 And as its name suggests, anything I put in this folder, in theory, 301 00:13:04,551 --> 00:13:07,790 is going to now be public, at least to people 302 00:13:07,790 --> 00:13:10,030 who have a direct connection to my computer. 303 00:13:10,030 --> 00:13:13,160 >> So now let me go ahead and do cd to that same directory 304 00:13:13,160 --> 00:13:15,490 so I can see what's going on and type ls. 305 00:13:15,490 --> 00:13:17,630 And indeed, that's the only thing in there. 306 00:13:17,630 --> 00:13:23,250 I claim now that because I have put this file hello.html inside of a directory 307 00:13:23,250 --> 00:13:26,940 called public inside of a directory called localhost inside of a directory 308 00:13:26,940 --> 00:13:29,810 called vhosts, which thanks to CS50 staff 309 00:13:29,810 --> 00:13:34,390 has been pre-configured to be the root of your web server, 310 00:13:34,390 --> 00:13:36,900 I can now hopefully do this. 311 00:13:36,900 --> 00:13:38,390 >> I'm going to open up a new tab. 312 00:13:38,390 --> 00:13:40,090 And I'm going to go not to file://. 313 00:13:40,090 --> 00:13:44,520 I'm going to use actual http/localhost, which 314 00:13:44,520 --> 00:13:47,470 again, is the nickname for my own server. 315 00:13:47,470 --> 00:13:51,085 And then I'm going to go to what file name, just to be clear? 316 00:13:51,085 --> 00:13:52,680 317 00:13:52,680 --> 00:13:54,320 Where is this story probably going? 318 00:13:54,320 --> 00:13:56,066 319 00:13:56,066 --> 00:13:56,565 hello.html. 320 00:13:56,565 --> 00:13:58,350 321 00:13:58,350 --> 00:14:04,270 >> So in other words, I want to now this is my own computer, my own appliance, 322 00:14:04,270 --> 00:14:05,660 as though it's an actual server. 323 00:14:05,660 --> 00:14:07,490 Its nickname is localhost. 324 00:14:07,490 --> 00:14:10,210 But think of localhost as like Facebook.com google.com, whatever. 325 00:14:10,210 --> 00:14:11,600 It's just my local name. 326 00:14:11,600 --> 00:14:14,810 And then the final I want is in the root of the hard drive, so to speak, 327 00:14:14,810 --> 00:14:17,729 or the root of the web server, ergo the forward slash and then 328 00:14:17,729 --> 00:14:18,770 the file name hello.html. 329 00:14:18,770 --> 00:14:19,880 330 00:14:19,880 --> 00:14:21,930 >> Let me zoom out and hit Enter. 331 00:14:21,930 --> 00:14:24,266 And indeed, there is now my web page. 332 00:14:24,266 --> 00:14:25,390 So it's slightly different. 333 00:14:25,390 --> 00:14:26,880 And it's just as underwhelming. 334 00:14:26,880 --> 00:14:27,904 This is the old version. 335 00:14:27,904 --> 00:14:29,070 Let me shrink the font back. 336 00:14:29,070 --> 00:14:29,745 This is the old. 337 00:14:29,745 --> 00:14:30,890 This is the new. 338 00:14:30,890 --> 00:14:35,430 But what's fundamentally happening now is that HTTP is being used. 339 00:14:35,430 --> 00:14:39,344 >> Let's make this a little more clear or, if you will, a little more complicated. 340 00:14:39,344 --> 00:14:41,760 Let me go to the bottom right hand corner of my appliance. 341 00:14:41,760 --> 00:14:44,000 And notice that all this time, there's been a number. 342 00:14:44,000 --> 00:14:47,330 That is the unique address of your CS50 Appliance. 343 00:14:47,330 --> 00:14:50,800 It's a private address, as implied by the 172.16, 344 00:14:50,800 --> 00:14:53,860 which just means only you physically can access this web server. 345 00:14:53,860 --> 00:14:56,340 Everything is firewalled and nicely protected from the rest 346 00:14:56,340 --> 00:14:58,130 of the world because of this addressing. 347 00:14:58,130 --> 00:15:01,920 >> And now notice though if I go to this address, not in my appliance, 348 00:15:01,920 --> 00:15:04,340 but in Mac OS-- I'm going to go back over here. 349 00:15:04,340 --> 00:15:05,930 This is my Mac now. 350 00:15:05,930 --> 00:15:08,460 And now I'm going to open up this version of Chrome here. 351 00:15:08,460 --> 00:15:17,370 And I'm going to go to http://172.16.25 / and I forget the rest-- 133. 352 00:15:17,370 --> 00:15:25,210 >> So I'm going to visit from my Mac that IP address /hello.html Enter. 353 00:15:25,210 --> 00:15:29,850 And now I see from my Mac that my CS50 Appliance, who's 354 00:15:29,850 --> 00:15:32,600 IP address is that number, is indeed behaving 355 00:15:32,600 --> 00:15:34,320 like a web server on the internet. 356 00:15:34,320 --> 00:15:36,944 It doesn't have a nice easy to remember name like Facebook.com, 357 00:15:36,944 --> 00:15:40,370 but it's using HTTP apparently, even though Chrome 358 00:15:40,370 --> 00:15:43,560 is kind of simplifying the world for us but not showing us HTTP. 359 00:15:43,560 --> 00:15:46,210 But this is indeed exactly that. 360 00:15:46,210 --> 00:15:48,470 Chrome is just saving some keystrokes these days. 361 00:15:48,470 --> 00:15:50,530 And that's what we now see. 362 00:15:50,530 --> 00:15:51,890 >> So that's all fine and good. 363 00:15:51,890 --> 00:15:53,740 But it's a pretty underwhelming page. 364 00:15:53,740 --> 00:15:56,230 Let me go in and do something a little different now. 365 00:15:56,230 --> 00:15:57,910 So let me go back to gedit. 366 00:15:57,910 --> 00:16:00,580 And instead of hello, world, let's put an image. 367 00:16:00,580 --> 00:16:05,880 And I claimed from before-- let me go into my localhost directory public. 368 00:16:05,880 --> 00:16:10,580 And let me go ahead and copy a whole bunch of files from today 369 00:16:10,580 --> 00:16:15,633 from my Dropbox folder into here. 370 00:16:15,633 --> 00:16:19,470 371 00:16:19,470 --> 00:16:21,680 >> Now if I type ls, look at all these files 372 00:16:21,680 --> 00:16:24,940 that I've distributed by the course's website in advance of today, 373 00:16:24,940 --> 00:16:26,830 one of which is still hello.html. 374 00:16:26,830 --> 00:16:27,830 So there's that one. 375 00:16:27,830 --> 00:16:30,730 And recall this silly one from last time-- cat.jpg . 376 00:16:30,730 --> 00:16:34,550 So let me try to embed cat.jpg inside of my web page. 377 00:16:34,550 --> 00:16:37,690 >> I'm going to go ahead and do cat.jpg, save. 378 00:16:37,690 --> 00:16:38,950 Let me go back to Chrome. 379 00:16:38,950 --> 00:16:41,140 And let me zoom in the font and now reload. 380 00:16:41,140 --> 00:16:43,090 381 00:16:43,090 --> 00:16:45,030 Oops, where I put this? 382 00:16:45,030 --> 00:16:48,210 383 00:16:48,210 --> 00:16:51,520 Standby-- I still have the old version from my desktop open. 384 00:16:51,520 --> 00:16:56,020 So let me go into my vhost, my localhost, my public, and hello.html. 385 00:16:56,020 --> 00:16:57,320 386 00:16:57,320 --> 00:17:00,670 So now let me go ahead and say cat.jpg inside of the body 387 00:17:00,670 --> 00:17:02,830 where I want it to be displayed and reload. 388 00:17:02,830 --> 00:17:04,560 Of course, this is not correct. 389 00:17:04,560 --> 00:17:08,050 >> So I need to tell the browser a little more deliberately what I want it to do. 390 00:17:08,050 --> 00:17:10,210 Simply typing the name is obviously not sufficient. 391 00:17:10,210 --> 00:17:15,134 So recall that there was another tag, image, img for short. 392 00:17:15,134 --> 00:17:17,550 That's just because humans don't like the type full words. 393 00:17:17,550 --> 00:17:19,050 And then we can do source="cat.jpg". 394 00:17:19,050 --> 00:17:21,470 395 00:17:21,470 --> 00:17:23,550 >> And now I'm going to do one thing different here. 396 00:17:23,550 --> 00:17:25,390 Even though all of our tags thus far have 397 00:17:25,390 --> 00:17:28,086 had this notion of a start tag and an end tag, 398 00:17:28,086 --> 00:17:30,210 that doesn't really make sense for an image, right? 399 00:17:30,210 --> 00:17:32,430 An image is either there or not there. 400 00:17:32,430 --> 00:17:36,650 And so the humans have come up with a simpler convention. 401 00:17:36,650 --> 00:17:40,310 When you have a tag that can both start and end at the same time-- 402 00:17:40,310 --> 00:17:43,790 it can be empty, so to speak-- just put the forward slash inside of the tag 403 00:17:43,790 --> 00:17:44,710 at the very end. 404 00:17:44,710 --> 00:17:45,776 405 00:17:45,776 --> 00:17:47,150 Now let me go back to my browser. 406 00:17:47,150 --> 00:17:50,377 Hit Reload Damn, something's wrong. 407 00:17:50,377 --> 00:17:52,460 You've probably seen this occasionally on the web, 408 00:17:52,460 --> 00:17:53,600 even if it's not been your fault. 409 00:17:53,600 --> 00:17:54,766 It's the web server's fault. 410 00:17:54,766 --> 00:17:56,240 What odes this seem to indicate? 411 00:17:56,240 --> 00:17:57,450 412 00:17:57,450 --> 00:17:58,009 It's broken. 413 00:17:58,009 --> 00:17:59,300 That's where the image belongs. 414 00:17:59,300 --> 00:17:59,700 Yeah? 415 00:17:59,700 --> 00:18:01,560 >> AUDIENCE: But it doesn't have access to the image. 416 00:18:01,560 --> 00:18:03,070 >> DAVID J MALAN: It doesn't have access to the image. 417 00:18:03,070 --> 00:18:05,230 That, or even worse, maybe it doesn't even exist. 418 00:18:05,230 --> 00:18:06,729 Let's see if we can't diagnose that. 419 00:18:06,729 --> 00:18:09,390 Recall from last time that if in Chrome, in the appliance, 420 00:18:09,390 --> 00:18:11,870 or even on your Mac or PC, you go to the Developer menu 421 00:18:11,870 --> 00:18:14,650 and go to the Developer Tools option, which probably you've 422 00:18:14,650 --> 00:18:16,850 not used much or ever. 423 00:18:16,850 --> 00:18:20,780 And if I go to Network and reload the page, 424 00:18:20,780 --> 00:18:24,110 let's actually look at the HTTP requests that are being made. 425 00:18:24,110 --> 00:18:28,400 >> It looks like hello.html is indeed OK, hence the 200. 426 00:18:28,400 --> 00:18:30,630 But cat.jpg is a 403. 427 00:18:30,630 --> 00:18:31,650 So it's not a 404. 428 00:18:31,650 --> 00:18:33,490 File probably exists. 429 00:18:33,490 --> 00:18:35,250 403 means forbidden. 430 00:18:35,250 --> 00:18:37,790 So this is a little confusing. 431 00:18:37,790 --> 00:18:42,340 I'm going to go back to my terminal window. 432 00:18:42,340 --> 00:18:43,700 Let me zoom in up here. 433 00:18:43,700 --> 00:18:44,750 And let me do an ls. 434 00:18:44,750 --> 00:18:46,430 There's those same files. 435 00:18:46,430 --> 00:18:49,410 >> Now let me do a ls-l, which you've probably 436 00:18:49,410 --> 00:18:53,350 used before to look at file sizes maybe or timestamps. 437 00:18:53,350 --> 00:18:55,590 And we see a whole bunch of overwhelming information. 438 00:18:55,590 --> 00:18:57,040 But notice a few details. 439 00:18:57,040 --> 00:19:01,660 Here's hello.html in this row here and here's cat.jpg. 440 00:19:01,660 --> 00:19:02,934 441 00:19:02,934 --> 00:19:05,850 And it's just the appliance being user friendly by highlighting JPEG's 442 00:19:05,850 --> 00:19:07,380 in purple like this. 443 00:19:07,380 --> 00:19:11,470 But what else is different beside the file size and the file name? 444 00:19:11,470 --> 00:19:13,438 445 00:19:13,438 --> 00:19:14,754 >> AUDIENCE: [INAUDIBLE]. 446 00:19:14,754 --> 00:19:16,920 DAVID J MALAN: Yeah, there's two more R's over here. 447 00:19:16,920 --> 00:19:20,170 Notice what hello.html has going on. 448 00:19:20,170 --> 00:19:24,050 So it turns out that the name of this directory public is important. 449 00:19:24,050 --> 00:19:26,400 Anything in this directory is meant to be public. 450 00:19:26,400 --> 00:19:28,790 But it's not sufficient just to drop files in there. 451 00:19:28,790 --> 00:19:31,480 You also need to change the mode of the files, 452 00:19:31,480 --> 00:19:35,180 change the permissions of the file to proactively not 453 00:19:35,180 --> 00:19:37,650 be the default setting, which is that only I can read 454 00:19:37,650 --> 00:19:39,220 and write it, I being the owner. 455 00:19:39,220 --> 00:19:43,540 I want the whole world everybody to be able to read my file, so to speak. 456 00:19:43,540 --> 00:19:44,950 Read just means view it. 457 00:19:44,950 --> 00:19:49,780 >> And indeed, as you'll see in problem set seven, that's what these R's mean. 458 00:19:49,780 --> 00:19:53,160 These two R's mean let everyone else in the world also read it, 459 00:19:53,160 --> 00:19:55,300 especially now that it's in this directory. 460 00:19:55,300 --> 00:19:59,620 So the simplest way to fix this is to go to my prompt and do chmod for change 461 00:19:59,620 --> 00:20:05,580 mode and then do a+r, altogether, everyone, all, plus r for read, 462 00:20:05,580 --> 00:20:07,944 and then cat.jpg Enter. 463 00:20:07,944 --> 00:20:10,360 Nothing seems to happen, which usually means a good thing. 464 00:20:10,360 --> 00:20:13,850 So ls-l again-- now let's look at cat.jpg. 465 00:20:13,850 --> 00:20:15,750 And this permission seem to have changed. 466 00:20:15,750 --> 00:20:18,670 As an aside, if you make a mistake and you, for instance, 467 00:20:18,670 --> 00:20:23,210 just made your-- I don't know-- essay publicly accessible by accident, 468 00:20:23,210 --> 00:20:25,480 you can do the opposite, chmod a-r. 469 00:20:25,480 --> 00:20:25,909 470 00:20:25,909 --> 00:20:28,200 Though frankly, it shouldn't be in the public directory 471 00:20:28,200 --> 00:20:29,760 anyway if that's the concern. 472 00:20:29,760 --> 00:20:32,475 >> So now let's go back to my browser and reload. 473 00:20:32,475 --> 00:20:32,904 474 00:20:32,904 --> 00:20:34,820 And I'm going to click the little Ghostbusters 475 00:20:34,820 --> 00:20:38,030 symbol to clear that part of the screen so we can see new requests. 476 00:20:38,030 --> 00:20:40,630 And indeed, here is Grump Cat from before. 477 00:20:40,630 --> 00:20:43,010 But more importantly, technically, there is 478 00:20:43,010 --> 00:20:45,565 the number 200, which means we got it OK. 479 00:20:45,565 --> 00:20:47,190 All right, so that's all fine and good. 480 00:20:47,190 --> 00:20:48,940 But we're not making the best of websites, 481 00:20:48,940 --> 00:20:51,967 nor are we going to try too hard to make the fanciest of websites today. 482 00:20:51,967 --> 00:20:54,550 But let's at least do something super familiar before rattling 483 00:20:54,550 --> 00:20:56,030 off a few other tags. 484 00:20:56,030 --> 00:20:58,470 So suppose I don't just want a cat here. 485 00:20:58,470 --> 00:21:02,530 Suppose I actually want this cat to link to something. 486 00:21:02,530 --> 00:21:07,210 >> I might, for instance do something like this. 487 00:21:07,210 --> 00:21:08,580 488 00:21:08,580 --> 00:21:12,890 a for anchor href for hyper reference equals-- 489 00:21:12,890 --> 00:21:17,440 and let's just do something like www.google.com close 490 00:21:17,440 --> 00:21:19,540 quote close bracket. 491 00:21:19,540 --> 00:21:22,000 And now search for cats. 492 00:21:22,000 --> 00:21:23,520 Close anchor tag. 493 00:21:23,520 --> 00:21:26,760 So this has only one sort of fundamentally new detail. 494 00:21:26,760 --> 00:21:28,190 The tag of course, is different. 495 00:21:28,190 --> 00:21:31,770 It's the name a for anchor href or hyper reference. 496 00:21:31,770 --> 00:21:35,269 >> But more importantly, there's this syntactical feature here. 497 00:21:35,269 --> 00:21:37,810 This is what we'll start calling not a tag, but an attribute. 498 00:21:37,810 --> 00:21:40,830 And an attribute is something that modifies the behavior of a tag. 499 00:21:40,830 --> 00:21:45,400 And this attribute, href, means modify the behavior of this anchor 500 00:21:45,400 --> 00:21:48,430 so that when it's clicked, it goes to this URL here. 501 00:21:48,430 --> 00:21:50,330 And of course, that URL is Google. 502 00:21:50,330 --> 00:21:53,951 >> Meanwhile, what is this text here going to be? 503 00:21:53,951 --> 00:21:55,950 Well, that's going to be what the human actually 504 00:21:55,950 --> 00:21:58,470 sees as the underlined link, as simple as that. 505 00:21:58,470 --> 00:21:59,220 So let's try this. 506 00:21:59,220 --> 00:21:59,980 Let me save it. 507 00:21:59,980 --> 00:22:01,650 I'm still in hello.html. 508 00:22:01,650 --> 00:22:05,360 But in the versions online, you'll see the actual file names we pre-prepared. 509 00:22:05,360 --> 00:22:06,805 Let me go ahead and reload. 510 00:22:06,805 --> 00:22:08,680 And now it's a very underwhelming page still. 511 00:22:08,680 --> 00:22:10,910 But if I hover over there-- and it's a little small, 512 00:22:10,910 --> 00:22:13,576 but-- you can see in the bottom left hand corner of your screen, 513 00:22:13,576 --> 00:22:15,242 it's indeed going to google.com. 514 00:22:15,242 --> 00:22:19,280 And if I click that, it will whisk me way to the actual Google. 515 00:22:19,280 --> 00:22:22,610 >> But notice here an opportunity for exploitation, just as an aside. 516 00:22:22,610 --> 00:22:25,150 And we'll come back to other issues of security before long. 517 00:22:25,150 --> 00:22:29,290 Because there's this dichotomy between where you go and what you say, 518 00:22:29,290 --> 00:22:34,722 you could do something like this-- http://www.google.com. 519 00:22:34,722 --> 00:22:37,134 OK, and now if I reload after saving that page, 520 00:22:37,134 --> 00:22:38,800 it looks like I'm going to go to Google. 521 00:22:38,800 --> 00:22:40,966 But there's no reason I have to go to Google, right? 522 00:22:40,966 --> 00:22:47,460 I could actually go to something like badguy.com, reload the page over here. 523 00:22:47,460 --> 00:22:49,750 And notice, it still looks like Google. 524 00:22:49,750 --> 00:22:52,020 And only if I'm sharp enough to hover over here 525 00:22:52,020 --> 00:22:54,770 do I see it's even going to go to a different location. 526 00:22:54,770 --> 00:22:57,400 >> So if you've ever gotten an email, especially 527 00:22:57,400 --> 00:22:59,610 one from Paypal, or seemingly from Paypal 528 00:22:59,610 --> 00:23:01,830 asking you to log in to your account, this 529 00:23:01,830 --> 00:23:06,380 is why you should never ever click links in emails, 530 00:23:06,380 --> 00:23:07,930 frankly, any links in emails. 531 00:23:07,930 --> 00:23:10,380 If you know you have actual money in Paypal or Bank 532 00:23:10,380 --> 00:23:14,250 of America or Fidelity or any website, manually type it in. 533 00:23:14,250 --> 00:23:17,530 Because look how easy it is to trick someone into presenting what 534 00:23:17,530 --> 00:23:18,526 looks like a link. 535 00:23:18,526 --> 00:23:20,400 But it actually could go absolutely anywhere. 536 00:23:20,400 --> 00:23:23,301 >> And there's far greater threats than this. 537 00:23:23,301 --> 00:23:25,300 In fact, this is a bit of a tangent now, but one 538 00:23:25,300 --> 00:23:28,430 of the best ones I ever saw which has since been closed, 539 00:23:28,430 --> 00:23:34,060 is someone led people to-- so this might say, 540 00:23:34,060 --> 00:23:37,660 click here to log into your account, a bank account. 541 00:23:37,660 --> 00:23:40,985 And this was Bank of the West. 542 00:23:40,985 --> 00:23:43,030 543 00:23:43,030 --> 00:23:44,250 >> So someone bought this. 544 00:23:44,250 --> 00:23:47,090 And it's a little easier to see it in a mono spaced font zoomed 545 00:23:47,090 --> 00:23:49,190 in on a 30-foot projector. 546 00:23:49,190 --> 00:23:51,720 But when it's small font in an email that you're receiving, 547 00:23:51,720 --> 00:23:54,690 this looks like bankofthewest.com, not bankofthevvest.com, 548 00:23:54,690 --> 00:23:58,230 which someone had paid $10 to buy. 549 00:23:58,230 --> 00:24:00,840 And then this led them to the equivalent of some bad website. 550 00:24:00,840 --> 00:24:05,540 >> And you'll see too-- actually we can do this-- if I go to the actual website, 551 00:24:05,540 --> 00:24:10,335 bankofthewest.com, again, recall from last time 552 00:24:10,335 --> 00:24:13,210 that if this is their web page and you're curious as to how it works, 553 00:24:13,210 --> 00:24:15,610 you can certainly go to Chrome's developer tools. 554 00:24:15,610 --> 00:24:18,890 And you can see all of the HTML nicely formatted there. 555 00:24:18,890 --> 00:24:20,890 >> But more to the point, you cam-- let's close 556 00:24:20,890 --> 00:24:24,760 this-- you can go to View Developer View Source. 557 00:24:24,760 --> 00:24:25,770 558 00:24:25,770 --> 00:24:28,350 Why don't I just copy all of that And then I 559 00:24:28,350 --> 00:24:31,630 can go into my little gedit window here and make my own web page. 560 00:24:31,630 --> 00:24:33,210 Save this in hello.html. 561 00:24:33,210 --> 00:24:36,770 And probably this is going to break, because it's not this easy usually. 562 00:24:36,770 --> 00:24:41,590 But now if I reload my own page on my own CS50 Appliance and hit reload, 563 00:24:41,590 --> 00:24:42,990 OK, some stuff broke. 564 00:24:42,990 --> 00:24:45,750 But I'm pretty close to having my own banking website, right? 565 00:24:45,750 --> 00:24:46,570 All of this HTML-- 566 00:24:46,570 --> 00:24:47,370 >> [LAUGHTER] 567 00:24:47,370 --> 00:24:49,210 >> DAVID J MALAN: --I didn't actually-- and you 568 00:24:49,210 --> 00:24:52,210 know there's someone out there who would actually click these links too. 569 00:24:52,210 --> 00:24:54,864 So clearly, some stuff broke. 570 00:24:54,864 --> 00:24:56,780 But that's going to lead us into a discussion, 571 00:24:56,780 --> 00:25:00,810 unnecessarily right now, as to what CSS, cascading style sheets, are, 572 00:25:00,810 --> 00:25:03,410 and how you actually download the other HTML files 573 00:25:03,410 --> 00:25:06,140 and JPEG files GIF files that the website might be using. 574 00:25:06,140 --> 00:25:07,960 But all of that is accomplishable. 575 00:25:07,960 --> 00:25:11,110 But it really boils down to these very simple heuristics. 576 00:25:11,110 --> 00:25:14,450 >> So now let's just skim through a couple of other examples of HTML 577 00:25:14,450 --> 00:25:16,680 just to give you a sense of what else you can do. 578 00:25:16,680 --> 00:25:18,670 For instance, this is list.html. 579 00:25:18,670 --> 00:25:23,240 Suppose I wanted to make a web page with a list of houses in the quad. 580 00:25:23,240 --> 00:25:28,960 I might use the ul tag for unordered list and then the list item child 581 00:25:28,960 --> 00:25:33,760 and then iterate over-- or list, rather-- the houses in question. 582 00:25:33,760 --> 00:25:36,080 >> And if I open this up, let's do this. 583 00:25:36,080 --> 00:25:40,670 Let's go not to hello.html, but to list.html. 584 00:25:40,670 --> 00:25:42,160 Damn it. 585 00:25:42,160 --> 00:25:43,000 How do I fix this? 586 00:25:43,000 --> 00:25:45,679 587 00:25:45,679 --> 00:25:47,220 It's the same issue as before, right? 588 00:25:47,220 --> 00:25:52,510 So let me do chmod-- oops-- chmod a+r of list.html. 589 00:25:52,510 --> 00:25:54,610 590 00:25:54,610 --> 00:25:59,610 And now if I go back to my browser and click Reload, there it is. 591 00:25:59,610 --> 00:26:02,360 So if you've ever wanted to make a bulleted list, you can do that. 592 00:26:02,360 --> 00:26:06,210 If you want to be super fancy and make an ordered list, not an unordered list, 593 00:26:06,210 --> 00:26:10,170 change those to ol, reload the page, and now the browser will number it for you. 594 00:26:10,170 --> 00:26:11,241 >> What else can we do? 595 00:26:11,241 --> 00:26:13,990 Well, a couple of others-- if you've got long paragraphs of text-- 596 00:26:13,990 --> 00:26:15,698 for instance, some Latin text like this-- 597 00:26:15,698 --> 00:26:20,730 and you want it in separate paragraphs, open p, close p for the paragraph tag. 598 00:26:20,730 --> 00:26:22,010 And do it again and again. 599 00:26:22,010 --> 00:26:26,600 And if I now open up this file, paragraphs.html, well, this 600 00:26:26,600 --> 00:26:27,570 is getting annoying. 601 00:26:27,570 --> 00:26:34,320 So now let's just go back to my prompt, chmod a+r r star .html-- 602 00:26:34,320 --> 00:26:36,099 a nice little wild card so to speak. 603 00:26:36,099 --> 00:26:37,890 It should fix all of these problems for me. 604 00:26:37,890 --> 00:26:38,990 Let's reload. 605 00:26:38,990 --> 00:26:40,500 There's three paragraphs. 606 00:26:40,500 --> 00:26:42,930 >> And now let's go ahead and open up one other. 607 00:26:42,930 --> 00:26:44,310 How about table? 608 00:26:44,310 --> 00:26:46,440 You'll notice table looks a little more complex. 609 00:26:46,440 --> 00:26:49,110 But it's the same idea-- open tag, open tag, 610 00:26:49,110 --> 00:26:51,360 open, open, open, close tag, open tag. 611 00:26:51,360 --> 00:26:54,410 And these happen to stand for table, whose border is apparently 612 00:26:54,410 --> 00:26:58,500 going to be a thickness 1-- whatever that means-- table row, table 613 00:26:58,500 --> 00:27:00,320 data, which means a cell. 614 00:27:00,320 --> 00:27:03,840 And if I go back to my browser here and go to table.html, 615 00:27:03,840 --> 00:27:05,840 you can see something like this, hideous. 616 00:27:05,840 --> 00:27:07,840 But we'll get to the point where we can actually 617 00:27:07,840 --> 00:27:09,260 make things prettier than that. 618 00:27:09,260 --> 00:27:10,530 >> So let me stipulate for now. 619 00:27:10,530 --> 00:27:11,870 There's bunches of more tags. 620 00:27:11,870 --> 00:27:15,225 And HTML is wonderful to pick up because, frankly, all you need to do 621 00:27:15,225 --> 00:27:17,600 is look at existing web pages with which you're familiar. 622 00:27:17,600 --> 00:27:20,340 And you're like, oh, that's how they did this aesthetically. 623 00:27:20,340 --> 00:27:23,159 >> Or you can look up any online resource as to how HTML works, 624 00:27:23,159 --> 00:27:25,700 and you'll see that there's a whole vocabulary of other tags. 625 00:27:25,700 --> 00:27:30,110 But with the simple mental model alone that almost any tag you open 626 00:27:30,110 --> 00:27:33,620 has to be closed, it really does suffice to teach oneself 627 00:27:33,620 --> 00:27:36,950 HTML after understand these basic ideas of tags 628 00:27:36,950 --> 00:27:40,520 and attributes and the well-formedness that we've talked about, 629 00:27:40,520 --> 00:27:44,697 closing anything that we might open so that we don't confuse a browser. 630 00:27:44,697 --> 00:27:46,780 So let's now take this to a more interesting level 631 00:27:46,780 --> 00:27:48,100 by going to the actual. 632 00:27:48,100 --> 00:27:51,095 And let's go to my Mac here, to google.com. 633 00:27:51,095 --> 00:27:52,280 634 00:27:52,280 --> 00:27:54,020 And now notice-- let's do this. 635 00:27:54,020 --> 00:27:57,280 I'm gong to go to Settings, Search Settings. 636 00:27:57,280 --> 00:28:01,070 I want to turn off this annoying instant results thing where it immediately 637 00:28:01,070 --> 00:28:02,450 starts responding to your typing. 638 00:28:02,450 --> 00:28:05,300 Let's do this older school so we actually see what's going on. 639 00:28:05,300 --> 00:28:08,260 >> So I'm going to save my Google settings here. 640 00:28:08,260 --> 00:28:11,160 And now notice-- I'm going to search for something like cats. 641 00:28:11,160 --> 00:28:14,500 And it's still doing auto complete here, but based on things 642 00:28:14,500 --> 00:28:15,970 people have typed in the past. 643 00:28:15,970 --> 00:28:17,490 But notice what's going to happen. 644 00:28:17,490 --> 00:28:20,272 >> In the URL at the moment is this, just google.com. 645 00:28:20,272 --> 00:28:22,650 And technically, it's slash. 646 00:28:22,650 --> 00:28:25,910 Google's just saving a character and not showing us that. 647 00:28:25,910 --> 00:28:30,400 They are showing us https, just to be super reassuring that we're 648 00:28:30,400 --> 00:28:32,850 at a secure or encrypted page. 649 00:28:32,850 --> 00:28:35,690 >> So let me go ahead and search for cats. 650 00:28:35,690 --> 00:28:37,670 Now this got really overwhelming quickly. 651 00:28:37,670 --> 00:28:39,470 Look at the length of this URL. 652 00:28:39,470 --> 00:28:43,070 But it turns out that most of this stuff in the URL is actually pretty useless. 653 00:28:43,070 --> 00:28:45,320 I'm going to start deleting things I don't understand. 654 00:28:45,320 --> 00:28:46,560 655 00:28:46,560 --> 00:28:47,360 I see cats. 656 00:28:47,360 --> 00:28:48,470 I understand cats. 657 00:28:48,470 --> 00:28:50,380 I don't know why cats are there again. 658 00:28:50,380 --> 00:28:52,620 I really don't know what this nonsense is. 659 00:28:52,620 --> 00:28:56,030 So I'm just going to keep highlighting and deleting stuff 660 00:28:56,030 --> 00:28:59,905 that I don't understand, distilling the URL into just this. 661 00:28:59,905 --> 00:29:00,920 662 00:29:00,920 --> 00:29:02,270 >> Now let me get enter again. 663 00:29:02,270 --> 00:29:03,814 It looks like Google still works. 664 00:29:03,814 --> 00:29:06,980 So for some reason, they're adding a lot of stuff to their URL's by default. 665 00:29:06,980 --> 00:29:09,000 But it's not strictly required. 666 00:29:09,000 --> 00:29:10,340 So what is nice about this? 667 00:29:10,340 --> 00:29:13,630 Well, let me go ahead and open up Chrome's Inspector. 668 00:29:13,630 --> 00:29:15,960 There's a little mouse shortcut for it. 669 00:29:15,960 --> 00:29:17,360 >> Go to the Network tab. 670 00:29:17,360 --> 00:29:19,340 And now let me reload this page once more. 671 00:29:19,340 --> 00:29:20,280 And I'm holding Shift. 672 00:29:20,280 --> 00:29:22,520 As an aside, browsers tend to cache or save 673 00:29:22,520 --> 00:29:24,697 information just for efficiency's sake. 674 00:29:24,697 --> 00:29:27,280 But usually, holding Shift and reloading will force everything 675 00:29:27,280 --> 00:29:28,994 to start over from the beginning. 676 00:29:28,994 --> 00:29:30,410 And that's what I want to do here. 677 00:29:30,410 --> 00:29:33,550 >> And notice all of these rows that just appeared. 678 00:29:33,550 --> 00:29:37,920 It turns out that in any given web page, there might be just one file 679 00:29:37,920 --> 00:29:43,500 involved-- hello.html-- or there might be 52, as in this case. 680 00:29:43,500 --> 00:29:45,820 When I visit google.com, apparently, my browser 681 00:29:45,820 --> 00:29:49,650 kicks off 52 separate HTTP requests. 682 00:29:49,650 --> 00:29:50,520 Why is that? 683 00:29:50,520 --> 00:29:53,380 >> Well, look at what's inside of this web page up top. 684 00:29:53,380 --> 00:29:55,620 There's not only text, but there's actual images 685 00:29:55,620 --> 00:29:57,130 of cats over to the right. 686 00:29:57,130 --> 00:29:59,110 There's a colorful logo up here at left. 687 00:29:59,110 --> 00:30:01,750 There's all of these icons for a microphone and so forth. 688 00:30:01,750 --> 00:30:05,130 There's a lot of pieces, building blocks, scratch pieces, if you will, 689 00:30:05,130 --> 00:30:06,250 to this web page. 690 00:30:06,250 --> 00:30:10,310 And what the browser is doing upon getting the very first file, which 691 00:30:10,310 --> 00:30:16,180 is this row here, it is essentially iterating over the HTML top 692 00:30:16,180 --> 00:30:19,880 to bottom, left to right, looking for things like image tags or other tags 693 00:30:19,880 --> 00:30:23,160 that are mentioning other files and when it sees them, goes and fetches them 694 00:30:23,160 --> 00:30:26,050 via HTTP, viable whole envelope metaphor, 695 00:30:26,050 --> 00:30:29,670 and then displays them in the appropriate location in the web page. 696 00:30:29,670 --> 00:30:33,370 >> But notice here if I focus on the first throw, search cats, 697 00:30:33,370 --> 00:30:37,090 notice that, indeed it's using HTTP 1.1. 698 00:30:37,090 --> 00:30:41,690 And unfortunately, Google Chrome right now in version 39 699 00:30:41,690 --> 00:30:45,110 is kind of dumbing things down and not showing us the actual headers. 700 00:30:45,110 --> 00:30:49,680 But what was indeed sent is a request for not slash, but /search?q=cats. 701 00:30:49,680 --> 00:30:52,830 702 00:30:52,830 --> 00:30:54,340 >> Now, why is that important? 703 00:30:54,340 --> 00:30:57,110 Well, I'm going to infer from this that if you Google 704 00:30:57,110 --> 00:31:01,520 supports queries of this form, why don't I implement my own search 705 00:31:01,520 --> 00:31:06,420 engine for CS50, but just the front end, just the graphical user interface. 706 00:31:06,420 --> 00:31:09,610 And we'll outsource the back end, the actual search results to Google. 707 00:31:09,610 --> 00:31:10,510 >> So how can I do this? 708 00:31:10,510 --> 00:31:13,820 Well, let me go into gedit over here. 709 00:31:13,820 --> 00:31:19,180 And let me go ahead and open up, let's say, a new file. 710 00:31:19,180 --> 00:31:22,280 And I'm going to save this temporarily as search-0.html. 711 00:31:22,280 --> 00:31:25,111 712 00:31:25,111 --> 00:31:27,860 And then eventually, we'll fast forward to the one I pre-prepared. 713 00:31:27,860 --> 00:31:30,190 >> And I'm going to quickly whip up doc type 714 00:31:30,190 --> 00:31:33,840 html open bracket html close bracket html. 715 00:31:33,840 --> 00:31:38,390 Then I'm going to do head close head open title CS50 716 00:31:38,390 --> 00:31:40,150 Search instead of Google search. 717 00:31:40,150 --> 00:31:43,480 Down here I'm going to have the body, down here close body. 718 00:31:43,480 --> 00:31:45,835 And now I need CS50 Search. 719 00:31:45,835 --> 00:31:47,710 And actually, let's build this incrementally. 720 00:31:47,710 --> 00:31:51,043 I'm going to go ahead and close this and actually put it in my public directory. 721 00:31:51,043 --> 00:31:52,730 So give me just one moment. 722 00:31:52,730 --> 00:31:55,390 search-0.html-- I'm going to temporally call it search.html. 723 00:31:55,390 --> 00:31:56,600 724 00:31:56,600 --> 00:31:59,750 I'm going to chmod it a+r search.html. 725 00:31:59,750 --> 00:32:01,072 726 00:32:01,072 --> 00:32:02,280 And now I'm going to open it. 727 00:32:02,280 --> 00:32:03,224 728 00:32:03,224 --> 00:32:04,390 All right, so that was fast. 729 00:32:04,390 --> 00:32:06,800 But the goal simply was to get us to the point 730 00:32:06,800 --> 00:32:09,630 of having this text file called search.html. 731 00:32:09,630 --> 00:32:10,940 732 00:32:10,940 --> 00:32:12,790 So not much to look at yet. 733 00:32:12,790 --> 00:32:16,970 Indeed, if I go to my browser, and go to search.html, that's all it is. 734 00:32:16,970 --> 00:32:17,720 But you know what? 735 00:32:17,720 --> 00:32:19,000 I can be a little fancier. 736 00:32:19,000 --> 00:32:22,710 I read in a book that there's a heading tag called h1. 737 00:32:22,710 --> 00:32:26,100 And I'm going to go ahead and use that open h1 and close h1. 738 00:32:26,100 --> 00:32:27,220 Reload the page. 739 00:32:27,220 --> 00:32:29,600 And now it's bigger and bolder, not all that interesting, 740 00:32:29,600 --> 00:32:32,399 but at least it structurally more interesting. 741 00:32:32,399 --> 00:32:33,940 But now let me introduce another tag. 742 00:32:33,940 --> 00:32:36,500 It turns out there's a form tag. 743 00:32:36,500 --> 00:32:38,400 And let me close that tag. 744 00:32:38,400 --> 00:32:40,830 And it turns out there's an input tag that 745 00:32:40,830 --> 00:32:44,600 has an attribute called type, which is the data type of the field, 746 00:32:44,600 --> 00:32:45,200 if you will. 747 00:32:45,200 --> 00:32:47,050 And is going to be of type text. 748 00:32:47,050 --> 00:32:52,200 And its value is going to be CS50 Search. 749 00:32:52,200 --> 00:32:53,850 Close tag. 750 00:32:53,850 --> 00:32:57,100 And there's going to be no notion of opening and closing with separate tags. 751 00:32:57,100 --> 00:33:00,300 >> Let me go back over here and see what's going on, reload. 752 00:33:00,300 --> 00:33:01,380 Getting interesting. 753 00:33:01,380 --> 00:33:02,950 It looks like it's a text field. 754 00:33:02,950 --> 00:33:04,080 755 00:33:04,080 --> 00:33:06,999 And actually, I didn't want to put a value there yet. 756 00:33:06,999 --> 00:33:10,040 Let me go back here and actually get rid of this value to keep it simple. 757 00:33:10,040 --> 00:33:12,939 Instead of a value, what I wanted to give this thing was a name. 758 00:33:12,939 --> 00:33:15,230 And I don't know what it is, so I'll come back to that. 759 00:33:15,230 --> 00:33:18,270 >> But below that, I want to do input type=submit. 760 00:33:18,270 --> 00:33:19,840 761 00:33:19,840 --> 00:33:22,120 And this value will be CS50 Search. 762 00:33:22,120 --> 00:33:24,850 And we'll see why I moved the value to this. 763 00:33:24,850 --> 00:33:28,900 When I reload, I seem to now have the beginnings of my own search 764 00:33:28,900 --> 00:33:30,820 engine, super hideous, though frankly, it's 765 00:33:30,820 --> 00:33:34,260 not a far throw from what Google's default page looks like. 766 00:33:34,260 --> 00:33:37,950 >> If I go here now, I can type in cats and hopefully click Search. 767 00:33:37,950 --> 00:33:40,380 But I'm not quite done yet, because I haven't implemented, 768 00:33:40,380 --> 00:33:41,045 obviously, a database. 769 00:33:41,045 --> 00:33:42,940 I haven't crawled the web for search results. 770 00:33:42,940 --> 00:33:44,840 So I need to outsource that to Google. 771 00:33:44,840 --> 00:33:46,290 So how do I do this? 772 00:33:46,290 --> 00:33:49,170 >> Well, first of all I need to add and action 773 00:33:49,170 --> 00:33:58,460 attribute to my form tag that is http://www.google.com/search. 774 00:33:58,460 --> 00:34:01,180 And I know that only from having inferred by looking closely 775 00:34:01,180 --> 00:34:02,505 at their URL's. 776 00:34:02,505 --> 00:34:03,380 And now take a guess. 777 00:34:03,380 --> 00:34:09,090 What should this text field probably be called, based on where we came 778 00:34:09,090 --> 00:34:09,754 from before? 779 00:34:09,754 --> 00:34:11,896 780 00:34:11,896 --> 00:34:13,290 >> AUDIENCE: ?q. 781 00:34:13,290 --> 00:34:14,370 >> DAVID J MALAN: ?q. 782 00:34:14,370 --> 00:34:17,800 And we don't actually need question mark it turns out, but q is indeed it, 783 00:34:17,800 --> 00:34:20,489 q for query probably by default, just because that's 784 00:34:20,489 --> 00:34:23,060 what Larry and Sergey came up with years ago. 785 00:34:23,060 --> 00:34:24,739 So now let me reload this page. 786 00:34:24,739 --> 00:34:26,409 It doesn't look all that different. 787 00:34:26,409 --> 00:34:28,120 But now watch what happens. 788 00:34:28,120 --> 00:34:32,360 >> If I type in cats and click CS50 Search and let go, 789 00:34:32,360 --> 00:34:35,770 notice I get whisked away to actual Google. 790 00:34:35,770 --> 00:34:38,150 Now, Google is being a little annoying in that they're 791 00:34:38,150 --> 00:34:41,877 appending an additional parameter, if you will, to the URL. 792 00:34:41,877 --> 00:34:43,960 That's all happening automatically on Google side. 793 00:34:43,960 --> 00:34:48,730 >> The important part is that I seem to have generated this request here. 794 00:34:48,730 --> 00:34:50,179 And indeed, that's what happens. 795 00:34:50,179 --> 00:34:53,040 When you have HTML that looks like this, this 796 00:34:53,040 --> 00:34:57,620 is sort of web developers notation for saying, go ahead and create a form 797 00:34:57,620 --> 00:34:59,990 that when it's submitted, it's going to go to this URL. 798 00:34:59,990 --> 00:35:03,430 And when the URL has provided values for things like q, 799 00:35:03,430 --> 00:35:05,440 don't go just to this URL. 800 00:35:05,440 --> 00:35:08,210 Actually, go to question mark and then q=cats. 801 00:35:08,210 --> 00:35:09,590 802 00:35:09,590 --> 00:35:13,060 Append the parameter, the HTTP parameter like that. 803 00:35:13,060 --> 00:35:15,590 >> And just to be super precise, what's being inferred here-- 804 00:35:15,590 --> 00:35:18,130 but I'll be more explicit-- is that the method I want to use 805 00:35:18,130 --> 00:35:22,270 is get, instead of something like post, which we'll eventually see. 806 00:35:22,270 --> 00:35:27,710 So in short, simply by understanding HTML and using some fairly simple tags, 807 00:35:27,710 --> 00:35:30,610 we can now begin to create our own front end user 808 00:35:30,610 --> 00:35:32,850 interface with a search engine behind it. 809 00:35:32,850 --> 00:35:34,800 >> But this of course, is pretty hideous. 810 00:35:34,800 --> 00:35:37,259 So let me actually open up a slightly better version. 811 00:35:37,259 --> 00:35:39,800 This is the one I prepared in advance that has some comments. 812 00:35:39,800 --> 00:35:41,900 But you'll see that I pretty much recreated it. 813 00:35:41,900 --> 00:35:44,150 So this is already available online. 814 00:35:44,150 --> 00:35:48,050 And I did happen to preemptively go to https just to keep it simple. 815 00:35:48,050 --> 00:35:50,610 >> And now let's open up a next iteration of this. 816 00:35:50,610 --> 00:35:52,510 Is version 1 instead of 0. 817 00:35:52,510 --> 00:35:55,315 What jumps out at you as slightly different in this example? 818 00:35:55,315 --> 00:35:59,480 819 00:35:59,480 --> 00:36:00,440 >> AUDIENCE: [INAUDIBLE]. 820 00:36:00,440 --> 00:36:03,020 >> Yeah, there's this text align center. 821 00:36:03,020 --> 00:36:04,590 This is a little weird up here. 822 00:36:04,590 --> 00:36:06,150 But this is indeed new. 823 00:36:06,150 --> 00:36:07,800 And maybe guess what's going to happen. 824 00:36:07,800 --> 00:36:11,730 If I go to my browser now and visit search-1.html, 825 00:36:11,730 --> 00:36:13,090 it's almost the same thing. 826 00:36:13,090 --> 00:36:15,705 But it's a step closer to being a little more pretty. 827 00:36:15,705 --> 00:36:19,150 It's still ugly, but prettier in that at least everything's now centered. 828 00:36:19,150 --> 00:36:23,470 >> So it turns out that what I'm using is another language altogether called 829 00:36:23,470 --> 00:36:25,680 CSS, cascading style sheets. 830 00:36:25,680 --> 00:36:28,310 And CSS, frankly, is kind of, in my personal opinion, 831 00:36:28,310 --> 00:36:29,775 an atrociously designed language. 832 00:36:29,775 --> 00:36:33,110 It is very annoying to remember all the various details. 833 00:36:33,110 --> 00:36:38,479 But it is what stylizes the entire worldwide web today. 834 00:36:38,479 --> 00:36:39,270 I offended someone. 835 00:36:39,270 --> 00:36:39,769 All right. 836 00:36:39,769 --> 00:36:43,180 So let's go back here and see how we're actually using this. 837 00:36:43,180 --> 00:36:45,940 And it turns out, at least it's actually a pretty simple language. 838 00:36:45,940 --> 00:36:49,470 It's just key value pairs, properties and values, properties and values. 839 00:36:49,470 --> 00:36:52,080 Indeed, here is one such property and value. 840 00:36:52,080 --> 00:36:55,890 >> Simply by using the style attribute on my body tag 841 00:36:55,890 --> 00:37:00,360 and giving it a value of a word colon and another word, 842 00:37:00,360 --> 00:37:03,730 or a property and a value, I can affect the aesthetics 843 00:37:03,730 --> 00:37:06,210 of the web page, not necessarily the structure yet, 844 00:37:06,210 --> 00:37:07,550 but the aesthetics of it. 845 00:37:07,550 --> 00:37:10,960 And just by Googling around, I realize that CSS, cascading style sheets, 846 00:37:10,960 --> 00:37:14,170 supports a property called text-align, whose value can 847 00:37:14,170 --> 00:37:16,980 be left, right, or center, for instance. 848 00:37:16,980 --> 00:37:19,990 >> So now when I reload this page, what I did get 849 00:37:19,990 --> 00:37:22,730 was a centered page, but still pretty ugly. 850 00:37:22,730 --> 00:37:25,770 Let's go ahead and open up version 2 of Search. 851 00:37:25,770 --> 00:37:28,570 And now notice I've done a little more. 852 00:37:28,570 --> 00:37:33,760 Notice that up here inside of the head tag, there can be more than title. 853 00:37:33,760 --> 00:37:35,400 In fact, there's a style tag. 854 00:37:35,400 --> 00:37:38,630 And this is where it just gets a little messy seeing CSS sometimes. 855 00:37:38,630 --> 00:37:41,971 >> Notice that I seem to have something that structurally looks very different. 856 00:37:41,971 --> 00:37:44,095 But here is the name of the tag I want to stylized. 857 00:37:44,095 --> 00:37:47,570 Here are our old friends curly braces and closed curly brace. 858 00:37:47,570 --> 00:37:50,290 And then here is that property and its value. 859 00:37:50,290 --> 00:37:56,300 >> If I load this file, search2.html, the end result is identical. 860 00:37:56,300 --> 00:37:59,300 But it's a step toward better design. 861 00:37:59,300 --> 00:38:04,560 By factoring out this CSS, I've not commingled it with my HTML. 862 00:38:04,560 --> 00:38:07,560 And indeed, as we'll see, I could reuse these properties and values. 863 00:38:07,560 --> 00:38:10,420 If I wanted to make bunches of parts of my web page centered, 864 00:38:10,420 --> 00:38:13,630 I don't have to type style=text-align center all over the place. 865 00:38:13,630 --> 00:38:16,580 I can put in one place perhaps, like up at the top. 866 00:38:16,580 --> 00:38:18,210 >> But even this isn't the best design. 867 00:38:18,210 --> 00:38:21,720 In fact, one of the things you'll learn as you spend more and more time with 868 00:38:21,720 --> 00:38:25,730 web programming is that the more you can modularize things and factor things out 869 00:38:25,730 --> 00:38:30,610 like .h files let us factor stuff out, like helpers.c let us factor things out 870 00:38:30,610 --> 00:38:31,880 a few psets ago. 871 00:38:31,880 --> 00:38:34,200 Similarly, might we want to achieve this. 872 00:38:34,200 --> 00:38:37,920 >> So notice in version three of search.html I've 873 00:38:37,920 --> 00:38:40,610 cleaned up the head of the page and just put 874 00:38:40,610 --> 00:38:43,320 in this, a link tag, which contrary to the name, 875 00:38:43,320 --> 00:38:44,700 does not give you a hyperlink. 876 00:38:44,700 --> 00:38:49,150 It links to another file by way of an href whose value in this case, 877 00:38:49,150 --> 00:38:51,586 is search-3.css 878 00:38:51,586 --> 00:38:52,960 So I realize we're going quickly. 879 00:38:52,960 --> 00:38:54,600 But all I'm doing is kind of moving things around. 880 00:38:54,600 --> 00:38:55,760 Let me open search-3.css. 881 00:38:55,760 --> 00:38:57,114 882 00:38:57,114 --> 00:38:58,530 There it is, nothing really to it. 883 00:38:58,530 --> 00:39:02,270 I just copied and pasted it into a new file, much like we factored stuff out 884 00:39:02,270 --> 00:39:03,509 into other files before. 885 00:39:03,509 --> 00:39:05,300 And the result-- completely underwhelming-- 886 00:39:05,300 --> 00:39:06,730 is going to be exactly the same. 887 00:39:06,730 --> 00:39:10,490 But we're moving toward-- no, it's not. 888 00:39:10,490 --> 00:39:11,930 Oh, I know why. 889 00:39:11,930 --> 00:39:13,790 >> So it seems to be a bug. 890 00:39:13,790 --> 00:39:15,010 And it is in some sense. 891 00:39:15,010 --> 00:39:17,730 But let me open up my Network tab. 892 00:39:17,730 --> 00:39:19,660 Let me reload the page. 893 00:39:19,660 --> 00:39:23,315 Ah, why is the CSS not being applied? 894 00:39:23,315 --> 00:39:26,920 Well, the CSS file, similarly, has to be world readable, so to speak. 895 00:39:26,920 --> 00:39:28,440 And it too is currently forbidden. 896 00:39:28,440 --> 00:39:33,760 So let me do a chmod a+r of star dot CSS-- whoops-- 897 00:39:33,760 --> 00:39:37,067 we're dot CSS is just the file extension for CSS files. 898 00:39:37,067 --> 00:39:38,900 Now let me go back to my browser and reload. 899 00:39:38,900 --> 00:39:40,910 OK, a little better. 900 00:39:40,910 --> 00:39:42,282 >> Now let me do one last thing. 901 00:39:42,282 --> 00:39:42,990 In search-4.html. 902 00:39:42,990 --> 00:39:44,550 903 00:39:44,550 --> 00:39:48,220 I have a version that I just thought was way cooler, albeit way more 904 00:39:48,220 --> 00:39:48,980 complicated. 905 00:39:48,980 --> 00:39:50,690 Let's look at the result first. 906 00:39:50,690 --> 00:39:52,290 Close this to give us more room. 907 00:39:52,290 --> 00:39:54,275 Change this to search-4, Enter. 908 00:39:54,275 --> 00:39:55,430 909 00:39:55,430 --> 00:39:57,200 >> And now a bunch of things are broken. 910 00:39:57,200 --> 00:39:59,910 I'm going to go back into my directory here. 911 00:39:59,910 --> 00:40:04,190 And now I'm just going to do a chmod of a+r on a file-- 912 00:40:04,190 --> 00:40:07,450 because I know it exists-- called logo.gif, which is an image. 913 00:40:07,450 --> 00:40:08,590 And now reload. 914 00:40:08,590 --> 00:40:11,040 And wow-- so now I'm pretty close, frankly, 915 00:40:11,040 --> 00:40:15,860 to like the 1999 version of Google, and frankly, the 2014 version of Google, 916 00:40:15,860 --> 00:40:16,360 right? 917 00:40:16,360 --> 00:40:21,920 >> So it's now going to their website, ultimately, if I search for cats. 918 00:40:21,920 --> 00:40:23,900 And indeed it is. 919 00:40:23,900 --> 00:40:26,410 But what did I do differently in this version 4? 920 00:40:26,410 --> 00:40:28,020 So we won't dwell too much on it here. 921 00:40:28,020 --> 00:40:30,100 You'll see this in problem set seven eventually. 922 00:40:30,100 --> 00:40:31,350 But notice I did a few things. 923 00:40:31,350 --> 00:40:33,690 >> I introduced a div tag, which is division, 924 00:40:33,690 --> 00:40:35,450 similar in spirit to a paragraph tag. 925 00:40:35,450 --> 00:40:38,220 But a division is just like, here's a rectangular invisible region 926 00:40:38,220 --> 00:40:39,150 of the screen. 927 00:40:39,150 --> 00:40:41,680 Let's give it a unique identifier, a footer, just 928 00:40:41,680 --> 00:40:44,700 so that we can talk about it in our HTML elsewhere. 929 00:40:44,700 --> 00:40:47,952 Here is another div of the page whose ID is going to be content. 930 00:40:47,952 --> 00:40:49,160 It's the content of the page. 931 00:40:49,160 --> 00:40:51,090 And up here is the header of the page. 932 00:40:51,090 --> 00:40:54,960 >> In other words, I've essentially in HTML am mentally 933 00:40:54,960 --> 00:40:57,700 viewing this web page as three components, a header 934 00:40:57,700 --> 00:41:01,200 up here with this invisible rectangle, the content in the middle, and then 935 00:41:01,200 --> 00:41:04,800 the footer down below, even though we don't see those things. 936 00:41:04,800 --> 00:41:09,940 Because I want to in my head of page here, or in a .css file, 937 00:41:09,940 --> 00:41:11,460 I can use this syntax. 938 00:41:11,460 --> 00:41:13,070 >> Header is not a tag. 939 00:41:13,070 --> 00:41:17,060 It's an ID so it turns out that by doing #header, 940 00:41:17,060 --> 00:41:20,840 I can now apply one or more properties to the header. 941 00:41:20,840 --> 00:41:24,130 I can do the same content, the same for content here. 942 00:41:24,130 --> 00:41:27,230 >> So for instance, in the footer, notice all of these properties I'm adding. 943 00:41:27,230 --> 00:41:30,660 And I know they exist just by reading up on the documentation for CSS. 944 00:41:30,660 --> 00:41:33,450 Font size is going to be smaller-- so some relative font size. 945 00:41:33,450 --> 00:41:34,741 The weight is going to be bold. 946 00:41:34,741 --> 00:41:37,340 Margin-- how many pixels around it-- is 20 pixels. 947 00:41:37,340 --> 00:41:38,590 And it's going to be centered. 948 00:41:38,590 --> 00:41:40,256 >> But right now, the page looks like this. 949 00:41:40,256 --> 00:41:42,840 If I'm not pleased with my copy right there, 950 00:41:42,840 --> 00:41:46,560 I could do something like color red. 951 00:41:46,560 --> 00:41:50,570 And then I can save this, reload, and now I've stylized the footer. 952 00:41:50,570 --> 00:41:54,130 So this is just hinting at the power of what you can do in a web page 953 00:41:54,130 --> 00:41:55,510 to change things around. 954 00:41:55,510 --> 00:41:59,080 >> And even cooler than this, if you want to poke around with actual websites, 955 00:41:59,080 --> 00:42:00,810 you can't permanently change them. 956 00:42:00,810 --> 00:42:03,640 But if I open up Chrome's Inspector again 957 00:42:03,640 --> 00:42:07,610 and I go not to the left hand side here, which shows Facebook's HTML, 958 00:42:07,610 --> 00:42:11,380 but shows on the right hand side all of its CSS, 959 00:42:11,380 --> 00:42:13,789 you can either and change things on the fly. 960 00:42:13,789 --> 00:42:15,080 So let me go ahead and do this. 961 00:42:15,080 --> 00:42:18,670 >> Let me go ahead and control click on this random word here, 962 00:42:18,670 --> 00:42:21,230 sign, and click Inspect Element. 963 00:42:21,230 --> 00:42:25,130 Chrome very conveniently jumps to the h1 tag that Facebook is using. 964 00:42:25,130 --> 00:42:27,290 And notice here Facebook has kind of lazily 965 00:42:27,290 --> 00:42:29,960 hard coded font size as a property here. 966 00:42:29,960 --> 00:42:33,530 >> So the cool thing though is that if I actually go in here 967 00:42:33,530 --> 00:42:39,560 and say, oh, Facebook, I don't like that 64 pixels, we can now change Facebook. 968 00:42:39,560 --> 00:42:42,590 Of course, we're only changing it for me personally at the moment. 969 00:42:42,590 --> 00:42:45,150 But this is just another tool in our tool kit 970 00:42:45,150 --> 00:42:48,360 that's going to allow us to tweak and figure out and also diagnose 971 00:42:48,360 --> 00:42:49,729 issues in our own web pages. 972 00:42:49,729 --> 00:42:52,270 And we could similarly go over here, which is the same thing. 973 00:42:52,270 --> 00:42:55,830 If you really want to get fancy, I mean, now you can really mutate the page 974 00:42:55,830 --> 00:42:57,380 and do crazy things. 975 00:42:57,380 --> 00:42:59,870 >> So why is this all useful? 976 00:42:59,870 --> 00:43:02,330 Well, ultimately, we're going to want to be 977 00:43:02,330 --> 00:43:07,110 able to create web pages that are driven by our own back ends, 978 00:43:07,110 --> 00:43:10,520 not by just Google and outsourcing the back end there. 979 00:43:10,520 --> 00:43:13,510 We actually want the value, for instance, 980 00:43:13,510 --> 00:43:18,830 of our search engine's action attribute to go not to someone else, 981 00:43:18,830 --> 00:43:24,270 but to something like search.php, where search.php is on our own server, 982 00:43:24,270 --> 00:43:25,670 not on someone else's. 983 00:43:25,670 --> 00:43:30,316 >> And so to get there, we actually need to introduce a new language. 984 00:43:30,316 --> 00:43:33,190 So we've already looked at one new language here, or two really, HTML 985 00:43:33,190 --> 00:43:33,700 and CSS. 986 00:43:33,700 --> 00:43:36,330 But they really are just structural and aesthetic languages. 987 00:43:36,330 --> 00:43:38,360 They're not programming languages per se. 988 00:43:38,360 --> 00:43:41,160 And that's about as much formal time as we'll spend on them. 989 00:43:41,160 --> 00:43:44,910 Because we'll begin now to transition to PHP. 990 00:43:44,910 --> 00:43:48,160 >> So PHP is an actual programming language. 991 00:43:48,160 --> 00:43:50,750 It's a scripting language in the sense that it's 992 00:43:50,750 --> 00:43:52,855 meant to be lighter weight than something like C. 993 00:43:52,855 --> 00:43:56,082 And it's an interpreted language, which means it's not compiled. 994 00:43:56,082 --> 00:43:58,790 So in a nutshell, what did it mean when we used a language like c 995 00:43:58,790 --> 00:44:00,290 and we had to compile it? 996 00:44:00,290 --> 00:44:02,120 What does it mean to compile C source code? 997 00:44:02,120 --> 00:44:03,864 998 00:44:03,864 --> 00:44:04,780 AUDIENCE: [INAUDIBLE]. 999 00:44:04,780 --> 00:44:06,184 DAVID J MALAN: Say it again? 1000 00:44:06,184 --> 00:44:07,100 AUDIENCE: [INAUDIBLE]. 1001 00:44:07,100 --> 00:44:07,962 1002 00:44:07,962 --> 00:44:08,920 DAVID J MALAN: Perfect. 1003 00:44:08,920 --> 00:44:10,180 It turns it into binary. 1004 00:44:10,180 --> 00:44:14,200 It turns it into zeroes and ones from actual English-like source code. 1005 00:44:14,200 --> 00:44:16,424 And then we can actually run those zeroes and ones 1006 00:44:16,424 --> 00:44:18,840 by passing them through the CPU by double clicking an icon 1007 00:44:18,840 --> 00:44:19,980 or running a command. 1008 00:44:19,980 --> 00:44:23,770 >> PHP and Python and Ruby and Perl and JavaScript 1009 00:44:23,770 --> 00:44:26,250 and bunches of other languages are interpreted 1010 00:44:26,250 --> 00:44:29,290 languages, which is to say you do not compile them. 1011 00:44:29,290 --> 00:44:34,220 Rather, you feed them as input to a program called an interpreter. 1012 00:44:34,220 --> 00:44:36,640 And that interpreter, which someone else wrote, 1013 00:44:36,640 --> 00:44:40,930 reads your source code top to bottom, left to right and just interprets 1014 00:44:40,930 --> 00:44:43,000 those lines and does what you say. 1015 00:44:43,000 --> 00:44:45,360 >> So if you encounter a line that says print, 1016 00:44:45,360 --> 00:44:48,660 it doesn't necessarily convert print to the corresponding zeros and ones. 1017 00:44:48,660 --> 00:44:51,910 It just has this interpreter like a big if condition that says, 1018 00:44:51,910 --> 00:44:56,110 if programmer's instruction is print, then do the following. 1019 00:44:56,110 --> 00:44:58,170 So it interprets it just by kind of reasoning 1020 00:44:58,170 --> 00:44:59,800 through what you're telling it to do. 1021 00:44:59,800 --> 00:45:01,320 >> And PHP is one of these languages. 1022 00:45:01,320 --> 00:45:05,310 And PHP years ago was designed precisely for web programming. 1023 00:45:05,310 --> 00:45:08,160 And it was initially a very sloppy messy language. 1024 00:45:08,160 --> 00:45:10,940 And indeed, there's a huge amount of bad PHP code out there. 1025 00:45:10,940 --> 00:45:13,520 But the language itself has matured over the years, 1026 00:45:13,520 --> 00:45:16,200 so much so that now it's actually a wonderful next step 1027 00:45:16,200 --> 00:45:19,970 pedagogically from C because it's so darned familiar to everything 1028 00:45:19,970 --> 00:45:22,380 you've just seen in the past few weeks. 1029 00:45:22,380 --> 00:45:25,724 >> The one initial difference we'll see is there's no main function anymore. 1030 00:45:25,724 --> 00:45:28,890 When you start writing code, it's just going to get executed no matter what, 1031 00:45:28,890 --> 00:45:30,220 as we'll see in a moment. 1032 00:45:30,220 --> 00:45:33,320 Meanwhile, here's what a variable looks like in PHP. 1033 00:45:33,320 --> 00:45:35,840 It's a little different, but only barely. 1034 00:45:35,840 --> 00:45:39,380 >> In PHP, there's not strong typing. 1035 00:45:39,380 --> 00:45:41,430 There's week typing, which just means there 1036 00:45:41,430 --> 00:45:44,030 are data types like strings and numbers and other things. 1037 00:45:44,030 --> 00:45:47,030 But you don't bother specifying what they are anymore. 1038 00:45:47,030 --> 00:45:48,980 PHP figures it out for you. 1039 00:45:48,980 --> 00:45:52,030 The dollar sign is just a decision that the PHP people made years 1040 00:45:52,030 --> 00:45:54,890 ago such that any variable in PHP just starts with a dollar sign. 1041 00:45:54,890 --> 00:45:58,130 It's actually kind of useful in that it jumps out at you a little more. 1042 00:45:58,130 --> 00:46:01,315 >> But after that, this is a condition in PHP. 1043 00:46:01,315 --> 00:46:03,140 1044 00:46:03,140 --> 00:46:04,730 What's different versus C? 1045 00:46:04,730 --> 00:46:07,180 1046 00:46:07,180 --> 00:46:09,600 Trick question-- nothing, which is actually really nice. 1047 00:46:09,600 --> 00:46:12,140 Boolean expressions in PHP-- the same. 1048 00:46:12,140 --> 00:46:19,354 Boolean expressions with and versus or, switches, loops, loops, loops-- OK, 1049 00:46:19,354 --> 00:46:20,270 this one is different. 1050 00:46:20,270 --> 00:46:22,660 >> So it turns out there's a couple of other features in PHP. 1051 00:46:22,660 --> 00:46:25,243 One of them is actually this, which is wonderfully convenient. 1052 00:46:25,243 --> 00:46:29,250 If $numbers is an array that you've declared previously in a program, 1053 00:46:29,250 --> 00:46:33,350 you have this fancy for each construct that instead of doing all of that 1054 00:46:33,350 --> 00:46:37,020 annoying I equals 0, I is less than this, [? I++ ?], 1055 00:46:37,020 --> 00:46:40,320 for each numbers as number, where each of those dollar sign values is just 1056 00:46:40,320 --> 00:46:42,790 a variable, and the latter you can think of as I. 1057 00:46:42,790 --> 00:46:44,290 You could call it anything you want. 1058 00:46:44,290 --> 00:46:45,770 I called it number. 1059 00:46:45,770 --> 00:46:48,825 This is going to iterate over the array called numbers. 1060 00:46:48,825 --> 00:46:51,200 And on each iteration, it's going to automatically update 1061 00:46:51,200 --> 00:46:54,340 for you the dollar sign number variable so that you constantly 1062 00:46:54,340 --> 00:46:58,210 have access to the variable you want without having to do any square bracket 1063 00:46:58,210 --> 00:47:00,980 notation or indexing into an array. 1064 00:47:00,980 --> 00:47:04,950 >> Beyond that, we even have things like arrays, which look almost the same, 1065 00:47:04,950 --> 00:47:08,210 except it's very common, as we'll see, both in PHP and JavaScript 1066 00:47:08,210 --> 00:47:10,750 to pre initialize an array using square brackets. 1067 00:47:10,750 --> 00:47:12,040 C uses curly braces. 1068 00:47:12,040 --> 00:47:15,330 So it's slightly different, even though we didn't really use that trick much. 1069 00:47:15,330 --> 00:47:20,090 >> But even more powerfully, PHP has associative arrays, 1070 00:47:20,090 --> 00:47:23,100 which is a fancy way of saying hash tables. 1071 00:47:23,100 --> 00:47:31,610 In fact, if you want to declare a hash table in PHP, unlike in C-- how many 1072 00:47:31,610 --> 00:47:34,775 lines of code did it take to actually implement a hash table in C? 1073 00:47:34,775 --> 00:47:38,310 Or how many lines of code is it taking to implement a hash table in C? 1074 00:47:38,310 --> 00:47:39,820 So it's probably a lot, right? 1075 00:47:39,820 --> 00:47:41,680 It's a few dozen, maybe 100 or 200. 1076 00:47:41,680 --> 00:47:42,980 It's nontrivial. 1077 00:47:42,980 --> 00:47:45,420 Or it's about to be, as you'll soon see, nontrivial 1078 00:47:45,420 --> 00:47:48,080 to implement a hash table [INAUDIBLE] and also a try. 1079 00:47:48,080 --> 00:47:50,580 But in PHP-- and frankly, I probably shouldn't tell you this 1080 00:47:50,580 --> 00:47:53,630 until Monday-- in PHP, if you want a table, done. 1081 00:47:53,630 --> 00:47:56,431 That's a hash table-- so with one line of code. 1082 00:47:56,431 --> 00:47:56,930 And 1083 00:47:56,930 --> 00:47:58,810 >> A lot of languages do that. 1084 00:47:58,810 --> 00:48:00,190 Have fun with pset five. 1085 00:48:00,190 --> 00:48:01,980 So a lot of languages do this. 1086 00:48:01,980 --> 00:48:03,050 1087 00:48:03,050 --> 00:48:06,140 They give you these abstractions that other people, other programmers, 1088 00:48:06,140 --> 00:48:09,870 have created for you so that you can stand on their shoulders 1089 00:48:09,870 --> 00:48:13,290 and start using ideas that are super compelling, like hash tables and trees 1090 00:48:13,290 --> 00:48:14,140 and tries. 1091 00:48:14,140 --> 00:48:17,790 But you don't necessarily have to implement those things yourself. 1092 00:48:17,790 --> 00:48:20,850 >> And so ultimately, what we're going to use PHP for 1093 00:48:20,850 --> 00:48:23,580 is potentially writing programs of the so-called command line. 1094 00:48:23,580 --> 00:48:26,600 We could recreate every program we've written this semester thus far, 1095 00:48:26,600 --> 00:48:30,410 except maybe Breakout which uses SPL, which is specific to C at the moment. 1096 00:48:30,410 --> 00:48:33,100 But every other problem set, certainly Mario and Caesar 1097 00:48:33,100 --> 00:48:35,300 and Vigenere and [? Crack ?] and onward, we 1098 00:48:35,300 --> 00:48:39,520 could re-implement in PHP, and probably a little more easily. 1099 00:48:39,520 --> 00:48:43,050 >> But what we're ultimately going to use PHP for is web programming. 1100 00:48:43,050 --> 00:48:46,420 And we're going to introduce next week a mental model, a paradigm called 1101 00:48:46,420 --> 00:48:49,610 MVC, model view controller, which if you've done programming 1102 00:48:49,610 --> 00:48:51,610 before in Python or Ruby or elsewhere, you 1103 00:48:51,610 --> 00:48:54,112 might know of this team with Rails and Django and the like. 1104 00:48:54,112 --> 00:48:55,820 But if you're new to this too, you'll see 1105 00:48:55,820 --> 00:48:59,652 that this is actually a very natural extension of the factorization 1106 00:48:59,652 --> 00:49:01,360 and the sort of design of code that we've 1107 00:49:01,360 --> 00:49:04,670 been doing in C. We're going to now apply some of those lessons to PHP 1108 00:49:04,670 --> 00:49:07,190 so that ultimately, we are implementing our own websites. 1109 00:49:07,190 --> 00:49:09,080 And if you're sort of mesmerized or amazed 1110 00:49:09,080 --> 00:49:10,954 that we're going to do all of the so quickly, 1111 00:49:10,954 --> 00:49:13,410 realize that almost every semester, nearly 90% 1112 00:49:13,410 --> 00:49:16,560 of students CS50, including those who have never programmed before, 1113 00:49:16,560 --> 00:49:20,329 end up making final projects that are based on web programming. 1114 00:49:20,329 --> 00:49:23,120 And so you will see that the returns are high in the weeks to come. 1115 00:49:23,120 --> 00:49:24,965 So we will see you then on Monday. 1116 00:49:24,965 --> 00:49:27,260 1117 00:49:27,260 --> 00:49:30,120 >> SPEAKER 1: And now, Deep Thoughts by Daven Farnham. 1118 00:49:30,120 --> 00:49:34,055 1119 00:49:34,055 --> 00:49:34,780 Hash tables. 1120 00:49:34,780 --> 00:49:37,180 1121 00:49:37,180 --> 00:49:38,402 >> [LAUGHTER] 1122 00:49:38,402 --> 00:49:38,902